Re: [HACKERS] Make more portable TAP tests of initdb

2015-05-02 Thread Michael Paquier
On Sun, May 3, 2015 at 8:09 AM, Noah Misch  wrote:
> On Fri, May 01, 2015 at 03:23:44PM +0900, Michael Paquier wrote:
>> On Thu, Apr 30, 2015 at 12:17 PM, Noah Misch  wrote:
>> > As I pondered this, I felt it would do better to solve a different problem.
>> > The "rm -rf" invocations presumably crept in to reduce peak disk usage.
>> > Considering the relatively-high duration of a successful initdb run, I 
>> > doubt
>> > we get good value from so many positive tests.  I squashed those positive
>> > tests into one, as attached.  (I changed the order of negative tests but 
>> > kept
>> > them all.)  This removed the need for intra-script cleaning, and it
>> > accelerated script runtime from 22s to 9s.  Does this seem good to you?
>>
>> That's a fight code simplicity vs. test performance. FWIW, I like the
>> test suite with many positive tests because it is easy to read the
>> test flow. And as this test suite is meant to grow up with new tests
>> in the future, I am guessing that people may not follow the
>> restriction this patch adds all the time.
>
> True.
>
>>  command_fails(
>> -   [ 'initdb', "$tempdir/data", '-X', 'pgxlog' ],
>> +   [ 'initdb', $datadir, '-X', 'pgxlog' ],
>> 'relative xlog directory not allowed');
>> $datadir should be the last argument. This would break on platforms
>> where src/port/getopt_long.c is used, like Windows.
>
> I committed that as a separate patch; thanks.
>
>> +command_ok([ 'initdb', '-N', '-T', 'german', '-X', $xlogdir, $datadir ],
>> +   'successful creation');
>> This is grouping the test for nosync, existing nonempty xlog
>> directory, and the one for selection of the default dictionary.
>> Shouldn't this test name be updated otherwise then?
>>
>> Could you add a comment mentioning that tests are grouped to reduce
>> I/O impact during a run?
>
> Committed with a comment added.  The "successful creation" test name is
> generic because it's the designated place to add testing of new options.

Thanks. The overall result looks good to me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)

2015-05-02 Thread Shigeru HANADA
Thanks for the comments.

2015/05/01 22:35、Robert Haas  のメール:
> On Mon, Apr 27, 2015 at 5:07 AM, Shigeru Hanada
>  wrote:
>> 2015-04-27 11:00 GMT+09:00 Kouhei Kaigai :
>>> Hanada-san, could you adjust your postgres_fdw patch according to
>>> the above new (previous?) definition.
>> 
>> The attached v14 patch is the revised version for your v13 patch.  It also 
>> contains changed for Ashutosh’s comments.
> 
> We should probably move this discussion to a new thread now that the
> other patch is committed.  Changing subject line accordingly.
> 
> Generally, there's an awful lot of changes in this patch - it is over
> 2000 insertions and more than 450 deletions - and it's not awfully
> obvious why all of those changes are there.  I think this patch needs
> a detailed README to accompany it explaining what the various changes
> in the patch are and why those things got changed; or maybe there is a
> way to break it up into multiple patches so that we can take a more
> incremental approach.  I am really suspicious of the amount of
> wholesale reorganization of code that this patch is doing.  It's
> really hard to validate that a reorganization like that is necessary,
> or that it's correct, and it's gonna make back-patching noticeably
> harder in the future.  If we really need this much code churn it needs
> careful justification; if we don't, we shouldn't do it.
> 

I agree.  I’ll write detailed description for the patch and repost the new one 
with rebasing onto current HEAD.  I’m sorry but it will take a day or so...

> +SET enable_mergejoin = off; -- planner choose MergeJoin even it has
> higher costs, so disable it for testing.
> 
> This seems awfully strange.  Why would the planner choose a plan if it
> had a higher cost?

I thought so but I couldn’t reproduce such situation now.  I’ll investigate it 
more.  If the issue has gone, I’ll remove that SET statement for 
straightforward tests.


> 
> -* If the table or the server is configured to use remote estimates,
> -* identify which user to do remote access as during planning.  This
> +* Identify which user to do remote access as during planning.  This
> * should match what ExecCheckRTEPerms() does.  If we fail due
> to lack of
> * permissions, the query would have failed at runtime anyway.
> */
> -   if (fpinfo->use_remote_estimate)
> -   {
> -   RangeTblEntry *rte = planner_rt_fetch(baserel->relid, root);
> -   Oid userid = rte->checkAsUser ?
> rte->checkAsUser : GetUserId();
> -
> -   fpinfo->user = GetUserMapping(userid, 
> fpinfo->server->serverid);
> -   }
> -   else
> -   fpinfo->user = NULL;
> +   rte = planner_rt_fetch(baserel->relid, root);
> +   fpinfo->userid = rte->checkAsUser ? rte->checkAsUser : GetUserId();
> 
> So, wait a minute, remote estimates aren't optional any more?

No, it seems to be removed accidentally.  I’ll check the reason again though, 
but I’ll revert the change unless I find any problem.

--
Shigeru HANADA
shigeru.han...@gmail.com






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing SQL ASSERTION

2015-05-02 Thread David Fetter
On Sat, May 02, 2015 at 10:42:24PM +0100, Joe Wildish wrote:
> 
> > On 1 May 2015, at 19:51, Robert Haas  wrote:
> > 
> > On Thu, Apr 30, 2015 at 6:36 PM, Joe Wildish
> >  wrote:
> >> I’m wondering if there are other people out there working on implementing 
> >> SQL ASSERTION functionality?
> >> 
> >> I’ve recently spent a bit of time looking to implement the execution 
> >> models described in “Applied Mathematics for Database Professionals” by 
> >> Toon Koppelaars and Lex de Haan.  I’ve gotten as far as execution model 3 
> >> and am now looking at deriving polarity of involved tables to do EM4 
> >> (described in some detail in “Deriving Production Rules for Constraint 
> >> Maintenance”, Ceri & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 
> >> look rather more difficult but I’m intending to try and implement those, 
> >> too.
> >> 
> >> If there are other people working on this stuff it would be great to 
> >> collaborate.
> > 
> > I don't know of anyone working on this.  It sounds very difficult.
> 
> The book I mention details a series of execution models, where each 
> successive model aims to validate the assertion in a more efficient manner 
> than the last. This is achieved by performing static analysis of the 
> assertion's expression to determine under what circumstances the assertion 
> need be (re)checked. Briefly:
> 
> EM1: after all DML statements;
> EM2: only after DML statements involving tables mentioned in the assertion 
> expression;
> EM3: only after DML statements involving the columns mentioned in the 
> assertion expression;
> EM4: only after DML statements involving the columns, plus if the statement 
> has a “polarity” that may affect the assertion expression.
> 
> “Polarity" here means that one is able to (statically) determine if only 
> INSERTS and not DELETES can affect an expression or vice-versa.
> 
> EMs 5 and 6 are further enhancements that make use of querying the 
> “transition effect” data of what actually changed in a statement, to 
> determine if the assertion expression need be validated. I’ve not done as 
> much reading around this topic yet so am concentrating on EMs 1-4.
> 
> I agree it is a difficult problem but there are a fair number of published 
> academic papers relating to this topic. The AM4DP book draws a lot of this 
> research together and presents the executions models.
> 
> I may start writing up on a blog of where I get to, and then post further to 
> this list, if there is interest.

I suspect that you would get a lot further with a PoC patch including
the needed documentation.  Remember to include how this would work at
all the transaction isolation levels and combinations of same that we
support.  Recall also to include the lock strength needed.  Just about
anything can be done with a database-wide lock :)

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BRIN range operator class

2015-05-02 Thread Andreas Karlsson

On 04/06/2015 09:36 PM, Emre Hasegeli wrote:

Yes but they were also required by this patch.  This version adds more
functions and operators.  I can split them appropriately after your
review.



Ok, sounds fine to me.


It is now split.


In which order should I apply the patches?

I also agree with Alvaro's comments.

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Make more portable TAP tests of initdb

2015-05-02 Thread Noah Misch
On Fri, May 01, 2015 at 03:23:44PM +0900, Michael Paquier wrote:
> On Thu, Apr 30, 2015 at 12:17 PM, Noah Misch  wrote:
> > As I pondered this, I felt it would do better to solve a different problem.
> > The "rm -rf" invocations presumably crept in to reduce peak disk usage.
> > Considering the relatively-high duration of a successful initdb run, I doubt
> > we get good value from so many positive tests.  I squashed those positive
> > tests into one, as attached.  (I changed the order of negative tests but 
> > kept
> > them all.)  This removed the need for intra-script cleaning, and it
> > accelerated script runtime from 22s to 9s.  Does this seem good to you?
> 
> That's a fight code simplicity vs. test performance. FWIW, I like the
> test suite with many positive tests because it is easy to read the
> test flow. And as this test suite is meant to grow up with new tests
> in the future, I am guessing that people may not follow the
> restriction this patch adds all the time.

True.

>  command_fails(
> -   [ 'initdb', "$tempdir/data", '-X', 'pgxlog' ],
> +   [ 'initdb', $datadir, '-X', 'pgxlog' ],
> 'relative xlog directory not allowed');
> $datadir should be the last argument. This would break on platforms
> where src/port/getopt_long.c is used, like Windows.

I committed that as a separate patch; thanks.

> +command_ok([ 'initdb', '-N', '-T', 'german', '-X', $xlogdir, $datadir ],
> +   'successful creation');
> This is grouping the test for nosync, existing nonempty xlog
> directory, and the one for selection of the default dictionary.
> Shouldn't this test name be updated otherwise then?
> 
> Could you add a comment mentioning that tests are grouped to reduce
> I/O impact during a run?

Committed with a comment added.  The "successful creation" test name is
generic because it's the designated place to add testing of new options.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing SQL ASSERTION

2015-05-02 Thread Joe Wildish

> On 1 May 2015, at 19:51, Robert Haas  wrote:
> 
> On Thu, Apr 30, 2015 at 6:36 PM, Joe Wildish
>  wrote:
>> I’m wondering if there are other people out there working on implementing 
>> SQL ASSERTION functionality?
>> 
>> I’ve recently spent a bit of time looking to implement the execution models 
>> described in “Applied Mathematics for Database Professionals” by Toon 
>> Koppelaars and Lex de Haan.  I’ve gotten as far as execution model 3 and am 
>> now looking at deriving polarity of involved tables to do EM4 (described in 
>> some detail in “Deriving Production Rules for Constraint Maintenance”, Ceri 
>> & Widom, VLDB Conference 1990, p555-577). EM5 & EM6 look rather more 
>> difficult but I’m intending to try and implement those, too.
>> 
>> If there are other people working on this stuff it would be great to 
>> collaborate.
> 
> I don't know of anyone working on this.  It sounds very difficult.

The book I mention details a series of execution models, where each successive 
model aims to validate the assertion in a more efficient manner than the last. 
This is achieved by performing static analysis of the assertion's expression to 
determine under what circumstances the assertion need be (re)checked. Briefly:

EM1: after all DML statements;
EM2: only after DML statements involving tables mentioned in the assertion 
expression;
EM3: only after DML statements involving the columns mentioned in the assertion 
expression;
EM4: only after DML statements involving the columns, plus if the statement has 
a “polarity” that may affect the assertion expression.

“Polarity" here means that one is able to (statically) determine if only 
INSERTS and not DELETES can affect an expression or vice-versa.

EMs 5 and 6 are further enhancements that make use of querying the “transition 
effect” data of what actually changed in a statement, to determine if the 
assertion expression need be validated. I’ve not done as much reading around 
this topic yet so am concentrating on EMs 1-4.

I agree it is a difficult problem but there are a fair number of published 
academic papers relating to this topic. The AM4DP book draws a lot of this 
research together and presents the executions models.

I may start writing up on a blog of where I get to, and then post further to 
this list, if there is interest.

Regards.
-Joe



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Fabien COELHO


It might be done in the other direction, though - the "writer" thread 
might collect current results at the end of the interval.


Yep, you can indeed accumulate per thread and sum on the end of the 
interval, but a lock is still needed if you want exact figures.


ISTM that it is what is done for --progress, without even bothering to 
lock anything (even if transactions are fast and some inconsistency arise, 
this is just for showing how things are faring on stderr, so no big deal).



"fprintf" does a lot of processing, so it is the main issue.


The question is whether the processing happens while holding the lock, 
though. I don't think that's the case.


Alas I'm pretty nearly sure it is necessarily the case, there is no 
intermediate buffer in fprintf, the buffering is done trough the file 
handler so a lock must be acquire throughout format processing.


https://www.gnu.org/software/libc/manual/html_node/Streams-and-Threads.html#Streams-and-Threads

"The POSIX standard requires that by default the stream operations are 
atomic.  I.e., issuing two stream operations for the same stream in two 
threads at the same time will cause the operations to be executed as if 
they were issued sequentially. The buffer operations performed while 
reading or writing are protected from other uses of the same stream. To do 
this each stream has an internal lock object which has to be (implicitly) 
acquired before any work can be done. "


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel mode and parallel contexts

2015-05-02 Thread Noah Misch
On Tue, Apr 21, 2015 at 02:08:00PM -0400, Robert Haas wrote:
> On Tue, Mar 24, 2015 at 3:26 PM, Andres Freund  wrote:

[proposal: teach deadlock detector about parallel master waiting on workers]

> > deadlock.c is far from simple, and at least I don't find the control
> > flow to be particularly clear. So it's not easy.  It'd be advantageous
> > to tackle things at that level because it'd avoid the need to acquire
> > locks on some lock's waitqueue when blocking; we're going to do that a
> > lot.
> >
> > But It seems to me that it should be possible to suceed: In
> > FindLockCycleRecurse(), in the case that we're not waiting for an actual
> > lock (checkProc->links.next == NULL) we can add a case that considers
> > the 'blocking parallelism' case. ISTM that that's just a
> > FindLockCycleRecurse() call on the process that we're waiting for. We'd
> > either have to find the other side's locktag for DEADLOCK_INFO or invent
> > another field in there; but that seems like a solveable problem.

> 1. The master doesn't actually *wait* until the very end of the parallel 
> phase.
> 2. When the master does wait, it waits for all of the parallel workers
> at once, not each one individually.
> 
> So, I don't think anything as simplistic as teaching a blocking
> shm_mq_receive() to tip off the deadlock detector that we're waiting
> for the process on the other end of that particular queue can ever
> work.  Because of point #2, that never happens.  When I first started
> thinking about how to fix this, I said, well, that's no big deal, we
> can just advertise the whole list of processes that we're waiting for
> in shared memory, rather than just the one.  This is a bit tricky,
> though. Any general API for any backend to advertise that it's waiting
> for an arbitrary subset of the other backends would require O(n^2)
> shared memory state.  That wouldn't be completely insane, but it's not
> great, either.  For this particular case, we could optimize that down
> to O(n) by just chaining all of the children of a given parallel group
> leader in a linked whose nodes are inlined in their PGPROCs, but that
> doesn't feel very general, because it forecloses the possibility of
> the children ever using that API, and I think they might need to.  If
> nothing else, they might need to advertise that they're blocking on
> the master if they are trying to send data, the queue is full, and
> they have to wait for the master to drain some of it before they can
> proceed.

Perhaps, rather than model it as master M waiting on worker list W1|W2|W3,
model it with queue-nonempty and queue-nonfull events, one pair per queue.  M
subscribes to queue0-nonempty, and each of W1,W2,W3 publish to it.  M
publishes to queue0-nonfull, and the workers subscribe.  An edge forms in the
deadlock detector's graph when all of an event's subscribers wait on it.  (One
can approximate that in userspace with a pair of advisory locks.)

> After thinking about it a bit more, I realized that even if we settle
> on some solution to that problem, there's another issues: the
> wait-edges created by this system don't really have the same semantics
> as regular lock waits.  Suppose A is waiting on a lock held by B and B
> is waiting for a lock held by A; that's a deadlock.  But if A is
> waiting for B to write to a tuple queue and B is waiting for A to read
> from a tuple queue, that's not a deadlock if the queues in question
> are the same.

I do see a deadlock.  B wants to block until A reads from the queue, so it
advertises that and sleeps.  Waking up deadlock_timeout later, it notices that
A is waiting for B to write something.  B will not spontaneously suspend
waiting to write to the queue, nor will A suspend waiting to read from the
queue.  Thus, the deadlock is valid.  This assumes the deadlock detector
reasons from an authoritative picture of pending waits and that we reliably
wake up a process when the condition it sought has arrived.  We have that for
heavyweight lock acquisitions.  The assumption may be incompatible with
Andres's hope, quoted above, of avoiding the full lock acquisition procedure.

> A further difference in the semantics of these wait-edges is that if
> process A is awaiting AccessExclusiveLock on a resource held by B, C,
> and D at AccessShareLock, it needs to wait for all of those processes
> to release their locks before it can do anything else.  On the other
> hand, if process A is awaiting tuples from B, C, and D, it just needs
> ONE of those processes to emit tuples in order to make progress.  Now

Right.

> maybe that doesn't make any difference in practice, because even if
> two of those processes are making lively progress and A is receiving
> tuples from them and processing them like gangbusters, that's probably
> not going to help the third one get unstuck.  If we adopt the approach
> of discounting that possibility, then as long as the parallel leader
> can generate tuples locally (that is, local_scan_done = false) we
> don't

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Tomas Vondra

Hi,

On 05/02/15 15:30, Fabien COELHO wrote:


Hello,


The counters are updated when the transaction is finished anyway?


Yes, but the thread does not know it's time to write the results until
it completes the first transaction after the interval ends ...

Let's say the very first query in thread #1 takes a minute for some
reason, while the other threads process 100 transactions per second. So
before the thread #1 can report 0 transactions for the first second, the
other threads already have results for the 60 intervals.

I think there's no way to make this work except for somehow tracking
timestamp of the last submitted results for each thread, and only
flushing results older than the minimum of the timestamps. But that's
not trivial - it certainly is more complicated than just writing into a
shared file descriptor.


I agree that such an approach this would be horrible for a very limited
value. However I was suggesting that a transaction is counted only when
it is finished, so the aggregated data is to be understood as refering
to "finished transactions in the interval", and what is in progress
would be counted in the next interval anyway.


That only works if every single transaction is immediately written into 
the shared buffer/file, but that would require acquiring a lock shared 
by all the threads. And that's not really free - for cases with many 
clients doing tiny transactions, this might be a big issue, for example.


That's why I suggested that each client uses a shared buffer for the 
results, and only submits the results once the interval is over. 
Submitting the result however happens on the first transaction from the 
next interval. If the transaction is long, the results would not be 
submitted.


It might be done in the other direction, though - the "writer" thread 
might collect current results at the end of the interval.



Merging results for each transaction would not have this issue, but
it would also use the lock much more frequently, and that seems
like a pretty bad idea (especially for the workloads with short
transactions that you suggested are bad match for detailed log -
this would make the aggregated log bad too).

Also notice that with all the threads will try to merge the data
(and thus acquire the lock) at almost the same time - this is
especially true for very short transactions. I would be surprised
if this did not cause issues on read-only workloads with large
numbers of threads.


ISTM that the aggregated version should fare better than the
detailed log, whatever is done: the key performance issue is because
fprintf is slow, with aggregated log these are infrequent, and only
arithmetic remains in a critical section.


Probably.


(2) The feature would not be available for the thread-emulation with
this approach, but I do not see this as a particular issue as I
think that it is pretty much only dead code and a maintenance burden.


I'm not willing to investigate that, nor am I willing to implement
another feature that works only sometimes (I've done that in the past,
and I find it a bad practice).


[...]


After the small discussion I triggered, I've submitted a patch to drop
thread fork-emulation from pgbench.


OK, good.




[...]
Also, if the lock for the shared buffer is cheaper than the lock
required for fprintf, it may still be an improvement.


Yep. "fprintf" does a lot of processing, so it is the main issue.


The question is whether the processing happens while holding the lock, 
though. I don't think that's the case.



--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CTE optimization fence on the todo list?

2015-05-02 Thread Andrew Dunstan


On 05/01/2015 07:24 PM, Josh Berkus wrote:

O

(A possible compromise position would be to offer a new GUC to
enable/disable the optimization globally; that would add only a reasonably
small amount of control code, and people who were afraid of the change
breaking their apps would probably want a global disable anyway.)

We'd need the GUC.  I know of a lot of cases where people are using WITH
clauses specifically to override the query planner, and requiring them
to edit all of their queries in order to enable the old behavior would
become an upgrade barrier.



+100

This could be a very bad, almost impossible to catch, behaviour break. 
Even if we add the GUC, we're probably going to be imposing very 
significant code audit costs on some users.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - logging aggregated info and transactions at the same time

2015-05-02 Thread Fabien COELHO




In doLog, the sample rate test should be mixed in the tlogfile condition
so as to avoid calling rand if there is no logging anyway.

 if (tlogfile && sample_rate != 0 && ...)


Oops, wrong logic. Rather:

  if (tlogfile && (sample_rate == 0 || ... random stuff))

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - logging aggregated info and transactions at the same time

2015-05-02 Thread Fabien COELHO



another thing that I find annoying on pgbench is that you can either log
aggregated summary (per interval) or detailed transaction info (possibly
sampled), but not both at the same time.


Here is a review.

Patch applied, compiled and run. Note that it needs a rebase after pgbench 
move to src/bin.



Overall comments:

This feature is a good think, i.e. it removes a current limitation which 
is not really logical.


I think that the user interface must be improved. I would suggest that 
option "-a/--aggregate-log" should trigger the aggregate log with a 
default aggregate interval of for instance 1 or 10 second(s), and 
"--aggregate-interval" would change the default interval.


Probably this patch could wait for the resolution of the pgbench merged 
log patch submitted in parallel as it seems to me that it interferes 
significantly with it.



Detailed comments:

In doLog, the sample rate test should be mixed in the tlogfile condition
so as to avoid calling rand if there is no logging anyway.

  if (tlogfile && sample_rate != 0 && ...)

The aggregate log file opening error refers to the "transaction" log file.

The documentation update is missing.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Fabien COELHO


Hello,


The counters are updated when the transaction is finished anyway?


Yes, but the thread does not know it's time to write the results until
it completes the first transaction after the interval ends ...

Let's say the very first query in thread #1 takes a minute for some
reason, while the other threads process 100 transactions per second. So
before the thread #1 can report 0 transactions for the first second, the
other threads already have results for the 60 intervals.

I think there's no way to make this work except for somehow tracking
timestamp of the last submitted results for each thread, and only
flushing results older than the minimum of the timestamps. But that's
not trivial - it certainly is more complicated than just writing into a
shared file descriptor.


I agree that such an approach this would be horrible for a very limited 
value. However I was suggesting that a transaction is counted only when it 
is finished, so the aggregated data is to be understood as refering to 
"finished transactions in the interval", and what is in progress would be 
counted in the next interval anyway.



Merging results for each transaction would not have this issue, but it
would also use the lock much more frequently, and that seems like a
pretty bad idea (especially for the workloads with short transactions
that you suggested are bad match for detailed log - this would make the
aggregated log bad too).

Also notice that with all the threads will try to merge the data (and
thus acquire the lock) at almost the same time - this is especially true
for very short transactions. I would be surprised if this did not cause
issues on read-only workloads with large numbers of threads.


ISTM that the aggregated version should fare better than the detailed log,
whatever is done: the key performance issue is because fprintf is slow, 
with aggregated log these are infrequent, and only arithmetic remains in a 
critical section.



(2) The feature would not be available for the thread-emulation with
this approach, but I do not see this as a particular issue as I
think that it is pretty much only dead code and a maintenance burden.


I'm not willing to investigate that, nor am I willing to implement
another feature that works only sometimes (I've done that in the past,
and I find it a bad practice).


[...]


After the small discussion I triggered, I've submitted a patch to drop 
thread fork-emulation from pgbench.



[...]
Also, if the lock for the shared buffer is cheaper than the lock
required for fprintf, it may still be an improvement.


Yep. "fprintf" does a lot of processing, so it is the main issue.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rounding to even for numeric data type

2015-05-02 Thread Fabien COELHO



So, attached is a patch that does 1) and 2) to make clear to the user
how numeric and double precision behave regarding rounding. I am
adding it to CF 2015-06 to keep track of it...


Quick review: patches applies, make check is fine, all is well.

Two minor suggestions:

All the casting tests could be put in "numeric.sql", as there are all 
related to numeric and that would avoid duplicating the values lists.


For the documentation, I would also add 3.5 so that rounding to even is 
even clearer:-)


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] feature freeze and beta schedule

2015-05-02 Thread Kohei KaiGai
2015-05-02 1:37 GMT+09:00 Andres Freund :
> * ctidscan as an example of custom-scan
>   Hasn't really gotten sufficient review.
>   => Move
>
I have to agree.

> * Join pushdown support for foreign tables
>   Hasn't gotten particularly much review yet. And it doesn't look
>   entirely ready to me on a very quick scan. But I don't know that much
>   about the area.
>   => Move?
>
I think its overall design had been discussed and people reached
consensus. Even though some documentation / comments works
are required, it is not a fundamental design change isn't it?
I expect the join-pushdown of postgres_fdw will reach the commitable
state within two weeks.

Thanks,
--
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers