Re: [PERFORM] Any better plan for this query?..

2009-05-20 Thread Simon Riggs
On Wed, 2009-05-20 at 07:17 -0400, Robert Haas wrote: > On Wed, May 20, 2009 at 4:11 AM, Simon Riggs wrote: > > The Hash node is fully executed before we start pulling rows through the > > Hash Join node. So the Hash Join node will know at execution time > > whether or not it will continue to mai

Re: [PERFORM] Any better plan for this query?..

2009-05-20 Thread Robert Haas
On Wed, May 20, 2009 at 4:11 AM, Simon Riggs wrote: > The Hash node is fully executed before we start pulling rows through the > Hash Join node. So the Hash Join node will know at execution time > whether or not it will continue to maintain sorted order. So we put the > Sort node into the plan, th

Re: [PERFORM] Any better plan for this query?..

2009-05-20 Thread Simon Riggs
On Tue, 2009-05-19 at 23:54 -0400, Robert Haas wrote: > I don't think it's a good idea to write off the idea of implementing > this optimization at some point. I see a lot of queries that join one > fairly large table against a whole bunch of little tables, and then > sorting the results by a co

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Robert Haas
On Tue, May 19, 2009 at 6:49 PM, Tom Lane wrote: > Simon Riggs writes: >> On Tue, 2009-05-19 at 08:58 -0400, Tom Lane wrote: >>> Nonsense.  The planner might think some other plan is cheaper, but >>> it definitely knows how to do this, and has since at least 8.1. > >> Please look at Dimitri's pla

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Tom Lane
Simon Riggs writes: > On Tue, 2009-05-19 at 08:58 -0400, Tom Lane wrote: >> Nonsense. The planner might think some other plan is cheaper, but >> it definitely knows how to do this, and has since at least 8.1. > Please look at Dimitri's plan. If it can remove the pointless sort, why > does it not

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Merlin Moncure
On Tue, May 19, 2009 at 3:15 PM, Dimitri wrote: > On 5/19/09, Merlin Moncure wrote: >> On Tue, May 19, 2009 at 11:53 AM, Dimitri wrote: >>> the query is *once* prepared via PQexec, >>> then it's looping with "execute" via PQexec. >>> Why PQexecPrepared will be better in my case?.. >> >> It can b

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
On 5/19/09, Merlin Moncure wrote: > On Tue, May 19, 2009 at 11:53 AM, Dimitri wrote: >> On 5/19/09, Merlin Moncure wrote: >>> On Mon, May 18, 2009 at 6:32 PM, Dimitri wrote: Thanks Dave for correction, but I'm also curious where the time is wasted in this case?.. 0.84ms is d

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
On 5/19/09, Scott Carey wrote: > > On 5/19/09 3:46 AM, "Dimitri" wrote: > >> On 5/19/09, Scott Carey wrote: >>> >>> On 5/18/09 3:32 PM, "Dimitri" wrote: >>> On 5/18/09, Scott Carey wrote: > Great data Dimitri!' Thank you! :-) > > I see a few key trends in the po

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Merlin Moncure
On Tue, May 19, 2009 at 11:53 AM, Dimitri wrote: > On 5/19/09, Merlin Moncure wrote: >> On Mon, May 18, 2009 at 6:32 PM, Dimitri wrote: >>> Thanks Dave for correction, but I'm also curious where the time is >>> wasted in this case?.. >>> >>> 0.84ms is displayed by "psql" once the result output i

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Scott Carey
On 5/19/09 3:46 AM, "Dimitri" wrote: > On 5/19/09, Scott Carey wrote: >> >> On 5/18/09 3:32 PM, "Dimitri" wrote: >> >>> On 5/18/09, Scott Carey wrote: Great data Dimitri!' >>> >>> Thank you! :-) >>> I see a few key trends in the poor scalability: The throughput

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Scott Carey
On 5/19/09 5:01 AM, "Matthew Wakeling" wrote: > On Tue, 19 May 2009, Simon Riggs wrote: >>> Speaking of avoiding large sorts, I'd like to push again for partial >>> sorts. This is the situation where an index provides data sorted by >>> column "a", and the query requests data sorted by "a, b". C

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 13:01 +0100, Matthew Wakeling wrote: > That leads me on to another topic. Consider the query: > > SELECT * FROM table ORDER BY a, b > > where the column "a" is declared UNIQUE and has an index. Does Postgres > eliminate "b" from the ORDER BY, and therefore allow fetching

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 08:58 -0400, Tom Lane wrote: > Simon Riggs writes: > > Both plans for this query show an IndexScan on a two column-index, with > > an Index Condition of equality on the leading column. The ORDER BY > > specifies a sort by the second index column, so the top-level Sort is > >

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
On 5/19/09, Merlin Moncure wrote: > On Mon, May 18, 2009 at 6:32 PM, Dimitri wrote: >> Thanks Dave for correction, but I'm also curious where the time is >> wasted in this case?.. >> >> 0.84ms is displayed by "psql" once the result output is printed, and I >> got similar time within my client (us

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
The response time is not progressive, it's simply jumping, it's likely since 16 sessions there is a sort of serialization happening somewhere.. As well on 16 sessions the throughput in TPS is near the same as on 8 (response time is only twice bigger for the moment), but on 32 it's dramatically drop

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Robert Haas
On May 19, 2009, at 7:36 AM, Simon Riggs wrote: On Tue, 2009-05-19 at 12:17 +0100, Matthew Wakeling wrote: Yes, Postgres has been missing the boat on this one for a while. +1 on requesting this feature. That's an optimizer feature. Speaking of avoiding large sorts, I'd like to push agai

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Merlin Moncure
On Mon, May 18, 2009 at 6:32 PM, Dimitri wrote: > Thanks Dave for correction, but I'm also curious where the time is > wasted in this case?.. > > 0.84ms is displayed by "psql" once the result output is printed, and I > got similar time within my client (using libpq) which is not printing > any out

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 14:00 +0200, Dimitri wrote: > I may confirm the issue with hash join - it's repeating both with > prepared and not prepared statements - it's curious because initially > the response time is lowering near ~1ms (the lowest seen until now) > and then once workload growing to 1

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Tom Lane
Simon Riggs writes: > Both plans for this query show an IndexScan on a two column-index, with > an Index Condition of equality on the leading column. The ORDER BY > specifies a sort by the second index column, so the top-level Sort is > superfluous in this case. > My understanding is that we don'

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Matthew Wakeling
On Tue, 19 May 2009, Simon Riggs wrote: Speaking of avoiding large sorts, I'd like to push again for partial sorts. This is the situation where an index provides data sorted by column "a", and the query requests data sorted by "a, b". Currently, Postgres sorts the entire data set, whereas it need

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
I may confirm the issue with hash join - it's repeating both with prepared and not prepared statements - it's curious because initially the response time is lowering near ~1ms (the lowest seen until now) and then once workload growing to 16 sessions it's jumping to 2.5ms, then with 32 sessions it's

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 12:36 +0100, Simon Riggs wrote: > Partially sorted data takes much less effort to sort (OK, not zero, I > grant) so this seems like a high complexity, lower value feature. I > agree it should be on the TODO, just IMHO at a lower priority than some > other features. Perhaps

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 12:17 +0100, Matthew Wakeling wrote: > Yes, Postgres has been missing the boat on this one for a while. +1 on > requesting this feature. That's an optimizer feature. > Speaking of avoiding large sorts, I'd like to push again for partial > sorts. This is the situation where

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Matthew Wakeling
On Tue, 19 May 2009, Simon Riggs wrote: Both plans for this query show an IndexScan on a two column-index, with an Index Condition of equality on the leading column. The ORDER BY specifies a sort by the second index column, so the top-level Sort is superfluous in this case. My understanding is t

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Mon, 2009-05-18 at 19:00 -0400, Tom Lane wrote: > Simon Riggs writes: > > In particular, running the tests repeatedly using > > H.REF_OBJECT = '01' > > rather than varying the value seems likely to benefit MySQL. One thing to note in terms of optimisation of this query is that we

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
On 5/19/09, Simon Riggs wrote: > > On Tue, 2009-05-19 at 00:33 +0200, Dimitri wrote: >> > >> > In particular, running the tests repeatedly using >> >H.REF_OBJECT = '01' >> > rather than varying the value seems likely to benefit MySQL. The >> >> let me repeat again - the reference is *r

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
On 5/19/09, Scott Carey wrote: > > On 5/18/09 3:32 PM, "Dimitri" wrote: > >> On 5/18/09, Scott Carey wrote: >>> Great data Dimitri!' >> >> Thank you! :-) >> >>> >>> I see a few key trends in the poor scalability: >>> >>> The throughput scales roughly with %CPU fairly well. But CPU used >>> does

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Dimitri
No, Tom, the query cache was off. I put it always explicitly off on MySQL as it has scalability issues. Rgds, -Dimitri On 5/19/09, Tom Lane wrote: > Simon Riggs writes: >> In particular, running the tests repeatedly using >> H.REF_OBJECT = '01' >> rather than varying the value see

Re: [PERFORM] Any better plan for this query?..

2009-05-19 Thread Simon Riggs
On Tue, 2009-05-19 at 00:33 +0200, Dimitri wrote: > > > > In particular, running the tests repeatedly using > > H.REF_OBJECT = '01' > > rather than varying the value seems likely to benefit MySQL. The > > let me repeat again - the reference is *random*, > the '01' value I've u

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Scott Carey
On 5/18/09 3:32 PM, "Dimitri" wrote: > On 5/18/09, Scott Carey wrote: >> Great data Dimitri!' > > Thank you! :-) > >> >> I see a few key trends in the poor scalability: >> >> The throughput scales roughly with %CPU fairly well. But CPU used doesn't >> go past ~50% on the 32 core tests. Th

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Tom Lane
Simon Riggs writes: > In particular, running the tests repeatedly using > H.REF_OBJECT = '01' > rather than varying the value seems likely to benefit MySQL. ... mumble ... query cache? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-pe

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Dimitri
On 5/18/09, Simon Riggs wrote: > > On Mon, 2009-05-18 at 20:00 +0200, Dimitri wrote: > >> >From my point of view it needs first to understand where the time is >> wasted on a single query (even when the statement is prepared it runs >> still slower comparing to MySQL). > > There is still a signifi

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Dimitri
On 5/18/09, Scott Carey wrote: > Great data Dimitri!' Thank you! :-) > > I see a few key trends in the poor scalability: > > The throughput scales roughly with %CPU fairly well. But CPU used doesn't > go past ~50% on the 32 core tests. This indicates lock contention. > You should not look on

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Dimitri
Thanks Dave for correction, but I'm also curious where the time is wasted in this case?.. 0.84ms is displayed by "psql" once the result output is printed, and I got similar time within my client (using libpq) which is not printing any output.. Rgds, -Dimitri On 5/18/09, Dave Dutcher wrote: > >

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Dave Dutcher
What I don't understand is the part where you talking about disabling hash joins: >* result: planner replaced hash join is replaced by merge join >* execution time: 0.84ms ! >* NOTE: curiously planner is expecting to execute this query in 0.29ms - so it's supposed from its logic to be

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Scott Carey
Great data Dimitri!' I see a few key trends in the poor scalability: The throughput scales roughly with %CPU fairly well. But CPU used doesn't go past ~50% on the 32 core tests. This indicates lock contention. Other proof of lock contention are the mutex locks / sec graph which climbs rapidly

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Simon Riggs
On Mon, 2009-05-18 at 20:00 +0200, Dimitri wrote: > >From my point of view it needs first to understand where the time is > wasted on a single query (even when the statement is prepared it runs > still slower comparing to MySQL). There is still a significant number of things to say about these n

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Dimitri
Folks, I've just published a full report including all results here: http://dimitrik.free.fr/db_STRESS_PostgreSQL_837_and_84_May2009.html >From my point of view it needs first to understand where the time is wasted on a single query (even when the statement is prepared it runs still slower compari

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Simon Riggs
On Wed, 2009-05-13 at 23:23 +0200, Dimitri Fontaine wrote: > As I think I need this solution too, I've coded a PG module to > scratch > that itch this morning, and just published it (BSD licenced) on > pgfoundry: >http://preprepare.projects.postgresql.org/README.html >http://cvs.pgfou

Re: [PERFORM] Any better plan for this query?..

2009-05-18 Thread Simon Riggs
On Thu, 2009-05-14 at 20:25 +0200, Dimitri wrote: > # lwlock_wait_8.4.d `pgrep -n postgres` >Lock IdMode Combined Time (ns) > FirstLockMgrLock Exclusive 803700 >BufFreelistLock Exclusive 3001600 > First

Re: [PERFORM] Any better plan for this query?..

2009-05-14 Thread Dimitri
Hi Scott, let me now finish my report and regroup all data together, and then we'll continue discussion as it'll come more in debug/profile phase.. - I'll be not polite from my part to send some tons of attachments to the mail list :-) Rgds, -Dimitri On 5/13/09, Scott Carey wrote: > > On 5/13/0

Re: [PERFORM] Any better plan for this query?..

2009-05-14 Thread Dimitri
It's absolutely great! it'll not help here because a think time is 0. but for any kind of solution with a spooler it's a must to try! Rgds, -Dimitri On 5/13/09, Dimitri Fontaine wrote: > Hi, > > Le 13 mai 09 à 18:42, Scott Carey a écrit : >>> will not help, as each client is *not* disconnecting/

Re: [PERFORM] Any better plan for this query?..

2009-05-14 Thread Dimitri
Folks, sorry, I'm outpassed little bit by the events :-)) I've finished tests with PREPARE/EXECUTE - it's much faster of course, and the max TSP is 15.000 now on 24 cores! - I've done various tests to see where is the limit bottleneck may be present - it's more likely something timer or interrupt

Re: [PERFORM] Any better plan for this query?..

2009-05-14 Thread Simon Riggs
On Tue, 2009-05-12 at 14:28 +0200, Dimitri wrote: > As problem I'm considering a scalability issue on Read-Only workload - > only selects, no disk access, and if on move from 8 to 16 cores we > gain near 100%, on move from 16 to 32 cores it's only 10%... Dimitri, Will you be re-running the Read

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri Fontaine
Hi, Le 13 mai 09 à 18:42, Scott Carey a écrit : will not help, as each client is *not* disconnecting/reconnecting during the test, as well PG is keeping well even 256 users. And TPS limit is reached already on 64 users, don't think pooler will help here. Actually, it might help a little. Post

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Kevin Grittner
Dimitri wrote: > The idea is good, but *only* pooling will be not enough. I mean if > all what pooler is doing is only keeping no more than N backends > working - it'll be not enough. You never know what exactly your > query will do - if you choose your N value to be sure to not > overload CPU an

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Scott Carey
On 5/13/09 3:22 AM, "Dimitri" wrote: > Hi Scott, > > On 5/12/09, Scott Carey wrote: >> Although nobody wants to support it, he should try the patch that Jignesh K. >> Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it >> makes 32 cores much faster, then we have a smoking

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
The idea is good, but *only* pooling will be not enough. I mean if all what pooler is doing is only keeping no more than N backends working - it'll be not enough. You never know what exactly your query will do - if you choose your N value to be sure to not overload CPU and then some of your queries

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Kevin Grittner
Glenn Maynard wrote: > I'm sorry, but I'm confused. Everyone keeps talking about > connection pooling, but Dimitri has said repeatedly that each client > makes a single connection and then keeps it open until the end of > the test, not that it makes a single connection per SQL query. > Connecti

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Robert Haas
On Tue, May 12, 2009 at 11:18 AM, Tom Lane wrote: > Matthew Wakeling writes: >> On Tue, 12 May 2009, Simon Riggs wrote: >>> No, we spawn then authenticate. > >> But you still have a single thread doing the accept() and spawn. At some >> point (maybe not now, but in the future) this could become a

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
On MySQL there is no changes if I set the number of sessions in the config file to 400 or to 2000 - for 2000 it'll just allocate more memory. After latest fix with default_statistics_target=5, version 8.3.7 is running as fast as 8.4, even 8.4 is little little bit slower. I understand your positio

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
I'm also confused, but seems discussion giving also other ideas :-) But yes, each client is connecting to the database server only *once*. To presice how the test is running: - 1 client is started => 1 in total - sleep ... - 1 another client is started => 2 in total - sleep .. - 2 another c

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
Hi Scott, On 5/12/09, Scott Carey wrote: > Although nobody wants to support it, he should try the patch that Jignesh K. > Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it > makes 32 cores much faster, then we have a smoking gun. > > Although everyone here is talking about

Re: [PERFORM] Any better plan for this query?..

2009-05-13 Thread Dimitri
On 5/12/09, Robert Haas wrote: > On Tue, May 12, 2009 at 1:00 PM, Dimitri wrote: >> On MySQL there is no changes if I set the number of sessions in the >> config file to 400 or to 2000 - for 2000 it'll just allocate more >> memory. > > I don't care whether the setting affects the speed of MySQL.

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Glenn Maynard
I'm sorry, but I'm confused. Everyone keeps talking about connection pooling, but Dimitri has said repeatedly that each client makes a single connection and then keeps it open until the end of the test, not that it makes a single connection per SQL query. Connection startup costs shouldn't be an

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Robert Haas
On Tue, May 12, 2009 at 12:49 PM, Tom Lane wrote: > 1. There is no (portable) way to pass the connection from the postmaster > to another pre-existing process. [Googles.] It's not obvious to me that SCM_RIGHTS is non-portable, and Windows has an API call WSADuplicateSocket() specifically for thi

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Robert Haas
On Tue, May 12, 2009 at 11:22 AM, Dimitri wrote: > Robert, what I'm testing now is 256 users max. The workload is growing > progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max > throughput is reached on the number of users equal to 2 * number of > cores, but what's important for me h

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stephen Frost
* Aidan Van Dyk (ai...@highrise.ca) wrote: > But, what really does preforking give us? A 2 or 3% improvement? The > forking isn't the expensive part, the per-database setup that happens is > the expensive setup... Obviously that begs the question- why not support pre-fork with specific database

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Aidan Van Dyk
* Joshua D. Drake [090512 19:27]: > Apache solved this problem back when it was still called NSCA HTTPD. Why > aren't we preforking again? Of course, preforking and connection pooling are totally different beast... But, what really does preforking give us? A 2 or 3% improvement? The forking

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Joshua D. Drake
On Tue, 2009-05-12 at 20:34 -0400, Aidan Van Dyk wrote: > * Joshua D. Drake [090512 19:27]: > > > Apache solved this problem back when it was still called NSCA HTTPD. Why > > aren't we preforking again? > > Of course, preforking and connection pooling are totally different > beast... > Yes an

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Greg Stark
On Tue, May 12, 2009 at 5:49 PM, Tom Lane wrote: > See previous discussions.  IIRC, there are two killer points: > > 1. There is no (portable) way to pass the connection from the postmaster > to another pre-existing process. The Apache model is to have all the backends call accept. So incoming co

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Alvaro Herrera
Dimitri Fontaine escribió: > A much better idea to solve this, in my opinion, would be to have > pgbouncer as a postmaster child, integrated into PostgreSQL. It allows > for choosing whether you want session pooling, transaction pooling or > statement pooling, which is a more deterministic w

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Robert Haas
On Tue, May 12, 2009 at 4:24 PM, Simon Riggs wrote: > > On Tue, 2009-05-12 at 15:52 -0400, Robert Haas wrote: >> On Tue, May 12, 2009 at 12:49 PM, Tom Lane wrote: >> > 1. There is no (portable) way to pass the connection from the postmaster >> > to another pre-existing process. >> >> [Googles.]  

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Robert Haas
On Tue, May 12, 2009 at 1:00 PM, Dimitri wrote: > On MySQL there is no changes if I set the number of sessions in the > config file to 400 or to 2000 - for 2000 it'll just allocate more > memory. I don't care whether the setting affects the speed of MySQL. I want to know if it affects the speed

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Joshua D. Drake
On Tue, 2009-05-12 at 21:24 +0100, Simon Riggs wrote: > On Tue, 2009-05-12 at 15:52 -0400, Robert Haas wrote: > > On Tue, May 12, 2009 at 12:49 PM, Tom Lane wrote: > > > 1. There is no (portable) way to pass the connection from the postmaster > > > to another pre-existing process. > > > > [Google

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Scott Carey
Although nobody wants to support it, he should try the patch that Jignesh K. Shah (from Sun) proposed that makes ProcArrayLock lighter-weight. If it makes 32 cores much faster, then we have a smoking gun. Although everyone here is talking about this as an 'unoptimal' solution, the fact is there i

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri Fontaine
Hi, Le 12 mai 09 à 18:32, Robert Haas a écrit : implement this same logic internally? IOW, when a client disconnects, instead of having the backend exit immediately, have it perform the equivalent of DISCARD ALL and then stick around for a minute or two and, if a new connection request arrives

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 15:52 -0400, Robert Haas wrote: > On Tue, May 12, 2009 at 12:49 PM, Tom Lane wrote: > > 1. There is no (portable) way to pass the connection from the postmaster > > to another pre-existing process. > > [Googles.] It's not obvious to me that SCM_RIGHTS is non-portable, > an

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 11:30 -0700, Scott Carey wrote: > the fact is there is no evidence that a connection pooler will fix the > scalability from 16 > 32 cores. There has been much analysis over a number of years of the effects of the ProcArrayLock, specifically the O(N^2) effect of increasing nu

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Alvaro Herrera
Andres Freund escribió: > Naturally it would still be nice to be good in this not optimal workload... I find it hard to justify wasting our scarce development resources into optimizing such a contrived workload. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Tom Lane
Robert Haas writes: > AIUI, whenever the connection pooler switches to serving a new client, > it tells the PG backend to DISCARD ALL. But why couldn't we just > implement this same logic internally? IOW, when a client disconnects, > instead of having the backend exit immediately, have it perfor

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
No, they keep connections till the end of the test. Rgds, -Dimitri On 5/12/09, Joshua D. Drake wrote: > On Tue, 2009-05-12 at 17:22 +0200, Dimitri wrote: >> Robert, what I'm testing now is 256 users max. The workload is growing >> progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Joshua D. Drake
On Tue, 2009-05-12 at 17:22 +0200, Dimitri wrote: > Robert, what I'm testing now is 256 users max. The workload is growing > progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max > throughput is reached on the number of users equal to 2 * number of > cores, but what's important for me h

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
On 5/12/09, Stefan Kaltenbrunner wrote: > Dimitri wrote: >> Hi Stefan, >> >> sorry, I did not have a time to bring all details into the toolkit - >> but at least I published it instead to tell a "nice story" about :-) > > fair point and appreciated. But it seems important that benchmarking > resul

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Kevin Grittner
Dimitri wrote: > Of course the Max throughput is reached on the number of users equal > to 2 * number of cores I'd expect that when disk I/O is not a significant limiting factor, but I've seen a "sweet spot" of (2 * cores) + (effective spindle count) for loads involving a lot of random I/O.

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Robert, what I'm testing now is 256 users max. The workload is growing progressively from 1, 2, 4, 8 ... to 256 users. Of course the Max throughput is reached on the number of users equal to 2 * number of cores, but what's important for me here - database should continue to keep the workload! - res

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Tom Lane
Matthew Wakeling writes: > On Tue, 12 May 2009, Simon Riggs wrote: >> No, we spawn then authenticate. > But you still have a single thread doing the accept() and spawn. At some > point (maybe not now, but in the future) this could become a bottleneck > given very short-lived connections. More

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner
Matthew Wakeling wrote: On Tue, 12 May 2009, Simon Riggs wrote: won't connect operations be all handled by a single thread - the parent postmaster? No, we spawn then authenticate. But you still have a single thread doing the accept() and spawn. At some point (maybe not now, but in the futur

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Matthew Wakeling
On Tue, 12 May 2009, Simon Riggs wrote: won't connect operations be all handled by a single thread - the parent postmaster? No, we spawn then authenticate. But you still have a single thread doing the accept() and spawn. At some point (maybe not now, but in the future) this could become a bo

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 16:00 +0100, Matthew Wakeling wrote: > won't connect operations be all handled by a > single thread - the parent postmaster? No, we spawn then authenticate. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-perf

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Matthew Wakeling
On Tue, 12 May 2009, Stefan Kaltenbrunner wrote: But what I get from your answer is that you are basically doing one connect/disconnect per client and the testcase you are talking about has 256 clients? Correct me if I'm wrong, but won't connect operations be all handled by a single thread -

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Robert Haas
On Tue, May 12, 2009 at 8:59 AM, Dimitri wrote: > Wait wait, currently I'm playing the "stress scenario", so there are > only 256 sessions max, but thing time is zero (full stress). Scenario > with 1600 users is to test how database is solid just to keep a huge > amount of users, but doing only on

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner
Dimitri wrote: Hi Stefan, sorry, I did not have a time to bring all details into the toolkit - but at least I published it instead to tell a "nice story" about :-) fair point and appreciated. But it seems important that benchmarking results can be verified by others as well... The client

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Hi Stefan, sorry, I did not have a time to bring all details into the toolkit - but at least I published it instead to tell a "nice story" about :-) The client process is a binary compiled with libpq. Client is interpreting a scenario script and publish via SHM a time spent on each SQL request. I

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Wait wait, currently I'm playing the "stress scenario", so there are only 256 sessions max, but thing time is zero (full stress). Scenario with 1600 users is to test how database is solid just to keep a huge amount of users, but doing only one transaction per second (very low global TPS comparing t

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Stefan Kaltenbrunner
Dimitri wrote: Folks, before you start to think "what a dumb guy doing a dumb thing" :-)) I'll explain you few details: it's for more than 10 years I'm using a db_STRESS kit (http://dimitrik.free.fr/db_STRESS.html) to check databases performance and scalability. Until now I was very happy with r

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
For the moment I'm even not considering any scalability issues on the Read+Write workload - it may always be related to the storage box, and storage latency or controller/cache efficiency may play a lot. As problem I'm considering a scalability issue on Read-Only workload - only selects, no disk a

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 12:19 +0200, Dimitri wrote: > What I'm trying to do now is to understand what exactly is the > problem. You're running with 1600 users, which is above the scalability limit uncovered (by Sun...) during earlier benchmarking. The scalability issues are understood but currentl

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Heikki Linnakangas
Dimitri wrote: What I discovered so far with all your help: - the impact of a planner - the impact of the analyze target - the impact of prepare / execute - scalability limit on 32 cores You've received good advice on how to minimize the impact of the first three points, and using thos

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 13:16 +0200, Dimitri wrote: > Wow, Simon! :-)) > > yes, I'm working in Sun Benchmark Center :-)) > (I'm not using my Sun email on public lists only to avid a spam) > > and as came here and asking questions it's probably proving my > intentions to show PostgreSQL in its bes

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Wow, Simon! :-)) yes, I'm working in Sun Benchmark Center :-)) (I'm not using my Sun email on public lists only to avid a spam) and as came here and asking questions it's probably proving my intentions to show PostgreSQL in its best light, no?.. - I never liked "not honest" comparisons :-)) Rega

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Simon Riggs
On Tue, 2009-05-12 at 12:19 +0200, Dimitri wrote: > For my big surprise, MySQL was faster! Ours too. ** I bet you $1000 that I can improve the performance of your benchmark results with PostgreSQL. You give me $1000 up-front and if I can't improve your high end numbers I'll give you $2000 back.

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Folks, before you start to think "what a dumb guy doing a dumb thing" :-)) I'll explain you few details: it's for more than 10 years I'm using a db_STRESS kit (http://dimitrik.free.fr/db_STRESS.html) to check databases performance and scalability. Until now I was very happy with results it gave me

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri Fontaine
Hi, Dimitri writes: >>> So, why I don't use prepare here: let's say I'm testing the worst >>> stress case :-) Imagine you have thousands of such kind of queries - >>> you cannot prepare all of them! :-) >> >> Thousands? Surely there'll be a dozen or three of most common queries, >> to which yo

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Andres Freund
Hi, On 05/12/2009 12:46 AM, Dimitri wrote: So, why I don't use prepare here: let's say I'm testing the worst stress case :-) Imagine you have thousands of such kind of queries - you cannot prepare all of them! :-) or you'll maybe prepare it once, but as I showed previously in this thread prepa

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
Nice to know. But again, if this cache is kept only on the client side it'll be always lost on disconnect. And if clients are "short-lived" it'll not help. BTW, is there an option to say "do execution plan as simple as possible"? If you're sure about your data and your indexes - don't need to spen

Re: [PERFORM] Any better plan for this query?..

2009-05-12 Thread Dimitri
>> So, why I don't use prepare here: let's say I'm testing the worst >> stress case :-) Imagine you have thousands of such kind of queries - >> you cannot prepare all of them! :-) > > Thousands? Surely there'll be a dozen or three of most common queries, > to which you pass different parameters.

Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Heikki Linnakangas
Dimitri wrote: Now, as you see from your explanation, the Part #2 is the most dominant - so why instead to blame this query not to implement a QUERY PLANNER CACHE??? - in way if any *similar* query is recognized by parser we simply *reuse* the same plan?.. At least in JDBC, there's several open

Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Alvaro Herrera
Dimitri escribió: > Hi Aidan, > > thanks a lot for this detailed summary! > > So, why I don't use prepare here: let's say I'm testing the worst > stress case :-) Imagine you have thousands of such kind of queries - > you cannot prepare all of them! :-) Thousands? Surely there'll be a dozen or

Re: [PERFORM] Any better plan for this query?..

2009-05-11 Thread Dimitri
Hi Aidan, thanks a lot for this detailed summary! So, why I don't use prepare here: let's say I'm testing the worst stress case :-) Imagine you have thousands of such kind of queries - you cannot prepare all of them! :-) or you'll maybe prepare it once, but as I showed previously in this thread

  1   2   >