Re: [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Cédric Villemain
2011/5/31 Robert Haas : > On Thu, May 26, 2011 at 8:33 AM, panam wrote: >> Any third party confirmation? > > Yeah, it definitely looks like there is some kind of bug here.  Or if > not a bug, then a very surprising feature.  EXPLAIN ANALYZE outputs > from your proposed test attached.  Here's a uni

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Robert Haas
On Thu, May 26, 2011 at 8:33 AM, panam wrote: > Any third party confirmation? Yeah, it definitely looks like there is some kind of bug here. Or if not a bug, then a very surprising feature. EXPLAIN ANALYZE outputs from your proposed test attached. Here's a unified diff of the two outputs:

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Craig Ringer
On 05/27/2011 02:13 AM, Cédric Villemain wrote: I am not an hibernate expert, but I'll surprised if you can not drive hibernate to do what you want. If nothing else, you can do a native query in hand-written SQL through Hibernate. ORMs are useful tools for some jobs, but it's good to be able

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam : > Hi all, > > > Cédric Villemain-3 wrote: >> >> without explaining further why the antijoin has bad performance >> without cluster, I wonder why you don't use this query : >> >> SELECT  b.id, >>                   max(m.id) >> FROM box b, message m >> WHERE m.box_id = b.id >> GROUP

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Sorry, SELECT MAX(e.id) FROM event_message e WHERE e.box_id = id as posted previously should actually read SELECT max(m1.id) FROM message m1 WHERE m1.box_id = b.id) so I tried this already. Regards, panam -- View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-J

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
panam wrote: > I cannot use it because of the way that query is generated > (by hibernate). > > The (simplyfied) base query is just > > SELECT b.id from box > > the subquery > > (SELECT m1.id FROM message m1 >LEFT JOIN message m2 > ON (m1.box_id = m2.box_id AND m1.id < m2.id )

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Hi all, Cédric Villemain-3 wrote: > > without explaining further why the antijoin has bad performance > without cluster, I wonder why you don't use this query : > > SELECT b.id, > max(m.id) > FROM box b, message m > WHERE m.box_id = b.id > GROUP BY b.id; > > looks similar an

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
Cédric Villemain wrote: > 2011/5/26 panam : >> "max_connections";"100" >> "work_mem";"1GB" Each connection can allocate work_mem, potentially several times. On a machines without hundreds of GB of RAM, that pair of settings could cause severe swapping. >> "Patholgical" query: >> >> select

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam : > Hi there, > > > Kevin Grittner wrote: >> >>> Is there a way to determine the values actually used? >> The pg_settings view.  Try the query shown here: >> http://wiki.postgresql.org/wiki/Server_Configuration >> > Thanks Kevin, very usful. Here is the output: > > "version";"Postgr

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Hi there, Kevin Grittner wrote: > >> Is there a way to determine the values actually used? > The pg_settings view. Try the query shown here: > http://wiki.postgresql.org/wiki/Server_Configuration > Thanks Kevin, very usful. Here is the output: "version";"PostgreSQL 9.0.4, compiled by Visual C

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread Craig Ringer
On 05/26/2011 12:42 AM, panam wrote: So, would you like to further investigate my previous issue (I think it is still strange that performance suddenly dropped that dramatically)? It's a bit beyond me, but I suspect that it'd be best if you could hang onto the dump file in case someone has th

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread Kevin Grittner
panam wrote: > Is there a way to determine the values actually used? The pg_settings view. Try the query shown here: http://wiki.postgresql.org/wiki/Server_Configuration -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscrip

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread panam
Hi all, @Tom, > BTW, this query doesn't actually match the EXPLAIN outputs... You're right, it is actually just the "heavy" subquery of a larger query which can be found here: http://pastebin.com/fuGrt0tB > One other thing I'm not following is how come it's using hash temp files > at all, when y

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-24 Thread Craig Ringer
On 24/05/11 22:34, panam wrote: >> The usual cause is that the statistics for estimated row counts cross a >> threshold that makes the query planner think that a different kind of >> plan will be faster. > > Hm, as far as i understand the plans, they are equivalent, aren't they? Yes, they are,

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-24 Thread Tom Lane
panam writes: > In my DB, there is a large table containing messages and one that contains > message boxes. > Messages are assigned to boxes via a child parent link m->b. > In order to obtain the last message for a specific box, I use the following > SQL: > SELECT m1.id FROM message m1 LEFT JOIN

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-24 Thread panam
Hi Craig and Cédric, Thanks for the very informative introduction to the netiquette here and thanks for sharing your time. I wasn't aware of http://explain.depesz.com/, very useful. So, here are the query plans: http://explain.depesz.com/s/6AU (1st from previous post, good) http://explain.depesz.c

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-24 Thread Cédric Villemain
2011/5/24 panam : > Hi, > > In my DB, there is a large table containing messages and one that contains > message boxes. > Messages are assigned to boxes via a child parent link m->b. > In order to obtain the last message for a specific box, I use the following > SQL: > > SELECT m1.id FROM message m

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-23 Thread Craig Ringer
On 24/05/11 12:14, panam wrote: > Hi, > > In my DB, there is a large table containing messages and one that contains > message boxes. > Messages are assigned to boxes via a child parent link m->b. > In order to obtain the last message for a specific box, I use the following > SQL: > > SELECT m1.i

[PERFORM] Hash Anti Join performance degradation

2011-05-23 Thread panam
Hi, In my DB, there is a large table containing messages and one that contains message boxes. Messages are assigned to boxes via a child parent link m->b. In order to obtain the last message for a specific box, I use the following SQL: SELECT m1.id FROM message m1 LEFT JOIN message m2 ON (m1.box_