[PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
I was recently asked to look into why one particular set of queries was taking a long time. The queries are all of the same form. They select the UNION of a few columns from around 100 tables. The query in particular was taking some 7-8 minutes to run. On a whim, I changed the query from this

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes: In the former case, the query plan was a bitmap heap scan for each table. Then those results were Appended, Sorted, Uniqued, Sorted again, and then returned. In the latter, before Appending, each table's results were run through HashAggregate.

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
On Thu, Jan 13, 2011 at 11:13 AM, Tom Lane t...@sss.pgh.pa.us wrote: Jon Nelson jnelson+pg...@jamponi.net writes: In the former case, the query plan was a bitmap heap scan for each table. Then those results were Appended, Sorted, Uniqued, Sorted again, and then returned. In the latter,

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes: Your comment regarding each individual de-duplication looked like it would fit in work_mem doesn't really make sense, exactly. Maybe I'm misunderstanding you. Yeah. What I was suggesting was to NOT add the DISTINCT's, but instead raise work_mem

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
On Thu, Jan 13, 2011 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jon Nelson jnelson+pg...@jamponi.net writes: Your comment regarding each individual de-duplication looked like it would fit in work_mem doesn't really make sense, exactly. Maybe I'm misunderstanding you. Yeah.  What I was

[PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
I was recently asked to look into why one particular set of queries was taking a long time. The queries are all of the same form. They select the UNION of a few columns from around 100 tables. The query in particular was taking some 7-8 minutes to run. On a whim, I changed the query from this

Re: [PERFORM] adding foreign key constraint locks up table

2011-01-13 Thread kakarukeys
On Jan 9, 11:34 am, robertmh...@gmail.com (Robert Haas) wrote: On Wed, Jan 5, 2011 at 2:09 AM, kakarukeys kakaruk...@gmail.com wrote: As requested, here are some output of the investigative queries, run when the problem occurred. I could see some locks there, but I don't know why the alter

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Robert Haas
On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson jnelson+pg...@jamponi.net wrote: I still think that having UNION do de-duplication of each contributory relation is a beneficial thing to consider -- especially if postgresql thinks the uniqueness is not very high. This might be worth a TODO. --

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson jnelson+pg...@jamponi.net wrote: I still think that having UNION do de-duplication of each contributory relation is a beneficial thing to consider -- especially if postgresql thinks the uniqueness is not very

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Robert Haas
On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson jnelson+pg...@jamponi.net wrote: I still think that having UNION do de-duplication of each contributory relation is a beneficial thing to

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Robert Haas
On Thu, Jan 13, 2011 at 5:41 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson jnelson+pg...@jamponi.net wrote: I still think that having UNION

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Andy Colson
On 1/13/2011 4:42 PM, Robert Haas wrote: On Thu, Jan 13, 2011 at 5:41 PM, Robert Haasrobertmh...@gmail.com wrote: On Thu, Jan 13, 2011 at 5:26 PM, Tom Lanet...@sss.pgh.pa.us wrote: Robert Haasrobertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelsonjnelson+pg...@jamponi.net

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Robert Haas
On Thu, Jan 13, 2011 at 5:47 PM, Andy Colson a...@squeakycode.net wrote: I don't believe there is any case where hashing each individual relation is a win compared to hashing them all together.  If the optimizer were smart enough to be considering the situation as a whole, it would always do

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Andy Colson
On 1/13/2011 4:49 PM, Robert Haas wrote: On Thu, Jan 13, 2011 at 5:47 PM, Andy Colsona...@squeakycode.net wrote: I don't believe there is any case where hashing each individual relation is a win compared to hashing them all together. If the optimizer were smart enough to be considering the

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
On Thu, Jan 13, 2011 at 4:49 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 13, 2011 at 5:47 PM, Andy Colson a...@squeakycode.net wrote: I don't believe there is any case where hashing each individual relation is a win compared to hashing them all together.  If the optimizer were

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't believe there is any case where hashing each individual relation is a win compared to hashing them all together.  If the optimizer were smart enough to be considering the

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Yeah, I'm all wet, because you'd still have to re-de-duplicate at the end. But then why did the OP get a speedup? *scratches head* He was reporting that 2 levels of hashing was faster than sort+uniq (with the sorts swapping to disk, no doubt). One

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Jon Nelson
On Thu, Jan 13, 2011 at 5:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't believe there is any case where hashing each individual relation is a win compared to hashing them all

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes: On Thu, Jan 13, 2011 at 5:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: If you have enough memory to de-dup them individually, you surely have enough to de-dup all at once. If everything were available up-front, sure. However, and please correct me if

Re: [PERFORM] queries with lots of UNIONed relations

2011-01-13 Thread Mladen Gogala
On 1/13/2011 5:41 PM, Robert Haas wrote: You might be right, but I'm not sure. Suppose that there are 100 inheritance children, and each has 10,000 distinct values, but none of them are common between the tables. In that situation, de-duplicating each individual table requires a hash table

Re: [PERFORM] The good, old times

2011-01-13 Thread Craig Ringer
On 01/12/2011 10:16 PM, Guillaume Cottenceau wrote: What's your point and in what is it related to that ML? Given the package names, I suspect this is a poorly-expressed complaint about the performance of downloads from the pgdg/psqlrpms site. If that was the original poster's intent, they