Re: [PERFORM] No hash join across partitioned tables?
On Sat, Oct 16, 2010 at 8:29 AM, Alvaro Herrera alvhe...@commandprompt.comwrote: Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010: An issue with automatically analyzing the entire hierarchy is 'abstract' table definitions. I've got a set of tables for storing the same data at different granularities of aggregation. Within each granularity, I've got partitions, but because the set of columns is identical for each granularity, I've got an abstract table definition that is inherited by everything. I don't need or want statistics kept on that table because I never query across the abstract table, only the parent table of each aggregation granularity Hmm, I think you'd be better served by using LIKE instead of regular inheritance. Yep. I inherited the architecture, though, and changing it hasn't been a high priority. --sam
Re: [PERFORM] No hash join across partitioned tables?
Excerpts from Samuel Gendler's message of lun oct 18 03:13:01 -0300 2010: On Sat, Oct 16, 2010 at 8:29 AM, Alvaro Herrera alvhe...@commandprompt.comwrote: Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010: An issue with automatically analyzing the entire hierarchy is 'abstract' table definitions. I've got a set of tables for storing the same data at different granularities of aggregation. Within each granularity, I've got partitions, but because the set of columns is identical for each granularity, I've got an abstract table definition that is inherited by everything. I don't need or want statistics kept on that table because I never query across the abstract table, only the parent table of each aggregation granularity Hmm, I think you'd be better served by using LIKE instead of regular inheritance. Yep. I inherited the architecture, though, and changing it hasn't been a high priority. I understand that; my point is merely that maybe we shouldn't work through many hoops to solve this particular facet of the problem, because it seems to be pilot error. (If you really needed to avoid the extra I/O that would be caused by unnecessary analyzes, you could turn autovac off for the abstract tables). -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010: An issue with automatically analyzing the entire hierarchy is 'abstract' table definitions. I've got a set of tables for storing the same data at different granularities of aggregation. Within each granularity, I've got partitions, but because the set of columns is identical for each granularity, I've got an abstract table definition that is inherited by everything. I don't need or want statistics kept on that table because I never query across the abstract table, only the parent table of each aggregation granularity Hmm, I think you'd be better served by using LIKE instead of regular inheritance. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Excerpts from Robert Haas's message of mié jun 09 15:47:55 -0400 2010: In going back through emails I had marked as possibly needing another look before 9.0 is released, I came across this issue again. As I understand it, analyze (or analyse) now collects statistics for both the parent individually, and for the parent and its children together. However, as I further understand it, autovacuum won't actually fire off an analyze unless there's enough activity on the parent table considered individually to warrant it. So if you have an empty parent and a bunch of children with data in it, your stats will still stink, unless you analyze by hand. So, is there something we could now do about this, while there's still time before 9.1? I haven't followed this issue very closely, but it seems to me that what we want is that we want an ANALYZE in a child table to be mutated into an analyze of its parent table, if the conditions are right; and that an ANALYZE of a parent removes the child tables from being analyzed on the same run. If we analyze the parent, do we also update the children stats, or is it just that we keep two stats for the parent, one with children and one without, both being updated when the parent is analyzed? If the latter's the case, maybe we should modify ANALYZE a bit more, so that we can analyze the whole hierarchy in one go, and store the lot of stats with a single pass (each child alone, the parent alone, the parent plus children). However it's not real clear how would this work with multiple inheritance levels. -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Alvaro Herrera alvhe...@commandprompt.com writes: If we analyze the parent, do we also update the children stats, or is it just that we keep two stats for the parent, one with children and one without, both being updated when the parent is analyzed? The latter. The trick here is that we need to fire an analyze on the parent even though only its children may have had any updates. If the latter's the case, maybe we should modify ANALYZE a bit more, so that we can analyze the whole hierarchy in one go, and store the lot of stats with a single pass (each child alone, the parent alone, the parent plus children). However it's not real clear how would this work with multiple inheritance levels. It's also not clear how it works without blowing out memory... regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Fri, Oct 15, 2010 at 10:22 PM, Tom Lane t...@sss.pgh.pa.us wrote: Alvaro Herrera alvhe...@commandprompt.com writes: If we analyze the parent, do we also update the children stats, or is it just that we keep two stats for the parent, one with children and one without, both being updated when the parent is analyzed? The latter. The trick here is that we need to fire an analyze on the parent even though only its children may have had any updates. If the latter's the case, maybe we should modify ANALYZE a bit more, so that we can analyze the whole hierarchy in one go, and store the lot of stats with a single pass (each child alone, the parent alone, the parent plus children). However it's not real clear how would this work with multiple inheritance levels. An issue with automatically analyzing the entire hierarchy is 'abstract' table definitions. I've got a set of tables for storing the same data at different granularities of aggregation. Within each granularity, I've got partitions, but because the set of columns is identical for each granularity, I've got an abstract table definition that is inherited by everything. I don't need or want statistics kept on that table because I never query across the abstract table, only the parent table of each aggregation granularity create table abstract_fact_table ( time timestamp, measure1 bigint, measure2 bigint, measure3 bigint, fk1 bigint, fk2 bigint ); create table minute_scale_fact_table ( } inherits abstract_fact_table; // Then there are several partitions for minute scale data create table hour_scale_fact_table ( ) inherits abstract_fact_table; // then several partitions for hour scale data etc. I do run queries on the minute_scale_fact_table and hour_scale_fact_table but never do so on abstract_fact_table. I could certainly modify my schema such that the abstract table goes away entirely easily enough, but I find this easier for new developers to come in and comprehend, since the similarity between the table definitions is explicit. I'm glad this topic came up, as I was unaware that I need to run analyze on the parent partitions separately - and no data is every inserted directly into the top level of each granularity hierarchy, so it will never fire by itself. If I am using ORM and I've got functionality in a common baseclass in the source code, I'll often implement its mapping in the database via a parent table that the table for any subclass mapping can inherit from. Again, I have no interest in maintaining statistics on the parent table, since I never query against it directly.
Re: [PERFORM] No hash join across partitioned tables?
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: Tom Lane wrote: I would have liked to get to this for 9.0 but I feel it's a bit late now. What do we want to do about the above issue? TODO item. Added to TODO: Have autoanalyze of parent tables occur when child tables are modified * http://archives.postgresql.org/message-id/aanlktinx8lltekwcyeq1rxvz6wmjvknezfxw5tknn...@mail.gmail.com I am surprised there is no documentation update requirement for this. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Bruce Momjian br...@momjian.us writes: I am surprised there is no documentation update requirement for this. Somebody put something about it in the docs a few days ago, IIRC. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Fri, Jul 2, 2010 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: I am surprised there is no documentation update requirement for this. Somebody put something about it in the docs a few days ago, IIRC. That was me. http://archives.postgresql.org/pgsql-committers/2010-06/msg00144.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Robert Haas wrote: On Fri, Jul 2, 2010 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Bruce Momjian br...@momjian.us writes: I am surprised there is no documentation update requirement for this. Somebody put something about it in the docs a few days ago, IIRC. That was me. http://archives.postgresql.org/pgsql-committers/2010-06/msg00144.php Oh, thanks, I missed that. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: In going back through emails I had marked as possibly needing another look before 9.0 is released, I came across this issue again. As I understand it, analyze (or analyse) now collects statistics for both the parent individually, and for the parent and its children together. However, as I further understand it, autovacuum won't actually fire off an analyze unless there's enough activity on the parent table considered individually to warrant it. So if you have an empty parent and a bunch of children with data in it, your stats will still stink, unless you analyze by hand. Check. Assuming my understanding of the problem is correct, we could: (a) fix it, (b) document that you should consider periodic manual analyze commands in this situation, or (c) do nothing. Thoughts? The objections to (a) are that it might result in excessive ANALYZE work if not done intelligently, and that we haven't got a patch ready anyway. I would have liked to get to this for 9.0 but I feel it's a bit late now. What do we want to do about the above issue? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Bruce Momjian br...@momjian.us writes: Tom Lane wrote: I would have liked to get to this for 9.0 but I feel it's a bit late now. What do we want to do about the above issue? TODO item. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Tue, Mar 2, 2010 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Partially. There are stats now but autovacuum is not bright about when to update them. Is that something you're planning to fix for 9.0? If not, we at least need to document what we intend for people to do about it. I want to look at it, but I'm not sure whether the fix will be small enough that we want to put it in during beta. In going back through emails I had marked as possibly needing another look before 9.0 is released, I came across this issue again. As I understand it, analyze (or analyse) now collects statistics for both the parent individually, and for the parent and its children together. However, as I further understand it, autovacuum won't actually fire off an analyze unless there's enough activity on the parent table considered individually to warrant it. So if you have an empty parent and a bunch of children with data in it, your stats will still stink, unless you analyze by hand. Assuming my understanding of the problem is correct, we could: (a) fix it, (b) document that you should consider periodic manual analyze commands in this situation, or (c) do nothing. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Robert Haas robertmh...@gmail.com writes: In going back through emails I had marked as possibly needing another look before 9.0 is released, I came across this issue again. As I understand it, analyze (or analyse) now collects statistics for both the parent individually, and for the parent and its children together. However, as I further understand it, autovacuum won't actually fire off an analyze unless there's enough activity on the parent table considered individually to warrant it. So if you have an empty parent and a bunch of children with data in it, your stats will still stink, unless you analyze by hand. Check. Assuming my understanding of the problem is correct, we could: (a) fix it, (b) document that you should consider periodic manual analyze commands in this situation, or (c) do nothing. Thoughts? The objections to (a) are that it might result in excessive ANALYZE work if not done intelligently, and that we haven't got a patch ready anyway. I would have liked to get to this for 9.0 but I feel it's a bit late now. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Robert Haas robertmh...@gmail.com writes: On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Partially. There are stats now but autovacuum is not bright about when to update them. Is that something you're planning to fix for 9.0? If not, we at least need to document what we intend for people to do about it. I want to look at it, but I'm not sure whether the fix will be small enough that we want to put it in during beta. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Tue, Mar 2, 2010 at 4:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote: Partially. There are stats now but autovacuum is not bright about when to update them. Is that something you're planning to fix for 9.0? If not, we at least need to document what we intend for people to do about it. I want to look at it, but I'm not sure whether the fix will be small enough that we want to put it in during beta. I am pretty sure many people will appreciate it, even if it isn't going to be small. Is that stat collection across child tables any useful by it self ? -- GJ
Re: [PERFORM] No hash join across partitioned tables?
Did this get addressed? --- Tom Lane wrote: Kris Jurka bo...@ejurka.com writes: The real problem is getting reasonable stats to pass through the partition Append step, so it can make a reasonable estimate of the join output size. I dug around a bit and concluded that the lack of stats for the Append relation is indeed the main problem. It's not so much the bad join size estimate (although that could hurt for cases where you need to join this result to another table). Rather, it's that the planner is deliberately biased against picking hash joins in the absence of stats for the inner relation. Per the comments for estimate_hash_bucketsize: * If no statistics are available, use a default estimate of 0.1. This will * discourage use of a hash rather strongly if the inner relation is large, * which is what we want. We do not want to hash unless we know that the * inner rel is well-dispersed (or the alternatives seem much worse). While we could back off the default a bit here, I think it'd be better to fix it by not punting on the stats-for-append-relations problem. That doesn't seem like material for 8.4 at this point, though. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Bruce Momjian br...@momjian.us writes: Did this get addressed? Partially. There are stats now but autovacuum is not bright about when to update them. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Kris Jurka bo...@ejurka.com writes: So the default disable_cost isn't enough to push it to use the hash join plan and goes back to nestloop. Since disable_cost hasn't been touched since January 2000, perhaps it's time to bump that up to match today's hardware and problem sizes? I think disable_cost was originally set at a time when costs were integers :-(. Yeah, there's probably no reason not to throw another zero or two on it. Is there another issue here besides that one? I think you were hoping that the hash join would be faster than the alternatives, but the cost estimate says it's a lot slower. Is that actually the case? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Kris Jurka bo...@ejurka.com writes: The hash join takes less than twenty seconds, the other two joins I killed after five minutes. I can try to collect explain analyze results later today if you'd like. Please, unless the test case you already posted has similar behavior. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Tom Lane wrote: Is there another issue here besides that one? I think you were hoping that the hash join would be faster than the alternatives, but the cost estimate says it's a lot slower. Is that actually the case? The hash join takes less than twenty seconds, the other two joins I killed after five minutes. I can try to collect explain analyze results later today if you'd like. Kris Jurka -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
Tom Lane wrote: Kris Jurka bo...@ejurka.com writes: The hash join takes less than twenty seconds, the other two joins I killed after five minutes. I can try to collect explain analyze results later today if you'd like. Attached are the explain analyze results. The analyze part hits the hash join worst of all, so I've also included the timings without analyzing. Method Time (ms) Time w/Analyze (ms) nestloop 304853 319060 merge514517 683757 hash 18957 143731 Kris Jurka Aggregate (cost=116546928.75..116546928.76 rows=1 width=0) (actual time=143731.602..143731.604 rows=1 loops=1) - Hash Join (cost=1470.48..108802907.84 rows=3097608361 width=0) (actual time=308.015..143724.055 rows=2437 loops=1) Hash Cond: ((l.vin)::text = (i.vin)::text) - Append (cost=0.00..332098.75 rows=18450775 width=18) (actual time=0.069..99984.899 rows=18449996 loops=1) - Seq Scan on liens l (cost=0.00..14.00 rows=400 width=21) (actual time=0.003..0.003 rows=0 loops=1) - Seq Scan on liens_s1 l (cost=0.00..18633.44 rows=917444 width=18) (actual time=0.060..1828.740 rows=917444 loops=1) - Seq Scan on liens_s2 l (cost=0.00..20.92 rows=1192 width=18) (actual time=0.010..2.274 rows=1192 loops=1) - Seq Scan on liens_s3 l (cost=0.00..53793.79 rows=2934179 width=18) (actual time=0.054..5777.782 rows=2934179 loops=1) - Seq Scan on liens_s4 l (cost=0.00..21069.39 rows=1214139 width=18) (actual time=0.065..2413.429 rows=1214139 loops=1) - Seq Scan on liens_s5 l (cost=0.00..29966.37 rows=1726837 width=18) (actual time=0.046..3394.974 rows=1726837 loops=1) - Seq Scan on liens_s6 l (cost=0.00..10587.18 rows=462918 width=18) (actual time=0.053..936.379 rows=462918 loops=1) - Seq Scan on liens_s7 l (cost=0.00..14.00 rows=400 width=21) (actual time=0.003..0.003 rows=0 loops=1) - Seq Scan on liens_s8 l (cost=0.00..86004.68 rows=4956168 width=18) (actual time=0.045..9729.965 rows=4956182 loops=1) - Seq Scan on liens_s9 l (cost=0.00..320.29 rows=18429 width=18) (actual time=0.010..34.880 rows=18429 loops=1) - Seq Scan on liens_s10 l (cost=0.00..18398.16 rows=951016 width=18) (actual time=0.055..1889.948 rows=951016 loops=1) - Seq Scan on liens_s11 l (cost=0.00..9956.22 rows=543022 width=18) (actual time=0.055..1070.156 rows=543022 loops=1) - Seq Scan on liens_s12 l (cost=0.00..78813.85 rows=4541785 width=18) (actual time=0.012..9431.035 rows=4541792 loops=1) - Seq Scan on liens_s13 l (cost=0.00..4506.46 rows=182846 width=18) (actual time=0.049..374.788 rows=182846 loops=1) - Hash (cost=1050.77..1050.77 rows=33577 width=18) (actual time=256.374..256.374 rows=33297 loops=1) - Append (cost=0.00..1050.77 rows=33577 width=18) (actual time=0.019..188.152 rows=33297 loops=1) - Seq Scan on impounds i (cost=0.00..11.40 rows=140 width=21) (actual time=0.002..0.002 rows=0 loops=1) - Seq Scan on impounds_s1 i (cost=0.00..11.40 rows=140 width=21) (actual time=0.002..0.002 rows=0 loops=1) - Seq Scan on impounds_s2 i (cost=0.00..913.87 rows=29587 width=18) (actual time=0.008..60.728 rows=29587 loops=1) - Seq Scan on impounds_s3 i (cost=0.00..18.14 rows=414 width=18) (actual time=0.009..0.848 rows=414 loops=1) - Seq Scan on impounds_s4 i (cost=0.00..95.96 rows=3296 width=18) (actual time=0.012..6.894 rows=3296 loops=1) Total runtime: 143731.788 ms (26 rows) QUERY PLAN --- Aggregate (cost=57241210.61..57241210.62 rows=1 width=0) (actual time=683467.350..683467.352 rows=1 loops=1) - Merge Join (cost=2940810.41..49497189.70 rows=3097608361 width=0) (actual time=434026.342..683460.545 rows=2437 loops=1) Merge Cond: ((l.vin)::text = (i.vin)::text) - Sort (cost=2937235.46..2983362.40 rows=18450775 width=18) (actual time=433519.957..637389.755 rows=18449961 loops=1) Sort Key: l.vin Sort Method: external merge Disk: 504728kB - Append (cost=0.00..332098.75 rows=18450775 width=18) (actual time=14.764..102905.170 rows=18449996 loops=1) - Seq Scan on liens l (cost=0.00..14.00 rows=400 width=21) (actual time=0.003..0.003 rows=0 loops=1) - Seq Scan on liens_s1 l (cost=0.00..18633.44 rows=917444 width=18) (actual time=14.755..2167.668 rows=917444 loops=1) - Seq Scan on liens_s2 l (cost=0.00..20.92 rows=1192 width=18) (actual time=0.012..2.304
[PERFORM] No hash join across partitioned tables?
PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. I have two partition hierarchies: impounds (with different impound sources) and liens (with vehicle liens from different companies). Trying to match those up gives: EXPLAIN SELECT COUNT(*) FROM impounds i JOIN liens l ON (i.vin = l.vin); Aggregate (cost=11164042.66..11164042.67 rows=1 width=0) - Nested Loop (cost=0.27..3420012.94 rows=3097611886 width=0) Join Filter: ((i.vin)::text = (l.vin)::text) - Append (cost=0.00..1072.77 rows=33577 width=21) - Seq Scan on impounds i (cost=0.00..11.40 rows=140 width=21) - Seq Scan on impounds_s1 i (cost=0.00..926.87 rows=29587 width=18) - Seq Scan on impounds_s2 i (cost=0.00..99.96 rows=3296 width=18) - Seq Scan on impounds_s3 i (cost=0.00..23.14 rows=414 width=18) - Seq Scan on impounds_s4 i (cost=0.00..11.40 rows=140 width=21) - Append (cost=0.27..101.64 rows=15 width=21) - Bitmap Heap Scan on liens l (cost=0.27..5.60 rows=2 width=21) Recheck Cond: ((l.vin)::text = (i.vin)::text) - Bitmap Index Scan on liens_pk (cost=0.00..0.27 rows=2 width=0) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using liens_s1_pk on liens_s1 l (cost=0.00..7.02 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using liens_s2_pk on liens_s2 l (cost=0.00..3.47 rows=1 width=21) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s3_pk on liens_s3 l (cost=0.00..7.52 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s4_pk on liens_s4 l (cost=0.00..7.67 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s5_pk on liens_s5 l (cost=0.00..7.62 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s6_pk on liens_s6 l (cost=0.00..7.61 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s7_pk on liens_s7 l (cost=0.00..7.50 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s8_pk on liens_s8 l (cost=0.00..7.36 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s9_pk on liens_s9 l (cost=0.00..7.43 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s10_pk on liens_s10 l (cost=0.00..7.79 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s11_pk on liens_s11 l (cost=0.00..8.07 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s12_pk on liens_s12 l (cost=0.00..8.45 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) - Index Scan using newliens_s13_pk on liens_s13 l (cost=0.00..8.53 rows=1 width=18) Index Cond: ((l.vin)::text = (i.vin)::text) This takes quite a while as it's got to do tons of index probes which results it tons of random IO. I killed this after five minutes of running. But if I do: CREATE TABLE i1 AS SELECT * FROM impounds; CREATE TABLE l1 AS SELECT * FROM liens; I get a reasonable plan, which runs in about 15 seconds, from: EXPLAIN SELECT COUNT(*) FROM i1 i JOIN l1 l ON (i.vin = l.vin); Aggregate (cost=749054.78..749054.79 rows=1 width=0) - Hash Join (cost=1444.18..748971.43 rows=8 width=0) Hash Cond: ((l.vin)::text = (i.vin)::text) - Seq Scan on l1 l (cost=0.00..332068.96 rows=18449996 width=18) - Hash (cost=1027.97..1027.97 rows=33297 width=18) - Seq Scan on i1 i (cost=0.00..1027.97 rows=33297 width=18) I've tried to force the hash join plan on the partitioned tables via: set enable_nestloop to off; This results in a merge join plan which needs to do a giant sort, again killed after five minutes. Aggregate (cost=58285765.20..58285765.21 rows=1 width=0) - Merge Join (cost=4077389.31..50541735.48 rows=3097611886 width=0) Merge Cond: ((i.vin)::text = (l.vin)::text) - Sort (cost=4286.45..4370.39 rows=33577 width=21) Sort Key: i.vin - Append (cost=0.00..1072.77 rows=33577 width=21) - Seq Scan on impounds i (cost=0.00..11.40 rows=140 width=21) - [Seq Scans on other partitions] - Materialize
Re: [PERFORM] No hash join across partitioned tables?
Kris Jurka bo...@ejurka.com writes: PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Thu, 16 Apr 2009, Tom Lane wrote: Kris Jurka bo...@ejurka.com writes: PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) The attached table definition with no data wants to mergejoin first, but after disabling mergejoin it does indeed do a hashjoin. Looking back at the cost estimates for the merge and nestloop joins, it seems to be selecting the number of rows in the cartesian product * .005 while the number of output rows in this case is 2437 (cartesian product * 4e-9). Perhaps the cost estimates for the real data are so high because of this bogus row count that the fudge factor to disable mergejoin isn't enough? Kris Jurka CREATE TABLE impounds ( vin character varying(17) NOT NULL, impounddate date NOT NULL, eventtypeid integer NOT NULL, ori character varying(9), platenumber character varying(8), platestate character varying(2), plateyear integer, platetype character varying(2), vehicleyear integer, vehiclemake text, vehiclemodel text, vehiclestyle text, vehiclecolor text, townotes text, damagenotes text, platenotes text, custodynotes text, sourcenotes text, referencenumber1 text, referencenumber2 text, referencenumber3 text, contactname text, contactphone text ); ALTER TABLE ONLY impounds ADD CONSTRAINT impounds_pk PRIMARY KEY (vin, impounddate, eventtypeid); CREATE TABLE impounds_s1 () INHERITS (impounds); CREATE TABLE impounds_s2 () INHERITS (impounds); CREATE TABLE impounds_s3 () INHERITS (impounds); CREATE TABLE impounds_s4 () INHERITS (impounds); ALTER TABLE impounds_s1 ADD CONSTRAINT impounds_s1_pk PRIMARY KEY (vin, impounddate); ALTER TABLE impounds_s2 ADD CONSTRAINT impounds_s2_pk PRIMARY KEY (vin, impounddate); ALTER TABLE impounds_s3 ADD CONSTRAINT impounds_s3_pk PRIMARY KEY (vin, impounddate); ALTER TABLE impounds_s4 ADD CONSTRAINT impounds_s4_pk PRIMARY KEY (vin, impounddate); CREATE TABLE liens ( agentid integer NOT NULL, vin character varying(17) NOT NULL, liendate date, accountnumber character varying(50), customername character varying(50), state character varying(2), vehiclemake character varying(20), vehiclemodel character varying(20), vehicleyear integer ); ALTER TABLE ONLY liens ADD CONSTRAINT liens_pk PRIMARY KEY (vin, agentid); CREATE TABLE liens_s1 () INHERITS (liens); CREATE TABLE liens_s2 () INHERITS (liens); CREATE TABLE liens_s3 () INHERITS (liens); CREATE TABLE liens_s4 () INHERITS (liens); CREATE TABLE liens_s5 () INHERITS (liens); CREATE TABLE liens_s6 () INHERITS (liens); CREATE TABLE liens_s7 () INHERITS (liens); ALTER TABLE liens_s1 ADD CONSTRAINT lines_s1_pk PRIMARY KEY (vin); ALTER TABLE liens_s2 ADD CONSTRAINT lines_s2_pk PRIMARY KEY (vin); ALTER TABLE liens_s3 ADD CONSTRAINT lines_s3_pk PRIMARY KEY (vin); ALTER TABLE liens_s4 ADD CONSTRAINT lines_s4_pk PRIMARY KEY (vin); ALTER TABLE liens_s5 ADD CONSTRAINT lines_s5_pk PRIMARY KEY (vin); ALTER TABLE liens_s6 ADD CONSTRAINT lines_s6_pk PRIMARY KEY (vin); ALTER TABLE liens_s7 ADD CONSTRAINT lines_s7_pk PRIMARY KEY (vin); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] No hash join across partitioned tables?
On Thu, 16 Apr 2009, Kris Jurka wrote: Perhaps the cost estimates for the real data are so high because of this bogus row count that the fudge factor to disable mergejoin isn't enough? Indeed, I get these cost estimates on 8.4b1 with an increased disable_cost value: nestloop: 11171206.18 merge: 58377401.39 hash: 116763544.76 So the default disable_cost isn't enough to push it to use the hash join plan and goes back to nestloop. Since disable_cost hasn't been touched since January 2000, perhaps it's time to bump that up to match today's hardware and problem sizes? This isn't even a particularly big problem, it's joing 18M rows against 30k. The real problem is getting reasonable stats to pass through the partition Append step, so it can make a reasonable estimate of the join output size. Kris Jurka -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance