Re: [PERFORM] No hash join across partitioned tables?

2010-10-18 Thread Samuel Gendler
On Sat, Oct 16, 2010 at 8:29 AM, Alvaro Herrera
alvhe...@commandprompt.comwrote:

 Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010:

  An issue with automatically analyzing the entire hierarchy is 'abstract'
  table definitions.  I've got a set of tables for storing the same data at
  different granularities of aggregation.  Within each granularity, I've
 got
  partitions, but because the set of columns is identical for each
  granularity, I've got an abstract table definition that is inherited by
  everything.  I don't need or want statistics kept on that table because I
  never query across the abstract table, only the parent table of each
  aggregation granularity

 Hmm, I think you'd be better served by using LIKE instead of regular
 inheritance.


Yep.  I inherited the architecture, though, and changing it hasn't been a
high priority.

--sam


Re: [PERFORM] No hash join across partitioned tables?

2010-10-18 Thread Alvaro Herrera
Excerpts from Samuel Gendler's message of lun oct 18 03:13:01 -0300 2010:
 On Sat, Oct 16, 2010 at 8:29 AM, Alvaro Herrera
 alvhe...@commandprompt.comwrote:
 
  Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010:
 
   An issue with automatically analyzing the entire hierarchy is
   'abstract' table definitions.  I've got a set of tables for
   storing the same data at different granularities of aggregation.
   Within each granularity, I've got partitions, but because the set
   of columns is identical for each granularity, I've got an abstract
   table definition that is inherited by everything.  I don't need or
   want statistics kept on that table because I never query across
   the abstract table, only the parent table of each aggregation
   granularity
 
  Hmm, I think you'd be better served by using LIKE instead of regular
  inheritance.

 Yep.  I inherited the architecture, though, and changing it hasn't been a
 high priority.

I understand that; my point is merely that maybe we shouldn't work
through many hoops to solve this particular facet of the problem,
because it seems to be pilot error.  (If you really needed to avoid the
extra I/O that would be caused by unnecessary analyzes, you could turn
autovac off for the abstract tables).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-10-16 Thread Alvaro Herrera
Excerpts from Samuel Gendler's message of sáb oct 16 02:35:46 -0300 2010:

 An issue with automatically analyzing the entire hierarchy is 'abstract'
 table definitions.  I've got a set of tables for storing the same data at
 different granularities of aggregation.  Within each granularity, I've got
 partitions, but because the set of columns is identical for each
 granularity, I've got an abstract table definition that is inherited by
 everything.  I don't need or want statistics kept on that table because I
 never query across the abstract table, only the parent table of each
 aggregation granularity

Hmm, I think you'd be better served by using LIKE instead of regular
inheritance.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-10-15 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mié jun 09 15:47:55 -0400 2010:

 In going back through emails I had marked as possibly needing another
 look before 9.0 is released, I came across this issue again.  As I
 understand it, analyze (or analyse) now collects statistics for both
 the parent individually, and for the parent and its children together.
  However, as I further understand it, autovacuum won't actually fire
 off an analyze unless there's enough activity on the parent table
 considered individually to warrant it.  So if you have an empty parent
 and a bunch of children with data in it, your stats will still stink,
 unless you analyze by hand.

So, is there something we could now do about this, while there's still
time before 9.1?

I haven't followed this issue very closely, but it seems to me that what
we want is that we want an ANALYZE in a child table to be mutated into
an analyze of its parent table, if the conditions are right; and that an
ANALYZE of a parent removes the child tables from being analyzed on the
same run.

If we analyze the parent, do we also update the children stats, or is it
just that we keep two stats for the parent, one with children and one
without, both being updated when the parent is analyzed?

If the latter's the case, maybe we should modify ANALYZE a bit more, so
that we can analyze the whole hierarchy in one go, and store the lot of
stats with a single pass (each child alone, the parent alone, the parent
plus children).  However it's not real clear how would this work with
multiple inheritance levels.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-10-15 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 If we analyze the parent, do we also update the children stats, or is it
 just that we keep two stats for the parent, one with children and one
 without, both being updated when the parent is analyzed?

The latter.

The trick here is that we need to fire an analyze on the parent even
though only its children may have had any updates.

 If the latter's the case, maybe we should modify ANALYZE a bit more, so
 that we can analyze the whole hierarchy in one go, and store the lot of
 stats with a single pass (each child alone, the parent alone, the parent
 plus children).  However it's not real clear how would this work with
 multiple inheritance levels.

It's also not clear how it works without blowing out memory...

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-10-15 Thread Samuel Gendler
On Fri, Oct 15, 2010 at 10:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Alvaro Herrera alvhe...@commandprompt.com writes:
  If we analyze the parent, do we also update the children stats, or is it
  just that we keep two stats for the parent, one with children and one
  without, both being updated when the parent is analyzed?

 The latter.

 The trick here is that we need to fire an analyze on the parent even
 though only its children may have had any updates.

  If the latter's the case, maybe we should modify ANALYZE a bit more, so
  that we can analyze the whole hierarchy in one go, and store the lot of
  stats with a single pass (each child alone, the parent alone, the parent
  plus children).  However it's not real clear how would this work with
  multiple inheritance levels.


An issue with automatically analyzing the entire hierarchy is 'abstract'
table definitions.  I've got a set of tables for storing the same data at
different granularities of aggregation.  Within each granularity, I've got
partitions, but because the set of columns is identical for each
granularity, I've got an abstract table definition that is inherited by
everything.  I don't need or want statistics kept on that table because I
never query across the abstract table, only the parent table of each
aggregation granularity

create table abstract_fact_table (
time timestamp,
measure1 bigint,
measure2 bigint,
measure3 bigint,
fk1 bigint,
fk2 bigint
);

create table minute_scale_fact_table (
} inherits abstract_fact_table;

// Then there are several partitions for minute scale data

create table hour_scale_fact_table (
) inherits abstract_fact_table;

// then several partitions for hour scale data

etc.  I do run queries on the minute_scale_fact_table and
hour_scale_fact_table but never do so on abstract_fact_table.  I could
certainly modify my schema such that the abstract table goes away entirely
easily enough, but I find this easier for new developers to come in and
comprehend, since the similarity between the table definitions is explicit.

I'm glad this topic came up, as I was unaware that I need to run analyze on
the parent partitions separately - and no data is every inserted directly
into the top level of each granularity hierarchy, so it will never fire by
itself.

If I am using ORM and I've got functionality in a common baseclass in the
source code, I'll often implement its mapping in the database via a parent
table that the table for any subclass mapping can inherit from.  Again, I
have no interest in maintaining statistics on the parent table, since I
never query against it directly.


Re: [PERFORM] No hash join across partitioned tables?

2010-07-02 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Tom Lane wrote:
  I would have liked to get to this for 9.0 but I feel it's a bit late
  now.
 
  What do we want to do about the above issue?
 
 TODO item.

Added to TODO:

Have autoanalyze of parent tables occur when child tables are modified

* 
http://archives.postgresql.org/message-id/aanlktinx8lltekwcyeq1rxvz6wmjvknezfxw5tknn...@mail.gmail.com
 

I am surprised there is no documentation update requirement for this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-07-02 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I am surprised there is no documentation update requirement for this.

Somebody put something about it in the docs a few days ago, IIRC.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-07-02 Thread Robert Haas
On Fri, Jul 2, 2010 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 I am surprised there is no documentation update requirement for this.

 Somebody put something about it in the docs a few days ago, IIRC.

That was me.

http://archives.postgresql.org/pgsql-committers/2010-06/msg00144.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-07-02 Thread Bruce Momjian
Robert Haas wrote:
 On Fri, Jul 2, 2010 at 4:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Bruce Momjian br...@momjian.us writes:
  I am surprised there is no documentation update requirement for this.
 
  Somebody put something about it in the docs a few days ago, IIRC.
 
 That was me.
 
 http://archives.postgresql.org/pgsql-committers/2010-06/msg00144.php

Oh, thanks, I missed that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-07-01 Thread Bruce Momjian
Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  In going back through emails I had marked as possibly needing another
  look before 9.0 is released, I came across this issue again.  As I
  understand it, analyze (or analyse) now collects statistics for both
  the parent individually, and for the parent and its children together.
   However, as I further understand it, autovacuum won't actually fire
  off an analyze unless there's enough activity on the parent table
  considered individually to warrant it.  So if you have an empty parent
  and a bunch of children with data in it, your stats will still stink,
  unless you analyze by hand.
 
 Check.
 
  Assuming my understanding of the problem is correct, we could:
 
  (a) fix it,
  (b) document that you should consider periodic manual analyze commands
  in this situation, or
  (c) do nothing.
 
  Thoughts?
 
 The objections to (a) are that it might result in excessive ANALYZE work
 if not done intelligently, and that we haven't got a patch ready anyway.
 I would have liked to get to this for 9.0 but I feel it's a bit late
 now.

What do we want to do about the above issue?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-07-01 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 I would have liked to get to this for 9.0 but I feel it's a bit late
 now.

 What do we want to do about the above issue?

TODO item.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-06-09 Thread Robert Haas
On Tue, Mar 2, 2010 at 12:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Partially.  There are stats now but autovacuum is not bright about
 when to update them.

 Is that something you're planning to fix for 9.0?  If not, we at least
 need to document what we intend for people to do about it.

 I want to look at it, but I'm not sure whether the fix will be small
 enough that we want to put it in during beta.

In going back through emails I had marked as possibly needing another
look before 9.0 is released, I came across this issue again.  As I
understand it, analyze (or analyse) now collects statistics for both
the parent individually, and for the parent and its children together.
 However, as I further understand it, autovacuum won't actually fire
off an analyze unless there's enough activity on the parent table
considered individually to warrant it.  So if you have an empty parent
and a bunch of children with data in it, your stats will still stink,
unless you analyze by hand.

Assuming my understanding of the problem is correct, we could:

(a) fix it,
(b) document that you should consider periodic manual analyze commands
in this situation, or
(c) do nothing.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-06-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 In going back through emails I had marked as possibly needing another
 look before 9.0 is released, I came across this issue again.  As I
 understand it, analyze (or analyse) now collects statistics for both
 the parent individually, and for the parent and its children together.
  However, as I further understand it, autovacuum won't actually fire
 off an analyze unless there's enough activity on the parent table
 considered individually to warrant it.  So if you have an empty parent
 and a bunch of children with data in it, your stats will still stink,
 unless you analyze by hand.

Check.

 Assuming my understanding of the problem is correct, we could:

 (a) fix it,
 (b) document that you should consider periodic manual analyze commands
 in this situation, or
 (c) do nothing.

 Thoughts?

The objections to (a) are that it might result in excessive ANALYZE work
if not done intelligently, and that we haven't got a patch ready anyway.
I would have liked to get to this for 9.0 but I feel it's a bit late
now.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-03-02 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Partially.  There are stats now but autovacuum is not bright about
 when to update them.

 Is that something you're planning to fix for 9.0?  If not, we at least
 need to document what we intend for people to do about it.

I want to look at it, but I'm not sure whether the fix will be small
enough that we want to put it in during beta.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-03-02 Thread Grzegorz Jaśkiewicz
On Tue, Mar 2, 2010 at 4:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Robert Haas robertmh...@gmail.com writes:
  On Thu, Feb 25, 2010 at 7:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Partially.  There are stats now but autovacuum is not bright about
  when to update them.

  Is that something you're planning to fix for 9.0?  If not, we at least
  need to document what we intend for people to do about it.

 I want to look at it, but I'm not sure whether the fix will be small
 enough that we want to put it in during beta.

 I am pretty sure many people will appreciate it, even if it isn't going to
be small.

Is that stat collection across child tables any useful by it self ?

-- 
GJ


Re: [PERFORM] No hash join across partitioned tables?

2010-02-25 Thread Bruce Momjian

Did this get addressed?

---

Tom Lane wrote:
 Kris Jurka bo...@ejurka.com writes:
  The real problem is getting reasonable stats to pass through the partition 
  Append step, so it can make a reasonable estimate of the join output size.
 
 I dug around a bit and concluded that the lack of stats for the Append
 relation is indeed the main problem.  It's not so much the bad join size
 estimate (although that could hurt for cases where you need to join this
 result to another table).  Rather, it's that the planner is deliberately
 biased against picking hash joins in the absence of stats for the inner
 relation.  Per the comments for estimate_hash_bucketsize:
 
  * If no statistics are available, use a default estimate of 0.1.  This will
  * discourage use of a hash rather strongly if the inner relation is large,
  * which is what we want.  We do not want to hash unless we know that the
  * inner rel is well-dispersed (or the alternatives seem much worse).
 
 While we could back off the default a bit here, I think it'd be better
 to fix it by not punting on the stats-for-append-relations problem.
 That doesn't seem like material for 8.4 at this point, though.
 
   regards, tom lane
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com
  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2010-02-25 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Did this get addressed?

Partially.  There are stats now but autovacuum is not bright about
when to update them.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-17 Thread Tom Lane
Kris Jurka bo...@ejurka.com writes:
 So the default disable_cost isn't enough to push it to use the hash join 
 plan and goes back to nestloop.  Since disable_cost hasn't been touched 
 since January 2000, perhaps it's time to bump that up to match today's 
 hardware and problem sizes?

I think disable_cost was originally set at a time when costs were
integers :-(.  Yeah, there's probably no reason not to throw another
zero or two on it.

Is there another issue here besides that one?  I think you were hoping
that the hash join would be faster than the alternatives, but the cost
estimate says it's a lot slower.  Is that actually the case?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-17 Thread Tom Lane
Kris Jurka bo...@ejurka.com writes:
 The hash join takes less than twenty seconds, the other two joins I 
 killed after five minutes.  I can try to collect explain analyze results 
 later today if you'd like.

Please, unless the test case you already posted has similar behavior.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-17 Thread Kris Jurka

Tom Lane wrote:


Is there another issue here besides that one?  I think you were hoping
that the hash join would be faster than the alternatives, but the cost
estimate says it's a lot slower.  Is that actually the case?



The hash join takes less than twenty seconds, the other two joins I 
killed after five minutes.  I can try to collect explain analyze results 
later today if you'd like.


Kris Jurka

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-17 Thread Kris Jurka

Tom Lane wrote:

Kris Jurka bo...@ejurka.com writes:
The hash join takes less than twenty seconds, the other two joins I 
killed after five minutes.  I can try to collect explain analyze results 
later today if you'd like.




Attached are the explain analyze results.  The analyze part hits the 
hash join worst of all, so I've also included the timings without analyzing.


Method   Time (ms)  Time w/Analyze (ms)
nestloop 304853 319060
merge514517 683757
hash  18957 143731

Kris Jurka
 Aggregate  (cost=116546928.75..116546928.76 rows=1 width=0) (actual 
time=143731.602..143731.604 rows=1 loops=1)
   -  Hash Join  (cost=1470.48..108802907.84 rows=3097608361 width=0) (actual 
time=308.015..143724.055 rows=2437 loops=1)
 Hash Cond: ((l.vin)::text = (i.vin)::text)
 -  Append  (cost=0.00..332098.75 rows=18450775 width=18) (actual 
time=0.069..99984.899 rows=18449996 loops=1)
   -  Seq Scan on liens l  (cost=0.00..14.00 rows=400 width=21) 
(actual time=0.003..0.003 rows=0 loops=1)
   -  Seq Scan on liens_s1 l  (cost=0.00..18633.44 rows=917444 
width=18) (actual time=0.060..1828.740 rows=917444 loops=1)
   -  Seq Scan on liens_s2 l  (cost=0.00..20.92 rows=1192 
width=18) (actual time=0.010..2.274 rows=1192 loops=1)
   -  Seq Scan on liens_s3 l  (cost=0.00..53793.79 rows=2934179 
width=18) (actual time=0.054..5777.782 rows=2934179 loops=1)
   -  Seq Scan on liens_s4 l  (cost=0.00..21069.39 rows=1214139 
width=18) (actual time=0.065..2413.429 rows=1214139 loops=1)
   -  Seq Scan on liens_s5 l  (cost=0.00..29966.37 rows=1726837 
width=18) (actual time=0.046..3394.974 rows=1726837 loops=1)
   -  Seq Scan on liens_s6 l  (cost=0.00..10587.18 rows=462918 
width=18) (actual time=0.053..936.379 rows=462918 loops=1)
   -  Seq Scan on liens_s7 l  (cost=0.00..14.00 rows=400 width=21) 
(actual time=0.003..0.003 rows=0 loops=1)
   -  Seq Scan on liens_s8 l  (cost=0.00..86004.68 rows=4956168 
width=18) (actual time=0.045..9729.965 rows=4956182 loops=1)
   -  Seq Scan on liens_s9 l  (cost=0.00..320.29 rows=18429 
width=18) (actual time=0.010..34.880 rows=18429 loops=1)
   -  Seq Scan on liens_s10 l  (cost=0.00..18398.16 rows=951016 
width=18) (actual time=0.055..1889.948 rows=951016 loops=1)
   -  Seq Scan on liens_s11 l  (cost=0.00..9956.22 rows=543022 
width=18) (actual time=0.055..1070.156 rows=543022 loops=1)
   -  Seq Scan on liens_s12 l  (cost=0.00..78813.85 rows=4541785 
width=18) (actual time=0.012..9431.035 rows=4541792 loops=1)
   -  Seq Scan on liens_s13 l  (cost=0.00..4506.46 rows=182846 
width=18) (actual time=0.049..374.788 rows=182846 loops=1)
 -  Hash  (cost=1050.77..1050.77 rows=33577 width=18) (actual 
time=256.374..256.374 rows=33297 loops=1)
   -  Append  (cost=0.00..1050.77 rows=33577 width=18) (actual 
time=0.019..188.152 rows=33297 loops=1)
 -  Seq Scan on impounds i  (cost=0.00..11.40 rows=140 
width=21) (actual time=0.002..0.002 rows=0 loops=1)
 -  Seq Scan on impounds_s1 i  (cost=0.00..11.40 rows=140 
width=21) (actual time=0.002..0.002 rows=0 loops=1)
 -  Seq Scan on impounds_s2 i  (cost=0.00..913.87 
rows=29587 width=18) (actual time=0.008..60.728 rows=29587 loops=1)
 -  Seq Scan on impounds_s3 i  (cost=0.00..18.14 rows=414 
width=18) (actual time=0.009..0.848 rows=414 loops=1)
 -  Seq Scan on impounds_s4 i  (cost=0.00..95.96 rows=3296 
width=18) (actual time=0.012..6.894 rows=3296 loops=1)
 Total runtime: 143731.788 ms
(26 rows)

QUERY PLAN
---
 Aggregate  (cost=57241210.61..57241210.62 rows=1 width=0) (actual 
time=683467.350..683467.352 rows=1 loops=1)
   -  Merge Join  (cost=2940810.41..49497189.70 rows=3097608361 width=0) 
(actual time=434026.342..683460.545 rows=2437 loops=1)
 Merge Cond: ((l.vin)::text = (i.vin)::text)
 -  Sort  (cost=2937235.46..2983362.40 rows=18450775 width=18) (actual 
time=433519.957..637389.755 rows=18449961 loops=1)
   Sort Key: l.vin
   Sort Method:  external merge  Disk: 504728kB
   -  Append  (cost=0.00..332098.75 rows=18450775 width=18) 
(actual time=14.764..102905.170 rows=18449996 loops=1)
 -  Seq Scan on liens l  (cost=0.00..14.00 rows=400 
width=21) (actual time=0.003..0.003 rows=0 loops=1)
 -  Seq Scan on liens_s1 l  (cost=0.00..18633.44 
rows=917444 width=18) (actual time=14.755..2167.668 rows=917444 loops=1)
 -  Seq Scan on liens_s2 l  (cost=0.00..20.92 rows=1192 
width=18) (actual time=0.012..2.304 

[PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka


PG (8.3.7) doesn't seem to want to do a hash join across two partitioned 
tables.  I have two partition hierarchies: impounds (with different 
impound sources) and liens (with vehicle liens from different companies). 
Trying to match those up gives:


EXPLAIN SELECT COUNT(*)
FROM impounds i
JOIN liens l ON (i.vin = l.vin);

 Aggregate  (cost=11164042.66..11164042.67 rows=1 width=0)
   -  Nested Loop  (cost=0.27..3420012.94 rows=3097611886 width=0)
 Join Filter: ((i.vin)::text = (l.vin)::text)
 -  Append  (cost=0.00..1072.77 rows=33577 width=21)
   -  Seq Scan on impounds i  (cost=0.00..11.40 rows=140 width=21)
   -  Seq Scan on impounds_s1 i  (cost=0.00..926.87 rows=29587 
width=18)
   -  Seq Scan on impounds_s2 i  (cost=0.00..99.96 rows=3296 
width=18)
   -  Seq Scan on impounds_s3 i  (cost=0.00..23.14 rows=414 
width=18)
   -  Seq Scan on impounds_s4 i  (cost=0.00..11.40 rows=140 
width=21)
 -  Append  (cost=0.27..101.64 rows=15 width=21)
   -  Bitmap Heap Scan on liens l  (cost=0.27..5.60 rows=2 
width=21)
 Recheck Cond: ((l.vin)::text = (i.vin)::text)
 -  Bitmap Index Scan on liens_pk  (cost=0.00..0.27 rows=2 
width=0)
   Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using liens_s1_pk on liens_s1 l  (cost=0.00..7.02 
rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using liens_s2_pk on liens_s2 l  (cost=0.00..3.47 
rows=1 width=21)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s3_pk on liens_s3 l  
(cost=0.00..7.52 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s4_pk on liens_s4 l  
(cost=0.00..7.67 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s5_pk on liens_s5 l  
(cost=0.00..7.62 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s6_pk on liens_s6 l  
(cost=0.00..7.61 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s7_pk on liens_s7 l  
(cost=0.00..7.50 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s8_pk on liens_s8 l  
(cost=0.00..7.36 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s9_pk on liens_s9 l  
(cost=0.00..7.43 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s10_pk on liens_s10 l  
(cost=0.00..7.79 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s11_pk on liens_s11 l  
(cost=0.00..8.07 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s12_pk on liens_s12 l  
(cost=0.00..8.45 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)
   -  Index Scan using newliens_s13_pk on liens_s13 l  
(cost=0.00..8.53 rows=1 width=18)
 Index Cond: ((l.vin)::text = (i.vin)::text)


This takes quite a while as it's got to do tons of index probes which 
results it tons of random IO.  I killed this after five minutes of 
running.


But if I do:

CREATE TABLE i1 AS SELECT * FROM impounds;
CREATE TABLE l1 AS SELECT * FROM liens;

I get a reasonable plan, which runs in about 15 seconds, from:

EXPLAIN SELECT COUNT(*)
FROM i1 i
JOIN l1 l ON (i.vin = l.vin);

 Aggregate  (cost=749054.78..749054.79 rows=1 width=0)
   -  Hash Join  (cost=1444.18..748971.43 rows=8 width=0)
 Hash Cond: ((l.vin)::text = (i.vin)::text)
 -  Seq Scan on l1 l  (cost=0.00..332068.96 rows=18449996 
width=18)

 -  Hash  (cost=1027.97..1027.97 rows=33297 width=18)
   -  Seq Scan on i1 i  (cost=0.00..1027.97 rows=33297 
width=18)



I've tried to force the hash join plan on the partitioned tables via:

set enable_nestloop to off;

This results in a merge join plan which needs to do a giant sort, again 
killed after five minutes.


 Aggregate  (cost=58285765.20..58285765.21 rows=1 width=0)
   -  Merge Join  (cost=4077389.31..50541735.48 rows=3097611886 width=0)
 Merge Cond: ((i.vin)::text = (l.vin)::text)
 -  Sort  (cost=4286.45..4370.39 rows=33577 width=21)
   Sort Key: i.vin
   -  Append  (cost=0.00..1072.77 rows=33577 width=21)
 -  Seq Scan on impounds i  (cost=0.00..11.40 rows=140 
width=21)
 -  [Seq Scans on other partitions]
 -  Materialize  

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Tom Lane
Kris Jurka bo...@ejurka.com writes:
 PG (8.3.7) doesn't seem to want to do a hash join across two partitioned 
 tables.

Could we see the whole declaration of these tables?  (pg_dump -s output
would be convenient)

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka



On Thu, 16 Apr 2009, Tom Lane wrote:


Kris Jurka bo...@ejurka.com writes:

PG (8.3.7) doesn't seem to want to do a hash join across two partitioned
tables.


Could we see the whole declaration of these tables?  (pg_dump -s output
would be convenient)



The attached table definition with no data wants to mergejoin first, but 
after disabling mergejoin it does indeed do a hashjoin.


Looking back at the cost estimates for the merge and nestloop joins, it 
seems to be selecting the number of rows in the cartesian product * .005 
while the number of output rows in this case is 2437 (cartesian product * 
4e-9).  Perhaps the cost estimates for the real data are so high because 
of this bogus row count that the fudge factor to disable mergejoin isn't 
enough?


Kris Jurka
CREATE TABLE impounds (
vin character varying(17) NOT NULL,
impounddate date NOT NULL,
eventtypeid integer NOT NULL,
ori character varying(9),
platenumber character varying(8),
platestate character varying(2),
plateyear integer,
platetype character varying(2),
vehicleyear integer,
vehiclemake text,
vehiclemodel text,
vehiclestyle text,
vehiclecolor text,
townotes text,
damagenotes text,
platenotes text,
custodynotes text,
sourcenotes text,
referencenumber1 text,
referencenumber2 text,
referencenumber3 text,
contactname text,
contactphone text
);



ALTER TABLE ONLY impounds
ADD CONSTRAINT impounds_pk PRIMARY KEY (vin, impounddate, eventtypeid);

CREATE TABLE impounds_s1 () INHERITS (impounds);
CREATE TABLE impounds_s2 () INHERITS (impounds);
CREATE TABLE impounds_s3 () INHERITS (impounds);
CREATE TABLE impounds_s4 () INHERITS (impounds);

ALTER TABLE impounds_s1 ADD CONSTRAINT impounds_s1_pk PRIMARY KEY (vin, 
impounddate);
ALTER TABLE impounds_s2 ADD CONSTRAINT impounds_s2_pk PRIMARY KEY (vin, 
impounddate);
ALTER TABLE impounds_s3 ADD CONSTRAINT impounds_s3_pk PRIMARY KEY (vin, 
impounddate);
ALTER TABLE impounds_s4 ADD CONSTRAINT impounds_s4_pk PRIMARY KEY (vin, 
impounddate);



CREATE TABLE liens (
agentid integer NOT NULL,
vin character varying(17) NOT NULL,
liendate date,
accountnumber character varying(50),
customername character varying(50),
state character varying(2),
vehiclemake character varying(20),
vehiclemodel character varying(20),
vehicleyear integer
);

ALTER TABLE ONLY liens
ADD CONSTRAINT liens_pk PRIMARY KEY (vin, agentid);

CREATE TABLE liens_s1 () INHERITS (liens);
CREATE TABLE liens_s2 () INHERITS (liens);
CREATE TABLE liens_s3 () INHERITS (liens);
CREATE TABLE liens_s4 () INHERITS (liens);
CREATE TABLE liens_s5 () INHERITS (liens);
CREATE TABLE liens_s6 () INHERITS (liens);
CREATE TABLE liens_s7 () INHERITS (liens);

ALTER TABLE liens_s1 ADD CONSTRAINT lines_s1_pk PRIMARY KEY (vin);
ALTER TABLE liens_s2 ADD CONSTRAINT lines_s2_pk PRIMARY KEY (vin);
ALTER TABLE liens_s3 ADD CONSTRAINT lines_s3_pk PRIMARY KEY (vin);
ALTER TABLE liens_s4 ADD CONSTRAINT lines_s4_pk PRIMARY KEY (vin);
ALTER TABLE liens_s5 ADD CONSTRAINT lines_s5_pk PRIMARY KEY (vin);
ALTER TABLE liens_s6 ADD CONSTRAINT lines_s6_pk PRIMARY KEY (vin);
ALTER TABLE liens_s7 ADD CONSTRAINT lines_s7_pk PRIMARY KEY (vin);


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka



On Thu, 16 Apr 2009, Kris Jurka wrote:

Perhaps the cost estimates for the real data are so high because of this 
bogus row count that the fudge factor to disable mergejoin isn't enough?




Indeed, I get these cost estimates on 8.4b1 with an increased 
disable_cost value:


nestloop:  11171206.18
merge: 58377401.39
hash: 116763544.76

So the default disable_cost isn't enough to push it to use the hash join 
plan and goes back to nestloop.  Since disable_cost hasn't been touched 
since January 2000, perhaps it's time to bump that up to match today's 
hardware and problem sizes?  This isn't even a particularly big problem, 
it's joing 18M rows against 30k.


The real problem is getting reasonable stats to pass through the partition 
Append step, so it can make a reasonable estimate of the join output size.


Kris Jurka


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance