On Sun, Jul 5, 2009 at 10:34 PM, Robert Haas<robertmh...@gmail.com> wrote: > On Jul 5, 2009, at 10:02 AM, Gregory Stark <st...@mit.edu> wrote: >> >> Here's a copy of the merge-append patch that I sent months ago merged up to >> head. I haven't really added any additional functionality since then. > > Can you provide some more details about the objective of this patch? Or a > link to previous discussion?
It's one piece of the puzzle of how to deal with partitioned tables more completely. The basic problem is that currently the planner assumes all Append nodes produce unordered output. That means there's no way for the planner to use any indexes on partitions to produce a desired ordering. That means it's impossible to do a merge join or to satisfy an ORDER BY clause without introducing a sort even if you have matching indexes on every partition. Lots of common cases arise with partitioned tables where this kicks in. Many of them will be eliminated as our partitioning support gets more intelligent and is able to recognize how the partitioning key relates to the requested order or collapse singleton partition scans in cases where the parent table currently interferes. However there will always be cases where those mechanisms to simplify the plan sufficiently and we end up with an Append node of unordered partitions and we need this as a back-stop to avoid awful plans. The merge-append node works by teaching the Append node what sort order it's aiming to produce. The append node then keeps a heap of slots and returns the tuple from the top child plan replacing it in the heap with the next plan. This produces plans like this: QUERY PLAN ------------------------------------------------------------------------------------ Result (cost=0.20..489.00 rows=9600 width=4) -> Append (cost=0.20..489.00 rows=9600 width=4) -> Index Scan using p_pkey on p (cost=0.00..80.25 rows=2400 width=4) -> Index Scan using p1_pkey on p1 p (cost=0.00..80.25 rows=2400 width=4) -> Index Scan using p2_pkey on p2 p (cost=0.00..80.25 rows=2400 width=4) -> Index Scan using p3_pkey on p3 p (cost=0.00..80.25 rows=2400 width=4) (6 rows) Instead of plans like this which is the best we can do today: QUERY PLAN -------------------------------------------------------------------------- Sort (cost=770.98..794.98 rows=9600 width=4) Sort Key: public.p.i -> Result (cost=0.00..136.00 rows=9600 width=4) -> Append (cost=0.00..136.00 rows=9600 width=4) -> Seq Scan on p (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan on p1 p (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan on p2 p (cost=0.00..34.00 rows=2400 width=4) -> Seq Scan on p3 p (cost=0.00..34.00 rows=2400 width=4) (8 rows) -- greg http://mit.edu/~gsstark/resume.pdf -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers