Re: [PATCHES] Ordered Append WIP patch v1

Heikki Linnakangas Mon, 31 Mar 2008 13:15:16 -0700

Gregory Stark wrote:

Here's the WIP patch I described on -hackers to implemented "ordered" append
nodes.

I think it would be better to create completely separate executor nodefor this, rather than tack this feature on the regular Append node. Thetwo don't seem to have much in common. Looking at AppendState, forexample, as_firstplan and as_lastplan are not interesting for theMergeAppend, and likewise MergeAppend needs a whole bunch of fields notinteresting to regular Append. The fact that the first statement inExecAppend is "if(!node->as_is_ordered)", with zero lines of sharedcodes between them, also suggests that it would be a good idea to keepthem separate.

As you point out in the comments, it's quite confusing to have indexesinto the slots array and the heap array. I still can't get my headaround that. Would it help to define a struct along the lines of:


struct {
  TupleTableSlot *slot;
  PlanState *plan;
};

and store pointers to those in the heap?

1) I still haven't completely figured out what to do with equivalence classes.
   My hack of just stuffing all the append subrel vars into there seems to
   work fine. I need to understand what's going on to see if there's really a
   problem with it or not.

I don't understand that well enough to comment... I presume the FIXME inthe patch is related to this?

4) I haven't handled mark/restore or random access. I think they could be
   handled and they'll probably be worth the complexity but I'm not sure.

For Mark/restore, I suppose you would mark/restore all subnodes, andstore/restore the heap. For reversing direction, you would reverse theheap. Doesn't sound too hard, but it's probably better to make it workwithout them at first before adding any bells and whistles...

5) Is it considering too many paths? Are there some which maybe aren't worth
   considering? For example, maybe it only makes sense to take best start_cost
   paths since if that plan doesn't dominate then the best total_cost plan is
   likely to be the sequential scans + unordered append + sort.

I don't understand this paragraph. Are you referring to this comment inthe patch:

/* If we can't find any plans with the right order just add the
 * cheapest total plan to both paths, we'll have to sort it
 * anyways. Perhaps consider not generating a startup path for such
 * pathkeys since it'll be unlikely to be the cheapest startup
 * cost. */

Having to sort one or two of the sub-plans might not be to be expensiveat all. For example, having to sort the empty parent relation of apartitioned table is essentially free. It's probably never cheaper touse the Merge Append if you need to sort *all* of the children, but ifyou can avoid sorting even one of them, it might very well be worthwhile.

6) I haven't looked at setops yet but this is particularly attractive for the
   UNION (as opposed to UNION ALL) case.


Yes. And DISTINCT.

7) I copied/adapted a bunch of bits from tuplesort to maintain the heap and do
   the scankey comparisons. I could refactor that code back into tuplesort but
   it would mean twisting around tuplesort quite a bit. Essentially it would
   mean introducing a new type of tuplesort which would start off in
   FINAL_MERGE state only it would have to behave differently since we don't
   want it prefetching lots of records like FINAL_MERGE does, I don't think.

Doesn't seem worthwhile. The heap management part of the patch is quiteshort.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Re: [PATCHES] Ordered Append WIP patch v1

Reply via email to