2010/9/3 Hans-Jürgen Schönig <h...@cybertec.at>:
> On Sep 2, 2010, at 1:20 AM, Robert Haas wrote:
>> I agree. Explicit partitioning may open up some additional optimization 
>> possibilities in certain cases, but Merge Append is more general and 
>> extremely valuable in its own right.
> we have revised greg's wonderful work and ported the entire thing to head.
> it solves the problem of merge_append. i did some testing earlier on today 
> and it seems most important cases are working nicely.

First, thanks for merging this up to HEAD.  I took a look through this
patch tonight, and the previous reviews thereof that I was able to
find, most notably Tom's detailed review on 2009-07-26.  I'm not sure
whether or not it's accidental that this didn't get added to the CF,
but here's an attempt to enumerate the things that seem like they need
to be fixed.  The quotes labeled "TGL" are from the aforementioned
review by Tom.

1. The code in set_append_rel_pathlist() that accumulates the pathkeys
of all sub-paths is, as it says, and as previous discussed, O(n^2).
In a previous email on this topic, Tom suggested on possible approach
for this problem: choose the largest child relation and call it the
leader, and consider only the pathkeys for that relation rather than
all of them.  I think it would be nice if we can find a way to be a
bit smarter, though, because that won't necessarily always find the
best path.  One idea I had is to choose some arbitrary limit on how
long the all_pathkeys list is allowed to become and iterate over the
children from largest to smallest, stopping early if you hit that
limit.  But thinking about it a little more, can't we just adjust the
way we do this so that it's not O(n^2)?  It seems we're only concerned
with equality here, so what about using a hash table?  We could hash
the PathKey pointers in each list, but not the lists or listcells

2. TGL: "you need an explicit flag to say 'we'll do a merge', not just
rely on whether the path has pathkeys."  This makes sense and doesn't
seem difficult.

3. TGL: "Speaking of sorting, it's not entirely clear to me how the
patch ensures that all the child plans produce the necessary sort keys
as output columns, and especially not how it ensures that they all get
produced in the *same* output columns.  This might accidentally manage
to work because of the "throwaway" call to make_sort_from_pathkeys(),
but at the very least that's a misleading comment."  I'm not sure what
needs to be done about this; I'm going to look at this further.

4. TGL: "In any case, I'm amazed that it's not failing regression
tests all over the place with those critical tests in
make_sort_from_pathkeys lobotomized by random #ifdef FIXMEs.  Perhaps
we need some more regression tests...".  Obviously, we need to remove
that lobotomy and insert the correct fix for whatever problem it was
trying to solve.  Adding some regression tests seems wise, too.

5. TGL: "In the same vein, the hack to 'short circuit' the append
stuff for a single child node is simply wrong, because it doesn't
allow for column number variances.  Please remove it."  This seems
like straightforward cleanup, and maybe another candidate for a
regression test.  (Actually, I notice that the patch has NO regression
tests at all, which surely can't be right for something of this type,
though admittedly since we didn't have EXPLAIN (COSTS OFF) when this
was first written it might have been difficult to write anything
meaningful at the time.)

6. The dummy call to cost_sort() seems like a crock; what that
function does doesn't seem particularly relevant to the cost of the
merge operation.  Just off the top of my head, it looks like the cost
of the merge step will be roughly O(lg n) * the cost of comparing two
tuples * the total number of tuples from all child paths.  In practice
it might be less, because once some of the paths run out of tuples the
number of comparisons will drop, I think.  But the magnitude of that
effect seems difficult to predict, and may be rather small, so perhaps
we should just ignore it.

7. I think there's some basic code cleanup needed here, also: comment
formatting, variable naming, etc.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to