Re: [HACKERS] path toward faster partition pruning

Robert Haas Fri, 02 Feb 2018 13:06:06 -0800

On Fri, Feb 2, 2018 at 9:33 AM, Robert Haas <[email protected]> wrote:
>> Updated set of patches attached (patches 0002 onward mostly unchanged,
>> except I incorporated the delta patch posted by David upthread).
>
> Committed 0001.  Thanks.


Some preliminary thoughts...

Regarding 0002, I can't help noticing that this adds a LOT of new code
to partition.c.  With 0002 applied, it climbs into the top 2% of all
".c" files in terms of lines of code.  It seems to me, though, that
maybe it would make sense to instead add all of this code to some new
file .c file, e.g. src/backend/optimizer/util/partprune.c.  I realize
that's a little awkward in this case because we're hoping to use this
code at runtime and not just in the optimizer, but I don't have a
better idea.  Using src/backend/catalog as a dumping-ground for code
that doesn't have a clear-cut place to live is not a superior
alternative, for sure.  And it seems to me that the code you're adding
here is really quite similar to what we've already got in that
directory -- for example, predtest.c, which currently does partition
pruning, lives there; so does clauses.c, whose evaluate_expr facility
this patch wants to use; so does relnode.c, which the other patches
modify; and in general this looks an awful lot like other optimizer
logic that decomposes clauses.  I'm open to other suggestions but I
don't think adding all of this directly into partition.c is a good
plan.

If we do add a new file for this code, the header comment for that
file would be a good place to write an overall explanation of this new
facility.  The individual bits and pieces seem to have good comments
but an overall explanation of what's going on here seems to be
lacking.

It doesn't seem good that get_partitions_from_clauses requires us to
reopen the relation.  I'm going to give my standard review feedback
any time someone injects additional relation_open or heap_open calls
into a patch: please look for a way to piggyback on one of the places
that already has the relation open instead of adding code to re-open
it elsewhere.  Reopening it is not entirely free, and, especially when
NoLock is used, makes it hard to tell whether we're doing the locking
correctly.  Note that we've already got things like
set_relation_partition_info (which copies the bounds) and
set_baserel_partition_key_exprs (which copies, with some partitioning,
the partitioning expressions) and also find_partition_scheme, but
instead of using that existing data from the RelOptInfo, this patch is
digging it directly out of the relcache entry, which doesn't seem
great.

The changes to set_append_rel_pathlist probably make it slower in the
case where rte->relkind != RELKIND_PARTITIONED_TABLE.  We build a
whole new list that we don't really need.  How about just keeping the
(appinfo->parent_relid != parentRTindex) test in the loop and setting
rel_appinfos to either root->append_rel_list or
rel->live_part_appinfos as appropriate?

I understand why COLLATION_MATCH think that a collation OID match is
OK, but why is InvalidOid also OK?  Can you add a comment?  Maybe some
test cases, too?

How fast is this patch these days, compared with the current approach?
 It would be good to test both when nearly all of the partitions are
pruned and when almost none of the partitions are pruned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] path toward faster partition pruning

Reply via email to