On Mon, Aug 7, 2017 at 2:54 AM, Amit Langote
<langote_amit...@lab.ntt.co.jp> wrote:
> I think Amit Khandekar mentioned this on the UPDATE partition key thread [1].

Yes, similar discussion.

> As long as find_all_inheritors() is a place only to determine the order in
> which partitions will be locked, it's fine.  My concern is about the time
> of actual locking, which in the current planner implementation is too soon
> that we end up needlessly locking all the partitions.

I don't think avoiding that problem is going to be easy.  We need a
bunch of per-relation information, like the size of each relation, and
what indexes it has, and how big they are, and the statistics for each
one.  It was at one point proposed by someone that every partition
should be required to have the same indexes, but (1) we didn't
implement it like that and (2) if we had done that it wouldn't solve
this problem anyway because the sizes are still going to vary.

Note that I'm not saying this isn't a good problem to solve, just that
it's likely to be a very hard problem to solve.

> The locking-partitions-too-soon issue, I think, is an important one and
> ISTM, we'd want to lock the partitions after we've determined the specific
> ones a query needs to scan using the information returned by
> RelationGetPartitionDispatchInfo.  That means the latter had better locked
> the relations whose cached partition descriptors will be used to determine
> the result that it produces.  One way to do that might be to lock all the
> tables in the list returned by find_all_inheritors that are partitioned
> tables before calling RelationGetPartitionDispatchInfo.  It seems what the
> approach you've outlined below will let us do that.

Yeah, I think so.  I think we could possibly open and lock partitioned
children only, then prune away leaf partitions that we can determine
aren't needed, then open and lock the leaf partitions that are needed.

> BTW, IIUC, there will be two lists of OIDs we'll  have: one in the
> find_all_inheritors order, say, L1 and the other determined by using
> partitioning-specific information for the given query, say L2.
> To lock, we iterate L1 and if a given member is in L2, we lock it.  It
> might be possible to make it as cheap as O(nlogn).

Commonly, we'll prune no partitions or all but one; and we should be
able to make those cases very fast.  Other cases can cost a little
more, but I'll certainly complain about anything more than O(n lg n).

>> 3. While we're optimizing, in the first loop inside of
>> RelationGetPartitionDispatchInfo, don't call heap_open().  Instead,
>> use get_rel_relkind() to see whether we've got a partitioned table; if
>> so, open it.  If not, there's no need.
> That's what the proposed refactoring patch 0002 actually does.


> Maybe, we can make the initial patch use syscache to get the relkind for a
> given child.  If the syscache bloat is unbearable, we go with the
> denormalization approach.

Yeah.  Maybe if you write that patch, you can also test it to see how
bad the bloat is.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to