On 2017/08/21 13:11, Ashutosh Bapat wrote: > On Sat, Aug 19, 2017 at 1:21 AM, Robert Haas <robertmh...@gmail.com> wrote: >> On Fri, Aug 18, 2017 at 1:17 AM, Ashutosh Bapat >> <ashutosh.ba...@enterprisedb.com> wrote: >>> 0004 patch in partition-wise join patchset has code to expand >>> partition hierarchy. That patch is expanding inheritance hierarchy in >>> depth first manner. Robert commented that instead of depth first >>> manner, it will be better if we expand it in partitioned tables first >>> manner. With the latest changes in your patch-set I don't see the >>> reason for expanding in partitioned tables first order. Can you please >>> elaborate if we still need to expand in partitioned table first >>> manner? May be we should just address the expansion issue in 0004 >>> instead of dividing it in two patches. >> >> Let me see if I can clarify. I think there are three requirements here: >> >> A. Amit wants to be able to prune leaf partitions before opening and >> locking those relations, so that pruning can be done earlier and, >> therefore, more cheaply. > > We could actually prune partitioned tables thus pruning whole > partitioned tree. Do we want to then lock those partitioned tables but > not the leaves in that tree?
I think it would be nice if we keep the current approach of expanding the whole partition tree in expand_inherited_rtentry(), at least to know how many more entries a given partitioned table will add to the query's range table. It would be nice, because that way, we don't have to worry *right away* about modifying the planner to cope with some new behavior whereby range table entries will get added at some later point. Then, as you might already know, if we want to use the partition bound order when expanding the whole partition tree, we will depend on the relcache entries of the partitioned tables in that tree, which will require us to take locks on them. It does sound odd that we may end up locking a child *partitioned* table that is potentially prune-able, but maybe there is some way to relinquish that lock once we find out that it is pruned after all. >> B. Partition-wise join wants to expand the inheritance hierarchy a >> level at a time instead of all at once, ending up with rte->inh = true >> entries for intermediate partitioned tables. > > And create AppendRelInfos which pair children with their partitioned > parent rather than the root. There should be *some* way to preserve the parent-child RT index mapping and to preserve the multi-level hierarchy, a way that doesn't map all the child tables in a partition tree to the root table's RT index. AppendRelInfo is one way of doing that mapping currently, but if we continue to treat it as the only way (for the purpose of mapping), we will be stuck with the way they are created and manipulated. Especially, if we are going to always depend on the fact that root->append_rel_list contains all the required AppendRelInfos, then we will always have to fully expand the inheritance in expand_inherited_rtentry() (by fully I mean, locking and opening all the child tables, instead of just the partitioned tables). In a world where we don't want to open the partition child tables in expand_inherited_rtentry(), we cannot build the corresponding AppendRelInfos there. Note that this is not about completely dispelling AppendRelInfos-for-partition-child-tables, but about doing without them being present in root->append_rel_list. We would still need them to be able to use adjust_appendrel_attrs(), etc., but we can create them at a different time and store them in a place that's not root->append_rel_list; For example, inside the RelOptInfo of the child table. Or perhaps, we can still add them to root->append_rel_list, but will need to be careful about the places that depend on the timing of AppendRelInfos being present there. >> So in the end game I think >> expand_inherited_rtentry looks approximately like this: >> >> 1. Calling find_all_inheritors with a new only-lock-the-partitions >> flag. This should result in locking all partitioned tables in the >> inheritance hierarchy in breadth-first, low-OID-first order. (When >> the only-lock-the-partitions isn't specified, all partitioned tables >> should still be locked before any unpartitioned tables, so that the >> locking order in that case is consistent with what we do here.) > > I am confused. When "only-lock-the-partitions" is true, do we expect > intermediate partitioned tables to be locked? Why then "only" in the > flag? I guess Robert meant to say lock-only-"partitioned"-tables? >> 2. Iterate over the partitioned tables identified in step 1 in the >> order in which they were returned. For each one: >> - Decide which children can be pruned. >> - Lock the unpruned, non-partitioned children in low-OID-first order. >> >> 3. Make another pass over the inheritance hierarchy, starting at the >> root. Traverse the whole hierarchy in breadth-first in *bound* order. >> Add RTEs and AppendRelInfos as we go -- these will have rte->inh = >> true for partitioned tables and rte->inh = false for leaf partitions. > > These two seem to be based on the assumption that we have to lock all > the partitioned tables even if they can be pruned. > > For regular inheritance there is only a single parent, so traversing > the list returned by find_all_inheritors suffices. For partitioned > hierarchy, we need to know the parent of every child, which is not > part of the find_all_inheritors() output list. Even if it returns only > the partitioned children, they themselves may have a parent different > from the root partition. So, we have to discard the output of > find_all_inheritors() for partitioned hierarchy and traverse the > children as per their orders in oids array in PartitionDesc. May be > it's better to separate the guts of expand_inherited_rtentry(), which > create AppendRelInfos, RTEs and rowmarks for the children into a > separate routine. Use that routine in two different functions > expand_inherited_rtentry() and expand_partitioned_rtentry() for > regular inheritance and partitioned inheritance resp. The functions > will use two different traversal methods appropriate for traversing > the children in either case. I just posted a patch  that implements something like this, but implementation details might seem different. It doesn't however implement a solution to the problem that you pose that partitioned child tables that are prune-able are locked. Thanks, Amit  https://www.postgresql.org/message-id/098b9c71-1915-1a2a-8d52-1a7a50ce79e8%40lab.ntt.co.jp -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers