Hi Jesper. Firstly, thanks for looking at the patch.
On 2017/09/26 22:00, Jesper Pedersen wrote: > Hi Amit, > > On 09/15/2017 04:50 AM, Amit Langote wrote: >> On 2017/09/15 11:16, Amit Langote wrote: >>> I will post rebased patches later today, although I think the overall >>> design of the patch on the planner side of things is not quite there yet. >>> Of course, your and others' feedback is greatly welcome. >> >> Rebased patches attached. Because Dilip complained earlier today about >> clauses of the form (const op var) not causing partition-pruning, I've >> added code to commute the clause where it is required. Some other >> previously mentioned limitations remain -- no handling of OR clauses, no >> elimination of redundant clauses for given partitioning column, etc. >> >> A note about 0001: this patch overlaps with >> 0003-Canonical-partition-scheme.patch from the partitionwise-join patch >> series that Ashutosh Bapat posted yesterday . Because I implemented >> the planner-portion of this patch based on what 0001 builds, I'm posting >> it here. It might actually turn out that we will review and commit >> 0003-Canonical-partition-scheme.patch on that thread, but meanwhile apply >> 0001 if you want to play with the later patches. I would certainly like >> to review 0003-Canonical-partition-scheme.patch myself, but won't be able >> to immediately (see below). >> > > Could you share your thoughts on the usage of PartitionAppendInfo's > min_datum_idx / max_datum_idx ? Especially in relation to hash partitions. > > I'm looking at get_partitions_for_keys. Sure. You may have noticed that min_datum_idx and max_datum_idx in PartitionAppendInfo are offsets in the PartitionDescData.boundinfo.datums array, of datums that lie within the query-specified range (that is, using =, >, >=, <, <= btree operators in the query). That array contains bounds of all partitions sorted in the btree operator class defined order, at least for list and range partitioning. I haven't (yet) closely looked at the composition of hash partition datums in PartitionBoundInfo, which perhaps have different ordering properties (or maybe none) than list and range partitioning datums. Now, since they are offsets of datums in PartitionBoundInfo, not indexes of partitions themselves, their utility outside partition.c might be questionable. But partition-wise join patch, for example, to determine if two partitioned tables can be joined partition-wise, is going to check if PartitionBoundInfos in RelOptInfos of two partitioned tables are bound-to-bound equal . Partition-pruning may select only a subset of partitions of each of the joining partitioned tables. Equi-join requirement for partition-wise join means that the subset of partitions will be same on both sides of the join. My intent of having min_datum_idx, max_datum_idx, along with contains_null_partition, and contains_default_partition in PartitionAppendInfo is to have sort of a cross check that those values end up being same on both sides of the join after equi-join requirement has been satisfied. That is, get_partitions_for_keys will have chosen the same set of partitions for both partitioned tables and hence will have set the same values for those fields. As mentioned above, that may be enough for list and range partitioning, but since hash partition datums do not appear to have the same properties as list and range datums , we may need an additional field(s) to describe the hash partition selected by get_partitions_for_keys. I guess only one field will be enough, that will be the offset in the datums array of the hash partition chosen for the query or -1 if query quals couldn't conclusively determine one (maybe not all partition keys were specified to be hashed or some or all used non-equality operator). Hope that answers your question at least to some degree. Now, there are some points Robert mentioned in his reply that I will need to also consider in the patch, which I'll go do now. :) Thanks, Amit  https://www.postgresql.org/message-id/CAFjFpRc4UdCYknBai9pBu2GA1h4nZVNPDmzgs4jOkqFamT1huA%40mail.gmail.com  It appears that in the hash partitioning case, unlike list and range partitioning, PartitionBoundInfo doesn't contain values that are directly comparable to query-specified constants, but a pair (modulus, remainder) for each partition. We'll first hash *all* the key values (mentioned in the query) using the partitioning hash machinery and then determine the hash partition index by using (hash % largest_modulus) as offset into the PartitionBoundInfo.indexes array. -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers