Re: [HACKERS] advanced partition matching algorithm for partition-wise join
On Thu, Oct 12, 2017 at 9:46 PM, Robert Haas wrote: > On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat > wrote: >> Here's updated patch set based on the basic partition-wise join >> committed. The patchset applies on top of the patch to optimize the >> case of dummy partitioned tables [1]. >> >> Right now, the advanced partition matching algorithm bails out when >> either of the joining relations has a default partition. > > So is that something you are going to fix? > Yes, if time permits. I had left the patch unattended while basic partition-wise join was getting committed. Now that it's committed, I rebased it. It still has TODOs and some work is required to improve it. But for the patch to be really complete, we have to deal with the problem of missing partitions described before. I am fine collaborating if someone else wants to pick it up. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
On Wed, Oct 11, 2017 at 7:08 AM, Ashutosh Bapat wrote: > Here's updated patch set based on the basic partition-wise join > committed. The patchset applies on top of the patch to optimize the > case of dummy partitioned tables [1]. > > Right now, the advanced partition matching algorithm bails out when > either of the joining relations has a default partition. So is that something you are going to fix? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
On Thu, Sep 7, 2017 at 7:34 PM, Antonin Houska wrote: > Ashutosh Bapat wrote: > >> I have fixed the issues which were marked as TODOs in the attached >> patches. Also, I have included your test change patch in my series of >> patches. > > I've noticed that partition_bounds_merge() is called twice from > make_join_rel(): > > * build_join_rel -> build_joinrel_partition_info -> partition_bounds_merge > > * try_partition_wise_join -> partition_bounds_merge > > Is this intentional, or just a thinko? > This is expected. partition_bounds_merge() also returns the pairs of matching partitions. So, we have to call that function for every pair of joining relations. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
Ashutosh Bapat wrote: > I have fixed the issues which were marked as TODOs in the attached > patches. Also, I have included your test change patch in my series of > patches. I've noticed that partition_bounds_merge() is called twice from make_join_rel(): * build_join_rel -> build_joinrel_partition_info -> partition_bounds_merge * try_partition_wise_join -> partition_bounds_merge Is this intentional, or just a thinko? -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
On Tue, Sep 5, 2017 at 4:34 PM, Ashutosh Bapat < ashutosh.ba...@enterprisedb.com> wrote: > I have fixed the issues which were marked as TODOs in the attached > patches. Also, I have included your test change patch in my series of > patches. Are there any other issues you have commented out? > > Thanks Ashutosh, All commented issue got fixed. I am working on some combinations of N-way joins to test partition matching, will send those as well once done.
Re: [HACKERS] advanced partition matching algorithm for partition-wise join
PFA the patches rebased on the latest sources. There are also fixes for some of the crashes and bugs reported. I haven't yet included the testcase patch in the main patchset. On Mon, Aug 28, 2017 at 12:44 PM, Rajkumar Raghuwanshi wrote: > On Mon, Aug 21, 2017 at 12:43 PM, Ashutosh Bapat > wrote: >> >> TODOs >> --- >> 1. Add tests for advanced partition matching algorithm > > > Hi Ashutosh, > > I have applied all partition-wise-join patches (v26) and tested feature. I > have modified partition_join.sql file and added extra test cases to test > partition matching. > > Attaching WIP test case patch which as of now have some server crashes and a > data corruptions issue which is commented in the file itself and need to be > removed once issue got solved. Also some of queries is not picking or > picking partition-wise-join as per expectation which may need some > adjustment. > > > > > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company From 865242c79b56f021dc619bc028480097d11bb69a Mon Sep 17 00:00:00 2001 From: Ashutosh Bapat Date: Thu, 6 Jul 2017 14:15:22 +0530 Subject: [PATCH 11/12] Modify bound comparision functions to accept members of PartitionKey Functions partition_bound_cmp(), partition_rbound_cmp() and partition_rbound_datum_cmp() are required to merge partition bounds from joining relations. While doing so, we do not have access to the PartitionKey of either relations. So, modify these functions to accept only required members of PartitionKey so that the functions can be reused for merging bounds. Ashutosh Bapat. --- src/backend/catalog/partition.c | 76 ++- 1 file changed, 44 insertions(+), 32 deletions(-) diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 96a64ce..d42e1b5 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -126,15 +126,17 @@ static List *generate_partition_qual(Relation rel); static PartitionRangeBound *make_one_range_bound(PartitionKey key, int index, List *datums, bool lower); -static int32 partition_rbound_cmp(PartitionKey key, - Datum *datums1, PartitionRangeDatumKind *kind1, - bool lower1, PartitionRangeBound *b2); -static int32 partition_rbound_datum_cmp(PartitionKey key, - Datum *rb_datums, PartitionRangeDatumKind *rb_kind, +static int32 partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, Datum *datums1, + PartitionRangeDatumKind *kind1, bool lower1, + PartitionRangeBound *b2); +static int32 partition_rbound_datum_cmp(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, Datum *rb_datums, + PartitionRangeDatumKind *rb_kind, Datum *tuple_datums); -static int32 partition_bound_cmp(PartitionKey key, - PartitionBoundInfo boundinfo, +static int32 partition_bound_cmp(int partnatts, FmgrInfo *partsupfunc, + Oid *partcollation, PartitionBoundInfo boundinfo, int offset, void *probe, bool probe_is_bound); static int partition_bound_bsearch(PartitionKey key, PartitionBoundInfo boundinfo, @@ -719,8 +721,9 @@ check_new_partition_bound(char *relname, Relation parent, * First check if the resulting range would be empty with * specified lower and upper bounds */ -if (partition_rbound_cmp(key, lower->datums, lower->kind, true, - upper) >= 0) +if (partition_rbound_cmp(key->partnatts, key->partsupfunc, + key->partcollation, lower->datums, + lower->kind, true, upper) >= 0) { ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), @@ -771,9 +774,11 @@ check_new_partition_bound(char *relname, Relation parent, { int32 cmpval; - cmpval = partition_bound_cmp(key, boundinfo, - offset + 1, upper, - true); + cmpval = partition_bound_cmp(key->partnatts, + key->partsupfunc, + key->partcollation, + boundinfo, offset + 1, + upper, true); if (cmpval < 0) { /* @@ -2138,7 +2143,9 @@ qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) PartitionRangeBound *b2 = (*(PartitionRangeBound *const *) b); PartitionKey key = (PartitionKey) arg; - return partition_rbound_cmp(key, b1->datums, b1->kind, b1->lower, b2); + return partition_rbound_cmp(key->partnatts, key->partsupfunc, +key->partcollation, b1->datums, b1->kind, +b1->lower, b2); } /* @@ -2155,7 +2162,7 @@ qsort_partition_rbound_cmp(const void *a, const void *b, void *arg) * two contiguous partitions. */ static int32 -partition_rbound_cmp(PartitionKey key, +partition_rbound_cmp(int partnatts, FmgrInfo *partsupfunc, Oid *partcollation, Datum *datums1, PartitionRangeDatumKind *kind1, bool lower1, PartitionRangeBound *b2) { @@ -2165,7 +2172,7 @@ partition_rbound_cmp(PartitionKey key, PartitionRangeDatumK
[HACKERS] advanced partition matching algorithm for partition-wise join
The patch-set in [1] supports partition-wise join when the partition bounds and partition keys of the joining tables exactly match. The last two patches in the last few patch-sets in that thread implement more advanced partition matching code. In order to avoid mixing reviews for advanced partition matching and the basic partition-wise join implementation, I am starting a new thread to discuss the same. I am attaching the last two patches from that patch set here. The new partition matching algorithm handles following cases when a given partition on one side has at most one matching partition matching on the other side. 1. When the ranges of the joining tables do not match exactly E.g. partition table t1 has partitions t1p1 (0 - 100), t1p2 (150 - 200) and partition table t2 has partitions t2p1 (0 - 50), t2p2 (100 - 175). In this case (t1p1, t2p1) and (t1p2, t2p2) form the matching partition pairs, which can be joined. While matching the pairs, we also compute the partition bounds for the resulting join. An INNER join between t1 and t2 will have ranges (0 - 50) since no row with 50 <= key < 100 from t1p1 is going to find a matching row in t2p1 and (150 - 175) since no row with 100 <= key < 150 from t2p2 is going to find a matching row in t1p2 and no row with 175 <= key < 200 in t1p2 is going to find a matching row in t1p2. A t1 LEFT join t2 on the other hand will have ranges same as the outer relation i.e. t1, (0 - 100), (150 - 200) since all rows from t1 will be part of the join. Thus depending upon the type of join the partition bounds of the resultant join relation change. Similarly for list partitioned table, when the lists do not match exactly, the algorithm finds matching pairs of partitions and the lists of resultant join relation. E.g. t1 has partitions t1p1 ('a', 'b', 'c'), t1p2 ('e', 'f') and t2 has partitions t2p1 ('a', 'b'), t2p2 ('d', 'e', 'f'). In this case (t1p1, t2p1) and (t2p1, t2p2) form the matching pairs which are joined. Inner join will have bounds ('a','b'), ('e', 'f') and t1 LEFT JOIN t2 will have bounds same as t1. 2. When one or both side have at least one partition that does not have matching partition on the other side. E.g. t1 has partitions t1p1 ('a','b'), t1p2 ('c','d') and t2 has only one partition t2p1 ('a','b') OR t1 has partitions t1p1 (0 - 100), t1p2 (100 - 200) and t2 has only one partition t2p1 (0 - 100). In this case as well different types of joins will have different partition bounds for the result using similar rules described above. 3. A combination of 1 and 2 e.g. t1 has partitions t1p1('a','b','c'), t1p2('d','e','f') and t2 has a single partition t2p1 ('a','b', 'z'). Algorithm - The pairs of matching partitions and the partition bounds of the join are calculated by an algorithm similar to merge join. In such a join, it can be observed that every partition on either side, contributes to at most one partition of the resultant join relation. Thus for every partition on either side, we keep track of the partition of resultant join (if any), which it contributes to. If multiple partitions from any of the joining relations map to a single partition of the resultant join, we need to gang those partitions together before joining the partition/s from the other side. Since we do not have infrastructure for ganging multiple arbitrary RelOptInfos together in a parent RelOptInfo, we do not support such a partitionw-wise join right now. We stop merging the bounds immediately when we detect such a case. For list partitioned tables, we compare list values from both the sides, starting with the lowest. If the two list values being compared match, corresponding partitions from both sides form a pair of partitions to be joined. We record this mapping and also include the list value in join bounds. If the two list values do not match and the lower of those two comes from the outer side of the join, we include it in the join bounds. We advance to the next list value on side with the lower list value continuing the process of merging till list values on at least one side are exhausted. If the remaining values are from the outer side, we include those in the join partition bounds. Every list value included in the join bounds, and its originating partition/s are associated with appropriate partition of the resultant join. For more details please see partition_list_bounds_merge() in the attached patch. In case of range partitioned tables, we compare the ranges of the partitions in increasing order of their bounds. If two ranges being compared overlap, corresponding partitions from both sides form a pair of partitions to be joined. We record this mapping and also include the merged range in the bounds of resultant join. The overlapping ranges are merged based on the type of join as described above. If either of the ranges completely precedes the other, and it's on the outer side, we include that range in the bounds of resultant join. We advance to the next range on the si