Hi, Here's patch with some cosmetic fixes to 0002, to be applied on top of 0002.
On Tue, May 16, 2017 at 1:02 PM, Ashutosh Bapat <ashutosh.ba...@enterprisedb.com> wrote: > On Sun, May 14, 2017 at 12:30 PM, amul sul <sula...@gmail.com> wrote: >> On Fri, May 12, 2017 at 10:39 PM, Ashutosh Bapat >> <ashutosh.ba...@enterprisedb.com> wrote: >>> On Fri, May 12, 2017 at 6:08 PM, amul sul <sula...@gmail.com> wrote: >>>> Hi, >>>> >>>> Please find the following updated patches attached: >>>> >>>> 0001-Cleanup.patch : Does some cleanup and code refactoring required >>>> for hash partition patch. Otherwise, there will be unnecessary diff in >>>> 0002 patch >>> >>> Thanks for splitting the patch. >>> >>> + if (isnull[0]) >>> + cur_index = partdesc->boundinfo->null_index; >>> This code assumes that null_index will be set to -1 when has_null is false. >>> Per >>> RelationBuildPartitionDesc() this is true. But probably we should write this >>> code as >>> if (isnull[0]) >>> { >>> if (partdesc->boundinfo->has_null) >>> cur_index = partdesc->boundinfo->null_index; >>> } >>> That way we are certain that when has_null is false, cur_index = -1 similar >>> to >>> the original code. >>> >> Okay will add this. > > Thanks. > >> I still don't understood point of having has_null >> variable, if no null accepting partition exists then null_index is >> alway set to -1 in RelationBuildPartitionDesc. Anyway, let not change >> the original code. > > I agree. has_null might have been folded as null_index == -1. But > that's not the problem of this patch. > > 0001 looks good to me now. > > >> >> [...] >>> >>> + if (key->strategy == PARTITION_STRATEGY_HASH) >>> + { >>> + ndatums = nparts; >>> + hbounds = (PartitionHashBound **) palloc(nparts * >>> + >>> sizeof(PartitionHashBound *)); >>> + i = 0; >>> + foreach (cell, boundspecs) >>> + { >>> + PartitionBoundSpec *spec = lfirst(cell); >>> + >>> [ clipped ] >>> + hbounds[i]->index = i; >>> + i++; >>> + } >>> For list and range partitioned table we order the bounds so that two >>> partitioned tables have them in the same order irrespective of order in >>> which >>> they are specified by the user or hence stored in the catalogs. The >>> partitions >>> then get indexes according the order in which their bounds appear in ordered >>> arrays of bounds. Thus any two partitioned tables with same partition >>> specification always have same PartitionBoundInfoData. This helps in >>> partition-wise join to match partition bounds of two given tables. Above >>> code >>> assigns the indexes to the partitions as they appear in the catalogs. This >>> means that two partitioned tables with same partition specification but >>> different order for partition bound specification will have different >>> PartitionBoundInfoData represenation. >>> >>> If we do that, probably partition_bounds_equal() would reduce to just >>> matching >>> indexes and the last element of datums array i.e. the greatest modulus >>> datum. >>> If ordered datums array of two partitioned table do not match exactly, the >>> mismatch can be because missing datums or different datums. If it's a >>> missing >>> datum it will change the greatest modulus or have corresponding entry in >>> indexes array as -1. If the entry differs it will cause mismatching indexes >>> in >>> the index arrays. >>> >> Make sense, will fix this. > > I don't see this being addressed in the patches attached in the reply to > Dilip. > >> >>> >>> + if (key->partattrs[i] != 0) >>> + { >>> + keyCol = (Node *) makeVar(1, >>> + key->partattrs[i], >>> + key->parttypid[i], >>> + key->parttypmod[i], >>> + key->parttypcoll[i], >>> + 0); >>> + >>> + /* Form hash_fn(value) expression */ >>> + keyCol = (Node *) makeFuncExpr(key->partsupfunc[i].fn_oid, >>> + >>> get_fn_expr_rettype(&key->partsupfunc[i]), >>> + list_make1(keyCol), >>> + InvalidOid, >>> + InvalidOid, >>> + COERCE_EXPLICIT_CALL); >>> + } >>> + else >>> + { >>> + keyCol = (Node *) copyObject(lfirst(partexprs_item)); >>> + partexprs_item = lnext(partexprs_item); >>> + } >>> I think we should add FuncExpr for column Vars as well as expressions. >>> >> Okay, will fix this. > > Here, please add a check similar to get_quals_for_range() > 1840 if (partexprs_item == NULL) > 1841 elog(ERROR, "wrong number of partition key expressions"); > > >> >>> I think we need more comments for compute_hash_value(), mix_hash_value() and >>> satisfies_hash_partition() as to what each of them accepts and what it >>> computes. >>> >>> + /* key's hash values start from third argument of function. */ >>> + if (!PG_ARGISNULL(i + 2)) >>> + { >>> + values[i] = PG_GETARG_DATUM(i + 2); >>> + isnull[i] = false; >>> + } >>> + else >>> + isnull[i] = true; >>> You could write this as >>> isnull[i] = PG_ARGISNULL(i + 2); >>> if (isnull[i]) >>> values[i] = PG_GETARG_DATUM(i + 2); >>> >> Okay. > > If we have used this technique somewhere else in PG code, please > mention that function/place. > /* > * Rotate hash left 1 bit before mixing in the next column. This > * prevents equal values in different keys from cancelling each other. > */ > > >> >>> + foreach (lc, $5) >>> + { >>> + DefElem *opt = (DefElem *) lfirst(lc); >>> A search on WITH in gram.y shows that we do not handle WITH options in >>> gram.y. >>> Usually they are handled at the transformation stage. Why is this an >>> exception? >>> If you do that, we can have all the error handling in >>> transformPartitionBound(). >>> >> If so, ForValues need to return list for hash and PartitionBoundSpec >> for other two; wouldn't that break code consistency? And such >> validation is not new in gram.y see xmltable_column_el. > > Thanks for pointing that out. Ok, then may be leave it in gram.y. But > may be we should move the error handling in transform function. > > >> >>> +DATA(insert OID = 5028 ( satisfies_hash_partition PGNSP PGUID 12 1 0 >>> 2276 0 f f f f f f i s 3 0 16 "23 23 2276" _null_ _null_ _null_ _null_ >>> _null_ satisfies_hash_partition _null_ _null_ _null_ )); >>> Why is third argument to this function ANY? Shouldn't it be INT4ARRAY >>> (variadic >>> INT4)? >>> >> Will use INT4ARRAY in next patch, but I am little sceptical of it. we >> need an unsigned int32, but unfortunately there is not variadic uint32 >> support. How about INT8ARRAY? > > Hmm, I think as long as the binary representation of given unsigned > integer doesn't change in the function call, we could cast an INT32 > datums into unsigned int32, so spending extra 4 bytes per partition > key doesn't look like worth the effort. > > A related question is, all hash functions have return type as > "integer" but internally they return uint32. Why not to do the same > for this function as well? > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c index 987bb73..e9b09dd 100644 --- a/src/backend/catalog/partition.c +++ b/src/backend/catalog/partition.c @@ -58,7 +58,8 @@ * In the case of range partitioning, ndatums will typically be far less than * 2 * nparts, because a partition's upper bound and the next partition's lower * bound are the same in most common cases, and we only store one of them. - * In case of hash partitioning, ndatums be same as the number of partitions. + * In case of hash partitioning, ndatums will be same as the number of + * partitions. * * For range and list partitioned tables, datums is an array of datum-tuples * with key->partnatts datums each. @@ -1504,12 +1505,12 @@ make_partition_op_expr(PartitionKey key, int keynum, /* * get_qual_for_hash * - * Given a list of partition columns, modulus and remainder this function - * returns an expression Node for the partition table's CHECK constraint. + * Given a list of partition columns, modulus and remainder corresponding to a + * partition, this function returns CHECK constraint expression Node for that + * partition. * - * For example, given a partition definition such as: - * CREATE TABLE simple_hash (a int, b char(10)) - * PARTITION BY HASH (a, b); + * For a partitioned table defined as: + * CREATE TABLE simple_hash (a int, b char(10)) PARTITION BY HASH (a, b); * * CREATE TABLE p_p1 PARTITION OF simple_hash * FOR VALUES WITH (modulus 2, remainder 1); @@ -1520,16 +1521,16 @@ make_partition_op_expr(PartitionKey key, int keynum, * CREATE TABLE p_p4 PARTITION OF simple_hash * FOR VALUES WITH (modulus 8, remainder 4); * - * This function will return one of the following in the form of a - * subexpression: + * This function will return one of the following in the form of an + * expression: * * for p_p1: satisfies_hash_partition(2, 1, hash_fn_1(a), hash_fn_2(b)) * for p_p2: satisfies_hash_partition(4, 2, hash_fn_1(a), hash_fn_2(b)) * for p_p3: satisfies_hash_partition(8, 0, hash_fn_1(a), hash_fn_2(b)) * for p_p4: satisfies_hash_partition(8, 4, hash_fn_1(a), hash_fn_2(b)) * - * hash_fn_1 and hash_fn_2 will be datatype-specific hash functions for - * column a and b respectively. + * where hash_fn_1 and hash_fn_2 are be datatype-specific hash functions for + * columns a and b respectively. */ static List * get_qual_for_hash(PartitionKey key, PartitionBoundSpec *spec) @@ -1568,7 +1569,6 @@ get_qual_for_hash(PartitionKey key, PartitionBoundSpec *spec) /* Left operand */ if (key->partattrs[i] != 0) - { keyCol = (Node *) makeVar(1, key->partattrs[i], key->parttypid[i], @@ -1576,7 +1576,6 @@ get_qual_for_hash(PartitionKey key, PartitionBoundSpec *spec) key->parttypcoll[i], 0); - } else { keyCol = (Node *) copyObject(lfirst(partexprs_item)); @@ -2684,7 +2683,7 @@ mix_hash_value(int nkeys, uint32 *hash_array, bool *isnull) /* * Rotate hash left 1 bit before mixing in the next column. This * prevents equal values in different keys from cancelling each other. - * */ + */ rowHash = (rowHash << 1) | ((rowHash & 0x80000000) ? 1 : 0); if (!isnull[i]) diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 46344ee..8f8bbcd 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -13231,8 +13231,9 @@ transformPartitionSpec(Relation rel, PartitionSpec *partspec, char *strategy) } /* - * Collation is irrelevant for hash partition key, because hash - * operator classes provide only equality, not ordering. + * Hash operator classes provide only equality, not ordering. + * Collation, which is relevant for ordering and not equality is + * irrelevant for hash partitioning. */ if (*strategy == PARTITION_STRATEGY_HASH && pelem->collation != NIL) { @@ -13404,9 +13405,10 @@ ComputePartitionAttrs(Relation rel, List *partParams, AttrNumber *partattrs, partcollation[attn] = attcollation; - /* Identify opclass to use. For list and range partitioning we use only - * btree operators, which seems enough for those. For hash partitioning, - * we use hash operators. */ + /* + * Identify opclass to use. For list and range partitioning we use only + * btree operator class, which seems enough for those. For hash partitioning, + * we use hash operator class. */ if (strategy == PARTITION_STRATEGY_HASH) { am_method = "hash";
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers