On 2017/09/27 22:41, Jesper Pedersen wrote:
> On 09/27/2017 03:05 AM, amul sul wrote:
>>>> Attached rebased patch, thanks.
>>>>
>>>>
>>> While reading through the patch I thought it would be better to keep
>>> MODULUS and REMAINDER in caps, if CREATE TABLE was in caps too in order to
>>> highlight that these are "keywords" for hash partition.
>>>
>>> Also updated some of the documentation.
>>>
>>>
>> Thanks a lot for the patch, included in the attached version.​
>>
> 
> Thank you.
> 
> Based on [1] I have moved the patch to "Ready for Committer".

Thanks a lot Amul for working on this.  Like Jesper said, the patch looks
pretty good overall.  I was looking at the latest version with intent to
study certain things about hash partitioning the way patch implements it,
during which I noticed some things.

+      The modulus must be a positive integer, and the remainder must a

must be a

+      suppose you have a hash-partitioned table with 8 children, each of
which
+      has modulus 8, but find it necessary to increase the number of
partitions
+      to 16.

Might it be a good idea to say 8 "partitions" instead of "children" in the
first sentence?

+      each modulus-8 partition until none remain.  While this may still
involve
+      a large amount of data movement at each step, it is still better than
+      having to create a whole new table and move all the data at once.
+     </para>
+

I read the paragraph that ends with the above text and started wondering
if the example to redistribute data in hash partitions by detaching and
attaching with new modulus/remainder could be illustrated with an example?
Maybe in the examples section of the ALTER TABLE page?

+      Since hash operator class provide only equality, not ordering,
collation

Either "Since hash operator classes provide" or "Since hash operator class
provides"

Other than the above points, patch looks good.


By the way, I noticed a couple of things about hash partition constraints:

1. In get_qual_for_hash(), using
get_fn_expr_rettype(&key->partsupfunc[i]), which returns InvalidOid for
the lack of fn_expr being set to non-NULL value, causes funcrettype of the
FuncExpr being generated for hashing partition key columns to be set to
InvalidOid, which I think is wrong.  That is, the following if condition
in get_fn_expr_rettype() is always satisfied:

    if (!flinfo || !flinfo->fn_expr)
        return InvalidOid;

I think we could use get_func_rettype(&key->partsupfunc[i].fn_oid)
instead.  Attached patch
hash-v21-set-funcexpr-funcrettype-correctly.patch, which applies on top
v21 of your patch.

2. It seems that the reason constraint exclusion doesn't work with hash
partitions as implemented by the patch is that predtest.c:
operator_predicate_proof() returns false even without looking into the
hash partition constraint, which is of the following form:

satisfies_hash_partition(<mod>, <rem>, <key1-exthash>,..)

beccause the above constraint expression doesn't translate into a a binary
opclause (an OpExpr), which operator_predicate_proof() knows how to work
with.  So, false is returned at the beginning of that function by the
following code:

    if (!is_opclause(predicate))
        return false;

For example,

create table p (a int) partition by hash (a);
create table p0 partition of p for values with (modulus 4, remainder 0);
create table p1 partition of p for values with (modulus 4, remainder 1);
\d+ p0
<...>
Partition constraint: satisfies_hash_partition(4, 0, hashint4extended(a,
'8816678312871386367'::bigint))

-- both p0 and p1 scanned
explain select * from p where satisfies_hash_partition(4, 0,
hashint4extended(a, '8816678312871386367'::bigint));
                                             QUERY PLAN

----------------------------------------------------------------------------------------------------
 Append  (cost=0.00..96.50 rows=1700 width=4)
   ->  Seq Scan on p0  (cost=0.00..48.25 rows=850 width=4)
         Filter: satisfies_hash_partition(4, 0, hashint4extended(a,
'8816678312871386367'::bigint))
   ->  Seq Scan on p1  (cost=0.00..48.25 rows=850 width=4)
         Filter: satisfies_hash_partition(4, 0, hashint4extended(a,
'8816678312871386367'::bigint))
(5 rows)

-- both p0 and p1 scanned
explain select * from p where satisfies_hash_partition(4, 1,
hashint4extended(a, '8816678312871386367'::bigint));
                                             QUERY PLAN

----------------------------------------------------------------------------------------------------
 Append  (cost=0.00..96.50 rows=1700 width=4)
   ->  Seq Scan on p0  (cost=0.00..48.25 rows=850 width=4)
         Filter: satisfies_hash_partition(4, 1, hashint4extended(a,
'8816678312871386367'::bigint))
   ->  Seq Scan on p1  (cost=0.00..48.25 rows=850 width=4)
         Filter: satisfies_hash_partition(4, 1, hashint4extended(a,
'8816678312871386367'::bigint))
(5 rows)


I looked into how satisfies_hash_partition() works and came up with an
idea that I think will make constraint exclusion work.  What if we emitted
the hash partition constraint in the following form instead:

hash_partition_mod(hash_partition_hash(key1-exthash, key2-exthash),
                   <mod>) = <rem>

With that form, constraint exclusion seems to work as illustrated below:

\d+ p0
<...>
Partition constraint:
(hash_partition_modulus(hash_partition_hash(hashint4extended(a,
'8816678312871386367'::bigint)), 4) = 0)

-- note only p0 is scanned
explain select * from p where
hash_partition_modulus(hash_partition_hash(hashint4extended(a,
'8816678312871386367'::bigint)), 4) = 0;
                     QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..61.00 rows=13 width=4)
   ->  Seq Scan on p0  (cost=0.00..61.00 rows=13 width=4)
         Filter:
(hash_partition_modulus(hash_partition_hash(hashint4extended(a,
'8816678312871386367'::bigint)), 4) = 0)
(3 rows)

-- note only p1 is scanned
explain select * from p where
hash_partition_modulus(hash_partition_hash(hashint4extended(a,
'8816678312871386367'::bigint)), 4) = 1;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.00..61.00 rows=13 width=4)
   ->  Seq Scan on p1  (cost=0.00..61.00 rows=13 width=4)
         Filter:
(hash_partition_modulus(hash_partition_hash(hashint4extended(a,
'8816678312871386367'::bigint)), 4) = 1)
(3 rows)

I tried to implement that in the attached
hash-v21-hash-part-constraint.patch, which applies on top v21 of your
patch (actually on top of
hash-v21-set-funcexpr-funcrettype-correctly.patch, which I think should be
applied anyway as it fixes a bug of the original patch).

What do you think?  Eventually, the new partition-pruning method [1] will
make using constraint exclusion obsolete, but it might be a good idea to
have it working if we can.

Thanks,
Amit

[1] https://commitfest.postgresql.org/14/1272/
From 7773ba29d73e2e5cdb4fa7fee68c4598b70bf22f Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Thu, 28 Sep 2017 13:26:00 +0900
Subject: [PATCH 3/3] Make constraint exclusion work with hash partitions

---
 src/backend/catalog/partition.c | 104 +++++++++++++++++++++++++++++-----------
 src/include/catalog/pg_proc.h   |   6 ++-
 src/include/fmgr.h              |   1 +
 3 files changed, 82 insertions(+), 29 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index a085a7a0be..ba437c13c8 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -28,6 +28,7 @@
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_inherits_fn.h"
 #include "catalog/pg_opclass.h"
+#include "catalog/pg_operator.h"
 #include "catalog/pg_partitioned_table.h"
 #include "catalog/pg_type.h"
 #include "commands/tablecmds.h"
@@ -174,7 +175,8 @@ static void get_partition_dispatch_recurse(Relation rel, 
Relation parent,
 static uint64 compute_hash_value(PartitionKey key, Datum *values, bool 
*isnull);
 
 /* SQL-callable function for use in hash partition CHECK constraints */
-PG_FUNCTION_INFO_V1(satisfies_hash_partition);
+PG_FUNCTION_INFO_V1(hash_partition_modulus);
+PG_FUNCTION_INFO_V1(hash_partition_hash);
 
 /*
  * RelationBuildPartitionDesc
@@ -1705,14 +1707,14 @@ make_partition_op_expr(PartitionKey key, int keynum,
  * This function will return one of the following in the form of an
  * expression:
  *
- * for p_p1: satisfies_hash_partition(2, 1, hash_fn_1_extended(a, HASH_SEED),
- *                                                                             
        hash_fn_2_extended(b, HASH_SEED))
- * for p_p2: satisfies_hash_partition(4, 2, hash_fn_1_extended(a, HASH_SEED),
- *                                                                             
        hash_fn_2_extended(b, HASH_SEED))
- * for p_p3: satisfies_hash_partition(8, 0, hash_fn_1_extended(a, HASH_SEED),
- *                                                                             
        hash_fn_2_extended(b, HASH_SEED))
- * for p_p4: satisfies_hash_partition(8, 4, hash_fn_1_extended(a, HASH_SEED),
- *                                                                             
        hash_fn_2_extended(b, HASH_SEED))
+ * for p_p1: hash_partition_hash(hash_fn_1_extended(a, HASH_SEED),
+ *                                                              
hash_fn_2_extended(b, HASH_SEED)) % 4 = 1
+ * for p_p2: hash_partition_hash(hash_fn_1_extended(a, HASH_SEED),
+ *                                                              
hash_fn_2_extended(b, HASH_SEED)) % 4 = 2
+ * for p_p3: hash_partition_hash(hash_fn_1_extended(a, HASH_SEED),
+ *                                                              
hash_fn_2_extended(b, HASH_SEED)) % 8 = 0
+ * for p_p4: hash_partition_hash(hash_fn_1_extended(a, HASH_SEED),
+ *                                                              
hash_fn_2_extended(b, HASH_SEED)) % 8 = 4
  *
  * where hash_fn_1_extended and hash_fn_2_extended are datatype-specific hash
  * functions(extended version) for columns a and b respectively and HASH_SEED
@@ -1722,11 +1724,13 @@ static List *
 get_qual_for_hash(Relation parent, PartitionBoundSpec *spec)
 {
        PartitionKey key = RelationGetPartitionKey(parent);
-       FuncExpr   *fexpr;
+       FuncExpr   *hash_fexpr,
+                          *mod_fexpr;
+       Expr       *opexpr;
        Node       *modulusConst;
        Node       *remainderConst;
        Node       *hashSeedConst;
-       List       *args;
+       List       *args = NIL;
        ListCell   *partexprs_item;
        int                     i;
 
@@ -1758,13 +1762,13 @@ get_qual_for_hash(Relation parent, PartitionBoundSpec 
*spec)
                                                                           
false,
                                                                           
FLOAT8PASSBYVAL);
 
-       args = list_make2(modulusConst, remainderConst);
        partexprs_item = list_head(key->partexprs);
 
        /* Add an argument for each key column. */
        for (i = 0; i < key->partnatts; i++)
        {
                Node       *keyCol;
+               FuncExpr   *fexpr;
 
                /* Left operand */
                if (key->partattrs[i] != 0)
@@ -1794,14 +1798,26 @@ get_qual_for_hash(Relation parent, PartitionBoundSpec 
*spec)
                args = lappend(args, fexpr);
        }
 
-       fexpr = makeFuncExpr(F_SATISFIES_HASH_PARTITION,
-                                                BOOLOID,
-                                                args,
-                                                InvalidOid,
-                                                InvalidOid,
-                                                COERCE_EXPLICIT_CALL);
+       hash_fexpr = makeFuncExpr(F_HASH_PARTITION_HASH,
+                                                         INT8OID,
+                                                         args,
+                                                         InvalidOid,
+                                                         InvalidOid,
+                                                         COERCE_EXPLICIT_CALL);
+       mod_fexpr = makeFuncExpr(F_HASH_PARTITION_MODULUS,
+                                                        INT4OID,
+                                                        list_make2(hash_fexpr, 
modulusConst),
+                                                        InvalidOid,
+                                                        InvalidOid,
+                                                        COERCE_EXPLICIT_CALL);
+
+       /* Make an fexpr = true expression */
+       opexpr = make_opclause(Int4EqualOperator, BOOLOID, false,
+                                                  (Expr *) mod_fexpr,
+                                                  (Expr *) remainderConst,
+                                                  InvalidOid, InvalidOid);
 
-       return list_make1(fexpr);
+       return list_make1(opexpr);
 }
 
 /*
@@ -3273,7 +3289,43 @@ compute_hash_value(PartitionKey key, Datum *values, bool 
*isnull)
 }
 
 /*
- * satisfies_hash_partition
+ * hash_partition_modulus
+ *
+ * This is a SQL-callable function for use in hash partition constraint that
+ * takes as arguments a 64-bit hash value and modulus and returns the
+ * remainder obtained by dividing the hash value with modulus.
+ */
+Datum
+hash_partition_modulus(PG_FUNCTION_ARGS)
+{
+       uint64          arg1 = PG_GETARG_UINT64(0);
+       int32           arg2 = PG_GETARG_INT32(1);
+
+       if (arg2 == 0)
+       {
+               ereport(ERROR,
+                               (errcode(ERRCODE_DIVISION_BY_ZERO),
+                                errmsg("division by zero")));
+               /* ensure compiler realizes we mustn't reach the division (gcc 
bug) */
+               PG_RETURN_NULL();
+       }
+
+       /*
+        * Some machines throw a floating-point exception for INT_MIN % -1, 
which
+        * is a bit silly since the correct answer is perfectly well-defined,
+        * namely zero.  (It's not clear this ever happens when dealing with
+        * int16, but we might as well have the test for safety.)
+        */
+       if (arg2 == -1)
+               PG_RETURN_INT32(0);
+
+       /* No overflow is possible */
+
+       PG_RETURN_INT32(arg1 % arg2);
+}
+
+/*
+ * hash_partition_hash
  *
  * This is a SQL-callable function for use in hash partition constraints takes
  * an already computed hash values of each partition key attribute, and combine
@@ -3285,11 +3337,9 @@ compute_hash_value(PartitionKey key, Datum *values, bool 
*isnull)
  * See get_qual_for_hash() for usage.
  */
 Datum
-satisfies_hash_partition(PG_FUNCTION_ARGS)
+hash_partition_hash(PG_FUNCTION_ARGS)
 {
-       int                     modulus = PG_GETARG_INT32(0);
-       int                     remainder = PG_GETARG_INT32(1);
-       short           nkeys = PG_NARGS() - 2;
+       short           nkeys = PG_NARGS();
        int                     i;
        Datum           hash_array[PARTITION_MAX_KEYS];
        bool            isnull[PARTITION_MAX_KEYS];
@@ -3301,14 +3351,14 @@ satisfies_hash_partition(PG_FUNCTION_ARGS)
                 * Partition key attribute's hash values start from third 
argument of
                 * function.
                 */
-               isnull[i] = PG_ARGISNULL(i + 2);
+               isnull[i] = PG_ARGISNULL(i);
 
                if (!isnull[i])
-                       hash_array[i] = PG_GETARG_DATUM(i + 2);
+                       hash_array[i] = PG_GETARG_DATUM(i);
        }
 
        /* Form a single 64-bit hash value */
        rowHash = mix_hash_value(nkeys, hash_array, isnull);
 
-       PG_RETURN_BOOL(rowHash % modulus == remainder);
+       PG_RETURN_UINT64(rowHash);
 }
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 470b1f6482..fa7e9a6f2b 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5523,8 +5523,10 @@ DATA(insert OID = 3354 (  pg_ls_waldir                   
         PGNSP PGUID 12 10 20 0 0 f f f f t t
 DESCR("list of files in the WAL directory");
 
 /* hash partitioning constraint function */
-DATA(insert OID = 5028 ( satisfies_hash_partition PGNSP PGUID 12 1 0 2276 0 f 
f f f f f i s 3 0 16 "23 23 1007" _null_ _null_ _null_ _null_ _null_ 
satisfies_hash_partition _null_ _null_ _null_ ));
-DESCR("hash partition CHECK constraint");
+DATA(insert OID = 315 (  hash_partition_modulus                   PGNSP PGUID 
12 1 0 0 0 f f f f t f i s 2 0 23 "20 23" _null_ _null_ _null_ _null_ _null_ 
hash_partition_modulus _null_ _null_ _null_ ));
+DESCR("to use in hash partition CHECK constraint");
+DATA(insert OID = 5028 ( hash_partition_hash PGNSP PGUID 12 1 0 2276 0 f f f f 
f f i s 1 0 20 "1007" _null_ _null_ _null_ _null_ _null_ hash_partition_hash 
_null_ _null_ _null_ ));
+DESCR("to use in hash partition CHECK constraint");
 
 /*
  * Symbolic values for provolatile column: these indicate whether the result
diff --git a/src/include/fmgr.h b/src/include/fmgr.h
index b604a5c162..7535f95382 100644
--- a/src/include/fmgr.h
+++ b/src/include/fmgr.h
@@ -245,6 +245,7 @@ extern struct varlena *pg_detoast_datum_packed(struct 
varlena *datum);
 #define PG_GETARG_FLOAT4(n)  DatumGetFloat4(PG_GETARG_DATUM(n))
 #define PG_GETARG_FLOAT8(n)  DatumGetFloat8(PG_GETARG_DATUM(n))
 #define PG_GETARG_INT64(n)      DatumGetInt64(PG_GETARG_DATUM(n))
+#define PG_GETARG_UINT64(n)     DatumGetUInt64(PG_GETARG_DATUM(n))
 /* use this if you want the raw, possibly-toasted input datum: */
 #define PG_GETARG_RAW_VARLENA_P(n)     ((struct varlena *) 
PG_GETARG_POINTER(n))
 /* use this if you want the input datum de-toasted: */
-- 
2.11.0

From 1f2bda192b3da7667fabaece2b7037e335ad6ec5 Mon Sep 17 00:00:00 2001
From: amit <amitlangot...@gmail.com>
Date: Thu, 28 Sep 2017 13:36:47 +0900
Subject: [PATCH 2/3] Fix get_qual_for_hash to set correct function return type

---
 src/backend/catalog/partition.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 3696b9a711..a085a7a0be 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1785,7 +1785,7 @@ get_qual_for_hash(Relation parent, PartitionBoundSpec 
*spec)
 
                /* Form hash_fn(keyCol) expression */
                fexpr = makeFuncExpr(key->partsupfunc[i].fn_oid,
-                                                        
get_fn_expr_rettype(&key->partsupfunc[i]),
+                                                        
get_func_rettype(key->partsupfunc[i].fn_oid),
                                                         list_make2(keyCol, 
hashSeedConst),
                                                         InvalidOid,
                                                         InvalidOid,
-- 
2.11.0

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to