On Sun, Apr 26, 2020 at 7:41 PM James Coleman <jtc...@gmail.com> wrote: > > On Sun, Apr 26, 2020 at 4:49 PM Tomas Vondra > <tomas.von...@2ndquadrant.com> wrote: > > > > On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote: > > >On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra > > ><tomas.von...@2ndquadrant.com> wrote: > > >> > > >> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote: > > >> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley <dgrowle...@gmail.com> > > >> >wrote: > > >> >> > > >> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra > > >> >> <tomas.von...@2ndquadrant.com> wrote: > > >> >> > This reminds me our attempts to add bloom filters to hash joins, > > >> >> > which I > > >> >> > think ran into mostly the same challenge of deciding when the bloom > > >> >> > filter can be useful and is worth the extra work. > > >> >> > > >> >> Speaking of that, it would be interesting to see how a test where you > > >> >> write the query as IN(VALUES(...)) instead of IN() compares. It would > > >> >> be interesting to know if the planner is able to make a more suitable > > >> >> choice and also to see how all the work over the years to improve Hash > > >> >> Joins compares to the bsearch with and without the bloom filter. > > >> > > > >> >It would be interesting. > > >> > > > >> >It also makes one wonder about optimizing these into to hash > > >> >joins...which I'd thought about over at [1]. I think it'd be a very > > >> >significant effort though. > > >> > > > >> > > >> I modified the script to also do the join version of the query. I can > > >> only run it on my laptop at the moment, so the results may be a bit > > >> different from those I shared before, but it's interesting I think. > > >> > > >> In most cases it's comparable to the binsearch/bloom approach, and in > > >> some cases it actually beats them quite significantly. It seems to > > >> depend on how expensive the comparison is - for "int" the comparison is > > >> very cheap and there's almost no difference. For "text" the comparison > > >> is much more expensive, and there are significant speedups. > > >> > > >> For example the test with 100k lookups in array of 10k elements and 10% > > >> match probability, the timings are these > > >> > > >> master: 62362 ms > > >> binsearch: 201 ms > > >> bloom: 65 ms > > >> hashjoin: 36 ms > > >> > > >> I do think the explanation is fairly simple - the bloom filter > > >> eliminates about 90% of the expensive comparisons, so it's 20ms plus > > >> some overhead to build and check the bits. The hash join probably > > >> eliminates a lot of the remaining comparisons, because the hash table > > >> is sized to have one tuple per bucket. > > >> > > >> Note: I also don't claim the PoC has the most efficient bloom filter > > >> implementation possible. I'm sure it could be made faster. > > >> > > >> Anyway, I'm not sure transforming this to a hash join is worth the > > >> effort - I agree that seems quite complex. But perhaps this suggest we > > >> should not be doing binary search and instead just build a simple hash > > >> table - that seems much simpler, and it'll probably give us about the > > >> same benefits. > > > > > >That's actually what I originally thought about doing, but I chose > > >binary search since it seemed a lot easier to get off the ground. > > > > > > > OK, that makes perfect sense. > > > > >If we instead build a hash is there anything else we need to be > > >concerned about? For example, work mem? I suppose for the binary > > >search we already have to expand the array, so perhaps it's not all > > >that meaningful relative to that... > > > > > > > I don't think we need to be particularly concerned about work_mem. We > > don't care about it now, and it's not clear to me what we could do about > > it - we already have the array in memory anyway, so it's a bit futile. > > Furthermore, if we need to care about it, it probably applies to the > > binary search too. > > > > >I was looking earlier at what our standard hash implementation was, > > >and it seemed less obvious what was needed to set that up (so binary > > >search seemed a faster proof of concept). If you happen to have any > > >pointers to similar usages I should look at, please let me know. > > > > > > > I think the hash join implementation is far too complicated. It has to > > care about work_mem, so it implements batching, etc. That's a lot of > > complexity we don't need here. IMO we could use either the usual > > dynahash, or maybe even the simpler simplehash. > > > > FWIW it'd be good to verify the numbers I shared, i.e. checking that the > > benchmarks makes sense and running it independently. I'm not aware of > > any issues but it was done late at night and only ran on my laptop. > > Some quick calculations (don't have the scripting in a form I can > attach yet; using this as an opportunity to hack on a genericized > performance testing framework of sorts) suggest your results are > correct. I was also testing on my laptop, but I showed 1.) roughly > equivalent results for IN (VALUES ...) and IN (<list>) for integers, > but when I switch to (short; average 3 characters long) text values I > show the hash join on VALUES is twice as fast as the binary search. > > Given that, I'm planning to implement this as a hash lookup and share > a revised patch.
I've attached a patch series as before, but with an additional patch that switches to using dynahash instead of binary search. Whereas before the benchmarking ended up with a trimodal distribution (i.e., master with IN <list>, patch with IN <list>, and either with IN VALUES), the hash implementation brings us back to an effectively bimodal distribution -- though the hash scalar array op expression implementation for text is about 5% faster than the hash join. Current outstanding thoughts (besides comment/naming cleanup): - The saop costing needs to be updated to match, as Tomas pointed out. - Should we be concerned about single execution cases? For example, is the regression of speed on a simple SELECT x IN something we should try to defeat by only kicking in the optimization if we execute in a loop at least twice? That might be of particular interest to pl/pgsql. - Should we have a test for an operator with a non-strict function? I'm not aware of any built-in ops that have that characteristic; would you suggest just creating a fake one for the test? James
From cd760f5d333a2f1c7b871f05d4a231c0381c660d Mon Sep 17 00:00:00 2001 From: James Coleman <jtc...@gmail.com> Date: Mon, 27 Apr 2020 08:51:28 -0400 Subject: [PATCH v2 2/2] Try using dynahash --- src/backend/executor/execExpr.c | 29 ++-- src/backend/executor/execExprInterp.c | 197 ++++++++++++---------- src/include/executor/execExpr.h | 7 +- src/test/regress/expected/expressions.out | 7 + src/test/regress/sql/expressions.sql | 2 + 5 files changed, 138 insertions(+), 104 deletions(-) diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index c202cc7e89..6249db5426 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -31,6 +31,7 @@ #include "postgres.h" #include "access/nbtree.h" +#include "access/hash.h" #include "catalog/objectaccess.h" #include "catalog/pg_type.h" #include "executor/execExpr.h" @@ -949,10 +950,13 @@ ExecInitExprRec(Expr *node, ExprState *state, { ScalarArrayOpExpr *opexpr = (ScalarArrayOpExpr *) node; Oid func; + Oid hash_func; Expr *scalararg; Expr *arrayarg; FmgrInfo *finfo; FunctionCallInfo fcinfo; + FmgrInfo *hash_finfo; + FunctionCallInfo hash_fcinfo; AclResult aclresult; bool useBinarySearch = false; @@ -983,11 +987,6 @@ ExecInitExprRec(Expr *node, ExprState *state, { Datum arrdatum = ((Const *) arrayarg)->constvalue; ArrayType *arr = (ArrayType *) DatumGetPointer(arrdatum); - Oid orderingOp; - Oid orderingFunc; - Oid opfamily; - Oid opcintype; - int16 strategy; int nitems; /* @@ -998,21 +997,24 @@ ExecInitExprRec(Expr *node, ExprState *state, if (nitems >= MIN_ARRAY_SIZE_FOR_BINARY_SEARCH) { /* - * Find the ordering op that matches the originally + * Find the hash op that matches the originally * planned equality op. */ - orderingOp = get_ordering_op_for_equality_op(opexpr->opno, NULL); - get_ordering_op_properties(orderingOp, &opfamily, &opcintype, &strategy); - orderingFunc = get_opfamily_proc(opfamily, opcintype, opcintype, BTORDER_PROC); + useBinarySearch = get_op_hash_functions(opexpr->opno, NULL, &hash_func); /* * But we may not have one, so fall back to the * default implementation if necessary. */ - if (OidIsValid(orderingFunc)) + if (useBinarySearch) { - useBinarySearch = true; - func = orderingFunc; + hash_finfo = palloc0(sizeof(FmgrInfo)); + hash_fcinfo = palloc0(SizeForFunctionCallInfo(2)); + fmgr_info(hash_func, hash_finfo); + fmgr_info_set_expr((Node *) node, hash_finfo); + InitFunctionCallInfoData(*hash_fcinfo, hash_finfo, 2, + opexpr->inputcollid, NULL, NULL); + InvokeFunctionExecuteHook(hash_func); } } } @@ -1042,6 +1044,9 @@ ExecInitExprRec(Expr *node, ExprState *state, scratch.d.scalararraybinsearchop.finfo = finfo; scratch.d.scalararraybinsearchop.fcinfo_data = fcinfo; scratch.d.scalararraybinsearchop.fn_addr = finfo->fn_addr; + scratch.d.scalararraybinsearchop.hash_finfo = hash_finfo; + scratch.d.scalararraybinsearchop.hash_fcinfo_data = hash_fcinfo; + scratch.d.scalararraybinsearchop.hash_fn_addr = hash_finfo->fn_addr; } else { diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index 5bebafbf0c..e54e807c6b 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -178,8 +178,6 @@ ExecAggPlainTransByRef(AggState *aggstate, AggStatePerTrans pertrans, AggStatePerGroup pergroup, ExprContext *aggcontext, int setno); -static int compare_array_elements(const void *a, const void *b, void *arg); - /* * Prepare ExprState for interpreted execution. */ @@ -3593,6 +3591,51 @@ ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op) *op->resnull = resultnull; } +static ExprEvalStep *current_saop_op; + +/* + * Hash function for elements. + * + * We use the element type's default hash opclass, and the column collation + * if the type is collation-sensitive. + */ +/* XXX: Name function to be specific to saop binsearch? */ +static uint32 +element_hash(const void *key, Size keysize) +{ + Datum hash; + FunctionCallInfo fcinfo = current_saop_op->d.scalararraybinsearchop.hash_fcinfo_data; + + fcinfo->args[0].value = *((const Datum *) key); + fcinfo->args[0].isnull = false; + + /* The keysize parameter is superfluous here */ + hash = current_saop_op->d.scalararraybinsearchop.hash_fn_addr(fcinfo); + + return DatumGetUInt32(hash); +} + +/* + * Matching function for elements, to be used in hashtable lookups. + */ +/* XXX: Name function to be specific to saop binsearch? */ +static int +element_match(const void *key1, const void *key2, Size keysize) +{ + Datum result; + FunctionCallInfo fcinfo = current_saop_op->d.scalararraybinsearchop.fcinfo_data; + + fcinfo->args[0].value = *((const Datum *) key1); + fcinfo->args[0].isnull = false; + fcinfo->args[1].value = *((const Datum *) key2); + fcinfo->args[1].isnull = false; + + /* The keysize parameter is superfluous here */ + result = current_saop_op->d.scalararraybinsearchop.fn_addr(fcinfo); + + return DatumGetBool(result) ? 0 : 1; +} + /* * Evaluate "scalar op ANY (const array)". * @@ -3613,16 +3656,19 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext * FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data; bool strictfunc = op->d.scalararraybinsearchop.finfo->fn_strict; ArrayType *arr; + Datum scalar = fcinfo->args[0].value; Datum result; bool resultnull; - bool *elem_nulls; - int l = 0, - r, - res; + bool hashfound; + int res; + + /* If we're only executing once, do we need a way to fall back to the regular loop? */ /* We don't setup a binary search op if the array const is null. */ Assert(!*op->resnull); + current_saop_op = op; + /* * If the scalar is NULL, and the function is strict, return NULL; no * point in executing the search. @@ -3634,104 +3680,88 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext * } /* Preprocess the array the first time we execute the op. */ - if (op->d.scalararraybinsearchop.elem_values == NULL) + if (op->d.scalararraybinsearchop.elements_tab == NULL) { - /* Cache the original lhs so we can scribble on it. */ - Datum scalar = fcinfo->args[0].value; - bool scalar_isnull = fcinfo->args[0].isnull; - int num_nonnulls = 0; - MemoryContext old_cxt; - MemoryContext array_cxt; int16 typlen; bool typbyval; char typalign; + HASHCTL elem_hash_ctl; + int nitems; + int num_nulls = 0; + char *s; + bits8 *bitmap; + int bitmask; arr = DatumGetArrayTypeP(*op->resvalue); + nitems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); get_typlenbyvalalign(ARR_ELEMTYPE(arr), &typlen, &typbyval, &typalign); - array_cxt = AllocSetContextCreate( - econtext->ecxt_per_query_memory, - "scalararraybinsearchop context", - ALLOCSET_SMALL_SIZES); - old_cxt = MemoryContextSwitchTo(array_cxt); + MemSet(&elem_hash_ctl, 0, sizeof(elem_hash_ctl)); + elem_hash_ctl.keysize = sizeof(Datum); + elem_hash_ctl.entrysize = sizeof(Datum); + elem_hash_ctl.hash = element_hash; + elem_hash_ctl.match = element_match; + elem_hash_ctl.hcxt = econtext->ecxt_per_query_memory; + op->d.scalararraybinsearchop.elements_tab = hash_create("Scalar array op expr elements", + nitems, + &elem_hash_ctl, + HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT); + + s = (char *) ARR_DATA_PTR(arr); + bitmap = ARR_NULLBITMAP(arr); + bitmask = 1; + for (int i = 0; i < nitems; i++) + { + Datum element; + + /* Get array element, checking for NULL. */ + if (bitmap && (*bitmap & bitmask) == 0) + { + num_nulls++; + } + else + { + element = fetch_att(s, typbyval, typlen); + s = att_addlength_pointer(s, typlen, s); + s = (char *) att_align_nominal(s, typalign); - deconstruct_array(arr, - ARR_ELEMTYPE(arr), - typlen, typbyval, typalign, - &op->d.scalararraybinsearchop.elem_values, &elem_nulls, &op->d.scalararraybinsearchop.num_elems); + hash_search(op->d.scalararraybinsearchop.elements_tab, (const void *) &element, HASH_ENTER, NULL); + } - /* Remove null entries from the array. */ - for (int j = 0; j < op->d.scalararraybinsearchop.num_elems; j++) - { - if (!elem_nulls[j]) - op->d.scalararraybinsearchop.elem_values[num_nonnulls++] = op->d.scalararraybinsearchop.elem_values[j]; + /* Advance bitmap pointer if any. */ + if (bitmap) + { + bitmask <<= 1; + if (bitmask == 0x100) + { + bitmap++; + bitmask = 1; + } + } } /* * Remember if we had any nulls so that we know if we need to execute * non-strict functions with a null lhs value if no match is found. */ - op->d.scalararraybinsearchop.has_nulls = num_nonnulls < op->d.scalararraybinsearchop.num_elems; - op->d.scalararraybinsearchop.num_elems = num_nonnulls; + op->d.scalararraybinsearchop.has_nulls = num_nulls > 0; /* - * Setup the fcinfo for sorting. We've removed nulls, so both lhs and - * rhs values are guaranteed to be non-null. + * We only setup a binary search op if we have > 8 elements, so we don't + * need to worry about adding an optimization for the empty array case. */ - fcinfo->args[0].isnull = false; - fcinfo->args[1].isnull = false; - - /* Sort the array and remove duplicate elements. */ - qsort_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum), - compare_array_elements, op); - op->d.scalararraybinsearchop.num_elems = qunique_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum), - compare_array_elements, op); - - /* Restore the lhs value after we scribbed on it for sorting. */ - fcinfo->args[0].value = scalar; - fcinfo->args[0].isnull = scalar_isnull; - - MemoryContextSwitchTo(old_cxt); + Assert(nitems > 0); } - /* - * We only setup a binary search op if we have > 8 elements, so we don't - * need to worry about adding an optimization for the empty array case. - */ - Assert(!(op->d.scalararraybinsearchop.num_elems <= 0 && !op->d.scalararraybinsearchop.has_nulls)); - - /* Assume no match will be found until proven otherwise. */ - result = BoolGetDatum(false); + /* Check the hash to see if we have a match. */ + hash_search(op->d.scalararraybinsearchop.elements_tab, (const void *) &scalar, HASH_FIND, &hashfound); + result = BoolGetDatum(hashfound); resultnull = false; - /* Binary search through the array. */ - r = op->d.scalararraybinsearchop.num_elems - 1; - while (l <= r) - { - int i = (l + r) / 2; - - fcinfo->args[1].value = op->d.scalararraybinsearchop.elem_values[i]; - - /* Call comparison function */ - fcinfo->isnull = false; - res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo)); - - if (res == 0) - { - result = BoolGetDatum(true); - resultnull = false; - break; - } - else if (res > 0) - l = i + 1; - else - r = i - 1; - } - /* * If we didn't find a match in the array, we still might need to handle * the possibility of null values (we've previously removed them from the @@ -3761,19 +3791,6 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext * *op->resnull = resultnull; } -/* XXX: Name function to be specific to saop binsearch? */ -static int -compare_array_elements(const void *a, const void *b, void *arg) -{ - ExprEvalStep *op = (ExprEvalStep *) arg; - FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data; - - fcinfo->args[0].value = *((const Datum *) a); - fcinfo->args[1].value = *((const Datum *) b); - - return DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo)); -} - /* * Evaluate a NOT NULL domain constraint. */ diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h index ac4478d060..2e93b1f990 100644 --- a/src/include/executor/execExpr.h +++ b/src/include/executor/execExpr.h @@ -554,13 +554,16 @@ typedef struct ExprEvalStep /* for EEOP_SCALARARRAYOP_BINSEARCH */ struct { - int num_elems; bool has_nulls; - Datum *elem_values; + HTAB *elements_tab; FmgrInfo *finfo; /* function's lookup data */ FunctionCallInfo fcinfo_data; /* arguments etc */ /* faster to access without additional indirection: */ PGFunction fn_addr; /* actual call address */ + FmgrInfo *hash_finfo; /* function's lookup data */ + FunctionCallInfo hash_fcinfo_data; /* arguments etc */ + /* faster to access without additional indirection: */ + PGFunction hash_fn_addr; /* actual call address */ } scalararraybinsearchop; /* for EEOP_XMLEXPR */ diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out index 55b57b9c59..f37dfe1ce2 100644 --- a/src/test/regress/expected/expressions.out +++ b/src/test/regress/expected/expressions.out @@ -197,3 +197,10 @@ select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); (1 row) +select 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'); + ?column? +---------- + t +(1 row) + +-- TODO: test non-strict op? diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql index 3cb850d838..c30fe66c5e 100644 --- a/src/test/regress/sql/expressions.sql +++ b/src/test/regress/sql/expressions.sql @@ -76,3 +76,5 @@ select 1 in (null, null, null, null, null, null, null, null, null, null, null); select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null); select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1); select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); +select 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'); +-- TODO: test non-strict op? -- 2.17.1
From 08742543d7865d5f25c24c26bf1014924035c9eb Mon Sep 17 00:00:00 2001 From: jcoleman <jtc...@gmail.com> Date: Fri, 10 Apr 2020 21:40:50 +0000 Subject: [PATCH v2 1/2] Binary search const arrays in OR'd ScalarArrayOps Currently all scalar array op expressions execute as a linear search through the array argument. However when OR semantics are desired it's possible to instead use a binary search. Here we apply that optimization to constant arrays (so we don't need to worry about teaching expression execution when params change) of at least length 9 (since very short arrays average to the same number of comparisons for linear searches and thus avoid the preprocessing necessary for a binary search). --- src/backend/executor/execExpr.c | 79 +++++++-- src/backend/executor/execExprInterp.c | 193 ++++++++++++++++++++++ src/include/executor/execExpr.h | 14 ++ src/test/regress/expected/expressions.out | 39 +++++ src/test/regress/sql/expressions.sql | 11 ++ 5 files changed, 326 insertions(+), 10 deletions(-) diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index c6a77bd66f..c202cc7e89 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -49,6 +49,7 @@ #include "utils/lsyscache.h" #include "utils/typcache.h" +#define MIN_ARRAY_SIZE_FOR_BINARY_SEARCH 9 typedef struct LastAttnumInfo { @@ -947,11 +948,13 @@ ExecInitExprRec(Expr *node, ExprState *state, case T_ScalarArrayOpExpr: { ScalarArrayOpExpr *opexpr = (ScalarArrayOpExpr *) node; + Oid func; Expr *scalararg; Expr *arrayarg; FmgrInfo *finfo; FunctionCallInfo fcinfo; AclResult aclresult; + bool useBinarySearch = false; Assert(list_length(opexpr->args) == 2); scalararg = (Expr *) linitial(opexpr->args); @@ -964,12 +967,58 @@ ExecInitExprRec(Expr *node, ExprState *state, if (aclresult != ACLCHECK_OK) aclcheck_error(aclresult, OBJECT_FUNCTION, get_func_name(opexpr->opfuncid)); - InvokeFunctionExecuteHook(opexpr->opfuncid); /* Set up the primary fmgr lookup information */ finfo = palloc0(sizeof(FmgrInfo)); fcinfo = palloc0(SizeForFunctionCallInfo(2)); - fmgr_info(opexpr->opfuncid, finfo); + func = opexpr->opfuncid; + + /* + * If we have a constant array and want OR semantics, then we + * can use a binary search to implement the op instead of + * looping through the entire array for each execution. + */ + if (opexpr->useOr && arrayarg && IsA(arrayarg, Const) && + !((Const *) arrayarg)->constisnull) + { + Datum arrdatum = ((Const *) arrayarg)->constvalue; + ArrayType *arr = (ArrayType *) DatumGetPointer(arrdatum); + Oid orderingOp; + Oid orderingFunc; + Oid opfamily; + Oid opcintype; + int16 strategy; + int nitems; + + /* + * Only do the optimization if we have a large enough + * array to make it worth it. + */ + nitems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr)); + if (nitems >= MIN_ARRAY_SIZE_FOR_BINARY_SEARCH) + { + /* + * Find the ordering op that matches the originally + * planned equality op. + */ + orderingOp = get_ordering_op_for_equality_op(opexpr->opno, NULL); + get_ordering_op_properties(orderingOp, &opfamily, &opcintype, &strategy); + orderingFunc = get_opfamily_proc(opfamily, opcintype, opcintype, BTORDER_PROC); + + /* + * But we may not have one, so fall back to the + * default implementation if necessary. + */ + if (OidIsValid(orderingFunc)) + { + useBinarySearch = true; + func = orderingFunc; + } + } + } + + InvokeFunctionExecuteHook(func); + fmgr_info(func, finfo); fmgr_info_set_expr((Node *) node, finfo); InitFunctionCallInfoData(*fcinfo, finfo, 2, opexpr->inputcollid, NULL, NULL); @@ -981,18 +1030,28 @@ ExecInitExprRec(Expr *node, ExprState *state, /* * Evaluate array argument into our return value. There's no * danger in that, because the return value is guaranteed to - * be overwritten by EEOP_SCALARARRAYOP, and will not be - * passed to any other expression. + * be overwritten by EEOP_SCALARARRAYOP[_BINSEARCH], and will + * not be passed to any other expression. */ ExecInitExprRec(arrayarg, state, resv, resnull); /* And perform the operation */ - scratch.opcode = EEOP_SCALARARRAYOP; - scratch.d.scalararrayop.element_type = InvalidOid; - scratch.d.scalararrayop.useOr = opexpr->useOr; - scratch.d.scalararrayop.finfo = finfo; - scratch.d.scalararrayop.fcinfo_data = fcinfo; - scratch.d.scalararrayop.fn_addr = finfo->fn_addr; + if (useBinarySearch) + { + scratch.opcode = EEOP_SCALARARRAYOP_BINSEARCH; + scratch.d.scalararraybinsearchop.finfo = finfo; + scratch.d.scalararraybinsearchop.fcinfo_data = fcinfo; + scratch.d.scalararraybinsearchop.fn_addr = finfo->fn_addr; + } + else + { + scratch.opcode = EEOP_SCALARARRAYOP; + scratch.d.scalararrayop.element_type = InvalidOid; + scratch.d.scalararrayop.useOr = opexpr->useOr; + scratch.d.scalararrayop.finfo = finfo; + scratch.d.scalararrayop.fcinfo_data = fcinfo; + scratch.d.scalararrayop.fn_addr = finfo->fn_addr; + } ExprEvalPushStep(state, &scratch); break; } diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index 113ed1547c..5bebafbf0c 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -76,6 +76,7 @@ #include "utils/timestamp.h" #include "utils/typcache.h" #include "utils/xml.h" +#include "lib/qunique.h" /* * Use computed-goto-based opcode dispatch when computed gotos are available. @@ -177,6 +178,8 @@ ExecAggPlainTransByRef(AggState *aggstate, AggStatePerTrans pertrans, AggStatePerGroup pergroup, ExprContext *aggcontext, int setno); +static int compare_array_elements(const void *a, const void *b, void *arg); + /* * Prepare ExprState for interpreted execution. */ @@ -425,6 +428,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) &&CASE_EEOP_DOMAIN_CHECK, &&CASE_EEOP_CONVERT_ROWTYPE, &&CASE_EEOP_SCALARARRAYOP, + &&CASE_EEOP_SCALARARRAYOP_BINSEARCH, &&CASE_EEOP_XMLEXPR, &&CASE_EEOP_AGGREF, &&CASE_EEOP_GROUPING_FUNC, @@ -1464,6 +1468,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull) EEO_NEXT(); } + EEO_CASE(EEOP_SCALARARRAYOP_BINSEARCH) + { + /* too complex for an inline implementation */ + ExecEvalScalarArrayOpBinSearch(state, op, econtext); + + EEO_NEXT(); + } + EEO_CASE(EEOP_DOMAIN_NOTNULL) { /* too complex for an inline implementation */ @@ -3581,6 +3593,187 @@ ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op) *op->resnull = resultnull; } +/* + * Evaluate "scalar op ANY (const array)". + * + * This is an optimized version of ExecEvalScalarArrayOp that only supports + * ANY (i.e., OR semantics) because it binary searches through the array for a + * match after sorting the array and removing null and duplicate entries. + * + * Source array is in our result area, scalar arg is already evaluated into + * fcinfo->args[0]. + * + * The operator always yields boolean, and we combine the results across all + * array elements using OR. Of course we short-circuit as soon as the result + * is known. + */ +void +ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *econtext) +{ + FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data; + bool strictfunc = op->d.scalararraybinsearchop.finfo->fn_strict; + ArrayType *arr; + Datum result; + bool resultnull; + bool *elem_nulls; + int l = 0, + r, + res; + + /* We don't setup a binary search op if the array const is null. */ + Assert(!*op->resnull); + + /* + * If the scalar is NULL, and the function is strict, return NULL; no + * point in executing the search. + */ + if (fcinfo->args[0].isnull && strictfunc) + { + *op->resnull = true; + return; + } + + /* Preprocess the array the first time we execute the op. */ + if (op->d.scalararraybinsearchop.elem_values == NULL) + { + /* Cache the original lhs so we can scribble on it. */ + Datum scalar = fcinfo->args[0].value; + bool scalar_isnull = fcinfo->args[0].isnull; + int num_nonnulls = 0; + MemoryContext old_cxt; + MemoryContext array_cxt; + int16 typlen; + bool typbyval; + char typalign; + + arr = DatumGetArrayTypeP(*op->resvalue); + + get_typlenbyvalalign(ARR_ELEMTYPE(arr), + &typlen, + &typbyval, + &typalign); + + array_cxt = AllocSetContextCreate( + econtext->ecxt_per_query_memory, + "scalararraybinsearchop context", + ALLOCSET_SMALL_SIZES); + old_cxt = MemoryContextSwitchTo(array_cxt); + + deconstruct_array(arr, + ARR_ELEMTYPE(arr), + typlen, typbyval, typalign, + &op->d.scalararraybinsearchop.elem_values, &elem_nulls, &op->d.scalararraybinsearchop.num_elems); + + /* Remove null entries from the array. */ + for (int j = 0; j < op->d.scalararraybinsearchop.num_elems; j++) + { + if (!elem_nulls[j]) + op->d.scalararraybinsearchop.elem_values[num_nonnulls++] = op->d.scalararraybinsearchop.elem_values[j]; + } + + /* + * Remember if we had any nulls so that we know if we need to execute + * non-strict functions with a null lhs value if no match is found. + */ + op->d.scalararraybinsearchop.has_nulls = num_nonnulls < op->d.scalararraybinsearchop.num_elems; + op->d.scalararraybinsearchop.num_elems = num_nonnulls; + + /* + * Setup the fcinfo for sorting. We've removed nulls, so both lhs and + * rhs values are guaranteed to be non-null. + */ + fcinfo->args[0].isnull = false; + fcinfo->args[1].isnull = false; + + /* Sort the array and remove duplicate elements. */ + qsort_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum), + compare_array_elements, op); + op->d.scalararraybinsearchop.num_elems = qunique_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum), + compare_array_elements, op); + + /* Restore the lhs value after we scribbed on it for sorting. */ + fcinfo->args[0].value = scalar; + fcinfo->args[0].isnull = scalar_isnull; + + MemoryContextSwitchTo(old_cxt); + } + + /* + * We only setup a binary search op if we have > 8 elements, so we don't + * need to worry about adding an optimization for the empty array case. + */ + Assert(!(op->d.scalararraybinsearchop.num_elems <= 0 && !op->d.scalararraybinsearchop.has_nulls)); + + /* Assume no match will be found until proven otherwise. */ + result = BoolGetDatum(false); + resultnull = false; + + /* Binary search through the array. */ + r = op->d.scalararraybinsearchop.num_elems - 1; + while (l <= r) + { + int i = (l + r) / 2; + + fcinfo->args[1].value = op->d.scalararraybinsearchop.elem_values[i]; + + /* Call comparison function */ + fcinfo->isnull = false; + res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo)); + + if (res == 0) + { + result = BoolGetDatum(true); + resultnull = false; + break; + } + else if (res > 0) + l = i + 1; + else + r = i - 1; + } + + /* + * If we didn't find a match in the array, we still might need to handle + * the possibility of null values (we've previously removed them from the + * array). + */ + if (!DatumGetBool(result) && op->d.scalararraybinsearchop.has_nulls) + { + if (strictfunc) + { + /* Had nulls, so strict function implies null. */ + result = (Datum) 0; + resultnull = true; + } + else + { + /* Execute function will null rhs just once. */ + fcinfo->args[1].value = (Datum) 0; + fcinfo->args[1].isnull = true; + + res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo)); + result = BoolGetDatum(res == 0); + resultnull = fcinfo->isnull; + } + } + + *op->resvalue = result; + *op->resnull = resultnull; +} + +/* XXX: Name function to be specific to saop binsearch? */ +static int +compare_array_elements(const void *a, const void *b, void *arg) +{ + ExprEvalStep *op = (ExprEvalStep *) arg; + FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data; + + fcinfo->args[0].value = *((const Datum *) a); + fcinfo->args[1].value = *((const Datum *) b); + + return DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo)); +} + /* * Evaluate a NOT NULL domain constraint. */ diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h index dbe8649a57..ac4478d060 100644 --- a/src/include/executor/execExpr.h +++ b/src/include/executor/execExpr.h @@ -213,6 +213,7 @@ typedef enum ExprEvalOp /* evaluate assorted special-purpose expression types */ EEOP_CONVERT_ROWTYPE, EEOP_SCALARARRAYOP, + EEOP_SCALARARRAYOP_BINSEARCH, EEOP_XMLEXPR, EEOP_AGGREF, EEOP_GROUPING_FUNC, @@ -550,6 +551,18 @@ typedef struct ExprEvalStep PGFunction fn_addr; /* actual call address */ } scalararrayop; + /* for EEOP_SCALARARRAYOP_BINSEARCH */ + struct + { + int num_elems; + bool has_nulls; + Datum *elem_values; + FmgrInfo *finfo; /* function's lookup data */ + FunctionCallInfo fcinfo_data; /* arguments etc */ + /* faster to access without additional indirection: */ + PGFunction fn_addr; /* actual call address */ + } scalararraybinsearchop; + /* for EEOP_XMLEXPR */ struct { @@ -728,6 +741,7 @@ extern void ExecEvalSubscriptingRefAssign(ExprState *state, ExprEvalStep *op); extern void ExecEvalConvertRowtype(ExprState *state, ExprEvalStep *op, ExprContext *econtext); extern void ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op); +extern void ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *econtext); extern void ExecEvalConstraintNotNull(ExprState *state, ExprEvalStep *op); extern void ExecEvalConstraintCheck(ExprState *state, ExprEvalStep *op); extern void ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op); diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out index 4f4deaec22..55b57b9c59 100644 --- a/src/test/regress/expected/expressions.out +++ b/src/test/regress/expected/expressions.out @@ -158,3 +158,42 @@ select count(*) from date_tbl 12 (1 row) +-- +-- Tests for ScalarArrayOpExpr binary search optimization +-- +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1); + ?column? +---------- + t +(1 row) + +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); + ?column? +---------- + +(1 row) + +select 1 in (null, null, null, null, null, null, null, null, null, null, null); + ?column? +---------- + +(1 row) + +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null); + ?column? +---------- + t +(1 row) + +select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1); + ?column? +---------- + +(1 row) + +select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); + ?column? +---------- + +(1 row) + diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql index 1ca8bb151c..3cb850d838 100644 --- a/src/test/regress/sql/expressions.sql +++ b/src/test/regress/sql/expressions.sql @@ -65,3 +65,14 @@ select count(*) from date_tbl where f1 not between symmetric '1997-01-01' and '1998-01-01'; select count(*) from date_tbl where f1 not between symmetric '1997-01-01' and '1998-01-01'; + +-- +-- Tests for ScalarArrayOpExpr binary search optimization +-- + +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1); +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); +select 1 in (null, null, null, null, null, null, null, null, null, null, null); +select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null); +select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1); +select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null); -- 2.17.1