On Sun, Apr 26, 2020 at 7:41 PM James Coleman <jtc...@gmail.com> wrote:
>
> On Sun, Apr 26, 2020 at 4:49 PM Tomas Vondra
> <tomas.von...@2ndquadrant.com> wrote:
> >
> > On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote:
> > >On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
> > ><tomas.von...@2ndquadrant.com> wrote:
> > >>
> > >> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
> > >> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley <dgrowle...@gmail.com> 
> > >> >wrote:
> > >> >>
> > >> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra 
> > >> >> <tomas.von...@2ndquadrant.com> wrote:
> > >> >> > This reminds me our attempts to add bloom filters to hash joins, 
> > >> >> > which I
> > >> >> > think ran into mostly the same challenge of deciding when the bloom
> > >> >> > filter can be useful and is worth the extra work.
> > >> >>
> > >> >> Speaking of that, it would be interesting to see how a test where you
> > >> >> write the query as IN(VALUES(...)) instead of IN() compares. It would
> > >> >> be interesting to know if the planner is able to make a more suitable
> > >> >> choice and also to see how all the work over the years to improve Hash
> > >> >> Joins compares to the bsearch with and without the bloom filter.
> > >> >
> > >> >It would be interesting.
> > >> >
> > >> >It also makes one wonder about optimizing these into to hash
> > >> >joins...which I'd thought about over at [1]. I think it'd be a very
> > >> >significant effort though.
> > >> >
> > >>
> > >> I modified the script to also do the join version of the query. I can
> > >> only run it on my laptop at the moment, so the results may be a bit
> > >> different from those I shared before, but it's interesting I think.
> > >>
> > >> In most cases it's comparable to the binsearch/bloom approach, and in
> > >> some cases it actually beats them quite significantly. It seems to
> > >> depend on how expensive the comparison is - for "int" the comparison is
> > >> very cheap and there's almost no difference. For "text" the comparison
> > >> is much more expensive, and there are significant speedups.
> > >>
> > >> For example the test with 100k lookups in array of 10k elements and 10%
> > >> match probability, the timings are these
> > >>
> > >>    master:  62362 ms
> > >>    binsearch: 201 ms
> > >>    bloom:      65 ms
> > >>    hashjoin:   36 ms
> > >>
> > >> I do think the explanation is fairly simple - the bloom filter
> > >> eliminates about 90% of the expensive comparisons, so it's 20ms plus
> > >> some overhead to build and check the bits. The hash join probably
> > >> eliminates a lot of the remaining comparisons, because the hash table
> > >> is sized to have one tuple per bucket.
> > >>
> > >> Note: I also don't claim the PoC has the most efficient bloom filter
> > >> implementation possible. I'm sure it could be made faster.
> > >>
> > >> Anyway, I'm not sure transforming this to a hash join is worth the
> > >> effort - I agree that seems quite complex. But perhaps this suggest we
> > >> should not be doing binary search and instead just build a simple hash
> > >> table - that seems much simpler, and it'll probably give us about the
> > >> same benefits.
> > >
> > >That's actually what I originally thought about doing, but I chose
> > >binary search since it seemed a lot easier to get off the ground.
> > >
> >
> > OK, that makes perfect sense.
> >
> > >If we instead build a hash is there anything else we need to be
> > >concerned about? For example, work mem? I suppose for the binary
> > >search we already have to expand the array, so perhaps it's not all
> > >that meaningful relative to that...
> > >
> >
> > I don't think we need to be particularly concerned about work_mem. We
> > don't care about it now, and it's not clear to me what we could do about
> > it - we already have the array in memory anyway, so it's a bit futile.
> > Furthermore, if we need to care about it, it probably applies to the
> > binary search too.
> >
> > >I was looking earlier at what our standard hash implementation was,
> > >and it seemed less obvious what was needed to set that up (so binary
> > >search seemed a faster proof of concept). If you happen to have any
> > >pointers to similar usages I should look at, please let me know.
> > >
> >
> > I think the hash join implementation is far too complicated. It has to
> > care about work_mem, so it implements batching, etc. That's a lot of
> > complexity we don't need here. IMO we could use either the usual
> > dynahash, or maybe even the simpler simplehash.
> >
> > FWIW it'd be good to verify the numbers I shared, i.e. checking that the
> > benchmarks makes sense and running it independently. I'm not aware of
> > any issues but it was done late at night and only ran on my laptop.
>
> Some quick calculations (don't have the scripting in a form I can
> attach yet; using this as an opportunity to hack on a genericized
> performance testing framework of sorts) suggest your results are
> correct. I was also testing on my laptop, but I showed 1.) roughly
> equivalent results for IN (VALUES ...) and IN (<list>) for integers,
> but when I switch to (short; average 3 characters long) text values I
> show the hash join on VALUES is twice as fast as the binary search.
>
> Given that, I'm planning to implement this as a hash lookup and share
> a revised patch.

I've attached a patch series as before, but with an additional patch
that switches to using dynahash instead of binary search.

Whereas before the benchmarking ended up with a trimodal distribution
(i.e., master with IN <list>, patch with IN <list>, and either with IN
VALUES), the hash implementation brings us back to an effectively
bimodal distribution -- though the hash scalar array op expression
implementation for text is about 5% faster than the hash join.

Current outstanding thoughts (besides comment/naming cleanup):

- The saop costing needs to be updated to match, as Tomas pointed out.

- Should we be concerned about single execution cases? For example, is
the regression of speed on a simple SELECT x IN something we should
try to defeat by only kicking in the optimization if we execute in a
loop at least twice? That might be of particular interest to pl/pgsql.

- Should we have a test for an operator with a non-strict function?
I'm not aware of any built-in ops that have that characteristic; would
you suggest just creating a fake one for the test?

James
From cd760f5d333a2f1c7b871f05d4a231c0381c660d Mon Sep 17 00:00:00 2001
From: James Coleman <jtc...@gmail.com>
Date: Mon, 27 Apr 2020 08:51:28 -0400
Subject: [PATCH v2 2/2] Try using dynahash

---
 src/backend/executor/execExpr.c           |  29 ++--
 src/backend/executor/execExprInterp.c     | 197 ++++++++++++----------
 src/include/executor/execExpr.h           |   7 +-
 src/test/regress/expected/expressions.out |   7 +
 src/test/regress/sql/expressions.sql      |   2 +
 5 files changed, 138 insertions(+), 104 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index c202cc7e89..6249db5426 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -31,6 +31,7 @@
 #include "postgres.h"
 
 #include "access/nbtree.h"
+#include "access/hash.h"
 #include "catalog/objectaccess.h"
 #include "catalog/pg_type.h"
 #include "executor/execExpr.h"
@@ -949,10 +950,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
 			{
 				ScalarArrayOpExpr *opexpr = (ScalarArrayOpExpr *) node;
 				Oid			func;
+				Oid			hash_func;
 				Expr	   *scalararg;
 				Expr	   *arrayarg;
 				FmgrInfo   *finfo;
 				FunctionCallInfo fcinfo;
+				FmgrInfo   *hash_finfo;
+				FunctionCallInfo hash_fcinfo;
 				AclResult	aclresult;
 				bool		useBinarySearch = false;
 
@@ -983,11 +987,6 @@ ExecInitExprRec(Expr *node, ExprState *state,
 				{
 					Datum		arrdatum = ((Const *) arrayarg)->constvalue;
 					ArrayType  *arr = (ArrayType *) DatumGetPointer(arrdatum);
-					Oid			orderingOp;
-					Oid			orderingFunc;
-					Oid			opfamily;
-					Oid			opcintype;
-					int16		strategy;
 					int			nitems;
 
 					/*
@@ -998,21 +997,24 @@ ExecInitExprRec(Expr *node, ExprState *state,
 					if (nitems >= MIN_ARRAY_SIZE_FOR_BINARY_SEARCH)
 					{
 						/*
-						 * Find the ordering op that matches the originally
+						 * Find the hash op that matches the originally
 						 * planned equality op.
 						 */
-						orderingOp = get_ordering_op_for_equality_op(opexpr->opno, NULL);
-						get_ordering_op_properties(orderingOp, &opfamily, &opcintype, &strategy);
-						orderingFunc = get_opfamily_proc(opfamily, opcintype, opcintype, BTORDER_PROC);
+						useBinarySearch = get_op_hash_functions(opexpr->opno, NULL, &hash_func);
 
 						/*
 						 * But we may not have one, so fall back to the
 						 * default implementation if necessary.
 						 */
-						if (OidIsValid(orderingFunc))
+						if (useBinarySearch)
 						{
-							useBinarySearch = true;
-							func = orderingFunc;
+							hash_finfo = palloc0(sizeof(FmgrInfo));
+							hash_fcinfo = palloc0(SizeForFunctionCallInfo(2));
+							fmgr_info(hash_func, hash_finfo);
+							fmgr_info_set_expr((Node *) node, hash_finfo);
+							InitFunctionCallInfoData(*hash_fcinfo, hash_finfo, 2,
+													 opexpr->inputcollid, NULL, NULL);
+							InvokeFunctionExecuteHook(hash_func);
 						}
 					}
 				}
@@ -1042,6 +1044,9 @@ ExecInitExprRec(Expr *node, ExprState *state,
 					scratch.d.scalararraybinsearchop.finfo = finfo;
 					scratch.d.scalararraybinsearchop.fcinfo_data = fcinfo;
 					scratch.d.scalararraybinsearchop.fn_addr = finfo->fn_addr;
+					scratch.d.scalararraybinsearchop.hash_finfo = hash_finfo;
+					scratch.d.scalararraybinsearchop.hash_fcinfo_data = hash_fcinfo;
+					scratch.d.scalararraybinsearchop.hash_fn_addr = hash_finfo->fn_addr;
 				}
 				else
 				{
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 5bebafbf0c..e54e807c6b 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -178,8 +178,6 @@ ExecAggPlainTransByRef(AggState *aggstate, AggStatePerTrans pertrans,
 					   AggStatePerGroup pergroup,
 					   ExprContext *aggcontext, int setno);
 
-static int	compare_array_elements(const void *a, const void *b, void *arg);
-
 /*
  * Prepare ExprState for interpreted execution.
  */
@@ -3593,6 +3591,51 @@ ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op)
 	*op->resnull = resultnull;
 }
 
+static ExprEvalStep *current_saop_op;
+
+/*
+ * Hash function for elements.
+ *
+ * We use the element type's default hash opclass, and the column collation
+ * if the type is collation-sensitive.
+ */
+/* XXX: Name function to be specific to saop binsearch? */
+static uint32
+element_hash(const void *key, Size keysize)
+{
+	Datum hash;
+	FunctionCallInfo fcinfo = current_saop_op->d.scalararraybinsearchop.hash_fcinfo_data;
+
+	fcinfo->args[0].value = *((const Datum *) key);
+	fcinfo->args[0].isnull = false;
+
+	/* The keysize parameter is superfluous here */
+	hash = current_saop_op->d.scalararraybinsearchop.hash_fn_addr(fcinfo);
+
+	return DatumGetUInt32(hash);
+}
+
+/*
+ * Matching function for elements, to be used in hashtable lookups.
+ */
+/* XXX: Name function to be specific to saop binsearch? */
+static int
+element_match(const void *key1, const void *key2, Size keysize)
+{
+	Datum result;
+	FunctionCallInfo fcinfo = current_saop_op->d.scalararraybinsearchop.fcinfo_data;
+
+	fcinfo->args[0].value = *((const Datum *) key1);
+	fcinfo->args[0].isnull = false;
+	fcinfo->args[1].value = *((const Datum *) key2);
+	fcinfo->args[1].isnull = false;
+
+	/* The keysize parameter is superfluous here */
+	result = current_saop_op->d.scalararraybinsearchop.fn_addr(fcinfo);
+
+	return DatumGetBool(result) ? 0 : 1;
+}
+
 /*
  * Evaluate "scalar op ANY (const array)".
  *
@@ -3613,16 +3656,19 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *
 	FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data;
 	bool		strictfunc = op->d.scalararraybinsearchop.finfo->fn_strict;
 	ArrayType  *arr;
+	Datum		scalar = fcinfo->args[0].value;
 	Datum		result;
 	bool		resultnull;
-	bool	   *elem_nulls;
-	int			l = 0,
-				r,
-				res;
+	bool		hashfound;
+	int			res;
+
+	/* If we're only executing once, do we need a way to fall back to the regular loop? */
 
 	/* We don't setup a binary search op if the array const is null. */
 	Assert(!*op->resnull);
 
+	current_saop_op = op;
+
 	/*
 	 * If the scalar is NULL, and the function is strict, return NULL; no
 	 * point in executing the search.
@@ -3634,104 +3680,88 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *
 	}
 
 	/* Preprocess the array the first time we execute the op. */
-	if (op->d.scalararraybinsearchop.elem_values == NULL)
+	if (op->d.scalararraybinsearchop.elements_tab == NULL)
 	{
-		/* Cache the original lhs so we can scribble on it. */
-		Datum		scalar = fcinfo->args[0].value;
-		bool		scalar_isnull = fcinfo->args[0].isnull;
-		int			num_nonnulls = 0;
-		MemoryContext old_cxt;
-		MemoryContext array_cxt;
 		int16		typlen;
 		bool		typbyval;
 		char		typalign;
+		HASHCTL		elem_hash_ctl;
+		int			nitems;
+		int			num_nulls = 0;
+		char	   *s;
+		bits8	   *bitmap;
+		int			bitmask;
 
 		arr = DatumGetArrayTypeP(*op->resvalue);
+		nitems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr));
 
 		get_typlenbyvalalign(ARR_ELEMTYPE(arr),
 							 &typlen,
 							 &typbyval,
 							 &typalign);
 
-		array_cxt = AllocSetContextCreate(
-										  econtext->ecxt_per_query_memory,
-										  "scalararraybinsearchop context",
-										  ALLOCSET_SMALL_SIZES);
-		old_cxt = MemoryContextSwitchTo(array_cxt);
+		MemSet(&elem_hash_ctl, 0, sizeof(elem_hash_ctl));
+		elem_hash_ctl.keysize = sizeof(Datum);
+		elem_hash_ctl.entrysize = sizeof(Datum);
+		elem_hash_ctl.hash = element_hash;
+		elem_hash_ctl.match = element_match;
+		elem_hash_ctl.hcxt = econtext->ecxt_per_query_memory;
+		op->d.scalararraybinsearchop.elements_tab = hash_create("Scalar array op expr elements",
+								   nitems,
+								   &elem_hash_ctl,
+								   HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
+
+		s = (char *) ARR_DATA_PTR(arr);
+		bitmap = ARR_NULLBITMAP(arr);
+		bitmask = 1;
+		for (int i = 0; i < nitems; i++)
+		{
+			Datum		element;
+
+			/* Get array element, checking for NULL. */
+			if (bitmap && (*bitmap & bitmask) == 0)
+			{
+				num_nulls++;
+			}
+			else
+			{
+				element = fetch_att(s, typbyval, typlen);
+				s = att_addlength_pointer(s, typlen, s);
+				s = (char *) att_align_nominal(s, typalign);
 
-		deconstruct_array(arr,
-						  ARR_ELEMTYPE(arr),
-						  typlen, typbyval, typalign,
-						  &op->d.scalararraybinsearchop.elem_values, &elem_nulls, &op->d.scalararraybinsearchop.num_elems);
+				hash_search(op->d.scalararraybinsearchop.elements_tab, (const void *) &element, HASH_ENTER, NULL);
+			}
 
-		/* Remove null entries from the array. */
-		for (int j = 0; j < op->d.scalararraybinsearchop.num_elems; j++)
-		{
-			if (!elem_nulls[j])
-				op->d.scalararraybinsearchop.elem_values[num_nonnulls++] = op->d.scalararraybinsearchop.elem_values[j];
+			/* Advance bitmap pointer if any. */
+			if (bitmap)
+			{
+				bitmask <<= 1;
+				if (bitmask == 0x100)
+				{
+					bitmap++;
+					bitmask = 1;
+				}
+			}
 		}
 
 		/*
 		 * Remember if we had any nulls so that we know if we need to execute
 		 * non-strict functions with a null lhs value if no match is found.
 		 */
-		op->d.scalararraybinsearchop.has_nulls = num_nonnulls < op->d.scalararraybinsearchop.num_elems;
-		op->d.scalararraybinsearchop.num_elems = num_nonnulls;
+		op->d.scalararraybinsearchop.has_nulls = num_nulls > 0;
 
 		/*
-		 * Setup the fcinfo for sorting. We've removed nulls, so both lhs and
-		 * rhs values are guaranteed to be non-null.
+		 * We only setup a binary search op if we have > 8 elements, so we don't
+		 * need to worry about adding an optimization for the empty array case.
 		 */
-		fcinfo->args[0].isnull = false;
-		fcinfo->args[1].isnull = false;
-
-		/* Sort the array and remove duplicate elements. */
-		qsort_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum),
-				  compare_array_elements, op);
-		op->d.scalararraybinsearchop.num_elems = qunique_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum),
-															 compare_array_elements, op);
-
-		/* Restore the lhs value after we scribbed on it for sorting. */
-		fcinfo->args[0].value = scalar;
-		fcinfo->args[0].isnull = scalar_isnull;
-
-		MemoryContextSwitchTo(old_cxt);
+		Assert(nitems > 0);
 	}
 
-	/*
-	 * We only setup a binary search op if we have > 8 elements, so we don't
-	 * need to worry about adding an optimization for the empty array case.
-	 */
-	Assert(!(op->d.scalararraybinsearchop.num_elems <= 0 && !op->d.scalararraybinsearchop.has_nulls));
-
-	/* Assume no match will be found until proven otherwise. */
-	result = BoolGetDatum(false);
+	/* Check the hash to see if we have a match. */
+	hash_search(op->d.scalararraybinsearchop.elements_tab, (const void *) &scalar, HASH_FIND, &hashfound);
+	result = BoolGetDatum(hashfound);
 	resultnull = false;
 
-	/* Binary search through the array. */
-	r = op->d.scalararraybinsearchop.num_elems - 1;
-	while (l <= r)
-	{
-		int			i = (l + r) / 2;
-
-		fcinfo->args[1].value = op->d.scalararraybinsearchop.elem_values[i];
-
-		/* Call comparison function */
-		fcinfo->isnull = false;
-		res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo));
-
-		if (res == 0)
-		{
-			result = BoolGetDatum(true);
-			resultnull = false;
-			break;
-		}
-		else if (res > 0)
-			l = i + 1;
-		else
-			r = i - 1;
-	}
-
 	/*
 	 * If we didn't find a match in the array, we still might need to handle
 	 * the possibility of null values (we've previously removed them from the
@@ -3761,19 +3791,6 @@ ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *
 	*op->resnull = resultnull;
 }
 
-/* XXX: Name function to be specific to saop binsearch? */
-static int
-compare_array_elements(const void *a, const void *b, void *arg)
-{
-	ExprEvalStep *op = (ExprEvalStep *) arg;
-	FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data;
-
-	fcinfo->args[0].value = *((const Datum *) a);
-	fcinfo->args[1].value = *((const Datum *) b);
-
-	return DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo));
-}
-
 /*
  * Evaluate a NOT NULL domain constraint.
  */
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index ac4478d060..2e93b1f990 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -554,13 +554,16 @@ typedef struct ExprEvalStep
 		/* for EEOP_SCALARARRAYOP_BINSEARCH */
 		struct
 		{
-			int			num_elems;
 			bool		has_nulls;
-			Datum	   *elem_values;
+			HTAB	   *elements_tab;
 			FmgrInfo   *finfo;	/* function's lookup data */
 			FunctionCallInfo fcinfo_data;	/* arguments etc */
 			/* faster to access without additional indirection: */
 			PGFunction	fn_addr;	/* actual call address */
+			FmgrInfo   *hash_finfo;	/* function's lookup data */
+			FunctionCallInfo hash_fcinfo_data;	/* arguments etc */
+			/* faster to access without additional indirection: */
+			PGFunction	hash_fn_addr;	/* actual call address */
 		}			scalararraybinsearchop;
 
 		/* for EEOP_XMLEXPR */
diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out
index 55b57b9c59..f37dfe1ce2 100644
--- a/src/test/regress/expected/expressions.out
+++ b/src/test/regress/expected/expressions.out
@@ -197,3 +197,10 @@ select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
  
 (1 row)
 
+select 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j');
+ ?column? 
+----------
+ t
+(1 row)
+
+-- TODO: test non-strict op?
diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql
index 3cb850d838..c30fe66c5e 100644
--- a/src/test/regress/sql/expressions.sql
+++ b/src/test/regress/sql/expressions.sql
@@ -76,3 +76,5 @@ select 1 in (null, null, null, null, null, null, null, null, null, null, null);
 select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null);
 select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1);
 select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
+select 'a' in ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j');
+-- TODO: test non-strict op?
-- 
2.17.1

From 08742543d7865d5f25c24c26bf1014924035c9eb Mon Sep 17 00:00:00 2001
From: jcoleman <jtc...@gmail.com>
Date: Fri, 10 Apr 2020 21:40:50 +0000
Subject: [PATCH v2 1/2] Binary search const arrays in OR'd ScalarArrayOps

Currently all scalar array op expressions execute as a linear search
through the array argument. However when OR semantics are desired it's
possible to instead use a binary search. Here we apply that optimization
to constant arrays (so we don't need to worry about teaching expression
execution when params change) of at least length 9 (since very short
arrays average to the same number of comparisons for linear searches and
thus avoid the preprocessing necessary for a binary search).
---
 src/backend/executor/execExpr.c           |  79 +++++++--
 src/backend/executor/execExprInterp.c     | 193 ++++++++++++++++++++++
 src/include/executor/execExpr.h           |  14 ++
 src/test/regress/expected/expressions.out |  39 +++++
 src/test/regress/sql/expressions.sql      |  11 ++
 5 files changed, 326 insertions(+), 10 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index c6a77bd66f..c202cc7e89 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -49,6 +49,7 @@
 #include "utils/lsyscache.h"
 #include "utils/typcache.h"
 
+#define MIN_ARRAY_SIZE_FOR_BINARY_SEARCH 9
 
 typedef struct LastAttnumInfo
 {
@@ -947,11 +948,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
 		case T_ScalarArrayOpExpr:
 			{
 				ScalarArrayOpExpr *opexpr = (ScalarArrayOpExpr *) node;
+				Oid			func;
 				Expr	   *scalararg;
 				Expr	   *arrayarg;
 				FmgrInfo   *finfo;
 				FunctionCallInfo fcinfo;
 				AclResult	aclresult;
+				bool		useBinarySearch = false;
 
 				Assert(list_length(opexpr->args) == 2);
 				scalararg = (Expr *) linitial(opexpr->args);
@@ -964,12 +967,58 @@ ExecInitExprRec(Expr *node, ExprState *state,
 				if (aclresult != ACLCHECK_OK)
 					aclcheck_error(aclresult, OBJECT_FUNCTION,
 								   get_func_name(opexpr->opfuncid));
-				InvokeFunctionExecuteHook(opexpr->opfuncid);
 
 				/* Set up the primary fmgr lookup information */
 				finfo = palloc0(sizeof(FmgrInfo));
 				fcinfo = palloc0(SizeForFunctionCallInfo(2));
-				fmgr_info(opexpr->opfuncid, finfo);
+				func = opexpr->opfuncid;
+
+				/*
+				 * If we have a constant array and want OR semantics, then we
+				 * can use a binary search to implement the op instead of
+				 * looping through the entire array for each execution.
+				 */
+				if (opexpr->useOr && arrayarg && IsA(arrayarg, Const) &&
+					!((Const *) arrayarg)->constisnull)
+				{
+					Datum		arrdatum = ((Const *) arrayarg)->constvalue;
+					ArrayType  *arr = (ArrayType *) DatumGetPointer(arrdatum);
+					Oid			orderingOp;
+					Oid			orderingFunc;
+					Oid			opfamily;
+					Oid			opcintype;
+					int16		strategy;
+					int			nitems;
+
+					/*
+					 * Only do the optimization if we have a large enough
+					 * array to make it worth it.
+					 */
+					nitems = ArrayGetNItems(ARR_NDIM(arr), ARR_DIMS(arr));
+					if (nitems >= MIN_ARRAY_SIZE_FOR_BINARY_SEARCH)
+					{
+						/*
+						 * Find the ordering op that matches the originally
+						 * planned equality op.
+						 */
+						orderingOp = get_ordering_op_for_equality_op(opexpr->opno, NULL);
+						get_ordering_op_properties(orderingOp, &opfamily, &opcintype, &strategy);
+						orderingFunc = get_opfamily_proc(opfamily, opcintype, opcintype, BTORDER_PROC);
+
+						/*
+						 * But we may not have one, so fall back to the
+						 * default implementation if necessary.
+						 */
+						if (OidIsValid(orderingFunc))
+						{
+							useBinarySearch = true;
+							func = orderingFunc;
+						}
+					}
+				}
+
+				InvokeFunctionExecuteHook(func);
+				fmgr_info(func, finfo);
 				fmgr_info_set_expr((Node *) node, finfo);
 				InitFunctionCallInfoData(*fcinfo, finfo, 2,
 										 opexpr->inputcollid, NULL, NULL);
@@ -981,18 +1030,28 @@ ExecInitExprRec(Expr *node, ExprState *state,
 				/*
 				 * Evaluate array argument into our return value.  There's no
 				 * danger in that, because the return value is guaranteed to
-				 * be overwritten by EEOP_SCALARARRAYOP, and will not be
-				 * passed to any other expression.
+				 * be overwritten by EEOP_SCALARARRAYOP[_BINSEARCH], and will
+				 * not be passed to any other expression.
 				 */
 				ExecInitExprRec(arrayarg, state, resv, resnull);
 
 				/* And perform the operation */
-				scratch.opcode = EEOP_SCALARARRAYOP;
-				scratch.d.scalararrayop.element_type = InvalidOid;
-				scratch.d.scalararrayop.useOr = opexpr->useOr;
-				scratch.d.scalararrayop.finfo = finfo;
-				scratch.d.scalararrayop.fcinfo_data = fcinfo;
-				scratch.d.scalararrayop.fn_addr = finfo->fn_addr;
+				if (useBinarySearch)
+				{
+					scratch.opcode = EEOP_SCALARARRAYOP_BINSEARCH;
+					scratch.d.scalararraybinsearchop.finfo = finfo;
+					scratch.d.scalararraybinsearchop.fcinfo_data = fcinfo;
+					scratch.d.scalararraybinsearchop.fn_addr = finfo->fn_addr;
+				}
+				else
+				{
+					scratch.opcode = EEOP_SCALARARRAYOP;
+					scratch.d.scalararrayop.element_type = InvalidOid;
+					scratch.d.scalararrayop.useOr = opexpr->useOr;
+					scratch.d.scalararrayop.finfo = finfo;
+					scratch.d.scalararrayop.fcinfo_data = fcinfo;
+					scratch.d.scalararrayop.fn_addr = finfo->fn_addr;
+				}
 				ExprEvalPushStep(state, &scratch);
 				break;
 			}
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 113ed1547c..5bebafbf0c 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -76,6 +76,7 @@
 #include "utils/timestamp.h"
 #include "utils/typcache.h"
 #include "utils/xml.h"
+#include "lib/qunique.h"
 
 /*
  * Use computed-goto-based opcode dispatch when computed gotos are available.
@@ -177,6 +178,8 @@ ExecAggPlainTransByRef(AggState *aggstate, AggStatePerTrans pertrans,
 					   AggStatePerGroup pergroup,
 					   ExprContext *aggcontext, int setno);
 
+static int	compare_array_elements(const void *a, const void *b, void *arg);
+
 /*
  * Prepare ExprState for interpreted execution.
  */
@@ -425,6 +428,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
 		&&CASE_EEOP_DOMAIN_CHECK,
 		&&CASE_EEOP_CONVERT_ROWTYPE,
 		&&CASE_EEOP_SCALARARRAYOP,
+		&&CASE_EEOP_SCALARARRAYOP_BINSEARCH,
 		&&CASE_EEOP_XMLEXPR,
 		&&CASE_EEOP_AGGREF,
 		&&CASE_EEOP_GROUPING_FUNC,
@@ -1464,6 +1468,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
 			EEO_NEXT();
 		}
 
+		EEO_CASE(EEOP_SCALARARRAYOP_BINSEARCH)
+		{
+			/* too complex for an inline implementation */
+			ExecEvalScalarArrayOpBinSearch(state, op, econtext);
+
+			EEO_NEXT();
+		}
+
 		EEO_CASE(EEOP_DOMAIN_NOTNULL)
 		{
 			/* too complex for an inline implementation */
@@ -3581,6 +3593,187 @@ ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op)
 	*op->resnull = resultnull;
 }
 
+/*
+ * Evaluate "scalar op ANY (const array)".
+ *
+ * This is an optimized version of ExecEvalScalarArrayOp that only supports
+ * ANY (i.e., OR semantics) because it binary searches through the array for a
+ * match after sorting the array and removing null and duplicate entries.
+ *
+ * Source array is in our result area, scalar arg is already evaluated into
+ * fcinfo->args[0].
+ *
+ * The operator always yields boolean, and we combine the results across all
+ * array elements using OR.  Of course we short-circuit as soon as the result
+ * is known.
+ */
+void
+ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
+{
+	FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data;
+	bool		strictfunc = op->d.scalararraybinsearchop.finfo->fn_strict;
+	ArrayType  *arr;
+	Datum		result;
+	bool		resultnull;
+	bool	   *elem_nulls;
+	int			l = 0,
+				r,
+				res;
+
+	/* We don't setup a binary search op if the array const is null. */
+	Assert(!*op->resnull);
+
+	/*
+	 * If the scalar is NULL, and the function is strict, return NULL; no
+	 * point in executing the search.
+	 */
+	if (fcinfo->args[0].isnull && strictfunc)
+	{
+		*op->resnull = true;
+		return;
+	}
+
+	/* Preprocess the array the first time we execute the op. */
+	if (op->d.scalararraybinsearchop.elem_values == NULL)
+	{
+		/* Cache the original lhs so we can scribble on it. */
+		Datum		scalar = fcinfo->args[0].value;
+		bool		scalar_isnull = fcinfo->args[0].isnull;
+		int			num_nonnulls = 0;
+		MemoryContext old_cxt;
+		MemoryContext array_cxt;
+		int16		typlen;
+		bool		typbyval;
+		char		typalign;
+
+		arr = DatumGetArrayTypeP(*op->resvalue);
+
+		get_typlenbyvalalign(ARR_ELEMTYPE(arr),
+							 &typlen,
+							 &typbyval,
+							 &typalign);
+
+		array_cxt = AllocSetContextCreate(
+										  econtext->ecxt_per_query_memory,
+										  "scalararraybinsearchop context",
+										  ALLOCSET_SMALL_SIZES);
+		old_cxt = MemoryContextSwitchTo(array_cxt);
+
+		deconstruct_array(arr,
+						  ARR_ELEMTYPE(arr),
+						  typlen, typbyval, typalign,
+						  &op->d.scalararraybinsearchop.elem_values, &elem_nulls, &op->d.scalararraybinsearchop.num_elems);
+
+		/* Remove null entries from the array. */
+		for (int j = 0; j < op->d.scalararraybinsearchop.num_elems; j++)
+		{
+			if (!elem_nulls[j])
+				op->d.scalararraybinsearchop.elem_values[num_nonnulls++] = op->d.scalararraybinsearchop.elem_values[j];
+		}
+
+		/*
+		 * Remember if we had any nulls so that we know if we need to execute
+		 * non-strict functions with a null lhs value if no match is found.
+		 */
+		op->d.scalararraybinsearchop.has_nulls = num_nonnulls < op->d.scalararraybinsearchop.num_elems;
+		op->d.scalararraybinsearchop.num_elems = num_nonnulls;
+
+		/*
+		 * Setup the fcinfo for sorting. We've removed nulls, so both lhs and
+		 * rhs values are guaranteed to be non-null.
+		 */
+		fcinfo->args[0].isnull = false;
+		fcinfo->args[1].isnull = false;
+
+		/* Sort the array and remove duplicate elements. */
+		qsort_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum),
+				  compare_array_elements, op);
+		op->d.scalararraybinsearchop.num_elems = qunique_arg(op->d.scalararraybinsearchop.elem_values, op->d.scalararraybinsearchop.num_elems, sizeof(Datum),
+															 compare_array_elements, op);
+
+		/* Restore the lhs value after we scribbed on it for sorting. */
+		fcinfo->args[0].value = scalar;
+		fcinfo->args[0].isnull = scalar_isnull;
+
+		MemoryContextSwitchTo(old_cxt);
+	}
+
+	/*
+	 * We only setup a binary search op if we have > 8 elements, so we don't
+	 * need to worry about adding an optimization for the empty array case.
+	 */
+	Assert(!(op->d.scalararraybinsearchop.num_elems <= 0 && !op->d.scalararraybinsearchop.has_nulls));
+
+	/* Assume no match will be found until proven otherwise. */
+	result = BoolGetDatum(false);
+	resultnull = false;
+
+	/* Binary search through the array. */
+	r = op->d.scalararraybinsearchop.num_elems - 1;
+	while (l <= r)
+	{
+		int			i = (l + r) / 2;
+
+		fcinfo->args[1].value = op->d.scalararraybinsearchop.elem_values[i];
+
+		/* Call comparison function */
+		fcinfo->isnull = false;
+		res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo));
+
+		if (res == 0)
+		{
+			result = BoolGetDatum(true);
+			resultnull = false;
+			break;
+		}
+		else if (res > 0)
+			l = i + 1;
+		else
+			r = i - 1;
+	}
+
+	/*
+	 * If we didn't find a match in the array, we still might need to handle
+	 * the possibility of null values (we've previously removed them from the
+	 * array).
+	 */
+	if (!DatumGetBool(result) && op->d.scalararraybinsearchop.has_nulls)
+	{
+		if (strictfunc)
+		{
+			/* Had nulls, so strict function implies null. */
+			result = (Datum) 0;
+			resultnull = true;
+		}
+		else
+		{
+			/* Execute function will null rhs just once. */
+			fcinfo->args[1].value = (Datum) 0;
+			fcinfo->args[1].isnull = true;
+
+			res = DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo));
+			result = BoolGetDatum(res == 0);
+			resultnull = fcinfo->isnull;
+		}
+	}
+
+	*op->resvalue = result;
+	*op->resnull = resultnull;
+}
+
+/* XXX: Name function to be specific to saop binsearch? */
+static int
+compare_array_elements(const void *a, const void *b, void *arg)
+{
+	ExprEvalStep *op = (ExprEvalStep *) arg;
+	FunctionCallInfo fcinfo = op->d.scalararraybinsearchop.fcinfo_data;
+
+	fcinfo->args[0].value = *((const Datum *) a);
+	fcinfo->args[1].value = *((const Datum *) b);
+
+	return DatumGetInt32(op->d.scalararraybinsearchop.fn_addr(fcinfo));
+}
+
 /*
  * Evaluate a NOT NULL domain constraint.
  */
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index dbe8649a57..ac4478d060 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -213,6 +213,7 @@ typedef enum ExprEvalOp
 	/* evaluate assorted special-purpose expression types */
 	EEOP_CONVERT_ROWTYPE,
 	EEOP_SCALARARRAYOP,
+	EEOP_SCALARARRAYOP_BINSEARCH,
 	EEOP_XMLEXPR,
 	EEOP_AGGREF,
 	EEOP_GROUPING_FUNC,
@@ -550,6 +551,18 @@ typedef struct ExprEvalStep
 			PGFunction	fn_addr;	/* actual call address */
 		}			scalararrayop;
 
+		/* for EEOP_SCALARARRAYOP_BINSEARCH */
+		struct
+		{
+			int			num_elems;
+			bool		has_nulls;
+			Datum	   *elem_values;
+			FmgrInfo   *finfo;	/* function's lookup data */
+			FunctionCallInfo fcinfo_data;	/* arguments etc */
+			/* faster to access without additional indirection: */
+			PGFunction	fn_addr;	/* actual call address */
+		}			scalararraybinsearchop;
+
 		/* for EEOP_XMLEXPR */
 		struct
 		{
@@ -728,6 +741,7 @@ extern void ExecEvalSubscriptingRefAssign(ExprState *state, ExprEvalStep *op);
 extern void ExecEvalConvertRowtype(ExprState *state, ExprEvalStep *op,
 								   ExprContext *econtext);
 extern void ExecEvalScalarArrayOp(ExprState *state, ExprEvalStep *op);
+extern void ExecEvalScalarArrayOpBinSearch(ExprState *state, ExprEvalStep *op, ExprContext *econtext);
 extern void ExecEvalConstraintNotNull(ExprState *state, ExprEvalStep *op);
 extern void ExecEvalConstraintCheck(ExprState *state, ExprEvalStep *op);
 extern void ExecEvalXmlExpr(ExprState *state, ExprEvalStep *op);
diff --git a/src/test/regress/expected/expressions.out b/src/test/regress/expected/expressions.out
index 4f4deaec22..55b57b9c59 100644
--- a/src/test/regress/expected/expressions.out
+++ b/src/test/regress/expected/expressions.out
@@ -158,3 +158,42 @@ select count(*) from date_tbl
     12
 (1 row)
 
+--
+-- Tests for ScalarArrayOpExpr binary search optimization
+--
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1);
+ ?column? 
+----------
+ t
+(1 row)
+
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
+ ?column? 
+----------
+ 
+(1 row)
+
+select 1 in (null, null, null, null, null, null, null, null, null, null, null);
+ ?column? 
+----------
+ 
+(1 row)
+
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null);
+ ?column? 
+----------
+ t
+(1 row)
+
+select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1);
+ ?column? 
+----------
+ 
+(1 row)
+
+select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
+ ?column? 
+----------
+ 
+(1 row)
+
diff --git a/src/test/regress/sql/expressions.sql b/src/test/regress/sql/expressions.sql
index 1ca8bb151c..3cb850d838 100644
--- a/src/test/regress/sql/expressions.sql
+++ b/src/test/regress/sql/expressions.sql
@@ -65,3 +65,14 @@ select count(*) from date_tbl
   where f1 not between symmetric '1997-01-01' and '1998-01-01';
 select count(*) from date_tbl
   where f1 not between symmetric '1997-01-01' and '1998-01-01';
+
+--
+-- Tests for ScalarArrayOpExpr binary search optimization
+--
+
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1);
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
+select 1 in (null, null, null, null, null, null, null, null, null, null, null);
+select 1 in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1, null);
+select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, 1);
+select null::int in (10, 9, 2, 8, 3, 7, 4, 6, 5, null);
-- 
2.17.1

Reply via email to