Re: [PATCH] Add support function for containment operators

2024-01-26 Thread vignesh C
On Sun, 21 Jan 2024 at 00:31, Tom Lane  wrote:
>
> jian he  writes:
> > Now I see your point. If the transformed plan is right, the whole
> > added code should be fine.
> > but keeping the textrange_supp related test should be a good idea.
> > since we don't have SUBTYPE_OPCLASS related sql tests.
>
> Yeah, it's a little harder to make a table-less test for that case.
> I thought about using current_user or the like as a stable comparison
> value, but that introduces some doubt about what the collation would
> be.  That test seems cheap enough as-is, since it's handling only a
> tiny amount of data.
>
> Committed.

I have updated the commitfest entry to Committed as the patch is committed.

Regards,
Vignesh




Re: [PATCH] Add support function for containment operators

2024-01-20 Thread Tom Lane
jian he  writes:
> Now I see your point. If the transformed plan is right, the whole
> added code should be fine.
> but keeping the textrange_supp related test should be a good idea.
> since we don't have SUBTYPE_OPCLASS related sql tests.

Yeah, it's a little harder to make a table-less test for that case.
I thought about using current_user or the like as a stable comparison
value, but that introduces some doubt about what the collation would
be.  That test seems cheap enough as-is, since it's handling only a
tiny amount of data.

Committed.

regards, tom lane




Re: [PATCH] Add support function for containment operators

2024-01-16 Thread jian he
On Wed, Jan 17, 2024 at 5:46 AM Tom Lane  wrote:
>
> But perhaps someone has an argument for a different rule?
>
> Anyway, pending discussion of that point, I think the code is good
> to go.  I don't like the test cases much though: they expend many more
> cycles than necessary.  You could prove the same points just by
> looking at the expansion of expressions, eg.
>

your patch is far better!

IMHO, worried about the support function, the transformed plan
generates the wrong result,
so we add the tests to make it bullet proof.
Now I see your point. If the transformed plan is right, the whole
added code should be fine.
but keeping the textrange_supp related test should be a good idea.
since we don't have SUBTYPE_OPCLASS related sql tests.




Re: [PATCH] Add support function for containment operators

2024-01-16 Thread Tom Lane
jian he  writes:
> [ v5-0001-Simplify-containment-in-range-constants-with-supp.patch ]

I spent some time reviewing and cleaning up this code.  The major
problem I noted was that it doesn't spend any effort thinking about
cases where it'd be unsafe or undesirable to apply the transformation.
In particular, it's entirely uncool to produce a double-sided
comparison if the elemExpr is volatile.  These two expressions
do not have the same result:

select random() <@ float8range(0.1, 0.2);
select random() >= 0.1 and random() < 0.2;

(Yes, I'm aware that BETWEEN is broken in this respect.  All the
more reason why we mustn't break <@.)

Another problem is that even if the elemExpr isn't volatile,
it might be expensive enough that evaluating it twice is bad.
I am not sure where we ought to put the cutoff.  There are some
existing places where we set a 10X cpu_operator_cost limit on
comparable transformations, so I stole that logic in the attached.
But perhaps someone has an argument for a different rule?

Anyway, pending discussion of that point, I think the code is good
to go.  I don't like the test cases much though: they expend many more
cycles than necessary.  You could prove the same points just by
looking at the expansion of expressions, eg.

regression=# explain (verbose, costs off) select current_date <@ 
daterange(null,null);
   QUERY PLAN   

 Result
   Output: true
(2 rows)

regression=# explain (verbose, costs off) select current_date <@ 
daterange('-Infinity', '1997-04-10'::date, '[)');
   QUERY PLAN   
 
-
 Result
   Output: ((CURRENT_DATE >= '-infinity'::date) AND (CURRENT_DATE < 
'1997-04-10'::date))
(2 rows)

I'd suggest losing the temp table and just coding tests like these.

regards, tom lane

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d99b00b590..2d94a6b877 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -30,19 +30,20 @@
  */
 #include "postgres.h"
 
-#include "access/tupmacs.h"
 #include "common/hashfn.h"
-#include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/makefuncs.h"
 #include "nodes/miscnodes.h"
-#include "port/pg_bitutils.h"
+#include "nodes/supportnodes.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
+#include "optimizer/optimizer.h"
 #include "utils/builtins.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
 #include "utils/timestamp.h"
-#include "varatt.h"
 
 
 /* fn_extra cache entry for one of the range I/O functions */
@@ -69,6 +70,12 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
+static Node *find_simplified_clause(PlannerInfo *root,
+	Expr *rangeExpr, Expr *elemExpr);
+static Expr *build_bound_expr(Expr *elemExpr, Datum val,
+			  bool isLowerBound, bool isInclusive,
+			  TypeCacheEntry *typeCache,
+			  Oid opfamily, Oid rng_collation);
 
 
 /*
@@ -2173,6 +2180,58 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range (<@ operator).
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = NULL;
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr   *fexpr = req->fcall;
+		Expr	   *leftop,
+   *rightop;
+
+		Assert(list_length(fexpr->args) == 2);
+		leftop = linitial(fexpr->args);
+		rightop = lsecond(fexpr->args);
+
+		ret = find_simplified_clause(req->root, rightop, leftop);
+	}
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem (@> operator).
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = NULL;
+
+	if (IsA(rawreq, SupportRequestSimplify))
+	{
+		SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
+		FuncExpr   *fexpr = req->fcall;
+		Expr	   *leftop,
+   *rightop;
+
+		Assert(list_length(fexpr->args) == 2);
+		leftop = linitial(fexpr->args);
+		rightop = lsecond(fexpr->args);
+
+		ret = find_simplified_clause(req->root, leftop, rightop);
+	}
+
+	PG_RETURN_POINTER(ret);
+}
+
 
 /*
  *--
@@ -2715,3 +2774,180 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign,
 
 	return ptr;
 }
+
+/*
+ * Common code for the elem_contained_by_range and range_contains_elem
+ * support functions.  The caller has 

Re: [PATCH] Add support function for containment operators

2023-12-16 Thread jian he
fix a typo and also did a minor change.

from
+ if (lowerExpr != NULL && upperExpr != NULL)
+ return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr),
-1);
+ else if (lowerExpr != NULL)
+ return (Node *) lowerExpr;
+ else if (upperExpr != NULL)
+ return (Node *) upperExpr;

to

+ if (lowerExpr != NULL && upperExpr != NULL)
+ return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr),
-1);
+ else if (lowerExpr != NULL)
+ return (Node *) lowerExpr;
+ else if (upperExpr != NULL)
+ return (Node *) upperExpr;
+ else
+ {
+ Assert(false);
+ return NULL;
+ }

because cfbot says:

15:04:38.116] make -s -j${BUILD_JOBS} clean
[15:04:38.510] time make -s -j${BUILD_JOBS} world-bin
[15:04:43.272] rangetypes.c:2908:1: error: non-void function does not
return a value in all control paths [-Werror,-Wreturn-type]
[15:04:43.272] }
[15:04:43.272] ^
[15:04:43.272] 1 error generated.

also add some commit messages, I hope it will be useful.
From 334566a62163469f53cfd941ed87e6d6e54d756b Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Mon, 11 Dec 2023 19:14:01 +0800
Subject: [PATCH v5 1/1] Simplify containment in range constants with support function

doc:
https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC
Similarly, if there are positions declared anyrange and others declared anyelement or anyarray,
the actual range type in the anyrange positions must be a range 
whose subtype is the same type appearing in the anyelement positions
and the same as the element type of the anyarray positions.

proname => 'range_contains_elem', prorettype => 'bool', proargtypes => 'anyrange anyelement',
proname => 'elem_contained_by_range', prorettype => 'bool', proargtypes => 'anyelement anyrange'

It will work, because the range constant base type will be the same as the element.
so we can safely rewrite "element within a range or not" 
to element comparison with range lower bound 
and (&&) element comparison with range upper bound.
and these simple same data type comparisons can be supported by btree index.

---
 src/backend/utils/adt/rangetypes.c  | 196 
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 193 +++
 src/test/regress/sql/rangetypes.sql |  92 +
 5 files changed, 494 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index 24bad529..0b3eab08 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,16 +31,24 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
+#include "catalog/pg_range.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "nodes/miscnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
+#include "utils/syscache.h"
 #include "utils/timestamp.h"
 #include "varatt.h"
 
@@ -69,6 +77,10 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
 
 
 /*
@@ -2173,6 +2185,57 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for the elem_contained_by_range (<@) operator.
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0),
+			   *leftop,
+			   *rightop;
+	FuncExpr   *fexpr;
+
+	if (!IsA(rawreq, SupportRequestSimplify))
+		PG_RETURN_POINTER(NULL);
+
+	fexpr = ((SupportRequestSimplify *) rawreq)->fcall;
+
+	Assert(list_length(fexpr->args) == 2);
+	leftop = linitial(fexpr->args);
+	rightop = lsecond(fexpr->args);
+
+	if (!IsA(rightop, Const) || ((Const *) rightop)->constisnull)
+		PG_RETURN_POINTER(NULL);
+
+	PG_RETURN_POINTER(find_simplified_clause((Const *) rightop, (Expr *) leftop));
+}
+
+/*
+ * Planner support function for the range_contains_elem (@>) operator.
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0),
+			   *leftop,
+			   *rightop;
+	FuncExpr   *fexpr;
+
+	if (!IsA(rawreq, SupportRequestSimplify))
+		PG_RETURN_POINTER(NULL);
+
+	fexpr = ((SupportRequestSimplify *) rawreq)->fcall;
+
+	

Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Kim Johan Andersson

On 12-11-2023 20:20, Laurenz Albe wrote:

On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote:

I adjusted your patch according to my comments; what do you think?


I have added the patch to the January commitfest, with Jian and Kim as authors.
I hope that is OK with you.


Sounds great to me. Thanks to Jian for picking this up.

Regards,
Kim Johan Andersson






Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Laurenz Albe
On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote:
> I adjusted your patch according to my comments; what do you think?

I have added the patch to the January commitfest, with Jian and Kim as authors.
I hope that is OK with you.

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-11-12 Thread Laurenz Albe
On Fri, 2023-10-20 at 16:24 +0800, jian he wrote:
> [new patch]

Thanks, that patch works as expected and passes regression tests.

Some comments about the code:

> --- a/src/backend/utils/adt/rangetypes.c
> +++ b/src/backend/utils/adt/rangetypes.c
> @@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
>   PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
>  }
>  
> -
>  /* range, range -> bool functions */
>  
>  /* equality (internal version) */

Please don't change unrelated whitespace.

> +static Node *
> +find_simplified_clause(Const *rangeConst, Expr *otherExpr)
> +{
> + Form_pg_range pg_range;
> + HeapTuple   tup;
> + Oid opclassOid;
> + RangeBound  lower;
> + RangeBound  upper;
> + boolempty;
> + Oid rng_collation;
> + TypeCacheEntry *elemTypcache;
> + Oid opfamily =  InvalidOid;
> +
> + RangeType  *range = DatumGetRangeTypeP(rangeConst->constvalue);
> + TypeCacheEntry *rangetypcache = 
> lookup_type_cache(RangeTypeGetOid(range), TYPECACHE_RANGE_INFO);
> + {

This brace is unnecessary.  Perhaps a leftover from a removed conditional 
statement.

> + /* this part is get the range's SUBTYPE_OPCLASS from pg_range 
> catalog.
> +  * Refer load_rangetype_info function last line.
> +  * TypeCacheEntry->rngelemtype typcaheenetry either don't have 
> opclass entry or with default opclass.
> +  * Range's subtype opclass only in catalog table.
> + */

The comments in the patch need some more love.
Apart from the language, you should have a look at the style guide:

- single-line comments should start with lower case and have no period:

  /* example of a single-line comment */

- Multi-line comments should start with /* on its own line and end with */ on 
its
  own line.  They should use whole sentences:

  /*
   * In case a comment spans several lines, it should look like
   * this.  Try not to exceed 80 characters.
   */

> + tup = SearchSysCache1(RANGETYPE, 
> ObjectIdGetDatum(RangeTypeGetOid(range)));
> +
> + /* should not fail, since we already checked typtype ... */
> + if (!HeapTupleIsValid(tup))
> + elog(ERROR, "cache lookup failed for range type %u", 
> RangeTypeGetOid(range));

If this is a "can't happen" case, it should be an Assert.

> +
> + pg_range = (Form_pg_range) GETSTRUCT(tup);
> +
> + opclassOid = pg_range->rngsubopc;
> +
> + ReleaseSysCache(tup);
> +
> + /* get opclass properties and look up the comparison function */
> + opfamily = get_opclass_family(opclassOid);
> + }
> +
> + range_deserialize(rangetypcache, range, , , );
> + rng_collation = rangetypcache->rng_collation;
> +
> + if (empty)
> + {
> + /* If the range is empty, then there can be no matches. */
> + return makeBoolConst(false, false);
> + }
> + else if (lower.infinite && upper.infinite)
> + {
> + /* The range has no bounds, so matches everything. */
> + return makeBoolConst(true, false);
> + }
> + else
> + {

Many of the variables declared at the beginning of the function are only used in
this branch.  You should declare them here.

> +static Node *
> +match_support_request(Node *rawreq)
> +{
> + if (IsA(rawreq, SupportRequestSimplify))
> + {

To keep the indentation shallow, the preferred style is:

  if (/* case we don't handle */)
  return NULL;
  /* proceed without indentation */

> + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq;
> + FuncExpr   *fexpr = req->fcall;
> + Node   *leftop;
> + Node   *rightop;
> + Const  *rangeConst;
> + Expr   *otherExpr;
> +
> + Assert(list_length(fexpr->args) == 2);
> +
> + leftop = linitial(fexpr->args);
> + rightop = lsecond(fexpr->args);
> +
> + switch (fexpr->funcid)
> + {
> + case F_ELEM_CONTAINED_BY_RANGE:
> + if (!IsA(rightop, Const) || ((Const *) 
> rightop)->constisnull)
> + return NULL;
> +
> + rangeConst = (Const *) rightop;
> + otherExpr = (Expr *) leftop;
> + break;
> +
> + case F_RANGE_CONTAINS_ELEM:
> + if (!IsA(leftop, Const) || ((Const *) 
> leftop)->constisnull)
> + return NULL;
> +
> + rangeConst = (Const *) leftop;
> + otherExpr = (Expr *) rightop;
> + break;
> +
> + default:
> + return NULL;
> 

Re: [PATCH] Add support function for containment operators

2023-10-20 Thread jian he
On Fri, Oct 20, 2023 at 12:01 AM Laurenz Albe  wrote:
>
> On Fri, 2023-10-13 at 14:26 +0800, jian he wrote:
> > Collation problem seems solved.
>
> I didn't review your patch in detail, there is still a problem
> with my example:
>
>   CREATE TYPE textrange AS RANGE (
>  SUBTYPE = text,
>  SUBTYPE_OPCLASS = text_pattern_ops
>   );
>
>   CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu");
>
>   INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');
>
>   SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>t
>   
>a
>c
>ch
>   (3 rows)
>
> That was correct.
>
>   EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>QUERY PLAN
>   
>Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
>  Filter: ((t >= 'a'::text) AND (t < 'd'::text))
>   (2 rows)
>
> But that was weird.  The operators seem wrong.  Look at that

Thanks for pointing this out!

The problem is that TypeCacheEntry->rngelemtype typcaheentry don't
have the range's SUBTYPE_OPCLASS info.
So in find_simplified_clause, we need to get the range's
SUBTYPE_OPCLASS from the pg_catalog table.
Also in pg_range, column rngsubopc  is not null. so this should be fine.
From 810208a42e99109bcaf56cddae6968efbcf969c5 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 20 Oct 2023 16:20:38 +0800
Subject: [PATCH v3 1/1]  Optimize quals (Expr <@ RangeConst) and (RangeConst
 @> Expr)

Previously these two quals will be processed as is.
This patch will rewritten the expression to expose more info to optimizer by
adding prosupport function to range_contains_elem, elem_contained_by_range.

Expr <@ rangeConst will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound". (range bound inclusiveness will be handled properly).

Added some tests to validate the generated plan.
---
 src/backend/utils/adt/rangetypes.c  | 229 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +
 src/test/regress/sql/rangetypes.sql | 101 +
 5 files changed, 535 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..d100e55e 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,16 +31,24 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
+#include "catalog/pg_range.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "nodes/miscnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
+#include "utils/syscache.h"
 #include "utils/timestamp.h"
 #include "varatt.h"
 
@@ -69,7 +77,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2184,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
 
 /*
  *--
@@ -2714,3 +2748,194 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign,
 
 	return ptr;
 }
+/*
+ * build (Expr Operator Range_bound) expression. something like: expr >= lower(range)
+ * if operator both sides have collation, operator should use (right) range's collation
+ *
+*/
+static Expr *
+build_bound_expr(Oid 

Re: [PATCH] Add support function for containment operators

2023-10-19 Thread Laurenz Albe
On Fri, 2023-10-13 at 14:26 +0800, jian he wrote:
> Collation problem seems solved.

I didn't review your patch in detail, there is still a problem
with my example:

  CREATE TYPE textrange AS RANGE (
 SUBTYPE = text,
 SUBTYPE_OPCLASS = text_pattern_ops
  );

  CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu");

  INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');

  SELECT * FROM tx WHERE t <@ textrange('a', 'd');

   t  
  
   a
   c
   ch
  (3 rows)

That was correct.

  EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');

   QUERY PLAN 
  
   Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
 Filter: ((t >= 'a'::text) AND (t < 'd'::text))
  (2 rows)

But that was weird.  The operators seem wrong.  Look at that
query:

  SELECT * FROM tx WHERE t >= 'a' AND t < 'd';

   t 
  ═══
   a
   c
  (2 rows)

But the execution plan is identical...

I am not sure what is the problem here, but in my opinion the
operators shown in the execution plan should be like this:

  SELECT * FROM tx WHERE t ~>=~ 'a' AND t ~<~ 'd';

   t  
  
   a
   c
   ch
  (3 rows)

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-10-13 Thread jian he
On Tue, Aug 1, 2023 at 10:07 AM Laurenz Albe  wrote:
> >
> >
> > > I had an idea about this:
> > > So far, you only consider constant ranges.  But if we have a STABLE range
> > > expression, you could use an index scan for "expr <@ range", for example
> > > Index Cond (expr >= lower(range) AND expr < upper(range)).
> > >

The above part, not sure how to implement it, not sure it is doable.

Refactor:
drop SupportRequestIndexCondition and related code, since mentioned in
upthread, it's dead code.
refactor the regression test. (since data types with infinity cover
more cases than int4range, so I deleted some tests).

now there are 3 helper functions (build_bound_expr,
find_simplified_clause, match_support_request), 2 entry functions
(elem_contained_by_range_support, range_contains_elem_support)

Collation problem seems solved. Putting the following test on the
src/test/regress/sql/rangetypes.sql will not work. Maybe because of
the order of the regression test, in SQL-ASCII encoding, I cannot
create collation="cs-CZ-x-icu".

drop type if EXISTS textrange1, textrange2;
drop table if EXISTS collate_test1, collate_test2;
CREATE TYPE textrange1 AS RANGE (SUBTYPE = text, collation="C");
create type textrange2 as range(subtype=text, collation="cs-CZ-x-icu");
CREATE TABLE collate_test1 (b text COLLATE "en-x-icu" NOT NULL);
INSERT INTO collate_test1(b) VALUES ('a'), ('c'), ('d'), ('ch');
CREATE TABLE collate_test2 (b text NOT NULL);
INSERT INTO collate_test2(b) VALUES ('a'), ('c'), ('d'), ('ch');

--should include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test1 WHERE b <@ textrange2('a', 'd');
--should include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange1('a', 'd');
--should not include 'ch'
SELECT * FROM collate_test2 WHERE b <@ textrange2('a', 'd');
-
From bcae7f8a0640b48b04f243660539e8670de43d41 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 13 Oct 2023 12:43:41 +0800
Subject: [PATCH v2 1/1] Optimize qual cases like Expr <@ RangeConst and
 RangeConst @> Expr

Previously these two quals will be processed as is.
With this patch, adding prosupport
function to range_contains_elem, elem_contained_by_range.

So cases like Expr <@ rangeCOnst, rangeConst @> expr
will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound".

Expr <@ rangeConst will be logically equivalent to rangeConst @> Expr,
if rangeConst is the same. added some tests to validate the generated
plan.
---
 src/backend/utils/adt/rangetypes.c  | 199 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +++
 src/test/regress/sql/rangetypes.sql | 101 ++
 5 files changed, 505 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..647bd5e4 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,13 +31,19 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/supportnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
 #include "nodes/miscnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
@@ -69,7 +75,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +568,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2182,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)

Re: [PATCH] Add support function for containment operators

2023-07-31 Thread Laurenz Albe
On Sat, 2023-07-08 at 08:11 +0200, Kim Johan Andersson wrote:
> On 07-07-2023 13:20, Laurenz Albe wrote:
> > I wrote:
> > > You implement both "SupportRequestIndexCondition" and 
> > > "SupportRequestSimplify",
> > > but when I experimented, the former was never called.  That does not
> > > surprise me, since any expression of the shape "expr <@ range constant"
> > > can be simplified.  Is the "SupportRequestIndexCondition" branch dead 
> > > code?
> > > If not, do you have an example that triggers it?
> 
> I would think it is dead code, I came to the same conclusion. So we can 
> drop SupportRequestIndexCondition, since the simplification happens to 
> take care of everything.
> 
> 
> > I had an idea about this:
> > So far, you only consider constant ranges.  But if we have a STABLE range
> > expression, you could use an index scan for "expr <@ range", for example
> > Index Cond (expr >= lower(range) AND expr < upper(range)).
> > 
> 
> I will try to look into this. Originally that was what I was hoping for, 
> but didn't see way of going about it.
> 
> Thanks for your comments, I will also look at the locale-related 
> breakage you spotted.

I have marked the patch as "returned with feedback".

I encourage you to submit an improved version in a future commitfest!

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-08 Thread Kim Johan Andersson

On 07-07-2023 13:20, Laurenz Albe wrote:

I wrote:

You implement both "SupportRequestIndexCondition" and "SupportRequestSimplify",
but when I experimented, the former was never called.  That does not
surprise me, since any expression of the shape "expr <@ range constant"
can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
If not, do you have an example that triggers it?


I would think it is dead code, I came to the same conclusion. So we can 
drop SupportRequestIndexCondition, since the simplification happens to 
take care of everything.




I had an idea about this:
So far, you only consider constant ranges.  But if we have a STABLE range
expression, you could use an index scan for "expr <@ range", for example
Index Cond (expr >= lower(range) AND expr < upper(range)).



I will try to look into this. Originally that was what I was hoping for, 
but didn't see way of going about it.


Thanks for your comments, I will also look at the locale-related 
breakage you spotted.


Regards,
Kimjand




Re: [PATCH] Add support function for containment operators

2023-07-07 Thread Laurenz Albe
I wrote:
> You implement both "SupportRequestIndexCondition" and 
> "SupportRequestSimplify",
> but when I experimented, the former was never called.  That does not
> surprise me, since any expression of the shape "expr <@ range constant"
> can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
> If not, do you have an example that triggers it?

I had an idea about this:
So far, you only consider constant ranges.  But if we have a STABLE range
expression, you could use an index scan for "expr <@ range", for example
Index Cond (expr >= lower(range) AND expr < upper(range)).

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Laurenz Albe
> On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote:
> > I had noticed that performance wasn't great when using the @> or <@ 
> > operators when examining if an element is contained in a range.
> > Based on the discussion in [1] I would like to suggest the following 
> > changes:
> > 
> > This patch attempts to improve the row estimation, as well as opening 
> > the possibility of using a btree index scan when using the containment 
> > operators.

I managed to break the patch:

CREATE DATABASE czech ICU_LOCALE "cs-CZ" LOCALE "cs_CZ.utf8" TEMPLATE template0;

\c czech

CREATE TYPE textrange AS RANGE (SUBTYPE = text, SUBTYPE_OPCLASS = 
text_pattern_ops);

CREATE TABLE tx (t text);

INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');

EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');

 QUERY PLAN 

 Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
   Filter: ((t >= 'a'::text) AND (t < 'd'::text))
(2 rows)

SELECT * FROM tx WHERE t <@ textrange('a', 'd');
ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.


The replacement operators are wrong; it should be ~>=~ and ~<~ .

Also, there should be no error message.
The result should be 'a', 'c' and 'ch'.

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Laurenz Albe
On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote:
> I had noticed that performance wasn't great when using the @> or <@ 
> operators when examining if an element is contained in a range.
> Based on the discussion in [1] I would like to suggest the following 
> changes:
> 
> This patch attempts to improve the row estimation, as well as opening 
> the possibility of using a btree index scan when using the containment 
> operators.
> 
> This is done via a new support function handling the following 2 requests:
> 
> * SupportRequestIndexCondition
> find_index_quals will build an operator clause, given at least one 
> finite RangeBound.
> 
> * SupportRequestSimplify
> find_simplified_clause will rewrite the containment operator into a 
> clause using inequality operators from the btree family (if available 
> for the element type).
> 
> A boolean constant is returned if the range is either empty or has no 
> bounds.
> 
> Performing the rewrite here lets the clausesel machinery provide the 
> same estimates as for normal scalar inequalities.
> 
> In both cases build_bound_expr is used to build the operator clauses 
> from RangeBounds.

I think that this is a small, but useful improvement.

The patch applies and builds without warning and passes "make 
installcheck-world"
with the (ample) new regression tests.

Some comments:

- About the regression tests:
  You are using EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF, COSTS OFF).
  While that returns stable results, I don't see the added value.
  I think you should use EXPLAIN (COSTS OFF).  You don't need to test the
  actual number of result rows; we can trust that the executor  processes
  >= and < correctly.
  Plain EXPLAIN would speed up the regression tests, which is a good thing.

- About the implementation:
  You implement both "SupportRequestIndexCondition" and 
"SupportRequestSimplify",
  but when I experimented, the former was never called.  That does not
  surprise me, since any expression of the shape "expr <@ range constant"
  can be simplified.  Is the "SupportRequestIndexCondition" branch dead code?
  If not, do you have an example that triggers it?

- About the code:

  +static Node *
  +find_index_quals(Const *rangeConst, Expr *otherExpr, Oid opfamily)
  +{
  [...]
  +
  +   if (!(lower.infinite && upper.infinite))
  +   {
 [...]
  +   }
  +
  +   return NULL;

  To avoid deep indentation and to make the code more readable, I think
  it would be better to write

  if (!(lower.infinite && upper.infinite))
  return NULL;

  and unindent the rest of the code


  +static Node *
  +match_support_request(Node *rawreq)
  +{
  [...]
  +   switch (req->funcid)
  +   {
  +   case F_ELEM_CONTAINED_BY_RANGE:
  [...]
  +   case F_RANGE_CONTAINS_ELEM:
  [...]
  +   default:
  +   return NULL;
  +   }

  (This code appears twice.)

  The default clause should not be reachable, right?
  I think that there should be an Assert() to verify that.
  Perhaps something like

  Assert(req->funcid == F_ELEM_CONTAINED_BY_RANGE ||
 req->funcid == F_RANGE_CONTAINS_ELEM);

  if (req->funcid == F_ELEM_CONTAINED_BY_RANGE)
  {
  [...]
  }
  else if (req->funcid == F_RANGE_CONTAINS_ELEM)
  {
  [...]
  }

Yours,
Laurenz Albe




Re: [PATCH] Add support function for containment operators

2023-07-06 Thread Kim Johan Andersson
Any thoughts on this change?