Re: [PATCH] Add support function for containment operators
On Sun, 21 Jan 2024 at 00:31, Tom Lane wrote: > > jian he writes: > > Now I see your point. If the transformed plan is right, the whole > > added code should be fine. > > but keeping the textrange_supp related test should be a good idea. > > since we don't have SUBTYPE_OPCLASS related sql tests. > > Yeah, it's a little harder to make a table-less test for that case. > I thought about using current_user or the like as a stable comparison > value, but that introduces some doubt about what the collation would > be. That test seems cheap enough as-is, since it's handling only a > tiny amount of data. > > Committed. I have updated the commitfest entry to Committed as the patch is committed. Regards, Vignesh
Re: [PATCH] Add support function for containment operators
jian he writes: > Now I see your point. If the transformed plan is right, the whole > added code should be fine. > but keeping the textrange_supp related test should be a good idea. > since we don't have SUBTYPE_OPCLASS related sql tests. Yeah, it's a little harder to make a table-less test for that case. I thought about using current_user or the like as a stable comparison value, but that introduces some doubt about what the collation would be. That test seems cheap enough as-is, since it's handling only a tiny amount of data. Committed. regards, tom lane
Re: [PATCH] Add support function for containment operators
On Wed, Jan 17, 2024 at 5:46 AM Tom Lane wrote: > > But perhaps someone has an argument for a different rule? > > Anyway, pending discussion of that point, I think the code is good > to go. I don't like the test cases much though: they expend many more > cycles than necessary. You could prove the same points just by > looking at the expansion of expressions, eg. > your patch is far better! IMHO, worried about the support function, the transformed plan generates the wrong result, so we add the tests to make it bullet proof. Now I see your point. If the transformed plan is right, the whole added code should be fine. but keeping the textrange_supp related test should be a good idea. since we don't have SUBTYPE_OPCLASS related sql tests.
Re: [PATCH] Add support function for containment operators
jian he writes: > [ v5-0001-Simplify-containment-in-range-constants-with-supp.patch ] I spent some time reviewing and cleaning up this code. The major problem I noted was that it doesn't spend any effort thinking about cases where it'd be unsafe or undesirable to apply the transformation. In particular, it's entirely uncool to produce a double-sided comparison if the elemExpr is volatile. These two expressions do not have the same result: select random() <@ float8range(0.1, 0.2); select random() >= 0.1 and random() < 0.2; (Yes, I'm aware that BETWEEN is broken in this respect. All the more reason why we mustn't break <@.) Another problem is that even if the elemExpr isn't volatile, it might be expensive enough that evaluating it twice is bad. I am not sure where we ought to put the cutoff. There are some existing places where we set a 10X cpu_operator_cost limit on comparable transformations, so I stole that logic in the attached. But perhaps someone has an argument for a different rule? Anyway, pending discussion of that point, I think the code is good to go. I don't like the test cases much though: they expend many more cycles than necessary. You could prove the same points just by looking at the expansion of expressions, eg. regression=# explain (verbose, costs off) select current_date <@ daterange(null,null); QUERY PLAN Result Output: true (2 rows) regression=# explain (verbose, costs off) select current_date <@ daterange('-Infinity', '1997-04-10'::date, '[)'); QUERY PLAN - Result Output: ((CURRENT_DATE >= '-infinity'::date) AND (CURRENT_DATE < '1997-04-10'::date)) (2 rows) I'd suggest losing the temp table and just coding tests like these. regards, tom lane diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index d99b00b590..2d94a6b877 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -30,19 +30,20 @@ */ #include "postgres.h" -#include "access/tupmacs.h" #include "common/hashfn.h" -#include "lib/stringinfo.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/makefuncs.h" #include "nodes/miscnodes.h" -#include "port/pg_bitutils.h" +#include "nodes/supportnodes.h" +#include "optimizer/clauses.h" +#include "optimizer/cost.h" +#include "optimizer/optimizer.h" #include "utils/builtins.h" #include "utils/date.h" #include "utils/lsyscache.h" #include "utils/rangetypes.h" #include "utils/timestamp.h" -#include "varatt.h" /* fn_extra cache entry for one of the range I/O functions */ @@ -69,6 +70,12 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval, char typalign, int16 typlen, char typstorage); static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, int16 typlen, char typstorage); +static Node *find_simplified_clause(PlannerInfo *root, + Expr *rangeExpr, Expr *elemExpr); +static Expr *build_bound_expr(Expr *elemExpr, Datum val, + bool isLowerBound, bool isInclusive, + TypeCacheEntry *typeCache, + Oid opfamily, Oid rng_collation); /* @@ -2173,6 +2180,58 @@ make_empty_range(TypeCacheEntry *typcache) return make_range(typcache, , , true, NULL); } +/* + * Planner support function for elem_contained_by_range (<@ operator). + */ +Datum +elem_contained_by_range_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *fexpr = req->fcall; + Expr *leftop, + *rightop; + + Assert(list_length(fexpr->args) == 2); + leftop = linitial(fexpr->args); + rightop = lsecond(fexpr->args); + + ret = find_simplified_clause(req->root, rightop, leftop); + } + + PG_RETURN_POINTER(ret); +} + +/* + * Planner support function for range_contains_elem (@> operator). + */ +Datum +range_contains_elem_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = NULL; + + if (IsA(rawreq, SupportRequestSimplify)) + { + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; + FuncExpr *fexpr = req->fcall; + Expr *leftop, + *rightop; + + Assert(list_length(fexpr->args) == 2); + leftop = linitial(fexpr->args); + rightop = lsecond(fexpr->args); + + ret = find_simplified_clause(req->root, leftop, rightop); + } + + PG_RETURN_POINTER(ret); +} + /* *-- @@ -2715,3 +2774,180 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, return ptr; } + +/* + * Common code for the elem_contained_by_range and range_contains_elem + * support functions. The caller has
Re: [PATCH] Add support function for containment operators
fix a typo and also did a minor change. from + if (lowerExpr != NULL && upperExpr != NULL) + return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr), -1); + else if (lowerExpr != NULL) + return (Node *) lowerExpr; + else if (upperExpr != NULL) + return (Node *) upperExpr; to + if (lowerExpr != NULL && upperExpr != NULL) + return (Node *) makeBoolExpr(AND_EXPR, list_make2(lowerExpr, upperExpr), -1); + else if (lowerExpr != NULL) + return (Node *) lowerExpr; + else if (upperExpr != NULL) + return (Node *) upperExpr; + else + { + Assert(false); + return NULL; + } because cfbot says: 15:04:38.116] make -s -j${BUILD_JOBS} clean [15:04:38.510] time make -s -j${BUILD_JOBS} world-bin [15:04:43.272] rangetypes.c:2908:1: error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type] [15:04:43.272] } [15:04:43.272] ^ [15:04:43.272] 1 error generated. also add some commit messages, I hope it will be useful. From 334566a62163469f53cfd941ed87e6d6e54d756b Mon Sep 17 00:00:00 2001 From: pgaddict Date: Mon, 11 Dec 2023 19:14:01 +0800 Subject: [PATCH v5 1/1] Simplify containment in range constants with support function doc: https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC Similarly, if there are positions declared anyrange and others declared anyelement or anyarray, the actual range type in the anyrange positions must be a range whose subtype is the same type appearing in the anyelement positions and the same as the element type of the anyarray positions. proname => 'range_contains_elem', prorettype => 'bool', proargtypes => 'anyrange anyelement', proname => 'elem_contained_by_range', prorettype => 'bool', proargtypes => 'anyelement anyrange' It will work, because the range constant base type will be the same as the element. so we can safely rewrite "element within a range or not" to element comparison with range lower bound and (&&) element comparison with range upper bound. and these simple same data type comparisons can be supported by btree index. --- src/backend/utils/adt/rangetypes.c | 196 src/backend/utils/adt/rangetypes_selfuncs.c | 6 +- src/include/catalog/pg_proc.dat | 12 +- src/test/regress/expected/rangetypes.out| 193 +++ src/test/regress/sql/rangetypes.sql | 92 + 5 files changed, 494 insertions(+), 5 deletions(-) diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index 24bad529..0b3eab08 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -31,16 +31,24 @@ #include "postgres.h" #include "access/tupmacs.h" +#include "access/stratnum.h" +#include "catalog/pg_range.h" #include "common/hashfn.h" #include "lib/stringinfo.h" #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/miscnodes.h" +#include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "nodes/pg_list.h" +#include "nodes/supportnodes.h" #include "port/pg_bitutils.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/date.h" #include "utils/lsyscache.h" #include "utils/rangetypes.h" +#include "utils/syscache.h" #include "utils/timestamp.h" #include "varatt.h" @@ -69,6 +77,10 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval, char typalign, int16 typlen, char typstorage); static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, int16 typlen, char typstorage); +static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache, + bool isLowerBound, bool isInclusive, + Datum val, Expr *otherExpr, Oid rng_collation); +static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr); /* @@ -2173,6 +2185,57 @@ make_empty_range(TypeCacheEntry *typcache) return make_range(typcache, , , true, NULL); } +/* + * Planner support function for the elem_contained_by_range (<@) operator. + */ +Datum +elem_contained_by_range_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0), + *leftop, + *rightop; + FuncExpr *fexpr; + + if (!IsA(rawreq, SupportRequestSimplify)) + PG_RETURN_POINTER(NULL); + + fexpr = ((SupportRequestSimplify *) rawreq)->fcall; + + Assert(list_length(fexpr->args) == 2); + leftop = linitial(fexpr->args); + rightop = lsecond(fexpr->args); + + if (!IsA(rightop, Const) || ((Const *) rightop)->constisnull) + PG_RETURN_POINTER(NULL); + + PG_RETURN_POINTER(find_simplified_clause((Const *) rightop, (Expr *) leftop)); +} + +/* + * Planner support function for the range_contains_elem (@>) operator. + */ +Datum +range_contains_elem_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0), + *leftop, + *rightop; + FuncExpr *fexpr; + + if (!IsA(rawreq, SupportRequestSimplify)) + PG_RETURN_POINTER(NULL); + + fexpr = ((SupportRequestSimplify *) rawreq)->fcall; + +
Re: [PATCH] Add support function for containment operators
On 12-11-2023 20:20, Laurenz Albe wrote: On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote: I adjusted your patch according to my comments; what do you think? I have added the patch to the January commitfest, with Jian and Kim as authors. I hope that is OK with you. Sounds great to me. Thanks to Jian for picking this up. Regards, Kim Johan Andersson
Re: [PATCH] Add support function for containment operators
On Sun, 2023-11-12 at 18:15 +0100, Laurenz Albe wrote: > I adjusted your patch according to my comments; what do you think? I have added the patch to the January commitfest, with Jian and Kim as authors. I hope that is OK with you. Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
On Fri, 2023-10-20 at 16:24 +0800, jian he wrote: > [new patch] Thanks, that patch works as expected and passes regression tests. Some comments about the code: > --- a/src/backend/utils/adt/rangetypes.c > +++ b/src/backend/utils/adt/rangetypes.c > @@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS) > PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val)); > } > > - > /* range, range -> bool functions */ > > /* equality (internal version) */ Please don't change unrelated whitespace. > +static Node * > +find_simplified_clause(Const *rangeConst, Expr *otherExpr) > +{ > + Form_pg_range pg_range; > + HeapTuple tup; > + Oid opclassOid; > + RangeBound lower; > + RangeBound upper; > + boolempty; > + Oid rng_collation; > + TypeCacheEntry *elemTypcache; > + Oid opfamily = InvalidOid; > + > + RangeType *range = DatumGetRangeTypeP(rangeConst->constvalue); > + TypeCacheEntry *rangetypcache = > lookup_type_cache(RangeTypeGetOid(range), TYPECACHE_RANGE_INFO); > + { This brace is unnecessary. Perhaps a leftover from a removed conditional statement. > + /* this part is get the range's SUBTYPE_OPCLASS from pg_range > catalog. > + * Refer load_rangetype_info function last line. > + * TypeCacheEntry->rngelemtype typcaheenetry either don't have > opclass entry or with default opclass. > + * Range's subtype opclass only in catalog table. > + */ The comments in the patch need some more love. Apart from the language, you should have a look at the style guide: - single-line comments should start with lower case and have no period: /* example of a single-line comment */ - Multi-line comments should start with /* on its own line and end with */ on its own line. They should use whole sentences: /* * In case a comment spans several lines, it should look like * this. Try not to exceed 80 characters. */ > + tup = SearchSysCache1(RANGETYPE, > ObjectIdGetDatum(RangeTypeGetOid(range))); > + > + /* should not fail, since we already checked typtype ... */ > + if (!HeapTupleIsValid(tup)) > + elog(ERROR, "cache lookup failed for range type %u", > RangeTypeGetOid(range)); If this is a "can't happen" case, it should be an Assert. > + > + pg_range = (Form_pg_range) GETSTRUCT(tup); > + > + opclassOid = pg_range->rngsubopc; > + > + ReleaseSysCache(tup); > + > + /* get opclass properties and look up the comparison function */ > + opfamily = get_opclass_family(opclassOid); > + } > + > + range_deserialize(rangetypcache, range, , , ); > + rng_collation = rangetypcache->rng_collation; > + > + if (empty) > + { > + /* If the range is empty, then there can be no matches. */ > + return makeBoolConst(false, false); > + } > + else if (lower.infinite && upper.infinite) > + { > + /* The range has no bounds, so matches everything. */ > + return makeBoolConst(true, false); > + } > + else > + { Many of the variables declared at the beginning of the function are only used in this branch. You should declare them here. > +static Node * > +match_support_request(Node *rawreq) > +{ > + if (IsA(rawreq, SupportRequestSimplify)) > + { To keep the indentation shallow, the preferred style is: if (/* case we don't handle */) return NULL; /* proceed without indentation */ > + SupportRequestSimplify *req = (SupportRequestSimplify *) rawreq; > + FuncExpr *fexpr = req->fcall; > + Node *leftop; > + Node *rightop; > + Const *rangeConst; > + Expr *otherExpr; > + > + Assert(list_length(fexpr->args) == 2); > + > + leftop = linitial(fexpr->args); > + rightop = lsecond(fexpr->args); > + > + switch (fexpr->funcid) > + { > + case F_ELEM_CONTAINED_BY_RANGE: > + if (!IsA(rightop, Const) || ((Const *) > rightop)->constisnull) > + return NULL; > + > + rangeConst = (Const *) rightop; > + otherExpr = (Expr *) leftop; > + break; > + > + case F_RANGE_CONTAINS_ELEM: > + if (!IsA(leftop, Const) || ((Const *) > leftop)->constisnull) > + return NULL; > + > + rangeConst = (Const *) leftop; > + otherExpr = (Expr *) rightop; > + break; > + > + default: > + return NULL; >
Re: [PATCH] Add support function for containment operators
On Fri, Oct 20, 2023 at 12:01 AM Laurenz Albe wrote: > > On Fri, 2023-10-13 at 14:26 +0800, jian he wrote: > > Collation problem seems solved. > > I didn't review your patch in detail, there is still a problem > with my example: > > CREATE TYPE textrange AS RANGE ( > SUBTYPE = text, > SUBTYPE_OPCLASS = text_pattern_ops > ); > > CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu"); > > INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch'); > > SELECT * FROM tx WHERE t <@ textrange('a', 'd'); > >t > >a >c >ch > (3 rows) > > That was correct. > > EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd'); > >QUERY PLAN > >Seq Scan on tx (cost=0.00..30.40 rows=7 width=32) > Filter: ((t >= 'a'::text) AND (t < 'd'::text)) > (2 rows) > > But that was weird. The operators seem wrong. Look at that Thanks for pointing this out! The problem is that TypeCacheEntry->rngelemtype typcaheentry don't have the range's SUBTYPE_OPCLASS info. So in find_simplified_clause, we need to get the range's SUBTYPE_OPCLASS from the pg_catalog table. Also in pg_range, column rngsubopc is not null. so this should be fine. From 810208a42e99109bcaf56cddae6968efbcf969c5 Mon Sep 17 00:00:00 2001 From: pgaddict Date: Fri, 20 Oct 2023 16:20:38 +0800 Subject: [PATCH v3 1/1] Optimize quals (Expr <@ RangeConst) and (RangeConst @> Expr) Previously these two quals will be processed as is. This patch will rewritten the expression to expose more info to optimizer by adding prosupport function to range_contains_elem, elem_contained_by_range. Expr <@ rangeConst will be rewritten to "expr opr range_lower_bound and expr opr range_upper_bound". (range bound inclusiveness will be handled properly). Added some tests to validate the generated plan. --- src/backend/utils/adt/rangetypes.c | 229 +++- src/backend/utils/adt/rangetypes_selfuncs.c | 6 +- src/include/catalog/pg_proc.dat | 12 +- src/test/regress/expected/rangetypes.out| 194 + src/test/regress/sql/rangetypes.sql | 101 + 5 files changed, 535 insertions(+), 7 deletions(-) diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index d65e5625..d100e55e 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -31,16 +31,24 @@ #include "postgres.h" #include "access/tupmacs.h" +#include "access/stratnum.h" +#include "catalog/pg_range.h" #include "common/hashfn.h" #include "lib/stringinfo.h" #include "libpq/pqformat.h" #include "miscadmin.h" #include "nodes/miscnodes.h" +#include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "nodes/pg_list.h" +#include "nodes/supportnodes.h" #include "port/pg_bitutils.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/date.h" #include "utils/lsyscache.h" #include "utils/rangetypes.h" +#include "utils/syscache.h" #include "utils/timestamp.h" #include "varatt.h" @@ -69,7 +77,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval, char typalign, int16 typlen, char typstorage); static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, int16 typlen, char typstorage); - +static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache, + bool isLowerBound, bool isInclusive, + Datum val, Expr *otherExpr, Oid rng_collation); +static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr); +static Node *match_support_request(Node *rawreq); /* *-- @@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS) PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val)); } - /* range, range -> bool functions */ /* equality (internal version) */ @@ -2173,6 +2184,29 @@ make_empty_range(TypeCacheEntry *typcache) return make_range(typcache, , , true, NULL); } +/* + * Planner support function for elem_contained_by_range operator + */ +Datum +elem_contained_by_range_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = match_support_request(rawreq); + + PG_RETURN_POINTER(ret); +} + +/* + * Planner support function for range_contains_elem operator + */ +Datum +range_contains_elem_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = match_support_request(rawreq); + + PG_RETURN_POINTER(ret); +} /* *-- @@ -2714,3 +2748,194 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, return ptr; } +/* + * build (Expr Operator Range_bound) expression. something like: expr >= lower(range) + * if operator both sides have collation, operator should use (right) range's collation + * +*/ +static Expr * +build_bound_expr(Oid
Re: [PATCH] Add support function for containment operators
On Fri, 2023-10-13 at 14:26 +0800, jian he wrote: > Collation problem seems solved. I didn't review your patch in detail, there is still a problem with my example: CREATE TYPE textrange AS RANGE ( SUBTYPE = text, SUBTYPE_OPCLASS = text_pattern_ops ); CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu"); INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch'); SELECT * FROM tx WHERE t <@ textrange('a', 'd'); t a c ch (3 rows) That was correct. EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd'); QUERY PLAN Seq Scan on tx (cost=0.00..30.40 rows=7 width=32) Filter: ((t >= 'a'::text) AND (t < 'd'::text)) (2 rows) But that was weird. The operators seem wrong. Look at that query: SELECT * FROM tx WHERE t >= 'a' AND t < 'd'; t ═══ a c (2 rows) But the execution plan is identical... I am not sure what is the problem here, but in my opinion the operators shown in the execution plan should be like this: SELECT * FROM tx WHERE t ~>=~ 'a' AND t ~<~ 'd'; t a c ch (3 rows) Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
On Tue, Aug 1, 2023 at 10:07 AM Laurenz Albe wrote: > > > > > > > I had an idea about this: > > > So far, you only consider constant ranges. But if we have a STABLE range > > > expression, you could use an index scan for "expr <@ range", for example > > > Index Cond (expr >= lower(range) AND expr < upper(range)). > > > The above part, not sure how to implement it, not sure it is doable. Refactor: drop SupportRequestIndexCondition and related code, since mentioned in upthread, it's dead code. refactor the regression test. (since data types with infinity cover more cases than int4range, so I deleted some tests). now there are 3 helper functions (build_bound_expr, find_simplified_clause, match_support_request), 2 entry functions (elem_contained_by_range_support, range_contains_elem_support) Collation problem seems solved. Putting the following test on the src/test/regress/sql/rangetypes.sql will not work. Maybe because of the order of the regression test, in SQL-ASCII encoding, I cannot create collation="cs-CZ-x-icu". drop type if EXISTS textrange1, textrange2; drop table if EXISTS collate_test1, collate_test2; CREATE TYPE textrange1 AS RANGE (SUBTYPE = text, collation="C"); create type textrange2 as range(subtype=text, collation="cs-CZ-x-icu"); CREATE TABLE collate_test1 (b text COLLATE "en-x-icu" NOT NULL); INSERT INTO collate_test1(b) VALUES ('a'), ('c'), ('d'), ('ch'); CREATE TABLE collate_test2 (b text NOT NULL); INSERT INTO collate_test2(b) VALUES ('a'), ('c'), ('d'), ('ch'); --should include 'ch' SELECT * FROM collate_test1 WHERE b <@ textrange1('a', 'd'); --should not include 'ch' SELECT * FROM collate_test1 WHERE b <@ textrange2('a', 'd'); --should include 'ch' SELECT * FROM collate_test2 WHERE b <@ textrange1('a', 'd'); --should not include 'ch' SELECT * FROM collate_test2 WHERE b <@ textrange2('a', 'd'); - From bcae7f8a0640b48b04f243660539e8670de43d41 Mon Sep 17 00:00:00 2001 From: pgaddict Date: Fri, 13 Oct 2023 12:43:41 +0800 Subject: [PATCH v2 1/1] Optimize qual cases like Expr <@ RangeConst and RangeConst @> Expr Previously these two quals will be processed as is. With this patch, adding prosupport function to range_contains_elem, elem_contained_by_range. So cases like Expr <@ rangeCOnst, rangeConst @> expr will be rewritten to "expr opr range_lower_bound and expr opr range_upper_bound". Expr <@ rangeConst will be logically equivalent to rangeConst @> Expr, if rangeConst is the same. added some tests to validate the generated plan. --- src/backend/utils/adt/rangetypes.c | 199 +++- src/backend/utils/adt/rangetypes_selfuncs.c | 6 +- src/include/catalog/pg_proc.dat | 12 +- src/test/regress/expected/rangetypes.out| 194 +++ src/test/regress/sql/rangetypes.sql | 101 ++ 5 files changed, 505 insertions(+), 7 deletions(-) diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c index d65e5625..647bd5e4 100644 --- a/src/backend/utils/adt/rangetypes.c +++ b/src/backend/utils/adt/rangetypes.c @@ -31,13 +31,19 @@ #include "postgres.h" #include "access/tupmacs.h" +#include "access/stratnum.h" #include "common/hashfn.h" #include "lib/stringinfo.h" #include "libpq/pqformat.h" #include "miscadmin.h" +#include "nodes/supportnodes.h" +#include "nodes/makefuncs.h" +#include "nodes/nodeFuncs.h" +#include "nodes/pg_list.h" #include "nodes/miscnodes.h" #include "port/pg_bitutils.h" #include "utils/builtins.h" +#include "utils/fmgroids.h" #include "utils/date.h" #include "utils/lsyscache.h" #include "utils/rangetypes.h" @@ -69,7 +75,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval, char typalign, int16 typlen, char typstorage); static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign, int16 typlen, char typstorage); - +static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache, + bool isLowerBound, bool isInclusive, + Datum val, Expr *otherExpr, Oid rng_collation); +static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr); +static Node *match_support_request(Node *rawreq); /* *-- @@ -558,7 +568,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS) PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val)); } - /* range, range -> bool functions */ /* equality (internal version) */ @@ -2173,6 +2182,29 @@ make_empty_range(TypeCacheEntry *typcache) return make_range(typcache, , , true, NULL); } +/* + * Planner support function for elem_contained_by_range operator + */ +Datum +elem_contained_by_range_support(PG_FUNCTION_ARGS) +{ + Node *rawreq = (Node *) PG_GETARG_POINTER(0); + Node *ret = match_support_request(rawreq); + + PG_RETURN_POINTER(ret); +} + +/* + * Planner support function for range_contains_elem operator + */ +Datum +range_contains_elem_support(PG_FUNCTION_ARGS)
Re: [PATCH] Add support function for containment operators
On Sat, 2023-07-08 at 08:11 +0200, Kim Johan Andersson wrote: > On 07-07-2023 13:20, Laurenz Albe wrote: > > I wrote: > > > You implement both "SupportRequestIndexCondition" and > > > "SupportRequestSimplify", > > > but when I experimented, the former was never called. That does not > > > surprise me, since any expression of the shape "expr <@ range constant" > > > can be simplified. Is the "SupportRequestIndexCondition" branch dead > > > code? > > > If not, do you have an example that triggers it? > > I would think it is dead code, I came to the same conclusion. So we can > drop SupportRequestIndexCondition, since the simplification happens to > take care of everything. > > > > I had an idea about this: > > So far, you only consider constant ranges. But if we have a STABLE range > > expression, you could use an index scan for "expr <@ range", for example > > Index Cond (expr >= lower(range) AND expr < upper(range)). > > > > I will try to look into this. Originally that was what I was hoping for, > but didn't see way of going about it. > > Thanks for your comments, I will also look at the locale-related > breakage you spotted. I have marked the patch as "returned with feedback". I encourage you to submit an improved version in a future commitfest! Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
On 07-07-2023 13:20, Laurenz Albe wrote: I wrote: You implement both "SupportRequestIndexCondition" and "SupportRequestSimplify", but when I experimented, the former was never called. That does not surprise me, since any expression of the shape "expr <@ range constant" can be simplified. Is the "SupportRequestIndexCondition" branch dead code? If not, do you have an example that triggers it? I would think it is dead code, I came to the same conclusion. So we can drop SupportRequestIndexCondition, since the simplification happens to take care of everything. I had an idea about this: So far, you only consider constant ranges. But if we have a STABLE range expression, you could use an index scan for "expr <@ range", for example Index Cond (expr >= lower(range) AND expr < upper(range)). I will try to look into this. Originally that was what I was hoping for, but didn't see way of going about it. Thanks for your comments, I will also look at the locale-related breakage you spotted. Regards, Kimjand
Re: [PATCH] Add support function for containment operators
I wrote: > You implement both "SupportRequestIndexCondition" and > "SupportRequestSimplify", > but when I experimented, the former was never called. That does not > surprise me, since any expression of the shape "expr <@ range constant" > can be simplified. Is the "SupportRequestIndexCondition" branch dead code? > If not, do you have an example that triggers it? I had an idea about this: So far, you only consider constant ranges. But if we have a STABLE range expression, you could use an index scan for "expr <@ range", for example Index Cond (expr >= lower(range) AND expr < upper(range)). Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
> On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote: > > I had noticed that performance wasn't great when using the @> or <@ > > operators when examining if an element is contained in a range. > > Based on the discussion in [1] I would like to suggest the following > > changes: > > > > This patch attempts to improve the row estimation, as well as opening > > the possibility of using a btree index scan when using the containment > > operators. I managed to break the patch: CREATE DATABASE czech ICU_LOCALE "cs-CZ" LOCALE "cs_CZ.utf8" TEMPLATE template0; \c czech CREATE TYPE textrange AS RANGE (SUBTYPE = text, SUBTYPE_OPCLASS = text_pattern_ops); CREATE TABLE tx (t text); INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch'); EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd'); QUERY PLAN Seq Scan on tx (cost=0.00..30.40 rows=7 width=32) Filter: ((t >= 'a'::text) AND (t < 'd'::text)) (2 rows) SELECT * FROM tx WHERE t <@ textrange('a', 'd'); ERROR: could not determine which collation to use for string comparison HINT: Use the COLLATE clause to set the collation explicitly. The replacement operators are wrong; it should be ~>=~ and ~<~ . Also, there should be no error message. The result should be 'a', 'c' and 'ch'. Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
On Sat, 2023-04-29 at 17:07 +0200, Kim Johan Andersson wrote: > I had noticed that performance wasn't great when using the @> or <@ > operators when examining if an element is contained in a range. > Based on the discussion in [1] I would like to suggest the following > changes: > > This patch attempts to improve the row estimation, as well as opening > the possibility of using a btree index scan when using the containment > operators. > > This is done via a new support function handling the following 2 requests: > > * SupportRequestIndexCondition > find_index_quals will build an operator clause, given at least one > finite RangeBound. > > * SupportRequestSimplify > find_simplified_clause will rewrite the containment operator into a > clause using inequality operators from the btree family (if available > for the element type). > > A boolean constant is returned if the range is either empty or has no > bounds. > > Performing the rewrite here lets the clausesel machinery provide the > same estimates as for normal scalar inequalities. > > In both cases build_bound_expr is used to build the operator clauses > from RangeBounds. I think that this is a small, but useful improvement. The patch applies and builds without warning and passes "make installcheck-world" with the (ample) new regression tests. Some comments: - About the regression tests: You are using EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF, COSTS OFF). While that returns stable results, I don't see the added value. I think you should use EXPLAIN (COSTS OFF). You don't need to test the actual number of result rows; we can trust that the executor processes >= and < correctly. Plain EXPLAIN would speed up the regression tests, which is a good thing. - About the implementation: You implement both "SupportRequestIndexCondition" and "SupportRequestSimplify", but when I experimented, the former was never called. That does not surprise me, since any expression of the shape "expr <@ range constant" can be simplified. Is the "SupportRequestIndexCondition" branch dead code? If not, do you have an example that triggers it? - About the code: +static Node * +find_index_quals(Const *rangeConst, Expr *otherExpr, Oid opfamily) +{ [...] + + if (!(lower.infinite && upper.infinite)) + { [...] + } + + return NULL; To avoid deep indentation and to make the code more readable, I think it would be better to write if (!(lower.infinite && upper.infinite)) return NULL; and unindent the rest of the code +static Node * +match_support_request(Node *rawreq) +{ [...] + switch (req->funcid) + { + case F_ELEM_CONTAINED_BY_RANGE: [...] + case F_RANGE_CONTAINS_ELEM: [...] + default: + return NULL; + } (This code appears twice.) The default clause should not be reachable, right? I think that there should be an Assert() to verify that. Perhaps something like Assert(req->funcid == F_ELEM_CONTAINED_BY_RANGE || req->funcid == F_RANGE_CONTAINS_ELEM); if (req->funcid == F_ELEM_CONTAINED_BY_RANGE) { [...] } else if (req->funcid == F_RANGE_CONTAINS_ELEM) { [...] } Yours, Laurenz Albe
Re: [PATCH] Add support function for containment operators
Any thoughts on this change?