On Tue, 2023-01-24 at 19:43 -0600, Justin Pryzby wrote: > I think "an optimization, if applicable" is either too terse, or > somehow > wrong. Maybe: > > > Enables or disables the use of abbreviated keys, a sort > > optimization...
Done. > > + optimization could return wrong results. Set to > > + <literal>true</literal> if certain that > > <function>strxfrm()</function> > > + can be trusted. > > "if you are certain"; or "if it is ..." Done. Thank you, rebased patch attached. -- Jeff Davis PostgreSQL Contributor Team - AWS
From 39ed011cc51ba3a4af5e3b559a7b8de25fb895a5 Mon Sep 17 00:00:00 2001 From: Jeff Davis <j...@j-davis.com> Date: Sat, 21 Jan 2023 12:44:07 -0800 Subject: [PATCH v2] Introduce GUCs to control abbreviated keys sort optimization. The setting sort_abbreviated_keys turns the optimization on or off overall. The optimization relies on collation providers, which are complex dependencies, and the performance of the optimization may rely on many factors. Introducing a GUC allows easier diagnosis when this optimization results in worse perforamnce. The setting trust_strxfrm replaces the define TRUST_STRXFRM, allowing users to experiment with the abbreviated keys optimization when using the libc provider. Previously, the optimization only applied to collations using the ICU provider unless specially compiled. By default, allowed only for superusers (because an incorrect setting could lead to wrong results), but can be granted to others. --- doc/src/sgml/config.sgml | 40 ++++++++++++++++++++++ src/backend/utils/adt/varlena.c | 10 +++--- src/backend/utils/misc/guc_tables.c | 24 +++++++++++++ src/backend/utils/sort/tuplesortvariants.c | 17 ++++++--- 4 files changed, 82 insertions(+), 9 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index f985afc009..8f55b89f35 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -11252,6 +11252,46 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' </listitem> </varlistentry> + <varlistentry id="guc-sort-abbreviated-keys" xreflabel="sort_abbreviated_keys"> + <term><varname>sort_abbreviated_keys</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>sort_abbreviated_keys</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Enables or disables the use of abbreviated sort keys, a sort optimization, + if applicable. The default is <literal>true</literal>. Disabling may + be useful to diagnose problems or measure performance. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-trust-strxfrm" xreflabel="trust_strxfrm"> + <term><varname>trust_strxfrm</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>trust_strxfrm</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Abbreviated keys, a sort optimization, depends on correct behavior of + the operating system function <function>strxfrm()</function> when + using a collation with the <literal>libc</literal> provider. On some + platforms <function>strxfrm()</function> does not return results + consistent with <function>strcoll()</function>, which means the + optimization could return wrong results. Set to + <literal>true</literal> if it is certain that + <function>strxfrm()</function> can be trusted. + </para> + <para> + The default value is <literal>false</literal>. This setting has no + effect if <xref linkend="guc-sort-abbreviated-keys"/> is set to + <literal>false</literal>. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-trace-locks" xreflabel="trace_locks"> <term><varname>trace_locks</varname> (<type>boolean</type>) <indexterm> diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index fd81c47474..c270022483 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -45,6 +45,9 @@ /* GUC variable */ int bytea_output = BYTEA_OUTPUT_HEX; +/* GUC to enable use of strxfrm() for abbreviated keys */ +bool trust_strxfrm = false; + typedef struct varlena unknown; typedef struct varlena VarString; @@ -2115,7 +2118,7 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid) * other libc other than Cygwin has so far been shown to have a problem, * we take the conservative course of action for right now and disable * this categorically. (Users who are certain this isn't a problem on - * their system can define TRUST_STRXFRM.) + * their system can set the trust_strxfrm GUC to true.) * * Even apart from the risk of broken locales, it's possible that there * are platforms where the use of abbreviated keys should be disabled at @@ -2128,10 +2131,9 @@ varstr_sortsupport(SortSupport ssup, Oid typid, Oid collid) * categorically, we may still want or need to disable it for particular * platforms. */ -#ifndef TRUST_STRXFRM - if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU)) + if (!trust_strxfrm && !collate_c && + !(locale && locale->provider == COLLPROVIDER_ICU)) abbreviate = false; -#endif /* * If we're using abbreviated keys, or if we're using a locale-aware diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 4ac808ed22..fd4a02fbf5 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -102,6 +102,8 @@ extern bool trace_syncscan; #ifdef DEBUG_BOUNDED_SORT extern bool optimize_bounded_sort; #endif +extern bool sort_abbreviated_keys; +extern bool trust_strxfrm; /* * Options for enum values defined in this module. @@ -1673,6 +1675,28 @@ struct config_bool ConfigureNamesBool[] = }, #endif + { + {"sort_abbreviated_keys", PGC_USERSET, DEVELOPER_OPTIONS, + gettext_noop("Enables the use of abbreviated sort keys."), + NULL, + GUC_NOT_IN_SAMPLE | GUC_EXPLAIN + }, + &sort_abbreviated_keys, + true, + NULL, NULL, NULL + }, + + { + {"trust_strxfrm", PGC_SUSET, DEVELOPER_OPTIONS, + gettext_noop("Allow use of strxfrm() for abbreviated keys optimization for libc provider."), + NULL, + GUC_NOT_IN_SAMPLE + }, + &trust_strxfrm, + false, + NULL, NULL, NULL + }, + #ifdef WAL_DEBUG { {"wal_debug", PGC_SUSET, DEVELOPER_OPTIONS, diff --git a/src/backend/utils/sort/tuplesortvariants.c b/src/backend/utils/sort/tuplesortvariants.c index eb6cfcfd00..ba16779f97 100644 --- a/src/backend/utils/sort/tuplesortvariants.c +++ b/src/backend/utils/sort/tuplesortvariants.c @@ -37,6 +37,8 @@ #define DATUM_SORT 2 #define CLUSTER_SORT 3 +bool sort_abbreviated_keys = true; + static void removeabbrev_heap(Tuplesortstate *state, SortTuple *stups, int count); static void removeabbrev_cluster(Tuplesortstate *state, SortTuple *stups, @@ -185,7 +187,8 @@ tuplesort_begin_heap(TupleDesc tupDesc, sortKey->ssup_nulls_first = nullsFirstFlags[i]; sortKey->ssup_attno = attNums[i]; /* Convey if abbreviation optimization is applicable in principle */ - sortKey->abbreviate = (i == 0 && base->haveDatum1); + if (sort_abbreviated_keys) + sortKey->abbreviate = (i == 0 && base->haveDatum1); PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey); } @@ -295,7 +298,8 @@ tuplesort_begin_cluster(TupleDesc tupDesc, (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0; sortKey->ssup_attno = scanKey->sk_attno; /* Convey if abbreviation optimization is applicable in principle */ - sortKey->abbreviate = (i == 0 && base->haveDatum1); + if (sort_abbreviated_keys) + sortKey->abbreviate = (i == 0 && base->haveDatum1); Assert(sortKey->ssup_attno != 0); @@ -379,7 +383,8 @@ tuplesort_begin_index_btree(Relation heapRel, (scanKey->sk_flags & SK_BT_NULLS_FIRST) != 0; sortKey->ssup_attno = scanKey->sk_attno; /* Convey if abbreviation optimization is applicable in principle */ - sortKey->abbreviate = (i == 0 && base->haveDatum1); + if (sort_abbreviated_keys) + sortKey->abbreviate = (i == 0 && base->haveDatum1); Assert(sortKey->ssup_attno != 0); @@ -499,7 +504,8 @@ tuplesort_begin_index_gist(Relation heapRel, sortKey->ssup_nulls_first = false; sortKey->ssup_attno = i + 1; /* Convey if abbreviation optimization is applicable in principle */ - sortKey->abbreviate = (i == 0 && base->haveDatum1); + if (sort_abbreviated_keys) + sortKey->abbreviate = (i == 0 && base->haveDatum1); Assert(sortKey->ssup_attno != 0); @@ -573,7 +579,8 @@ tuplesort_begin_datum(Oid datumType, Oid sortOperator, Oid sortCollation, * can't, because a datum sort only stores a single copy of the datum; the * "tuple" field of each SortTuple is NULL. */ - base->sortKeys->abbreviate = !typbyval; + if (sort_abbreviated_keys) + base->sortKeys->abbreviate = !typbyval; PrepareSortSupportFromOrderingOp(sortOperator, base->sortKeys); -- 2.34.1