Re: check_strxfrm_bug()

2023-07-10 Thread Thomas Munro
On Tue, Jul 11, 2023 at 2:28 AM Peter Eisentraut  wrote:
> This looks sensible to me.
>
> If we ever need mbstowcs_l() etc. outside of pg_locale.c, then the
> proper way would be to make a mbstowcs_l.c file in src/port/.
>
> But I like your approach for now because it moves us more firmly into
> the direction of having it contained in pg_locale.c, instead of having
> some knowledge global and some local.

Thanks.  Pushed.

That leaves only one further cleanup opportunity from discussion this
thread, for future work: a TRUST_STRXFRM-ectomy.




Re: check_strxfrm_bug()

2023-07-10 Thread Peter Eisentraut

On 10.07.23 04:51, Thomas Munro wrote:

On Sun, Jul 9, 2023 at 6:35 PM Thomas Munro  wrote:

On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut  wrote:

So I don't think this code is correct.  AFAICT, there is nothing right
now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
the intention?


Yes, that was my intention.  Windows actually doesn't have them.


Thinking about that some more...  Its _XXX implementations don't deal
with UTF-8 the way Unix-based developers would expect, and are
therefore just portability hazards, aren't they?  What would we gain
by restoring the advertisement that they are available?  Perhaps we
should go the other way completely and remove the relevant #defines
from win32_port.h, and fully confine knowledge of them to pg_locale.c?
  It knows how to deal with that.  Here is a patch trying this idea
out, with as slightly longer explanation.


This looks sensible to me.

If we ever need mbstowcs_l() etc. outside of pg_locale.c, then the 
proper way would be to make a mbstowcs_l.c file in src/port/.


But I like your approach for now because it moves us more firmly into 
the direction of having it contained in pg_locale.c, instead of having 
some knowledge global and some local.






Re: check_strxfrm_bug()

2023-07-09 Thread Thomas Munro
On Sun, Jul 9, 2023 at 6:35 PM Thomas Munro  wrote:
> On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut  wrote:
> > So I don't think this code is correct.  AFAICT, there is nothing right
> > now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> > the intention?
>
> Yes, that was my intention.  Windows actually doesn't have them.

Thinking about that some more...  Its _XXX implementations don't deal
with UTF-8 the way Unix-based developers would expect, and are
therefore just portability hazards, aren't they?  What would we gain
by restoring the advertisement that they are available?  Perhaps we
should go the other way completely and remove the relevant #defines
from win32_port.h, and fully confine knowledge of them to pg_locale.c?
 It knows how to deal with that.  Here is a patch trying this idea
out, with as slightly longer explanation.
From 1c04d781195266b6bc88d8a5a584a64aac07f613 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 10 Jul 2023 13:41:27 +1200
Subject: [PATCH] Don't expose Windows' mbstowcs_l() and wcstombs_l().

Windows has similar functions with leading underscores, and we provided
the rename via a macro in win32_port.h.  In fact its functions are not
always good replacements for the Unix functions, since they can't deal
with UTF-8.  They are only currently used by pg_locale.c, which is
careful to redirect to other Windows routines for UTF-8.  Given that
portability hazard, it seem unlikely to be a good idea to encourage any
other code to think of these functions as being available outside
pg_locale.c.  Any code that thinks it wants these functions probably
wants our wchar2char() or char2wchar() routines instead, or it won't
actually work on Windows in UTF-8 databases.

Furthermore, some major libc implementations including glibc don't have
them (they only have the standard variants without _l), so external code
is very unlikely to require them to exist.

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 0eb764e897..61daa8190b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -154,36 +154,41 @@ static void icu_set_collation_attributes(UCollator *collator, const char *loc,
 		 UErrorCode *status);
 #endif
 
-#ifndef WIN32
 /*
  * POSIX doesn't define _l-variants of these functions, but several systems
- * have them.  We provide our own replacements here.  For Windows, we have
- * macros in win32_port.h.
+ * have them.  We provide our own replacements here.
  */
 #ifndef HAVE_MBSTOWCS_L
 static size_t
 mbstowcs_l(wchar_t *dest, const char *src, size_t n, locale_t loc)
 {
+#ifdef WIN32
+	return _mbstowcs_l(dest, src, n, loc);
+#else
 	size_t		result;
 	locale_t	save_locale = uselocale(loc);
 
 	result = mbstowcs(dest, src, n);
 	uselocale(save_locale);
 	return result;
+#endif
 }
 #endif
 #ifndef HAVE_WCSTOMBS_L
 static size_t
 wcstombs_l(char *dest, const wchar_t *src, size_t n, locale_t loc)
 {
+#ifdef WIN32
+	return _wcstombs_l(dest, src, n, loc);
+#else
 	size_t		result;
 	locale_t	save_locale = uselocale(loc);
 
 	result = wcstombs(dest, src, n);
 	uselocale(save_locale);
 	return result;
-}
 #endif
+}
 #endif
 
 /*
diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index b957d5c598..0e6d608330 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -455,8 +455,6 @@ extern int	_pglstat64(const char *name, struct stat *buf);
 #define strcoll_l _strcoll_l
 #define strxfrm_l _strxfrm_l
 #define wcscoll_l _wcscoll_l
-#define wcstombs_l _wcstombs_l
-#define mbstowcs_l _mbstowcs_l
 
 /*
  * Versions of libintl >= 0.18? try to replace setlocale() with a macro
-- 
2.41.0



Re: check_strxfrm_bug()

2023-07-09 Thread Thomas Munro
On Sun, Jul 9, 2023 at 6:20 PM Peter Eisentraut  wrote:
> So I don't think this code is correct.  AFAICT, there is nothing right
> now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that
> the intention?

Yes, that was my intention.  Windows actually doesn't have them.  The
autoconf/MinGW test result was telling the truth.  Somehow I had to
make the three build systems agree on this.  Either by strong-arming
all three of them to emit a hard-coded claim that it does, or by
removing the test that produces a different answer in different build
systems.  I will happily do it the other way if you insist, which
would involve restoring the meson.build and Solultion.pm kludges you
quoted, but I'd also have to add a compatible kludge to configure.ac.
It doesn't seem like an improvement to me but I don't feel strongly
about it.  In the end, Solution.pm and configure.ac will be vaporised
by lasers, so we'll be left with 0 or 1 special cases.  I don't care
much, but I like 0, it's nice and round.




Re: check_strxfrm_bug()

2023-07-09 Thread Peter Eisentraut

On 07.07.23 22:30, Thomas Munro wrote:

Thinking about how to bring this all into "normal" form -- where
HAVE_XXX means "system defines XXX", not "system defines XXX || we
define a replacement"


HAVE_XXX means "code can use XXX", doesn't matter how it got there (it 
could also be a libpgport replacement).


So I don't think this code is correct.  AFAICT, there is nothing right 
now that can possibly define HAVE_MBSTOWCS_L on Windows/MSVC.  Was that 
the intention?


I think these changes would need to be reverted to make this correct:

-# MSVC has replacements defined in src/include/port/win32_port.h.
-if cc.get_id() == 'msvc'
-  cdata.set('HAVE_WCSTOMBS_L', 1)
-  cdata.set('HAVE_MBSTOWCS_L', 1)
-endif

-   HAVE_MBSTOWCS_L => 1,
+   HAVE_MBSTOWCS_L => undef,

-   HAVE_WCSTOMBS_L => 1,
+   HAVE_WCSTOMBS_L => undef,





Re: check_strxfrm_bug()

2023-07-08 Thread Thomas Munro
On Sat, Jul 8, 2023 at 10:13 AM Thomas Munro  wrote:
> Thanks.  I will wait a bit to see if Peter has any other comments and
> then push this.  I haven't actually tested with Solution.pm due to
> lack of CI for that, but fingers crossed, since the build systems will
> now agree, reducing the screw-up surface.

Done.  Let's see what the build farm thinks...




Re: check_strxfrm_bug()

2023-07-07 Thread Thomas Munro
On Sat, Jul 8, 2023 at 8:52 AM Tristan Partin  wrote:
> Should you wrap the two _l function replacements in HAVE_USELOCALE
> instead of WIN32?

I find that more confusing, and I'm also not sure if HAVE_USELOCALE is
even going to survive (based on your nearby thread).  I mean, by the
usual criteria that we applied to a lot of other system library
functions in the 16 cycle, I'd drop it.  It's in the standard and all
relevant systems have it except Windows which we have to handle with
special pain-in-the-neck logic anyway.

> > +if not cc.has_type('locale_t', prefix: '#include ') and 
> > cc.has_type('locale_t', prefix: '#include ')
>
> I wouldn't mind a line break after the 'and'.

Ah, right, I am still learning what is allowed in this syntax...  will do.

> Other than these comments, the patch looks fine to me.

Thanks.  I will wait a bit to see if Peter has any other comments and
then push this.  I haven't actually tested with Solution.pm due to
lack of CI for that, but fingers crossed, since the build systems will
now agree, reducing the screw-up surface.




Re: check_strxfrm_bug()

2023-07-07 Thread Tristan Partin
On Fri Jul 7, 2023 at 3:30 PM CDT, Thomas Munro wrote:
> On Fri, Jul 7, 2023 at 4:20 AM Peter Eisentraut  wrote:
> > I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm.
> > Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being
> > defined in win32_port.h.
>
> In this version I have it in there, but set it to undef.  This way
> configure (MinGW), meson (MinGW or MSVC), and Solution.pm all agree.
> This version passes on CI (not quite sure what I screwed up before).
>
> To restate the problem I am solving with this Solution.pm change +
> associated changes in the C: configure+MinGW disabled the libc
> provider completely before.  With this patch that is fixed, and that
> forced me to address the inconsistency (because if you have the libc
> provider but no _l functions, you hit uselocale() code paths that
> Windows can't do).  It was a bug, really, but I don't plan to
> back-patch anything (nobody really uses configure+MinGW builds AFAIK,
> they exist purely as a developer [in]convenience).  But that explains
> why I needed to make a change.
>
> Thinking about how to bring this all into "normal" form -- where
> HAVE_XXX means "system defines XXX", not "system defines XXX || we
> define a replacement" -- leads to the attached.  Do you like this one?

Should you wrap the two _l function replacements in HAVE_USELOCALE
instead of WIN32?

> +if not cc.has_type('locale_t', prefix: '#include ') and 
> cc.has_type('locale_t', prefix: '#include ')

I wouldn't mind a line break after the 'and'.

Other than these comments, the patch looks fine to me.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: check_strxfrm_bug()

2023-07-07 Thread Thomas Munro
On Fri, Jul 7, 2023 at 4:20 AM Peter Eisentraut  wrote:
> I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm.
> Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being
> defined in win32_port.h.

In this version I have it in there, but set it to undef.  This way
configure (MinGW), meson (MinGW or MSVC), and Solution.pm all agree.
This version passes on CI (not quite sure what I screwed up before).

To restate the problem I am solving with this Solution.pm change +
associated changes in the C: configure+MinGW disabled the libc
provider completely before.  With this patch that is fixed, and that
forced me to address the inconsistency (because if you have the libc
provider but no _l functions, you hit uselocale() code paths that
Windows can't do).  It was a bug, really, but I don't plan to
back-patch anything (nobody really uses configure+MinGW builds AFAIK,
they exist purely as a developer [in]convenience).  But that explains
why I needed to make a change.

Thinking about how to bring this all into "normal" form -- where
HAVE_XXX means "system defines XXX", not "system defines XXX || we
define a replacement" -- leads to the attached.  Do you like this one?
From a5decf23c6198eb469835b23e9f4ea7778df869c Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 4 Jul 2023 16:32:35 +1200
Subject: [PATCH v3] All supported systems have locale_t.

locale_t is defined by POSIX.1-2008 and SUSv4, and available on all
targeted systems.  For Windows, win32_port.h redirects to a partial
implementation called _locale_t.  It's time to remove a lot of
compile-time tests for HAVE_LOCALE_T, and a few associated comments and
dead code branches.

Since configure + MinGW builds didn't detect locale_t but now we assume
that all systems have it, further inconsistencies among the 3 Windows build
systems were revealed.  With this commit, we no longer define
HAVE_WCSTOMBS_L and HAVE_MBSTOWCS_L on any Windows build system, but
we have logic to deal with that so that replacements are available where
appropriate.

Reviewed-by: Noah Misch 
Reviewed-by: Tristan Partin 
Reviewed-by: Peter Eisentraut 
Discussion: https://postgr.es/m/CA%2BhUKGLg7_T2GKwZFAkEf0V7vbnur-NfCjZPKZb%3DZfAXSV1ORw%40mail.gmail.com
---
 config/c-library.m4  | 10 +---
 configure|  5 --
 meson.build  | 22 ++-
 src/backend/commands/collationcmds.c |  2 +-
 src/backend/regex/regc_pg_locale.c   | 52 +
 src/backend/utils/adt/formatting.c   | 18 --
 src/backend/utils/adt/like.c |  2 -
 src/backend/utils/adt/like_support.c |  2 -
 src/backend/utils/adt/pg_locale.c| 86 +++-
 src/include/pg_config.h.in   |  3 -
 src/include/utils/pg_locale.h|  7 +--
 src/tools/msvc/Solution.pm   |  5 +-
 12 files changed, 45 insertions(+), 169 deletions(-)

diff --git a/config/c-library.m4 b/config/c-library.m4
index c1dd804679..aa8223d2ef 100644
--- a/config/c-library.m4
+++ b/config/c-library.m4
@@ -86,9 +86,9 @@ AC_DEFUN([PGAC_STRUCT_SOCKADDR_SA_LEN],
 # PGAC_TYPE_LOCALE_T
 # --
 # Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
-# xlocale.h file that we should not use.
-#
+# needs xlocale.h; standard is locale.h, but glibc <= 2.25 also had an
+# xlocale.h file that we should not use, so we check the standard
+# header first.
 AC_DEFUN([PGAC_TYPE_LOCALE_T],
 [AC_CACHE_CHECK([for locale_t], pgac_cv_type_locale_t,
 [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
@@ -102,10 +102,6 @@ locale_t x;],
 [])],
 [pgac_cv_type_locale_t='yes (in xlocale.h)'],
 [pgac_cv_type_locale_t=no])])])
-if test "$pgac_cv_type_locale_t" != no; then
-  AC_DEFINE(HAVE_LOCALE_T, 1,
-[Define to 1 if the system has the type `locale_t'.])
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
   AC_DEFINE(LOCALE_T_IN_XLOCALE, 1,
 [Define to 1 if `locale_t' requires .])
diff --git a/configure b/configure
index c4463cb17a..842ef855fc 100755
--- a/configure
+++ b/configure
@@ -15120,11 +15120,6 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_type_locale_t" >&5
 $as_echo "$pgac_cv_type_locale_t" >&6; }
-if test "$pgac_cv_type_locale_t" != no; then
-
-$as_echo "#define HAVE_LOCALE_T 1" >>confdefs.h
-
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
 
 $as_echo "#define LOCALE_T_IN_XLOCALE 1" >>confdefs.h
diff --git a/meson.build b/meson.build
index 62942ead09..3d4a85f8c6 100644
--- a/meson.build
+++ b/meson.build
@@ -2283,17 +2283,12 @@ else
   cdata.set('STRERROR_R_INT', false)
 endif
 
-# Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
-# xlocale.h file that we should not use.  MSVC has a replacement
-# defined in src/include/port/win32_port.h.
-if 

Re: check_strxfrm_bug()

2023-07-06 Thread Peter Eisentraut

On 05.07.23 00:15, Thomas Munro wrote:

On Tue, Jul 4, 2023 at 2:52 AM Tristan Partin  wrote:

The patch looks good to me as well. Happy to rebase my other patch on
this one.


Thanks.  Here is a slightly tidier version.  It passes on CI[1]
including the optional extra MinGW64/Meson task, and the
MinGW64/autoconf configure+build that is in the SanityCheck task.
There are two questions I'm hoping to get feedback on:  (1) I believe
that defining HAVE_MBSTOWCS_L etc in win32_port.h is the best idea
because that is also where we define mbstowcs_l() etc.  Does that make
sense?  (2) IIRC, ye olde Solution.pm system might break if I were to
completely remove HAVE_MBSTOWCS_L and HAVE_WCSTOMBS_L from Solution.pm
(there must be a check somewhere that compares it with pg_config.h.in
or something like that), but it would also break if I defined them as
1 there (macro redefinition).


I think the correct solution is to set HAVE_MBSTOWCS_L in Solution.pm. 
Compare HAVE_FSEEKO, which is set in Solution.pm with fseeko being 
defined in win32_port.h.






Re: check_strxfrm_bug()

2023-07-04 Thread Thomas Munro
On Wed, Jul 5, 2023 at 10:15 AM Thomas Munro  wrote:
> [1] https://cirrus-ci.com/build/5298278007308288

That'll teach me to be impatient.  I only waited for compiling to
finish after triggering the optional extra MinGW task before sending
the above email, figuring that the only risk was there, but then the
pg_upgrade task failed due to mismatched locales.  Apparently there is
something I don't understand yet about locale_t support under MinGW.




Re: check_strxfrm_bug()

2023-07-04 Thread Thomas Munro
On Tue, Jul 4, 2023 at 2:52 AM Tristan Partin  wrote:
> The patch looks good to me as well. Happy to rebase my other patch on
> this one.

Thanks.  Here is a slightly tidier version.  It passes on CI[1]
including the optional extra MinGW64/Meson task, and the
MinGW64/autoconf configure+build that is in the SanityCheck task.
There are two questions I'm hoping to get feedback on:  (1) I believe
that defining HAVE_MBSTOWCS_L etc in win32_port.h is the best idea
because that is also where we define mbstowcs_l() etc.  Does that make
sense?  (2) IIRC, ye olde Solution.pm system might break if I were to
completely remove HAVE_MBSTOWCS_L and HAVE_WCSTOMBS_L from Solution.pm
(there must be a check somewhere that compares it with pg_config.h.in
or something like that), but it would also break if I defined them as
1 there (macro redefinition).Will undef in Solution.pm be
acceptable (ie define nothing to avoid redefinition, but side-step the
sanity check)?  It's a bit of a kludge, but IIRC we're dropping that
3rd build system in 17 so maybe that's OK?  (Not tested as I don't
have Windows and CI doesn't test Solution.pm, so I'd be grateful if
someone who has Windows/Solution.pm setup could try this.)

[1] https://cirrus-ci.com/build/5298278007308288
From ab0705a9e169627a77bf4e3cc36ea53603162921 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 4 Jul 2023 16:32:35 +1200
Subject: [PATCH v2] All supported systems have locale_t.

locale_t is defined by POSIX.1-2008 and SUSv4, and available on all
targeted systems.  For Windows, win32_port.h redirects to a partial
implementation called _locale_t.  It's time to remove a lot of
compile-time tests for HAVE_LOCALE_T, and a few associated comments and
dead code branches.

Definitions for HAVE_WCSTOMBS_L and HAVE_MBSTOWCS_L had to be moved into
win32_port.h, because this change revealed that MinGW builds on Windows
were failing to detect these functions, but without them we couldn't
build due to the assumption that uselocale() exists.

Reviewed-by: Noah Misch 
Reviewed-by: Tristan Partin 
Discussion: https://postgr.es/m/CA%2BhUKGLg7_T2GKwZFAkEf0V7vbnur-NfCjZPKZb%3DZfAXSV1ORw%40mail.gmail.com
---
 config/c-library.m4  | 10 ++
 configure|  5 ---
 meson.build  | 22 +++-
 src/backend/commands/collationcmds.c |  2 +-
 src/backend/regex/regc_pg_locale.c   | 52 +---
 src/backend/utils/adt/formatting.c   | 18 --
 src/backend/utils/adt/like.c |  2 --
 src/backend/utils/adt/like_support.c |  2 --
 src/backend/utils/adt/pg_locale.c| 36 ---
 src/include/pg_config.h.in   |  3 --
 src/include/port/win32_port.h|  3 ++
 src/include/utils/pg_locale.h|  7 +---
 src/tools/msvc/Solution.pm   |  5 ++-
 13 files changed, 16 insertions(+), 151 deletions(-)

diff --git a/config/c-library.m4 b/config/c-library.m4
index c1dd804679..aa8223d2ef 100644
--- a/config/c-library.m4
+++ b/config/c-library.m4
@@ -86,9 +86,9 @@ AC_DEFUN([PGAC_STRUCT_SOCKADDR_SA_LEN],
 # PGAC_TYPE_LOCALE_T
 # --
 # Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
-# xlocale.h file that we should not use.
-#
+# needs xlocale.h; standard is locale.h, but glibc <= 2.25 also had an
+# xlocale.h file that we should not use, so we check the standard
+# header first.
 AC_DEFUN([PGAC_TYPE_LOCALE_T],
 [AC_CACHE_CHECK([for locale_t], pgac_cv_type_locale_t,
 [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
@@ -102,10 +102,6 @@ locale_t x;],
 [])],
 [pgac_cv_type_locale_t='yes (in xlocale.h)'],
 [pgac_cv_type_locale_t=no])])])
-if test "$pgac_cv_type_locale_t" != no; then
-  AC_DEFINE(HAVE_LOCALE_T, 1,
-[Define to 1 if the system has the type `locale_t'.])
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
   AC_DEFINE(LOCALE_T_IN_XLOCALE, 1,
 [Define to 1 if `locale_t' requires .])
diff --git a/configure b/configure
index 997d42d8f7..f1d4630007 100755
--- a/configure
+++ b/configure
@@ -15120,11 +15120,6 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_type_locale_t" >&5
 $as_echo "$pgac_cv_type_locale_t" >&6; }
-if test "$pgac_cv_type_locale_t" != no; then
-
-$as_echo "#define HAVE_LOCALE_T 1" >>confdefs.h
-
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
 
 $as_echo "#define LOCALE_T_IN_XLOCALE 1" >>confdefs.h
diff --git a/meson.build b/meson.build
index 3ea4b0d72a..82a966fcc2 100644
--- a/meson.build
+++ b/meson.build
@@ -2283,17 +2283,12 @@ else
   cdata.set('STRERROR_R_INT', false)
 endif
 
-# Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
-# xlocale.h file that we should not use.  MSVC has a replacement
-# defined in src/include/port/win32_port.h.
-if 

Re: check_strxfrm_bug()

2023-07-03 Thread Tristan Partin
On Sun Jul 2, 2023 at 8:49 PM CDT, Thomas Munro wrote:
> On Sun, Jul 2, 2023 at 4:25 AM Noah Misch  wrote:
> > On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> > > On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro  
> > > wrote:
> > > > The GCC build farm has just received some SPARC hardware new enough to
> > > > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > > > there we could finally assume all systems have POSIX 2008 (AKA
> > > > SUSv4)'s locale_t.
> > >
> > > That would look something like this.
> >
> > This removes thirty-eight ifdefs, most of them located in the middle of
> > function bodies.  That's far more beneficial than most proposals to raise
> > minimum requirements.  +1 for revoking support for wrasse's OS version.
> > (wrasse wouldn't move, but it would stop testing v17+.)
>
> Great.  It sounds like I should wait a few days for any other feedback
> and then push this patch.  Thanks for looking.

The patch looks good to me as well. Happy to rebase my other patch on
this one.

-- 
Tristan Partin
Neon (https://neon.tech)




Re: check_strxfrm_bug()

2023-07-02 Thread Thomas Munro
On Sun, Jul 2, 2023 at 4:25 AM Noah Misch  wrote:
> On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> > On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro  
> > wrote:
> > > The GCC build farm has just received some SPARC hardware new enough to
> > > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > > there we could finally assume all systems have POSIX 2008 (AKA
> > > SUSv4)'s locale_t.
> >
> > That would look something like this.
>
> This removes thirty-eight ifdefs, most of them located in the middle of
> function bodies.  That's far more beneficial than most proposals to raise
> minimum requirements.  +1 for revoking support for wrasse's OS version.
> (wrasse wouldn't move, but it would stop testing v17+.)

Great.  It sounds like I should wait a few days for any other feedback
and then push this patch.  Thanks for looking.




Re: check_strxfrm_bug()

2023-07-01 Thread Noah Misch
On Wed, Jun 28, 2023 at 01:02:21PM +1200, Thomas Munro wrote:
> On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro  wrote:
> > The GCC build farm has just received some SPARC hardware new enough to
> > run modern Solaris (hostname gcc106), so if wrasse were moved over
> > there we could finally assume all systems have POSIX 2008 (AKA
> > SUSv4)'s locale_t.
> 
> That would look something like this.

This removes thirty-eight ifdefs, most of them located in the middle of
function bodies.  That's far more beneficial than most proposals to raise
minimum requirements.  +1 for revoking support for wrasse's OS version.
(wrasse wouldn't move, but it would stop testing v17+.)




Re: check_strxfrm_bug()

2023-06-27 Thread Thomas Munro
On Wed, Jun 28, 2023 at 11:03 AM Thomas Munro  wrote:
> The GCC build farm has just received some SPARC hardware new enough to
> run modern Solaris (hostname gcc106), so if wrasse were moved over
> there we could finally assume all systems have POSIX 2008 (AKA
> SUSv4)'s locale_t.

That would look something like this.
From 87268870aa9b6f22eb0bf831614bd38c529b0c8c Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 28 Jun 2023 10:23:32 +1200
Subject: [PATCH] All supported computers have locale_t.

locale_t is defined by POSIX:1-2008 and SUSv4, and available on all
targeted systems.

Definitions for HAVE_WCSTOMBS_L and HAVE_MBSTOWCS_L had to be moved into
win32_port.h, because this change revealed that MinGW builds on Windows
were failing to detect these functions, but without them we couldn't
build due to the assumption that uselocale() exists.

Discussion: https://postgr.es/m/CA%2BhUKGLg7_T2GKwZFAkEf0V7vbnur-NfCjZPKZb%3DZfAXSV1ORw%40mail.gmail.com

diff --git a/config/c-library.m4 b/config/c-library.m4
index c1dd804679..aa8223d2ef 100644
--- a/config/c-library.m4
+++ b/config/c-library.m4
@@ -86,9 +86,9 @@ AC_DEFUN([PGAC_STRUCT_SOCKADDR_SA_LEN],
 # PGAC_TYPE_LOCALE_T
 # --
 # Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
-# xlocale.h file that we should not use.
-#
+# needs xlocale.h; standard is locale.h, but glibc <= 2.25 also had an
+# xlocale.h file that we should not use, so we check the standard
+# header first.
 AC_DEFUN([PGAC_TYPE_LOCALE_T],
 [AC_CACHE_CHECK([for locale_t], pgac_cv_type_locale_t,
 [AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
@@ -102,10 +102,6 @@ locale_t x;],
 [])],
 [pgac_cv_type_locale_t='yes (in xlocale.h)'],
 [pgac_cv_type_locale_t=no])])])
-if test "$pgac_cv_type_locale_t" != no; then
-  AC_DEFINE(HAVE_LOCALE_T, 1,
-[Define to 1 if the system has the type `locale_t'.])
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
   AC_DEFINE(LOCALE_T_IN_XLOCALE, 1,
 [Define to 1 if `locale_t' requires .])
diff --git a/configure b/configure
index a1c3cb0036..b8bc03e2c7 100755
--- a/configure
+++ b/configure
@@ -15122,11 +15122,6 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_type_locale_t" >&5
 $as_echo "$pgac_cv_type_locale_t" >&6; }
-if test "$pgac_cv_type_locale_t" != no; then
-
-$as_echo "#define HAVE_LOCALE_T 1" >>confdefs.h
-
-fi
 if test "$pgac_cv_type_locale_t" = 'yes (in xlocale.h)'; then
 
 $as_echo "#define LOCALE_T_IN_XLOCALE 1" >>confdefs.h
diff --git a/meson.build b/meson.build
index 7bc7310c44..7cc4cbc034 100644
--- a/meson.build
+++ b/meson.build
@@ -2283,17 +2283,12 @@ else
   cdata.set('STRERROR_R_INT', false)
 endif
 
-# Check for the locale_t type and find the right header file.  macOS
-# needs xlocale.h; standard is locale.h, but glibc also has an
+# Find the right header file for the locale_t type.  macOS
+# needs xlocale.h; standard is locale.h, but older glibc also had an
 # xlocale.h file that we should not use.  MSVC has a replacement
 # defined in src/include/port/win32_port.h.
-if cc.has_type('locale_t', prefix: '#include ')
-  cdata.set('HAVE_LOCALE_T', 1)
-elif cc.has_type('locale_t', prefix: '#include ')
-  cdata.set('HAVE_LOCALE_T', 1)
+if not cc.has_type('locale_t', prefix: '#include ') and cc.has_type('locale_t', prefix: '#include ')
   cdata.set('LOCALE_T_IN_XLOCALE', 1)
-elif cc.get_id() == 'msvc'
-  cdata.set('HAVE_LOCALE_T', 1)
 endif
 
 # Check if the C compiler understands typeof or a variant.  Define
@@ -2489,13 +2484,6 @@ if cc.has_function('syslog', args: test_c_args) and \
 endif
 
 
-# MSVC has replacements defined in src/include/port/win32_port.h.
-if cc.get_id() == 'msvc'
-  cdata.set('HAVE_WCSTOMBS_L', 1)
-  cdata.set('HAVE_MBSTOWCS_L', 1)
-endif
-
-
 # if prerequisites for unnamed posix semas aren't fulfilled, fall back to sysv
 # semaphores
 if sema_kind == 'unnamed_posix' and \
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index efb8b4d289..cc239b4d14 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -534,7 +534,7 @@ pg_collation_actual_version(PG_FUNCTION_ARGS)
 
 
 /* will we use "locale -a" in pg_import_system_collations? */
-#if defined(HAVE_LOCALE_T) && !defined(WIN32)
+#if !defined(WIN32)
 #define READ_LOCALE_A_OUTPUT
 #endif
 
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 00ce735fdd..31e6300b5d 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -44,8 +44,7 @@
  * the platform's wchar_t representation matches what we do in pg_wchar
  * conversions.
  *
- * 3. Other collations are only supported on platforms that HAVE_LOCALE_T.
- * Here, we use the locale_t-extended forms of the  and 
+ * 3. Here, we use the locale_t-extended forms of the  and 
  

Re: check_strxfrm_bug()

2023-06-27 Thread Thomas Munro
The GCC build farm has just received some SPARC hardware new enough to
run modern Solaris (hostname gcc106), so if wrasse were moved over
there we could finally assume all systems have POSIX 2008 (AKA
SUSv4)'s locale_t.

It's slightly annoying that Windows has locale_t but doesn't have
uselocale().  It does have thread-local locales via another API,
though.  I wonder how hard it would be to get to a point where all
systems have uselocale() too, by supplying a replacement.  I noticed
that some other projects eg older versions of LLVM libcxx do that.  I
see from one of their discussions[1] that it worked, except that
thread-local locales are only available with one of the MinGW C
runtimes and not another.  We'd have to get to the bottom of that.

[1] https://reviews.llvm.org/D40181




Re: check_strxfrm_bug()

2023-06-12 Thread Heikki Linnakangas

On 12/06/2023 01:15, Thomas Munro wrote:

There are still at least a couple
of functions that lack XXX_l variants in the standard: mbstowcs() and
wcstombs() (though we use the non-standard _l variants if we find them
in ), but that's OK because we use uselocale() and not
setlocale(), because uselocale() is required to be thread-local.


Right, mbstowcs() and wcstombs() are already thread-safe, that's why 
there are no _l variants of them.



The use of setlocale() to set up the per-backend/per-database
default locale would have to be replaced with uselocale().
This recent bug report is also related to that: 
https://www.postgresql.org/message-id/17946-3e84cb577e9551c3%40postgresql.org. 
In a nutshell, libperl calls uselocale(), which overrides the setting we 
try set with setlocale().


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: check_strxfrm_bug()

2023-06-11 Thread Thomas Munro
On Mon, Apr 17, 2023 at 8:00 AM Thomas Munro  wrote:
> On Sun, Dec 18, 2022 at 10:27 AM Thomas Munro  wrote:
> > With my garbage collector hat on, that made me wonder if there was
> > some more potential cleanup here: could we require locale_t yet?  The
> > last straggler systems on our target OS list to add the POSIX locale_t
> > stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
> > it's still too soon: we have two EOL'd OSes in the farm that are older
> > than that.  But here's an interesting fact about wrasse, assuming its
> > host is gcc211: it looks like it can't even apply further OS updates
> > because the hardware[1] is so old that Solaris doesn't support it
> > anymore[2].
>
> For the record, now the OpenBSD machines have been upgraded, so now
> "wrasse" is the last relevant computer on earth with no POSIX
> locale_t.  Unfortunately there is no reason to think it's going to go
> away soon, so I'm just noting this fact here as a reminder for when it
> eventually does...

Since talk of server threads erupted again, I just wanted to note that
this system locale API stuff would be on the long list of
micro-obstacles.  You'd *have* to use the locale_t-based interfaces
(or come up with replacements using a big ugly lock to serialise all
access to the process-global locale, or allow only ICU locale support
in that build, but those seem like strange lengths to go to just to
support a dead version of Solaris).  There are still at least a couple
of functions that lack XXX_l variants in the standard: mbstowcs() and
wcstombs() (though we use the non-standard _l variants if we find them
in ), but that's OK because we use uselocale() and not
setlocale(), because uselocale() is required to be thread-local.  The
use of setlocale() to set up the per-backend/per-database default
locale would have to be replaced with uselocale().  In other words, I
think wrasse would not be coming with us on this hypothetical quest.




Re: check_strxfrm_bug()

2023-04-22 Thread Jonathan S. Katz

On 4/19/23 9:34 PM, Thomas Munro wrote:

On Wed, Apr 19, 2023 at 2:31 PM Jonathan S. Katz  wrote:

To be clear, is the proposal to remove both "check_strxfrm_bug" and
"TRUST_STRXFRM"?

Given a bunch of folks who have expertise in this area of code all agree
with removing the above as part of the collation cleanups targeted for
v16, I'm inclined to agree. I don't really see the need for an explicit
RMT action, but based on the consensus this seems OK to add as an open item.


Thanks all.  I went ahead and removed check_strxfrm_bug().


Thanks! For housekeeping, I put this into "Open Items" and marked it as 
resolved.



I could write a patch to remove the libc strxfrm support, but since
Jeff recently wrote new code in 16 to abstract that stuff, he might
prefer to look at it?


I believe we'd be qualifying this as an open item too? If so, let's add it.

Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: check_strxfrm_bug()

2023-04-20 Thread Jeff Davis
On Thu, 2023-04-20 at 13:34 +1200, Thomas Munro wrote:
> I could write a patch to remove the libc strxfrm support, but since
> Jeff recently wrote new code in 16 to abstract that stuff, he might
> prefer to look at it?

+1 to removing it.

As far as how it's removed, we could directly check:

  if (!collate_c && !(locale && locale->provider == COLLPROVIDER_ICU))
abbreviate = false;

as it was before, or we could still try to hide it as a detail behind a
function. I don't have a strong opinion there, though I thought it
might be good for varlena.c to not know those internal details.

Regards,
Jeff Davis





Re: check_strxfrm_bug()

2023-04-19 Thread Thomas Munro
On Wed, Apr 19, 2023 at 2:31 PM Jonathan S. Katz  wrote:
> To be clear, is the proposal to remove both "check_strxfrm_bug" and
> "TRUST_STRXFRM"?
>
> Given a bunch of folks who have expertise in this area of code all agree
> with removing the above as part of the collation cleanups targeted for
> v16, I'm inclined to agree. I don't really see the need for an explicit
> RMT action, but based on the consensus this seems OK to add as an open item.

Thanks all.  I went ahead and removed check_strxfrm_bug().

I could write a patch to remove the libc strxfrm support, but since
Jeff recently wrote new code in 16 to abstract that stuff, he might
prefer to look at it?




Re: check_strxfrm_bug()

2023-04-19 Thread Tom Lane
Joe Conway  writes:
> I have wondered a few times, given the known issues with strxfrm, how is 
> the use in selfuncs.c still ok. Has anyone taken a hard look at that?

On the one hand, we only need approximately-correct results in that
code.  On the other, the result is fed to convert_string_to_scalar(),
which has a rather naive idea that it's dealing with ASCII text.
I've seen at least some strxfrm output that isn't even vaguely
textual-looking.

regards, tom lane




Re: check_strxfrm_bug()

2023-04-19 Thread Joe Conway

On 4/18/23 21:19, Thomas Munro wrote:

On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier  wrote:

On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
> +1 for getting rid of TRUST_STRXFRM.


+1

The situation is not improving fast, and requires hard work to follow
on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
support will always have niche uses, to be compatible with other
software on the box, but trusting strxfrm doesn't seem to be on the
cards any time soon.


I have wondered a few times, given the known issues with strxfrm, how is 
the use in selfuncs.c still ok. Has anyone taken a hard look at that?


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: check_strxfrm_bug()

2023-04-18 Thread Jonathan S. Katz

On 4/18/23 9:19 PM, Thomas Munro wrote:

On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier  wrote:

On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:

+1 for getting rid of TRUST_STRXFRM.


+1

The situation is not improving fast, and requires hard work to follow
on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
support will always have niche uses, to be compatible with other
software on the box, but trusting strxfrm doesn't seem to be on the
cards any time soon.


[RMT hat, personal opinion on RMT]

To be clear, is the proposal to remove both "check_strxfrm_bug" and 
"TRUST_STRXFRM"?


Given a bunch of folks who have expertise in this area of code all agree 
with removing the above as part of the collation cleanups targeted for 
v16, I'm inclined to agree. I don't really see the need for an explicit 
RMT action, but based on the consensus this seems OK to add as an open item.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: check_strxfrm_bug()

2023-04-18 Thread Thomas Munro
On Tue, Apr 18, 2023 at 11:52 AM Michael Paquier  wrote:
> On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
> > +1 for getting rid of TRUST_STRXFRM.

+1

The situation is not improving fast, and requires hard work to follow
on each OS.  Clearly, mainstreaming ICU is the way to go.  libc
support will always have niche uses, to be compatible with other
software on the box, but trusting strxfrm doesn't seem to be on the
cards any time soon.




Re: check_strxfrm_bug()

2023-04-17 Thread Michael Paquier
On Mon, Apr 17, 2023 at 03:40:14PM -0700, Peter Geoghegan wrote:
> On Mon, Apr 17, 2023 at 2:48 PM Tom Lane  wrote:
>> +1.  I wonder if we should go further and get rid of TRUST_STRXFRM
>> and the not-so-trivial amount of code around it (pg_strxfrm_enabled
>> etc).  Carrying that indefinitely in the probably-vain hope that
>> the libraries will become trustworthy seems rather pointless.
>> Besides, if such a miracle does occur, we can dig the code out
>> of our git history.
> 
> +1 for getting rid of TRUST_STRXFRM.
> 
> ICU-based collations (which aren't affected by TRUST_STRXFRM) are
> becoming the de facto standard (possibly even the de jure standard).
> So even if we thought that the situation with strxfrm() had improved,
> we'd still have little motivation to do anything about it.

Makes sense to do some cleanup now as this is new in the tree.
Perhaps somebody from the RMT would like to comment?

FYI, Jeff has also posted patches to replace this CFLAGS with a GUC:
https://www.postgresql.org/message-id/6ec4ad7f93f255dbb885da0a664d9c77ed4872c4.ca...@j-davis.com
--
Michael


signature.asc
Description: PGP signature


Re: check_strxfrm_bug()

2023-04-17 Thread Peter Geoghegan
On Mon, Apr 17, 2023 at 2:48 PM Tom Lane  wrote:
> +1.  I wonder if we should go further and get rid of TRUST_STRXFRM
> and the not-so-trivial amount of code around it (pg_strxfrm_enabled
> etc).  Carrying that indefinitely in the probably-vain hope that
> the libraries will become trustworthy seems rather pointless.
> Besides, if such a miracle does occur, we can dig the code out
> of our git history.

+1 for getting rid of TRUST_STRXFRM.

ICU-based collations (which aren't affected by TRUST_STRXFRM) are
becoming the de facto standard (possibly even the de jure standard).
So even if we thought that the situation with strxfrm() had improved,
we'd still have little motivation to do anything about it.

-- 
Peter Geoghegan




Re: check_strxfrm_bug()

2023-04-17 Thread Tom Lane
Nathan Bossart  writes:
> On Thu, Dec 15, 2022 at 03:22:59PM +1300, Thomas Munro wrote:
>> While studying Jeff's new crop of collation patches I noticed in
>> passing that check_strxfrm_bug() must surely by now be unnecessary.
>> The buffer overrun bugs were fixed a decade ago, and the relevant
>> systems are way out of support.  If you're worried that the bugs might
>> come back, then the test is insufficient: modern versions of both OSes
>> have strxfrm_l(), which we aren't checking.  In any case, we also
>> completely disable this stuff because of bugs and quality problems in
>> every other known implementation, via TRUST_STRXFRM (or rather the
>> lack of it).  So I think it's time to remove that function; please see
>> attached.

> Seems reasonable to me.

+1.  I wonder if we should go further and get rid of TRUST_STRXFRM
and the not-so-trivial amount of code around it (pg_strxfrm_enabled
etc).  Carrying that indefinitely in the probably-vain hope that
the libraries will become trustworthy seems rather pointless.
Besides, if such a miracle does occur, we can dig the code out
of our git history.

regards, tom lane




Re: check_strxfrm_bug()

2023-04-17 Thread Nathan Bossart
On Thu, Dec 15, 2022 at 03:22:59PM +1300, Thomas Munro wrote:
> While studying Jeff's new crop of collation patches I noticed in
> passing that check_strxfrm_bug() must surely by now be unnecessary.
> The buffer overrun bugs were fixed a decade ago, and the relevant
> systems are way out of support.  If you're worried that the bugs might
> come back, then the test is insufficient: modern versions of both OSes
> have strxfrm_l(), which we aren't checking.  In any case, we also
> completely disable this stuff because of bugs and quality problems in
> every other known implementation, via TRUST_STRXFRM (or rather the
> lack of it).  So I think it's time to remove that function; please see
> attached.

Seems reasonable to me.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: check_strxfrm_bug()

2023-04-16 Thread Thomas Munro
On Sun, Dec 18, 2022 at 10:27 AM Thomas Munro  wrote:
> With my garbage collector hat on, that made me wonder if there was
> some more potential cleanup here: could we require locale_t yet?  The
> last straggler systems on our target OS list to add the POSIX locale_t
> stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
> it's still too soon: we have two EOL'd OSes in the farm that are older
> than that.  But here's an interesting fact about wrasse, assuming its
> host is gcc211: it looks like it can't even apply further OS updates
> because the hardware[1] is so old that Solaris doesn't support it
> anymore[2].

For the record, now the OpenBSD machines have been upgraded, so now
"wrasse" is the last relevant computer on earth with no POSIX
locale_t.  Unfortunately there is no reason to think it's going to go
away soon, so I'm just noting this fact here as a reminder for when it
eventually does...




Re: check_strxfrm_bug()

2022-12-17 Thread Thomas Munro
On Thu, Dec 15, 2022 at 3:22 PM Thomas Munro  wrote:
> ... If you're worried that the bugs might
> come back, then the test is insufficient: modern versions of both OSes
> have strxfrm_l(), which we aren't checking.

With my garbage collector hat on, that made me wonder if there was
some more potential cleanup here: could we require locale_t yet?  The
last straggler systems on our target OS list to add the POSIX locale_t
stuff were Solaris 11.4 (2018) and OpenBSD 6.2 (2018).  Apparently
it's still too soon: we have two EOL'd OSes in the farm that are older
than that.  But here's an interesting fact about wrasse, assuming its
host is gcc211: it looks like it can't even apply further OS updates
because the hardware[1] is so old that Solaris doesn't support it
anymore[2].

[1] https://cfarm.tetaneutral.net/machines/list/
[2] https://support.oracle.com/knowledge/Sun%20Microsystems/2382427_1.html




check_strxfrm_bug()

2022-12-14 Thread Thomas Munro
Hi

While studying Jeff's new crop of collation patches I noticed in
passing that check_strxfrm_bug() must surely by now be unnecessary.
The buffer overrun bugs were fixed a decade ago, and the relevant
systems are way out of support.  If you're worried that the bugs might
come back, then the test is insufficient: modern versions of both OSes
have strxfrm_l(), which we aren't checking.  In any case, we also
completely disable this stuff because of bugs and quality problems in
every other known implementation, via TRUST_STRXFRM (or rather the
lack of it).  So I think it's time to remove that function; please see
attached.

Just by the way, if you like slow motion domino runs, check this out:

* Original pgsql-bugs investigation into strxfrm() inconsistencies
  
https://www.postgresql.org/message-id/flat/111d0e27-a8f3-4a84-a4e0-b0fb70386...@s24.com

* I happened to test that on bleeding-edge FreeBSD 11 (wasn't released
yet), because at that time FreeBSD was in the process of adopting
illumos's new collation code, and reported teething problems:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208266

* FreeBSD, DragonFly and illumos's trees were then partially fixed by
the authors, but our strcolltest.c still showed some remaining
problems in some locales (and it still does on my local FreeBSD
battlestation):
  
https://github.com/freebsd/freebsd-src/commit/c48dc2a193b9befceda8dfc6f894d73251cc00a4
  https://www.illumos.org/rb/r/402/

* The authors traced the remaining problem to flaws in the Unicode
project's CLDR/POSIX data, and the report was accepted:
  https://www.illumos.org/issues/7962
  https://unicode-org.atlassian.net/browse/CLDR-10394

Eventually that'll be fixed, and (I guess) trigger at least a CLDR
minor version bump affecting all downstream consumers (ICU, ...).
Then... maybe... at least FreeBSD will finally pass that test.  I do
wonder whether other consumer libraries are also confused by that
problem source data, and if not, why not; are glibc's problems related
or just random code or data quality problems in different areas?  (I
also don't know why a problem in that data should affect strxfrm() and
strcoll() differently, but I don't plan to find out owing to an acute
shortage of round tuits).

But in the meantime, I still can't recommend turning on TRUST_STRXFRM
on any OS that I know of!  The strcolltest.c program certainly still
finds fault with glibc 2.36 despite the last update on that redhat
bugzilla ticket that suggested that the big resync back in 2.28 was
going to fix it.

To be fair, macOS does actually pass that test for all locales, but
the strxfrm() result is too narrow to be useful, according to comments
in our tree.  I would guess that a couple of other OSes with the old
Berkeley locale code are similar.
From 8a70596a7de680e02fb2a6db59e622a653f097c2 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 15 Dec 2022 13:38:57 +1300
Subject: [PATCH] Remove obsolete defense against strxfrm() bugs.

Old versions of Solaris and illumos had buffer overrun bugs in their
strxfrm() implementations.  The bugs were fixed a decade ago and the
relevant releases are long out of vendor support.  It's time to remove
the defense added by commit be8b06c3.

diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 37989ba567..bf6992d9fe 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -137,8 +137,6 @@ main(int argc, char *argv[])
 	 */
 	unsetenv("LC_ALL");
 
-	check_strxfrm_bug();
-
 	/*
 	 * Catch standard options before doing much else, in particular before we
 	 * insist on not being root.
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 2b42d9ccd8..aa60008ad7 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1134,64 +1134,6 @@ IsoLocaleName(const char *winlocname)
 #endif			/* WIN32 && LC_MESSAGES */
 
 
-/*
- * Detect aging strxfrm() implementations that, in a subset of locales, write
- * past the specified buffer length.  Affected users must update OS packages
- * before using PostgreSQL 9.5 or later.
- *
- * Assume that the bug can come and go from one postmaster startup to another
- * due to physical replication among diverse machines.  Assume that the bug's
- * presence will not change during the life of a particular postmaster.  Given
- * those assumptions, call this no less than once per postmaster startup per
- * LC_COLLATE setting used.  No known-affected system offers strxfrm_l(), so
- * there is no need to consider pg_collation locales.
- */
-void
-check_strxfrm_bug(void)
-{
-	char		buf[32];
-	const int	canary = 0x7F;
-	bool		ok = true;
-
-	/*
-	 * Given a two-byte ASCII string and length limit 7, 8 or 9, Solaris 10
-	 * 05/08 returns 18 and modifies 10 bytes.  It respects limits above or
-	 * below that range.
-	 *
-	 * The bug is present in Solaris 8 as well; it is absent in Solaris 10
-	 * 01/13 and Solaris 11.2.  Affected locales inclu