EUC_* encodings: pass check-world

Noah Misch Tue, 17 Feb 2026 10:48:20 -0800

Three src/pl tests have comments about how they fail in EUC_* encodings other
than EUC_JP_2004.  I think this predates psql \if and \gset meta-commands
making test skips less onerous, so let's skip using those, as attached.


Why now?  Commit c67bef3 added euc_kr.sql to exercise some code specific to
non-UTF8 multibyte encodings.  I want it to be possible to exercise that code
with settings that also pass check-world as a whole.  The alternative was to
use EUC_JP_2004 or maybe MULE_INTERNAL to exercise that code.  EUC_KR is
closer to present-day relevance than MULE_INTERNAL or EUC_JP_2004, which don't
have glibc locales.

From: Noah Misch <[email protected]>

EUC_CN, EUC_JP, EUC_KR, EUC_TW: Skip U+00A0 tests instead of failing.

This entails alternative expected outputs, but psql \gset and \if have
reduced the maintenance burden.  Use those now, so settings that run the
new euc_kr.sql to completion don't get check-world failures.  That file
is new in commit c67bef3f3252a3a38bf347f9f119944176a796ce.  Back-patch
to v14, like that commit.

Reviewed-by: FIXME
Discussion: https://postgr.es/m/FIXME
Backpatch-through: 14

diff --git a/src/pl/plperl/GNUmakefile b/src/pl/plperl/GNUmakefile
index 558c764..d7c8917 100644
--- a/src/pl/plperl/GNUmakefile
+++ b/src/pl/plperl/GNUmakefile
@@ -62,7 +62,7 @@ endif
 
 REGRESS_OPTS = --dbname=$(PL_TESTDB) --dlpath=$(top_builddir)/src/test/regress
 REGRESS = plperl_setup plperl plperl_lc plperl_trigger plperl_shared \
-       plperl_elog plperl_util plperl_init plperlu plperl_array \
+       plperl_elog plperl_unicode plperl_util plperl_init plperlu plperl_array 
\
        plperl_call plperl_transaction plperl_env
 # if Perl can support two interpreters in one backend,
 # test plperl-and-plperlu cases
diff --git a/src/pl/plperl/expected/plperl_elog.out 
b/src/pl/plperl/expected/plperl_elog.out
index 6343962..042719d 100644
--- a/src/pl/plperl/expected/plperl_elog.out
+++ b/src/pl/plperl/expected/plperl_elog.out
@@ -97,16 +97,3 @@ NOTICE:  caught die
                    2
 (1 row)
 
--- Test non-ASCII error messages
---
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
-SET client_encoding TO UTF8;
-create or replace function error_with_nbsp() returns void language plperl as $$
-  elog(ERROR, "this message contains a no-break space");
-$$;
-select error_with_nbsp();
-ERROR:  this message contains a no-break space at line 2.
-CONTEXT:  PL/Perl function "error_with_nbsp"
diff --git a/src/pl/plperl/expected/plperl_elog_1.out 
b/src/pl/plperl/expected/plperl_elog_1.out
index a85dd17..42d4111 100644
--- a/src/pl/plperl/expected/plperl_elog_1.out
+++ b/src/pl/plperl/expected/plperl_elog_1.out
@@ -97,16 +97,3 @@ NOTICE:  caught die
                    2
 (1 row)
 
--- Test non-ASCII error messages
---
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
-SET client_encoding TO UTF8;
-create or replace function error_with_nbsp() returns void language plperl as $$
-  elog(ERROR, "this message contains a no-break space");
-$$;
-select error_with_nbsp();
-ERROR:  this message contains a no-break space at line 2.
-CONTEXT:  PL/Perl function "error_with_nbsp"
diff --git a/src/pl/plperl/expected/plperl_unicode.out 
b/src/pl/plperl/expected/plperl_unicode.out
new file mode 100644
index 0000000..3c48f2e
--- /dev/null
+++ b/src/pl/plperl/expected/plperl_unicode.out
@@ -0,0 +1,18 @@
+-- Test non-ASCII error messages
+--
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
+SET client_encoding TO UTF8;
+create or replace function error_with_nbsp() returns void language plperl as $$
+  elog(ERROR, "this message contains a no-break space");
+$$;
+select error_with_nbsp();
+ERROR:  this message contains a no-break space at line 2.
+CONTEXT:  PL/Perl function "error_with_nbsp"
diff --git a/src/pl/plperl/expected/plperl_unicode_1.out 
b/src/pl/plperl/expected/plperl_unicode_1.out
new file mode 100644
index 0000000..761de04
--- /dev/null
+++ b/src/pl/plperl/expected/plperl_unicode_1.out
@@ -0,0 +1,10 @@
+-- Test non-ASCII error messages
+--
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
diff --git a/src/pl/plperl/meson.build b/src/pl/plperl/meson.build
index f3a9350..ff41812 100644
--- a/src/pl/plperl/meson.build
+++ b/src/pl/plperl/meson.build
@@ -88,6 +88,7 @@ tests += {
       'plperl_trigger',
       'plperl_shared',
       'plperl_elog',
+      'plperl_unicode',
       'plperl_util',
       'plperl_init',
       'plperlu',
diff --git a/src/pl/plperl/sql/plperl_elog.sql 
b/src/pl/plperl/sql/plperl_elog.sql
index 9ea1350..032fd8b 100644
--- a/src/pl/plperl/sql/plperl_elog.sql
+++ b/src/pl/plperl/sql/plperl_elog.sql
@@ -76,18 +76,3 @@ return $a + $b;
 $$;
 
 select indirect_die_caller();
-
--- Test non-ASCII error messages
---
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
-
-SET client_encoding TO UTF8;
-
-create or replace function error_with_nbsp() returns void language plperl as $$
-  elog(ERROR, "this message contains a no-break space");
-$$;
-
-select error_with_nbsp();
diff --git a/src/pl/plperl/sql/plperl_unicode.sql 
b/src/pl/plperl/sql/plperl_unicode.sql
new file mode 100644
index 0000000..7e1ad74
--- /dev/null
+++ b/src/pl/plperl/sql/plperl_unicode.sql
@@ -0,0 +1,19 @@
+-- Test non-ASCII error messages
+--
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
+
+SET client_encoding TO UTF8;
+
+create or replace function error_with_nbsp() returns void language plperl as $$
+  elog(ERROR, "this message contains a no-break space");
+$$;
+
+select error_with_nbsp();
diff --git a/src/pl/plpython/expected/plpython_unicode.out 
b/src/pl/plpython/expected/plpython_unicode.out
index fd54b0b..bd8d9c5 100644
--- a/src/pl/plpython/expected/plpython_unicode.out
+++ b/src/pl/plpython/expected/plpython_unicode.out
@@ -1,11 +1,16 @@
 --
 -- Unicode handling
 --
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
 --
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
 SET client_encoding TO UTF8;
 CREATE TABLE unicode_test (
        testvalue  text NOT NULL
diff --git a/src/pl/plpython/expected/plpython_unicode_1.out 
b/src/pl/plpython/expected/plpython_unicode_1.out
new file mode 100644
index 0000000..f8b21fd
--- /dev/null
+++ b/src/pl/plpython/expected/plpython_unicode_1.out
@@ -0,0 +1,12 @@
+--
+-- Unicode handling
+--
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
+--
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
diff --git a/src/pl/plpython/sql/plpython_unicode.sql 
b/src/pl/plpython/sql/plpython_unicode.sql
index 14f7b4e..f45844b 100644
--- a/src/pl/plpython/sql/plpython_unicode.sql
+++ b/src/pl/plpython/sql/plpython_unicode.sql
@@ -1,11 +1,16 @@
 --
 -- Unicode handling
 --
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
 --
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
 
 SET client_encoding TO UTF8;
 
diff --git a/src/pl/tcl/expected/pltcl_unicode.out 
b/src/pl/tcl/expected/pltcl_unicode.out
index eea7d70..d33afd7 100644
--- a/src/pl/tcl/expected/pltcl_unicode.out
+++ b/src/pl/tcl/expected/pltcl_unicode.out
@@ -1,11 +1,16 @@
 --
 -- Unicode handling
 --
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
 --
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
 SET client_encoding TO UTF8;
 CREATE TABLE unicode_test (
     testvalue  text NOT NULL
diff --git a/src/pl/tcl/expected/pltcl_unicode_1.out 
b/src/pl/tcl/expected/pltcl_unicode_1.out
new file mode 100644
index 0000000..f8b21fd
--- /dev/null
+++ b/src/pl/tcl/expected/pltcl_unicode_1.out
@@ -0,0 +1,12 @@
+--
+-- Unicode handling
+--
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
+--
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
diff --git a/src/pl/tcl/sql/pltcl_unicode.sql b/src/pl/tcl/sql/pltcl_unicode.sql
index f000604..a09e499 100644
--- a/src/pl/tcl/sql/pltcl_unicode.sql
+++ b/src/pl/tcl/sql/pltcl_unicode.sql
@@ -1,11 +1,16 @@
 --
 -- Unicode handling
 --
--- Note: this test case is known to fail if the database encoding is
--- EUC_CN, EUC_JP, EUC_KR, or EUC_TW, for lack of any equivalent to
--- U+00A0 (no-break space) in those encodings.  However, testing with
--- plain ASCII data would be rather useless, so we must live with that.
+-- This test case would fail if the database encoding is EUC_CN, EUC_JP,
+-- EUC_KR, or EUC_TW, for lack of any equivalent to U+00A0 (no-break space) in
+-- those encodings.  However, testing with plain ASCII data would be rather
+-- useless, so we must live with that.
 --
+SELECT getdatabaseencoding() IN ('EUC_CN', 'EUC_JP', 'EUC_KR', 'EUC_TW')
+  AS skip_test \gset
+\if :skip_test
+\quit
+\endif
 
 SET client_encoding TO UTF8;

EUC_* encodings: pass check-world

Reply via email to