Re: 039_end_of_wal: error in "xl_tot_len zero" test

2024-05-13 Thread Anton Voloshin

On 13/05/2024 00:39, Tom Lane wrote:

Hm.  It occurs to me that there *is* a system-specific component to
the amount of WAL emitted during initdb: the number of locales
that "locale -a" prints translates directly to the number of
rows inserted into pg_collation. [...]


Yes. Another system-specific circumstance affecting WAL position is the 
name length of the unix user doing initdb. I've seen 039_end_of_wal 
failing consistently under user  but working fine with , 
both on the same machine at the same time.


To be more precise, on one particular machine under those particular 
circumstances (including set of locales) it would work for any username 
with length < 8 or >= 16, but would fail for length 8..15 (in bytes, not 
characters, if non-ASCII usernames were used).


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-26 Thread Anton Voloshin

On 26/04/2024 17:38, Anton Voloshin wrote:

I will try to report this to Perl community later.


Reported under https://github.com/Perl/perl5/issues/22176

Perl 5.36.3 seems to be fine (latest stable release before 5.38.x).
5.38.0 and 5.38.2 are broken.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-26 Thread Anton Voloshin

On 26/04/2024 05:20, Tom Lane wrote:

Haven't we worked around that everywhere it matters, in commits such
as 8421f6bce and 605062227?


Yes, needing 8421f6bce and 605062227 was, perhaps, surprising, but 
reasonable. Unlike breaking floating point constants in the source code. 
But, I guess, you're right and, since it does look like a Perl bug, 
we'll have to work around that in all places where we use floating-point 
constants in Perl code, which are surprisingly few.


> For me, check-world passes under
> LANG=ru_RU, even with perl 5.38.2 (where I do confirm that your
> test script fails).  The buildfarm isn't unhappy either.

Indeed, check-world seems to run fine on my machine and on the bf as well.

Grepping and browsing through, I've only found three spots with \d\.\d 
directly in Perl code as a float, only one of them needs correction.


1. src/test/perl/PostgreSQL/Test/Kerberos.pm in master
src/test/kerberos/t/001_auth.pl in REL_16_STABLE
> if ($krb5_version >= 1.15)

I guess adding use locale ':numeric' would be easiest workaround here.
Alternatively, we could also split version into krb5_major_version and 
krb5_minor_version while parsing krb5-config --version's output above, 
but I don't think that's warranted. So I suggest something along the 
lines of 0001-use-numeric-locale-in-kerberos-test-rel16.patch and 
*-master.patch (attached, REL_16 and master need this change in 
different places).


I did verify by providing fake 'krb5-config' that before the fix, with 
LANG=ru_RU.UTF-8 and Perl 5.38.2 and with, say, krb5 "version" 1.13 it 
would still add the "listen" lines to kdc.conf by mistake (presumably, 
confusing some versions of kerberos).


2 and 3. contrib/intarray/bench/create_test.pl
> if (rand() < 0.7)
and
> if ($#sect < 0 || rand() < 0.1)

PostgreSQL::Test::Utils is not used there, so it's OK, no change needed.

I did not find any other float constants in .pl/.pm files in master (I 
could have missed something).


> Particularly in
> this way --- what are we supposed to do, write "if (0 < 0,5)"?
> That means something else.

Yep. I will try to report this to Perl community later.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom bc187cd95f91d5a7fe97ba4ccfaf75b77e2f03c8 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Fri, 26 Apr 2024 12:24:49 +
Subject: [PATCH] use numeric locale in kerberos test to work around perl bug

After b124104e7 we became susceptible to Perl bug (at least on Perl
5.38.2), where on locales with non-dot fractional part separator,
floating-point constants in perl source are broken (1.15 would be '1').
Work around that by using numeric locale in this file.
---
 src/test/kerberos/t/001_auth.pl | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/test/kerberos/t/001_auth.pl b/src/test/kerberos/t/001_auth.pl
index d74e4af464e..9475f14d50e 100644
--- a/src/test/kerberos/t/001_auth.pl
+++ b/src/test/kerberos/t/001_auth.pl
@@ -23,6 +23,8 @@ use PostgreSQL::Test::Utils;
 use PostgreSQL::Test::Cluster;
 use Test::More;
 use Time::HiRes qw(usleep);
+# Work around the Perl 5.38.2 bug: we need floating-point constants.
+use locale ':numeric';
 
 if ($ENV{with_gssapi} ne 'yes')
 {
-- 
2.43.0

From d6ee9a5567cea503c211fa41a0f741c71aa60ad5 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Fri, 26 Apr 2024 14:23:28 +
Subject: [PATCH] use numeric locale in kerberos test to work around perl bug

After b124104e7 we became susceptible to Perl bug (at least on Perl
5.38.2), where on locales with non-dot fractional part separator,
floating-point constants in perl source are broken (1.15 would be '1').
Work around that by using numeric locale in this file.
---
 src/test/perl/PostgreSQL/Test/Kerberos.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/test/perl/PostgreSQL/Test/Kerberos.pm b/src/test/perl/PostgreSQL/Test/Kerberos.pm
index f7810da9c1d..2c29c67329e 100644
--- a/src/test/perl/PostgreSQL/Test/Kerberos.pm
+++ b/src/test/perl/PostgreSQL/Test/Kerberos.pm
@@ -9,6 +9,8 @@ package PostgreSQL::Test::Kerberos;
 use strict;
 use warnings FATAL => 'all';
 use PostgreSQL::Test::Utils;
+# Work around the Perl 5.38.2 bug: we need floating-point constants.
+use locale ':numeric';
 
 our ($krb5_bin_dir, $krb5_sbin_dir, $krb5_config, $kinit, $klist,
 	 $kdb5_util, $kadmin_local, $krb5kdc,
-- 
2.43.0



Re: pgsql: psql: add an optional execution-count limit to \watch.

2024-04-25 Thread Anton Voloshin

Hello, hackers.

On 18/04/2023 20:34, Tom Lane wrote (on pgsql-committers):
> I shall now retire to a safe distance and watch the buildfarm.

Unfortunately, on fresh perl (5.38.2 verified) and on ru_RU.UTF-8 
locale, it breaks basic float comparison: 0 < 0.5 is no longer true.


This is the reproduction on REL_16_STABLE (but it affects master
as well), using fresh Ubuntu 24.04 container.

0. I've used lxc to get a fresh container:
$ lxc launch ubuntu-daily:noble u2404
But I don't think lxc or containerization in general matters in this 
case. Also, I think any environment with fresh enough Perl would work, 
Ubuntu 24.04 is just an easy example.


(obviously, install necessary dev packages)

1. Generate ru_RU.UTF-8 locale:
a. In /etc/locale.gen, uncomment the line:
# ru_RU.UTF-8 UTF-8

b. Run locale-gen as root. For me, it says:
$ sudo locale-gen
Generating locales (this might take a while)...
  en_US.UTF-8... done
  ru_RU.UTF-8... done
Generation complete.

2. Apply 0001-demo-of-weird-Perl-setlocale-effect-on-float-numbers.patch
(adding src/test/authentication/t/999_broken.pl)

3. Run the test
LANG=ru_RU.UTF-8 make check -C src/test/authentication 
PROVE_TESTS=t/999_broken.pl PROVE_FLAGS=--verbose


The test is, basically:
use PostgreSQL::Test::Utils;
use Test::More tests => 1;
ok(0 < 0.5, "0 < 0.5");

If I comment-out the "use PostgreSQL::Test::Utils" line, the test works. 
Otherwise it fails to notice that 0 is less than 0.5.


Alternatively, the test fails if I replace that "use" line with
BEGIN {
use POSIX qw(locale_h);
setlocale(LC_NUMERIC, "");
}

"BEGIN" part is essential: mere use/setlocale is fine.

Also, adding
use locale;
or even
use locale ':numeric';
fixes the test, but I doubt whether it's a good idea to add that to 
Utils.pm.


Obviously, one of the reasons is that according to ru_RU.UTF-8 locale 
for LC_NUMERIC, fractional part separator is ",", not ".". So one could, 
technically, parse "0.5" as "0" and then unparsed ".5" tail. I think it 
might even be a Perl bug, because, according to my quick browsing of man 
perlfunc (setlocale) and man perllocale, this should not affect the code 
outside "use locale", not in such a fundamental way. After all, we're 
talking not about strtod etc, but about floating-point numbers in the 
source code.


P.S. $ perl --version

This is perl 5, version 38, subversion 2 (v5.38.2) built for 
x86_64-linux-gnu-thread-multi

(with 44 registered patches, see perl -V for more detail)

P.P.S. I'm replying to pgsql-hackers, even though part of previous 
discussion have been on pgsql-committers. Hopefully, it's OK.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom 6d1719de49b1c690e1fb7aeed8fc760f053a2a31 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Thu, 25 Apr 2024 17:23:26 +
Subject: [PATCH] demo of weird Perl setlocale effect on float numbers
 comparison

---
 src/test/authentication/t/999_broken.pl | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 src/test/authentication/t/999_broken.pl

diff --git a/src/test/authentication/t/999_broken.pl b/src/test/authentication/t/999_broken.pl
new file mode 100644
index 000..e65a272e1cb
--- /dev/null
+++ b/src/test/authentication/t/999_broken.pl
@@ -0,0 +1,8 @@
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Utils;
+
+use Test::More tests => 1;
+
+ok(0 < 0.5, "0 < 0.5");
-- 
2.43.0



Re: Combine headerscheck and cpluspluscheck scripts

2024-04-16 Thread Anton Voloshin

Hello, hackers,

On 10/03/2024 12:03, Peter Eisentraut wrote:

Committed, thanks.


This commit (7b8e2ae2f) have turned cpluspluscheck script into a 
--cplusplus option for headerscheck.  I propose to update the 
src/tools/pginclude/README correspondingly, please see the attached patch.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom a50e58f117341e8a9df5f97fa05630f7b8f4bd86 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Tue, 16 Apr 2024 17:19:28 +0300
Subject: [PATCH] Update src/tools/pginclude/README to match recent changes to
 cpluspluscheck

7b8e2ae2f have turned cpluspluscheck from separate script into
a --cplusplus option for headerscheck. Update README correspondingly.
---
 src/tools/pginclude/README | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/tools/pginclude/README b/src/tools/pginclude/README
index 712eca76fb3..65372057dad 100644
--- a/src/tools/pginclude/README
+++ b/src/tools/pginclude/README
@@ -1,10 +1,10 @@
 src/tools/pginclude/README
 
-NOTE: headerscheck and cpluspluscheck are in current use, and any
-problems they find should generally get fixed.  The other scripts
-in this directory have not been used in some time, and have issues.
-pgrminclude in particular has a history of creating more problems
-than it fixes.  Be very wary of applying their results blindly.
+NOTE: headerscheck and headerscheck --cplusplus are in current use, and any
+problems they find should generally get fixed.  The other scripts in this
+directory have not been used in some time, and have issues.  pgrminclude in
+particular has a history of creating more problems than it fixes.  Be very
+wary of applying their results blindly.
 
 
 pginclude
@@ -84,16 +84,16 @@ prerequisite, even if postgres_fe.h or c.h would be more appropriate.
 Also note that the contents of macros are not checked; this is intentional.
 
 
-cpluspluscheck
-==
+headerscheck --cplusplus
+
 
-This script can be run to verify that all Postgres include files meet
-the project convention that they will compile as C++ code.  Although
-the project's coding language is C, some people write extensions in C++,
-so it's helpful for include files to be C++-clean.
+The headerscheck in --cplusplus mode can be run to verify that all Postgres
+include files meet the project convention that they will compile as C++
+code.  Although the project's coding language is C, some people write
+extensions in C++, so it's helpful for include files to be C++-clean.
 
 A small number of header files are exempted from this requirement,
-and are skipped by the cpluspluscheck script.
+and are skipped by the script in the --cplusplus mode.
 
 The easy way to run the script is to say "make -s cpluspluscheck" in
 the top-level build directory after completing a build.  You should
-- 
2.43.2



Re: 039_end_of_wal: error in "xl_tot_len zero" test

2024-02-15 Thread Anton Voloshin

Hello, Thomas,

On 19/01/2024 01:35, Thomas Munro wrote:

I don't yet have an opinion on the best way to
do it though.  Would it be enough to add emit_message($node, 0) after
advance_out_of_record_splitting_zone()?


Yes, indeed that seems to be enough. At least I could not produce any 
more "xl_tot_len zero" failures with that addition.


I like this solution the best.


Tolerating the two different messages would weaken the test.


I agree, I also don't really like this option.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




039_end_of_wal: error in "xl_tot_len zero" test

2024-01-18 Thread Anton Voloshin

Hello, hackers,

I believe there is a small problem in the 039_end_of_wal.pl's 
"xl_tot_len zero" test. It supposes that after immediate shutdown the 
server, upon startup recovery, should always produce a message matching 
"invalid record length at .*: wanted 24, got 0". However, if the 
circumstances are just right and we happened to hit exactly on the edge 
of WAL page, then the message on startup recovery would be "invalid 
magic number  in log segment .*, offset .*". The test does not take 
that into account.


Now, reproducing this is somewhat tricky, because exact position in WAL 
at the test time depends on what exactly initdb did, and that not only 
differs in different major verisons, but also depends on e.g. username 
length, locales available, and, perhaps, more. Even though originally 
this problem was found "in the wild" on one particular system on one 
particular code branch, I've written small helper patch to make 
reproduction on master easier, see 
0001-repro-for-039_end_of_wal-s-problem-with-page-end.patch.


This patch adds single emit_message of (hopefully) the right size to 
make sure we hit end of WAL block right by the time we call 
$node->stop('immediate') in "xl_tot_len zero" test. With this patch, 
"xl_tot_len zero" test fails every time because the server writes 
"invalid magic number  in log segment" while the test still only 
expects "invalid record length at .*: wanted 24, got 0". If course, this 
0001 patch is *not* meant to be committed, but only as an issue 
reproduction helper.


I can think of two possible fixes:

1. Update advance_out_of_record_splitting_zone to also avoid stopping at
   exactly the block end:

 my $page_offset = $end_lsn % $WAL_BLOCK_SIZE;
-while ($page_offset >= $WAL_BLOCK_SIZE - $page_threshold)
+while ($page_offset >= $WAL_BLOCK_SIZE - $page_threshold || 
$page_offset <= $SizeOfXLogShortPHD)

 {
see 0002-fix-xl_tot_len-zero-test-amend-advance_out_of.patch

We need to compare with $SizeOfXLogShortPHD (and not with zero) because 
at that point, even though we didn't actually write out new WAL page

yet, it's header is already in place in memory and taken in account
for LSN reporting.

2. Alternatively, amend "xl_tot_len zero" test to expect "invalid magic
   number  in WAL segment" message as well:

 $node->start;
 ok( $node->log_contains(
+"invalid magic number  in WAL segment|" .
 "invalid record length at .*: expected at least 24, got 0", 
$log_size

 ),
see 0003-alt.fix-for-xl_tot_len-zero-test-accept-invalid.patch

I think it makes sense to backport whatever the final change would be to 
all branches with 039_end_of_wal (REL_12+).


Any thoughts?

Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom 5f12139816f6c1bc7d625ba8007aedb8f5d04a71 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Thu, 18 Jan 2024 12:45:12 +0300
Subject: [PATCH 1/3] repro for 039_end_of_wal's problem with page
 end

Tags: commitfest_hotfix
---
 src/test/recovery/t/039_end_of_wal.pl | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/test/recovery/t/039_end_of_wal.pl b/src/test/recovery/t/039_end_of_wal.pl
index f9acc83c7d0..ecf9b6089de 100644
--- a/src/test/recovery/t/039_end_of_wal.pl
+++ b/src/test/recovery/t/039_end_of_wal.pl
@@ -36,6 +36,8 @@ my $WAL_SEGMENT_SIZE;
 my $WAL_BLOCK_SIZE;
 my $TLI;
 
+my $SizeOfXLogShortPHD = 24; # rounded up to 8 bytes
+
 # Build path of a WAL segment.
 sub wal_segment_path
 {
@@ -258,9 +260,27 @@ my $prev_lsn;
 note "Single-page end-of-WAL detection";
 ###
 
+my $lsn = get_insert_lsn($node);
+my $lsn_offset = $lsn % $WAL_BLOCK_SIZE;
+my $empty_msg_size = ( ( ($ENV{EMPTY_MSG_SIZE} || 51) + 7) / 8) * 8;
+my $commit_msg_size = ( (34 + 7) / 8) * 8;
+# empty logical msg and commit message take this many bytes of WAL:
+my $empty_msg_overhead = $empty_msg_size + $commit_msg_size;
+# cound overhead twice to account for two emit_message calls below:
+# we want to hit the page edge after the second call.
+my $target_lsn_offset = $WAL_BLOCK_SIZE * 2 - $empty_msg_overhead * 2;
+my $msg_size = ($target_lsn_offset - $lsn_offset) % $WAL_BLOCK_SIZE;
+# If we happen to switch to the next WAL block mid-message, reduce the message
+# by $SizeOfXLogShortPHD (minimal page header) to hit the same target.
+if ($lsn_offset + $msg_size >= $WAL_BLOCK_SIZE) { $msg_size -= $SizeOfXLogShortPHD; }
+print "QWE0: $lsn WAL_BLOCK_SIZE='$WAL_BLOCK_SIZE', lsn_offset='$lsn_offset' target_lsn_offset='$target_lsn_offset' msg_size='$msg_size'\n";
+emit_message($node, $msg_size);
+printf "QWE1: %s - after corrective msg\n", $node->safe_psql('postgres', "SELECT pg_current_wal_lsn()");
 # xl_

Re: duplicate function declaration in multirangetypes_selfuncs.c

2023-04-21 Thread Anton Voloshin

On 21/04/2023 14:14, Daniel Gustafsson wrote:

I'll take care of these in a bit (unless someone finds more, or objects)
backpatching them to their respective origins branches


Thanks!

I went through master with
find . -name "*.[ch]" -exec bash -c 'echo {}; uniq -d {}' \;|sed -E 
'/^[[:space:]*]*$/d;'


and could not find any other obvious unintentional duplicates, except 
the two mentioned already. There seems to be some strange duplicates in 
snowball, but that's external and generated code and I could not figure 
out quickly whether those are intentional or not. Hopefully, they are 
harmless or intentional.


All other duplicated lines I've analyzed seem to be intentional.

Granted, I've mostly ignored lines without ';', also I could have missed 
something, but currently I'm not aware of any other unintentionally 
duplicated lines.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: duplicate function declaration in multirangetypes_selfuncs.c

2023-04-21 Thread Anton Voloshin

On 21/04/2023 13:45, Pavel Borisov wrote:

The patch is attached. Anyone to commit?


Speaking of duplicates, I just found another one:
>break;
>break;
in src/interfaces/ecpg/preproc/variable.c
(in all stable branches).

Sorry for missing it in the previous letter.

Additional patch attached. Or both could go in the same commit, it's up 
to committer.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom 05c9979de0596f9bbe89304f54e20b759205b57b Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Fri, 21 Apr 2023 13:55:05 +0300
Subject: [PATCH] Remove duplicate break in find_struct_member's switch

---
 src/backend/utils/adt/multirangetypes_selfuncs.c | 1 -
 src/interfaces/ecpg/preproc/variable.c   | 1 -
 2 files changed, 2 deletions(-)

diff --git a/src/backend/utils/adt/multirangetypes_selfuncs.c b/src/backend/utils/adt/multirangetypes_selfuncs.c
index e9326f8342a..cefc4710fd4 100644
--- a/src/backend/utils/adt/multirangetypes_selfuncs.c
+++ b/src/backend/utils/adt/multirangetypes_selfuncs.c
@@ -35,7 +35,6 @@ static double calc_multirangesel(TypeCacheEntry *typcache,
  VariableStatData *vardata,
  const MultirangeType *constval, Oid operator);
 static double default_multirange_selectivity(Oid operator);
-static double default_multirange_selectivity(Oid operator);
 static double calc_hist_selectivity(TypeCacheEntry *typcache,
 	VariableStatData *vardata,
 	const MultirangeType *constval,
diff --git a/src/interfaces/ecpg/preproc/variable.c b/src/interfaces/ecpg/preproc/variable.c
index 2a2b9531187..b23ed5edf46 100644
--- a/src/interfaces/ecpg/preproc/variable.c
+++ b/src/interfaces/ecpg/preproc/variable.c
@@ -105,7 +105,6 @@ find_struct_member(char *name, char *str, struct ECPGstruct_member *members, int
 		else
 			return find_struct_member(name, ++end, members->type->u.members, brace_level);
 		break;
-		break;
 	case '.':
 		if (members->type->type == ECPGt_array)
 			return find_struct_member(name, end, members->type->u.element->u.members, brace_level);
-- 
2.40.0



duplicate function declaration in multirangetypes_selfuncs.c

2023-04-21 Thread Anton Voloshin

Hello, hackers,

we have a duplicate line, declaration of default_multirange_selectivity() in
src/backend/utils/adt/multirangetypes_selfuncs.c:

static double default_multirange_selectivity(Oid operator);
static double default_multirange_selectivity(Oid operator);

Affected branches: REL_14_STABLE and above.

Both lines come from the same commit:
> commit 6df7a9698bb036610c1e8c6d375e1be38cb26d5f
> Author: Alexander Korotkov 
> Date:   Sun Dec 20 07:20:33 2020
>
> Multirange datatypes

No harm from this duplication, still, I suggest to clean it up for 
tidiness' sake.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




patch suggestion: Fix citext_utf8 test's "Turkish I" with ICU collation provider

2022-10-21 Thread Anton Voloshin

Hello, hackers.

In current master, as well as in REL_15_STABLE, installcheck in 
contrib/citext fails in most locales, if we use ICU as a locale provider:


$ rm -fr data; initdb --locale-provider icu --icu-locale en-US -D data 
&& pg_ctl -D data -l logfile start && make -C contrib/citext 
installcheck; pg_ctl -D data stop; cat contrib/citext/regression.diffs

...
test citext   ... ok  457 ms
test citext_utf8  ... FAILED   21 ms
...
diff -u 
/home/ashutosh/pg/REL_15_STABLE/contrib/citext/expected/citext_utf8.out 
/home/ashutosh/pg/REL_15_STABLE/contrib/citext/results/citext_utf8.out
--- 
/home/ashutosh/pg/REL_15_STABLE/contrib/citext/expected/citext_utf8.out 
   2022-07-14 17:45:31.747259743 +0300
+++ 
/home/ashutosh/pg/REL_15_STABLE/contrib/citext/results/citext_utf8.out 
   2022-10-21 19:43:21.146044062 +0300

@@ -54,7 +54,7 @@
 SELECT 'i'::citext = 'İ'::citext AS t;
  t
 ---
- t
+ f
 (1 row)

The reason is that in ICU lowercasing Unicode symbol "İ" (U+0130
"LATIN CAPITAL LETTER I WITH DOT ABOVE") can give two valid results:
- "i", i.e. "U+0069 LATIN SMALL LETTER I" in "tr" and "az" locales.
- "i̇", i.e. "U+0069 LATIN SMALL LETTER I" followed by "U+0307 COMBINING
  DOT ABOVE" in all other locales I've tried (including "en-US", "de",
  "ru", etc).
And the way this test is currently written only accepts plain latin "i", 
which might be true in glibc, but is not so in ICU. Verified on ICU 
70.1, but I've seen this on few other ICU versions as well, so I think 
this is probably an ICU's feature, not a bug(?).


Since we probably want installcheck in contrib/citext to pass on
databases with various locales, including reasonable ICU-based ones,
I suggest to fix this test by accepting either of outputs as valid.

I can see two ways of doing that:
1. change SQL in the test to use "IN" instead of "=";
2. add an alternative output.

I think in this case "IN" is better, because that allows a single 
comment to address both possible outputs and to avoid unnecessary 
duplication.


I've attached a patch authored mostly by my colleague, Roman Zharkov, as 
one possible fix.


Only versions 15+ are affected.

Any comments?

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom 5cfbb59a11d099fa9b8703502fd8aac936a02c4d Mon Sep 17 00:00:00 2001
From: Roman Zharkov 
Date: Fri, 21 Oct 2022 19:56:19 +0300
Subject: [PATCH] Fix citext_utf8 test's "Turkish I" with ICU collation
 provider
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

With the ICU collation provider the Turkish unicode symbol "İ" (U+0130
"LATIN CAPITAL LETTER I WITH DOT ABOVE") has two lowercase variants:
- "i", i.e. "U+0069 LATIN SMALL LETTER I", in "tr" and "az" locales.
- "i̇", i.e. "U+0069 LATIN SMALL LETTER I" followed by "U+0307 COMBINING
  DOT ABOVE" in all other locales I've tried (including "en-US", "de",
  "ru", etc).

So, add both variants to the test.
---
 contrib/citext/expected/citext_utf8.out | 6 --
 contrib/citext/sql/citext_utf8.sql  | 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/contrib/citext/expected/citext_utf8.out b/contrib/citext/expected/citext_utf8.out
index 666b07ccec4..62f0794028f 100644
--- a/contrib/citext/expected/citext_utf8.out
+++ b/contrib/citext/expected/citext_utf8.out
@@ -48,10 +48,12 @@ SELECT 'Ä'::citext <> 'Ä'::citext AS t;
  t
 (1 row)
 
--- Test the Turkish dotted I. The lowercase is a single byte while the
+-- Test the Turkish dotted I. The lowercase might be a single byte while the
 -- uppercase is multibyte. This is why the comparison code can't be optimized
 -- to compare string lengths.
-SELECT 'i'::citext = 'İ'::citext AS t;
+-- Note that lower('İ') is 'i' (U+0069) in tr and az locales,
+-- but 'i̇' (U+0069 U+0307) in C and most (all?) other locales.
+SELECT 'İ'::citext in ('i'::citext, 'i̇'::citext) AS t;
  t 
 ---
  t
diff --git a/contrib/citext/sql/citext_utf8.sql b/contrib/citext/sql/citext_utf8.sql
index d068000b423..942daa9ce50 100644
--- a/contrib/citext/sql/citext_utf8.sql
+++ b/contrib/citext/sql/citext_utf8.sql
@@ -24,10 +24,12 @@ SELECT 'À'::citext <> 'B'::citext AS t;
 SELECT 'Ä'::text   <> 'Ä'::text   AS t;
 SELECT 'Ä'::citext <> 'Ä'::citext AS t;
 
--- Test the Turkish dotted I. The lowercase is a single byte while the
+-- Test the Turkish dotted I. The lowercase might be a single byte while the
 -- uppercase is multibyte. This is why the comparison code can't be optimized
 -- to compare string lengths.
-SELECT 'i'::citext = 'İ'::citext AS t;
+-- Note that lower('İ') is 'i' (U+0069) in tr and az locales,
+-- but 'i̇' (U+0069 U+0307) in C and most (all?) other locales.
+SELECT 'İ'::citext in ('i'::citext, 'i̇'::citext) AS t;
 
 -- Regression.
 SELECT 'láska'::citext <> 'laská'::citext AS t;
-- 
2.38.1



Re: Make #else/#endif comments more consistent

2022-08-29 Thread Anton Voloshin

On 29/08/2022 14:50, Peter Eisentraut wrote:
I usually try to follow the guidelines in 
<https://www.gnu.org/prep/standards/html_node/Comments.html>, which 
pretty much match what you are proposing.


Thank you for the link, it's a useful one and the wording is better than 
mine.


Note that for _MSC_VER in particular there is some trickiness: We 
generally use it to tell apart different MSVC compiler versions.


That's certainly true in branches <= 15, but in master, to my surprise, 
I don't see any numerical comparisons of _MSC_VER since the recent 
6203583b7.


I'm not sure explicit !defined(_MSC_VER) is all that more clear
than !_MSC_VER in the commentary. After all, we would probably
never (?) see an actual
#if (!_MSC_VER)
in a real code.

So I have mixed feelings on forcing define() on _MSC_VER, but if you 
insist, I don't mind much either way.


What about other changes? Are there any obviously wrong or missed ones?

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru





Make #else/#endif comments more consistent

2022-08-29 Thread Anton Voloshin

Hello,

while reading the postgres code, occasionally I see a little bit of 
inconsistency in the comments after #else (and corresponding #endif).


In some places #else/endif's comment expresses condition for else block 
to be active:

#ifdef HAVE_UUID_OSSP
...
#else   /* !HAVE_UUID_OSSP */
...
#endif  /* HAVE_UUID_OSSP */

and in others -- just the opposite:

#ifdef SHA2_UNROLL_TRANSFORM
...
#else   /* SHA2_UNROLL_TRANSFORM */
...
#endif  /* SHA2_UNROLL_TRANSFORM */

Also, #endif comment after #else might expresses condition for else 
block to be active:

#ifdef USE_ICU
...
#else   /* !USE_ICU */
...
#endif  /* !USE_ICU */

or it might be just the opposite, like in HAVE_UUID_OSSP and 
SHA2_UNROLL_TRANSFORM examples above.



I propose making them more consistent. Would the following guidelines be 
acceptable?



1. #else/#elif/#endif's comment, if present, should reflect the
condition of the #else/#elif block as opposed to always being a copy
of #if/ifdef/ifndef condition.

e.g. prefer this:
#if LLVM_VERSION_MAJOR > 11
...
#else   /* LLVM_VERSION_MAJOR <= 11 */
...
#endif  /* LLVM_VERSION_MAJOR <= 11 */

over this:

#if LLVM_VERSION_MAJOR > 11
...
#else   /* LLVM_VERSION_MAJOR > 11 */
...
#endif  /* LLVM_VERSION_MAJOR > 11 */


2. In #else/#elif/#endif comments, prefer A to defined(A).

E.g. prefer this:
#endif  /* DMETAPHONE_MAIN */
over
#endif  /* defined DMETAPHONE_MAIN */

And this:
#else   /* !_MSC_VER */
over
#else   /* !defined(_MSC_VER) */


3. Textual hand-crafted condition comments are perfectly fine.
Like this:
#else   /* no ppoll(), so use select() */


4. #else/#endif condition comment, if present, should reflect the
*effective* condition, i.e. condition taking into account previous
#if/#elif-s.

E.g. do this:
#if defined(HAVE_INT128)
...
#elif defined(HAS_64_BIT_INTRINSICS)
...
#else   /* !HAVE_INT128 && !HAS_64_BIT_INTRINSICS */
...
#endif  /* !HAVE_INT128 && !HAS_64_BIT_INTRINSICS */


5. Comment of the form "!A && !B", if deemed complicated enough, may
also be expressed as "neither A nor B" for easier reading.

Example:
#if (defined(HAVE_LANGINFO_H) && defined(CODESET)) || defined(WIN32)
...
#else   /* neither (HAVE_LANGINFO_H && CODESET) 
nor WIN32 */

...
#endif  /* neither (HAVE_LANGINFO_H && CODESET) 
nor WIN32 */



6. Use "!" as opposed to "not" to be consistent. E.g. do this:
#ifdef LOCK_DEBUG
...
#else   /* !LOCK_DEBUG */
...
#endif  /* !LOCK_DEBUG */

as opposed to:

#ifdef LOCK_DEBUG
...
#else   /* not LOCK_DEBUG */
...
#endif  /* not LOCK_DEBUG */


The draft of proposed changes is attached as
0001-Make-else-endif-comments-more-consistent.patch
In the patch I've also cleaned up some minor things, like removing
occasional "//" comments within "/* */" ones.

Any thoughts?
--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ruFrom a118eb1c4caa1a140cd9a8c4230b91c7bfb91773 Mon Sep 17 00:00:00 2001
From: Anton Voloshin 
Date: Sat, 27 Aug 2022 15:56:11 +0300
Subject: [PATCH] Make else/endif comments more consistent

This only changes condition comments after some preprocessor directives
(mostly else and endif).

1. #else/#elif/#endif's comment, if present, should reflect the condition
of the #else/#elif block as opposed to always being a copy of #if/ifdef/ifndef
condition.

e.g. do this:
#if LLVM_VERSION_MAJOR > 11
...
#else   /* LLVM_VERSION_MAJOR <= 11 */
...
#endif  /* LLVM_VERSION_MAJOR <= 11 */

as opposed to

#if LLVM_VERSION_MAJOR > 11
...
#else   /* LLVM_VERSION_MAJOR > 11 */
...
#endif  /* LLVM_VERSION_MAJOR > 11 */


2. In #else/#elif/#endif comments, prefer A to defined(A).

E.g. prefer this:
#endif  /* DMETAPHONE_MAIN */
over
#endif  /* defined DMETAPHONE_MAIN */

And this:
#else   /* !_MSC_VER */
over
#else   /* !defined(_MSC_VER) */


3. Textual hand-crafted condition comments are perfectly fine.
Like this:
#else   /* no ppoll(), so use select() */


4. #else/#endif condition comment, if present, should reflect the *effective*
condition, i.e. condition taking into account previous #if/#elif-s.

E.g. do this:
#if defined(H

[PATCH] allow src/tools/msvc/*.bat files to be called from the root of the source tree

2021-12-29 Thread Anton Voloshin

Hello,

currently, on Windows/MSVC, src\tools\msvc\*.bat files mostly require 
being in that src\tools\msvc directory first.


I suggest an obvious fix:

diff --git a/src/tools/msvc/build.bat b/src/tools/msvc/build.bat
index 4001ac1d0d1..407b6559cfb 100755
--- a/src/tools/msvc/build.bat
+++ b/src/tools/msvc/build.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/build.bat
 REM all the logic for this now belongs in build.pl. This file really
 REM only exists so you don't have to type "perl build.pl"
 REM Resist any temptation to add any logic here.
-@perl build.pl %*
+@perl %~dp0\build.pl %*
diff --git a/src/tools/msvc/install.bat b/src/tools/msvc/install.bat
index d03277eff2b..98edf6bdffb 100644
--- a/src/tools/msvc/install.bat
+++ b/src/tools/msvc/install.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/install.bat
 REM all the logic for this now belongs in install.pl. This file really
 REM only exists so you don't have to type "perl install.pl"
 REM Resist any temptation to add any logic here.
-@perl install.pl %*
+@perl %~dp0\install.pl %*
diff --git a/src/tools/msvc/vcregress.bat b/src/tools/msvc/vcregress.bat
index a981d3a6aa1..0d65c823e13 100644
--- a/src/tools/msvc/vcregress.bat
+++ b/src/tools/msvc/vcregress.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/vcregress.bat
 REM all the logic for this now belongs in vcregress.pl. This file really
 REM only exists so you don't have to type "perl vcregress.pl"
 REM Resist any temptation to add any logic here.
-@perl vcregress.pl %*
+@perl %~dp0\vcregress.pl %*

This patch uses standard windows cmd's %~dp0 to get the complete path 
(drive, "d", and path, "p") of the currently executing .bat file to get 
proper path of a .pl file to execute. I find the following link useful 
whenever I need to remember details on cmd's %-substitution rules: 
https://ss64.com/nt/syntax-args.html


With this change, one can call those .bat files, e.g. 
src\tools\msvc\build.bat, without leaving the root of the source tree.


Not sure if similar change should be applied to pgflex.bat and 
pgbison.bat -- never used them on Windows and they seem to require being 
called from the root, but perhaps they deserve a similar change.


If accepted, do you think this change is worthy of back-porting?

Please advise if you think this change is a beneficial one.

P.S. Yes, I am aware of very probable upcoming move to meson, but until 
then this little patch really helps me whenever I have to deal with 
Windows and MSVC from the command line. Besides, it could help old 
branches as well.


--
Anton Voloshin

https://postgrespro.ru
Postgres Professional, The Russian Postgres Companydiff --git a/src/tools/msvc/build.bat b/src/tools/msvc/build.bat
index 4001ac1d0d1..407b6559cfb 100755
--- a/src/tools/msvc/build.bat
+++ b/src/tools/msvc/build.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/build.bat
 REM all the logic for this now belongs in build.pl. This file really
 REM only exists so you don't have to type "perl build.pl"
 REM Resist any temptation to add any logic here.
-@perl build.pl %*
+@perl %~dp0\build.pl %*
diff --git a/src/tools/msvc/install.bat b/src/tools/msvc/install.bat
index d03277eff2b..98edf6bdffb 100644
--- a/src/tools/msvc/install.bat
+++ b/src/tools/msvc/install.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/install.bat
 REM all the logic for this now belongs in install.pl. This file really
 REM only exists so you don't have to type "perl install.pl"
 REM Resist any temptation to add any logic here.
-@perl install.pl %*
+@perl %~dp0\install.pl %*
diff --git a/src/tools/msvc/vcregress.bat b/src/tools/msvc/vcregress.bat
index a981d3a6aa1..0d65c823e13 100644
--- a/src/tools/msvc/vcregress.bat
+++ b/src/tools/msvc/vcregress.bat
@@ -3,4 +3,4 @@ REM src/tools/msvc/vcregress.bat
 REM all the logic for this now belongs in vcregress.pl. This file really
 REM only exists so you don't have to type "perl vcregress.pl"
 REM Resist any temptation to add any logic here.
-@perl vcregress.pl %*
+@perl %~dp0\vcregress.pl %*


Re: Triage for unimplemented geometric operators

2021-12-07 Thread Anton Voloshin

On 07/12/2021 06:18, Tom Lane wrote:

So my inclination for HEAD is to implement poly_distance and nuke
the others.  I'm a bit less sure about the back branches, but maybe
just de-document all of them there.


I agree, seems to be a reasonable compromise. Removing 20+-years old 
unused and slightly misleading code probably should overweight the 
natural inclination to implement all of the functions promised in the 
catalog. Enhancing software by deleting the code is not an everyday 
chance and IMHO should be taken, even when it requires an extra 
catversion bump.


I am kind of split on whether it is worth it to implement poly_distance 
in back branches. Maybe there is a benefit in implementing: it should 
not cause any reasonable incompatibilities and will introduce somewhat 
better compatibility with v15+. We could even get away with not updating 
v10..12' docs on poly_distance because it's not mentioned anyway.


I agree on de-documenting all of the unimplemented functions in v13 and 
v14. Branches before v13 should not require any changes (including 
documentation) because detailed table on which operators support which 
geometry primitives only came in 13, and I could not find in e.g. 12's 
documentation any references to the cases you listed previously:

> dist_lb:   <->(line,box)
> dist_bl:   <->(box,line)
> close_sl:  lseg ## line
> close_lb:  line ## box
> poly_distance: polygon <-> polygon
> path_center:   @@ path

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: fix warnings in 9.6, 10, 11's contrib when compiling without openssl

2021-11-11 Thread Anton Voloshin

On 11/11/2021 08:14, Michael Paquier wrote:

I would suggest to send any improvements for imath directly to
upstream, then Postgres would just feed on that when we update it:
https://github.com/creachadair/imath


I didn't realise imath comes (occasionally) from external repo. Thank you.

The compilation warnings are only relevant for their old releases, 
before imath v1.27 where they've replaced macros with inline functions, see

https://github.com/creachadair/imath/commit/d4760eef8

So imath in upstream doesn't require updates (especially since those 
warnings come specifically from pgindent's work, not from the upstream 
source). But Postgres' 10_STABLE and 11_STABLE still use old version 
with CLAMP as a macro.


Noah Misch was last to update imath from upstream in February 2019 and 
that change became part of REL_12_STABLE and later:

> Import changes from IMath versions (1.3, 1.29].
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=48e24ba6b7fd3bfd156b51e8d768fd48df0d288b

Noah and others, do you think it would be a good idea to update imath in 
REL_10_STABLE and REL_11_STABLE to a later upstream version to avoid 
those warnings about misleading indent? (see beginning of this thread) 
Admittedly, those warnings can only be seen in unusual circumstances (no 
openssl), but still, as you have mentioned in your commit message,

> We're better off naively tracking upstream than reactively
> maintaining a twelve-year-old snapshot of upstream.

Perhaps even update imath in all relevant stable branches, 10..14 
inclusive, to the same upstream version?


On 11/11/2021 08:14, Michael Paquier wrote:

Peter has removed this code as of db7d1a7, but there could still be a
point in updating imath on the stable branches where it exists if
there is an issue with it impacting pgcrypto without OpenSSL.


Perhaps that would be a good idea, see above.

--
Anton Voloshin
https://postgrespro.ru
Postgres Professional, The Russian Postgres Company




fix warnings in 9.6, 10, 11's contrib when compiling without openssl

2021-11-10 Thread Anton Voloshin

Hello,

after plain ./configure without options (including noticeable absence of 
--with-openssl) and make,
make -C contrib COPT=-Werror gives error with gcc-11 on REL_9_6_STABLE, 
REL_10_STABLE or REL_11_STABLE.


The warning/error is about misleading indent in 
contrib/pgcrypto/imath.c's CLAMP macro:


imath.c: In function ‘mp_int_add’:
imath.c:133:1: error: this ‘while’ clause does not guard... 
[-Werror=misleading-indentation]

  133 | while(uz_ > 1 && (*dz_-- == 0)) --uz_;MP_USED(z_)=uz_;}while(0)
  | ^
imath.c:670:17: note: in expansion of macro ‘CLAMP’
  670 | CLAMP(c);
  | ^
In file included from imath.c:34:
imath.h:68:26: note: ...this statement, but the latter is misleadingly 
indented as if it were guarded by the ‘while’

   68 | #define MP_USED(Z)   ((Z)->used)
  |  ^
imath.c:133:39: note: in expansion of macro ‘MP_USED’
  133 | while(uz_ > 1 && (*dz_-- == 0)) --uz_;MP_USED(z_)=uz_;}while(0)
  |   ^~~
imath.c:670:17: note: in expansion of macro ‘CLAMP’
  670 | CLAMP(c);
  | ^

pgcrypto-fix-imath-clamp-warning.patch, attached, is a suggested minimal 
fix:

diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c
index b94a51b81a4..801b843cbe3 100644
--- a/contrib/pgcrypto/imath.c
+++ b/contrib/pgcrypto/imath.c
@@ -130,7 +130,8 @@ do{T *u_=(A),*v_=u_+(N)-1;while(u_xch=*u_;*u_++=*v_;*v_--=xch;}}while(0)

 #else
 #define CLAMP(Z) \
 do{mp_int z_=(Z);mp_size uz_=MP_USED(z_);mp_digit 
*dz_=MP_DIGITS(z_)+uz_-1;\

-while(uz_ > 1 && (*dz_-- == 0)) --uz_;MP_USED(z_)=uz_;}while(0)
+while(uz_ > 1 && (*dz_-- == 0)) --uz_;\
+MP_USED(z_)=uz_;}while(0)
 #endif

 #undef MIN

The same patch works for all 9.6, 10 and 11. It's not needed in 12 or 
later, they compile without warnings just fine even without --with-openssl.


In addition, 9.6 (unlike 10 and 11) gives four "array-parameter=" 
warnings about contrib/pgcrypto/sha2.c:


sha2.c:552:20: error: argument 1 of type ‘uint8[]’ {aka ‘unsigned 
char[]’} with mismatched bound [-Werror=array-parameter=]

  552 | SHA256_Final(uint8 digest[], SHA256_CTX *context)
  |  ~~^~~~
In file included from sha2.c:44:
sha2.h:93:30: note: previously declared as ‘uint8[32]’ {aka ‘unsigned 
char[32]’}
   93 | voidSHA256_Final(uint8[SHA256_DIGEST_LENGTH], 
SHA256_CTX *);

  |  ^~~

and the same for SHA{512,384,224}_Final.

pgcrypto-fix-sha2-warning.patch, attached, is a suggested minimal fix 
for that:

 void
-SHA256_Final(uint8 digest[], SHA256_CTX *context)
+SHA256_Final(uint8 digest[SHA256_DIGEST_LENGTH], SHA256_CTX *context)
 {
etc.

Please consider fixing those warnings.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
diff --git a/contrib/pgcrypto/imath.c b/contrib/pgcrypto/imath.c
index b94a51b81a4..801b843cbe3 100644
--- a/contrib/pgcrypto/imath.c
+++ b/contrib/pgcrypto/imath.c
@@ -130,7 +130,8 @@ do{T *u_=(A),*v_=u_+(N)-1;while(u_ 1 && (*dz_-- == 0)) --uz_;MP_USED(z_)=uz_;}while(0)
+while(uz_ > 1 && (*dz_-- == 0)) --uz_;\
+MP_USED(z_)=uz_;}while(0)
 #endif
 
 #undef MIN
diff --git a/contrib/pgcrypto/sha2.c b/contrib/pgcrypto/sha2.c
index 231f9dfbb0e..11311deda8d 100644
--- a/contrib/pgcrypto/sha2.c
+++ b/contrib/pgcrypto/sha2.c
@@ -549,7 +549,7 @@ SHA256_Last(SHA256_CTX *context)
 }
 
 void
-SHA256_Final(uint8 digest[], SHA256_CTX *context)
+SHA256_Final(uint8 digest[SHA256_DIGEST_LENGTH], SHA256_CTX *context)
 {
 	/* If no digest buffer is passed, we don't bother doing this: */
 	if (digest != NULL)
@@ -877,7 +877,7 @@ SHA512_Last(SHA512_CTX *context)
 }
 
 void
-SHA512_Final(uint8 digest[], SHA512_CTX *context)
+SHA512_Final(uint8 digest[SHA512_DIGEST_LENGTH], SHA512_CTX *context)
 {
 	/* If no digest buffer is passed, we don't bother doing this: */
 	if (digest != NULL)
@@ -922,7 +922,7 @@ SHA384_Update(SHA384_CTX *context, const uint8 *data, size_t len)
 }
 
 void
-SHA384_Final(uint8 digest[], SHA384_CTX *context)
+SHA384_Final(uint8 digest[SHA384_DIGEST_LENGTH], SHA384_CTX *context)
 {
 	/* If no digest buffer is passed, we don't bother doing this: */
 	if (digest != NULL)
@@ -966,7 +966,7 @@ SHA224_Update(SHA224_CTX *context, const uint8 *data, size_t len)
 }
 
 void
-SHA224_Final(uint8 digest[], SHA224_CTX *context)
+SHA224_Final(uint8 digest[SHA224_DIGEST_LENGTH], SHA224_CTX *context)
 {
 	/* If no digest buffer is passed, we don't bother doing this: */
 	if (digest != NULL)


Re: [PATCH] Print error when libpq-refs-stamp fails

2021-10-04 Thread Anton Voloshin

Hello,

On 28/09/2021 05:55, Rachel Heaton wrote:

Hello,

While developing I got this error and it was difficult to figure out 
what was going on.


Thanks to Jacob, I was able to learn the context of the failure, so we 
created this small patch.


-   ! nm -A -u $< 2>/dev/null | grep -v __cxa_atexit | grep exit
+   @if nm -a -u $< 2>/dev/null | grep -v __cxa_atexit | grep exit; then \
+		echo 'libpq must not be linked against any library calling exit'; 
exit 1; \

+   fi

Could you please confirm that the change from -A to -a in nm arguments 
in this patch is intentional?


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: missing warning in pg_import_system_collations

2021-09-10 Thread Anton Voloshin

On 10/09/2021 01:37, Tom Lane wrote:

Another approach we could take is to deem the comment incorrect and
just remove it, codifying the current behavior of silently ignoring
unrecognized encodings.  The reason that seems like it might be
appropriate is that the logic immediately below this bit silently
ignores encodings that are known but are frontend-only:

 if (!PG_VALID_BE_ENCODING(enc))
 continue;/* ignore locales for client-only encodings */

It's sure not very clear to me why one case deserves a message and the
other not.  Perhaps they both do, which would lead to adding another
DEBUG1 message here.


I'm not an expert in locales, but I think it makes some sense to be 
silent about encodings we have consciously decided to ignore as we have 
them in our tables, but marked them as frontend-only 
(!PG_VALID_BE_ENCODING(enc)).
Just like it makes sense to do give a debug-level warning about 
encodings seen in locale -a output but not recognized by us at all 
(pg_get_encoding_from_locale(localebuf, false) < 0).


Therefore I think your patch with duplicated error message is better 
than what we have currently. I don't see how adding debug-level messages 
about skipping frontend-only encodings would be of any significant use here.


Unless someone more experienced in locales' subtleties would like to 
chime in.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: missing warning in pg_import_system_collations

2021-09-09 Thread Anton Voloshin

On 09/09/2021 21:51, Tom Lane wrote:

What you propose to do here would promote this case from
ought-to-be-DEBUG1 to WARNING, which seems to me to be way too much in the
user's face.  Or, if there actually is a case for complaining, then all
those messages ought to be WARNING not DEBUG1. ...

Assuming we don't want to change pg_get_encoding_from_locale()'s API,
the simplest fix is to duplicate its error message, so more or less

  if (enc < 0)
  {
-/* error message printed by pg_get_encoding_from_locale() */
+elog(DEBUG1, "could not determine encoding for locale \"%s\"",
+ localebuf)));
  continue;
  }


Upon thinking a little more, I agree.
The warnings I happen to get from initdb on my current machine (with 
many various locales installed, more than on a typical box) are:


performing post-bootstrap initialization ... 2021-09-09 22:04:01.678 +07 
[482312] WARNING:  could not determine encoding for locale 
"hy_AM.armscii8": codeset is "ARMSCII-8"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "ka_GE": codeset is "GEORGIAN-PS"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "kk_KZ": codeset is "PT154"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "kk_KZ.rk1048": codeset is "RK1048"
2021-09-09 22:04:01.686 +07 [482312] WARNING:  could not determine 
encoding for locale "tg_TJ": codeset is "KOI8-T"
2021-09-09 22:04:01.686 +07 [482312] WARNING:  could not determine 
encoding for locale "th_TH": codeset is "TIS-620"

ok

While they are definitely interesting as DEBUG1, not so as a WARNING.

So, +1 from me for your proposed elog(DEBUG1, ...); patch. Thank you.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




missing warning in pg_import_system_collations

2021-09-09 Thread Anton Voloshin

Hello hackers,

In pg_import_system_collations() there is this fragment of code:

enc = pg_get_encoding_from_locale(localebuf, false);
if (enc < 0)
{
/* error message printed by pg_get_encoding_from_locale() */
continue;
}

However, false passed to pg_get_encoding_from_locale() means 
write_message argument is false, so no error message is ever printed.

I propose an obvious patch (see attachment).

Introduced in aa17c06fb in January 2017 when debug was replaced by 
false, so I guess back-patching through 10 would be appropriate.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c
index 4075f991a03..b601ca4d68c 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -594,7 +594,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 continue;
 			}
 
-			enc = pg_get_encoding_from_locale(localebuf, false);
+			enc = pg_get_encoding_from_locale(localebuf, /*write_message=*/true);
 			if (enc < 0)
 			{
 /* error message printed by pg_get_encoding_from_locale() */


[patch] remove strver's leftover from error message in Solution.pm

2021-06-25 Thread Anton Voloshin

Hello,

in src/tools/msvc/Solution.pm (in the current master) there is a 
leftover from the past:

> confess "Bad format of version: $self->{strver}\n";

strver has been gone since 8f4fb4c6 in 2019, so I suggest an obvious 
one-line fix in the patch attached:


diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index a7b8f720b55..fcb43b0ca05 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -176,7 +176,7 @@ sub GenerateFiles

if ($package_version !~ /^(\d+)(?:\.(\d+))?/)
{
-   confess "Bad format of version: 
$self->{strver}\n";
+   confess "Bad format of version: 
$package_version\n";
}
$majorver = sprintf("%d", $1);
$minorver = sprintf("%d", $2 ? $2 : 0);

I think this should be backported to REL_13_STABLE, but not to 
REL_12_STABLE and earlier, where strver was still present.


--
Anton Voloshin,
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index a7b8f720b55..fcb43b0ca05 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -176,7 +176,7 @@ sub GenerateFiles
 
 			if ($package_version !~ /^(\d+)(?:\.(\d+))?/)
 			{
-confess "Bad format of version: $self->{strver}\n";
+confess "Bad format of version: $package_version\n";
 			}
 			$majorver = sprintf("%d", $1);
 			$minorver = sprintf("%d", $2 ? $2 : 0);


back-port one-line gcc-10+ warning fix to REL_10_STABLE

2021-06-07 Thread Anton Voloshin

Hello, hackers,

Currently, REL_10_STABLE can't be compiled with gcc-10 or 11, -Werror 
and "./configure" without arguments. E.g. gcc-11 gives an error:


objectaddress.c:1618:99: error: ‘typeoids’ may be used uninitialized 
[-Werror=maybe-uninitialized]
 1618 | 
  ObjectIdGetDatum(typeoids[1]),

...
objectaddress.c: In function ‘get_object_address’:
objectaddress.c:1578:33: note: ‘typeoids’ declared here
 1578 | Oid typeoids[2];
  | ^~~~

gcc-10 gives a similar error.

I propose to back-port a small part of Tom Lane's commit 9a725f7b5cb7, 
which was somehow never back-ported to REL_10_STABLE. The fix is

explicit initialization to InvalidOid for the typeoids[2] variable involved.

Even if, technically, the initialization is probably not required (or
so I've heard), in PostgreSQL 11+ it was deemed that explicit 
initialization is acceptable here to avoid compiler warning.


Please note that above-mentioned commit 9a725f7b5cb7 adds initialization 
for a variable from the previous line, typenames[2] as well, but since 
gcc 10 and 11 don't warn on that, I guess there is no need to add that 
initialization as well.


The proposed one-line patch is attached, but basically it is:
diff --git a/src/backend/catalog/objectaddress.c 
b/src/backend/catalog/objectaddress.c

index b0ff255a593..8cc9dc003c8 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1591,6 +1591,7 @@ get_object_address_opf_member(ObjectType objtype,
famaddr = get_object_address_opcf(OBJECT_OPFAMILY, copy, false);

/* find out left/right type names and OIDs */
+   typeoids[0] = typeoids[1] = InvalidOid;
i = 0;
foreach(cell, lsecond(object))
{

I've verified that all other current branches, i.e. 
REL9_6_STABLE..REL_13_STABLE (excluding REL_10_STABLE) and master can 
compile cleanly even with bare ./configure without arguments using gcc-11.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index b0ff255a593..8cc9dc003c8 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -1591,6 +1591,7 @@ get_object_address_opf_member(ObjectType objtype,
 	famaddr = get_object_address_opcf(OBJECT_OPFAMILY, copy, false);
 
 	/* find out left/right type names and OIDs */
+	typeoids[0] = typeoids[1] = InvalidOid;
 	i = 0;
 	foreach(cell, lsecond(object))
 	{


Re: possible repalloc() in icu_convert_case()

2021-04-04 Thread Anton Voloshin

On 04.04.2021 19:20, Tom Lane wrote:

repalloc is likely to be more expensive, since it implies copying
data which isn't helpful here.  I think this code is fine as-is.


Oh, you are right, thanks. I did not think properly about copying in 
repalloc.


--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company




possible repalloc() in icu_convert_case()

2021-04-04 Thread Anton Voloshin

Hello,

in src/backend/utils/adt/formatting.c, in icu_convert_case() I see:
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
pfree(*buff_dest);
*buff_dest = palloc(len_dest * sizeof(**buff_dest));
...

Is there any reason why this should not be repalloc()?

In case it should be, I've attached a corresponding patch.

--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/utils/adt/formatting.c 
b/src/backend/utils/adt/formatting.c
index 783c7b5e7a..409067e4a0 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -1588,8 +1588,7 @@ icu_convert_case(ICU_Convert_Func func, pg_locale_t 
mylocale,
if (status == U_BUFFER_OVERFLOW_ERROR)
{
/* try again with adjusted length */
-   pfree(*buff_dest);
-   *buff_dest = palloc(len_dest * sizeof(**buff_dest));
+   *buff_dest = repalloc(*buff_dest, len_dest * sizeof(**buff_dest));
status = U_ZERO_ERROR;
len_dest = func(*buff_dest, len_dest, buff_source, len_source,
mylocale->info.icu.locale, );


[PATCH] typo fix in collationcmds.c: "if they are distinct"

2021-04-04 Thread Anton Voloshin

Hello,

just a quick patch for a single-letter typo in a comment
in src/backend/commands/collationcmds.c
...
 * set of language+region combinations, whereas the latter only returns
-* language+region combinations of they are distinct from the language's
+* language+region combinations if they are distinct from the language's
 * base collation.  So there might not be a de-DE or en-GB, which 
would be

...
(please see the attached patch).

--
Anton Voloshin
Postgres Professional: https://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/commands/collationcmds.c 
b/src/backend/commands/collationcmds.c
index 55a0e24a35..b8ec6f5756 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -577,7 +577,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
 * We use uloc_countAvailable()/uloc_getAvailable() rather than
 * ucol_countAvailable()/ucol_getAvailable().  The former returns a full
 * set of language+region combinations, whereas the latter only returns
-* language+region combinations of they are distinct from the language's
+* language+region combinations if they are distinct from the language's
 * base collation.  So there might not be a de-DE or en-GB, which would be
 * confusing.
 */