from:"Marina Polyakova"

Re: cpluspluscheck/headerscheck require build in REL_16_STABLE

2024-04-27 Thread Marina Polyakova


On 2024-04-27 09:14, John Naylor wrote:
On Wed, Apr 17, 2024 at 7:21 AM John Naylor  
wrote:


On Mon, Apr 15, 2024 at 9:20 PM Marina Polyakova
 wrote:
> Everything seems to work with this patch, thank you!

Glad to hear it -- I'll push next week when I get back from vacation,
unless there are objections.


Pushed, thanks for the report!


Thank you again!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: cpluspluscheck/headerscheck require build in REL_16_STABLE

2024-04-15 Thread Marina Polyakova


On 2024-04-13 08:40, John Naylor wrote:

On Fri, Apr 12, 2024 at 11:51 PM Marina Polyakova
 wrote:

...

In the other branches everything is fine: these problems begin with
commits [2] (jsonpath_gram.h) and [3] (gram.h) and in the master 
branch

there're no such problems after commit [4]. The attached diff.patch
fixes this issue for me. (IIUC internal headers generated by Bison are
usually excluded from such checks so I also excluded gramparse.h and
jsonpath_internal.h...)


I'm not in favor of this patch because these files build fine on
master, so there is no good reason to exclude them. We should arrange
so that they build fine on PG16 as well. The problem is, not all the
required headers are generated when invoking `make headerscheck`. The
attached patch brings in some Makefile rules from master to make this
work. Does this fix it for you?


Everything seems to work with this patch, thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

cpluspluscheck/headerscheck require build in REL_16_STABLE

2024-04-12 Thread Marina Polyakova


Hello, hackers!

When running cpluspluscheck/headerscheck on REL_16_STABLE [1] I found 
that everything was ok only if it was preceded by a build and make 
maintainer-clean was not used:


$ ./configure --without-icu --with-perl --with-python > configure.txt &&
make -j16 > make.txt &&
make -j16 clean > clean.txt &&
make -j16 cpluspluscheck > cpluspluscheck.txt

$ ./configure --without-icu --with-perl --with-python > configure_1.txt 
&&

make -j16 > make.txt &&
make -j16 distclean > clean.txt &&
./configure --without-icu --with-perl --with-python > configure_2.txt &&
make -j16 cpluspluscheck > cpluspluscheck.txt

Otherwise cpluspluscheck/headerscheck will fail:

$ ./configure --without-icu --with-perl --with-python > configure_1.txt 
&&

make -j16 > make.txt &&
make -j16 maintainer-clean > clean.txt &&
./configure --without-icu --with-perl --with-python > configure_2.txt &&
make -j16 cpluspluscheck > cpluspluscheck.txt

$ ./configure --without-icu --with-perl --with-python > configure.txt &&
make -j16 cpluspluscheck > cpluspluscheck.txt

In file included from /tmp/cpluspluscheck.Zy4645/test.cpp:3:
/home/marina/postgrespro/postgrespro/src/backend/parser/gramparse.h:29:10: 
fatal error: gram.h: No such file or directory

   29 | #include "gram.h"
  |  ^~~~
compilation terminated.
In file included from /tmp/cpluspluscheck.Zy4645/test.cpp:3:
/home/marina/postgrespro/postgrespro/src/backend/utils/adt/jsonpath_internal.h:26:10: 
fatal error: jsonpath_gram.h: No such file or directory

   26 | #include "jsonpath_gram.h"
  |  ^
compilation terminated.
make: *** [GNUmakefile:141: cpluspluscheck] Error 1

In the other branches everything is fine: these problems begin with 
commits [2] (jsonpath_gram.h) and [3] (gram.h) and in the master branch 
there're no such problems after commit [4]. The attached diff.patch 
fixes this issue for me. (IIUC internal headers generated by Bison are 
usually excluded from such checks so I also excluded gramparse.h and 
jsonpath_internal.h...)


[1] 
https://github.com/postgres/postgres/commit/e177da5c87a10abac97c028bfb427bafb7353aa2
[2] 
https://github.com/postgres/postgres/commit/dac048f71ebbcf2f980d280711f8ff8001331c5d
[3] 
https://github.com/postgres/postgres/commit/ecaf7c5df54f7fa9df2fdc7225d2bb4e283f0081
[4] 
https://github.com/postgres/postgres/commit/721856ff24b3722ce8e894e5a32c9c063cd48455


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/tools/pginclude/cpluspluscheck b/src/tools/pginclude/cpluspluscheck
index 4e09c4686b30b973fd4ca2443d04b5228e904f4b..e77979c97ebd09dff18cb98090254bd3724de0ff 100755
--- a/src/tools/pginclude/cpluspluscheck
+++ b/src/tools/pginclude/cpluspluscheck
@@ -121,14 +121,15 @@ do
 
 	# We can't make these Bison output files compilable standalone
 	# without using "%code require", which old Bison versions lack.
-	# parser/gram.h will be included by parser/gramparse.h anyway.
 	test "$f" = contrib/cube/cubeparse.h && continue
 	test "$f" = contrib/seg/segparse.h && continue
 	test "$f" = src/backend/bootstrap/bootparse.h && continue
 	test "$f" = src/backend/parser/gram.h && continue
+	test "$f" = src/backend/parser/gramparse.h && continue
 	test "$f" = src/backend/replication/repl_gram.h && continue
 	test "$f" = src/backend/replication/syncrep_gram.h && continue
 	test "$f" = src/backend/utils/adt/jsonpath_gram.h && continue
+	test "$f" = src/backend/utils/adt/jsonpath_internal.h && continue
 	test "$f" = src/bin/pgbench/exprparse.h && continue
 	test "$f" = src/pl/plpgsql/src/pl_gram.h && continue
 	test "$f" = src/interfaces/ecpg/preproc/preproc.h && continue
diff --git a/src/tools/pginclude/headerscheck b/src/tools/pginclude/headerscheck
index 8dee1b56709d06a9ef1f5e80b956a079fa8a1e6a..30e0e47684e571e8c86459654a0ad37326d798e0 100755
--- a/src/tools/pginclude/headerscheck
+++ b/src/tools/pginclude/headerscheck
@@ -116,14 +116,15 @@ do
 
 	# We can't make these Bison output files compilable standalone
 	# without using "%code require", which old Bison versions lack.
-	# parser/gram.h will be included by parser/gramparse.h anyway.
 	test "$f" = contrib/cube/cubeparse.h && continue
 	test "$f" = contrib/seg/segparse.h && continue
 	test "$f" = src/backend/bootstrap/bootparse.h && continue
 	test "$f" = src/backend/parser/gram.h && continue
+	test "$f" = src/backend/parser/gramparse.h && continue
 	test "$f" = src/backend/replication/repl_gram.h && continue
 	test "$f" = src/backend/replication/syncrep_gram.h && continue
 	test "$f" = src/backend/utils/adt/jsonpath_gram.h && continue
+	test "$f" = src/backend/utils/adt/jsonpath_internal.h && continue
 	test "$f" = src/bin/pgbench/exprparse.h && continue
 	test "$f" = src/pl/plpgsql/src/pl_gram.h && continue
 	test "$f" = src/interfaces/ecpg/preproc/preproc.h && continue

Re: Conflict between regression tests namespace & transactions due to recent changes

2023-05-19 Thread Marina Polyakova


On 2023-05-19 17:59, Tom Lane wrote:

Marina Polyakova  writes:

On 2023-05-19 09:03, Michael Paquier wrote:

FYI, the buildfarm is seeing some spurious failures as well:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=schnauzer=2023-05-1904%3A29%3A42



Yes, it is the same error. Here's another one in version 13:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=schnauzer=2023-05-18%2022:37:49


Right.  I went ahead and pushed the fix in hopes of stabilizing things.
(I went with "trans_abc" as the new table name, for consistency with
some other nearby names.)

regards, tom lane


Thank you! I missed the same changes in version 11 :(

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Conflict between regression tests namespace & transactions due to recent changes

2023-05-19 Thread Marina Polyakova


On 2023-05-19 09:03, Michael Paquier wrote:

On Wed, May 17, 2023 at 02:39:10PM +0900, Michael Paquier wrote:

On Tue, May 16, 2023 at 11:02:45AM +0300, Marina Polyakova wrote:

It confuses me a little that different methods are used for the same
purpose. But the namespace test checks schemas. So see
diff_abc_to_txn_table.patch which replaces abc with txn_table in the
transaction test.


Looks OK seen from here.  Thanks!


FYI, the buildfarm is seeing some spurious failures as well:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=schnauzer=2023-05-19
04%3A29%3A42
--
Michael


Yes, it is the same error. Here's another one in version 13:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=schnauzer=2023-05-18%2022:37:49

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Conflict between regression tests namespace & transactions due to recent changes

2023-05-16 Thread Marina Polyakova


On 2023-05-16 02:19, Michael Paquier wrote:

On Mon, May 15, 2023 at 11:23:18PM +0300, Marina Polyakova wrote:
Maybe use a separate schema for all new objects in the transaction 
test?..

See diff_set_tx_schema.patch.


Sure, you could do that to bypass the failure (without the "public"
actually?), leaving non-generic names around.  Still I'd agree with
Tom here and just rename the objects to something more in line with
the context of the test to make things a bit more greppable.  These
could be renamed as transaction_tab or transaction_view, for example.
--
Michael


It confuses me a little that different methods are used for the same 
purpose. But the namespace test checks schemas. So see 
diff_abc_to_txn_table.patch which replaces abc with txn_table in the 
transaction test.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/test/regress/expected/transactions.out b/src/test/regress/expected/transactions.out
index 2b2cff7d9120a139532d1a6a2f2bc79e463bc4fa..21efb900ef1bc30338f1102f6a40f1d25be0db9c 100644
--- a/src/test/regress/expected/transactions.out
+++ b/src/test/regress/expected/transactions.out
@@ -609,10 +609,10 @@ drop function inverse(int);
 -- performed in the aborted subtransaction
 begin;
 savepoint x;
-create table abc (a int);
-insert into abc values (5);
-insert into abc values (10);
-declare foo cursor for select * from abc;
+create table txn_table (a int);
+insert into txn_table values (5);
+insert into txn_table values (10);
+declare foo cursor for select * from txn_table;
 fetch from foo;
  a 
 ---
@@ -625,11 +625,11 @@ fetch from foo;
 ERROR:  cursor "foo" does not exist
 commit;
 begin;
-create table abc (a int);
-insert into abc values (5);
-insert into abc values (10);
-insert into abc values (15);
-declare foo cursor for select * from abc;
+create table txn_table (a int);
+insert into txn_table values (5);
+insert into txn_table values (10);
+insert into txn_table values (15);
+declare foo cursor for select * from txn_table;
 fetch from foo;
  a 
 ---
@@ -698,7 +698,7 @@ COMMIT;
 DROP FUNCTION create_temp_tab();
 DROP FUNCTION invert(x float8);
 -- Tests for AND CHAIN
-CREATE TABLE abc (a int);
+CREATE TABLE txn_table (a int);
 -- set nondefault value so we have something to override below
 SET default_transaction_read_only = on;
 START TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ WRITE, DEFERRABLE;
@@ -720,8 +720,8 @@ SHOW transaction_deferrable;
  on
 (1 row)
 
-INSERT INTO abc VALUES (1);
-INSERT INTO abc VALUES (2);
+INSERT INTO txn_table VALUES (1);
+INSERT INTO txn_table VALUES (2);
 COMMIT AND CHAIN;  -- TBLOCK_END
 SHOW transaction_isolation;
  transaction_isolation 
@@ -741,11 +741,11 @@ SHOW transaction_deferrable;
  on
 (1 row)
 
-INSERT INTO abc VALUES ('error');
+INSERT INTO txn_table VALUES ('error');
 ERROR:  invalid input syntax for type integer: "error"
-LINE 1: INSERT INTO abc VALUES ('error');
-^
-INSERT INTO abc VALUES (3);  -- check it's really aborted
+LINE 1: INSERT INTO txn_table VALUES ('error');
+  ^
+INSERT INTO txn_table VALUES (3);  -- check it's really aborted
 ERROR:  current transaction is aborted, commands ignored until end of transaction block
 COMMIT AND CHAIN;  -- TBLOCK_ABORT_END
 SHOW transaction_isolation;
@@ -766,7 +766,7 @@ SHOW transaction_deferrable;
  on
 (1 row)
 
-INSERT INTO abc VALUES (4);
+INSERT INTO txn_table VALUES (4);
 COMMIT;
 START TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ WRITE, DEFERRABLE;
 SHOW transaction_isolation;
@@ -788,10 +788,10 @@ SHOW transaction_deferrable;
 (1 row)
 
 SAVEPOINT x;
-INSERT INTO abc VALUES ('error');
+INSERT INTO txn_table VALUES ('error');
 ERROR:  invalid input syntax for type integer: "error"
-LINE 1: INSERT INTO abc VALUES ('error');
-^
+LINE 1: INSERT INTO txn_table VALUES ('error');
+  ^
 COMMIT AND CHAIN;  -- TBLOCK_ABORT_PENDING
 SHOW transaction_isolation;
  transaction_isolation 
@@ -811,7 +811,7 @@ SHOW transaction_deferrable;
  on
 (1 row)
 
-INSERT INTO abc VALUES (5);
+INSERT INTO txn_table VALUES (5);
 COMMIT;
 START TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ WRITE, DEFERRABLE;
 SHOW transaction_isolation;
@@ -873,7 +873,7 @@ SHOW transaction_deferrable;
  off
 (1 row)
 
-INSERT INTO abc VALUES (6);
+INSERT INTO txn_table VALUES (6);
 ROLLBACK AND CHAIN;  -- TBLOCK_ABORT_PENDING
 SHOW transaction_isolation;
  transaction_isolation 
@@ -893,10 +893,10 @@ SHOW transaction_deferrable;
  off
 (1 row)
 
-INSERT INTO abc VALUES ('error');
+INSERT INTO txn_table VALUES ('error');
 ERROR:  invalid input syntax for type integer: "error"
-LINE 1: INSERT INTO abc VALUES ('error');
-^
+LINE 1: INSERT INTO txn_table VALUES ('error');
+  ^
 ROLLBACK AND CHAIN

Re: Conflict between regression tests namespace & transactions due to recent changes

2023-05-15 Thread Marina Polyakova


On 2023-05-15 19:16, Tom Lane wrote:

Marina Polyakova  writes:

IIUC the conflict was caused by



+SET search_path to public, test_ns_schema_1;
+CREATE SCHEMA test_ns_schema_2
+   CREATE VIEW abc_view AS SELECT a FROM abc;



because the parallel regression test transactions had already created
the table abc and was trying to drop it.


Hmm.  I'd actually fix the blame on transactions.sql here.  Creating
a table named as generically as "abc" is horribly bad practice in
a set of concurrent tests.  namespace.sql is arguably okay, since
it's creating that table name in a private schema.

I'd be inclined to fix this by doing s/abc/something-else/g in
transactions.sql.

regards, tom lane


Maybe use a separate schema for all new objects in the transaction 
test?.. See diff_set_tx_schema.patch.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/test/regress/expected/transactions.out b/src/test/regress/expected/transactions.out
index 2b2cff7d9120a139532d1a6a2f2bc79e463bc4fa..6a3758bafb54327fbe341dfd89457e53ffce47dd 100644
--- a/src/test/regress/expected/transactions.out
+++ b/src/test/regress/expected/transactions.out
@@ -1,6 +1,10 @@
 --
 -- TRANSACTIONS
 --
+-- Use your schema to avoid conflicts with objects with the same names in
+-- parallel tests.
+CREATE SCHEMA test_tx_schema;
+SET search_path TO test_tx_schema,public;
 BEGIN;
 CREATE TABLE xacttest (a smallint, b real);
 INSERT INTO xacttest VALUES
diff --git a/src/test/regress/sql/transactions.sql b/src/test/regress/sql/transactions.sql
index 7ee5f6aaa55a686f5484664c3fae224c08bde42a..0e6d6453edaa776425524d43abf5ae3cfca3ac45 100644
--- a/src/test/regress/sql/transactions.sql
+++ b/src/test/regress/sql/transactions.sql
@@ -2,6 +2,11 @@
 -- TRANSACTIONS
 --
 
+-- Use your schema to avoid conflicts with objects with the same names in
+-- parallel tests.
+CREATE SCHEMA test_tx_schema;
+SET search_path TO test_tx_schema,public;
+
 BEGIN;
 
 CREATE TABLE xacttest (a smallint, b real);

Conflict between regression tests namespace & transactions due to recent changes

2023-05-15 Thread Marina Polyakova


Hello, hackers!

When running tests for version 15, we found a conflict between 
regression tests namespace & transactions due to recent changes [1].


diff -w -U3 .../src/test/regress/expected/transactions.out 
.../src/bin/pg_upgrade/tmp_check/results/transactions.out

--- .../src/test/regress/expected/transactions.out ...
+++ .../src/bin/pg_upgrade/tmp_check/results/transactions.out ...
@@ -899,6 +899,9 @@

 RESET default_transaction_read_only;
 DROP TABLE abc;
+ERROR:  cannot drop table abc because other objects depend on it
+DETAIL:  view test_ns_schema_2.abc_view depends on table abc
+HINT:  Use DROP ... CASCADE to drop the dependent objects too.
 -- Test assorted behaviors around the implicit transaction block 
created
 -- when multiple SQL commands are sent in a single Query message.  
These
 -- tests rely on the fact that psql will not break SQL commands apart 
at a

...

IIUC the conflict was caused by

+SET search_path to public, test_ns_schema_1;
+CREATE SCHEMA test_ns_schema_2
+   CREATE VIEW abc_view AS SELECT a FROM abc;

because the parallel regression test transactions had already created 
the table abc and was trying to drop it.


ISTM the patch diff.patch fixes this problem...

[1] 
https://github.com/postgres/postgres/commit/dbd5795e7539ec9e15c0d4ed2d05b1b18d2a3b09


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/test/regress/expected/namespace.out b/src/test/regress/expected/namespace.out
index a62fd8ded015e6ee56784e4341e5ac1a4ed5621f..bcb6a75cf230fdc30f728201c2a2baef1838ed18 100644
--- a/src/test/regress/expected/namespace.out
+++ b/src/test/regress/expected/namespace.out
@@ -35,10 +35,15 @@ SHOW search_path;
 
 -- verify that the correct search_path preserved
 -- after creating the schema and on commit
+-- create your table to not use the same table from transactions test
 BEGIN;
 SET search_path to public, test_ns_schema_1;
 CREATE SCHEMA test_ns_schema_2
-   CREATE VIEW abc_view AS SELECT a FROM abc;
+   CREATE VIEW abc_view AS SELECT a FROM abc
+   CREATE TABLE abc (
+  a serial,
+  b int UNIQUE
+   );
 SHOW search_path;
search_path
 --
@@ -53,7 +58,9 @@ SHOW search_path;
 (1 row)
 
 DROP SCHEMA test_ns_schema_2 CASCADE;
-NOTICE:  drop cascades to view test_ns_schema_2.abc_view
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table test_ns_schema_2.abc
+drop cascades to view test_ns_schema_2.abc_view
 -- verify that the objects were created
 SELECT COUNT(*) FROM pg_class WHERE relnamespace =
 (SELECT oid FROM pg_namespace WHERE nspname = 'test_ns_schema_1');
diff --git a/src/test/regress/sql/namespace.sql b/src/test/regress/sql/namespace.sql
index 3474f5ecf4215068fd89ec52f9ee8b95ddff36bb..95b9ed7b00807222e94e30221348f58039c63257 100644
--- a/src/test/regress/sql/namespace.sql
+++ b/src/test/regress/sql/namespace.sql
@@ -28,10 +28,16 @@ SHOW search_path;
 
 -- verify that the correct search_path preserved
 -- after creating the schema and on commit
+-- create your table to not use the same table from transactions test
 BEGIN;
 SET search_path to public, test_ns_schema_1;
 CREATE SCHEMA test_ns_schema_2
-   CREATE VIEW abc_view AS SELECT a FROM abc;
+   CREATE VIEW abc_view AS SELECT a FROM abc
+
+   CREATE TABLE abc (
+  a serial,
+  b int UNIQUE
+   );
 SHOW search_path;
 COMMIT;
 SHOW search_path;

Re: Fix order of checking ICU options in initdb and create database

2022-11-14 Thread Marina Polyakova


On 2022-11-12 22:43, Jose Arthur Benetasso Villanova wrote:

Hello Marina.

I just reviewed your patch.

It applied cleanly at my current master (commit
d6a3dbe14f98d867b2fc3faeb99d2d3c2a48ca67).

Also, it worked as described in email. Since it's a clarification in
an error message, I think the documentation is fine.

I played a bit with "make check", creating a database in my native
language (pt_BR), testing with some data and everything worked as
expected.


Hello!

Thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Fix order of checking ICU options in initdb and create database

2022-10-29 Thread Marina Polyakova


On 2022-10-29 14:33, Marina Polyakova wrote:

Hello!

This is the last proposed patch on this subject [1] moved to a
separate thread for Commitfest..


Also added a patch to export with_icu when running src/bin/scripts tests 
[1].

The problem can be reproduced as

$ meson setup  && ninja && meson test --print-errorlogs 
--suite setup --suite scripts


[1] https://cirrus-ci.com/task/4825664661487616

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom a6de5922fc533c88c7288051d8797d021ae91dae Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Sat, 29 Oct 2022 14:25:05 +0300
Subject: [PATCH v1 2/2] Fix order of checking ICU options in initdb and create
 database

First check if ICU is supported in this build. Then check that the selected
encoding is supported BY ICU (the encoding can be set from lc_ctype which can be
set from an environment variable). Only then check the option for the ICU
locale.
---
 src/backend/commands/dbcommands.c |  6 ++
 src/backend/utils/adt/pg_locale.c | 10 ++---
 src/bin/initdb/initdb.c   | 36 ++-
 src/bin/initdb/t/001_initdb.pl| 21 ++
 src/bin/scripts/t/020_createdb.pl | 31 ++
 src/include/utils/pg_locale.h |  2 +-
 6 files changed, 69 insertions(+), 37 deletions(-)

diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 16e422138b..4c1e79fca3 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1030,6 +1030,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 
 	if (dblocprovider == COLLPROVIDER_ICU)
 	{
+#ifndef USE_ICU
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build")));
+#else
 		if (!(is_encoding_supported_by_icu(encoding)))
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1046,6 +1051,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	 errmsg("ICU locale must be specified")));
 
 		check_icu_locale(dbiculocale);
+#endif
 	}
 	else
 	{
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 2b42d9ccd8..743f11e1d1 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1945,15 +1945,12 @@ icu_set_collation_attributes(UCollator *collator, const char *loc)
 	}
 }
 
-#endif			/* USE_ICU */
-
 /*
  * Check if the given locale ID is valid, and ereport(ERROR) if it isn't.
  */
 void
 check_icu_locale(const char *icu_locale)
 {
-#ifdef USE_ICU
 	UCollator  *collator;
 	UErrorCode	status;
 
@@ -1967,13 +1964,10 @@ check_icu_locale(const char *icu_locale)
 	if (U_ICU_VERSION_MAJOR_NUM < 54)
 		icu_set_collation_attributes(collator, icu_locale);
 	ucol_close(collator);
-#else
-	ereport(ERROR,
-			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-			 errmsg("ICU is not supported in this build")));
-#endif
 }
 
+#endif			/* USE_ICU */
+
 /*
  * These functions convert from/to libc's wchar_t, *not* pg_wchar_t.
  * Therefore we keep them here rather than with the mbutils code.
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index f61a043055..82e6644f89 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2047,9 +2047,12 @@ check_locale_encoding(const char *locale, int user_enc)
  *
  * this should match the similar check in the backend createdb() function
  */
-static bool
+static void
 check_icu_locale_encoding(int user_enc)
 {
+#ifndef USE_ICU
+	pg_fatal("ICU is not supported in this build");
+#else
 	if (!(is_encoding_supported_by_icu(user_enc)))
 	{
 		pg_log_error("encoding mismatch");
@@ -2058,9 +2061,17 @@ check_icu_locale_encoding(int user_enc)
 		pg_log_error_hint("Rerun %s and either do not specify an encoding explicitly, "
 		  "or choose a matching combination.",
 		  progname);
-		return false;
+		exit(1);
 	}
-	return true;
+
+	if (!icu_locale)
+		pg_fatal("ICU locale must be specified");
+
+	/*
+	 * In supported builds, the ICU locale ID will be checked by the backend
+	 * during post-bootstrap initialization.
+	 */
+#endif
 }
 
 /*
@@ -2113,20 +2124,6 @@ setlocales(void)
 	check_locale_name(LC_CTYPE, lc_messages, );
 	lc_messages = canonname;
 #endif
-
-	if (locale_provider == COLLPROVIDER_ICU)
-	{
-		if (!icu_locale)
-			pg_fatal("ICU locale must be specified");
-
-		/*
-		 * In supported builds, the ICU locale ID will be checked by the
-		 * backend during post-bootstrap initialization.
-		 */
-#ifndef USE_ICU
-		pg_fatal("ICU is not supported in this build");
-#endif
-	}
 }
 
 /*
@@ -2388,9 +2385,8 @@ setup_locale_encoding(void)
 		!check_locale_encoding(lc_collate, encodingid))
 		exit(1);/* check_locale_encoding printed the error */
 
-	if (locale_provider == COLLPROVIDER_ICU &&
-		!check_icu_locale_enc

Fix order of checking ICU options in initdb and create database

2022-10-29 Thread Marina Polyakova


Hello!

This is the last proposed patch on this subject [1] moved to a separate 
thread for Commitfest..


It looks like that the current order of checking ICU options in initdb 
and create database in PG 15+ is not user-friendly. Examples:


1. initdb reports a missing ICU locale although it may already report 
that the selected encoding cannot be used:


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified

$ initdb --encoding sql-ascii --locale-provider icu --icu-locale en-US 
hoge

...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (SQL_ASCII) is not supported 
with the ICU provider.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


2. initdb/create database report problems with the ICU locale/encoding 
although they may already report that ICU is not supported in this 
build:


2.1.

$ initdb --locale-provider icu hoge
...
initdb: error: ICU locale must be specified

$ initdb --locale-provider icu --icu-locale en-US hoge
...
initdb: error: ICU is not supported in this build

$ createdb --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU locale must be 
specified


$ createdb --locale-provider icu --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


2.2.

$ createdb --locale-provider icu --icu-locale en-US --encoding sql-ascii 
hoge
createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII" 
is not supported with ICU provider


$ createdb --locale-provider icu --icu-locale en-US --encoding utf8 hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


The patch 
v1-0001-Fix-order-of-checking-ICU-options-in-initdb-and-c.patch fixes 
this:


1.

$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (SQL_ASCII) is not supported 
with the ICU provider.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


2.1.

$ initdb --locale-provider icu hoge
...
initdb: error: ICU is not supported in this build

$ createdb --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


2.2.

$ createdb --locale-provider icu --icu-locale en-US --encoding sql-ascii 
hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


A side effect of the proposed patch in initdb is that if ICU locale is 
missed (or ICU is not supported in this build), the provider, locales 
and encoding are reported before the error message:


$ initdb --locale-provider icu hoge
The files belonging to this database system will be owned by user 
"marina".

This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:icu
  LC_COLLATE:  en_US.UTF-8
  LC_CTYPE:en_US.UTF-8
  LC_MESSAGES: en_US.UTF-8
  LC_MONETARY: ru_RU.UTF-8
  LC_NUMERIC:  ru_RU.UTF-8
  LC_TIME: ru_RU.UTF-8
The default database encoding has been set to "UTF8".
initdb: error: ICU locale must be specified

$ initdb --locale-provider icu hoge
The files belonging to this database system will be owned by user 
"marina".

This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:icu
  LC_COLLATE:  en_US.UTF-8
  LC_CTYPE:en_US.UTF-8
  LC_MESSAGES: en_US.UTF-8
  LC_MONETARY: ru_RU.UTF-8
  LC_NUMERIC:  ru_RU.UTF-8
  LC_TIME: ru_RU.UTF-8
The default database encoding has been set to "UTF8".
initdb: error: ICU is not supported in this build

I was thinking about another master-only version of the patch that first 
checks everything for provider, locales and encoding but IMO it is worse 
[2]..


[1] 
https://www.postgresql.org/message-id/e94aca035bf0b92fac42d204ad385552%40postgrespro.ru
[2] 
https://www.postgresql.org/message-id/79f410460c4fc9534000785adb8bf39a%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom da761095a7e39463ec6083484c4b1b924cacf387 Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Sat, 29 Oct 2022 14:25:05 +0300
Subject: [PATCH v1] Fix order of checking ICU options in initdb and create
 database

First check if ICU is supported in this build. Then check that the selected
encoding is supported BY ICU (the encoding can be set from lc_ctype which can be
set from an environment variable). Only then check the option for the ICU
locale.
---
 src/backend/commands/dbcommands.c |  6 ++
 src/backend/utils/adt/pg_locale.c | 10 ++---
 src/bin/initdb/initdb.c   | 36 ++-
 src/bin/initdb/t/001_initdb.pl| 21 ++
 src/bin/scripts/t/020_create

Fix database creation during installchecks for ICU cluster

2022-10-29 Thread Marina Polyakova


Hello!

This is a copy of [1] moved to a separated thread for Commitfest..

I discovered an interesting behaviour during installcheck runs on PG 15+ 
when the cluster was initialized with ICU locale provider:


$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start

1) The ECPG tests fail because they use the SQL_ASCII encoding [2], the 
database template0 uses the ICU locale provider and SQL_ASCII is not 
supported by ICU:


$ make -C src/interfaces/ecpg/ installcheck
...
== creating database "ecpg1_regression"   ==
ERROR:  encoding "SQL_ASCII" is not supported with ICU provider
ERROR:  database "ecpg1_regression" does not exist
command failed: "/home/marina/postgresql/master/my/inst/bin/psql" -X -c 
"CREATE DATABASE \"ecpg1_regression\" TEMPLATE=template0 
ENCODING='SQL_ASCII'" -c "ALTER DATABASE \"ecpg1_regression\" SET 
lc_messages TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_monetary 
TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_numeric TO 'C';ALTER 
DATABASE \"ecpg1_regression\" SET lc_time TO 'C';ALTER DATABASE 
\"ecpg1_regression\" SET bytea_output TO 'hex';ALTER DATABASE 
\"ecpg1_regression\" SET timezone_abbreviations TO 'Default';" 
"postgres"


2) The option --no-locale in pg_regress is described as "use C locale" 
[3]. But in this case the created databases actually use the ICU locale 
provider with the ICU cluster locale from template0 (see 
diff_check_backend_used_provider.txt):


$ make NO_LOCALE=1 installcheck

In regression.diffs:

diff -U3 
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out 
/home/marina/postgresql/master/src/test/regress/results/test_setup.out
--- 
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out  
  2022-09-27 05:31:27.674628815 +0300
+++ 
/home/marina/postgresql/master/src/test/regress/results/test_setup.out   
 2022-10-21 15:09:31.232992885 +0300

@@ -143,6 +143,798 @@
 \set filename :abs_srcdir '/data/person.data'
 COPY person FROM :'filename';
 VACUUM ANALYZE person;
+NOTICE:  varstrfastcmp_locale sss->collate_c 0 sss->locale 0xefacd0
+NOTICE:  varstrfastcmp_locale sss->locale->provider i
+NOTICE:  varstrfastcmp_locale sss->locale->info.icu.locale en-US
...

The patch 
v1-0001-Fix-database-creation-during-installchecks-for-IC.patch fixes 
both issues for me.


[1] 
https://www.postgresql.org/message-id/727b5d5160f845dcf5e0818e625a6e56%40postgrespro.ru
[2] 
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/interfaces/ecpg/test/Makefile#L18
[3] 
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/test/regress/pg_regress.c#L1992


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
index b57ed946c42bb54ede800e95045aa937a8dbad85..b3c0f6f753f8428274389844ccf9778a7ed47ea4 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -281,6 +281,14 @@ hashtext(PG_FUNCTION_ARGS)
 	if (!lc_collate_is_c(collid))
 		mylocale = pg_newlocale_from_collation(collid);
 
+	elog(NOTICE, "hashtext lc_collate_is_c(collid) %d mylocale %p", lc_collate_is_c(collid), mylocale);
+	if (mylocale)
+	{
+		elog(NOTICE, "hashtext mylocale->provider %c", mylocale->provider);
+		if (mylocale->provider == COLLPROVIDER_ICU)
+			elog(NOTICE, "hashtext mylocale->info.icu.locale %s", mylocale->info.icu.locale ? mylocale->info.icu.locale : "(null)");
+	}
+
 	if (!mylocale || mylocale->deterministic)
 	{
 		result = hash_any((unsigned char *) VARDATA_ANY(key),
@@ -337,6 +345,14 @@ hashtextextended(PG_FUNCTION_ARGS)
 	if (!lc_collate_is_c(collid))
 		mylocale = pg_newlocale_from_collation(collid);
 
+	elog(NOTICE, "hashtextextended lc_collate_is_c(collid) %d mylocale %p", lc_collate_is_c(collid), mylocale);
+	if (mylocale)
+	{
+		elog(NOTICE, "hashtextextended mylocale->provider %c", mylocale->provider);
+		if (mylocale->provider == COLLPROVIDER_ICU)
+			elog(NOTICE, "hashtextextended mylocale->info.icu.locale %s", mylocale->info.icu.locale ? mylocale->info.icu.locale : "(null)");
+	}
+
 	if (!mylocale || mylocale->deterministic)
 	{
 		result = hash_any_extended((unsigned char *) VARDATA_ANY(key),
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 02d462a659778016f3c4479d425ba0a84feb6e26..9627c84a7ccfb4c4013556a51c989e9e6d611634 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -243,6 +243,8 @@ pg_set_regex_collation(Oid collation)
  errhint("Use the COLLATE clause to

Re: ICU for global collation

2022-10-21 Thread Marina Polyakova


Hello!

I discovered an interesting behaviour during installcheck runs when the 
cluster was initialized with ICU locale provider:


$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start

1) The ECPG tests fail because they use the SQL_ASCII encoding [1], the 
database template0 uses the ICU locale provider and SQL_ASCII is not 
supported by ICU:


$ make -C src/interfaces/ecpg/ installcheck
...
== creating database "ecpg1_regression"   ==
ERROR:  encoding "SQL_ASCII" is not supported with ICU provider
ERROR:  database "ecpg1_regression" does not exist
command failed: "/home/marina/postgresql/master/my/inst/bin/psql" -X -c 
"CREATE DATABASE \"ecpg1_regression\" TEMPLATE=template0 
ENCODING='SQL_ASCII'" -c "ALTER DATABASE \"ecpg1_regression\" SET 
lc_messages TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_monetary 
TO 'C';ALTER DATABASE \"ecpg1_regression\" SET lc_numeric TO 'C';ALTER 
DATABASE \"ecpg1_regression\" SET lc_time TO 'C';ALTER DATABASE 
\"ecpg1_regression\" SET bytea_output TO 'hex';ALTER DATABASE 
\"ecpg1_regression\" SET timezone_abbreviations TO 'Default';" 
"postgres"


2) The option --no-locale in pg_regress is described as "use C locale" 
[2]. But in this case the created databases actually use the ICU locale 
provider with the ICU cluster locale from template0 (see 
diff_check_backend_used_provider.patch):


$ make NO_LOCALE=1 installcheck

In regression.diffs:

diff -U3 
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out 
/home/marina/postgresql/master/src/test/regress/results/test_setup.out
--- 
/home/marina/postgresql/master/src/test/regress/expected/test_setup.out	2022-09-27 
05:31:27.674628815 +0300
+++ 
/home/marina/postgresql/master/src/test/regress/results/test_setup.out	2022-10-21 
15:09:31.232992885 +0300

@@ -143,6 +143,798 @@
 \set filename :abs_srcdir '/data/person.data'
 COPY person FROM :'filename';
 VACUUM ANALYZE person;
+NOTICE:  varstrfastcmp_locale sss->collate_c 0 sss->locale 0xefacd0
+NOTICE:  varstrfastcmp_locale sss->locale->provider i
+NOTICE:  varstrfastcmp_locale sss->locale->info.icu.locale en-US
...

The patch diff_fix_pg_regress_create_database.patch fixes both issues 
for me.


[1] 
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/interfaces/ecpg/test/Makefile#L18
[2] 
https://github.com/postgres/postgres/blob/ce20f8b9f4354b46b40fd6ebf7ce5c37d08747e0/src/test/regress/pg_regress.c#L1992


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
index b57ed946c42bb54ede800e95045aa937a8dbad85..b3c0f6f753f8428274389844ccf9778a7ed47ea4 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -281,6 +281,14 @@ hashtext(PG_FUNCTION_ARGS)
 	if (!lc_collate_is_c(collid))
 		mylocale = pg_newlocale_from_collation(collid);
 
+	elog(NOTICE, "hashtext lc_collate_is_c(collid) %d mylocale %p", lc_collate_is_c(collid), mylocale);
+	if (mylocale)
+	{
+		elog(NOTICE, "hashtext mylocale->provider %c", mylocale->provider);
+		if (mylocale->provider == COLLPROVIDER_ICU)
+			elog(NOTICE, "hashtext mylocale->info.icu.locale %s", mylocale->info.icu.locale ? mylocale->info.icu.locale : "(null)");
+	}
+
 	if (!mylocale || mylocale->deterministic)
 	{
 		result = hash_any((unsigned char *) VARDATA_ANY(key),
@@ -337,6 +345,14 @@ hashtextextended(PG_FUNCTION_ARGS)
 	if (!lc_collate_is_c(collid))
 		mylocale = pg_newlocale_from_collation(collid);
 
+	elog(NOTICE, "hashtextextended lc_collate_is_c(collid) %d mylocale %p", lc_collate_is_c(collid), mylocale);
+	if (mylocale)
+	{
+		elog(NOTICE, "hashtextextended mylocale->provider %c", mylocale->provider);
+		if (mylocale->provider == COLLPROVIDER_ICU)
+			elog(NOTICE, "hashtextextended mylocale->info.icu.locale %s", mylocale->info.icu.locale ? mylocale->info.icu.locale : "(null)");
+	}
+
 	if (!mylocale || mylocale->deterministic)
 	{
 		result = hash_any_extended((unsigned char *) VARDATA_ANY(key),
diff --git a/src/backend/regex/regc_pg_locale.c b/src/backend/regex/regc_pg_locale.c
index 02d462a659778016f3c4479d425ba0a84feb6e26..9627c84a7ccfb4c4013556a51c989e9e6d611634 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -243,6 +243,8 @@ pg_set_regex_collation(Oid collation)
  errhint("Use the COLLATE clause to set the collation explicitly.")));
 	}
 
+	elog(NOTICE, "pg_set_regex_collation lc_ctype_is_c(collid) %d", lc_ctype_is_c(collation));
+
 	if (lc_ctype_is_c(collation))
 	{
 		/* C/

Re: ICU for global collation

2022-10-08 Thread Marina Polyakova


On 2022-10-01 15:07, Peter Eisentraut wrote:

On 22.09.22 20:06, Marina Polyakova wrote:

On 2022-09-21 17:53, Peter Eisentraut wrote:

Committed with that test, thanks.  I think that covers all the ICU
issues you reported for PG15 for now?


I thought about the order of the ICU checks - if it is ok to check 
that the selected encoding is supported by ICU after printing all the 
locale & encoding information, why not to move almost all the ICU 
checks here?..


It's possible that we can do better, but I'm not going to add things
like that to PG 15 at this point unless it fixes a faulty behavior.


Will PG 15 always have this order of ICU checks, is the current 
behaviour correct enough? On the other hand, there may be a better fix 
for PG 16+ and not all changes can be backported...


On 2022-09-16 10:56, Peter Eisentraut wrote:

On 15.09.22 17:41, Marina Polyakova wrote:
I agree with you. Here's another version of the patch. The 
locale/encoding checks and reports in initdb have been reordered, 
because now the encoding is set first and only then the ICU locale is 
checked.


I committed something based on the first version of your patch.  This
reordering of the messages here was a little too much surgery for me
at this point.  For instance, there are also messages in #ifdef WIN32
code that would need to be reordered as well.  I kept the overall
structure of the code the same and just inserted the additional
proposed checks.

If you want to pursue the reordering of the checks and messages
overall, a patch for the master branch could be considered.


I've worked on this again (see attached patch) but I'm not sure if the 
messages of encoding mismatches are clear enough without the full locale 
information. For


$ initdb -D data --icu-locale en --locale-provider icu

compare the outputs:

The database cluster will be initialized with this locale configuration:
  provider:icu
  ICU locale:  en
  LC_COLLATE:  de_DE.iso885915@euro
  LC_CTYPE:de_DE.iso885915@euro
  LC_MESSAGES: en_US.utf8
  LC_MONETARY: de_DE.iso885915@euro
  LC_NUMERIC:  de_DE.iso885915@euro
  LC_TIME: de_DE.iso885915@euro
The default database encoding has been set to "UTF8".
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that 
the selected locale uses (LATIN9) do not match. This would lead to 
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


and

Encoding "UTF8" implied by locale will be set as the default database 
encoding.

initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that 
the selected locale uses (LATIN9) do not match. This would lead to 
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


The same without ICU, e.g. for

$ initdb -D data

the output with locale information:

The database cluster will be initialized with this locale configuration:
  provider:libc
  LC_COLLATE:  en_US.utf8
  LC_CTYPE:de_DE.iso885915@euro
  LC_MESSAGES: en_US.utf8
  LC_MONETARY: de_DE.iso885915@euro
  LC_NUMERIC:  de_DE.iso885915@euro
  LC_TIME: de_DE.iso885915@euro
The default database encoding has accordingly been set to "LATIN9".
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (LATIN9) and the encoding that 
the selected locale uses (UTF8) do not match. This would lead to 
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


and the "shorter" version:

Encoding "LATIN9" implied by locale will be set as the default database 
encoding.

initdb: error: encoding mismatch
initdb: detail: The encoding you selected (LATIN9) and the encoding that 
the selected locale uses (UTF8) do not match. This would lead to 
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


BTW, what did you mean that "there are also messages in #ifdef WIN32 
code that would need to be reordered as well"?..


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 96b46cbc020ef8b85b6d54d3d4ca8ad116277832..242d68f58287aeb6f95619c2ce8b78e38433cf18 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1030,6 +1030,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 
 	if (dblocprovider == COLLPROVIDER_ICU)
 	{
+#ifndef USE_ICU
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTE

Re: ICU for global collation

2022-09-22 Thread Marina Polyakova


On 2022-09-21 17:53, Peter Eisentraut wrote:

Committed with that test, thanks.  I think that covers all the ICU
issues you reported for PG15 for now?


I thought about the order of the ICU checks - if it is ok to check that 
the selected encoding is supported by ICU after printing all the locale 
& encoding information, why not to move almost all the ICU checks 
here?..


Examples of the work of the attached patch:

1. ICU locale vs supported encoding:

1.1.

$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (SQL_ASCII) is not supported 
with the ICU provider.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


1.2. (like before)

$ initdb --encoding sql-ascii --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen


$ createdb --encoding sql-ascii --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU


2. For builds without ICU:

2.1.

$ initdb --locale-provider icu hoge
...
initdb: error: ICU is not supported in this build

$ createdb --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


2.2. (like before)

$ initdb --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen


$ createdb --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU


2.3.

$ createdb --locale-provider icu --icu-locale en-US --encoding sql-ascii 
hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


4. About errors in initdb:

4.1. If icu_locale is not specified, but it is required, then we get 
this:


$ initdb --locale-provider icu hoge
The files belonging to this database system will be owned by user 
"marina".

This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:icu
  LC_COLLATE:  en_US.UTF-8
  LC_CTYPE:en_US.UTF-8
  LC_MESSAGES: en_US.UTF-8
  LC_MONETARY: ru_RU.UTF-8
  LC_NUMERIC:  ru_RU.UTF-8
  LC_TIME: ru_RU.UTF-8
The default database encoding has been set to "UTF8".
initdb: error: ICU locale must be specified

Almost the same if ICU is not supported in this build:

$ initdb --locale-provider icu hoge
The files belonging to this database system will be owned by user 
"marina".

This user must also own the server process.

The database cluster will be initialized with this locale configuration:
  provider:icu
  LC_COLLATE:  en_US.UTF-8
  LC_CTYPE:en_US.UTF-8
  LC_MESSAGES: en_US.UTF-8
  LC_MONETARY: ru_RU.UTF-8
  LC_NUMERIC:  ru_RU.UTF-8
  LC_TIME: ru_RU.UTF-8
The default database encoding has been set to "UTF8".
initdb: error: ICU is not supported in this build

4.2. If icu_locale is specified for the wrong provider, the error will 
be at the beginning of the program start as before:


$ initdb --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index e0753c1badcc299e2bac45f3bdd2f23f59d70cbc..2589a2523e07c9543c99c7d7b446438d62382b89 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1030,6 +1030,11 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 
 	if (dblocprovider == COLLPROVIDER_ICU)
 	{
+#ifndef USE_ICU
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ICU is not supported in this build")));
+#else
 		if (!(is_encoding_supported_by_icu(encoding)))
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1046,6 +1051,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	 errmsg("ICU locale must be specified")));
 
 		check_icu_locale(dbiculocale);
+#endif
 	}
 	else
 	{
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 2b42d9ccd8b751bcb5cda3a1f4c7803a68bc0a4a..743f11e1d1fd62d500fc2846976472d13e08046b 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1945,15 +1945,12 @@ icu_set_collation_attributes(UCollator *collator, const char *loc)
 	}
 }
 
-#endif			/* USE_ICU */
-
 /*
  * Check if the given locale ID is valid, and ereport(ERROR) if it isn't.
  */
 void
 check_icu_locale(const char *icu_locale)
 {
-#ifdef USE_ICU
 	UCollator  *collator;
 	UErrorCode	status;
 
@@ -1967,13 +1964,10 @@ check_icu_locale(const char *icu_locale)
 	if (U_ICU_VERSION_MAJOR_NUM < 54)
 		icu_set_collati

Re: ICU for global collation

2022-09-21 Thread Marina Polyakova


On 2022-09-20 12:59, Peter Eisentraut wrote:

On 17.09.22 10:33, Marina Polyakova wrote:

3.

The locale provider is ICU, but it has not yet been set from the 
template database:



$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot 
be

specified unless locale provider is ICU


Please see attached patch for a fix.  Does that work for you?


Yes, it works. The following test checks this fix:

diff --git a/src/bin/scripts/t/020_createdb.pl 
b/src/bin/scripts/t/020_createdb.pl
index 
b87d8fc63b5246b02bcd4499aae815269b60df7c..c2464a99618cd7ca5616cc21121e1e4379b52baf 
100644

--- a/src/bin/scripts/t/020_createdb.pl
+++ b/src/bin/scripts/t/020_createdb.pl
@@ -71,6 +71,14 @@ if ($ENV{with_icu} eq 'yes')
$node2->command_ok(
 		[ 'createdb', '-T', 'template0', '--locale-provider=libc', 'foobar55' 
],
 		'create database with libc provider from template database with icu 
provider');

+
+   $node2->command_ok(
+   [
+   'createdb', '-T', 'template0', '--icu-locale',
+   'en-US', 'foobar56'
+   ],
+		'create database with icu locale from template database with icu 
provider'

+   );
 }
 else
 {

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-17 Thread Marina Polyakova


On 2022-09-16 10:56, Peter Eisentraut wrote:

On 15.09.22 17:41, Marina Polyakova wrote:
I agree with you. Here's another version of the patch. The 
locale/encoding checks and reports in initdb have been reordered, 
because now the encoding is set first and only then the ICU locale is 
checked.


I committed something based on the first version of your patch.  This
reordering of the messages here was a little too much surgery for me
at this point.  For instance, there are also messages in #ifdef WIN32
code that would need to be reordered as well.  I kept the overall
structure of the code the same and just inserted the additional
proposed checks.

If you want to pursue the reordering of the checks and messages
overall, a patch for the master branch could be considered.


Thank you! I already wrote about the order of the ICU checks in 
initdb/create database, they were the only reason to propose such 
changes...


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-17 Thread Marina Polyakova

Thanks to Kyotaro Horiguchi review we found out that there're 
interesting cases due to the order of some ICU checks:


1. ICU locale vs supported encoding:

1.1.

On 2022-09-15 09:52, Kyotaro Horiguchi wrote:

If I executed initdb as follows, I would be told to specify
--icu-locale option.


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified


However, when I reran the command, it complains about incompatible
encoding this time.  I think it's more user-friendly to check for the
encoding compatibility before the check for missing --icu-locale
option.


1.2. (ok?)

$ initdb --encoding sql-ascii --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen


$ initdb --encoding sql-ascii --icu-locale en-US --locale-provider icu 
hoge

...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (SQL_ASCII) is not supported 
with the ICU provider.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


$ createdb --encoding sql-ascii --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU
$ createdb --encoding sql-ascii --icu-locale en-US --locale-provider icu 
hoge
createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII" 
is not supported with ICU provider


2. For builds without ICU:

2.1.

$ initdb --locale-provider icu hoge
...
initdb: error: ICU locale must be specified

$ initdb --locale-provider icu --icu-locale en-US hoge
...
initdb: error: ICU is not supported in this build

$ createdb --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU locale must be 
specified


$ createdb --locale-provider icu --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


IMO, it would be more user-friendly to inform an unsupported build in 
the first runs too..


2.2. (ok?)

$ initdb --icu-locale en-US hoge
initdb: error: --icu-locale cannot be specified unless locale provider 
"icu" is chosen

$ initdb --icu-locale en-US --locale-provider icu hoge
...
initdb: error: ICU is not supported in this build

$ createdb --icu-locale en-US hoge
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU

$ createdb --icu-locale en-US --locale-provider icu hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


2.3.

$ createdb --locale-provider icu --icu-locale en-US --encoding sql-ascii 
hoge
createdb: error: database creation failed: ERROR:  encoding "SQL_ASCII" 
is not supported with ICU provider

$ createdb --locale-provider icu --icu-locale en-US --encoding utf8 hoge
createdb: error: database creation failed: ERROR:  ICU is not supported 
in this build


IMO, it would be more user-friendly to inform an unsupported build in 
the first run too..


3.

The locale provider is ICU, but it has not yet been set from the 
template database:



$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot be
specified unless locale provider is ICU


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-17 Thread Marina Polyakova


On 2022-09-16 11:11, Kyotaro Horiguchi wrote:

At Fri, 16 Sep 2022 09:49:28 +0300, Marina Polyakova
 wrote in

In continuation of options check: AFAICS the following checks in
initdb

if (locale_provider == COLLPROVIDER_ICU)
{
if (!icu_locale)
pg_fatal("ICU locale must be specified");

/*
 * In supported builds, the ICU locale ID will be checked by the
 * backend during post-bootstrap initialization.
 */
#ifndef USE_ICU
pg_fatal("ICU is not supported in this build");
#endif
}

are executed approximately when they are executed in create database
after getting all the necessary data from the template database:


initdb doesn't work that way, but anyway, I realized that I am
proposing to move that code in setlocales() to the caller function as
the result. I don't think setlocales() is the place for the code
because icu locale has no business with what the function does.  That
being said there's no obvious reason we *need* to move the code out to
its caller.


Excuse me, but could you explain your last sentence in more detail? I 
read that this code is not for setlocales and then - that it should not 
moved from here, so I'm confused...



+errmsg("encoding \"%s\" is not supported 
with ICU provider",

+   pg_log_error("encoding \"%s\" is not supported with ICU 
provider",
+pg_encoding_to_char(encodingid));

I might be wrong, but the messages look wrong to me.  The alternatives
below might work.

"encoding \"%s\" is not supported by ICU"
"encoding \"%s\" cannot be used for/with ICU locales"


The message indicates that the selected encoding cannot be used with the 
ICU provider because it does not support it. But if the text of the 
message becomes better and clearer, I will only be glad.



+   pg_log_error_hint("Rerun %s and choose a matching combination.",
+ progname);

This doesn't seem to provide users with useful information.


It was commited in more verbose form:

pg_log_error_hint("Rerun %s and either do not specify an encoding 
explicitly, "

  "or choose a matching combination.",

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-17 Thread Marina Polyakova


On 2022-09-16 10:57, Peter Eisentraut wrote:

On 16.09.22 09:31, Marina Polyakova wrote:
IMO it is hardly understantable from the program output either - it 
looks like I manually chose the encoding UTF8. Maybe first inform 
about selected encoding?..


Yes, I included something like that in the patch just committed.


diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 
6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 
100644

--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2310,7 +2310,11 @@ setup_locale_encoding(void)
  }

  if (!encoding && locale_provider == COLLPROVIDER_ICU)
+    {
  encodingid = PG_UTF8;
+    printf(_("The default database encoding has been set to 
\"%s\" for a better experience with the ICU provider.\n"),

+   pg_encoding_to_char(encodingid));
+    }
  else if (!encoding)
  {
  int    ctype_enc;


Thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-16 Thread Marina Polyakova


On 2022-09-16 07:55, Kyotaro Horiguchi wrote:

At Thu, 15 Sep 2022 18:41:31 +0300, Marina Polyakova
 wrote in

P.S. While working on the patch, I discovered that UTF8 encoding is
always used for the ICU provider in initdb unless it is explicitly
specified by the user:

if (!encoding && locale_provider == COLLPROVIDER_ICU)
encodingid = PG_UTF8;

IMO this creates additional errors for locales with other encodings:

$ initdb --locale de_DE.iso885915@euro --locale-provider icu
--icu-locale de-DE
...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that
the selected locale uses (LATIN9) do not match. This would lead to
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding
explicitly, or choose a matching combination.

And ICU supports many encodings, see the contents of pg_enc2icu_tbl in
encnames.c...


It seems to me the best default that fits almost all cases using icu
locales.

So, we need to specify encoding explicitly in that case.

$ initdb --encoding iso-8859-15 --locale de_DE.iso885915@euro
--locale-provider icu --icu-locale de-DE

However, I think it is hardly understantable from the documentation.

(I checked this using euc-jp [1] so it might be wrong..)

[1] initdb --encoding euc-jp --locale ja_JP.eucjp --locale-provider
icu --icu-locale ja-x-icu

regards.


Thank you!

IMO it is hardly understantable from the program output either - it 
looks like I manually chose the encoding UTF8. Maybe first inform about 
selected encoding?..


diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 
6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 
100644

--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2310,7 +2310,11 @@ setup_locale_encoding(void)
}

if (!encoding && locale_provider == COLLPROVIDER_ICU)
+   {
encodingid = PG_UTF8;
+		printf(_("The default database encoding has been set to \"%s\" for a 
better experience with the ICU provider.\n"),

+  pg_encoding_to_char(encodingid));
+   }
else if (!encoding)
{
int ctype_enc;

ISTM that such choices (e.g. UTF8 for Windows in some cases) are 
described in the documentation [1] as


By default, initdb uses the locale provider libc, takes the locale 
settings from the environment, and determines the encoding from the 
locale settings. This is almost always sufficient, unless there are 
special requirements.


[1] https://www.postgresql.org/docs/devel/app-initdb.html

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-16 Thread Marina Polyakova


On 2022-09-15 09:52, Kyotaro Horiguchi wrote:

If I executed initdb as follows, I would be told to specify
--icu-locale option.


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified


However, when I reran the command, it complains about incompatible
encoding this time.  I think it's more user-friendly to check for the
encoding compatibility before the check for missing --icu-locale
option.

regards.


In continuation of options check: AFAICS the following checks in initdb

if (locale_provider == COLLPROVIDER_ICU)
{
if (!icu_locale)
pg_fatal("ICU locale must be specified");

/*
 * In supported builds, the ICU locale ID will be checked by the
 * backend during post-bootstrap initialization.
 */
#ifndef USE_ICU
pg_fatal("ICU is not supported in this build");
#endif
}

are executed approximately when they are executed in create database 
after getting all the necessary data from the template database:


if (dblocprovider == COLLPROVIDER_ICU)
{
/*
 * This would happen if template0 uses the libc provider but the new
 * database uses icu.
 */
if (!dbiculocale)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("ICU locale must be specified")));
}

if (dblocprovider == COLLPROVIDER_ICU)
check_icu_locale(dbiculocale);

But perhaps the check that --icu-locale cannot be specified unless 
locale provider icu is chosen should also be moved here? So all these 
checks will be in one place and it will use the provider from the 
template database (which could be icu):


$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-15 Thread Marina Polyakova


On 2022-09-15 09:52, Kyotaro Horiguchi wrote:

If I executed initdb as follows, I would be told to specify
--icu-locale option.


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified


However, when I reran the command, it complains about incompatible
encoding this time.  I think it's more user-friendly to check for the
encoding compatibility before the check for missing --icu-locale
option.

regards.


I agree with you. Here's another version of the patch. The 
locale/encoding checks and reports in initdb have been reordered, 
because now the encoding is set first and only then the ICU locale is 
checked.


P.S. While working on the patch, I discovered that UTF8 encoding is 
always used for the ICU provider in initdb unless it is explicitly 
specified by the user:


if (!encoding && locale_provider == COLLPROVIDER_ICU)
encodingid = PG_UTF8;

IMO this creates additional errors for locales with other encodings:

$ initdb --locale de_DE.iso885915@euro --locale-provider icu 
--icu-locale de-DE

...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that 
the selected locale uses (LATIN9) do not match. This would lead to 
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding 
explicitly, or choose a matching combination.


And ICU supports many encodings, see the contents of pg_enc2icu_tbl in 
encnames.c...


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 6ff48bb18f3639ae45d9528b32df51a4aebc60c0..f248ad42b77c8c0cf2089963d4357b120914ce20 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1034,6 +1034,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 
 	if (dblocprovider == COLLPROVIDER_ICU)
 	{
+		if (!(is_encoding_supported_by_icu(encoding)))
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("encoding \"%s\" is not supported with ICU provider",
+			pg_encoding_to_char(encoding;
+
 		/*
 		 * This would happen if template0 uses the libc provider but the new
 		 * database uses icu.
@@ -1042,10 +1048,9 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 	 errmsg("ICU locale must be specified")));
-	}
 
-	if (dblocprovider == COLLPROVIDER_ICU)
 		check_icu_locale(dbiculocale);
+	}
 
 	/*
 	 * Check that the new encoding and locale settings match the source
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 6aeec8d426c52414b827686781c245291f27ed1f..999e7c2bf8dc707063eddf19a5c27eed03d3ca09 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2092,20 +2092,30 @@ setlocales(void)
 	check_locale_name(LC_CTYPE, lc_messages, );
 	lc_messages = canonname;
 #endif
+}
 
-	if (locale_provider == COLLPROVIDER_ICU)
+static void
+check_icu_locale_encoding(void)
+{
+	if (!(is_encoding_supported_by_icu(encodingid)))
 	{
-		if (!icu_locale)
-			pg_fatal("ICU locale must be specified");
+		pg_log_error("encoding \"%s\" is not supported with ICU provider",
+	 pg_encoding_to_char(encodingid));
+		pg_log_error_hint("Rerun %s and choose a matching combination.",
+		  progname);
+		exit(1);
+	}
 
-		/*
-		 * In supported builds, the ICU locale ID will be checked by the
-		 * backend during post-bootstrap initialization.
-		 */
+	if (!icu_locale)
+		pg_fatal("ICU locale must be specified");
+
+	/*
+	 * In supported builds, the ICU locale ID will be checked by the backend
+	 * during post-bootstrap initialization.
+	 */
 #ifndef USE_ICU
-		pg_fatal("ICU is not supported in this build");
+	pg_fatal("ICU is not supported in this build");
 #endif
-	}
 }
 
 /*
@@ -2281,34 +2291,6 @@ setup_locale_encoding(void)
 {
 	setlocales();
 
-	if (locale_provider == COLLPROVIDER_LIBC &&
-		strcmp(lc_ctype, lc_collate) == 0 &&
-		strcmp(lc_ctype, lc_time) == 0 &&
-		strcmp(lc_ctype, lc_numeric) == 0 &&
-		strcmp(lc_ctype, lc_monetary) == 0 &&
-		strcmp(lc_ctype, lc_messages) == 0 &&
-		(!icu_locale || strcmp(lc_ctype, icu_locale) == 0))
-		printf(_("The database cluster will be initialized with locale \"%s\".\n"), lc_ctype);
-	else
-	{
-		printf(_("The database cluster will be initialized with this locale configuration:\n"));
-		printf(_("  provider:%s\n"), collprovider_name(locale_provider));
-		if (icu_locale)
-			printf(_("  ICU locale:  %s\n"), icu_locale);
-		printf(_("  LC_COLLATE:  %s\n"
- "  LC_CTYPE:%s\n"
- "  LC_MESSAGES: %s\n"
- "  LC_MONETARY: %s\n"

Re: ICU for global collation

2022-09-14 Thread Marina Polyakova


Hello!

I was surprised that it is allowed to create clusters/databases where 
the default ICU collations do not actually work due to unsupported 
encodings:


$ initdb --encoding SQL_ASCII --locale-provider icu --icu-locale en-US 
-D data &&

pg_ctl -D data -l logfile start &&
psql -c "SELECT 'a' < 'b'" template1
...
waiting for server to start done
server started
ERROR:  encoding "SQL_ASCII" not supported by ICU

$ createdb --encoding SQL_ASCII --locale-provider icu --icu-locale en-US 
--template template0 mydb &&

psql -c "SELECT 'a' < 'b'" mydb
ERROR:  encoding "SQL_ASCII" not supported by ICU

The patch diff_check_icu_encoding.patch prohibits the creation of such 
objects...


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 6ff48bb18f3639ae45d9528b32df51a4aebc60c0..07758d15e8613d5a049537ddf2c5992e57ad6424 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1042,11 +1042,16 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 	 errmsg("ICU locale must be specified")));
-	}
 
-	if (dblocprovider == COLLPROVIDER_ICU)
 		check_icu_locale(dbiculocale);
 
+		if (!(is_encoding_supported_by_icu(encoding)))
+			ereport(ERROR,
+	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+	 errmsg("encoding \"%s\" is not supported with ICU provider",
+			pg_encoding_to_char(encoding;
+	}
+
 	/*
 	 * Check that the new encoding and locale settings match the source
 	 * database.  We insist on this because we simply copy the source data ---
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index e00837ecacf4885cf2a176168c283f3e67c6eb53..8a762ced8340c9d8256f7832a4c19a43f1d5538a 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2362,6 +2362,16 @@ setup_locale_encoding(void)
 	if (!check_locale_encoding(lc_ctype, encodingid) ||
 		!check_locale_encoding(lc_collate, encodingid))
 		exit(1);/* check_locale_encoding printed the error */
+
+	if (locale_provider == COLLPROVIDER_ICU &&
+		!(is_encoding_supported_by_icu(encodingid)))
+	{
+		pg_log_error("encoding \"%s\" is not supported with ICU provider",
+	 pg_encoding_to_char(encodingid));
+		pg_log_error_hint("Rerun %s and choose a matching combination.",
+		  progname);
+		exit(1);
+	}
 }
 
 
diff --git a/src/bin/initdb/t/001_initdb.pl b/src/bin/initdb/t/001_initdb.pl
index a37f6dd9b334b6ee22d9fdd4d51422795cb54a39..e4bb3d0cdd9c23729c5fb97886374f8df558f239 100644
--- a/src/bin/initdb/t/001_initdb.pl
+++ b/src/bin/initdb/t/001_initdb.pl
@@ -118,6 +118,15 @@ if ($ENV{with_icu} eq 'yes')
 		],
 		qr/FATAL:  could not open collator for locale/,
 		'fails for invalid ICU locale');
+
+	command_fails_like(
+		[
+			'initdb','--no-sync',
+			'--locale-provider=icu', '--icu-locale=en',
+			'--encoding=SQL_ASCII',  "$tempdir/dataX"
+		],
+		qr/error: encoding "SQL_ASCII" is not supported with ICU provider/,
+		'encoding "SQL_ASCII" is not supported with ICU provider');
 }
 else
 {
diff --git a/src/bin/scripts/t/020_createdb.pl b/src/bin/scripts/t/020_createdb.pl
index e91c1d013d08d8bd1e3a92f2aba958c5c7713ca6..eaab3caa32669ead068719d98bb953c5c6ff5a17 100644
--- a/src/bin/scripts/t/020_createdb.pl
+++ b/src/bin/scripts/t/020_createdb.pl
@@ -50,6 +50,16 @@ if ($ENV{with_icu} eq 'yes')
 		],
 		'fails for invalid ICU locale');
 
+	$node->command_fails_like(
+		[
+			'createdb','-T',
+			'template0',   '--locale-provider=icu',
+			'--icu-locale=en', '--encoding=SQL_ASCII',
+			'foobarX'
+		],
+		qr/ERROR:  encoding "SQL_ASCII" is not supported with ICU provider/,
+		'encoding "SQL_ASCII" is not supported with ICU provider');
+
 	# additional node, which uses the icu provider
 	my $node2 = PostgreSQL::Test::Cluster->new('icu');
 	$node2->init(extra => ['--locale-provider=icu', '--icu-locale=en']);

Re: ICU for global collation

2022-09-13 Thread Marina Polyakova


On 2022-09-13 15:41, Peter Eisentraut wrote:

On 13.09.22 07:34, Marina Polyakova wrote:
I agree with you that it is more comfortable and more similar to what 
has already been done in initdb. IMO it would be easier to do it like 
this:


diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 
e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26 
100644

--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -161,12 +161,10 @@ main(int argc, char *argv[])

  if (locale)
  {
-    if (lc_ctype)
-    pg_fatal("only one of --locale and --lc-ctype can be 
specified");

-    if (lc_collate)
-    pg_fatal("only one of --locale and --lc-collate can be 
specified");

-    lc_ctype = locale;
-    lc_collate = locale;
+    if (!lc_ctype)
+    lc_ctype = locale;
+    if (!lc_collate)
+    lc_collate = locale;
  }

  if (encoding)


done that way


Thank you!


BTW it's somewhat crummy that it uses a string comparison, so if you
write "UTF8" without a dash, it says this; it took me a few minutes 
to

see the difference...

postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
"en_US.UTF8" LOCALE "en_US.UTF8";
ERROR:  new collation (en_US.UTF8) is incompatible with the collation
of the template database (en_US.UTF-8)


Perhaps we could check the locale itself with the function 
normalize_libc_locale_name (collationcmds.c). But ISTM that the 
current check is a safety net in case the function 
pg_get_encoding_from_locale (chklocale.c) returns -1 or 
PG_SQL_ASCII...


This is not new behavior in PG15, is it?


No, it has always existed [1] AFAICS..

[1] 
https://github.com/postgres/postgres/commit/61d967498802ab86d8897cb3c61740d7e9d712f6


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-12 Thread Marina Polyakova


On 2022-09-09 19:46, Justin Pryzby wrote:

In pg14:
|postgres=# create database a LC_COLLATE C LC_CTYPE C LOCALE C;
|ERROR:  conflicting or redundant options
|DETAIL:  LOCALE cannot be specified together with LC_COLLATE or 
LC_CTYPE.


In pg15:
|postgres=# create database a LC_COLLATE "en_US.UTF-8" LC_CTYPE
"en_US.UTF-8" LOCALE "en_US.UTF-8" ;
|CREATE DATABASE

f2553d430 actually relaxed the restriction by removing this check:

-   if (dlocale && (dcollate || dctype))
-   ereport(ERROR,
-   (errcode(ERRCODE_SYNTAX_ERROR),
-errmsg("conflicting or redundant 
options"),

-errdetail("LOCALE cannot be specified
together with LC_COLLATE or LC_CTYPE.")));

But isn't the right fix to do the corresponding thing in createdb
(relaxing the frontend restriction rather than reverting its relaxation
in the backend).

diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index e523e58b218..5b80e56dfd9 100644
--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -159,15 +159,10 @@ main(int argc, char *argv[])
exit(1);
}

-   if (locale)
-   {
-   if (lc_ctype)
-   pg_fatal("only one of --locale and --lc-ctype can be 
specified");
-   if (lc_collate)
-   pg_fatal("only one of --locale and --lc-collate can be 
specified");
+   if (locale && !lc_ctype)
lc_ctype = locale;
+   if (locale && !lc_collate)
lc_collate = locale;
-   }

if (encoding)
{


I agree with you that it is more comfortable and more similar to what 
has already been done in initdb. IMO it would be easier to do it like 
this:


diff --git a/src/bin/scripts/createdb.c b/src/bin/scripts/createdb.c
index 
e523e58b2189275dc603a06324a2f28b0f49d8b7..a1482df3d981a680dd3322052e7c03ddacc8dc26 
100644

--- a/src/bin/scripts/createdb.c
+++ b/src/bin/scripts/createdb.c
@@ -161,12 +161,10 @@ main(int argc, char *argv[])

if (locale)
{
-   if (lc_ctype)
-   pg_fatal("only one of --locale and --lc-ctype can be 
specified");
-   if (lc_collate)
-   pg_fatal("only one of --locale and --lc-collate can be 
specified");
-   lc_ctype = locale;
-   lc_collate = locale;
+   if (!lc_ctype)
+   lc_ctype = locale;
+   if (!lc_collate)
+   lc_collate = locale;
}

if (encoding)

Should we change the behaviour of createdb and CREATE DATABASE in 
previous major versions?..



BTW it's somewhat crummy that it uses a string comparison, so if you
write "UTF8" without a dash, it says this; it took me a few minutes to
see the difference...

postgres=# create database a LC_COLLATE "en_US.UTF8" LC_CTYPE
"en_US.UTF8" LOCALE "en_US.UTF8";
ERROR:  new collation (en_US.UTF8) is incompatible with the collation
of the template database (en_US.UTF-8)


Perhaps we could check the locale itself with the function 
normalize_libc_locale_name (collationcmds.c). But ISTM that the current 
check is a safety net in case the function pg_get_encoding_from_locale 
(chklocale.c) returns -1 or PG_SQL_ASCII...


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-09-05 Thread Marina Polyakova


Hello!

IMO after adding ICU for global collations [1] the behaviour of createdb 
and CREATE DATABASE is a bit inconsistent when both locale and 
lc_collate (or locale and lc_ctype) options are used:


$ createdb mydb --locale C --lc-collate C --template template0
createdb: error: only one of --locale and --lc-collate can be specified
$ psql -c "create database mydb locale = 'C' lc_collate = 'C' template = 
'template0'" postgres

CREATE DATABASE

From the CREATE DATABASE documentation [2]:

locale
This is a shortcut for setting LC_COLLATE and LC_CTYPE at once. If you 
specify this, you cannot specify either of those parameters.


The patch diff_return_back_create_database_error.patch returns back the 
removed code for CREATE DATABASE so it behaves like createdb as 
before...


[1] 
https://github.com/postgres/postgres/commit/f2553d43060edb210b36c63187d52a632448e1d2

[2] https://www.postgresql.org/docs/devel/sql-createdatabase.html

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 6ff48bb18f3639ae45d9528b32df51a4aebc60c0..0a22cace11d9df6b5fc085bfd7b86319f4b13165 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -851,6 +851,12 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	 parser_errposition(pstate, defel->location)));
 	}
 
+	if (dlocale && (dcollate || dctype))
+		ereport(ERROR,
+(errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options"),
+ errdetail("LOCALE cannot be specified together with LC_COLLATE or LC_CTYPE.")));
+
 	if (downer && downer->arg)
 		dbowner = defGetString(downer);
 	if (dtemplate && dtemplate->arg)

Re: ICU for global collation

2022-08-23 Thread Marina Polyakova

My colleague Andrew Bille found another bug in master 
(b4e936859dc441102eb0b6fb7a104f3948c90490) and REL_15_STABLE 
(2c63b0930aee1bb5c265fad4a65c9d0b62b1f9da): pg_collation.colliculocale 
is not dumped. See check_icu_locale.sh:


In the old cluster:
SELECT collname, colliculocale FROM pg_collation WHERE collname = 
'testcoll_backwards'

  collname  |   colliculocale
+---
 testcoll_backwards | @colBackwards=yes
(1 row)

In the new cluster:
SELECT collname, colliculocale FROM pg_collation WHERE collname = 
'testcoll_backwards'

  collname  | colliculocale
+---
 testcoll_backwards |
(1 row)

diff_dump_colliculocale.patch works for me.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companyinitdb -D data_old &&
pg_ctl -D data_old -l logfile_old start &&
psql -ac "CREATE COLLATION testcoll_backwards (provider = icu, locale = 
'@colBackwards=yes')" postgres &&
echo "In the old cluster:" &&
psql -ac "SELECT collname, colliculocale FROM pg_collation WHERE collname = 
'testcoll_backwards'" postgres &&
pg_dump postgres > dump_postgres.sql &&
pg_ctl -D data_old stop &&
initdb -D data_new &&
pg_ctl -D data_new -l logfile_new start &&
psql -v ON_ERROR_STOP=1 -f dump_postgres.sql postgres &&
echo "In the new cluster:" &&
psql -ac "SELECT collname, colliculocale FROM pg_collation WHERE collname = 
'testcoll_backwards'" postgres &&
pg_ctl -D data_new stop
diff --git a/src/bin/pg_dump/Makefile b/src/bin/pg_dump/Makefile
index 2f524b09bf53a55037013d78148f8cbca4fa7eee..9dc5a784dd2d1ce58a6284f19c54730364779c4d 100644
--- a/src/bin/pg_dump/Makefile
+++ b/src/bin/pg_dump/Makefile
@@ -17,6 +17,7 @@ top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
 export GZIP_PROGRAM=$(GZIP)
+export with_icu
 
 override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 2c6891573296b395c5bd6b627d061de657355e9c..e8e8f69e30c8a787e5dff157f5d1619917638d7d 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -13084,9 +13084,11 @@ dumpCollation(Archive *fout, const CollInfo *collinfo)
 	int			i_collisdeterministic;
 	int			i_collcollate;
 	int			i_collctype;
+	int			i_colliculocale;
 	const char *collprovider;
 	const char *collcollate;
 	const char *collctype;
+	const char *colliculocale;
 
 	/* Do nothing in data-only dump */
 	if (dopt->dataOnly)
@@ -13117,6 +13119,13 @@ dumpCollation(Archive *fout, const CollInfo *collinfo)
 		appendPQExpBufferStr(query,
 			 "true AS collisdeterministic, ");
 
+	if (fout->remoteVersion >= 15)
+		appendPQExpBufferStr(query,
+			 "colliculocale, ");
+	else
+		appendPQExpBufferStr(query,
+			 "NULL AS colliculocale, ");
+
 	appendPQExpBuffer(query,
 	  "collcollate, "
 	  "collctype "
@@ -13130,10 +13139,24 @@ dumpCollation(Archive *fout, const CollInfo *collinfo)
 	i_collisdeterministic = PQfnumber(res, "collisdeterministic");
 	i_collcollate = PQfnumber(res, "collcollate");
 	i_collctype = PQfnumber(res, "collctype");
+	i_colliculocale = PQfnumber(res, "colliculocale");
 
 	collprovider = PQgetvalue(res, 0, i_collprovider);
-	collcollate = PQgetvalue(res, 0, i_collcollate);
-	collctype = PQgetvalue(res, 0, i_collctype);
+
+	if (!PQgetisnull(res, 0, i_collcollate))
+		collcollate = PQgetvalue(res, 0, i_collcollate);
+	else
+		collcollate = NULL;
+
+	if (!PQgetisnull(res, 0, i_collctype))
+		collctype = PQgetvalue(res, 0, i_collctype);
+	else
+		collctype = NULL;
+
+	if (!PQgetisnull(res, 0, i_colliculocale))
+		colliculocale = PQgetvalue(res, 0, i_colliculocale);
+	else
+		colliculocale = NULL;
 
 	appendPQExpBuffer(delq, "DROP COLLATION %s;\n",
 	  fmtQualifiedDumpable(collinfo));
@@ -13156,17 +13179,28 @@ dumpCollation(Archive *fout, const CollInfo *collinfo)
 	if (strcmp(PQgetvalue(res, 0, i_collisdeterministic), "f") == 0)
 		appendPQExpBufferStr(q, ", deterministic = false");
 
-	if (strcmp(collcollate, collctype) == 0)
+	if (colliculocale != NULL)
 	{
 		appendPQExpBufferStr(q, ", locale = ");
-		appendStringLiteralAH(q, collcollate, fout);
+		appendStringLiteralAH(q, colliculocale, fout);
 	}
 	else
 	{
-		appendPQExpBufferStr(q, ", lc_collate = ");
-		appendStringLiteralAH(q, collcollate, fout);
-		appendPQExpBufferStr(q, ", lc_ctype = ");
-		appendStringLiteralAH(q, collctype, fout);
+		Assert(collcollate != NULL);
+		Assert(collctype != NULL);
+
+		if (strcmp(collcollate, collctype) == 0)
+		{
+			appendPQExpBufferStr(q, ", locale = ");
+			appendStringLi

Re: ICU for global collation

2022-08-22 Thread Marina Polyakova


On 2022-08-22 17:10, Peter Eisentraut wrote:

On 15.08.22 14:06, Marina Polyakova wrote:
1.1) It looks like there's a bug in the function get_db_infos 
(src/bin/pg_upgrade/info.c), where the version of the old cluster is 
always checked:


if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
 snprintf(query + strlen(query), sizeof(query) - strlen(query),
  "'c' AS datlocprovider, NULL AS daticulocale, ");
else
 snprintf(query + strlen(query), sizeof(query) - strlen(query),
  "datlocprovider, daticulocale, ");

With the simple patch

diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 
df374ce4b362b4c6c87fc1fd0e476e5d6d353d9e..53ea348e211d3ac38334292bc16cb814bc13bb87 
100644

--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -319,7 +319,7 @@ get_db_infos(ClusterInfo *cluster)

  snprintf(query, sizeof(query),
   "SELECT d.oid, d.datname, d.encoding, d.datcollate, 
d.datctype, ");

-    if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
+    if (GET_MAJOR_VERSION(cluster->major_version) < 1500)
  snprintf(query + strlen(query), sizeof(query) - 
strlen(query),

   "'c' AS datlocprovider, NULL AS daticulocale, ");
  else


fixed

1.2) It looks like the mentioned asserion in dbcommands.c conflicts 
with the following lines earlier:


if (dbiculocale == NULL)
 dbiculocale = src_iculocale;


fixed

I'm wondering if this is not a fully-supported feature (because 
createdb creates an SQL command with LC_COLLATE and LC_CTYPE options 
instead of LOCALE option) or is it a bug in CREATE DATABASE?.. From 
src/backend/commands/dbcommands.c:


if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
{
 if (dlocale && dlocale->arg)
     dbiculocale = defGetString(dlocale);
}


I think this piece of code was left over from some earlier attempts to
specify the libc locale and the icu locale with one option, which
never really worked well.  The CREATE DATABASE man page does not
mention that LOCALE provides the default for ICU_LOCALE.  Hence, I
think we should delete this.


Thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: ICU for global collation

2022-08-20 Thread Marina Polyakova

 
first 3 databases in the cluster in much the same way...


P.S. FYI there seems to be a bug for very old ICU versions: in master 
(92fce4e1eda9b24d73f583fbe9b58f4e03f097a4):


$ initdb -D data &&
pg_ctl -D data -l logfile start &&
psql -c "CREATE DATABASE mydb LOCALE \"C.UTF-8\" LOCALE_PROVIDER icu 
TEMPLATE template0" postgres &&

psql -c "SELECT 1" mydb

WARNING:  database "mydb" has a collation version mismatch
DETAIL:  The database was created using collation version 49.192.0.42, 
but the operating system provides version 49.192.5.42.
HINT:  Rebuild all objects in this database that use the default 
collation and run ALTER DATABASE mydb REFRESH COLLATION VERSION, or 
build PostgreSQL with the right library version.


See the additional output (diff_log_icu_collator_locale.patch) in the 
logfile:


2022-08-20 11:38:30.162 MSK [136546] LOG:  check_icu_locale 
uloc_getDefault() en_US
2022-08-20 11:38:30.162 MSK [136546] STATEMENT:  CREATE DATABASE mydb 
LOCALE "C.UTF-8" LOCALE_PROVIDER icu TEMPLATE template0
2022-08-20 11:38:30.163 MSK [136546] LOG:  check_icu_locale icu_locale 
C.UTF-8 valid_locale en_US version 49.192.0.42
2022-08-20 11:38:30.163 MSK [136546] STATEMENT:  CREATE DATABASE mydb 
LOCALE "C.UTF-8" LOCALE_PROVIDER icu TEMPLATE template0
2022-08-20 11:38:30.163 MSK [136546] LOG:  get_collation_actual_version 
uloc_getDefault() en_US
2022-08-20 11:38:30.163 MSK [136546] STATEMENT:  CREATE DATABASE mydb 
LOCALE "C.UTF-8" LOCALE_PROVIDER icu TEMPLATE template0
2022-08-20 11:38:30.163 MSK [136546] LOG:  get_collation_actual_version 
icu_locale C.UTF-8 valid_locale en_US version 49.192.0.42
2022-08-20 11:38:30.163 MSK [136546] STATEMENT:  CREATE DATABASE mydb 
LOCALE "C.UTF-8" LOCALE_PROVIDER icu TEMPLATE template0
2022-08-20 11:38:30.224 MSK [136548] LOG:  make_icu_collator 
uloc_getDefault() c
2022-08-20 11:38:30.225 MSK [136548] LOG:  make_icu_collator icu_locale 
C.UTF-8 valid_locale root version 49.192.5.42
2022-08-20 11:38:30.225 MSK [136548] LOG:  get_collation_actual_version 
uloc_getDefault() c
2022-08-20 11:38:30.225 MSK [136548] LOG:  get_collation_actual_version 
icu_locale C.UTF-8 valid_locale root version 49.192.5.42
2022-08-20 11:38:30.225 MSK [136548] WARNING:  database "mydb" has a 
collation version mismatch
2022-08-20 11:38:30.225 MSK [136548] DETAIL:  The database was created 
using collation version 49.192.0.42, but the operating system provides 
version 49.192.5.42.
2022-08-20 11:38:30.225 MSK [136548] HINT:  Rebuild all objects in this 
database that use the default collation and run ALTER DATABASE mydb 
REFRESH COLLATION VERSION, or build PostgreSQL with the right library 
version.


[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=loach=2022-08-18%2006%3A25%3A17


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 1a047a97d74e7fe5b2cad22cd54cf30f7129bf84..21cfbfffdd1b58df821adbdd577fb599bc2747cf 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -1406,6 +1406,14 @@ make_icu_collator(const char *iculocstr,
 #ifdef USE_ICU
 	UCollator  *collator;
 	UErrorCode	status;
+	const char *default_icu_locale;
+	UVersionInfo versioninfo;
+	char		buf[U_MAX_VERSION_STRING_LENGTH];
+	const char *valid_locale;
+
+	default_icu_locale = uloc_getDefault();
+	elog(LOG, "make_icu_collator uloc_getDefault() %s",
+		 default_icu_locale ? default_icu_locale : "(null)");
 
 	status = U_ZERO_ERROR;
 	collator = ucol_open(iculocstr, );
@@ -1417,6 +1425,20 @@ make_icu_collator(const char *iculocstr,
 	if (U_ICU_VERSION_MAJOR_NUM < 54)
 		icu_set_collation_attributes(collator, iculocstr);
 
+	status = U_ZERO_ERROR;
+	valid_locale = ucol_getLocaleByType(collator, ULOC_VALID_LOCALE, );
+	if (U_FAILURE(status) || valid_locale == NULL)
+		ereport(ERROR,
+(errmsg("failed to get valid locale for collator with "
+		"requested locale \"%s\": %s",
+		iculocstr, u_errorName(status;
+
+	ucol_getVersion(collator, versioninfo);
+	u_versionToString(versioninfo, buf);
+
+	elog(LOG, "make_icu_collator icu_locale %s valid_locale %s version %s",
+		 iculocstr ? iculocstr : "(null)", valid_locale, buf);
+
 	/* We will leak this string if the caller errors later :-( */
 	resultp->info.icu.locale = MemoryContextStrdup(TopMemoryContext, iculocstr);
 	resultp->info.icu.ucol = collator;
@@ -1654,6 +1676,12 @@ get_collation_actual_version(char collprovider, const char *collcollate)
 		UErrorCode	status;
 		UVersionInfo versioninfo;
 		char		buf[U_MAX_VERSION_STRING_LENGTH];
+		const char *default_icu_locale;
+		const char *valid_locale;
+
+		default_icu_locale = uloc_getDefault();
+		elog(LOG, "get_collation_actual_versi

Re: ICU for global collation

2022-08-15 Thread Marina Polyakova

(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("ICU locale must be 
specified")));
}
+   else
+   dbiculocale = NULL;

if (dblocprovider == COLLPROVIDER_ICU)
check_icu_locale(dbiculocale);

2) CREATE DATABASE does not always require the icu locale unlike initdb 
and createdb:


$ initdb -D data --locale en_US.UTF-8 --locale-provider icu
...
initdb: error: ICU locale must be specified

$ initdb -D data --locale en_US.UTF-8
$ pg_ctl -D data -l logfile start

$ createdb mydb --locale en_US.UTF-8 --template template0 
--locale-provider icu
createdb: error: database creation failed: ERROR:  ICU locale must be 
specified


$ psql -c "CREATE DATABASE mydb LOCALE \"en_US.UTF-8\" TEMPLATE 
template0 LOCALE_PROVIDER icu" postgres

CREATE DATABASE

$ psql -c "CREATE DATABASE mydb TEMPLATE template0 LOCALE_PROVIDER icu" 
postgres

ERROR:  ICU locale must be specified

I'm wondering if this is not a fully-supported feature (because createdb 
creates an SQL command with LC_COLLATE and LC_CTYPE options instead of 
LOCALE option) or is it a bug in CREATE DATABASE?.. From 
src/backend/commands/dbcommands.c:


if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
{
if (dlocale && dlocale->arg)
dbiculocale = defGetString(dlocale);
}

[1] 
https://github.com/postgres/postgres/commit/f2553d43060edb210b36c63187d52a632448e1d2


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Reduce the time required for a database recovery from archive.

2021-03-19 Thread Marina Polyakova


Hello everyone in this thread!

On 2021-03-18 18:04, David Steele wrote:

Seems like there should have been a patch attached?


IMO there's a technical problem with sending, receiving (or displaying 
on the site) emails from the list pgsql-hackers. By subsribing to this 
list I received the attached patch from the email [1]. And my colleague 
Roman Zharkov said that the button 'Resend email' from that link helped 
him to receive the email with the attached patch. On the other hand 
follwing this link in the browser I do not see the attached patch. Do 
you think it is worth to write about this issue to 
webmaster(dot)postgresql(dot)org?..


Just in case I'm lucky this email contains the lost patch.

[1] 
https://www.postgresql.org/message-id/4047CC05-1AF5-454B-850B-ED37374A2AC0%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom c071e8ee78aac811feaf54c4374c1a998409733e Mon Sep 17 00:00:00 2001
From: Dmitry Shulga 
Date: Fri, 18 Dec 2020 12:38:58 +0700
Subject: [PATCH] Reduce time required to recover database from archive.

Originally database recovering from archive was performed by
sequential receiving of files with WAL records and applying them against
the database. Delivering of files containing WAL records are performed
by running a command specified by the GUC parameter restore_command.
In case receiving of every file containing WAL records takes long time
it results in standing idle most of time waiting until files be received.
If time required to apply WAL records from an archive file is significantly
lesser than time required to deliver the file from archive it leads
to nonproductive standing idle after current WAL segment is applied and
before next WAL segment be received.  As a consequence a wall time required
to recover a database from archive log can be unacceptably long.

To reduce total time required to restore database from archive the procedure
for delivering of WAL files was redesigned in order to allow concurrent
loading of WAL files. At postmaster start a few background processes are
spawned to load WAL files from archive in parallel. A number of processes
that started to perform preloading of WAL files is determined by
the new GUC parameter wal_prefetch_workers. A number of WAL files to prefetch
from archive is limited by the new GUC parameter wal_max_prefetch_amount.

Additionally, refactoring was done to extract duplicate code, used
in the files xlogarchive.c and xlogrestore.c, into stanalone functions and
move it to the file xlogutils.c

Author: Dmitry Shulga
Reviewed-by: Anna Akenteva
Tested-by: Roman Zharkov

diff --git a/src/backend/access/transam/Makefile b/src/backend/access/transam/Makefile
index 595e02de722..ffbf8090f45 100644
--- a/src/backend/access/transam/Makefile
+++ b/src/backend/access/transam/Makefile
@@ -32,7 +32,8 @@ OBJS = \
 	xlogfuncs.o \
 	xloginsert.o \
 	xlogreader.o \
-	xlogutils.o
+	xlogutils.o \
+	xlogrestore.o
 
 include $(top_srcdir)/src/backend/common.mk
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13f1d8c3dc7..f0a0c68725e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -37,6 +37,7 @@
 #include "access/xloginsert.h"
 #include "access/xlogreader.h"
 #include "access/xlogutils.h"
+#include "access/xlogrestore.h"
 #include "catalog/catversion.h"
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
@@ -3684,10 +3685,11 @@ XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 	 xlogfname);
 			set_ps_display(activitymsg);
 
-			restoredFromArchive = RestoreArchivedFile(path, xlogfname,
-	  "RECOVERYXLOG",
-	  wal_segment_size,
-	  InRedo);
+			restoredFromArchive = RestoreCommandXLog(path, xlogfname,
+	 "RECOVERYXLOG",
+	 wal_segment_size,
+	 InRedo);
+
 			if (!restoredFromArchive)
 return -1;
 			break;
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index f39dc4ddf1a..fb4023f1cec 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -22,6 +22,7 @@
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "access/xlogarchive.h"
+#include "access/xlogutils.h"
 #include "common/archive.h"
 #include "miscadmin.h"
 #include "postmaster/startup.h"
@@ -55,13 +56,8 @@ RestoreArchivedFile(char *path, const char *xlogfname,
 	bool cleanupEnabled)
 {
 	char		xlogpath[MAXPGPATH];
-	char	   *xlogRestoreCmd;
 	char		lastRestartPointFname[MAXPGPATH];
 	int			rc;
-	struct stat stat_buf;
-	XLogSegNo	restartSegNo;
-	XLogRecPtr	restartRedoPtr;
-	TimeLineID	restartTli;
 
 	/*
 	 * Ignore restore_command when not in archive recovery (meaning we are

Re: pgbench: option delaying queries till connections establishment?

2020-11-16 Thread Marina Polyakova

Hello!

On 2020-11-14 20:07, Alexander Korotkov wrote:

Hmm... Let's see the big picture. You've recently committed a
patchset, which greatly improved the performance of GetSnapshotData().
And you're making further improvements in this direction. But you're
getting trouble in measuring the effect, because Postgres is still
stuck on ProcArrayLock. And in this thread you propose a workaround
for that implemented on the pgbench side. My very dumb idea is
following: should we finally give a chance to more fair lwlocks rather
than inventing workarounds?

As I remember, your major argument against more fair lwlocks was the
idea that we should fix lwlocks use-cases rather than lwlock mechanism
themselves. But can we expect that we fix all the lwlocks use-case in
any reasonable prospect? My guess is 'no'.

Links
1.
https://www.postgresql.org/message-id/CAPpHfdvJhO1qutziOp%3Ddy8TO8Xb4L38BxgKG4RPa1up1Lzh_UQ%40mail.gmail.com

Sorry I'm not familiar with the internal architecture of snapshots,
locks etc. in postgres, but I wanted to ask - what exact kind of patch
for fair lwlocks do you want to offer to the community? I applied the
6-th version of the patch for fair lwlocks from [1] to the old master
branch (see commit [2]), started many threads in pgbench (-M prepared -c
1000 -j 500 -T 10 -P1 -S) and I did not receive stable first progress
reports, which IIUC are one of the advantages of the discussed patch for
the pgbench (see [3])...

[1]
https://www.postgresql.org/message-id/CAPpHfduV3v3EG7K74-9htBZz_mpE993zGz-%3D2k5RNA3tqabUAA%40mail.gmail.com
[2]
https://github.com/postgres/postgres/commit/84d514887f9ca673ae688d00f8b544e70f1ab270
[3]
https://www.postgresql.org/message-id/20200227185129.hikscyenomnlrord%40alap3.anarazel.de

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: pgbench: option delaying queries till connections establishment?

2020-11-14 Thread Marina Polyakova

Hello!

On 2020-11-13 08:44, kuroda.hay...@fujitsu.com wrote:

Dear Fabien,

and this will wait till its time comes. In the mean time, I think that
you

should put the patch status as you see fit, independently of the other
patch: it seems unlikely that they would be committed together, and
I'll

have to merge the remaining one anyway.

OK. I found the related thread[1], and I understood you will submit
another patch
on the thread.

PostgreSQL Patch Tester says all regression tests are passed, and
I change the status to "Ready for committer."

[1]: https://commitfest.postgresql.org/31/2827/

Thank you for discussing with me.

Hayato Kuroda
FUJITSU LIMITED

From the mentioned thread [2]:

While trying to test a patch that adds a synchronization barrier in
pgbench [1] on Windows,

Thanks for trying that, I do not have a windows setup for testing, and
the sync code I wrote for Windows is basically blind coding:-(

FYI:

1) It looks like pgbench will no longer support Windows XP due to the
function DeleteSynchronizationBarrier. From
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier
:

Minimum supported client: Windows 8 [desktop apps only]
Minimum supported server: Windows Server 2012 [desktop apps only]

On Windows Server 2008 R2 (MSVC 2013) the 6-th version of the patch
[1] has compiled without (new) warnings, but when running pgbench I
got the following error:

The procedure entry point DeleteSynchronizationBarrier could not be
located in the dynamic link library KERNEL32.dll.

IMO, it looks like either old Windows systems should not call new
functions, or we should throw them a compilation error. (Update
MIN_WINNT to 0x0602 = Windows 8 in src/include/port/win32.h?) In the
second case it looks like the documentation should be updated too, see
doc/src/sgml/installation.sgml:

PostgreSQL can be expected to work on these
operating

systems: Linux (all recent distributions), Windows (XP and later),
FreeBSD, OpenBSD, NetBSD, macOS, AIX, HP/UX, and Solaris.
Other Unix-like systems may also work but are not currently
being tested. In most cases, all CPU architectures supported by
a given operating system will work. Look in
below to see if
there is information
specific to your operating system, particularly if using an older
system.

<...>

The native Windows port requires a 32 or 64-bit version of Windows
2000 or later. Earlier operating systems do
not have sufficient infrastructure (but Cygwin may be used on
those). MinGW, the Unix-like build tools, and MSYS, a collection
of Unix tools required to run shell scripts
like configure, can be downloaded
from http://www.mingw.org/;>. Neither is
required to run the resulting binaries; they are needed only for
creating the binaries.

[2]
https://www.postgresql.org/message-id/e5a09b790db21356376b6e73673aa07c%40postgrespro.ru

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: pgbench stopped supporting large number of client connections on Windows

2020-11-08 Thread Marina Polyakova


On 2020-11-07 01:01, Fabien COELHO wrote:

Hello Marina,


Hello, Fabien!

Thank you for your comments!

While trying to test a patch that adds a synchronization barrier in 
pgbench [1] on Windows,


Thanks for trying that, I do not have a windows setup for testing, and
the sync code I wrote for Windows is basically blind coding:-(


FYI:

1) It looks like pgbench will no longer support Windows XP due to the 
function DeleteSynchronizationBarrier. From 
https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-deletesynchronizationbarrier 
:


Minimum supported client: Windows 8 [desktop apps only]
Minimum supported server: Windows Server 2012 [desktop apps only]

On Windows Server 2008 R2 (MSVC 2013) the 6-th version of the patch [1] 
has compiled without (new) warnings, but when running pgbench I got the 
following error:


The procedure entry point DeleteSynchronizationBarrier could not be 
located in the dynamic link library KERNEL32.dll.


2) On Windows Server 2019 (MSVC 2019) the 6-th version of the patch [1] 
with fix_max_client_conn_on_Windows.patch has compiled without (new) 
warnings. I made a few runs (-M prepared -c 100 -j 10 -T 10 -P1 -S) with 
and without your patches. On Linux (-M prepared -c 1000 -j 500 -T 10 -P1 
-S) your patches fix problems with progress reports as in [2], but on 
Windows I did not notice such changes, see attached 
pgbench_runs_linux_vs_windows.zip.


The almost same thing happens with reindexdb and vacuumdb (build on 
commit [3]):


Windows fd implementation is somehow buggy because it does not return
the smallest number available, and then with the assumption that
select uses a dense array indexed with them (true on linux, less so on
Windows which probably uses a sparse array), so that the number gets
over the limit, even if less are actually used, hence the catch, as
you noted.


I agree with you. It looks like the structure fd_set just contains used 
sockets by this application on Windows, and the macro FD_SETSIZE is used 
only here.


From 
https://docs.microsoft.com/en-us/windows/win32/api/winsock/ns-winsock-fd_set 
:


typedef struct fd_set {
  u_int  fd_count;
  SOCKET fd_array[FD_SETSIZE];
} fd_set, FD_SET, *PFD_SET, *LPFD_SET;

From 
https://docs.microsoft.com/en-us/windows/win32/winsock/maximum-number-of-sockets-supported-2 
:


The maximum number of sockets that a Windows Sockets application can use 
is not affected by the manifest constant FD_SETSIZE. This value defined 
in the Winsock2.h header file is used in constructing the FD_SET 
structures used with select function.


IIUC the checks below are not correct on Windows, since on this system 
sockets can have values equal to or greater than FD_SETSIZE (see 
Windows documentation [4] and pgbench debug output in attached 
pgbench_debug.txt).


Okay.

But then, how may one detect that there are too many fds in the set?

I think that an earlier version of the code needed to make assumptions
about the internal implementation of windows (there is a counter
somewhere in windows fd_set struct), which was rejected because if was
breaking the interface. Now your patch is basically resurrecting that.


I tried to keep the behaviour "we check if the socket value can be used 
in select() at runtime", but now I will also read that thread...



Why not if there is no other solution, but this is quite depressing,
and because it breaks the interface it would be broken if windows
changed its internals for some reason:-(


It looks like if the internals of the structure fd_set are changed, we 
will also have problems with the function pgwin32_select from 
src/backend/port/win32/socket.c, because it uses fd_set.fd_count too?..


(I'm writing responses to the rest of your comments but it takes 
time...)


[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2011021726390.989361%40pseudo
[2] 
https://www.postgresql.org/message-id/20200227185129.hikscyenomnlrord%40alap3.anarazel.de


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company<>

Re: pgbench stopped supporting large number of client connections on Windows

2020-11-07 Thread Marina Polyakova


On 2020-11-06 23:54, Ranier Vilela wrote:

Hi Marina,


Hello!


Nice catch.


Thank you!


rc/bin/pgbench/pgbench.c, the function add_socket_to_set:
if (fd < 0 || fd >= FD_SETSIZE)
{
/*
* Doing a hard exit here is a bit grotty, but it doesn't seem worth
* complicating the API to make it less grotty.
*/
pg_log_fatal("too many client connections for select()");
exit(1);
}


It seems to me that the limit is hardcode in,
src/backend/port/win32/socket.c

FD_SETSIZE * 2

that would be 2048?


1) If you mean the function pgwin32_select in the file 
src/backend/port/win32/socket.c, IIUC it is only used in the backend, 
see src/include/port/win32_port.h:


#ifndef FRONTEND
<...>
#define select(n, r, w, e, timeout) pgwin32_select(n, r, w, e, timeout)
<...>
#endif  /* FRONTEND */

2) It looks like FD_SETSIZE does not set a limit on the socket value on 
Windows, see 
https://docs.microsoft.com/en-us/windows/win32/winsock/maximum-number-of-sockets-supported-2 
:


The maximum number of sockets that a Windows Sockets application can use 
is not affected by the manifest constant FD_SETSIZE. This value defined 
in the Winsock2.h header file is used in constructing the FD_SET 
structures used with select function.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgbench stopped supporting large number of client connections on Windows

2020-11-06 Thread Marina Polyakova


Hello, hackers!

While trying to test a patch that adds a synchronization barrier in 
pgbench [1] on Windows, I found that since the commit "Use ppoll(2), if 
available, to wait for input in pgbench." [2] I cannot use a large 
number of client connections in pgbench on my Windows virtual machines 
(Windows Server 2008 R2 and Windows 2019), for example:



bin\pgbench.exe -c 90 -S -T 3 postgres

starting vacuum...end.
too many client connections for select()

The almost same thing happens with reindexdb and vacuumdb (build on 
commit [3]):



bin\reindexdb.exe -j 95 postgres

reindexdb: fatal: too many jobs for this platform -- try 90


bin\vacuumdb.exe -j 95 postgres

vacuumdb: vacuuming database "postgres"
vacuumdb: fatal: too many jobs for this platform -- try 90

IIUC the checks below are not correct on Windows, since on this system 
sockets can have values equal to or greater than FD_SETSIZE (see Windows 
documentation [4] and pgbench debug output in attached 
pgbench_debug.txt).


src/bin/pgbench/pgbench.c, the function add_socket_to_set:
if (fd < 0 || fd >= FD_SETSIZE)
{
/*
 * Doing a hard exit here is a bit grotty, but it doesn't seem worth
 * complicating the API to make it less grotty.
 */
pg_log_fatal("too many client connections for select()");
exit(1);
}

src/bin/scripts/scripts_parallel.c, the function ParallelSlotsSetup:
/*
 * Fail and exit immediately if trying to use a socket in an
 * unsupported range.  POSIX requires open(2) to use the lowest
 * unused file descriptor and the hint given relies on that.
 */
if (PQsocket(conn) >= FD_SETSIZE)
{
pg_log_fatal("too many jobs for this platform -- try %d", i);
exit(1);
}

I tried to fix this, see attached fix_max_client_conn_on_Windows.patch 
(based on commit [3]). I checked it for reindexdb and vacuumdb, and it 
works for simple databases (1025 jobs are not allowed and 1024 jobs is 
ok). Unfortunately, pgbench was getting connection errors when it tried 
to use 1000 jobs on my virtual machines, although there were no errors 
for fewer jobs (500) and the same number of clients (1000)...


Any suggestions are welcome!

[1] 
https://www.postgresql.org/message-id/flat/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de
[2] 
https://github.com/postgres/postgres/commit/60e612b602999e670f2d57a01e52799eaa903ca9
[3] 
https://github.com/postgres/postgres/commit/48e1291342dd7771cf8c67aa1d7ec1f394b95dd8
[4] From 
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select 
:
Internally, socket handles in an fd_set structure are not represented as 
bit flags as in Berkeley Unix. Their data representation is opaque.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 3057665bbec567331ad5ea03d31af707f5e91b4c..7a54638db191982d538cabf007d82715fa254b6a 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -62,6 +62,7 @@
 #include "common/string.h"
 #include "fe_utils/cancel.h"
 #include "fe_utils/conditional.h"
+#include "fe_utils/socket_utils.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
 #include "pgbench.h"
@@ -6698,7 +6699,7 @@ clear_socket_set(socket_set *sa)
 static void
 add_socket_to_set(socket_set *sa, int fd, int idx)
 {
-	if (fd < 0 || fd >= FD_SETSIZE)
+	if (fd < 0 || !check_fd_set_size(fd, >fds))
 	{
 		/*
 		 * Doing a hard exit here is a bit grotty, but it doesn't seem worth
diff --git a/src/bin/scripts/scripts_parallel.c b/src/bin/scripts/scripts_parallel.c
index ec264a269a7d7a506c347112af6be183b2aec9ce..61977741c7c5f76b1966ab8e59792a1d2b53b934 100644
--- a/src/bin/scripts/scripts_parallel.c
+++ b/src/bin/scripts/scripts_parallel.c
@@ -25,6 +25,7 @@
 #include "common.h"
 #include "common/logging.h"
 #include "fe_utils/cancel.h"
+#include "fe_utils/socket_utils.h"
 #include "scripts_parallel.h"
 
 static void init_slot(ParallelSlot *slot, PGconn *conn);
@@ -144,6 +145,16 @@ ParallelSlotsGetIdle(ParallelSlot *slots, int numslots)
 			if (sock < 0)
 continue;
 
+			/*
+			 * Fail and exit immediately if trying to use a socket in an
+			 * unsupported range.
+			 */
+			if (!check_fd_set_size(sock, ))
+			{
+pg_log_fatal("too many jobs for this platform -- try %d", i);
+exit(1);
+			}
+
 			FD_SET(sock, );
 			if (sock > maxFd)
 maxFd = sock;
@@ -221,18 +232,6 @@ ParallelSlotsSetup(const ConnParams *cparams,
 		for (i = 1; i < numslots; i++)
 		{
 			conn = connectDatabase(cparams, progname, echo, false, true);
-
-			/*
-			 * Fail and exit immediately if trying to use a socket in an
-			 * unsupported range.  POSIX requires open(2) to use the lowest
-			 * unused file descriptor and the hint given relies o

Re: pg_upgrade check fails on Solaris 10

2019-09-24 Thread Marina Polyakova


On 2019-09-23 19:41, Alvaro Herrera wrote:

On 2019-Sep-23, Marina Polyakova wrote:
The branch REL9_4_STABLE (commit 
8a17afe84be6fefe76d0d2f4d26c5ee075e64487)

has the same issue - according to the release table [2] it is still
supported, isn't it?...


Yeah, but pg_upgrade is in contrib/ in 9.4, so nowhere as good as from
9.5 onwards; and it's going to die in a couple of months anyway, so I'm
not thrilled about fixing this there.

If you *need* to have this fixed in 9.4, we can do that, but do you?


No, we don't. I just noticed :-)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: pg_upgrade check fails on Solaris 10

2019-09-23 Thread Marina Polyakova


On 2019-09-18 17:36, Alvaro Herrera wrote:

On 2019-Sep-17, Marina Polyakova wrote:


Hello, hackers!

We got an error for pg_upgrade check on the branch REL_11_STABLE 
(commit
40ad4202513c72f5c1beeb03e26dfbc8890770c0) on Solaris 10 because IIUC 
the
argument to the sed command is not enclosed in quotation marks (see 
[1]):


Hmm, I'm surprised it has taken this long to detect the problem.


Looking at the members of buildfarm [1] castoroides and protosciurus  - 
IIUC they do not check pg_upgrade. And I was that lucky one who have run 
the branch with the latest commits at our buildfarm...



Attached diff.patch fixes the problem.


I have pushed it to all branches that have src/bin/pg_upgrade (namely,
9.5 onwards), thanks.  I hope this won't make the msys/mingw machines
angry ;-)


Thank you! I ran pg_upgrade tests for MSYS, everything is fine.

The branch REL9_4_STABLE (commit 
8a17afe84be6fefe76d0d2f4d26c5ee075e64487) has the same issue - according 
to the release table [2] it is still supported, isn't it?...


[1] https://buildfarm.postgresql.org/cgi-bin/show_members.pl
[2] https://www.postgresql.org/support/versioning/

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pg_upgrade check fails on Solaris 10

2019-09-17 Thread Marina Polyakova


Hello, hackers!

We got an error for pg_upgrade check on the branch REL_11_STABLE (commit 
40ad4202513c72f5c1beeb03e26dfbc8890770c0) on Solaris 10 because IIUC the 
argument to the sed command is not enclosed in quotation marks (see 
[1]):


$ gmake -C src/bin/pg_upgrade/ check
<...>
MAKE=gmake 
bindir="/home/buildfarm/mpolyakova/postgrespro_REL_11_STABLE/inst/bin" 
libdir="/home/buildfarm/mpolyakova/postgrespro_REL_11_STABLE/inst/lib" 
EXTRA_REGRESS_OPTS="" /bin/sh test.sh --install

test.sh: MSYS/MINGW/: not found
gmake: *** [check] Error 1
gmake: Leaving directory 
`/home/buildfarm/mpolyakova/postgrespro_REL_11_STABLE/src/bin/pg_upgrade'

$ sed: command garbled: s/

Attached diff.patch fixes the problem.

About the system: SunOS, Release 5.10, KernelID Generic_141444-09.
About the used shell: according to the manual, it comes from the package 
SUNWcsu.


Thanks to Victor Wagner for his help to investigate this issue.

[1] $ man sh
<...>
Quoting
The following characters have a special meaning to the shell and cause 
termination of a word unless quoted:

;  &  (  )  |  ^  <  >  newline  space  tab

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 424d89c3feeced91531920e919a2d9e06cc4baa6..f258983b18f8276cfef3db0a425d59fead6493d9 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -33,7 +33,7 @@ standard_initdb() {
 
 # What flavor of host are we on?
 # Treat MINGW* (msys1) and MSYS* (msys2) the same.
-testhost=`uname -s | sed s/^MSYS/MINGW/`
+testhost=`uname -s | sed 's/^MSYS/MINGW/'`
 
 # Establish how the server will listen for connections
 case $testhost in

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-11-19 Thread Marina Polyakova


On 2018-11-16 22:59, Alvaro Herrera wrote:

On 2018-Sep-05, Marina Polyakova wrote:


v11-0001-Pgbench-errors-use-the-RandomState-structure-for.patch
- a patch for the RandomState structure (this is used to reset a 
client's

random seed during the repeating of transactions after
serialization/deadlock failures).


Pushed this one with minor stylistic changes (the most notable of which
is the move of initRandomState to where the rest of the random 
generator

infrastructure is, instead of in a totally random place).  Thanks,


Thank you very much! I'm going to send a new patch set until the end of 
this week (I'm sorry I was very busy in the release of Postgres Pro 
11...).


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-09-12 Thread Marina Polyakova


On 12-09-2018 17:04, Fabien COELHO wrote:

Hello Marina,

You can get other errors that cannot happen for only one client if you 
use shell commands in meta commands:


Or if you use untrusted procedural languages in SQL expressions (see 
the used file in the attachments):


Or if you try to create a function and perhaps replace an existing 
one:


Sure. Indeed there can be shell errors, perl errors, create functions
conflicts... I do not understand what is your point wrt these.

I'm mostly saying that your patch should focus on implementing the
retry feature when appropriate, and avoid changing the behavior (error
displayed, abort or not) on features unrelated to serialization &
deadlock errors.

Maybe there are inconsistencies, and "bug"/"feature" worth fixing, but
if so that should be a separate patch, if possible, and if these are
bugs they could be backpatched.

For now I'm still convinced that pgbench should keep on aborting on
"\set" or SQL syntax errors, and show clear error messages on these,
and your examples have not changed my mind on that point.


I'm fine with renaming the field if it makes thinks clearer. They are
all counters, so naming them "cnt" or "total_cnt" does not help much.
Maybe "succeeded" or "success" to show what is really counted?


Perhaps renaming of StatsData.cnt is better than just adding a comment 
to this field. But IMO we have the same problem (They are all 
counters, so naming them "cnt" or "total_cnt" does not help much.) for 
CState.cnt which cannot be named in the same way because it also 
includes skipped and failed transactions.


Hmmm. CState's cnt seems only used to implement -t anyway? I'm okay if
it has a different name, esp if it has a different semantics.


Ok!


I think
I was arguing only about cnt in StatsData.


The discussion about this has become entangled from the beginning, 
because as I wrote in [1] at first I misread your original proposal...


[1] 
https://www.postgresql.org/message-id/d318cdee8f96de6b1caf2ce684ffe4db%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-09-12 Thread Marina Polyakova

0  DO $$

Or if you try to create a function and perhaps replace an existing one:

starting vacuum...end.
client 0 got an error in command 0 (SQL) of script 0; ERROR:  duplicate 
key value violates unique constraint "pg_proc_proname_args_nsp_index"
DETAIL:  Key (proname, proargtypes, pronamespace)=(my_function, , 2200) 
already exists.


client 0 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 0 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 1 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


client 0 got an error in command 0 (SQL) of script 0; ERROR:  tuple 
concurrently updated


transaction type: pgbench_create_function.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 10/20
number of failures: 10 (50.000%)
number of serialization failures: 0 (0.000%)
number of deadlock failures: 0 (0.000%)
number of other SQL failures: 10 (50.000%)
maximum number of tries: 1
latency average = 82.881 ms (including failures)
tps = 12.065492 (including connections establishing)
tps = 12.092216 (excluding connections establishing)
statement latencies in milliseconds and failures:
82.549  10  CREATE OR REPLACE FUNCTION my_function() 
RETURNS integer AS 'select 1;' LANGUAGE SQL;


Why not handle client errors that can occur (but they may also not 
occur) the same way? (For example, always abort the client, or 
conversely do not make aborts in these cases.) Here's an example of 
such error:


client 5 got an error in command 1 (SQL) of script 0; ERROR:  division 
by zero


This is an interesting case. For me we must stop the script because
the client is asking for something "stupid", and retrying the same
won't change the outcome, the division will still be by zero. It is
the client responsability not to ask for something stupid, the bench
script is buggy, it should not submit illegal SQL queries. This is
quite different from submitting something legal which happens to fail.
...

I'm not sure that having "--debug" implying this option
is useful: As there are two distinct options, the user may be allowed
to trigger one or the other as they wish?


I'm not sure that the main debugging output will give a good clue of 
what's happened without full messages about errors, retries and 
failures...


I'm more argumenting about letting the user decide what they want.


These lines are quite long - do you suggest to wrap them this way?


Sure, if it is too long, then wrap.


Ok!

Function getTransactionStatus name does not seem to correspond fully 
to what the function does. There is a passthru case which should be 
either avoided or clearly commented.


I don't quite understand you - do you mean that in fact this function 
finds out whether we are in a (failed) transaction block or not? Or do 
you mean that the case of PQTRANS_INTRANS is also ok?...


The former: although the function is named "getTransactionStatus", it
does not really return the "status" of the transaction (aka
PQstatus()?).


Thank you, I'll think how to improve it. Perhaps the name 
checkTransactionStatus will be better...


I'd insist in a comment that "cnt" does not include "skipped" 
transactions

(anymore).


If you mean CState.cnt I'm not sure if this is practically useful 
because the code uses only the sum of all client transactions 
including skipped and failed... Maybe we can rename this field to 
nxacts or total_cnt?


I'm fine with renaming the field if it makes thinks clearer. They are
all counters, so naming them "cnt" or "total_cnt" does not help much.
Maybe "succeeded" or "success" to show what is really counted?


Perhaps renaming of StatsData.cnt is better than just adding a comment 
to this field. But IMO we have the same problem (They are all counters, 
so naming them "cnt" or "total_cnt" does not help much.) for CState.cnt 
which cannot be named in the same way because it also includes skipped 
and failed transactions.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyDO $$
  my $directory = "my_directory";
  mkdir $directory
or elog(ERROR, qq{could not create the directory "$directory": $!});
  select(undef, undef, undef, 0.001);
  rmdir $directory or elog(ERROR, qq{could not delete the directory 
"$directory": $!});
$$ LANGUAGE plperlu;

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-09-11 Thread Marina Polyakova


On 11-09-2018 16:47, Marina Polyakova wrote:

On 08-09-2018 16:03, Fabien COELHO wrote:

Hello Marina,
I'd insist in a comment that "cnt" does not include "skipped" 
transactions

(anymore).


If you mean CState.cnt I'm not sure if this is practically useful
because the code uses only the sum of all client transactions
including skipped and failed... Maybe we can rename this field to
nxacts or total_cnt?


Sorry, I misread your proposal for the first time. Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-09-11 Thread Marina Polyakova

g");

 + clientAborted(st,
 +  "perhaps the backend died while processing");

keep on one line?


I tried not to break the limit of 80 characters, but if you think that 
this is better, I'll change it.


Overall, the comment text in StatsData is very clear. However they are 
not
clearly linked to the struct fields. I'd suggest that earch field when 
used
should be quoted, so as to separate English from code, and the struct 
name

should always be used explicitely when possible.


Ok!

I'd insist in a comment that "cnt" does not include "skipped" 
transactions

(anymore).


If you mean CState.cnt I'm not sure if this is practically useful 
because the code uses only the sum of all client transactions including 
skipped and failed... Maybe we can rename this field to nxacts or 
total_cnt?



About v11-4. I'm do not feel that these changes are very
useful/important for now. I'd propose that your prioritize on updating
11-3 so that we can have another round about it as soon as possible,
and keep that one later.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-09-11 Thread Marina Polyakova


On 08-09-2018 10:17, Fabien COELHO wrote:

Hello Marina,


Hello, Fabien!


About the two first preparatory patches.


v11-0001-Pgbench-errors-use-the-RandomState-structure-for.patch
- a patch for the RandomState structure (this is used to reset a 
client's random seed during the repeating of transactions after 
serialization/deadlock failures).


Same version as the previous one, which was ok. Still applies,
compiles, passes tests. Fine with me.


v11-0002-Pgbench-errors-use-the-Variables-structure-for-c.patch
- a patch for the Variables structure (this is used to reset client 
variables during the repeating of transactions after 
serialization/deadlock failures).


Simpler version, applies cleanly on top of previous patch, compiles
and global & local "make check" are ok. Fine with me as well.


Glad to hear it :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-17 Thread Marina Polyakova


On 17-08-2018 14:04, Fabien COELHO wrote:

...
Or perhaps we can use a more detailed failure status so for each type 
of failure we always know the command name (argument "cmd") and 
whether the client is aborted. Something like this (but in comparison 
with the first variant ISTM overly complicated):


I agree., I do not think that it would be useful given that the same
thing is done on all meta-command error cases in the end.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-17 Thread Marina Polyakova


On 17-08-2018 10:49, Fabien COELHO wrote:

Hello Marina,

Detailed -r report. I understand from the doc that the retry number 
on the detailed per-statement report is to identify at what point 
errors occur? Probably this is more or less always at the same point 
on a given script, so that the most interesting feature is to report 
the number of retries at the script level.


This may depend on various factors.. for example:
[...]
   21.239   5  36  UPDATE xy3 SET y = y + :delta 
WHERE x = :x3;
   21.360   5  39  UPDATE xy2 SET y = y + :delta 
WHERE x = :x2;


Ok, not always the same point, and you confirm that it identifies
where the error is raised which leads to a retry.


Yes, I confirm this. I'll try to write more clearly about this in the 
documentation...



So we can write something like this:


All the transactions are divided into several types depending on their 
execution. Firstly, they can be divided into transactions that we 
started to execute, and transactions which were skipped (it was too 
late to execute them). Secondly, running transactions fall into 2 main 
types: is there any command that got a failure during the last 
execution of the transaction script or not? Thus


Here is an attempt at having a more precise and shorter version, not
sure it is much better than yours, though:

"""
Transactions are counted depending on their execution and outcome. 
First

a transaction may have started or not: skipped transactions occur
under --rate and --latency-limit when the client is too late to
execute them. Secondly, a started transaction may ultimately succeed
or fail on some error, possibly after some retries when --max-tries is
not one. Thus
"""


Thank you!

I'd suggest to put "ANOTHER_SQL_FAILURE" as the last option, 
otherwise "another" does not make sense yet.


Maybe firstly put a general group, and then special cases?...


I understand it more as a catch all default "none of the above" case.


Ok!

commandFailed: I'm not thrilled by the added boolean, which is 
partially

redundant with the second argument.


Do you mean that it is partially redundant with the argument "cmd" 
and, for example, the meta commands errors always do not cause the 
abortions of the client?


Yes. And also I'm not sure we should want this boolean at all.


Perhaps we can use a separate function to print the messages about 
client's abortion, something like this (it is assumed that all abortions 
happen when processing SQL commands):


static void
clientAborted(CState *st, const char *message)
{
pgbench_error(...,
  "client %d aborted in command %d (SQL) of script 
%d; %s\n",
  st->id, st->command, st->use_file, message);
}

Or perhaps we can use a more detailed failure status so for each type of 
failure we always know the command name (argument "cmd") and whether the 
client is aborted. Something like this (but in comparison with the first 
variant ISTM overly complicated):


/*
 * For the failures during script execution.
 */
typedef enum FailureStatus
{
NO_FAILURE = 0,

/*
 * Failures in meta commands. In these cases the failed transaction is
 * terminated.
 */
META_SET_FAILURE,
META_SETSHELL_FAILURE,
META_SHELL_FAILURE,
META_SLEEP_FAILURE,
META_IF_FAILURE,
META_ELIF_FAILURE,

/*
	 * Failures in SQL commands. In cases of serialization/deadlock 
failures a
	 * failed transaction is re-executed from the very beginning if 
possible;

 * otherwise the failed transaction is terminated.
 */
SERIALIZATION_FAILURE,
DEADLOCK_FAILURE,
OTHER_SQL_FAILURE,  /* other failures in SQL 
commands that are not
 * listed by 
themselves above */

/*
 * Failures while processing SQL commands. In this case the client is
 * aborted.
 */
SQL_CONNECTION_FAILURE
} FailureStatus;


[...]
If in such cases one command is placed on several lines, ISTM that the 
code is more understandable if curly brackets are used...


Hmmm. Such basic style changes are avoided because they break
backpatching, so we try to avoid gratuitous changes unless there is a
strong added value, which does not seem to be the case here.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-16 Thread Marina Polyakova

rilled by the added boolean, which is 
partially

redundant with the second argument.


Do you mean that it is partially redundant with the argument "cmd" and, 
for example, the meta commands errors always do not cause the abortions 
of the client?



 if (per_script_stats)
 -   accumStats(_script[st->use_file].stats, skipped,
latency, lag);
 +   {
 +   accumStats(_script[st->use_file].stats, skipped,
latency, lag,
 +      st->failure_status, st->retries);
 +   }
  }

I do not see the point of changing the style here.


If in such cases one command is placed on several lines, ISTM that the 
code is more understandable if curly brackets are used...


[1] 
https://www.postgresql.org/message-id/fcc2512cdc9e6bc49d3b489181f454da%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-13 Thread Marina Polyakova


On 12-08-2018 12:14, Fabien COELHO wrote:

HEllo Marina,


Hello, Fabien!


v10-0003-Pgbench-errors-use-the-Variables-structure-for-c.patch
- a patch for the Variables structure (this is used to reset client 
variables during the repeating of transactions after 
serialization/deadlock failures).


This patch adds an explicit structure to manage Variables, which is
useful to reset these on pgbench script retries, which is the purpose
of the whole patch series.

About part 3:

Patch applies cleanly,


On 12-08-2018 12:17, Fabien COELHO wrote:

About part 3:

Patch applies cleanly,


I forgot: compiles, global & local "make check" are ok.


I'm glad to hear it :-)


* typo in comments: "varaibles"


I'm sorry, I'll fix it.


* About enlargeVariables:

multiple INT_MAX error handling looks strange, especially as this code
can never be triggered because pgbench would be dead long before
having allocated INT_MAX variables. So I would not bother to add such
checks.
...
I'm not sure that the size_t cast here and there are useful for any
practical values likely to be encountered by pgbench.


Looking at the code of the functions, for example, ParseScript and 
psql_scan_setup, where the integer variable is used for the size of the 
entire script - ISTM that you are right.. Therefore size_t casts will 
also be removed.



ISTM that if something is amiss it will fail in pg_realloc anyway.


IIUC and physical RAM is not enough, this may depend on the size of the 
swap.



Also I do not like the ExpBuf stuff, as usual.



The exponential allocation seems overkill. I'd simply add a constant
number of slots, with a simple rule:

/* reallocated with a margin */
if (max_vars < needed) max_vars = needed + 8;

So in the end the function should be much simpler.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova

On 10-08-2018 17:19, Arthur Zakirov wrote:

On Fri, Aug 10, 2018 at 04:46:04PM +0300, Marina Polyakova wrote:

> +1 from me to keep initial name "pgbench_error". "pgbench_log" for new
> function looks nice to me. I think it is better than just "log",
> because "log" may conflict with natural logarithmic function (see "man 3
> log").

Do you think that pgbench_log (or another whose name speaks only about
logging) will look good, for example, with FATAL? Because this means 
that
the logging function also processes errors and calls exit(1) if 
necessary..

Yes, why not. "_log" just means that you want to log some message with
the specified log level. Moreover those messages sometimes aren't 
error:

pgbench_error(LOG, "starting vacuum...");

"pgbench_log" is already used as the default filename prefix for 
transaction logging.

> I agree with Fabien. Calling pgbench_error() inside pgbench_error()
> could be dangerous. I think "fmt" checking could be removed, or we may
> use Assert()

I would like not to use Assert in this case because IIUC they are 
mostly

used for testing.

I'd vote to remove this check at all. I don't see any place where it is
possible to call pgbench_error() passing empty "fmt".

pgbench_error(..., "%s", PQerrorMessage(con)); ?

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova

On 10-08-2018 15:53, Arthur Zakirov wrote:

On Thu, Aug 09, 2018 at 06:17:22PM +0300, Marina Polyakova wrote:

> * ErrorLevel
>
> If ErrorLevel is used for things which are not errors, its name should
> not include "Error"? Maybe "LogLevel"?

On the one hand, this sounds better for me too. On the other hand, 
will not

this be in some kind of conflict with error level codes in elog.h?..

I think it shouldn't because those error levels are backends levels.
pgbench is a client side utility with its own code, it shares some code
with libpq and other utilities, but elog.h isn't one of them.

I agree with you on this :) I just meant that maybe it would be better 
to call this group in the same way because they are used in general for 
the same purpose?..

> This point also suggest that maybe "pgbench_error" is misnamed as well
> (ok, I know I suggested it in place of ereport, but e stands for error
> there), as it is called on errors, but is also on other things. Maybe
> "pgbench_log"? Or just simply "log" or "report", as it is really an
> local function, which does not need a prefix? That would mean that
> "pgbench_simple_error", which is indeed called on errors, could keep
> its initial name "pgbench_error", and be called on errors.

About the name "log" - we already have the function doLog, so perhaps 
the
name "report" will be better.. But like with ErrorLevel will not this 
be in

some kind of conflict with ereport which is also used for the levels
DEBUG... / LOG / INFO?

+1 from me to keep initial name "pgbench_error". "pgbench_log" for new
function looks nice to me. I think it is better than just "log",
because "log" may conflict with natural logarithmic function (see "man 
3

log").

Do you think that pgbench_log (or another whose name speaks only about 
logging) will look good, for example, with FATAL? Because this means 
that the logging function also processes errors and calls exit(1) if 
necessary..

> pgbench_error calls pgbench_error. Hmmm, why not.

I agree with Fabien. Calling pgbench_error() inside pgbench_error()
could be dangerous. I think "fmt" checking could be removed, or we may
use Assert()

I would like not to use Assert in this case because IIUC they are mostly 
used for testing.

or fprintf()+exit(1) at least.

Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-10 Thread Marina Polyakova


On 10-08-2018 11:33, Fabien COELHO wrote:

Hello Marina,


I'd suggest to let lookupCreateVariable, putVariable* as they are,
call pgbench_error with a level which does not stop the execution, 
and

abort if necessary from the callers with a "aborted because of
putVariable/eval/... error" message, as it was done before.


There's one more problem: if this is a client failure, an error 
message inside any of these functions should be printed at the level 
DEBUG_FAILS; otherwise it should be printed at the level LOG. Or do 
you suggest using the error level as an argument for these functions?


No. I suggest that the called function does only one simple thing,
probably "DEBUG", and that the *caller* prints a message if it is
unhappy about the failure of the called function, as it is currently
done. This allows to provide context as well from the caller, eg
"setting variable %s failed while ". The user
call rerun under debug for precision if they need it.


Ok!


I'm still not over enthousiastic with these changes, and still think
that it should be an independent patch, not submitted together with
the "retry on error" feature.


In the next version I will put the error patch last, so it will be 
possible to compare the "retry on error" feature with it and without it, 
and let the committer decide how it is better)


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-09 Thread Marina Polyakova

report, but e stands for error
there), as it is called on errors, but is also on other things. Maybe
"pgbench_log"? Or just simply "log" or "report", as it is really an
local function, which does not need a prefix? That would mean that
"pgbench_simple_error", which is indeed called on errors, could keep
its initial name "pgbench_error", and be called on errors.


About the name "log" - we already have the function doLog, so perhaps 
the name "report" will be better.. But like with ErrorLevel will not 
this be in some kind of conflict with ereport which is also used for the 
levels DEBUG... / LOG / INFO?



Alternatively, the debug/logging code could be let as it is (i.e.
direct print to stderr) and the function only called when there is
some kind of error, in which case it could be named with "error" in
its name (or elog/ereport...).


As I wrote in [2]:


because of my patch there may be a lot
of such code (see v7 in [1]):

-   fprintf(stderr,
-   "malformed variable \"%s\" value: 
\"%s\"\n",
-   var->name, var->svalue);
+   if (debug_level >= DEBUG_FAILS)
+   {
+   fprintf(stderr,
+   "malformed variable \"%s\" value: 
\"%s\"\n",
+   var->name, var->svalue);
+   }

-   if (debug)
+   if (debug_level >= DEBUG_ALL)
fprintf(stderr, "client %d sending %s\n", st->id, sql);

That's why it was suggested to make the error function which hides all
these things (see [2]):

There is a lot of checks like "if (debug_level >= DEBUG_FAILS)" with
corresponding fprintf(stderr..) I think it's time to do it like in the
main code, wrap with some function like log(level, msg).


And IIUC macros will not help in the absence of __VA_ARGS__.


* PQExpBuffer

I still do not see a positive value from importing PQExpBuffer
complexity and cost into pgbench, as the resulting code is not very
readable and it adds malloc/free cycles, so I'd try to avoid using
PQExpBuf as much as possible. ISTM that all usages could be avoided in
the patch, and most should be avoided even if ExpBuffer is imported
because it is really useful somewhere.

- to call pgbench_error from pgbench_simple_error, you can do a
pgbench_log_va(level, format, va_list) version called both from
pgbench_error & pgbench_simple_error.

- for PGBENCH_DEBUG function, do separate calls per type, the very
small partial code duplication is worth avoiding ExpBuf IMO.

- for doCustom debug: I'd just let the printf as it is, with a
comment, as it is really very internal stuff for debug. Or I'd just
snprintf a something in a static buffer.

- for syntax_error: it should terminate, so it should call
pgbench_error(FATAL, ...). Idem, I'd either keep the printf then call
pgbench_error(FATAL, "syntax error found\n") for a final message,
or snprintf in a static buffer.

- for listAvailableScript: I'd simply call "pgbench_error(LOG" several
time, once per line.

I see building a string with a format (printfExpBuf..) and then
calling the pgbench_error function with just a "%s" format on the
result as not very elegant, because the second format is somehow
hacked around.


Ok! About using a static buffer in doCustom debug or in syntax_error - 
I'm not sure that this is always possible because ISTM that the variable 
name can be quite large.



* bool client

I'm unconvince by this added boolean just to switch the level on
encountered errors.

I'd suggest to let lookupCreateVariable, putVariable* as they are,
call pgbench_error with a level which does not stop the execution, and
abort if necessary from the callers with a "aborted because of
putVariable/eval/... error" message, as it was done before.


There's one more problem: if this is a client failure, an error message 
inside any of these functions should be printed at the level 
DEBUG_FAILS; otherwise it should be printed at the level LOG. Or do you 
suggest using the error level as an argument for these functions?



pgbench_error calls pgbench_error. Hmmm, why not.


[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.21.1806100837380.3655%40lancre
[2] 
https://www.postgresql.org/message-id/b692de21caaed13c59f31c06d0098488%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-08-08 Thread Marina Polyakova


On 07-08-2018 19:21, Fabien COELHO wrote:

Hello Marina,


Hello, Fabien!


v10-0001-Pgbench-errors-use-the-RandomState-structure-for.patch
- a patch for the RandomState structure (this is used to reset a 
client's random seed during the repeating of transactions after 
serialization/deadlock failures).


About this v10 part 1:

Patch applies cleanly, compile, global & local make check both ok.

The random state is cleanly separated so that it will be easy to reset
it on client error handling ISTM that the pgbench side is
deterministic with
the separation of the seeds for different uses.

Code is clean, comments are clear.


:-)


I'm wondering what is the rational for the "xseed" field name? In
particular, what does the "x" stands for?


I called it "...seed" instead of "data" because perhaps the "data" is 
too general a name for use here (but I'm not entirely sure what Alvaro 
Herrera meant in [1], see my answer in [2]). I called it "xseed" to 
combine it with the arguments of the functions _dorand48 / pg_erand48 / 
pg_jrand48 in the file erand48.c. IIUC they use a linear congruential 
generator and perhaps "xseed" means the sequence with the name X of 
pseudorandom values of size 48 bits (X_0, X_1, ... X_n) where X_0 is the 
seed / the start value.


[1] 
https://www.postgresql.org/message-id/20180711180417.3ytmmwmonsr5lra7@alvherre.pgsql



LGTM, though I'd rename the random_state struct members so that it
wouldn't look as confusing.  Maybe that's just me.


[2] 
https://www.postgresql.org/message-id/cb2cde10e4e7a10a38b48e9cae8fbd28%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-07-12 Thread Marina Polyakova


On 11-07-2018 22:34, Fabien COELHO wrote:

can we try something like this?

PGBENCH_ERROR_START(DEBUG_FAIL)
{
PGBENCH_ERROR("client %d repeats the failed transaction (try %d",


Argh, no? I was thinking of something much more trivial:

   pgbench_error(DEBUG, "message format %d %s...", 12, "hello world");

If you really need some complex dynamic buffer, and I would prefer
that you avoid that, then the fallback is:

   if (level >= DEBUG)
   {
  initPQstuff();
  ...
  pgbench_error(DEBUG, "fixed message... %s\n", msg);
  freePQstuff();
   }

The point is to avoid building the message with dynamic allocation and 
so

if in the end it is not used.


Ok! About avoidance - I'm afraid there's one more piece of debugging 
code with the same problem:


else if (command->type == META_COMMAND)
{
...
initPQExpBuffer(_buf);
printfPQExpBuffer(_buf, "client %d executing \\%s",
  st->id, argv[0]);
for (i = 1; i < argc; i++)
appendPQExpBuffer(_buf, " %s", argv[i]);
appendPQExpBufferChar(_buf, '\n');
ereport(ELEVEL_DEBUG, (errmsg("%s", errmsg_buf.data)));
termPQExpBuffer(_buf);

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-07-12 Thread Marina Polyakova


On 11-07-2018 21:04, Alvaro Herrera wrote:

Just a quick skim while refreshing what were those error reporting API
changes about ...


Thank you!


On 2018-May-21, Marina Polyakova wrote:


v9-0001-Pgbench-errors-use-the-RandomState-structure-for-.patch
- a patch for the RandomState structure (this is used to reset a 
client's

random seed during the repeating of transactions after
serialization/deadlock failures).


LGTM, though I'd rename the random_state struct members so that it
wouldn't look as confusing.  Maybe that's just me.


IIUC, do you like "xseed" instead of "data"?

 typedef struct RandomState
 {
-   unsigned short data[3];
+   unsigned short xseed[3];
 } RandomState;

Or do you want to rename "random_state" in the structures RetryState / 
CState / TState? Thanks to Fabien Coelho' comments in [1], TState can 
contain several RandomStates for different purposes, something like 
this:


/*
 * Thread state
 */
typedef struct
{
...
/*
 * Separate randomness for each thread. Each thread option uses its own
	 * random state to make all of them independent of each other and 
therefore

 * deterministic at the thread level.
 */
RandomState choose_script_rs;   /* random state for selecting a script 
*/
	RandomState throttling_rs;	/* random state for transaction throttling 
*/

RandomState sampling_rs;/* random state for log sampling */
...
} TState;


v9-0002-Pgbench-errors-use-the-Variables-structure-for-cl.patch
- a patch for the Variables structure (this is used to reset client
variables during the repeating of transactions after 
serialization/deadlock

failures).


Please don't allocate Variable structs one by one.  First time allocate
some decent number (say 8) and then enlarge by duplicating size.  That
way you save realloc overhead.  We use this technique everywhere else,
no reason do different here.  Other than that, LGTM.


Ok!

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.21.1806090810090.5307%40lancre


While reading your patch, it occurs to me that a run is not 
deterministic

at the thread level under throttling and sampling, because the random
state is sollicited differently depending on when transaction ends. 
This
suggest that maybe each thread random_state use should have its own 
random

state.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-07-12 Thread Marina Polyakova


On 11-07-2018 20:49, Alvaro Herrera wrote:

On 2018-Jul-11, Marina Polyakova wrote:


can we try something like this?

PGBENCH_ERROR_START(DEBUG_FAIL)
{
PGBENCH_ERROR("client %d repeats the failed transaction (try %d",
  st->id, st->retries + 1);
if (max_tries)
PGBENCH_ERROR("/%d", max_tries);
if (latency_limit)
{
PGBENCH_ERROR(", %.3f%% of the maximum time of tries was used",
  getLatencyUsed(st, ));
}
PGBENCH_ERROR(")\n");
}
PGBENCH_ERROR_END();


I didn't quite understand what these PGBENCH_ERROR() functions/macros
are supposed to do.  Care to explain?


It is used only to print a string with the given arguments to stderr. 
Probably it might be just the function pgbench_error and not a macro..


P.S. This is my mistake, I did not think that PGBENCH_ERROR_END does not 
know the elevel for calling exit(1) if the elevel >= ERROR.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-07-11 Thread Marina Polyakova

 or 
not.


ISTM that it will be shown final report. If I want debug, I ask for
--debug, otherwise I think that the command should do what it was
asked for, i.e. run scripts, collect performance statistics and show
them at the end.

In particular, when running with retries is enabled, the user is
expecting deadlock/serialization errors, so that they are not "errors"
as such for
them.

They also help to understand which limit of retries was violated or 
how close we were to these limits during the execution of a specific 
transaction. But I agree with you that they are costly and can be 
skipped if the failure type is never retried. Maybe it is better to 
split them into multiple error function calls?..


Debugging message costs should only be incurred when under --debug,
not otherwise.


Ok! IIUC instead of this part of the code

initPQExpBuffer(_buf);
printfPQExpBuffer(_buf,
  "client %d repeats the failed transaction (try 
%d",
  st->id, st->retries + 1);
if (max_tries)
appendPQExpBuffer(_buf, "/%d", max_tries);
if (latency_limit)
{
appendPQExpBuffer(_buf,
  ", %.3f%% of the maximum time of tries was 
used",
  getLatencyUsed(st, ));
}
appendPQExpBufferStr(_buf, ")\n");
pgbench_error(DEBUG_FAIL, "%s", errmsg_buf.data);
termPQExpBuffer(_buf);

can we try something like this?

PGBENCH_ERROR_START(DEBUG_FAIL)
{
PGBENCH_ERROR("client %d repeats the failed transaction (try %d",
  st->id, st->retries + 1);
if (max_tries)
PGBENCH_ERROR("/%d", max_tries);
if (latency_limit)
{
PGBENCH_ERROR(", %.3f%% of the maximum time of tries was used",
  getLatencyUsed(st, ));
}
PGBENCH_ERROR(")\n");
}
PGBENCH_ERROR_END();

You have added 20-columns alignment prints. This looks like too much 
and
generates much too large lines. Probably 10 (billion) would be 
enough.


I have already asked you about this in [2]:


Probably:-)

The variables for the numbers of failures and retries are of type 
int64

since the variable for the total number of transactions has the same
type. That's why such a large alignment (as I understand it now, 
enough

20 characters). Do you prefer floating alignemnts, depending on the
maximum number of failures/retries for any command in any script?


An int64 counter is not likely to reach its limit anytime soon:-) If
the column display limit is ever reached, ISTM that then the text is
just misaligned, which is a minor and rare inconvenience. If very wide
columns are used, then it does not fit my terminal and the report text
will always be wrapped around, which makes it harder to read, every
time.


Ok!

The latency limit to 900 ms try is a bad idea because it takes a lot 
of time. I did such tests before and they were removed by Tom Lane 
because of determinism and time issues. I would comment this test out 
for now.


Ok! If it doesn't bother you - can you tell more about the causes of 
these determinism issues?.. Tests for some other failures that cannot 
be retried are already added to 001_pgbench_with_server.pl.


Some farm animals are very slow, so you cannot really assume much
about time one way or another.


Thanks!

I do not understand why there is so much text about in failed sql 
transaction stuff, while we are mainly interested in serialization & 
deadlock errors, and this only falls in some "other" category. There 
seems to be more details about other errors that about deadlocks & 
serializable errors.


The reporting should focus on what is of interest, either all errors, 
or some detailed split of these errors.


<...>

* "errors_in_failed_tx" is some subcounter of "errors", for a special
case. Why it is there fails me [I finally understood, and I think it
should be removed, see end of review]. If we wanted to distinguish,
then we should distinguish homogeneously: maybe just count the
different error types, eg have things like "deadlock_errors",
"serializable_errors", "other_errors", "internal_pgbench_errors" 
which
would be orthogonal one to the other, and "errors" could be 
recomputed

from these.


Thank you, I agree with you. Unfortunately each new error type adds a 
new 1 or 2 columns of maximum width 20 to the per-statement report


The fact that some data are collected does not mean that they should
all be reported in detail. We can have detailed error count and report
the sum of this errors for instance, or have some more
verbose/detailed reports
as options (eg --latencies does just that).


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-07-11 Thread Marina Polyakova

(trigger one or a few deadlocks), idem for serializable,
maybe idem for other errors if any.

<...>

The latency limit to 900 ms try is a bad idea because it takes a lot of 
time.
I did such tests before and they were removed by Tom Lane because of 
determinism

and time issues. I would comment this test out for now.


Ok! If it doesn't bother you - can you tell more about the causes of 
these determinism issues?.. Tests for some other failures that cannot be 
retried are already added to 001_pgbench_with_server.pl.


The challenge is to do that reliably and efficiently, i.e. so that the 
test does

not rely on chance and is still quite efficient.

The trick you use is to run an interactive psql in parallel to pgbench 
so as to
play with concurrent locks. That is interesting, but deserves more 
comments

and explanatation, eg before the test functions.

Maybe this could be achieved within pgbench by using some wait stuff
in PL/pgSQL so that concurrent client can wait one another based on
data in unlogged table updated by a CALL within an "embedded"
transactions? Not sure.

<...>

Anyway, TAP tests should be much lighter (in total time), and if
possible much simpler.


I'll try, thank you..


Otherwise, maybe (simple) pgbench-side thread
barrier could help, but this would require more thinking.


Tests must pass if we use --disable-thread-safety..


Documentation
=

Not looked at in much details for now. Just a few comments:

Having the "most important settings" on line 1-6 and 8 (i.e. skipping 
7) looks
silly. The important ones should simply be the first ones, and the 8th 
is not

that important, or it is in 7th position.


Ok!

I do not understand why there is so much text about in failed sql 
transaction
stuff, while we are mainly interested in serialization & deadlock 
errors, and
this only falls in some "other" category. There seems to be more 
details about

other errors that about deadlocks & serializable errors.

The reporting should focus on what is of interest, either all errors, 
or some

detailed split of these errors.

<...>

* "errors_in_failed_tx" is some subcounter of "errors", for a special
case. Why it is there fails me [I finally understood, and I think it
should be removed, see end of review]. If we wanted to distinguish,
then we should distinguish homogeneously: maybe just count the
different error types, eg have things like "deadlock_errors",
"serializable_errors", "other_errors", "internal_pgbench_errors" which
would be orthogonal one to the other, and "errors" could be recomputed
from these.


Thank you, I agree with you. Unfortunately each new error type adds a 
new 1 or 2 columns of maximum width 20 to the per-statement report (to 
report errors and possibly retries of this type in this statement) and 
we already have 2 new columns for all errors and retries. So I'm not 
sure that we need add anything other than statistics only about all the 
errors and all the retries in general.



The documentation should state clearly what
are the counted errors, and then what are their effects on the reported 
stats.
The "Errors and Serialization/Deadlock Retries" section is a good start 
in that
direction, but it does not talk about pgbench internal errors (eg 
"cos(true)").

I think it should more explicit about errors.


Thank you, I'll try to improve it.


Option --max-tries default value should be spelled out in the doc.


If you mean that it is set to 1 if neither of the options --max-tries or 
--latency-limit is explicitly used, I'll fix this.



"Client's run is aborted", do you mean "Pgbench run is aborted"?


No, other clients continue their run as usual.


* FailureStatus states are not homogeneously named. I'd suggest to use
*_FAILURE for all cases. The miscellaneous case should probably be the
last. I do not understand the distinction between ANOTHER_FAILURE &
IN_FAILED_SQL_TRANSACTION. Why should it be needed? [again, see and of
review]

<...>

"If a failed transaction block does not terminate in the current 
script":

this just looks like a very bad idea, and explains my general ranting
above about this error condition. ISTM that the only reasonable option
is that a pgbench script should be inforced as a transaction, or a set 
of
transactions, but cannot be a "piece" of transaction, i.e. pgbench 
script

with "BEGIN;" but without a corresponding "COMMIT" is a user error and
warrants an abort, so that there is no need to manage these "in aborted
transaction" errors every where and report about them and document them
extensively.

This means adding a check when a script is finished or starting that
PQtransactionStatus(const PGconn *conn) == PQTRANS_IDLE, and abort if 
not
with a fatal error. Then we can forget about these "in tx errors" 
counting,

reporting and so on, and just have to document the restriction.


Ok!

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1801031720270.20034%40lancre
[2] 
https://www.postgresql.org/message-id/e4c5e8cefa4a8e88f1273b0f1ee29...@postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-06-14 Thread Marina Polyakova


On 13-06-2018 22:44, Fabien COELHO wrote:

Hello Marina,

I suppose that this is related; because of my patch there may be a lot 
of such code (see v7 in [1]):


-   fprintf(stderr,
-   "malformed variable \"%s\" value: 
\"%s\"\n",
-   var->name, var->svalue);
+   if (debug_level >= DEBUG_FAILS)
+   {
+   fprintf(stderr,
+   "malformed variable \"%s\" value: 
\"%s\"\n",
+   var->name, var->svalue);
+   }

-   if (debug)
+   if (debug_level >= DEBUG_ALL)
fprintf(stderr, "client %d sending %s\n", st->id, sql);


I'm not sure that debug messages needs to be kept after debug, if it
is about debugging pgbench itself. That is debatable.


AFAICS it is not about debugging pgbench itself, but about more detailed 
information that can be used to understand what exactly happened during 
its launch. In the case of errors this helps to distinguish between 
failures or errors by type (including which limit for retries was 
violated and how far it was exceeded for the serialization/deadlock 
errors).



The code adapts/duplicates existing server-side "ereport" stuff and
brings it to the frontend, where the logging needs are somehow quite
different.

I'd prefer to avoid duplication and/or have some code sharing.


I was recommended to use the same interface in [3]:

On elog/errstart: we already have a convention for what ereport() 
calls look like; I suggest to use that instead of inventing your 
own.


The "elog" interface already exists, it is not an invention. "ereport"
is a hack which is somehow necessary in some cases. I prefer a simple
function call if possible for the purpose, and ISTM that this is the
case.



That is a lot of complication which are justified server side
where logging requirements are special, but in this case I see it as
overkill.


I think we need ereport() if we want to make detailed error messages 
(see examples in [1])..


If it really needs to be duplicated, I'd suggest to put all this 
stuff in separated files. If we want to do that, I think that it 
would belong to fe_utils, and where it could/should be used by all 
front-end programs.


I'll try to do it..


Dunno. If you only need one "elog" function which prints a message to
stderr and decides whether to abort/exit/whatevrer, maybe it can just
be kept in pgbench. If there are are several complicated functions and
macros, better with a file. So I'd say it depends.



So my current view is that if you only need an "elog" function, it is
simpler to add it to "pgbench.c".


Thank you!


For logging purposes, ISTM that the "elog" macro interface is nicer,
closer to the existing "fprintf(stderr", as it would not introduce 
the

additional parentheses hack for "rest".


I was also recommended to use ereport() instead of elog() in [3]:


Probably. Are you hoping that advises from different reviewers should
be consistent? That seems optimistic:-)


To make the patch committable there should be no objection to it..

[1] 
https://www.postgresql.org/message-id/c89fcc380a19380260b5ea463efc1416%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-06-14 Thread Marina Polyakova


On 13-06-2018 22:59, Alvaro Herrera wrote:

For context: in the backend, elog() is only used for internal messages
(i.e. "can't-happen" conditions), and ereport() is used for user-facing
messages.  There are many things ereport() has that elog() doesn't, 
such
as additional message fields (HINT, DETAIL, etc) that I think could 
have
some use in pgbench as well.  If you use elog() then you can't have 
that.


AFAIU originally it was not supposed that the pgbench error messages 
have these fields, so will it be good to change the final output to 
stderr?.. For example:


-   fprintf(stderr, "%s", PQerrorMessage(con));
-   fprintf(stderr, "(ignoring this error and continuing 
anyway)\n");
+   ereport(LOG,
+   (errmsg("Ignoring the server error and continuing 
anyway"),
+errdetail("%s", PQerrorMessage(con;

-   fprintf(stderr, "%s", PQerrorMessage(con));
-   if (sqlState && strcmp(sqlState, 
ERRCODE_UNDEFINED_TABLE) == 0)
-   {
-fprintf(stderr, "Perhaps you need to do initialization (\"pgbench 
-i\") in database \"%s\"\n", PQdb(con));

-   }
-
-   exit(1);
+   ereport(ERROR,
+   (errmsg("Server error"),
+errdetail("%s", PQerrorMessage(con)),
+sqlState && strcmp(sqlState, 
ERRCODE_UNDEFINED_TABLE) == 0 ?
+	 errhint("Perhaps you need to do initialization (\"pgbench -i\") 
in database \"%s\"\n",

+PQdb(con)) : 0));

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-06-13 Thread Marina Polyakova

which maybe would exit or abort depending on level, and possibly not
actually report under some levels and/or some conditions. For that, it
could enough to just provide an nice "elog" function.


I agree that elog() can be coded in this way. To use ereport() I need a 
structure to store the error level as a condition to exit.



In conclusion, which you can disagree with because maybe I have missed
something... anyway I currently think that:

 - it should be an independent submission

 - possibly at "fe_utils" level

 - possibly just a nice "elog" function is enough, if so just do that.


I hope I answered all this above..

[1] 
https://www.postgresql.org/message-id/453fa52de88477df2c4a2d82e09e461c%40postgrespro.ru
[2] 
https://www.postgresql.org/message-id/20180405180807.0bc1114f%40wp.localdomain
[3] 
https://www.postgresql.org/message-id/20180508105832.6o3uf3npfpjgk5m7%40alvherre.pgsql


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-06-09 Thread Marina Polyakova


On 09-06-2018 16:31, Fabien COELHO wrote:

Hello Marina,


Hello!


v9-0002-Pgbench-errors-use-the-Variables-structure-for-cl.patch
- a patch for the Variables structure (this is used to reset client 
variables during the repeating of transactions after 
serialization/deadlock failures).


About this second patch:

This extracts the variable holding structure, so that it is somehow
easier to reset them to their initial state on transaction failures,
the management of which is the ultimate aim of this patch series.

It is also cleaner this way.

Patch applies cleanly on top of the previous one (there is no real
interactions with it). It compiles cleanly. Global & pgbench "make
check" are both ok.


:-)


The structure typedef does not need a name. "typedef struct { } V...".


Ok!


I tend to disagree with naming things after their type, eg "array".
I'd suggest "vars" instead. "nvariables" could be "nvars" for
consistency with that and "vars_sorted", and because
"foo.variables->nvariables" starts looking heavy.

I'd suggest but "Variables" type declaration just after "Variable"
type declaration in the file.


Thank you, I agree and I'll fix all this.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-06-09 Thread Marina Polyakova


On 09-06-2018 9:55, Fabien COELHO wrote:

Hello Marina,


Hello!


v9-0001-Pgbench-errors-use-the-RandomState-structure-for-.patch
- a patch for the RandomState structure (this is used to reset a 
client's random seed during the repeating of transactions after 
serialization/deadlock failures).


A few comments about this first patch.


Thank you very much!


Patch applies cleanly, compiles, global & pgbench "make check" ok.

I'm mostly ok with the changes, which cleanly separate the different
use of random between threads (script choice, throttle delay,
sampling...) and client (random*() calls).


Glad to hear it :)


This change is necessary so that a client can restart a transaction
deterministically (at the client level at least), which is the
ultimate aim of the patch series.

A few remarks:

The RandomState struct is 6 bytes, which will induce some padding when
used. This is life and pre-existing. No problem.

ISTM that the struct itself does not need a name, ie. "typedef struct
{ ... } RandomState" is enough.


Ok!


There could be clear comments, say in the TState and CState structs,
about what randomness is impacted (i.e. script choices, etc.).


Thank you, I'll add them.


getZipfianRand, computeHarmonicZipfian: The "thread" parameter was
justified because it was used for two fieds. As the random state is
separated, I'd suggest that the other argument should be a zipfcache
pointer.


I agree with you and I will change it.


While reading your patch, it occurs to me that a run is not
deterministic at the thread level under throttling and sampling,
because the random state is sollicited differently depending on when
transaction ends. This suggest that maybe each thread random_state use
should have its own random state.


Thank you, I'll fix this.


In passing, and totally unrelated to this patch:

I've always been a little puzzled about why a quite small 48-bit
internal state random generator is used. I understand the need for pg
to have a portable & state-controlled thread-safe random generator,
but why this particular small one fails me. The source code
(src/port/erand48.c, copyright in 1993...) looks optimized for 16 bits
architectures, which is probably pretty inefficent to run on 64 bits
architectures. Maybe this could be updated with something more
consistent with today's processors, providing more quality at a lower
cost.


This sounds interesting, thanks!
*went to look for a multiplier and a summand that are large enough and 
are mutually prime..*


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] path toward faster partition pruning

2018-05-10 Thread Marina Polyakova


On 09-05-2018 17:30, Alvaro Herrera wrote:

Marina Polyakova wrote:

Hello everyone in this thread!


I got a similar server crash as in [1] on the master branch since the 
commit
9fdb675fc5d2de825414e05939727de8b120ae81 when the assertion fails 
because
the second argument ScalarArrayOpExpr is not a Const or an ArrayExpr, 
but is

an ArrayCoerceExpr (see [2]):


Hello Marina, thanks for reporting this.  I have pushed all fixes
derived from this report -- thanks to Amit and Michaël for those.
I verified your test case no longer crashes.  If you have more 
elaborate

test cases, please do try these too.


Hello, thank you all very much! :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] path toward faster partition pruning

2018-05-07 Thread Marina Polyakova


On 07-05-2018 4:37, Michael Paquier wrote:

On Fri, May 04, 2018 at 12:32:23PM +0300, Marina Polyakova wrote:
I got a similar server crash as in [1] on the master branch since the 
commit
9fdb675fc5d2de825414e05939727de8b120ae81 when the assertion fails 
because
the second argument ScalarArrayOpExpr is not a Const or an ArrayExpr, 
but is

an ArrayCoerceExpr (see [2]):


Indeed, I can see the crash.  I have been playing with this stuff and I
am in the middle of writing the patch, but let's track this properly 
for

now.


Thank you very much!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] path toward faster partition pruning

2018-05-04 Thread Marina Polyakova


Hello everyone in this thread!

I got a similar server crash as in [1] on the master branch since the 
commit 9fdb675fc5d2de825414e05939727de8b120ae81 when the assertion fails 
because the second argument ScalarArrayOpExpr is not a Const or an 
ArrayExpr, but is an ArrayCoerceExpr (see [2]):


=# create table list_parted (
  a varchar
) partition by list (a);
=# create table part_ab_cd partition of list_parted for values in ('ab', 
'cd');

=# CREATE OR REPLACE FUNCTION public.x_stl_text_integer (
)
RETURNS text STABLE AS
$body$
BEGIN
RAISE NOTICE 's text integer';
RETURN 1::text;
END;
$body$
LANGUAGE 'plpgsql';
=# explain (costs off) select * from list_parted where a in ('ab', 'cd', 
x_stl_text_integer());

server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

[1] 
https://www.postgresql.org/message-id/CAKcux6nCsCmu9oUnnuKZkeBenYvUFbU2Lt4q2MFNEb7QErzn8w%40mail.gmail.com


[2] partprune.c, function match_clause_to_partition_key:
if (IsA(rightop, Const))
{
...
}
else
{
ArrayExpr  *arrexpr = castNode(ArrayExpr, rightop); # 
fails here
...
}

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Add support for printing/reading MergeAction nodes

2018-04-06 Thread Marina Polyakova


Sorry for this late reply, I was very busy with the patch for pgbench..

On 04-04-2018 20:07, Simon Riggs wrote:

...
Which debug mode are we talking about, please?


-d 5

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: Add support for printing/reading MergeAction nodes

2018-04-06 Thread Marina Polyakova


I'm sorry I was very busy with the patch for pgbench..

On 04-04-2018 19:19, Tom Lane wrote:

...

BTW, poking around in the grammar, I notice that MergeStmt did not
get added to RuleActionStmt.  That seems like a rather serious
omission.


Thank you very much! I will try to do this, if you do not mind, of 
course..


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Add support for printing/reading MergeAction nodes

2018-04-04 Thread Marina Polyakova


Hello, hackers!

When debugging is enabled for server logging, isolation tests fail 
because there're no corresponding output functions for InsertStmt / 
DeleteStmt / UpdateStmt that are used in the output of the MergeAction 
nodes (see the attached regressions diffs and output). I also attached a 
try that makes the tests pass. Sorry if I missed that it was already 
discussed somewhere.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index c8d9626..2411658 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -409,6 +409,68 @@ _outMergeAction(StringInfo str, const MergeAction *node)
 }
 
 static void
+_outInferClause(StringInfo str, const InferClause *node)
+{
+	WRITE_NODE_TYPE("INFERCLAUSE");
+
+	WRITE_NODE_FIELD(indexElems);
+	WRITE_NODE_FIELD(whereClause);
+	WRITE_STRING_FIELD(conname);
+	WRITE_LOCATION_FIELD(location);
+}
+
+static void
+_outOnConflictClause(StringInfo str, const OnConflictClause *node)
+{
+	WRITE_NODE_TYPE("ONCONFLICTCLAUSE");
+
+	WRITE_ENUM_FIELD(action, OnConflictAction);
+	WRITE_NODE_FIELD(infer);
+	WRITE_NODE_FIELD(targetList);
+	WRITE_NODE_FIELD(whereClause);
+	WRITE_LOCATION_FIELD(location);
+}
+
+static void
+_outInsertStmt(StringInfo str, const InsertStmt *node)
+{
+	WRITE_NODE_TYPE("INSERT");
+
+	WRITE_NODE_FIELD(relation);
+	WRITE_NODE_FIELD(cols);
+	WRITE_NODE_FIELD(selectStmt);
+	WRITE_NODE_FIELD(onConflictClause);
+	WRITE_NODE_FIELD(returningList);
+	WRITE_NODE_FIELD(withClause);
+	WRITE_ENUM_FIELD(override, OverridingKind);
+}
+
+static void
+_outDeleteStmt(StringInfo str, const DeleteStmt *node)
+{
+	WRITE_NODE_TYPE("DELETE");
+
+	WRITE_NODE_FIELD(relation);
+	WRITE_NODE_FIELD(usingClause);
+	WRITE_NODE_FIELD(whereClause);
+	WRITE_NODE_FIELD(returningList);
+	WRITE_NODE_FIELD(withClause);
+}
+
+static void
+_outUpdateStmt(StringInfo str, const UpdateStmt *node)
+{
+	WRITE_NODE_TYPE("UPDATE");
+
+	WRITE_NODE_FIELD(relation);
+	WRITE_NODE_FIELD(targetList);
+	WRITE_NODE_FIELD(whereClause);
+	WRITE_NODE_FIELD(fromClause);
+	WRITE_NODE_FIELD(returningList);
+	WRITE_NODE_FIELD(withClause);
+}
+
+static void
 _outAppend(StringInfo str, const Append *node)
 {
 	WRITE_NODE_TYPE("APPEND");
@@ -3682,6 +3744,21 @@ outNode(StringInfo str, const void *obj)
 			case T_MergeAction:
 _outMergeAction(str, obj);
 break;
+			case T_InferClause:
+_outInferClause(str, obj);
+break;
+			case T_OnConflictClause:
+_outOnConflictClause(str, obj);
+break;
+			case T_InsertStmt:
+_outInsertStmt(str, obj);
+break;
+			case T_DeleteStmt:
+_outDeleteStmt(str, obj);
+break;
+			case T_UpdateStmt:
+_outUpdateStmt(str, obj);
+break;
 			case T_Append:
 _outAppend(str, obj);
 break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 4518fa0..13891b1 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1620,6 +1620,93 @@ _readMergeAction(void)
 }
 
 /*
+ * _readInferClause
+ */
+static InferClause *
+_readInferClause(void)
+{
+	READ_LOCALS(InferClause);
+
+	READ_NODE_FIELD(indexElems);
+	READ_NODE_FIELD(whereClause);
+	READ_STRING_FIELD(conname);
+	READ_LOCATION_FIELD(location);
+
+	READ_DONE();
+}
+
+/*
+ * _readOnConflictClause
+ */
+static OnConflictClause *
+_readOnConflictClause(void)
+{
+	READ_LOCALS(OnConflictClause);
+
+	READ_ENUM_FIELD(action, OnConflictAction);
+	READ_NODE_FIELD(infer);
+	READ_NODE_FIELD(targetList);
+	READ_NODE_FIELD(whereClause);
+	READ_LOCATION_FIELD(location);
+
+	READ_DONE();
+}
+
+/*
+ * _readInsertStmt
+ */
+static InsertStmt *
+_readInsertStmt(void)
+{
+	READ_LOCALS(InsertStmt);
+
+	READ_NODE_FIELD(relation);
+	READ_NODE_FIELD(cols);
+	READ_NODE_FIELD(selectStmt);
+	READ_NODE_FIELD(onConflictClause);
+	READ_NODE_FIELD(returningList);
+	READ_NODE_FIELD(withClause);
+	READ_ENUM_FIELD(override, OverridingKind);
+
+	READ_DONE();
+}
+
+/*
+ * _readDeleteStmt
+ */
+static DeleteStmt *
+_readDeleteStmt(void)
+{
+	READ_LOCALS(DeleteStmt);
+
+	READ_NODE_FIELD(relation);
+	READ_NODE_FIELD(usingClause);
+	READ_NODE_FIELD(whereClause);
+	READ_NODE_FIELD(returningList);
+	READ_NODE_FIELD(withClause);
+
+	READ_DONE();
+}
+
+/*
+ * _readUpdateStmt
+ */
+static UpdateStmt *
+_readUpdateStmt(void)
+{
+	READ_LOCALS(UpdateStmt);
+
+	READ_NODE_FIELD(relation);
+	READ_NODE_FIELD(targetList);
+	READ_NODE_FIELD(whereClause);
+	READ_NODE_FIELD(fromClause);
+	READ_NODE_FIELD(returningList);
+	READ_NODE_FIELD(withClause);
+
+	READ_DONE();
+}
+
+/*
  * _readAppend
  */
 static Append *
@@ -2620,6 +2707,16 @@ parseNodeString(void)
 		return_value = _readModifyTable();
 	else if (MATCH("MERGEACTION", 11))
 		return_value = _readMergeAction();
+	else if (MATCH("INFERCLAUSE", 11))
+		return_value = _readInferClause();
+	else if (MATCH("ONCONFLICT

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2018-03-26 Thread Marina Polyakova


On 25-03-2018 15:23, Fabien COELHO wrote:
Hm, I took a look on both thread about patch and it seems to me now 
it's overcomplicated. With recently committed enhancements of pgbench 
(\if, \when) it becomes close to impossible to retry transation in 
case of failure. So, initial approach just to rollback such 
transaction looks more attractive.


Yep.


Many thanks to both of you! I'm working on a patch in this direction..


I think that the best approach for now is simply to reset (command
zero, random generator) and start over the whole script, without
attempting to be more intelligent. The limitations should be clearly
documented (one transaction per script), though. That would be a
significant enhancement already.


I'm not sure that we can always do this, because we can get new errors 
until we finish the failed transaction block, and we need destroy the 
conditional stack..


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Pgbench Serialization and deadlock errors

2018-03-06 Thread Marina Polyakova


On 05-03-2018 18:21, David Steele wrote:

Hello Marina,


Hello, David!


On 1/12/18 12:01 PM, Marina Polyakova wrote:

...


This patch was marked Waiting on Author on Jan 8 and no new patch was
submitted before this commitfest.

I think we should mark this patch as Returned with Feedback.


I'm sorry, I was very busy with the patch to precalculate stable 
functions.. I'm working on a new version of this patch for pgbench 
unless, of course, it's too late for v11.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-02 Thread Marina Polyakova


Ok!

On 02-03-2018 22:56, Andres Freund wrote:

Hi,

On 2018-03-02 11:22:01 +0300, Marina Polyakova wrote:
I fixed the failure that Thomas pointed out to me, and I'm finishing 
work on

it, but it took me a while to study this part of the executor..


I unfortunately think that makes this too late for v11, and we should
mark this as returned with feedback.

Greetings,

Andres Freund


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-03-02 Thread Marina Polyakova

Hello!

I fixed the failure that Thomas pointed out to me, and I'm finishing 
work on it, but it took me a while to study this part of the executor..

On 02-03-2018 0:11, Andres Freund wrote:

Hi,

On 2018-02-01 08:01:48 +0300, Marina Polyakova wrote:

> ISTM, there might be some value to consider all of them in the design of
> the new mechanism.

I'm sorry, the other parts have occupied all the time, and I'll work 
on it..

That has, as far as I can see, not happened. And the patch has been
reported as failing by Thomas a while ago.  So I'm inclined to mark 
this

as returned with feedback for this CF?

- Andres

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master check fails on Windows Server 2008

2018-02-21 Thread Marina Polyakova


On 21-02-2018 18:51, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 20-02-2018 21:23, Tom Lane wrote:

I continue to wonder if it's not better to just remove
the option and thereby simplify our lives.  What's the actual value 
of

having it anymore?


I agree with you, but I have too little experience to vote for 
removing

this option.


I've started a separate thread to propose removal of the option, at
https://postgr.es/m/10862.1519228...@sss.pgh.pa.us


Thank you!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master check fails on Windows Server 2008

2018-02-21 Thread Marina Polyakova


On 20-02-2018 21:23, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 20-02-2018 3:37, Tom Lane wrote:
4. Try to tweak the stats_ext.sql test conditions in some more 
refined

way to get the test to pass everywhere.  This'd be a lot of work with
no guarantee of success, so I'm not too excited about it.



Thank you for your explanations! I'll try to do something in this
direction..


OK.  The least painful fix might be to establish a different work_mem
setting just for that one query.

However, if you're intent on putting work into continued support of
--disable-float8-byval, I would *strongly* suggest setting up a 
buildfarm
member that runs that way, because otherwise we're pretty much 
guaranteed

to break it again.


Oh, thank you again!


I continue to wonder if it's not better to just remove
the option and thereby simplify our lives.  What's the actual value of
having it anymore?


I agree with you, but I have too little experience to vote for removing 
this option.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master check fails on Windows Server 2008

2018-02-20 Thread Marina Polyakova


On 20-02-2018 3:37, Tom Lane wrote:

Ah-hah.  I can reproduce the described failure if I configure with
--disable-float8-byval on an otherwise 64-bit machine.  It appears that
the minimum work_mem setting that will allow this query to use a 
hashagg

plan on such a configuration is about 155kB (which squares with the
results you show).  On the other hand, in a normal 64-bit 
configuration,

work_mem settings of 160kB or more cause other failures (due to other
plans switching to hashagg), and on a 32-bit machine I see such 
failures

with work_mem of 150kB or more.  So there's basically no setting of
work_mem that would make all these tests pass everywhere.

I see several things we could do about this:

1. Nothing; just say "sorry, we don't promise that the regression tests
pass with no plan differences on nonstandard configurations".  Given 
that
--disable-float8-byval has hardly any real-world use, there is not a 
lot

of downside to that.

2. Decide that --disable-float8-byval, and for that matter
--disable-float4-byval, have no real-world use at all and take them 
out.

There was some point in those options back when we cared about binary
compatibility with version-zero C functions, but now I'm not sure why
anyone would use them.

3. Drop that one test case from stats_ext.sql; I'm not sure how much
additional test value it's actually bringing.

4. Try to tweak the stats_ext.sql test conditions in some more refined
way to get the test to pass everywhere.  This'd be a lot of work with
no guarantee of success, so I'm not too excited about it.


Thank you for your explanations! I'll try to do something in this 
direction..



5. Something else?

regards, tom lane


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

master check fails on Windows Server 2008

2018-02-16 Thread Marina Polyakova

Hello, hackers! I got a permanent failure of master (commit 
2a41507dab0f293ff241fe8ae326065998668af8) check on Windows Server 2008. 
Regression output and diffs as well as config.pl are attached.


I used the following commands:
build.bat > build.txt
vcregress.bat check > check.txt

Binary search has shown that this failure begins with commit 
bed9ef5a16239d91d97a1fa2efd9309c3cbbc4b2 (Rework the stats_ext test). On 
the previous commit (70ec3f1f8f0b753c38a1a582280a02930d7cac5f) the check 
passes.
I'm trying to figure out what went wrong, and any suspicions are 
welcome.


About the system: Windows Server 2008 R2 Standard, Service Pack 1, 
64-bit.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company*** C:/Users/buildfarm/mpolyakova/postgrespro_master/src/test/regress/expected/stats_ext.out	Fri Feb 16 12:56:00 2018
--- C:/Users/buildfarm/mpolyakova/postgrespro_master/src/test/regress/results/stats_ext.out	Fri Feb 16 13:09:47 2018
***
*** 312,322 
  EXPLAIN (COSTS off)
   SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d;
   QUERY PLAN  
! -
!  HashAggregate
 Group Key: b, c, d
 ->  Seq Scan on ndistinct
! (3 rows)
  
  EXPLAIN (COSTS off)
   SELECT COUNT(*) FROM ndistinct GROUP BY a, d;
--- 312,324 
  EXPLAIN (COSTS off)
   SELECT COUNT(*) FROM ndistinct GROUP BY b, c, d;
  QUERY PLAN 
! ---
!  GroupAggregate
 Group Key: b, c, d
+->  Sort
+  Sort Key: b, c, d
   ->  Seq Scan on ndistinct
! (5 rows)
  
  EXPLAIN (COSTS off)
   SELECT COUNT(*) FROM ndistinct GROUP BY a, d;

==

test tablespace   ... ok
parallel group (20 tests):  boolean char name varchar text int2 int4 txid int8 
bit float8 float4 oid pg_lsn regproc rangetypes enum money uuid numeric
 boolean  ... ok
 char ... ok
 name ... ok
 varchar  ... ok
 text ... ok
 int2 ... ok
 int4 ... ok
 int8 ... ok
 oid  ... ok
 float4   ... ok
 float8   ... ok
 bit  ... ok
 numeric  ... ok
 txid ... ok
 uuid ... ok
 enum ... ok
 money... ok
 rangetypes   ... ok
 pg_lsn   ... ok
 regproc  ... ok
test strings  ... ok
test numerology   ... ok
parallel group (20 tests):  lseg box path line point tstypes reltime circle 
interval date time timetz macaddr8 tinterval abstime macaddr inet timestamptz 
timestamp polygon
 point... ok
 lseg ... ok
 line ... ok
 box  ... ok
 path ... ok
 polygon  ... ok
 circle   ... ok
 date ... ok
 time ... ok
 timetz   ... ok
 timestamp... ok
 timestamptz  ... ok
 interval ... ok
 abstime  ... ok
 reltime  ... ok
 tinterval... ok
 inet ... ok
 macaddr  ... ok
 macaddr8 ... ok
 tstypes  ... ok
parallel group (9 tests):  geometry horology expressions oidjoins misc_sanity 
type_sanity comments regex opr_sanity
 geometry ... ok
 horology ... ok
 regex... ok
 oidjoins ... ok
 type_sanity  ... ok
 opr_sanity   ... ok
 misc_sanity  ... ok
 comments ... ok
 expressions  ... ok
test insert   ... ok
test insert_conflict  ... ok
test create_function_1... ok
test create_type  ... ok
test create_table ... ok
test create_function_2... ok
parallel group (3 tests):  copy copydml copyselect
 copy ... ok
 copyselect   ... ok
 copydml  ... ok
parallel group (3 tests):  create_misc create_operator create_procedure
 create_misc  ... ok
 create_operator  ... ok
 create_procedure ... ok
parallel group (2 tests):  create_view create_index
 create_index ... ok
 create_view  ... ok
parallel group (15 tests):  create_aggregate create_function_3 create_cast 
roleattributes create_am typed_table hash_func vacuum drop_if_exis

Re: master plpython check fails on Solaris 10

2018-02-14 Thread Marina Polyakova


On 14-02-2018 17:54, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 14-02-2018 3:43, Peter Eisentraut wrote:

OK, can you get some kind of stack trace or other debugging
information?



I got this backtrace from gdb:


Hmm, so the only question in my mind is how did this ever work for 
anyone?


The basic problem here is that, instead of using the
ErrorContextCallback.arg field provided for the purpose,
plpython_error_callback is using PLy_current_execution_context()
to try to identify the context it's supposed to report on.
In the example, that points to the context associated with the
inner DO block, not with the outer procedure.  That context
looks like it should reliably have curr_proc->proname == NULL,
so how come this test case doesn't crash for everybody?

In any case the expected output for the transaction_test4
case is obviously wrong.  Rather than reporting the transaction_test4
function and then the inner DO block, it's reporting 2 levels
of transaction_test4.  That seems to trace directly to both levels
of error context callback looking at the same execution context.

I think we need to fix the error callback code so that it
uses the "arg" field to find the relevant procedure, and that
that's a back-patchable fix, because nested plpython functions
would show this up as wrong in any case.  That would also let us
undo the not-terribly-safe practice of having code between the
PLy_push_execution_context call and the PG_TRY that is supposed
to ensure the context gets popped again.


Thank you very much! I'll try to implement this.


While I'm bitching ... I wonder how long it's been since the
comment for PLy_procedure_name() had anything to do with its
actual behavior.


AFAIU this has always been like that, since the commit 1ca717f3...

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master plpython check fails on Solaris 10

2018-02-14 Thread Marina Polyakova


On 14-02-2018 3:43, Peter Eisentraut wrote:

On 2/13/18 05:40, Marina Polyakova wrote:

Binary search has shown that this failure begins with commit
8561e4840c81f7e345be2df170839846814fa004 (Transaction control in PL
procedures.). On the previous commit
(b9ff79b8f17697f3df492017d454caa9920a7183) there's no
plpython_transaction test and plpython check passes.


OK, can you get some kind of stack trace or other debugging 
information?


I got this backtrace from gdb:

Program received signal SIGSEGV, Segmentation fault.
0x7d13bb50 in strlen () from /lib/64/libc.so.1
(gdb) bt
#0  0x7d13bb50 in strlen () from /lib/64/libc.so.1
#1  0x7d1aabf8 in _ndoprnt () from /lib/64/libc.so.1
#2  0x7d1ad3d4 in vsnprintf () from /lib/64/libc.so.1
#3  0x0001009ab174 in pvsnprintf (
buf=0x100d680e3 "PL/Python function \"", '\177' times>..., len=933,
fmt=0x100d684a0 "PL/Python function \"%s\"", 
args=0x7fff77f8) at psprintf.c:121

#4  0x000100492bd0 in appendStringInfoVA (str=0x7fff7738,
fmt=0x100d684a0 "PL/Python function \"%s\"", 
args=0x7fff77f8) at stringinfo.c:130
#5  0x000100911038 in errcontext_msg (fmt=0x7aa1eb98 
"PL/Python function \"%s\"")

at elog.c:1021
#6  0x7aa11ea8 in plpython_error_callback (arg=0x100e7d7e8) at 
plpy_main.c:402

#7  0x00010090e9f8 in errfinish (dummy=0) at elog.c:438
#8  0x000100482e78 in SPI_commit () at spi.c:211
#9  0x7aa13e90 in PLy_commit (self=0x0, args=0x0) at 
plpy_plpymodule.c:602
#10 0x71a1ca40 in PyEval_EvalFrameEx (f=0x70619548, 
throwflag=2147449896)

at Python/ceval.c:4004
#11 0x71a1d8a0 in PyEval_EvalFrameEx (f=0x706193a0, 
throwflag=2147450456)

at Python/ceval.c:4106
#12 0x71a1ed20 in PyEval_EvalCodeEx (co=0x7060e6b0, 
globals=0x705236b4,
locals=0x71bf9940 <_PyThreadState_Current>, args=0x0, 
argcount=0, kws=0x0, kwcount=0,

defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#13 0x71a1ef64 in PyEval_EvalCode (co=0x7060e6b0, 
globals=0x70525c58,

locals=0x70525c58) at Python/ceval.c:667
#14 0x7aa10938 in PLy_procedure_call (proc=0x7fff8580,
kargs=0x7aa1e128 "args", vargs=0x7060d290) at 
plpy_exec.c:1031
#15 0x7aa0cf08 in PLy_exec_function (fcinfo=0x7fff86f0, 
proc=0x7fff8580)

at plpy_exec.c:107
#16 0x7aa11c64 in plpython_inline_handler 
(fcinfo=0x7fff8be0) at plpy_main.c:353
#17 0x00010091e44c in OidFunctionCall1Coll (functionId=16385, 
collation=0, arg1=4311287992)

at fmgr.c:1327
#18 0x0001003746ec in ExecuteDoStmt (stmt=0x100f8fa70, atomic=1 
'\001') at functioncmds.c:2183

#19 0x0001006f420c in standard_ProcessUtility (pstmt=0x100f90768,
queryString=0x70528054 "DO LANGUAGE plpythonu $x$ 
plpy.commit() $x$",
context=PROCESS_UTILITY_QUERY, params=0x0, queryEnv=0x0, 
dest=0x100cd6708 ,

completionTag=0x7fff93f8 "") at utility.c:533
#20 0x0001006f3b04 in ProcessUtility (pstmt=0x100f90768,
queryString=0x70528054 "DO LANGUAGE plpythonu $x$ 
plpy.commit() $x$",
context=PROCESS_UTILITY_QUERY, params=0x0, queryEnv=0x0, 
dest=0x100cd6708 ,

completionTag=0x7fff93f8 "") at utility.c:358
#21 0x0001004886e0 in _SPI_execute_plan (plan=0x7fff95b0, 
paramLI=0x0, snapshot=0x0,
crosscheck_snapshot=0x0, read_only=0 '\000', fire_triggers=1 '\001', 
tcount=0) at spi.c:2207

#22 0x000100483980 in SPI_execute (
src=0x70528054 "DO LANGUAGE plpythonu $x$ plpy.commit() 
$x$", read_only=0 '\000',

tcount=0) at spi.c:416
#23 0x7aa17698 in PLy_spi_execute_query (
query=0x70528054 "DO LANGUAGE plpythonu $x$ plpy.commit() 
$x$", limit=0)

at plpy_spi.c:331
#24 0x7aa16c3c in PLy_spi_execute (self=0x0, 
args=0x7051fb50) at plpy_spi.c:168
#25 0x719af824 in PyCFunction_Call (func=0x7062a098, 
arg=0x7051fb50, kw=0x0)

at Objects/methodobject.c:116
#26 0x71a1cfb0 in PyEval_EvalFrameEx (f=0x7af29c20, 
throwflag=2147457704)

at Python/ceval.c:4020
#27 0x71a1d8a0 in PyEval_EvalFrameEx (f=0x706191f8, 
throwflag=2147458264)

at Python/ceval.c:4106
#28 0x71a1ed20 in PyEval_EvalCodeEx (co=0x7072cf30, 
globals=0x70523624,
locals=0x71bf9940 <_PyThreadState_Current>, args=0x0, 
argcount=0, kws=0x0, kwcount=0,

defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:3252
#29 0x71a1ef64 in PyEval_EvalCode (co=0x7072cf30, 
globals=0x70525910,

locals=0x70525910) at Python/ceval.c:667
---Type  to continue, or q  to quit---
#30 0x7aa10938 in PLy_procedure_

master plpython check fails on Solaris 10

2018-02-13 Thread Marina Polyakova

Hello, hackers! I got a permanent failure of master (commit 
ebdb42a0d6a61b93a5bb9f4204408edf5959332c) plpython check on Solaris 10. 
Regression output and diffs are attached.


I used the following commands:
./configure CC="ccache gcc" CFLAGS="-m64 -I/opt/csw/include" 
PKG_CONFIG_PATH="/opt/csw/lib/pkgconfig:/usr/local/lib/pkgconfig" 
LDFLAGS="-L/opt/csw/lib/sparcv9 -L/usr/local/lib/64" --enable-cassert 
--enable-debug --enable-nls --enable-tap-tests --with-perl --with-tcl 
--with-python --with-gssapi --with-openssl --with-ldap --with-libxml 
--with-libxslt --with-icu

gmake > make_results.txt
gmake -C src/pl/plpython check

Binary search has shown that this failure begins with commit 
8561e4840c81f7e345be2df170839846814fa004 (Transaction control in PL 
procedures.). On the previous commit 
(b9ff79b8f17697f3df492017d454caa9920a7183) there's no 
plpython_transaction test and plpython check passes.


About the system: SunOS, Release 5.10, KernelID Generic_141444-09.

P.S. It seems that there's a similar failure on Windows, and I'm trying 
to reproduce it..


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company*** /home/buildfarm/mpolyakova/postgrespro_core_353_stable_func/src/pl/plpython/expected/plpython_transaction.out	Tue Feb 13 13:00:33 2018
--- /home/buildfarm/mpolyakova/postgrespro_core_353_stable_func/src/pl/plpython/results/plpython_transaction.out	Tue Feb 13 13:34:39 2018
***
*** 88,136 
  return 1
  $$;
  SELECT transaction_test4();
! ERROR:  spiexceptions.InvalidTransactionTermination: invalid transaction termination
! CONTEXT:  Traceback (most recent call last):
!   PL/Python function "transaction_test4", line 2, in 
! plpy.execute("DO LANGUAGE plpythonu $x$ plpy.commit() $x$")
! PL/Python function "transaction_test4"
! -- commit inside subtransaction (prohibited)
! DO LANGUAGE plpythonu $$
! s = plpy.subtransaction()
! s.enter()
! plpy.commit()
! $$;
! WARNING:  forcibly aborting a subtransaction that has not been exited
! ERROR:  cannot commit while a subtransaction is active
! CONTEXT:  PL/Python anonymous code block
! -- commit inside cursor loop
! CREATE TABLE test2 (x int);
! INSERT INTO test2 VALUES (0), (1), (2), (3), (4);
! TRUNCATE test1;
! DO LANGUAGE plpythonu $$
! for row in plpy.cursor("SELECT * FROM test2 ORDER BY x"):
! plpy.execute("INSERT INTO test1 (a) VALUES (%s)" % row['x'])
! plpy.commit()
! $$;
! ERROR:  cannot commit transaction while a cursor is open
! CONTEXT:  PL/Python anonymous code block
! SELECT * FROM test1;
!  a | b 
! ---+---
! (0 rows)
! 
! -- rollback inside cursor loop
! TRUNCATE test1;
! DO LANGUAGE plpythonu $$
! for row in plpy.cursor("SELECT * FROM test2 ORDER BY x"):
! plpy.execute("INSERT INTO test1 (a) VALUES (%s)" % row['x'])
! plpy.rollback()
! $$;
! ERROR:  cannot abort transaction while a cursor is open
! CONTEXT:  PL/Python anonymous code block
! SELECT * FROM test1;
!  a | b 
! ---+---
! (0 rows)
! 
! DROP TABLE test1;
! DROP TABLE test2;
--- 88,94 
  return 1
  $$;
  SELECT transaction_test4();
! server closed the connection unexpectedly
! 	This probably means the server terminated abnormally
! 	before or while processing the request.
! connection to server was lost

==

*** /home/buildfarm/mpolyakova/postgrespro_core_353_stable_func/src/pl/plpython/expected/plpython_drop.out	Wed Feb  7 17:27:50 2018
--- /home/buildfarm/mpolyakova/postgrespro_core_353_stable_func/src/pl/plpython/results/plpython_drop.out	Tue Feb 13 13:34:39 2018
***
*** 1,6 
! --
! -- For paranoia's sake, don't leave an untrusted language sitting around
! --
! SET client_min_messages = WARNING;
! DROP EXTENSION plpythonu CASCADE;
! DROP EXTENSION IF EXISTS plpython2u CASCADE;
--- 1 
! psql: FATAL:  the database system is in recovery mode

==

test plpython_schema  ... ok
test plpython_populate... ok
test plpython_test... ok
test plpython_do  ... ok
test plpython_global  ... ok
test plpython_import  ... ok
test plpython_spi ... ok
test plpython_newline ... ok
test plpython_void... ok
test plpython_call... ok
test plpython_params  ... ok
test plpython_setof   ... ok
test plpython_record  ... ok
test plpython_trigger ... ok
test plpython_types   ... ok
test plpython_error   ... ok
test plpython_ereport ... ok
test plpython_unicode ... ok
test plpython_quote   ... ok
test plpython_composite   ... ok
test plpython_subtransaction  ... ok
test plpython_transaction ... FAILED (test process exited with exit code 2)
test plpython_drop... FAILED (test process exited with exit code 2)

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-02-05 Thread Marina Polyakova


Hello!

Thank you for reporting! I'll try to get it on our buildfarm..

On 05-02-2018 0:10, Thomas Munro wrote:

On Thu, Feb 1, 2018 at 6:01 PM, Marina Polyakova
<m.polyak...@postgrespro.ru> wrote:
This is the 8-th version of the patch for the precalculation of stable 
or
immutable functions, stable or immutable operators and other 
nonvolatile
expressions. This is a try to fix the most problems (I'm sorry, it 
took some
time..) that Tom Lane and Andres Freund mentioned in [1], [2] and [3]. 
It is
based on the top of master, and on my computer make check-world 
passes. And

I'll continue work on it.


Hi Marina,

FYI I saw a repeatable crash in the contrib regression tests when
running make check-world with this patch applied.

test hstore_plperl... FAILED (test process exited with exit 
code 2)
test hstore_plperlu   ... FAILED (test process exited with exit 
code 2)

test create_transform ... FAILED

I'm not sure why it passes for you but fails here, but we can see from
the backtrace[1] that ExecInitExprRec is receiving a null node pointer
on this Ubuntu Trusty GCC 4.8 amd64 system.

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/337255374


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-25 Thread Marina Polyakova


Hello!

On 24-01-2018 23:36, Andres Freund wrote:

On 2018-01-24 15:10:56 -0500, Tom Lane wrote:
...

Keeping the stored value of a CachedExpr in a Param slot is an
interesting idea indeed.


We keep coming back to this, IIRC we had a pretty similar discussion
around redesigning caseValue_datum/isNull domainValue_datum/isNull to 
be

less ugly. There also was
https://www.postgresql.org/message-id/20171116182208.kcvf75nfaldv3...@alap3.anarazel.de
where we discussed using something similar to PARAM_EXEC Param nodes to
allow inlining of volatile functions.

ISTM, there might be some value to consider all of them in the design 
of

the new mechanism.


Thank you both very much for this discussion and for the link on that 
thread! Now I'm working on the patch, thanks to Tom Lane's comments 
earlier [1], and I'll try to implement something of this..


[1] 
https://www.postgresql.org/message-id/403e0ae329c6868b3f3467eac92cc04d%40postgrespro.ru


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-20 Thread Marina Polyakova


On 17-01-2018 1:05, Tom Lane wrote:

[ I'm sending this comment separately because I think it's an issue
Andres might take an interest in. ]

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

[ v7-0001-Precalculate-stable-and-immutable-functions.patch ]


Another thing that's bothering me is that the execution semantics
you're proposing for CachedExpr seem rather inflexible.  AFAICS, once a
CachedExpr has run once, it will hang on to the result value and keep
returning that for the entire lifespan of the compiled expression.
We already noted that that breaks plpgsql's "simple expression"
logic, and it seems inevitable to me that it will be an issue for
other places as well.  I think it'd be a better design if we had some
provision for resetting the cached values, short of recompiling the
expression from scratch.

One way that occurs to me to do this is to replace the simple boolean
isExecuted flags with a generation counter, and add a master generation
counter to ExprState.  The rule for executing CachedExpr would be "if 
my

generation counter is different from the ExprState's counter, then
evaluate the subexpression and copy the ExprState's counter into mine".
Then the procedure for forcing recalculation of cached values is just 
to
increment the ExprState's counter.  There are other ways one could 
imagine

doing this --- for instance, I initially thought of keeping the master
counter in the ExprContext being used to run the expression.  But you 
need

some way to remember what counter value was used last with a particular
expression, so probably keeping it in ExprState is better.

Or we could just brute-force it by providing a function that runs 
through

a compiled expression step list and resets the isExecuted flag for each
EEOP_CACHEDEXPR_IF_CACHED step it finds.  A slightly less brute-force
way is to link those steps together in a list, so that the function
doesn't have to visit irrelevant steps.  If the reset function were 
seldom
used then the extra cycles for this wouldn't be very expensive.  But 
I'm

not sure it will be seldom used --- it seems like plpgsql simple
expressions will be doing this every time --- so I think the counter
approach might be a better idea.


Thank you very much! I'll try to implement something from this.


I'm curious to know whether Andres has some other ideas, or whether he
feels this is all a big wart on the compiled-expression concept.  I 
don't

think there are any existing cases where we keep any meaningful state
across executions of a compiled-expression data structure; maybe that's
a bad idea in itself.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-20 Thread Marina Polyakova

the 
infrastructure of ParamListInfoData.



And what in the world is
CheckBoundParams about? The internal documentation in this patch
isn't quite nonexistent, but it's well short of being in a
committable state IMO.


I'll try to improve it, for CheckBoundParams (if I understood you 
correctly) and others.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-20 Thread Marina Polyakova


On 18-01-2018 20:49, Marina Polyakova wrote:

On 18-01-2018 20:34, Tom Lane wrote:

...
What you could do in the meantime is work on finding a variation of
Victor's test that will detect the bug regardless of -O level.
If we do have hope that future gcc versions will handle this 
correctly,

we'll need a better test rather than just summarily dismissing
host_cpu = sparc.


Thanks, I'll try..


I tried different options of gcc but it did not help..
Perhaps searching in the source code of gcc will clarify something, but 
I'm sorry that I'm now too busy for this..


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


On 18-01-2018 20:34, Tom Lane wrote:

I wrote:
...
But ... let's not panic, but wait and see the final result of the
discussion on the gcc PR.  Jakub at least seems to think it ought
to be a supportable case.

What you could do in the meantime is work on finding a variation of
Victor's test that will detect the bug regardless of -O level.
If we do have hope that future gcc versions will handle this correctly,
we'll need a better test rather than just summarily dismissing
host_cpu = sparc.


Thanks, I'll try..

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


On 18-01-2018 20:24, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 18-01-2018 19:53, Tom Lane wrote:

So basically, we're outta luck and we have to consider __int128 as
unsupportable on SPARC.  I'm inclined to mechanize that as a test on
$host_cpu.  At least that means we don't need an AC_RUN test ;-)



%-)) :-)
Can I do something else about this problem?..


I don't see any other workable alternative.  The box we're in as far
as the interaction with MAXALIGN goes is still the same as it was
a month ago: raising MAXALIGN is impractical, and so is allowing
some datatypes to have more-than-MAXALIGN alignment specs.

I suppose you could imagine declaring int128s that are in any sort
of palloc'd storage as, in effect, char[16], and always memcpy'ing
to and from local variables that're declared int128 whenever you
want to do arithmetic with them.  But ugh.  I can't see taking that
sort of notational and performance hit for just one non-mainstream
architecture.

Really, this is something that the compiler ought to do for us, IMO.
If the gcc guys don't want to be bothered, OK, but that tells you more
about the priority they place on SPARC support than anything else.


Thank you very much for your explanations!
So I'll go to all of your comments to my patch about stable functions 
when the next work day begins in Moscow)


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


On 18-01-2018 19:53, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 18-01-2018 17:56, Tom Lane wrote:

Weird.  Maybe the gcc bug only manifests with certain optimization
flags?  That's not what I'd have expected from Victor's theory about
why the code is wrong, but if it only shows up some of the time,
it's hard to think of another explanation.



Thank you! Using ./configure CC="gcc" CFLAGS="-m64 -O1" on commit
9c7d06d60680 with your patch, I got this:
[ configure check passes ]
But make check got the same failures, and I see the same debug output 
as

in [1]..


Interesting.  Maybe the parameter-passing misbehavior that Victor's
test is looking for isn't the only associated bug.


P.S. As I understand it, this comment on bugzilla [2] is also about
this.
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83925#c6


Even more interesting, see c7 that was just posted there:


Eric Botcazou 2018-01-18 16:22:48 UTC
128-bit types requite 128-bit alignment on SPARC64 so we cannot 
support that.


So basically, we're outta luck and we have to consider __int128 as
unsupportable on SPARC.  I'm inclined to mechanize that as a test on
$host_cpu.  At least that means we don't need an AC_RUN test ;-)


%-)) :-)
Can I do something else about this problem?..

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


On 18-01-2018 17:56, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

Applying your patch on commit f033462d8f77c40b7d6b33c5116e50118fb4699d
and using the configuration command from [1], I got:
checking for __int128... yes
checking for __int128 alignment bug... broken
...
And make check-world passes. Victor said that he used a much simpler
configuration command, and I'm trying to figure out what's changed..


Weird.  Maybe the gcc bug only manifests with certain optimization
flags?  That's not what I'd have expected from Victor's theory about
why the code is wrong, but if it only shows up some of the time,
it's hard to think of another explanation.


Thank you! Using ./configure CC="gcc" CFLAGS="-m64 -O1" on commit 
9c7d06d60680 with your patch, I got this:

checking for __int128... yes
checking for __int128 alignment bug... ok
checking alignment of PG_INT128_TYPE... 16

In pg_config.h:
/* The normal alignment of `PG_INT128_TYPE', in bytes. */
#define ALIGNOF_PG_INT128_TYPE 16
...
/* Define to the name of a signed 128-bit integer type. */
#define PG_INT128_TYPE __int128

But make check got the same failures, and I see the same debug output as 
in [1]..


P.S. As I understand it, this comment on bugzilla [2] is also about 
this.


[1] 
https://www.postgresql.org/message-id/90ab676392c8f9c84431976147097cf0%40postgrespro.ru

[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83925#c6

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


On 17-01-2018 18:50, Marina Polyakova wrote:

On 17-01-2018 18:28, Tom Lane wrote:

BTW, now that you've demonstrated that the bug exists in a current
gcc release, you should definitely file a bug at
https://gcc.gnu.org/bugzilla/
I think you can just give them int128test2.c as-is as a test case.

Please do that and let me know the PR number --- I think it would be
good to cite the bug specifically in the comments for our configure 
code.


Thanks, I'll try to do it.


If I understand correctly, its PR number is 83925 (see [1]).

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83925

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-18 Thread Marina Polyakova


Thank you!

On 17-01-2018 19:33, Tom Lane wrote:

Attached is a draft patch to incorporate Victor's slimmed-down test
into configure.  If you have a chance, could you confirm it does
the right thing on your Sparc machine?



Victor Wagner <vitus(at)wagner(dot)pp(dot)ru> writes:

It seems that what it does is not exactly a right thing.
I've applied it to commit 9c7d06d60680 in master and see following
checking for __int128 alignment bug... ok
As far as I understand your patch, there should be:
checking for __int128 alignment bug... broken

...

Then in the pg_config.h I see
/* The normal alignment of `PG_INT128_TYPE', in bytes. */
#define ALIGNOF_PG_INT128_TYPE 16
/* Define to the name of a signed 128-bit integer type. */
#define PG_INT128_TYPE __int128

...

However, make check passes.

Uh ... how could that be?  If the output of configure is exactly
the same as before the patch, how could the end result be different?


Applying your patch on commit f033462d8f77c40b7d6b33c5116e50118fb4699d 
and using the configuration command from [1], I got:

checking for __int128... yes
checking for __int128 alignment bug... broken

Nothing is defined for int128 in pg_config.h:
/* The normal alignment of `PG_INT128_TYPE', in bytes. */
/* #undef ALIGNOF_PG_INT128_TYPE */
...
/* Define to the name of a signed 128-bit integer type. */
/* #undef PG_INT128_TYPE */

And make check-world passes. Victor said that he used a much simpler 
configuration command, and I'm trying to figure out what's changed..



BTW, it would be a good idea to set up a buildfarm member on that
machine, if you care about whether that configuration continues
to work in the future.


Victor answered this in [2]:

Really we already have buildfarm member on this machine. It is just
member of PostgresPro private buildfarm, not of big comminity
buildfarm.

So, I'll register it in the big buildfarm as soon as I figure out how
to distribute limited resources of this machine between two buildfarms.


P.S. I found the trailing whitespace in line 80:
! int128a q;

[1] 
https://www.postgresql.org/message-id/0d3a9fa264cebe1cb9966f37b7c06e86%40postgrespro.ru
[2] 
https://www.postgresql.org/message-id/20180117203648.2626d97a%40wagner.wagner.home


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-17 Thread Marina Polyakova


On 17-01-2018 18:28, Tom Lane wrote:

BTW, now that you've demonstrated that the bug exists in a current
gcc release, you should definitely file a bug at
https://gcc.gnu.org/bugzilla/
I think you can just give them int128test2.c as-is as a test case.

Please do that and let me know the PR number --- I think it would be
good to cite the bug specifically in the comments for our configure 
code.


Thanks, I'll try to do it.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-17 Thread Marina Polyakova


Sorry, diff.patch is attached now.

On 17-01-2018 18:02, Marina Polyakova wrote:

[I added Victor Wagner as co-researcher of this problem]

On 13-01-2018 21:10, Tom Lane wrote:

In the end this might just be an instance of the old saw about
avoiding dot-zero releases.  Have you tried a newer gcc?
(Digging in their bugzilla finds quite a number of __int128 bugs
fixed in 5.4.x, though none look to be specifically about
misaligned data.)


gcc 5.5.0 (from [1]) did not fix the problem..

On 16-01-2018 2:41, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 13-01-2018 21:10, Tom Lane wrote:

I'm not sure there's much we can do about this.  Dropping the use
of the alignment spec isn't a workable option.  If there were a
simple way for configure to detect that the compiler generates bad
code for that, we could have it do so and reject use of __int128,
but it'd be up to you to come up with a workable test.



I'll think about it..


Attached is a possible test program.  I can confirm it passes on a
machine with working __int128, but I have no idea whether it will
detect the problem on yours.  If not, maybe you can tweak it?


Thank you! Using gcc 5.5.0 it prints that everything is ok. But,
investigating the regression diffs, we found out that the error occurs
when we pass int128 as not the first argument to the function (perhaps
its value is replaced by the value of some address):

-- Use queries from random.sql
SELECT count(*) FROM onek; -- Everything is ok
...
SELECT random, count(random) FROM RANDOM_TBL
  GROUP BY random HAVING count(random) > 3; -- Everything is ok

postgres=# SELECT * FROM RANDOM_TBL ORDER BY random; -- Print current 
data

 random

 78
 86
 98
 98
(4 rows)

postgres=# SELECT AVG(random) FROM RANDOM_TBL
postgres-#   HAVING AVG(random) NOT BETWEEN 80 AND 120; -- Oops!
  avg
---
 79446934848446476698976780288
(1 row)

Debug output from the last query (see attached diff.patch, it is based
on commit 9c7d06d60680c7f00d931233873dee81fdb311c6 of master):

makeInt128AggState
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 0062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 86
int8_avg_accum val_int128 as 2 x int64: 0 86
int8_avg_accum val_int128 bytes: 0056
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 0062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 78
int8_avg_accum val_int128 as 2 x int64: 0 78
int8_avg_accum val_int128 bytes: 004E
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
numeric_poly_avg
int128_to_numericvar
int128_to_numericvar int128 val as 2 x int64: 17227307872 0
int128_to_numericvar int128 val bytes: 000402D3DB60

(val_int128 in the function int8_avg_accum is correct, but newval in
the function do_int128_accum is not equal to it. val in the function
int128_to_numericvar is (4 * 4306826968).)

Based on this, we modified the test program (see attached). Here is
its output on Solaris 10 for different alignments requirements for
int128 (on my machine where make check-world passes everything is OK)
(ALIGNOF_PG_INT128_TYPE is 16 on Solaris 10):

$ gcc -D PG_ALIGN_128=16 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 OK
pass uint 16 OK
pass int 32 OK
pass int 64 OK
pass int 128 OK
$ gcc -D PG_ALIGN_128=8 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 FAILED
pass uint 16 FAILED
pass int 32 FAILED
pass int 64 FAILED
pass int 128 OK

Maybe some pass test from int128test2.c can be used to test __int128?

P.S. I suppose, g.b should be 97656250 to get 4005:


struct glob128
{
__int128start;
charpad;
int128a a;
int128a b;
int128a c;
int128a d;
} g = {0, 'p', 48828125, 97656255, 0, 0};
...
g.b = (g.b << 12) + 5;/* 4005

Re: master make check fails on Solaris 10

2018-01-17 Thread Marina Polyakova


[I added Victor Wagner as co-researcher of this problem]

On 13-01-2018 21:10, Tom Lane wrote:

In the end this might just be an instance of the old saw about
avoiding dot-zero releases.  Have you tried a newer gcc?
(Digging in their bugzilla finds quite a number of __int128 bugs
fixed in 5.4.x, though none look to be specifically about
misaligned data.)


gcc 5.5.0 (from [1]) did not fix the problem..

On 16-01-2018 2:41, Tom Lane wrote:

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

On 13-01-2018 21:10, Tom Lane wrote:

I'm not sure there's much we can do about this.  Dropping the use
of the alignment spec isn't a workable option.  If there were a
simple way for configure to detect that the compiler generates bad
code for that, we could have it do so and reject use of __int128,
but it'd be up to you to come up with a workable test.



I'll think about it..


Attached is a possible test program.  I can confirm it passes on a
machine with working __int128, but I have no idea whether it will
detect the problem on yours.  If not, maybe you can tweak it?


Thank you! Using gcc 5.5.0 it prints that everything is ok. But, 
investigating the regression diffs, we found out that the error occurs 
when we pass int128 as not the first argument to the function (perhaps 
its value is replaced by the value of some address):


-- Use queries from random.sql
SELECT count(*) FROM onek; -- Everything is ok
...
SELECT random, count(random) FROM RANDOM_TBL
  GROUP BY random HAVING count(random) > 3; -- Everything is ok

postgres=# SELECT * FROM RANDOM_TBL ORDER BY random; -- Print current 
data

 random

 78
 86
 98
 98
(4 rows)

postgres=# SELECT AVG(random) FROM RANDOM_TBL
postgres-#   HAVING AVG(random) NOT BETWEEN 80 AND 120; -- Oops!
  avg
---
 79446934848446476698976780288
(1 row)

Debug output from the last query (see attached diff.patch, it is based 
on commit 9c7d06d60680c7f00d931233873dee81fdb311c6 of master):


makeInt128AggState
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 0062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 86
int8_avg_accum val_int128 as 2 x int64: 0 86
int8_avg_accum val_int128 bytes: 0056
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 98
int8_avg_accum val_int128 as 2 x int64: 0 98
int8_avg_accum val_int128 bytes: 0062
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
int8_avg_accum val 78
int8_avg_accum val_int128 as 2 x int64: 0 78
int8_avg_accum val_int128 bytes: 004E
int8_avg_accum state 100e648d8
int8_avg_accum 1007f2e94
do_int128_accum int128 newval as 2 x int64: 4306826968 0
do_int128_accum int128 newval bytes: 000100B4F6D8
do_int128_accum state 100e648d8
do_int128_accum 1007f1e30
numeric_poly_avg
int128_to_numericvar
int128_to_numericvar int128 val as 2 x int64: 17227307872 0
int128_to_numericvar int128 val bytes: 000402D3DB60

(val_int128 in the function int8_avg_accum is correct, but newval in the 
function do_int128_accum is not equal to it. val in the function 
int128_to_numericvar is (4 * 4306826968).)


Based on this, we modified the test program (see attached). Here is its 
output on Solaris 10 for different alignments requirements for int128 
(on my machine where make check-world passes everything is OK) 
(ALIGNOF_PG_INT128_TYPE is 16 on Solaris 10):


$ gcc -D PG_ALIGN_128=16 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 OK
pass uint 16 OK
pass int 32 OK
pass int 64 OK
pass int 128 OK
$ gcc -D PG_ALIGN_128=8 -m64 -o int128test2 int128test2.c
$ ./int128test2
basic aritmetic OK
pass int 16 FAILED
pass uint 16 FAILED
pass int 32 FAILED
pass int 64 FAILED
pass int 128 OK

Maybe some pass test from int128test2.c can be used to test __int128?

P.S. I suppose, g.b should be 97656250 to get 4005:


struct glob128
{
__int128start;
charpad;
int128a a;
int128a b;
int128a c;
int128a d;
} g = {0, 'p', 48828125, 97656255, 0, 0};
...
g.b = (g.b << 12) + 5;/* 4005 */


[1] https://www.opencsw.org

--
Marina Polyakova
Postgres Pro

Re: WIP Patch: Precalculate stable functions, infrastructure v1

2018-01-17 Thread Marina Polyakova

Thank you so much for your comments!! I'll answer a bit later because 
now I'm trying to find a test for int128 on Solaris 10.. [1]


On 17-01-2018 1:05, Tom Lane wrote:

[ I'm sending this comment separately because I think it's an issue
Andres might take an interest in. ]

Marina Polyakova <m.polyak...@postgrespro.ru> writes:

[ v7-0001-Precalculate-stable-and-immutable-functions.patch ]


Another thing that's bothering me is that the execution semantics
you're proposing for CachedExpr seem rather inflexible.  AFAICS, once a
CachedExpr has run once, it will hang on to the result value and keep
returning that for the entire lifespan of the compiled expression.
We already noted that that breaks plpgsql's "simple expression"
logic, and it seems inevitable to me that it will be an issue for
other places as well.  I think it'd be a better design if we had some
provision for resetting the cached values, short of recompiling the
expression from scratch.

One way that occurs to me to do this is to replace the simple boolean
isExecuted flags with a generation counter, and add a master generation
counter to ExprState.  The rule for executing CachedExpr would be "if 
my

generation counter is different from the ExprState's counter, then
evaluate the subexpression and copy the ExprState's counter into mine".
Then the procedure for forcing recalculation of cached values is just 
to
increment the ExprState's counter.  There are other ways one could 
imagine

doing this --- for instance, I initially thought of keeping the master
counter in the ExprContext being used to run the expression.  But you 
need

some way to remember what counter value was used last with a particular
expression, so probably keeping it in ExprState is better.

Or we could just brute-force it by providing a function that runs 
through

a compiled expression step list and resets the isExecuted flag for each
EEOP_CACHEDEXPR_IF_CACHED step it finds.  A slightly less brute-force
way is to link those steps together in a list, so that the function
doesn't have to visit irrelevant steps.  If the reset function were 
seldom
used then the extra cycles for this wouldn't be very expensive.  But 
I'm

not sure it will be seldom used --- it seems like plpgsql simple
expressions will be doing this every time --- so I think the counter
approach might be a better idea.

I'm curious to know whether Andres has some other ideas, or whether he
feels this is all a big wart on the compiled-expression concept.  I 
don't

think there are any existing cases where we keep any meaningful state
across executions of a compiled-expression data structure; maybe that's
a bad idea in itself.

regards, tom lane


[1] 
https://www.postgresql.org/message-id/18209.1516059711%40sss.pgh.pa.us


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-12 Thread Marina Polyakova


On 12-01-2018 14:05, Marina Polyakova wrote:

- on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935)
the same failures occur (see the attached regression diffs and
output);
- on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world 
passes.

I'll try to find out from what commit it started..


Binary search has shown that all these failures begin with commit 
7518049980be1d90264addab003476ae105f70d4 (Prevent int128 from requiring 
more than MAXALIGN alignment.). On the previous commit 
(91aec93e6089a5ba49cce0aca3bf7f7022d62ea4) make check-world passes.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: master make check fails on Solaris 10

2018-01-12 Thread Marina Polyakova


On 12-01-2018 18:12, Alvaro Herrera wrote:

Marina Polyakova wrote:

- on the previous commit (272c2ab9fd0a604e3200030b1ea26fd464c44935) 
the same

failures occur (see the attached regression diffs and output);
- on commit bf54c0f05c0a58db17627724a83e1b6d4ec2712c make check-world
passes.
I'll try to find out from what commit it started.. Don't you have any
suspicions?)


Perhaps you can use "git bisect".


Thanks, I'm doing the same thing :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Pgbench Serialization and deadlock errors

2018-01-12 Thread Marina Polyakova


On 12-01-2018 15:47, Fabien COELHO wrote:

Hello Marina,

A partial answer, to focus on how the feature may be simplified.


Thank you!


I apologise as it seems that I overlooked one of your mail. Changing
the thread has not helped.


I'm wondering whether the whole feature could be simplified by
considering that one script is one "transaction" (it is from the
report point of view at least), and that any retry is for the full
script only, from its beginning. That would remove the trying to 
guess
at transactions begin or end, avoid scanning manually for 
subcommands,

and so on.
 - Would it make sense?
 - Would it be ok for your use case?


I'm not sure if this makes sense: if we retry the script from the very 
beginning including successful transactions, there may be errors..


What I'm suggesting is to simplify the problem by adding some
restrictions to what kind of case which is handled, so as to simplify
the code a lot.

I'd start by stating (i.e. documenting) that the features assumes that
one script is just *one* transaction.

Note that pgbench somehow already assumes that one script is one
transaction when it reports performance anyway.

If you want 2 transactions, then you have to put them in two scripts,
which looks fine with me. Different transactions are expected to be
independent, otherwise they should be merged into one transaction.


Therefore if the script consists of several single statements (= several 
transactions), you cannot retry them. For example, the script looked 
like this:


UPDATE xy1 SET x = 1 WHERE y = 1;
UPDATE xy2 SET x = 2 WHERE y = 2;
UPDATE xy3 SET x = 3 WHERE y = 3;

If this restriction is ok for you, I'll simplify the code :)


Under these restrictions, ISTM that a retry is something like:

  case ABORTED:
 if (we want to retry) {
// do necessary stats
// reset the initial state (random, vars, current command)
state = START_TX; // loop
 }
 else {
   // count as failed...
   state = FINISHED; // or done.
 }
 break;


If we successfully complete a failed transaction block and process its 
end command in CSTATE_END_COMMAND, we may want to make a retry. So do 
you think that in this case it is ok to go to CSTATE_ABORTED at the end 
of CSTATE_END_COMMAND?..



Once this works, maybe it could go a step further by restarting at
savepoints. I'd put restrictions there to ease detecting a savepoint
so that it cannot occur in a compound command for instance. This would
probably require a new state. Fine.


We discussed the savepoints earlier in [1]:

- if there's a failure what savepoint we should rollback to and start 
the

execution again?

...
Maybe to go to the last one, if it is not successful go to the 
previous

one etc. Retrying the entire transaction may take less time..


Well, I do not know that. My 0.02 € is that if there was a savepoint 
then

this is natural the restarting point of a transaction which has some
recoverable error.




A detail:

For file contents, maybe the << 'EOF' here-document syntax would help 
instead of using concatenated backslashed strings everywhere.


Thanks, but I'm not sure that it is better to open file handlers for 
printing in variables..


Perl here-document stuff is just a multiline string, no file is
involved, it is different from the shell.


Oh, now googling was successful, thanks)

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1707141637300.7871%40lancre


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

1 2 >

1 - 100 of 107 matches

Mail list logo