Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-03 Thread Drouvot, Bertrand

Hi,

On 5/4/23 6:43 AM, Amit Kapila wrote:

On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:


Thanks for posting the updated patch, I had run this test in a loop of
100 times to verify that there was no failure because of race
conditions. The 100 times execution passed successfully.

One suggestion:
"wal file" should be changed to "WAL file":
+# Request a checkpoint on the standby to trigger the WAL file(s) removal
+$node_standby->safe_psql('postgres', 'checkpoint;');
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;



Thanks for the verification. I have pushed the patch.



Thanks!

I've marked the CF entry as Committed and moved the associated PostgreSQL 16 
Open Item
to "resolved before 16beta1".

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Order changes in PG16 since ICU introduction

2023-05-03 Thread Jeff Davis
On Fri, 2023-04-28 at 14:35 -0700, Jeff Davis wrote:
> On Thu, 2023-04-27 at 14:23 +0200, Daniel Verite wrote:
> > This should be pg_strcasecmp(...) == 0
> 
> Good catch, thank you! Fixed in updated patches.

Rebased patches.

=== 0001: do not convert C to en-US-u-va-posix

I plan to commit this soon. If someone specifies "C", they are probably
expecting memcmp()-like behavior, or some kind of error/warning that it
can't be provided.

Removing this transformation means that if you specify iculocale=C,
you'll get an error or warning (depending on icu_validation_level),
because C is not a recognized icu locale. Depending on how some of the
other issues in this thread are sorted out, we may want to relax the
validation.

=== 0002: fix @euro, etc. in ICU >= 64

I'd like to commit this soon too, but I'll wait for someone to take a
look. It makes it more reliable to map libc names to icu locale names
regardless of the ICU version.

It doesn't solve the problem for locales like "de__PHONEBOOK", but
those don't seem to be a libc format (I think just an old ICU format),
so I don't see a big reason to carry it forward. It might be another
reason to turn down the validation level to WARNING, though.

=== 0003: support C memcmp() behavior with ICU provider

The current patch 0003 has a problem, because in previous postgres
versions (going all the way back), we allowed "C" as a valid ICU
locale, that would actually be passed to ICU as a locale name. But ICU
didn't recognize it, and it would end up opening the root locale. So we
can't simply redefine "C" to mean "memcmp", because that would
potentially break indexes.

I see the following potential solutions:

   1. Represent the memcmp behavior with iculocale=NULL, or some other
catalog hack, so that we can distinguish between a locale "C" upgraded
from a previous version (which should pass "C" to ICU and get the root
locale), and a new collation defined with locale "C" (which should have
memcmp behavior). The catalog representation for locale information is
already complex, so I'm not excited about this option, but it will
work.

   2. When provider=icu and locale=C, magically transform that into
provider=libc to get memcmp-like behavior for new collations but
preserve the existing behavior for upgraded collations. Not especially
clean, but if we issue a NOTICE perhaps that would avoid confusion.

   3. Like #2, except create a new provider type "none" which may be
slightly less confusing.

=== 0004: make LOCALE apply to ICU for CREATE DATABASE

To understand this patch it helps to understand the confusing situation
with CREATE DATABASE in version 15:

The keywords LC_CTYPE and LC_COLLATE set the server environment
LC_CTYPE/LC_COLLATE for that database and can be specified regardless
of the provider. LOCALE can be specified along with (or instead of)
LC_CTYPE and LC_COLLATE, in which case whichever of LC_CTYPE or
LC_COLLATE is unspecified defaults to the setting of LOCALE. Iff the
provider is libc, LC_CTYPE and LC_COLLATE also act as the database
default collation's locale. If the provider is icu, then none of
LOCALE, LC_CTYPE, or LC_COLLATE affect the database default collation's
locale at all; that's controlled by ICU_LOCALE (which may be omitted if
the template's daticulocale is non-NULL).

The idea of patch 0004 is to address the last part, which is probably
the most confusing aspect. But for that to work smoothly, we need
something like 0003 so that LOCALE=C gives the same semantics
regardless of the provider.

Regards,
Jeff Davis

From ddda683963959a175dff17ab0e3d8519641498b9 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Fri, 21 Apr 2023 14:03:57 -0700
Subject: [PATCH v4 1/4] ICU: do not convert locale 'C' to 'en-US-u-va-posix'.

The conversion was intended to be for convenience, but it's more
likely to be confusing than useful.

The user can still directly specify 'en-US-u-va-posix' if desired.

Discussion: https://postgr.es/m/f83f089ee1e9acd5dbbbf3353294d24e1f196e95.ca...@j-davis.com
---
 src/backend/utils/adt/pg_locale.c | 19 +--
 src/bin/initdb/initdb.c   | 17 +
 .../regress/expected/collate.icu.utf8.out |  8 
 src/test/regress/sql/collate.icu.utf8.sql |  4 
 4 files changed, 14 insertions(+), 34 deletions(-)

diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index f0b6567da1..51b4221a39 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2782,26 +2782,10 @@ icu_language_tag(const char *loc_str, int elevel)
 {
 #ifdef USE_ICU
 	UErrorCode	 status;
-	char		 lang[ULOC_LANG_CAPACITY];
 	char		*langtag;
 	size_t		 buflen = 32;	/* arbitrary starting buffer size */
 	const bool	 strict = true;
 
-	status = U_ZERO_ERROR;
-	uloc_getLanguage(loc_str, lang, ULOC_LANG_CAPACITY, );
-	if (U_FAILURE(status))
-	{
-		if (elevel > 0)
-			ereport(elevel,
-	(errmsg("could not get language from locale \"%s\": 

Re: PL/Python: Fix return in the middle of PG_TRY() block.

2023-05-03 Thread Nathan Bossart
On Wed, May 03, 2023 at 09:54:13PM -0700, Nathan Bossart wrote:
> Here's a new patch that removes the volatile marker from pltdata.

Gah, right after I sent that, I realized we can remove one more volatile
marker.  Sorry for the noise.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 24a4e91b3cdd66b985fab70f3aed8f6a202b31fe Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 3 May 2023 11:32:43 -0700
Subject: [PATCH v5 1/1] Fix improper returns in PG_TRY blocks.

If we exit a PG_TRY block early via "continue", "break", "goto", or
"return", we'll skip unwinding its exception stack.  This change
moves a couple of such "return" statements in PL/Python out of
PG_TRY blocks.  This was introduced in d0aa965c0a and affects all
supported versions.

We might also be able to add compile-time checks to avoid
recurrence, but that is left as a future exercise.

Reported-by: Mikhail Gribkov, Xing Guo
Author: Xing Guo
Reviewed-by: Michael Paquier, Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAMEv5_v5Y%2B-D%3DCO1%2Bqoe16sAmgC4sbbQjz%2BUtcHmB6zcgS%2B5Ew%40mail.gmail.com
Discussion: https://postgr.es/m/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com
Backpatch-through: 11 (all supported versions)
---
 src/pl/plpython/plpy_exec.c | 54 -
 1 file changed, 36 insertions(+), 18 deletions(-)

diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c
index 923703535a..fcb363891a 100644
--- a/src/pl/plpython/plpy_exec.c
+++ b/src/pl/plpython/plpy_exec.c
@@ -411,15 +411,20 @@ static PyObject *
 PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
 {
 	PyObject   *volatile arg = NULL;
-	PyObject   *volatile args = NULL;
+	PyObject   *args = NULL;
 	int			i;
 
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	args = PyList_New(proc->nargs);
+	if (!args)
+		return NULL;
+
 	PG_TRY();
 	{
-		args = PyList_New(proc->nargs);
-		if (!args)
-			return NULL;
-
 		for (i = 0; i < proc->nargs; i++)
 		{
 			PLyDatumToOb *arginfo = >args[i];
@@ -683,19 +688,34 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			   *pltlevel,
 			   *pltrelid,
 			   *plttablename,
-			   *plttableschema;
-	PyObject   *pltargs,
+			   *plttableschema,
 			   *pytnew,
-			   *pytold;
-	PyObject   *volatile pltdata = NULL;
+			   *pytold,
+			   *pltdata;
+	PyObject   *volatile pltargs = NULL;
 	char	   *stroid;
 
-	PG_TRY();
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	pltdata = PyDict_New();
+	if (!pltdata)
+		return NULL;
+
+	if (tdata->tg_trigger->tgnargs)
 	{
-		pltdata = PyDict_New();
-		if (!pltdata)
+		pltargs = PyList_New(tdata->tg_trigger->tgnargs);
+		if (!pltargs)
+		{
+			Py_DECREF(pltdata);
 			return NULL;
+		}
+	}
 
+	PG_TRY();
+	{
 		pltname = PLyUnicode_FromString(tdata->tg_trigger->tgname);
 		PyDict_SetItemString(pltdata, "name", pltname);
 		Py_DECREF(pltname);
@@ -835,12 +855,9 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			int			i;
 			PyObject   *pltarg;
 
-			pltargs = PyList_New(tdata->tg_trigger->tgnargs);
-			if (!pltargs)
-			{
-Py_DECREF(pltdata);
-return NULL;
-			}
+			/* pltargs should have been allocated before the PG_TRY block. */
+			Assert(pltargs);
+
 			for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
 			{
 pltarg = PLyUnicode_FromString(tdata->tg_trigger->tgargs[i]);
@@ -861,6 +878,7 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 	}
 	PG_CATCH();
 	{
+		Py_XDECREF(pltargs);
 		Py_XDECREF(pltdata);
 		PG_RE_THROW();
 	}
-- 
2.25.1



Re: PL/Python: Fix return in the middle of PG_TRY() block.

2023-05-03 Thread Nathan Bossart
On Wed, May 03, 2023 at 01:58:38PM -0700, Nathan Bossart wrote:
> With this change, pltdata isn't modified in the PG_TRY section, and the
> only modification of pltargs happens after all elogs.  It might be worth
> keeping pltargs volatile in case someone decides to add an elog() in the
> future, but I see no need to keep it for pltdata.

Here's a new patch that removes the volatile marker from pltdata.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 53d2d942b8ee553439da6f324186a62ebb7fce1f Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 3 May 2023 11:32:43 -0700
Subject: [PATCH v4 1/1] Fix improper returns in PG_TRY blocks.

If we exit a PG_TRY block early via "continue", "break", "goto", or
"return", we'll skip unwinding its exception stack.  This change
moves a couple of such "return" statements in PL/Python out of
PG_TRY blocks.  This was introduced in d0aa965c0a and affects all
supported versions.

We might also be able to add compile-time checks to avoid
recurrence, but that is left as a future exercise.

Reported-by: Mikhail Gribkov, Xing Guo
Author: Xing Guo
Reviewed-by: Michael Paquier, Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAMEv5_v5Y%2B-D%3DCO1%2Bqoe16sAmgC4sbbQjz%2BUtcHmB6zcgS%2B5Ew%40mail.gmail.com
Discussion: https://postgr.es/m/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com
Backpatch-through: 11 (all supported versions)
---
 src/pl/plpython/plpy_exec.c | 52 +
 1 file changed, 35 insertions(+), 17 deletions(-)

diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c
index 923703535a..1e3efd4d86 100644
--- a/src/pl/plpython/plpy_exec.c
+++ b/src/pl/plpython/plpy_exec.c
@@ -414,12 +414,17 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
 	PyObject   *volatile args = NULL;
 	int			i;
 
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	args = PyList_New(proc->nargs);
+	if (!args)
+		return NULL;
+
 	PG_TRY();
 	{
-		args = PyList_New(proc->nargs);
-		if (!args)
-			return NULL;
-
 		for (i = 0; i < proc->nargs; i++)
 		{
 			PLyDatumToOb *arginfo = >args[i];
@@ -683,19 +688,34 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			   *pltlevel,
 			   *pltrelid,
 			   *plttablename,
-			   *plttableschema;
-	PyObject   *pltargs,
+			   *plttableschema,
 			   *pytnew,
-			   *pytold;
-	PyObject   *volatile pltdata = NULL;
+			   *pytold,
+			   *pltdata;
+	PyObject   *volatile pltargs = NULL;
 	char	   *stroid;
 
-	PG_TRY();
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	pltdata = PyDict_New();
+	if (!pltdata)
+		return NULL;
+
+	if (tdata->tg_trigger->tgnargs)
 	{
-		pltdata = PyDict_New();
-		if (!pltdata)
+		pltargs = PyList_New(tdata->tg_trigger->tgnargs);
+		if (!pltargs)
+		{
+			Py_DECREF(pltdata);
 			return NULL;
+		}
+	}
 
+	PG_TRY();
+	{
 		pltname = PLyUnicode_FromString(tdata->tg_trigger->tgname);
 		PyDict_SetItemString(pltdata, "name", pltname);
 		Py_DECREF(pltname);
@@ -835,12 +855,9 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			int			i;
 			PyObject   *pltarg;
 
-			pltargs = PyList_New(tdata->tg_trigger->tgnargs);
-			if (!pltargs)
-			{
-Py_DECREF(pltdata);
-return NULL;
-			}
+			/* pltargs should have been allocated before the PG_TRY block. */
+			Assert(pltargs);
+
 			for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
 			{
 pltarg = PLyUnicode_FromString(tdata->tg_trigger->tgargs[i]);
@@ -861,6 +878,7 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 	}
 	PG_CATCH();
 	{
+		Py_XDECREF(pltargs);
 		Py_XDECREF(pltdata);
 		PG_RE_THROW();
 	}
-- 
2.25.1



Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-03 Thread Amit Kapila
On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:
>
> Thanks for posting the updated patch, I had run this test in a loop of
> 100 times to verify that there was no failure because of race
> conditions. The 100 times execution passed successfully.
>
> One suggestion:
> "wal file" should be changed to "WAL file":
> +# Request a checkpoint on the standby to trigger the WAL file(s) removal
> +$node_standby->safe_psql('postgres', 'checkpoint;');
> +
> +# Verify that the wal file has not been retained on the standby
> +my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;
>

Thanks for the verification. I have pushed the patch.

-- 
With Regards,
Amit Kapila.




Re: "CREATE RULE ... ON SELECT": redundant?

2023-05-03 Thread Ian Lawrence Barwick
2023年5月4日(木) 12:51 Tom Lane :
>
> Ian Lawrence Barwick  writes:
> > While poking around at an update for that, unless I'm missing something it 
> > is
> > now not possible to use "CREATE RULE ... ON SELECT" for any kind of 
> > relation,
> > given that it's disallowed on views / material views already.
>
> What makes you think it's disallowed on views?  You do need to use
> CREATE OR REPLACE, since the rule will already exist.

Ah, "OR REPLACE". Knew I was missing something.

> regression=# create view v as select * from int8_tbl ;
> CREATE VIEW
> regression=# create or replace rule "_RETURN" as on select to v do instead 
> select q1, q2+1 as q2 from int8_tbl ;
> CREATE RULE
> regression=# \d+ v
>  View "public.v"
>  Column |  Type  | Collation | Nullable | Default | Storage | Description
> ++---+--+-+-+-
>  q1 | bigint |   |  | | plain   |
>  q2 | bigint |   |  | | plain   |
> View definition:
>  SELECT int8_tbl.q1,
> int8_tbl.q2 + 1 AS q2
>FROM int8_tbl;
>
> Now, this is certainly syntax that's deprecated in favor of using
> CREATE OR REPLACE VIEW, but I'm very hesitant to remove it.  ISTR
> that ancient pg_dump files used it in cases involving circular
> dependencies.  If you want to adjust the docs to say that it's
> deprecated in favor of CREATE OR REPLACE VIEW, I could get on
> board with that.

'k, I will work on a doc patch.

Thanks

Ian Barwick




RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-05-03 Thread Hayato Kuroda (Fujitsu)
Dear Alvaro,

Thanks for giving suggestion!

> A point on preserving physical replication slots: because we change WAL
> format from one major version to the next (adding new messages or
> changing format for other messages), we can't currently rely on physical
> slots working across different major versions.
> 
> So IMO, for now don't bother with physical replication slot
> preservation, but do keep the option name as specific to logical slots.

Based on the Julien's advice, We have already decided not to include physical
slots in this patch and the option name has been changed.
I think you said explicitly that we are going correct way. Thanks!

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: "CREATE RULE ... ON SELECT": redundant?

2023-05-03 Thread Tom Lane
Ian Lawrence Barwick  writes:
> While poking around at an update for that, unless I'm missing something it is
> now not possible to use "CREATE RULE ... ON SELECT" for any kind of relation,
> given that it's disallowed on views / material views already.

What makes you think it's disallowed on views?  You do need to use
CREATE OR REPLACE, since the rule will already exist.

regression=# create view v as select * from int8_tbl ;
CREATE VIEW
regression=# create or replace rule "_RETURN" as on select to v do instead 
select q1, q2+1 as q2 from int8_tbl ;
CREATE RULE
regression=# \d+ v
 View "public.v"
 Column |  Type  | Collation | Nullable | Default | Storage | Description 
++---+--+-+-+-
 q1 | bigint |   |  | | plain   | 
 q2 | bigint |   |  | | plain   | 
View definition:
 SELECT int8_tbl.q1,
int8_tbl.q2 + 1 AS q2
   FROM int8_tbl;

Now, this is certainly syntax that's deprecated in favor of using
CREATE OR REPLACE VIEW, but I'm very hesitant to remove it.  ISTR
that ancient pg_dump files used it in cases involving circular
dependencies.  If you want to adjust the docs to say that it's
deprecated in favor of CREATE OR REPLACE VIEW, I could get on
board with that.

regards, tom lane




"CREATE RULE ... ON SELECT": redundant?

2023-05-03 Thread Ian Lawrence Barwick
Hi

Commit b23cd185f [1] forbids manual creation of ON SELECT rule on
a table, and updated the main rules documentation [2], but didn't update
the corresponding CREATE RULE page [3].

[1] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b23cd185fd5410e5204683933f848d4583e34b35
[2] https://www.postgresql.org/docs/devel/rules-views.html
[3] https://www.postgresql.org/docs/devel/sql-createrule.html

While poking around at an update for that, unless I'm missing something it is
now not possible to use "CREATE RULE ... ON SELECT" for any kind of relation,
given that it's disallowed on views / material views already.

Assuming that's the case, that makes this useless syntax in a non-SQL-standard
command, is there any reason to keep it in the grammar at all?

Attached suggested patch removes it entirely and updates the CREATE RULE
documentation.

Apart from removing ON SELECT from the grammar, the main change is the removal
of usage checks in DefineQueryRewrite(), as the only time it is called with the
event_type set to "CMD_SELECT" is when a view/matview is created, and presumably
we can trust the internal caller to do the right thing. I added an Assert in
just in case, dunno if that's really needed. In passing, a redundant workaround
for pre-7.3 rule names gets removed as well.

I note that with or without this change, pg_get_ruledef() e.g. executed with:

  SELECT pg_get_ruledef(oid) FROM pg_rewrite WHERE
ev_class='some_view'::regclass;

emits SQL for CREATE RULE which can no longer be executed; I don't think there
is anything which can be done about that other than noting it as a historical
implementation oddity.


Regards

Ian Barwick
commit 3cf32a2c68d83e632e37b20be5bad7fb02945750
Author: Ian Barwick 
Date:   Tue May 2 22:17:47 2023 +0900

Remove SQL syntax support for CREATE RULE ... ON SELECT.

Commit b23cd185f forbids creation of an ON SELECT rule on tables,
which means it is no longer possible to create an ON SELECT rule
on any relation type, making the ON SELECT event option entirely
redundant.

This patch removes the syntax, and also updates the CREATE RULE
documentation page.

diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml
index dbf4c93784..0f031bdc2f 100644
--- a/doc/src/sgml/ref/create_rule.sgml
+++ b/doc/src/sgml/ref/create_rule.sgml
@@ -27,7 +27,7 @@ CREATE [ OR REPLACE ] RULE name AS
 
 where event can be one of:
 
-SELECT | INSERT | UPDATE | DELETE
+INSERT | UPDATE | DELETE
 
  
 
@@ -58,18 +58,6 @@ CREATE [ OR REPLACE ] RULE name AS
More information about the rules system is in .
   
 
-  
-   Presently, ON SELECT rules must be unconditional
-   INSTEAD rules and must have actions that consist
-   of a single SELECT command.  Thus, an
-   ON SELECT rule effectively turns the table into
-   a view, whose visible contents are the rows returned by the rule's
-   SELECT command rather than whatever had been
-   stored in the table (if anything).  It is considered better style
-   to write a CREATE VIEW command than to create a
-   real table and define an ON SELECT rule for it.
-  
-
   
You can create the illusion of an updatable view by defining
ON INSERT, ON UPDATE, and
@@ -134,12 +122,11 @@ CREATE [ OR REPLACE ] RULE name AS
 event
 
  
-  The event is one of SELECT,
-  INSERT, UPDATE, or
-  DELETE.  Note that an
-  INSERT containing an ON
-  CONFLICT clause cannot be used on tables that have
-  either INSERT or UPDATE
+  The event is one of INSERT,
+  UPDATE, or DELETE.
+  Note that an INSERT containing an
+  ON CONFLICT clause cannot be used on tables
+  that have either INSERT or UPDATE
   rules.  Consider using an updatable view instead.
  
 
@@ -246,22 +233,22 @@ CREATE [ OR REPLACE ] RULE name AS
It is very important to take care to avoid circular rules.  For
example, though each of the following two rule definitions are
accepted by PostgreSQL, the
-   SELECT command would cause
+   INSERT command would cause
PostgreSQL to report an error because
of recursive expansion of a rule:
 
 
-CREATE RULE "_RETURN" AS
-ON SELECT TO t1
-DO INSTEAD
-SELECT * FROM t2;
+CREATE RULE t1_insert AS
+  ON INSERT TO t1
+  DO INSTEAD
+INSERT INTO t2 VALUES (NEW.id);
 
-CREATE RULE "_RETURN" AS
-ON SELECT TO t2
-DO INSTEAD
-SELECT * FROM t1;
+CREATE RULE t2_insert AS
+  ON INSERT TO t2
+  DO INSTEAD
+INSERT INTO t1 VALUES (NEW.id);
 
-SELECT * FROM t1;
+INSERT INTO t1 VALUES(1);
 
   
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a723d9db78..cf74f75792 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -10781,8 +10781,7 @@ RuleActionStmtOrEmpty:
 			|	/*EMPTY*/			{ $$ = NULL; }
 		;
 
-event:		SELECT	{ $$ = CMD_SELECT; }
-			| UPDATE{ $$ = CMD_UPDATE; }
+event:		UPDATE	{ $$ = CMD_UPDATE; }
 			| 

Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-03 Thread vignesh C
On Wed, 3 May 2023 at 15:59, Amit Kapila  wrote:
>
> On Tue, May 2, 2023 at 4:52 PM Drouvot, Bertrand
>  wrote:
> >
> >
> > As per your suggestion, changing the insert ordering (like in V8 attached) 
> > makes it now work on the failing environment too.
> >
>
> I think it is better to use wait_for_replay_catchup() to wait for
> standby to catch up. I have changed that and a comment in the
> attached. I'll push this tomorrow unless there are further comments.

Thanks for posting the updated patch, I had run this test in a loop of
100 times to verify that there was no failure because of race
conditions. The 100 times execution passed successfully.

One suggestion:
"wal file" should be changed to "WAL file":
+# Request a checkpoint on the standby to trigger the WAL file(s) removal
+$node_standby->safe_psql('postgres', 'checkpoint;');
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;

Regards,
Vignesh




Re: Direct I/O

2023-05-03 Thread Thomas Munro
On Wed, Apr 19, 2023 at 7:35 AM Greg Stark  wrote:
> On Mon, 17 Apr 2023 at 17:45, Thomas Munro  wrote:
> > (2) without a page cache, you really need to size your shared_buffers
> > adequately and we can't do that automatically.
>
> Well I'm more optimistic... That may not always be impossible.
> We've already added the ability to add more shared memory after
> startup. We could implement the ability to add or remove shared buffer
> segments after startup.

Yeah, there are examples of systems going back decades with multiple
buffer pools.  In some you can add more space later, and in some you
can also configure pools with different block sizes (imagine if you
could set your extremely OLTP tables to use 4KB blocks for reduced
write amplification and then perhaps even also promise that your
storage doesn't need FPIs for that size because you know it's
perfectly safe™, and imagine if you could set some big write-only
history tables to use 32KB blocks because some compression scheme
works better, etc), and you might also want different cache
replacement algorithms in different pools.  Complex and advanced stuff
no doubt and I'm not suggesting that's anywhere near a reasonable
thing for us to think about now (as a matter of fact in another thread
you can find me arguing for fully unifying our existing segregated
SLRU buffer pools with the one true buffer pool), but since we're
talking pie-in-the-sky ideas around the water cooler...




Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing

2023-05-03 Thread Peter Geoghegan
On Wed, May 3, 2023 at 2:59 PM Peter Geoghegan  wrote:
> Coming up with a new user-facing name for xidStopLimit is already on
> my TODO list (it's surprisingly hard). I have used that name so far
> because it unambiguously refers to the exact thing that I want to talk
> about when discussing the worst case. Other than that, it's a terrible
> name.

What about "XID allocation overload"? The implication that I'm going
for here is that the system was misconfigured, or there was otherwise
some kind of imbalance between XID supply and demand. It also seems to
convey the true gravity of the situation -- it's *bad*, to be sure,
but in many environments it's a survivable condition.

One possible downside of this name is that it could suggest that all
that needs to happen is for autovacuum to catch up on vacuuming. In
reality the user *will* probably have to do more than just wait before
the system's ability to allocate new XIDs returns, because (in all
likelihood) autovacuum just won't be able to catch up unless and until
the user (say) drops a replication slot. Even still, the name seems to
work; it describes the conceptual model of the system accurately. Even
before the user drops the replication slot, autovacuum will at least
*try* to get the system back to being able to allocate new XIDs once
more.

-- 
Peter Geoghegan




Re: issue with meson builds on msys2

2023-05-03 Thread Andrew Dunstan


On 2023-05-03 We 14:26, Andres Freund wrote:

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

Oh. Seems like something we need to dig into independent of meson then :(



The main thing that's now an issue on Windows is support for various options
like libxml2. I installed the libxml2 distro from the package manager scoop,
generated .lib files for the libxml2 and libxslt DLLs, and was able to build
with autoconf on msys2, and with our MSVC support, but not with meson in
either case. It looks like we need to expand the logic in meson.build for a
number of these, just as we have done for perl, python, openssl, ldap etc.

I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense).  And it should be something that understands needing to build debug
and non-debug libraries.



I'm not familiar with conan. I have struggled considerably with vcpkg in 
the past.


I don't think there is any one perfect answer.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing

2023-05-03 Thread Peter Geoghegan
Hi Samay,

On Tue, May 2, 2023 at 11:40 PM samay sharma  wrote:
> Thanks for taking the time to do this. It is indeed difficult work.

Thanks for the review! I think that this is something that would
definitely benefit from a perspective such as yours.

> There are things I like about the changes you've proposed and some where I 
> feel that the previous section was easier to understand.

That makes sense, and I think that I agree with every point you've
raised, bar none. I'm pleased to see that you basically agree with the
high level direction.

I would estimate that the version you looked at (v2) is perhaps 35%
complete. So some of the individual problems you noticed were a direct
consequence of the work just not being anywhere near complete. I'll
try to do a better job of tracking the relative maturity of each
commit/patch in each commit message, going forward.

Anything that falls under "25.2.1. Recovering Disk Space" is
particularly undeveloped in v2. The way that I broke that up into a
bunch of WARNINGs/NOTEs/TIPs was just a short term way of breaking it
up into pieces, so that the structure was very approximately what I
wanted. I actually think that the stuff about CLUSTER and VACUUM FULL
belongs in a completely different chapter. Since it is not "Routine
Vacuuming" at all.

>> 2. Renamed "Preventing Transaction ID Wraparound Failures" to
>> "Freezing to manage the transaction ID space". Now we talk about
>> wraparound as a subtopic of freezing, not vice-versa. (This is a
>> complete rewrite, as described by later items in this list).
>
> +1 on this too. Freezing is a normal part of vacuuming and while the 
> aggressive vacuums are different, I think just talking about the worst case 
> scenario while referring to it is alarmist.

Strangely enough, Postgres 16 is the first version that instruments
freezing in its autovacuum log reports. I suspect that some long term
users will find it quite surprising to see how much (or how little)
freezing takes place in non-aggressive VACUUMs.

The introduction of page-level freezing will make it easier and more
natural to tune settings like vacuum_freeze_min_age, with the aim of
smoothing out the burden of freezing over time (particularly by making
non-aggressive VACUUMs freeze more). Page-level freezing removes any
question of not freezing every tuple on a page (barring cases where
"removable cutoff" is noticeably held back by an old MVCC snapshot).
This makes it more natural to think of freezing as a process that
makes it okay to store data in individual physical heap pages, long
term.

> 1) While I agree that bundling VACUUM and VACUUM FULL is not the right way, 
> moving all VACUUM FULL references into tips and warnings also seems 
> excessive. I think it's probably best to just have a single paragraph which 
> talks about VACUUM FULL as I do think it should be mentioned in the 
> reclaiming disk space section.

As I mentioned briefly already, my intention is to move it to another
chapter entirely. I was thinking of "Chapter 29. Monitoring Disk
Usage". The "Routine Vacuuming" docs would then link to this sect1 --
something along the lines of "non-routine commands to reclaim a lot of
disk space in the event of extreme bloat".

> 2) I felt that the new section, "Freezing to manage the transaction ID space" 
> could be made simpler to understand. As an example, I understood what the 
> parameters (autovacuum_freeze_max_age, vacuum_freeze_table_age) do and how 
> they interact better in the previous version of the docs.

Agreed. I'm going to split it up some more. I think that the current
"25.2.2.1. VACUUM's Aggressive Strategy" should be split in two, so we
go from talking about aggressive VACUUMs to Antiwraparound
autovacuums. Finding the least confusing way of explaining it has been
a focus of mine in the last few days.

> 4) I think we should explicitly call out that seeing an anti-wraparound 
> VACUUM or "VACUUM table (to prevent wraparound)" is normal and that it's just 
> a VACUUM triggered due to the table having unfrozen rows with an XID older 
> than autovacuum_freeze_max_age. I've seen many users panicking on seeing this 
> and feeling that they are close to a wraparound.

That has also been my exact experience. Users are terrified, usually
for no good reason at all. I'll make sure that this comes across in
the next revision of the patch series.

> Also, we should be more clear about how it's different from VACUUMs triggered 
> due to the scale factors (cancellation behavior, being triggered when 
> autovacuum is disabled etc.).

Right. Though I think that the biggest point of confusion for users is
how *few* differences there really are between antiwraparound
autovacuum, and any other kind of autovacuum that happens to use
VACUUM's aggressive strategy. There is really only one important
difference: the autocancellation behavior. This is an autovacuum
behavior, not a VACUUM behavior -- so the "VACUUM side" doesn't know
anything about that at all.

> 

Re: refactoring relation extension and BufferAlloc(), faster COPY

2023-05-03 Thread Andres Freund
Hi,

On 2023-05-03 19:29:46 +, Muhammad Malik wrote:
> > I use a script like:
>
> > c=16;psql -c 'DROP TABLE IF EXISTS copytest_0; CREATE TABLE copytest_0(data 
> > text not null);' && time /srv/dev/build/m-opt/src/bin/pgbench/pgbench -n 
> > -P1 -c$c -j$c -t$((1024/$c)) -f ~/tmp/copy.sql && psql -c 'TRUNCATE 
> > copytest_0'
>
> > >[1] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 
> > >10)) TO '/tmp/copytest_data_text.copy' WITH (FORMAT test);
> > >[2] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 
> > >6*10)) TO '/tmp/copytest_data_text.copy' WITH (FORMAT text);
>
> When I ran this script it did not insert anything into the copytest_0 table. 
> It only generated a single copytest_data_text.copy file of size 9.236MB.
> Please help me understand how is this 'pgbench running COPY into a single 
> table'.

That's the data generation for the file to be COPYed in. The script passed to
pgbench is just something like

COPY copytest_0 FROM '/tmp/copytest_data_text.copy';
or
COPY copytest_0 FROM '/tmp/copytest_data_binary.copy';


> Also what are the 'seconds' and 'tbl-MBs' metrics that were reported.

The total time for inserting N (1024 for the small files, 64 for the larger
ones). "tbl-MBs" is size of the resulting table, divided by time. I.e. a
measure of throughput.

Greetings,

Andres Freund




Re: PL/Python: Fix return in the middle of PG_TRY() block.

2023-05-03 Thread Nathan Bossart
On Wed, May 03, 2023 at 04:33:32PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> Here's a new version of the patch.  Besides adding comments and a commit
>> message, I made sure to decrement the reference count for pltargs in the
>> PG_CATCH block (which means that pltargs likely needs to be volatile).
> 
> Hmm, actually I think these changes should allow you to *remove* some
> volatile markers.  IIUC, you need volatile for variables that are declared
> outside PG_TRY but modified within it.  That is the case for these
> pointers as the code stands, but your patch is changing them to the
> non-risky case where they are assigned once before entering PG_TRY.
> 
> (My mental model of this is that without "volatile", the compiler
> may keep the variable in a register, creating the hazard that longjmp
> will revert the variable's value to what it was at setjmp time thanks
> to the register save/restore that those functions do.  But if it hasn't
> changed value since entering PG_TRY, then that doesn't matter.)

Ah, I think you are right.  elog.h states as follows:

 * Note: if a local variable of the function containing PG_TRY is modified
 * in the PG_TRY section and used in the PG_CATCH section, that variable
 * must be declared "volatile" for POSIX compliance.  This is not mere
 * pedantry; we have seen bugs from compilers improperly optimizing code
 * away when such a variable was not marked.  Beware that gcc's -Wclobbered
 * warnings are just about entirely useless for catching such oversights.

With this change, pltdata isn't modified in the PG_TRY section, and the
only modification of pltargs happens after all elogs.  It might be worth
keeping pltargs volatile in case someone decides to add an elog() in the
future, but I see no need to keep it for pltdata.
 
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: PL/Python: Fix return in the middle of PG_TRY() block.

2023-05-03 Thread Tom Lane
Nathan Bossart  writes:
> Here's a new version of the patch.  Besides adding comments and a commit
> message, I made sure to decrement the reference count for pltargs in the
> PG_CATCH block (which means that pltargs likely needs to be volatile).

Hmm, actually I think these changes should allow you to *remove* some
volatile markers.  IIUC, you need volatile for variables that are declared
outside PG_TRY but modified within it.  That is the case for these
pointers as the code stands, but your patch is changing them to the
non-risky case where they are assigned once before entering PG_TRY.

(My mental model of this is that without "volatile", the compiler
may keep the variable in a register, creating the hazard that longjmp
will revert the variable's value to what it was at setjmp time thanks
to the register save/restore that those functions do.  But if it hasn't
changed value since entering PG_TRY, then that doesn't matter.)

> I'm
> not too wild about moving the chunk of code for pltargs like this, but I
> haven't thought of a better option.  We could error instead of returning
> NULL, but IIUC that would go against d0aa965's stated purpose.

Agreed, throwing an error in these situations doesn't improve matters.

regards, tom lane




Re: PL/Python: Fix return in the middle of PG_TRY() block.

2023-05-03 Thread Nathan Bossart
Here's a new version of the patch.  Besides adding comments and a commit
message, I made sure to decrement the reference count for pltargs in the
PG_CATCH block (which means that pltargs likely needs to be volatile).  I'm
not too wild about moving the chunk of code for pltargs like this, but I
haven't thought of a better option.  We could error instead of returning
NULL, but IIUC that would go against d0aa965's stated purpose.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 12daf35ca398a34046d911270283f2cb7ebcbf3f Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 3 May 2023 11:32:43 -0700
Subject: [PATCH v3 1/1] Fix improper returns in PG_TRY blocks.

If we exit a PG_TRY block early via "continue", "break", "goto", or
"return", we'll skip unwinding its exception stack.  This change
moves a couple of such "return" statements in PL/Python out of
PG_TRY blocks.  This was introduced in d0aa965c0a and affects all
supported versions.

We might also be able to add compile-time checks to avoid
recurrence, but that is left as a future exercise.

Reported-by: Mikhail Gribkov, Xing Guo
Author: Xing Guo
Reviewed-by: Michael Paquier, Andres Freund, Tom Lane
Discussion: https://postgr.es/m/CAMEv5_v5Y%2B-D%3DCO1%2Bqoe16sAmgC4sbbQjz%2BUtcHmB6zcgS%2B5Ew%40mail.gmail.com
Discussion: https://postgr.es/m/CACpMh%2BCMsGMRKFzFMm3bYTzQmMU5nfEEoEDU2apJcc4hid36AQ%40mail.gmail.com
Backpatch-through: 11 (all supported versions)
---
 src/pl/plpython/plpy_exec.c | 48 +
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/src/pl/plpython/plpy_exec.c b/src/pl/plpython/plpy_exec.c
index 923703535a..9a483daa8e 100644
--- a/src/pl/plpython/plpy_exec.c
+++ b/src/pl/plpython/plpy_exec.c
@@ -414,12 +414,17 @@ PLy_function_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc)
 	PyObject   *volatile args = NULL;
 	int			i;
 
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	args = PyList_New(proc->nargs);
+	if (!args)
+		return NULL;
+
 	PG_TRY();
 	{
-		args = PyList_New(proc->nargs);
-		if (!args)
-			return NULL;
-
 		for (i = 0; i < proc->nargs; i++)
 		{
 			PLyDatumToOb *arginfo = >args[i];
@@ -683,19 +688,34 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			   *pltlevel,
 			   *pltrelid,
 			   *plttablename,
-			   *plttableschema;
-	PyObject   *pltargs,
+			   *plttableschema,
 			   *pytnew,
 			   *pytold;
+	PyObject   *volatile pltargs = NULL;
 	PyObject   *volatile pltdata = NULL;
 	char	   *stroid;
 
-	PG_TRY();
+	/*
+	 * Make any Py*_New() calls before the PG_TRY block so that we can quickly
+	 * return NULL on failure.  We can't return within the PG_TRY block, else
+	 * we'd miss unwinding the exception stack.
+	 */
+	pltdata = PyDict_New();
+	if (!pltdata)
+		return NULL;
+
+	if (tdata->tg_trigger->tgnargs)
 	{
-		pltdata = PyDict_New();
-		if (!pltdata)
+		pltargs = PyList_New(tdata->tg_trigger->tgnargs);
+		if (!pltargs)
+		{
+			Py_DECREF(pltdata);
 			return NULL;
+		}
+	}
 
+	PG_TRY();
+	{
 		pltname = PLyUnicode_FromString(tdata->tg_trigger->tgname);
 		PyDict_SetItemString(pltdata, "name", pltname);
 		Py_DECREF(pltname);
@@ -835,12 +855,9 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 			int			i;
 			PyObject   *pltarg;
 
-			pltargs = PyList_New(tdata->tg_trigger->tgnargs);
-			if (!pltargs)
-			{
-Py_DECREF(pltdata);
-return NULL;
-			}
+			/* pltargs should have been allocated before the PG_TRY block. */
+			Assert(pltargs);
+
 			for (i = 0; i < tdata->tg_trigger->tgnargs; i++)
 			{
 pltarg = PLyUnicode_FromString(tdata->tg_trigger->tgargs[i]);
@@ -861,6 +878,7 @@ PLy_trigger_build_args(FunctionCallInfo fcinfo, PLyProcedure *proc, HeapTuple *r
 	}
 	PG_CATCH();
 	{
+		Py_XDECREF(pltargs);
 		Py_XDECREF(pltdata);
 		PG_RE_THROW();
 	}
-- 
2.25.1



Re: issue with meson builds on msys2

2023-05-03 Thread Andrew Dunstan


On 2023-05-03 We 09:20, Andrew Dunstan wrote:



On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?



For a sufficiently modern installation of msys2 (20230318 version) 
this is reproducible on autoconf builds as well.


For now it's off my list of meson blockers. I will pursue the issue 
when I have time, but for now the IPC::Run workaround is sufficient.


The main thing that's now an issue on Windows is support for various 
options like libxml2. I installed the libxml2 distro from the package 
manager scoop, generated .lib files for the libxml2 and libxslt DLLs, 
and was able to build with autoconf on msys2, and with our MSVC 
support, but not with meson in either case. It looks like we need to 
expand the logic in meson.build for a number of these, just as we have 
done for perl, python, openssl, ldap etc.






I've actually made some progress on this front. I grabbed and built 
https://github.com/pkgconf/pkgconf.git (with meson :-) )


After that I set PKG_CONFIG_PATH to point to where the libxml .pc files 
are installed, and lo and behold the meson/msvc build worked with libxml 
/ libxslt. I did have to move libxml's openssl.pc file aside, as the 
distro's version of openssl is extremely old, and we don't want to use 
it (I'm using 3.1.0).


Of course, this imposes an extra build dependency for Windows, but it's 
not too onerous.


It also means that if anyone wants to use some dependency without a .pc 
file they would need to create one. I'll keep trying to expand the list 
of things I configure with.


Next targets will include ldap, lz4 and zstd.

I also need to test this with msys2, so fat I have only tested with MSVC.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: refactoring relation extension and BufferAlloc(), faster COPY

2023-05-03 Thread Muhammad Malik

> I use a script like:

> c=16;psql -c 'DROP TABLE IF EXISTS copytest_0; CREATE TABLE copytest_0(data 
> text not null);' && time /srv/dev/build/m-opt/src/bin/pgbench/pgbench -n -P1 
> -c$c -j$c -t$((1024/$c)) -f ~/tmp/copy.sql && psql -c 'TRUNCATE copytest_0'

> >[1] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 10)) 
> >TO '/tmp/copytest_data_text.copy' WITH (FORMAT test);
> >[2] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 
> >6*10)) TO '/tmp/copytest_data_text.copy' WITH (FORMAT text);

When I ran this script it did not insert anything into the copytest_0 table. It 
only generated a single copytest_data_text.copy file of size 9.236MB.
Please help me understand how is this 'pgbench running COPY into a single 
table'. Also what are the 'seconds' and 'tbl-MBs' metrics that were reported.

Thank you,
Muhammad




Re: SQL/JSON revisited

2023-05-03 Thread Matthias Kurz
On Wed, 3 May 2023 at 20:17, Alvaro Herrera  wrote:

>
> I would suggest to start a new thread with updated patches, and then a new
> commitfest entry can be created with those.
>

Whoever starts that new thread, please link link it here, I am keen to
follow it ;) Thanks a lot!
Thanks a lot for all your hard work btw, it's highly appreciated!

Best,
Matthias


Re: refactoring relation extension and BufferAlloc(), faster COPY

2023-05-03 Thread Andres Freund
Hi,

On 2023-05-03 18:25:51 +, Muhammad Malik wrote:
> Could you please share repro steps for running these benchmarks? I am doing 
> performance testing in this area and want to use the same benchmarks.

The email should contain all the necessary information. What are you missing?


c=16;psql -c 'DROP TABLE IF EXISTS copytest_0; CREATE TABLE copytest_0(data
text not null);' && time /srv/dev/build/m-opt/src/bin/pgbench/pgbench -n -P1
-c$c -j$c -t$((1024/$c)) -f ~/tmp/copy.sql && psql -c 'TRUNCATE copytest_0'


> I've done a fair bit of benchmarking of this patchset. For COPY it comes out
> ahead everywhere. It's possible that there's a very small regression for
> extremly IO miss heavy workloads, more below.
>
>
> server "base" configuration:
>
> max_wal_size=150GB
> shared_buffers=24GB
> huge_pages=on
> autovacuum=0
> backend_flush_after=2MB
> max_connections=5000
> wal_buffers=128MB
> wal_segment_size=1GB
>
> benchmark: pgbench running COPY into a single table. pgbench -t is set
> according to the client count, so that the same amount of data is inserted.
> This is done oth using small files ([1], ringbuffer not effective, no dirty
> data to write out within the benchmark window) and a bit larger files ([2],
> lots of data to write out due to ringbuffer).

I use a script like:

c=16;psql -c 'DROP TABLE IF EXISTS copytest_0; CREATE TABLE copytest_0(data 
text not null);' && time /srv/dev/build/m-opt/src/bin/pgbench/pgbench -n -P1 
-c$c -j$c -t$((1024/$c)) -f ~/tmp/copy.sql && psql -c 'TRUNCATE copytest_0'


> [1] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 10)) 
> TO '/tmp/copytest_data_text.copy' WITH (FORMAT test);
> [2] COPY (SELECT repeat(random()::text, 5) FROM generate_series(1, 6*10)) 
> TO '/tmp/copytest_data_text.copy' WITH (FORMAT text);


Greetings,

Andres Freund




Re: issue with meson builds on msys2

2023-05-03 Thread Andres Freund
Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:
> On 2023-04-27 Th 18:18, Andres Freund wrote:
> > Hi,
> > 
> > On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
> > > Still running into this, and I am rather stumped. This is a blocker for
> > > buildfarm support for meson:
> > > 
> > > Here's a simple illustration of the problem. If I do the identical test 
> > > with
> > > a non-meson build there is no problem:
> > This happens 100% reproducible?

> For a sufficiently modern installation of msys2 (20230318 version) this is
> reproducible on autoconf builds as well.

Oh. Seems like something we need to dig into independent of meson then :(


> The main thing that's now an issue on Windows is support for various options
> like libxml2. I installed the libxml2 distro from the package manager scoop,
> generated .lib files for the libxml2 and libxslt DLLs, and was able to build
> with autoconf on msys2, and with our MSVC support, but not with meson in
> either case. It looks like we need to expand the logic in meson.build for a
> number of these, just as we have done for perl, python, openssl, ldap etc.

I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense).  And it should be something that understands needing to build debug
and non-debug libraries.

Greetings,

Andres Freund




Re: refactoring relation extension and BufferAlloc(), faster COPY

2023-05-03 Thread Muhammad Malik
Hi,

Could you please share repro steps for running these benchmarks? I am doing 
performance testing in this area and want to use the same benchmarks.

Thanks,
Muhammad

From: Andres Freund 
Sent: Friday, October 28, 2022 7:54 PM
To: pgsql-hack...@postgresql.org ; Thomas Munro 
; Melanie Plageman 
Cc: Yura Sokolov ; Robert Haas 
Subject: refactoring relation extension and BufferAlloc(), faster COPY

Hi,

I'm working to extract independently useful bits from my AIO work, to reduce
the size of that patchset. This is one of those pieces.

In workloads that extend relations a lot, we end up being extremely contended
on the relation extension lock. We've attempted to address that to some degree
by using batching, which helps, but only so much.

The fundamental issue, in my opinion, is that we do *way* too much while
holding the relation extension lock.  We acquire a victim buffer, if that
buffer is dirty, we potentially flush the WAL, then write out that
buffer. Then we zero out the buffer contents. Call smgrextend().

Most of that work does not actually need to happen while holding the relation
extension lock. As far as I can tell, the minimum that needs to be covered by
the extension lock is the following:

1) call smgrnblocks()
2) insert buffer[s] into the buffer mapping table at the location returned by
   smgrnblocks
3) mark buffer[s] as IO_IN_PROGRESS


1) obviously has to happen with the relation extension lock held because
otherwise we might miss another relation extension. 2+3) need to happen with
the lock held, because otherwise another backend not doing an extension could
read the block before we're done extending, dirty it, write it out, and then
have it overwritten by the extending backend.


The reason we currently do so much work while holding the relation extension
lock is that bufmgr.c does not know about the relation lock and that relation
extension happens entirely within ReadBuffer* - there's no way to use a
narrower scope for the lock.


My fix for that is to add a dedicated function for extending relations, that
can acquire the extension lock if necessary (callers can tell it to skip that,
e.g., when initially creating an init fork). This routine is called by
ReadBuffer_common() when P_NEW is passed in, to provide backward
compatibility.


To be able to acquire victim buffers outside of the extension lock, victim
buffers are now acquired separately from inserting the new buffer mapping
entry. Victim buffer are pinned, cleaned, removed from the buffer mapping
table and marked invalid. Because they are pinned, clock sweeps in other
backends won't return them. This is done in a new function,
[Local]BufferAlloc().

This is similar to Yuri's patch at [0], but not that similar to earlier or
later approaches in that thread. I don't really understand why that thread
went on to ever more complicated approaches, when the basic approach shows
plenty gains, with no issues around the number of buffer mapping entries that
can exist etc.



Other interesting bits I found:

a) For workloads that [mostly] fit into s_b, the smgwrite() that BufferAlloc()
   does, nearly doubles the amount of writes. First the kernel ends up writing
   out all the zeroed out buffers after a while, then when we write out the
   actual buffer contents.

   The best fix for that seems to be to optionally use posix_fallocate() to
   reserve space, without dirtying pages in the kernel page cache. However, it
   looks like that's only beneficial when extending by multiple pages at once,
   because it ends up causing one filesystem-journal entry for each extension
   on at least some filesystems.

   I added 'smgrzeroextend()' that can extend by multiple blocks, without the
   caller providing a buffer to write out. When extending by 8 or more blocks,
   posix_fallocate() is used if available, otherwise pg_pwritev_with_retry() is
   used to extend the file.


b) I found that is quite beneficial to bulk-extend the relation with
   smgrextend() even without concurrency. The reason for that is the primarily
   the aforementioned dirty buffers that our current extension method causes.

   One bit that stumped me for quite a while is to know how much to extend the
   relation by. RelationGetBufferForTuple() drives the decision whether / how
   much to bulk extend purely on the contention on the extension lock, which
   obviously does not work for non-concurrent workloads.

   After quite a while I figured out that we actually have good information on
   how much to extend by, at least for COPY /
   heap_multi_insert(). heap_multi_insert() can compute how much space is
   needed to store all tuples, and pass that on to
   RelationGetBufferForTuple().

   For that to be accurate we need to recompute that number whenever we use an
   already partially filled page. That's not great, but doesn't appear to be a
   measurable overhead.


c) Contention on the FSM and the pages returned by it is a serious 

Re: SQL/JSON revisited

2023-05-03 Thread Alvaro Herrera
On 2023-May-03, Matthias Kurz wrote:

> On Wed, 5 Apr 2023 at 09:53, Alvaro Herrera  wrote:
> 
> > Okay, I've marked the CF entry as committed then.
> 
> This was marked as commited in the 2023-03 commitfest, however there are
> still patches missing (for example the JSON_TABLE one).
> However, I can not see an entry in the current 2023-07 Commitfest.
> I think it would be a good idea for a new entry in the current commitfest,
> just to not forget about the not-yet-commited features.

Yeah ... These remaining patches have to be rebased, and a few things
fixed (I left a few review comments somewhere).  I would suggest to
start a new thread with updated patches, and then a new commitfest entry
can be created with those.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las mujeres son como hondas:  mientras más resistencia tienen,
 más lejos puedes llegar con ellas"  (Jonas Nightingale, Leap of Faith)




Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-05-03 Thread Peter Geoghegan
On Wed, May 3, 2023 at 12:30 AM John Naylor
 wrote:
> I went to go find the phrase that I thought I was reacted to, and ... 
> nothing. I am also baffled.  My comment was inexcusable.

I'd quite like to drop this topic, and get on with the work at hand.
But before I do that, I ask you to consider one thing: if you were
mistaken about my words (or their intent) on this occasion, isn't it
also possible that it wasn't the first time?

I never had the opportunity to sit down to talk with you face to face
before now. If things had been different (if we managed to talk at one
of the PGCons before COVID, say), then maybe this incident would have
happened in just the same way. I can't help but think that some face
time would have prevented the whole episode, though.

You have every right to dislike me on a personal level, of course, but
if you do then I'd very much prefer that it be due to one of my actual
flaws. I'm not a petty man -- I don't resent the success of others.
I've always thought that you do rather good work. Plus I'm just not in
the habit of obstructing things that I directly benefit from.

-- 
Peter Geoghegan




Re: SQL/JSON revisited

2023-05-03 Thread Matthias Kurz
On Wed, 5 Apr 2023 at 09:53, Alvaro Herrera  wrote:

>
> Okay, I've marked the CF entry as committed then.
>

This was marked as commited in the 2023-03 commitfest, however there are
still patches missing (for example the JSON_TABLE one).
However, I can not see an entry in the current 2023-07 Commitfest.
I think it would be a good idea for a new entry in the current commitfest,
just to not forget about the not-yet-commited features.

Thanks!
Matthias


Re: psql: Add role's membership options to the \du+ command

2023-05-03 Thread Jonathan S. Katz

On 5/3/23 12:25 PM, Tom Lane wrote:

"David G. Johnston"  writes:

On Wed, May 3, 2023 at 9:00 AM Jonathan S. Katz 
wrote:

I don't see why this is an open item as this feature was not committed
for v16. Open items typically revolve around committed features.



The argument is that updating the psql \d views to show the newly added
options is something that the patch to add those options should have done
before being committed.


Yeah, if there is not any convenient way to see that info in psql
then that seems like a missing part of the feature.


[RMT hat]

OK -- I was rereading the thread again to see if I could glean that 
insight. There was a comment buried in the thread with David's opinion 
on that front, but no one had +1'd that.


However, if this is for feature completeness, I'll withdraw the closing 
of the open item, but would strongly suggest we complete it in time for 
Beta 1.


[Personal hat]

Looking at Pavel's latest patch, I do find the output easy to 
understand, though do we need to explicitly list "empty" if there are no 
membership permissions?


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: psql: Add role's membership options to the \du+ command

2023-05-03 Thread Tom Lane
"David G. Johnston"  writes:
> On Wed, May 3, 2023 at 9:00 AM Jonathan S. Katz 
> wrote:
>> I don't see why this is an open item as this feature was not committed
>> for v16. Open items typically revolve around committed features.

> The argument is that updating the psql \d views to show the newly added
> options is something that the patch to add those options should have done
> before being committed.

Yeah, if there is not any convenient way to see that info in psql
then that seems like a missing part of the feature.

regards, tom lane




Re: psql: Add role's membership options to the \du+ command

2023-05-03 Thread David G. Johnston
On Wed, May 3, 2023 at 9:00 AM Jonathan S. Katz 
wrote:

> On 4/13/23 8:44 AM, Pavel Luzanov wrote:
>
> > P.S. If no objections I plan to add this patch to Open Items for v16
> > https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items
>
> [RMT hat]
>
> I don't see why this is an open item as this feature was not committed
> for v16. Open items typically revolve around committed features.
>
> Unless someone makes a convincing argument otherwise, I'll remove this
> from the Open Items list[1] tomorrow.
>
> Thanks,
>
> Jonathan
>
> [1] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items


The argument is that updating the psql \d views to show the newly added
options is something that the patch to add those options should have done
before being committed.  Or, at worse, we should decide now that we don't
want to do so and spare people the effort of trying to get this committed
later.

David J.


Re: psql: Add role's membership options to the \du+ command

2023-05-03 Thread Jonathan S. Katz

On 4/13/23 8:44 AM, Pavel Luzanov wrote:


P.S. If no objections I plan to add this patch to Open Items for v16
https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items


[RMT hat]

I don't see why this is an open item as this feature was not committed 
for v16. Open items typically revolve around committed features.


Unless someone makes a convincing argument otherwise, I'll remove this 
from the Open Items list[1] tomorrow.


Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_16_Open_Items


OpenPGP_signature
Description: OpenPGP digital signature


Re: pg_stat_io not tracking smgrwriteback() is confusing

2023-05-03 Thread Jonathan S. Katz

On 4/27/23 11:36 AM, Melanie Plageman wrote:

Thanks for the review!

On Wed, Apr 26, 2023 at 10:22 PM Kyotaro Horiguchi
 wrote:


At Wed, 26 Apr 2023 17:08:14 -0400, Melanie Plageman 
 wrote in

On Mon, Apr 24, 2023 at 9:29 PM Melanie Plageman
 wrote:

I've yet to cook up a client backend test case (e.g. with COPY). I've taken
that as a todo.


It was trivial to see client backend writebacks in almost any scenario
once I set backend_flush_after. I wonder if it is worth mentioning the
various "*flush_after" gucs in the docs?


I have a few outstanding questions:

1) Does it make sense for writebacks to count the number of blocks for
which writeback was requested or the number of calls to smgrwriteback() or
the number of actual syscalls made? We don't actually know from outside
of mdwriteback() how many FileWriteback() calls we will make.


So, in the attached v3, I've kept the first method: writebacks are the
number of blocks which the backend has requested writeback of. I'd like
it to be clear in the docs exactly what writebacks are (so that people
know not to add them together with writes or something like that). I
made an effort but could use further docs review.


+Number of units of size op_bytes which the backend
+requested the kernel write out to permanent storage.

I just want to mention that it is not necessarily the same as the
number of system calls, but I'm not sure what others think about that.


My thinking is that some other IO operations, for example, extends,
count the number of blocks extended and not the number of syscalls.


+Time spent in writeback operations in milliseconds (if
+ is enabled, otherwise zero). This
+does not necessarily count the time spent by the kernel writing the
+data out. The backend will initiate write-out of the dirty pages and
+wait only if the request queue is full.

The last sentence looks like it's taken from the sync_file_range() man
page, but I think it's a bit too detailed. We could just say, "The
time usually only includes the time it takes to queue write-out
requests.", bit I'm not sure wh...


Ah, yes, I indeed took heavy inspiration from the sync_file_range()
man page :) I've modified this comment in the attached v4. I didn't want
to say "usually" since I imagine it is quite workload and configuration
dependent.


2) I'm a little nervous about not including IOObject in the writeback
context. Technically, there is nothing stopping local buffer code from
calling IssuePendingWritebacks(). Right now, local buffer code doesn't
do ScheduleBufferTagForWriteback(). But it doesn't seem quite right to
hardcode in IOOBJECT_RELATION when there is nothing wrong with
requesting writeback of local buffers (AFAIK). What do you think?


I've gone ahead and added IOObject to the WritebackContext.


The smgropen call in IssuePendingWritebacks below clearly shows that
the function only deals with shared buffers.


   /* and finally tell the kernel to write the data to storage */
   reln = smgropen(currlocator, InvalidBackendId);
   smgrwriteback(reln, BufTagGetForkNum(), tag.blockNum, 
nblocks);


Yes, as it is currently, IssuePendingWritebacks() is only used for shared
buffers. My rationale for including IOObject is that localbuf.c calls
smgr* functions and there isn't anything stopping it from calling
smgrwriteback() or using WritebackContexts (AFAICT).


The callback-related code fully depends on callers following its
expectation. So we can rewrite the following comment added to
InitBufferPoll with a more confident tone.

+* Initialize per-backend file flush context. IOObject is initialized to
+* IOOBJECT_RELATION and IOContext to IOCONTEXT_NORMAL since these are 
the
+* most likely targets for writeback. The backend can overwrite these as
+* appropriate.


I have updated this comment to be more confident and specific.


Or I actually think we might not even need to pass around the io_*
parameters and could just pass immediate values to the
pgstat_count_io_op_time call. If we ever start using shared buffers
for thing other than relation files (for example SLRU?), we'll have to
consider the target individually for each buffer block. That being
said, I'm fine with how it is either.


In IssuePendingWritebacks() we don't actually know which IOContext we
are issuing writebacks for when we call pgstat_count_io_op_time() (we do
issue pending writebacks for other IOContexts than IOCONTEXT_NORMAL). I
agree IOObject is not strictly necessary right now. I've kept IOObject a
member of WritebackContext for the reasons I mention above, however, I
am open to removing it if it adds confusion.


[RMT hat]

Horiguchi-san: do the changes that Melanie made address your feedback?

It'd be good if we can get this into Beta 1 if everyone is comfortable 
with the patch.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Move defaults toward ICU in 16?

2023-05-03 Thread Jonathan S. Katz

On 4/17/23 2:33 PM, Tom Lane wrote:

Jeff Davis  writes:

Is now a reasonable time to check it in and see what breaks? It looks
like there are quite a few buildfarm members that specify neither --
with-icu nor --without-icu.


I see you just pinged buildfarm-members again, so I'd think it's
polite to give people 24 hours or so to deal with that before
you break things.


[RMT hat]

This thread has fallen silent and the RMT wanted to check in.

The RMT did have a brief discussion on $SUBJECT. We agree with several 
points that regardless of if/when ICU becomes the default collation 
provider for PostgreSQL, we'll likely have to flush out several issues. 
The question is how long we want that period to be before releasing the 
default.


Right now, and in absence of critical issues or objections, the RMT is 
OK with leaving in ICU as the default collation provider for Beta 1. If 
we're to revert back to glibc, we recommend doing this before Beta 2.


However, if there are strong objections to this proposal, please do 
state them.


Thanks,

Jonathan


OpenPGP_signature
Description: OpenPGP digital signature


Re: Rename 'lpp' to 'lp' in heapam.c

2023-05-03 Thread Robert Haas
On Tue, May 2, 2023 at 10:18 PM David Rowley  wrote:
> I don't really agree that one is any more correct than the other. I
> also don't think we should be making changes like this as doing this
> may give some false impression that we have some standard to follow
> here that a local variable of a given type must be given a certain
> name. To comply with such a standard seems like it would take close to
> an endless number of patches which would just result in wasted
> reviewer and committer time and give us nothing but pain while back
> patching.
>
> -1 from me.

I agree with David. This seems like pointless code churn.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: issue with meson builds on msys2

2023-05-03 Thread Andrew Dunstan


On 2023-04-27 Th 18:18, Andres Freund wrote:

Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?



For a sufficiently modern installation of msys2 (20230318 version) this 
is reproducible on autoconf builds as well.


For now it's off my list of meson blockers. I will pursue the issue when 
I have time, but for now the IPC::Run workaround is sufficient.


The main thing that's now an issue on Windows is support for various 
options like libxml2. I installed the libxml2 distro from the package 
manager scoop, generated .lib files for the libxml2 and libxslt DLLs, 
and was able to build with autoconf on msys2, and with our MSVC support, 
but not with meson in either case. It looks like we need to expand the 
logic in meson.build for a number of these, just as we have done for 
perl, python, openssl, ldap etc.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Logging parallel worker draught

2023-05-03 Thread Robert Haas
On Tue, May 2, 2023 at 6:57 AM Amit Kapila  wrote:
> We can output this at the LOG level to avoid running the server at
> DEBUG1 level. There are a few other cases where we are not able to
> spawn the worker or process and those are logged at the LOG level. For
> example, "could not fork autovacuum launcher process .." or "too many
> background workers". So, not sure, if this should get a separate
> treatment. If we fear this can happen frequently enough that it can
> spam the LOG then a GUC may be worthwhile.

I think we should definitely be afraid of that. I am in favor of a separate GUC.

> > What I was wondering was whether we would be better off putting this
> > into the statistics collector, vs. doing it via logging. Both
> > approaches seem to have pros and cons.
>
> I think it could be easier for users to process the information if it
> is available via some view, so there is a benefit in putting this into
> the stats subsystem.

Unless we do this instead.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_stat_io for the startup process

2023-05-03 Thread Melih Mutlu
Hi,

Andres Freund , 27 Nis 2023 Per, 19:27 tarihinde şunu
yazdı:

> Huh - do you have a link to the failure? That's how I would expect it to be
> done.


I think I should have registered it in the beginning of
PerformWalRecovery() and not just before the main redo loop
since HandleStartupProcInterrupts is called before the loop too.
Here's the error log otherise [1]

>  #ifdef WAL_DEBUG
> >   if (XLOG_DEBUG ||
> >   (record->xl_rmid == RM_XACT_ID &&
> trace_recovery_messages <= DEBUG2) ||
> > @@ -3617,6 +3621,13 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr,
> bool randAccess,
> >   /* Do background tasks
> that might benefit us later. */
> >
>  KnownAssignedTransactionIdsIdleMaintenance();
> >
> > + /*
> > +  * Need to disable flush
> timeout to avoid unnecessary
> > +  * wakeups. Enable it
> again after a WAL record is read
> > +  * in PerformWalRecovery.
> > +  */
> > +
>  disable_startup_stat_flush_timeout();
> > +
> >   (void)
> WaitLatch(>recoveryWakeupLatch,
> >
>   WL_LATCH_SET | WL_TIMEOUT |
> >
>   WL_EXIT_ON_PM_DEATH,
>
> I think always disabling the timer here isn't quite right - we want to
> wake up
> *once* in WaitForWALToBecomeAvailable(), otherwise we'll not submit pending
> stats before waiting - potentially for a long time - for WAL. One way
> would be
> just explicitly report before the WaitLatch().
>
> Another approach, I think better, would be to not use
> enable_timeout_every(),
> and to explicitly rearm the timer in HandleStartupProcInterrupts(). When
> called from WaitForWALToBecomeAvailable(), we'd not rearm, and instead do
> so
> at the end of WaitForWALToBecomeAvailable().  That way we also wouldn't
> repeatedly fire between calls to HandleStartupProcInterrupts().
>

Attached patch is probably not doing what you asked. IIUC
HandleStartupProcInterrupts should arm the timer but also shouldn't arm it
if it's called from  WaitForWALToBecomeAvailable. But the timer should be
armed again at the very end of WaitForWALToBecomeAvailable. I've been
thinking about how to do this properly. Should HandleStartupProcInterrupts
take a parameter to decide whether the timer needs to be armed? Or need to
add an additional global flag to rearm the timer? Any thoughts?

[1]
https://api.cirrus-ci.com/v1/artifact/task/5282291971260416/testrun/build/testrun/recovery/010_logical_decoding_timelines/log/010_logical_decoding_timelines_replica.log

Best,
-- 
Melih Mutlu
Microsoft


v2-0001-Add-timeout-to-flush-stats-during-startup-s-main-replay-loop.patch
Description: Binary data


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-03 Thread Drouvot, Bertrand

Hi,

On 5/3/23 12:29 PM, Amit Kapila wrote:

On Tue, May 2, 2023 at 4:52 PM Drouvot, Bertrand
 wrote:



As per your suggestion, changing the insert ordering (like in V8 attached) 
makes it now work on the failing environment too.



I think it is better to use wait_for_replay_catchup() to wait for
standby to catch up.


Oh right, that's a discussion we already had in [1], I should have thought 
about it.


 I have changed that and a comment in the
attached. I'll push this tomorrow unless there are further comments.



LGTM, thanks!

[1]: 
https://www.postgresql.org/message-id/acbac69e-9ae8-c546-3216-8ecb38e7a93d%40gmail.com

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-05-03 Thread Alvaro Herrera
On 2023-May-02, Julien Rouhaud wrote:

> On Tue, May 02, 2023 at 12:55:18PM +0200, Alvaro Herrera wrote:

> > A point on preserving physical replication slots: because we change WAL
> > format from one major version to the next (adding new messages or
> > changing format for other messages), we can't currently rely on physical
> > slots working across different major versions.
> 
> I don't think anyone suggested to do physical replication over different major
> versions.

They didn't, but a man can dream.  (Anyway, we agree on it not working
for various reasons.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No es bueno caminar con un hombre muerto"




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-03 Thread Amit Kapila
On Tue, May 2, 2023 at 4:52 PM Drouvot, Bertrand
 wrote:
>
>
> As per your suggestion, changing the insert ordering (like in V8 attached) 
> makes it now work on the failing environment too.
>

I think it is better to use wait_for_replay_catchup() to wait for
standby to catch up. I have changed that and a comment in the
attached. I'll push this tomorrow unless there are further comments.

-- 
With Regards,
Amit Kapila.


v9-0001-Test-that-invalidated-logical-slots-doesn-t-retai.patch
Description: Binary data


Re: Add PQsendSyncMessage() to libpq

2023-05-03 Thread Alvaro Herrera
On 2023-May-02, Robert Haas wrote:

> On Mon, May 1, 2023 at 8:42 PM Michael Paquier  wrote:
> > Another thing that may matter in terms of extensibility?  Would a
> > boolean argument really be the best design?  Could it be better to
> > have instead one API with a bits32 and some flags controlling its
> > internals?
> 
> I wondered that, too. If we never add any more Boolean parameters to
> this function then that would end up a waste, but maybe we will and
> then it will be genius. Not sure what's best.

I agree that adding a flag is the way to go, since it improve chances
that we won't end up with ten different functions in case we decide to
have eight other behaviors.  One more function and we're done.  And
while I can't think of any use for a future flag, we (I) already didn't
of this one either, so let's not make the same mistake.

We already have 'int' flag masks in PQcopyResult() and
PQsetTraceFlags().  We were using bits32 initially for flag stuff in the
PQtrace facilities, until [1] reminded us that we shouldn't let c.h
creep into app-land, so that was turned into plain 'int'.

[1] 
https://www.postgresql.org/message-id/TYAPR01MB2990B6C6A32ACF15D97AE94AFEBD0%40TYAPR01MB2990.jpnprd01.prod.outlook.com

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No nos atrevemos a muchas cosas porque son difíciles,
pero son difíciles porque no nos atrevemos a hacerlas" (Séneca)




Re: Support logical replication of DDLs

2023-05-03 Thread Alvaro Herrera
Patch 0001 adds a new event trigger type that can be fired, but it's
missing documentation and its own tests.  (I think part of the docs are
in 0002, but that seems to be only the changes to the supported
operations table, without any other explanation for it in sect1
event-trigger-definition, and examples showing it at work).  Adding a
new event trigger type is quite a major thing because it's user visible,
so a commit that adds that should be self-contained.  Users will want to
use it for other things as soon as it's in, for reasons other than what
you're adding it for.  This also means that you'll want to keep other
things separate, such as adding AlterTableStmt->table_like and the move
of structs from event_trigger.c to event_trigger.h ... and is
EventTriggerAlterTypeStart/End necessary in 0001 as well, or should it
be separate?

(I find patch series as single .tar.gz not very friendly.  I think
compression is okay, but perhaps compress each patch separately.)

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-05-03 Thread John Naylor
On Wed, May 3, 2023 at 10:04 AM Peter Geoghegan  wrote:
>
> That's not the only reason, though. I also genuinely don't have the
> foggiest notion what was behind what you said. In particular, this
> part still makes zero sense to me:
>
> "Claim that others are holding you back, and then try to move the
> goalposts in their work"

I went to go find the phrase that I thought I was reacted to, and ...
nothing. I am also baffled.  My comment was inexcusable.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Perform streaming logical transactions by background workers and parallel apply

2023-05-03 Thread Amit Kapila
On Tue, May 2, 2023 at 9:46 AM Amit Kapila  wrote:
>
> On Tue, May 2, 2023 at 9:06 AM Zhijie Hou (Fujitsu)
>  wrote:
> >
> > On Friday, April 28, 2023 2:18 PM Masahiko Sawada  
> > wrote:
> > >
> > > >
> > > > Alexander, does the proposed patch fix the problem you are facing?
> > > > Sawada-San, and others, do you see any better way to fix it than what
> > > > has been proposed?
> > >
> > > I'm concerned that the idea of relying on IsNormalProcessingMode()
> > > might not be robust since if we change the meaning of
> > > IsNormalProcessingMode() some day it would silently break again. So I
> > > prefer using something like InitializingApplyWorker, or another idea
> > > would be to do cleanup work (e.g., fileset deletion and lock release)
> > > in a separate callback that is registered after connecting to the
> > > database.
> >
> > Thanks for the review. I agree that it’s better to use a new variable here.
> > Attach the patch for the same.
> >
>
> + *
> + * However, if the worker is being initialized, there is no need to release
> + * locks.
>   */
> - LockReleaseAll(DEFAULT_LOCKMETHOD, true);
> + if (!InitializingApplyWorker)
> + LockReleaseAll(DEFAULT_LOCKMETHOD, true);
>
> Can we slightly reword this comment as: "The locks will be acquired
> once the worker is initialized."?
>

After making this modification, I pushed your patch. Thanks!

-- 
With Regards,
Amit Kapila.




Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing

2023-05-03 Thread samay sharma
Hi,

On Mon, Apr 24, 2023 at 2:58 PM Peter Geoghegan  wrote:

> My work on page-level freezing for PostgreSQL 16 has some remaining
> loose ends to tie up with the documentation. The "Routine Vacuuming"
> section of the docs has no mention of page-level freezing. It also
> doesn't mention the FPI optimization added by commit 1de58df4. This
> isn't a small thing to leave out; I fully expect that the FPI
> optimization will very significantly alter when and how VACUUM
> freezes. The cadence will look quite a lot different.
>
> It seemed almost impossible to fit in discussion of page-level
> freezing to the existing structure. In part this is because the
> existing documentation emphasizes the worst case scenario, rather than
> talking about freezing as a maintenance task that affects physical
> heap pages in roughly the same way as pruning does. There isn't a
> clean separation of things that would allow me to just add a paragraph
> about the FPI thing.
>
> Obviously it's important that the system never enters xidStopLimit
> mode -- not being able to allocate new XIDs is a huge problem. But it
> seems unhelpful to define that as the only goal of freezing, or even
> the main goal. To me this seems similar to defining the goal of
> cleaning up bloat as avoiding completely running out of disk space;
> while it may be "the single most important thing" in some general
> sense, it isn't all that important in most individual cases. There are
> many very bad things that will happen before that extreme worst case
> is hit, which are far more likely to be the real source of pain.
>
> There are also very big structural problems with "Routine Vacuuming",
> that I also propose to do something about. Honestly, it's a huge mess
> at this point. It's nobody's fault in particular; there has been
> accretion after accretion added, over many years. It is time to
> finally bite the bullet and do some serious restructuring. I'm hoping
> that I don't get too much push back on this, because it's already very
> difficult work.
>

Thanks for taking the time to do this. It is indeed difficult work. I'll
give my perspective as someone who has not read the vacuum code but have
learnt most of what I know about autovacuum / vacuuming by reading the
"Routine Vacuuming" page 10s of times.


>
> Attached patch series shows what I consider to be a much better
> overall structure. To make this convenient to take a quick look at, I
> also attach a prebuilt version of routine-vacuuming.html (not the only
> page that I've changed, but the most important set of changes by far).
>
> This initial version is still quite lacking in overall polish, but I
> believe that it gets the general structure right. That's what I'd like
> to get feedback on right now: can I get agreement with me about the
> general nature of the problem? Does this high level direction seem
> like the right one?
>

There are things I like about the changes you've proposed and some where I
feel that the previous section was easier to understand. I'll comment
inline on the summary below and will put in a few points about things I
think can be improved at the end.


>
> The following list is a summary of the major changes that I propose:
>
> 1. Restructures the order of items to match the actual processing
> order within VACUUM (and ANALYZE), rather than jumping from VACUUM to
> ANALYZE and then back to VACUUM.
>
> This flows a lot better, which helps with later items that deal with
> freezing/wraparound.
>

+1


>
> 2. Renamed "Preventing Transaction ID Wraparound Failures" to
> "Freezing to manage the transaction ID space". Now we talk about
> wraparound as a subtopic of freezing, not vice-versa. (This is a
> complete rewrite, as described by later items in this list).
>

+1 on this too. Freezing is a normal part of vacuuming and while the
aggressive vacuums are different, I think just talking about the worst case
scenario while referring to it is alarmist.


>
> 3. All of the stuff about modulo-2^32 arithmetic is moved to the
> storage chapter, where we describe the heap tuple header format.
>
> It seems crazy to me that the second sentence in our discussion of
> wraparound/freezing is still:
>
> "But since transaction IDs have limited size (32 bits) a cluster that
> runs for a long time (more than 4 billion transactions) would suffer
> transaction ID wraparound: the XID counter wraps around to zero, and
> all of a sudden transactions that were in the past appear to be in the
> future"
>
> Here we start the whole discussion of wraparound (a particularly
> delicate topic) by describing how VACUUM used to work 20 years ago,
> before the invention of freezing. That was the last time that a
> PostgreSQL cluster could run for 4 billion XIDs without freezing. The
> invariant is that we activate xidStopLimit mode protections to avoid a
> "distance" between any two unfrozen XIDs that exceeds about 2 billion
> XIDs. So why on earth are we talking about 4 billion XIDs? This is the
> most