Re: benchmark results comparing versions 15.2 and 16

2023-05-05 Thread MARK CALLAGHAN
I have two more runs of the benchmark in progress so we will have 3 results
for each of the test cases to confirm that the small regressions are
repeatable.

On Fri, May 5, 2023 at 1:34 PM Andres Freund  wrote:


> One thing that's somewhat odd is that there's very marked changes in l.i0's
> p99 latency for the four clients cases - but whether 15 or 16 are better
> differs between the runs.
>

>From the response time sections for l.i0 the [1ms, 4ms) bucket has 99% or
more for all 6 cases.
For example,
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.1u.1tno.cached/all.html#l.i0.rt
But yes, the p99 is as you state. I will wade through my test scripts
tomorrow to see how the p99 is computed.

I do wonder if there's something getting scheduled in some of these runs
> increasing latency?
>

Do you mean interference from other processes? Both the big and small
servers run Ubuntu 22.04 server (no X) and there shouldn't be many extra
things, although Google Cloud adds a few extra things that run in the
background.


> Or what we're seeing depends on the time between the start of the server
> and
> the start of the benchmark? It is interesting that the per-second
> throughput
> graph shows a lot of up/down at the end:
>
> https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tyes.1g/tput.l.i0.html
>
> Both 15 and 16 have one very high result at 70s, 15 then has one low, but
> 16
> has two low results.
>

Immediately prior to l.i0 the database directory is wiped and then Postgres
is initialized and started.

The IPS vs time graphs are more interesting for benchmark steps that run
longer. Alas, this can't run too long if the resulting database is to fit
in <= 16G. But that is a problem for another day. The IPS vs time graphs
are not a flat line, but I am not ready to pursue that as problem unless it
shows multi-second write-stalls (fortunately it does not).

-- 
Mark Callaghan
mdcal...@gmail.com


Re: [PATCH] Add native windows on arm64 support

2023-05-05 Thread Tom Lane
Michael Paquier  writes:
> Seeing how simple this has become, I would be really tempted to get
> that applied, especially if there's a buildfarm machine able to follow
> up..  On the one hand, we are in a stability period for v16, so it may
> not be the best moment ever.  On the other hand, waiting one more year
> sounds like a waste for a patch that just adds new code paths.

Indeed, there's not much in this patch ... but it's impossible to see
how "run on an entirely new platform" isn't a new feature.  Moreover,
it's a platform that very few of us will have any ability to support
or debug for.  I can't see how it's a good idea to shove this in
post-feature-freeze.  Let's plan to push it in when v17 opens.

regards, tom lane




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

2023-05-05 Thread Tatsuo Ishii
>> The attached test patch is mostly the same as in the previous patch
>> set, but it doesn't fail on row_number anymore as the main patch
>> only rejects aggregate functions. The test patch also adds a test for
> 
>> +SELECT sum(orbit) RESPECT NULLS OVER () FROM planets; -- succeeds
> 
> I think the standard does not allow to specify RESPECT NULLS other
> than lead, lag, first_value, last_value and nth_value. Unless we agree
> that PostgreSQL violates the standard in this regard, you should not
> allow to use RESPECT NULLS for the window functions, expect lead etc.
> and aggregates.
> 
> See my patch.
> 
>> +/*
>> + * Window function option clauses
>> + */
>> +opt_null_treatment:
>> +RESPECT NULLS_P 
>> { $$ = RESPECT_NULLS; }
>> +| IGNORE_P NULLS_P  
>> { $$ = IGNORE_NULLS; }
>> +| /*EMPTY*/ 
>> { $$ = NULL_TREATMENT_NOT_SET; }
>> +;
> 
> With this, you can check if null treatment clause is used or not in
> each window function.
> 
> In my previous patch I did the check in parse/analysis but I think
> it's better to be checked in each window function. This way,
> 
> - need not to add a column to pg_proc.
> 
> - allow user defined window functions to decide by themselves whether
>   they can accept null treatment option.

Attached is the patch to implement this (on top of your patch).

test=# SELECT row_number() RESPECT NULLS OVER () FROM (SELECT 1) AS s;
ERROR:  window function row_number cannot have RESPECT NULLS or IGNORE NULLS

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index fac0e05dee..b8519d9890 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -74,6 +74,7 @@ typedef struct WindowObjectData
 	int64		*win_nonnulls;	/* tracks non-nulls in ignore nulls mode */
 	int			nonnulls_size;	/* track size of the win_nonnulls array */
 	int			nonnulls_len;	/* track length of the win_nonnulls array */
+	WindowFunc	*wfunc;			/* WindowFunc of this function */
 } WindowObjectData;
 
 /*
@@ -2634,6 +2635,8 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
 winobj->nonnulls_len = 0;
 			}
 
+			winobj->wfunc = wfunc;
+
 			/* It's a real window function, so set up to call it. */
 			fmgr_info_cxt(wfunc->winfnoid, >flinfo,
 		  econtext->ecxt_per_query_memory);
@@ -3881,3 +3884,24 @@ WinGetFuncArgCurrent(WindowObject winobj, int argno, bool *isnull)
 	return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
 		econtext, isnull);
 }
+
+/*
+ * Error out that this window function cannot have null treatement.
+ */
+void
+ErrorOutNullTreatment(WindowObject winobj)
+{
+	char			*fname;
+
+	Assert(WindowObjectIsValid(winobj));
+
+	if (winobj->wfunc->null_treatment == NULL_TREATMENT_NOT_SET)
+		return;
+
+	fname = get_func_name(winobj->wfunc->winfnoid);
+
+	ereport(ERROR,
+			(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+			 errmsg("window function %s cannot have RESPECT NULLS or IGNORE NULLS",
+	fname)));
+}
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 01fd16acf9..05e64c4569 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2475,6 +2475,7 @@ eval_const_expressions_mutator(Node *node,
 newexpr->winstar = expr->winstar;
 newexpr->winagg = expr->winagg;
 newexpr->ignore_nulls = expr->ignore_nulls;
+newexpr->null_treatment = expr->null_treatment;
 newexpr->location = expr->location;
 
 return (Node *) newexpr;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 58c00bfd4f..e131428e85 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -276,6 +276,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	MergeWhenClause *mergewhen;
 	struct KeyActions *keyactions;
 	struct KeyAction *keyaction;
+	NullTreatment	nulltreatment;
 }
 
 %type 	stmt toplevel_stmt schema_stmt routine_body_stmt
@@ -633,7 +634,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 opt_frame_clause frame_extent frame_bound
 %type 	opt_window_exclusion_clause
 %type 		opt_existing_window_name
-%type  null_treatment
 %type  opt_if_not_exists
 %type  opt_unique_null_treatment
 %type 	generated_when override_kind
@@ -662,6 +662,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	json_object_constructor_null_clause_opt
 	json_array_constructor_null_clause_opt
 
+%type 		null_treatment
+
 /*
  * Non-keyword token types.  These are hard-wired into the "flex" lexer.
  * They must be listed first so that their numeric codes do not 

Re: [PATCH] Add native windows on arm64 support

2023-05-05 Thread Michael Paquier
On Tue, May 02, 2023 at 08:51:09AM +0100, Niyas Sait wrote:
> I've attached a new version (v8) to fix the above indentation issue.
> 
> Could someone please help with the review ?

Seeing how simple this has become, I would be really tempted to get
that applied, especially if there's a buildfarm machine able to follow
up..  On the one hand, we are in a stability period for v16, so it may
not be the best moment ever.  On the other hand, waiting one more year
sounds like a waste for a patch that just adds new code paths.

RMT members, any thoughts?
--
Michael


signature.asc
Description: PGP signature


Re: Tab completion for CREATE SCHEMAAUTHORIZATION

2023-05-05 Thread Michael Paquier
On Tue, May 02, 2023 at 01:19:49PM +0100, Dagfinn Ilmari Mannsåker wrote:
> Dagfinn Ilmari Mannsåker  writes:
>> This is for completing the word CREATE itself after CREATE SCHEMA
>> [[] AUTHORIZATION] .  The things that can come after that
>> are already handled generically earlier in the function:
>>
>> /* CREATE */
>> /* complete with something you can create */
>> else if (TailMatches("CREATE"))
>> matches = rl_completion_matches(text, create_command_generator);
>>
>> create_command_generator uses the words_after_create array, which lists
>> all the things that can be created.

You are right.  I have completely forgotten that this code path would
append everything that supports CREATE for a CREATE SCHEMA command :)

> But, looking closer at the docs, only tables, views, indexes, sequences
> and triggers can be created as part of a CREATE SCHEMA statement. Maybe
> we should add a HeadMatches("CREATE", "SCHEMA") exception in the above?

Yes, it looks like we are going to need an exception and append only
the keywords that are supported, or we will end up recommending mostly
things that are not accepted by the parser.
--
Michael


signature.asc
Description: PGP signature


Re: Add LZ4 compression in pg_dump

2023-05-05 Thread Michael Paquier
On Fri, May 05, 2023 at 02:13:28PM +, gkokola...@pm.me wrote:
> Good point. I thought about it before submitting the patch. I
> concluded that given the complexity and operations involved in
> LZ4Stream_read_internal() and the rest of t he pg_dump/pg_restore
> code, the memset() call will be negligible. However from the
> readability point of view, the function is a bit cleaner with the
> memset(). 
> 
> I will not object to any suggestion though, as this is a very
> trivial point. Please find attached a v2 of the patch following the
> suggested approach.

Please note that an open item has been added for this stuff.
--
Michael


signature.asc
Description: PGP signature


Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Tom Lane
Michael Paquier  writes:
> +WHEN NOT MATCHED
> +   AND s.a > 100
> +   THEN INSERT (id, data) OVERRIDING SYSTEM VALUE
> +   VALUES (s.a, DEFAULT)

> About OVERRIDING, I can see that this is still missing coverage for
> OVERRIDING USER VALUE.

Yeah, I couldn't see that covering that too was worth any cycles.

regards, tom lane




Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Michael Paquier
On Fri, May 05, 2023 at 05:06:34PM -0400, Tom Lane wrote:
> I did a bit of review and more work on this:
> 
> * Added the missing OVERRIDING support
> 
> * Played around with the pretty-printing indentation
> 
> * Improved test coverage, and moved the test to rules.sql where
> I thought it fit more naturally.
> 
> I think we could probably commit this, though I've not tried it
> in v15 yet.

Seems rather OK..

+WHEN NOT MATCHED
+   AND s.a > 100
+   THEN INSERT (id, data) OVERRIDING SYSTEM VALUE
+   VALUES (s.a, DEFAULT)

About OVERRIDING, I can see that this is still missing coverage for
OVERRIDING USER VALUE.
--
Michael


signature.asc
Description: PGP signature


Re: Autogenerate some wait events code and documentation

2023-05-05 Thread Michael Paquier
On Sat, May 06, 2023 at 11:23:17AM +0900, Michael Paquier wrote:
>> I'll look at v7 when the v17 branch opens and propose the separate patch
>> mentioned above at that time too.
> 
> Thanks, again.

By the way, while browsing through the patch, I have noticed two
things:
- The ordering of the items for Lock and LWLock is incorrect.
- We are missing some of the LWLock entries, like CommitTsBuffer,
XactBuffer or WALInsert, as of all the entries in
BuiltinTrancheNames.

My apologies for not noticing that earlier.  This exists in v6 as much
as v7.
--
Michael


signature.asc
Description: PGP signature


Re: Autogenerate some wait events code and documentation

2023-05-05 Thread Michael Paquier
On Thu, May 04, 2023 at 08:39:49AM +0200, Drouvot, Bertrand wrote:
> On 5/1/23 1:59 AM, Michael Paquier wrote:
> I'm not sure I like it. First, it does break the "Note" ordering as compare
> to the current documentation. That's not a big deal though.
> 
> Secondly, what If we need to add some note(s) in the future for
> another wait class? Having all the notes after all the tables are
> generated would sound weird to me.

Appending these notes at the end of all the tables does not strike me
as a big dea, TBH.  But, well, my sole opinion is not the final choice
either.  For now, I am mostly tempted to keep the generation script as
minimalistic as possible.

> We could discuss another approach for the "Note" part if there is a
> need to add one for an existing/new wait class though.

Documenting what's expected from the wait event classes is critical in
the .txt file as that's what developers are going to look at when
adding a new wait event.  Adding them in the header is less appealing
to me considering that is it now generated, and the docs provide a lot
of explanation as well.

>> This has as extra consequence to require a change in
>> wait_event.h so as PG_WAIT_BUFFER_PIN is renamed to PG_WAIT_BUFFERPIN,
>> equally fine by me.  Logically, this rename should be done in a patch
>> of its own, for clarity.
> 
> Yes, I can look at it.
> [...]
> Agree, I'll look at this.

Thanks!

> I'll look at v7 when the v17 branch opens and propose the separate patch
> mentioned above at that time too.

Thanks, again.
--
Michael


signature.asc
Description: PGP signature


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 7:53 PM Drouvot, Bertrand
 wrote:
>
>
> On 5/5/23 2:28 PM, Amit Kapila wrote:
> > On Fri, May 5, 2023 at 5:36 PM Drouvot, Bertrand
> >  wrote:
> >
> > It seems due to some reason the current wal file is not switched due
> > to some reason.
>
> Oh wait, here is a NON failing one: 
> https://cirrus-ci.com/task/5086849685782528 (I modified the
> .cirrus.yml so that we can download the "testrun.zip" file even if the test 
> is not failing).
>
> So, in this testrun.zip we can see, that the test is ok:
>
> $ grep -i retain 
> ./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
> [10:06:08.789](0.000s) ok 19 - invalidated logical slots do not lead to 
> retaining WAL
>
> and that the WAL file we expect to be removed is:
>
> $ grep "WAL file is" 
> ./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
> [10:06:08.789](0.925s) # BDT WAL file is 
> /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003
>
> This WAL file has been removed by the standby:
>
> $ grep -i 00010003 
> ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_standby.log
>  | grep -i recy
> 2023-05-05 10:06:08.787 UTC [17521][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010003"
>
> But on the primary, it has been recycled way after that time:
>
> $ grep -i 00010003 
> ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
>  | grep -i recy
> 2023-05-05 10:06:13.370 UTC [16785][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010003"
>
> As, the checkpoint on the primary after the WAL file switch only recycled 
> (001 and 002):
>
> $ grep -i recycled 
> ./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
> 2023-05-05 10:05:57.196 UTC [16785][checkpointer] LOG:  0: checkpoint 
> complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; 
> write=0.001 s, sync=0.001 s, total=0.027 s; sync files=0, longest=0.000 s, 
> average=0.000 s; distance=11219 kB, estimate=11219 kB; lsn=0/260, redo 
> lsn=0/228
> 2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010001"
> 2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010002"
> 2023-05-05 10:06:08.138 UTC [16785][checkpointer] LOG:  0: checkpoint 
> complete: wrote 20 buffers (15.6%); 0 WAL file(s) added, 0 removed, 2 
> recycled; write=0.001 s, sync=0.001 s, total=0.003 s; sync files=0, 
> longest=0.000 s, average=0.000 s; distance=32768 kB, estimate=32768 kB; 
> lsn=0/4D0, redo lsn=0/498
>
>
> So, even on a successful test, we can see that the WAL file we expect to be 
> removed on the standby has not been recycled on the primary before the test.
>

Okay, one possibility of not removing on primary is that at the time
of checkpoint (when we compute RedoRecPtr), the wal_swtich and insert
is not yet performed because in that case it will compute the
RedoRecPtr as a location before those operations which would be *3
file. However, it is not clear how is that possible except from a
background checkpoint happening at that point but from LOGs, it
appears that the checkpoint triggered by test has recycled the wal
files.

> > I think we need to add more DEBUG info to find that
> > out. Can you please try to print 'RedoRecPtr', '_logSegNo', and
> > recptr?
> >
>
> Yes, will do.
>

Okay, thanks, please try to print similar locations on standby in
CreateRestartPoint().

-- 
With Regards,
Amit Kapila.




Re: Order changes in PG16 since ICU introduction

2023-05-05 Thread Jeff Davis
On Fri, 2023-04-21 at 20:12 -0400, Robert Haas wrote:
> On Fri, Apr 21, 2023 at 5:56 PM Jeff Davis  wrote:
> > Most of the complaints seem to be complaints about v15 as well, and
> > while those complaints may be a reason to not make ICU the default,
> > they are also an argument that we should continue to learn and try
> > to
> > fix those issues because they exist in an already-released version.
> > Leaving it the default for now will help us fix those issues rather
> > than hide them.
> > 
> > It's still early, so we have plenty of time to revert the initdb
> > default if we need to.
> 
> That's fair enough, but I really think it's important that some
> energy
> get invested in providing adequate documentation for this stuff. Just
> patching the code is not enough.

Attached a significant documentation patch.

I tried to make it comprehensive without trying to be exhaustive, and I
separated the explanation of language tags from what collation settings
you can include in a language tag, so hopefully that's more clear.

I added quite a few examples spread throughout the various sections,
and I preserved the existing examples at the end. I also left all of
the external links at the bottom for those interested enough to go
beyond what's there.

I didn't add additional documentation for ICU rules. There are so many
options for collations that it's hard for me to think of realistic
examples to specify the rules directly, unless someone wants to invent
a new language. Perhaps useful if working with an interesting text file
format with special treatment for delimiters?

I asked the question about rules here:

https://www.postgresql.org/message-id/e861ac4fdae9f9f5ce2a938a37bcb5e083f0f489.camel%40cybertec.at

and got some limited response about addressing sort complaints. That
sounds reasonable, but a lot of that can also be handled just by
specifying the right collation settings. Someone who understands the
use case better could add some more documentation.


-- 
Jeff Davis
PostgreSQL Contributor Team - AWS


From b09515bfaf5e9de330138ec4a627d02a7947de1a Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Thu, 27 Apr 2023 14:43:46 -0700
Subject: [PATCH v1] Doc improvements for language tags and custom ICU
 collations.

Separate the documentation for language tags from the documentaiton
for the available collation settings which can be included in a
language tag.

Include tables of the available options, more details about the
effects of each option, and additional examples.

Also include an explanation of the "levels" of textual features and
how they relate to collation.
---
 doc/src/sgml/charset.sgml | 656 +++---
 1 file changed, 535 insertions(+), 121 deletions(-)

diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 6dd95b8966..be74064168 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -377,7 +377,125 @@ initdb --locale-provider=icu --icu-locale=en
 variants and customization options.

   
+  
+   ICU Locales
+   
+ICU Locale Names
+
+ The ICU format for the locale name is a Language Tag.
+
+
+CREATE COLLATION mycollation1 (PROVIDER = icu, LOCALE = 'ja-JP);
+CREATE COLLATION mycollation2 (PROVIDER = icu, LOCALE = 'fr');
+
+
+   
+   
+Locale Canonicalization and Validation
+
+ When defining a new ICU collation object or database with ICU as the
+ provider, the given locale name is transformed ("canonicalized") into a
+ language tag if not already in that form. For instance,
+
+
+CREATE COLLATION mycollation3 (PROVIDER = icu, LOCALE = 'en-US-u-kn-true');
+NOTICE:  using standard form "en-US-u-kn" for locale "en-US-u-kn-true"
+CREATE COLLATION mycollation4 (PROVIDER = icu, LOCALE = 'de_DE.utf8');
+NOTICE:  using standard form "de-DE" for locale "de_DE.utf8"
+
+
+ If you see such a message, ensure that the PROVIDER and
+ LOCALE are as you expect, and consider specifying
+ directly as the canonical language tag instead of relying on the
+ transformation.
+
+
+ 
+  ICU can transform most libc locale names, as well as some other formats,
+  into language tags for easier transition to ICU. If a libc locale name
+  is used in ICU, it may not have precisely the same behavior as in libc.
+ 
+
+
+ If there is some problem interpreting the locale name, or if it represents
+ a language or region that ICU does not recognize, a message will be reported:
 
+
+SET icu_validation_level = ERROR;
+CREATE COLLATION nonsense (PROVIDER = icu, LOCALE = 'nonsense');
+ERROR:  ICU locale "nonsense" has unknown language "nonsense"
+HINT:  To disable ICU locale validation, set parameter icu_validation_level to DISABLED.
+
+
+  controls how the message is
+ reported. If set below ERROR, the collation will still
+ be created, but the behavior may not be what the user intended.
+
+   
+   
+Language Tag
+
+ Basic language tags are 

Re: [PATCH] Allow Postgres to pick an unused port to listen

2023-05-05 Thread Cary Huang
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, failed
Spec compliant:   not tested
Documentation:not tested

Hello

This is one of those features that is beneficial to a handful of people in 
specific test cases. It may not benefit the majority of the users but is 
certainly not useless either. As long as it can be disabled and enough tests 
have been run to ensure it won't have a significant impact on working 
components while disabled, it should be fine in my opinion. Regarding where the 
selected port shall be saved (postmaster.pid, read by pg_ctl or saved in a 
dedicated file), I see that postmaster.pid already contains a port number in 
line number 4, so adding a port number into there is nothing new; port number 
is already there and we can simply replace the port number with the one 
selected by the system. 

I applied and tested the patch and found that the system can indeed start when 
port is set to 0, but line 4 of postmaster.pid does not store the port number 
selected by the system, rather, it stored 0, which is the same as configured. 
So I am actually not able to find out the port number that my PG is running on, 
at least not in a straight-forward way. 

thank you
==
Cary Huang
HighGo Software
www.highgo.ca

Bancolombia Open Source Program Office - Proposal of contribution on lock inactive users

2023-05-05 Thread Francisco Luis Arbelaez Lopez
Dear PostgreSQL Community,

Hope you all are doing very well.
My name is Francisco Arbelaez and I would like to introduce myself in behalf of 
Bancolombia, from Open Source Office.
>From the Bancolombia organization, we have the responsibility to migrate 
>private licensed technology to Open Source. More specifically, database 
>migration from Oracle to PostgreSQL.
On one of these migrations, we had the necessity to implement an automatic user 
lockout strategy for users who do not log into the database in a given number 
of days, similar to a functionality that already exists in the Oracle 
environment.
Based on the official contribution documentation, we have reviewed the TODOs 
and the PostgreSQL archives and have not found any related contribution. >From 
the Open Source Office of Bancolombia, we have a possible contribution which 
could help to solve this need and to increase the value of PostgreSQL as an 
Open Source database engine.

If it matches your needs and objectives, I would like to receive more 
information related to the next steps of this contribution.

Hope to hear from you soon.

Best regards,
Francisco Arbelaez
Open Source Office of Bancolombia



[Logotipo  Descripción generada automáticamente]

Francisco Luis Arbeláez López.

Ingeniero de Software Oficina Open Source.

Vicepresidencia Servicios de Tecnología

404 - 41628

Medellín – Colombia

flarb...@bancolombia.com.co

[Imagen que contiene Forma  Descripción generada automáticamente]



Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Tom Lane
Alvaro Herrera  writes:
> Here's a first attempt.  I mostly just copied code from the insert and
> update support routines.  There's a couple of things missing still, but
> I'm not sure I'll get to it tonight.  I only tested to the extent of
> what's in the new regression test.

I did a bit of review and more work on this:

* Added the missing OVERRIDING support

* Played around with the pretty-printing indentation

* Improved test coverage, and moved the test to rules.sql where
I thought it fit more naturally.

I think we could probably commit this, though I've not tried it
in v15 yet.

regards, tom lane

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 461735e84f..60f9d08d5d 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -411,6 +411,8 @@ static void get_update_query_targetlist_def(Query *query, List *targetList,
 			RangeTblEntry *rte);
 static void get_delete_query_def(Query *query, deparse_context *context,
  bool colNamesVisible);
+static void get_merge_query_def(Query *query, deparse_context *context,
+bool colNamesVisible);
 static void get_utility_query_def(Query *query, deparse_context *context);
 static void get_basic_select_query(Query *query, deparse_context *context,
    TupleDesc resultDesc, bool colNamesVisible);
@@ -5448,6 +5450,10 @@ get_query_def(Query *query, StringInfo buf, List *parentnamespace,
 			get_delete_query_def(query, , colNamesVisible);
 			break;
 
+		case CMD_MERGE:
+			get_merge_query_def(query, , colNamesVisible);
+			break;
+
 		case CMD_NOTHING:
 			appendStringInfoString(buf, "NOTHING");
 			break;
@@ -7044,6 +7050,128 @@ get_delete_query_def(Query *query, deparse_context *context,
 }
 
 
+/* --
+ * get_merge_query_def- Parse back a MERGE parsetree
+ * --
+ */
+static void
+get_merge_query_def(Query *query, deparse_context *context,
+	bool colNamesVisible)
+{
+	StringInfo	buf = context->buf;
+	RangeTblEntry *rte;
+	ListCell   *lc;
+
+	/* Insert the WITH clause if given */
+	get_with_clause(query, context);
+
+	/*
+	 * Start the query with MERGE INTO relname
+	 */
+	rte = rt_fetch(query->resultRelation, query->rtable);
+	Assert(rte->rtekind == RTE_RELATION);
+	if (PRETTY_INDENT(context))
+	{
+		appendStringInfoChar(buf, ' ');
+		context->indentLevel += PRETTYINDENT_STD;
+	}
+	appendStringInfo(buf, "MERGE INTO %s%s",
+	 only_marker(rte),
+	 generate_relation_name(rte->relid, NIL));
+
+	/* Print the relation alias, if needed */
+	get_rte_alias(rte, query->resultRelation, false, context);
+
+	/* Print the source relation and join clause */
+	get_from_clause(query, " USING ", context);
+	appendContextKeyword(context, " ON ",
+		 -PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
+	get_rule_expr(query->jointree->quals, context, false);
+
+	/* Print each merge action */
+	foreach(lc, query->mergeActionList)
+	{
+		MergeAction *action = lfirst_node(MergeAction, lc);
+
+		appendContextKeyword(context, " WHEN ",
+			 -PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
+		appendStringInfo(buf, "%sMATCHED", action->matched ? "" : "NOT ");
+
+		if (action->qual)
+		{
+			appendContextKeyword(context, " AND ",
+ -PRETTYINDENT_STD, PRETTYINDENT_STD, 3);
+			get_rule_expr(action->qual, context, false);
+		}
+		appendContextKeyword(context, " THEN ",
+			 -PRETTYINDENT_STD, PRETTYINDENT_STD, 3);
+
+		if (action->commandType == CMD_INSERT)
+		{
+			/* This generally matches get_insert_query_def() */
+			List	   *strippedexprs = NIL;
+			const char *sep = "";
+			ListCell   *lc2;
+
+			appendStringInfoString(buf, "INSERT");
+
+			if (action->targetList)
+appendStringInfoString(buf, " (");
+			foreach(lc2, action->targetList)
+			{
+TargetEntry *tle = (TargetEntry *) lfirst(lc2);
+
+Assert(!tle->resjunk);
+
+appendStringInfoString(buf, sep);
+sep = ", ";
+
+appendStringInfoString(buf,
+	   quote_identifier(get_attname(rte->relid,
+	tle->resno,
+	false)));
+strippedexprs = lappend(strippedexprs,
+		processIndirection((Node *) tle->expr,
+		   context));
+			}
+			if (action->targetList)
+appendStringInfoChar(buf, ')');
+
+			if (action->override)
+			{
+if (action->override == OVERRIDING_SYSTEM_VALUE)
+	appendStringInfoString(buf, " OVERRIDING SYSTEM VALUE");
+else if (action->override == OVERRIDING_USER_VALUE)
+	appendStringInfoString(buf, " OVERRIDING USER VALUE");
+			}
+
+			if (strippedexprs)
+			{
+appendContextKeyword(context, " VALUES (",
+	 -PRETTYINDENT_STD, PRETTYINDENT_STD, 4);
+get_rule_list_toplevel(strippedexprs, context, false);
+appendStringInfoChar(buf, ')');
+			}
+			else
+appendStringInfoString(buf, " DEFAULT VALUES");
+		}
+		else if (action->commandType == CMD_UPDATE)
+		{
+			appendStringInfoString(buf, "UPDATE SET ");
+			get_update_query_targetlist_def(query, 

Re: benchmark results comparing versions 15.2 and 16

2023-05-05 Thread Andres Freund
Hi,

On 2023-05-05 10:45:12 -0700, MARK CALLAGHAN wrote:
> This is mostly a hobby project for me - my other hobby is removing
> invasive weeds.

Hah :)


> Summary of the results:
> * from r1 - insert-heavy (l.i0, l.i1) and create indexes (l.x) steps are
> ~2% slower in PG 16
> * from r2 - create index (l.x) step is ~4% slower in PG 16
> * from r3 - regressions are similar to r1
> * from r4, r5 and r6 - regressions are mostly worse than r1, r2, r3. Note
> r4, r5, r6 are the same workload as r1, r2, r3 except the database is
> cached by PG for r1, r2, r3 so the r4, r5 and r6 benchmarks will do much
> more copying from the OS page cache into the Postgres buffer pool.

One thing that's somewhat odd is that there's very marked changes in l.i0's
p99 latency for the four clients cases - but whether 15 or 16 are better
differs between the runs.

r2)
p99
20m.pg152_o3_native_lto.cx7 300
20m.pg16prebeta.cx7 23683

r3)
p99
20m.pg152_o3_native_lto.cx7 70245
20m.pg16prebeta.cx7 8191

r5)
p99
20m.pg152_o3_native_lto.cx7 11188
20m.pg16prebeta.cx7 72720

r6)
p99
20m.pg152_o3_native_lto.cx7 1898
20m.pg16prebeta.cx7 31666


I do wonder if there's something getting scheduled in some of these runs
increasing latency?

Or what we're seeing depends on the time between the start of the server and
the start of the benchmark? It is interesting that the per-second throughput
graph shows a lot of up/down at the end:
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tyes.1g/tput.l.i0.html

Both 15 and 16 have one very high result at 70s, 15 then has one low, but 16
has two low results.

Greetings,

Andres Freund




Re: PGDOCS - Replica Identity quotes

2023-05-05 Thread David Zhang

On 2023-03-16 4:46 p.m., Peter Smith wrote:

A rebase was needed due to the recent REPLICA IDENTITY push [1].

PSA v2.


-   A published table must have a replica identity configured in
+   A published table must have a replica identity 
configured in

+1

 order to be able to replicate UPDATE
 and DELETE operations, so that appropriate rows to
 update or delete can be identified on the subscriber side.  By default,
 this is the primary key, if there is one.  Another unique index (with
 certain additional requirements) can also be set to be the replica
 identity.  If the table does not have any suitable key, then it can be set
-   to replica identity full, which means the entire row becomes
-   the key.  When replica identity full is specified,
+   to REPLICA IDENTITY FULL, which means the entire row 
becomes
+   the key.  When REPLICA IDENTITY FULL is specified,
 indexes can be used on the subscriber side for searching the rows.  
Candidate
 indexes must be btree, non-partial, and have at least one column reference
 (i.e. cannot consist of only expressions).  These restrictions
 on the non-unique index properties adhere to some of the restrictions that
 are enforced for primary keys.  If there are no such suitable indexes,
 the search on the subscriber side can be very inefficient, therefore
-   replica identity full should only be used as a
+   REPLICA IDENTITY FULL should only be used as a
 fallback if no other solution is possible.  If a replica identity other
IMO, it would be better just change "full" to "FULL". On one side, it 
can emphasize that "FULL" is one of the specific values (DEFAULT | USING 
INDEX index_name | FULL | NOTHING); on the other side, it leaves 
"replica identity" in lowercase to be more consistent with the 
terminology used in this entire paragraph.

-   than full is set on the publisher side, a replica identity
+   than FULL is set on the publisher side, a replica 
identity

+1

 comprising the same or fewer columns must also be set on the subscriber
 side.  See  for details on
 how to set the replica identity.  If a table without a replica identity is


David





Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Alvaro Herrera
On 2023-May-05, Tom Lane wrote:

> (I'd guess that decompiling the WHEN clause would take a nontrivial
> amount of new code, so maybe fixing it on such short notice is
> impractical.  But ugh.)

Here's a first attempt.  I mostly just copied code from the insert and
update support routines.  There's a couple of things missing still, but
I'm not sure I'll get to it tonight.  I only tested to the extent of
what's in the new regression test.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
 It does it in a really, really complicated way
 why does it need to be complicated?
 Because it's MakeMaker.
>From a89d46592ba4d113f57a890b63ef123d8470db45 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 5 May 2023 20:40:38 +0200
Subject: [PATCH] add ruleutils.c support for MERGE

---
 src/backend/utils/adt/ruleutils.c   | 119 
 src/test/regress/expected/merge.out |  68 
 src/test/regress/sql/merge.sql  |  38 +
 3 files changed, 225 insertions(+)

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 461735e84f0..851488dd6d3 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -411,6 +411,8 @@ static void get_update_query_targetlist_def(Query *query, List *targetList,
 			RangeTblEntry *rte);
 static void get_delete_query_def(Query *query, deparse_context *context,
  bool colNamesVisible);
+static void get_merge_query_def(Query *query, deparse_context *context,
+bool colNamesVisible);
 static void get_utility_query_def(Query *query, deparse_context *context);
 static void get_basic_select_query(Query *query, deparse_context *context,
    TupleDesc resultDesc, bool colNamesVisible);
@@ -5448,6 +5450,10 @@ get_query_def(Query *query, StringInfo buf, List *parentnamespace,
 			get_delete_query_def(query, , colNamesVisible);
 			break;
 
+		case CMD_MERGE:
+			get_merge_query_def(query, , colNamesVisible);
+			break;
+
 		case CMD_NOTHING:
 			appendStringInfoString(buf, "NOTHING");
 			break;
@@ -7043,6 +7049,119 @@ get_delete_query_def(Query *query, deparse_context *context,
 	}
 }
 
+/* --
+ * get_merge_query_def- Parse back a MERGE parsetree
+ * --
+ */
+static void
+get_merge_query_def(Query *query, deparse_context *context,
+	bool colNamesVisible)
+{
+
+	StringInfo	buf = context->buf;
+	RangeTblEntry *rte;
+	ListCell   *lc;
+
+	/* Insert the WITH clause if given */
+	get_with_clause(query, context);
+
+	/*
+	 * Start the query with MERGE INTO relname
+	 */
+	rte = rt_fetch(query->resultRelation, query->rtable);
+	Assert(rte->rtekind == RTE_RELATION);
+	if (PRETTY_INDENT(context))
+	{
+		appendStringInfoChar(buf, ' ');
+		context->indentLevel += PRETTYINDENT_STD;
+	}
+	appendStringInfo(buf, "MERGE INTO %s%s",
+	 only_marker(rte),
+	 generate_relation_name(rte->relid, NIL));
+
+	/* Print the relation alias, if needed */
+	get_rte_alias(rte, query->resultRelation, false, context);
+
+	get_from_clause(query, " USING ", context);
+	appendContextKeyword(context, " ON ",
+		 -PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
+	get_rule_expr(query->jointree->quals, context, false);
+
+	appendStringInfoChar(buf, '\n');	/* FIXME proper way to do this? */
+
+	/* FIXME missing: OVERRIDING clause */
+	foreach(lc, query->mergeActionList)
+	{
+		MergeAction *action = lfirst_node(MergeAction, lc);
+
+		appendStringInfo(buf, "WHEN %sMATCHED", action->matched ? "" : "NOT ");
+
+		if (action->qual)
+		{
+			appendContextKeyword(context, " AND ",
+ -PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
+			get_rule_expr(action->qual, context, false);
+		}
+		appendContextKeyword(context, " THEN ", -PRETTYINDENT_STD, PRETTYINDENT_STD, 1);
+
+		if (action->commandType == CMD_INSERT)
+		{
+			ListCell   *lc2;
+			char	   *sep = "";
+			List	   *strippedexprs = NIL;
+
+			appendStringInfoString(buf, "INSERT ");
+
+			/*
+			 * This matches what get_insert_query_def does for the VALUES
+			 * clause
+			 */
+			if (action->targetList)
+appendStringInfoChar(buf, '(');
+			foreach(lc2, action->targetList)
+			{
+TargetEntry *tle = (TargetEntry *) lfirst(lc2);
+
+Assert(!tle->resjunk);
+
+appendStringInfoString(buf, sep);
+sep = ", ";
+
+appendStringInfoString(buf, quote_identifier(get_attname(rte->relid,
+		 tle->resno,
+		 false)));
+strippedexprs = lappend(strippedexprs,
+		processIndirection((Node *) tle->expr,
+		   context));
+			}
+			if (action->targetList)
+appendStringInfoString(buf, ") ");
+			if (strippedexprs)
+			{
+appendContextKeyword(context, "VALUES (",
+	 -PRETTYINDENT_STD, PRETTYINDENT_STD, 2);
+get_rule_list_toplevel(strippedexprs, context, false);
+appendStringInfoChar(buf, ')');
+			}
+			else
+appendStringInfoString(buf, "DEFAULT VALUES");
+		}
+		else if (action->commandType == CMD_UPDATE)
+		{
+			

benchmark results comparing versions 15.2 and 16

2023-05-05 Thread MARK CALLAGHAN
Let me know if results like this shouldn't be posted here.

This is mostly a hobby project for me - my other hobby is removing
invasive weeds. I am happy to answer questions and run more tests, but
turnaround for answers won't be instant. Getting results from Linux perf
for these tests is on my TODO list. For now I am just re-running a subset
of these to get more certainty that the regressions are real and not noise.

These are results for the insert benchmark on a small server comparing
performance for versions 15.2 and 16. For version 16 I built from source at
git sha 1ab763fc.
https://github.com/postgres/postgres/commit/1ab763fc22adc88e5d779817e7b42b25a9dd7c9e

Late in the version 15 beta release cycle this benchmark found a
significant regression. I don't see anything significant this time, but
there are potential small regressions.

More detail on how I run the benchmark and the HW is here, the server is
small -- Beelink with 8 cores, 16G RAM and 1T NVMe SSD.
http://smalldatum.blogspot.com/2023/05/the-insert-benchmark-postgres-versions.html

Performance reports are linked below. But first, disclaimers:
* the goal is to determine whether there are CPU improvements or
regressions. To make that easier to spot I disable fsync on commit.
* my scripts compute CPU/operation where operation is a SQL statement.
However, CPU in this case is measured by vmstat and includes CPU from the
benchmark client and Postgres server
* the regressions here are small, usually less than 5% which means it can
be hard to distinguish between normal variance and a regression but the
results are repeatable
* the links below are to the Summary section which includes throughput
(absolute and relative). The relative throughput is the (throughput for PG
16 / throughput for PG 15.2) and
* I used the same compiler options for the builds of 15.2 and 16

Summary of the results:
* from r1 - insert-heavy (l.i0, l.i1) and create indexes (l.x) steps are
~2% slower in PG 16
* from r2 - create index (l.x) step is ~4% slower in PG 16
* from r3 - regressions are similar to r1
* from r4, r5 and r6 - regressions are mostly worse than r1, r2, r3. Note
r4, r5, r6 are the same workload as r1, r2, r3 except the database is
cached by PG for r1, r2, r3 so the r4, r5 and r6 benchmarks will do much
more copying from the OS page cache into the Postgres buffer pool.

I will repeat r1, r2, r4 and r5 but with the tests run in a different order
to confirm this isn't just noise.

Database cached by Postgres:
r1) 1 table, 1 client -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.1u.1tno.cached/all.html#summary
r2) 4 tables, 4 clients -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tno.cached/all.html#summary
r3) 1 table, 4 clients -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tyes.cached/all.html#summary

Database cached by OS but not by Postgres:
r4) 1 table, 1 client -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.1u.1tno.1g/all.html#summary
r5) 4 tables, 4 clients -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tno.1g/all.html#summary
r6) 1 table, 4 clients -
https://mdcallag.github.io/reports/23_05_04_ibench.beelink.pg16b.4u.1tyes.1g/all.html#summary


-- 
Mark Callaghan
mdcal...@gmail.com


Re: psql: Add role's membership options to the \du+ command

2023-05-05 Thread David G. Johnston
On Wed, May 3, 2023 at 9:30 AM Jonathan S. Katz 
wrote:

> [Personal hat]
>
> Looking at Pavel's latest patch, I do find the output easy to
> understand, though do we need to explicitly list "empty" if there are no
> membership permissions?
>
>
Yes.  I dislike having the equivalent of null embedded within the output
here.  I would rather label it for what it is.  As a membership without any
attributes has no real purpose I don't see how the choice matters and at
least empty both stands out visually and can be grepped.

The SQL language uses the words "by" and "from" in its syntax; I'm against
avoiding them in our presentation here without a clearly superior
alternative that doesn't require a majority of people to have to translate
the symbol " / " back into the word " by " in order to read the output.

But if it is really a blocker then maybe we should produce 3 separate
newline separated columns, one for the member of role, one for the list of
attributes, and one with the grantor.  The column headers can be translated
more easily as single nouns.  The readability quite probably would end up
being equivalent (maybe even better) in tabular form instead of sentence
form.

David J.


First draft of back-branch release notes is done

2023-05-05 Thread Tom Lane
... at

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=56e869a0987c93f594e73c1c3e49274de5c502d3

As usual, please send corrections/comments by Sunday.

regards, tom lane




Re: MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Alvaro Herrera
On 2023-May-05, Tom Lane wrote:

> I made this function:
> 
> CREATE OR REPLACE FUNCTION test_fun()
> RETURNS void
> LANGUAGE SQL
> BEGIN ATOMIC
> MERGE INTO target
> USING source s on s.id = target.id
> WHEN MATCHED THEN
>   UPDATE SET data = s.data
> WHEN NOT MATCHED THEN
>   INSERT VALUES (s.id, s.data);
> end;
> 
> It appears to work fine, but:
> 
> regression=# \sf+ test_fun()
> ERROR:  unrecognized query command type: 5
> 
> and it also breaks pg_dump.  Somebody screwed up pretty badly
> here.  Is there any hope of fixing it for Monday's releases?
> 
> (I'd guess that decompiling the WHEN clause would take a nontrivial
> amount of new code, so maybe fixing it on such short notice is
> impractical.  But ugh.)

Hmm, there is *some* code in ruleutils for MERGE, but clearly something
is missing.  Let me have a look ...

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Tables getting stuck at 's' state during logical replication

2023-05-05 Thread Padmavathi G
Some background on the setup on which I am trying to carry out the upgrade:

We have a pod in a kubernetes cluster which contains the postgres 11 image.
We are following the logical replication process for upgrade

Steps followed for logical replication:

1. Created a new pod in the same kubernetes cluster with the latest
postgres 15 image
2. Created a publication  (say publication 1) in the old pod including all
tables in a database
3. Created a subscription (say subscription 1) in the new pod for the above
mentioned publication
4. When monitoring the subscription via pg_subscription_rel in the
subscriber, I noticed that out of 45 tables 20 were in the 'r' state and 25
were in 's' state and they remained in the same state for almost 2 days,
there was no improvement in the state. But the logs showed that the tables
which had 's' state also had "synchronization workers for 
finished".
5. Then I removed the tables which got stuck in the 's' state from
publication 1 and created a new publication (publication 2) with only these
tables which got stuck and created a new subscription (subscription 2) for
this publication in the subscriber.
6. Now on monitoring subscription 2 via pg_subscription_rel I noticed that
out of 25, now 12 were in 'r' state and 13 again got stuck in 's' state.
Repeated this process of dropping tables which got stuck from publication
and created a new publisher and subscriber and finally I was able to bring
all tables to sync in this way. But still the tables were present in
replication origin.
7. On executing pg_replication_origins command, I saw that every
subscription had one origin and every table which got stuck in each
publication had one origin with roname pg__. Eventhough they
were stuck, these replication origins were not removed.


On Fri, May 5, 2023 at 4:04 PM Amit Kapila  wrote:

> On Fri, May 5, 2023 at 3:04 PM Padmavathi G 
> wrote:
> >
> > Hello,
> > I am trying out logical replication for upgrading postgres instance from
> version 11 to 15.x. In the process, I noticed that some tables get stuck in
> the 's' state during logical replication and they do not move to the 'r'
> state. I tried to drop the subscription and create a new subscriber, but
> the same thing repeated. Also, each time different tables get stuck, it is
> not like the same tables get stuck every time.
> >
>
> This is strange. BTW, we don't save slots after the upgrade, so the
> subscriptions in the upgraded node won't be valid. We have some
> discussion on this topic in threads [1][2]. So, I think after the
> upgrade one needs to anyway re-create the subscriptions. Can you share
> your exact steps for the upgrade and what is the state before the
> upgrade? Is it possible to share some minimal test case to show the
> exact problem you are facing?
>
> [1] -
> https://www.postgresql.org/message-id/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud
> [2] -
> https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com
>
>
> --
> With Regards,
> Amit Kapila.
>


Re: Add PQsendSyncMessage() to libpq

2023-05-05 Thread Anton Kirilov
Hello,

On Thu, 4 May 2023, 11:36 Alvaro Herrera, mailto:alvhe...@alvh.no-ip.org>> wrote:
> On 2023-May-04, Anton Kirilov wrote:
> If you want to make sure it's fully flushed, your only option is to have
> the call block.


Surely PQflush() returning 0 would signify that the output buffer has been 
fully flushed? Which means that there is another, IMHO simpler option than 
introducing an extra flag - make the new function return the same values as 
PQflush(), i.e. 0 for no error and fully flushed output, -1 for error, and 1 
for partial flush (so that the user may start polling for write readiness). Of 
course, the function would never return 1 (but would block instead) unless the 
user has called PQsetnonblocking() beforehand.

Best wishes,
Anton Kirilov


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Drouvot, Bertrand




On 5/5/23 2:28 PM, Amit Kapila wrote:

On Fri, May 5, 2023 at 5:36 PM Drouvot, Bertrand
 wrote:

It seems due to some reason the current wal file is not switched due
to some reason.


Oh wait, here is a NON failing one: https://cirrus-ci.com/task/5086849685782528 
(I modified the
.cirrus.yml so that we can download the "testrun.zip" file even if the test is 
not failing).

So, in this testrun.zip we can see, that the test is ok:

$ grep -i retain 
./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
[10:06:08.789](0.000s) ok 19 - invalidated logical slots do not lead to 
retaining WAL

and that the WAL file we expect to be removed is:

$ grep "WAL file is" 
./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
[10:06:08.789](0.925s) # BDT WAL file is 
/Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003

This WAL file has been removed by the standby:

$ grep -i 00010003 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_standby.log
 | grep -i recy
2023-05-05 10:06:08.787 UTC [17521][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010003"

But on the primary, it has been recycled way after that time:

$ grep -i 00010003 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
 | grep -i recy
2023-05-05 10:06:13.370 UTC [16785][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010003"

As, the checkpoint on the primary after the WAL file switch only recycled (001 
and 002):

$ grep -i recycled 
./build/testrun/recovery/035_standby_logical_decoding/log/035_standby_logical_decoding_primary.log
2023-05-05 10:05:57.196 UTC [16785][checkpointer] LOG:  0: checkpoint 
complete: wrote 4 buffers (3.1%); 0 WAL file(s) added, 0 removed, 0 recycled; 
write=0.001 s, sync=0.001 s, total=0.027 s; sync files=0, longest=0.000 s, 
average=0.000 s; distance=11219 kB, estimate=11219 kB; lsn=0/260, redo 
lsn=0/228
2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010001"
2023-05-05 10:06:08.138 UTC [16785][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010002"
2023-05-05 10:06:08.138 UTC [16785][checkpointer] LOG:  0: checkpoint 
complete: wrote 20 buffers (15.6%); 0 WAL file(s) added, 0 removed, 2 recycled; 
write=0.001 s, sync=0.001 s, total=0.003 s; sync files=0, longest=0.000 s, 
average=0.000 s; distance=32768 kB, estimate=32768 kB; lsn=0/4D0, redo 
lsn=0/498


So, even on a successful test, we can see that the WAL file we expect to be 
removed on the standby has not been recycled on the primary before the test.


I think we need to add more DEBUG info to find that
out. Can you please try to print 'RedoRecPtr', '_logSegNo', and
recptr?



Yes, will do.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LZ4 compression in pg_dump

2023-05-05 Thread gkokolatos
--- Original Message ---
On Friday, May 5th, 2023 at 3:23 PM, Andrew Dunstan  wrote:

> On 2023-05-05 Fr 06:02, gkokola...@pm.me wrote:
>
>> --- Original Message ---
>> On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin
>> [](mailto:exclus...@gmail.com)
>> wrote:
>>
>>> 23.03.2023 20:10, Tomas Vondra wrote:
>>>
 So pushed all three parts, after updating the commit messages a bit.

 This leaves the empty-data issue (which we have a fix for) and the
 switch to LZ4F. And then the zstd part.
>>>
>>> I'm sorry that I haven't noticed/checked that before, but when trying to
>>> perform check-world with Valgrind I've discovered another issue presumably
>>> related to LZ4File_gets().
>>> When running under Valgrind:
>>> PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
>>> I get:
>>> ...
>>> 07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
>>> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
>>> # Running: pg_restore --jobs=2 
>>> --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
>>> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir
>>>
>>> ==00:00:00:00.579 2811926== Conditional jump or move depends on 
>>> uninitialised value(s)
>>> ==00:00:00:00.579 2811926== at 0x4853376: rawmemchr 
>>> (vg_replace_strmem.c:1548)
>>> ==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal 
>>> (strops.c:41)
>>> ==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
>>> ==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf 
>>> (isoc99_sscanf.c:28)
>>> ==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs 
>>> (pg_backup_directory.c:458)
>>> ==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData 
>>> (pg_backup_directory.c:422)
>>> ==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry 
>>> (pg_backup_archiver.c:882)
>>> ==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive 
>>> (pg_backup_archiver.c:699)
>>> ==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
>>> ==00:00:00:00.579 2811926==
>>> ...
>>>
>>> It looks like the line variable returned by gets_func() here is not
>>> null-terminated:
>>> while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
>>>
>>> {
>>> ...
>>> if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", , lofname) != 2)
>>> ...
>>> And Valgrind doesn't like it.
>>
>> Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
>> gets() when it should have been modeled after fgets().
>>
>> Please find a patch attached to address it.
>
> Isn't using memset here a bit wasteful? Why not just put a null at the end 
> after calling LZ4Stream_read_internal(), which tells you how many bytes it 
> has written?

Good point. I thought about it before submitting the patch. I concluded that 
given the complexity and operations involved in LZ4Stream_read_internal() and 
the rest of the pg_dump/pg_restore code, the memset() call will be negligible. 
However from the readability point of view, the function is a bit cleaner with 
the memset().

I will not object to any suggestion though, as this is a very trivial point. 
Please find attached a v2 of the patch following the suggested approach.

Cheers,

//Georgios

> cheers
>
> andrew
>
> --
> Andrew Dunstan
> EDB:
> https://www.enterprisedb.comFrom 65dbce1eb81597e3dd44eff62d8d667b0a3322da Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos 
Date: Fri, 5 May 2023 09:47:02 +
Subject: [PATCH v2] Null terminate the output buffer of LZ4Stream_gets

LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.

Reported-by: Alexander Lakhin
---
 src/bin/pg_dump/compress_lz4.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..0f447919b2 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
 	if (!LZ4Stream_init(state, size, false /* decompressing */ ))
 		return -1;
 
+	/* No work needs to be done for a zero-sized output buffer */
+	if (size <= 0)
+		return 0;
+
 	/* Verify that there is enough space in the outbuf */
 	if (size > state->buflen)
 	{
@@ -636,7 +640,9 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	LZ4State   *state = (LZ4State *) CFH->private_data;
 	int			ret;
 
-	ret = LZ4Stream_read_internal(state, ptr, size, true);
+	Assert(size > 1);
+
+	ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
 	if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
 		pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
 
@@ -644,6 +650,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	if (ret == 0)
 		return NULL;
 
+	/*
+	 * Our caller expects the return string to be 

MERGE lacks ruleutils.c decompiling support!?

2023-05-05 Thread Tom Lane
I made this function:

CREATE OR REPLACE FUNCTION test_fun()
RETURNS void
LANGUAGE SQL
BEGIN ATOMIC
MERGE INTO target
USING source s on s.id = target.id
WHEN MATCHED THEN
  UPDATE SET data = s.data
WHEN NOT MATCHED THEN
  INSERT VALUES (s.id, s.data);
end;

It appears to work fine, but:

regression=# \sf+ test_fun()
ERROR:  unrecognized query command type: 5

and it also breaks pg_dump.  Somebody screwed up pretty badly
here.  Is there any hope of fixing it for Monday's releases?

(I'd guess that decompiling the WHEN clause would take a nontrivial
amount of new code, so maybe fixing it on such short notice is
impractical.  But ugh.)

regards, tom lane




Re: Add LZ4 compression in pg_dump

2023-05-05 Thread Andrew Dunstan


On 2023-05-05 Fr 06:02, gkokola...@pm.me wrote:





--- Original Message ---
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin  
wrote:




23.03.2023 20:10, Tomas Vondra wrote:


So pushed all three parts, after updating the commit messages a bit.

This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.


I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 
--file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir

==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised 
value(s)
==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal 
(strops.c:41)
==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData 
(pg_backup_directory.c:422)
==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry 
(pg_backup_archiver.c:882)
==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive 
(pg_backup_archiver.c:699)
==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...

It looks like the line variable returned by gets_func() here is not
null-terminated:
while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)

{
...
if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", , lofname) != 2)
...
And Valgrind doesn't like it.


Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().

Please find a patch attached to address it.




Isn't using memset here a bit wasteful? Why not just put a null at the 
end after calling LZ4Stream_read_internal(), which tells you how many 
bytes it has written?



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


[DOC] Update ALTER SUBSCRIPTION documentation

2023-05-05 Thread Robert Sjöblom


Hi,

We have recently used the PostgreSQL documentation when setting up our 
logical replication. We noticed there was a step missing in the 
documentation on how to drop a logical replication subscription with a 
replication slot attached.


We clarify the documentation to include prerequisites for running the 
DROP SUBSCRIPTION command. Please see attached patch.


Best regards,
Robert Sjöblom,
Oscar Carlberg
--
Innehållet i detta e-postmeddelande är konfidentiellt och avsett endast för 
adressaten.Varje spridning, kopiering eller utnyttjande av innehållet är 
förbjuden utan tillåtelse av avsändaren. Om detta meddelande av misstag 
gått till fel adressat vänligen radera det ursprungliga meddelandet och 
underrätta avsändaren via e-post
diff --git a/doc/src/sgml/ref/drop_subscription.sgml b/doc/src/sgml/ref/drop_subscription.sgml
index 8d997c983f..81b3051070 100644
--- a/doc/src/sgml/ref/drop_subscription.sgml
+++ b/doc/src/sgml/ref/drop_subscription.sgml
@@ -86,8 +86,10 @@ DROP SUBSCRIPTION [ IF EXISTS ] nameDROP SUBSCRIPTION command will fail.  To proceed in
-   this situation, disassociate the subscription from the replication slot by
-   executing ALTER SUBSCRIPTION ... SET (slot_name = NONE).
+   this situation, first disable the subscription with
+   ALTER SUBSCRIPTION ... DISABLE.  Then disassociate the
+   subscription from the replication slot by executing
+   ALTER SUBSCRIPTION ... SET (slot_name = NONE).
After that, DROP SUBSCRIPTION will no longer attempt any
actions on a remote host.  Note that if the remote replication slot still
exists, it (and any related table synchronization slots) should then be


Re: Should CreateExprContext() be using ALLOCSET_DEFAULT_SIZES?

2023-05-05 Thread Daniel Gustafsson
> On 17 Feb 2023, at 05:01, Andres Freund  wrote:

> ISTM that we really shouldn't use ALLOCSET_DEFAULT_SIZES for expression
> contexts, as they most commonly see only a few small, or no, allocations.

Looking into this I think you are correct.

> ISTM that we could save a reasonable amount of memory by using a smaller
> initial size.

I experimented with the below trivial patch in CreateExprContext:

-   return CreateExprContextInternal(estate, ALLOCSET_DEFAULT_SIZES);
+   return CreateExprContextInternal(estate, ALLOCSET_START_SMALL_SIZES);

Across various (unscientific) benchmarks, including expression heavy TPC-H
queries, I can see consistent reductions in memory use and tiny (within the
margin of error) increases in performance.  More importantly, I didn't see a
case of slowdowns with this applied or any adverse effects in terms of memory
use.  Whenever the initial size isn't big enough the expr runtime is likely
exceeding the overhead from growing the allocation?

--
Daniel Gustafsson





Re: Fix typos and inconsistencies for v16

2023-05-05 Thread Michael Paquier
On Tue, May 02, 2023 at 12:26:31PM +0900, Michael Paquier wrote:
> On Fri, Apr 21, 2023 at 12:00:00PM +0300, Alexander Lakhin wrote:
>> 4. CommitTSBuffer -> CommitTsBuffer // the inconsistency exists since 
>> 5da14938f; maybe this change should be backpatched
> 
> Yes, we'd better backpatch that.  I agree that it seems more sensible
> here to switch the compiled value rather than what the docs have been
> using for years.  Perhaps somebody has a different opinion?

Hearing nothing, I have now applied this part down to 13, on time for
the next minor release.
--
Michael


signature.asc
Description: PGP signature


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 5:36 PM Drouvot, Bertrand
 wrote:
>
> On 5/5/23 12:58 PM, Amit Kapila wrote:
> > On Fri, May 5, 2023 at 4:02 PM Drouvot, Bertrand
> >  wrote:
>
> > How did you concluded that 00010003 is the file the
> > test is expecting to be removed?
> >
> because I added a note in the test that way:
>
> "
> @@ -535,6 +539,7 @@ $node_standby->safe_psql('postgres', 'checkpoint;');
>
>   # Verify that the WAL file has not been retained on the standby
>   my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;
> +note "BDT WAL file is $standby_walfile";
>   ok(!-f "$standby_walfile",
>  "invalidated logical slots do not lead to retaining WAL");
> "
>
> so that I can check in the test log file:
>
> grep "WAL file is" 
> ./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
> [08:23:31.931](2.217s) # BDT WAL file is 
> /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003
>

It seems due to some reason the current wal file is not switched due
to some reason. I think we need to add more DEBUG info to find that
out. Can you please try to print 'RedoRecPtr', '_logSegNo', and
recptr?

/*
* Delete old log files, those no longer needed for last checkpoint to
* prevent the disk holding the xlog from growing full.
*/
XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
KeepLogSeg(recptr, &_logSegNo);


-- 
With Regards,
Amit Kapila.




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Drouvot, Bertrand




On 5/5/23 12:58 PM, Amit Kapila wrote:

On Fri, May 5, 2023 at 4:02 PM Drouvot, Bertrand
 wrote:



How did you concluded that 00010003 is the file the
test is expecting to be removed?


because I added a note in the test that way:

"
@@ -535,6 +539,7 @@ $node_standby->safe_psql('postgres', 'checkpoint;');

 # Verify that the WAL file has not been retained on the standby
 my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;
+note "BDT WAL file is $standby_walfile";
 ok(!-f "$standby_walfile",
"invalidated logical slots do not lead to retaining WAL");
"

so that I can check in the test log file:

grep "WAL file is" 
./build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding
[08:23:31.931](2.217s) # BDT WAL file is 
/Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: issue with meson builds on msys2

2023-05-05 Thread Andrew Dunstan


On 2023-05-04 Th 19:54, Andres Freund wrote:

Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:

On 2023-04-27 Th 18:18, Andres Freund wrote:

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:

Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:

This happens 100% reproducible?

For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I
have time, but for now the IPC::Run workaround is sufficient.

Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?



Its a Windows Server 2019 (v 1809) instance running on AWS.


Here's an extract from systeminfo:


OS Name:   Microsoft Windows Server 2019 Datacenter
OS Version:    10.0.17763 N/A Build 17763
OS Manufacturer:   Microsoft Corporation
OS Configuration:  Standalone Server
OS Build Type: Multiprocessor Free
Registered Owner:  EC2
Registered Organization:   Amazon.com
Product ID:    00430-0-0-AA796
Original Install Date: 4/24/2023, 10:28:31 AM
System Boot Time:  4/24/2023, 1:49:59 PM
System Manufacturer:   Amazon EC2
System Model:  t3.large
System Type:   x64-based PC
Processor(s):  1 Processor(s) Installed.
   [01]: Intel64 Family 6 Model 85 Stepping 7 
GenuineIntel ~2500 Mhz

BIOS Version:  Amazon EC2 1.0, 10/16/2017
Windows Directory: C:\Windows
System Directory:  C:\Windows\system32
Boot Device:   \Device\HarddiskVolume1
System Locale: en-us;English (United States)
Input Locale:  en-us;English (United States)
Time Zone: (UTC) Coordinated Universal Time
Total Physical Memory: 8,090 MB
Available Physical Memory: 4,843 MB
Virtual Memory: Max Size:  10,010 MB
Virtual Memory: Available: 7,405 MB
Virtual Memory: In Use:    2,605 MB




I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...



Just to reassure myself I have not been hallucinating, I repeated the test.


pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start 
> startlog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'

OK

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop 
> stoplog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'

BANG: 33280


If you want to play I can arrange access.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 4:02 PM Drouvot, Bertrand
 wrote:
>
> On 5/5/23 11:29 AM, Amit Kapila wrote:
> > On Fri, May 5, 2023 at 1:16 PM Drouvot, Bertrand
> >  wrote:
> >>
> >>
> >> After multiple attempts, I got one failing one.
> >>
> >> Issue is that we expect this file to be removed:
> >>
> >> [07:24:27.261](0.899s) #WAL file is 
> >> /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003
> >>
> >> But the standby emits:
> >>
> >> 2023-05-05 07:24:27.216 UTC [17909][client backend] 
> >> [035_standby_logical_decoding.pl][3/6:0] LOG:  statement: checkpoint;
> >> 2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG:  restartpoint 
> >> starting: immediate wait
> >> 2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG:  attempting to 
> >> remove WAL segments older than log file 0002
> >>
> >> So it seems the test is not right (missing activity??), not sure why yet.
> >>
> >
> > Can you try to print the value returned by
> > XLogGetReplicationSlotMinimumLSN() in KeepLogSeg() on standby? Also,
> > please try to print "attempting to remove WAL segments ..." on the
> > primary. We can see, if by any chance some slot is holding us to
> > remove the required WAL file.
> >
>
> I turned DEBUG2 on. We can also see on the primary:
>
> 2023-05-05 08:23:30.843 UTC [16833][checkpointer] LOCATION:  
> CheckPointReplicationSlots, slot.c:1576
> 2023-05-05 08:23:30.844 UTC [16833][checkpointer] DEBUG:  0: snapshot of 
> 0+0 running transaction ids (lsn 0/4D0 oldest xid 746 latest complete 745 
> next xid 746)
> 2023-05-05 08:23:30.844 UTC [16833][checkpointer] LOCATION:  
> LogCurrentRunningXacts, standby.c:1377
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG:  0: BDT1 about to 
> call RemoveOldXlogFiles in CreateCheckPoint
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  
> CreateCheckPoint, xlog.c:6835
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG:  0: attempting to 
> remove WAL segments older than log file 0002
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  
> RemoveOldXlogFiles, xlog.c:3560
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010001"
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  RemoveXlogFile, 
> xlog.c:3708
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: recycled 
> write-ahead log file "00010002"
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  RemoveXlogFile, 
> xlog.c:3708
> 2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: 
> SlruScanDirectory invoking callback on pg_subtrans/
>
> So, 00010003 is not removed on the primary.
>

How did you concluded that 00010003 is the file the
test is expecting to be removed?


--
With Regards,
Amit Kapila.




Re: Tables getting stuck at 's' state during logical replication

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 3:04 PM Padmavathi G  wrote:
>
> Hello,
> I am trying out logical replication for upgrading postgres instance from 
> version 11 to 15.x. In the process, I noticed that some tables get stuck in 
> the 's' state during logical replication and they do not move to the 'r' 
> state. I tried to drop the subscription and create a new subscriber, but the 
> same thing repeated. Also, each time different tables get stuck, it is not 
> like the same tables get stuck every time.
>

This is strange. BTW, we don't save slots after the upgrade, so the
subscriptions in the upgraded node won't be valid. We have some
discussion on this topic in threads [1][2]. So, I think after the
upgrade one needs to anyway re-create the subscriptions. Can you share
your exact steps for the upgrade and what is the state before the
upgrade? Is it possible to share some minimal test case to show the
exact problem you are facing?

[1] - 
https://www.postgresql.org/message-id/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud
[2] - 
https://www.postgresql.org/message-id/TYAPR01MB58664C81887B3AF2EB6B16E3F5939%40TYAPR01MB5866.jpnprd01.prod.outlook.com


-- 
With Regards,
Amit Kapila.




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Drouvot, Bertrand




On 5/5/23 11:29 AM, Amit Kapila wrote:

On Fri, May 5, 2023 at 1:16 PM Drouvot, Bertrand
 wrote:



After multiple attempts, I got one failing one.

Issue is that we expect this file to be removed:

[07:24:27.261](0.899s) #WAL file is 
/Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003

But the standby emits:

2023-05-05 07:24:27.216 UTC [17909][client backend] 
[035_standby_logical_decoding.pl][3/6:0] LOG:  statement: checkpoint;
2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG:  restartpoint starting: 
immediate wait
2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG:  attempting to remove 
WAL segments older than log file 0002

So it seems the test is not right (missing activity??), not sure why yet.



Can you try to print the value returned by
XLogGetReplicationSlotMinimumLSN() in KeepLogSeg() on standby? Also,
please try to print "attempting to remove WAL segments ..." on the
primary. We can see, if by any chance some slot is holding us to
remove the required WAL file.



I turned DEBUG2 on. We can also see on the primary:

2023-05-05 08:23:30.843 UTC [16833][checkpointer] LOCATION:  
CheckPointReplicationSlots, slot.c:1576
2023-05-05 08:23:30.844 UTC [16833][checkpointer] DEBUG:  0: snapshot of 
0+0 running transaction ids (lsn 0/4D0 oldest xid 746 latest complete 745 
next xid 746)
2023-05-05 08:23:30.844 UTC [16833][checkpointer] LOCATION:  
LogCurrentRunningXacts, standby.c:1377
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG:  0: BDT1 about to 
call RemoveOldXlogFiles in CreateCheckPoint
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  CreateCheckPoint, 
xlog.c:6835
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG:  0: attempting to 
remove WAL segments older than log file 0002
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  
RemoveOldXlogFiles, xlog.c:3560
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010001"
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  RemoveXlogFile, 
xlog.c:3708
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010002"
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION:  RemoveXlogFile, 
xlog.c:3708
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG:  0: 
SlruScanDirectory invoking callback on pg_subtrans/

So, 00010003 is not removed on the primary.

It has been recycled on:

2023-05-05 08:23:38.605 UTC [16833][checkpointer] DEBUG:  0: recycled write-ahead log 
file "00010003"

Which is later than the test:

[08:23:31.931](0.000s) not ok 19 - invalidated logical slots do not lead to 
retaining WAL

FWIW, the failing test with DEBUG2 can be found there: 
https://cirrus-ci.com/task/5615316688961536

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Support logical replication of DDLs

2023-05-05 Thread Peter Smith
I revisited the 0005 patch to verify all changes made to address my
previous review comments [1][2][3][4] were OK.

Not all changes were made quite as expected, and there were a few
other things I noticed in passing.

==

1. General

I previously [1] wrote a comment:
Use consistent uppercase for JSON and DDL instead of sometimes json
and ddl. There are quite a few random examples in the commit message
but might be worth searching the entire patch to make all comments
also consistent case.

Now I still find a number of lowercase "json" and "ddl" strings.

~~~

2.

Some of my previous review comments were replied [5] as "Will add this
in a later version"... or "Keeping this same for now".

IMO it would be much better to add a "TODO" or a "FIXME" comment
directly into the source for anything described like that, otherwise
the list of things "to do later" is just going to get misplaced in all
the long thread posts, and/or other reviews are going to report
exactly the same missing stuff again later. This review comment
applies also to PG DOCS which are known as incomplete or under
discussion - I think there should be a TODO/FIXME in those SGML files
or the Commit message.

e.g. [1]#9b
e.g. [2]#12abc
e.g. [2]#13c
e.g. [2]#14b
e.g. [2]#15ab
e.g. [2]#17
e.g. [3] (table excessively long)
e.g. [4]#2
e.g. [4]#11

~~~

3. Commit message

Executing a non-immutable expression during the table rewrite phase is not
allowed, as it may result in different data between publisher and subscriber.
While some may suggest converting the rewrite inserts to updates and replicate
them afte the ddl command to maintain data consistency. But it doesn't work if
the replica identity column is altered in the command. This is because the
rewrite inserts do not contain the old values and therefore cannot be converted
to update.

~

3a.
Grammar and typo need fixing for "While some may suggest converting
the rewrite inserts to updates and replicate them afte the ddl command
to maintain data consistency. But it doesn't work if the replica
identity column is altered in the command."

~

3b.
"rewrite inserts to updates"
Consider using uppercase for the INSERTs and UPDATEs

~~~

4.
LIMIT:

--> LIMITATIONS: (??)


==
contrib/test_decoding/sql/ddl.sql

5.
+SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394,
1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I
%{authorization}s", "name": "foo", "authorization": {"fmt":
"AUTHORIZATION %{authorization_role}I", "present": false,
"authorization_role": null}, "if_not_exists": ""}');

Previously ([4]#1) I had asked what is the point of setting a JSON
payload here when the JSON payload is never used. You might as well
pass the string "banana" to achieve the same thing AFAICT. I think the
reply [5] to the question was wrong. If this faked JSON is used for
some good reason then there ought to be a test comment to say the
reason. Otherwise, the fake JSON just confuses the purpose of this
test so it should be removed/simplified.

==
contrib/test_decoding/test_decoding.c

6. pg_decode_ddl_message

Previously ([4] #4b) I asked if it was necessary to use
appendBinaryStringInfo, instead of just appendStringInfo. I guess it
doesn't matter much, but I think the question was not answered.

==
doc/src/sgml/catalogs.sgml

7.
+ 
+  
+   evtisinternal char
+  
+  
+   True if the event trigger is internally generated.
+  
+ 

Why was this called a 'char' type instead of a 'bool' type?

==
doc/src/sgml/logical-replication.sgml

8.
+  
+For example, a CREATE TABLE command executed on the publisher gets
+WAL-logged, and forwarded to the subscriber to replay; a subsequent "ALTER
+SUBSCRIPTION ... REFRESH PUBLICATION" is performed on the
subscriber database so any
+following DML changes on the new table can be replicated.
+  

In my previous review comments ([2] 11b) I suggested for this to say
"then an implicit ALTER..." instead of "a subsequent ALTER...". I
think the "implicit" part got accidentally missed.

~~~

9.
+
+  
+In ADD COLUMN ... DEFAULT clause and
+ALTER COLUMN TYPE clause of ALTER
+TABLE command, the functions and operators used in
+expression must be immutable.
+  
+

IMO this is hard to read. It might be easier if expressed as 2
separate bullet points.

SUGGESTION
For ALTER TABLE ... ADD COLUMN ... DEFAULT, the functions and
operators used in expressions must be immutable.

For ALTER TABLE ... ADD COLUMN TYPE, the functions and operators used
in expressions must be immutable.

~~~

10.
+  
+To change the column type, first add a new column of the desired
+type, then update the new column value with the old column value,
+and finnally drop the old column and rename the new column to the
+old column.
+  

/finnally/finally/

==
.../access/rmgrdesc/logicalddlmsgdesc.c

11. 

Re: Add LZ4 compression in pg_dump

2023-05-05 Thread gkokolatos





--- Original Message ---
On Friday, May 5th, 2023 at 8:00 AM, Alexander Lakhin  
wrote:


> 
> 
> 23.03.2023 20:10, Tomas Vondra wrote:
> 
> > So pushed all three parts, after updating the commit messages a bit.
> > 
> > This leaves the empty-data issue (which we have a fix for) and the
> > switch to LZ4F. And then the zstd part.
> 
> 
> I'm sorry that I haven't noticed/checked that before, but when trying to
> perform check-world with Valgrind I've discovered another issue presumably
> related to LZ4File_gets().
> When running under Valgrind:
> PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
> I get:
> ...
> 07:07:11.683 ok 1939 - compression_lz4_dir: glob check for
> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
> # Running: pg_restore --jobs=2 
> --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql
> .../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir
> 
> ==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised 
> value(s)
> ==00:00:00:00.579 2811926== at 0x4853376: rawmemchr (vg_replace_strmem.c:1548)
> ==00:00:00:00.579 2811926== by 0x4C96A67: _IO_str_init_static_internal 
> (strops.c:41)
> ==00:00:00:00.579 2811926== by 0x4C693A2: _IO_strfile_read (strfile.h:95)
> ==00:00:00:00.579 2811926== by 0x4C693A2: __isoc99_sscanf (isoc99_sscanf.c:28)
> ==00:00:00:00.579 2811926== by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
> ==00:00:00:00.579 2811926== by 0x11DD1E: _PrintTocData 
> (pg_backup_directory.c:422)
> ==00:00:00:00.579 2811926== by 0x118484: restore_toc_entry 
> (pg_backup_archiver.c:882)
> ==00:00:00:00.579 2811926== by 0x1190CC: RestoreArchive 
> (pg_backup_archiver.c:699)
> ==00:00:00:00.579 2811926== by 0x10F25D: main (pg_restore.c:414)
> ==00:00:00:00.579 2811926==
> ...
> 
> It looks like the line variable returned by gets_func() here is not
> null-terminated:
> while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
> 
> {
> ...
> if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", , lofname) != 2)
> ...
> And Valgrind doesn't like it.
> 

Valgrind is correct to not like it. LZ4Stream_gets() got modeled after
gets() when it should have been modeled after fgets().

Please find a patch attached to address it.

Cheers,
//Georgios

> Best regards,
> AlexanderFrom 587873da2b563c59b281051c2636cda667abf099 Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos 
Date: Fri, 5 May 2023 09:47:02 +
Subject: [PATCH] Null terminate the output buffer of LZ4Stream_gets

LZ4Stream_gets did not null terminate its output buffer. Its callers expected
the buffer to be null terminated so they passed it around to functions such as
sscanf with unintended consequences.

Reported-by: Alexander Lakhin
---
 src/bin/pg_dump/compress_lz4.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/bin/pg_dump/compress_lz4.c b/src/bin/pg_dump/compress_lz4.c
index 423e1b7976..26c9a8b280 100644
--- a/src/bin/pg_dump/compress_lz4.c
+++ b/src/bin/pg_dump/compress_lz4.c
@@ -459,6 +459,10 @@ LZ4Stream_read_internal(LZ4State *state, void *ptr, int ptrsize, bool eol_flag)
 	if (!LZ4Stream_init(state, size, false /* decompressing */ ))
 		return -1;
 
+	/* No work needs to be done for a zero-sized output buffer */
+	if (size <= 0)
+		return 0;
+
 	/* Verify that there is enough space in the outbuf */
 	if (size > state->buflen)
 	{
@@ -636,7 +640,12 @@ LZ4Stream_gets(char *ptr, int size, CompressFileHandle *CFH)
 	LZ4State   *state = (LZ4State *) CFH->private_data;
 	int			ret;
 
-	ret = LZ4Stream_read_internal(state, ptr, size, true);
+	Assert(size > 1);
+
+	/* Our caller expects the return string to be NULL terminated */
+	memset(ptr, '\0', size);
+
+	ret = LZ4Stream_read_internal(state, ptr, size - 1, true);
 	if (ret < 0 || (ret == 0 && !LZ4Stream_eof(CFH)))
 		pg_fatal("could not read from input file: %s", LZ4Stream_get_error(CFH));
 
-- 
2.34.1



Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 2:59 PM Amit Kapila  wrote:
>
> On Fri, May 5, 2023 at 1:16 PM Drouvot, Bertrand
>  wrote:
> >
> >
> > After multiple attempts, I got one failing one.
> >
> > Issue is that we expect this file to be removed:
> >
> > [07:24:27.261](0.899s) #WAL file is 
> > /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003
> >
> > But the standby emits:
> >
> > 2023-05-05 07:24:27.216 UTC [17909][client backend] 
> > [035_standby_logical_decoding.pl][3/6:0] LOG:  statement: checkpoint;
> > 2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG:  restartpoint 
> > starting: immediate wait
> > 2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG:  attempting to 
> > remove WAL segments older than log file 0002
> >
> > So it seems the test is not right (missing activity??), not sure why yet.
> >
>
> Can you try to print the value returned by
> XLogGetReplicationSlotMinimumLSN() in KeepLogSeg() on standby? Also,
> please try to print "attempting to remove WAL segments ..." on the
> primary. We can see, if by any chance some slot is holding us to
> remove the required WAL file.
>

We can also probably check the values of 'endptr',  'receivePtr', and
replayPtr on standby in the below code:

CreateRestartPoint()
{
...
/*
* Retreat _logSegNo using the current end of xlog replayed or received,
* whichever is later.
*/
receivePtr = GetWalRcvFlushRecPtr(NULL, NULL);
replayPtr = GetXLogReplayRecPtr();
endptr = (receivePtr < replayPtr) ? replayPtr : receivePtr;
KeepLogSeg(endptr, &_logSegNo);
...
}

-- 
With Regards,
Amit Kapila.




Feature: Add reloption support for table access method

2023-05-05 Thread 吴昊
Hi pg-hackers,


When I wrote an extension to implement a new storage by table access method. I 
found some issues
that the existing code has strong assumptions for heap tables now. Here are 3 
issues that I currently have:


1. Index access method has a callback to handle reloptions, but table access 
method hasn't. We can't
add storage-specific parameters by `WITH` clause when creating a table.
2. Existing code strongly assumes that the data file of a table structures by a 
serial physical files named
in a hard coded rule: [.]. It may only fit for heap like 
tables. A new storage may have its
owner structure on how the data files are organized. The problem happens when 
dropping a table.

3. The existing code also assumes that the data file consists of a series of 
fix-sized block. It may not
be desired by other storage. Is there any suggestions on this situation?


The rest of this mail is to talk about the first issue. It looks reasonable to 
add a similar callback in
struct TableAmRoutine, and parse reloptions by the callback. This patch is in 
the attachment file.


Another thing about reloption is that the current API passes a parameter 
`validate` to tell the parse
functioin to check and whether to raise an error. It doesn't have enough 
context when these reloptioins
are used:
1. CREATE TABLE ... WITH(...)
2. ALTER TABLE ... SET ...
3. ALTER TABLE ... RESET ...
The reason why the context matters is that some reloptions are disallowed to 
change after creating
the table, while some reloptions are allowed.



I wonder if this change makes sense for you.


The attached patch only supports callback for TAM-specific implementation, not 
include the change
about usage context.


Regards,
Hao Wu











reloptions.diffs
Description: Binary data


Tables getting stuck at 's' state during logical replication

2023-05-05 Thread Padmavathi G
Hello,
I am trying out logical replication for upgrading postgres instance from
version 11 to 15.x. In the process, I noticed that some tables get stuck in
the 's' state during logical replication and they do not move to the 'r'
state. I tried to drop the subscription and create a new subscriber, but
the same thing repeated. Also, each time different tables get stuck, it is
not like the same tables get stuck every time. This also happens to tables
that do not have frequent writes to them. Then I tried to drop the tables
which got stuck from the original publication and created a new publication
and new subscribers just with the tables which got stuck in the previous
subscription. Now, in this new subscription, some tables go to 'r' state
(even though they did not in the older subscription) and some tables remain
in 's' state. If I continue doing this process for 5-6 times I am able to
cover all of the tables. Since these tables get stuck at 's' state, the
replication origins are not deleted automatically when I delete the table
from the publication and I had to manually delete the replication origin.
The logs contain "table synchronization worker for  has
finished even for table in 's' state. Can somebody please help with this
issue as I do not want to do manual intervention of creating a new
publication and subscription for tables that get stuck at 's'.

Configuration

max_replication_slots : 75 in publisher and subscriber
max_worker_processes: 60 (in subscriber)
max_logical_replication_workers: 55 (in subscriber)


Thanks,

Padmavathi


Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 1:16 PM Drouvot, Bertrand
 wrote:
>
>
> After multiple attempts, I got one failing one.
>
> Issue is that we expect this file to be removed:
>
> [07:24:27.261](0.899s) #WAL file is 
> /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003
>
> But the standby emits:
>
> 2023-05-05 07:24:27.216 UTC [17909][client backend] 
> [035_standby_logical_decoding.pl][3/6:0] LOG:  statement: checkpoint;
> 2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG:  restartpoint 
> starting: immediate wait
> 2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG:  attempting to remove 
> WAL segments older than log file 0002
>
> So it seems the test is not right (missing activity??), not sure why yet.
>

Can you try to print the value returned by
XLogGetReplicationSlotMinimumLSN() in KeepLogSeg() on standby? Also,
please try to print "attempting to remove WAL segments ..." on the
primary. We can see, if by any chance some slot is holding us to
remove the required WAL file.

-- 
With Regards,
Amit Kapila.




Re: A bug with ExecCheckPermissions

2023-05-05 Thread Alvaro Herrera
On 2023-May-04, Tom Lane wrote:

> It looks like this patch caused a change in the order of output from
> the sepgsql tests [1].  If you expected it to re-order permissions
> checking then this is probably fine, and we should just update the
> expected output.

Yeah, looks correct.  Fix pushed.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Every machine is a smoke machine if you operate it wrong enough."
https://twitter.com/libseybieda/status/1541673325781196801




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Drouvot, Bertrand




On 5/5/23 9:04 AM, Amit Kapila wrote:

On Fri, May 5, 2023 at 11:08 AM Drouvot, Bertrand
 wrote:


On 5/4/23 6:43 AM, Amit Kapila wrote:

On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:


Thanks for posting the updated patch, I had run this test in a loop of
100 times to verify that there was no failure because of race
conditions. The 100 times execution passed successfully.

One suggestion:
"wal file" should be changed to "WAL file":
+# Request a checkpoint on the standby to trigger the WAL file(s) removal
+$node_standby->safe_psql('postgres', 'checkpoint;');
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;



Thanks for the verification. I have pushed the patch.



It looks like there is still something wrong with this test as there
are a bunch of cfbot errors on this new test (mainly on macOS - Ventura - 
Meson).



Is it possible for you to point me to those failures?


I'll try to reproduce with more debug infos.



Okay, thanks!



After multiple attempts, I got one failing one.

Issue is that we expect this file to be removed:

[07:24:27.261](0.899s) #WAL file is 
/Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/00010003

But the standby emits:

2023-05-05 07:24:27.216 UTC [17909][client backend] 
[035_standby_logical_decoding.pl][3/6:0] LOG:  statement: checkpoint;
2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG:  restartpoint starting: 
immediate wait
2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG:  attempting to remove 
WAL segments older than log file 0002

So it seems the test is not right (missing activity??), not sure why yet.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Drouvot, Bertrand

Hi,

On 5/5/23 9:11 AM, vignesh C wrote:

On Fri, 5 May 2023 at 12:34, Amit Kapila  wrote:


On Fri, May 5, 2023 at 11:08 AM Drouvot, Bertrand
 wrote:


On 5/4/23 6:43 AM, Amit Kapila wrote:

On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:


Thanks for posting the updated patch, I had run this test in a loop of
100 times to verify that there was no failure because of race
conditions. The 100 times execution passed successfully.

One suggestion:
"wal file" should be changed to "WAL file":
+# Request a checkpoint on the standby to trigger the WAL file(s) removal
+$node_standby->safe_psql('postgres', 'checkpoint;');
+
+# Verify that the wal file has not been retained on the standby
+my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . $walfile_name;



Thanks for the verification. I have pushed the patch.



It looks like there is still something wrong with this test as there
are a bunch of cfbot errors on this new test (mainly on macOS - Ventura - 
Meson).



Is it possible for you to point me to those failures?


I think these failures are occuring in CFBOT, once such instance is at:
https://cirrus-ci.com/task/6642271152504832?logs=test_world#L39
https://api.cirrus-ci.com/v1/artifact/task/6642271152504832/testrun/build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding



Yeah, thanks, that's one of them.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread vignesh C
On Fri, 5 May 2023 at 12:34, Amit Kapila  wrote:
>
> On Fri, May 5, 2023 at 11:08 AM Drouvot, Bertrand
>  wrote:
> >
> > On 5/4/23 6:43 AM, Amit Kapila wrote:
> > > On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:
> > >>
> > >> Thanks for posting the updated patch, I had run this test in a loop of
> > >> 100 times to verify that there was no failure because of race
> > >> conditions. The 100 times execution passed successfully.
> > >>
> > >> One suggestion:
> > >> "wal file" should be changed to "WAL file":
> > >> +# Request a checkpoint on the standby to trigger the WAL file(s) removal
> > >> +$node_standby->safe_psql('postgres', 'checkpoint;');
> > >> +
> > >> +# Verify that the wal file has not been retained on the standby
> > >> +my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . 
> > >> $walfile_name;
> > >>
> > >
> > > Thanks for the verification. I have pushed the patch.
> > >
> >
> > It looks like there is still something wrong with this test as there
> > are a bunch of cfbot errors on this new test (mainly on macOS - Ventura - 
> > Meson).
> >
>
> Is it possible for you to point me to those failures?

I think these failures are occuring in CFBOT, once such instance is at:
https://cirrus-ci.com/task/6642271152504832?logs=test_world#L39
https://api.cirrus-ci.com/v1/artifact/task/6642271152504832/testrun/build/testrun/recovery/035_standby_logical_decoding/log/regress_log_035_standby_logical_decoding

Regards,
Vignesh




Re: Add two missing tests in 035_standby_logical_decoding.pl

2023-05-05 Thread Amit Kapila
On Fri, May 5, 2023 at 11:08 AM Drouvot, Bertrand
 wrote:
>
> On 5/4/23 6:43 AM, Amit Kapila wrote:
> > On Thu, May 4, 2023 at 8:37 AM vignesh C  wrote:
> >>
> >> Thanks for posting the updated patch, I had run this test in a loop of
> >> 100 times to verify that there was no failure because of race
> >> conditions. The 100 times execution passed successfully.
> >>
> >> One suggestion:
> >> "wal file" should be changed to "WAL file":
> >> +# Request a checkpoint on the standby to trigger the WAL file(s) removal
> >> +$node_standby->safe_psql('postgres', 'checkpoint;');
> >> +
> >> +# Verify that the wal file has not been retained on the standby
> >> +my $standby_walfile = $node_standby->data_dir . '/pg_wal/' . 
> >> $walfile_name;
> >>
> >
> > Thanks for the verification. I have pushed the patch.
> >
>
> It looks like there is still something wrong with this test as there
> are a bunch of cfbot errors on this new test (mainly on macOS - Ventura - 
> Meson).
>

Is it possible for you to point me to those failures?

> I'll try to reproduce with more debug infos.
>

Okay, thanks!

-- 
With Regards,
Amit Kapila.




Re: Add LZ4 compression in pg_dump

2023-05-05 Thread Alexander Lakhin

23.03.2023 20:10, Tomas Vondra wrote:

So pushed all three parts, after updating the commit messages a bit.

This leaves the empty-data issue (which we have a fix for) and the
switch to LZ4F. And then the zstd part.



I'm sorry that I haven't noticed/checked that before, but when trying to
perform check-world with Valgrind I've discovered another issue presumably
related to LZ4File_gets().
When running under Valgrind:
PROVE_TESTS=t/002_pg_dump.pl make check -C src/bin/pg_dump/
I get:
...
[07:07:11.683](0.000s) ok 1939 - compression_lz4_dir: glob check for 
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir/*.dat.lz4
# Running: pg_restore --jobs=2 --file=.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir.sql 
.../src/bin/pg_dump/tmp_check/tmp_test_HB6A/compression_lz4_dir


==00:00:00:00.579 2811926== Conditional jump or move depends on uninitialised 
value(s)
==00:00:00:00.579 2811926==    at 0x4853376: rawmemchr 
(vg_replace_strmem.c:1548)
==00:00:00:00.579 2811926==    by 0x4C96A67: _IO_str_init_static_internal 
(strops.c:41)
==00:00:00:00.579 2811926==    by 0x4C693A2: _IO_strfile_read (strfile.h:95)
==00:00:00:00.579 2811926==    by 0x4C693A2: __isoc99_sscanf 
(isoc99_sscanf.c:28)
==00:00:00:00.579 2811926==    by 0x11DB6F: _LoadLOs (pg_backup_directory.c:458)
==00:00:00:00.579 2811926==    by 0x11DD1E: _PrintTocData 
(pg_backup_directory.c:422)
==00:00:00:00.579 2811926==    by 0x118484: restore_toc_entry 
(pg_backup_archiver.c:882)
==00:00:00:00.579 2811926==    by 0x1190CC: RestoreArchive 
(pg_backup_archiver.c:699)
==00:00:00:00.579 2811926==    by 0x10F25D: main (pg_restore.c:414)
==00:00:00:00.579 2811926==
...

It looks like the line variable returned by gets_func() here is not
null-terminated:
    while ((CFH->gets_func(line, MAXPGPATH, CFH)) != NULL)
    {
...
    if (sscanf(line, "%u %" CppAsString2(MAXPGPATH) "s\n", , lofname) 
!= 2)
...
And Valgrind doesn't like it.

Best regards,
Alexander