Re: logical decoding and replication of sequences

2022-03-12 Thread Peter Eisentraut

Further review (based on 20220310 patch):

 doc/src/sgml/ref/create_publication.sgml   |   3 +

For the clauses added to the synopsis, descriptions should be added
below.  See attached patch for a start.

 src/backend/commands/sequence.c|  79 ++

There is quite a bit of overlap between ResetSequence() and
ResetSequence2(), but I couldn't see a good way to combine them that
genuinely saves code and complexity.  So maybe it's ok.

Actually, ResetSequence2() is not really "reset", it's just "set".
Maybe pick a different function name.

 src/backend/commands/subscriptioncmds.c| 272 +++

The code added in AlterSubscription_refresh() seems to be entirely
copy-and-paste from the tables case.  I think this could be combined
by concatenating the lists from fetch_table_list() and
fetch_sequence_list() and looping over it once.  The same also applies
to CreateSubscription(), although the code duplication is smaller
there.

This in turn means that fetch_table_list() and fetch_sequence_list()
can be combined, so that you don't actually need any extensive new
code in CreateSubscription() and AlterSubscription_refresh() for
sequences.  This could go on, you can combine more of the underlying
code, like pg_publication_tables and pg_publication_sequences and so
on.

 src/backend/replication/logical/proto.c|  52 ++

The documentation of the added protocol message needs to be added to
the documentation.  See attached patch for a start.

The sequence message does not contain the sequence Oid, unlike the
relation message.  Would that be good to add?

 src/backend/replication/logical/worker.c   |  56 ++

Maybe the Asserts in apply_handle_sequence() should be elogs.  These
are checking what is sent over the network, so we don't want a
bad/evil peer able to trigger asserts.  And in non-assert builds these
conditions would be unchecked.

 src/backend/replication/pgoutput/pgoutput.c|  82 +-

I find the the in get_rel_sync_entry() confusing.  You add a section for

if (!publish && is_sequence)

but then shouldn't the code below that be something like

if (!publish && !is_sequence)

 src/bin/pg_dump/t/002_pg_dump.pl   |  38 +-

This adds a new publication "pub4", but the tests already contain a
"pub4".  I'm not sure why this even works, but perhaps the new one
shold be "pub5", unless there is a deeper meaning.

 src/include/catalog/pg_publication_namespace.h |   3 +-

I don't like how the distinction between table and sequence is done
using a bool field.  That affects also the APIs in pg_publication.c
and publicationcmds.c especially.  There is a lot of unadorned "true"
and "false" being passed around that isn't very clear, and it all
appears to originate at this catalog.  I think we could use a char
field here that uses the relkind constants.  That would also make the
code in pg_publication.c etc. slightly clearer.


See attached patch for more small tweaks.

Your patch still contains a number of XXX and FIXME comments, which in 
my assessment are all more or less correct, so I didn't comment on those 
separately.


Other than that, this seems pretty good.

Earlier in the thread I commented on some aspects of the new grammar 
(e.g., do we need FOR ALL SEQUENCES?).  I think this would be useful to 
review again after all the new logical replication patches are in.  I 
don't want to hold up this patch for that at this point.From bdded82050841d3b71308ce82110efd21d99ea53 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sun, 13 Mar 2022 07:38:46 +0100
Subject: [PATCH] fixup! Add support for decoding sequences to built-in
 replication

---
 doc/src/sgml/protocol.sgml| 119 ++
 doc/src/sgml/ref/alter_publication.sgml   |   2 +-
 doc/src/sgml/ref/create_publication.sgml  |  42 +---
 src/backend/catalog/pg_publication.c  |   8 +-
 src/backend/commands/subscriptioncmds.c   |   2 +-
 src/backend/parser/gram.y |  14 ---
 src/backend/replication/logical/worker.c  |   2 +-
 src/bin/pg_dump/pg_dump.c |   6 +-
 src/test/regress/expected/publication.out |   2 +-
 src/test/regress/sql/object_address.sql   |   1 +
 src/test/regress/sql/publication.sql  |   2 +-
 src/test/subscription/t/029_sequences.pl  |  14 +--
 12 files changed, 165 insertions(+), 49 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 9178c779ba..49c05e1866 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -7055,6 +7055,125 @@ Logical Replication Message Formats
 
 
 
+
+
+Sequence
+
+
+
+
+
+
+
+Byte1('X')
+
+
+
+Identifies the message as a sequence message.
+
+
+
+
+
+Int32 (TransactionId)
+
+
+
+   Xid of the transaction (only present for streamed transactions).
+   This field is available since protocol version 2.
+
+
+
+
+
+Int8(0)
+
+
+
+Flags; currently unused.
+
+
+
+
+
+Int64 

Re: range_agg with multirange inputs

2022-03-12 Thread Chapman Flack
On 03/11/22 22:18, Paul Jungwirth wrote:
> Arg, fixed.
> 
>> In range_agg_transfn, you've changed the message in the "must be called
>> with a range or multirange"; that seems like another good candidate to
>> be an elog.
> 
> Agreed. Updated here.

This looks good to me and passes installcheck-world, so I'll push
the RfC button.

> Sharing a prosrc seems even less remarkable than sharing an aggfinalfn.
> You're right there are no cases for other finalfns yet, but I don't think
> there is anything special about finalfns that would make this a weirder
> thing to do there than with ordinary functions.

You sent me back to look at how many of those there are. I get 42 cases
of shared prosrc (43 now).

The chief subgroup of those looks to involve sharing between parameter
signatures where the types have identical layouts and the semantic
differences are unimportant to the function in question (comparisons
between bit or between varbit, overlaps taking timestamp or timestamptz,
etc.).

The other prominent group is range and multirange constructors, where
the C function has an obviously generic name like range_constructor2
and gets shared by a bunch of SQL declarations.

I think here we've added the first instance where the C function is
shared by SQL-declared functions accepting two different polymorphic
pseudotypes. But it's clearly simple and works, nothing objectionable
about it.

I had experimented with renaming multirange_agg_finalfn to just
range_agg_finalfn so it would just look like two overloads of one
function sharing a prosrc, and ultimately gave up because genbki.pl
couldn't resolve the OID where the name is used in pg_aggregate.dat.

That's why it surprised me to see three instances where other functions
(overlaps, isfinite, name) do use the same SQL name for different
overloads, but the explanation seems to be that nothing else at genbki
time refers to those, so genbki's unique-name limitation doesn't affect
them.

Neither here nor there for this patch, but an interesting new thing
I learned while reviewing it.

Regards,
-Chap




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Justin Pryzby
On Sun, Mar 13, 2022 at 09:45:35AM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).
> 
> Are you referring to the contents of 0003 here that changes the
> semantics of pg_ls_dir_files() regarding its setup call?

Yes, as it has this:

-   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  
...
-   SetSingleFuncCall(fcinfo, 0);   

  
...
+   if (flags & LS_DIR_METADATA)

  
+   SetSingleFuncCall(fcinfo, 0);   

  
+   else

  
+   SetSingleFuncCall(fcinfo, SRF_SINGLE_USE_EXPECTED); 

  

-- 
Justin




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Michael Paquier
On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> Rebased over 9e9858389 (Michael may want to look at the tuplestore part?).

Are you referring to the contents of 0003 here that changes the
semantics of pg_ls_dir_files() regarding its setup call?
--
Michael


signature.asc
Description: PGP signature


Re: Allow async standbys wait for sync replication

2022-03-12 Thread Nathan Bossart
On Tue, Mar 08, 2022 at 06:01:23PM -0800, Andres Freund wrote:
> To me it's architecturally the completely wrong direction. We should move in
> the *other* direction, i.e. allow WAL to be sent to standbys before the
> primary has finished flushing it locally. Which requires similar
> infrastructure to what we're discussing here.

I think this is a good point.  After all, WALRead() has the following
comment:

 * XXX probably this should be improved to suck data directly from the
 * WAL buffers when possible.

Once you have all the infrastructure for that, holding back WAL replay on
async standbys based on synchronous replication might be relatively easy.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: EXPLAIN vs track_io_timing=on vs tests

2022-03-12 Thread Andres Freund
Hi,

On 2020-10-29 16:10:37 -0700, Andres Freund wrote:
> I run my development instances with track_io_timing=on, as I've found
> that to be really useful. Unfortunately that causes tests to fail
> whenever I forget to turn that off to run installcheck.
> 
> The diffs are caused by the additional data shown in the explain tests:
> ...
> -   "Temp Written Blocks": N+
> +   "Temp Written Blocks": N,   +
> +   "I/O Read Time": N.N,   +
> +   "I/O Write Time": N.N   +
> ...
> 
> 
> First, why is the output of these fields conditional when using a
> non-text format?  Seems we instead should output -1 or null. The latter
> seems a bit clearer, but is a bit json specific. I guess we could add a
> ExplainPropertyNull() or such?

Not addressed so far.


> Second, as long as it is conditional, would anybody mind if I put a
> track_io_timing=false into explain.sql? We don't try to make the tests
> pass with every possible option set, but track_io_timing seems common
> enough?

Done that now.

Greetings,

Andres Freund




Re: create_index test fails when synchronous_commit = off @ master

2022-03-12 Thread Andres Freund
On 2022-03-11 10:38:22 +0300, Aleksander Alekseev wrote:
> > Here is an updated patch that includes the commit message.
> 
> Since this is a trivial change and no one seems to object to it so far, I'm
> going to re-check the patch against the recent `master` and change its
> status to "Ready for Committer" somewhere next week.

Pushed it now.

- Andres




Re: Optimize external TOAST storage

2022-03-12 Thread Nathan Bossart
I think this is an interesting idea.  My first thought is that it would be
nice if we didn't have to first compress the data to see whether we wanted
to store it compressed or not, but I don't think there is much we can do
about that.  In any case, we aren't adding much work in the write path
compared to the potential savings in the read path, so that is probably
okay.

Do you think it is worth making this configurable?  I don't think it is
outside the realm of possibility for some users to care more about disk
space than read performance.  And maybe the threshold for this optimization
could differ based on the workload.

 extern void toast_tuple_externalize(ToastTupleContext *ttc, int attribute,
 int options);
+extern void toast_tuple_opt_externalize(ToastTupleContext *ttc, int attribute,
+int options, Datum old_toast_value,
+ToastAttrInfo *old_toast_attr);

Could we bake this into toast_tuple_externalize() so that all existing
callers benefit from this optimization?  Is there a reason to export both
functions?  Perhaps toast_tuple_externalize() should be renamed and made
static, and then this new function could be called
toast_tuple_externalize() (which would be a wrapper around the internal
function).

+/* Sanity check: if data is not compressed then we can proceed as usual. */
+if (!VARATT_IS_COMPRESSED(DatumGetPointer(*value)))
+toast_tuple_externalize(ttc, attribute, options);

With a --enable-cassert build, this line causes assertion failures in the
call to GetMemoryChunkContext() from pfree().  'make check' is enough to
reproduce it.  Specifically, it fails the following assertion:

AssertArg(MemoryContextIsValid(context));

I didn't see an existing commitfest entry for this patch.  I'd encourage
you to add one in the July commitfest:

https://commitfest.postgresql.org/38/

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Add index scan progress to pg_stat_progress_vacuum

2022-03-12 Thread Nathan Bossart
On Sat, Mar 12, 2022 at 07:00:06AM +, Imseih (AWS), Sami wrote:
>> nitpick: Can we remove the extra spaces in the parentheses?
> 
> fixed
> 
>> What does it mean if there isn't an entry in the map?  Is this actually
>> expected, or should we ERROR instead?
> 
> I cleaned up the code here and added comments. 
> 
>> I think the number of entries should be shared between
>> VacuumWorkerProgressShmemInit() and VacuumWorkerProgressShmemSize().
>> Otherwise, we might update one and not the other.
> 
> Fixed
> 
>> I think we should elaborate a bit more in this comment.  It's difficult to
>> follow what this is doing without referencing the comment above
>> set_vacuum_worker_progress().
> 
> More comments added
> 
> I also simplified the 0002 patch as well.

These patches look pretty good to me.  Barring additional feedback, I
intend to mark this as ready-for-committer early next week.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Issue with pg_stat_subscription_stats

2022-03-12 Thread Andres Freund
Hi,

On 2022-03-12 08:28:35 +0530, Amit Kapila wrote:
> On Sat, Mar 12, 2022 at 2:14 AM Melanie Plageman
>  wrote:
> >
> > So, I noticed that pg_stat_reset_subscription_stats() wasn't working
> > properly, and, upon further investigation, I'm not sure the view
> > pg_stat_subscription_stats is being properly populated.
> >
> 
> I have tried the below scenario based on this:
> Step:1 Create some data that generates conflicts and lead to apply
> failures and then check in the view:

I think the problem is present when there was *no* conflict
previously. Because nothing populates the stats entry without an error, the
reset doesn't have anything to set the stats_reset field in, which then means
that the stats_reset field is NULL even though stats have been reset.

I'll just repeat what I've said before: Making variable numbered stats
individiually resettable is a bad idea.

Greetings,

Andres Freund




Re: psql - add SHOW_ALL_RESULTS option

2022-03-12 Thread Fabien COELHO



Are you planning to send a rebased patch for this commit fest?


Argh, I did it in a reply in another thread:-( Attached v15.

So as to help moves things forward, I'd suggest that we should not to care 
too much about corner case repetition of some error messages which are due to 
libpq internals, so I could remove the ugly buffer reset from the patch and 
have the repetition, and if/when the issue is fixed later in libpq then the 
repetition will be removed, fine! The issue is that we just expose the 
strange behavior of libpq, which is libpq to solve, not psql.


See attached v16 which removes the libpq workaround.

--
Fabien.diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index e0abe34bb6..8f7f93172a 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -50,8 +50,28 @@ BEGIN \;
 SELECT 2.0 AS "float" \;
 SELECT 'world' AS "text" \;
 COMMIT;
+ float 
+---
+   2.0
+(1 row)
+
+ text  
+---
+ world
+(1 row)
+
 -- compound with empty statements and spurious leading spacing
 \;\;   SELECT 3 + 3 \;\;\;   SELECT ' ' || ' !' \;\;   SELECT 1 + 4 \;;
+ ?column? 
+--
+6
+(1 row)
+
+ ?column? 
+--
+   !
+(1 row)
+
  ?column? 
 --
 5
@@ -61,6 +81,11 @@ COMMIT;
 SELECT 1 + 1 + 1 AS "add" \gset
 SELECT :add + 1 + 1 AS "add" \;
 SELECT :add + 1 + 1 AS "add" \gset
+ add 
+-
+   5
+(1 row)
+
 -- set operator
 SELECT 1 AS i UNION SELECT 2 ORDER BY i;
  i 
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index caabb06c53..f01adb1fd2 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -127,18 +127,11 @@ echo '\x \\ SELECT * FROM foo;' | psql
commands included in the string to divide it into multiple
transactions.  (See 
for more details about how the server handles multi-query strings.)
-   Also, psql only prints the
-   result of the last SQL command in the string.
-   This is different from the behavior when the same string is read from
-   a file or fed to psql's standard input,
-   because then psql sends
-   each SQL command separately.
   
   
-   Because of this behavior, putting more than one SQL command in a
-   single -c string often has unexpected results.
-   It's better to use repeated -c commands or feed
-   multiple commands to psql's standard input,
+   If having several commands executed in one transaction is not desired, 
+   use repeated -c commands or feed multiple commands to
+   psql's standard input,
either using echo as illustrated above, or
via a shell here-document, for example:
 
@@ -3570,10 +3563,6 @@ select 1\; select 2\; select 3;
 commands included in the string to divide it into multiple
 transactions.  (See 
 for more details about how the server handles multi-query strings.)
-psql prints only the last query result
-it receives for each request; in this example, although all
-three SELECTs are indeed executed, psql
-only prints the 3.
 
 
   
@@ -4160,6 +4149,18 @@ bar
   
 
   
+SHOW_ALL_RESULTS
+
+
+When this variable is set to off, only the last
+result of a combined query (\;) is shown instead of
+all of them.  The default is on.  The off behavior
+is for compatibility with older versions of psql.
+
+
+  
+
+   
 SHOW_CONTEXT
 
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index d65b9a124f..7903075975 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -34,6 +34,8 @@ static bool DescribeQuery(const char *query, double *elapsed_msec);
 static bool ExecQueryUsingCursor(const char *query, double *elapsed_msec);
 static bool command_no_begin(const char *query);
 static bool is_select_command(const char *query);
+static int SendQueryAndProcessResults(const char *query, double *pelapsed_msec,
+	bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool *tx_ended);
 
 
 /*
@@ -354,7 +356,7 @@ CheckConnection(void)
  * Returns true for valid result, false for error state.
  */
 static bool
-AcceptResult(const PGresult *result)
+AcceptResult(const PGresult *result, bool show_error)
 {
 	bool		OK;
 
@@ -385,7 +387,7 @@ AcceptResult(const PGresult *result)
 break;
 		}
 
-	if (!OK)
+	if (!OK && show_error)
 	{
 		const char *error = PQerrorMessage(pset.db);
 
@@ -473,6 +475,18 @@ ClearOrSaveResult(PGresult *result)
 	}
 }
 
+/*
+ * Consume all results
+ */
+static void
+ClearOrSaveAllResults()
+{
+	PGresult	*result;
+
+	while ((result = PQgetResult(pset.db)) != NULL)
+		ClearOrSaveResult(result);
+}
+
 
 /*
  * Print microtiming output.  Always print raw milliseconds; if the interval
@@ -573,7 

Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)

2022-03-12 Thread Justin Pryzby
On Tue, Jan 25, 2022 at 10:19:53AM +0530, Shruthi Gowda wrote:
> On Tue, Jan 25, 2022 at 1:14 AM Robert Haas  wrote:
> > On Sat, Jan 22, 2022 at 2:20 AM Shruthi Gowda  wrote:
> > > Agree. In the latest patch, the template0 and postgres OIDs are fixed
> > > to unused manually assigned OIDs 4 and 5 respectively. These OIDs are
> > > no more listed as unused OIDs.
> >
> > Thanks. Committed with a few more cosmetic changes.
> 
> Thanks, Robert for committing this patch.

If I'm not wrong, this can be closed.
https://commitfest.postgresql.org/37/3296/




[GSOC 22] GUI representation of monitoring System Activity with the system_stats Extension in pgAdmin 4

2022-03-12 Thread Arjun Prashanth
Hello PGSQL Mailing List,
I am Arjun a 2nd year CSE Student and I am eager to contribute to this
community and was interested in the GUI System Admin Dashboard project
mentioned here

 .

I am a React-Nextjs Web developer and am familiar with typescript.
I have created a project with React and is available on my github page
 and hosted with Firebase
.
I am also quite familiar with the basics of python-flask and jinja2
templating. I also know SQL(SQL plus) and No-SQL (mongoDB) for querying
databases.
I have built couple of projects with flask -

   1. Microblogging Application (Github
    | Heroku
   )
   2. Price Drop Notifier (Github
    | Heroku
   )


I am willing to take up a proficiency test if required.

Regarding this project it would be helpful If someone could direct me to
resources to utilize so that i can familiarize myself with it.

Hoping to hear from you soon.
Yours Sincerely
Arjun


Re: support for MERGE

2022-03-12 Thread Justin Pryzby
On Sat, Jan 29, 2022 at 12:03:35AM -0600, Justin Pryzby wrote:
> Note that MergeWhenClause and MergeAction have the node definition in a
> different order than the header, which is a bit confusing.

The .h files still order these fields differently than the other .h files, and
then the node funcs (at least MergeAction) also have a different order than the
.h files.

> diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
> index 1617702d9d..c8e8254b16 100644
> --- a/src/include/nodes/parsenodes.h
> +++ b/src/include/nodes/parsenodes.h
> @@ -117,7 +117,7 @@ typedef struct Query
> +typedef struct MergeWhenClause
> +{
> + NodeTag type;
> + boolmatched;/* true=MATCHED, false=NOT 
> MATCHED */
> + CmdType commandType;/* INSERT/UPDATE/DELETE/DO NOTHING */
> + Node   *condition;  /* WHEN conditions (raw parser) */
> + List   *targetList; /* INSERT/UPDATE targetlist */
> + /* the following members are only useful for INSERT action */
> + List   *cols;   /* optional: names of the 
> target columns */
> + List   *values; /* VALUES to INSERT, or NULL */
> + OverridingKind override;/* OVERRIDING clause */
> +} MergeWhenClause;

> +/*
> + * WHEN [NOT] MATCHED THEN action info
> + */
> +typedef struct MergeAction
> +{
> + NodeTag type;
> + boolmatched;/* true=MATCHED, false=NOT 
> MATCHED */
> + OverridingKind override;/* OVERRIDING clause */
> + Node   *qual;   /* transformed WHEN conditions 
> */
> + CmdType commandType;/* INSERT/UPDATE/DELETE/DO NOTHING */
> + List   *targetList; /* the target list (of TargetEntry) */
> + List   *updateColnos;   /* target attribute numbers of an 
> UPDATE */
> +} MergeAction;

> diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
> index d4f8455a2b..234a701045 100644
> --- a/src/backend/nodes/copyfuncs.c
> +++ b/src/backend/nodes/copyfuncs.c
> +static MergeAction *
> +_copyMergeAction(const MergeAction *from)
> +{
> + MergeAction *newnode = makeNode(MergeAction);
> +
> + COPY_SCALAR_FIELD(matched);
> + COPY_SCALAR_FIELD(commandType);
> + COPY_SCALAR_FIELD(override);
> + COPY_NODE_FIELD(qual);
> + COPY_NODE_FIELD(targetList);
> + COPY_NODE_FIELD(updateColnos);

> +static MergeWhenClause *
> +_copyMergeWhenClause(const MergeWhenClause *from)
> +{
> + MergeWhenClause *newnode = makeNode(MergeWhenClause);
> +
> + COPY_SCALAR_FIELD(matched);
> + COPY_SCALAR_FIELD(commandType);
> + COPY_NODE_FIELD(condition);
> + COPY_NODE_FIELD(targetList);
> + COPY_NODE_FIELD(cols);
> + COPY_NODE_FIELD(values);
> + COPY_SCALAR_FIELD(override);
> + return newnode;
> +}

> diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
> index f1002afe7a..5e1ff02a55 100644
> --- a/src/backend/nodes/equalfuncs.c
> +++ b/src/backend/nodes/equalfuncs.c
> @@ -841,6 +841,20 @@ _equalOnConflictExpr(const OnConflictExpr *a, const 
> OnConflictExpr *b)
> +static bool
> +_equalMergeAction(const MergeAction *a, const MergeAction *b)
> +{
> + COMPARE_SCALAR_FIELD(matched);
> + COMPARE_SCALAR_FIELD(commandType);
> + COMPARE_SCALAR_FIELD(override);
> + COMPARE_NODE_FIELD(qual);
> + COMPARE_NODE_FIELD(targetList);
> + COMPARE_NODE_FIELD(updateColnos);

> +static bool
> +_equalMergeWhenClause(const MergeWhenClause *a, const MergeWhenClause *b)
> +{
> + COMPARE_SCALAR_FIELD(matched);
> + COMPARE_SCALAR_FIELD(commandType);
> + COMPARE_NODE_FIELD(condition);
> + COMPARE_NODE_FIELD(targetList);
> + COMPARE_NODE_FIELD(cols);
> + COMPARE_NODE_FIELD(values);
> + COMPARE_SCALAR_FIELD(override);
> +
> + return true;
> +}

> diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
> index 6bdad462c7..7549b27b39 100644
> --- a/src/backend/nodes/outfuncs.c
> +++ b/src/backend/nodes/outfuncs.c
> @@ -429,6 +429,21 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
> +static void
> +_outMergeWhenClause(StringInfo str, const MergeWhenClause *node)
> +{
> + WRITE_NODE_TYPE("MERGEWHENCLAUSE");
> +
> + WRITE_BOOL_FIELD(matched);
> + WRITE_ENUM_FIELD(commandType, CmdType);
> + WRITE_NODE_FIELD(condition);
> + WRITE_NODE_FIELD(targetList);
> + WRITE_NODE_FIELD(cols);
> + WRITE_NODE_FIELD(values);
> + WRITE_ENUM_FIELD(override, OverridingKind);

> +static void
> +_outMergeAction(StringInfo str, const MergeAction *node)
> +{
> + WRITE_NODE_TYPE("MERGEACTION");
> +
> + WRITE_BOOL_FIELD(matched);
> + WRITE_ENUM_FIELD(commandType, CmdType);
> + WRITE_ENUM_FIELD(override, OverridingKind);
> + WRITE_NODE_FIELD(qual);
> + WRITE_NODE_FIELD(targetList);
> + WRITE_NODE_FIELD(updateColnos);
> +}

> diff 

Re: support for MERGE

2022-03-12 Thread Alvaro Herrera
FYI I intend to get the ModifyTable split patch (0001+0003) pushed
hopefully on Tuesday or Wednesday next week, unless something really
ugly is found on it.

As for MERGE proper, I'm aiming at getting that one pushed on the week
starting March 21st, though of course I'll spend some more time on it
and will probably post further versions of it before that.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)




Re: Add parameter jit_warn_above_fraction

2022-03-12 Thread Justin Pryzby
On Mon, Mar 07, 2022 at 04:02:16PM +0100, Magnus Hagander wrote:
> On Mon, Mar 7, 2022 at 2:09 PM Justin Pryzby  wrote:
> > On Mon, Mar 07, 2022 at 01:10:32PM +0100, Magnus Hagander wrote:
> > > On Fri, Feb 25, 2022 at 5:23 PM Justin Pryzby  
> > > wrote:
> > > >
> > > > I think it should be a NOTICE (or less?)
> > >
> > > Hmm. I'm not so sure.
> > >
> > > Other similar parameters often use LOG, but the downside of that is
> > > that it won't be sent to the client.
> > >
> > > The problem with using NOTICE is that it won't go to the log by
> > > default. It needs to be at least warning to do that.
> >
> > Anyone who uses this parameter is aleady going to be changing GUCs, so it
> > doesn't need to work by default.  The people who are likely to enable this
> > already monitor their logs and have probably customized their logging
> > configuration.
> 
> Sure, but if you set log_min_messagse to NOTICE you are likely going
> to flood your logs beyond usefulness. And the whole idea of this
> parameter is you can keep it on during "normal times" to catch
> outliers.

I do set log_min_messages - not to NOTICE, but to INFO ;)

Would NOTICEs really flood most people's logs ?  From what ?  They're aren't
that many, and most seem to be for DDL and utility commands.  We do, uh, more
of that than most people would say is reasonable.

I see a couple hundred of these per day:
| PostGIS: Unable to compute statistics for...
And during deployment, a few hundred more for IF NOT EXIST commands.
That's it.

I still think this GUC should not be WARNING.  If it's a warning, then *this*
will cause previously-quiet logs to be flooded - the opposite problem.  Which
is not desirable for a new guc.

https://www.postgresql.org/docs/current/runtime-config-logging.html

INFOProvides information implicitly requested by the user, e.g., output 
from VACUUM VERBOSE.INFOINFORMATION
NOTICE  Provides information that might be helpful to users, e.g., notice of 
truncation of long identifiers.NOTICE  INFORMATION
WARNING Provides warnings of likely problems, e.g., COMMIT outside a 
transaction block. NOTICE  WARNING

> > I don't understand - why doesn't it combine trivially with
> > log_min_duration_statement?  Are you saying that the default / pre-existing 
> > min
> > duration may not log all of the intended queries ?  I think that just means 
> > the
> > configuration needs to be changed.  The GUC needs to allow/help finding 
> > these
> > JIT issues, but going to require an admin's interaction in any case.  
> > Avoiding
> > false positives may be important for it to be useful at all.
> >
> > Hmm .. should it also combine with min_sample_rate ?  Maybe that addresses 
> > your
> > concern.
> 
> For one, what if I don't want to log any queries unless this is the
> case? I leave log_min_duration_statement at -1...

Then you will conclude that our logging is inadequate for your case.
You need to filter the lines which don't interest you.

> Or what if I want to log say only queries taking more than 60 seconds.
> Now I will miss all queries taking 30 seconds where 99.9% of the time
> is spent on JIT which I definitely *would* want.

If you're unwilling to deal with log entries which are uninteresting, then
clearly you'll need a separate GUC just for log_min_duration_jit_above_fraction
(and maybe another one for jit_above_duration_log_level).  Or implement generic
log filtering based on file/severity/message.

As I see it: most people won't set this GUC at all.  Those who do might enable
it with a high threshold value of log_min_duration_statement (probably a
pre-existing, configured value) to see what there is to see - the worst issues.

A small fraction of people will enable it with a low threshold value for
log_min_duration_statement - maybe only temporarily or per-session.  They'll be
willing to sift through the logs to look for interesting things, like queries
that take 10ms, 75% of which is in JIT, but run hundreds of times per second.

This feature intends to make it easier to identify queries affected by this
problem, but it doesn't need to be possible to do that without logging anything
else.

-- 
Justin




Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-12 Thread Fabien COELHO



Hello Yugo-san,

About Pgbench error handling v16:

This patch set needs a minor rebase because of 506035b0. Otherwise, patch 
compiles, global and local "make check" are ok. Doc generation is ok.


This patch is in good shape, the code and comments are clear.
Some minor remarks below, including typos and a few small suggestions.


## About v16-1

This refactoring patch adds a struct for managing pgbench variables, instead of
mixing fields into the client state (CState) struct.

Patch compiles, global and local "make check" are both ok.

Although this patch is not necessary to add the feature, I'm fine with it as
improves pgbench source code readability.


## About v16-2

This last patch adds handling of serialization and deadlock errors to pgbench
transactions. This feature is desirable because it enlarge performance testing
options, and makes pgbench behave more like a database client application.

Possible future extension enabled by this patch include handling deconnections
errors by trying to reconnect, for instance.

The documentation is clear and well written, at least for my non-native speaker
eyes and ears.

English: "he will be aborted" -> "it will be aborted".

I'm fine with renaming --report-latencies to --report-per-command as the later
is clearer about what the options does.

I'm still not sure I like the "failure detailed" option, ISTM that the report
could be always detailed. That would remove some complexity and I do not think
that people executing a bench with error handling would mind having the details.
No big deal.

printVerboseErrorMessages: I'd make the buffer static and initialized only once
so that there is no significant malloc/free cycle involved when calling the 
function.

advanceConnectionState: I'd really prefer not to add new variables (res, status)
in the loop scope, and only declare them when actually needed in the state 
branches,
so as to avoid any unwanted interaction between states.

typo: "fullowing" -> "following"

Pipeline cleaning: the advance function is already s long, I'd put that in a
separate function and call it.

I think that the report should not remove data when they are 0, otherwise it 
makes
it harder to script around it (in failures_detailed on line 6284).

The test cover the different cases. I tried to suggest a simpler approach 
in a previous round, but it seems not so simple to do so. They could be 
simplified later, if possible.


--
Fabien.




Re: pg_stat_statements and "IN" conditions

2022-03-12 Thread Dmitry Dolgov
> On Thu, Mar 10, 2022 at 12:11:59PM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > New status: Waiting on Author
>
> > This seems incorrect, as the only feedback I've got was "this is a bad
> > idea", and no reaction on follow-up questions.
>
> I changed the status because it seems to me there is no chance of
> this being committed as-is.
>
> 1. I think an absolute prerequisite before we could even consider
> changing the query jumbler rules this much is to do the work that was
> put off when the jumbler was moved into core: that is, provide some
> honest support for multiple query-ID generation methods being used at
> the same time.  Even if you successfully make a case for
> pg_stat_statements to act this way, other consumers of query IDs
> aren't going to be happy with it.
>
> 2. You haven't made a case for it.  The original complaint was
> about different lengths of IN lists not being treated as equivalent,
> but this patch has decided to do I'm-not-even-sure-quite-what
> about treating different Params as equivalent.  Plus you're trying
> to invoke eval_const_expressions in the jumbler; that is absolutely
> Not OK, for both safety and semantic reasons.
>
> If you backed off to just treating ArrayExprs containing different
> numbers of Consts as equivalent, maybe that'd be something we could
> adopt without fixing point 1.  I don't think anything that fuzzes the
> treatment of Params can get away with that, though.

Here is the limited version of list collapsing functionality, which
doesn't utilize eval_const_expressions and ignores most of the stuff
except ArrayExprs. Any thoughts/more suggestions?
>From ce9f2ed2466d28dbbef3310383d84eba58e5791b Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Sat, 12 Mar 2022 14:42:02 +0100
Subject: [PATCH v6] Prevent jumbling of every element in ArrayExpr

pg_stat_statements produces multiple entries for queries like

SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on number of parameters, because every element of ArrayExpr is
jumbled. Make Consts contribute nothing to the jumble hash if they're
part of a series and at position further that specified threshold.

Reviewed-by: Zhihong Yu, Sergey Dudoladov
Tested-by: Chengxi Sun
---
 .../expected/pg_stat_statements.out   | 412 ++
 .../pg_stat_statements/pg_stat_statements.c   |  26 +-
 .../sql/pg_stat_statements.sql| 107 +
 src/backend/utils/misc/guc.c  |  13 +
 src/backend/utils/misc/queryjumble.c  | 236 +-
 src/include/utils/queryjumble.h   |  10 +-
 6 files changed, 791 insertions(+), 13 deletions(-)

diff --git a/contrib/pg_stat_statements/expected/pg_stat_statements.out b/contrib/pg_stat_statements/expected/pg_stat_statements.out
index e0abe34bb6..e05a6f565a 100644
--- a/contrib/pg_stat_statements/expected/pg_stat_statements.out
+++ b/contrib/pg_stat_statements/expected/pg_stat_statements.out
@@ -1077,4 +1077,416 @@ SELECT COUNT(*) FROM pg_stat_statements WHERE query LIKE '%SELECT GROUPING%';
  2
 (1 row)
 
+--
+-- Consts merging
+--
+CREATE TABLE test_merge (id int, data int);
+-- IN queries
+-- No merging
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
+ id | data 
++--
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+ query  | calls 
++---
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10) | 1
+ SELECT pg_stat_statements_reset()  | 1
+ SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C" | 0
+(7 rows)
+
+-- Normal
+SET const_merge_threshold = 5;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset 
+--
+ 
+(1 row)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3);
+ id | data 
++--
+(0 rows)
+
+SELECT query, calls FROM 

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-12 Thread Bharath Rupireddy
On Sat, Mar 12, 2022 at 2:09 AM Robert Haas  wrote:
>
> On Thu, Mar 10, 2022 at 9:38 PM Kyotaro Horiguchi
>  wrote:
> > It seems to me too rigorous that pg_get_wal_records_info/stats()
> > reject future LSNs as end-LSN and I think WARNING or INFO and stop at
> > the real end-of-WAL is more kind to users.  I think the same with the
> > restriction that start and end LSN are required to be different.
>
> In his review just yesterday, Jeff suggested this: "Don't issue
> WARNINGs or other messages for ordinary situations, like when
> pg_get_wal_records_info() hits the end of WAL." I think he's entirely
> right, and I don't think any patch that does otherwise should get
> committed. It is worth remembering that the results of queries are
> often examined by something other than a human being sitting at a psql
> terminal. Any tool that uses this is going to want to understand what
> happened from the result set, not by parsing strings that may show up
> inside warning messages.
>
> I think that the right answer here is to have a function that returns
> one row per record parsed, and each row should also include the start
> and end LSN of the record. If for some reason the WAL records return
> start after the specified start LSN (e.g. because we skip over a page
> header) or end before the specified end LSN (e.g. because we reach
> end-of-WAL) the user can figure it out from looking at the LSNs in the
> output rows and comparing them to the LSNs provided as input.

Thanks Robert. I've removed the WARNING part and added end_lsn as suggested.

Thanks Kyotaro-san, Ashutosh and Jeff for your review. I tried to
address your review comments, if not all, but many.

Here's v9 patch-set please review it further.

Regards,
Bharath Rupireddy.


v9-0001-pg_walinspect.patch
Description: Binary data


v9-0001-pg_walinspect-tests.patch
Description: Binary data


v9-0001-pg_walinspect-docs.patch
Description: Binary data


Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-12 Thread Bharath Rupireddy
On Fri, Mar 11, 2022 at 7:53 PM Ashutosh Sharma  wrote:
>
> Some comments on pg_walinspect-docc.patch this time:
>
> +   
> +
> + pg_get_wal_record_info(in_lsn pg_lsn, lsn OUT pg_lsn,
> prev_lsn OUT pg_lsn, xid OUT xid, resource_manager OUT text, length
> OUT int4, total_length OUT int4, description OUT text, block_ref OUT
> text, data OUT bytea, data_len OUT int4)
> +
>
> You may shorten this by mentioning just the function input parameters
> and specify "returns record" like shown below. So no need to specify
> all the OUT params.
>
> pg_get_wal_record_info(in_lsn pg_lsn) returns record.
>
> Please check the documentation for other functions for reference.
>
> ==
>
> +
> + pg_get_wal_records_info(start_lsn pg_lsn, end_lsn
> pg_lsn DEFAULT NULL, lsn OUT pg_lsn, prev_lsn OUT pg_lsn, xid OUT xid,
> resource_manager OUT text, length OUT int4, total_length OUT int4,
> description OUT text, block_ref OUT text, data OUT bytea, data_len OUT
> int4)
> +
>
> Same comment applies here as well. In the return type you can just
> mention - "returns setof record" like shown below:
>
> pg_get_wal_records_info(start_lsn pg_lsn, end_lsn pg_lsn) returns setof 
> records.
>
> You may also check for such optimizations at other places. I might
> have missed some.

I used the way verify_heapam shows the columns as it looks good IMO
and we can't show sample outputs for all of the functions in the
documentation.

> ==
>
> +
> +postgres=# select prev_lsn, xid, resource_manager, length,
> total_length, block_ref from pg_get_wal_records_info('0/158A7F0',
> '0/1591400');
> + prev_lsn  | xid | resource_manager | length | total_length |
>block_ref
> +---+-+--++--+--
> + 0/158A7B8 | 735 | Heap | 54 | 7838 | blkref
> #0: rel 1663/5/2619 blk 18 (FPW); hole: offset: 88, length: 408
> + 0/158A7F0 | 735 | Btree| 53 | 8133 | blkref
> #0: rel 1663/5/2696 blk 1 (FPW); hole: offset: 1632, length: 112
> + 0/158C6A8 | 735 | Heap | 53 |  873 | blkref
> #0: rel 1663/5/1259 blk 0 (FPW); hole: offset: 212, length: 7372
>
> Instead of specifying column names in the targetlist I think it's
> better to use "*" so that it will display all the output columns. Also
> you may shorten the gap between start and end lsn to reduce the output
> size.

All columns are giving huge output, especially because of data and
description columns hence I'm not showing them in the sample output.

> ==
>
> Any reason for not specifying author name in the .sgml file. Do you
> want me to add my name to the author? :)

Haha. Thanks. I will add in the v9 patch set which I will post in a while.

Regards,
Bharath Rupireddy.




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-12 Thread Dilip Kumar
On Fri, Mar 11, 2022 at 11:51 PM Robert Haas  wrote:
>
> On Fri, Mar 11, 2022 at 1:10 PM Robert Haas  wrote:
> > I don't think you've adequately considered temporary relations here.
> > It seems to be that ReadBufferWithoutRelcache() could not be safe on a
> > temprel, because we'd need a BackendId to access the underlying
> > storage. So I think that ReadBufferWithoutRelcache can only accept
> > unlogged or permanent, and maybe the argument ought to be a Boolean
> > instead of a relpersistence value. I thought that this problem might
> > be only cosmetic, but I checked the code that actually does the copy,
> > and there's no filter there on relpersistence either. And I think
> > there should be.

Yeah right for now, this api can only support unlogged or permanent.
I will fix this

> I hit "send" too quickly there:
>
> rhaas=# create database fudge;
> CREATE DATABASE
> rhaas=# \c fudge
> You are now connected to database "fudge" as user "rhaas".
> fudge=# create temp table q ();
> CREATE TABLE
> fudge=# ^Z
> [2]+  Stopped psql
> [rhaas Downloads]$ pg_ctl stop -mi
> waiting for server to shut down done
> server stopped
> [rhaas Downloads]$ %%
> psql
> \c
> You are now connected to database "fudge" as user "rhaas".
> fudge=# select * from pg_class where relpersistence='t';
>   oid  | relname | relnamespace | reltype | reloftype | relowner |
> relam | relfilenode | reltablespace | relpages | reltuples |
> relallvisible | reltoastrelid | relhasindex | relisshared |
> relpersistence | relkind | relnatts | relchecks | relhasrules |
> relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity
> | relispopulated | relreplident | relispartition | relrewrite |
> relfrozenxid | relminmxid | relacl | reloptions | relpartbound
> ---+-+--+-+---+--+---+-+---+--+---+---+---+-+-++-+--+---+-++++-++--+++--++++--
>  16388 | q   |16386 |   16390 | 0 |   10 |
> 2 |   16388 | 0 |0 |-1 | 0
> | 0 | f   | f   | t  | r
> |0 | 0 | f   | f  | f
> | f  | f   | t  | d
> | f  |  0 |  721 |  1 ||
>  |
> (1 row)
>
> fudge=# \c rhaas
> You are now connected to database "rhaas" as user "rhaas".
> rhaas=# alter database fudge is_template true;
> ALTER DATABASE
> rhaas=# create database cookies template fudge;
> CREATE DATABASE
> rhaas=# \c cookies
> You are now connected to database "cookies" as user "rhaas".
> cookies=# select count(*) from pg_class where relpersistence='t';
>  count
> ---
>  1
> (1 row)

I think this is not a right example to show the problem, no?  Because
you are showing the pg_class entry and the pg_class is not a temp
relation so even if we avoid copying the temp relation pg_class will
be copied right? so you will still see this uncleaned temp relation
entry.  I could reproduce exactly the same issue without my patch as
well.

So I agree we need to avoid copying temp relations but with that the
above behavior will not change.  Am I missing something?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-12 Thread Fabien COELHO



Hello Justin,

Review about v34, up from v32 which I reviewed in January. v34 is an 
updated version of v32, without the part about lstat at the end of the 
series.


All 7 patches apply cleanly.

# part 01

One liner doc improvement about the directory flag.

OK.

# part 02

Add tests for various options on pg_ls_dir, and for pg_stat_file, which were not
exercised before.  "make check" is ok.

OK.

# part 03

This patch adds a new pg_ls_dir_metadata.  Internally, this is an extension of
pg_ls_dir_files function which is used by other pg_ls functions.  Doc ok.

New tests are added which check that the result columns are configured as 
required,
including error cases.

"make check" is ok.

OK.

# part 04

Add a new "isdir" column to "pg_ls_tmpdir" output.  This is a small behavioral
change.

I'm ok with that, however I must say that I'm still unsure why we would 
not jump directly to a "type" char column.  What is wrong with outputing 
'd' or '-' instead of true or false? This way, the interface needs not 
change if "lstat" is used later? ISTM that the interface issue should be 
somehow independent of the implementation issue, and we should choose 
directly the right/best interface?


Independently, the documentation may be clearer about what happens to 
"isdir" when the file is a link? It may say that the behavior may change 
in the future?


About homogeneity, I note that some people may be against adding "isdir" 
to other ls functions. I must say that I cannot see a good argument not to 
do it, and that I hate dealing with systems which are not homogeneous 
because it creates surprises and some loss of time.


"make check" ok.

OK.

# part 05

Add isdir to other ls functions. Doc is updated.

Same as above, I'd prefer a char instead of a bool, as it is more extendable and
future-proof.

OK.

# part 06

This part extends and adds a test for pg_ls_logdir.
"make check" is ok.

OK.

# part 07

This part extends pg_stat_file with more date informations.

"make check" is ok.

OK.

# doc

Overall doc generation is OK.

--
Fabien.