Re: Identify missing publications from publisher while create/alter subscription.

2022-03-28 Thread Amit Kapila
On Sat, Mar 26, 2022 at 7:53 PM vignesh C  wrote:
>
> The patch was not applying on HEAD, attached patch which is rebased on
> top of HEAD.
>

IIUC, this patch provides an option that allows us to give an error if
while creating/altering subcsiction, user gives non-existant
publications. I am not sure how useful it is to add such behavior via
an option especially when we know that it can occur in some other ways
like after creating the subscription, users can independently drop
publication from publisher. I think it could be useful to provide
additional information here but it would be better if we can follow
Euler's suggestion [1] in the thread where he suggested issuing a
WARNING if the publications don't exist and document that the
subscription catalog can have non-existent publications.

I think we should avoid adding new options unless they are really
required and useful.

[1] - 
https://www.postgresql.org/message-id/a2f2fba6-40dd-44cc-b40e-58196bb77f1c%40www.fastmail.com

-- 
With Regards,
Amit Kapila.




Re: Add psql command to list constraints

2022-03-28 Thread Tatsuro Yamada

Hi Dag,


The patch adds the command "\dco" to list constraints in psql. This
seems useful to me.


Thank you!

 

The patch applies cleanly to HEAD, although some hunks have rather large
offsets.

As far as I can tell, the "\dco" command works as documented.

I have however found the following issues with the patch:

* A TAB character has been added to doc/src/sgml/ref/psql-ref.sgml -
   this should be replaced with spaces.


Fixed.



* The call to listConstraints in line src/bin/psql/command.c 794 refers
   to [2], this should rather be [3].

* The patch kills the "\dc" command in src/bin/psql/command.c
   This can be fixed by adding the following at line 800:
   else
 success =
   listConversions(pattern, show_verbose, show_system);



Oh, you are right! Fixed.

 

Another comment is that the "\dco" command outputs quite a lot of
information, which only fits in a wide terminal window. Would it be an
idea to only display the columns "Schema" and "Name" by default, and
use "+" to specify inclusion of the columns "Definition" and "Table".
 


I fixed the output columns as you proposed.

The current status of this patch is:

  - Addressed Dag's comments
  - Not implemented yet:
- Tab completion
- Regression test
- NOT NULL constraint, and so on (based on pg_attribute)

Please find attached new patch.


Thanks,
Tatsuro Yamada




diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index caabb06..125ae3d 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1388,6 +1388,26 @@ testdb=
 
 
   
+\dco[cfptuxS+] [ pattern ]
+
+
+Lists constraints.
+If pattern
+is specified, only entries whose name matches the pattern are listed.
+The modifiers c (check), f 
(foreign key),
+p (primary key), t (trigger), 
+u (unique), x (exclusion) can be 
+appended to the command, filtering the kind of constraints to list. 
+By default, only user-created constraints are shown; supply the 
+S modifier to include system objects. 
+If + is appended to the command name, each object
+is listed with its associated description.
+
+
+  
+
+
+  
 \dC[+] [ pattern ]
 
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 079f4a1..05ae25e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -780,7 +780,26 @@ exec_command_d(PsqlScanState scan_state, bool 
active_branch, const char *cmd)
success = describeTablespaces(pattern, 
show_verbose);
break;
case 'c':
-   success = listConversions(pattern, 
show_verbose, show_system);
+   if (strncmp(cmd, "dco", 3) == 0) /* Constraint 
*/
+   switch (cmd[3])
+   {
+   case '\0':
+   case '+':
+   case 'S':
+   case 'c':
+   case 'f':
+   case 'p':
+   case 't':
+   case 'u':
+   case 'x':
+   success = 
listConstraints([3], pattern, show_verbose, show_system);
+   break;
+   default:
+   status = 
PSQL_CMD_UNKNOWN;
+   break;
+   }
+   else
+   success = listConversions(pattern, 
show_verbose, show_system);
break;
case 'C':
success = listCasts(pattern, show_verbose);
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4dddf08..7acd25a 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -19,6 +19,7 @@
 #include "catalog/pg_attribute_d.h"
 #include "catalog/pg_cast_d.h"
 #include "catalog/pg_class_d.h"
+#include "catalog/pg_constraint_d.h"
 #include "catalog/pg_default_acl_d.h"
 #include "common.h"
 #include "common/logging.h"
@@ -4599,6 +4600,109 @@ listExtendedStats(const char *pattern)
 }
 
 /*
+ * \dco
+ *
+ * Describes constraints
+ *
+ * As with \d, you can specify the kinds of constraints you want:
+ *
+ * c for check
+ * f for foreign key
+ * p for primary key
+ * t for trigger
+ * u for unique
+ * x for exclusion
+ *
+ * 

Re: Skipping logical replication transactions on subscriber side

2022-03-28 Thread Amit Kapila
On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira  wrote:
>
> On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
>
> I have fixed all the above comments as per your suggestion in the
> attached. Do let me know if something is missed?
>
> Looks good to me.
>

This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
Today, I have marked the corresponding entry in CF as committed.

-- 
With Regards,
Amit Kapila.




Re: Logical replication timeout problem

2022-03-28 Thread Masahiko Sawada
On Fri, Mar 25, 2022 at 5:33 PM Amit Kapila  wrote:
>
> On Fri, Mar 25, 2022 at 11:49 AM Masahiko Sawada  
> wrote:
> >
> > On Fri, Mar 25, 2022 at 2:23 PM wangw.f...@fujitsu.com
> >  wrote:
> >
> > Since commit 75b1521 added decoding of sequence to logical
> > replication, the patch needs to have pgoutput_sequence() call
> > update_progress().
> >
>
> Yeah, I also think this needs to be addressed. But apart from this, I
> want to know your and other's opinion on the following two points:
> a. Both this and the patch discussed in the nearby thread [1] add an
> additional parameter to
> WalSndUpdateProgress/OutputPluginUpdateProgress and it seems to me
> that both are required. The additional parameter 'last_write' added by
> this patch indicates: "If the last write is skipped then try (if we
> are close to wal_sender_timeout) to send a keepalive message to the
> receiver to avoid timeouts.". This means it can be used after any
> 'write' message. OTOH, the parameter 'skipped_xact' added by another
> patch [1] indicates if we have skipped sending anything for a
> transaction then sendkeepalive for synchronous replication to avoid
> any delays in such a transaction. Does this sound reasonable or can
> you think of a better way to deal with it?

These current approaches look good to me.

> b. Do we want to backpatch the patch in this thread? I am reluctant to
> backpatch because it changes the exposed API which can have an impact
> and second there exists a workaround (user can increase
> wal_sender_timeout/wal_receiver_timeout).

Yeah, we should avoid API changes between minor versions. I feel it's
better to fix it also for back-branches but probably we need another
fix for them. The issue reported on this thread seems quite
confusable; it looks like a network problem but is not true. Also, the
user who faced this issue has to increase wal_sender_timeout due to
the decoded data size, which also means to delay detecting network
problems. It seems an unrelated trade-off.

Regards,
-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add pg_freespacemap extension sql test

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 12:12:48PM +0900, Michael Paquier wrote:
> It seems to me here that the snapshot hold by autovacuum during the
> scan of pg_database to find the relations to process is enough to
> prevent the FSM truncation, as the tuples cleaned up by the DELETE
> query still need to be visible.  One simple way to keep this test
> would be a custom configuration file with autovacuum disabled and
> NO_INSTALLCHECK.

Well, done this way.  We already do that in other tests that rely on a
FSM truncation to happen, like 008_fsm_truncation.pl.
--
Michael


signature.asc
Description: PGP signature


Re: Add LZ4 compression in pg_dump

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 08:36:15AM -0400, Robert Haas wrote:
> Well, I think the first thing we should do is get rid of enum
> WalCompressionMethod and use enum WalCompression instead. They've got
> the same elements and very similar names, but the WalCompressionMethod
> ones just have names like COMPRESSION_NONE, which is too generic,
> whereas WalCompressionMethod uses WAL_COMPRESSION_NONE, which is
> better. Then I think we should also rename the COMPR_ALG_* constants
> in pg_dump.h to names like DUMP_COMPRESSION_*. Once we do that we've
> got rid of all the unprefixed things that purport to be a list of
> compression algorithms.

Yes, having a centralized enum for the compression method would make
sense, along with the routines to parse and get the compression method
names.  At least that would be one step towards more unity in
src/common/.

> Then, if people are willing to adopt the syntax that the
> backup_compression.c/h stuff supports as a project standard (+1 from
> me) we can go the other way and rename that stuff to be more generic,
> taking backup out of the name.

I am not sure about the specification part which is only used by base
backups that has no client-server requirements, so option values would
still require their own grammar.
--
Michael


signature.asc
Description: PGP signature


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Kyotaro Horiguchi
At Mon, 28 Mar 2022 12:17:50 -0400, Robert Haas  wrote 
in 
> On Mon, Mar 21, 2022 at 3:02 PM Alvaro Herrera  
> wrote:
> > > 2. Why not instead change the code so that the operation can succeed,
> > > by creating the prerequisite parent directories? Do we not have enough
> > > information for that? I'm not saying that we definitely should do it
> > > that way rather than this way, but I think we do take that approach in
> > > some cases.
> >
> > It seems we can choose freely between these two implementations -- I
> > mean I don't see any upsides or downsides to either one.
> 
> What got committed here feels inconsistent to me. Suppose we have a
> checkpoint, and then a series of operations that touch a tablespace,
> and then a drop database and drop tablespace. If the first operation
> happens to be CREATE DATABASE, then this patch is going to fix it by
> skipping the operation. However, if the first operation happens to be
> almost anything else, the way it's going to reference the dropped
> tablespace is via a block reference in a WAL record of a wide variety
> of types. That's going to result in a call to
> XLogReadBufferForRedoExtended() which will call
> XLogReadBufferExtended() which will do smgrcreate(smgr, forknum, true)
> which will in turn call TablespaceCreateDbspace() to fill in all the
> missing directories.

Right. I thought that recovery avoids that but that's wrong.  This
behavior creates a bare (non-linked) directly within pg_tblspc.  The
directory would dissapear soon if recovery proceeds to the consistency
point, though.

> I don't think that's very good. It would be reasonable to decide that
> we're never going to create the missing directories and instead just
> remember that they were not found so we can do a cross check. It's
> also reasonable to just create the directories on the fly. But doing a
> mix of those systems doesn't really seem like the right idea -
> particularly because it means that the cross-check system is probably
> not very effective at finding actual problems in the code.
> 
> Am I missing something here?

No. I agree that mixing them is not good.  On the other hand we
already doing that by heapam.  AFAICS sometimes it avoid creating a
new page but sometimes creates it.  But I don't mean to use the fact
for justifying this patch to do that, or denying to do that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Support logical replication of DDLs

2022-03-28 Thread Dilip Kumar
On Thu, Mar 24, 2022 at 11:24 PM Zheng Li  wrote:
>
> Hi Dilip,
>
> Thanks for the feedback.
>
> > > > > The table creation WAL and table insert WAL are available. The tricky
> > > > > part is how do we break down this command into two parts (a normal
> > > > > CREATE TABLE followed by insertions) either from the parsetree or the
> > > > > WALs. I’ll have to dig more on this.
>
> > > I had put some more thought about this, basically, during CTAS we are
> > > generating the CreateStmt inside "create_ctas_internal" and executing
> > > it first before inserting the tuple, so can't we generate the
> > > independent sql just for creating the tuple maybe using deparsing or
> > > something?
>
> Yes, deparsing might help for edge cases like this. However I found
> a simple solution for this specific case:
>
> The idea is to force skipping any direct data population (which can
> potentially cause data inconsistency on the subscriber)
> in CREATE AS and SELECT INTO command on the subscriber by forcing the
> skipData flag in the intoClause of the parsetree after
> the logical replication worker parses the command. The data sync will
> be taken care of by the DML replication after the DDL replication
> finishes.

Okay, something like that should work, I am not sure it is the best
design though.

> This is implemented in the latest commit:
> https://github.com/zli236/postgres/commit/116c33451da8d44577b8d6fdb05c4b6998cd0167
>
> > > Apart from that I have one more question, basically if you are
> > > directly logging the sql query then how you are identifying under
> > > which schema you need to create that table, are you changing the sql
> > > and generating schema-qualified name?
> >
> > I was going through the patch and it seems you are logging the search
> > path as well along with the query so I think this will probably work.
>
> Yes, currently we log the search path as well as the user name. And we
> enforce the same search path and user name when applying the DDL command
> on the subscriber.

Yeah this looks fine to me.

>
> > I have got one more query while looking into the code.  In the below
> > code snippet you are logging DDL command only if it is a top level
> > query but there are no comments explaining what sort of queries we
> > don't want to log.  Suppose I am executing a DDL statement inside a PL
> > then that will not be a top level statement so is your intention to
> > block that as well or that is an unintentional side effect?
> >
> > +/*
> > + * Consider logging the DDL command if logical logging is
> > enabled and this is
> > + * a top level query.
> > + */
> > +if (XLogLogicalInfoActive() && isTopLevel)
> > +LogLogicalDDLCommand(parsetree, queryString);
>
> Good catch. The reason for having isTopLevel in the condition is
> because I haven't decided if a DDL statement inside a PL should
> be replicated from the user point of view. For example, if I execute a
> plpgsql function or a stored procedure which creates a table under the hood,
> does it always make sense to replicate the DDL without running the same
> function or stored procedure on the subscriber? It probably depends on
> the specific
> use case. Maybe we can consider making this behavior configurable by the user.

But then this could be true for DML as well right?  Like after
replicating the function to the subscriber if we are sending the DML
done by function then what's the problem in DDL.  I mean if there is
no design issue in implementing this then I don't think there is much
point in blocking the same or even providing configuration for this.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-28 Thread Dilip Kumar
On Tue, Mar 29, 2022 at 12:38 AM Robert Haas  wrote:
>
> On Mon, Mar 28, 2022 at 2:18 AM Dilip Kumar  wrote:
> > > I have put the similar logic for relmap_update WAL replay as well,
> >
> > There was some mistake in the last patch, basically, for relmap update
> > also I have checked the missing tablespace directory but I should have
> > checked the missing database directory so I have fixed that.
> >
> > > Now, is it possible to get the FPI without smgr_create wal in other
> > > cases?  If it is then that problem is orthogonal to this path, but
> > > anyway I could not find any such scenario.
> >
> > I have digged further into it, tried manually removing the directory
> > before XLOG_FPI, but I noticed that during FPI also
> > XLogReadBufferExtended() take cares of creating the missing files
> > using smgrcreate() and that intern take care of missing directory
> > creation so I don't think we have any problem here.
>
> I don't understand whether XLOG_RELMAP_UPDATE should be just doing
> smgrcreate()

XLOG_RELMAP_UPDATE is for the complete database so for which relnode
it will create smgr?  I think you probably meant
TablespaceCreateDbspace()?

 as we would for most WAL records or whether it should be
> adopting the new system introduced by
> 49d9cfc68bf4e0d32a948fe72d5a0ef7f464944e. I wrote about this concern
> over here:

okay, thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: SQL/JSON: functions

2022-03-28 Thread Andres Freund
On 2022-03-28 19:25:51 -0400, Tom Lane wrote:
> ... even more baffling: jabiru went green after the IS JSON patch.

Broken ccache contents somehow?




Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
On Mon, Mar 28, 2022 at 7:07 PM Tom Lane  wrote:
> Right, that's the case I had in mind --- apologies if my terminology
> was faulty.  btree can actually handle such a case now, but what it
> fails to do is re-descend from the tree root instead of plowing
> forward in the index to find the next matching entry.

KNNGIST seems vaguely related to what we'd build for nbtree skip scan,
though. GiST index scans are "inherently loose", though. KNNGIST uses
a pairing heap/priority queue, which seems like the kind of thing
nbtree skip scan can avoid.

> +1.  We at least need to be sure we all are using these terms
> the same way.

Yeah, there are *endless* opportunities for confusion here.

-- 
Peter Geoghegan




Re: Add psql command to list constraints

2022-03-28 Thread Tatsuro Yamada

Hi All,


In the interests of trying to clean up the CF and keep things moving
I'll mark the patch rejected.


Thank you for managing the commitfest and the comments from many of
hackers. I apologize for not being able to email you more often due to
my busy day job.

First of all, I understand to a certain extent your opinion that we
can use \d and look at the constraints on a table-by-table basis as a
way to check the constraints.
However, suppose We want to reverse lookup a table from a constraint.
In that case, there are two ways: (1) use "\d" to lookup all tables,
(2) execute a select statement against pg_constraint. I think the
proposed function is more valuable than these.

From another perspective, when looking at the comprehensiveness of
metacommands in psql, it seems that only functions that focus on
constraints do not exist. Therefore, It would be better to add it in
terms of comprehensiveness.

I think there is room for other discussions about this patch. Still,
at least there are people (including myself) who think it is useful.
I don't think there is anything wrong with this patch that would
complicate the code or cause performance degradation, so I would like to
continue developing it for those who want to use it.

However, I understand that it will not be ready in time for PG15, so
I would like to move forward with PG16. Therefore, the status of the
patch would be better by changing "Waiting for Author" rather than
"Rejected".

P.S.
I'll send a new patch addressing Dag's comments in the next email.

Thanks,
Tatsuro Yamada






Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Julien Rouhaud
Hi,

On Mon, Mar 28, 2022 at 03:09:12PM -0400, Greg Stark wrote:
> This patch got some very positive feedback and some significant amount
> of work earlier in the release cycle. The feedback from Julien earlier
> this month seemed pretty minor.
> 
> Ekaterina, is there any chance you'll be able to work on this this
> week and do you think it has a chance of making this release? Julien,
> do you think it's likely to be possible to polish for this release?

Most of the comments I have are easy to fix.  But I think that the real problem
is the significant overhead shown by Ekaterina that for now would apply even if
you don't consume the new stats, for instance if you have pg_stat_statements.
And I'm still not sure of what is the best way to avoid that.




Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)

2022-03-28 Thread Justin Pryzby
On Sat, Mar 26, 2022 at 08:23:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 23, 2022 at 03:17:35PM +0900, Michael Paquier wrote:
> > FWIW, per my review the bit of the patch set that I found the most
> > relevant is the addition of a note in the docs of pg_stat_file() about
> > the case where "filename" is a link, because the code internally uses
> > stat().   The function name makes that obvious, but that's not
> > commonly known, I guess.  Please see the attached, that I would be
> > fine to apply.
> 
> Hmm.  I am having second thoughts on this one, as on Windows we rely
> on GetFileInformationByHandle() for the emulation of stat() in
> win32stat.c, and it looks like this returns some information about the
> junction point and not the directory or file this is pointing to, it
> seems.

Where did you find that ?  What metadata does it return about the junction
point ?  We only care about a handful of fields.




Re: Window Function "Run Conditions"

2022-03-28 Thread David Rowley
On Wed, 23 Mar 2022 at 16:30, David Rowley  wrote:
>
> On Wed, 23 Mar 2022 at 11:24, David Rowley  wrote:
> > I think it's safer to just disable the optimisation when there are
> > multiple window clauses.  Multiple matching clauses are merged
> > already, so it's perfectly valid to have multiple window functions,
> > it's just they must share the same window clause.  I don't think
> > that's terrible as with the major use case that I have in mind for
> > this, the window function is only added to limit the number of rows.
> > In most cases I can imagine, there'd be no reason to have an
> > additional window function with different frame options.
>
> I've not looked into the feasibility of it, but I had a thought that
> we could just accumulate all the run-conditions in a new field in the
> PlannerInfo then just tag them onto the top-level WindowAgg when
> building the plan.
>
> I'm just not sure it would be any more useful than what the v3 patch
> is currently doing as intermediate WindowAggs would still need to
> process all rows.  I think it would only save the window function
> evaluation of the top-level WindowAgg for rows that don't match the
> run-condition.  All the supported window functions are quite cheap, so
> it's not a huge saving. I'd bet there would be example cases where it
> would be measurable though.

Another way of doing this that seems better is to make it so only the
top-level WindowAgg will stop processing when the run condition
becomes false.  Any intermediate WindowAggs must continue processing
tuples, but may skip evaluation of their WindowFuncs.

Doing things this way also allows us to handle cases where there is a
PARTITION BY clause, however, in this case, the top-level WindowAgg
must not stop processing and return NULL, instead, it can just act as
if it were an intermediate WindowAgg and just stop evaluating
WindowFuncs.  The top-level WindowAgg must continue processing the
tuples so that the other partitions are also processed.

I made the v4 patch do things this way and tested the performance of
it vs current master.  Test 1 and 2 have PARTITION BY clauses. There's
a small performance increase from not evaluating the row_number()
function once rn <= 2 is no longer true.

Test 3 shows the same speedup as the original patch where we just stop
processing any further tuples when the run condition is no longer true
and there is no PARTITION BY clause.

Setup:
create table xy (x int, y int);
insert into xy select x,y from generate_series(1,1000)x,
generate_Series(1,1000)y;
create index on xy(x,y);
vacuum analyze xy;

Test 1:

explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn <= 2;

Master:

Execution Time: 359.553 ms
Execution Time: 354.235 ms
Execution Time: 357.646 ms

v4 patch:

Execution Time: 346.641 ms
Execution Time: 337.131 ms
Execution Time: 336.531 ms

(5% faster)

Test 2:

explain analyze select * from (select x,y,row_number() over (partition
by x order by y) rn from xy) as xy where rn = 1;

Master:

Execution Time: 359.046 ms
Execution Time: 357.601 ms
Execution Time: 357.977 ms

v4 patch:

Execution Time: 336.540 ms
Execution Time: 337.024 ms
Execution Time: 342.706 ms

(5.7% faster)

Test 3:

explain analyze select * from (select x,y,row_number() over (order by
x,y) rn from xy) as xy where rn <= 2;

Master:

Execution Time: 362.322 ms
Execution Time: 348.812 ms
Execution Time: 349.471 ms

v4 patch:

Execution Time: 0.060 ms
Execution Time: 0.037 ms
Execution Time: 0.037 ms

(~8000x faster)

One thing which I'm not sure about with the patch is how I'm handling
the evaluation of the runcondition in nodeWindowAgg.c.  Instead of
having ExecQual() evaluate an OpExpr such as "row_number() over (...)
<= 10", I'm replacing the WindowFunc with the Var in the targetlist
that corresponds to the given WindowFunc.  This saves having to double
evaluate the WindowFunc. Instead, the value of the Var can be taken
directly from the slot.  I don't know of anywhere else we do things
quite like that.  The runcondition is slightly similar to HAVING
clauses, but HAVING clauses don't work this way.  Maybe they would
have if slots had existed back then. Or maybe it's a bad idea to set a
precedent that the targetlist Vars must be evaluated already.  Does
anyone have any thoughts on this part?

v4 patch attached.

David
From 40dc5b3d521bc3490396bb6c3c2fb78714ef5242 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Sat, 8 May 2021 23:43:25 +1200
Subject: [PATCH v4] Teach planner and executor about monotonic window funcs

Window functions such as row_number() always return a value higher than
the one previously in any given partition.  If a query were to only
request the first few row numbers, then traditionally we would continue
evaluating the WindowAgg node until all tuples are exhausted.  However, it
is possible if someone, say only wanted all row numbers <= 10, then we
could just stop once we get number 11.

Here we 

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-03-28 Thread Nathan Bossart
On Mon, Mar 28, 2022 at 04:30:27PM -0400, Stephen Frost wrote:
>> - By default, pg_start_backup can take a long time 
>> to finish.
>> + By default, pg_backup_start can take a long time 
>> to finish.
>>   This is because it performs a checkpoint, and the I/O
>>   required for the checkpoint will be spread out over a significant
>>   period of time, by default half your inter-checkpoint interval
> 
> Hrmpf.  Not this patch's fault, but the default isn't 'half your
> inter-checkpoint interval' anymore (checkpoint_completion_target was
> changed to 0.9 ... let's not discuss who did that and missed fixing
> this).  Overall though, maybe we should reword this to avoid having to
> remember to change it again if we ever change the completion target
> again?  Also it seems to imply that pg_backup_start is going to run its
> own independent checkpoint or something, which isn't the typical case.
> I generally explain what is going on here like this:
> 
> By default, pg_backup_start will wait for the next
> checkpoint to complete (see ref: checkpoint_timeout ... maybe also
> wal-configuration.html).

done

>> - The pg_stop_backup will return one row with three
>> + The pg_backup_stop will return one row with three
>>   values. The second of these fields should be written to a file named
>>   backup_label in the root directory of the backup. 
>> The
>>   third field should be written to a file named
> 
> I get that we have  and , but those are just
> formatting and don't show up in the actual text, and so it ends up
> being:
> 
> The pg_backup_stop will return
> 
> Which doesn't sound quite right to me.  I'd say we either drop 'The' or
> add 'function' after.  I realize that this patch is mostly doing a
> s/pg_stop_backup/pg_backup_stop/, but, still.

done

>> -Finishes performing an exclusive or non-exclusive on-line backup.
>> -The exclusive parameter must match the
>> -previous pg_start_backup call.
>> -In an exclusive backup, pg_stop_backup removes
>> -the backup label file and, if it exists, the tablespace map file
>> -created by pg_start_backup.  In a non-exclusive
>> -backup, the desired contents of these files are returned as part of
>> +Finishes performing an on-line backup.  The desired contents of the
>> +backup label file and the tablespace map file are returned as part 
>> of
>>  the result of the function, and should be written to files in the
>>  backup area (not in the data directory).
>> 
> 
> Given that this is a crucial part, which the exclusive mode has wrong,
> I'd be a bit more forceful about it, eg:
> 
> The contents of the backup label file and the tablespace map file must
> be stored as part of the backup and must NOT be stored in the live data
> directory (doing so will cause PostgreSQL to fail to restart in the
> event of a crash).

done

>>  The result of the function is a single record.
>>  The lsn column holds the backup's ending
>>  write-ahead log location (which again can be ignored).  The second 
>> and
>> -third columns are NULL when ending an exclusive
>> -backup; after a non-exclusive backup they hold the desired contents 
>> of
>> +third columns hold the desired contents of
>>  the label and tablespace map files.
>> 
>> 
> 
> I dislike saying 'desired' here as if it's something that we're nicely
> asking but maybe isn't a big deal, and just saying 'label' isn't good
> since we call it 'backup label' elsewhere and we should try hard to be
> consistent and clear.  How about:
> 
> The second column returns the contents of the backup label file, the
> third column returns the contents of the tablespace map file.  These
> must be stored as part of the backup and are required as part of the
> restore process.

done

>>  {
>>  print_msg(_("WARNING: online backup mode is active\n"
>> -"Shutdown will not complete 
>> until pg_stop_backup() is called.\n\n"));
>> +"Shutdown will not complete 
>> until pg_backup_stop() is called.\n\n"));
>>  }
>>  
>>  print_msg(_("waiting for server to shut down..."));
>> @@ -1145,7 +1145,7 @@ do_restart(void)
>>  get_control_dbstate() != DB_IN_ARCHIVE_RECOVERY)
>>  {
>>  print_msg(_("WARNING: online backup mode is active\n"
>> -"Shutdown will not complete 
>> until pg_stop_backup() is called.\n\n"));
>> +"Shutdown will not complete 
>> until pg_backup_stop() is called.\n\n"));
>>  }
>>  
>>  print_msg(_("waiting for server to shut down..."));
> 
> This... can't actually happen anymore, right?  Shouldn't we just pull
> all of this out?

done

> On a 

Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Tom Lane
Peter Geoghegan  writes:
> The terminology in this area is a mess. MySQL calls
> SELECT-DISTINCT-that-matches-an-index "loose index scans". I think
> that you're talking about skip scan when you say "loose index scan".
> Skip scan is where there is an omitted prefix of columns in the SQL
> query -- omitted columns after the first column that lack an equality
> qual.

Right, that's the case I had in mind --- apologies if my terminology
was faulty.  btree can actually handle such a case now, but what it
fails to do is re-descend from the tree root instead of plowing
forward in the index to find the next matching entry.

> It might be useful for somebody to go write a "taxonomy of MDAM
> techniques", or a glossary.

+1.  We at least need to be sure we all are using these terms
the same way.

regards, tom lane




Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
On Mon, Mar 28, 2022 at 5:21 PM Tom Lane  wrote:
> In any case, what I was on about is _bt_preprocess_keys() and
> adjacent code.  I'm surprised that those aren't more expensive
> than one palloc in _bt_first.  Maybe that logic falls through very
> quickly in simple cases, though.

I assume that it doesn't really appear in very simple cases (also
common cases). But delaying the scan setup work until execution time
does seem ugly. That's probably a good enough reason to refactor.

-- 
Peter Geoghegan




Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
On Tue, Mar 22, 2022 at 1:55 PM Tom Lane  wrote:
> Peter asked me off-list to spend some time thinking about the overall
> direction we ought to be pursuing here.

Thanks for taking a look!

"5.5 Exploiting Key Prefixes" and "5.6 Ordered Retrieval" from "Modern
B-Tree Techniques" are also good, BTW.

The terminology in this area is a mess. MySQL calls
SELECT-DISTINCT-that-matches-an-index "loose index scans". I think
that you're talking about skip scan when you say "loose index scan".
Skip scan is where there is an omitted prefix of columns in the SQL
query -- omitted columns after the first column that lack an equality
qual. Pretty sure that MySQL/InnoDB can't do that -- it can only
"skip" to the extent required to make
SELECT-DISTINCT-that-matches-an-index perform well, but that's about
it.

It might be useful for somebody to go write a "taxonomy of MDAM
techniques", or a glossary. The existing "Loose indexscan" Postgres
wiki page doesn't seem like enough. Something very high level and
explicit, with examples, just so we don't end up talking at cross
purposes too much.

> 1. Usually I'm in favor of doing this sort of thing in an index AM
> agnostic way, but here I don't see much point.  All of the ideas at
> stake rely fundamentally on having a lexicographically-ordered multi
> column index; but we don't have any of those except btree, nor do
> I think we're likely to get any soon.  This motivates the general
> tenor of my remarks below, which is "do it in access/nbtree/ not in
> the planner".

That was my intuition all along, but I didn't quite have the courage
to say so -- sounds too much like something that an optimizer
dilettante like me would be expected to say.  :-)

Seems like one of those things where lots of high level details
intrinsically need to be figured out on-the-fly, at execution time,
rather than during planning. Perhaps it'll be easier to correctly
determine that a skip scan plan is the cheapest in practice than to
accurately cost skip scan plans. If the only alternative is a
sequential scan, then perhaps a very approximate cost model will work
well enough. It's probably way too early to tell right now, though.

> I've worried in the past that we'd
> need planner/statistical support to figure out whether a loose scan
> is likely to be useful compared to just plowing ahead in the index.

I don't expect to be able to come up with a structure that leaves no
unanswered questions about future MDAM work -- it's not realistic to
expect everything to just fall into place. But that's okay. Just
having everybody agree on roughly the right conceptual model is the
really important thing. That now seems quite close, which I count as
real progress.

> 4. I find each of the above ideas to be far more attractive than
> optimizing SELECT-DISTINCT-that-matches-an-index, so I don't really
> understand why the current patchsets seem to be driven largely
> by that single use-case.  I wouldn't even bother with that for the
> initial patch.

I absolutely agree. I wondered about that myself in the past. My best
guess is that a certain segment of users are familiar with
SELECT-DISTINCT-that-matches-an-index from MySQL. And so to some
extent application frameworks evolved in a world where that capability
existed. IIRC Jesper once said that Hibernate relied on this
capability.

It's probably a lot easier to implement
SELECT-DISTINCT-that-matches-an-index if you have the MySQL storage
engine model, with concurrency control that's typically based on
two-phase locking. I think that MySQL does some amount of
deduplication in its executor here -- and *not* in what they call the storage
engine. This is exactly what I'd like to avoid in Postgres; as I said
"Maintenance of Index Order" (as the paper calls it) seems important,
and not something to be added later on. Optimizer and executor layers
that each barely know the difference between a skip scan and a full
index scan seems like something we might actually want to aim for,
rather than avoid. Teaching nbtree to transform quals into ranges
sounds odd at first, but it seems like the right approach now, on
balance -- that's the only *good* way to maintain index order.
(Maintaining index order is needed to avoid needing or relying on
deduplication in the executor proper, which is even inappropriate in
an implementation of SELECT-DISTINCT-that-matches-an-index IMO.)

--
Peter Geoghegan




Re: POC: GROUP BY optimization

2022-03-28 Thread Zhihong Yu
On Mon, Mar 28, 2022 at 5:49 PM Tomas Vondra 
wrote:

> Hi,
>
> Here's a rebased/improved version of the patch, with smaller parts
> addressing various issues. There are seven parts:
>
> 0001 - main part, just rebased
>
> 0002 - replace the debug GUC options with a single GUC to disable the
>optimization if needed
>
> 0003 - minor code cleanup, removal of unnecessary variable
>
> 0004 - various comment fixes (rewordings, typos, ...)
>
> 0005 - a minor code simplification, addressing FIXMEs from 0004
>
> 0006 - adds the new GUC to the docs
>
> 0007 - demonstrates plan changes with a disabled optimization
>
> The first 6 parts should be squashed and committed at one, I only kept
> them separate for clarity. The 0007 is merely a demonstration of the new
> GUC and that it disables the optimization.
>
> > Agree. Because it is a kind of automation we should allow user to switch
> > it off in the case of problems or manual tuning.
> > > Also, I looked through this patch. It has some minor problems:
> > 1. Multiple typos in the patch comment.
>
> I went through the comments and checked all of them for grammar mistakes
> and typos using a word processor, so hopefully that should be OK. But
> maybe there's still something wrong.
>
> > 2. The term 'cardinality of a key' - may be replace with 'number of
> > duplicates'?
>
> No, cardinality means "number of distinct values", so "duplicates" would
> be wrong. And I think "cardinality" is well established term, so I think
> it's fine.
>
> BTW I named the GUC enable_group_by_reordering, I wonder if it should be
> named differently, e.g. enable_groupby_reordering? Opinions?
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

Hi,

For 0001-Optimize-order-of-GROUP-BY-keys-20220328.patch:

multiple parametes need to be

  parametes -> parameters

leave more expensive comparions

  comparions -> comparisons

+   if (has_fake_var == false)

The above can be written as:

   if (!has_fake_var)

+   nGroups = ceil(2.0 + sqrt(tuples) * (i + 1) /
list_length(pathkeys));

Looks like the value of tuples doesn't change inside the loop.
You can precompute sqrt(tuples) outside the loop and store the value in a
variable.

+   return -1;
+   else if (a->cost == b->cost)
+   return 0;
+   return 1;

the keyword 'else' is not needed.

+ * Returns newly allocated lists. If no reordering is possible (or needed),
+ * the lists are set to NIL.
+ */
+static bool
+get_cheapest_group_keys_order(PlannerInfo *root, double nrows,

It seems the comment for return value doesn't match the bool return type.

+   /* If this optimization is disabled, we're done. */
+   if (!debug_cheapest_group_by)

It seems enable_cheapest_group_by would be better name for the flag.

Cheers


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 02:34:44PM +1300, Thomas Munro wrote:
> Just a thought:  we could consider back-patching
> allow_in_place_tablespaces, after a little while, if we're happy with
> how that is working out, if it'd be useful for verifying bug fixes in
> back branches.  It's non-end-user-facing testing infrastructure.

+1 for a backpatch on that.  That would be useful.
--
Michael


signature.asc
Description: PGP signature


RE: Logical replication timeout problem

2022-03-28 Thread wangw.f...@fujitsu.com
On Mon, Mar 28, 2022 at 2:11 AM I wrote:
> Rebase the patch.

After reviewing anohter patch[1], I think this patch should also add a loop in
function WalSndUpdateProgress like what did in function WalSndWriteData.
So update the patch to be consistent with the existing code and the patch
mentioned above.

Attach the new patch.

[1] - 
https://www.postgresql.org/message-id/OS0PR01MB5716946347F607F4CFB02FCE941D9%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Regards,
Wang wei


v9-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  v9-0001-Fix-the-logical-replication-timeout-during-large-.patch


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Kyotaro Horiguchi
At Mon, 28 Mar 2022 10:37:04 -0400, Robert Haas  wrote 
in 
> On Fri, Mar 25, 2022 at 8:26 AM Alvaro Herrera  
> wrote:
> > On 2022-Mar-21, Alvaro Herrera wrote:
> > > I had a look at this latest version of the patch, and found some things
> > > to tweak.  Attached is v21 with three main changes from Kyotaro's v20:
> >
> > Pushed this, backpatching to 14 and 13.  It would have been good to
> > backpatch further, but there's an (textually trivial) merge conflict
> > related to commit e6d8069522c8.  Because that commit conceptually
> > touches the same area that this bugfix is about, I'm not sure that
> > backpatching further without a lot more thought is wise -- particularly
> > so when there's no way to automate the test in branches older than
> > master.
> >
> > This is quite annoying, considering that the bug was reported shortly
> > before 12 went into beta.
> 
> I think that the warnings this patch issues may cause some unnecessary
> end-user alarm. It seems to me that they are basically warning about a
> situation that is unusual but not scary. Isn't the appropriate level
> for that DEBUG1, maybe without the errhint?

log_invalid_page reports missing pages with DEBUG1 before reaching
consistency.  And since missing directory is not an issue if all of
those reports are forgotten until reaching consistency, DEBUG1 sounds
reasonable.  Maybe we lower the DEBUG1 messages to DEBUG2 in
XLogRememberMissingDir?

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




RE: Logical replication timeout problem

2022-03-28 Thread kuroda.hay...@fujitsu.com
Dear Amit, Wang,

> > I think maybe we do not need to deal with this use case.
> > The maximum number of table columns allowed by PG is 1600
> > (macro MaxHeapAttributeNumber), and after loop through all columns in the
> > function logicalrep_write_tuple, the function OutputPluginWrite will be 
> > invoked
> > immediately to actually send the data to the subscriber. This refreshes the
> > last time the subscriber received a message.
> > So I think this loop will not cause timeout issues.
> >
> 
> Right, I also don't think it can be a source of timeout.

OK. I have no comments for this version.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Re: Allow file inclusion in pg_hba and pg_ident files

2022-03-28 Thread Michael Paquier
On Mon, Mar 28, 2022 at 04:33:30PM +0800, Julien Rouhaud wrote:
> Ok, v5 attached without the TAP tests and updated sysviews tests.

The update of the query related to pg_hba_file_rules in the regression
tests was independant, so I have split and applied that first, as of
091a971.

Now, for the rest, I have found one place in the docs that had an
incorrect link, two incorrect comments (aka
s/pg_hba.conf/pg_ident.conf/), and the indentation was a bit off.
Anyway, all that was cosmetic, so applied after adjusting all those
things.
--
Michael


signature.asc
Description: PGP signature


Re: SQL/JSON: functions

2022-03-28 Thread Andrew Dunstan





> On Mar 28, 2022, at 7:25 PM, Tom Lane  wrote:
> 
> ... even more baffling: jabiru went green after the IS JSON patch.
> 
>   

Yeah, bizarre. Let’s see if I can upset that tomorrow with the next patch :-)

cheers

andrew 



Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Tom Lane
Peter Geoghegan  writes:
> We could get rid of dynamic allocations for BTStackData in
> _bt_first(), perhaps. The problem is that there is no simple,
> reasonable proof of the maximum height on a B-tree, even though a
> B-Tree with more than 7 or 8 levels seems extraordinarily unlikely.

Start with a few entries preallocated, and switch to dynamically
allocated space if there turn out to be more levels than that,
perhaps?  Not sure if it's worth the trouble.

In any case, what I was on about is _bt_preprocess_keys() and
adjacent code.  I'm surprised that those aren't more expensive
than one palloc in _bt_first.  Maybe that logic falls through very
quickly in simple cases, though.

regards, tom lane




Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-03-28 Thread Tatsuo Ishii
> Hello,
> 
> On 2022-Mar-27, Tatsuo Ishii wrote:
> 
>> After:
>> interval_start num_transactions sum_latency sum_latency_2 min_latency 
>> max_latency
>>   { failures | serialization_failures deadlock_failures } [ sum_lag 
>> sum_lag_2 min_lag max_lag [ skipped ] ] [ retried retries ]
> 
> You're showing an indentation, but looking at the HTML output there is
> no such.  Is the HTML processor eating leading whitespace or something
> like that?

I just copied from my web browser screen (Firefox 98.0.2 on Ubuntu 20).

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
On Tue, Mar 22, 2022 at 4:06 PM Andres Freund  wrote:
> Are you thinking of just moving the setup stuff in nbtree (presumably parts of
> _bt_first() / _bt_preprocess_keys()) or also stuff in
> ExecIndexBuildScanKeys()?
>
> The latter does show up a bit more heavily in profiles than nbtree specific
> setup, and given that it's generic executor type stuff, seems even more
> amenable to being moved to plan time.

When I was working on the patch series that became the nbtree Postgres
12 work, this came up. At one point I discovered that using palloc0()
for the insertion scankey in _bt_first() was a big problem with nested
loop joins -- it became a really noticeable bottleneck with one of my
test cases. I independently discovered what Tom must have figured out
back in 2005, when he committed d961a56899. This led to my making the
new insertion scan key structure (BTScanInsertData) not use dynamic
allocation. So _bt_first() is definitely performance critical for
certain types of queries.

We could get rid of dynamic allocations for BTStackData in
_bt_first(), perhaps. The problem is that there is no simple,
reasonable proof of the maximum height on a B-tree, even though a
B-Tree with more than 7 or 8 levels seems extraordinarily unlikely.
You could also invent a slow path (maybe do what we do in
_bt_insert_parent() in the event of a concurrent root page split/NULL
stack), but that runs into the problem of being awkward to test, and
pretty ugly. It's doable, but I wouldn't do it unless there was a
pretty noticeable payoff.

-- 
Peter Geoghegan




Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Tom Lane
Justin Pryzby  writes:
> Also, why wouldn't *kwend be checked in any case ?

I suspect Robert wrote it that way intentionally --- but if so,
I agree it could do with more than zero commentary.

regards, tom lane




Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Justin Pryzby
On Mon, Mar 28, 2022 at 05:39:31PM -0400, Robert Haas wrote:
> On Mon, Mar 28, 2022 at 4:53 PM Justin Pryzby  wrote:
> > I suggest to write it differently, as in 0002.
> 
> That doesn't seem better to me. What's the argument for it?

I find this much easier to understand:

/* If we got an error or have reached the end of the string, 
stop. */
  
-   if (result->parse_error != NULL || *kwend == '\0' || *vend == 
'\0')   
 
+   if (result->parse_error != NULL)

   
+   break;  

   
+   if (*kwend == '\0') 

   
+   break;  

   
+   if (vend != NULL && *vend == '\0')  

   
break;  

   

than

/* If we got an error or have reached the end of the string, 
stop. */
-   if (result->parse_error != NULL || *kwend == '\0' || *vend == 
'\0')
+   if (result->parse_error != NULL ||
+   (vend == NULL ? *kwend == '\0' : *vend == '\0'))

Also, why wouldn't *kwend be checked in any case ?




Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Justin Pryzby
> > +static void show_loop_info(Instrumentation *instrument, bool isworker,
> > +  ExplainState *es);
> >
> > I think this should be done as a separate refactoring commit.

Right - the 0001 patch I sent seems independently beneficial, and makes the
changes in 0002 more apparent.  My 0001 could also be applied after the feature
freeze and before branching for v16..




Re: SQL/JSON: functions

2022-03-28 Thread Tom Lane
... even more baffling: jabiru went green after the IS JSON patch.

regards, tom lane




Re: [RFC] building postgres with meson -v8

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 18:58:19 -0400, Tom Lane wrote:
> If we can commit meson build infrastructure without removing the
> existing infrastructure, then the buildfarm can continue to work,
> and we can roll out support for the new way slowly.

I think it's not a huge issue to have both for a while. Of course it's
annoying to have to update two files when adding a source file, but it's not
the end of the world for a limited time. IMO.


> It'd be fairly impractical to expect all buildfarm animals to update
> instantly anyhow.

And development workflows. I expect some unforseen breakages somewhere, given
the variety of systems out there.

Greetings,

Andres Freund




Re: [RFC] building postgres with meson -v8

2022-03-28 Thread Tom Lane
Andrew Dunstan  writes:
> On 3/28/22 15:59, Andres Freund wrote:
>> In our context it could make sense to merge meson, after a few months of
>> shakeup remove the current windows buildsystems, and then in release + 1
>> remove the make based stuff.

That sounds like a decent plan.

> I'd like to get a stake in the ground and then start working on
> buildfarm support. Up to now I think it's been a bit too much of a
> moving target. Essentially that would mean an interim option for
> building with meson.

If we can commit meson build infrastructure without removing the
existing infrastructure, then the buildfarm can continue to work,
and we can roll out support for the new way slowly.  It'd be
fairly impractical to expect all buildfarm animals to update
instantly anyhow.

regards, tom lane




Re: [RFC] building postgres with meson -v8

2022-03-28 Thread Andrew Dunstan


On 3/28/22 15:59, Andres Freund wrote:
>
> One thing I'd like to discuss fairly soon is what kind of approach to take for
> integrating meson support. Most other projects I looked kept parallel
> buildsystems for at least a release, so that there's one round of "broad" user
> feedback.



We did something similar when we moved from CVS to git.


>
> In our context it could make sense to merge meson, after a few months of
> shakeup remove the current windows buildsystems, and then in release + 1
> remove the make based stuff.
>
> But we can have that discussion that before the next CF, but still after
> code-freeze & immediate mopup.
>


I'd like to get a stake in the ground and then start working on
buildfarm support. Up to now I think it's been a bit too much of a
moving target. Essentially that would mean an interim option for
building with meson.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: On login trigger: take three

2022-03-28 Thread Tom Lane
Andres Freund  writes:
> On 2022-03-28 23:27:56 +0200, Daniel Gustafsson wrote:
>> Do you think this potential foot-gun is scary enough to reject this patch?
>> There are lots of creative ways to cause Nagios alerts from ones database, 
>> but
>> this has the potential to do so with a small bug in userland code.  Still, I
>> kind of like the feature so I'm indecisive.

> It does seem like a huge footgun. But also kinda useful. So I'm really +-0.

An on-login trigger is *necessarily* a foot-gun; I don't see that this
particular failure mode makes it any worse than it would be anyway.
There has to be some not-too-difficult-to-use way to bypass a broken
login trigger.  Assuming we are happy with the design for doing that,
might as well accept the hazards.

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
Mark Dilger  writes:
> I was about to write another patch using the HookStr form, but if you are 
> already editing, then I'll let you make the change.  I don't see a problem 
> with what you are proposing.

Don't sweat about that, I can easily rebase what I've done so far
over your updates.

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
I'm going to be mostly unavailable till Wednesday, but I'll leave
you with another thing to chew on:

regression=# create user joe;
CREATE ROLE
regression=# grant set on parameter plpgsql.extra_warnings to joe;
ERROR:  unrecognized configuration parameter "plpgsql.extra_warnings"

This is problematic, because once plpgsql is loaded it works:

regression=# load 'plpgsql';
LOAD
regression=# grant set on parameter plpgsql.extra_warnings to joe;
GRANT

If we then do

$ pg_dumpall -g

it falls over:
pg_dumpall: error: query failed: ERROR:  unrecognized configuration parameter 
"plpgsql.extra_warnings"

apparently because aclparameterdefault() is allergic to being asked about
unknown GUCs, and plpgsql is not loaded in pg_dumpall's session.  But
if pg_dumpall hadn't failed, it'd produce a dump script containing that
same command, which would fail at load time (because, again, plpgsql
isn't going to be loaded in the backend reading the restore script).

This is what I meant by saying that you can't just refuse to GRANT on
unknown GUCs.  It makes custom GUCs into a time bomb for dump/restore.
And that means you need a strategy for dealing with the possibility
that you don't know whether the GUC is USERSET or not.  I think though
that it might work to just assume that it isn't, in which case dumps
on unrecognized GUCs that really are USERSET will end up issuing an
explicit GRANT SET TO PUBLIC that we didn't actually need to do, but it
won't hurt anything.  (Testing that assertion would be a good thing
to do.)

regards, tom lane




Re: SQL/JSON: JSON_TABLE

2022-03-28 Thread Andrew Dunstan


On 3/28/22 15:48, Greg Stark wrote:
> FYI I think the patch failure in the cfbot is spurious because the
> cfbot got confused by Erik's patch.



The cfbot is likely to be confused until I am finished committing the
SQL/JSON patches. Just disregard it.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Mark Dilger



> On Mar 28, 2022, at 2:54 PM, Tom Lane  wrote:
> 
> Yeah, I know it's *possible* to make this work.  The question is why is
> it good to do it like this rather than to use the string API, now that
> we have the latter.  AFAICS this way just guarantees that the hook must
> do a catalog lookup in order to figure out what you're talking about.

Ok, thanks for clarifying.  I took the *HookStr versions of the hooks to be an 
alternative to be used when no Oid was present, something of a last resort.  I 
never thought much about using them under other circumstances.

> The core point here is that the actual identity of a GUC is its name.
> Any OID that may exist in pg_parameter_acl is just a nonce alias that
> means nothing to anybody.  Anyone who's trying to, say, enforce that
> Joe Blow can't change shared_buffers is going to need to see the GUC
> name.  (I am, btw, busy doing a lot of renaming in the patch to try
> to clarify that these OIDs are not identifiers for GUCs; imagining
> that they are just risks confusion.)

I was about to write another patch using the HookStr form, but if you are 
already editing, then I'll let you make the change.  I don't see a problem with 
what you are proposing.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
Mark Dilger  writes:
>> On Mar 28, 2022, at 2:16 PM, Tom Lane  wrote:
>> I just came across something odd in v12 that is still there in v13:
>> ExecGrant_Parameter uses InvokeObjectPostAlterHook not
>> InvokeObjectPostAlterHookArgStr.  This seems pretty inconsistent.
>> Is there a good argument for it?

> For SET and ALTER SYSTEM, the target of the action may not have an entry
> in pg_parameter_acl, nor an assigned Oid anywhere, so the only
> consistent way to pass the argument to the hook is by name.  For
> GRANT/REVOKE, the parameter must have an Oid, at least by the time the
> hook gets called.

Yeah, I know it's *possible* to make this work.  The question is why is
it good to do it like this rather than to use the string API, now that
we have the latter.  AFAICS this way just guarantees that the hook must
do a catalog lookup in order to figure out what you're talking about.

The core point here is that the actual identity of a GUC is its name.
Any OID that may exist in pg_parameter_acl is just a nonce alias that
means nothing to anybody.  Anyone who's trying to, say, enforce that
Joe Blow can't change shared_buffers is going to need to see the GUC
name.  (I am, btw, busy doing a lot of renaming in the patch to try
to clarify that these OIDs are not identifiers for GUCs; imagining
that they are just risks confusion.)

regards, tom lane




Re: SSL/TLS instead of SSL in docs

2022-03-28 Thread Daniel Gustafsson
> On 25 Mar 2022, at 22:01, Daniel Gustafsson  wrote:
>> On 25 Mar 2022, at 20:58, Robert Haas  wrote:

>> However, if we're not ready/willing to make a bigger change, then doing as 
>> you
>> have proposed here seems fine to me.
> 
> Thanks for review!  Trying out again just now the patch still applies (with
> some offsets) and builds.

Barring objections I will go ahead and push this for 15.  It's the minimal
change but it might still help someone new to PostgreSQL who gets confused on
the choice of naming/wording.

--
Daniel Gustafsson   https://vmware.com/





Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-28 Thread Peter Geoghegan
On Mon, Mar 28, 2022 at 1:23 PM Peter Geoghegan  wrote:
> I doubt that the patch's use of pg_memory_barrier() in places like
> _bt_killitems() is correct.

I also doubt that posting list splits are handled correctly.

If there is an LP_DEAD bit set on a posting list on the primary, and
we need to do a posting list split against the posting tuple, we need
to be careful -- we cannot allow our new TID to look like it's LP_DEAD
immediately, before our transaction even commits/aborts. We cannot
swap out our new TID with an old LP_DEAD TID, because we'll think that
our new TID is LP_DEAD when we shouldn't.

This is currently handled by having the inserted do an early round of
simple/LP_DEAD index tuple deletion, using the "simpleonly" argument
from _bt_delete_or_dedup_one_page(). Obviously the primary cannot be
expected to know that one of its standbys has independently set a
posting list's LP_DEAD bit, though. At the very least you need to
teach the posting list split path in btree_xlog_insert() about all
this -- it's not necessarily sufficient to clear LP_DEAD bits in the
index AM's fpi_mask() routine.

Overall, I think that this patch has serious design flaws, and that
this issue is really just a symptom of a bigger problem.

-- 
Peter Geoghegan




Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Mark Dilger



> On Mar 28, 2022, at 2:16 PM, Tom Lane  wrote:
> 
> I just came across something odd in v12 that is still there in v13:
> ExecGrant_Parameter uses InvokeObjectPostAlterHook not
> InvokeObjectPostAlterHookArgStr.  This seems pretty inconsistent.
> Is there a good argument for it?
> 

For SET and ALTER SYSTEM, the target of the action may not have an entry in 
pg_parameter_acl, nor an assigned Oid anywhere, so the only consistent way to 
pass the argument to the hook is by name.  For GRANT/REVOKE, the parameter must 
have an Oid, at least by the time the hook gets called.  Upthread there was 
some discussion of a hook not being able to assume a snapshot and working 
transaction, and hence not being able to query the catalogs.  I would think 
that in a GRANT or REVOKE that hasn't already errored, the hook would have a 
transaction and could look up whatever it likes?  There is a 
CommandCounterIncrement() call issued in objectNamesToOids() for new 
parameters, so by the time the hook is running it should be able to see the 
parameter.

Am I reasoning about this the wrong way?

> ... or, for that matter, why is there any such call at all?
> No other GRANT/REVOKE operation calls such a hook.

I think ALTER DEFAULT PRIVILEGES does, though that's not quite the same thing.  
I don't have a strong opinion on this.  Joshua, what's your take?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 4:53 PM Justin Pryzby  wrote:
> I suggest to write it differently, as in 0002.

That doesn't seem better to me. What's the argument for it?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: On login trigger: take three

2022-03-28 Thread Daniel Gustafsson
> On 28 Mar 2022, at 23:31, Andres Freund  wrote:
> 
> Hi,
> 
> On 2022-03-28 23:27:56 +0200, Daniel Gustafsson wrote:
>>> On 28 Mar 2022, at 19:10, Andres Freund  wrote:
>>> On 2022-03-28 15:57:37 +0300, a.soko...@postgrespro.ru wrote:
>> 
 +data initialization. It is vital that any event trigger using the
 +login event checks whether or not the database is 
 in
 +recovery.
 
 Does any trigger really have to contain a pg_is_in_recovery() call?
>>> 
>>> Not *any* trigger, just any trigger that writes.
>> 
>> Thats correct, the docs should be updated with something like the below I
>> reckon.
>> 
>>It is vital that event trigger using the login event
>>which has side-effects checks whether or not the database is in recovery 
>> to
>>ensure they are not performing modifications to hot standby nodes.
> 
> Maybe side-effects is a bit too general? Emitting a log message, rejecting a
> login, setting some GUCs, etc are all side-effects too.

Good point, it needs to say that modifications that cause WAL to be generated
are prohibited, but in a more user-friendly readable way.  Perhaps in a big red
warning box.

 In this message
 (https://www.postgresql.org/message-id/20220312024652.lvgehszwke4hhove%40alap3.anarazel.de)
 it was only about triggers on hot standby, which run not read-only queries
>>> 
>>> The problem precisely is that the login triggers run on hot standby nodes, 
>>> and
>>> that if they do writes, you can't login anymore.
>> 
>> Do you think this potential foot-gun is scary enough to reject this patch?
>> There are lots of creative ways to cause Nagios alerts from ones database, 
>> but
>> this has the potential to do so with a small bug in userland code.  Still, I
>> kind of like the feature so I'm indecisive.
> 
> It does seem like a huge footgun. But also kinda useful. So I'm really +-0.

Looks like we are in agreement here.  I'm going to go over it again and sleep
on it some more before the deadline hits.

--
Daniel Gustafsson   https://vmware.com/





Re: On login trigger: take three

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 23:27:56 +0200, Daniel Gustafsson wrote:
> > On 28 Mar 2022, at 19:10, Andres Freund  wrote:
> > On 2022-03-28 15:57:37 +0300, a.soko...@postgrespro.ru wrote:
> 
> >> +data initialization. It is vital that any event trigger using the
> >> +login event checks whether or not the database is 
> >> in
> >> +recovery.
> >> 
> >> Does any trigger really have to contain a pg_is_in_recovery() call?
> > 
> > Not *any* trigger, just any trigger that writes.
> 
> Thats correct, the docs should be updated with something like the below I
> reckon.
> 
> It is vital that event trigger using the login event
> which has side-effects checks whether or not the database is in recovery 
> to
> ensure they are not performing modifications to hot standby nodes.

Maybe side-effects is a bit too general? Emitting a log message, rejecting a
login, setting some GUCs, etc are all side-effects too.


> >> In this message
> >> (https://www.postgresql.org/message-id/20220312024652.lvgehszwke4hhove%40alap3.anarazel.de)
> >> it was only about triggers on hot standby, which run not read-only queries
> > 
> > The problem precisely is that the login triggers run on hot standby nodes, 
> > and
> > that if they do writes, you can't login anymore.
> 
> Do you think this potential foot-gun is scary enough to reject this patch?
> There are lots of creative ways to cause Nagios alerts from ones database, but
> this has the potential to do so with a small bug in userland code.  Still, I
> kind of like the feature so I'm indecisive.

It does seem like a huge footgun. But also kinda useful. So I'm really +-0.

Greetings,

Andres Freund




Re: On login trigger: take three

2022-03-28 Thread Daniel Gustafsson
> On 28 Mar 2022, at 19:10, Andres Freund  wrote:
> On 2022-03-28 15:57:37 +0300, a.soko...@postgrespro.ru wrote:

>> +data initialization. It is vital that any event trigger using the
>> +login event checks whether or not the database is in
>> +recovery.
>> 
>> Does any trigger really have to contain a pg_is_in_recovery() call?
> 
> Not *any* trigger, just any trigger that writes.

Thats correct, the docs should be updated with something like the below I
reckon.

It is vital that event trigger using the login event
which has side-effects checks whether or not the database is in recovery to
ensure they are not performing modifications to hot standby nodes.

>> In this message
>> (https://www.postgresql.org/message-id/20220312024652.lvgehszwke4hhove%40alap3.anarazel.de)
>> it was only about triggers on hot standby, which run not read-only queries
> 
> The problem precisely is that the login triggers run on hot standby nodes, and
> that if they do writes, you can't login anymore.

Do you think this potential foot-gun is scary enough to reject this patch?
There are lots of creative ways to cause Nagios alerts from ones database, but
this has the potential to do so with a small bug in userland code.  Still, I
kind of like the feature so I'm indecisive.

--
Daniel Gustafsson   https://vmware.com/





Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
I just came across something odd in v12 that is still there in v13:
ExecGrant_Parameter uses InvokeObjectPostAlterHook not
InvokeObjectPostAlterHookArgStr.  This seems pretty inconsistent.
Is there a good argument for it?

... or, for that matter, why is there any such call at all?
No other GRANT/REVOKE operation calls such a hook.

regards, tom lane




Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Justin Pryzby
On Mon, Mar 28, 2022 at 03:50:50PM -0400, Robert Haas wrote:
> On Sun, Mar 27, 2022 at 1:47 PM Tom Lane  wrote:
> > Coverity has a nitpick about this:
> >
> > /srv/coverity/git/pgsql-git/postgresql/src/common/backup_compression.c: 194 
> > in parse_bc_specification()
> > 193 /* Advance to next entry and loop around. */
> > >>> CID 1503251:  Null pointer dereferences  (REVERSE_INULL)
> > >>> Null-checking "vend" suggests that it may be null, but it has 
> > >>> already been dereferenced on all paths leading to the check.
> > 194 specification = vend == NULL ? kwend + 1 : vend + 1;
> > 195 }
> > 196 }
> >
> > Not sure if you should remove this null-check or add some other ones,
> > but I think you ought to do one or the other.
> 
> Yes, I think this is buggy.  I think there's only a theoretical bug
> right now, because the only keyword we have is "level" and that
> requires a value. But if I add an example keyword that does not
> require an associated value (as demonstrated in the attached patch)
> and do something like pg_basebackup -cfast -D whatever --compress
> lz4:example, then the present code will dereference "vend" even though
> it's NULL, which is not good. The attached patch also shows how I
> think that should be fixed.
> 
> As I hope is apparent, the first hunk of this patch is not for commit,
> and the second hunk is for commit.

Confirmed that it's a real issue with my patch for zstd long match mode.  But
you need to specify another option after the value-less flag option for it to
crash.

I suggest to write it differently, as in 0002.

This also fixes some rebase-induced errors with my previous patches, and adds
expect_boolean().
>From 9bedbfc6bfa471473a8b3479ffd1888d5da285ab Mon Sep 17 00:00:00 2001
From: Robert Haas 
Date: Mon, 28 Mar 2022 13:25:44 -0400
Subject: [PATCH 1/4] Allow parallel zstd compression when taking a base
 backup.

libzstd allows transparent parallel compression just by setting
an option when creating the compression context, so permit that
for both client and server-side backup compression. To use this,
use something like pg_basebackup --compress WHERE-zstd:workers=N
where WHERE is "client" or "server" and N is an integer.

When compression is performed on the server side, this will spawn
threads inside the PostgreSQL backend. While there is almost no
PostgreSQL server code which is thread-safe, the threads here are used
internally by libzstd and touch only data structures controlled by
libzstd.

Patch by me, based in part on earlier work by Dipesh Pandit
and Jeevan Ladhe. Reviewed by Justin Pryzby.
---
 doc/src/sgml/protocol.sgml| 12 +++--
 doc/src/sgml/ref/pg_basebackup.sgml   |  4 +-
 src/backend/replication/basebackup_zstd.c | 45 ---
 src/bin/pg_basebackup/bbstreamer_zstd.c   | 40 -
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |  5 +++
 src/bin/pg_verifybackup/t/009_extract.pl  | 29 +++-
 src/bin/pg_verifybackup/t/010_client_untar.pl | 33 --
 src/common/backup_compression.c   | 16 +++
 src/include/common/backup_compression.h   |  2 +
 src/test/perl/PostgreSQL/Test/Cluster.pm  |  3 +-
 10 files changed, 148 insertions(+), 41 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 2fa3cedfe9e..98f0bc3cc34 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2739,17 +2739,23 @@ The commands accepted in replication mode are:
   option.  If the value is an integer, it specifies the compression
   level.  Otherwise, it should be a comma-separated list of items,
   each of the form keyword or
-  keyword=value. Currently, the only supported
-  keyword is level, which sets the compression
-  level.
+  keyword=value. Currently, the supported keywords
+  are level and workers.
 
 
 
+  The level keyword sets the compression level.
   For gzip the compression level should be an
   integer between 1 and 9, for lz4 an integer
   between 1 and 12, and for zstd an integer
   between 1 and 22.
  
+
+
+  The workers keyword sets the number of threads
+  that should be used for parallel compression. Parallel compression
+  is supported only for zstd.
+ 
 

 
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index d9233beb8e1..82f5f606250 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -424,8 +424,8 @@ PostgreSQL documentation
 integer, it specifies the compression level.  Otherwise, it should be
 a comma-separated list of items, each of the form
 keyword or keyword=value.
-Currently, the only supported keyword is level,
-which sets the compression 

Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Tom Lane
Robert Haas  writes:
> On Sun, Mar 27, 2022 at 1:47 PM Tom Lane  wrote:
>> Not sure if you should remove this null-check or add some other ones,
>> but I think you ought to do one or the other.

> As I hope is apparent, the first hunk of this patch is not for commit,
> and the second hunk is for commit.

Looks plausible to me.

regards, tom lane




Re: [PATCH] Enable SSL library detection via PQsslAttribute

2022-03-28 Thread Daniel Gustafsson
> On 25 Mar 2022, at 23:45, Jacob Champion  wrote:
> 
> On Fri, 2022-03-25 at 18:00 -0400, Tom Lane wrote:
>> Jacob Champion  writes:
>>> Do I need to merge my tiny test program into the libpq_pipeline tests?
>> 
>> Doesn't seem worth the trouble to me, notably because you'd
>> then have to cope with non-SSL builds too.
> 
> Fine by me.
> 
> v5 moves the docs out of the Note, as requested by Daniel.

I went over this again and I think this version is ready for committer.  Having
tried to add implement a new TLS library I would add a small comment on
PQsslAttributeNames() for this, reverse-engineering what to implement is hard
as it is without special cases easily identifiable.  That can easily be done
when pushed, no need for a new version IMHO.

--
Daniel Gustafsson   https://vmware.com/





Re: jsonpath syntax extensions

2022-03-28 Thread Phil Krylov

Hi,

On 2022-03-21 21:09, Greg Stark wrote:

This patch seems to be getting ignored. Like David I'm a bit puzzled
because it doesn't seem like an especially obscure or difficult patch
to review. Yet it's been multiple years without even a superficial
"does it meet the coding requirements" review let alone a design
review.

Can we get a volunteer to at least give it a quick once-over? I don't
think it's ideal to be doing this in the last CF but neither is it
very appetizing to just shift it to the next CF without a review after
two years...


I have just one suggestion: probably the object subscription syntax, as 
in '$["keyA","keyB"]', should not require 'pg ' prefix, as it is a part 
of the original JSONPath (https://goessner.net/articles/JsonPath/) and 
is supported in multiple other implementations.


6. Object subscription syntax.  This gives us ability to specify what 
key to
extract on runtime.  The syntax is the same as ordinary array 
subscription

syntax.

 -- non-existent $.x is simply skipped in lax mode
 SELECT jsonb_path_query('{"a": "b", "b": "c"}', 'pg $[$.a, "x", 
"a"]');

  jsonb_path_query
 --
  "c"
  "b"


The variable reference support ('pg $[$.a]') probably _is_ a 
PostgreSQL-specific extension, though.


-- Ph.




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-03-28 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 07:13:14PM -0500, Chapman Flack wrote:
> > Looks like this change to an example in func.sgml is not quite right:
> > 
> > -postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
> > +postgres=# SELECT * FROM pg_walfile_name_offset(pg_backup_stop());
> > 
> > pg_backup_stop returns a record now, not just lsn. So this works for me:
> > 
> > +postgres=# SELECT * FROM pg_walfile_name_offset((pg_backup_stop()).lsn);
> 
> Ah, good catch.  I made this change in v7.  I considered doing something
> like this
> 
>   SELECT w.* FROM pg_backup_stop() b, pg_walfile_name_offset(b.lsn) w;
> 
> but I think your suggestion is simpler.

I tend to agree.  More below.

> Subject: [PATCH v7 1/1] remove exclusive backup mode
> diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
> index 0d69851bb1..c8b914c1aa 100644
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -881,19 +873,19 @@ test ! -f 
> /mnt/server/archivedir/000100A90065  cp pg_wal/0
> 
>  
>   Connect to the server (it does not matter which database) as a user with
> - rights to run pg_start_backup (superuser, or a user who has been granted
> + rights to run pg_backup_start (superuser, or a user who has been granted
>   EXECUTE on the function) and issue the command:
>  
> -SELECT pg_start_backup('label', false, false);
> +SELECT pg_backup_start(label => 'label', fast => false);
>  
>   where label is any string you want to use to uniquely
>   identify this backup operation. The connection
> - calling pg_start_backup must be maintained until 
> the end of
> + calling pg_backup_start must be maintained until 
> the end of
>   the backup, or the backup will be automatically aborted.
>  
>  
>  
> - By default, pg_start_backup can take a long time 
> to finish.
> + By default, pg_backup_start can take a long time 
> to finish.
>   This is because it performs a checkpoint, and the I/O
>   required for the checkpoint will be spread out over a significant
>   period of time, by default half your inter-checkpoint interval

Hrmpf.  Not this patch's fault, but the default isn't 'half your
inter-checkpoint interval' anymore (checkpoint_completion_target was
changed to 0.9 ... let's not discuss who did that and missed fixing
this).  Overall though, maybe we should reword this to avoid having to
remember to change it again if we ever change the completion target
again?  Also it seems to imply that pg_backup_start is going to run its
own independent checkpoint or something, which isn't the typical case.
I generally explain what is going on here like this:

By default, pg_backup_start will wait for the next
checkpoint to complete (see ref: checkpoint_timeout ... maybe also
wal-configuration.html).

> @@ -937,7 +925,7 @@ SELECT * FROM pg_stop_backup(false, true);
>   ready to archive.
>  
>  
> - The pg_stop_backup will return one row with three
> + The pg_backup_stop will return one row with three
>   values. The second of these fields should be written to a file named
>   backup_label in the root directory of the backup. 
> The
>   third field should be written to a file named

I get that we have  and , but those are just
formatting and don't show up in the actual text, and so it ends up
being:

The pg_backup_stop will return

Which doesn't sound quite right to me.  I'd say we either drop 'The' or
add 'function' after.  I realize that this patch is mostly doing a
s/pg_stop_backup/pg_backup_stop/, but, still.

> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index 8a802fb225..73096708cc 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -25732,24 +25715,19 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 
> 622360 free (88 chunks); 1029560
>  spcmapfile text )
> 
> 
> -Finishes performing an exclusive or non-exclusive on-line backup.
> -The exclusive parameter must match the
> -previous pg_start_backup call.
> -In an exclusive backup, pg_stop_backup removes
> -the backup label file and, if it exists, the tablespace map file
> -created by pg_start_backup.  In a non-exclusive
> -backup, the desired contents of these files are returned as part of
> +Finishes performing an on-line backup.  The desired contents of the
> +backup label file and the tablespace map file are returned as part of
>  the result of the function, and should be written to files in the
>  backup area (not in the data directory).
> 

Given that this is a crucial part, which the exclusive mode has wrong,
I'd be a bit more forceful about it, eg:

The contents of the backup label file and the tablespace map file must
be stored as part of the backup and must NOT be stored in the live data
directory (doing so 

Re: Temporary tables versus wraparound... again

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 16:11:55 -0400, Greg Stark wrote:
> From 4515075b644d1e38920eb5bdaaa898e1698510a8 Mon Sep 17 00:00:00 2001
> From: Greg Stark 
> Date: Tue, 22 Mar 2022 15:51:32 -0400
> Subject: [PATCH v4 1/2] Update relfrozenxmin when truncating temp tables
> 
> Make ON COMMIT DELETE ROWS reset relfrozenxmin and other table stats
> like normal truncate. Otherwise even typical short-lived transactions
> using temporary tables can easily cause them to reach relfrozenxid.

Might be worth mentioning that ON COMMIT DELETE is implemented as truncating
tables. If we actually implemented it as deleting rows, it'd not at all be
correct to reset relfrozenxmin.


> Also add warnings when old temporary tables are found to still be in
> use during autovacuum. Long lived sessions using temporary tables are
> required to vacuum them themselves.

I'd do that in a separate patch.


> +/*
> + * Reset the relfrozenxid and other stats to the same values used when
> + * creating tables. This is used after non-transactional truncation.
> + *
> + * This reduces the need for long-running programs to vacuum their own
> + * temporary tables (since they're not covered by autovacuum) at least in the
> + * case where they're ON COMMIT DELETE ROWS.
> + *
> + * see also src/backend/commands/vacuum.c vac_update_relstats()
> + * also see AddNewRelationTuple() above
> + */
> +
> +static void
> +ResetVacStats(Relation rel)
> +{
> + HeapTuple   ctup;
> + Form_pg_class pgcform;
> + Relation classRel;
> +
> + /* Fetch a copy of the tuple to scribble on */
> + classRel = table_open(RelationRelationId, RowExclusiveLock);
> + ctup = SearchSysCacheCopy1(RELOID, 
> ObjectIdGetDatum(RelationGetRelid(rel)));
>
> + if (!HeapTupleIsValid(ctup))
> + elog(ERROR, "pg_class entry for relid %u vanished during 
> truncation",
> +  RelationGetRelid(rel));
> + pgcform = (Form_pg_class) GETSTRUCT(ctup);
> +
> + /*
> +  * Update relfrozenxid
> +  */
> +
> + pgcform->relpages = 0;
> + pgcform->reltuples = -1;
> + pgcform->relallvisible = 0;
> + pgcform->relfrozenxid = RecentXmin;

Hm. Is RecentXmin guaranteed to be valid at this point?


> + pgcform->relminmxid = GetOldestMultiXactId();

Ugh. That's pretty expensive for something now done at a much higher rate than
before.


> @@ -2113,20 +2126,31 @@ do_autovacuum(void)
>* Remember it so we can try to delete it later.
>*/
>   orphan_oids = lappend_oid(orphan_oids, relid);
> + } else if (temp_status == TEMP_NAMESPACE_NOT_TEMP) {
> + elog(LOG, "autovacuum: found temporary table 
> \"%s.%s.%s\" in non-temporary namespace",
> +  get_database_name(MyDatabaseId),
> +  
> get_namespace_name(classForm->relnamespace),
> +  NameStr(classForm->relname));
> + } else if (temp_status == TEMP_NAMESPACE_IN_USE && 
> wraparound) {

we put else if on a separate line from }. And { also is always on a separate
line.



Greetings,

Andres Freund




Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-28 Thread Peter Geoghegan
On Mon, Mar 28, 2022 at 12:40 PM Greg Stark  wrote:
> I'm seeing a recovery test failure. Not sure if this represents an
> actual bug or just a test that needs to be adjusted for the new
> behaviour.
>
> https://cirrus-ci.com/task/5711008294502400

I doubt that the patch's use of pg_memory_barrier() in places like
_bt_killitems() is correct. There is no way to know for sure if this
novel new lockless algorithm is correct or not, since it isn't
explained anywhere.

The existing use of memory barriers is pretty much limited to a
handful of performance critical code paths, none of which are in
access method code that reads from shared_buffers. So this is not a
minor oversight.

-- 
Peter Geoghegan




Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-28 Thread Jacob Champion
On Mon, 2022-03-28 at 11:17 +0200, Daniel Gustafsson wrote:
> Fixing up the switch_server_cert() calls and using default_ssl_connstr makes
> the test pass for me.  The required fixes are in the supplied 0004 diff, I 
> kept
> them separate to allow the original author to incorporate them without having
> to dig them out to see what changed (named to match the git format-patch 
> output
> since I think the CFBot just applies the patches in alphabetical order).

Thanks! Those changes look good to me; I've folded them into v11. This
is rebased on a newer HEAD so it should fix the apply failures that
Greg pointed out.

--Jacob
From 0025dcecab2d05c30e118105ea7d461ce98a8466 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Wed, 24 Nov 2021 14:46:11 -0800
Subject: [PATCH v11 1/3] Move inet_net_pton() to src/port

This will be helpful for IP address verification in libpq.
---
 src/backend/utils/adt/Makefile  |  1 -
 src/include/port.h  |  3 +++
 src/include/utils/builtins.h|  4 
 src/port/Makefile   |  1 +
 src/{backend/utils/adt => port}/inet_net_pton.c | 16 +++-
 src/tools/msvc/Mkvcbuild.pm |  2 +-
 6 files changed, 20 insertions(+), 7 deletions(-)
 rename src/{backend/utils/adt => port}/inet_net_pton.c (96%)

diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 7c722ea2ce..ccaa46babf 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -44,7 +44,6 @@ OBJS = \
 	geo_spgist.o \
 	hbafuncs.o \
 	inet_cidr_ntop.o \
-	inet_net_pton.o \
 	int.o \
 	int8.o \
 	json.o \
diff --git a/src/include/port.h b/src/include/port.h
index 3d103a2b31..2852e5b58b 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -515,6 +515,9 @@ extern int	pg_codepage_to_encoding(UINT cp);
 extern char *pg_inet_net_ntop(int af, const void *src, int bits,
 			  char *dst, size_t size);
 
+/* port/inet_net_pton.c */
+extern int	pg_inet_net_pton(int af, const char *src, void *dst, size_t size);
+
 /* port/pg_strong_random.c */
 extern void pg_strong_random_init(void);
 extern bool pg_strong_random(void *buf, size_t len);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 666e545496..67b9326dc8 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -93,10 +93,6 @@ extern int	xidLogicalComparator(const void *arg1, const void *arg2);
 extern char *pg_inet_cidr_ntop(int af, const void *src, int bits,
 			   char *dst, size_t size);
 
-/* inet_net_pton.c */
-extern int	pg_inet_net_pton(int af, const char *src,
-			 void *dst, size_t size);
-
 /* network.c */
 extern double convert_network_to_scalar(Datum value, Oid typid, bool *failure);
 extern Datum network_scan_first(Datum in);
diff --git a/src/port/Makefile b/src/port/Makefile
index bfe1feb0d4..c3ae7b3d5c 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -43,6 +43,7 @@ OBJS = \
 	bsearch_arg.o \
 	chklocale.o \
 	inet_net_ntop.o \
+	inet_net_pton.o \
 	noblock.o \
 	path.o \
 	pg_bitutils.o \
diff --git a/src/backend/utils/adt/inet_net_pton.c b/src/port/inet_net_pton.c
similarity index 96%
rename from src/backend/utils/adt/inet_net_pton.c
rename to src/port/inet_net_pton.c
index d3221a1313..bae50ba67e 100644
--- a/src/backend/utils/adt/inet_net_pton.c
+++ b/src/port/inet_net_pton.c
@@ -14,14 +14,18 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
  * OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
- *	  src/backend/utils/adt/inet_net_pton.c
+ *	  src/port/inet_net_pton.c
  */
 
 #if defined(LIBC_SCCS) && !defined(lint)
 static const char rcsid[] = "Id: inet_net_pton.c,v 1.4.2.3 2004/03/17 00:40:11 marka Exp $";
 #endif
 
+#ifndef FRONTEND
 #include "postgres.h"
+#else
+#include "postgres_fe.h"
+#endif
 
 #include 
 #include 
@@ -29,9 +33,19 @@ static const char rcsid[] = "Id: inet_net_pton.c,v 1.4.2.3 2004/03/17 00:40:11 m
 #include 
 #include 
 
+#ifndef FRONTEND
 #include "utils/builtins.h" /* pgrminclude ignore */	/* needed on some
 		 * platforms */
 #include "utils/inet.h"
+#else
+/*
+ * In a frontend build, we can't include inet.h, but we still need to have
+ * sensible definitions of these two constants.  Note that pg_inet_net_ntop()
+ * assumes that PGSQL_AF_INET is equal to AF_INET.
+ */
+#define PGSQL_AF_INET	(AF_INET + 0)
+#define PGSQL_AF_INET6	(AF_INET + 1)
+#endif
 
 
 static int	inet_net_pton_ipv4(const char *src, u_char *dst);
diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
index de8676d339..8491e31adb 100644
--- a/src/tools/msvc/Mkvcbuild.pm
+++ b/src/tools/msvc/Mkvcbuild.pm
@@ -100,7 +100,7 @@ sub mkvcbuild
 
 	our @pgportfiles = qw(
 	  chklocale.c explicit_bzero.c fls.c getpeereid.c getrusage.c inet_aton.c
-	  getaddrinfo.c gettimeofday.c inet_net_ntop.c kill.c open.c
+	  getaddrinfo.c gettimeofday.c inet_net_ntop.c 

Re: range_agg with multirange inputs

2022-03-28 Thread Greg Stark
Fwiw the cfbot is failing due to a duplicate OID. Traditionally we
didn't treat duplicate OIDs as reason to reject a patch because
they're inevitable as other patches get committed and the committer
can just renumber them.

I think the cfbot kind of changes this calculus since it's a pain lose
the visibility into whether the rest of the tests are passing that the
cfbot normally gives us.

If it's not to much of a hassle could you renumber resubmit the patch
with an updated OID?

[10:54:57.606] su postgres -c "make -s -j${BUILD_JOBS} world-bin"
[10:54:57.927] Duplicate OIDs detected:
[10:54:57.927] 8000




Re: Temporary tables versus wraparound... again

2022-03-28 Thread Greg Stark
I had to rebase this again after Tom's cleanup of heap.c removing some includes.

I had to re-add snapmgr to access RecentXmin. I occurs to me to ask
whether RecentXmin is actually guaranteed to be set. I haven't
checked. I thought it was set when the first snapshot was taken and
presumably even if it's a non-transactional truncate we're still in a
transaction?

The patch also added heapam.h to heap.c which might seem like a layer
violation. I think it's ok since it's just to be able to update the
catalog (heap_inplace_update is in heapam.h).
From f8972baecbe50da9cb4265a2debabeb94cd32b47 Mon Sep 17 00:00:00 2001
From: Greg Stark 
Date: Tue, 22 Mar 2022 15:54:59 -0400
Subject: [PATCH v4 2/2] Add test for truncating temp tables advancing
 relfrozenxid

This test depends on other transactions not running at the same time
so that relfrozenxid can advance so it has to be moved to its own
schedule.
---
 src/test/regress/expected/temp.out | 21 +
 src/test/regress/parallel_schedule | 10 +++---
 src/test/regress/sql/temp.sql  | 19 +++
 3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/src/test/regress/expected/temp.out b/src/test/regress/expected/temp.out
index a5b3ed34a3..1fee5521af 100644
--- a/src/test/regress/expected/temp.out
+++ b/src/test/regress/expected/temp.out
@@ -82,6 +82,27 @@ SELECT * FROM temptest;
 -
 (0 rows)
 
+DROP TABLE temptest;
+-- Test that ON COMMIT DELETE ROWS resets the relfrozenxid when the
+-- table is truncated. This requires this test not be run in parallel
+-- with other tests as concurrent transactions will hold back the
+-- globalxmin
+CREATE TEMP TABLE temptest(col int) ON COMMIT DELETE ROWS;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset old_
+BEGIN;
+INSERT INTO temptest (select generate_series(1,1000));
+ANALYZE temptest; -- update relpages, reltuples
+COMMIT;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset new_
+SELECT :old_relpages  = :new_relpages  AS pages_reset,
+   :old_reltuples = :new_reltuples AS tuples_reset,
+   :old_relallvisible = :new_relallvisible AS allvisible_reset,
+   :old_relfrozenxid <> :new_relfrozenxid  AS frozenxid_advanced;
+ pages_reset | tuples_reset | allvisible_reset | frozenxid_advanced 
+-+--+--+
+ t   | t| t| t
+(1 row)
+
 DROP TABLE temptest;
 -- Test ON COMMIT DROP
 BEGIN;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 58fab1de1a..86cdd2ddde 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -116,10 +116,14 @@ test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath sqljson
 # --
 # Another group of parallel tests
 # with depends on create_misc
-# NB: temp.sql does a reconnect which transiently uses 2 connections,
-# so keep this parallel group to at most 19 tests
 # --
-test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+test: plancache limit plpgsql copy2 domain rangefuncs prepare conversion truncate alter_table sequence polymorphism rowtypes returning largeobject with xml
+
+# --
+# Run this alone because it transiently uses 2 connections and also
+# tests relfrozenxid advances when truncating temp tables
+# --
+test: temp
 
 # --
 # Another group of parallel tests
diff --git a/src/test/regress/sql/temp.sql b/src/test/regress/sql/temp.sql
index 424d12b283..5f0c39b5e7 100644
--- a/src/test/regress/sql/temp.sql
+++ b/src/test/regress/sql/temp.sql
@@ -79,6 +79,25 @@ SELECT * FROM temptest;
 
 DROP TABLE temptest;
 
+-- Test that ON COMMIT DELETE ROWS resets the relfrozenxid when the
+-- table is truncated. This requires this test not be run in parallel
+-- with other tests as concurrent transactions will hold back the
+-- globalxmin
+CREATE TEMP TABLE temptest(col int) ON COMMIT DELETE ROWS;
+
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset old_
+BEGIN;
+INSERT INTO temptest (select generate_series(1,1000));
+ANALYZE temptest; -- update relpages, reltuples
+COMMIT;
+SELECT relpages, reltuples, relallvisible, relfrozenxid FROM pg_class where oid = 'temptest'::regclass \gset new_
+SELECT :old_relpages  = :new_relpages  AS pages_reset,
+   :old_reltuples = :new_reltuples AS tuples_reset,
+   :old_relallvisible = :new_relallvisible AS allvisible_reset,
+   :old_relfrozenxid <> :new_relfrozenxid  AS frozenxid_advanced;
+
+DROP TABLE temptest;
+
 -- Test ON COMMIT DROP
 
 BEGIN;
-- 
2.35.1

From 4515075b644d1e38920eb5bdaaa898e1698510a8 Mon Sep 17 00:00:00 2001
From: Greg Stark 
Date: Tue, 22 Mar 2022 15:51:32 

Re: Add parameter jit_warn_above_fraction

2022-03-28 Thread Magnus Hagander
On Tue, Mar 22, 2022 at 12:50 AM Andres Freund  wrote:

> Hi,
>
> On 2022-03-07 13:10:32 +0100, Magnus Hagander wrote:
> > Meanwhile here is an updated based on your other comments above, as
> > well as those from Julien.
>
> This fails on cfbot, due to compiler warnings:
> https://cirrus-ci.com/task/5127667648299008?logs=mingw_cross_warning#L390


Huh. That's annoying. I forgot int64 is %d on linux and %lld on Windows :/

PFA a fix for that.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 05df48131d..4831fc362c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6618,6 +6618,21 @@ local0.*/var/log/postgresql

  
 
+ 
+  jit_warn_above_fraction (floating point)
+  
+   jit_warn_above_fraction configuration parameter
+  
+  
+   
+
+ Causes a warning to be written to the log if the time spent on JIT (see )
+ goes above this fraction of the total query runtime.
+ A value of 0 (the default) disables the warning.
+
+   
+ 
+
  
   log_startup_progress_interval (integer)
   
diff --git a/src/backend/tcop/fastpath.c b/src/backend/tcop/fastpath.c
index d429aa4663..a1bff893a3 100644
--- a/src/backend/tcop/fastpath.c
+++ b/src/backend/tcop/fastpath.c
@@ -196,6 +196,7 @@ HandleFunctionRequest(StringInfo msgBuf)
 	struct fp_info *fip;
 	bool		callit;
 	bool		was_logged = false;
+	int			msecs;
 	char		msec_str[32];
 
 	/*
@@ -305,7 +306,7 @@ HandleFunctionRequest(StringInfo msgBuf)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(, msec_str, was_logged))
 	{
 		case 1:
 			ereport(LOG,
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index ba2fcfeb4a..cd7c076b7c 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -998,7 +998,9 @@ exec_simple_query(const char *query_string)
 	bool		save_log_statement_stats = log_statement_stats;
 	bool		was_logged = false;
 	bool		use_implicit_block;
+	int			msecs;
 	char		msec_str[32];
+	int64		jit_time = 0;
 
 	/*
 	 * Report query to various monitoring facilities.
@@ -1257,6 +1259,16 @@ exec_simple_query(const char *query_string)
 
 		receiver->rDestroy(receiver);
 
+		/* Collect JIT timings in case it's active */
+		if (jit_enabled && jit_warn_above_fraction > 0 && portal->queryDesc && portal->queryDesc->estate->es_jit)
+		{
+			jit_time +=
+INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.generation_counter) +
+INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.inlining_counter) +
+INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.optimization_counter) +
+INSTR_TIME_GET_MILLISEC(portal->queryDesc->estate->es_jit->instr.emission_counter);
+		}
+
 		PortalDrop(portal, false);
 
 		if (lnext(parsetree_list, parsetree_item) == NULL)
@@ -1327,7 +1339,7 @@ exec_simple_query(const char *query_string)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, was_logged))
+	switch (check_log_duration(, msec_str, was_logged))
 	{
 		case 1:
 			ereport(LOG,
@@ -1343,6 +1355,16 @@ exec_simple_query(const char *query_string)
 			break;
 	}
 
+	if (jit_enabled && jit_warn_above_fraction > 0)
+	{
+		if (jit_time > msecs * jit_warn_above_fraction)
+		{
+			ereport(WARNING,
+	(errmsg("JIT total processing time was %lld ms of %d ms",
+			(long long) jit_time, msecs)));
+		}
+	}
+
 	if (save_log_statement_stats)
 		ShowUsage("QUERY STATISTICS");
 
@@ -1370,6 +1392,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	CachedPlanSource *psrc;
 	bool		is_named;
 	bool		save_log_statement_stats = log_statement_stats;
+	int			msecs;
 	char		msec_str[32];
 
 	/*
@@ -1563,7 +1586,7 @@ exec_parse_message(const char *query_string,	/* string to execute */
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(, msec_str, false))
 	{
 		case 1:
 			ereport(LOG,
@@ -1610,6 +1633,7 @@ exec_bind_message(StringInfo input_message)
 	MemoryContext oldContext;
 	bool		save_log_statement_stats = log_statement_stats;
 	bool		snapshot_set = false;
+	int			msecs;
 	char		msec_str[32];
 	ParamsErrorCbData params_data;
 	ErrorContextCallback params_errcxt;
@@ -2022,7 +2046,7 @@ exec_bind_message(StringInfo input_message)
 	/*
 	 * Emit duration logging if appropriate.
 	 */
-	switch (check_log_duration(msec_str, false))
+	switch (check_log_duration(, msec_str, false))
 	{
 		case 1:
 			ereport(LOG,
@@ -2068,6 +2092,7 @@ exec_execute_message(const char *portal_name, long max_rows)
 	bool		is_xact_command;
 	bool		execute_is_fetch;
 	bool		was_logged = false;
+	int			msecs;
 	char		msec_str[32];
 	

Re: [PATCH v2] use has_privs_for_role for predefined roles

2022-03-28 Thread Joe Conway

On 3/28/22 15:56, Robert Haas wrote:

On Mon, Mar 21, 2022 at 4:15 PM Joe Conway  wrote:

Robert -- any opinion on this? If I am not mistaken it is code that you
are actively working on.


Woops, I only just saw this. I don't mind if you want to change the
calls to is_member_of_role() in basebackup_server.c and
basebackup_to_shell.c to has_privs_of_role().


No worries -- I will take care of that shortly.


However, it's not clear to me why it's different than the calls we
have in other places, like calculate_database_size() and the
relatively widely-used check_is_member_of_role().


I will have to go refresh my memory, but when I looked at those sites 
closely it all made sense to me.


I think most if not all of them were checking for the ability to switch 
to the other role, not actually checking for privileges by virtue of 
belonging to that role.



As long as we have a bunch of different practices in different parts
of the code base I can't see people getting this right consistently
... leaving aside any possible disagreement about which way is
"right".
When I take the next pass I can consider whether additional comments 
will help and report back.


Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




Re: [RFC] building postgres with meson -v8

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-25 10:01:09 +0100, Peter Eisentraut wrote:
> On 22.03.22 03:22, Andres Freund wrote:
> > Attached is v8. It's just a rebase to resolve conflicts with recent changes.
> 
> I have committed the DLSUFFIX refactoring, and also a stripped-down version
> of the patch that makes update-unicode work with vpath.  This takes care of
> patches 0007 and 0009.

Thanks!


> The only other thing IMO that might be worth addressing in PG15 is the
> snowball scripts refactoring (0002 and 0003), but that doesn't seem quite
> ready yet.  (At least, it would need to be integrated into the distprep
> target, since it adds a dependency on perl.)  Maybe it's not worth it right
> now.

Not sure myself.


> With that, I suggest moving this patch set to CF 2022-07.

Done.


One thing I'd like to discuss fairly soon is what kind of approach to take for
integrating meson support. Most other projects I looked kept parallel
buildsystems for at least a release, so that there's one round of "broad" user
feedback.

In our context it could make sense to merge meson, after a few months of
shakeup remove the current windows buildsystems, and then in release + 1
remove the make based stuff.

But we can have that discussion that before the next CF, but still after
code-freeze & immediate mopup.


> A general comment on the remaining prereq patches:  We appear to be
> accumulating a mix of conventions for how "generate" scripts specify their
> output file.  Some have it built-in, some use the last argument, some use an
> option, which might be -o or --output.  Maybe we can gently work toward more
> commonality there.

Fair point.

Greetings,

Andres Freund




Re: [PATCH v2] use has_privs_for_role for predefined roles

2022-03-28 Thread Robert Haas
On Mon, Mar 21, 2022 at 4:15 PM Joe Conway  wrote:
> Robert -- any opinion on this? If I am not mistaken it is code that you
> are actively working on.

Woops, I only just saw this. I don't mind if you want to change the
calls to is_member_of_role() in basebackup_server.c and
basebackup_to_shell.c to has_privs_of_role(). However, it's not clear
to me why it's different than the calls we have in other places, like
calculate_database_size() and the relatively widely-used
check_is_member_of_role(). As long as we have a bunch of different
practices in different parts of the code base I can't see people
getting this right consistently ... leaving aside any possible
disagreement about which way is "right".

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: refactoring basebackup.c (zstd workers)

2022-03-28 Thread Robert Haas
On Sun, Mar 27, 2022 at 1:47 PM Tom Lane  wrote:
> Coverity has a nitpick about this:
>
> /srv/coverity/git/pgsql-git/postgresql/src/common/backup_compression.c: 194 
> in parse_bc_specification()
> 193 /* Advance to next entry and loop around. */
> >>> CID 1503251:  Null pointer dereferences  (REVERSE_INULL)
> >>> Null-checking "vend" suggests that it may be null, but it has already 
> >>> been dereferenced on all paths leading to the check.
> 194 specification = vend == NULL ? kwend + 1 : vend + 1;
> 195 }
> 196 }
>
> Not sure if you should remove this null-check or add some other ones,
> but I think you ought to do one or the other.

Yes, I think this is buggy.  I think there's only a theoretical bug
right now, because the only keyword we have is "level" and that
requires a value. But if I add an example keyword that does not
require an associated value (as demonstrated in the attached patch)
and do something like pg_basebackup -cfast -D whatever --compress
lz4:example, then the present code will dereference "vend" even though
it's NULL, which is not good. The attached patch also shows how I
think that should be fixed.

As I hope is apparent, the first hunk of this patch is not for commit,
and the second hunk is for commit.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


coverity-backup-compression-fix.patch
Description: Binary data


Re: CFBot has failures on 027_stream_regress for a number of patches

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 15:28:20 -0400, Greg Stark wrote:
> I notice a number of patches have a failure in the cfbot on the
> 027_stream_regress test. I think these are related to a bug in that
> test being discussed in a thread somewhere though I don't have it
> handy. Is that right?

It looks more like a bug in the general regression tests that are more likely
to be triggered by 027_stream_regress.


> I think it doesn't indicate anything wrong with the individual patches right?

Depends on the patch, I think. It certainly also has triggered for bugs in
patches...


> For an example the "Add checkpoint and redo LSN to LogCheckpointEnd
> log message" patch which is a fairly simple patch adding a few details
> to some log messages is failing with this which doesn't make much
> sense since it shouldn't be affecting the actual recovery at all:

027_stream_regress.pl runs the normal regression tests via streaming rep - the
failure here is on the primary (i.e. doesn't involve recovery itself). Due to
the smaller shared_buffers setting the test uses, some edge cases are
triggered more often.


> diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/cluster.out
> /tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/cluster.out ---
> /tmp/cirrus-ci-build/src/test/regress/expected/cluster.out 2022-03-28
> 01:18:36.126774178 + +++
> /tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/cluster.out
> 2022-03-28 01:23:24.489517050 + @@ -467,7 +467,8 @@ where row(hundred,
> thousand, tenthous) <= row(lhundred, lthousand, ltenthous); hundred |
> lhundred | thousand | lthousand | tenthous | ltenthous
> -+--+--+---+--+--- -(0 rows)
> + 0 | 99 | 0 | 999 | 0 |  +(1 row)

This one is: 
https://postgr.es/m/CA%2BhUKGLV3wzmYFbNs%2BTZ1%2Bw0e%3Dhc61fcvrF3OmutuaPBuZMd0w%40mail.gmail.com

Greetings,

Andres Freund




Re: SQL/JSON: JSON_TABLE

2022-03-28 Thread Greg Stark
FYI I think the patch failure in the cfbot is spurious because the
cfbot got confused by Erik's patch.




Re: [PATCH] Accept IP addresses in server certificate SANs

2022-03-28 Thread Greg Stark
On Mon, 28 Mar 2022 at 05:17, Daniel Gustafsson  wrote:
>
> named to match the git format-patch output
> since I think the CFBot just applies the patches in alphabetical order).

The first patch doesn't seem to actually apply though so it doesn't
get to the subsequent patches.

http://cfbot.cputube.org/patch_37_3458.log

=== Applying patches on top of PostgreSQL commit ID
e26114c817b610424010cfbe91a743f591246ff1 ===
=== applying patch ./v10-0001-Move-inet_net_pton-to-src-port.patch
patching file src/backend/utils/adt/Makefile
Hunk #1 FAILED at 44.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/utils/adt/Makefile.rej
patching file src/include/port.h
patching file src/include/utils/builtins.h
patching file src/port/Makefile
patching file src/port/inet_net_pton.c (renamed from
src/backend/utils/adt/inet_net_pton.c)
patching file src/tools/msvc/Mkvcbuild.pm


-- 
greg




Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-28 Thread Greg Stark
On Tue, 22 Mar 2022 at 09:52, Michail Nikolaev
 wrote:
>
> Thanks for notifying me. BTW, some kind of automatic email in case of
> status change could be very helpful.

I agree but realize the cfbot is quite new and I guess the priority is
to work out any kinks before spamming people with false positives.

> New version is attached, build is passing
> (https://cirrus-ci.com/build/5599876384817152), so, moving it back to
> "ready for committer" .

I'm seeing a recovery test failure. Not sure if this represents an
actual bug or just a test that needs to be adjusted for the new
behaviour.

https://cirrus-ci.com/task/5711008294502400

[14:42:46.885] # Failed test 'no new index hint bits are set on new standby'
[14:42:46.885] # at t/027_standby_index_lp_dead.pl line 262.
[14:42:46.885] # got: '12'
[14:42:46.885] # expected: '11'
[14:42:47.147]
[14:42:47.147] # Failed test 'hint not marked as standby-safe'
[14:42:47.147] # at t/027_standby_index_lp_dead.pl line 263.
[14:42:47.147] # got: '1'
[14:42:47.147] # expected: '0'
[14:42:49.723] # Looks like you failed 2 tests of 30.
[14:42:49.750] [14:42:49] t/027_standby_index_lp_dead.pl ...
[14:42:49.761] Dubious, test returned 2 (wstat 512, 0x200)
[14:42:49.761] Failed 2/30 subtests




-- 
greg




Re: SQL/JSON: functions

2022-03-28 Thread Tom Lane
Andres Freund  writes:
> On 2022-03-28 14:57:20 -0400, Andrew Dunstan wrote:
>> That didn't help, there are no differences that matter (just #line
>> directives as I did a vpath build). :-(

> Yea. I didn't see any differences when comparing to a non-vpath build that
> runs tests successfully. Pretty weird.

Unsurprisingly, these files match what I built, too.  So the problem
is downstream of the flex/bison runs.  Baffling :-(

regards, tom lane




Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-03-28 Thread Greg Stark
This patch is currently showing up with a test failure in the CFBot
however I do *not* believe this is a bug in the patch. I think it's a
bug in that test which is being discussed elsewhere.

It's also a very short and straightforward patch that a committer
could probably make a decision about whether it's a good idea or not
and then apply it quickly if so.

Just to give people a leg up and an idea how short the patch is...
Here's the entire patch:


diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index ed16f279b1..b85c76d8f8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6121,7 +6121,8 @@ LogCheckpointEnd(bool restartpoint)
  "%d WAL file(s) added, %d removed, %d recycled; "
  "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
  "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
- "distance=%d kB, estimate=%d kB",
+ "distance=%d kB, estimate=%d kB; "
+ "lsn=%X/%X, redo lsn=%X/%X",
  CheckpointStats.ckpt_bufs_written,
  (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
  CheckpointStats.ckpt_segs_added,
@@ -6134,14 +6135,21 @@ LogCheckpointEnd(bool restartpoint)
  longest_msecs / 1000, (int) (longest_msecs % 1000),
  average_msecs / 1000, (int) (average_msecs % 1000),
  (int) (PrevCheckPointDistance / 1024.0),
- (int) (CheckPointDistanceEstimate / 1024.0;
+ (int) (CheckPointDistanceEstimate / 1024.0),
+ /*
+ * ControlFileLock is not required as we are the only
+ * updator of these variables.
+ */
+ LSN_FORMAT_ARGS(ControlFile->checkPoint),
+ LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo;
  else
  ereport(LOG,
  (errmsg("checkpoint complete: wrote %d buffers (%.1f%%); "
  "%d WAL file(s) added, %d removed, %d recycled; "
  "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
  "sync files=%d, longest=%ld.%03d s, average=%ld.%03d s; "
- "distance=%d kB, estimate=%d kB",
+ "distance=%d kB, estimate=%d kB; "
+ "lsn=%X/%X, redo lsn=%X/%X",
  CheckpointStats.ckpt_bufs_written,
  (double) CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
  CheckpointStats.ckpt_segs_added,
@@ -6154,7 +6162,13 @@ LogCheckpointEnd(bool restartpoint)
  longest_msecs / 1000, (int) (longest_msecs % 1000),
  average_msecs / 1000, (int) (average_msecs % 1000),
  (int) (PrevCheckPointDistance / 1024.0),
- (int) (CheckPointDistanceEstimate / 1024.0;
+ (int) (CheckPointDistanceEstimate / 1024.0),
+ /*
+ * ControlFileLock is not required as we are the only
+ * updator of these variables.
+ */
+ LSN_FORMAT_ARGS(ControlFile->checkPoint),
+ LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo;
 }

 /*
-- 
2.27.0




Re: [PATCH v2] use has_privs_for_role for predefined roles

2022-03-28 Thread Joe Conway

On 3/21/22 16:15, Joe Conway wrote:

On 3/20/22 12:38, Stephen Frost wrote:

Greetings,

On Sun, Mar 20, 2022 at 18:31 Joshua Brindle 
mailto:joshua.brin...@crunchydata.com>> 
wrote:


On Sun, Mar 20, 2022 at 12:27 PM Joe Conway mailto:m...@joeconway.com>> wrote:
 >
 > On 3/3/22 11:26, Joshua Brindle wrote:
 > > On Thu, Feb 10, 2022 at 2:37 PM Joe Conway mailto:m...@joeconway.com>> wrote:
 > >>
 > >> On 2/10/22 14:28, Nathan Bossart wrote:
 > >> > On Wed, Feb 09, 2022 at 04:39:11PM -0500, Joe Conway wrote:
 > >> >> On 2/9/22 13:13, Nathan Bossart wrote:
 > >> >>> I do wonder if users find the differences between
predefined roles and role
 > >> >>> attributes confusing.  INHERIT doesn't govern role
attributes, but it will
 > >> >>> govern predefined roles when this patch is applied.  Maybe
the role
 > >> >>> attribute system should eventually be deprecated in favor
of using
 > >> >>> predefined roles for everything.  Or perhaps the
predefined roles should be
 > >> >>> converted to role attributes.
 > >> >>
 > >> >> Yep, I was suggesting that the latter would have been
preferable to me while
 > >> >> Robert seemed to prefer the former. Honestly I could be
happy with either of
 > >> >> those solutions, but as I alluded to that is probably a
discussion for the
 > >> >> next development cycle since I don't see us doing that big
a change in this
 > >> >> one.
 > >> >
 > >> > I agree.  I still think Joshua's proposed patch is a
worthwhile improvement
 > >> > for v15.
 > >>
 > >> +1
 > >>
 > >> I am planning to get into it in detail this weekend. So far I have
 > >> really just ensured it merges cleanly and passes make world.
 > >
 > > Rebased patch to apply to master attached.
 >
 > Well longer than I planned, but finally took a closer look.
 >
 > I made one minor editorial fix to Joshua's patch, rebased to current
 > master, and added two missing call sites that presumably are
related to
 > recent commits for pg_basebackup.

FWIW I pinged Stephen when I saw the basebackup changes included
is_member_of and he didn't think they necessarily needed to be changed
since they aren't accessible to human and you can't SET ROLE on a
replication connection in order to access the role where inheritance
was blocked.

Yeah, though that should really be very clearly explained in comments 
around that code as anything that uses is_member should really be viewed 
as an exception.  I also wouldn’t be against using has_privs there 
anyway and saying that you shouldn’t be trying to connect as one role on 
a replication connection and using the privs of another when you don’t 
automatically inherit those rights, but on the assumption that the 
committer intended is_member there because SET ROLE isn’t something one 
does on replication connections, I’m alright with it being as is.



Robert -- any opinion on this? If I am not mistaken it is code that you
are actively working on.


Lacking any feedback from Robert, I removed my changes related to 
basebackup and pushed Joshua's patch with one minor editorial fix. What 
to do with the basebackup changes can go on the open items list for pg15 
I guess.


Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development




CFBot has failures on 027_stream_regress for a number of patches

2022-03-28 Thread Greg Stark
I notice a number of patches have a failure in the cfbot on the
027_stream_regress test. I think these are related to a bug in that
test being discussed in a thread somewhere though I don't have it
handy. Is that right?

I think it doesn't indicate anything wrong with the individual patches right?

For an example the "Add checkpoint and redo LSN to LogCheckpointEnd
log message" patch which is a fairly simple patch adding a few details
to some log messages is failing with this which doesn't make much
sense since it shouldn't be affecting the actual recovery at all:


diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/cluster.out
/tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/cluster.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/cluster.out
2022-03-28 01:18:36.126774178 +
+++ /tmp/cirrus-ci-build/src/test/recovery/tmp_check/results/cluster.out
2022-03-28 01:23:24.489517050 +
@@ -467,7 +467,8 @@
 where row(hundred, thousand, tenthous) <= row(lhundred, lthousand, ltenthous);
  hundred | lhundred | thousand | lthousand | tenthous | ltenthous
 -+--+--+---+--+---
-(0 rows)
+   0 |   99 |0 |   999 |0 |  
+(1 row)

 reset enable_indexscan;
 reset maintenance_work_mem;


-- 
greg




Re: SQL/JSON: functions

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 14:57:20 -0400, Andrew Dunstan wrote:
> That didn't help, there are no differences that matter (just #line
> directives as I did a vpath build). :-(

Yea. I didn't see any differences when comparing to a non-vpath build that
runs tests successfully. Pretty weird.

Nikola, unless remote access turns out to be possible for one of us, could you
perhaps try to build interactively and see whether it reproduces there?

Greetings,

Andres Freund




Re: Accommodate startup process in a separate ProcState array slot instead of in MaxBackends slots.

2022-03-28 Thread Robert Haas
On Sat, Mar 26, 2022 at 2:23 AM Bharath Rupireddy
 wrote:
> Thanks. It makes sense to put the comment in SInvalShmemSize.
> Attaching v2 patch. Please review it.

How about this version, which I have edited lightly for grammar?

-- 
Robert Haas
EDB: http://www.enterprisedb.com


sinvalshmemsize-comment-rmh.patch
Description: Binary data


Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
Mark Dilger  writes:
> On Mar 28, 2022, at 11:31 AM, Tom Lane  wrote:
>> I think we probably have to trash the core-regression-tests part of
>> the patch altogether and instead use a TAP test for whatever testing
>> we want to do.  It might be all right to test SET privileges without
>> testing ALTER SYSTEM, but I'm not sure there's much point in that.

> How about putting them under src/test/modules/unsafe_tests ?

Ah, that could work; I'd forgotten about that subdirectory.

It'd still be a good idea if they didn't fail when run twice in a row.

regards, tom lane




Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
Mark Dilger  writes:
> Version 12 also introduces \dcp (pneumonic, "Describe Configuration 
> Parameter") for listing parameters, with \dcp+ also showing the acl, like:

The fact that that code is not dry behind the ears is painfully obvious.
It's not documented in psql-ref, not tested anywhere AFAICS, and its
handling of the pattern parameter is inconsistent with every other
\d command.  The wildcard character should be * not %.  It only
accidentally fails to dump core if no pattern is given, too.

> \dcp[+] only shows "user" and "superuser" parameters:

Why make that restriction?  Also, I find it astonishing that this doesn't
show the GUC's value by default.  The non-plus form of the command seems
useless as it stands, or at least its intended use-case is so narrow
I can't see it.  If we're to have it at all, it seems like it ought to
be a reasonably useful shortcut for interrogating pg_settings.  I'd
expect the base set of columns to be name, value, and possibly unit;
then add ACL with +.  I'm not sure that GucContext belongs in this at all,
but if it does, it's a + column.

On the whole perhaps this should be taken out again; it's a bit
late in the cycle to be introducing new features, especially ones
as subject to bikeshedding as a \d command is.  My ideas about what
columns to show probably wouldn't match anyone else's ... and we
haven't even gotten to whether \dcp is an okay choice of name.

regards, tom lane




Re: [PATCH] Add extra statistics to explain for Nested Loop

2022-03-28 Thread Greg Stark
This patch got some very positive feedback and some significant amount
of work earlier in the release cycle. The feedback from Julien earlier
this month seemed pretty minor.

Ekaterina, is there any chance you'll be able to work on this this
week and do you think it has a chance of making this release? Julien,
do you think it's likely to be possible to polish for this release?

Otherwise I guess we should move it to the next CF but it seems a
shame given how much work has been done and how close it is.

On Mon, 7 Mar 2022 at 00:17, Julien Rouhaud  wrote:
>
> Hi,
>
> On Thu, Feb 03, 2022 at 12:59:03AM +0300, Ekaterina Sokolova wrote:
> >
> > I apply the new version of patch.
> >
> > I wanted to measure overheads, but could't choose correct way. Thanks for
> > idea with auto_explain.
> > I loaded it and made 10 requests of pgbench (number of clients: 1, of
> > threads: 1).
> > I'm not sure I chose the right way to measure overhead, so any suggestions
> > are welcome.
> > Current results are in file overhead_v0.txt.
> >
> > Please feel free to share your suggestions and comments. Regards,
> >
>
> >| master latency (ms) | master tps |  | new latency (ms) |   new tps
> > --
> >  1 |2,462| 406,190341 |  |   4,485  | 222,950527
> >  2 |3,877| 257,89813  |  |   4,141  | 241,493395
> >  3 |3,789| 263,935811 |  |   2,837  | 352,522297
> >  4 |3,53 | 283,310196 |  |   5,510  | 181,488203
> >  5 |3,413| 292,997363 |  |   6,475  | 154,432999
> >  6 |3,757| 266,148564 |  |   4,073  | 245,507218
> >  7 |3,752| 266,560043 |  |   3,901  | 256,331385
> >  8 |4,389| 227,847524 |  |   4,658  | 214,675196
> >  9 |4,341| 230,372282 |  |   4,220  | 236,983672
> > 10 |3,893| 256,891104 |  |   7.059  | 141,667139
> > --
> > avg|3,7203   | 275,215136 |  |   4,03   | 224,8052031
> >
> >
> > master/new latency | 0,92315 |
> > master/new tps | 1,22424 |
> >
> > new/master latency | 1,08325 |
> > new/master tps | 0,81683 |
>
> The overhead is quite significant (at least for OLTP-style workload).
>
> I think this should be done with a new InstrumentOption, like
> INSTRUMENT_LOOP_DETAILS or something like that, and set it where appropriate.
> Otherwise you will have to pay that overhead even if you won't use the new
> fields at all.  It could be EXPLAIN (ANALYZE, VERBOSE OFF), but also simply
> using pg_stat_statements which doesn't seem acceptable.
>
> One problem is that some extensions (like pg_stat_statements) can rely on
> INSTRUMENT_ALL but may or may not care about those extra counters.  Maybe we
> should remove that alias and instead provide two (like INSTRUMENT_ALL_VERBOSE
> and INSTRUMENT_ALL_SOMETHINGELSE, I don't have any bright name right now) so
> that authors can decide what they need instead of silently having such
> extension ruin the performance for no reason.
>
> About the implementation itself:
>
> +static void show_loop_info(Instrumentation *instrument, bool isworker,
> +  ExplainState *es);
>
> I think this should be done as a separate refactoring commit.
>
> +   /*
> +* this is first loop
> +*
> +* We only initialize the min values. We don't need to bother with the
> +* max, because those are 0 and the non-zero values will get updated a
> +* couple lines later.
> +*/
> +   if (instr->nloops == 0)
> +   {
> +   instr->min_t = totaltime;
> +   instr->min_tuples = instr->tuplecount;
> +   }
> +
> +   if (instr->min_t > totaltime)
> +   instr->min_t = totaltime;
> +
> +   if (instr->max_t < totaltime)
> +   instr->max_t = totaltime;
> +
> instr->ntuples += instr->tuplecount;
> +
> +   if (instr->min_tuples > instr->tuplecount)
> +   instr->min_tuples = instr->tuplecount;
> +
> +   if (instr->max_tuples < instr->tuplecount)
> +   instr->max_tuples = instr->tuplecount;
> +
> instr->nloops += 1;
>
> Why do you need to initialize min_t and min_tuples but not max_t and
> max_tuples while both will initially be 0 and possibly updated afterwards?
>
> I think you should either entirely remove this if (instr->nloops == 0) part, 
> or
> handle some else block.
>
>


-- 
greg




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 2:18 AM Dilip Kumar  wrote:
> > I have put the similar logic for relmap_update WAL replay as well,
>
> There was some mistake in the last patch, basically, for relmap update
> also I have checked the missing tablespace directory but I should have
> checked the missing database directory so I have fixed that.
>
> > Now, is it possible to get the FPI without smgr_create wal in other
> > cases?  If it is then that problem is orthogonal to this path, but
> > anyway I could not find any such scenario.
>
> I have digged further into it, tried manually removing the directory
> before XLOG_FPI, but I noticed that during FPI also
> XLogReadBufferExtended() take cares of creating the missing files
> using smgrcreate() and that intern take care of missing directory
> creation so I don't think we have any problem here.

I don't understand whether XLOG_RELMAP_UPDATE should be just doing
smgrcreate() as we would for most WAL records or whether it should be
adopting the new system introduced by
49d9cfc68bf4e0d32a948fe72d5a0ef7f464944e. I wrote about this concern
over here:

http://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=icfm0suyqbbqzea...@mail.gmail.com

But apart from that question your adaptations here look reasonable to me.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: Add psql command to list constraints

2022-03-28 Thread Greg Stark
In the interests of trying to clean up the CF and keep things moving
I'll mark the patch rejected.

That doesn't mean the decision can't change or that nobody's allowed
to keep discussing it. It's just that that seems to be the decision
right now and there are too many patches queued up to keep things in a
pending state waiting for a more decisive conclusion. We can always
change it if the winds shift...




Re: SQL/JSON: functions

2022-03-28 Thread Andrew Dunstan


On 3/28/22 14:31, Nikola Ivanov wrote:
> Hi Andreas,
>
> Archive with the files is attached.



That didn't help, there are no differences that matter (just #line
directives as I did a vpath build). :-(


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Patch: Code comments: why some text-handling functions are leakproof

2022-03-28 Thread Greg Stark
I'm going to mark this returned with feedback.

If you have a chance to update the patch moving the documentation to
xfunc.sgml the way Tom describes make sure to create a new commitfest
entry. I would suggest submitting the patch as a followup on this
thread so when it's added to the commitfest it links to this whole
discussion.


On Mon, 28 Feb 2022 at 17:12, Tom Lane  wrote:
>
> Robert Haas  writes:
> > On Tue, Jan 11, 2022 at 2:07 AM Gurjeet Singh  wrote:
> >> This is more or less a verbatim copy of Tom's comment in email thread at 
> >> [1].
> >>
> >> I could not find an appropriate spot to place these comments, so I placed 
> >> them on bttextcmp() function, The only other place that I could see we can 
> >> place these comments is in the file src/backend/optimizer/README, because 
> >> there is some consideration given to leakproof functions in optimizer 
> >> docs. But these comments seem quite out of place in optimizer docs.
>
> > It doesn't seem particularly likely that someone who is thinking about
> > changing this in the future would notice the comment in the place
> > where you propose to put it, nor that they would read the optimizer
> > README.
>
> Agreed.  I think if we wanted to make an upgrade in the way function
> leakproofness is documented, we ought to add a  about it in
> xfunc.sgml, adjacent to the one about function volatility categories.
> This could perhaps consolidate some of the existing documentation mentions
> of leakproofness, as well as adding text similar to what Gurjeet suggests.
>
> > Furthermore, I don't know that everyone agrees with Tom about this. I
> > do agree that it's more important to mark relational operators
> > leakproof than other things, and I also agree that conservatism is
> > warranted. But that does not mean that someone could not make a
> > compelling argument for marking other functions leakproof.
>
> ISTM the proposed text does a reasonable job of explaining why
> we made the decisions currently embedded in pg_proc.proleakproof.
> If we make some other decisions in future, updating the rationale
> in the docs would be an appropriate part of that.
>
> regards, tom lane
>
>


--
greg




Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Mark Dilger



> On Mar 28, 2022, at 11:31 AM, Tom Lane  wrote:
> 
> I think we probably have to trash the core-regression-tests part of
> the patch altogether and instead use a TAP test for whatever testing
> we want to do.  It might be all right to test SET privileges without
> testing ALTER SYSTEM, but I'm not sure there's much point in that.

How about putting them under src/test/modules/unsafe_tests ?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Proposal: Support custom authentication methods using hooks

2022-03-28 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-18 15:23:21 -0400, Stephen Frost wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> > > On 2022-03-18 00:45:49 -0400, Stephen Frost wrote:
> > > > > I also don’t think that I agree that it’s acceptable to only have the
> > > > > > ability to extend the authentication on the server side as that 
> > > > > > implies a
> > > > > > whole bunch of client-side work that goes above and beyond just
> > > > > > implementing an authentication system if it’s not possible to 
> > > > > > leverage
> > > > > > libpq (which nearly all languages out there use..).  Not addressing 
> > > > > > that
> > > > > > side of it also makes me concerned that whatever we do here won’t be
> > > > > > right/enough/etc.
> > > > >
> > > > > You're skipping over my point of everything that can be made to use
> > > > > SASL-SCRAM-256 just working with the existing libpq? Again?
> > > 
> > > > The plan is to make scram pluggable on the client side?  That isn’t 
> > > > what’s
> > > > been proposed here that I’ve seen.
> > > 
> > > Libpq and many other drivers already speaks SASL-SCRAM-256. If you're
> > > implementing an authentication method that wants to store secrets outside 
> > > of
> > > postgres (centrally in another postgres instance, ldap, mysql or whatnot) 
> > > you
> > > don't need to make scram pluggable client-side. From the perspective of 
> > > the
> > > client it'd look *exactly* like the normal auth flow with the server.
> > 
> > Then the only way to get support for something like, say, OAUTH, is to
> > modify the core code.
> 
> It wasn't the goal of the patch to add oauth support. I think it's a good sign
> that the patch allowed Jacob to move the server side code in an extension, but
> it's a benefit not a requirement.

I disagree that it's actually at all useful when considering this
change, specifically because it doesn't move the actual goalposts at
all when it comes to adding support for new authentication methods for
PG overall as the core code still has to be modified through the regular
process.  I get that the idea of this patch isn't to add oauth, but I
don't think it's sensible to claim that the changes to the oauth patch
to use these hooks on just the server side while still having a bunch of
code in libpq for oauth makes this set of hooks make sense.  I
definitely don't think we should ever even consider adding support for
something on the libpq side which require a 'custom' auth module on the
server side that isn't included in core.  Putting it into contrib would
just ultimately make it really painful for users to deal with without
any benefit that I can see either as we'd still have to maintain it and
update it through the regular PG process.  Trying hard to give the
benefit of the doubt here, maybe you could argue that by having the
module in contrib and therefore not loaded unless requested that the
system might be overall 'more secure', but given that we already require
admins to specify what the auth method is with pg_hba and that hasn't
been a source of a lot of CVEs around somehow getting the wrong code in
there, I just don't see it as a sensible justfication.

> I think we might be able to build ontop of this to make things even more
> extensible and it's worth having the discussion whether the proposal can be
> made more extensible with a reasonable amount of complexity. But that's
> different from faulting the patch for not achieving something it didn't set
> out to do.

That it wasn't clear just what the idea of this patch was strikes me as
another point against it, really.  The thread talked about custom
authentication methods and the modification to pg_hba clearly made it
out to be a top-level alternative to SCRAM and the other methods even
though it sounds like that wasn't the intent, or maybe was but was only
part of it (but then, what's the other part..?  That's still unclear to
me even after reading these latest emails..).

> > That strikes me as entirely defeating the point of having this be
> > extensible, since you still have to modify the core code to get support
> > for it.
> 
> Since it wasn't the goal to add oauth...

But.. a patch was specifically proposed on this thread to use this to
add oauth, with encouragment from the folks working on this patch, and
that was then used, even just above, as a reason why this was a useful
change to make, but I don't see what about it makes it such.

> > I don't get why such an exercise would have been done if the goal is to
> > just allow what's in rolpassword to be pulled from somewhere else or why
> > we would be talking about different auth methods in the first place if
> > that's the goal here.
> 
> It's called that way because auth.c calling it that. See
> 
> void
> ClientAuthentication(Port *port)
> ...
>   switch (port->hba->auth_method)
> 
> even though the different auth_methods use a smaller number of ways of
> exchanges of secrets than we have "auth 

Re: Granting SET and ALTER SYSTE privileges for GUCs

2022-03-28 Thread Tom Lane
I've started reviewing this patch in earnest, and almost immediately
hit a serious problem: these regression tests aren't even a little bit
committable.  For one thing, they fail if you do "make installcheck"
twice in a row.  This seems to be because the first run leaves some
cruft behind in pg_parameter_acl that affects the results of
subsequent runs.  But the bigger problem is that it is ABSOLUTELY NOT
OKAY to test ALTER SYSTEM during an "installcheck" run.  That would be
true even if you cleaned up all the settings by the end of the run, as
the patch fails to do (and for extra demerit, it leaves a superuser
role laying around).  Even transient effects on the behavior of
sessions in other DBs aren't acceptable in installcheck mode, IMO.

I think we probably have to trash the core-regression-tests part of
the patch altogether and instead use a TAP test for whatever testing
we want to do.  It might be all right to test SET privileges without
testing ALTER SYSTEM, but I'm not sure there's much point in that.

regards, tom lane




Re: role self-revocation

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 10:51 AM Stephen Frost  wrote:
> > > > So I propose to commit something like what I posted here:
> > > > http://postgr.es/m/ca+tgmobgek0jraowqvpqhsxcfbdfitxsomoebhmmmhmj4gl...@mail.gmail.com
> > >
> > > +1, although the comments might need some more work.  In particular,
> > > I'm not sure that this bit is well stated:
>
> Also +1 on this.

OK, done using Tom's proposed wording.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Extensible Rmgr for Table AMs

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-23 21:43:08 -0700, Jeff Davis wrote:
>  /* must be kept in sync with RmgrData definition in xlog_internal.h */
>  #define PG_RMGR(symname,name,redo,desc,identify,startup,cleanup,mask,decode) 
> \
> - { name, redo, desc, identify, startup, cleanup, mask, decode },
> + &(struct RmgrData){ name, redo, desc, identify, startup, cleanup, mask, 
> decode },
>  
> -const RmgrData RmgrTable[RM_MAX_ID + 1] = {
> +static RmgrData *RmgrTable[RM_MAX_ID + 1] = {
>  #include "access/rmgrlist.h"
>  };

I think this has been discussed before, but to me it's not obvious that it's a
good idea to change RmgrTable from RmgrData to RmgrData *. That adds an
indirection, without obvious benefit.


> +
> +/*
> + * Start up all resource managers.
> + */
> +void
> +StartupResourceManagers()

(void)


> +void
> +RegisterCustomRmgr(RmgrId rmid, RmgrData *rmgr)
> +{
> + if (rmid < RM_MIN_CUSTOM_ID)
> + ereport(PANIC, errmsg("custom rmgr id %d is out of range", 
> rmid));
> +
> + if (!process_shared_preload_libraries_in_progress)
> + ereport(ERROR,
> + (errmsg("custom rmgr must be registered while 
> initializing modules in shared_preload_libraries")));
> +
> + ereport(LOG,
> + (errmsg("registering custom rmgr \"%s\" with ID %d",
> + rmgr->rm_name, rmid)));
> +
> + if (RmgrTable[rmid] != NULL)
> + ereport(PANIC,
> + (errmsg("custom rmgr ID %d already registered 
> with name \"%s\"",
> + rmid, 
> RmgrTable[rmid]->rm_name)));
> +
> + /* check for existing rmgr with the same name */
> + for (int i = 0; i <= RM_MAX_ID; i++)
> + {
> + const RmgrData *existing_rmgr = RmgrTable[i];
> +
> + if (existing_rmgr == NULL)
> + continue;
> +
> + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name))
> + ereport(PANIC,
> + (errmsg("custom rmgr \"%s\" has the 
> same name as builtin rmgr",
> + 
> existing_rmgr->rm_name)));
> + }
> +
> + /* register it */
> + RmgrTable[rmid] = rmgr;
> +}

Random idea: Might be worth emitting the id->name mapping just after a redo
location is determined, to make it easier to debug things.


> +RmgrData *
> +GetRmgr(RmgrId rmid)
> +{
> + return RmgrTable[rmid];
> +}

Given this is so simple, why incur the cost of a function call? Rather than
continuing to expose RmgrTable and move GetRmgr() into the header, as a static
inline? As-is this also prevent the compiler from optimizing across repeated
GetRmgr() calls (which often won't be possible anyway, but still)..


> - if (record->xl_rmid > RM_MAX_ID)
> + if (record->xl_rmid > RM_MAX_BUILTIN_ID && record->xl_rmid < 
> RM_MIN_CUSTOM_ID)
>   {
>   report_invalid_record(state,
> "invalid resource 
> manager ID %u at %X/%X",

Shouldn't this continue to enforce RM_MAX_ID as well?


> @@ -1604,12 +1603,7 @@ PerformWalRecovery(void)
>  
>   InRedo = true;
>  
> - /* Initialize resource managers */
> - for (rmid = 0; rmid <= RM_MAX_ID; rmid++)
> - {
> - if (RmgrTable[rmid].rm_startup != NULL)
> - RmgrTable[rmid].rm_startup();
> - }
> + StartupResourceManagers();

Personally I'd rather name it ResourceManagersStartup() or RmgrStartup().


> @@ -1871,7 +1860,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord 
> *record, TimeLineID *repl
>   xlogrecovery_redo(xlogreader, *replayTLI);
>  
>   /* Now apply the WAL record itself */
> - RmgrTable[record->xl_rmid].rm_redo(xlogreader);
> + GetRmgr(record->xl_rmid)->rm_redo(xlogreader);
>  
>   /*
>* After redo, check whether the backup pages associated with the WAL

So we have just added one indirect call and one pointer indirection
(previously RmgrTable could be resolved by the linker, now it needs to be
dereferenced again), that's not too bad. I was afraid there'd be multiple
calls.


> @@ -2101,16 +2090,16 @@ xlog_outdesc(StringInfo buf, XLogReaderState *record)
>   uint8   info = XLogRecGetInfo(record);
>   const char *id;
>  
> - appendStringInfoString(buf, RmgrTable[rmid].rm_name);
> + appendStringInfoString(buf, GetRmgr(rmid)->rm_name);
>   appendStringInfoChar(buf, '/');
>  
> - id = RmgrTable[rmid].rm_identify(info);
> + id = GetRmgr(rmid)->rm_identify(info);
>   if (id == NULL)
>   appendStringInfo(buf, "UNKNOWN (%X): ", info & ~XLR_INFO_MASK);
>   else
>   appendStringInfo(buf, "%s: ", id);
>  
> - RmgrTable[rmid].rm_desc(buf, record);
> + GetRmgr(rmid)->rm_desc(buf, record);
>  }

Like here. It's 

Re: Estimating HugePages Requirements?

2022-03-28 Thread Nathan Bossart
On Thu, Mar 24, 2022 at 01:31:08PM -0700, Nathan Bossart wrote:
> A couple of other options to consider:
> 
> 1) Always set log_min_messages to WARNING/ERROR/FATAL for 'postgres -C'.
> We might need some special logic for handling the case where the user is
> inspecting the log_min_messages parameter.  With this approach, you'd
> probably never get extra output unless something was wrong (e.g., database
> already running when inspecting a runtime-computed GUC).  Also, this would
> silence any extra output that you might see today with non-runtime-computed
> GUCs.
> 
> 2) Add some way to skip just the shutdown message (e.g., a variable set
> when output_config_variable is true).  With this approach, you wouldn't get
> extra output by default, but you still might if log_min_messages is set to
> something like DEBUG3.  This wouldn't impact any extra output that you see
> today with non-runtime-computed GUCs.

I've attached a first attempt at option 1.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f36d04f5d19c673a7be788d6d4955b148f580095 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 28 Mar 2022 10:12:23 -0700
Subject: [PATCH v3 1/1] Silence extra logging with 'postgres -C'.

Presently, the server may emit a variety of extra log messages when inspecting
GUC values.  For example, when inspecting a runtime-computed GUC, the server
will always emit a "database system is shut down" LOG (unless the user has set
log_min_messages higher than LOG).  To avoid these extra log messages, this
change sets log_min_messages to FATAL when -C is used (unless the user has set
it to PANIC).  At FATAL, the user will still receive messages explaining why a
GUC's value cannot be inspected.
---
 src/backend/postmaster/postmaster.c | 34 +
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 80bb269599..7ee20d7c69 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -897,6 +897,26 @@ PostmasterMain(int argc, char *argv[])
 
 	if (output_config_variable != NULL)
 	{
+		const char *logging_setting;
+		int			flags;
+
+		/*
+		 * Silence any extra log messages when -C is used.  Before adjusting
+		 * log_min_messages, we save its value in case the user is inspecting
+		 * it.  If we are inspecting a runtime-computed GUC, it is possible that
+		 * a preloaded library's _PG_init() will override our custom setting for
+		 * log_min_messages, but trying to deal with that is probably more
+		 * trouble than it's worth.
+		 *
+		 * If the user has set log_min_messages to PANIC, we leave it at PANIC.
+		 * This has the side effect of silencing helpful FATAL messages (e.g.,
+		 * when trying to inspect a runtime-computed GUC on a running server),
+		 * but presumably the user has set it there for good reason.
+		 */
+		logging_setting = GetConfigOption("log_min_messages", false, false);
+		if (log_min_messages < FATAL)
+			SetConfigOption("log_min_messages", "FATAL", PGC_INTERNAL, PGC_S_OVERRIDE);
+
 		/*
 		 * If this is a runtime-computed GUC, it hasn't yet been initialized,
 		 * and the present value is not useful.  However, this is a convenient
@@ -908,17 +928,23 @@ PostmasterMain(int argc, char *argv[])
 		 * on running servers for those GUCs, but using this option now would
 		 * lead to incorrect results for them.
 		 */
-		int			flags = GetConfigOptionFlags(output_config_variable, true);
+		flags = GetConfigOptionFlags(output_config_variable, true);
 
 		if ((flags & GUC_RUNTIME_COMPUTED) == 0)
 		{
+			const char *config_val;
+
 			/*
 			 * "-C guc" was specified, so print GUC's value and exit.  No
 			 * extra permission check is needed because the user is reading
-			 * inside the data dir.
+			 * inside the data dir.  If we are inspecting log_min_messages,
+			 * print the saved value.
 			 */
-			const char *config_val = GetConfigOption(output_config_variable,
-	 false, false);
+			if (pg_strcasecmp(output_config_variable, "log_min_messages") == 0)
+config_val = logging_setting;
+			else
+config_val = GetConfigOption(output_config_variable,
+			 false, false);
 
 			puts(config_val ? config_val : "");
 			ExitPostmaster(0);
-- 
2.25.1



Re: multithreaded zstd backup compression for client and server

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 12:57 PM Robert Haas  wrote:
> Here's an updated and rebased version of my patch.

Well, that only updated the comment on the client side. Let's try again.

--
Robert Haas
EDB: http://www.enterprisedb.com


v3-0001-Allow-parallel-zstd-compression-when-taking-a-bas.patch
Description: Binary data


Re: Assert in pageinspect with NULL pages

2022-03-28 Thread Maxim Orlov
I've suddenly found that the test in this patch is based on a fact that
heap pages don't have PageSpecial or it is of different size with btree
pages Special area:

CREATE TABLE test1 (a int8, b int4range);

SELECT bt_page_items(get_raw_page('test1', 0));


In the current state is is so, but it is not guaranteed. For example, in
the proposed 64xid patch [1] we introduce PageSpecial into heap pages and
its size on occasion equals to the size of btree page special so check from
above is false. Even if we pass heap page into pageinspect pretending it is
btree page. So the test will fail.


Generally it seems not only a wrong test but the consequence of a bigger
scale fact that we can not always distinguish different AM's pages. So I'd
propose at least that we don't come to a conclusion that a page is valid
based on PageSpecial size is right.

[1]
https://www.postgresql.org/message-id/flat/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com
-- 
Best regards,
Maxim Orlov.


Re: On login trigger: take three

2022-03-28 Thread Andres Freund
Hi,

On 2022-03-28 15:57:37 +0300, a.soko...@postgrespro.ru wrote:
> +data initialization. It is vital that any event trigger using the
> +login event checks whether or not the database is in
> +recovery.
> 
> Does any trigger really have to contain a pg_is_in_recovery() call?

Not *any* trigger, just any trigger that writes.


> In this message
> (https://www.postgresql.org/message-id/20220312024652.lvgehszwke4hhove%40alap3.anarazel.de)
> it was only about triggers on hot standby, which run not read-only queries

The problem precisely is that the login triggers run on hot standby nodes, and
that if they do writes, you can't login anymore.

Greetings,

Andres Freund




Re: multithreaded zstd backup compression for client and server

2022-03-28 Thread Robert Haas
On Sun, Mar 27, 2022 at 4:50 PM Justin Pryzby  wrote:
> Actually, I suggest to remove those comments:
> | "We check for failure here because..."
>
> That should be the rule rather than the exception, so shouldn't require
> justifying why one might checks the return value of library and system calls.

I went for modifying the comment rather than removing it. I agree with
you that checking for failure doesn't really require justification,
but I think that in a case like this it is useful to explain what we
know about why it might fail.

> In bbsink_zstd_new(), I think you need to check to see if workers were
> requested (same as the issue you found with "level").

Fixed.

> src/backend/replication/basebackup_zstd.c:  elog(ERROR, "could 
> not set zstd compression level to %d: %s",
> src/bin/pg_basebackup/bbstreamer_gzip.c:pg_log_error("could 
> not set compression level %d: %s",
> src/bin/pg_basebackup/bbstreamer_zstd.c:
> pg_log_error("could not set compression level to: %d: %s",
>
> I'm not sure why these messages sometimes mention the current compression
> method and sometimes don't.  I suggest that they shouldn't - errcontext will
> have the algorithm, and the user already specified it anyway.  It'd allow the
> compiler to merge strings.

I don't think that errcontext() helps here. On the client side, it
doesn't exist. On the server side, it's not in use. I do see
STATEMENT:  in the server log when a replication command
throws a server-side error, which is similar, but pg_basebackup
doesn't display that STATEMENT line. I don't really know how to
balance the legitimate desire for future messages against the
also-legitimate desire for clarity about where things are failing. I'm
slightly inclined to think that including the algorithm name is
better, because options are in the end algorithm-specific, but it's
certainly debatable. I would be interested in hearing other
opinions...

Here's an updated and rebased version of my patch.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v2-0001-Allow-parallel-zstd-compression-when-taking-a-bas.patch
Description: Binary data


Re: multithreaded zstd backup compression for client and server

2022-03-28 Thread Robert Haas
On Mon, Mar 28, 2022 at 12:52 PM Dagfinn Ilmari Mannsåker
 wrote:
> > True, but that also means it shows up in the actual failure message,
> > which seems too verbose. By just using 'print', it ends up in the log
> > file if it's needed, but not anywhere else. Maybe there's a better way
> > to do this, but I don't think using note() is what I want.
>
> That is the difference between note() and diag(): note() prints to
> stdout so is not visible under a non-verbose prove run, while diag()
> prints to stderr so it's always visible.

OK, but print doesn't do either of those things. The output only shows
up in the log file, even with --verbose. Here's an example of what the
log file looks like:

# Running: pg_verifybackup -n -m
/Users/rhaas/pgsql/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/server-backup/backup_manifest
-e 
/Users/rhaas/pgsql/src/bin/pg_verifybackup/tmp_check/t_008_untar_primary_data/backup/extracted-backup
backup successfully verified
ok 6 - verify backup, compression gzip

As you can see, there is a line here that does not begin with #. That
line is the standard output of a command that was run by the test
script.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: multithreaded zstd backup compression for client and server

2022-03-28 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> On Thu, Mar 24, 2022 at 9:19 AM Dagfinn Ilmari Mannsåker
>  wrote:
>> Per the TAP protocol, every line of non-test-result output should be
>> prefixed by "# ". The note() function does this for you, see
>> https://metacpan.org/pod/Test::More#Diagnostics for details.
>
> True, but that also means it shows up in the actual failure message,
> which seems too verbose. By just using 'print', it ends up in the log
> file if it's needed, but not anywhere else. Maybe there's a better way
> to do this, but I don't think using note() is what I want.

That is the difference between note() and diag(): note() prints to
stdout so is not visible under a non-verbose prove run, while diag()
prints to stderr so it's always visible.

- ilmari




Re: multithreaded zstd backup compression for client and server

2022-03-28 Thread Robert Haas
On Thu, Mar 24, 2022 at 9:19 AM Dagfinn Ilmari Mannsåker
 wrote:
> Per the TAP protocol, every line of non-test-result output should be
> prefixed by "# ". The note() function does this for you, see
> https://metacpan.org/pod/Test::More#Diagnostics for details.

True, but that also means it shows up in the actual failure message,
which seems too verbose. By just using 'print', it ends up in the log
file if it's needed, but not anywhere else. Maybe there's a better way
to do this, but I don't think using note() is what I want.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: fixing a few backup compression goofs

2022-03-28 Thread Robert Haas
On Fri, Mar 25, 2022 at 9:23 AM Dipesh Pandit  wrote:
> The changes look good to me.

Thanks. Committed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgsql: Add 'basebackup_to_shell' contrib module.

2022-03-28 Thread Magnus Hagander
On Sun, Mar 27, 2022 at 12:28 AM Tom Lane  wrote:

> Thomas Munro  writes:
> > On Sat, Mar 26, 2022 at 4:14 PM Justin Pryzby 
> wrote:
> >> I see it here (and in cfbot), although I'm not sure how you created a
> new
> >> patch for the active CF, and not for the next CF.
>
> > Anyone who has ever been a CF manager has this power, it seems.  I did
> > it myself once, by accident, and got told off by the active CF
> > manager.
>
> I'm not sure what the policy is for that.  I have done it myself,
> although I've never been a CF manager, so maybe it was granted
> to all committers?
>

It is not. In fact, you have some strange half-between power that is only
you and those pginfra members that are *not* developers in it... I've made
you a "full cf manager" now so it's at least consistent :)

And yes, the way it works now is once a cf manager always a cf manager. We
haven't had enough of them that it's been something worth considering yet.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: standby recovery fails (tablespace related) (tentative patch and discussion)

2022-03-28 Thread Robert Haas
On Mon, Mar 21, 2022 at 3:02 PM Alvaro Herrera  wrote:
> > 2. Why not instead change the code so that the operation can succeed,
> > by creating the prerequisite parent directories? Do we not have enough
> > information for that? I'm not saying that we definitely should do it
> > that way rather than this way, but I think we do take that approach in
> > some cases.
>
> It seems we can choose freely between these two implementations -- I
> mean I don't see any upsides or downsides to either one.

What got committed here feels inconsistent to me. Suppose we have a
checkpoint, and then a series of operations that touch a tablespace,
and then a drop database and drop tablespace. If the first operation
happens to be CREATE DATABASE, then this patch is going to fix it by
skipping the operation. However, if the first operation happens to be
almost anything else, the way it's going to reference the dropped
tablespace is via a block reference in a WAL record of a wide variety
of types. That's going to result in a call to
XLogReadBufferForRedoExtended() which will call
XLogReadBufferExtended() which will do smgrcreate(smgr, forknum, true)
which will in turn call TablespaceCreateDbspace() to fill in all the
missing directories.

I don't think that's very good. It would be reasonable to decide that
we're never going to create the missing directories and instead just
remember that they were not found so we can do a cross check. It's
also reasonable to just create the directories on the fly. But doing a
mix of those systems doesn't really seem like the right idea -
particularly because it means that the cross-check system is probably
not very effective at finding actual problems in the code.

Am I missing something here?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: SQL/JSON: functions

2022-03-28 Thread Nikola Ivanov
ok, I have enabled it. Will send it after the next build.


Regards

On Mon, 28 Mar 2022 at 18:39, Andres Freund  wrote:

> Hi,
>
> On 2022-03-28 18:05:19 +0300, Nikola Ivanov wrote:
> > Let me know check what can I do with the access. I will get back to you
> in
> > an hour.
>
> Perhaps you can temporarily enable keep_error_builds, and send in
> src/interfaces/ecpg/preproc/c_kwlist_d.h
> src/interfaces/ecpg/preproc/pgc.c
> src/interfaces/ecpg/preproc/preproc.h
> src/interfaces/ecpg/preproc/ecpg_kwlist_d.h
> src/interfaces/ecpg/preproc/preproc.c
> from the failed build directory? It seems something there have to differ.
>
> Regards,
>
> Andres
>


Re: SQL/JSON: functions

2022-03-28 Thread Nikola Ivanov
Hi Andrew,

Let me know check what can I do with the access. I will get back to you in
an hour.

Regards

On Mon, Mar 28, 2022, 17:30 Andrew Dunstan  wrote:

>
> On 3/28/22 09:35, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> On 3/27/22 19:14, Tom Lane wrote:
> >>> What's worse, I'm unable to replicate the failure on an OpenBSD 7.0
> >>> system here.  So there's something odd about jabiru's build
> >>> environment; but what?
> >> It's hard to see how this could be caused by the OS environment. Maybe a
> >> flaky bison/flex? I'm going to be pretty reluctant to revert this based
> >> on this error.
> > No, I wouldn't recommend reverting.  Perhaps if we dig down and find
> > something reproducible here, we could fix it --- but right now,
> > given my failure to reproduce, I think there's just something broken
> > on jabiru.
> >
> >
>
>
>
> Yeah. I have just duplicated your non-replication on a fresh instance.
>
>
> Nikola Ivanov, can you give us any assistance or give us access to the
> machine?
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


  1   2   >