Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Thomas Munro
I don't know how to fix 82023d47^ on Windows[1][2], but in case it's
useful, here's a small clue.  I see that Perl's readlink() does in
fact know how to read "junction points" on Windows[3], so if that was
the only problem it might have been very close to working.  One
difference is that our own src/port/dirmod.c's pgreadlink() also
strips \??\ from the returned paths (something to do with the
various forms of NT path), but when I tried that:

my $olddir = readlink("$backup_path/pg_tblspc/$tsoid")
|| die "readlink
$backup_path/pg_tblspc/$tsoid: $!";

+   # strip NT path prefix (see src/port/dirmod.c
pgreadlink())
+   $olddir =~ s/^\\\?\?\\// if
$PostgreSQL::Test::Utils::windows_os;

... it still broke[4].  So I'm not sure what's going on...

[1] https://github.com/postgres/postgres/runs/24040897199
[2] 
https://api.cirrus-ci.com/v1/artifact/task/5550091866472448/testrun/build/testrun/pg_combinebackup/002_compare_backups/log/002_compare_backups_pitr1.log
[3] 
https://github.com/Perl/perl5/blob/f936cd91ee430786a1bb6068a4a7c8362610dd5f/win32/win32.c#L2041
[4] https://cirrus-ci.com/task/6746621638082560




Re: subscription/026_stats test is intermittently slow?

2024-04-19 Thread Alexander Lakhin

Hello Michael and Robert,

20.04.2024 05:57, Michael Paquier wrote:

On Fri, Apr 19, 2024 at 01:57:41PM -0400, Robert Haas wrote:

It looks to me like in the first run it took 3 minutes for the
replay_lsn to catch up to the desired value, and in the second run,
two seconds. I think I have seen previous instances where something
similar happened, although in those cases I did not stop to record any
details. Have others seen this? Is there something we can/should do
about it?

FWIW, I've also seen delays as well with this test on a few occasions.
Thanks for mentioning it.


It reminds me of
https://www.postgresql.org/message-id/858a7622-2c81-1687-d1df-1322dfcb2e72%40gmail.com

At least, I could reproduce such a delay with the attached patch applied.

Best regards,
Alexanderdiff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 66070e9131..9368a3a897 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -227,6 +227,7 @@ WaitForReplicationWorkerAttach(LogicalRepWorker *worker,
 
 		if (rc & WL_LATCH_SET)
 		{
+pg_usleep(30);
 			ResetLatch(MyLatch);
 			CHECK_FOR_INTERRUPTS();
 		}
diff --git a/src/test/subscription/t/026_stats.pl b/src/test/subscription/t/026_stats.pl
index d1d68fad9a..a82f1c051c 100644
--- a/src/test/subscription/t/026_stats.pl
+++ b/src/test/subscription/t/026_stats.pl
@@ -37,6 +37,7 @@ sub create_sub_pub_w_errors
 		$db,
 		qq[
 	BEGIN;
+SELECT pg_sleep(0.5);
 	CREATE TABLE $table_name(a int primary key);
 	INSERT INTO $table_name VALUES (1);
 	COMMIT;


Re: UniqueKey v2

2024-04-19 Thread Andy Fan


Hello Antonin,

> zhihuifan1...@163.com wrote:
>
>> Here is the v3, ...
>
> I'm trying to enhance the join removal functionality (will post my patch in a
> separate thread soon) and I consider your patch very helpful in this
> area.

Thanks for these words.  The point 2) and 3) is pretty interesting to me
at [1] and "enhance join removal" is another interesting user case. 

> Following is my review. Attached are also some fixes and one enhancement:
> propagation of the unique keys (UK) from a subquery to the parent query
> (0004). (Note that 0001 is just your patch rebased).

Thanks for that! more enhancment like uniquekey in partitioned table is
needed. This post is mainly used to check if more people is still
interested with this. 

> Are you going to submit the patch to the first CF of PG 18?

Since there are still known work to do, I'm not sure if it is OK to
submit in CF. What do you think about this part?

>
> Please let me know if I can contribute to the effort by reviewing or writing
> some code.

Absolutely yes! please feel free to review / writing any of them and do
remember add yourself into the author list if you do that. 

Thanks for your review suggestion, I will get to this very soon if once
I get time, I hope it is in 4 weeks. 

[1]
https://www.postgresql.org/message-id/7mlamswjp81p.fsf%40e18c07352.et15sqa

-- 
Best Regards
Andy Fan





Re: tablecmds.c/MergeAttributes() cleanup

2024-04-19 Thread Ashutosh Bapat
On Sat, Apr 20, 2024 at 9:30 AM Alexander Lakhin 
wrote:

> Hello,
>
> 30.01.2024 09:22, Ashutosh Bapat wrote:
> >
> >> Please look at the segmentation fault triggered by the following query
> since
> >> 4d969b2f8:
> >> CREATE TABLE t1(a text COMPRESSION pglz);
> >> CREATE TABLE t2(a text);
> >> CREATE TABLE t3() INHERITS(t1, t2);
> >> NOTICE:  merging multiple inherited definitions of column "a"
> >> server closed the connection unexpectedly
> >>   This probably means the server terminated abnormally
> >>   before or while processing the request.
> >>
> >> Core was generated by `postgres: law regression [local] CREATE TABLE
>  '.
> >> Program terminated with signal SIGSEGV, Segmentation fault.
> >>
> >> (gdb) bt
> >> #0  __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
> >> #1  0x5606fbcc9d52 in MergeAttributes (columns=0x0,
> supers=supers@entry=0x5606fe293d30, relpersistence=112 'p',
> >> is_partition=false, supconstr=supconstr@entry=0x7fff4046d410,
> supnotnulls=supnotnulls@entry=0x7fff4046d418)
> >>   at tablecmds.c:2811
> >> #2  0x5606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130,
> relkind=relkind@entry=114 'r', ownerId=10,
> >> ownerId@entry=0, typaddress=typaddress@entry=0x0,
> >>   queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3()
> INHERITS(t1, t2);") at tablecmds.c:885
> > This bug existed even before the refactoring.Happens because strcmp()
> > is called on NULL input (t2's compression is NULL). I already have a
> > fix for this and will be posting it in [1].
> >
> > [1]
> https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org
> >
>
> Now that that fix is closed with RwF [1], shouldn't this crash issue be
> added to Open Items for v17?
> (I couldn't reproduce the crash on 4d969b2f8~1 nor on REL_16_STABLE.)
>
> https://commitfest.postgresql.org/47/4813/


Yes please. Probably this issue surfaced again after we reverted
compression and storage fix? Please  If that's the case, please add it to
the open items.

-- 
Best Wishes,
Ashutosh Bapat


Re: Typos in the code and README

2024-04-19 Thread David Rowley
On Fri, 19 Apr 2024 at 20:13, Daniel Gustafsson  wrote:
> Thanks, I incorporated these into 0001 before pushing.  All the commits in 
> this
> patchset are now applied.

Here are a few more to see if it motivates anyone else to do a more
thorough search for another batch.

Fixes duplicate words spanning multiple lines plus an outdated
reference to "streaming read" which was renamed to "read stream" late
in that patch's development.

duplicate words found using:
ag "\s([a-zA-Z]{2,})[\s*]*\n\1\b"
ag "\s([a-zA-Z]{2,})\n(\s*\*\s*)\1\b"

David
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 4a4cf76269..32e7d3c146 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1122,7 +1122,7 @@ heap_beginscan(Relation relation, Snapshot snapshot,
/*
 * Set up a read stream for sequential scans and TID range scans. This
 * should be done after initscan() because initscan() allocates the
-* BufferAccessStrategy object passed to the streaming read API.
+* BufferAccessStrategy object passed to the read stream API.
 */
if (scan->rs_base.rs_flags & SO_TYPE_SEQSCAN ||
scan->rs_base.rs_flags & SO_TYPE_TIDRANGESCAN)
diff --git a/src/backend/access/heap/vacuumlazy.c 
b/src/backend/access/heap/vacuumlazy.c
index de109acc89..0c5379666b 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -13,7 +13,7 @@
  * autovacuum_work_mem) memory space to keep track of dead TIDs.  If the
  * TID store is full, we must call lazy_vacuum to vacuum indexes (and to vacuum
  * the pages that we've pruned). This frees up the memory space dedicated to
- * to store dead TIDs.
+ * store dead TIDs.
  *
  * In practice VACUUM will often complete its initial pass over the target
  * heap relation without ever running out of space to store TIDs.  This means
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 4548a91723..a2510cf80c 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -108,7 +108,7 @@
  * (if one exists).
  *
  * activeSearchPath is always the actually active path; it points to
- * to baseSearchPath which is the list derived from namespace_search_path.
+ * baseSearchPath which is the list derived from namespace_search_path.
  *
  * If baseSearchPathValid is false, then baseSearchPath (and other derived
  * variables) need to be recomputed from namespace_search_path, or retrieved
diff --git a/src/backend/replication/logical/slotsync.c 
b/src/backend/replication/logical/slotsync.c
index cb39adcd0e..ed8104ccb4 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -601,7 +601,7 @@ update_and_persist_local_synced_slot(RemoteSlot 
*remote_slot, Oid remote_dbid)
  * metadata of the slot as per the data received from the primary server.
  *
  * The slot is created as a temporary slot and stays in the same state until 
the
- * the remote_slot catches up with locally reserved position and local slot is
+ * remote_slot catches up with locally reserved position and local slot is
  * updated. The slot is then persisted and is considered as sync-ready for
  * periodic syncs.
  *
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 100f6454e5..a691aed1f4 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -244,7 +244,7 @@ smgropen(RelFileLocator rlocator, ProcNumber backend)
 
 /*
  * smgrpin() -- Prevent an SMgrRelation object from being destroyed at end of
- * of transaction
+ * transaction
  */
 void
 smgrpin(SMgrRelation reln)


Re: tablecmds.c/MergeAttributes() cleanup

2024-04-19 Thread Alexander Lakhin

Hello,

30.01.2024 09:22, Ashutosh Bapat wrote:



Please look at the segmentation fault triggered by the following query since
4d969b2f8:
CREATE TABLE t1(a text COMPRESSION pglz);
CREATE TABLE t2(a text);
CREATE TABLE t3() INHERITS(t1, t2);
NOTICE:  merging multiple inherited definitions of column "a"
server closed the connection unexpectedly
  This probably means the server terminated abnormally
  before or while processing the request.

Core was generated by `postgres: law regression [local] CREATE TABLE
 '.
Program terminated with signal SIGSEGV, Segmentation fault.

(gdb) bt
#0  __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
#1  0x5606fbcc9d52 in MergeAttributes (columns=0x0, 
supers=supers@entry=0x5606fe293d30, relpersistence=112 'p',
is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, 
supnotnulls=supnotnulls@entry=0x7fff4046d418)
  at tablecmds.c:2811
#2  0x5606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, 
relkind=relkind@entry=114 'r', ownerId=10,
ownerId@entry=0, typaddress=typaddress@entry=0x0,
  queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, 
t2);") at tablecmds.c:885

This bug existed even before the refactoring.Happens because strcmp()
is called on NULL input (t2's compression is NULL). I already have a
fix for this and will be posting it in [1].

[1] 
https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org



Now that that fix is closed with RwF [1], shouldn't this crash issue be
added to Open Items for v17?
(I couldn't reproduce the crash on 4d969b2f8~1 nor on REL_16_STABLE.)

https://commitfest.postgresql.org/47/4813/

Best regards,
Alexander




Re: Stability of queryid in minor versions

2024-04-19 Thread Michael Paquier
On Sat, Apr 20, 2024 at 01:56:48PM +1200, David Rowley wrote:
> Thanks for the review. I've now pushed this, backpatching to 12.

You've split that into two separate paragraphs with 2d3389c28c5c.
Thanks for the commit.
--
Michael


signature.asc
Description: PGP signature


Re: subscription/026_stats test is intermittently slow?

2024-04-19 Thread Michael Paquier
On Fri, Apr 19, 2024 at 01:57:41PM -0400, Robert Haas wrote:
> It looks to me like in the first run it took 3 minutes for the
> replay_lsn to catch up to the desired value, and in the second run,
> two seconds. I think I have seen previous instances where something
> similar happened, although in those cases I did not stop to record any
> details. Have others seen this? Is there something we can/should do
> about it?

FWIW, I've also seen delays as well with this test on a few occasions.
Thanks for mentioning it.
--
Michael


signature.asc
Description: PGP signature


Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

2024-04-19 Thread Michael Paquier
On Fri, Apr 19, 2024 at 10:41:30AM +0900, Michael Paquier wrote:
> This comes from the contents of the dump for
> regress_pg_dump_table_am_2, that uses heap as table AM.  A SET is
> issued for it before dumping regress_pg_dump_table_am_parent and its
> partitions.  One trick that I can think of to make the output parsing
> of the test more palatable is to switch the AMs used by the two
> partitions, so as we finish with two SET queries before each partition
> rather than one before the partitioned table.  See the attached for
> the idea.

FYI, as this is an open item, I am planning to wrap that at the
beginning of next week after a second lookup.  If there are any
comments, feel free.
--
Michael


signature.asc
Description: PGP signature


Re: Remove deprecated header file resowner_private.h.

2024-04-19 Thread Michael Paquier
On Sat, Apr 20, 2024 at 10:07:45AM +0800, Xing Guo wrote:
> I noticed that the header file resowner_private.h is deprecated and no
> longer useful after commit b8bff07[^1]. We should remove it.

Nice catch, looks like a `git rm` has been slippery here .  It is
indeed confusing to keep it around now that all these routines are
mostly internal or have been switched to static inline that work as
wrappers of some other resowner routines.

Will clean up.
--
Michael


signature.asc
Description: PGP signature


Remove deprecated header file resowner_private.h.

2024-04-19 Thread Xing Guo
Hi hackers,

I noticed that the header file resowner_private.h is deprecated and no
longer useful after commit b8bff07[^1]. We should remove it.

[^1]: 
https://github.com/postgres/postgres/commit/b8bff07daa85c837a2747b4d35cd5a27e73fb7b2

Best Regards,
Xing
From ab13ca575e92f26a4b936c0d4fa92df23865c2d4 Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Sat, 20 Apr 2024 10:00:23 +0800
Subject: [PATCH v1] Remove deprecated header file.

The resowner_private.h is deprecated after b8bff07.
---
 src/include/utils/resowner_private.h | 117 ---
 1 file changed, 117 deletions(-)
 delete mode 100644 src/include/utils/resowner_private.h

diff --git a/src/include/utils/resowner_private.h b/src/include/utils/resowner_private.h
deleted file mode 100644
index 02b6ed2f2d..00
--- a/src/include/utils/resowner_private.h
+++ /dev/null
@@ -1,117 +0,0 @@
-/*-
- *
- * resowner_private.h
- *	  POSTGRES resource owner private definitions.
- *
- * See utils/resowner/README for more info.
- *
- *
- * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group
- * Portions Copyright (c) 1994, Regents of the University of California
- *
- * src/include/utils/resowner_private.h
- *
- *-
- */
-#ifndef RESOWNER_PRIVATE_H
-#define RESOWNER_PRIVATE_H
-
-#include "storage/dsm.h"
-#include "storage/fd.h"
-#include "storage/lock.h"
-#include "utils/catcache.h"
-#include "utils/plancache.h"
-#include "utils/resowner.h"
-#include "utils/snapshot.h"
-
-
-/* support for buffer refcount management */
-extern void ResourceOwnerEnlargeBuffers(ResourceOwner owner);
-extern void ResourceOwnerRememberBuffer(ResourceOwner owner, Buffer buffer);
-extern void ResourceOwnerForgetBuffer(ResourceOwner owner, Buffer buffer);
-
-/* support for IO-in-progress management */
-extern void ResourceOwnerEnlargeBufferIOs(ResourceOwner owner);
-extern void ResourceOwnerRememberBufferIO(ResourceOwner owner, Buffer buffer);
-extern void ResourceOwnerForgetBufferIO(ResourceOwner owner, Buffer buffer);
-
-/* support for local lock management */
-extern void ResourceOwnerRememberLock(ResourceOwner owner, LOCALLOCK *locallock);
-extern void ResourceOwnerForgetLock(ResourceOwner owner, LOCALLOCK *locallock);
-
-/* support for catcache refcount management */
-extern void ResourceOwnerEnlargeCatCacheRefs(ResourceOwner owner);
-extern void ResourceOwnerRememberCatCacheRef(ResourceOwner owner,
-			 HeapTuple tuple);
-extern void ResourceOwnerForgetCatCacheRef(ResourceOwner owner,
-		   HeapTuple tuple);
-extern void ResourceOwnerEnlargeCatCacheListRefs(ResourceOwner owner);
-extern void ResourceOwnerRememberCatCacheListRef(ResourceOwner owner,
- CatCList *list);
-extern void ResourceOwnerForgetCatCacheListRef(ResourceOwner owner,
-			   CatCList *list);
-
-/* support for relcache refcount management */
-extern void ResourceOwnerEnlargeRelationRefs(ResourceOwner owner);
-extern void ResourceOwnerRememberRelationRef(ResourceOwner owner,
-			 Relation rel);
-extern void ResourceOwnerForgetRelationRef(ResourceOwner owner,
-		   Relation rel);
-
-/* support for plancache refcount management */
-extern void ResourceOwnerEnlargePlanCacheRefs(ResourceOwner owner);
-extern void ResourceOwnerRememberPlanCacheRef(ResourceOwner owner,
-			  CachedPlan *plan);
-extern void ResourceOwnerForgetPlanCacheRef(ResourceOwner owner,
-			CachedPlan *plan);
-
-/* support for tupledesc refcount management */
-extern void ResourceOwnerEnlargeTupleDescs(ResourceOwner owner);
-extern void ResourceOwnerRememberTupleDesc(ResourceOwner owner,
-		   TupleDesc tupdesc);
-extern void ResourceOwnerForgetTupleDesc(ResourceOwner owner,
-		 TupleDesc tupdesc);
-
-/* support for snapshot refcount management */
-extern void ResourceOwnerEnlargeSnapshots(ResourceOwner owner);
-extern void ResourceOwnerRememberSnapshot(ResourceOwner owner,
-		  Snapshot snapshot);
-extern void ResourceOwnerForgetSnapshot(ResourceOwner owner,
-		Snapshot snapshot);
-
-/* support for temporary file management */
-extern void ResourceOwnerEnlargeFiles(ResourceOwner owner);
-extern void ResourceOwnerRememberFile(ResourceOwner owner,
-	  File file);
-extern void ResourceOwnerForgetFile(ResourceOwner owner,
-	File file);
-
-/* support for dynamic shared memory management */
-extern void ResourceOwnerEnlargeDSMs(ResourceOwner owner);
-extern void ResourceOwnerRememberDSM(ResourceOwner owner,
-	 dsm_segment *);
-extern void ResourceOwnerForgetDSM(ResourceOwner owner,
-   dsm_segment *);
-
-/* support for JITContext management */
-extern void ResourceOwnerEnlargeJIT(ResourceOwner owner);
-extern void ResourceOwnerRememberJIT(ResourceOwner owner,
-	 Datum handle);
-extern void ResourceOwnerForgetJIT(ResourceOwner owner,
-

Re: Stability of queryid in minor versions

2024-04-19 Thread David Rowley
On Tue, 16 Apr 2024 at 15:16, Michael Paquier  wrote:
>
> On Tue, Apr 16, 2024 at 02:04:22PM +1200, David Rowley wrote:
> > It makes sense to talk about the hashing variations closer to the
> > object identifier part.
>
> Mostly what I had in mind.  A separate  for the new part you are
> adding at the end of the first part feels a bit more natural split
> here.  Feel free to discard my comment if you think that's not worth
> it.

Thanks for the review. I've now pushed this, backpatching to 12.

David




Re: brininsert optimization opportunity

2024-04-19 Thread Michael Paquier
On Fri, Apr 19, 2024 at 04:14:00PM +0200, Tomas Vondra wrote:
> FWIW I've pushed both patches, which resolves the open item, so I've
> moved it to the "resolved" part on wiki.

Thanks, Tomas!
--
Michael


signature.asc
Description: PGP signature


Re: Use XLOG_CONTROL_FILE macro everywhere?

2024-04-19 Thread Michael Paquier
On Fri, Apr 19, 2024 at 10:12:14AM +0200, Daniel Gustafsson wrote:
> Off the cuff that seems to make sense, it does make it easier to grep for uses
> of the control file.

+1 for switching to the macro where we can.  Still, I don't see a
point in rushing and would just switch that once v18 opens up for
business. 
--
Michael


signature.asc
Description: PGP signature


Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Michael Paquier
On Fri, Apr 19, 2024 at 04:17:18PM -0400, Robert Haas wrote:
> On Fri, Apr 19, 2024 at 3:24 PM Tom Lane  wrote:
>> I can't say that I care for "backtrace_on_internal_error".
>> Re-reading that thread, I see I argued for having *no* GUC and
>> just enabling that behavior all the time.  I lost that fight,
>> but it should have been clear that a GUC of this exact shape
>> is a design dead end --- and that's what we're seeing now.
> 
> Yeah, I guess I have to agree with that.

Ah, I have missed this argument.

> So the question before us right now is whether there's a palatable
> alternative to completely ripping out a feature that both you and I
> seem to agree does something useful. I don't think we necessarily need
> to leap to the conclusion that a revert is radically less risky than
> some other alternative. Now, if there's not some obvious alternative
> upon which we can (mostly) all agree, then maybe that's where we end
> up. But I would like us to be looking to save the features we can.

Removing this GUC and making the backend react by default the same way
as when this GUC was enabled sounds like a sensible route here.  This
stuff is useful.
--
Michael


signature.asc
Description: PGP signature


Re: Direct SSL connection with ALPN and HBA rules

2024-04-19 Thread Heikki Linnakangas

On 19/04/2024 19:48, Jacob Champion wrote:

On Fri, Apr 19, 2024 at 6:56 AM Heikki Linnakangas  wrote:

With direct SSL negotiation, we always require ALPN.


  (As an aside: I haven't gotten to test the version of the patch that
made it into 17 yet, but from a quick glance it looks like we're not
rejecting mismatched ALPN during the handshake as noted in [1].)


Ah, good catch, that fell through the cracks. Agreed, the client should 
reject a direct SSL connection if the server didn't send ALPN. I'll add 
that to the Open Items so we don't forget again.



I don't see direct SSL negotiation as a security feature.


`direct` mode is not, since it's opportunistic, but I think people are
going to use `requiredirect` as a security feature. At least, I was
hoping to do that myself...


Rather, the
point is to reduce connection latency by saving one round-trip. For
example, if gssencmode=prefer, but the server refuses GSS encryption, it
seems fine to continue with negotiated SSL, instead of establishing a
new connection with direct SSL.


Well, assuming the user is okay with plaintext negotiation at all.
(Was that fixed before the patch went in? Is GSS negotiation still
allowed even with requiredirect?)


To disable sending the startup packet in plaintext, you need to use 
sslmode=require. Same as before the patch. GSS is still allowed, as it 
takes precedence over SSL if both are enabled in libpq. Per the docs:



Note that if gssencmode is set to prefer, a GSS connection is
attempted first. If the server rejects GSS encryption, SSL is
negotiated over the same TCP connection using the traditional
postgres protocol, regardless of sslnegotiation. In other words, the
direct SSL handshake is not used, if a TCP connection has already
been established and can be used for the SSL handshake.




What would be the use case of requiring
direct SSL in the server? What extra security do you get?


You get protection against attacks that could have otherwise happened
during the plaintext portion of the handshake. That has architectural
implications for more advanced uses of SCRAM, and it should prevent
any repeats of CVE-2021-23222/23214. And if the peer doesn't pass the
TLS handshake, they can't send you anything that you might forget is
untrusted (like, say, an error message).


Can you elaborate on the more advanced uses of SCRAM?


Controlling these in HBA is a bit inconvenient, because you only find
out after authentication if it's allowed or not. So if e.g. direct SSL
connections are disabled for a user,


Hopefully disabling direct SSL piecemeal is not a desired use case?
I'm not sure it makes sense to focus on that. Forcing it to be enabled
shouldn't have the same problem, should it?


Forcing it to be enabled piecemeal based on role or database has similar 
problems. Forcing it enabled for all connections seems sensible, though. 
Forcing it enabled based on the client's source IP address, but not 
user/database would be somewhat sensible too, but we don't currently 
have the HBA code to check the source IP and accept/reject SSLRequest 
based on that. The HBA rejection always happens after the client has 
sent the startup packet.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 4:18 PM Tom Lane  wrote:
> Robert Haas  writes:
> > On Fri, Apr 19, 2024 at 3:31 PM Tom Lane  wrote:
> >> That would be a reasonable answer if we deem the problem to be
> >> just "the buildfarm is unhappy".  What I'm wondering about is
> >> whether the feature will be useful to end users with this
> >> pathname length restriction.
>
> > Possibly you're getting a little too enthusiastic about these revert
> > requests, because I'd say it's at least a decade too late to get rid
> > of pg_basebackup.
>
> I misunderstood the context then.  I thought you had just added
> support for tablespaces in this area.  If pg_basebackup has been
> choking on overly-long tablespace symlinks this whole time, then
> the lack of field complaints suggests it's not such a common
> case after all.

No, the commit that caused all this was a 1-line code change. It was a
pretty stupid mistake which I would have avoided if I'd had proper
test case coverage for it, but I didn't do that originally. I think my
underlying reason for not doing the work was that I feared it would be
hard to test in a way that was stable. But, the existence of a bug
obviously proved that the test cases were needed. As expected,
however, that was hard to do without breaking things.

If you look at the error message you sent me, you can see that while
it's a pg_combinebackup test that is failing, the actual failing
program is pg_basebackup.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: allow changing autovacuum_max_workers without restarting

2024-04-19 Thread Nathan Bossart
On Fri, Apr 19, 2024 at 02:42:13PM -0400, Robert Haas wrote:
> I think this could help a bunch of users, but I'd still like to
> complain, not so much with the desire to kill this patch as with the
> desire to broaden the conversation.

I think I subconsciously hoped this would spark a bigger discussion...

> Now, before this patch, there is a fairly good reason for that, which
> is that we need to reserve shared memory resources for each autovacuum
> worker that might potentially run, and the system can't know how much
> shared memory you'd like to reserve for that purpose. But if that were
> the only problem, then this patch would probably just be proposing to
> crank up the default value of that parameter rather than introducing a
> second one. I bet Nathan isn't proposing that because his intuition is
> that it will work out badly, and I think he's right. I bet that
> cranking up the number of allowed workers will often result in running
> more workers than we really should. One possible negative consequence
> is that we'll end up with multiple processes fighting over the disk in
> a situation where they should just take turns. I suspect there are
> also ways that we can be harmed - in broadly similar fashion - by cost
> balancing.

Even if we were content to bump up the default value of
autovacuum_max_workers and tell folks to just mess with the cost settings,
there are still probably many cases where bumping up the number of workers
further would be necessary.  If you have a zillion tables, turning
cost-based vacuuming off completely may be insufficient to keep up, at
which point your options become limited.  It can be difficult to tell
whether you might end up in this situation over time as your workload
evolves.  In any case, it's not clear to me that bumping up the default
value of autovacuum_max_workers would do more good than harm.  I get the
idea that the default of 3 is sufficient for a lot of clusters, so there'd
really be little upside to changing it AFAICT.  (I guess this proves your
point about my intuition.)

> So I feel like what this proposal reveals is that we know that our
> algorithm for ramping up the number of running workers doesn't really
> work. And maybe that's just a consequence of the general problem that
> we have no global information about how much vacuuming work there is
> to be done at any given time, and therefore we cannot take any kind of
> sensible guess about whether 1 more worker will help or hurt. Or,
> maybe there's some way to do better than what we do today without a
> big rewrite. I'm not sure. I don't think this patch should be burdened
> with solving the general problem here. But I do think the general
> problem is worth some discussion.

I certainly don't want to hold up $SUBJECT for a larger rewrite of
autovacuum scheduling, but I also don't want to shy away from a larger
rewrite if it's an idea whose time has come.  I'm looking forward to
hearing your ideas in your pgconf.dev talk.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Tom Lane
Robert Haas  writes:
> On Fri, Apr 19, 2024 at 3:31 PM Tom Lane  wrote:
>> That would be a reasonable answer if we deem the problem to be
>> just "the buildfarm is unhappy".  What I'm wondering about is
>> whether the feature will be useful to end users with this
>> pathname length restriction.

> Possibly you're getting a little too enthusiastic about these revert
> requests, because I'd say it's at least a decade too late to get rid
> of pg_basebackup.

I misunderstood the context then.  I thought you had just added
support for tablespaces in this area.  If pg_basebackup has been
choking on overly-long tablespace symlinks this whole time, then
the lack of field complaints suggests it's not such a common
case after all.

regards, tom lane




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 3:24 PM Tom Lane  wrote:
> I can't say that I care for "backtrace_on_internal_error".
> Re-reading that thread, I see I argued for having *no* GUC and
> just enabling that behavior all the time.  I lost that fight,
> but it should have been clear that a GUC of this exact shape
> is a design dead end --- and that's what we're seeing now.

Yeah, I guess I have to agree with that.

> > But on the other hand, in my personal experience,
> > backtrace_on_internal_error would be the right thing in a really large
> > number of cases.
>
> That's why I thought we could get away with doing it automatically.
> Sure, more control would be better.  But if we just hard-wire it for
> the moment then we aren't locking in what the eventual design for
> that control will be.

So the question before us right now is whether there's a palatable
alternative to completely ripping out a feature that both you and I
seem to agree does something useful. I don't think we necessarily need
to leap to the conclusion that a revert is radically less risky than
some other alternative. Now, if there's not some obvious alternative
upon which we can (mostly) all agree, then maybe that's where we end
up. But I would like us to be looking to save the features we can.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: postgres_fdw fails because GMT != UTC

2024-04-19 Thread Tom Lane
Etsuro Fujita  writes:
> On Thu, Apr 4, 2024 at 3:49 PM Laurenz Albe  wrote:
>> On Thu, 2024-04-04 at 02:19 -0400, Tom Lane wrote:
>>> I am not quite clear on how broken an installation needs to be to
>>> reject "UTC" as a time zone setting, except that the breakage cannot
>>> be subtle.  However, I notice that our code in pgtz.c and other
>>> places treats "GMT" as a hard-wired special case ... but not "UTC".
>>> I wonder if we ought to modify those places to force "UTC" down the
>>> same hard-wired paths.  If we acted like that, this would have worked
>>> no matter how misconfigured the installation was.
>>> 
>>> An alternative answer could be to change postgres_fdw to send "GMT"
>>> not "UTC".  That's ugly from a standards-compliance viewpoint, but
>>> it would fix this problem even with a non-updated remote server,
>>> and I think postgres_fdw is generally intended to work with even
>>> very old remote servers.
>>> 
>>> Or we could do both.

> +1 for both (assuming that the latter does not make the postgres_fdw
> code complicated).

I looked briefly at changing the server like this, and decided that
it would be a little invasive, if only because there would be
documentation and such to update.  Example question: should we change
the boot-time default value of the timezone GUC from "GMT" to "UTC"?
Probably, but I doubt we want to back-patch that, nor does it seem
like something to be messing with post-feature-freeze.  So I'm
in favor of working on that when the tree opens for v18, but not
right now.

However, we can change postgres_fdw at basically no cost AFAICS.
That's the more important part anyway I think.  If your own server
burps because it's got a bad timezone database, you are probably in
a position to do something about that, while you may have no control
over a remote server.  (As indeed the original complainant didn't.)

So I propose to apply and back-patch the attached, and leave
it at that for now.

regards, tom lane

diff --git a/contrib/postgres_fdw/connection.c b/contrib/postgres_fdw/connection.c
index 603e043af4..33e8054f64 100644
--- a/contrib/postgres_fdw/connection.c
+++ b/contrib/postgres_fdw/connection.c
@@ -671,10 +671,12 @@ configure_remote_session(PGconn *conn)
 	 * anyway.  However it makes the regression test outputs more predictable.
 	 *
 	 * We don't risk setting remote zone equal to ours, since the remote
-	 * server might use a different timezone database.  Instead, use UTC
-	 * (quoted, because very old servers are picky about case).
+	 * server might use a different timezone database.  Instead, use GMT
+	 * (quoted, because very old servers are picky about case).  That's
+	 * guaranteed to work regardless of the remote's timezone database,
+	 * because pg_tzset() hard-wires it (at least in PG 9.2 and later).
 	 */
-	do_sql_command(conn, "SET timezone = 'UTC'");
+	do_sql_command(conn, "SET timezone = 'GMT'");
 
 	/*
 	 * Set values needed to ensure unambiguous data output from remote.  (This


Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 3:31 PM Tom Lane  wrote:
> That would be a reasonable answer if we deem the problem to be
> just "the buildfarm is unhappy".  What I'm wondering about is
> whether the feature will be useful to end users with this
> pathname length restriction.

Possibly you're getting a little too enthusiastic about these revert
requests, because I'd say it's at least a decade too late to get rid
of pg_basebackup.

As discussed elsewhere, I do rather hope that pg_combinebackup will
eventually know how to operate on tar files as well, but right now it
doesn't.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Tom Lane
Robert Haas  writes:
> On Fri, Apr 19, 2024 at 2:44 PM Tom Lane  wrote:
>> This patch failed to survive contact with the buildfarm.  It looks
>> like the animals that are unhappy are choking like this:
>> pg_basebackup: error: backup failed: ERROR:  symbolic link target too long 
>> for tar format: file name "pg_tblspc/16415", target 
>> "/home/bf/bf-build/olingo/HEAD/pgsql.build/testrun/pg_combinebackup/002_compare_backups/data/tmp_test_bV72/ts"

> Ah, crap. That sucks. As far as I've been able to find, we have no
> code in the tree that knows how to generate symlinks longer than 99
> characters (see tarCreateHeader). I can search around and see if I can
> find something else out there on the Internet.

wikipedia has some useful info:

https://en.wikipedia.org/wiki/Tar_(computing)#POSIX.1-2001/pax

However, post-feature-freeze is not the time to be messing with
implementing pax.  Should we revert handling of tablespaces in this
program for now?

> But I think in 010_pg_basebackup.pl we actually work harder to avoid
> the problem than I had realized:
> ...
> Maybe I need to clone that workaround here.

That would be a reasonable answer if we deem the problem to be
just "the buildfarm is unhappy".  What I'm wondering about is
whether the feature will be useful to end users with this
pathname length restriction.

regards, tom lane




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Tom Lane
Robert Haas  writes:
> On Fri, Apr 19, 2024 at 2:08 PM Tom Lane  wrote:
>> I've not been following this thread, so I don't have an opinion
>> about what the design ought to be.  But if we still aren't settled
>> on it by now, I think the prudent thing is to revert the feature
>> out of v17 altogether, and try again in v18.  When we're still
>> designing something after feature freeze, that is a good indication
>> that we are trying to ship something that's not ready for prime time.

> There are two features at issue here. One is
> backtrace_on_internal_error, committed as
> a740b213d4b4d3360ad0cac696e47e5ec0eb8864. The other is a feature to
> produce backtraces for all errors, which was originally proposed as
> backtrace_functions='*', backtrace_functions_level=ERROR but which has
> subsequently been proposed with a few other spellings that involve
> merging that functionality into backtrace_on_internal_error. To the
> extent that there's an open question here for PG17, it's not about
> reverting this patch (which AIUI has never been committed) but about
> reverting the earlier patch for backtrace_on_internal_error. So is
> that the right thing to do?

I can't say that I care for "backtrace_on_internal_error".
Re-reading that thread, I see I argued for having *no* GUC and
just enabling that behavior all the time.  I lost that fight,
but it should have been clear that a GUC of this exact shape
is a design dead end --- and that's what we're seeing now.

> Well, on the one hand, I confess to having had a passing thought that
> backtrace_on_internal_error is awfully specific. Surely, user A's
> criterion for which messages should have backtraces might be anything,
> and we cannot reasonably add backtrace_on_$WHATEVER for all $WHATEVER
> in some large set. And the debate here suggests that I wasn't the only
> one to have that concern. So that argues for a revert.

Exactly.

> But on the other hand, in my personal experience,
> backtrace_on_internal_error would be the right thing in a really large
> number of cases.

That's why I thought we could get away with doing it automatically.
Sure, more control would be better.  But if we just hard-wire it for
the moment then we aren't locking in what the eventual design for
that control will be.

regards, tom lane




Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 2:44 PM Tom Lane  wrote:
> Robert Haas  writes:
> > On Thu, Apr 18, 2024 at 1:45 PM Andres Freund  wrote:
> >> Just to be clear: I don't want the above to block merging your test. If you
> >> think you want to do it the way you did, please do.
>
> > I think I will go ahead and do that soon, then.
>
> This patch failed to survive contact with the buildfarm.  It looks
> like the animals that are unhappy are choking like this:
>
> pg_basebackup: error: backup failed: ERROR:  symbolic link target too long 
> for tar format: file name "pg_tblspc/16415", target 
> "/home/bf/bf-build/olingo/HEAD/pgsql.build/testrun/pg_combinebackup/002_compare_backups/data/tmp_test_bV72/ts"
>
> So whether it works depends on how long the path to the animal's build
> root is.
>
> This is not good at all, because if the buildfarm is hitting this
> limit then actual users are likely to hit it as well.  But doesn't
> POSIX define some way to get longer symlink paths into tar format?
> (If not POSIX, I bet there's a widely-accepted GNU extension.)

Ah, crap. That sucks. As far as I've been able to find, we have no
code in the tree that knows how to generate symlinks longer than 99
characters (see tarCreateHeader). I can search around and see if I can
find something else out there on the Internet.

I feel like this is not a new problem but one I've had to dodge
before. In fact, I think I had to dodge it locally when developing
this patch. I believe that in various test cases, we rely on the fact
that PostgreSQL::Test::Utils::tempdir() produces pathnames that tend
to be shorter than the ones you get if you generate a path using
$node->backup_dir or $node->basedir or whatever. And I think if I
don't do it that way, it fails even locally on my machine. But, I
thought that were using that workaround elsewhere successfully, so I
expected it to be OK.

But I think in 010_pg_basebackup.pl we actually work harder to avoid
the problem than I had realized:

my $sys_tempdir = PostgreSQL::Test::Utils::tempdir_short;
...
# Test backup of a tablespace using tar format.
# Symlink the system located tempdir to our physical temp location.
# That way we can use shorter names for the tablespace directories,
# which hopefully won't run afoul of the 99 character length limit.
my $real_sys_tempdir = "$sys_tempdir/tempdir";
dir_symlink "$tempdir", $real_sys_tempdir;

mkdir "$tempdir/tblspc1";
my $realTsDir = "$real_sys_tempdir/tblspc1";

Maybe I need to clone that workaround here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 2:08 PM Tom Lane  wrote:
> Robert Haas  writes:
> > There are some things that are pretty hard to change once we've
> > released them. For example, if we had a function or operator and
> > somebody embeds it in a view definition, removing or renaming it
> > prevents people from upgrading. But GUCs are not as bad.
>
> Really?  Are we certain that nobody will put SETs of this GUC
> into their applications, or even just activate it via ALTER DATABASE
> SET?  If they've done that, then a GUC change means dump/reload/upgrade
> failure.

That's fair. That feature isn't super-widely used, but it wouldn't be
crazy for someone to use it with this feature, either.

> I've not been following this thread, so I don't have an opinion
> about what the design ought to be.  But if we still aren't settled
> on it by now, I think the prudent thing is to revert the feature
> out of v17 altogether, and try again in v18.  When we're still
> designing something after feature freeze, that is a good indication
> that we are trying to ship something that's not ready for prime time.

There are two features at issue here. One is
backtrace_on_internal_error, committed as
a740b213d4b4d3360ad0cac696e47e5ec0eb8864. The other is a feature to
produce backtraces for all errors, which was originally proposed as
backtrace_functions='*', backtrace_functions_level=ERROR but which has
subsequently been proposed with a few other spellings that involve
merging that functionality into backtrace_on_internal_error. To the
extent that there's an open question here for PG17, it's not about
reverting this patch (which AIUI has never been committed) but about
reverting the earlier patch for backtrace_on_internal_error. So is
that the right thing to do?

Well, on the one hand, I confess to having had a passing thought that
backtrace_on_internal_error is awfully specific. Surely, user A's
criterion for which messages should have backtraces might be anything,
and we cannot reasonably add backtrace_on_$WHATEVER for all $WHATEVER
in some large set. And the debate here suggests that I wasn't the only
one to have that concern. So that argues for a revert.

But on the other hand, in my personal experience,
backtrace_on_internal_error would be the right thing in a really large
number of cases. I was disappointed to see it added as a developer
option with GUC_NOT_IN_SAMPLE. My vote would have been to put it in
postgresql.conf and enable it by default. We have a somewhat debatable
habit of using the exact same message in many places with similar
kinds of problems, and when a production system manages to hit one of
those errors, figuring out what actually went wrong is hard. In fact,
it's often hard even when the error text only occurs in one or two
places, because it's often some very low-level part of the code where
you can't get enough context to understand the problem without knowing
where that code got called from. So I sort of hate to see one of the
most useful extensions of backtrace functionality that I can
personally imagine get pulled back out of the tree because it turns
out that someone else has something else that they want.

I wonder whether a practical solution here might be to replace
backtrace_on_internal_error=true|false with
backtrace_on_error=true|internal|false. (Yes, I know that more
proposed resolutions is not necessarily what we need right now, but I
can't resist floating the idea.)

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: fix tablespace handling in pg_combinebackup

2024-04-19 Thread Tom Lane
Robert Haas  writes:
> On Thu, Apr 18, 2024 at 1:45 PM Andres Freund  wrote:
>> Just to be clear: I don't want the above to block merging your test. If you
>> think you want to do it the way you did, please do.

> I think I will go ahead and do that soon, then.

This patch failed to survive contact with the buildfarm.  It looks
like the animals that are unhappy are choking like this:

pg_basebackup: error: backup failed: ERROR:  symbolic link target too long for 
tar format: file name "pg_tblspc/16415", target 
"/home/bf/bf-build/olingo/HEAD/pgsql.build/testrun/pg_combinebackup/002_compare_backups/data/tmp_test_bV72/ts"

So whether it works depends on how long the path to the animal's build
root is.

This is not good at all, because if the buildfarm is hitting this
limit then actual users are likely to hit it as well.  But doesn't
POSIX define some way to get longer symlink paths into tar format?
(If not POSIX, I bet there's a widely-accepted GNU extension.)

regards, tom lane




Re: allow changing autovacuum_max_workers without restarting

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 11:43 AM Nathan Bossart
 wrote:
> Removed in v2.  I also noticed that I forgot to update the part about when
> autovacuum_max_workers can be changed.  *facepalm*

I think this could help a bunch of users, but I'd still like to
complain, not so much with the desire to kill this patch as with the
desire to broaden the conversation.

Part of the underlying problem here is that, AFAIK, neither PostgreSQL
as a piece of software nor we as human beings who operate PostgreSQL
databases have much understanding of how autovacuum_max_workers should
be set. It's relatively easy to hose yourself by raising
autovacuum_max_workers to try to make things go faster, but produce
the exact opposite effect due to how the cost balancing stuff works.
But, even if you have the correct use case for autovacuum_max_workers,
something like a few large tables that take a long time to vacuum plus
a bunch of smaller ones that can't get starved just because the big
tables are in the midst of being processed, you might well ask
yourself why it's your job to figure out the correct number of
workers.

Now, before this patch, there is a fairly good reason for that, which
is that we need to reserve shared memory resources for each autovacuum
worker that might potentially run, and the system can't know how much
shared memory you'd like to reserve for that purpose. But if that were
the only problem, then this patch would probably just be proposing to
crank up the default value of that parameter rather than introducing a
second one. I bet Nathan isn't proposing that because his intuition is
that it will work out badly, and I think he's right. I bet that
cranking up the number of allowed workers will often result in running
more workers than we really should. One possible negative consequence
is that we'll end up with multiple processes fighting over the disk in
a situation where they should just take turns. I suspect there are
also ways that we can be harmed - in broadly similar fashion - by cost
balancing.

So I feel like what this proposal reveals is that we know that our
algorithm for ramping up the number of running workers doesn't really
work. And maybe that's just a consequence of the general problem that
we have no global information about how much vacuuming work there is
to be done at any given time, and therefore we cannot take any kind of
sensible guess about whether 1 more worker will help or hurt. Or,
maybe there's some way to do better than what we do today without a
big rewrite. I'm not sure. I don't think this patch should be burdened
with solving the general problem here. But I do think the general
problem is worth some discussion.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Tom Lane
Robert Haas  writes:
> There are some things that are pretty hard to change once we've
> released them. For example, if we had a function or operator and
> somebody embeds it in a view definition, removing or renaming it
> prevents people from upgrading. But GUCs are not as bad.

Really?  Are we certain that nobody will put SETs of this GUC
into their applications, or even just activate it via ALTER DATABASE
SET?  If they've done that, then a GUC change means dump/reload/upgrade
failure.

I've not been following this thread, so I don't have an opinion
about what the design ought to be.  But if we still aren't settled
on it by now, I think the prudent thing is to revert the feature
out of v17 altogether, and try again in v18.  When we're still
designing something after feature freeze, that is a good indication
that we are trying to ship something that's not ready for prime time.

regards, tom lane




Re: WIP Incremental JSON Parser

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 1:35 AM Michael Paquier  wrote:
> Are you sure that relying on Temp::File is a good thing overall?  The
> current temporary file knowledge is encapsulated within Utils.pm, with
> files removed or kept depending on PG_TEST_NOCLEAN.  So it would be
> just more consistent to rely on the existing facilities instead?
> test_json_parser is the only code path in the whole tree that directly
> uses File::Temp.  The rest of the TAP tests relies on Utils.pm for
> temp file paths.

Yeah, I think this patch invented a new solution to a problem that
we've solved in a different way everywhere else. I think we should
change it to match what we do in general.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




subscription/026_stats test is intermittently slow?

2024-04-19 Thread Robert Haas
I just did a run of the regression test where this test was the last
one to finish by quite a lot. Key log entries:

[13:35:48.583](0.039s) # initializing database system by copying initdb template
...
[13:35:52.397](0.108s) ok 5 - Check reset timestamp for
'test_tab1_sub' is newer after second reset.

 Begin standard error

psql::1: NOTICE:  created replication slot "test_tab2_sub" on publisher

 End standard error

Waiting for replication conn test_tab2_sub's replay_lsn to pass
0/151E8C8 on publisher

done

[13:38:53.706](181.310s) ok 6 - Check that table 'test_tab2' now has 1 row.
...
[13:38:54.344](0.294s) 1..13

I reran the test and it looks very different:

[13:54:01.703](0.090s) ok 5 - Check reset timestamp for
'test_tab1_sub' is newer after second reset.
...
Waiting for replication conn test_tab2_sub's replay_lsn to pass
0/151E900 on publisher
...
[13:54:03.006](1.303s) ok 6 - Check that table 'test_tab2' now has 1 row.

It looks to me like in the first run it took 3 minutes for the
replay_lsn to catch up to the desired value, and in the second run,
two seconds. I think I have seen previous instances where something
similar happened, although in those cases I did not stop to record any
details. Have others seen this? Is there something we can/should do
about it?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support event trigger for logoff

2024-04-19 Thread Tom Lane
Japin Li  writes:
> I see [1] has already implemented on login event trigger, why not implement
> the logoff event trigger?

What happens if the session crashes, or otherwise fails unexpectedly?

> Here is a problem with the regression test when using \c to create a new
> session, because it might be running concurrently, which may lead to the
> checking being unstable.

> Any thoughts?

Let's not go there at all.  I don't think there's enough field
demand to justify dealing with this concept.

regards, tom lane




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Robert Haas
On Fri, Apr 19, 2024 at 7:31 AM Jelte Fennema-Nio  wrote:
> While there maybe isn't consensus on what a new design exactly looks
> like, I do feel like there's consensus on this thread that the
> backtrace_on_internal_error GUC is almost certainly not the design
> that we want. I guess a more conservative approach to this problem
> would be to revert the backtrace_on_internal_error commit and agree on
> a better design for PG18. But I didn't think that would be necessary
> if we could agree on the name for a more flexibly named GUC, which
> seemed quite possible to me (after a little bit of bikeshedding).

I think Peter is correct that this presupposes we more or less agree
on the final destination. For example, I think that log_backtrace =
error | internal | all is a bit odd; why not backtrace_errcodes =
? I've written a logging hook for
EDB that can filter out error messages by error code, so I don't think
it's at all a stretch to think that someone might want to do something
similar here. I agree that it's somewhat likely that the name we want
going forward isn't backtrace_on_internal_error, but I don't think we
know that completely for certain, or what new name we necessarily
want.

> I think releasing a GUC (even if it's just meant for developers) in
> PG17 and then deprecating it for a newer version in PG18 wouldn't be a
> great look. And even if that's not a huge problem, it still seems
> better not to have the problem at all. Renaming the GUC now seems to
> have only upsides to me: worst case the new design turns out not to be
> what we want either, and we have to deprecate the GUC. But in the best
> case we don't need to deprecate anything.

There are some things that are pretty hard to change once we've
released them. For example, if we had a function or operator and
somebody embeds it in a view definition, removing or renaming it
prevents people from upgrading. But GUCs are not as bad. If you don't
port your postgresql.conf forward to the new version, or if you
haven't uncommented this particular setting, then there's no issue at
all, and when there is a problem, removing a GUC setting from
postgresql.conf is pretty straightforward compared to getting some
construct out of your application code. I agree it's not amazing if we
end up changing this exactly one release after it was introduced, but
we don't really know that we're going to change it next release, or at
all, and even if we do, I still don't think that's a catastrophe.

I'm not completely certain that "let's just leave this alone for right
now" is the correct conclusion, but the fact that we might need to
rename a GUC down the road is not, by itself, a critical flaw in the
status quo.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Direct SSL connection with ALPN and HBA rules

2024-04-19 Thread Jacob Champion
On Fri, Apr 19, 2024 at 6:56 AM Heikki Linnakangas  wrote:
> On 19/04/2024 08:06, Michael Paquier wrote:
> > Since 91044ae4baea (require ALPN for direct SSL connections) and
> > d39a49c1e459 (direct hanshake), direct SSL connections are supported
> > (yeah!), still the thread where this has been discussed does not cover
> > the potential impact on HBA rules:
> > https://www.postgresql.org/message-id/CAM-w4HOEAzxyY01ZKOj-iq%3DM4-VDk%3DvzQgUsuqiTFjFDZaebdg%40mail.gmail.com
> >
> > My point is, would there be a point in being able to enforce that ALPN
> > is used from the server rather than just relying on the client-side
> > sslnegotiation to decide if direct SSL connections should be forced or
> > not?

I'm a little confused about whether we're talking about requiring ALPN
or requiring direct connections. I think you mean the latter, Michael?

Personally, I was hoping that we'd have a postgresql.conf option to
reject every network connection that wasn't direct SSL, but I ran out
of time to review the patchset for 17. I would like to see server-side
enforcement of direct SSL in some way, eventually. I hadn't given much
thought to HBA, though.

> I don't think ALPN gives any meaningful security benefit, when used with
> the traditional 'postgres' SSL negotiation. There's little possibility
> of mixing that up with some other protocol, so I don't see any need to
> enforce it from server side. This was briefly discussed on that original
> thread [1].

Agreed. By the time you've issued a traditional SSL startup packet,
and the server responds with a go-ahead, it's pretty much understood
what protocol is in use.

> With direct SSL negotiation, we always require ALPN.

 (As an aside: I haven't gotten to test the version of the patch that
made it into 17 yet, but from a quick glance it looks like we're not
rejecting mismatched ALPN during the handshake as noted in [1].)

> I don't see direct SSL negotiation as a security feature.

`direct` mode is not, since it's opportunistic, but I think people are
going to use `requiredirect` as a security feature. At least, I was
hoping to do that myself...

> Rather, the
> point is to reduce connection latency by saving one round-trip. For
> example, if gssencmode=prefer, but the server refuses GSS encryption, it
> seems fine to continue with negotiated SSL, instead of establishing a
> new connection with direct SSL.

Well, assuming the user is okay with plaintext negotiation at all.
(Was that fixed before the patch went in? Is GSS negotiation still
allowed even with requiredirect?)

> What would be the use case of requiring
> direct SSL in the server? What extra security do you get?

You get protection against attacks that could have otherwise happened
during the plaintext portion of the handshake. That has architectural
implications for more advanced uses of SCRAM, and it should prevent
any repeats of CVE-2021-23222/23214. And if the peer doesn't pass the
TLS handshake, they can't send you anything that you might forget is
untrusted (like, say, an error message).

> Controlling these in HBA is a bit inconvenient, because you only find
> out after authentication if it's allowed or not. So if e.g. direct SSL
> connections are disabled for a user,

Hopefully disabling direct SSL piecemeal is not a desired use case?
I'm not sure it makes sense to focus on that. Forcing it to be enabled
shouldn't have the same problem, should it?

--Jacob

[1] 
https://www.postgresql.org/message-id/CAOYmi%2B%3DcnV-8V8TndSkEF6Htqa7qHQUL_KnQU8-DrT0Jjnm3_Q%40mail.gmail.com




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Robert Haas
On Thu, Apr 18, 2024 at 3:02 AM Michael Paquier  wrote:
> As this is an open item, let's move on here.
>
> Attached is a proposal of patch for this open item, switching
> backtrace_on_internal_error to backtrace_mode with two values:
> - "none", to log no backtraces.
> - "internal", to log backtraces for internal errors.
>
> The rest of the proposals had better happen as a v18 discussion, where
> extending this GUC is a benefit.

-1. That's just weird. There's no reason to replace a Boolean with a
non-Boolean without adding a third value. Either we decide this
concern is important enough to justify a post-feature-freeze design
change, and add the third value now, or we leave it alone and revisit
it in a future release. I'm probably OK with either one, but being
halfway in between has no appeal for me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_combinebackup does not detect missing files

2024-04-19 Thread Robert Haas
On Thu, Apr 18, 2024 at 7:36 PM David Steele  wrote:
> Sure -- pg_combinebackup would only need to verify the data that it
> uses. I'm not suggesting that it should do an exhaustive verify of every
> single backup in the chain. Though I can see how it sounded that way
> since with pg_verifybackup that would pretty much be your only choice.
>
> The beauty of doing verification in pg_combinebackup is that it can do a
> lot less than running pg_verifybackup against every backup but still get
> a valid result. All we care about is that the output is correct -- if
> there is corruption in an unused part of an earlier backup
> pg_combinebackup doesn't need to care about that.
>
> As far as I can see, pg_combinebackup already checks most of the boxes.
> The only thing I know that it can't do is detect missing files and that
> doesn't seem like too big a thing to handle.

Hmm, that's an interesting perspective. I've always been very
skeptical of doing verification only around missing files and not
anything else. I figured that wouldn't be particularly meaningful, and
that's pretty much the only kind of validation that's even
theoretically possible without a bunch of extra overhead, since we
compute checksums on entire files rather than, say, individual blocks.
And you could really only do it for the final backup in the chain,
because you should end up accessing all of those files, but the same
is not true for the predecessor backups. So it's a very weak form of
verification.

But I looked into it and I think you're correct that, if you restrict
the scope in the way that you suggest, we can do it without much
additional code, or much additional run-time. The cost is basically
that, instead of only looking for a backup_manifest entry when we
think we can reuse its checksum, we need to do a lookup for every
single file in the final input directory. Then, after processing all
such files, we need to iterate over the hash table one more time and
see what files were never touched. That seems like an acceptably low
cost to me. So, here's a patch.

I do think there's some chance that this will encourage people to
believe that pg_combinebackup is better at finding problems than it
really is or ever will be, and I also question whether it's right to
keep changing stuff after feature freeze. But I have a feeling most
people here are going to think this is worth including in 17. Let's
see what others say.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-pg_combinebackup-Detect-missing-files-when-possib.patch
Description: Binary data


Support event trigger for logoff

2024-04-19 Thread Japin Li

Hi, hackers

I see [1] has already implemented on login event trigger, why not implement
the logoff event trigger?

My friend Song Jinzhou and I try to implement the logoff event trigger, so
attach it.

Here is a problem with the regression test when using \c to create a new
session, because it might be running concurrently, which may lead to the
checking being unstable.

Any thoughts?

[1] 
https://www.postgresql.org/message-id/0d46d29f-4558-3af9-9c85-7774e14a7709%40postgrespro.ru

--
Regards,
Japin Li

>From ef7c6aa408d0a6a97d7b0d2e093b71e279b1b0dc Mon Sep 17 00:00:00 2001
From: Japin Li 
Date: Mon, 15 Apr 2024 09:11:41 +0800
Subject: [PATCH v1 1/1] Add support event triggers for logoff

---
 doc/src/sgml/catalogs.sgml|  13 ++
 doc/src/sgml/ecpg.sgml|   2 +
 doc/src/sgml/event-trigger.sgml   |   5 +
 src/backend/commands/dbcommands.c |  17 +-
 src/backend/commands/event_trigger.c  | 181 +-
 src/backend/tcop/postgres.c   |   8 +
 src/backend/utils/cache/evtcache.c|   2 +
 src/backend/utils/init/globals.c  |   1 +
 src/backend/utils/init/postinit.c |   1 +
 src/bin/psql/tab-complete.c   |   2 +-
 src/include/catalog/pg_database.dat   |   2 +-
 src/include/catalog/pg_database.h |   3 +
 src/include/commands/event_trigger.h  |   1 +
 src/include/miscadmin.h   |   1 +
 src/include/tcop/cmdtaglist.h |   1 +
 src/include/utils/evtcache.h  |   1 +
 .../regress/expected/event_trigger_logoff.out |  49 +
 src/test/regress/parallel_schedule|   2 +
 src/test/regress/sql/event_trigger_logoff.sql |  26 +++
 19 files changed, 309 insertions(+), 9 deletions(-)
 create mode 100644 src/test/regress/expected/event_trigger_logoff.out
 create mode 100644 src/test/regress/sql/event_trigger_logoff.sql

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 2907079e2a..621fbfde98 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -3076,6 +3076,19 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   dathaslogoffevt bool
+  
+  
+Indicates that there are logoff event triggers defined for this database.
+This flag is used to avoid extra lookups on the
+pg_event_trigger table during each backend
+startup.  This flag is used internally by PostgreSQL
+and should not be manually altered or read for monitoring purposes.
+  
+ 
+
  
   
datconnlimit int4
diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml
index e7a53f3c9d..d223843157 100644
--- a/doc/src/sgml/ecpg.sgml
+++ b/doc/src/sgml/ecpg.sgml
@@ -4765,6 +4765,7 @@ encoding = 0 (type: 5)
 datistemplate = t (type: 1)
 datallowconn = t (type: 1)
 dathasloginevt = f (type: 1)
+dathaslogoffevt = f (type: 1)
 datconnlimit = -1 (type: 5)
 datfrozenxid = 379 (type: 1)
 dattablespace = 1663 (type: 1)
@@ -4790,6 +4791,7 @@ encoding = 0 (type: 5)
 datistemplate = f (type: 1)
 datallowconn = t (type: 1)
 dathasloginevt = f (type: 1)
+dathaslogoffevt = f (type: 1)
 datconnlimit = -1 (type: 5)
 datfrozenxid = 379 (type: 1)
 dattablespace = 1663 (type: 1)
diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
index 8e009cca05..ed996a456e 100644
--- a/doc/src/sgml/event-trigger.sgml
+++ b/doc/src/sgml/event-trigger.sgml
@@ -29,6 +29,7 @@
  occurs in the database in which it is defined. Currently, the only
  supported events are
  login,
+ logoff,
  ddl_command_start,
  ddl_command_end,
  table_rewrite
@@ -54,6 +55,10 @@
  the in-progress login trigger.

 
+   
+ The logoff event occurs when a user logs off the system.
+   
+

  The ddl_command_start event occurs just before the
  execution of a CREATE, ALTER, DROP,
diff --git a/src/backend/commands/dbcommands.c b/src/backend/commands/dbcommands.c
index 8229dfa1f2..a6784a0ea6 100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -116,7 +116,7 @@ static void movedb_failure_callback(int code, Datum arg);
 static bool get_db_info(const char *name, LOCKMODE lockmode,
 		Oid *dbIdP, Oid *ownerIdP,
 		int *encodingP, bool *dbIsTemplateP, bool *dbAllowConnP, bool *dbHasLoginEvtP,
-		TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
+		bool *dbHasLogoffEvtP, TransactionId *dbFrozenXidP, MultiXactId *dbMinMultiP,
 		Oid *dbTablespace, char **dbCollate, char **dbCtype, char **dbLocale,
 		char **dbIcurules,
 		char *dbLocProvider,
@@ -680,6 +680,7 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt)
 	char	   *src_collversion = NULL;
 	bool		src_istemplate;
 	bool		src_hasloginevt = false;
+	bool		src_haslogoffevt = false;
 	bool		src_allowconn;
 	TransactionId src_frozenxid = InvalidTransactionId;
 	MultiXactId 

Re: allow changing autovacuum_max_workers without restarting

2024-04-19 Thread Nathan Bossart
On Thu, Apr 18, 2024 at 05:05:03AM +, Imseih (AWS), Sami wrote:
> I looked at the patch set. With the help of DEBUG2 output, I tested to ensure
> that the the autovacuum_cost_limit  balance adjusts correctly when the 
> autovacuum_max_workers value increases/decreases. I did not think the 
> patch will break this behavior, but it's important to verify this.

Great.

> 1. A nit. There should be a tab here.
> 
> -   dlist_head  av_freeWorkers;
> +   dclist_head av_freeWorkers;

I dare not argue with pgindent.

> 2. autovacuum_max_worker_slots documentation:
> 
> +   
> +Note that the value of  
> is
> +silently capped to this value.
> +   
> 
> This comment looks redundant in the docs, since the entry 
> for autovacuum_max_workers that follows mentions the
> same.

Removed in v2.  I also noticed that I forgot to update the part about when
autovacuum_max_workers can be changed.  *facepalm*

> 3. The docs for autovacuum_max_workers should mention that when
> the value changes, consider adjusting the autovacuum_cost_limit/cost_delay. 
> 
> This is not something new. Even in the current state, users should think 
> about 
> these settings. However, it seems even important if this value is to be 
> dynamically adjusted.

I don't necessarily disagree that it might be worth mentioning these
parameters, but I would argue that this should be proposed in a separate
thread.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 466f31a23605755a8e3d17c362c9f4940bf91da0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Sat, 13 Apr 2024 15:00:08 -0500
Subject: [PATCH v2 1/4] Rename autovacuum_max_workers to
 autovacuum_max_worker_slots.

---
 doc/src/sgml/config.sgml |  8 
 doc/src/sgml/maintenance.sgml|  4 ++--
 doc/src/sgml/runtime.sgml| 12 ++--
 src/backend/access/transam/xlog.c|  2 +-
 src/backend/postmaster/autovacuum.c  |  8 
 src/backend/postmaster/postmaster.c  |  2 +-
 src/backend/storage/lmgr/proc.c  |  6 +++---
 src/backend/utils/init/postinit.c| 12 ++--
 src/backend/utils/misc/guc_tables.c  |  6 +++---
 src/backend/utils/misc/postgresql.conf.sample|  2 +-
 src/include/postmaster/autovacuum.h  |  2 +-
 src/include/utils/guc_hooks.h|  4 ++--
 .../modules/xid_wraparound/t/001_emergency_vacuum.pl |  2 +-
 src/test/modules/xid_wraparound/t/003_wraparounds.pl |  2 +-
 14 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d8e1282e12..b4d67a93b6 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1914,7 +1914,7 @@ include_dir 'conf.d'


 Note that when autovacuum runs, up to
- times this memory
+ times this memory
 may be allocated, so be careful not to set the default value
 too high.  It may be useful to control for this by separately
 setting .
@@ -8534,10 +8534,10 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
   
  
 
- 
-  autovacuum_max_workers (integer)
+ 
+  autovacuum_max_worker_slots (integer)
   
-   autovacuum_max_workers configuration parameter
+   autovacuum_max_worker_slots configuration parameter
   
   
   
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 0be90bdc7e..5373acba41 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -864,9 +864,9 @@ HINT:  Execute a database-wide VACUUM in that database.
 seconds.  (Therefore, if the installation has N databases,
 a new worker will be launched every
 autovacuum_naptime/N seconds.)
-A maximum of  worker processes
+A maximum of  worker processes
 are allowed to run at the same time. If there are more than
-autovacuum_max_workers databases to be processed,
+autovacuum_max_worker_slots databases to be processed,
 the next database will be processed as soon as the first worker finishes.
 Each worker process will check each table within its database and
 execute VACUUM and/or ANALYZE as needed.
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index 6047b8171d..26a02034c8 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -781,13 +781,13 @@ psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such

 SEMMNI
 Maximum number of semaphore identifiers (i.e., sets)
-at least ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) plus room for other applications
+at least ceil((max_connections + autovacuum_max_worker_slots + max_wal_senders + max_worker_processes + 5) / 16) plus 

Re: ECPG cleanup and fix for clang compile-time problem

2024-04-19 Thread Tom Lane
One other bit of randomness that I noticed: ecpg's parse.pl has
this undocumented bit of logic:

if ($a eq 'IDENT' && $prior eq '%nonassoc')
{

# add more tokens to the list
$str = $str . "\n%nonassoc CSTRING";
}

The net effect of that is that, where gram.y writes

%nonassocUNBOUNDED NESTED /* ideally would have same precedence as IDENT */
%nonassocIDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
 SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
%leftOp OPERATOR  /* multi-character ops and user-defined operators 
*/

preproc.c has

 %nonassoc UNBOUNDED NESTED
 %nonassoc IDENT
%nonassoc CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
 SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
 %left Op OPERATOR

If you don't find that scary as heck, I suggest reading the very long
comment just in front of the cited lines of gram.y.  The argument why
assigning these keywords a precedence at all is OK depends heavily
on it being the same precedence as IDENT, yet here's ECPG randomly
breaking that.

We seem to have avoided problems though, because if I fix things
by manually editing preproc.y to re-join the lines:

 %nonassoc IDENT CSTRING PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE 
ROLLUP

the generated preproc.c doesn't change at all.  Actually, I can
take CSTRING out of this list altogether and it still doesn't
change the results ... although looking at how CSTRING is used,
it looks safer to give it the same precedence as IDENT.

I think we should change parse.pl to give one or the other of these
results before something more serious breaks there.

regards, tom lane




Re: Idea Feedback: psql \h misses -> Offers Links?

2024-04-19 Thread Euler Taveira
On Wed, Apr 17, 2024, at 2:47 PM, Kirk Wolak wrote:
>   I often use the ctrl-click on the link after getting help in psql.  A great 
> feature.
> 
> Challenge, when there is no help, you don't get any link.
> 
>   My thought process is to add a default response that would take them to
> https://www.postgresql.org/search/?u=%2Fdocs%2F16%2F={TOKEN} 
> 
> 
> *Example:*
> \h current_setting
> No help available for "current_setting".
> Try \h with no arguments to see available help.

That's because current_setting is a function. Help says:

postgres=# \?
.
.
.
Help
  \? [commands]  show help on backslash commands
  \? options show help on psql command-line options
  \? variables   show help on special variables
  \h [NAME]  help on syntax of SQL commands, * for all commands

It is just for SQL commands.

> https://www.postgresql.org/search/?u=%2Fdocs%2F16%2F=current_setting
> 
>   To me, this is a huge step in helping me get to the docs.
> 
> This is Question 1: Do others see the potential value here?

Yes. However, I expect an exact and direct answer. There will be cases that the
first result is not the one you are looking for. (You are expecting the
function or parameter description but other page is on the top because it is
more relevant.) The referred URL does not point you to the direct link.
Instead, you have to click again to be able to check the content.

> Question 2: What if we allowed the users to set some extra link Templates 
> using \pset??
> 
> \pset help_assist_link_1 =  https://www.google.com/search?q={token} 
> '
> \pset help_assist_link_2 = 
> 'https://wiki.postgresql.org/index.php?title=Special%3ASearch={token}=Go
>  
> '

That's a different idea. Are you proposing to provide URLs if this psql
variable is set and it doesn't find an entry (say \h foo)? I'm not sure if it
is a good idea to allow third-party URLs (even if it is configurable).

IMO we should expand \h to list documentation references for functions and GUCs
using SGML files. We already did it for SQL commands. Another broader idea is
to build an inverted index similar to what Index [1] provides. The main problem
with this approach is to create a dependency between documentation build and
psql. Maybe there is a reasonable way to obtain the links for each term.


[1] https://www.postgresql.org/docs/current/bookindex.html


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: brininsert optimization opportunity

2024-04-19 Thread Tomas Vondra
On 4/19/24 00:13, Tomas Vondra wrote:
> ...
> 
> It's a bit too late for me to push this now, I'll do so early tomorrow.
> 

FWIW I've pushed both patches, which resolves the open item, so I've
moved it to the "resolved" part on wiki.


thanks

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Possible to exclude a database from loading a shared_preload_libraries module?

2024-04-19 Thread Tom Lane
Rajan Pandey  writes:
> Hi, I have a use case where I want a particular database to not load a few
> modules in shared_preload_libraries.
> I was wondering if there's any way to tweak the codebase to achieve this.

No.  It's not even theoretically possible, because the whole point of
shared_preload_libraries is that the module gets loaded at postmaster
start.

Depending on what the module does, it might work to load it
per-session with session_preload_libraries, which you could attach
to the specific database(s) where you want it to be effective.

> Otherwise, can I block the modules' hooks/bgw from performing actions on my
> particular database?

That would be a feature for the individual module to implement.

regards, tom lane




Re: Direct SSL connection with ALPN and HBA rules

2024-04-19 Thread Heikki Linnakangas

On 19/04/2024 08:06, Michael Paquier wrote:

Hi all,
(Heikki in CC.)


(Adding Jacob)


Since 91044ae4baea (require ALPN for direct SSL connections) and
d39a49c1e459 (direct hanshake), direct SSL connections are supported
(yeah!), still the thread where this has been discussed does not cover
the potential impact on HBA rules:
https://www.postgresql.org/message-id/CAM-w4HOEAzxyY01ZKOj-iq%3DM4-VDk%3DvzQgUsuqiTFjFDZaebdg%40mail.gmail.com

My point is, would there be a point in being able to enforce that ALPN
is used from the server rather than just relying on the client-side
sslnegotiation to decide if direct SSL connections should be forced or
not?

Hence, I'd imagine that we could have an HBA option for hostssl rules,
like a negotiation={direct,postgres,all} that cross-checks
Port->alpn_used with the option value in a hostssl entry, rejecting
the use of connections using direct connections or the default
protocol if these are not used, giving users a way to avoid one.  As
this is a new thing, there may be an argument in this option for
security reasons, as well, so as it would be possible for operators to
turn that off in the server.


I don't think ALPN gives any meaningful security benefit, when used with 
the traditional 'postgres' SSL negotiation. There's little possibility 
of mixing that up with some other protocol, so I don't see any need to 
enforce it from server side. This was briefly discussed on that original 
thread [1]. With direct SSL negotiation, we always require ALPN.


I don't see direct SSL negotiation as a security feature. Rather, the 
point is to reduce connection latency by saving one round-trip. For 
example, if gssencmode=prefer, but the server refuses GSS encryption, it 
seems fine to continue with negotiated SSL, instead of establishing a 
new connection with direct SSL. What would be the use case of requiring 
direct SSL in the server? What extra security do you get?


Controlling these in HBA is a bit inconvenient, because you only find 
out after authentication if it's allowed or not. So if e.g. direct SSL 
connections are disabled for a user, the client would still establish a 
direct SSL connection, send the startup packet, and only then get 
rejected. The client would not know if it was rejected because of the 
direct SSL or for some reason, so it needs to retry with negotiated SSL. 
Currently, as it is master, if the TLS layer is established with direct 
SSL, you never need to retry with traditional negotiation, or vice versa.


[1] 
https://www.postgresql.org/message-id/CAAWbhmjetCVgu9pHJFkQ4ejuXuaz2mD1oniXokRHft0usCa7Yg%40mail.gmail.com


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

2024-04-19 Thread Daniel Gustafsson
> On 19 Apr 2024, at 15:13, David Benjamin  wrote:
> 
> Circling back here, anything else needed from my end on this patch?

Patience mostly, v17 very recently entered feature freeze so a lof of the
developers are busy finding and fixing bugs to stabilize the release.  When the
window for new features opens again I am sure this patch will get reviews (I
have it on my personal TODO and hope to get to it).

--
Daniel Gustafsson





Re: [PATCH] Avoid mixing custom and OpenSSL BIO functions

2024-04-19 Thread David Benjamin
Circling back here, anything else needed from my end on this patch?

On Wed, Feb 21, 2024, 17:04 David Benjamin  wrote:

> Thanks for the very thorough comments! I've attached a new version of the
> patch.
>
> On Thu, Feb 15, 2024 at 4:17 PM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2024-02-11 13:19:00 -0500, David Benjamin wrote:
>> > I've attached a patch for the master branch to fix up the custom BIOs
>> used
>> > by PostgreSQL, in light of the issues with the OpenSSL update recently.
>> > While c82207a548db47623a2bfa2447babdaa630302b9 (switching from
>> BIO_get_data
>> > to BIO_get_app_data) resolved the immediate conflict, I don't think it
>> > addressed the root cause, which is that PostgreSQL was mixing up two BIO
>> > implementations, each of which expected to be driving the BIO.
>>
>> Yea, that's certainly not nice - and I think we've been somewhat lucky it
>> hasn't caused more issues. There's some nasty possibilities, e.g.
>> sock_ctrl()
>> partially enabling ktls without our initialization having called
>> ktls_enable(). Right now that just means ktls is unusable, but it's not
>> hard
>> to imagine accidentally ending up sending unencrypted data.
>>
>
> Yeah. Even if, say, the ktls bits work, given you all care enough about
> I/O to have wanted to wrap the BIO, I assume you'd want to pick up those
> features on your own terms, e.g. by implementing the BIO_CTRLs yourself.
>
>
>> I've in the past looked into not using a custom BIO [1], but I have my
>> doubts
>> about that being a good idea. I think medium term we want to be able to do
>> network IO asynchronously, which seems quite hard to do when using
>> openssl's
>> socket BIO.
>
>
>
> > Once we've done that, we're free to use BIO_set_data. While
>> BIO_set_app_data
>> > works fine, I've reverted back to BIO_set_data because it's more
>> commonly used.
>> > app_data depends on OpenSSL's "ex_data" mechanism, which is a tad
>> heavier under
>> > the hood.
>>
>> At first I was a bit wary of that, because it requires us to bring back
>> the
>> fallback implementation. But you're right, it's noticeably heavier than
>> BIO_get_data(), and we do call BIO_get_app_data() fairly frequently.
>>
>
> TBH, I doubt it makes any real difference perf-wise. But I think
> BIO_get_data is a bit more common in this context.
>
>
>> > That leaves ctrl. ctrl is a bunch of operations (it's ioctl). The only
>> > operation that matters is BIO_CTRL_FLUSH, which is implemented as a
>> no-op. All
>> > other operations are unused. It's once again good that they're unused
>> because
>> > otherwise OpenSSL might mess with postgres's socket, break the
>> assumptions
>> > around interrupt handling, etc.
>>
>> How did you determine that only FLUSH is required? I didn't even really
>> find
>> documentation about what the intended semantics actually are.
>>
>
> The unhelpful answer is that my day job is working on BoringSSL, so I've
> spent a lot of time with this mess. :-) But, yeah, it's not well-documented
> at all. OpenSSL ends up calling BIO_flush at the end of each batch of
> writes in libssl. TBH, I suspect that was less intentional and more an
> emergent property of them internally layering a buffer BIO at one point in
> the process, but it's long been part of the (undocumented) API contract.
> Conversely, I don't think OpenSSL can possibly make libssl *require* a
> new BIO_CTRL because they'd break every custom BIO anyone has ever used
> with the library.
>
>
>> E.g. should we implement BIO_CTRL_EOF? Sure, it wasn't really supported so
>> far, because we never set it, but is that right?
>
>
> Ah hmm, BIO_CTRL_EOF is... a bit of a mess. OpenSSL kind of messed things
> up. So, up until recently, I would have said that BIO_CTRL_EOF was not part
> of the interface here. OpenSSL 1.0.x did not implement it for sockets, and
> the BIO_read API *already* had a way to signal EOF: did you return zero
> or -1?
>
> Then OpenSSL 1.1.x introduced size_t-based BIO_read_ex APIs. However, in
> the process, they *forgot that EOF and error are different things* and
> made it impossible to detect EOFs if you use BIO_read_ex! They never
> noticed this, because they didn't actually convert their own code to their
> new API. See this discussion, which alas ended with OpenSSL deciding to
> ignore the problem and not even document their current interface.
> https://github.com/openssl/openssl/issues/8208
>
> Though they never responded, they seem to have tacitly settled using the
> out-of-band BIO_eof function (which is BIO_CTRL_EOF) as the way to signal
> EOF for BIO_read_ex. This is kind of fiddly, but is at least a well-defined
> option. But the problem is no one's BIO_METHODs, including their own, are
> read_ex-based. They all implement the old read callback. But someone might
> call BIO_read_ex on a read-based BIO_METHOD. IMO, BIO_read_ex should be
> responsible for translating between the two EOF conventions. For example,
> it could automatically set a flag when the read callback returns 0 

Re: Okay to remove mention of mystery @ and ~ operators?

2024-04-19 Thread Daniel Gustafsson
> On 19 Apr 2024, at 13:49, Daniel Gustafsson  wrote:
>> On 19 Apr 2024, at 12:31, Aleksander Alekseev  
>> wrote:

>> Here is the patch.
> 
> Nice catch, and thanks for the patch.  I'll apply it with a backpatch to when
> they were removed.

Done, thanks for the report and the patch!

--
Daniel Gustafsson





Possible to exclude a database from loading a shared_preload_libraries module?

2024-04-19 Thread Rajan Pandey
Hi, I have a use case where I want a particular database to not load a few
modules in shared_preload_libraries.
I was wondering if there's any way to tweak the codebase to achieve this.
Otherwise, can I block the modules' hooks/bgw from performing actions on my
particular database?

-- 
Regards
Rajan Pandey


Re: Okay to remove mention of mystery @ and ~ operators?

2024-04-19 Thread Daniel Gustafsson
> On 19 Apr 2024, at 12:31, Aleksander Alekseev  
> wrote:
> 
> Hi,
> 
>>> This page says that the `@` and `~` operators on various types can be 
>>> accelerated by a GiST index.
>>> 
>>> https://www.postgresql.org/docs/current/gist-builtin-opclasses.html
>>> 
>>> These operators have been listed in the file since it was created in 2014, 
>>> but if they exist then I don't know how to use them or what they do.
>>> 
>>> Code examples, for clarity:
>>> 
 select box '(0,0),(1, 1)' ~ box '(2,2),(3,3)';
>>> operator does not exist: box ~ box
>>> 
 select box '(0,0),(1, 1)' @ box '(2,2),(3,3)';
>>> operator does not exist: box @ box
>>> 
>>> If they're a typo or some removed thing then I'd like to remove them from 
>>> the page. This email is me asking to find out if I'm wrong about that 
>>> before I try to submit a patch (also very happy for someone with a 
>>> committer bit to just fix this).
>> 
>> Indeed, there is no @(box,box) or ~(box,box) in the \dAo output. These
>> operators were removed by 2f70fdb0644c back in 2020.
>> 
>> I will submit a patch for the documentation shortly. Thanks for reporting.
> 
> Here is the patch.

Nice catch, and thanks for the patch.  I'll apply it with a backpatch to when
they were removed.

--
Daniel Gustafsson





Re: documentation structure

2024-04-19 Thread jian he
On Wed, Apr 17, 2024 at 7:07 PM Dagfinn Ilmari Mannsåker
 wrote:
>
>
> > It'd also be quite useful if clients could render more of the documentation
> > for functions. People are used to language servers providing full
> > documentation for functions etc...
>
> A more user-friendly version of \df+ (maybe spelled \hf, for symmetry
> with \h for commands?) would certainly be nice.
>

I think `\hf` is useful.
otherwise people first need google to find out the function html page,
then need Ctrl + F to locate specific function entry.

for \hf
we may need to offer a doc url link.
but currently many functions are unlinkable in the doc.
Also one section can have many sections.
I guess just linking directly to a nearby position in a html page
should be fine.


We can also add a url for functions decorated as underscore
like mysql 
(https://dev.mysql.com/doc/refman/8.3/en/string-functions.html#function_concat).
I am not sure it is an elegant solution.




Re: UniqueKey v2

2024-04-19 Thread Antonin Houska
zhihuifan1...@163.com wrote:

> Here is the v3, ...

I'm trying to enhance the join removal functionality (will post my patch in a
separate thread soon) and I consider your patch very helpful in this
area.

Following is my review. Attached are also some fixes and one enhancement:
propagation of the unique keys (UK) from a subquery to the parent query
(0004). (Note that 0001 is just your patch rebased).


* I think that, before EC is considered suitable for an UK, its ec_opfamilies
  field needs to be checked. I try to do that in
  find_ec_position_matching_expr(), see 0004.


* Set-returning functions (SRFs) can make a "distinct path" necessary even if
  the join output is unique.


* RelOptInfo.notnullattrs

  My understanding is that this field contains the set attributes whose
  uniqueness is guaranteed by the unique key. They are acceptable because they
  are either 1) not allowed to be NULL due to NOT NULL constraint or 2) NULL
  value makes the containing row filtered out, so the row cannot break
  uniqueness of the output. Am I right?

  If so, I suggest to change the field name to something less generic, maybe
  'uniquekey_attrs' or 'uniquekey_candidate_attrs', and adding a comment that
  more checks are needed before particular attribute can actually be used in
  UniqueKey.


* add_uniquekey_for_uniqueindex()

  I'd appreciate an explanation why the "single-row UK" is created. I think
  the reason for unique_exprs==NIL is that a restriction clause VAR=CONST
  exists for each column of the unique index. Whether I'm right or not, a
  comment should state clearly what the reason is.


* uniquekey_useful_for_merging()

  How does uniqueness relate to merge join? In README.uniquekey you seem to
  point out that a single row is always sorted, but I don't think this
  function is related to that fact. (Instead, I'd expect that pathkeys are
  added to all paths for a single-row relation, but I'm not sure you do that
  in the current version of the patch.)


* is_uniquekey_useful_afterjoin()

  Now that my patch (0004) allows propagation of the unique keys from a
  subquery to the upper query, I was wondering if the UniqueKey structure
  needs the 'use_for_distinct field' I mean we should now propagate the unique
  keys to the parent query whether the subquery has DISTINCT clause or not. I
  noticed that the field is checked by is_uniquekey_useful_afterjoin(), so I
  changed the function to always returned true. However nothing changed in the
  output of regression tests (installcheck). Do you insist that the
  'use_for_distinct' field is needed?


* uniquekey_contains_multinulls()

  ** Instead of calling the function when trying to use the UK, how about
 checking the ECs when considering creation of the UK? If the tests fail,
 just don't create the UK.

  ** What does the 'multi' word in the function name mean?


* relation_is_distinct_for()

  The function name is too similar to rel_is_distinct_for(). I think the name
  should indicate that you are checking the relation against a set of
  pathkeys. Maybe rel_is_distinct_for_pathkeys() (and remove 'distinct' from
  the argument name)? At the same time, it might be good to rename
  rel_is_distinct_for() to rel_is_distinct_for_clauses().


* uniquekey_contains_in()

  Shouldn't this be uniquekey_contained_in()? And likewise, shouldn't the
  comment be " ... if UniqueKey is contained in the list of EquivalenceClass"
  ?

  (In general, even though I'm not English native speaker, I think I see quite
  a few grammar issues, which often make reading of the comments/documentation
  a bit difficult.)


* Combining the UKs

  IMO this is the most problematic part of the patch. You call
  populate_joinrel_uniquekeys() for the same join multiple times, each time
  with a different 'restrictlist', and you try to do two things at the same
  time: 1) combine the UKs of the input relations into the UKs of the join
  relation, 2) check if the join relation can be marked single-row.

  I think that both 1) and 2) should be independent from join order, and thus
  both computations should only take place once for given set of input
  relations. And I think they should be done separately:

  1) Compute the join UKs

  As you admit in a comment in populate_joinrel_uniquekeys(), neither join
  method nor clauses should matter. So I think you only need to pick the
  "component UKs" (i.e. UKs of the input relations) which are usable above
  that join (i.e. neither the join itself nor any join below sets any column
  of the UK to NULL) and combine them.

  Of course one problem is that the number of combinations can grow
  exponentially as new relations are joined. I'm not sure it's necessary to
  combine the UKs (and to discard some of them) immediately. Instead, maybe we
  can keep lists of UKs only for base relations, and postpone picking the
  suitable UKs and combining them until we actually need to check the relation
  uniqueness.

  2) Check if the join relation 

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-19 Thread Justin Pryzby
On Thu, Apr 11, 2024 at 10:20:53PM -0400, Robert Haas wrote:
> On Thu, Apr 11, 2024 at 9:54 PM Alexander Korotkov  
> wrote:
> > I think we shouldn't unconditionally copy schema name and
> > relpersistence from the parent table.  Instead we should throw the
> > error on a mismatch like CREATE TABLE ... PARTITION OF ... does.  I'm
> > working on revising this fix.
> 
> We definitely shouldn't copy the schema name from the parent table. It
> should be possible to schema-qualify the new partition names, and if
> you don't, then the search_path should determine where they get
> placed.

+1.  Alexander Lakhin reported an issue with schemas and SPLIT, and I
noticed an issue with schemas with MERGE.  The issue I hit is occurs
when MERGE'ing into a partition with the same name, and it's fixed like
so:

--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -21526,8 +21526,7 @@ ATExecMergePartitions(List **wqueue, AlteredTableInfo 
*tab, Relation rel,
{
/* Create partition table with generated temporary name. */
sprintf(tmpRelName, "merge-%u-%X-tmp", RelationGetRelid(rel), 
MyProcPid);
-   mergePartName = 
makeRangeVar(get_namespace_name(RelationGetNamespace(rel)),
-
tmpRelName, -1);
+   mergePartName = makeRangeVar(mergePartName->schemaname, 
tmpRelName, -1);
}
createPartitionTable(mergePartName,
 
makeRangeVar(get_namespace_name(RelationGetNamespace(rel)),

> One of the things I dislike about this type of feature -- not this
> implementation specifically, but just this kind of idea in general --
> is that the syntax mentions a whole bunch of tables but in a way where
> you can't set their properties. Persistence, reloptions, whatever.
> There's just no place to mention any of that stuff - and if you wanted
> to create a place, you'd have to invent special syntax for each
> separate thing. That's why I think it's good that the normal way of
> creating a partition is CREATE TABLE .. PARTITION OF. Because that
> way, we know that the full power of the CREATE TABLE statement is
> always available, and you can set anything that you could set for a
> table that is not a partition.

Right.  The current feature is useful and will probably work for 90% of
people's partitioned tables.

Currently, CREATE TABLE .. PARTITION OF does not create stats objects on
the child table, but MERGE PARTITIONS does, which seems strange.
Maybe stats should not be included on the new child ?

Note that stats on parent table are not analagous to indexes -
partitioned indexes do nothing other than cause indexes to be created on
any new/attached partitions.  But stats objects on the parent 1) cause
extended stats to be collected and computed across the whole partition
heirarchy, and 2) do not cause stats to be computed for the individual
partitions.

Partitions can have different column definitions, for example null
constraints, FKs, defaults.  And currently, if you MERGE partitions,
those will all be lost (or rather, replaced by whatever LIKE parent
gives).  I think that's totally fine - anyone using different defaults
on child tables could either not use MERGE PARTITIONS, or fix up the
defaults afterwards.  There's not much confusion that the details of the
differences between individual partitions will be lost when the
individual partitions are merged and no longer exist.
But I think it'd be useful to document how the new partitions will be
constructed.

-- 
Justin




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Jelte Fennema-Nio
On Fri, 19 Apr 2024 at 10:58, Peter Eisentraut  wrote:
> This presupposes that there is consensus about how the future
> functionality should look like.  This topic has gone through half a
> dozen designs over a few months, and I think it would be premature to
> randomly freeze that discussion now and backport that design.

While there maybe isn't consensus on what a new design exactly looks
like, I do feel like there's consensus on this thread that the
backtrace_on_internal_error GUC is almost certainly not the design
that we want. I guess a more conservative approach to this problem
would be to revert the backtrace_on_internal_error commit and agree on
a better design for PG18. But I didn't think that would be necessary
if we could agree on the name for a more flexibly named GUC, which
seemed quite possible to me (after a little bit of bikeshedding).

> If a better, more comprehensive design arises for PG18, I think it would
> be pretty easy to either remove backtrace_on_internal_error or just
> internally remap it.

I think releasing a GUC (even if it's just meant for developers) in
PG17 and then deprecating it for a newer version in PG18 wouldn't be a
great look. And even if that's not a huge problem, it still seems
better not to have the problem at all. Renaming the GUC now seems to
have only upsides to me: worst case the new design turns out not to be
what we want either, and we have to deprecate the GUC. But in the best
case we don't need to deprecate anything.




Re: AIX support

2024-04-19 Thread Sriram RK
For any complier/hardware related issue we should able to provide support.
We are in the process of identifying the AIX systems that can be added to the 
CI/buildfarm environment.

Regards,
Sriram.


Re: POC: GROUP BY optimization

2024-04-19 Thread jian he
On Thu, Apr 18, 2024 at 6:58 PM Alexander Korotkov  wrote:
>
> Thank you for the fixes you've proposed.  I didn't look much into
> details yet, but I think the main concern Tom expressed in [1] is
> whether the feature is reasonable at all.  I think at this stage the
> most important thing is to come up with convincing examples showing
> how huge performance benefits it could cause.  I will return to this
> later today and will try to provide some convincing examples.
>

hi.
I found a case where it improved performance.

+-- GROUP BY optimization by reorder columns
+CREATE TABLE btg AS SELECT
+ i % 100 AS x,
+ i % 100 AS y,
+ 'abc' || i % 10 AS z,
+ i AS w
+FROM generate_series(1,1) AS i;
+CREATE INDEX abc ON btg(x,y);
+ANALYZE btg;
+
I change
+FROM generate_series(1,1) AS i;
to
+ FROM generate_series(1, 1e6) AS i;

Then I found out about these 2 queries performance improved a lot.
A: explain(analyze) SELECT count(*) FROM btg GROUP BY w, x, y, z ORDER
BY y, x \watch i=0.1 c=10
B: explain(analyze) SELECT count(*) FROM btg GROUP BY w, x, z, y ORDER
BY y, x, z, w \watch i=0.1 c=10

set (enable_seqscan,enable_hashagg) from on to off:
queryA execution time from 1533.013 ms to 533.430 ms
queryB execution time from 1996.817 ms to 497.020 ms




Re: Okay to remove mention of mystery @ and ~ operators?

2024-04-19 Thread Aleksander Alekseev
Hi,

> > This page says that the `@` and `~` operators on various types can be 
> > accelerated by a GiST index.
> >
> > https://www.postgresql.org/docs/current/gist-builtin-opclasses.html
> >
> > These operators have been listed in the file since it was created in 2014, 
> > but if they exist then I don't know how to use them or what they do.
> >
> > Code examples, for clarity:
> >
> > > select box '(0,0),(1, 1)' ~ box '(2,2),(3,3)';
> > operator does not exist: box ~ box
> >
> > > select box '(0,0),(1, 1)' @ box '(2,2),(3,3)';
> > operator does not exist: box @ box
> >
> > If they're a typo or some removed thing then I'd like to remove them from 
> > the page. This email is me asking to find out if I'm wrong about that 
> > before I try to submit a patch (also very happy for someone with a 
> > committer bit to just fix this).
>
> Indeed, there is no @(box,box) or ~(box,box) in the \dAo output. These
> operators were removed by 2f70fdb0644c back in 2020.
>
> I will submit a patch for the documentation shortly. Thanks for reporting.

Here is the patch.

-- 
Best regards,
Aleksander Alekseev


v1-0001-Remove-mention-of-and-operators.patch
Description: Binary data


Re: Okay to remove mention of mystery @ and ~ operators?

2024-04-19 Thread Aleksander Alekseev
Hi,

> This page says that the `@` and `~` operators on various types can be 
> accelerated by a GiST index.
>
> https://www.postgresql.org/docs/current/gist-builtin-opclasses.html
>
> These operators have been listed in the file since it was created in 2014, 
> but if they exist then I don't know how to use them or what they do.
>
> Code examples, for clarity:
>
> > select box '(0,0),(1, 1)' ~ box '(2,2),(3,3)';
> operator does not exist: box ~ box
>
> > select box '(0,0),(1, 1)' @ box '(2,2),(3,3)';
> operator does not exist: box @ box
>
> If they're a typo or some removed thing then I'd like to remove them from the 
> page. This email is me asking to find out if I'm wrong about that before I 
> try to submit a patch (also very happy for someone with a committer bit to 
> just fix this).

Indeed, there is no @(box,box) or ~(box,box) in the \dAo output. These
operators were removed by 2f70fdb0644c back in 2020.

I will submit a patch for the documentation shortly. Thanks for reporting.

-- 
Best regards,
Aleksander Alekseev




Re: Disallow changing slot's failover option in transaction block

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 6:09 AM Zhijie Hou (Fujitsu)
 wrote:
>
> Here is V2 patch which includes the changes for CREATE SUBSCRIPTION as
> suggested. Since we don't connect pub to alter slot when (create_slot=false)
> anymore, the restriction that disallows failover=true when connect=false is
> also removed.

Thanks for the patch. I feel getSubscription() also needs to get
'subfailover' option independent of dopt->binary_upgrade i.e. it needs
similar changes as that of dumpSubscription(). I tested pg_dump,
currently it is not dumping failover parameter for failover-enabled
subscriptions, perhaps due to the same bug.  Create-sub and Alter-sub
changes look good and work well.

thanks
Shveta




Okay to remove mention of mystery @ and ~ operators?

2024-04-19 Thread Colin Caine
Hello

This page says that the `@` and `~` operators on various types can be
accelerated by a GiST index.

https://www.postgresql.org/docs/current/gist-builtin-opclasses.html

These operators have been listed in the file since it was created in 2014,
but if they exist then I don't know how to use them or what they do.

Code examples, for clarity:

> select box '(0,0),(1, 1)' ~ box '(2,2),(3,3)';
operator does not exist: box ~ box

> select box '(0,0),(1, 1)' @ box '(2,2),(3,3)';
operator does not exist: box @ box

If they're a typo or some removed thing then I'd like to remove them from
the page. This email is me asking to find out if I'm wrong about that
before I try to submit a patch (also very happy for someone with a
committer bit to just fix this).

Cheers,
Col


Re: [PATCH] Replace magic constant 3 with NUM_MERGE_MATCH_KINDS

2024-04-19 Thread Aleksander Alekseev
Hi,

> > Fair point. PFA the alternative version of the patch.
> >
>
> Thanks. Committed.

Thanks. I see a few pieces of code that use special FOO_NUMBER enum
values instead of a macro. Should we refactor these pieces
accordingly? PFA another patch.

-- 
Best regards,
Aleksander Alekseev


v1-0001-Use-macro-to-define-the-number-of-enum-values.patch
Description: Binary data


Re: Trigger violates foreign key constraint

2024-04-19 Thread Aleksander Alekseev
Hi,

> Perhaps we should leave the system triggers out of the discussion
> entirely?  More or less like:
>
> If a foreign key constraint specifies referential actions (that
> is, cascading updates or deletes), those actions are performed via
> ordinary SQL update or delete commands on the referencing table.
> In particular, any triggers that exist on the referencing table
> will be fired for those changes.  If such a trigger modifies or
> blocks the effect of one of these commands, the end result could
> be to break referential integrity.  It is the trigger programmer's
> responsibility to avoid that.

That's perfect!

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Replace magic constant 3 with NUM_MERGE_MATCH_KINDS

2024-04-19 Thread Dean Rasheed
On Thu, 18 Apr 2024 at 13:00, Aleksander Alekseev
 wrote:
>
> Fair point. PFA the alternative version of the patch.
>

Thanks. Committed.

Regards,
Dean




Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-19 Thread Alexander Lakhin

18.04.2024 20:49, Alvaro Herrera wrote:

On 2024-Apr-18, Alexander Lakhin wrote:


I think the feature implementation should also provide tab completion
for SPLIT/MERGE.

I don't think that we should be imposing on feature authors or
committers the task of filling in tab-completion for whatever features
they contribute.  I mean, if they want to add that, cool; but if not,
somebody else can do that, too.  It's not a critical piece.


I agree, I just wanted to note the lack of the current implementation.
But now, thanks to Dagfinn, we have the tab completion too.

I have also a question regarding "ALTER TABLE ... SET ACCESS METHOD". The
current documentation says:
When applied to a partitioned table, there is no data to rewrite, but
partitions created afterwards will default to the given access method
unless overridden by a USING clause.

But MERGE/SPLIT behave differently (if one can assume that MERGE/SPLIT
create new partitions under the hood):
CREATE ACCESS METHOD heap2 TYPE TABLE HANDLER heap_tableam_handler;

CREATE TABLE t (i int, PRIMARY KEY(i)) PARTITION BY RANGE (i);
ALTER TABLE t SET ACCESS METHOD heap2;
CREATE TABLE tp_0 PARTITION OF t FOR VALUES FROM (0) TO (1);
CREATE TABLE tp_1 PARTITION OF t FOR VALUES FROM (1) TO (2);
\d t+
  Partitioned table "public.t"
...
Access method: heap2

  Table "public.tp_0"
...
Access method: heap2

  Table "public.tp_1"
...
Access method: heap2

ALTER TABLE t MERGE PARTITIONS (tp_0, tp_1) INTO tp_0;
  Partitioned table "public.t"
...
Access method: heap2

  Table "public.tp_0"
...
Access method: heap

Shouldn't it be changed, what do you think?

Best regards,
Alexander




Re: Support a wildcard in backtrace_functions

2024-04-19 Thread Peter Eisentraut

On 18.04.24 11:54, Jelte Fennema-Nio wrote:

On Thu, 18 Apr 2024 at 10:50, Peter Eisentraut  wrote:

Why exactly is this an open item?  Is there anything wrong with the
existing feature?

The name of the GUC backtrace_on_internal_error is so specific that
it's impossible to extend our backtrace behaviour in future releases
without adding yet another backtrace GUC. You started the discussion
on renaming it upthread:


This presupposes that there is consensus about how the future 
functionality should look like.  This topic has gone through half a 
dozen designs over a few months, and I think it would be premature to 
randomly freeze that discussion now and backport that design.


If a better, more comprehensive design arises for PG18, I think it would 
be pretty easy to either remove backtrace_on_internal_error or just 
internally remap it.






Re: Disallow changing slot's failover option in transaction block

2024-04-19 Thread Bertrand Drouvot
Hi,

On Fri, Apr 19, 2024 at 12:39:40AM +, Zhijie Hou (Fujitsu) wrote:
> Here is V2 patch which includes the changes for CREATE SUBSCRIPTION as
> suggested. Since we don't connect pub to alter slot when (create_slot=false)
> anymore, the restriction that disallows failover=true when connect=false is
> also removed.

Thanks!

+  specified in the subscription. When creating the slot, ensure the 
slot
+  property failover matches the counterpart 
parameter
+  of the subscription.

The slot could be created before or after the subscription is created, so I 
think
it needs a bit of rewording (as here it sounds like the sub is already created),
, something like?

"Always ensure the slot property failover matches the
counterpart parameter of the subscription and vice versa."

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Idea Feedback: psql \h misses -> Offers Links?

2024-04-19 Thread Peter Eisentraut

On 18.04.24 23:29, Kirk Wolak wrote:
On Thu, Apr 18, 2024 at 2:37 PM Peter Eisentraut > wrote:


On 17.04.24 19:47, Kirk Wolak wrote:
 > *Example:*
 > \h current_setting
 > No help available for "current_setting".
 > Try \h with no arguments to see available help.
 >
 >
https://www.postgresql.org/search/?u=%2Fdocs%2F16%2F=current_setting 

 >
>

One problem is that this search URL does not actually produce any
useful
information about current_setting.

I see what you mean, but doesn't that imply our web search feature is 
weak?  That's the full name of an existing function, and it's in the 
index. But it cannot be found if searched from the website?


Maybe it's weak, or maybe we are using it wrong, I don't know.

\h has always been (a) local help, and (b) help specifically about SQL 
commands.  If we are going to vastly expand the scope, we need to think 
it through more thoroughly.  I could see some kind of \onlinehelp 
command, or maybe even redesigning \h altogether.


Also, as you say, the function is in the documentation index, so there 
should be a deterministic way to go directly to exactly the target 
destination.  Maybe the full-text search functionality of the web site 
is the wrong interface for that.






Re: Cannot find a working 64-bit integer type on Illumos

2024-04-19 Thread Peter Eisentraut

On 18.04.24 02:31, Thomas Munro wrote:

For limits, why do we have this:

- * stdint.h limits aren't guaranteed to have compatible types with our fixed
- * width types. So just define our own.

?  I mean, how could they not have compatible types?


The commit for this was 62e2a8dc2c7 and the thread was 
. 
 The problem was that something like


snprintf(bufm, sizeof(bufm), INT64_FORMAT, SEQ_MINVALUE);

could issue a warning if, say, INT64_FORMAT, which refers to our own 
int64, is based on long int, but SEQ_MINVALUE, which was then INT64_MIN, 
which refers to int64_t, which could be long long int.


So this is correct.  If we introduce the use of int64_t, then you need 
to be consistent still:


int64, PG_INT64_MIN, PG_INT64_MAX, INT64_FORMAT

int64_t, INT64_MIN, INT64_MAX, PRId64




Re: promotion related handling in pg_sync_replication_slots()

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 11:37 AM shveta malik  wrote:
>
> On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot
>  wrote:
> >
> > Hi,
> >
> > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote:
> > > Please find v8 attached. Changes are:
> >
> > Thanks!
> >
> > A few comments:
>
> Thanks for reviewing.
>
> > 1 ===
> >
> > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t 
> > startup_data_len)
> >  * slotsync_worker_onexit() but that will need the connection to be 
> > made
> >  * global and we want to avoid introducing global for this purpose.
> >  */
> > -   before_shmem_exit(slotsync_failure_callback, 
> > PointerGetDatum(wrconn));
> > +   before_shmem_exit(slotsync_worker_disconnect, 
> > PointerGetDatum(wrconn));
> >
> > The comment above this change still states "Register the failure callback 
> > once
> > we have the connection", I think it has to be reworded a bit now that v8 is
> > making use of slotsync_worker_disconnect().
> >
> > 2 ===
> >
> > +* Register slotsync_worker_onexit() before we register
> > +* ReplicationSlotShmemExit() in BaseInit(), to ensure that during 
> > exit of
> > +* slot sync worker, ReplicationSlotShmemExit() is called first, 
> > followed
> > +* by slotsync_worker_onexit(). Startup process during promotion 
> > waits for
> >
> > Worth to mention in shmem_exit() (where it "while 
> > (--before_shmem_exit_index >= 0)"
> > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies 
> > on
> > this LIFO behavior? (not sure if there is other "strong" LIFO requirement in
> > other part of the code).
>
> I see other modules as well relying on LIFO behavior.
> Please see applyparallelworker.c where
> 'before_shmem_exit(pa_shutdown)' is needed to be done after
> 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6).
> Also in postinit.c, I see such comments atop
> 'before_shmem_exit(ShutdownPostgres, 0)'.
> I feel we can skip adding this specific comment about
> ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has
> also not added any. I will address the rest of your comments in the
> next version.
>
> > 3 ===
> >
> > +* Startup process during promotion waits for slot sync to finish 
> > and it
> > +* does that by checking the 'syncing' flag.
> >
> > worth to mention ShutDownSlotSync()?
> >
> > 4 ===
> >
> > I did a few tests manually (launching ShutDownSlotSync() through gdb / with 
> > and
> > without sync worker and with / without pg_sync_replication_slots() running
> > concurrently) and it looks like it works as designed.
>
> Thanks for testing it.
>
> > Having said that, the logic that is in place to take care of the corner 
> > cases
> > described up-thread seems reasonable to me.

Please find v9 with the above comments addressed.

thanks
Shveta


v9-0001-Handle-stopSignaled-during-sync-function-call.patch
Description: Binary data


Re: Typos in the code and README

2024-04-19 Thread Daniel Gustafsson
> On 16 Apr 2024, at 15:37, Nazir Bilal Yavuz  wrote:

> I realized two small typos: 'sgmr' -> 'smgr'. You may want to include
> them in 0001.

Thanks, I incorporated these into 0001 before pushing.  All the commits in this
patchset are now applied.

--
Daniel Gustafsson





Re: Use XLOG_CONTROL_FILE macro everywhere?

2024-04-19 Thread Daniel Gustafsson
> On 19 Apr 2024, at 05:50, Anton A. Melnikov  wrote:

> May be better use this macro everywhere in C code?

Off the cuff that seems to make sense, it does make it easier to grep for uses
of the control file.

--
Daniel Gustafsson





Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-19 Thread Peter Eisentraut

On 19.04.24 07:37, Michael Paquier wrote:

On Thu, Apr 18, 2024 at 12:53:43PM +0200, Peter Eisentraut wrote:

If everything is addressed, I agree that 0001, 0003, and 0004 can go into
PG17, the rest later.


About the PG17 bits, would you agree about a backpatch?  Or perhaps
you disagree?


I don't think any of these need to be backpatched, at least right now.

0001 is just a cosmetic documentation tweak, has no reason to be 
backpatched.


0003 adds new functionality for LibreSSL.  While the code looks 
straightforward, we have little knowledge about how it works in 
practice.  How is the buildfarm coverage of LibreSSL (with SSL tests 
enabled!)?  If people are keen on this, it might be better to get it 
into PG17 and at least let to go through a few months of beta testing.


0004 effectively just enhances an error message for LibreSSL; there is 
little reason to backpatch this.






Re: Show WAL write and fsync stats in pg_stat_io

2024-04-19 Thread Nazir Bilal Yavuz
Hi,

On Mon, 19 Feb 2024 at 10:28, Nazir Bilal Yavuz  wrote:
>
> Hi,
>
> On Thu, 18 Jan 2024 at 04:22, Michael Paquier  wrote:
> >
> > On Wed, Jan 17, 2024 at 03:20:39PM +0300, Nazir Bilal Yavuz wrote:
> > > I agree with your points. While the other I/O related work is
> > > happening we can discuss what we should do in the variable op_byte
> > > cases. Also, this is happening only for WAL right now but if we try to
> > > extend pg_stat_io in the future, that problem possibly will rise
> > > again. So, it could be good to come up with a general solution, not
> > > only for WAL.
> >
> > Okay, I've marked the patch as RwF for this CF.
>
> I wanted to inform you that the 73f0a13266 commit changed all WALRead
> calls to read variable bytes, only the WAL receiver was reading
> variable bytes before.

I want to start working on this again if possible. I will try to
summarize the current status:

* With the 73f0a13266 commit, the WALRead() function started to read
variable bytes in every case. Before, only the WAL receiver was
reading variable bytes.

* With the 91f2cae7a4 commit, WALReadFromBuffers() is merged. We were
discussing what we have to do when this is merged. It is decided that
WALReadFromBuffers() does not call pgstat_report_wait_start() because
this function does not perform any IO [1]. We may follow the same
logic by not including these to pg_stat_io?

* With the b5a9b18cd0 commit, streaming I/O is merged but AFAIK this
does not block anything related to putting WAL stats in pg_stat_io.

If I am not missing any new changes, the only problem is reading
variable bytes now. We have discussed a couple of solutions:

1- Change op_bytes to something like -1, 0, 1, NULL etc. and document
that this means some variable byte I/O is happening.

I kind of dislike this solution because if the *only* read I/O is
happening in variable bytes, it will look like write and extend I/Os
are happening in variable bytes as well. As a solution, it could be
documented that only read I/Os could happen in variable bytes for now.

2- Use op_bytes_[read | write | extend] columns instead of one
op_bytes column, also use the first solution.

This can solve the first solution's weakness but it introduces two
more columns. This is more future proof compared to the first solution
if there is a chance that some variable I/O could happen in write and
extend calls as well in the future.

3- Create a new pg_stat_io_wal view to put WAL I/Os here instead of pg_stat_io.

pg_stat_io could remain untouchable and we will have flexibility to
edit this new view as much as we want. But the original aim of the
pg_stat_io is evaluating all I/O from a single view and adding a new
view breaks this aim.

I hope that I did not miss anything and my explanations are clear.

Any kind of feedback would be appreciated.


[1] 
https://www.postgresql.org/message-id/CAFiTN-sE7CJn-ZFj%2B-0Wv6TNytv_fp4n%2BeCszspxJ3mt77t5ig%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft




Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?

2024-04-19 Thread Daniel Gustafsson
> On 19 Apr 2024, at 07:37, Michael Paquier  wrote:
> 
> On Thu, Apr 18, 2024 at 12:53:43PM +0200, Peter Eisentraut wrote:
>> If everything is addressed, I agree that 0001, 0003, and 0004 can go into
>> PG17, the rest later.
> 
> About the PG17 bits, would you agree about a backpatch?  Or perhaps
> you disagree?

If we want to 0001 can be baclpatched to v15, 0004 to v13 and 0003 all the way.
I don't have strong opinions either way.

--
Daniel Gustafsson





Re: Can't find not null constraint, but \d+ shows that

2024-04-19 Thread Tender Wang
Alvaro Herrera  于2024年4月19日周五 02:49写道:

> On 2024-Apr-13, jian he wrote:
>
> > I wonder is there any incompatibility issue, or do we need to say
> something
> > about the new behavior when dropping a key column?
>
> Umm, yeah, maybe we should document it in ALTER TABLE DROP PRIMARY KEY
> and in the release notes to note the different behavior.
>
> > only minor cosmetic issue:
> > + if (unconstrained_cols)
> > i would like change it to
> > + if (unconstrained_cols != NIL)
> >
> > + foreach(lc, unconstrained_cols)
> > we can change to
> > + foreach_int(attnum, unconstrained_cols)
> > per commit
> >
> https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff
>
> Ah, yeah.  I did that, rewrote some comments and refined the tests a
> little bit to ensure the pg_upgrade behavior is sane.  I intend to get
> this pushed tomorrow, if nothing ugly comes up.
>

The new patch looks good to me.


>
> CI run: https://cirrus-ci.com/build/5471117953990656
>
>

-- 
Tender Wang
OpenPie:  https://en.openpie.com/


Re: promotion related handling in pg_sync_replication_slots()

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote:
> > Please find v8 attached. Changes are:
>
> Thanks!
>
> A few comments:

Thanks for reviewing.

> 1 ===
>
> @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t 
> startup_data_len)
>  * slotsync_worker_onexit() but that will need the connection to be 
> made
>  * global and we want to avoid introducing global for this purpose.
>  */
> -   before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn));
> +   before_shmem_exit(slotsync_worker_disconnect, 
> PointerGetDatum(wrconn));
>
> The comment above this change still states "Register the failure callback once
> we have the connection", I think it has to be reworded a bit now that v8 is
> making use of slotsync_worker_disconnect().
>
> 2 ===
>
> +* Register slotsync_worker_onexit() before we register
> +* ReplicationSlotShmemExit() in BaseInit(), to ensure that during 
> exit of
> +* slot sync worker, ReplicationSlotShmemExit() is called first, 
> followed
> +* by slotsync_worker_onexit(). Startup process during promotion 
> waits for
>
> Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index 
> >= 0)"
> or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on
> this LIFO behavior? (not sure if there is other "strong" LIFO requirement in
> other part of the code).

I see other modules as well relying on LIFO behavior.
Please see applyparallelworker.c where
'before_shmem_exit(pa_shutdown)' is needed to be done after
'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6).
Also in postinit.c, I see such comments atop
'before_shmem_exit(ShutdownPostgres, 0)'.
I feel we can skip adding this specific comment about
ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has
also not added any. I will address the rest of your comments in the
next version.

> 3 ===
>
> +* Startup process during promotion waits for slot sync to finish and 
> it
> +* does that by checking the 'syncing' flag.
>
> worth to mention ShutDownSlotSync()?
>
> 4 ===
>
> I did a few tests manually (launching ShutDownSlotSync() through gdb / with 
> and
> without sync worker and with / without pg_sync_replication_slots() running
> concurrently) and it looks like it works as designed.

Thanks for testing it.

> Having said that, the logic that is in place to take care of the corner cases
> described up-thread seems reasonable to me.

thanks
Shveta