date:20240430

On Sat, Apr 27, 2024 at 08:33:55PM +0200, Daniel Gustafsson wrote:
> > On 27 Apr 2024, at 20:32, Daniel Gustafsson  wrote:
> 
> > That's a good point, there is potential for more code removal here.  The
> > attached 0001 takes a stab at it while it's fresh in mind, I'll revisit 
> > before
> > the July CF to see if there is more that can be done.
> 
> ..and again with the attachment. Not enough coffee.

My remark was originally about pq_init_crypto_lib that does the
locking initialization, and your new patch a bit more, as of:

-/* This stuff need be done only once. */
-if (!SSL_initialized)
-{
-#ifdef HAVE_OPENSSL_INIT_SSL
-OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL);
-#else
-OPENSSL_config(NULL);
-SSL_library_init();
-SSL_load_error_strings();
-#endif
-SSL_initialized = true;
-}

OPENSSL_init_ssl() has replaced SSL_library_init(), marked as
deprecated, and even this step is mentioned as not required anymore
with 1.1.0~:
https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_init_ssl.html

Same with OPENSSL_init_crypto(), replacing OPENSSL_config(), again not
required in 1.1.0~:
https://www.openssl.org/docs/man1.1.1/man3/OPENSSL_init_crypto.html

SSL_load_error_strings() is recommended as not to use in 1.1.0,
replaced by the others:
https://www.openssl.org/docs/man3.2/man3/SSL_load_error_strings.html

While OpenSSL will be able to cope with that, how much of that applies
to LibreSSL?  SSL_load_error_strings(), OPENSSL_init_ssl(),
OPENSSL_CONFIG() are OK based on the docs:
https://man.archlinux.org/man/extra/libressl/libressl-OPENSSL_config.3.en
https://man.archlinux.org/man/extra/libressl/libressl-OPENSSL_init_ssl.3.en
https://man.archlinux.org/man/extra/libressl/libressl-ERR_load_crypto_strings.3.en

So +1 to remove all this code after a closer lookup.  I would
recommend to update the documentation of PQinitSSL and PQinitOpenSSL
to tell that these become useless and are deprecated.

ERR_clear_error();
-
#ifdef USE_RESOWNER_FOR_HMAC

Some noise diff.
--
Michael

signature.asc
Description: PGP signature

Re: partitioning and identity column

On Tue, Apr 30, 2024 at 04:29:11PM +0530, Ashutosh Bapat wrote:
> PFA patch which fixes all the three problems.

Please note that this was not tracked as an open item, so I have added
one referring to the failures reported by Alexander.
--
Michael


signature.asc
Description: PGP signature

Re: Fix parallel vacuum buffer usage reporting

2024-04-30 Thread Alena Rybakina

Hi!

On 30.04.2024 05:18, Masahiko Sawada wrote:

On Fri, Apr 26, 2024 at 9:12 PM Alena Rybakina wrote:

Hi!

The same script was run, but using vacuum verbose analyze, and I saw the
difference again in the fifth step:
with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied

Isn't there a chance for the checkpointer to run during this time? That could
make the conditions between the two runs slightly different and explain the
change in buffer report.

[0]https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831

Looking at the script, you won't trigger the problem.

Thank you for the link I accounted it in my next experiments.

I repeated the test without processing checkpoints with a single index, and the
number of pages in the buffer used almost matched:

master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied

with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 4489
dirtied

I think you are right - the problem was interfering with the checkpoint
process, by the way I checked the first version patch. To cut a long story
short, everything is fine now with one index.

Just in case, I'll explain: I considered this case because your patch could
have impact influenced it too.

On 25.04.2024 10:17, Anthonin Bonnefoy wrote:

On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina wrote:

I tested the main postgres branch with and without your fix using a script that
was written by me. It consists of five scenarios and I made a comparison in the
logs between the original version of the master branch and the master branch
with your patch:

Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)

The reporting issue will only happen if there's a parallel index vacuum and it
will only happen if there's at least 2 indexes [0]. You will need to create an
additional index.

Speaking of the problem, I added another index and repeated the test and found
a significant difference:

I found it when I commited the transaction (3):

master: 2964 hits, 0 misses, 0 dirtied

with applied patch v4 version: buffer usage: 33013 hits, 0 misses, 3 dirtied

When I deleted all the data from the table and later started vacuum verbose
again (4):

master: buffer usage: 51486 hits, 0 misses, 0 dirtied

with applied patch v4 version:buffer usage: 77924 hits, 0 misses, 0 dirtied

when I inserted 1 million data into the table and updated it (5):

master:buffer usage: 27904 hits, 5021 misses, 1777 dirtied

with applied patch v4 version:buffer usage: 41051 hits, 9973 misses, 2564
dirtied

As I see, the number of pages is significantly more than it was in the master
branch and ,frankly, I couldn't fully figure out if it was a mistake or not.

I think that the patch fixes the problem correctly.

I've run pgindent and updated the commit message. I realized that
parallel vacuum was introduced in pg13 but buffer usage reporting in
VACUUM command was implemented in pg15. Therefore, in pg13 and pg14,
VACUUM (PARALLEL) is available but VACUUM (PARALLEL, VERBOSE) doesn't
show the buffer usage report. Autovacuum does show the buffer usage
report but parallel autovacuum is not supported. Therefore, we should
backpatch it down to 15, not 13.

I'm going to push the patch down to pg15, barring any objections.

Regards,
I agree with you about porting and I saw that the patch is working
correctly.

--
Regards,
Alena Rybakina
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Re: pg_sequence_last_value() for unlogged sequences on standbys

On Tue, Apr 30, 2024 at 09:05:31PM -0500, Nathan Bossart wrote:
> This ended up being easier than I expected.  While unlogged sequences are
> only supported on v15 and above, temporary sequences have been around since
> v7.2, so this will probably need to be back-patched to all supported
> versions.

Unlogged and temporary relations cannot be accessed during recovery,
so I'm OK with your change to force a NULL for both relpersistences.
However, it seems to me that you should also drop the
pg_is_other_temp_schema() in system_views.sql for the definition of
pg_sequences.  Doing that on HEAD now would be OK, but there's nothing
urgent to it so it may be better done once v18 opens up.  Note that
pg_is_other_temp_schema() is only used for this sequence view, which
is a nice cleanup.

By the way, shouldn't we also change the function to return NULL for a
failed permission check?  It would be possible to remove the
has_sequence_privilege() as well, thanks to that, and a duplication
between the code and the function view.  I've been looking around a
bit, noticing one use of this function in check_pgactivity (nagios
agent), and its query also has a has_sequence_privilege() so returning
NULL would simplify its definition in the long-run.  I'd suspect other
monitoring queries to do something similar to bypass permission
errors.

> The added test case won't work for v12-v14 since it uses an
> unlogged sequence.

That would require a BackgroundPsql to maintain the connection to the
primary, so not having a test is OK by me.
--
Michael

signature.asc
Description: PGP signature

Re: Fix parallel vacuum buffer usage reporting

2024-04-30 Thread Masahiko Sawada

On Tue, Apr 30, 2024 at 3:34 PM Anthonin Bonnefoy
 wrote:
>
> I've done some additional tests to validate the reported numbers. Using 
> pg_statio, it's possible to get the minimum number of block hits (Full script 
> attached).
>
> -- Save block hits before vacuum
> SELECT pg_stat_force_next_flush();
> SELECT heap_blks_hit, idx_blks_hit FROM pg_statio_all_tables where 
> relname='vestat' \gset
> vacuum (verbose, index_cleanup on) vestat;
> -- Check the difference
> SELECT pg_stat_force_next_flush();
> SELECT heap_blks_hit - :heap_blks_hit as delta_heap_hit,
>idx_blks_hit - :idx_blks_hit as delta_idx_hit,
>heap_blks_hit - :heap_blks_hit + idx_blks_hit - :idx_blks_hit as sum
> FROM pg_statio_all_tables where relname='vestat';
>
> Output:
> ...
> buffer usage: 14676 hits, 0 misses, 667 dirtied
> buffer usage (new): 16081 hits, 0 misses, 667 dirtied
> ...
>  -[ RECORD 1 ]--+--
> delta_heap_hit | 9747
> delta_idx_hit  | 6325
> sum| 16072
>
> From pg_statio, we had 16072 blocks for the relation + indexes.
> Pre-patch, we are under reporting with 14676.
> Post-patch, we have 16081. The 9 additional block hits come from vacuum 
> accessing catalog tables like pg_class or pg_class_oid_index.
>

Thank you for further testing! I've pushed the patch.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Streaming I/O, vectored I/O (WIP)

2024-04-30 Thread Thomas Munro

On Wed, May 1, 2024 at 2:51 PM David Rowley  wrote:
> On Wed, 24 Apr 2024 at 14:32, David Rowley  wrote:
> > I've attached a patch with a few typo fixes and what looks like an
> > incorrect type for max_ios. It's an int16 and I think it needs to be
> > an int. Doing "max_ios = Min(max_ios, PG_INT16_MAX);" doesn't do
> > anything when max_ios is int16.
>
> No feedback, so I'll just push this in a few hours unless anyone has anything.

Patch looks correct, thanks.  Please do.  (Sorry, running a bit behind
on email ATM...  I also have a few more typos around here from an
off-list email from Mr Lakhin, will get to that soon...)

Re: query_id, pg_stat_activity, extended query protocol

2024-04-30 Thread Imseih (AWS), Sami

Here is a new rev of the patch which deals with the scenario
mentioned by Andrei [1] in which the queryId may change
due to a cached query invalidation.


[1] 
https://www.postgresql.org/message-id/724348C9-8023-41BC-895E-80634E79A538%40amazon.com

Regards,

Sami



0001-v2-Fix-Extended-Query-Protocol-handling-of-queryId.patch
Description: 0001-v2-Fix-Extended-Query-Protocol-handling-of-queryId.patch

Re: add tab-complete for memory, serialize option and other minor issues.

On Sat, Apr 27, 2024 at 11:15:47AM -0400, Tom Lane wrote:
> https://www.postgresql.org/message-id/3870833.1712696581%40sss.pgh.pa.us
> 
> Post-feature-freeze is no time to be messing with behavior as basic
> as WORD_BREAKS, though.

Indeed.

By the way, that psql completion patch has fallen through the cracks
and I don't see a point in waiting for that, so I have applied it.
Note that the patch did not order the options according to the docs,
which was consistent on HEAD but not anymore with the patch. 
--
Michael


signature.asc
Description: PGP signature

Re: Streaming I/O, vectored I/O (WIP)

2024-04-30 Thread David Rowley

On Wed, 24 Apr 2024 at 14:32, David Rowley  wrote:
> I've attached a patch with a few typo fixes and what looks like an
> incorrect type for max_ios. It's an int16 and I think it needs to be
> an int. Doing "max_ios = Min(max_ios, PG_INT16_MAX);" doesn't do
> anything when max_ios is int16.

No feedback, so I'll just push this in a few hours unless anyone has anything.

David

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

2024-04-30 Thread Masahiko Sawada

On Wed, May 1, 2024 at 2:29 AM Tom Lane  wrote:
>
> Alexander Korotkov  writes:
> > I agree that storing char signedness might seem weird.  But it appears
> > that we already store indexes that depend on char signedness.  So,
> > it's effectively property of bits-on-disk even though it affects
> > indirectly.  Then I see two options to make the picture consistent.
> > 1) Assume that char signedness is somehow a property of bits-on-disk
> > even though it's weird.  Then pg_trgm indexes are correct, but we need
> > to store char signedness in pg_control.
> > 2) Assume that char signedness is not a property of bits-on-disk.
> > Then pg_trgm indexes are buggy and need to be fixed.
> > What do you think?
>
> The problem with option (2) is the assumption that pg_trgm's behavior
> is the only bug of this kind, either now or in the future.  I think
> that's just about an impossible standard to meet, because there's no
> realistic way to test whether char signedness is affecting things.
> (Sure, you can compare results across platforms, but maybe you
> just didn't test the right case.)
>
> Also, the bigger picture here is the seeming assumption that "if
> we change pg_trgm then it will be safe to replicate from x86 to
> arm".  I don't believe that that's a good idea and I'm unwilling
> to promise that it will work, regardless of what we do about
> char signedness.  That being the case, I don't want to invest a
> lot of effort in the signedness issue.

I think that the char signedness issue is an issue also for developers
(and extension authors) since it could lead to confusion and potential
bugs in the future due to that. x86 developers would think of char as
always being signed and write code that will misbehave on arm
machines. For example, since logical replication should behave
correctly even in cross-arch replication all developers need to be
aware of that. I thought of using the -funsigned-char (or
-fsigned-char) compiler flag to avoid that but it would have a broader
impact not only on indexes.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

Hi Noah,

On Wed, May 1, 2024 at 5:24 AM Noah Misch  wrote:
> On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> > 0001: Optimize speed by avoiding heap visibility checking for different
> > non-deduplicated index tuples as proposed by Noah Misch
> >
> > Speed measurements on my laptop using the exact method recommended by Noah
> > upthread:
> > Current master branch: checkunique off: 144s, checkunique on: 419s
> > With patch 0001: checkunique off: 141s, checkunique on: 171s
>
> Where is the CPU time going to make it still be 21% slower w/ checkunique on?
> It's a great improvement vs. current master, but I don't have an obvious
> explanation for the remaining +21%.

I think there is at least extra index tuples comparison.

--
Regards,
Alexander Korotkov

Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-04-30 Thread Noah Misch

On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> 0001: Optimize speed by avoiding heap visibility checking for different
> non-deduplicated index tuples as proposed by Noah Misch
> 
> Speed measurements on my laptop using the exact method recommended by Noah
> upthread:
> Current master branch: checkunique off: 144s, checkunique on: 419s
> With patch 0001: checkunique off: 141s, checkunique on: 171s

Where is the CPU time going to make it still be 21% slower w/ checkunique on?
It's a great improvement vs. current master, but I don't have an obvious
explanation for the remaining +21%.

Re: pg_sequence_last_value() for unlogged sequences on standbys

2024-04-30 Thread Nathan Bossart

On Tue, Apr 30, 2024 at 08:13:17PM -0500, Nathan Bossart wrote:
> Good point.  I'll work on a patch along these lines, then.

This ended up being easier than I expected.  While unlogged sequences are
only supported on v15 and above, temporary sequences have been around since
v7.2, so this will probably need to be back-patched to all supported
versions.  The added test case won't work for v12-v14 since it uses an
unlogged sequence.  I'm not sure it's worth constructing a test case for
temporary sequences.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 71008f13da88f41a205e0643129162df9d2ebc81 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 30 Apr 2024 20:54:51 -0500
Subject: [PATCH v1 1/1] Fix pg_sequence_last_value() for non-permanent
 sequences on standbys.

---
 src/backend/commands/sequence.c   | 18 +-
 src/test/recovery/t/001_stream_rep.pl |  8 
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 46103561c3..659d2ad4fc 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -1780,7 +1780,7 @@ pg_sequence_last_value(PG_FUNCTION_ARGS)
 	Buffer		buf;
 	HeapTupleData seqtuple;
 	Form_pg_sequence_data seq;
-	bool		is_called;
+	bool		is_called = false;
 	int64		result;
 
 	/* open and lock sequence */
@@ -1792,12 +1792,20 @@ pg_sequence_last_value(PG_FUNCTION_ARGS)
  errmsg("permission denied for sequence %s",
 		RelationGetRelationName(seqrel;
 
-	seq = read_seq_tuple(seqrel, , );
+	/*
+	 * For the benefit of the pg_sequences system view, we return NULL for
+	 * temporary and unlogged sequences on standbys instead of throwing an
+	 * error.
+	 */
+	if (RelationIsPermanent(seqrel) || !RecoveryInProgress())
+	{
+		seq = read_seq_tuple(seqrel, , );
 
-	is_called = seq->is_called;
-	result = seq->last_value;
+		is_called = seq->is_called;
+		result = seq->last_value;
 
-	UnlockReleaseBuffer(buf);
+		UnlockReleaseBuffer(buf);
+	}
 	sequence_close(seqrel, NoLock);
 
 	if (is_called)
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 5311ade509..4c698b5ce1 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -95,6 +95,14 @@ $result = $node_standby_2->safe_psql('postgres', "SELECT * FROM seq1");
 print "standby 2: $result\n";
 is($result, qq(33|0|t), 'check streamed sequence content on standby 2');
 
+# Check pg_sequence_last_value() returns NULL for unlogged sequence on standby
+$node_primary->safe_psql('postgres',
+	"CREATE UNLOGGED SEQUENCE ulseq; SELECT nextval('ulseq')");
+$node_primary->wait_for_replay_catchup($node_standby_1);
+is($node_standby_1->safe_psql('postgres',
+	"SELECT pg_sequence_last_value('ulseq'::regclass) IS NULL"),
+	't', 'pg_sequence_last_value() on unlogged sequence on standby 1');
+
 # Check that only READ-only queries can run on standbys
 is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
 	3, 'read-only queries on standby 1');
-- 
2.25.1

Re: pg_sequence_last_value() for unlogged sequences on standbys

2024-04-30 Thread Nathan Bossart

On Tue, Apr 30, 2024 at 09:06:04PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
>> If you create an unlogged sequence on a primary, pg_sequence_last_value()
>> for that sequence on a standby will error like so:
>>  postgres=# select pg_sequence_last_value('test'::regclass);
>>  ERROR:  could not open file "base/5/16388": No such file or directory
> 
>> As pointed out a few years ago [0], this function is undocumented, so
>> there's no stated contract to uphold.  I lean towards just returning NULL
>> because that's what we'll have to put in the relevant pg_sequences field
>> anyway, but I can see an argument for fixing the ERROR to align with what
>> you see when you try to access unlogged relations on a standby (i.e.,
>> "cannot access temporary or unlogged relations during recovery").
> 
> Yeah, I agree with putting that logic into the function.  Putting
> such conditions into the SQL of a system view is risky because it
> is really, really painful to adjust the SQL in a released version.
> You could back-patch a fix for this if done at the C level, but
> I doubt we'd go to the trouble if it's done in the view.

Good point.  I'll work on a patch along these lines, then.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: pg_sequence_last_value() for unlogged sequences on standbys

Nathan Bossart  writes:
> If you create an unlogged sequence on a primary, pg_sequence_last_value()
> for that sequence on a standby will error like so:
>   postgres=# select pg_sequence_last_value('test'::regclass);
>   ERROR:  could not open file "base/5/16388": No such file or directory

> As pointed out a few years ago [0], this function is undocumented, so
> there's no stated contract to uphold.  I lean towards just returning NULL
> because that's what we'll have to put in the relevant pg_sequences field
> anyway, but I can see an argument for fixing the ERROR to align with what
> you see when you try to access unlogged relations on a standby (i.e.,
> "cannot access temporary or unlogged relations during recovery").

Yeah, I agree with putting that logic into the function.  Putting
such conditions into the SQL of a system view is risky because it
is really, really painful to adjust the SQL in a released version.
You could back-patch a fix for this if done at the C level, but
I doubt we'd go to the trouble if it's done in the view.

regards, tom lane

pg_sequence_last_value() for unlogged sequences on standbys

2024-04-30 Thread Nathan Bossart

If you create an unlogged sequence on a primary, pg_sequence_last_value()
for that sequence on a standby will error like so:

postgres=# select pg_sequence_last_value('test'::regclass);
ERROR:  could not open file "base/5/16388": No such file or directory

This function is used by the pg_sequences system view, which fails with the
same error on standbys.  The two options I see are:

* Return a better ERROR and adjust pg_sequences to avoid calling this
  function for unlogged sequences on standbys.
* Return NULL from pg_sequence_last_value() if called for an unlogged
  sequence on a standby.

As pointed out a few years ago [0], this function is undocumented, so
there's no stated contract to uphold.  I lean towards just returning NULL
because that's what we'll have to put in the relevant pg_sequences field
anyway, but I can see an argument for fixing the ERROR to align with what
you see when you try to access unlogged relations on a standby (i.e.,
"cannot access temporary or unlogged relations during recovery").

Thoughts?

[0] 
https://postgr.es/m/CAAaqYe8JL8Et2DoO0RRjGaMvy7-C6eDH-2wHXK-gp3dOssvBkQ%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: TLS certificate alternate trust paths issue in libpq - certificate chain validation failing

2024-04-30 Thread Jacob Champion

On Tue, Apr 30, 2024 at 2:41 PM Thomas Spear  wrote:
> The full details can be found at github.com/pgjdbc/pgjdbc/discussions/3236 - 
> in summary, both jdbc-postgres and the psql cli seem to be affected by an 
> issue validating the certificate chain up to a publicly trusted root 
> certificate that has cross-signed an intermediate certificate coming from a 
> Postgres server in Azure, when using sslmode=verify-full and trying to rely 
> on the default path for sslrootcert.

Hopefully someone more familiar with the Azure cross-signing setup
sees something obvious and chimes in, but in the meantime there are a
couple things I can think to ask:

1. Are you sure that the server is actually putting the cross-signed
intermediate in the chain it's serving to the client?

2. What version of OpenSSL? There used to be validation bugs with
alternate trust paths; hopefully you're not using any of those (I
think they're old as dirt), but it doesn't hurt to know.

3. Can you provide a sample public certificate chain that should
validate and doesn't?

Thanks,
--Jacob

TLS certificate alternate trust paths issue in libpq - certificate chain validation failing

2024-04-30 Thread Thomas Spear

Hello, I've recently joined the list on a tip from one of the maintainers
of jdbc-postgres as I wanted to discuss an issue we've run into and find
out if the fix we've worked out is the right thing to do, or if there is
actually a bug that needs to be fixed.

The full details can be found at github.com/pgjdbc/pgjdbc/discussions/3236
- in summary, both jdbc-postgres and the psql cli seem to be affected by an
issue validating the certificate chain up to a publicly trusted root
certificate that has cross-signed an intermediate certificate coming from a
Postgres server in Azure, when using sslmode=verify-full and trying to rely
on the default path for sslrootcert.

The workaround we came up with is to add the original root cert, not the
root that cross-signed the intermediate, to root.crt, in order to avoid
needing to specify sslrootcert=. This allows the full
chain to be verified.

I believe that either one should be able to be placed there without me
needing to explicitly specify sslrootcert=, but if I use
the CA that cross-signed the intermediate cert, and don't specify
sslrootcert= the chain validation fails.

Thank you,

Thomas

pg_parse_json() should not leak token copies on failure

2024-04-30 Thread Jacob Champion

Hi,

When a client of our JSON parser registers semantic action callbacks,
the parser will allocate copies of the lexed tokens to pass into those
callbacks. The intent is for the callbacks to always take ownership of
the copied memory, and if they don't want it then they can pfree() it.

However, if parsing fails on the token before the callback is invoked,
that allocation is leaked. That's not a problem for the server side,
or for clients that immediately exit on parse failure, but if the JSON
parser gets added to libpq for OAuth, it will become more of a
problem.

(I'm building a fuzzer to flush out these sorts of issues; not only
does it complain loudly about the leaks, but the accumulation of
leaked memory puts a hard limit on how long a fuzzer run can last. :D)

Attached is a draft patch to illustrate what I mean, but it's
incomplete: it only solves the problem for scalar values. We also make
a copy of object field names, which is much harder to fix, because we
make only a single copy and pass that to both the object_field_start
and object_field_end callbacks. Consider:

- If a client only implements object_field_start, it takes ownership
of the field name when we call it. It can free the allocation if it
decides that the field is irrelevant.
- The same is true for clients that only implement object_field_end.
- If a client implements both callbacks, it takes ownership of the
field name when we call object_field_start. But irrelevant field names
can't be freed during that callback, because the same copy is going to
be passed to object_field_end. And object_field_end isn't guaranteed
to be called, because parsing could fail for any number of reasons
between now and then. So what code should be responsible for cleanup?
The parser doesn't know whether the callback already freed the pointer
or kept a reference to it in semstate.

Any thoughts on how we can improve this? I was thinking we could maybe
make two copies of the field name and track ownership individually?

Thanks,
--Jacob


0001-WIP-fix-leak-of-scalar-value-on-lex-failure.patch
Description: Binary data

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-30 Thread Dmitry Koval


Hi!

30.04.2024 6:00, Alexander Lakhin пишет:

Maybe I'm doing something wrong, but the following script:
CREATE TABLE t (i int, PRIMARY KEY(i)) PARTITION BY RANGE (i);
CREATE TABLE tp_0 PARTITION OF t FOR VALUES FROM (0) TO (1);
CREATE TABLE tp_1 PARTITION OF t FOR VALUES FROM (1) TO (2);

CREATE TABLE t2 (LIKE t INCLUDING ALL);
CREATE TABLE tp2 (LIKE tp_0 INCLUDING ALL);
creates tables t2, tp2 without not-null constraints.


To create partitions is used the "CREATE TABLE ... LIKE ..." command 
with the "EXCLUDING INDEXES" modifier (to speed up the insertion of values).


CREATE TABLE t (i int, PRIMARY KEY(i)) PARTITION BY RANGE(i);
CREATE TABLE t2 (LIKE t INCLUDING ALL EXCLUDING INDEXES EXCLUDING IDENTITY);
\d+ t2;
...
Not-null constraints:
"t2_i_not_null" NOT NULL "i"
Access method: heap


[1] 
https://github.com/postgres/postgres/blob/d12b4ba1bd3eedd862064cf1dad5ff107c5cba90/src/backend/commands/tablecmds.c#L21215

--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com

Re: Build with meson + clang + sanitizer resulted in undefined reference

2024-04-30 Thread Dmitry Dolgov

> On Thu, Apr 25, 2024 at 06:38:58PM +0300, Maxim Orlov wrote:
>
> And then upon build I've got overwhelmed by thousands of undefined
> reference errors.
>
> fe-auth-scram.c:(.text+0x17a): undefined reference to
> `__ubsan_handle_builtin_unreachable'
> /usr/bin/ld: fe-auth-scram.c:(.text+0x189): undefined reference to
> `__ubsan_handle_type_mismatch_v1_abort'
> /usr/bin/ld: fe-auth-scram.c:(.text+0x195): undefined reference to
> `__ubsan_handle_type_mismatch_v1_abort'
> /usr/bin/ld: fe-auth-scram.c:(.text+0x1a1): undefined reference to
> `__ubsan_handle_type_mismatch_v1_abort'
> /usr/bin/ld: fe-auth-scram.c:(.text+0x1ad): undefined reference to
> `__ubsan_handle_type_mismatch_v1_abort'
> /usr/bin/ld: fe-auth-scram.c:(.text+0x1b9): undefined reference to
> `__ubsan_handle_type_mismatch_v1_abort'

Seems to be a meson quirk [1]. I could reproduce this, and adding
-Db_lundef=false on top of your configuration solved the issue.

[1]: https://github.com/mesonbuild/meson/issues/3853

Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

2024-04-30 Thread Justin Pryzby

On Thu, Apr 11, 2024 at 08:00:00PM +0300, Alexander Lakhin wrote:
> 11.04.2024 16:27, Dmitry Koval wrote:
> > 
> > Added correction (and test), see 
> > v3-0001-Fix-for-SPLIT-MERGE-partitions-of-temporary-table.patch.
> 
> Thank you for the correction, but may be an attempt to merge into implicit
> pg_temp should fail just like CREATE TABLE ... PARTITION OF ... does?
> 
> Please look also at another anomaly with schemas:
> CREATE SCHEMA s1;
> CREATE TABLE t (i int) PARTITION BY RANGE (i);
> CREATE TABLE tp_0_2 PARTITION OF t
>   FOR VALUES FROM (0) TO (2);
> ALTER TABLE t SPLIT PARTITION tp_0_2 INTO
>   (PARTITION s1.tp0 FOR VALUES FROM (0) TO (1), PARTITION s1.tp1 FOR VALUES 
> FROM (1) TO (2));
> results in:
> \d+ s1.*
> Did not find any relation named "s1.*"
> \d+ tp*
>   Table "public.tp0"

Hi,

Is this issue already fixed ?

I wasn't able to reproduce it.  Maybe it only happened with earlier
patch versions applied ?

Thanks,
-- 
Justin

Re: tablecmds.c/MergeAttributes() cleanup

On Tue, Apr 30, 2024 at 2:19 AM Ashutosh Bapat
 wrote:
> On Mon, Apr 29, 2024 at 6:46 PM Robert Haas  wrote:
>> On Sat, Apr 20, 2024 at 12:17 AM Ashutosh Bapat
>>  wrote:
>> > Yes please. Probably this issue surfaced again after we reverted 
>> > compression and storage fix? Please  If that's the case, please add it to 
>> > the open items.
>>
>> This is still on the open items list and I'm not clear who, if anyone,
>> is working on fixing it.
>>
>> It would be good if someone fixed it. :-)
>
> Here's a patch fixing it.
>
> I have added the reproducer provided by Alexander as a test. I thought of 
> improving that test further to test the compression of the inherited table 
> but did not implement it since we haven't documented the behaviour of 
> compression with inheritance. Defining and implementing compression behaviour 
> for inherited tables was the goal of 
> 0413a556990ba628a3de8a0b58be020fd9a14ed0, which has been reverted.

I took a look at this patch. Currently this case crashes:

CREATE TABLE cmdata(f1 text COMPRESSION pglz);
CREATE TABLE cmdata3(f1 text);
CREATE TABLE cminh() INHERITS (cmdata, cmdata3);

The patch makes this succeed, but I was initially unclear why it
didn't make it fail with an error instead: you can argue that cmdata
has pglz and cmdata3 has default and those are different. It seems
that prior precedent goes both ways -- we treat the absence of a
STORAGE specification as STORAGE EXTENDED and it conflicts with an
explicit storage specification on some other inheritance parent - but
on the other hand, we treat the absence of a default as compatible
with any explicit default, similar to what happens here. But I
eventually realized that you're just putting back behavior that we had
in previous releases: pre-v17, the code already works the way this
patch makes it do, and MergeChildAttribute() is already coded similar
to this. As Alexander Lakhin said upthread, 4d969b2f8 seems to have
broken this.

So now I think this is committable, but I can't do it now because I
won't be around for the next few hours in case the buildfarm blows up.
I can do it tomorrow, or perhaps Peter would like to handle it since
it seems to have been his commit that introduced the issue.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Support tid range scan in parallel?

2024-04-30 Thread Cary Huang

Hi David

Thank you for your reply.

> From a CPU point of view, I'd hard to imagine that a SELECT * query
> without any other items in the WHERE clause other than the TID range
> quals would run faster with multiple workers than with 1.  The problem
> is the overhead of pushing tuples to the main process often outweighs
> the benefits of the parallelism.  However, from an I/O point of view
> on a server with slow enough disks, I can imagine there'd be a
> speedup.

yeah, this is generally true. With everything set to default, the planner would 
not choose parallel sequential scan if the scan range covers mostly all tuples 
of a table (to reduce the overhead of pushing tuples to main proc as you 
mentioned). It is preferred when the target data is small but the table is 
huge. In my case, it is also the same, the planner by default uses normal tid 
range scan, so I had to alter cost parameters to influence the planner's 
decision. This is where I found that with WHERE clause only containing TID 
ranges that cover the entire table would result faster with parallel workers, 
at least in my environment.

> Of course, it may be beneficial to have parallel TID Range for other
> cases when more row filtering or aggregation is being done as that
> requires pushing fewer tuples over from the parallel worker to the
> main process. It just would be good to get to the bottom of if there's
> still any advantage to parallelism when no filtering other than the
> ctid quals is being done now that we've less chance of having to wait
> for I/O coming from disk with the read streams code.

I believe so too. I shared my test procedure below with ctid being the only 
quals. 

>> below is the timing to complete a select query covering all the records in a 
>> simple 2-column table with 40 million records,
>>
>> - tid range scan takes 10216ms
>> - tid range scan with 2 workers takes 7109ms
>> - sequential scan with 2 workers takes 8499ms
>
> Can you share more details about this test? i.e. the query, what the
> times are that you've measured (EXPLAIN ANALYZE, or SELECT, COPY?).
> Also, which version/commit did you patch against? I was wondering if
> the read stream code added in v17 would result in the serial case
> running faster because the parallelism just resulted in more I/O
> concurrency.

Yes of course. These numbers were obtained earlier this year on master with the 
patch applied most likely without the read stream code you mentioned. The patch 
attached here is rebased to commit dd0183469bb779247c96e86c2272dca7ff4ec9e7 on 
master, which is quite recent and should have the read stream code for v17 as I 
can immediately tell that the serial scans run much faster now in my setup. I 
increased the records on the test table from 40 to 100 million because serial 
scans are much faster now. Below is the summary and details of my test. Note 
that I only include the EXPLAIN ANALYZE details of round1 test. Round2 is the 
same except for different execution times. 

[env]
- OS: Ubuntu 18.04
- CPU: 4 cores @ 3.40 GHz
- MEM: 16 GB

[test table setup]
initdb with all default values
CREATE TABLE test (a INT, b TEXT);
INSERT INTO test VALUES(generate_series(1,1), 'testing');
SELECT min(ctid), max(ctid) from test;
  min  | max
---+--
 (0,1) | (540540,100)
(1 row)

[summary]
round 1:
tid range scan: 14915ms
tid range scan 2 workers: 12265ms
seq scan with 2 workers: 12675ms

round2:
tid range scan: 12112ms
tid range scan 2 workers: 10649ms
seq scan with 2 workers: 11206ms

[details of EXPLAIN ANALYZE below]

[default tid range scan]
EXPLAIN ANALYZE SELECT a FROM test WHERE ctid >= '(1,0)' AND ctid <= 
'(540540,100)';
 QUERY PLAN

 Tid Range Scan on test  (cost=0.01..1227029.81 rows=68648581 width=4) (actual 
time=0.188..12280.791 rows=9815 loops=1)
   TID Cond: ((ctid >= '(1,0)'::tid) AND (ctid <= '(540540,100)'::tid))
 Planning Time: 0.817 ms
 Execution Time: 14915.035 ms
(4 rows)

[parallel tid range scan with 2 workers]
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=2;

EXPLAIN ANALYZE SELECT a FROM test WHERE ctid >= '(1,0)' AND ctid <= 
'(540540,100)';
   QUERY PLAN
-
 Gather  (cost=0.01..511262.43 rows=68648581 width=4) (actual 
time=1.322..9249.197 rows=9815 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Tid Range Scan on test  (cost=0.01..511262.43 rows=28603575 
width=4) (actual time=0.332..4906.262 rows=3272 loops=3)
 TID Cond: ((ctid >= '(1,0)'::tid) AND (ctid <= '(540540,100)'::tid))
 Planning Time: 0.213 ms

Re: pg17 issues with not-null contraints

2024-04-30 Thread Justin Pryzby

On Tue, Apr 30, 2024 at 01:52:02PM -0400, Robert Haas wrote:
> On Thu, Apr 18, 2024 at 12:52 PM Justin Pryzby  wrote:
> > I'm not totally clear on what's intended in v17 - maybe it'd be dead
> > code, and maybe it shouldn't even be applied to master branch.  But I do
> > think it's worth patching earlier versions (even though it'll be less
> > useful than having done so 5 years ago).
> 
> This thread is still on the open items list, but I'm not sure whether
> there's still stuff here that needs to be fixed for the current
> release. If not, this thread should be moved to the "resolved before
> 17beta1" section. If so, we should try to reach consensus on what the
> remaining issues are and what we're going to do about them.

I think the only thing that's relevant for v17 is this:

On Tue, Apr 16, 2024 at 08:11:49PM +0200, Alvaro Herrera wrote:
> Speaking of which, I wonder if I should modify pg16's tests so that they
> leave behind tables set up in this way, to immortalize pg_upgrade testing.

The patch on the other thread for pg_upgrade --check is an old issue
affecting all stable releases.

-- 
Justin

Control flow in logical replication walsender

2024-04-30 Thread Christophe Pettus



Hi,

I wanted to check my understanding of how control flows in a walsender doing 
logical replication.  My understanding is that the (single) thread in each 
walsender process, in the simplest case, loops on:

1. Pull a record out of the WAL.
2. Pass it to the recorder buffer code, which,
3. Sorts it out into the appropriate in-memory structure for that transaction 
(spilling to disk as required), and then continues with #1, or,
4. If it's a commit record, it iteratively passes the transaction data one 
change at a time to,
5. The logical decoding plugin, which returns the output format of that plugin, 
and then,
6. The walsender sends the output from the plugin to the client. It cycles on 
passing the data to the plugin and sending it to the client until it runs out 
of changes in that transaction, and then resumes reading the WAL in #1.

In particular, I wanted to confirm that while it is pulling the reordered 
transaction and sending it to the plugin (and thence to the client), that 
particular walsender is *not* reading new WAL records or putting them in the 
reorder buffer.

The specific issue I'm trying to track down is an enormous pileup of spill 
files.  This is in a non-supported version of PostgreSQL (v11), so an upgrade 
may fix it, but at the moment, I'm trying to find a cause and a mitigation.

Re: pg17 issues with not-null contraints

On Thu, Apr 18, 2024 at 12:52 PM Justin Pryzby  wrote:
> I'm not totally clear on what's intended in v17 - maybe it'd be dead
> code, and maybe it shouldn't even be applied to master branch.  But I do
> think it's worth patching earlier versions (even though it'll be less
> useful than having done so 5 years ago).

This thread is still on the open items list, but I'm not sure whether
there's still stuff here that needs to be fixed for the current
release. If not, this thread should be moved to the "resolved before
17beta1" section. If so, we should try to reach consensus on what the
remaining issues are and what we're going to do about them.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[PATCH] json_lex_string: don't overread on bad UTF8

2024-04-30 Thread Jacob Champion

Hi all,

When json_lex_string() hits certain types of invalid input, it calls
pg_encoding_mblen_bounded(), which assumes that its input is
null-terminated and calls strnlen(). But the JSON lexer is constructed
with an explicit string length, and we don't ensure that the string is
null-terminated in all cases, so we can walk off the end of the
buffer. This isn't really relevant on the server side, where you'd
have to get a superuser to help you break string encodings, but for
client-side usage on untrusted input (such as my OAuth patch) it would
be more important.

Attached is a draft patch that explicitly checks against the
end-of-string pointer and clamps the token_terminator to it. Note that
this removes the only caller of pg_encoding_mblen_bounded() and I'm
not sure what we should do with that function. It seems like a
reasonable API, just not here.

The new test needs to record two versions of the error message, one
for invalid token and one for invalid escape sequence. This is
because, for smaller chunk sizes, the partial-token logic in the
incremental JSON parser skips the affected code entirely when it can't
find an ending double-quote.

Tangentially: Should we maybe rethink pieces of the json_lex_string
error handling? For example, do we really want to echo an incomplete
multibyte sequence once we know it's bad? It also looks like there are
places where the FAIL_AT_CHAR_END macro is called after the `s`
pointer has already advanced past the code point of interest. I'm not
sure if that's intentional.

Thanks,
--Jacob


0001-json_lex_string-don-t-overread-on-bad-UTF8.patch
Description: Binary data

Re: A problem about partitionwise join

On Wed, Feb 21, 2024 at 6:25 AM Richard Guo  wrote:
> This patch was returned due to 'lack of interest'.  However, upon
> verification, it appears that the reported issue still exists, and the
> proposed fix in the thread remains valid.  Hence, resurrect this patch
> after rebasing it on master.  I've also written a detailed commit
> message which hopefully can help people review the changes more
> effectively.

I think it's slightly questionable whether this patch is worthwhile.
The case memorialized in the regression tests, t1.a = t2.a AND t1.a =
t2.b, is a very weird thing to do. The case mentioned in the original
email, foo.k1 = bar.k1 and foo.k2 = bar.k2 and foo.k2 = 16, seems like
something that could realistically happen, especially when there are
views in use (e.g. the view joins foo and bar, and then someone
queries the view for one of the join columns). In such a case, it's
possible that foo.k2 = 16 is selective enough that we really don't
care about partition-wise join any more, but it's also possible that
it's not too selective and we do care about partition-wise join. So I
don't think that the case that the patch fixes is something that can
ever happen, but I do think it's probably fairly rare that brings any
benefit, which is why I thought that EC-based matching was an OK
approach to this problem initially. Perhaps that was the wrong idea,
though.

Does the additional logic added by this patch have a noticeable
performance cost?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

Alexander Korotkov  writes:
> I agree that storing char signedness might seem weird.  But it appears
> that we already store indexes that depend on char signedness.  So,
> it's effectively property of bits-on-disk even though it affects
> indirectly.  Then I see two options to make the picture consistent.
> 1) Assume that char signedness is somehow a property of bits-on-disk
> even though it's weird.  Then pg_trgm indexes are correct, but we need
> to store char signedness in pg_control.
> 2) Assume that char signedness is not a property of bits-on-disk.
> Then pg_trgm indexes are buggy and need to be fixed.
> What do you think?

The problem with option (2) is the assumption that pg_trgm's behavior
is the only bug of this kind, either now or in the future.  I think
that's just about an impossible standard to meet, because there's no
realistic way to test whether char signedness is affecting things.
(Sure, you can compare results across platforms, but maybe you
just didn't test the right case.)

Also, the bigger picture here is the seeming assumption that "if
we change pg_trgm then it will be safe to replicate from x86 to
arm".  I don't believe that that's a good idea and I'm unwilling
to promise that it will work, regardless of what we do about
char signedness.  That being the case, I don't want to invest a
lot of effort in the signedness issue.  Option (1) is clearly
a small change with little if any risk of future breakage.
Option (2) ... not so much.

regards, tom lane

Re: TerminateOtherDBBackends code comments inconsistency.

2024-04-30 Thread Noah Misch

On Tue, Apr 30, 2024 at 09:10:52AM +0530, Amit Kapila wrote:
> On Tue, Apr 30, 2024 at 2:58 AM Noah Misch  wrote:
> > On Mon, Apr 29, 2024 at 10:18:35AM +0530, Amit Kapila wrote:
> > > On Mon, Apr 22, 2024 at 9:56 PM Noah Misch  wrote:
> >
> > > 3a9b18b309 didn't change the docs of pg_terminate_backend and whatever
> > > is mentioned w.r.t permissions in the doc of that function sounds
> > > valid for drop database force to me. Do you have any specific proposal
> > > in your mind?
> >
> > Something like the attached.
> 
> LGTM.
> 
> >  One could argue the function should also check
> > isBackgroundWorker and ignore even bgworkers that set proc->roleId, but I've
> > not done that.
> 
> What is the argument for ignoring such workers?

One of the proposed code comments says, "For bgworker authors, it's convenient
to be able to recommend FORCE if a worker is blocking DROP DATABASE
unexpectedly."  That argument is debatable, but I do think it applies equally
to bgworkers whether or not they set proc->roleId.

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

On Tue, Apr 30, 2024 at 7:54 PM Tom Lane  wrote:
> Alexander Korotkov  writes:
> > Given this, should we try to do better with binary compatibility
> > checks using ControlFileData?  AFAICS they are supposed to check if
> > the database cluster is binary compatible with the running
> > architecture.  But it obviously allows incompatibilities.
>
> Perhaps.  pg_control already covers endianness, which I think
> is the root of the hashing differences I showed.  Adding a field
> for char signedness feels a little weird, since it's not directly
> a property of the bits-on-disk, but maybe we should.

I agree that storing char signedness might seem weird.  But it appears
that we already store indexes that depend on char signedness.  So,
it's effectively property of bits-on-disk even though it affects
indirectly.  Then I see two options to make the picture consistent.
1) Assume that char signedness is somehow a property of bits-on-disk
even though it's weird.  Then pg_trgm indexes are correct, but we need
to store char signedness in pg_control.
2) Assume that char signedness is not a property of bits-on-disk.
Then pg_trgm indexes are buggy and need to be fixed.
What do you think?

--
Regards,
Alexander Korotkov

Re: Avoid orphaned objects dependencies, take 3

2024-04-30 Thread Alexander Lakhin


Hi Bertrand,

25.04.2024 10:20, Bertrand Drouvot wrote:

postgres=# CREATE FUNCTION f() RETURNS int LANGUAGE SQL RETURN f() + 1;
ERROR:  cache lookup failed for function 16400

This stuff does appear before we get a chance to call the new 
depLockAndCheckObject()
function.

I think this is what Tom was referring to in [1]:

"
So the only real fix for this would be to make every object lookup in the entire
system do the sort of dance that's done in RangeVarGetRelidExtended.
"

The fact that those kind of errors appear also somehow ensure that no orphaned
dependencies can be created.


I agree; the only thing that I'd change here, is the error code.

But I've discovered yet another possibility to get a broken dependency.
Please try this script:
res=0
numclients=20
for ((i=1;i<=100;i++)); do
for ((c=1;c<=numclients;c++)); do
  echo "
CREATE SCHEMA s_$c;
CREATE CONVERSION myconv_$c FOR 'LATIN1' TO 'UTF8' FROM iso8859_1_to_utf8;
ALTER CONVERSION myconv_$c SET SCHEMA s_$c;
  " | psql >psql1-$c.log 2>&1 &
  echo "DROP SCHEMA s_$c RESTRICT;" | psql >psql2-$c.log 2>&1 &
done
wait
pg_dump -f db.dump || { echo "on iteration $i"; res=1; break; }
for ((c=1;c<=numclients;c++)); do
  echo "DROP SCHEMA s_$c CASCADE;" | psql >psql3-$c.log 2>&1
done
done
psql -c "SELECT * FROM pg_conversion WHERE connamespace NOT IN (SELECT oid FROM 
pg_namespace);"

It fails for me (with the v4 patch applied) as follows:
pg_dump: error: schema with OID 16392 does not exist
on iteration 1
  oid  | conname  | connamespace | conowner | conforencoding | contoencoding |  
    conproc  | condefault
---+--+--+--++---+---+
 16396 | myconv_6 |    16392 |   10 |  8 | 6 | 
iso8859_1_to_utf8 | f

Best regards,
Alexander

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

Alexander Korotkov  writes:
> Given this, should we try to do better with binary compatibility
> checks using ControlFileData?  AFAICS they are supposed to check if
> the database cluster is binary compatible with the running
> architecture.  But it obviously allows incompatibilities.

Perhaps.  pg_control already covers endianness, which I think
is the root of the hashing differences I showed.  Adding a field
for char signedness feels a little weird, since it's not directly
a property of the bits-on-disk, but maybe we should.

regards, tom lane

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

On Tue, Apr 23, 2024 at 5:57 PM Tom Lane  wrote:
> "Guo, Adam"  writes:
> > I would like to report an issue with the pg_trgm extension on
> > cross-architecture replication scenarios. When an x86_64 standby
> > server is replicating from an aarch64 primary server or vice versa,
> > the gist_trgm_ops opclass returns different results on the primary
> > and standby.
>
> I do not think that is a supported scenario.  Hash functions and
> suchlike are not guaranteed to produce the same results on different
> CPU architectures.  As a quick example, I get
>
> regression=# select hashfloat8(34);
>  hashfloat8
> 
>21570837
> (1 row)
>
> on x86_64 but
>
> postgres=# select hashfloat8(34);
>  hashfloat8
> 
>  -602898821
> (1 row)
>
> on ppc32 thanks to the endianness difference.

Given this, should we try to do better with binary compatibility
checks using ControlFileData?  AFAICS they are supposed to check if
the database cluster is binary compatible with the running
architecture.  But it obviously allows incompatibilities.

--
Regards,
Alexander Korotkov

Re: SQL:2011 application time

2024-04-30 Thread Paul Jungwirth


On 4/30/24 09:24, Robert Haas wrote:

Peter, could you have a look at
http://postgr.es/m/47550967-260b-4180-9791-b224859fe...@illuminatedcomputing.com
and express an opinion about whether each of those proposals are (a)
good or bad ideas and (b) whether they need to be fixed for the
current release?


Here are the same patches but rebased. I've added a fourth which is my progress on adding the CHECK 
constraint. I don't really consider it finished though, because it has these problems:


- The CHECK constraint should be marked as an internal dependency of the PK, so that you can't drop 
it, and it gets dropped when you drop the PK. I don't see a good way to tie the two together though, 
so I'd appreciate any advice there. They are separate AlterTableCmds, so how do I get the 
ObjectAddress of both constraints at the same time? I wanted to store the PK's ObjectAddress on the 
Constraint node, but since ObjectAddress isn't a Node it doesn't work.


- The CHECK constraint should maybe be hidden when you say `\d foo`? Or maybe not, but that's what 
we do with FK triggers.


- When you create partitions you get a warning about the constraint already existing, because it 
gets created via the PK and then also the partitioning code tries to copy it. Solving the first 
issue here should solve this nicely though.


Alternately we could just fix the GROUP BY functional dependency code to only accept b-tree indexes. 
But I think the CHECK constraint approach is a better solution.


Thanks,

--
Paul  ~{:-)
p...@illuminatedcomputing.comFrom 0dbc008a654ab1fdc5f492345ee4575c352499d3 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" 
Date: Sun, 24 Mar 2024 21:46:30 -0700
Subject: [PATCH v2 1/4] Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes

A PRIMARY KEY or UNIQUE constraint with WITHOUT OVERLAPS will be a GiST
index, not a B-Tree, but it will still have indisunique set. The code
for ON CONFLICT fails if it sees a non-btree index that has indisunique.
This commit fixes that and adds some tests. But now that we can't just
test indisunique, we also need some extra checks to prevent DO UPDATE
from running against a WITHOUT OVERLAPS constraint (because the conflict
could happen against more than one row, and we'd only update one).
---
 src/backend/catalog/index.c   |   1 +
 src/backend/executor/execIndexing.c   |   2 +-
 src/backend/optimizer/util/plancat.c  |   4 +-
 src/include/nodes/execnodes.h |   1 +
 .../regress/expected/without_overlaps.out | 176 ++
 src/test/regress/sql/without_overlaps.sql | 113 +++
 6 files changed, 294 insertions(+), 3 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5a8568c55c9..1fd543cc550 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2443,6 +2443,7 @@ BuildIndexInfo(Relation index)
  >ii_ExclusionOps,
  >ii_ExclusionProcs,
  >ii_ExclusionStrats);
+		ii->ii_HasWithoutOverlaps = ii->ii_Unique;
 	}
 
 	return ii;
diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c
index 9f05b3654c1..faa37ca56db 100644
--- a/src/backend/executor/execIndexing.c
+++ b/src/backend/executor/execIndexing.c
@@ -210,7 +210,7 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative)
 		 * If the indexes are to be used for speculative insertion, add extra
 		 * information required by unique index entries.
 		 */
-		if (speculative && ii->ii_Unique)
+		if (speculative && ii->ii_Unique && !ii->ii_HasWithoutOverlaps)
 			BuildSpeculativeIndexInfo(indexDesc, ii);
 
 		relationDescs[i] = indexDesc;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 130f838629f..a398d7a78d1 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -815,7 +815,7 @@ infer_arbiter_indexes(PlannerInfo *root)
 		 */
 		if (indexOidFromConstraint == idxForm->indexrelid)
 		{
-			if (!idxForm->indisunique && onconflict->action == ONCONFLICT_UPDATE)
+			if ((!idxForm->indisunique || idxForm->indisexclusion) && onconflict->action == ONCONFLICT_UPDATE)
 ereport(ERROR,
 		(errcode(ERRCODE_WRONG_OBJECT_TYPE),
 		 errmsg("ON CONFLICT DO UPDATE not supported with exclusion constraints")));
@@ -837,7 +837,7 @@ infer_arbiter_indexes(PlannerInfo *root)
 		 * constraints), so index under consideration can be immediately
 		 * skipped if it's not unique
 		 */
-		if (!idxForm->indisunique)
+		if (!idxForm->indisunique || idxForm->indisexclusion)
 			goto next;
 
 		/* Build BMS representation of plain (non expression) index attrs */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index d927ac44a82..fdfaef284e9 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -204,6 +204,7 @@ typedef struct IndexInfo
 	bool		ii_Summarizing;
 	int			ii_ParallelWorkers;
 	Oid			ii_Am;
+	bool

Re: SQL:2011 application time

On Fri, Apr 26, 2024 at 3:41 PM Paul Jungwirth
 wrote:
> On 4/26/24 12:25, Robert Haas wrote:
> > I think this thread should be added to the open items list.
>
> Thanks! I sent a request to pgsql-www to get edit permission. I didn't 
> realize there was a wiki page
> tracking things like this. I agree it needs to be fixed if we want to include 
> the feature.

Great, I see that it's on the list now.

Peter, could you have a look at
http://postgr.es/m/47550967-260b-4180-9791-b224859fe...@illuminatedcomputing.com
and express an opinion about whether each of those proposals are (a)
good or bad ideas and (b) whether they need to be fixed for the
current release?

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

Masahiko Sawada  writes:
> On Tue, Apr 30, 2024 at 12:37 PM Tom Lane  wrote:
>> I think this will break existing indexes that are working fine.
>> Yeah, it would have been better to avoid the difference, but
>> it's too late now.

> True. So it will be a PG18 item.

How will it be any better in v18?  It's still an on-disk
compatibility break for affected platforms.

Now, people could recover by reindexing affected indexes,
but I think we need to have a better justification than this
for making them do so.

regards, tom lane

Re: Support LIKE with nondeterministic collations

2024-04-30 Thread Daniel Verite

Peter Eisentraut wrote:

> This patch adds support for using LIKE with nondeterministic
>  collations.  So you can do things such as
> 
> col LIKE 'foo%' COLLATE case_insensitive

Nice!

> The pattern is partitioned into substrings at wildcard characters
> (so 'foo%bar' is partitioned into 'foo', '%', 'bar') and then then
> whole predicate matches if a match can be found for each partition
> under the applicable collation

Trying with a collation that ignores punctuation:

  postgres=# CREATE COLLATION "ign_punct" (
provider = 'icu',
locale='und-u-ka-shifted',
deterministic = false
  );

  postgres=# SELECT '.foo.' like 'foo' COLLATE ign_punct;
   ?column? 
  --
   t
  (1 row)

  postgres=# SELECT '.foo.' like 'f_o' COLLATE ign_punct;
   ?column? 
  --
   t
  (1 row)

  postgres=# SELECT '.foo.' like '_oo' COLLATE ign_punct;
   ?column? 
  --
   f
  (1 row)

The first two results look fine, but the next one is inconsistent.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-04-30 Thread Daniel Gustafsson

> On 26 Apr 2024, at 15:04, Melanie Plageman  wrote:

> If this seems correct to you, are you okay with the rest of the fix
> and test? We could close this open item once the patch is acceptable.

From reading the discussion and the patch this seems like the right fix to me.
Does the test added here aptly cover 04e72ed617be in terms its functionality?

--
Daniel Gustafsson

Re: CREATE TABLE/ProcessUtility hook behavior change

2024-04-30 Thread jian he

On Tue, Apr 30, 2024 at 4:35 PM David Steele  wrote:
>
> On 4/30/24 12:57, jian he wrote:
> > On Tue, Apr 30, 2024 at 10:26 AM David Steele  wrote:
> >>
> >> Since bb766cde cannot be readily applied to older commits in master I'm
> >> unable to continue bisecting to find the ALTER TABLE behavioral change.
> >>
> >> This seems to leave me with manual code inspection and there have been a
> >> lot of changes in this area, so I'm hoping somebody will know why this
> >> ALTER TABLE change happened before I start digging into it.
> >

I just tested these two commits.
https://git.postgresql.org/cgit/postgresql.git/commit/?id=3da13a6257bc08b1d402c83feb2a35450f988365
https://git.postgresql.org/cgit/postgresql.git/commit/?id=b0e96f311985bceba79825214f8e43f65afa653a

i think it's related to the catalog not null commit.
it will alter table and add not null constraint internally (equivalent
to `alter table test alter id set not null`).

Re: partitioning and identity column

2024-04-30 Thread Ashutosh Bapat

On Sun, Apr 28, 2024 at 12:29 PM Alexander Lakhin 
wrote:

> 27.04.2024 18:00, Alexander Lakhin wrote:
> >
> > Please look also at another script, which produces the same error:
>
> I've discovered yet another problematic case:
> CREATE TABLE tbl1 (a int GENERATED ALWAYS AS IDENTITY, b text)
>  PARTITION BY LIST (a);
> CREATE TABLE tbl2 (b text, a int NOT NULL);
> ALTER TABLE tbl1 ATTACH PARTITION tbl2 DEFAULT;
>
> INSERT INTO tbl2 DEFAULT VALUES;
> ERROR:  no owned sequence found
>
> Though it works with tbl2(a int NOT NULL, b text)...
> Take a look at this too, please.
>

Thanks Alexander for the report.

PFA patch which fixes all the three problems.

I had not fixed getIdentitySequence() to fetch identity sequence associated
with the partition because I thought it would be better to fail with an
error when it's not used correctly. But these bugs show 1. the error is
misleading and unwanted 2. there are more places where adding that logic
to  getIdentitySequence() makes sense. Fixed the function in these patches.
Now callers like transformAlterTableStmt have to be careful not to call the
function on a partition.

I have examined all the callers of getIdentitySequence() and they seem to
be fine. The code related to SetIdentity, DropIdentity is not called for
partitions, errors for which are thrown elsewhere earlier.

-- 
Best Wishes,
Ashutosh Bapat
From bce2e8d573040901b18c0c9f6884f0fd44f50724 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat 
Date: Tue, 30 Apr 2024 15:32:34 +0530
Subject: [PATCH] Fix assorted bugs related to identity column in partitioned
 tables

When changing data type of a column of a partitioned table craft ALTER SEQUENCE
command only once. Partitions do not have identity sequences of their own and
thus do not need ALTER SEQUENCE command for each partition.

Fix getIdentitySequence() to fetch the identity sequence associated with the
top level partitioned table when Relation of a partition is passed to it. While
doing so translate the attribute number of the partition into the attribute
number of the partitioned table.

Author: Ashutosh Bapat
Discussion: https://www.postgresql.org/message-id/3b8a9dc1-bbc7-0ef5-6863-c432afac7...@gmail.com
---
 src/backend/catalog/pg_depend.c| 31 ++-
 src/backend/commands/tablecmds.c   |  2 +-
 src/backend/parser/parse_utilcmd.c | 52 +++---
 src/backend/rewrite/rewriteHandler.c   | 19 +-
 src/include/catalog/dependency.h   |  2 +-
 src/test/regress/expected/identity.out | 32 +++-
 src/test/regress/sql/identity.sql  | 11 +-
 7 files changed, 101 insertions(+), 48 deletions(-)

diff --git a/src/backend/catalog/pg_depend.c b/src/backend/catalog/pg_depend.c
index f85a898de8..5366f7820c 100644
--- a/src/backend/catalog/pg_depend.c
+++ b/src/backend/catalog/pg_depend.c
@@ -23,10 +23,12 @@
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_depend.h"
 #include "catalog/pg_extension.h"
+#include "catalog/partition.h"
 #include "commands/extension.h"
 #include "miscadmin.h"
 #include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
+#include "utils/syscache.h"
 #include "utils/rel.h"
 
 
@@ -941,10 +943,35 @@ getOwnedSequences(Oid relid)
  * Get owned identity sequence, error if not exactly one.
  */
 Oid
-getIdentitySequence(Oid relid, AttrNumber attnum, bool missing_ok)
+getIdentitySequence(Relation rel, AttrNumber attnum, bool missing_ok)
 {
-	List	   *seqlist = getOwnedSequences_internal(relid, attnum, DEPENDENCY_INTERNAL);
+	Oid			relid;
+	List	   *seqlist;
 
+	/*
+	 * The identity sequence is associated with the topmost partitioned table,
+	 * which might have column order different than the given partition.
+	 */
+	if (RelationGetForm(rel)->relispartition)
+	{
+		List	   *ancestors =
+			get_partition_ancestors(RelationGetRelid(rel));
+		HeapTuple	ctup = SearchSysCacheAttNum(RelationGetRelid(rel), attnum);
+		const char *attname = NameStr(((Form_pg_attribute) GETSTRUCT(ctup))->attname);
+		HeapTuple	ptup;
+
+		relid = llast_oid(ancestors);
+		ptup = SearchSysCacheAttName(relid, attname);
+		attnum = ((Form_pg_attribute) GETSTRUCT(ptup))->attnum;
+
+		ReleaseSysCache(ctup);
+		ReleaseSysCache(ptup);
+		list_free(ancestors);
+	}
+	else
+		relid = RelationGetRelid(rel);
+
+	seqlist = getOwnedSequences_internal(relid, attnum, DEPENDENCY_INTERNAL);
 	if (list_length(seqlist) > 1)
 		elog(ERROR, "more than one owned sequence found");
 	else if (seqlist == NIL)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 3556240c8e..9d32b2c495 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -8516,7 +8516,7 @@ ATExecDropIdentity(Relation rel, const char *colName, bool missing_ok, LOCKMODE
 	if (!recursing)
 	{
 		/* drop the internal sequence */
-		seqid = getIdentitySequence(RelationGetRelid(rel), attnum, false);
+		seqid = getIdentitySequence(rel, attnum, false);

Re: pg_input_error_info doc 2 exampled crammed together

On Sun, Apr 28, 2024 at 10:07:49PM -0700, David G. Johnston wrote:
> Agreed.  The column names are self-explanatory if you’ve seen errors
> before.  The values are immaterial.  Plus we don’t generally use
> psql-specific features in our examples.

Okay, I've just cleaned up that a bit with f6ab942f5de0.
--
Michael


signature.asc
Description: PGP signature

Re: Removing unneeded self joins

On Tue, Apr 30, 2024 at 9:00 AM Alexander Lakhin  wrote:
> 23.10.2023 12:47, Alexander Korotkov wrote:
> > I think this patch makes substantial improvement to query planning.
> > It has received plenty of reviews.  The code is currently in quite
> > good shape.  I didn't manage to find the cases when this optimization
> > causes significant overhead to planning time.  Even if such cases will
> > be spotted there is a GUC option to disable this feature.  So, I'll
> > push this if there are no objections.
>
> I've discovered another failure, introduced by d3d55ce57.
> Please try the following:
> CREATE TABLE t (a int unique, b float);
> SELECT * FROM t NATURAL JOIN LATERAL
>   (SELECT * FROM t t2 TABLESAMPLE SYSTEM (t.b)) t2;

I think we should just forbid SJE in case when relations to be merged
have cross-references with lateral vars.  The draft patch for this is
attached.  I'd like to ask Alexander to test it, Richard and Andrei to
review it.  Thank you!

--
Regards,
Alexander Korotkov


sje_skip_cross_lateral_vars.patch
Description: Binary data

Re: Direct SSL connection with ALPN and HBA rules

2024-04-30 Thread Daniel Gustafsson

> On 29 Apr 2024, at 21:06, Heikki Linnakangas  wrote:

> Oh I was not aware sslrootcert=system works like that. That's a bit 
> surprising, none of the other ssl-related settings imply or require that SSL 
> is actually used. Did we intend to set a precedence for new settings with 
> that?

It was very much intentional, and documented, an sslmode other than verify-full
makes little sense when combined with sslrootcert=system.  It wasn't intended
to set a precedence (though there is probably a fair bit of things we can do,
getting this right is hard enough as it is), rather it was footgun prevention.

--
Daniel Gustafsson

Re: using extended statistics to improve join estimates

2024-04-30 Thread Andy Fan

Hello Justin,

Thanks for showing interest on this!

> On Sun, Apr 28, 2024 at 10:07:01AM +0800, Andy Fan wrote:
>> 's/estimiatedcluases/estimatedclauses/' typo error in the
>> commit message is not fixed since I have to regenerate all the commits
>
> Maybe you know this, but some of these patches need to be squashed.
> Regenerating the patches to address feedback is the usual process.
> When they're not squished, it makes it hard to review the content of the
> patches.

You might overlooked the fact that the each individual commit is just to
make the communication effectively (easy to review) and all of them
will be merged into 1 commit at the last / during the process of review. 

Even though if something make it hard to review, I am pretty happy to
regenerate the patches, but does 's/estimiatedcluases/estimatedclauses/'
belongs to this category? I'm pretty sure that is not the only typo
error or inapproprate word, if we need to regenerate the 22 patches
because of that, we have to regenerate that pretty often.

Do you mind to provide more feedback once and I can merge all of them in
one modification or you think the typo error has blocked the review
process?

>
> For example:
> [PATCH v1 18/22] Fix error "unexpected system attribute" when join with 
> system attr
> ..adds .sql regression tests, but the expected .out isn't updated until
> [PATCH v1 19/22] Fix the incorrect comment on extended stats.
>
> That fixes an elog() in Tomas' original commit, so it should probably be
> 002 or 003.

Which elog are you talking about?

> It might make sense to keep the first commit separate for
> now, since it's nice to keep Tomas' original patch "pristine" to make
> more apparent the changes you're proposing.

This is my goal as well, did you find anything I did which break this
rule, that's absoluately not my intention.

> Another:
> [PATCH v1 20/22] Add fastpath when combine the 2 MCV like eqjoinsel_inner.
> ..doesn't compile without
> [PATCH v1 21/22] When mcv->ndimensions == list_length(clauses), handle it 
> same as
>
> Your 022 patch fixes a typo in your 002 patch, which means that first
> one reads a patch with a typo, and then later, a 10 line long patch
> reflowing the comment with a typo fixed.

I would like to regenerate the 22 patches if you think the typo error
make the reivew process hard. I can do such things but not willing to
do that often.

>
> A good guideline is that each patch should be self-contained, compiling
> and passing tests.  Which is more difficult with a long stack of
> patches.

I agree.

-- 
Best Regards
Andy Fan

Re: [PATCH] Fix bug when calling strncmp in check_authmethod_valid

2024-04-30 Thread Daniel Gustafsson

> On 30 Apr 2024, at 04:41, Jingxian Li  wrote:

> Attached is a patch that fixes bug when calling strncmp function, in 
> which case the third argument (authmethod - strchr(authmethod, ' ')) 
> may be negative, which is not as expected..

The calculation is indeed incorrect, but the lack of complaints of it being
broken made me wonder why this exist in the first place.  This dates back to
e7029b212755, just shy of 2 decades old, which added --auth with support for
strings with auth-options to ident and pam like --auth 'pam ' and
'ident sameuser'.  Support for options to ident was removed in 01c1a12a5bb4 but
options to pam is still supported (although not documented), but was AFAICT
broken in commit 8a02339e9ba3 some 12 years ago with this strncmp().

-  if (strncmp(authmethod, *p, (authmethod - strchr(authmethod, ' '))) == 0)
+  if (strncmp(authmethod, *p, (strchr(authmethod, ' ') - authmethod)) == 0)

This with compare "pam postgresql" with "pam" and not "pam " so the length
should be "(strchr(authmethod, ' ') - authmethod + 1)" since "pam " is a method
separate from "pam" in auth_methods_{host|local}.  We don't want to allow "md5
" as that's not a method in the array of valid methods.

But, since it's been broken in all supported versions of postgres and has
AFAICT never been documented to exist, should we fix it or just remove it?  We
don't support auth-options for any other methods, like clientcert to cert for
example.  If we fix it we should also document that it works IMHO.

--
Daniel Gustafsson

Re: Removing unneeded self joins