Re: Docs: Encourage strong server verification with SCRAM

2023-05-28 Thread Michael Paquier
On Sun, May 28, 2023 at 02:21:53PM -0400, Jonathan S. Katz wrote:
> The above assumes that the reader reviewed the previous paragraph and
> followed the guidelines there. However, we can make it explicit. Please see
> attached.

Yeah, I was under the same impression as Jacob that we don't insist
enough in this new paragraph about what sslmode ought to be when I
initially read the patch.  However, looking at the html page produced,
what you have to refer to the previous paragraph looks OK to me.
--
Michael


signature.asc
Description: PGP signature


Re: contrib/spi/insert_username.c comment typo?

2023-05-28 Thread John Naylor
On Mon, May 29, 2023 at 11:31 AM jian he 
wrote:
> hi.
> reading through contrib/spi/insert_username.c
>
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/spi/insert_username.c;h=a2e1747ff74c7667665dcc334f54ad368885d83c;hb=HEAD
> 36 /* sanity checks from autoinc.c */
> 37 if (!CALLED_AS_TRIGGER(fcinfo))
> 38 /* internal error */
> 39 elog(ERROR, "insert_username: not fired by trigger manager");
>
> should it be  /* sanity checks from insert_username.c */ ?

I believe it's saying the checks were copied from autoinc.c.

--
John Naylor
EDB: http://www.enterprisedb.com


contrib/spi/insert_username.c comment typo?

2023-05-28 Thread jian he
hi.
reading through contrib/spi/insert_username.c
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=contrib/spi/insert_username.c;h=a2e1747ff74c7667665dcc334f54ad368885d83c;hb=HEAD
36

/* sanity checks from autoinc.c */
37

if (!CALLED_AS_TRIGGER(fcinfo))
38

/* internal error */
39

elog(ERROR, "insert_username: not fired by trigger manager");

should it be  /* sanity checks from insert_username.c */ ?


Re: Cleaning up nbtree after logical decoding on standby work

2023-05-28 Thread Heikki Linnakangas

On 29/05/2023 03:31, Michael Paquier wrote:

On Fri, May 26, 2023 at 04:48:37PM -0700, Peter Geoghegan wrote:

I'd have thought the subject line "Cleaning up nbtree after logical
decoding on standby work" made it quite clear that this patch was
targeting 16.


Hmm, okay.  I was understanding that as something for v17, honestly.


IMO this makes sense for v16. These new arguments were introduced in 
v16, so if we have second thoughts, now is the right time to change 
them, before v16 is released. It will reduce the backpatching effort in 
the future; if we apply this in v17, then v16 will be the odd one out.


For the patch itself:


@@ -75,6 +74,10
  * _bt_search() -- Search the tree for a particular scankey,
  * or more precisely for the first leaf page it could be on.
  *
+ * rel must always be provided.  heaprel must be provided by callers that pass
+ * access = BT_WRITE, since we may need to allocate a new root page for caller
+ * in passing (or when finishing a page split results in a parent page split).
+ *
  * The passed scankey is an insertion-type scankey (see nbtree/README),
  * but it can omit the rightmost column(s) of the index.
  *


Maybe add an assertion for that in _bt_search(), too. I know you added 
one in _bt_getroot(), and _bt_search() calls that as the very first 
thing. But I think it would be useful as documentation in _bt_search(), too.


Maybe it would be more straightforward to always require heapRel in 
_bt_search() and _bt_getroot(), regardless of whether it's BT_READ or 
BT_WRITE, even if the functions don't use it with BT_READ. It would be 
less mental effort in the callers to just always pass in 'heapRel'.


Overall, +1 on this patch, and +1 for committing it to v16.

--
Heikki Linnakangas
Neon (https://neon.tech)





Re: Implement generalized sub routine find_in_log for tap test

2023-05-28 Thread vignesh C
On Sat, 27 May 2023 at 17:32, Andrew Dunstan  wrote:
>
>
> On 2023-05-26 Fr 20:35, vignesh C wrote:
>
> On Fri, 26 May 2023 at 04:09, Michael Paquier  wrote:
>
> On Thu, May 25, 2023 at 06:34:20PM +0100, Dagfinn Ilmari Mannsåker wrote:
>
> However, none of the other functions in ::Utils know anything about node
> objects, which makes me think it should be a method on the node itself
> (i.e. in PostgreSQL::Test::Cluster) instead.  Also, I think log_contains
> would be a better name, since it just returns a boolean.  The name
> find_in_log makes me think it would return the log lines matching the
> pattern, or the position of the match in the file.
>
> In that case, the slurp_file() call would have to be fully qualified,
> since ::Cluster uses an empty import list to avoid polluting the method
> namespace with imported functions.
>
> Hmm.  connect_ok() and connect_fails() in Cluster.pm have a similar
> log comparison logic, feeding from the offset of a log file.  Couldn't
> you use the same code across the board for everything?  Note that this
> stuff is parameterized so as it is possible to check if patterns match
> or do not match, for multiple patterns.  It seems to me that we could
> use the new log finding routine there as well, so how about extending
> it a bit more?  You would need, at least:
> - One parameter for log entries matching.
> - One parameter for log entries not matching.
>
> I felt adding these to log_contains was making the function slightly
> complex with multiple checks. I was not able to make it simple with
> the approach I tried. How about having a common function
> check_connect_log_contents which has the common log contents check for
> connect_ok and connect_fails function like the v2-0002 patch attached.
>
>
> +$offset = 0 unless defined $offset;
>
>
> This is unnecessary, as slurp_file() handles it appropriately, and in fact 
> doing this is slightly inefficient, as it will cause slurp_file to do a 
> redundant seek.
>
> FYI there's a simpler way to say it if we wanted to:
>
> $offset //= 0;

Thanks for the comment, the attached v3 version patch has the changes
for the same.

Regards,
Vignesh
From 02cda5bc291d2e951b971081981a336e22c74ec1 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Sat, 27 May 2023 05:36:34 +0530
Subject: [PATCH v3 2/2] Move common connection log content verification code
 to a common function.

Move common connection log content verification code to a common
function.
---
 src/test/perl/PostgreSQL/Test/Cluster.pm | 122 +++
 1 file changed, 60 insertions(+), 62 deletions(-)

diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 4df7dd4dec..912892e28b 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -2168,21 +2168,19 @@ sub pgbench
 
 =pod
 
-=item $node->connect_ok($connstr, $test_name, %params)
+=item $node->check_connect_log_contents($offset, $test_name, %parameters)
 
-Attempt a connection with a custom connection string.  This is expected
-to succeed.
+Check connection log contents.
 
 =over
 
-=item sql => B
+=item $test_name
 
-If this parameter is set, this query is used for the connection attempt
-instead of the default.
+Name of test for error messages.
 
-=item expected_stdout => B
+=item $offset
 
-If this regular expression is set, matches it with the output generated.
+Offset of the log file.
 
 =item log_like => [ qr/required message/ ]
 
@@ -2200,6 +2198,58 @@ passed to C.
 
 =cut
 
+sub check_connect_log_contents
+{
+	my ($self, $test_name, $offset, %params) = @_;
+
+	my (@log_like, @log_unlike);
+	if (defined($params{log_like}))
+	{
+		@log_like = @{ $params{log_like} };
+	}
+	if (defined($params{log_unlike}))
+	{
+		@log_unlike = @{ $params{log_unlike} };
+	}
+
+	if (@log_like or @log_unlike)
+	{
+		my $log_contents =
+		  PostgreSQL::Test::Utils::slurp_file($self->logfile, $offset);
+
+		while (my $regex = shift @log_like)
+		{
+			like($log_contents, $regex, "$test_name: log matches");
+		}
+		while (my $regex = shift @log_unlike)
+		{
+			unlike($log_contents, $regex, "$test_name: log does not match");
+		}
+	}
+}
+
+=pod
+
+=item $node->connect_ok($connstr, $test_name, %params)
+
+Attempt a connection with a custom connection string.  This is expected
+to succeed.
+
+=over
+
+=item sql => B
+
+If this parameter is set, this query is used for the connection attempt
+instead of the default.
+
+=item expected_stdout => B
+
+If this regular expression is set, matches it with the output generated.
+
+=back
+
+=cut
+
 sub connect_ok
 {
 	local $Test::Builder::Level = $Test::Builder::Level + 1;
@@ -2215,16 +2265,6 @@ sub connect_ok
 		$sql = "SELECT \$\$connected with $connstr\$\$";
 	}
 
-	my (@log_like, @log_unlike);
-	if (defined($params{log_like}))
-	{
-		@log_like = @{ $params{log_like} };
-	}
-	if (defined($params{log_unlike}))
-	{
-		@log_unlike = @{ $params{log_unlike} };
-	}
-
 	my $log_location = -s 

[16Beta1][doc] Add BackendType for standalone backends

2023-05-28 Thread Shinoda, Noriyoshi (PN Japan FSIP)
Hi, hackers.
In PostgreSQL 16 Beta 1, standalone backend was added to the backend type by 
this patch [1]. I think this patch will change the value of backend_type column 
in the pg_stat_activity view, but it's not explained in the documentation. The 
attached patch fixes monitoring.sgml.

[1] Add BackendType for standalone backends
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=0c679464a837079acc75ff1d45eaa83f79e05690

Regards, 
Noriyoshi Shinoda


monitoring_sgml_v1.diff
Description: monitoring_sgml_v1.diff


Re: Cleaning up nbtree after logical decoding on standby work

2023-05-28 Thread Michael Paquier
On Fri, May 26, 2023 at 04:48:37PM -0700, Peter Geoghegan wrote:
> I'd have thought the subject line "Cleaning up nbtree after logical
> decoding on standby work" made it quite clear that this patch was
> targeting 16.

Hmm, okay.  I was understanding that as something for v17, honestly.

> It's not refactoring work -- not really. The whole idea of outright
> removing the use of P_NEW in nbtree was where I landed with this after
> a couple of hours of work. In fact I almost posted a version without
> that, though that was worse in every way to my final approach.
> 
> I first voiced concerns about this whole area way back on April 4,
> which is only 3 days after commit 61b313e4 went in:
> 
> https://postgr.es/m/CAH2-Wz=jgryxwm74g1khst0znpunhezyjnvsjno2t3jswtb...@mail.gmail.com

Sure.  My take is that if this patch were to be sent at the beginning
of April, it could have been considered in v16.  However, deciding
such a matter at the end of May after beta1 has been released is a
different call.  You may want to make sure that the RMT is OK with
that, at the end.
--
Michael


signature.asc
Description: PGP signature


Re: benchmark results comparing versions 15.2 and 16

2023-05-28 Thread David Rowley
On Tue, 23 May 2023 at 07:40, MARK CALLAGHAN  wrote:

(pg15)

> --- Q2.10k : explain analyze SELECT c FROM sbtest1 WHERE id BETWEEN 1000 
> AND 1001 order by c;
>   QUERY PLAN
> ---
>  Sort  (cost=1114.85..1137.28 rows=8971 width=121) (actual 
> time=36.561..37.429 rows=10001 loops=1)
>Sort Key: c
>Sort Method: quicksort  Memory: 2025kB
>->  Index Scan using sbtest1_pkey on sbtest1  (cost=0.44..525.86 rows=8971 
> width=121) (actual time=0.022..3.776 rows=10001 loops=1)
>  Index Cond: ((id >= 1000) AND (id <= 1001))
>  Planning Time: 0.059 ms
>  Execution Time: 38.224 ms

(pg16 b1)

>QUERY PLAN
> 
>  Sort  (cost=1259.19..1284.36 rows=10068 width=121) (actual 
> time=14.419..15.042 rows=10001 loops=1)
>Sort Key: c
>Sort Method: quicksort  Memory: 1713kB
>->  Index Scan using sbtest1_pkey on sbtest1  (cost=0.44..589.80 
> rows=10068 width=121) (actual time=0.023..3.473 rows=10001 loops=1)
>  Index Cond: ((id >= 1000) AND (id <= 1001))
>  Planning Time: 0.049 ms
>  Execution Time: 15.700 ms

It looks like the improvements here are due to qsort being faster on
v16.  To get an idea of the time taken to perform the actual qsort,
you can't use the first row time vs last row time in the Sort node, as
we must (obviously) have performed the sort before outputting the
first row.  I think you could get a decent idea of the time taken to
perform the qsort by subtracting the actual time for the final Index
scan row from the actual time for the Sort's first row.  That's 36.561
- 3.776 = 32.785 ms for pg15's plan, and 14.419 - 3.473 = 10.946 ms
pg16 b1's

c6e0fe1f2 might have helped improve some of that performance, but I
suspect there must be something else as ~3x seems much more than I'd
expect from reducing the memory overheads.  Testing versions before
and after that commit might give a better indication.

David




Re: Docs: Encourage strong server verification with SCRAM

2023-05-28 Thread Jonathan S. Katz

On 5/26/23 6:47 PM, Jacob Champion wrote:

On Thu, May 25, 2023 at 6:10 PM Jonathan S. Katz  wrote:



+   To prevent server spoofing from occurring when using
+   scram-sha-256 password authentication
+   over a network, you should ensure you are connecting using SSL.


seems to backtrack on the recommendation -- you have to use
sslmode=verify-full, not just SSL, to avoid handing a weak(er) hash to
an untrusted party.


The above assumes that the reader reviewed the previous paragraph and 
followed the guidelines there. However, we can make it explicit. Please 
see attached.


Thanks,

Jonathan
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
index dbe23db54f..9a9fa7b206 100644
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -2014,6 +2014,19 @@ pg_dumpall -p 5432 | psql -d postgres -p 5433
CA.
   
 
+  
+   To prevent server spoofing from occurring when using
+   scram-sha-256 password authentication
+   over a network, you should ensure that you connect to the server using SSL
+   and with one of the anti-spoofing methods described in the previous
+   paragraph. Additionally, the SCRAM implementation in
+   libpq cannot protect the entire authentication
+   exchange, but using the channel_binding=require 
connection
+   parameter provides a mitigation against server spoofing. An attacker that
+   uses a rogue server to intercept a SCRAM exchange can use offline analysis 
to
+   determine the hashed password from the client.
+  
+
   
 To prevent spoofing with GSSAPI, the server must be configured to accept
 only hostgssenc connections


OpenPGP_signature
Description: OpenPGP digital signature


Re: abi-compliance-checker

2023-05-28 Thread Peter Geoghegan
On Sun, May 28, 2023 at 8:37 AM Tom Lane  wrote:
> I gather it'd catch things like NodeTag enum assignments changing,
> which is something we really need to have a check for.

Right. Any ABI break that involves machine-generated translation units
seems particularly prone to being overlooked. Just eyeballing code
(and perhaps double-checking struct layout using pahole) seems
inadequate.

I'll try to come up with a standard abi-compliance-checker Postgres
workflow once I'm back from pgCon. It already looks like
abi-compliance-checker is capable of taking most of the guesswork out
of ABI compatibility in stable releases, without any real downside,
which is encouraging. I have spent very little time on this, so it's
quite possible that some detail or other was overlooked.

-- 
Peter Geoghegan




Re: abi-compliance-checker

2023-05-28 Thread Tom Lane
I wrote:
> (Which reminds me that I forgot to turn on the ad-hoc check for
> that in gen_node_support.pl.  I'll go do that, but it'd be better
> to have a more general-purpose solution.)

Oh, scratch that, it's not supposed to happen until we make the
v16 branch.  It'd still be better to not need it.

regards, tom lane




Re: abi-compliance-checker

2023-05-28 Thread Tom Lane
Peter Geoghegan  writes:
> I tried comparing REL_11_0 to REL_11_20. Attached is the report for that.

Nice!

> I don't have time to study this in detail today, but the report seems
> to have a plausible variety of issues. I noticed that it warns about
> the breaking signature change to _bt_pagedel(). This is the
> theoretical ABI break that I mentioned in the commit message of commit
> b0229f26. This is arguably a false positive, since the tool doesn't
> understand my reasoning about why it's okay in this particular
> instance (namely "any extension that called that function was already
> severely broken"). Obviously the tool couldn't possibly be expected to
> know better in these kinds of situations, though, so whether or not it
> counts as a false positive is just semantics.

Agreed.  The point of such a tool is to make sure that we notice
any ABI breaks; it can't be expected to make engineering judgments
about whether the alternatives are worse.  For instance, I see that
it noticed commit 1f28ec6be (Rename rbtree.c functions to use "rbt"
prefix not "rb" prefix), which is not something we would have done
of our own choosing, but on balance it seemed the best solution.

I gather it'd catch things like NodeTag enum assignments changing,
which is something we really need to have a check for.

(Which reminds me that I forgot to turn on the ad-hoc check for
that in gen_node_support.pl.  I'll go do that, but it'd be better
to have a more general-purpose solution.)

regards, tom lane




Re: abi-compliance-checker

2023-05-28 Thread Tom Lane
Peter Geoghegan  writes:
> Attached is a html report that was generated by a tool called
> abi-compliance-checker/abi-dumper [1][2] (by using
> "abi-compliance-checker -l libTest ... ") . I deliberately introduced
> 2 ABI compatibility issues affecting postgres, just to see what the
> tool had to say about it.

This seems pretty cool.  I agree that we're in dire need of some
tool of this sort for checking back-branch patches.  I wonder
though if it'll have false-positive problems.  Have you tried it
on live rather than mocked-up cases, for instance 13.0 vs 13.11?

regards, tom lane




Re: Large files for relations

2023-05-28 Thread Thomas Munro
On Sun, May 28, 2023 at 2:48 AM Thomas Munro  wrote:
> (you'd need over 2 billion
> directories ...

directory *entries* (segment files), I meant to write there.




Re: Large files for relations

2023-05-28 Thread Thomas Munro
On Thu, May 25, 2023 at 1:08 PM Stephen Frost  wrote:
> * Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> > On 24.05.23 02:34, Thomas Munro wrote:
> > > * pg_upgrade would convert if source and target don't match
> >
> > This would be good, but it could also be an optional or later feature.
>
> Agreed.

OK.  I do have a patch for that, but I'll put that (+ copy_file_range)
aside for now so we can talk about the basic feature.  Without that,
pg_upgrade just rejects mismatching clusters as it always did, no
change required.

> > > I would probably also leave out those Windows file API changes, too.
> > > --rel-segsize would simply refuse larger sizes until someone does the
> > > work on that platform, to keep the initial proposal small.
> >
> > Those changes from off_t to pgoff_t?  Yes, it would be good to do without
> > those.  Apart of the practical problems that have been brought up, this was
> > a major annoyance with the proposed patch set IMO.

+1, it was not nice.

Alright, since I had some time to kill in an airport, here is a
starter patch for initdb --rel-segsize.  Some random thoughts:

Another potential option name would be --segsize, if we think we're
going to use this for temp files too eventually.

Maybe it's not so beautiful to have that global variable
rel_segment_size (which replaces REL_SEGSIZE everywhere).  Another
idea would be to make it static in md.c and call smgrsetsegmentsize(),
or something like that.  That could be a nice place to compute the
"shift" value up front, instead of computing it each time in
blockno_to_segno(), but that's probably not worth bothering with (?).
BSR/LZCNT/CLZ instructions are pretty fast on modern chips.  That's
about the only place where someone could say that this change makes
things worse for people not interested in the new feature, so I was
careful to get rid of / and % operations with no-longer-constant RHS.

I had to promote segment size to int64 (global variable, field in
control file), because otherwise it couldn't represent
--rel-segsize=32TB (it'd be too big by one).  Other ideas would be to
store the shift value instead of the size, or store the max block
number, eg subtract one, or use InvalidBlockNumber to mean "no limit"
(with more branches to test for it).  The only problem I ran into with
the larger type was that 'SHOW segment_size' now needs a custom show
function because we don't have int64 GUCs.

A C type confusion problem that I noticed: some code uses BlockNumber
and some code uses int for segment numbers.  It's not really a
reachable problem for practical reasons (you'd need over 2 billion
directories and VFDs to reach it), but it's wrong to use int if
segment size can be set as low as BLCKSZ (one file per block); you
could have more segments than an int can represent.  We could go for
uint32, BlockNumber or create SegmentNumber (which I think I've
proposed before, and lost track of...).  We can address that
separately (perhaps by finding my old patch...)
From c6809aafd147d0ac286ab73c2d8fbe571c698550 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 26 May 2023 01:41:11 +1200
Subject: [PATCH 1/2] Allow relation segment size to be set by initdb.

Previously, relation segment size was a rarely modified compile time
option.  Make it an initdb option, so that users with very large tables
can avoid using so many files and file descriptors.

The initdb option --rel-segsize is modeled on the existing --wal-segsize
option.

The data type used to store the size is int64, not BlockNumber, because
it seems reasonable to want to be able to say --rel-segsize=32TB (=
don't use segments at all), but that would overflow uint32.

The default behavior is unchanged: 1GB segments.  On Windows, we can't
go above 2GB for now due (we'd have to make a lot of changes due to
Windows' small off_t).

Discussion: https://postgr.es/m/CA%2BhUKG%2BBGXwMbrvzXAjL8VMGf25y_ga_XnO741g10y0%3Dm6dDiA%40mail.gmail.com

diff --git a/configure b/configure
index 1b415142d1..a3dee3ea74 100755
--- a/configure
+++ b/configure
@@ -841,8 +841,6 @@ enable_coverage
 enable_dtrace
 enable_tap_tests
 with_blocksize
-with_segsize
-with_segsize_blocks
 with_wal_blocksize
 with_CC
 with_llvm
@@ -1551,9 +1549,6 @@ Optional Packages:
   --with-pgport=PORTNUM   set default port number [5432]
   --with-blocksize=BLOCKSIZE
   set table block size in kB [8]
-  --with-segsize=SEGSIZE  set table segment size in GB [1]
-  --with-segsize-blocks=SEGSIZE_BLOCKS
-  set table segment size in blocks [0]
   --with-wal-blocksize=BLOCKSIZE
   set WAL block size in kB [8]
   --with-CC=CMD   set compiler (deprecated)
@@ -3731,85 +3726,6 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
-#
-# Relation segment size
-#
-
-
-
-# Check whether --with-segsize was given.
-if test "${with_segsize+set}" = set; then :
-  withval=$with_segsize;
-  case $withval in
-yes)
-  as_fn_error $? "argument required for