date:20231020

Re: Remove last traces of HPPA support

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-20 22:06:55 -0400, Tom Lane wrote:
> Noah Misch  writes:
> > If the next thing is a patch removing half of the fallback atomics, that is 
> > a
> > solid reason to remove hppa.
> 
> Agreed, though I don't think we have a clear proposal as to what
> else to remove.
> 
> > The code removed in the last proposed patch was
> > not that and was code that never changes, hence my reaction.
> 
> Mmm ... I'd agree that the relevant stanzas of s_lock.h/.c haven't
> changed in a long time, but port/atomics/ is of considerably newer
> vintage and is still receiving a fair amount of churn.  Moreover,
> much of what I proposed to remove from there is HPPA-only code with
> exactly no parallel in other arches (specifically, the bits in
> atomics/fallback.h).  So I don't feel comfortable that it will
> continue to work without benefit of testing.  We're taking a risk
> just hoping that it will continue to work in the back branches until
> they hit EOL.  Expecting that it'll continue to work going forward,
> sans testing, seems like the height of folly.

It'd be one thing to continue supporting an almost-guaranteed-to-be-unused
platform, if we expected it to become more popular or complete enough to be
usable like e.g. risc-v a few years ago. But I doubt we'll find anybody out
there believing that there's a potential future upward trend for HPPA.

IMO a single person looking at HPPA code for a few minutes is a cost that more
than outweighs the potential benefits of continuing "supporting" this dead
arch. Even code that doesn't need to change has costs, particularly if it's
intermingled with actually important code (which spinlocks certainly are).

Greetings,

Andres Freund

Re: Add support for AT LOCAL

2023-10-20 Thread Tom Lane

Andres Freund  writes:
> On 2023-10-19 10:38:14 -0400, Robert Haas wrote:
>> To be honest, I'm not entirely sure that even AIX gcc support is
>> delivering enough value per unit work to justify keeping it around.
>> But the xlc situation is worse.

> Agreed with both. If it were just a platform that didn't need special casing
> in a bunch of places, it'd be one thing, but it's linkage model is so odd that
> it makes no sense to keep AIX support around. But I'll take what I can get...

The other thread recently referred to:

https://www.postgresql.org/message-id/flat/20220702183354.a6uhja35wta7agew%40alap3.anarazel.de

was mostly about how AIX's choice that alignof(double) < alignof(int64)
breaks a whole bunch of assumptions in our code.  AFAICS we've done
nothing to resolve that, and nobody really wants to deal with it,
and there's no good reason to think that fixing it would improve
portability to any other platform.  So maybe there's an adequate
case for just nuking AIX support altogether?  I can't recall the
last time I saw a report from an actual AIX end user.

regards, tom lane

Re: Add support for AT LOCAL

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-19 10:38:14 -0400, Robert Haas wrote:
> On Wed, Oct 18, 2023 at 7:33 PM Noah Misch  wrote:
> > I feel the gravity and longevity of xlc bugs has been out of proportion with
> > the compiler's contribution to PostgreSQL.  I would find it reasonable to
> > revoke xlc support in v17+, leaving AIX gcc support in place.
> 
> +1 for this proposal. I just think this is getting silly. We're saying
> that we only have access to 1 or 2 AIX machines, and most of us have
> access to none, and the compiler has serious code generation bugs that
> are present in both a release 11 years old and also a release current
> release, meaning they went unfixed for 10 years, and we can't report
> bugs or get them fixed when we find them, and the use of this
> particular compiler in the buildfarm isn't finding any issues that
> matter anywhere else.

+1.


> To be honest, I'm not entirely sure that even AIX gcc support is
> delivering enough value per unit work to justify keeping it around.
> But the xlc situation is worse.

Agreed with both. If it were just a platform that didn't need special casing
in a bunch of places, it'd be one thing, but it's linkage model is so odd that
it makes no sense to keep AIX support around. But I'll take what I can get...

Greetings,

Andres Freund

Re: post-recovery amcheck expectations

2023-10-20 Thread Noah Misch

On Mon, Oct 09, 2023 at 04:46:26PM -0700, Peter Geoghegan wrote:
> On Wed, Oct 4, 2023 at 7:52 PM Noah Misch  wrote:

> Might make sense to test the fix for this issue using a similar
> approach: by adding custom code that randomly throws errors at a point
> that stresses the implementation. I'm referring to the point at which
> VACUUM is between the first and second phase of page deletion: right
> before (or directly after) _bt_unlink_halfdead_page() is called. That
> isn't fundamentally different to the approach from your TAP test, but
> seems like it might add some interesting variation.

My initial manual test was like that, actually.

> > I lean toward fixing this by
> > having amcheck scan left; if left links reach only half-dead or deleted 
> > pages,
> > that's as good as the present child block being P_LEFTMOST.
> 
> Also my preference.

Done mostly that way, except I didn't accept deleted pages.  Making this work
on !readonly would take more than that, and readonly shouldn't need that.

> > There's a
> > different error from bt_index_check(), and I've not yet studied how to fix
> > that:
> >
> >   ERROR:  left link/right link pair in index "not_leftmost_pk" not in 
> > agreement
> >   DETAIL:  Block=0 left block=0 left link from block=4294967295.
> 
> This looks like this might be a straightforward case of confusing
> P_NONE for InvalidBlockNumber (or vice-versa). They're both used to
> indicate "no such block" here.

Roughly so.  It turned out this scenario was passing leftcurrent=P_NONE to
bt_recheck_sibling_links(), causing that function to use BTPageGetOpaque() on
the metapage.

> > For some other amcheck expectations, the comments suggest reliance on the
> > bt_index_parent_check() ShareLock.  I haven't tried to make test cases for
> > them, but perhaps recovery can trick them the same way.  Examples:
> >
> >   errmsg("downlink or sibling link points to deleted block in index \"%s\"",
> >   errmsg("block %u is not leftmost in index \"%s\"",
> >   errmsg("block %u is not true root in index \"%s\"",
> 
> These are all much older. They're certainly all from before the
> relevant checks were first added (by commit d114cc53), and seem much
> less likely to be buggy.

After I fixed the original error, the "block %u is not leftmost" surfaced
next.  The attached patch fixes that, too.  I didn't investigate the others.
The original test was flaky in response to WAL flush timing, but this one
survives thousands of runs.
Author: Noah Misch 
Commit: Noah Misch 

amcheck: Distinguish interrupted page deletion from corruption.

This prevents false-positive reports about "the first child of leftmost
target page is not leftmost of its level", "block %u is not leftmost"
and "left link/right link pair".  They appeared if amcheck ran before
VACUUM cleaned things, after a cluster exited recovery between the
first-stage and second-stage WAL records of a deletion.  Back-patch to
v11 (all supported versions).

Reviewed by FIXME.

Discussion: https://postgr.es/m/20231005025232.c7.nmi...@google.com

diff --git a/contrib/amcheck/Makefile b/contrib/amcheck/Makefile
index b82f221..f830bd3 100644
--- a/contrib/amcheck/Makefile
+++ b/contrib/amcheck/Makefile
@@ -12,6 +12,7 @@ PGFILEDESC = "amcheck - function for verifying relation 
integrity"
 
 REGRESS = check check_btree check_heap
 
+EXTRA_INSTALL = contrib/pg_walinspect
 TAP_TESTS = 1
 
 ifdef USE_PGXS
diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index 5b55cf3..51d8107 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -42,6 +42,7 @@ tests += {
   't/001_verify_heapam.pl',
   't/002_cic.pl',
   't/003_cic_2pc.pl',
+  't/004_pitr.pl',
 ],
   },
 }
diff --git a/contrib/amcheck/t/004_pitr.pl b/contrib/amcheck/t/004_pitr.pl
new file mode 100644
index 000..6bcc159
--- /dev/null
+++ b/contrib/amcheck/t/004_pitr.pl
@@ -0,0 +1,82 @@
+# Copyright (c) 2021-2023, PostgreSQL Global Development Group
+
+# Test integrity of intermediate states by PITR to those states
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# origin node: generate WAL records of interest.
+my $origin = PostgreSQL::Test::Cluster->new('origin');
+$origin->init(has_archiving => 1, allows_streaming => 1);
+$origin->append_conf('postgresql.conf', 'autovacuum = off');
+$origin->start;
+$origin->backup('my_backup');
+# Create a table with each of 6 PK values spanning 1/4 of a block.  Delete the
+# first four, so one index leaf is eligible for deletion.  Make a replication
+# slot just so pg_walinspect will always have access to later WAL.
+my $setup = safe_psql('postgres', "SELECT pg_current_wal_lsn()");
+# VACUUM to delete the aforementioned leaf page.  Force an XLogFlush() by
+# dropping a permanent table.  That way, the XLogReader infrastructure can

Remove extraneous break condition in logical slot advance function

2023-10-20 Thread Bharath Rupireddy

Hi,

There exists an extraneous break condition in
pg_logical_replication_slot_advance(). When the end of WAL or moveto
LSN is reached, the main while condition helps to exit the loop, so no
separate break condition is needed. Attached patch removes it.

Thoughts?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Remove-extraneous-break-condition-in-logical-slot.patch
Description: Binary data

Re: Fix output of zero privileges in psql

2023-10-20 Thread Erik Wienhold

On 2023-10-20 22:35 +0200, David G. Johnston wrote:
> Ok, I found my mis-understanding and better understand where you are all
> coming from now; I apparently had the usage of NULL flip-flopped.
> 
> Taking pg_tablespace as an example.  Its "spcacl" column produces NULL for
> default privileges and '{}'::text[] for empty privileges.
> 
> Thus, today:
> empty: array_to_string('{}'::text[], '\n') produces an empty string
> default: array_to_string(null, '\n') produces null which then was printed
> as a hard-coded empty string via forcibly changing \pset null
> 
> I was thinking the two cases were reversed.
> 
> My proposal for changing printACLColumn is thus:
> 
> case when spcacl is null then ''
>  when cardinality(spcacl) = 0 then '(none)'
>  else array_to_string(spcacl, E'\\n')
> end as "Access privileges"
> 
> In short, I don't want default privileges to start to obey \pset null when
> it never has before and is documented as displaying the empty string.  I do
> want the empty string produced by empty privileges to change to (none) so
> that it no longer is indistinguishable from our choice of presentation for
> the default privilege case.
> 
> Mechanically, we remove the existing \pset null for these metacommands and
> move it into the query via the added CASE expression in the ‎printACLColumn
> function.
> 
> This gets rid of NULL as an output value for this column and makes the
> patch regarding \pset null discussion unnecessary.  And it leaves the
> existing well-established empty string for default privileges alone (and
> changing this is what I believe Tom is against and I agree on that point).

I haven't thought off this yet.  The attached v3 of my initial patch
does that.  It also includes Laurenz' fix to no longer ignore \pset null
(minus the doc changes that suggest using \pset null to distinguish
between default and empty privileges because that's no longer needed).

-- 
Erik
>From 5e3f1840a5a6f49ccfa7f61c040b24638daad421 Mon Sep 17 00:00:00 2001
From: Erik Wienhold 
Date: Sun, 17 Sep 2023 20:54:48 +0200
Subject: [PATCH v3] Fix output of empty privileges in psql

Print "(none)" for empty privileges instead of nothing so that the user
is able to distinguish them from default privileges.  \pset null was
ignored to always print default privileges as empty strings.  This
kludge is now removed by explicitly returning the empty string for
default privileges with the nice side effect that all describe commands
now honor \pset null.

Meta commands affected by empty privileges:

\db+ \dD+ \des+ \dew+ \df+ \dL+ \dl+ \dn+ \dp \dT+ \l

Default privileges start as NULL::aclitem[] in various catalog columns
but revoking the default privileges leaves an empty aclitem[].  Using
\pset null '(default)' as a workaround for spotting default privileges
did not work because the meta commands ignored this setting.

The privileges shown by \dconfig+ and \ddp as well as the column
privileges shown by \dp are not affected by this change because the
respective aclitem[] is reset to NULL or deleted from the catalog
instead of leaving empty arrays.

Commands \des+ and \dew+ are not covered in src/test/regress because no
foreign data wrapper is available at this point to create a foreign
server.

Handling of empty privileges by Erik Wienhold.  Fixing \pset null by
Laurenz Albe.
---
 doc/src/sgml/ddl.sgml  |  4 +-
 src/bin/psql/describe.c| 51 +++-
 src/test/regress/expected/psql.out | 94 ++
 src/test/regress/sql/psql.sql  | 45 ++
 4 files changed, 149 insertions(+), 45 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 075ff32991..1c17c4a967 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2353,7 +2353,9 @@ GRANT SELECT (col1), UPDATE (col1) ON mytable TO 
miriam_rw;
   
If the Access privileges column is empty for a given
object, it means the object has default privileges (that is, its
-   privileges entry in the relevant system catalog is null).  Default
+   privileges entry in the relevant system catalog is null).  The column shows
+   (none) for empty privileges (that is, no privileges at
+   all, even for the object owner  a rare occurrence).  Default
privileges always include all privileges for the owner, and can include
some privileges for PUBLIC depending on the object
type, as explained above.  The first GRANT
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..0d7f86d80f 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -124,7 +124,6 @@ describeAggregates(const char *pattern, bool verbose, bool 
showSystem)
if (!res)
return false;
 
-   myopt.nullPrint = NULL;
myopt.title = _("List of aggregate functions");
myopt.translate_header = true;
 
@@ -197,7 +196,6 @@ describeAccessMethods(const char *pattern, bool verbose)
if (!res)
return

Re: Remove last traces of HPPA support

2023-10-20 Thread Tom Lane

Noah Misch  writes:
> If the next thing is a patch removing half of the fallback atomics, that is a
> solid reason to remove hppa.

Agreed, though I don't think we have a clear proposal as to what
else to remove.

> The code removed in the last proposed patch was
> not that and was code that never changes, hence my reaction.

Mmm ... I'd agree that the relevant stanzas of s_lock.h/.c haven't
changed in a long time, but port/atomics/ is of considerably newer
vintage and is still receiving a fair amount of churn.  Moreover,
much of what I proposed to remove from there is HPPA-only code with
exactly no parallel in other arches (specifically, the bits in
atomics/fallback.h).  So I don't feel comfortable that it will
continue to work without benefit of testing.  We're taking a risk
just hoping that it will continue to work in the back branches until
they hit EOL.  Expecting that it'll continue to work going forward,
sans testing, seems like the height of folly.

regards, tom lane

Re: Fix output of zero privileges in psql

2023-10-20 Thread Erik Wienhold

On 2023-10-20 08:42 +0200, Laurenz Albe wrote:
> If you want to proceed with your patch, you could send a new version.

v2 attached.

> I think it could do with an added line of documentation to the
> "Privileges" chapter, and I'd say that the regression tests should be
> in "psql.sql" (not that it is very important).

I added some docs.  There will be merge conflicts when combining with
your v5!  Also moved the regression tests to psql.sql which is the right
place for testing meta commands.

-- 
Erik
>From 170ca82fa7b74cc9102582cdf48042131d5ae016 Mon Sep 17 00:00:00 2001
From: Erik Wienhold 
Date: Sun, 17 Sep 2023 20:54:48 +0200
Subject: [PATCH v2] Fix output of empty privileges in psql

Print "(none)" for empty privileges instead of nothing so that the user
is able to distinguish empty from default privileges.  This affects the
following meta commands:

\db+ \dD+ \des+ \dew+ \df+ \dL+ \dl+ \dn+ \dp \dT+ \l

Default privileges start as NULL::aclitem[] in various catalog columns
but revoking the default privileges leaves an empty aclitem array.
Using \pset null '(null)' as a workaround to spot default privileges
does not work because the meta commands ignore this setting.

The privileges shown by \dconfig+ and \ddp as well as the column
privileges shown by \dp are not affected by this change because the
respective aclitem[] is reset to NULL or deleted from the catalog
instead of leaving empty arrays.

Commands \des+ and \dew+ are not covered in src/test/regress because no
foreign data wrapper is available at this point to create a foreign
server.
---
 doc/src/sgml/ddl.sgml  |  4 +-
 src/bin/psql/describe.c|  6 +-
 src/test/regress/expected/psql.out | 94 ++
 src/test/regress/sql/psql.sql  | 45 ++
 4 files changed, 147 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 075ff32991..1c17c4a967 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2353,7 +2353,9 @@ GRANT SELECT (col1), UPDATE (col1) ON mytable TO 
miriam_rw;
   
If the Access privileges column is empty for a given
object, it means the object has default privileges (that is, its
-   privileges entry in the relevant system catalog is null).  Default
+   privileges entry in the relevant system catalog is null).  The column shows
+   (none) for empty privileges (that is, no privileges at
+   all, even for the object owner  a rare occurrence).  Default
privileges always include all privileges for the owner, and can include
some privileges for PUBLIC depending on the object
type, as explained above.  The first GRANT
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bac94a338c..154d244d97 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -6718,7 +6718,11 @@ static void
 printACLColumn(PQExpBuffer buf, const char *colname)
 {
appendPQExpBuffer(buf,
- "pg_catalog.array_to_string(%s, 
E'\\n') AS \"%s\"",
+ "CASE pg_catalog.cardinality(%s)\n"
+ "  WHEN 0 THEN '%s'\n"
+ "  ELSE 
pg_catalog.array_to_string(%s, E'\\n')\n"
+ "END AS \"%s\"",
+ colname, gettext_noop("(none)"),
  colname, gettext_noop("Access 
privileges"));
 }
 
diff --git a/src/test/regress/expected/psql.out 
b/src/test/regress/expected/psql.out
index c70205b98a..08f854d0a8 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -6663,3 +6663,97 @@ DROP ROLE regress_du_role0;
 DROP ROLE regress_du_role1;
 DROP ROLE regress_du_role2;
 DROP ROLE regress_du_admin;
+-- Test empty privileges.
+BEGIN;
+WARNING:  there is already a transaction in progress
+-- Create an owner for tested objects because output contains owner info.
+-- Must be superuser to be owner of tablespace.
+CREATE ROLE regress_zeropriv_owner SUPERUSER;
+SET LOCAL ROLE regress_zeropriv_owner;
+ALTER TABLESPACE regress_tblspace OWNER TO CURRENT_USER;
+REVOKE ALL ON TABLESPACE regress_tblspace FROM CURRENT_USER;
+\db+ regress_tblspace
+List of tablespaces
+   Name   | Owner  |Location | Access 
privileges | Options |  Size   | Description 
+--++-+---+-+-+-
+ regress_tblspace | regress_zeropriv_owner | pg_tblspc/16385 | (none)  
  | | 0 bytes | 
+(1 row)
+
+CREATE DOMAIN regress_zeropriv_domain AS int;
+REVOKE ALL ON DOMAIN regress_zeropriv_domain FROM CURRENT_USER, PUBLIC;
+\dD+ regress_zeropriv_domain
+List of domains
+ Schema |  Name   |  Type   | Collation |

Re: The documentation for storage type 'plain' actually allows single byte header

2023-10-20 Thread Bruce Momjian

On Fri, Sep 29, 2023 at 06:45:52PM -0400, Tom Lane wrote:
> Laurenz Albe  writes:
> > On Fri, 2023-09-29 at 18:19 -0400, Bruce Momjian wrote:
> >> Where did we end with this?  Is a doc patch the solution?
> 
> > I don't think this went anywhere, and a doc patch is not the solution.
> > Tom has argued convincingly that single-byte headers are an effect of the 
> > TOAST
> > system, and that STORAGE PLAIN should disable all effects of TOAST.
> 
> Well, that was the original idea: you could use STORAGE PLAIN if you
> had C code that wasn't yet toast-aware.  However, given the lack of
> complaints, it seems there's no non-toast-aware code left anywhere.
> And that's not too surprising, because the evolutionary pressure to
> fix such code would be mighty strong, and a lot of time has passed.
> 
> I'm now inclined to think that changing the docs is better than
> changing the code; we'd be more likely to create new problems than
> fix anything useful.
> 
> I wonder though if there's really just one place claiming that
> that's how it works.  A trawl through the code comments might
> be advisable.

[ Discussion moved to hackers, same subject. ]

Here is the original thread from pgsql-docs:


https://www.postgresql.org/message-id/flat/167336599095.2667301.15497893107226841625%40wrigleys.postgresql.org

The report is about single-byte headers being used for varlena values
with PLAIN storage.

Here is the reproducible report:

CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(1) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));

Now peek into the page with pageinspect functions

SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));

This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = 
(11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" 
repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 
times.

I researched this and thought it would be a case where we were lacking a
check before creating a single-byte header, but I couldn't find anything
missing.  I think the problem is that the _source_ tupleDesc attstorage
attribute is being used to decide if we should use a short header, while
it is really the storage type of the destination that we should be
checking.  Unfortunately, I don't think the destination is accessible at
the location were we are deciding about a short header.

I am confused how to proceed.  I feel we need to fully understand why
this happening before we adjust anything.  Here is a backtrace --- the
short header is being created in fill_val() and the attstorage value
there is 'x'/EXTENDED.

---

#0  fill_val (att=0x56306f61dae8, bit=0x0, bitmask=0x7ffcfcfc1fb4, 
dataP=0x7ffcfcfc1f90, infomask=0x56306f61e25c, datum=94766026487048, 
isnull=false) at heaptuple.c:278
#1  0x56306e7800eb in heap_fill_tuple (tupleDesc=0x56306f61dad0, 
values=0x56306f61dc20, isnull=0x56306f61dc28, data=0x56306f61e260 "", 
data_size=11, infomask=0x56306f61e25c, bit=0x0) at heaptuple.c:427
#2  0x56306e781708 in heap_form_tuple (tupleDescriptor=0x56306f61dad0, 
values=0x56306f61dc20, isnull=0x56306f61dc28) at heaptuple.c:1181
#3  0x56306ea13dcb in tts_virtual_copy_heap_tuple (slot=0x56306f61dbd8) at 
execTuples.c:280
#4  0x56306ea1346e in ExecCopySlotHeapTuple (slot=0x56306f61dbd8) at 
../../../src/include/executor/tuptable.h:463
#5  0x56306ea14928 in tts_buffer_heap_copyslot (dstslot=0x56306f61e1a8, 
srcslot=0x56306f61dbd8) at execTuples.c:798
#6  0x56306ea4342e in ExecCopySlot (dstslot=0x56306f61e1a8, 
srcslot=0x56306f61dbd8) at ../../../src/include/executor/tuptable.h:487
#7  0x56306ea44785 in ExecGetInsertNewTuple (relinfo=0x56306f61d678, 
planSlot=0x56306f61dbd8) at nodeModifyTable.c:685
#8  0x56306ea49123 in ExecModifyTable (pstate=0x56306f61d470) at 
nodeModifyTable.c:3789
#9  0x56306ea0ef3c in ExecProcNodeFirst (node=0x56306f61d470) at 
execProcnode.c:464
#10 0x56306ea03702 in ExecProcNode (node=0x56306f61d470) at 
../../../src/include/executor/executor.h:273
#11 0x56306ea05fe2 in ExecutePlan (estate=0x56306f61d228, 
planstate=0x56306f61d470, use_parallel_mode=false, operation=CMD_INSERT, 
sendTuples=false, numberTuples=0, direction=ForwardScanDirection, 
dest=0x56306f588170, execute_once=true) at execMain.c:1670
#12 0x56306ea03c63 in standard_ExecutorRun (queryDesc=0x56306f527888, 
direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:365
#13 0x56306ea03aee in ExecutorRun (queryDesc=0x56306f527888, 
direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:309
#14 0x56306ec70cf5 in ProcessQuery

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Tom Lane

Thomas Munro  writes:
> (It'd be nice if the
> build farm logged "$LLVM_CONFIG --version" somewhere.)

It's not really the buildfarm script's responsibility to do that,
but feel free to make configure do so.

regards, tom lane

Re: Remove last traces of HPPA support

2023-10-20 Thread Noah Misch

On Fri, Oct 20, 2023 at 12:40:00PM -0700, Andres Freund wrote:
> On 2023-10-19 17:23:04 -0700, Noah Misch wrote:
> > On Thu, Oct 19, 2023 at 11:16:28AM -0400, Tom Lane wrote:
> > > We removed support for the HP-UX OS in v16, but left in support
> > > for the PA-RISC architecture, mainly because I thought that its
> > > spinlock mechanism is weird enough to be a good stress test
> > > for our spinlock infrastructure.  It still is that, but my
> > > one remaining HPPA machine has gone to the great recycle heap
> > > in the sky.  There seems little point in keeping around nominal
> > > support for an architecture that we can't test and no one is
> > > using anymore.
> > > 
> > > Hence, the attached removes the remaining support for HPPA.
> > > Any objections?
> > 
> > I wouldn't do this.  NetBSD/hppa still claims to exist, as does the OpenBSD
> > equivalent.  I presume its pkgsrc compiles this code.  The code is basically
> > zero-maintenance, so there's not much to gain from deleting it preemptively.
> 
> In addition to the point Tom has made, I think it's also not correct that hppa
> doesn't impose a burden: hppa is the only of our architectures that doesn't
> actually support atomic operations, requiring us to have infrastructure to
> backfill atomics using spinlocks. This does preclude some uses of atomics,
> e.g. in signal handlers - I think Thomas wanted to do so for some concurrency
> primitive.

If the next thing is a patch removing half of the fallback atomics, that is a
solid reason to remove hppa.  The code removed in the last proposed patch was
not that and was code that never changes, hence my reaction.

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro

On Sat, Oct 21, 2023 at 12:02 PM Thomas Munro  wrote:
> On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> > I'm quite sure that jiting did pass on ppc64 at some point.
>
> Yeah, I remember debugging it on EDB's POWER machine.  First off, we
> know that LLVM < 7 doesn't work for us on POWER, because:
>
> https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com
>
> That was fixed:
>
> https://github.com/llvm/llvm-project/commit/a95b0df5eddbe7fa1e9f8fe0b1ff62427e1c0318
>
> So I think that means that we'd first have to go through those animals
> and figure out which ones have older LLVM, and ignore those results --
> they just can't use --with-llvm.  Unfortunately there doesn't seem to
> be any clue on the version from the paths used by OpenSUSE.  Mark, do
> you know?

Adding Mark to this subthread.  Concretely, could you please disable
--with-llvm on any of those machines running LLVM < 7?  And report
what version any remaining animals are running?  (It'd be nice if the
build farm logged "$LLVM_CONFIG --version" somewhere.)  One of them
seems to have clang 5 which is a clue -- if the LLVM is also 5 it's
just not going to work, as LLVM is one of those forwards-only projects
that doesn't back-patch fixes like that.

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-20 Thread Bharath Rupireddy

On Fri, Oct 20, 2023 at 8:51 PM Zhijie Hou (Fujitsu)
 wrote:
>
> Attach the new version patch.

Thanks. Here are some comments on v55 patch:

1. A nit:
+
+/*
+ * We also skip decoding in 'fast_forward' mode. In passing set the
+ * 'processing_required' flag to indicate, were it not for this mode,
+ * processing *would* have been required.
+ */
How about "We also skip decoding in fast_forward mode. In passing set
the processing_required flag to indicate that if it were not for
fast_forward mode, processing would have been required."?

2. Don't we need InvalidateSystemCaches() after FreeDecodingContext()?

+/* Clean up */
+FreeDecodingContext(ctx);

3. Don't we need to put CreateDecodingContext in PG_TRY-PG_CATCH with
InvalidateSystemCaches() in PG_CATCH block? I think we need to clear
all timetravel entries with InvalidateSystemCaches(), no?

4. The following assertion better be an error? Or we ensure that
binary_upgrade_slot_has_caught_up isn't called for an invalidated slot
at all?
+
+/* Slots must be valid as otherwise we won't be able to scan the WAL */
+Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE);

5. This better be an error instead of returning false? IMO, null value
for slot name is an error.
+/* Quick exit if the input is NULL */
+if (PG_ARGISNULL(0))
+PG_RETURN_BOOL(false);

6. A nit: how about is_decodable_txn or is_decodable_change or some
other instead of just a plain name processing_required?
+/* Do we need to process any change in 'fast_forward' mode? */
+boolprocessing_required;

7. Can the following pg_fatal message be consistent and start with
lowercase letter something like "expected 0 logical replication slots
"?
+pg_fatal("Expected 0 logical replication slots but found %d.",
+ nslots_on_new);

8. s/problem/problematic - "A list of problematic slots is in the file:\n"
+ "A list of the problem slots is in the file:\n"

9. IMO, binary_upgrade_logical_replication_slot_has_caught_up seems
better, meaningful and consistent despite a bit long than just
binary_upgrade_slot_has_caught_up.

10. How about an assert that the passed-in replication slot is logical
in binary_upgrade_slot_has_caught_up?

11. How about adding CheckLogicalDecodingRequirements too in
binary_upgrade_slot_has_caught_up after CheckSlotPermissions just in
case?

12. Not necessary but adding ReplicationSlotValidateName(slot_name,
ERROR); for the passed-in slotname in
binary_upgrade_slot_has_caught_up may be a good idea, at least in
assert builds to help with input validations.

13. Can the functionality of LogicalReplicationSlotHasPendingWal be
moved to binary_upgrade_slot_has_caught_up and get rid of a separate
function LogicalReplicationSlotHasPendingWal? Or is it that the
function exists in logical.c to avoid extra dependencies between
logical.c and pg_upgrade_support.c?

14. I think it's better to check if the old cluster contains the
necessary function binary_upgrade_slot_has_caught_up instead of just
relying on major version.
+/* Logical slots can be migrated since PG17. */
+if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+return;

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Server crash on RHEL 9/s390x platform against PG16

2023-10-20 Thread Andres Freund

Hi,

On 2023-09-12 15:27:21 +0530, Suraj Kharage wrote:
> *[edb@9428da9d2137 postgres]$ cat /etc/redhat-release AlmaLinux release 9.2
> (Turquoise Kodkod)[edb@9428da9d2137 postgres]$ lscpuArchitecture:
> s390x  CPU op-mode(s):   32-bit, 64-bit  Address sizes:39 bits

Can you provide the rest of the lscpu output?  There have been issues with Z14
vs Z15:
https://github.com/llvm/llvm-project/issues/53009

You're apparently not hitting that, but given that fact, you either are on a
slightly older CPU, or you have applied a patch to work around it. Because
otherwise your uild instructions below would hit that problem, I think.


> physical, 48 bits virtual  Byte Order:   Big Endian*
> *Configure command:*
> ./configure --prefix=/home/edb/postgres/ --with-lz4 --with-zstd --with-llvm
> --with-perl --with-python --with-tcl --with-openssl --enable-nls
> --with-libxml --with-libxslt --with-systemd --with-libcurl --without-icu
> --enable-debug --enable-cassert --with-pgport=5414

Hm, based on "--with-libcurl" this isn't upstream postgres, correct? Have you
verified the issue reproduces on upstream postgres?

> 
> *Test case:*
> CREATE TABLE rm32044_t1
> (
> pkey   integer,
> val  text
> );
> CREATE TABLE rm32044_t2
> (
> pkey   integer,
> label  text,
> hidden boolean
> );
> CREATE TABLE rm32044_t3
> (
> pkey integer,
> val integer
> );
> CREATE TABLE rm32044_t4
> (
> pkey integer
> );
> insert into rm32044_t1 values ( 1 , 'row1');
> insert into rm32044_t1 values ( 2 , 'row2');
> insert into rm32044_t2 values ( 1 , 'hidden', true);
> insert into rm32044_t2 values ( 2 , 'visible', false);
> insert into rm32044_t3 values (1 , 1);
> insert into rm32044_t3 values (2 , 1);
> 
> postgres=# SELECT * FROM rm32044_t1 LEFT JOIN rm32044_t2 ON rm32044_t1.pkey
> = rm32044_t2.pkey, rm32044_t3 LEFT JOIN rm32044_t4 ON rm32044_t3.pkey =
> rm32044_t4.pkey order by rm32044_t1.pkey,label,hidden;

> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> The connection to the server was lost. Attempting reset: Failed.

I tried this on both master and 16, without hitting this issue.

If you can reproduce the issue on upstream postgres, can you share more about
your configuration?

Greetings,

Andres Freund

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-21 12:02:51 +1300, Thomas Munro wrote:
> On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> > > I doubt I can get anywhere near an s390x though, and we definitely had
> > > pre-existing problems on that arch.
> >
> > IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter 
> > of
> > perspective.
> > https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553
> 
> Ah, good to know about that.  But there are also reports of crashes in
> released versions that manage to get passed that ABI-wobble business:
> 
> https://www.postgresql.org/message-id/flat/CAF1DzPXjpPxnsgySz2Zjm8d2dx7%3DJ070C%2BMQBFh%2B9NBNcBKCAg%40mail.gmail.com

Trying to debug that now, using access to an s390x box provided by Mark...

Greetings,

Andres Freund

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro

On Sat, Oct 21, 2023 at 11:12 AM Andres Freund  wrote:
> On 2023-10-21 10:48:47 +1300, Thomas Munro wrote:
> > On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > I see that Mark has also just enabled --with-llvm on some POWER Linux
> > animals, and they have failed in various ways.  The failures are
> > strangely lacking in detail.  It seems we didn't have coverage before,
> > and I recall that there were definitely versions of LLVM that *didn't*
> > work for our usage in the past, which I'll need to dredge out of the
> > archives.  I will try to get onto a cfarm POWER machine and see if I
> > can reproduce that, before and after these commits, and whose bug is
> > it etc.
>
> I'm quite sure that jiting did pass on ppc64 at some point.

Yeah, I remember debugging it on EDB's POWER machine.  First off, we
know that LLVM < 7 doesn't work for us on POWER, because:

https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com

That was fixed:

https://github.com/llvm/llvm-project/commit/a95b0df5eddbe7fa1e9f8fe0b1ff62427e1c0318

So I think that means that we'd first have to go through those animals
and figure out which ones have older LLVM, and ignore those results --
they just can't use --with-llvm.  Unfortunately there doesn't seem to
be any clue on the version from the paths used by OpenSUSE.  Mark, do
you know?

> > I doubt I can get anywhere near an s390x though, and we definitely had
> > pre-existing problems on that arch.
>
> IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter of
> perspective.
> https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553

Ah, good to know about that.  But there are also reports of crashes in
released versions that manage to get passed that ABI-wobble business:

https://www.postgresql.org/message-id/flat/CAF1DzPXjpPxnsgySz2Zjm8d2dx7%3DJ070C%2BMQBFh%2B9NBNcBKCAg%40mail.gmail.com

Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

2023-10-20 Thread Peter Geoghegan

On Sun, Oct 15, 2023 at 1:50 PM Peter Geoghegan  wrote:
> Attached is v4, which applies cleanly on top of HEAD. This was needed
> due to Alexandar Korotkov's commit e0b1ee17, "Skip checking of scan
> keys required for directional scan in B-tree".
>
> Unfortunately I have more or less dealt with the conflicts on HEAD by
> disabling the optimization from that commit, for the time being.

Attached is v5, which deals with the conflict with the optimization
added by Alexandar Korotkov's commit e0b1ee17 sensibly: the
optimization is now only disabled in cases without array scan keys.
(It'd be very hard to make it work with array scan keys, since an
important principle for my patch is that we can change search-type
scan keys right in the middle of any _bt_readpage() call).

v5 also fixes a longstanding open item for the patch: we no longer
call _bt_preprocess_keys() with a buffer lock held, which was a bad
idea at best, and unsafe (due to the syscache lookups within
_bt_preprocess_keys) at worst. A new, minimal version of the function
(called _bt_preprocess_keys_leafbuf) is called at the same point
instead. That change, combined with the array binary search stuff
(which was added back in v2), makes the total amount of work performed
with a buffer lock held totally reasonable in all cases. It's even
okay in extreme or adversarial cases with many millions of array keys.

Making this _bt_preprocess_keys_leafbuf approach work has a downside:
it requires that _bt_preprocess_keys be a little less aggressive about
removing redundant scan keys, in order to meet certain assumptions
held by the new _bt_preprocess_keys_leafbuf function. Essentially,
_bt_preprocess_keys must now worry about current and future array key
values when determining redundancy among scan keys -- not just the
current array key values. _bt_preprocess_keys knows nothing about
SK_SEARCHARRAY scan keys on HEAD, because on HEAD there is a strict
1:1 correspondence between the number of primitive index scans and the
number of array keys (actually, the number of distinct combinations of
array keys). Obviously that's no longer the case with the patch
(that's the whole point of the patch).

It's easiest to understand how elimination of redundant quals needs to
work in v5 by way of an example. Consider the following query:

select count(*), two, four, twenty, hundred
from
  tenk1
where
  two in (0, 1) and four in (1, 2, 3)
  and two < 1;

Notice that "two" appears in the where clause twice. First it appears
as an SAOP, and then as an inequality. Right now, on HEAD, the
primitive index scan where the SAOP's scankey is "two = 0" renders
"two < 1" redundant. However, the subsequent primitive index scan
where "two = 1" does *not* render "two < 1" redundant. This has
implications for the mechanism in the patch, since the patch will
perform one big primitive index scan for all array constants, with
only a single _bt_preprocess_keys call at the start of its one and
only _bt_first call (but with multiple _bt_preprocess_keys_leafbuf
calls once we reach the leaf level).

The compromise that I've settled on in v5 is to teach
_bt_preprocess_keys to *never* treat "two < 1" as redundant with such
a query -- even though there is some squishy sense in which "two < 1"
is indeed still redundant (for the first SAOP key of value 0). My
approach is reasonably well targeted in that it mostly doesn't affect
queries that don't need it. But it will add cycles to some badly
written queries that wouldn't have had them in earlier Postgres
versions. I'm not entirely sure how much this matters, but my current
sense is that it doesn't matter all that much. This is the kind of
thing that is hard to test and poorly tested, so simplicity is even
more of a virtue than usual.

Note that the changes to _bt_preprocess_keys in v5 *don't* affect how
we determine if the scan has contradictory quals, which is generally
more important. With contradictory quals, _bt_first can avoid reading
any data from the index. OTOH eliminating redundant quals (i.e. the
thing that v5 *does* change) merely makes evaluating index quals less
expensive via preprocessing-away unneeded scan keys. In other words,
while it's possible that the approach taken by v5 will add CPU cycles
in a small number of cases, it should never result in more page
accesses.

--
Peter Geoghegan

v5-0001-Enhance-nbtree-ScalarArrayOp-execution.patch
Description: Binary data

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-21 10:48:47 +1300, Thomas Munro wrote:
> On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > Interestingly, a new problem just showed up on the the RHEL9 s390x
> > machine "lora", where a previously reported problem [1] apparently
> > re-appeared.  It complains about incompatible layout, previously
> > blamed on mismatch between clang and LLVM versions.  I can see that
> > its clang is v15 from clues in the conflig log, but I don't know which
> > version of LLVM is being used.  However, I see now that --with-llvm
> > was literally just turned on, so there is no reason to think that this
> > would have worked before or this work is relevant.  Strange though --
> > we must be able to JIT further than that on s390x because we have
> > crash reports in other threads (ie we made it past this and into other
> > more advanced brokenness).
> 
> I see that Mark has also just enabled --with-llvm on some POWER Linux
> animals, and they have failed in various ways.  The failures are
> strangely lacking in detail.  It seems we didn't have coverage before,
> and I recall that there were definitely versions of LLVM that *didn't*
> work for our usage in the past, which I'll need to dredge out of the
> archives.  I will try to get onto a cfarm POWER machine and see if I
> can reproduce that, before and after these commits, and whose bug is
> it etc.

I'm quite sure that jiting did pass on ppc64 at some point.


> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

IMO an LLVM bug, rather than a postgres bug, but I guess it's all a matter of
perspective.
https://github.com/llvm/llvm-project/issues/53009#issuecomment-1042748553

I had made another bug report about this issue at some point, but I can't
refind it right now.

Greetings,

Andres Freund

Re: Remove last traces of HPPA support

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-20 17:46:59 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-10-20 15:59:42 -0400, Tom Lane wrote:
> >> Hmm, are you saying there's more of port/atomics/ that could be
> >> removed?  What exactly?
> 
> > I was thinking we could remove the whole fallback path for atomic 
> > operations,
> > but it's a bit less, because we likely don't want to mandate support for 
> > 64bit
> > atomics yet.
> 
> Yeah.  That'd be tantamount to desupporting 32-bit arches altogether,
> I think.  I'm not ready to go there yet.

It shouldn't be tantamount to that - many 32bit archs support 64bit atomic
operations. E.g. x86 supported it since the 586 (in 1993). However, arm only
addded them to 32 bit, in an extension, comparatively recently...

> > That'd still allow removing more than half of
> > src/include/port/atomics/fallback.h and src/backend/port/atomics.c - and 
> > more
> > if we finally decided to require a spinlock implementation.
> 
> In the wake of 1c72d82c2, it seems likely that requiring some kind of
> spinlock implementation is not such a big lift.  Certainly, a machine
> without that hasn't been a fit target for production in a very long
> time, so maybe we should just drop that semaphore-based emulation.

Yep. And the performance drop due to not having spinlock is also getting worse
over time, with CPU bound workloads having become a lot more common due to
larger amounts of memory and much much faster IO.

> >> Do we really want to assume that all future architectures will have atomic
> >> operations?
> 
> > Yes. Outside of the tiny microcontrollers, which obviously won't run 
> > postgres,
> > I cannot see any future architecture not having support for atomic 
> > operations.
> 
> I'd like to refine what that means a bit more.  Are we assuming that a
> machine providing any of the gcc atomic intrinsics (of a given width) will
> provide all of them? Or is there a specific subset that we can emulate the
> rest on top of?

Right now we don't require that. As long as we know how to do atomic compare
exchange, we backfill all other atomic operations using compare-exchange -
albeit less efficiently (there's no retries for atomic-add when implemented
directly, but there are retries when using cmpxchg, the difference can be
significant under contention).

Practically speaking I think it's quite unlikely that a compiler + arch
combination will have only some intrinsics of some width - I think all
compilers have infrastructure to fall back to compare-exchange when there's no
dedicated atomic operation for some intrinsic.

Greetings,

Andres Freund

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Tom Lane

Thomas Munro  writes:
> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

Yeah.  Too bad there's no s390x in the gcc compile farm.
(I'm wondering how straight a line to draw between that fact
and llvm's evident shortcomings on s390x.)

I'm missing my old access to Red Hat's dev machines.  But in
the meantime, Mark's clearly got beaucoup access to s390
machines, so I wonder if he can let you into any of them?

regards, tom lane

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Mark Wong

On Sat, Oct 21, 2023 at 10:48:47AM +1300, Thomas Munro wrote:
> On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> > Interestingly, a new problem just showed up on the the RHEL9 s390x
> > machine "lora", where a previously reported problem [1] apparently
> > re-appeared.  It complains about incompatible layout, previously
> > blamed on mismatch between clang and LLVM versions.  I can see that
> > its clang is v15 from clues in the conflig log, but I don't know which
> > version of LLVM is being used.  However, I see now that --with-llvm
> > was literally just turned on, so there is no reason to think that this
> > would have worked before or this work is relevant.  Strange though --
> > we must be able to JIT further than that on s390x because we have
> > crash reports in other threads (ie we made it past this and into other
> > more advanced brokenness).
> 
> I see that Mark has also just enabled --with-llvm on some POWER Linux
> animals, and they have failed in various ways.  The failures are
> strangely lacking in detail.  It seems we didn't have coverage before,
> and I recall that there were definitely versions of LLVM that *didn't*
> work for our usage in the past, which I'll need to dredge out of the
> archives.  I will try to get onto a cfarm POWER machine and see if I
> can reproduce that, before and after these commits, and whose bug is
> it etc.

Yeah, I'm slowing enabling --with-llvm on POWER, s390x, and aarch64 (but
none here yet as I write this)...

> I doubt I can get anywhere near an s390x though, and we definitely had
> pre-existing problems on that arch.

If you want to send me your ssh key, I have access to these systems
through OSUOSL and LinuxFoundation programs.

Regards,
Mark

--
Mark Wong
EDB https://enterprisedb.com

Re: LLVM 16 (opaque pointers)

2023-10-20 Thread Thomas Munro

On Thu, Oct 19, 2023 at 6:20 AM Thomas Munro  wrote:
> Interestingly, a new problem just showed up on the the RHEL9 s390x
> machine "lora", where a previously reported problem [1] apparently
> re-appeared.  It complains about incompatible layout, previously
> blamed on mismatch between clang and LLVM versions.  I can see that
> its clang is v15 from clues in the conflig log, but I don't know which
> version of LLVM is being used.  However, I see now that --with-llvm
> was literally just turned on, so there is no reason to think that this
> would have worked before or this work is relevant.  Strange though --
> we must be able to JIT further than that on s390x because we have
> crash reports in other threads (ie we made it past this and into other
> more advanced brokenness).

I see that Mark has also just enabled --with-llvm on some POWER Linux
animals, and they have failed in various ways.  The failures are
strangely lacking in detail.  It seems we didn't have coverage before,
and I recall that there were definitely versions of LLVM that *didn't*
work for our usage in the past, which I'll need to dredge out of the
archives.  I will try to get onto a cfarm POWER machine and see if I
can reproduce that, before and after these commits, and whose bug is
it etc.

I doubt I can get anywhere near an s390x though, and we definitely had
pre-existing problems on that arch.

Re: Remove last traces of HPPA support

2023-10-20 Thread Tom Lane

Andres Freund  writes:
> On 2023-10-20 15:59:42 -0400, Tom Lane wrote:
>> Hmm, are you saying there's more of port/atomics/ that could be
>> removed?  What exactly?

> I was thinking we could remove the whole fallback path for atomic operations,
> but it's a bit less, because we likely don't want to mandate support for 64bit
> atomics yet.

Yeah.  That'd be tantamount to desupporting 32-bit arches altogether,
I think.  I'm not ready to go there yet.

> That'd still allow removing more than half of
> src/include/port/atomics/fallback.h and src/backend/port/atomics.c - and more
> if we finally decided to require a spinlock implementation.

In the wake of 1c72d82c2, it seems likely that requiring some kind of
spinlock implementation is not such a big lift.  Certainly, a machine
without that hasn't been a fit target for production in a very long
time, so maybe we should just drop that semaphore-based emulation.

>> Do we really want to assume that all future architectures will have atomic
>> operations?

> Yes. Outside of the tiny microcontrollers, which obviously won't run postgres,
> I cannot see any future architecture not having support for atomic operations.

I'd like to refine what that means a bit more.  Are we assuming that
a machine providing any of the gcc atomic intrinsics (of a given
width) will provide all of them?  Or is there a specific subset that
we can emulate the rest on top of?

regards, tom lane

Re: Remove last traces of HPPA support

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-20 15:59:42 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > In addition to the point Tom has made, I think it's also not correct that 
> > hppa
> > doesn't impose a burden: hppa is the only of our architectures that doesn't
> > actually support atomic operations, requiring us to have infrastructure to
> > backfill atomics using spinlocks. This does preclude some uses of atomics,
> > e.g. in signal handlers - I think Thomas wanted to do so for some 
> > concurrency
> > primitive.
> 
> Hmm, are you saying there's more of port/atomics/ that could be
> removed?  What exactly?

I was thinking we could remove the whole fallback path for atomic operations,
but it's a bit less, because we likely don't want to mandate support for 64bit
atomics yet. That'd still allow removing more than half of
src/include/port/atomics/fallback.h and src/backend/port/atomics.c - and more
if we finally decided to require a spinlock implementation.

> Do we really want to assume that all future architectures will have atomic
> operations?

Yes. Outside of the tiny microcontrollers, which obviously won't run postgres,
I cannot see any future architecture not having support for atomic operations.

Greetings,

Andres Freund

Re: Fix output of zero privileges in psql

2023-10-20 Thread David G. Johnston

On Fri, Oct 20, 2023 at 12:57 PM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Fri, Oct 20, 2023 at 12:34 PM Tom Lane  wrote:
> >> As near as I can tell, doing both things (the \pset null fix and
> >> substituting "(none)" for empty privileges) would be an acceptable
> >> answer to everyone who has commented.  Let's proceed with both
> >> patches, or combine them into one if there are merge conflicts.
>
> > I'm under the impression that removing the null representation of empty
> > privileges by making them (none) removes all known \d commands that
> output
> > nulls and don't obey \pset null.
>
> How so?  IIUC the proposal is to substitute "(none)" for empty-string
> ACLs, not null ACLs.  The \pset change should be addressing an
> independent case.
>

Ok, I found my mis-understanding and better understand where you are all
coming from now; I apparently had the usage of NULL flip-flopped.

Taking pg_tablespace as an example.  Its "spcacl" column produces NULL for
default privileges and '{}'::text[] for empty privileges.

Thus, today:
empty: array_to_string('{}'::text[], '\n') produces an empty string
default: array_to_string(null, '\n') produces null which then was printed
as a hard-coded empty string via forcibly changing \pset null

I was thinking the two cases were reversed.

My proposal for changing printACLColumn is thus:

case when spcacl is null then ''
 when cardinality(spcacl) = 0 then '(none)'
 else array_to_string(spcacl, E'\\n')
end as "Access privileges"

In short, I don't want default privileges to start to obey \pset null when
it never has before and is documented as displaying the empty string.  I do
want the empty string produced by empty privileges to change to (none) so
that it no longer is indistinguishable from our choice of presentation for
the default privilege case.

Mechanically, we remove the existing \pset null for these metacommands and
move it into the query via the added CASE expression in the ‎printACLColumn
function.

This gets rid of NULL as an output value for this column and makes the
patch regarding \pset null discussion unnecessary.  And it leaves the
existing well-established empty string for default privileges alone (and
changing this is what I believe Tom is against and I agree on that point).

David J.

Re: Remove last traces of HPPA support

2023-10-20 Thread Tom Lane

Andres Freund  writes:
> In addition to the point Tom has made, I think it's also not correct that hppa
> doesn't impose a burden: hppa is the only of our architectures that doesn't
> actually support atomic operations, requiring us to have infrastructure to
> backfill atomics using spinlocks. This does preclude some uses of atomics,
> e.g. in signal handlers - I think Thomas wanted to do so for some concurrency
> primitive.

Hmm, are you saying there's more of port/atomics/ that could be
removed?  What exactly?  Do we really want to assume that all
future architectures will have atomic operations?

regards, tom lane

Re: Fix output of zero privileges in psql

2023-10-20 Thread Tom Lane

"David G. Johnston"  writes:
> On Fri, Oct 20, 2023 at 12:34 PM Tom Lane  wrote:
>> As near as I can tell, doing both things (the \pset null fix and
>> substituting "(none)" for empty privileges) would be an acceptable
>> answer to everyone who has commented.  Let's proceed with both
>> patches, or combine them into one if there are merge conflicts.

> I'm under the impression that removing the null representation of empty
> privileges by making them (none) removes all known \d commands that output
> nulls and don't obey \pset null.

How so?  IIUC the proposal is to substitute "(none)" for empty-string
ACLs, not null ACLs.  The \pset change should be addressing an
independent case.

regards, tom lane

Re: Fix output of zero privileges in psql

2023-10-20 Thread David G. Johnston

On Fri, Oct 20, 2023 at 12:34 PM Tom Lane  wrote:

> Laurenz Albe  writes:
> > I am not sure how to proceed. Perhaps it would indeed be better to have
> > two competing commitfest entries.  Both could be "ready for committer",
> > and the committers can decide what they prefer.
>
> As near as I can tell, doing both things (the \pset null fix and
> substituting "(none)" for empty privileges) would be an acceptable
> answer to everyone who has commented.  Let's proceed with both
> patches, or combine them into one if there are merge conflicts.
>
>
I'm under the impression that removing the null representation of empty
privileges by making them (none) removes all known \d commands that output
nulls and don't obey \pset null.  At least, the existing \pset null patch
doesn't purport to fix anything that would become not applicable if the
(none) patch goes in.  I.e., at present they are mutually exclusive.

David J.

Re: Remove last traces of HPPA support

2023-10-20 Thread Andres Freund

Hi,

On 2023-10-19 17:23:04 -0700, Noah Misch wrote:
> On Thu, Oct 19, 2023 at 11:16:28AM -0400, Tom Lane wrote:
> > We removed support for the HP-UX OS in v16, but left in support
> > for the PA-RISC architecture, mainly because I thought that its
> > spinlock mechanism is weird enough to be a good stress test
> > for our spinlock infrastructure.  It still is that, but my
> > one remaining HPPA machine has gone to the great recycle heap
> > in the sky.  There seems little point in keeping around nominal
> > support for an architecture that we can't test and no one is
> > using anymore.
> > 
> > Hence, the attached removes the remaining support for HPPA.
> > Any objections?
> 
> I wouldn't do this.  NetBSD/hppa still claims to exist, as does the OpenBSD
> equivalent.  I presume its pkgsrc compiles this code.  The code is basically
> zero-maintenance, so there's not much to gain from deleting it preemptively.

In addition to the point Tom has made, I think it's also not correct that hppa
doesn't impose a burden: hppa is the only of our architectures that doesn't
actually support atomic operations, requiring us to have infrastructure to
backfill atomics using spinlocks. This does preclude some uses of atomics,
e.g. in signal handlers - I think Thomas wanted to do so for some concurrency
primitive.

Greetings,

Andres Freund

Re: Meson build updates

2023-10-20 Thread Andres Freund

Hi,

On 2023-07-12 18:53:03 -0500, Tristan Partin wrote:
> Did you need anything more from the "Make finding pkg-config(python3)
> more robust" patch? That one doesn't seem to have been applied yet.

Sorry, was overloaded at the time and then lost track of it. Pushed now!

Greetings,

Andres Freund

Re: Fix output of zero privileges in psql

2023-10-20 Thread Tom Lane

Laurenz Albe  writes:
> I am not sure how to proceed. Perhaps it would indeed be better to have
> two competing commitfest entries.  Both could be "ready for committer",
> and the committers can decide what they prefer.

As near as I can tell, doing both things (the \pset null fix and
substituting "(none)" for empty privileges) would be an acceptable
answer to everyone who has commented.  Let's proceed with both
patches, or combine them into one if there are merge conflicts.

regards, tom lane

Re: Remove last traces of HPPA support

2023-10-20 Thread Tom Lane

I wrote:
> Noah Misch  writes:
>> On Thu, Oct 19, 2023 at 11:16:28AM -0400, Tom Lane wrote:
>>> Hence, the attached removes the remaining support for HPPA.

>> I wouldn't do this.  NetBSD/hppa still claims to exist, as does the OpenBSD
>> equivalent.  I presume its pkgsrc compiles this code.  The code is basically
>> zero-maintenance, so there's not much to gain from deleting it preemptively.

> I doubt it: I don't think anyone is routinely building very much of
> pkgsrc for backwater hardware like HPPA, on either distro.

I dug a bit further on this point.  The previous discussion about
our policy for old-hardware support was here:

https://www.postgresql.org/message-id/flat/959917.1657522169%40sss.pgh.pa.us#47f7af4817dc8dc0d8901d1ee965971e

The existence of a NetBSD/sh3el package for Postgres didn't stop
us from dropping SuperH support.  Moreover, the page showing the
existence of that package:

https://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/databases/postgresql14-server/index.html

also shows a build for VAX, which we know positively would not
have passed regression tests, so they certainly weren't testing
those builds.  (And, to the point here, it does *not* show any
build for hppa.)

The bottom line, though, is that IMV we agreed in that thread to a
policy that no architecture will be considered supported unless
it has a representative in the buildfarm.  We've since enforced
that policy in the case of loongarch64, so it seems established.
With my HPPA animal gone, and nobody very likely to step up with
a replacement, HPPA no longer meets that threshold requirement.

regards, tom lane

Re: controlling meson's parallelism (and some whining)

2023-10-20 Thread Tristan Partin


On Fri Oct 20, 2023 at 11:22 AM CDT, Tristan Partin wrote:

On Thu Oct 19, 2023 at 12:44 PM CDT, Robert Haas wrote:
> The obvious fix to this is to just tell 'meson test' how many 
> processes I'd like it to run. I thought maybe I could just do 'meson 
> -j8 test' but that does not work, because the option is 
> --num-processes and has no short version. Even typing -j8 every time 
> would be kind of annoying; typing --num-processes 8 every time is 
> ridiculously verbose.


I submitted a patch[0] to Meson to add -j.

[0]: https://github.com/mesonbuild/meson/pull/12403


You will see this in the 1.3.0 release which will be happening soon™️.

--
Tristan Partin
Neon (https://neon.tech)

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2023-10-20 Thread Bharath Rupireddy

On Thu, Oct 12, 2023 at 4:13 AM Andres Freund  wrote:
>
> On 2023-10-03 16:05:32 -0700, Jeff Davis wrote:
> > On Sat, 2023-01-14 at 12:34 -0800, Andres Freund wrote:
> > > One benefit would be that it'd make it more realistic to use direct
> > > IO for WAL
> > > - for which I have seen significant performance benefits. But when we
> > > afterwards have to re-read it from disk to replicate, it's less
> > > clearly a win.
> >
> > Does this patch still look like a good fit for your (or someone else's)
> > plans for direct IO here? If so, would committing this soon make it
> > easier to make progress on that, or should we wait until it's actually
> > needed?
>
> I think it'd be quite useful to have. Even with the code as of 16, I see
> better performance in some workloads with debug_io_direct=wal,
> wal_sync_method=open_datasync compared to any other configuration. Except of
> course that it makes walsenders more problematic, as they suddenly require
> read IO. Thus having support for walsenders to send directly from wal buffers
> would be beneficial, even without further AIO infrastructure.

I'm attaching the v11 patch set with the following changes:
- Improved input validation in the function that reads WAL from WAL
buffers in 0001 patch.
- Improved test module's code in 0002 patch.
- Modernized meson build file in 0002 patch.
- Added commit messages for both the patches.
- Ran pgindent on both the patches.

Any thoughts are welcome.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 6615590d795a6068897a8aa348d9e699442bb07b Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Fri, 20 Oct 2023 15:49:23 +
Subject: [PATCH v11] Allow WAL reading from WAL buffers

This commit adds WALRead() the capability to read WAL from WAL
buffers when possible. When requested WAL isn't available in WAL
buffers, the WAL is read from the WAL file as usual. It relies on
WALBufMappingLock so that no one replaces the WAL buffer page that
we're reading from. It skips reading from WAL buffers if
WALBufMappingLock can't be acquired immediately. In other words,
it doesn't wait for WALBufMappingLock to be available. This helps
reduce the contention on WALBufMappingLock.

This commit benefits the callers of WALRead(), that are walsenders
and pg_walinspect. They can now avoid reading WAL from the WAL
file (possibly avoiding disk IO). Tests show that the WAL buffers
hit ratio stood at 95% for 1 primary, 1 sync standby, 1 async
standby, with pgbench --scale=300 --client=32 --time=900. In other
words, the walsenders avoided 95% of the time reading from the
file/avoided pread system calls:
https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com

This commit also benefits when direct IO is enabled for WAL.
Reading WAL from WAL buffers puts back the performance close to
that of without direct IO for WAL:
https://www.postgresql.org/message-id/CALj2ACV6rS%2B7iZx5%2BoAvyXJaN4AG-djAQeM1mrM%3DYSDkVrUs7g%40mail.gmail.com

This commit also paves the way for the following features in
future:
- Improves synchronous replication performance by replicating
directly from WAL buffers.
- A opt-in way for the walreceivers to receive unflushed WAL.
More details here:
https://www.postgresql.org/message-id/20231011224353.cl7c2s222dw3de4j%40awork3.anarazel.de

Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Andres Freund
Reviewed-by: Nathan Bossart, Kuntal Ghosh
Discussion: https://www.postgresql.org/message-id/CALj2ACXKKK%3DwbiG5_t6dGao5GoecMwRkhr7GjVBM_jg54%2BNa%3DQ%40mail.gmail.com
---
 src/backend/access/transam/xlog.c   | 208 
 src/backend/access/transam/xlogreader.c |  45 -
 src/include/access/xlog.h   |   6 +
 3 files changed, 257 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index cea13e3d58..9553a880f1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1706,6 +1706,214 @@ GetXLogBuffer(XLogRecPtr ptr, TimeLineID tli)
 	return cachedPos + ptr % XLOG_BLCKSZ;
 }
 
+/*
+ * Read WAL from WAL buffers.
+ *
+ * Read 'count' bytes of WAL from WAL buffers into 'buf', starting at location
+ * 'startptr', on timeline 'tli' and return total read bytes.
+ *
+ * Note that this function reads as much as it can from WAL buffers, meaning,
+ * it may not read all the requested 'count' bytes. Caller must be aware of
+ * this and deal with it.
+ *
+ * Note that this function is not available for frontend code as WAL buffers is
+ * an internal mechanism to the server.
+ */
+Size
+XLogReadFromBuffers(XLogReaderState *state,
+	XLogRecPtr startptr,
+	TimeLineID tli,
+	Size count,
+	char *buf)
+{
+	XLogRecPtr	ptr;
+	Size		nbytes;
+	Size		ntotal;
+	Size		nbatch;
+	char	   *batchstart;
+	TimeLineID	current_timeline;
+
+	/*
+	 * Do some input parameter validations to fail

Re: [PATCH] Extend ALTER OPERATOR to support adding commutator, negator, hashes, and merges

2023-10-20 Thread Tom Lane

jian he  writes:
> errmsg("operator attribute \"negator\" cannot be changed if it has
> already been set")));
> I feel like the above message is not very helpful.

I think it's okay to be concise about this as long as the operator
we're referring to is the target of the ALTER.  I agree that when
we're complaining about some *other* operator, we'd better spell
out which one we mean, and I made some changes to the patch to
improve that.

Pushed after a round of editorialization -- mostly cosmetic
stuff, except for tweaking some error messages.  I shortened the
test cases a bit too, as I thought they were somewhat excessive
to have as a permanent thing.

regards, tom lane

Re: Is this a problem in GenericXLogFinish()?

2023-10-20 Thread Jeff Davis

On Thu, 2023-10-19 at 16:12 -0400, Robert Haas wrote:
> For what it's worth, though, I think it would be better to
> just make these cases exceptions to your Assert

OK, I'll probably commit something like v4 then.

I still have a question though: if a buffer is exclusive-locked,
unmodified and clean, and then the caller registers it and later does
PageSetLSN (just as if it were dirty), is that a definite bug?

There are a couple callsites where the control flow is complex enough
that it's hard to be sure the buffer is always marked dirty before
being registered (like in log_heap_visible(), as I mentioned upthread).
But those callsites are all doing PageSetLSN, unlike the hash index
case.

Regards,
Jeff Davis

Re: controlling meson's parallelism (and some whining)

2023-10-20 Thread Tristan Partin


On Thu Oct 19, 2023 at 12:44 PM CDT, Robert Haas wrote:
The obvious fix to this is to just tell 'meson test' how many 
processes I'd like it to run. I thought maybe I could just do 'meson 
-j8 test' but that does not work, because the option is 
--num-processes and has no short version. Even typing -j8 every time 
would be kind of annoying; typing --num-processes 8 every time is 
ridiculously verbose.


I submitted a patch[0] to Meson to add -j.

[0]: https://github.com/mesonbuild/meson/pull/12403

--
Tristan Partin
Neon (https://neon.tech)

Re: trying again to get incremental backup

2023-10-20 Thread David Steele


On 10/19/23 16:00, Robert Haas wrote:

On Thu, Oct 19, 2023 at 3:18 PM David Steele  wrote:

0001 looks pretty good to me. The only thing I find a little troublesome
is the repeated construction of file names with/without segment numbers
in ResetUnloggedRelationsInDbspaceDir(), .e.g.:

+   if (segno == 0)
+   snprintf(dstpath, sizeof(dstpath), "%s/%u",
+dbspacedirname, relNumber);
+   else
+   snprintf(dstpath, sizeof(dstpath), "%s/%u.%u",
+dbspacedirname, relNumber, 
segno);


If this happened three times I'd definitely want a helper function, but
even with two I think it would be a bit nicer.


Personally I think that would make the code harder to read rather than
easier. I agree that repeating code isn't great, but this is a
relatively brief idiom and pretty self-explanatory. If other people
agree with you I can change it, but to me it's not an improvement.


Then I'm fine with it as is.


0002 is definitely a good idea. FWIW pgBackRest does this conversion but
also errors if it does not succeed. We have never seen a report of this
error happening in the wild, so I think it must be pretty rare if it
does happen.


Cool, but ... how about the main patch set? It's nice to get some of
these refactoring bits and pieces out of the way, but if I spend the
effort to work out what I think are the right answers to the remaining
design questions for the main patch set and then find out after I've
done all that that you have massive objections, I'm going to be
annoyed. I've been trying to get this feature into PostgreSQL for
years, and if I don't succeed this time, I want the reason to be
something better than "well, I didn't find out that David disliked X
until five minutes before I was planning to type 'git push'."


I simply have not had time to look at the main patch set in any detail.


I'm not really concerned about detailed bug-hunting in the main
patches just yet. The time for that will come. But if you have views
on how to resolve the design questions that I mentioned in a couple of
emails back, or intend to advocate vigorously against the whole
concept for some reason, let's try to sort that out sooner rather than
later.


In my view this feature puts the cart way before the horse. I'd think 
higher priority features might be parallelism, a backup repository, 
expiration management, archiving, or maybe even a restore command.


It seems the only goal here is to make pg_basebackup a tool for external 
backup software to use, which might be OK, but I don't believe this 
feature really advances pg_basebackup as a usable piece of stand-alone 
software. If people really think that start/stop backup is too 
complicated an interface how are they supposed to track page 
incrementals and get them to a place where pg_combinebackup can put them 
backup together? If automation is required to use this feature, 
shouldn't pg_basebackup implement that automation?


I have plenty of thoughts about the implementation as well, but I have a 
lot on my plate right now and I don't have time to get into it.


I don't plan to stand in your way on this feature. I'm reviewing what 
patches I can out of courtesy and to be sure that nothing adjacent to 
your work is being affected. My apologies if my reviews are not meeting 
your expectations, but I am contributing as my time constraints allow.


Regards,
-David

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-20 Thread Zhijie Hou (Fujitsu)

On Friday, October 20, 2023 11:24 AM vignesh C  wrote:
> 
> On Thu, 19 Oct 2023 at 16:16, Hayato Kuroda (Fujitsu)
>  wrote:
> >
> > Dear Vignesh,
> >
> > Thanks for revieing! New patch can be available in [1].
> >
> > > Few comments:
> > > 1) Even if we comment 3rd point "Emit a non-transactional message",
> > > test_slot2 still appears in the
> > > invalid_logical_replication_slots.txt
> > > file. There is something wrong here.
> > > +   # 2. Advance the slot test_slot2 up to the current WAL location,
> but
> > > +   #test_slot1 still has unconsumed WAL records.
> > > +   $old_publisher->safe_psql('postgres',
> > > +   "SELECT pg_replication_slot_advance('test_slot2',
> > > + NULL);");
> > > +
> > > +   # 3. Emit a non-transactional message. test_slot2 detects
> > > + the message
> > > so
> > > +   #that this slot will be also reported by upcoming
> pg_upgrade.
> > > +   $old_publisher->safe_psql('postgres',
> > > +   "SELECT count(*) FROM
> > > + pg_logical_emit_message('false',
> > > 'prefix', 'This is a non-transactional message');"
> > > +   );
> >
> > The comment was updated based on others. How do you think?
> 
> I mean if we comment or remove this statement like in the attached patch, the
> test is still passing with 'The slot "test_slot2" has not consumed the WAL 
> yet', in
> this case should the test_slot2 be still invalid as we have called
> pg_replication_slot_advance for test_slot2.

It's because we pass NULL to pg_replication_slot_advance(). We should pass 
pg_current_wal_lsn() instead. I have fixed it in V55 version.

Best Regards,
Hou zj

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-10-20 Thread Zhijie Hou (Fujitsu)

On Friday, October 20, 2023 9:50 AM Peter Smith  wrote:
> 
> Here are some review comments for v54-0001

Thanks for the review.

> 
> ==
> src/backend/replication/slot.c
> 
> 1.
> + if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade) {
> + ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("replication slots must not be invalidated during the
> + upgrade"), errhint("\"max_slot_wal_keep_size\" must not be set to -1
> + during the
> upgrade"));
> + }
> 
> This new error is replacing the old code:
> + Assert(max_slot_wal_keep_size_mb == -1);
> 
> Is that errhint correct? Shouldn't it say "must" instead of "must not"?

Fixed.

> 
> ==
> src/bin/pg_upgrade/t/003_upgrade_logical_replication_slots.pl
> 
> 2. General formating
> 
> Some of the "]);" formatting and indenting for the multiple SQL commands is
> inconsistent.
> 
> For example,
> 
> + $old_publisher->safe_psql(
> + 'postgres', qq[
> + SELECT pg_create_logical_replication_slot('test_slot1',
> + 'test_decoding'); SELECT
> + pg_create_logical_replication_slot('test_slot2', 'test_decoding'); ]
> + );
> 
> versus
> 
> + $old_publisher->safe_psql(
> + 'postgres', qq[
> + CREATE TABLE tbl AS SELECT generate_series(1, 10) AS a; SELECT
> + pg_replication_slot_advance('test_slot2', NULL); SELECT count(*) FROM
> + pg_logical_emit_message('false', 'prefix',
> 'This is a non-transactional message');
> + ]);
> 

Fixed.

> ~~~
> 
> 3.
> +# Set up some settings for the old cluster, so that we can ensures that
> +initdb # will be done.
> +my @initdb_params = ();
> +push @initdb_params, ('--encoding', 'UTF-8'); push @initdb_params,
> +('--locale', 'C'); $node_params{extra} = \@initdb_params;
> +
> +$old_publisher->init(%node_params);
> 
> Why would initdb not be done if these were not set? I didn't understand the
> comment.
> 
> /so that we can ensures/to ensure/

The node->init() will use a previously initialized cluster if no parameter was
specified, but that cluster could be of wrong version when doing cross-version
test, so we set something to let the initdb happen.

I added some explanation in the comment.

> ~~~
> 
> 4.
> +# XXX: For PG9.6 and prior, the TAP Cluster.pm assigns
> +'max_wal_senders' and # 'max_connections' to the same value (10). But
> +these versions considered # max_wal_senders as a subset of
> +max_connections, so setting the same value # will fail. This adjustment
> +will not be needed when packages for older #versions are defined.
> +if ($old_publisher->pg_version->major <= 9.6) {
> +$old_publisher->append_conf(  'postgresql.conf', qq[  max_wal_senders =
> +5  max_connections = 10  ]); }
> 
> 4a.
> IMO remove the complicated comment trying to explain the problem and just
> to unconditionally set the values you want.
> 
> SUGGESTION#1
> # Older PG version had different rules for the inter-dependency of
> 'max_wal_senders' and 'max_connections', # so assign values which will work
> for all PG versions.
> $old_publisher->append_conf(
>   'postgresql.conf', qq[
>   max_wal_senders = 5
>   max_connections = 10
>   ]);
> 
> ~~

As Kuroda-san mentioned, we may fix Cluster.pm later, so I kept the XXX comment
but simplify it based on your suggestion.

Attach the new version patch.

Best Regards,
Hou zj


v55-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v55-0001-pg_upgrade-Allow-to-replicate-logical-replicatio.patch

Re: Patch: Improve Boolean Predicate JSON Path Docs

2023-10-20 Thread David Wheeler

On Oct 19, 2023, at 23:49, Erik Wienhold  wrote:

> I don't even know what that represents, probably not some fancy file
> compression.

Oh, weird. Trying from a webmail client instead.

Best,

David


v5-0001-Improve-boolean-predicate-JSON-Path-docs.patch
Description: Binary data

Re: trying again to get incremental backup

2023-10-20 Thread Jakub Wartak

Hi Robert,

On Wed, Oct 4, 2023 at 10:09 PM Robert Haas  wrote:
>
> On Tue, Oct 3, 2023 at 2:21 PM Robert Haas  wrote:
> > Here's a new patch set, also addressing Jakub's observation that
> > MINIMUM_VERSION_FOR_WAL_SUMMARIES needed updating.
>
> Here's yet another new version.[..]

Okay, so another good news - related to the patch version #4.
Not-so-tiny stress test consisting of pgbench run for 24h straight
(with incremental backups every 2h, with base of initial full backup),
followed by two PITRs (one not using incremental backup and one using
to to illustrate the performance point - and potentially spot any
errors in between). In both cases it worked fine. Pgbench has this
behaviour that it doesn't cause space growth over time - it produces
lots of WAL instead. Some stats:

START DBSIZE: ~3.3GB (pgbench -i -s 200 --partitions=8)
END DBSIZE: ~4.3GB
RUN DURATION: 24h (pgbench -P 1 -R 100 -T 86400)
WALARCHIVES-24h: 77GB
FULL-DB-BACKUP-SIZE: 3.4GB
INCREMENTAL-BACKUP-11-SIZE: 3.5GB
Env: Azure VM D4s (4VCPU), Debian 11, gcc 10.2, normal build (asserts
and debug disabled)
The increments were taken every 2h just to see if they would fail for
any reason - they did not.

PITR RTO RESULTS (copy/pg_combinebackup time + recovery time):
1. time to restore from fullbackup (+ recovery of 24h WAL[77GB]): 53s
+ 4640s =~ 78min
2. time to restore from fullbackup+incremental backup from 2h ago (+
recovery of 2h WAL [5.4GB]): 68s + 190s =~ 4min18s

I could probably pre populate the DB with 1TB cold data (not touched
to be touched pgbench at all), just for the sake of argument, and that
would probably could be demonstrated how space efficient the
incremental backup can be, but most of time would be time wasted on
copying the 1TB here...

> - I would like some feedback on the generation of WAL summary files.
> Right now, I have it enabled by default, and summaries are kept for a
> week. That means that, with no additional setup, you can take an
> incremental backup as long as the reference backup was taken in the
> last week.

I've just noticed one thing when recovery is progress: is
summarization working during recovery - in the background - an
expected behaviour? I'm wondering about that, because after freshly
restored and recovered DB, one would need to create a *new* full
backup and only from that point new summaries would have any use?
Sample log:

2023-10-20 11:10:02.288 UTC [64434] LOG:  restored log file
"000100020022" from archive
2023-10-20 11:10:02.599 UTC [64434] LOG:  restored log file
"000100020023" from archive
2023-10-20 11:10:02.769 UTC [64446] LOG:  summarized WAL on TLI 1 from
2/139B1130 to 2/239B1518
2023-10-20 11:10:02.923 UTC [64434] LOG:  restored log file
"000100020024" from archive
2023-10-20 11:10:03.193 UTC [64434] LOG:  restored log file
"000100020025" from archive
2023-10-20 11:10:03.345 UTC [64432] LOG:  restartpoint starting: wal
2023-10-20 11:10:03.407 UTC [64446] LOG:  summarized WAL on TLI 1 from
2/239B1518 to 2/25B609D0
2023-10-20 11:10:03.521 UTC [64434] LOG:  restored log file
"000100020026" from archive
2023-10-20 11:10:04.429 UTC [64434] LOG:  restored log file
"000100020027" from archive

>
> - On a related note, I haven't yet tested this on a standby, which is
> a thing that I definitely need to do. I don't know of a reason why it
> shouldn't be possible for all of this machinery to work on a standby
> just as it does on a primary, but then we need the WAL summarizer to
> run there too, which could end up being a waste if nobody ever tries
> to take an incremental backup. I wonder how that should be reflected
> in the configuration. We could do something like what we've done for
> archive_mode, where on means "only on if this is a primary" and you
> have to say always if you want it to run on standbys as well ... but
> I'm not sure if that's a design pattern that we really want to
> replicate into more places. I'd be somewhat inclined to just make
> whatever configuration parameters we need to configure this thing on
> the primary also work on standbys, and you can set each server up as
> you please. But I'm open to other suggestions.

I'll try to play with some standby restores in future, stay tuned.

Regards,
-J.

Re: SQL:2011 application time

2023-10-20 Thread jian he

Hi.
based on v16.

/* Look up the FOR PORTION OF name requested. */
range_attno = attnameAttNum(targetrel, range_name, false);
if (range_attno == InvalidAttrNumber)
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_COLUMN),
errmsg("column or period \"%s\" of relation \"%s\" does not exist",
range_name,
RelationGetRelationName(targetrel)),
parser_errposition(pstate, forPortionOf->range_name_location)));
attr = TupleDescAttr(targetrel->rd_att, range_attno - 1);
// TODO: check attr->attisdropped (?),
// and figure out concurrency issues with that in general.
// It should work the same as updating any other column.

I don't think we need to check attr->attisdropped here.
because the above function attnameAttNum already does the job.

bool
get_typname_and_namespace(Oid typid, char **typname, char **typnamespace)
{
HeapTuple tp;

tp = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
if (HeapTupleIsValid(tp))
{
Form_pg_type typtup = (Form_pg_type) GETSTRUCT(tp);

*typname = pstrdup(NameStr(typtup->typname));
*typnamespace = get_namespace_name(typtup->typnamespace);
ReleaseSysCache(tp);
return *typnamespace;

"return *typnamespace;" should be "return true"?
Maybe name it  to get_typname_and_typnamespace?
---
if (!get_typname_and_namespace(attr->atttypid, _type_name,
_type_namespace))
elog(ERROR, "missing range type %d", attr->atttypid);

you can just `elog(ERROR, "missing range type %s", range_type_name);` ?
Also, this should be placed just below if (!type_is_range(attr->atttypid))?
---
src/backend/catalog/objectaddress.c

if (OidIsValid(per->perrelid))
{
StringInfoData rel;

initStringInfo();
getRelationDescription(, per->perrelid, false);
appendStringInfo(, _("period %s on %s"),
NameStr(per->pername), rel.data);
pfree(rel.data);
}
else
{
appendStringInfo(, _("period %s"),
NameStr(per->pername));
}

periods are always associated with the table, is the above else branch correct?
---
File: src/backend/commands/tablecmds.c
7899: /*
7900: * this test is deliberately not attisdropped-aware, since if one tries to
7901: * add a column matching a dropped column name, it's gonna fail anyway.
7902: *
7903: * XXX: Does this hold for periods?
7904: */
7905: attTuple = SearchSysCache2(ATTNAME,
7906:ObjectIdGetDatum(RelationGetRelid(rel)),
7907:PointerGetDatum(pername));

XXX: Does this hold for periods?
Yes. we can add the following 2 sql for code coverage.
alter table pt add period for tableoid (ds, de);
alter table pt add period for "pg.dropped.4" (ds, de);

Re: Add the ability to limit the amount of memory that can be allocated to backends.

2023-10-20 Thread Stephen Frost

Greetings,

* Andrei Lepikhov (a.lepik...@postgrespro.ru) wrote:
> The only issue I worry about is the uncertainty and clutter that can be
> created by this feature. In the worst case, when we have a complex error
> stack (including the extension's CATCH sections, exceptions in stored
> procedures, etc.), the backend will throw the memory limit error repeatedly.

I'm not seeing what additional uncertainty or clutter there is- this is,
again, exactly the same as what happens today on a system with
overcommit disabled and I don't feel like we get a lot of complaints
about this today.

> Of course, one failed backend looks better than a surprisingly killed
> postmaster, but the mix of different error reports and details looks
> terrible and challenging to debug in the case of trouble. So, may we throw a
> FATAL error if we reach this limit while handling an exception?

I don't see why we'd do that when we can do better- we just fail
whatever the ongoing query or transaction is and allow further requests
on the same connection.  We already support exactly that and it works
really rather well and I don't see why we'd throw that away because
there's a different way to get an OOM error.

If you want to make the argument that we should throw FATAL on OOM when
handling an exception, that's something you could argue independently of
this effort already today, but I don't think you'll get agreement that
it's an improvement.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: [PoC/RFC] Multiple passwords, interval expirations

2023-10-20 Thread Stephen Frost

Greetings,

* Jeff Davis (pg...@j-davis.com) wrote:
> On Wed, 2023-10-18 at 14:48 -0400, Stephen Frost wrote:
> > Right, we need more observability, agreed, but that's not strictly
> > necessary of this patch and could certainly be added independently. 
> > Is
> > there really a need to make this observability a requirement of this
> > particular change?
> 
> I won't draw a line in the sand, but it feels like something should be
> there to help the user keep track of which password they might want to
> keep. At least a "created on" date or something.

Sure, no objection to adding that and seems like it should be fairly
easy ... but then again, I tend to feel that we should do that for all
of the objects in the system and we've got some strong feelings against
doing that from others.  Perhaps this case is different to them, in
which case, great, but if it's not, it'd be unfortunate for this feature
to get bogged down due to that.

> > > (Aside: is the uniqueness of the salt enforced in the current
> > > patch?)
> > 
> > Err, the salt has to be *identical* for each password of a given
> > user,
> > not unique, so I'm a bit confused here.
> 
> Sorry, my mistake.

Sure, no worries.

> If the client needs to use the same salt as existing passwords, can you
> still use PQencryptPasswordConn() on the client to avoid sending the
> plaintext password to the server?

Short answer, yes ... but seems that wasn't actually done yet.  Requires
a bit of additional work, since the client needs to get the existing
salt (note that as part of the SCRAM typical exchange, the client is
provided with the salt, so this isn't exposing anything new) to use to
construct what is then sent to the server to store.  We'll also need to
decide how to handle the case if the client tries to send a password
that doesn't have the same salt as the existing ones (regardless of how
many passwords we end up supporting).  Perhaps we require the user,
through the grammar, to make clear if they want to add a password, and
then error if they don't provide a matching salt, or if they want to
remove existing passwords and replace with the new one.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: A performance issue with Memoize

2023-10-20 Thread Pavel Stehule

pá 20. 10. 2023 v 13:01 odesílatel Richard Guo 
napsal:

> I noticed $subject with the query below.
>
> set enable_memoize to off;
>
> explain (analyze, costs off)
> select * from tenk1 t1 left join lateral
> (select t1.two as t1two, * from tenk1 t2 offset 0) s
> on t1.two = s.two;
>  QUERY PLAN
>
> 
>  Nested Loop Left Join (actual time=0.050..59578.053 rows=5000 loops=1)
>->  Seq Scan on tenk1 t1 (actual time=0.027..2.703 rows=1 loops=1)
>->  Subquery Scan on s (actual time=0.004..4.819 rows=5000 loops=1)
>  Filter: (t1.two = s.two)
>  Rows Removed by Filter: 5000
>  ->  Seq Scan on tenk1 t2 (actual time=0.002..3.834 rows=1
> loops=1)
>  Planning Time: 0.666 ms
>  Execution Time: 60937.899 ms
> (8 rows)
>
> set enable_memoize to on;
>
> explain (analyze, costs off)
> select * from tenk1 t1 left join lateral
> (select t1.two as t1two, * from tenk1 t2 offset 0) s
> on t1.two = s.two;
> QUERY PLAN
>
> --
>  Nested Loop Left Join (actual time=0.061..122684.607 rows=5000
> loops=1)
>->  Seq Scan on tenk1 t1 (actual time=0.026..3.367 rows=1 loops=1)
>->  Memoize (actual time=0.011..9.821 rows=5000 loops=1)
>  Cache Key: t1.two, t1.two
>  Cache Mode: binary
>  Hits: 0  Misses: 1  Evictions:   Overflows: 0  Memory
> Usage: 1368kB
>  ->  Subquery Scan on s (actual time=0.008..5.188 rows=5000
> loops=1)
>Filter: (t1.two = s.two)
>Rows Removed by Filter: 5000
>->  Seq Scan on tenk1 t2 (actual time=0.004..4.081
> rows=1 loops=1)
>  Planning Time: 0.607 ms
>  Execution Time: 124431.388 ms
> (12 rows)
>
> The execution time (best of 3) is 124431.388 VS 60937.899 with and
> without memoize.
>
> The Memoize runtime stats 'Hits: 0  Misses: 1  Evictions: '
> seems suspicious to me, so I've looked into it a little bit, and found
> that the MemoizeState's keyparamids and its outerPlan's chgParam are
> always different, and that makes us have to purge the entire cache each
> time we rescan the memoize node.
>
> But why are they always different?  Well, for the query above, we have
> two NestLoopParam nodes, one (with paramno 1) is created when we replace
> outer-relation Vars in the scan qual 't1.two = s.two', the other one
> (with paramno 0) is added from the subquery's subplan_params, which was
> created when we replaced uplevel vars with Param nodes for the subquery.
> That is to say, the chgParam would be {0, 1}.
>
> When it comes to replace outer-relation Vars in the memoize keys, the
> two 't1.two' Vars are both replaced with the NestLoopParam with paramno
> 1, because it is the first NLP we see in root->curOuterParams that is
> equal to the Vars in memoize keys.  That is to say, the memoize node's
> keyparamids is {1}.
>
> I haven't thought thoroughly about the fix yet.  But one way I'm
> thinking is that in create_subqueryscan_plan() we can first add the
> subquery's subplan_params to root->curOuterParams, and then replace
> outer-relation Vars in scan_clauses afterwards.  That can make us be
> able to share the same PARAM_EXEC slot for the same Var that both
> belongs to the subquery's uplevel vars and to the NestLoop's
> outer-relation vars.  To be concrete, something like attached.
>
> With this change the same query runs much faster and the Memoize runtime
> stats looks more normal.
>
> explain (analyze, costs off)
> select * from tenk1 t1 left join lateral
> (select t1.two as t1two, * from tenk1 t2 offset 0) s
> on t1.two = s.two;
>   QUERY PLAN
>
> --
>  Nested Loop Left Join (actual time=0.063..21177.476 rows=5000 loops=1)
>->  Seq Scan on tenk1 t1 (actual time=0.025..5.415 rows=1 loops=1)
>->  Memoize (actual time=0.001..0.234 rows=5000 loops=1)
>  Cache Key: t1.two, t1.two
>  Cache Mode: binary
>  Hits: 9998  Misses: 2  Evictions: 0  Overflows: 0  Memory Usage:
> 2735kB
>  ->  Subquery Scan on s (actual time=0.009..5.169 rows=5000
> loops=2)
>Filter: (t1.two = s.two)
>Rows Removed by Filter: 5000
>->  Seq Scan on tenk1 t2 (actual time=0.006..4.050
> rows=1 loops=2)
>  Planning Time: 0.593 ms
>  Execution Time: 22486.621 ms
> (12 rows)
>
> Any thoughts?
>

+1

it would be great to fix this problem - I've seen this issue a few times.

Regards

Pavel



> Thanks
> Richard
>

RE: Initial Schema Sync for Logical Replication

2023-10-20 Thread Kumar, Sachin


> From: vignesh C 
> Sent: Thursday, October 19, 2023 10:41 AM
> Can you rebase the patch and post the complete set of required changes for
> the concurrent DDL, I will have a look at them.

Sure , I will try to send the complete rebased patch within a week.

Regards
Sachin

A performance issue with Memoize

2023-10-20 Thread Richard Guo

I noticed $subject with the query below.

set enable_memoize to off;

explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
 QUERY PLAN

 Nested Loop Left Join (actual time=0.050..59578.053 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.027..2.703 rows=1 loops=1)
   ->  Subquery Scan on s (actual time=0.004..4.819 rows=5000 loops=1)
 Filter: (t1.two = s.two)
 Rows Removed by Filter: 5000
 ->  Seq Scan on tenk1 t2 (actual time=0.002..3.834 rows=1
loops=1)
 Planning Time: 0.666 ms
 Execution Time: 60937.899 ms
(8 rows)

set enable_memoize to on;

explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
QUERY PLAN
--
 Nested Loop Left Join (actual time=0.061..122684.607 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.026..3.367 rows=1 loops=1)
   ->  Memoize (actual time=0.011..9.821 rows=5000 loops=1)
 Cache Key: t1.two, t1.two
 Cache Mode: binary
 Hits: 0  Misses: 1  Evictions:   Overflows: 0  Memory
Usage: 1368kB
 ->  Subquery Scan on s (actual time=0.008..5.188 rows=5000
loops=1)
   Filter: (t1.two = s.two)
   Rows Removed by Filter: 5000
   ->  Seq Scan on tenk1 t2 (actual time=0.004..4.081
rows=1 loops=1)
 Planning Time: 0.607 ms
 Execution Time: 124431.388 ms
(12 rows)

The execution time (best of 3) is 124431.388 VS 60937.899 with and
without memoize.

The Memoize runtime stats 'Hits: 0  Misses: 1  Evictions: '
seems suspicious to me, so I've looked into it a little bit, and found
that the MemoizeState's keyparamids and its outerPlan's chgParam are
always different, and that makes us have to purge the entire cache each
time we rescan the memoize node.

But why are they always different?  Well, for the query above, we have
two NestLoopParam nodes, one (with paramno 1) is created when we replace
outer-relation Vars in the scan qual 't1.two = s.two', the other one
(with paramno 0) is added from the subquery's subplan_params, which was
created when we replaced uplevel vars with Param nodes for the subquery.
That is to say, the chgParam would be {0, 1}.

When it comes to replace outer-relation Vars in the memoize keys, the
two 't1.two' Vars are both replaced with the NestLoopParam with paramno
1, because it is the first NLP we see in root->curOuterParams that is
equal to the Vars in memoize keys.  That is to say, the memoize node's
keyparamids is {1}.

I haven't thought thoroughly about the fix yet.  But one way I'm
thinking is that in create_subqueryscan_plan() we can first add the
subquery's subplan_params to root->curOuterParams, and then replace
outer-relation Vars in scan_clauses afterwards.  That can make us be
able to share the same PARAM_EXEC slot for the same Var that both
belongs to the subquery's uplevel vars and to the NestLoop's
outer-relation vars.  To be concrete, something like attached.

With this change the same query runs much faster and the Memoize runtime
stats looks more normal.

explain (analyze, costs off)
select * from tenk1 t1 left join lateral
(select t1.two as t1two, * from tenk1 t2 offset 0) s
on t1.two = s.two;
  QUERY PLAN
--
 Nested Loop Left Join (actual time=0.063..21177.476 rows=5000 loops=1)
   ->  Seq Scan on tenk1 t1 (actual time=0.025..5.415 rows=1 loops=1)
   ->  Memoize (actual time=0.001..0.234 rows=5000 loops=1)
 Cache Key: t1.two, t1.two
 Cache Mode: binary
 Hits: 9998  Misses: 2  Evictions: 0  Overflows: 0  Memory Usage:
2735kB
 ->  Subquery Scan on s (actual time=0.009..5.169 rows=5000 loops=2)
   Filter: (t1.two = s.two)
   Rows Removed by Filter: 5000
   ->  Seq Scan on tenk1 t2 (actual time=0.006..4.050
rows=1 loops=2)
 Planning Time: 0.593 ms
 Execution Time: 22486.621 ms
(12 rows)

Any thoughts?

Thanks
Richard


v1-0001-Fix-a-performance-issue-with-Memoize.patch
Description: Binary data

Re: BRIN minmax multi - incorrect distance for infinite timestamp/date

2023-10-20 Thread Ashutosh Bapat

On Thu, Oct 19, 2023 at 6:11 PM Ashutosh Bapat
 wrote:
>
> I think we should provide generate_series(date, date, integer) which
> will use date + integer -> date.

Just to be clear, I don't mean that this patch should add it.

-- 
Best Wishes,
Ashutosh Bapat

Re: Use virtual tuple slot for Unique node

2023-10-20 Thread Ashutosh Bapat

On Thu, Oct 19, 2023 at 4:26 PM David Rowley  wrote:
>
> On Thu, 19 Oct 2023 at 22:29, David Rowley  wrote:
> > It's hard to imagine why there would be a slowdown as this query uses
> > a TTSOpsMinimalTuple slot type in the patch and the unpatched version.
>
> I shrunk down your table sizes to 10k rows instead of 1 million rows
> to reduce the CPU cache pressure on the queries.
>
> I ran pgbench for 1 minute on each query and did pg_prewarm on each
> table. Here are the times I got in milliseconds:
>
> Query   master   Master + 0001   compare
> Q12.576 1.979 130.17%
> Q29.546 9.941   96.03%
> Q39.069 9.536   95.10%
> Q47.285 7.208 101.07%
> Q57.585 6.904 109.86%
> Q6 162.253 161.434100.51%
> Q7   62.507   58.922106.08%
>
> I also noted down the slot type that nodeUnique.c is using in each of
> the queries:
>
> Q1 TTSOpsVirtual
> Q2 TTSOpsVirtual
> Q3 TTSOpsVirtual
> Q4 TTSOpsMinimalTuple
> Q5 TTSOpsVirtual
> Q6 TTSOpsMinimalTuple
> Q7 TTSOpsMinimalTuple
>
> So, I'm not really expecting Q4, Q6 or Q7 to change much. However, Q7
> does seem to be above noise level faster and I'm not sure why.

I ran my experiments again. It seems on my machine the execution times
do vary a bit. I ran EXPLAIN ANALYZE on the query 5 times and took
average of execution times. I did this three times. For each run the
standard deviation was within 2%. Here are the numbers
master: 13548.33, 13878.88, 14572.52
master + 0001: 13734.58, 14193.83, 14574.73

So for me, I would say, this particular query performs the same with
or without patch.

>
> We can see that Q2 and Q3 become a bit slower.  This makes sense as
> tts_virtual_materialize() is quite a bit more complex than
> heap_copy_minimal_tuple() which is a simple palloc/memcpy.
>

If the source slot is a materialized virtual slot,
tts_virtual_copyslot() could perform a memcpy of the materialized data
itself rather than materialising from datums. That might be more
efficient.

> We'd likely see Q2 and Q3 do better with the patched version if there
> were more duplicates as there'd be less tuple deforming going on
> because of the virtual slots.
>
> Overall, the patched version is 5.55% faster than master.  However,
> it's pretty hard to say if we should do this or not. Q3 has a mix of
> varlena and byval types and that came out slower with the patched
> version.

Theoretically using the same slot type is supposed to be faster. We
use same slot types for input and output in other places where as
well. May be we should fix the above said inefficiency in
tt_virtual_copyslot()?

-- 
Best Wishes,
Ashutosh Bapat

Re: [patch] pg_basebackup: mention that spread checkpoints are the default in --help

2023-10-20 Thread Aleksander Alekseev

Hi,

> On Thu, Oct 19, 2023 at 10:30:04PM +0200, Michael Banck wrote:
> > Hrm right, but those have multiple options and they do not enumerate
> > them in the help string as do -F and -c - not sure what general project
> > policy here is for mentioning defaults in --help, I will check some of
> > the other commands.
>
> Then comes the point that this bloats the --help output.  A bunch of
> system commands I use on a daily-basis outside Postgres don't do that,
> so it's kind of hard to put a line on what's good or not in this area
> while we have the SGML and man pages to do the job, with always more
> details.

Right. Then I suggest merging the patch as is.

-- 
Best regards,
Aleksander Alekseev

Re: UniqueKey v2

2023-10-20 Thread zhihuifan1213



> i did some simple tests using text data type.
>
> it works with the primary key, not with unique indexes.
> it does not work when the column is unique, not null.
>
> The following is my test.

Can you simplify your test case please? I can't undertand what "doesn't
work" mean here and for which case. FWIW, this feature has nothing with
the real data, I don't think inserting any data is helpful unless I
missed anything. 


-- 
Best Regards
Andy Fan

Re: [PATCH] Add support function for containment operators

2023-10-20 Thread jian he

On Fri, Oct 20, 2023 at 12:01 AM Laurenz Albe  wrote:
>
> On Fri, 2023-10-13 at 14:26 +0800, jian he wrote:
> > Collation problem seems solved.
>
> I didn't review your patch in detail, there is still a problem
> with my example:
>
>   CREATE TYPE textrange AS RANGE (
>  SUBTYPE = text,
>  SUBTYPE_OPCLASS = text_pattern_ops
>   );
>
>   CREATE TABLE tx (t text COLLATE "cs-CZ-x-icu");
>
>   INSERT INTO tx VALUES ('a'), ('c'), ('d'), ('ch');
>
>   SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>t
>   
>a
>c
>ch
>   (3 rows)
>
> That was correct.
>
>   EXPLAIN SELECT * FROM tx WHERE t <@ textrange('a', 'd');
>
>QUERY PLAN
>   
>Seq Scan on tx  (cost=0.00..30.40 rows=7 width=32)
>  Filter: ((t >= 'a'::text) AND (t < 'd'::text))
>   (2 rows)
>
> But that was weird.  The operators seem wrong.  Look at that

Thanks for pointing this out!

The problem is that TypeCacheEntry->rngelemtype typcaheentry don't
have the range's SUBTYPE_OPCLASS info.
So in find_simplified_clause, we need to get the range's
SUBTYPE_OPCLASS from the pg_catalog table.
Also in pg_range, column rngsubopc  is not null. so this should be fine.
From 810208a42e99109bcaf56cddae6968efbcf969c5 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Fri, 20 Oct 2023 16:20:38 +0800
Subject: [PATCH v3 1/1]  Optimize quals (Expr <@ RangeConst) and (RangeConst
 @> Expr)

Previously these two quals will be processed as is.
This patch will rewritten the expression to expose more info to optimizer by
adding prosupport function to range_contains_elem, elem_contained_by_range.

Expr <@ rangeConst will be rewritten to "expr opr range_lower_bound and expr opr
range_upper_bound". (range bound inclusiveness will be handled properly).

Added some tests to validate the generated plan.
---
 src/backend/utils/adt/rangetypes.c  | 229 +++-
 src/backend/utils/adt/rangetypes_selfuncs.c |   6 +-
 src/include/catalog/pg_proc.dat |  12 +-
 src/test/regress/expected/rangetypes.out| 194 +
 src/test/regress/sql/rangetypes.sql | 101 +
 5 files changed, 535 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625..d100e55e 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -31,16 +31,24 @@
 #include "postgres.h"
 
 #include "access/tupmacs.h"
+#include "access/stratnum.h"
+#include "catalog/pg_range.h"
 #include "common/hashfn.h"
 #include "lib/stringinfo.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "nodes/miscnodes.h"
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "nodes/pg_list.h"
+#include "nodes/supportnodes.h"
 #include "port/pg_bitutils.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/date.h"
 #include "utils/lsyscache.h"
 #include "utils/rangetypes.h"
+#include "utils/syscache.h"
 #include "utils/timestamp.h"
 #include "varatt.h"
 
@@ -69,7 +77,11 @@ static Size datum_compute_size(Size data_length, Datum val, bool typbyval,
 			   char typalign, int16 typlen, char typstorage);
 static Pointer datum_write(Pointer ptr, Datum datum, bool typbyval,
 		   char typalign, int16 typlen, char typstorage);
-
+static Expr *build_bound_expr(Oid opfamily, TypeCacheEntry *typeCache,
+			  bool isLowerBound, bool isInclusive,
+			  Datum val, Expr *otherExpr, Oid rng_collation);
+static Node *find_simplified_clause(Const *rangeConst, Expr *otherExpr);
+static Node *match_support_request(Node *rawreq);
 
 /*
  *--
@@ -558,7 +570,6 @@ elem_contained_by_range(PG_FUNCTION_ARGS)
 	PG_RETURN_BOOL(range_contains_elem_internal(typcache, r, val));
 }
 
-
 /* range, range -> bool functions */
 
 /* equality (internal version) */
@@ -2173,6 +2184,29 @@ make_empty_range(TypeCacheEntry *typcache)
 	return make_range(typcache, , , true, NULL);
 }
 
+/*
+ * Planner support function for elem_contained_by_range operator
+ */
+Datum
+elem_contained_by_range_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
+
+/*
+ * Planner support function for range_contains_elem operator
+ */
+Datum
+range_contains_elem_support(PG_FUNCTION_ARGS)
+{
+	Node	   *rawreq = (Node *) PG_GETARG_POINTER(0);
+	Node	   *ret = match_support_request(rawreq);
+
+	PG_RETURN_POINTER(ret);
+}
 
 /*
  *--
@@ -2714,3 +2748,194 @@ datum_write(Pointer ptr, Datum datum, bool typbyval, char typalign,
 
 	return ptr;
 }
+/*
+ * build (Expr Operator Range_bound) expression. something like: expr >= lower(range)
+ * if operator both sides have collation, operator should use (right) range's collation
+ *
+*/
+static Expr *
+build_bound_expr(Oid

Re: CRC32C Parallel Computation Optimization on ARM

2023-10-20 Thread Michael Paquier

On Fri, Oct 20, 2023 at 07:08:58AM +, Xiang Gao wrote:
> This patch uses a parallel computing optimization algorithm to
> improve crc32c computing performance on ARM. The algorithm comes
> from Intel whitepaper:
> crc-iscsi-polynomial-crc32-instruction-paper. Input data is divided
> into three equal-sized blocks.Three parallel blocks (crc0, crc1,
> crc2) for 1024 Bytes.One Block: 42(BLK_LENGTH) * 8(step length:
> crc32c_u64) bytes 
> 
> Crc32c unitest: 
> https://gist.github.com/gaoxyt/138fd53ca1eead8102eeb9204067f7e4
> Crc32c benchmark: 
> https://gist.github.com/gaoxyt/4506c10fc06b3501445e32c4257113e9
> It gets ~2x speedup compared to linear Arm crc32c instructions.

Interesting.  Could you attached to this thread the test files you
used and the results obtained please?  If this data gets deleted from
github, then it would not be possible to refer back to what you did at
the related benchmark results.

Note that your patch is forgetting about meson; it just patches
./configure.
--
Michael


signature.asc
Description: PGP signature

CRC32C Parallel Computation Optimization on ARM

2023-10-20 Thread Xiang Gao

Hi all

This patch uses a parallel computing optimization algorithm to improve crc32c 
computing performance on ARM. The algorithm comes from Intel whitepaper: 
crc-iscsi-polynomial-crc32-instruction-paper. Input data is divided into three 
equal-sized blocks.Three parallel blocks (crc0, crc1, crc2) for 1024 Bytes.One 
Block: 42(BLK_LENGTH) * 8(step length: crc32c_u64) bytes

Crc32c unitest: https://gist.github.com/gaoxyt/138fd53ca1eead8102eeb9204067f7e4
Crc32c benchmark: 
https://gist.github.com/gaoxyt/4506c10fc06b3501445e32c4257113e9
It gets ~2x speedup compared to linear Arm crc32c instructions.

I'll create a CommitFests ticket for this submission.
Any comments or feedback are welcome.

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.


0001-crc32c-parallel-computation-optimization-on-arm.patch
Description:  0001-crc32c-parallel-computation-optimization-on-arm.patch

Re: Fix output of zero privileges in psql

2023-10-20 Thread Laurenz Albe

On Fri, 2023-10-20 at 04:13 +0200, Erik Wienhold wrote:
> On 2023-10-17 04:05 +0200, David G. Johnston wrote:
> > Erik seems to favors (none)
> 
> Yes, with a slight favor for "(none)" because it's the least disruptive
> to users who change \pset null to a non-blank string.  The output of \dp
> etc. would still look the same for default privileges.
> 
> But I'm also okay with respecting \pset null because it is so simple.
> We will no longer hide the already documented null representation of
> default privileges by allowing the user to display the ACL as it is.
> And with Laurenz' patch we will also document the special case of zero
> privileges and how to distinguish it.

If you want to proceed with your patch, you could send a new version.

I think it could do with an added line of documentation to the
"Privileges" chapter, and I'd say that the regression tests should be
in "psql.sql" (not that it is very important).

I am not sure how to proceed. Perhaps it would indeed be better to have
two competing commitfest entries.  Both could be "ready for committer",
and the committers can decide what they prefer.

Yours,
Laurenz Albe

56 matches

Mail list logo