Re: Track in pg_replication_slots the reason why slots conflict?

2023-12-31 Thread shveta malik
On Mon, Jan 1, 2024 at 9:14 AM shveta malik  wrote:
>
> On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier  wrote:
> >
> > On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote:
> > > Does anyone have a preference for a column name? The options on the
> > > table are conflict_cause, conflicting_cause, conflict_reason. Any
> > > others? I was checking docs for similar usage and found
> > > "pg_event_trigger_table_rewrite_reason" function, so based on that we
> > > can even go with conflict_reason.
> >
> > "conflict_reason" sounds like the natural choice here.
>
> Do we have more comments on the patch apart from column_name?
>
> thanks
> Shveta

PFA v3 after changing column name to 'conflict_reason'

thanks
Shveta


v3-0001-Track-conflicting_reason-in-pg_replication_slots.patch
Description: Binary data


Re: Autonomous transactions 2023, WIP

2023-12-31 Thread Pavel Stehule
Hi

ne 31. 12. 2023 v 15:15 odesílatel Ivan Kush 
napsal:

>
> On 24.12.2023 15:38, Pavel Stehule wrote:
> > Can you show some benchmarks? I don't like this system too much but
> > maybe it can work enough.
> >
> > Still I am interested in possible use cases. If it should be used only
> > for logging, then we can implement something less generic, but surely
> > with better performance and stability. Logging to tables is a little
> > bit outdated.
> >
> > Regards
> >
> > Pavel
>
> All use cases of pg_background, except asynchronous execution. If later
> add asynchronous execution, then all =)
>
For example, also:
>
> * conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.
>
> * possibility to create functions that calls utility statements, like
> VACUUM, etc.
>

almost all these tasks are more or less dirty. It is a serious question if
we want to integrate pg_background to core.

I don't have good benchmarks now. Some simple, like many INSERTs. Pool
> gives advantage, more tps compared to pg_background with increasing
> number of backends.
>
> The main advantage over pg_background is pool of workers. In this patch
> separate pool is created for each backend. At the current time I'm
> coding one shared pool for all backends.
>

I afraid so this solution can be very significantly slower than logging to
postgres log or forwarding to syslog


>
> >
> >
> >  > 2. although the Oracle syntax is interesting, and I proposed
> > PRAGMA
> > more times,  it doesn't allow this functionality in other PL
> >
> > 2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`?
> > `BEGIN
> > AUTONOMOUS`.
> > It shows immediately that we are in autonomous session, no need to
> > search in subsequent lines for keyword.
> >
> > ```
> > CREATE FUNCTION foo() RETURNS void AS $$
> > BEGIN AUTONOMOUS
> >INSERT INTO tbl VALUES (1);
> >BEGIN AUTONOMOUS
> > 
> > END;
> > END;
> > $$ LANGUAGE plpgsql;
> > ```
> >
> >  > CREATE OR REPLACE FUNCTION ...
> >  > AS $$
> >  > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;
> >
> > The downside with the keyword in function declaration, that we
> > will not
> > be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
> > `BEGIN AUTONOMOUS` it's possible to create them.
> >
> > ```
> > -- BEGIN AUTONOMOUS
> >
> > CREATE FUNCTION foo() RETURNS void AS $$
> > BEGIN
> >INSERT INTO tbl VALUES (1);
> >BEGIN AUTONOMOUS
> >  INSERT INTO tbl VALUES (2);
> >END;
> > END;
> > $$ LANGUAGE plpgsql;
> >
> >
> > -- or PRAGMA AUTONOMOUS
> >
> > CREATE FUNCTION foo() RETURNS void AS $$
> > BEGIN
> >INSERT INTO tbl VALUES (1);
> >BEGIN
> >DECLARE AUTONOMOUS_TRANSACTION;
> >  INSERT INTO tbl VALUES (2);
> >END;
> > END;
> > $$ LANGUAGE plpgsql;
> >
> >
> > START TRANSACTION;
> > foo();
> > ROLLBACK;
> > ```
> >
> > ```
> > Output:
> > 2
> > ```
> >
> >  > it doesn't allow this functionality in other PL
> >
> > I didn't work out on other PLs at the current time, but...
> >
> > ## Python
> >
> > In plpython we could use context managers, like was proposed in
> > Peter's
> > patch. ```
> >
> > with plpy.autonomous() as a:
> >  a.execute("INSERT INTO tbl VALUES (1) ");
> >
> > ```
> >
> > ## Perl
> >
> > I don't programm in Perl. But googling shows Perl supports subroutine
> > attributes. Maybe add `autonomous` attribute for autonomous
> execution?
> >
> > ```
> > sub foo :autonomous {
> > }
> > ```
> >
> > https://www.perl.com/article/untangling-subroutine-attributes/
> >
> >
> >  > Heikki wrote about the possibility to support threads in Postgres.
> >
> > 3. Do you mean this thread?
> >
> https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
> > Thanks for info. Will watch it. Unfortunately it takes many years to
> > implement threads =(
> >
> >  > Surely, the first topic should be the method of implementation.
> > Maybe
> > I missed it, but there is no agreement of background worker based.
> > I agree. No consensus at the current time.
> > Pros of bgworkers are:
> > 1. this entity is already in Postgres.
> > 2. possibility of asynchronous execution of autonomous session in the
> > future. Like in pg_background extension. For asynchronous
> > execution we
> > need a separate process, bgworkers are this separate process.
> >
> > Also maybe later create autonomous workers themselves without using
> > bgworkers internally: launch of separate process, etc. But I think
> > will
> > be many common code with bgworkers.
> >
> >
> > On 21.12.2023 12:35, Pavel Stehule wrote:
> > > Hi
> > >
> > > although I like the idea related to autonomous transactions, 

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

2023-12-31 Thread Maxim Orlov
On Fri, 29 Dec 2023 at 16:36, Matthias van de Meent <
boekewurm+postg...@gmail.com> wrote:

>
> I don't think this is an actionable change, as this wastes 4 more bytes
> (or 8 with alignment) in nearly all WAL records that don't use the
> HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
> 64but-aligned) bytes per record. Unless something like [0] gets committed
> this will add a significant write overhead to all operations, even if they
> are not doing anything that needs an XID.
>
> Also, I don't think we need to support transactions that stay alive and
> change things for longer than 2^31 concurrently created transactions, so we
> could well add a fxid to each WAL segment header (and checkpoint record?)
> and calculate the fxid of each record as a relative fxid off of that.
>
> Thank you for your reply.  Yes, this is a good idea.  Actually, this is
exactly what I was trying to do at first.
But in this case, we have to add more locks on TransamVariables->nextXid,
and I've abandoned the idea.
Maybe, I should look in this direction.

On Sun, 31 Dec 2023 at 03:44, Michael Paquier  wrote:

> And FWIW, echoing with Matthias, making these generic structures
> arbitrary larger is a non-starter.  We should try to make them
> shorter.
>

Yeah, obviously, this is patch make WAL bigger.  I'll try to look into the
idea of fxid calculation, as mentioned above.
It might in part be a "chicken and the egg" situation.  It is very hard to
split overall 64xid patch into smaller pieces
with each part been a meaningful and beneficial for current 32xids Postgres.

Anyway, thanks for reply, appreciate it.

-- 
Best regards,
Maxim Orlov.


Re: Build versionless .so for Android

2023-12-31 Thread Matthias Kuhn
Thanks for all the comments and help.

I have added the patch to the January CF.

It looks like meson does not currently support building for android, the
following output is what I get (but I have actually no experience with
meson):

meson.build:320:2: ERROR: Problem encountered: unknown host system:
android

Please find an attached patch which also includes a documentation section.
I am happy to adjust if needed.

Kind regards
Matthias
diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 75dc81a..9c88558 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -1989,6 +1989,14 @@ build-postgresql:
   among developers is to use PROFILE for one-time flag
   adjustments, while COPT might be kept set all the time.
  
+
+ 
+  When building libpq as shared libraries, the resulting files are suffixed
+  with the version libpq.5.[version] and an unversioned
+  symlink libpq.soto the versioned file is created. An
+  exception to that is Android where library file names must end with
+  .so. Building for Android is only supported using make.
+ 
 
   
  
diff --git a/src/Makefile.shlib b/src/Makefile.shlib
index 92d893e..4801b7a 100644
--- a/src/Makefile.shlib
+++ b/src/Makefile.shlib
@@ -187,7 +187,13 @@ endif
 ifeq ($(PORTNAME), linux)
   LINK.shared		= $(COMPILER) -shared
   ifdef soname
-LINK.shared		+= -Wl,-soname,$(soname)
+ifeq (linux-android,$(findstring linux-android,$(host_os)))
+#crosscompiling for android needs to create unvesioned libs
+shlib		= lib$(NAME)$(DLSUFFIX)
+LINK.shared		+= -Wl,-soname,lib$(NAME)$(DLSUFFIX)
+else
+LINK.shared		+= -Wl,-soname,$(soname)
+endif
   endif
   BUILD.exports		= ( echo '{ global:'; $(AWK) '/^[^\#]/ {printf "%s;\n",$$1}' $<; echo ' local: *; };' ) >$@
   exports_file		= $(SHLIB_EXPORTS:%.txt=%.list)


Re: Commitfest manager January 2024

2023-12-31 Thread Shubham Khanna
On Sat, Dec 23, 2023 at 8:53 AM vignesh C  wrote:
>
> Hi,
>
> I didn't see anyone volunteering for the January Commitfest, so I'll
> volunteer to be CF manager for January 2024 Commitfest.

I can assist with the January 2024 Commitfest.

Thanks and Regards,
Shubham Khanna.




Re: Track in pg_replication_slots the reason why slots conflict?

2023-12-31 Thread shveta malik
On Fri, Dec 29, 2023 at 3:35 PM Michael Paquier  wrote:
>
> On Fri, Dec 29, 2023 at 09:20:52AM +0530, Amit Kapila wrote:
> > Does anyone have a preference for a column name? The options on the
> > table are conflict_cause, conflicting_cause, conflict_reason. Any
> > others? I was checking docs for similar usage and found
> > "pg_event_trigger_table_rewrite_reason" function, so based on that we
> > can even go with conflict_reason.
>
> "conflict_reason" sounds like the natural choice here.

Do we have more comments on the patch apart from column_name?

thanks
Shveta




Re: Commitfest manager January 2024

2023-12-31 Thread vignesh C
On Sun, 24 Dec 2023 at 18:40, vignesh C  wrote:
>
> On Sun, 24 Dec 2023 at 07:16, Michael Paquier  wrote:
> >
> > On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote:
> > > I didn't see anyone volunteering for the January Commitfest, so I'll
> > > volunteer to be CF manager for January 2024 Commitfest.
> >
> > (Adding Magnus in CC.)
> >
> > That would be really helpful.  Thanks for helping!  Do you have the
> > admin rights on the CF app?  You are going to require them in order to
> > mark the CF as in-process, and you would also need to switch the CF
> > after that from "Future" to "Open" so as people can still post
> > patches once January one begins.
>
> I don't have admin rights for the CF app. Please provide admin rights.

I have not yet got the admin rights, Kindly provide admin rights for the CF app.

Regards,
Vignesh




Re: Add support for __attribute__((returns_nonnull))

2023-12-31 Thread John Naylor
On Thu, Dec 28, 2023 at 1:20 AM Tristan Partin  wrote:
> I recently wound up in a situation where I was checking for NULL return
> values of a function that couldn't ever return NULL because the
> inability to allocate memory was always elog(ERROR)ed (aborted).

It sounds like you have a ready example to test, so what happens with the patch?

As for whether any code generation changed, I'd start by checking if
anything in a non-debug binary changed at all.




Re: Commitfest 2024-01 starting in 3 days!

2023-12-31 Thread vignesh C
On Sun, 31 Dec 2023 at 06:19, Michael Paquier  wrote:
>
> On Fri, Dec 29, 2023 at 02:41:55PM +0530, vignesh C wrote:
> > Commitfest 2024-01 is starting in 3 days!
> > Please register the patches which have not yet registered. Also if
> > someone has some pending patch that is not yet submitted, please
> > submit and register for 2024-01 Commitfest.
> > I will be having a look at the commitfest entries, correcting the
> > status if any of them need to be corrected and update the authors.
> > Kindly keep the patch updated so that the reviewers/committers can
> > review the patches and get it committed.
>
> Please be careful that it should be possible to register patches until
> the 31st of December 23:59 AoE, as of:
> https://www.timeanddate.com/time/zones/aoe
>
> The status of the CF should be switched after this time, not before.

Thanks, I will take care of this.

Regards,
Vignesh




Re: SET ROLE x NO RESET

2023-12-31 Thread Joe Conway

On 12/30/23 17:19, Michał Kłeczek wrote:



On 30 Dec 2023, at 17:16, Eric Hanson  wrote:

What do you think of adding a NO RESET option to the SET ROLE command?


What I proposed some time ago is SET ROLE … GUARDED BY ‘password’, so 
that you could later: RESET ROLE WITH ‘password'


I like that too, but see it as a separate feature. FWIW that is also 
supported by the set_user extension referenced elsewhere on this thread.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: Build versionless .so for Android

2023-12-31 Thread Joe Conway

On 12/31/23 01:24, Michael Paquier wrote:

On Sun, Dec 31, 2023 at 06:08:04AM +0100, Matthias Kuhn wrote:

I was wondering if there are a) any comments on the approach and if I
should be handed in for a commitfest (currently waiting for the cooldown
period after account activation, I am not sure how long that is)


Such discussions are adapted in a commit fest entry.  No idea if there
is a cooldown period after account creation before being able to touch
the CF app contents, though.



FWIW I have expedited the cooldown period, so Matthias can log in right 
away.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com





Re: [17] CREATE SUBSCRIPTION ... SERVER

2023-12-31 Thread Jeff Davis
On Fri, 2023-12-29 at 15:22 -0800, Jeff Davis wrote:
> On Tue, 2023-09-05 at 12:08 -0700, Jeff Davis wrote:
> > OK, so we could have a built-in FDW called pg_connection that would
> > do
> > the right kinds of validation; and then also allow other FDWs but
> > the
> > subscription would have to do its own validation.
> 
> Attached a rough rebased version.

Attached a slightly better version which fixes a pg_dump issue and
improves the documentation.

Regards,
Jeff Davis

From 0b8cb23157b86909d38cc10723f19d94787efed2 Mon Sep 17 00:00:00 2001
From: Jeff Davis 
Date: Wed, 23 Aug 2023 10:31:16 -0700
Subject: [PATCH v4] CREATE SUBSCRIPTION ... SERVER.

---
 doc/src/sgml/ref/alter_subscription.sgml  |  18 +-
 doc/src/sgml/ref/create_subscription.sgml |  16 +-
 doc/src/sgml/user-manag.sgml  |  12 +-
 src/backend/catalog/Makefile  |   1 +
 src/backend/catalog/pg_subscription.c |  17 +-
 src/backend/catalog/system_functions.sql  |   2 +
 src/backend/commands/subscriptioncmds.c   | 197 ++--
 src/backend/foreign/foreign.c | 214 ++
 src/backend/parser/gram.y |  20 ++
 src/backend/replication/logical/worker.c  |  12 +-
 src/bin/pg_dump/pg_dump.c |  63 +-
 src/bin/pg_dump/pg_dump.h |   1 +
 src/bin/psql/describe.c   |   2 +-
 src/bin/psql/tab-complete.c   |   2 +-
 src/include/catalog/meson.build   |   1 +
 src/include/catalog/pg_authid.dat |   5 +
 .../catalog/pg_foreign_data_wrapper.dat   |  22 ++
 src/include/catalog/pg_proc.dat   |   8 +
 src/include/catalog/pg_subscription.h |   5 +-
 src/include/foreign/foreign.h |   1 +
 src/include/nodes/parsenodes.h|   3 +
 src/test/regress/expected/foreign_data.out|  60 -
 src/test/regress/expected/subscription.out|  38 
 src/test/regress/sql/foreign_data.sql |  41 +++-
 src/test/regress/sql/subscription.sql |  39 
 src/test/subscription/t/001_rep_changes.pl|  57 +
 26 files changed, 804 insertions(+), 53 deletions(-)
 create mode 100644 src/include/catalog/pg_foreign_data_wrapper.dat

diff --git a/doc/src/sgml/ref/alter_subscription.sgml b/doc/src/sgml/ref/alter_subscription.sgml
index 6d36ff0dc9..6d219145a9 100644
--- a/doc/src/sgml/ref/alter_subscription.sgml
+++ b/doc/src/sgml/ref/alter_subscription.sgml
@@ -21,6 +21,7 @@ PostgreSQL documentation
 
  
 
+ALTER SUBSCRIPTION name SERVER servername
 ALTER SUBSCRIPTION name CONNECTION 'conninfo'
 ALTER SUBSCRIPTION name SET PUBLICATION publication_name [, ...] [ WITH ( publication_option [= value] [, ... ] ) ]
 ALTER SUBSCRIPTION name ADD PUBLICATION publication_name [, ...] [ WITH ( publication_option [= value] [, ... ] ) ]
@@ -94,13 +95,24 @@ ALTER SUBSCRIPTION name RENAME TO <
 

 
+   
+SERVER servername
+
+ 
+  This clause replaces the foreign server or connection string originally
+  set by  with the foreign server
+  servername.
+ 
+
+   
+

 CONNECTION 'conninfo'
 
  
-  This clause replaces the connection string originally set by
-  .  See there for more
-  information.
+  This clause replaces the foreign server or connection string originally
+  set by  with the connection
+  string conninfo.
  
 

diff --git a/doc/src/sgml/ref/create_subscription.sgml b/doc/src/sgml/ref/create_subscription.sgml
index f1c20b3a46..8cf67516cf 100644
--- a/doc/src/sgml/ref/create_subscription.sgml
+++ b/doc/src/sgml/ref/create_subscription.sgml
@@ -22,7 +22,7 @@ PostgreSQL documentation
  
 
 CREATE SUBSCRIPTION subscription_name
-CONNECTION 'conninfo'
+{ SERVER servername | CONNECTION 'conninfo' }
 PUBLICATION publication_name [, ...]
 [ WITH ( subscription_parameter [= value] [, ... ] ) ]
 
@@ -77,6 +77,15 @@ CREATE SUBSCRIPTION subscription_name

 
+   
+SERVER servername
+
+ 
+  A foreign server to use for the connection.
+ 
+
+   
+

 CONNECTION 'conninfo'
 
@@ -363,6 +372,11 @@ CREATE SUBSCRIPTION subscription_nametrue. Only superusers can set
   this value to false.
  
+ 
+  Only allowed when using a connection string. If using a foreign
+  server, specify password_required as part of the
+  user mapping for the foreign server, instead.
+ 
 

 
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 92a299d2d3..4f4c20ba3c 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -687,11 +687,19 @@ DROP ROLE doomed_role;
Allow use of connection slots reserved via
.
   
+  
+   pg_create_connection
+   Allow users to specify a connection string directly in CREATE
+   SUBSCRIPTION.
+  
   

Re: pg_stat_statements: more test coverage

2023-12-31 Thread Tom Lane
Peter Eisentraut  writes:
> It looks like the failing configurations are exactly all the big-endian 
> ones: s390x, sparc, powerpc.  So it's possible that this is actually a 
> bug?  But unless someone can reproduce this locally and debug it, we 
> should probably revert this for now.

The reason for the platform-dependent behavior is that we're dealing
with a bunch of entries with identical "usage", so it's just about
random which ones actually get deleted.  I do not think our qsort()
has platform-dependent behavior --- but the initial contents of
entry_dealloc's array are filled in hash_seq_search order, and that
*is* platform-dependent.

Now, the test script realizes that hazard.  The bug seems to be that
it's wrong about how many competing usage-count-1 entries there are.
Instrumenting the calls to entry_alloc (which'll call entry_dealloc
when we hit 100 entries), I see

2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max LOG:  
calling entry_alloc for 'SELECT pg_stat_statements_reset() IS NOT', cur hash 
size 0
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max LOG:  
calling entry_alloc for '$1', cur hash size 1
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max CONTEXT:  
SQL expression "1"
PL/pgSQL function inline_code_block line 3 at FOR with integer loop 
variable
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max LOG:  
calling entry_alloc for 'format($3, lpad(i::text, $4, $5))', cur hash size 2
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max CONTEXT:  
SQL expression "format('select * from t%s', lpad(i::text, 3, '0'))"
PL/pgSQL function inline_code_block line 4 at EXECUTE
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max LOG:  
calling entry_alloc for 'select * from t001', cur hash size 3
2023-12-31 13:21:39.160 EST client backend[3764867] pg_regress/max CONTEXT:  
SQL statement "select * from t001"
PL/pgSQL function inline_code_block line 4 at EXECUTE
...
2023-12-31 13:21:39.165 EST client backend[3764867] pg_regress/max LOG:  
calling entry_alloc for 'select * from t098', cur hash size 100
2023-12-31 13:21:39.165 EST client backend[3764867] pg_regress/max CONTEXT:  
SQL statement "select * from t098"
PL/pgSQL function inline_code_block line 4 at EXECUTE
2023-12-31 13:21:39.165 EST client backend[3764867] pg_regress/max LOG:  
entry_dealloc: zapping 10 of 100 victims

So the dealloc happens sooner than the script expects, and it's pure
chance that the test appeared to work anyway.

The test case needs to be rewritten to allow for more competing
usage-count-1 entries than it currently does.  Backing "98" down to
"95" might be enough, but I've not experimented (and I'd recommend
leaving more than the minimum amount of headroom, in case plpgsql
changes behavior about how many subexpressions get put into the
table).

Strongly recommend that while fixing the test, you stick in some
debugging elogs to verify when the dealloc calls actually happen
rather than assuming you predicted it correctly.  I did it as
attached.

regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 6f62a531f7..ffdc14a1eb 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -1373,6 +1373,10 @@ pgss_store(const char *query, uint64 queryId,
 		if (!stored)
 			goto done;
 
+		elog(LOG, "calling entry_alloc for '%.40s', cur hash size %ld",
+			 norm_query ? norm_query : query, 
+			 hash_get_num_entries(pgss_hash));
+
 		/* OK to create a new hashtable entry */
 		entry = entry_alloc(, query_offset, query_len, encoding,
 			jstate != NULL);
@@ -2160,6 +2164,8 @@ entry_dealloc(void)
 	nvictims = Max(10, i * USAGE_DEALLOC_PERCENT / 100);
 	nvictims = Min(nvictims, i);
 
+	elog(LOG, "entry_dealloc: zapping %d of %d victims", nvictims, i);
+
 	for (i = 0; i < nvictims; i++)
 	{
 		hash_search(pgss_hash, [i]->key, HASH_REMOVE, NULL);


Re: Streaming I/O, vectored I/O (WIP)

2023-12-31 Thread Melanie Plageman
I've written a new version of the vacuum streaming read user on top of
the rebased patch set [1]. It differs substantially from Andres' and
includes several refactoring patches that can apply on top of master.
As such, I've proposed those in a separate thread [2]. I noticed mac
and windows fail to build on CI for my branch with the streaming read
code. I haven't had a chance to investigate  -- but I must have done
something wrong on rebase.

- Melanie

[1] https://github.com/melanieplageman/postgres/tree/stepwise_vac_streaming_read
[2] 
https://www.postgresql.org/message-id/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com




Confine vacuum skip logic to lazy_scan_skip

2023-12-31 Thread Melanie Plageman
Hi,

I've written a patch set for vacuum to use the streaming read interface
proposed in [1]. Making lazy_scan_heap() async-friendly required a bit
of refactoring of lazy_scan_heap() and lazy_scan_skip(). I needed to
confine all of the skipping logic -- previously spread across
lazy_scan_heap() and lazy_scan_skip() -- to lazy_scan_skip(). All of the
patches doing this and other preparation for vacuum to use the streaming
read API can be applied on top of master. The attached patch set does
this.

There are a few comments that still need to be updated. I also noticed I
needed to reorder and combine a couple of the commits. I wanted to
register this for the january commitfest, so I didn't quite have time
for the finishing touches.

- Melanie

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com
From 9cd579d6a20aef2aeeab6ef50d72e779d75bf7cd Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Sat, 30 Dec 2023 16:18:40 -0500
Subject: [PATCH v1 1/7] lazy_scan_skip remove unnecessary local var rel_pages

lazy_scan_skip() only uses vacrel->rel_pages twice, so there seems to be
no reason to save it in a local variable, rel_pages.
---
 src/backend/access/heap/vacuumlazy.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 3b9299b8924..c4e0c077694 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,13 +1302,12 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber rel_pages = vacrel->rel_pages,
-next_unskippable_block = next_block,
+	BlockNumber next_unskippable_block = next_block,
 nskippable_blocks = 0;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
-	while (next_unskippable_block < rel_pages)
+	while (next_unskippable_block < vacrel->rel_pages)
 	{
 		uint8		mapbits = visibilitymap_get_status(vacrel->rel,
 	   next_unskippable_block,
@@ -1331,7 +1330,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 		 *
 		 * Implement this by always treating the last block as unsafe to skip.
 		 */
-		if (next_unskippable_block == rel_pages - 1)
+		if (next_unskippable_block == vacrel->rel_pages - 1)
 			break;
 
 		/* DISABLE_PAGE_SKIPPING makes all skipping unsafe */
-- 
2.37.2

From 314dd9038593610583e4fe60ab62e0d46ea3be86 Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Sat, 30 Dec 2023 16:30:59 -0500
Subject: [PATCH v1 2/7] lazy_scan_skip remove unneeded local var
 nskippable_blocks

nskippable_blocks can be easily derived from next_unskippable_block's
progress when compared to the passed in next_block.
---
 src/backend/access/heap/vacuumlazy.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index c4e0c077694..3b28ea2cdb5 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1302,8 +1302,7 @@ static BlockNumber
 lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 			   bool *next_unskippable_allvis, bool *skipping_current_range)
 {
-	BlockNumber next_unskippable_block = next_block,
-nskippable_blocks = 0;
+	BlockNumber next_unskippable_block = next_block;
 	bool		skipsallvis = false;
 
 	*next_unskippable_allvis = true;
@@ -1360,7 +1359,6 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 
 		vacuum_delay_point();
 		next_unskippable_block++;
-		nskippable_blocks++;
 	}
 
 	/*
@@ -1373,7 +1371,7 @@ lazy_scan_skip(LVRelState *vacrel, Buffer *vmbuffer, BlockNumber next_block,
 	 * non-aggressive VACUUMs.  If the range has any all-visible pages then
 	 * skipping makes updating relfrozenxid unsafe, which is a real downside.
 	 */
-	if (nskippable_blocks < SKIP_PAGES_THRESHOLD)
+	if (next_unskippable_block - next_block < SKIP_PAGES_THRESHOLD)
 		*skipping_current_range = false;
 	else
 	{
-- 
2.37.2

From 67043818003faa9cf3cdf10e6fdc6cbf6f8eee4c Mon Sep 17 00:00:00 2001
From: Melanie Plageman 
Date: Sat, 30 Dec 2023 16:22:12 -0500
Subject: [PATCH v1 3/7] Add lazy_scan_skip unskippable state

Future commits will remove all skipping logic from lazy_scan_heap() and
confine it to lazy_scan_skip(). To make those commits more clear, first
introduce the struct, VacSkipState, which will maintain the variables
needed to skip ranges less than SKIP_PAGES_THRESHOLD.

While we are at it, add additional information to the lazy_scan_skip()
comment, including descriptions of the role and expectations for its
function parameters.
---
 src/backend/access/heap/vacuumlazy.c | 105 ---
 src/tools/pgindent/typedefs.list |   1 +
 2 files changed, 64 insertions(+), 42 deletions(-)

diff 

Re: Permute underscore separated components of columns before fuzzy matching

2023-12-31 Thread Arne Roland

Hi!

Mikhail Gribkov  writes:

> > Honestly I'm not entirely sure fixing only two switched words is 
worth the

> > effort, but the declared goal is clearly achieved.
>
>
> > I think the patch is good to go, although you need to fix code 
formatting.

>
>
> I took a brief look at this.  I concur that we shouldn't need to be
> hugely concerned about the speed of this code path.  However, we *do*
> need to be concerned about its maintainability, and I think the patch
> falls down badly there: it adds a chunk of very opaque and essentially
> undocumented code, that people will need to reverse-engineer anytime
> they are studying this function.  That could be alleviated perhaps
> with more work on comments, but I have to wonder whether it's worth
> carrying this logic at all.  It's a rather strange behavior to add,
> and I wonder if many users will want it.

I encounter this problem all the time. I don't know, whether my clients 
are representative. But I see the problem, when the developers show me 
their code base all the time.
It's an issue for column names and table names alike. I personally spent 
hours watching developers trying various permutations.
They rarely request this feature. Usually they are to embarrassed for 
not knowing their object names to request anything in that state.
But I want the database, which I support, to be gentle and helpful to 
the user under these circumstances.


Regarding complexity: I think the permutation matrix is the thing to 
easily get wrong. I had a one off bug writing it down initially.
I tried to explain the conceptual approach better with a longer comment 
than before.


    /*
 * Only consider mirroring permutations, since the 
three simple rotations are already
 * (or will be for a later underscore_current) covered 
above.

 *
 * The entries of the permutation matrix tell us, where 
we should copy the tree segments to.
 * The zeroth dimension iterates over the permutations, 
while the first dimension iterates

 * over the three segments are permuted to.
 * Considering the string A_B_C the three segments are:
 * - before the initial underscore sections (A)
 * - between the underscore sections (B)
 * - after the later underscore sections (C)
 */

If anything is still unclear, I'd appreciate feedback about what might 
be still unclear/confusing about this.
I can't promise to be helpful, if something breaks. But I have 
practically forgotten how I did it, and I found it easy to extend it 
like described below. It would have been embarrassing otherwise. Yet 
this gives me hope, it should be possible to enable others the same way.
I certainly want the code simple without need to reverse-engineer 
anything. Please let me know, if there are difficult to understand bits 
left around.


> One thing that struck me is that no care is being taken for adjacent
> underscores (that is, "foo__bar" and similar cases).  It seems
> unlikely that treating the zero-length substring between the
> underscores as a word to permute is helpful; moreover, it adds
> an edge case that the string-moving logic could easily get wrong.
> I wonder if the code should treat any number of consecutive
> underscores as a single separator.  (Somewhat related: I think it
> will behave oddly when the first or last character is '_', since the
> outer loop ignores those positions.)

I wasn't sure how there could be any potential future bug with copying 
zero-length strings, i.e. doing nothing. And I still don't see that.


There is one point I agree with: Doing this seems rarely helpful. I 
changed the code, so it treats sections delimited by an arbitrary amount 
of underscores.
So it never permutes with zero length strings within. I also added 
functionality to skip the zero length cases if we should encounter them 
at the end of the string.
So afaict there should be no zero length swaps left. Please let me know 
whether this is more to your liking.


I also replaced the hard limit of underscores with more nuanced limits 
of permutations to try before giving up.


> > And it would be much more convenient to work with your patch if 
every next

> > version file will have a unique name (maybe something like "_v2", "_v3"
> > etc. suffixes)
>
>
> Please.  It's very confusing when there are multiple identically-named
> patches in a thread.

Sorry, I started with this, because I confused cf bot in the past about 
whether the patches should be applied on top of each other or not.


For me the cf-bot logic is a bit opaque there. But you are right, 
confusing patch readers is definitely worse. I'll try to do that. I hope 
the attached format is better.



One question about pgindent: I struggled a bit with getting the right 
version of bsd_indent. I found versions labeled 2.2.1 and 2.1.1, but 
apparently we work with 2.1.2. 

Re: pg_stat_statements: more test coverage

2023-12-31 Thread Tom Lane
Peter Eisentraut  writes:
> It looks like the failing configurations are exactly all the big-endian 
> ones: s390x, sparc, powerpc.  So it's possible that this is actually a 
> bug?  But unless someone can reproduce this locally and debug it, we 
> should probably revert this for now.

I see it failed on my animal mamba, so I should be able to reproduce
it there.  Will look.

regards, tom lane




Re: Fixing backslash dot for COPY FROM...CSV

2023-12-31 Thread Daniel Verite
  Hi,

The CI patch tester fails on this patch, because it has a label
at the end of a C block, which I'm learning is a C23 feature
that happens to be supported by gcc 11 [1], but is not portable.

PFA an update fixing this, plus removing an obsolete chunk
in the COPY documentation that v2 left out.


[1] https://gcc.gnu.org/onlinedocs/gcc/Mixed-Labels-and-Declarations.html


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
From 98f38aff440ad683aa1bd7a30fd960f1d0101191 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Daniel=20V=C3=A9rit=C3=A9?= 
Date: Sun, 31 Dec 2023 14:47:05 +0100
Subject: [PATCH v4] Support backslash-dot on a line by itself as valid data in
 COPY FROM...CSV

---
 doc/src/sgml/ref/copy.sgml   |  6 +--
 doc/src/sgml/ref/psql-ref.sgml   |  4 --
 src/backend/commands/copyfromparse.c | 57 +++-
 src/bin/psql/copy.c  | 32 +++-
 src/test/regress/expected/copy.out   | 18 +
 src/test/regress/sql/copy.sql| 12 ++
 6 files changed, 67 insertions(+), 62 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 18ecc69c33..e719053e3d 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -761,11 +761,7 @@ COPY count
 format, \., the end-of-data marker, could also appear
 as a data value.  To avoid any misinterpretation, a \.
 data value appearing as a lone entry on a line is automatically
-quoted on output, and on input, if quoted, is not interpreted as the
-end-of-data marker.  If you are loading a file created by another
-application that has a single unquoted column and might have a
-value of \., you might need to quote that value in the
-input file.
+quoted on output.

 

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index cc7d797159..d94e3cacfc 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1119,10 +1119,6 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 
'second value' \g
 destination, because all data must pass through the client/server
 connection.  For large amounts of data the SQL
 command might be preferable.
-Also, because of this pass-through method, \copy
-... from in CSV mode will erroneously
-treat a \. data value alone on a line as an
-end-of-input marker.
 
 
 
diff --git a/src/backend/commands/copyfromparse.c 
b/src/backend/commands/copyfromparse.c
index f553734582..67e046d450 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -136,14 +136,6 @@ if (1) \
} \
 } else ((void) 0)
 
-/* Undo any read-ahead and jump out of the block. */
-#define NO_END_OF_COPY_GOTO \
-if (1) \
-{ \
-   input_buf_ptr = prev_raw_ptr + 1; \
-   goto not_end_of_copy; \
-} else ((void) 0)
-
 /* NOTE: there's a copy of this in copyto.c */
 static const char BinarySignature[11] = "PGCOPY\n\377\r\n\0";
 
@@ -1137,7 +1129,6 @@ CopyReadLineText(CopyFromState cstate)
boolresult = false;
 
/* CSV variables */
-   boolfirst_char_in_line = true;
boolin_quote = false,
last_was_esc = false;
charquotec = '\0';
@@ -1335,10 +1326,9 @@ CopyReadLineText(CopyFromState cstate)
}
 
/*
-* In CSV mode, we only recognize \. alone on a line.  This is 
because
-* \. is a valid CSV data value.
+* In CSV mode, backslash is a normal character.
 */
-   if (c == '\\' && (!cstate->opts.csv_mode || first_char_in_line))
+   if (c == '\\' && !cstate->opts.csv_mode)
{
charc2;
 
@@ -1371,21 +1361,15 @@ CopyReadLineText(CopyFromState cstate)
 
if (c2 == '\n')
{
-   if (!cstate->opts.csv_mode)
-   ereport(ERROR,
-   
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-
errmsg("end-of-copy marker does not match previous newline style")));
-   else
-   NO_END_OF_COPY_GOTO;
+   ereport(ERROR,
+   
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+
errmsg("end-of-copy marker does not match previous newline style")));
}
else if 

Re: Password leakage avoidance

2023-12-31 Thread Jonathan S. Katz

On 12/31/23 9:50 AM, Magnus Hagander wrote:

On Wed, Dec 27, 2023 at 10:31 PM Jonathan S. Katz  wrote:


On 12/24/23 12:15 PM, Tom Lane wrote:


Maybe we need a PQcreaterole that provide the mechanism to set passwords
safely. It'd likely need to take all the options need for creating a
role, but that would at least give the underlying mechanism to ensure
we're always sending a hashed password to the server.


I'm kind of down on that, because it seems like it'd be quite hard to
design an easy-to-use C API that doesn't break the next time somebody
adds another option to CREATE USER.  What's so wrong with suggesting
that the password be set in a separate step?  (For comparison, typical
Unix utilities like useradd(8) also tell you to set the password
separately.)


Modern development frameworks tend to reduce things down to one-step,
even fairly complex operations. Additionally, a lot of these frameworks
don't even require a developer to build backend applications that
involve doing actually work on the backend (e.g. UNIX), so the approach
of useradd(8) et al. are not familiar. Adding the additional step would
lead to errors, e.g. the developer not calling the "change password"
function to create the obfuscated password. Granted, we can push the
problem down to driver authors to "be better" and simplify the process
for their end users, but that still can be error prone, having seen this
with driver authors implementing PostgreSQL SCRAM and having made
(correctable) mistakes that could have lead to security issues.


This seems to confuse "driver" with "framework".

I would say the "two step" approach is perfectly valid for a driver
whereas as you say most people building say webapps or similar on top
of a framework will expect it to handle things for them. But that's
more of a framework thing than a driver thing, depending on
terminology. E.g. it would be up to the "Postgres support driver" in
django/rails/whatnot to reduce it down to one step, not to a low level
driver like libpq (or other low level drivers).

None of those frameworks are likely to want to require direct driver
access anyway, they *want* to take control of that process in my
experience.


Fair point on the framework/driver comparison, but the above still 
applies to drivers. As mentioned above, non-libpq drivers did have 
mistakes that could have lead to security issues while implementing 
PostgreSQL SCRAM. Additionally, CVE-2021-23222[1] presented itself in 
both libpq/non-libpq drivers, either through the issue itself, or 
through implementing the protocol step in a way similar to libpq.


Keeping the implementation surface area simpler for driver maintainers 
does generally help mitigate further issues, though I'd defer to the 
driver maintainers if they agree with that statement.


Thanks,

Jonathan

[1] https://www.postgresql.org/support/security/CVE-2021-23222/



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Password leakage avoidance

2023-12-31 Thread Magnus Hagander
On Wed, Dec 27, 2023 at 10:31 PM Jonathan S. Katz  wrote:
>
> On 12/24/23 12:15 PM, Tom Lane wrote:
>
> >> Maybe we need a PQcreaterole that provide the mechanism to set passwords
> >> safely. It'd likely need to take all the options need for creating a
> >> role, but that would at least give the underlying mechanism to ensure
> >> we're always sending a hashed password to the server.
> >
> > I'm kind of down on that, because it seems like it'd be quite hard to
> > design an easy-to-use C API that doesn't break the next time somebody
> > adds another option to CREATE USER.  What's so wrong with suggesting
> > that the password be set in a separate step?  (For comparison, typical
> > Unix utilities like useradd(8) also tell you to set the password
> > separately.)
>
> Modern development frameworks tend to reduce things down to one-step,
> even fairly complex operations. Additionally, a lot of these frameworks
> don't even require a developer to build backend applications that
> involve doing actually work on the backend (e.g. UNIX), so the approach
> of useradd(8) et al. are not familiar. Adding the additional step would
> lead to errors, e.g. the developer not calling the "change password"
> function to create the obfuscated password. Granted, we can push the
> problem down to driver authors to "be better" and simplify the process
> for their end users, but that still can be error prone, having seen this
> with driver authors implementing PostgreSQL SCRAM and having made
> (correctable) mistakes that could have lead to security issues.

This seems to confuse "driver" with "framework".

I would say the "two step" approach is perfectly valid for a driver
whereas as you say most people building say webapps or similar on top
of a framework will expect it to handle things for them. But that's
more of a framework thing than a driver thing, depending on
terminology. E.g. it would be up to the "Postgres support driver" in
django/rails/whatnot to reduce it down to one step, not to a low level
driver like libpq (or other low level drivers).

None of those frameworks are likely to want to require direct driver
access anyway, they *want* to take control of that process in my
experience.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Remove useless GROUP BY columns considering unique index

2023-12-31 Thread jian he
On Fri, Dec 29, 2023 at 11:05 PM Zhang Mingli  wrote:
>
> Hi,
>
> This idea first came from remove_useless_groupby_columns does not need to 
> record constraint dependencie[0] which points out that
> unique index whose columns all have NOT NULL constraints  could also take the 
> work with primary key when removing useless GROUP BY columns.
> I study it and implement the idea.
>
> [0]https://www.postgresql.org/message-id/flat/CAApHDvrdYa%3DVhOoMe4ZZjZ-G4ALnD-xuAeUNCRTL%2BPYMVN8OnQ%40mail.gmail.com
>

cosmetic issues:
+--
+-- Test removal of redundant GROUP BY columns using unique not null index.
+-- materialized view
+--
+create temp table t1 (a int, b int, c int, primary key (a, b), unique(c));
+create temp table t2 (a int, b int, c int not null, primary key (a,
b), unique(c));
+create temp table t3 (a int, b int, c int not null, d int not null,
primary key (a, b), unique(c, d));
+create temp table t4 (a int, b int not null, c int not null, d int
not null, primary key (a, b), unique(b, c), unique(d));
+create temp table t5 (a int not null, b int not null, c int not null,
d int not null, unique (a, b), unique(b, c), unique(a, c, d));
+
+-- Test unique index without not null constraint should not be used.
+explain (costs off) select * from t1 group by a,b,c;
+
+-- Test unique not null index beats primary key.
+explain (costs off) select * from t2 group by a,b,c;
+
+-- Test primary key beats unique not null index.
+explain (costs off) select * from t3 group by a,b,c,d;
+
+-- Test unique not null index with less columns wins.
+explain (costs off) select * from t4 group by a,b,c,d;
+
+-- Test unique not null indices have overlap.
+explain (costs off) select * from t5 group by a,b,c,d;
+
+drop table t1;
+drop table t2;
+drop table t3;
+drop table t4;
+

line `materailzed view` is unnecessary?
you forgot to drop table t5.

create temp table p_t2 (
  a int not null,
  b int not null,
  c int not null,
  d int,
  unique(a), unique(a,b),unique(a,b,c)
) partition by list(a);
create temp table p_t2_1 partition of p_t2 for values in(1);
create temp table p_t2_2 partition of p_t2 for values in(2);
explain (costs off, verbose) select * from p_t2 group by a,b,c,d;
your patch output:
  QUERY PLAN
--
 HashAggregate
   Output: p_t2.a, p_t2.b, p_t2.c, p_t2.d
   Group Key: p_t2.a
   ->  Append
 ->  Seq Scan on pg_temp.p_t2_1
   Output: p_t2_1.a, p_t2_1.b, p_t2_1.c, p_t2_1.d
 ->  Seq Scan on pg_temp.p_t2_2
   Output: p_t2_2.a, p_t2_2.b, p_t2_2.c, p_t2_2.d
(8 rows)

so it seems to work with partitioned tables, maybe you should add some
test cases with partition tables.


- * XXX This is only used to create derived tables, so NO INHERIT constraints
- * are always skipped.
+ * XXX When used to create derived tables, set skip_no_inherit tp true,
+ * so that NO INHERIT constraints will be skipped.
  */
 List *
-RelationGetNotNullConstraints(Oid relid, bool cooked)
+RelationGetNotNullConstraints(Oid relid, bool cooked, bool skip_no_inherit)
 {
  List   *notnulls = NIL;
  Relation constrRel;
@@ -815,7 +815,7 @@ RelationGetNotNullConstraints(Oid relid, bool cooked)

  if (conForm->contype != CONSTRAINT_NOTNULL)
  continue;
- if (conForm->connoinherit)
+ if (skip_no_inherit && conForm->connoinherit)
  continue;

I don't think you need to refactor RelationGetNotNullConstraints.
however i found it hard to explain it, (which one is parent, which one
is children is very confusing for me).
so i use the following script to test it:
DROP TABLE ATACC1, ATACC2;
CREATE TABLE ATACC1 (a int);
CREATE TABLE ATACC2 (b int,c int, unique(c)) INHERITS (ATACC1);
ALTER TABLE ATACC1 ADD NOT NULL a;
-- ALTER TABLE ATACC1 ADD NOT NULL a NO INHERIT;
ALTER TABLE ATACC2 ADD unique(a);
explain (costs off, verbose) select * from ATACC2 group by a,b,c;
-

create table t_0_3 (a int not null, b int not null, c int not null, d
int not null, unique (a, b), unique(b, c) DEFERRABLE, unique(d));
explain (costs off, verbose) select * from t_0_3 group by a,b,c,d;
   QUERY PLAN

 HashAggregate
   Output: a, b, c, d
   Group Key: t_0_3.a, t_0_3.b
   ->  Seq Scan on public.t_0_3
 Output: a, b, c, d
(5 rows)

the above part seems not right, it should be `Group Key: t_0_3.d`
so i changed to:

/* Skip constraints that are not UNIQUE */
+ if (con->contype != CONSTRAINT_UNIQUE)
+ continue;
+
+ /* Skip unique constraints that are condeferred */
+ if (con->condeferrable && con->condeferred)
+ continue;

I guess you probably have noticed, in the function
remove_useless_groupby_columns,
get_primary_key_attnos followed by get_min_unique_not_null_attnos.
Also, get_min_unique_not_null_attnos main code is very similar to
get_primary_key_attnos.

They both do `pg_constraint = table_open(ConstraintRelationId,
AccessShareLock);`
you might need to explain why we must open pg_constraint twice.
either 

Re: Autonomous transactions 2023, WIP

2023-12-31 Thread Ivan Kush



On 24.12.2023 15:38, Pavel Stehule wrote:
Can you show some benchmarks? I don't like this system too much but 
maybe it can work enough.


Still I am interested in possible use cases. If it should be used only 
for logging, then we can implement something less generic, but surely 
with better performance and stability. Logging to tables is a little 
bit outdated.


Regards

Pavel


All use cases of pg_background, except asynchronous execution. If later 
add asynchronous execution, then all =)


For example, also:

* conversion from Oracle's `PRAGMA AUTONOMOUS` to Postgres.

* possibility to create functions that calls utility statements, like 
VACUUM, etc.


I don't have good benchmarks now. Some simple, like many INSERTs. Pool 
gives advantage, more tps compared to pg_background with increasing 
number of backends.


The main advantage over pg_background is pool of workers. In this patch 
separate pool is created for each backend. At the current time I'm 
coding one shared pool for all backends.





 > 2. although the Oracle syntax is interesting, and I proposed
PRAGMA
more times,  it doesn't allow this functionality in other PL

2. Add `AUTONOMOUS` to `BEGIN` instead of `PRAGMA` in `DECLARE`?
`BEGIN
AUTONOMOUS`.
It shows immediately that we are in autonomous session, no need to
search in subsequent lines for keyword.

```
CREATE FUNCTION foo() RETURNS void AS $$
BEGIN AUTONOMOUS
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
    
    END;
END;
$$ LANGUAGE plpgsql;
```

 > CREATE OR REPLACE FUNCTION ...
 > AS $$
 > $$ LANGUAGE plpgsql AUTONOMOUS TRANSACTION;

The downside with the keyword in function declaration, that we
will not
be able to create autonomous subblocks. With `PRAGMA AUTONOMOUS` or
`BEGIN AUTONOMOUS` it's possible to create them.

```
-- BEGIN AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN AUTONOMOUS
     INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;


-- or PRAGMA AUTONOMOUS

CREATE FUNCTION foo() RETURNS void AS $$
BEGIN
   INSERT INTO tbl VALUES (1);
   BEGIN
   DECLARE AUTONOMOUS_TRANSACTION;
     INSERT INTO tbl VALUES (2);
   END;
END;
$$ LANGUAGE plpgsql;


START TRANSACTION;
foo();
ROLLBACK;
```

```
Output:
2
```

 > it doesn't allow this functionality in other PL

I didn't work out on other PLs at the current time, but...

## Python

In plpython we could use context managers, like was proposed in
Peter's
patch. ```

with plpy.autonomous() as a:
 a.execute("INSERT INTO tbl VALUES (1) ");

```

## Perl

I don't programm in Perl. But googling shows Perl supports subroutine
attributes. Maybe add `autonomous` attribute for autonomous execution?

```
sub foo :autonomous {
}
```

https://www.perl.com/article/untangling-subroutine-attributes/


 > Heikki wrote about the possibility to support threads in Postgres.

3. Do you mean this thread?

https://www.postgresql.org/message-id/flat/31cc6df9-53fe-3cd9-af5b-ac0d801163f4%40iki.fi
Thanks for info. Will watch it. Unfortunately it takes many years to
implement threads =(

 > Surely, the first topic should be the method of implementation.
Maybe
I missed it, but there is no agreement of background worker based.
I agree. No consensus at the current time.
Pros of bgworkers are:
1. this entity is already in Postgres.
2. possibility of asynchronous execution of autonomous session in the
future. Like in pg_background extension. For asynchronous
execution we
need a separate process, bgworkers are this separate process.

Also maybe later create autonomous workers themselves without using
bgworkers internally: launch of separate process, etc. But I think
will
be many common code with bgworkers.


On 21.12.2023 12:35, Pavel Stehule wrote:
> Hi
>
> although I like the idea related to autonomous transactions, I
don't
> think so this way is the best
>
> 1. The solution based on background workers looks too fragile -
it can
> be easy to exhaust all background workers, and because this
feature is
> proposed mainly for logging, then it is a little bit dangerous,
> because it means loss of possibility of logging.
>
> 2. although the Oracle syntax is interesting, and I proposed PRAGMA
> more times,  it doesn't allow this functionality in other PL
>
> I don't propose exactly  firebird syntax
>
https://firebirdsql.org/refdocs/langrefupd25-psql-autonomous-trans.html,

> but I think this solution is better than ADA's PRAGMAs. I can
imagine
> some special flag for function like
>
> CREATE OR REPLACE FUNCTION ...
> AS $$
> $$ 

Network failure may prevent promotion

2023-12-31 Thread Kyotaro Horiguchi
(Apology for resubmitting due to poor subject of the previous mail)
---
Hello.

We've noticed that when walreceiver is waiting for a connection to
complete, standby does not immediately respond to promotion
requests. In PG14, upon receiving a promotion request, walreceiver
terminates instantly, but in PG16, it waits for connection
timeout. This behavior is attributed to commit 728f86fec65, where a
part of libpqrcv_connect was simply replaced with a call to
libpqsrc_connect_params. This behavior can be verified by simply
dropping packets from the standby to the primary.

By a simple thought, in walreceiver, libpqsrv_connect_internal could
just call ProcessWalRcvInterrupts() instead of CHECK_FOR_INTERRUPTS(),
but this approach is quite ugly. Since ProcessWalRcvInterrupts()
originally calls CHECK_FOR_INTERRUPTS() and there are no standalone
calls to CHECK_FOR_INTERRUPTS() within walreceiver, I think it might
be better to use ProcDiePending instead of ShutdownRequestPending.  I
added a subset function of die() as the SIGTERM handler in walsender
in a crude patch attached.

What do you think about the issue, and the approach?

If there are no issues or objections with this method, I will continue
to refine this patch. For now, I plan to register it for the upcoming
commitfest.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




libpqsrv_connect_params should call ProcessWalRcvInterrupts

2023-12-31 Thread Kyotaro Horiguchi
Hello.

We've noticed that when walreceiver is waiting for a connection to
complete, standby does not immediately respond to promotion
requests. In PG14, upon receiving a promotion request, walreceiver
terminates instantly, but in PG16, it waits for connection
timeout. This behavior is attributed to commit 728f86fec65, where a
part of libpqrcv_connect was simply replaced with a call to
libpqsrc_connect_params. This behavior can be verified by simply
dropping packets from the standby to the primary.

By a simple thought, in walreceiver, libpqsrv_connect_internal could
just call ProcessWalRcvInterrupts() instead of CHECK_FOR_INTERRUPTS(),
but this approach is quite ugly. Since ProcessWalRcvInterrupts()
originally calls CHECK_FOR_INTERRUPTS() and there are no standalone
calls to CHECK_FOR_INTERRUPTS() within walreceiver, I think it might
be better to use ProcDiePending instead of ShutdownRequestPending.  I
added a subset function of die() as the SIGTERM handler in walsender
in a crude patch attached.

What do you think about the issue, and the approach?

If there are no issues or objections with this method, I will continue
to refine this patch. For now, I plan to register it for the upcoming
commitfest.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 693b3669ba..e503799bd8 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -738,7 +738,7 @@ libpqrcv_PQgetResult(PGconn *streamConn)
 		if (rc & WL_LATCH_SET)
 		{
 			ResetLatch(MyLatch);
-			ProcessWalRcvInterrupts();
+			CHECK_FOR_INTERRUPTS();
 		}
 
 		/* Consume whatever data is available from the socket */
@@ -1042,7 +1042,7 @@ libpqrcv_processTuples(PGresult *pgres, WalRcvExecResult *walres,
 	{
 		char	   *cstrs[MaxTupleAttributeNumber];
 
-		ProcessWalRcvInterrupts();
+		CHECK_FOR_INTERRUPTS();
 
 		/* Do the allocations in temporary context. */
 		oldcontext = MemoryContextSwitchTo(rowcontext);
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 26ded928a7..c53a8e6c89 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -147,39 +147,34 @@ static void XLogWalRcvSendReply(bool force, bool requestReply);
 static void XLogWalRcvSendHSFeedback(bool immed);
 static void ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime);
 static void WalRcvComputeNextWakeup(WalRcvWakeupReason reason, TimestampTz now);
+static void WalRcvShutdownSignalHandler(SIGNAL_ARGS);
 
-/*
- * Process any interrupts the walreceiver process may have received.
- * This should be called any time the process's latch has become set.
- *
- * Currently, only SIGTERM is of interest.  We can't just exit(1) within the
- * SIGTERM signal handler, because the signal might arrive in the middle of
- * some critical operation, like while we're holding a spinlock.  Instead, the
- * signal handler sets a flag variable as well as setting the process's latch.
- * We must check the flag (by calling ProcessWalRcvInterrupts) anytime the
- * latch has become set.  Operations that could block for a long time, such as
- * reading from a remote server, must pay attention to the latch too; see
- * libpqrcv_PQgetResult for example.
- */
 void
-ProcessWalRcvInterrupts(void)
+WalRcvShutdownSignalHandler(SIGNAL_ARGS)
 {
-	/*
-	 * Although walreceiver interrupt handling doesn't use the same scheme as
-	 * regular backends, call CHECK_FOR_INTERRUPTS() to make sure we receive
-	 * any incoming signals on Win32, and also to make sure we process any
-	 * barrier events.
-	 */
-	CHECK_FOR_INTERRUPTS();
+	int			save_errno = errno;
 
-	if (ShutdownRequestPending)
+	/* Don't joggle the elbow of proc_exit */
+	if (!proc_exit_inprogress)
 	{
-		ereport(FATAL,
-(errcode(ERRCODE_ADMIN_SHUTDOWN),
- errmsg("terminating walreceiver process due to administrator command")));
+		InterruptPending = true;
+		ProcDiePending = true;
 	}
+
+	SetLatch(MyLatch);
+
+	errno = save_errno;
+	
 }
 
+/*
+ * Is current process a wal receiver?
+ */
+bool
+IsWalReceiver(void)
+{
+	return WalRcv != NULL;
+}
 
 /* Main entry point for walreceiver process */
 void
@@ -277,7 +272,7 @@ WalReceiverMain(void)
 	pqsignal(SIGHUP, SignalHandlerForConfigReload); /* set flag to read config
 	 * file */
 	pqsignal(SIGINT, SIG_IGN);
-	pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request shutdown */
+	pqsignal(SIGTERM, WalRcvShutdownSignalHandler); /* request shutdown */
 	/* SIGQUIT handler was already set up by InitPostmasterChild */
 	pqsignal(SIGALRM, SIG_IGN);
 	pqsignal(SIGPIPE, SIG_IGN);
@@ -456,7 +451,7 @@ WalReceiverMain(void)
 			 errmsg("cannot continue WAL streaming, recovery has already ended")));
 
 /* Process any requests or signals received recently */
-

Re: pg_stat_statements: more test coverage

2023-12-31 Thread Peter Eisentraut

On 31.12.23 10:26, Julien Rouhaud wrote:

On Sun, Dec 31, 2023 at 2:28 PM Michael Paquier  wrote:


On Sat, Dec 30, 2023 at 08:39:47PM +0100, Peter Eisentraut wrote:

Ok, I have committed these two patches.


Please note that the buildfarm has turned red, as in:
https://buildfarm.postgresql.org/cgi-bin/show_stagxe_log.pl?nm=pipit=2023-12-31%2001%3A12%3A22=misc-check

pg_stat_statements's regression.diffs holds more details:
SELECT query FROM pg_stat_statements WHERE query LIKE '%t001%' OR query LIKE 
'%t098%' ORDER BY query;
  query
  
- select * from t001
   select * from t098
-(2 rows)
+(1 row)


That's surprising.  I wanted to see if there was any specific
configuration but I get a 403.  I'm wondering if this is only due to
other tests being run concurrently evicting an entry earlier than
planned.


These tests are run in a separate instance and serially, so I don't 
think concurrency is an issue.


It looks like the failing configurations are exactly all the big-endian 
ones: s390x, sparc, powerpc.  So it's possible that this is actually a 
bug?  But unless someone can reproduce this locally and debug it, we 
should probably revert this for now.






Re: Things I don't like about \du's "Attributes" column

2023-12-31 Thread Pavel Luzanov

On 30.12.2023 17:33, Isaac Morland wrote:
Would it make sense to make the column non-nullable and always set it 
to infinity when there is no expiry?


A password is not required for roles. In many cases, external 
authentication is used in ph_hba.conf.

I think it would be strange to have 'infinity' for roles without a password.

Tom suggested to have 'infinity' in the \du output for roles with a 
password.
My doubt is that this will hide the real values (absence of values). So 
I suggested a separate column
'Has password?' to show roles with password and unmodified column 
'Password expire time'.


Yes, it's easy to replace NULL with "infinity" for roles with a 
password, but why?
What is the reason for this action? Absence of value for 'expire time' 
clear indicates that there is no time limit.
Also I don't see a practical reasons to execute next command, since it 
do nothing:


ALTER ROLE .. PASSWORD 'infinity';

So I think that in most cases there is no "infinity" in the 
rolvaliduntil column.


But of course, I can be wrong.

Thank you for giving your opinion.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com


Re: A typo in a messsage?

2023-12-31 Thread Kyotaro Horiguchi
At Wed, 27 Dec 2023 13:34:50 +0700, John Naylor  wrote 
in 
> > Shouldn't the "is" following "LSN" be "in"?
> 
> Pushed.

At Wed, 27 Dec 2023 13:35:24 +0700, John Naylor  wrote 
in 
>> I think "crc" should be in all uppercase in general and a brief
>> grep'ing told me that it is almost always or consistently used in
>> uppercase in our tree.
> I pushed this also.

Thanks for pushing them!

regads.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pg_basebackup has an accidentaly separated help message

2023-12-31 Thread Kyotaro Horiguchi
At Tue, 26 Dec 2023 19:04:53 +0900, Michael Paquier  wrote 
in 
> On Mon, Dec 25, 2023 at 05:07:28PM +0900, Kyotaro Horiguchi wrote:
> > Yes. So, it turns out that they're found after they have been
> > committed.
> 
> No problem.  I've just applied what you had.  I hope this makes your
> life a bit easier ;)

Thanks for committing this!

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: pg_stat_statements: more test coverage

2023-12-31 Thread Julien Rouhaud
On Sun, Dec 31, 2023 at 2:28 PM Michael Paquier  wrote:
>
> On Sat, Dec 30, 2023 at 08:39:47PM +0100, Peter Eisentraut wrote:
> > Ok, I have committed these two patches.
>
> Please note that the buildfarm has turned red, as in:
> https://buildfarm.postgresql.org/cgi-bin/show_stagxe_log.pl?nm=pipit=2023-12-31%2001%3A12%3A22=misc-check
>
> pg_stat_statements's regression.diffs holds more details:
> SELECT query FROM pg_stat_statements WHERE query LIKE '%t001%' OR query LIKE 
> '%t098%' ORDER BY query;
>  query
>  
> - select * from t001
>   select * from t098
> -(2 rows)
> +(1 row)

That's surprising.  I wanted to see if there was any specific
configuration but I get a 403.  I'm wondering if this is only due to
other tests being run concurrently evicting an entry earlier than
planned.