Re: logical replication restrictions

2022-11-08 Thread Kyotaro Horiguchi
At Wed, 10 Aug 2022 17:33:00 -0300, "Euler Taveira"  wrote 
in 
> On Wed, Aug 10, 2022, at 9:39 AM, osumi.takami...@fujitsu.com wrote:
> > Minor review comments for v6.
> Thanks for your review. I'm attaching v7.

Using interval is not standard as this kind of parameters but it seems
convenient. On the other hand, it's not great that the unit month
introduces some subtle ambiguity.  This patch translates a month to 30
days but I'm not sure it's the right thing to do. Perhaps we shouldn't
allow the units upper than days.

apply_delay() chokes the message-receiving path so that a not-so-long
delay can cause a replication timeout to fire.  I think we should
process walsender pings even while delaying.  Needing to make
replication timeout longer than apply delay is not great, I think.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: User functions for building SCRAM secrets

2022-11-08 Thread Michael Paquier
On Tue, Nov 08, 2022 at 04:57:09PM -0800, Jacob Champion wrote:
> But I guess that wouldn't really help with ALTER ROLE ... PASSWORD,
> because you can't parameterize it. Hm...

Yeah, and I'd like to think that this is never something we should
allow, either, as that could be easily a footgun for users (?).
--
Michael


signature.asc
Description: PGP signature


Re: Locks release order in LogStandbySnapshot

2022-11-08 Thread Japin Li


On Wed, 09 Nov 2022 at 11:21, Andres Freund  wrote:
> I think it does. If we allow xid assignment before LogCurrentRunningXacts() is
> done, those new xids would not have been mentioned in the xl_running_xacts
> record, despite already running. Which I think result in corrupted snapshots
> during hot standby and logical decoding.
>
>
>> Does there any sense to release them in reversed acquisition order in
>> LogStandbySnapshot like ProcArrayRemove?
>
> I don't think it's worth optimizing for, this happens at a low frequency
> (whereas connection establishment can be very frequent). And due to the above,
> we can sometimes release ProcArrayLock earlier.
>

Thanks for the explanation!  Got it.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication

2022-11-08 Thread Andrey Borodin
On Thu, Sep 29, 2022 at 3:53 PM Bruce Momjian  wrote:
>
> So, what happens when an insufficient number of synchronous replicas
> reply?

It's a failover.

> Sessions hang because the synchronous behavior cannot be
> guaranteed.  We then _allow_ query cancel so the user or administrator
> can get out of the hung sessions and perhaps modify
> synchronous_standby_names.

Administrators should not modify synchronous_standby_names.
Administrator must shoot this not in the head.

> I have always felt this has to be done at the server level, meaning when
> a synchronous_standby_names replica is not responding after a certain
> timeout, the administrator must be notified by calling a shell command
> defined in a GUC and all sessions will ignore the replica.

Standbys are expelled from the waitlist according to quorum rules. I'd
propose not to invent more quorum rules involving shell scripts.
The Administrator expressed what number of standbys can be offline by
setting synchronous_standby_names. They actively asked for hanging
queries in case of insufficient standbys.

We have reserved administrator connections for the case when all
connection slots are used by hanging queries.


Best regards, Andrey Borodin.




Re: [patch] Have psql's \d+ indicate foreign partitions

2022-11-08 Thread Michael Paquier
On Tue, Nov 08, 2022 at 03:38:22PM +0900, Ian Lawrence Barwick wrote:
> CF entry updated accordingly.

Missed this part, thanks..
--
Michael


signature.asc
Description: PGP signature


Re: Improve logging when using Huge Pages

2022-11-08 Thread Michael Paquier
On Wed, Nov 09, 2022 at 11:47:57AM +0900, Kyotaro Horiguchi wrote:
> Honestly I don't come up with other users of the new
> log-level. Another possible issue is it might be a bit hard for people
> to connect that level to huge_pages=try, whereas I think we shouldn't
> put a description about the concrete impact range of that log-level.
> 
> I came up with an alternative idea that add a new huge_pages value
> try_report or try_verbose, which tell postgresql to *always* report
> the result of huge_pages = try.

Here is an extra idea for the bucket of ideas: switch the user-visible
value of huge_pages to 'on' when we are at "try" but success in using
huge pages, and switch the visible value to "off".  The idea of Justin
in [1] to use an internal runtime-computed GUC sounds sensible, as well
(say a boolean effective_huge_pages?).

[1]: https://www.postgresql.org/message-id/20221106130426.gg16...@telsasoft.com
--
Michael


signature.asc
Description: PGP signature


Re: Suppressing useless wakeups in walreceiver

2022-11-08 Thread Bharath Rupireddy
On Wed, Nov 9, 2022 at 2:38 AM Nathan Bossart  wrote:
>
> On Tue, Nov 08, 2022 at 09:45:40PM +1300, Thomas Munro wrote:
> > On Tue, Nov 8, 2022 at 9:20 PM Bharath Rupireddy
> >  wrote:
> >> Thanks. Do we need a similar wakeup approach for logical replication
> >> workers in worker.c? Or is it okay that the nap time is 1sec there?
> >
> > Yeah, I think so.  At least for its idle/nap case (waiting for flush
> > is also a technically interesting case, but harder, and applies to
> > non-idle systems so the polling is a little less offensive).
>
> Bharath, are you planning to pick this up?  If not, I can probably spend
> some time on it.

Please go ahead!

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: allow segment size to be set to < 1GiB

2022-11-08 Thread Michael Paquier
On Tue, Nov 08, 2022 at 06:28:08PM -0800, Andres Freund wrote:
> FWIW, with HEAD, all tests pass with -Dsegsize_blocks=6 on HEAD.

Wow.  The relation page size influences some of the plans in the
main regression test suite, but this is nice to hear.  +1 from me for
more flexibility with this option at compile-time.
--
Michael


signature.asc
Description: PGP signature


Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-08 Thread Julien Rouhaud
Hi,

On Wed, Nov 09, 2022 at 09:51:17AM +0900, Michael Paquier wrote:
> On Tue, Nov 08, 2022 at 10:04:16AM +0900, Michael Paquier wrote:
> > CF bot unhappy as I have messed up with rules.out.  Rebased.  I have
> > removed the restriction on MAXPGPATH in AbsoluteConfigLocation() in
> > 0001, while on it.  The absolute paths built on GUC or ident
> > inclusions are the same.
>
> Rebased after 6bbd8b7g that is an equivalent of the previous 0001.

Thanks a lot!

> Julien, please note that this is waiting on author for now.  What do
> you think about the now-named v18-0001 and the addition of an
> ErrorContextCallback to provide more information about the list of
> included files on error?

Yes, I'm unfortunately fully aware that it's waiting on me.  I've been a bit
busy this week with $work but I will try to go back to it as soon as I can,
hopefully this week!




Re: Reviving lost replication slots

2022-11-08 Thread sirisha chamarthi
On Mon, Nov 7, 2022 at 11:17 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Tue, Nov 8, 2022 at 12:08 PM sirisha chamarthi
>  wrote:
> >
> > On Fri, Nov 4, 2022 at 11:02 PM Amit Kapila 
> wrote:
> >>
> >> On Fri, Nov 4, 2022 at 1:40 PM sirisha chamarthi
> >>  wrote:
> >> >
> >> > A replication slot can be lost when a subscriber is not able to catch
> up with the load on the primary and the WAL to catch up exceeds
> max_slot_wal_keep_size. When this happens, target has to be reseeded
> (pg_dump) from the scratch and this can take longer. I am investigating the
> options to revive a lost slot.
> >> >
> >>
> >> Why in the first place one has to set max_slot_wal_keep_size if they
> >> care for WAL more than that?
> >
> >  Disk full is a typical use where we can't wait until the logical slots
> to catch up before truncating the log.
>
> If the max_slot_wal_keep_size is set appropriately and the replication
> lag is monitored properly along with some automatic actions such as
> replacing/rebuilding the standbys or subscribers (which may not be
> easy and cheap though), the chances of hitting the "lost replication"
> problem becomes less, but not zero always.
>

pg_dump and pg_restore can take several hours to days on a large database.
Keeping the WAL in the pg_wal folder (faster, smaller and costly disks?) is
not always an option.


>
> >> If you have a case where you want to
> >> handle this case for some particular slot (where you are okay with the
> >> invalidation of other slots exceeding max_slot_wal_keep_size) then the
> >> other possibility could be to have a similar variable at the slot
> >> level but not sure if that is a good idea because you haven't
> >> presented any such case.
> >
> > IIUC, ability to fetch WAL from the archive as a fall back mechanism
> should automatically take care of all the lost slots. Do you see a need to
> take care of a specific slot? If the idea is not to download the wal files
> in the pg_wal directory, they can be placed in a slot specific folder
> (data/pg_replslot//) until they are needed while decoding and can be
> removed.
>
> Is the idea here the core copying back the WAL files from the archive?
> If yes, I think it is not something the core needs to do. This very
> well fits the job of an extension or an external module that revives
> the lost replication slots by copying WAL files from archive location.
>

The current code is throwing an error that the slot is lost because the
restart_lsn is set  to invalid LSN when the WAL is truncated by
checkpointer. In order to build an external service that can revive a lost
slot, at the minimum we needed the patch attached.


>
> Having said above, what's the best way to revive a lost replication
> slot today? Any automated way exists today? It seems like
> pg_replication_slot_advance() doesn't do anything for the
> invalidated/lost slots.
>

 If the WAL is available in the pg_wal directory, the replication stream
resumes normally when the client connects with the patch I posted.


>
> If it's a streaming replication slot, the standby will anyway jump to
> archive mode ignoring the replication slot and the slot will never be
> usable again unless somebody creates a new replication slot and
> provides it to the standby for reuse.
> If it's a logical replication slot, the subscriber will start to
> diverge from the publisher and the slot will have to be revived
> manually i.e. created again.
>

Physical slots can be revived with standby downloading the WAL from the
archive directly. This patch is helpful for the logical slots.


>
> --
> Bharath Rupireddy
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com
>


Re: Reviving lost replication slots

2022-11-08 Thread sirisha chamarthi
On Tue, Nov 8, 2022 at 1:36 AM Amit Kapila  wrote:

> On Tue, Nov 8, 2022 at 12:08 PM sirisha chamarthi
>  wrote:
> >
> > On Fri, Nov 4, 2022 at 11:02 PM Amit Kapila 
> wrote:
> >>
> >> On Fri, Nov 4, 2022 at 1:40 PM sirisha chamarthi
> >>  wrote:
> >> >
> >> > A replication slot can be lost when a subscriber is not able to catch
> up with the load on the primary and the WAL to catch up exceeds
> max_slot_wal_keep_size. When this happens, target has to be reseeded
> (pg_dump) from the scratch and this can take longer. I am investigating the
> options to revive a lost slot.
> >> >
> >>
> >> Why in the first place one has to set max_slot_wal_keep_size if they
> >> care for WAL more than that?
> >
> >  Disk full is a typical use where we can't wait until the logical slots
> to catch up before truncating the log.
> >
>
> Ideally, in such a case the subscriber should fall back to the
> physical standby of the publisher but unfortunately, we don't yet have
> a functionality where subscribers can continue logical replication
> from physical standby. Do you think if we had such functionality it
> would serve our purpose?
>

 Don't think  streaming from standby helps as the disk layout is expected
to remain the same on physical standby and primary.



> >> If you have a case where you want to
> >> handle this case for some particular slot (where you are okay with the
> >> invalidation of other slots exceeding max_slot_wal_keep_size) then the
> >> other possibility could be to have a similar variable at the slot
> >> level but not sure if that is a good idea because you haven't
> >> presented any such case.
> >
> > IIUC, ability to fetch WAL from the archive as a fall back mechanism
> should automatically take care of all the lost slots. Do you see a need to
> take care of a specific slot?
> >
>
> No, I was just trying to see if your use case can be addressed in some
> other way. BTW, won't copying the WAL again back from archive can lead
> to a disk full situation.
>
The idea is to download the WAL from archive on demand as the slot requires
them and throw away the segment once processed.


>
> --
> With Regards,
> Amit Kapila.
>


Re: Locks release order in LogStandbySnapshot

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-09 11:03:04 +0800, Japin Li wrote:
> GetRunningTransactionData requires holding both ProcArrayLock and
> XidGenLock (in that order).  Then LogStandbySnapshot releases those
> locks in that order.  However, to reduce the frequency of having to
> wait for XidGenLock while holding ProcArrayLock, ProcArrayAdd releases
> them in reversed acquisition order.
>
> The comments of LogStandbySnapshot says:
> 
> > GetRunningTransactionData() acquired ProcArrayLock, we must release it.
> > For Hot Standby this can be done before inserting the WAL record
> > because ProcArrayApplyRecoveryInfo() rechecks the commit status using
> > the clog. For logical decoding, though, the lock can't be released
> > early because the clog might be "in the future" from the POV of the
> > historic snapshot. This would allow for situations where we're waiting
> > for the end of a transaction listed in the xl_running_xacts record
> > which, according to the WAL, has committed before the xl_running_xacts
> > record. Fortunately this routine isn't executed frequently, and it's
> > only a shared lock.
> 
> This comment is only for ProcArrayLock, not for XidGenLock. IIUC,
> LogCurrentRunningXacts doesn't need holding XidGenLock, right?

I think it does. If we allow xid assignment before LogCurrentRunningXacts() is
done, those new xids would not have been mentioned in the xl_running_xacts
record, despite already running. Which I think result in corrupted snapshots
during hot standby and logical decoding.


> Does there any sense to release them in reversed acquisition order in
> LogStandbySnapshot like ProcArrayRemove?

I don't think it's worth optimizing for, this happens at a low frequency
(whereas connection establishment can be very frequent). And due to the above,
we can sometimes release ProcArrayLock earlier.

Greetings,

Andres Freund




Locks release order in LogStandbySnapshot

2022-11-08 Thread Japin Li

Hi, hackers

GetRunningTransactionData requires holding both ProcArrayLock and
XidGenLock (in that order).  Then LogStandbySnapshot releases those
locks in that order.  However, to reduce the frequency of having to
wait for XidGenLock while holding ProcArrayLock, ProcArrayAdd releases
them in reversed acquisition order.

The comments of LogStandbySnapshot says:

> GetRunningTransactionData() acquired ProcArrayLock, we must release it.
> For Hot Standby this can be done before inserting the WAL record
> because ProcArrayApplyRecoveryInfo() rechecks the commit status using
> the clog. For logical decoding, though, the lock can't be released
> early because the clog might be "in the future" from the POV of the
> historic snapshot. This would allow for situations where we're waiting
> for the end of a transaction listed in the xl_running_xacts record
> which, according to the WAL, has committed before the xl_running_xacts
> record. Fortunately this routine isn't executed frequently, and it's
> only a shared lock.

This comment is only for ProcArrayLock, not for XidGenLock. IIUC,
LogCurrentRunningXacts doesn't need holding XidGenLock, right?

Does there any sense to release them in reversed acquisition order in
LogStandbySnapshot like ProcArrayRemove?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 7db86f7885..28d1c152bf 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -1283,6 +1283,9 @@ LogStandbySnapshot(void)
 	 */
 	running = GetRunningTransactionData();
 
+	/* GetRunningTransactionData() acquired XidGenLock, we must release it */
+	LWLockRelease(XidGenLock);
+
 	/*
 	 * GetRunningTransactionData() acquired ProcArrayLock, we must release it.
 	 * For Hot Standby this can be done before inserting the WAL record
@@ -1304,9 +1307,6 @@ LogStandbySnapshot(void)
 	if (wal_level >= WAL_LEVEL_LOGICAL)
 		LWLockRelease(ProcArrayLock);
 
-	/* GetRunningTransactionData() acquired XidGenLock, we must release it */
-	LWLockRelease(XidGenLock);
-
 	return recptr;
 }
 


Re: Asynchronous and "direct" IO support for PostgreSQL.

2022-11-08 Thread Andres Freund
Hi,

On 2022-10-12 14:45:26 +0900, Michael Paquier wrote:
> On Tue, Aug 31, 2021 at 10:56:59PM -0700, Andres Freund wrote:
> > I've attached the code for posterity, but the series is large enough that I
> > don't think it makes sense to do that all that often... The code is at
> > https://github.com/anarazel/postgres/tree/aio
> 
> I don't know what's the exact status here, but as there has been no
> activity for the past five months, I have just marked the entry as RwF
> for now.

We're trying to get a number of smaller prerequisite patches merged this CF
(aligned alloc, direction IO, dclist, bulk relation extension, ...). Once
that's done I'm planning to send out a new version of the (large) remainder of
the changes.

Greetings,

Andres Freund




Re: Add index scan progress to pg_stat_progress_vacuum

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-04 13:27:34 +, Imseih (AWS), Sami wrote:
> diff --git a/src/backend/access/gin/ginvacuum.c 
> b/src/backend/access/gin/ginvacuum.c
> index b4fa5f6bf8..3d5e4600dc 100644
> --- a/src/backend/access/gin/ginvacuum.c
> +++ b/src/backend/access/gin/ginvacuum.c
> @@ -633,6 +633,9 @@ ginbulkdelete(IndexVacuumInfo *info, 
> IndexBulkDeleteResult *stats,
>   UnlockReleaseBuffer(buffer);
>   buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>   
> RBM_NORMAL, info->strategy);
> +
> + if (info->report_parallel_progress)
> + 
> info->parallel_progress_callback(info->parallel_progress_arg);
>   }
>  
>   /* right now we found leftmost page in entry's BTree */

I don't think any of these progress callbacks should be done while pinning a
buffer and ...

> @@ -677,6 +680,9 @@ ginbulkdelete(IndexVacuumInfo *info, 
> IndexBulkDeleteResult *stats,
>   buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,
>   
> RBM_NORMAL, info->strategy);
>   LockBuffer(buffer, GIN_EXCLUSIVE);
> +
> + if (info->report_parallel_progress)
> + 
> info->parallel_progress_callback(info->parallel_progress_arg);
>   }
>  
>   MemoryContextDelete(gvs.tmpCxt);

... definitely not while holding a buffer lock.


I also don't understand why info->parallel_progress_callback exists? It's only
set to parallel_vacuum_progress_report(). Why make this stuff more expensive
than it has to already be?



> +parallel_vacuum_progress_report(void *arg)
> +{
> + ParallelVacuumState *pvs = (ParallelVacuumState *) arg;
> +
> + if (IsParallelWorker())
> + return;
> +
> + pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
> +  
> pg_atomic_read_u32(&(pvs->shared->idx_completed_progress)));
> +}

So each of the places that call this need to make an additional external
function call for each page, just to find that there's nothing to do or to
make yet another indirect function call. This should probably a static inline
function.

This is called, for every single page, just to read the number of indexes
completed by workers? A number that barely ever changes?

This seems all wrong.

Greetings,

Andres Freund




RE: Perform streaming logical transactions by background workers and parallel apply

2022-11-08 Thread Hayato Kuroda (Fujitsu)
Hi all,

I have tested the patch set in two cases, so I want to share the result. 


Case 1. deadlock caused by leader worker, parallel worker, and backend.

Case 2. deadlock caused by non-immutable trigger
===

It has worked well in both cases. PSA reports what I did.
I want to investigate more if anymore wants to check.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

# Deadlock caused by leader worker, parallel worker, and backend.

## SCHEMA DEFINITIONS

[Publisher]

CREATE TABLE tbl1 (c int);
CREATE TABLE tbl2 (c1 int primary key, c2 int, c3 int);
CREATE PUBLICATION pub FOR ALL TABLES;

[Subscriber]
CREATE TABLE tbl1 (c int);
CREATE UNIQUE INDEX idx_tbl1 on tbl1(c)
CREATE TABLE tbl2 (c1 int primary key, c2 int, c3 int);
CREATE SUBSCRIPTION sub CONNECTION 'port=$port_N1 user=postgres' PUBLICATION 
pub WITH(streaming = 'parallel', copy_data = 'false', two_phase = 'on');


### WORKLOADS


Tx1 on publisher
BEGIN;
INSERT INTO tbl1 SELECT i FROM generate_series(1, 5000);

Tx2 on subscriber
BEGIN;
INSERT INTO tbl2 VALUES (1, 2, 3);
INSERT INTO tbl1 VALUES (1);

Tx3 on publisher
BEGIN;
INSERT INTO tbl2 VALUES(1, 2, 3);
COMMIT;


### RESULTS

Followings were copied from log on subscriber.

```
ERROR:  deadlock detected
DETAIL:  Process (LA) waits for ShareLock on transaction 743; blocked by 
process (Backend).
Process (Backend) waits for ShareLock on transaction 742; blocked by 
process (PA).
Process (PA) waits for AccessShareLock on object 16393 of class 6100 of 
database 0; blocked by process (LA).
Process (LA): 
Process (Backend): INSERT INTO tbl1 VALUES (1);
Process (PA): 
```

# deadlock caused by non-immutable trigger

## SCHEMA DEFINITIONS

[Publisher]

CREATE TABLE tbl1 (c int);
CREATE PUBLICATION pub FOR ALL TABLES;

[Subscriber]
CREATE TABLE tbl1 (c int);
CREATE TABLE tbl1_log (c int PRIMARY KEY);

CREATE FUNCTION record_insert() RETURNS trigger AS $record_insert$
BEGIN
  SET search_path TO 'public';
  INSERT INTO tbl1_log VALUES (NEW.c);
  RAISE LOG 'record_insert is fired';
  RESET search_path;
  RETURN NEW;
END;
$record_insert$
LANGUAGE plpgsql;

CREATE TRIGGER record_trigger AFTER INSERT ON tbl1 FOR EACH ROW EXECUTE 
FUNCTION record_insert();
ALTER TABLE tbl1 ENABLE ALWAYS TRIGGER record_trigger

CREATE SUBSCRIPTION sub CONNECTION 'port=$port_N1 user=postgres' PUBLICATION 
pub WITH(streaming = 'parallel', copy_data = 'false', two_phase = 'on');


### WORKLOADS

Tx1 on publisher
BEGIN;
INSERT INTO tbl1 SELECT i FROM generate_series(1, 5000);

Tx2 on publisher
BEGIN;
INSERT INTO tbl1 VALUES (1);
COMMIT;
COMMIT;


### RESULTS

Followings were copied from log on subscriber.

```
ERROR:  deadlock detected
DETAIL:  Process (LA) waits for ShareLock on transaction 735; blocked by 
process (PA).
Process (PA) waits for AccessShareLock on object 16400 of class 6100 of 
database 0; blocked by process (LA).
Process (LA): 
Process (PA): 
```




Re: Improve logging when using Huge Pages

2022-11-08 Thread Kyotaro Horiguchi
At Tue, 8 Nov 2022 11:34:53 +1300, Thomas Munro  wrote 
in 
> On Tue, Nov 8, 2022 at 4:56 AM Tom Lane  wrote:
> > Andres Freund  writes:
> > > On 2022-11-06 14:04:29 +0700, John Naylor wrote:
> > Agreed --- changing "on" to be exactly like "try" isn't an improvement.
> > If you want "try" semantics, choose "try".
> 
> Agreed, but how can we make the people who want a log message happy,
> without upsetting the people who don't want new log messages?  Hence
> my suggestion of a new level.  How about try_verbose?

Honestly I don't come up with other users of the new
log-level. Another possible issue is it might be a bit hard for people
to connect that level to huge_pages=try, whereas I think we shouldn't
put a description about the concrete impact range of that log-level.

I came up with an alternative idea that add a new huge_pages value
try_report or try_verbose, which tell postgresql to *always* report
the result of huge_pages = try.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add connection active, idle time to pg_stat_activity

2022-11-08 Thread David G. Johnston
On Tue, Nov 8, 2022 at 7:37 PM Andres Freund  wrote:

> On 2022-11-08 19:25:27 -0700, David G. Johnston wrote:
> > Actually two, because I also suggest that not only is the duration
> recorded,
> > but a counter be incremented each time a given state becomes the
> currently
> > active state.  Seems like having access to a divisor of some form may be
> > useful.
>
> What for?
>

Because 5 hours of idle-in-transaction time in a single block means
something different than the same 5 hours accumulated across 300 mini-idles.

David J.


Re: Collation version tracking for macOS

2022-11-08 Thread Thomas Munro
On Tue, Nov 8, 2022 at 1:22 AM Peter Eisentraut
 wrote:
> I made a Homebrew repository for ICU versions 50 through 72:
> https://github.com/petere/homebrew-icu

Nice!

> All of these packages build and pass their self-tests on my machine.  So
> from that experience, I think maintaining a repository of ICU versions,
> and being able to install more than one for testing this feature, is
> feasible.

I wonder what the situation with CVEs is in older releases.  I heard a
rumour that upstream might only patch current + previous, leaving it
up to distros to back-patch to whatever they need to support, but I
haven't tried to track down cold hard evidence of this or think about
what it means for this project...




Re: Add connection active, idle time to pg_stat_activity

2022-11-08 Thread Andres Freund
On 2022-11-08 19:25:27 -0700, David G. Johnston wrote:
> Actually two, because I also suggest that not only is the duration recorded,
> but a counter be incremented each time a given state becomes the currently
> active state.  Seems like having access to a divisor of some form may be
> useful.

What for?




Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-11-08 Thread sho kato
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

Hello,

I tested this patch on Linux and there is no problem.
Also, I reviewed this patch and commented below.

@@ -439,6 +447,107 @@ XLogRecordHasFPW(XLogReaderState *record)
+   if (fork >= 0 && fork <= MAX_FORKNUM)
+   {
+   if (fork)
+   sprintf(forkname, "_%s", forkNames[fork]);
+   else
+   forkname[0] = 0;
+   }
+   else
+   pg_fatal("found invalid fork number: %u", fork);

Should we add the comment if the main fork is saved without "_main" suffix for 
code readability?

@@ -679,6 +788,9 @@ usage(void)
 " (default: 1 or the value used in 
STARTSEG)\n"));
printf(_("  -V, --version  output version information, then 
exit\n"));
printf(_("  -w, --fullpage only show records with a full page 
write\n"));
+   printf(_("  -W, --save-fpi=pathsave full page images to given path as\n"
+" LSN.T.D.R.B_F\n"));
+   printf(_("  -X, --fixup-fpi=path   like --save-fpi but apply LSN fixups to 
saved page\n"));
printf(_("  -x, --xid=XID  only show records with transaction ID 
XID\n"));
printf(_("  -z, --stats[=record]   show statistics instead of records\n"
 " (optionally, show per-record 
statistics)\n"));

Since lower-case options are displayed at the top, should we switch the order 
of -x and -X?

@@ -972,6 +1093,25 @@ main(int argc, char **argv)
}
}

+   int dir_status = pg_check_dir(config.save_fpw_path);
+
+   if (dir_status < 0)
+   {
+   pg_log_error("could not access output directory: %s", 
config.save_fpw_path);
+   goto bad_argument;
+   }

Should we output %s enclosed with \"?

Regards,
Sho Kato

Re: allow segment size to be set to < 1GiB

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-07 21:36:33 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-11-07 12:52:25 -0500, Tom Lane wrote:
> >> How about instead allowing the segment size to be set in pages?
> 
> > In addition or instead of --with-segsize/-Dsegsize?
> 
> In addition to.  What I meant by "instead" was to replace
> your proposal of --with-segsize-mb.

Working on updating the patch.

One semi-interesting bit is that <= 5 blocks per segment fails, because
corrupt_page_checksum() doesn't know about segments and
src/bin/pg_basebackup/t/010_pg_basebackup.pl does

# induce further corruption in 5 more blocks
$node->stop;
for my $i (1 .. 5)
{
$node->corrupt_page_checksum($file_corrupt1, $i * $block_size);
}
$node->start;

I'd be content with not dealing with that given the use case of the
functionality? A buildfarm animal setting it to 10 seem to
suffice. Alternatively we could add segment support to
corrupt_page_checksum().

Opinions?

FWIW, with HEAD, all tests pass with -Dsegsize_blocks=6 on HEAD.

Greetings,

Andres Freund




Re: Add connection active, idle time to pg_stat_activity

2022-11-08 Thread David G. Johnston
On Tue, Nov 8, 2022 at 6:56 PM Andres Freund  wrote:

>
> Separately from that, I'm a bit worried about starting to add accumulative
> counters to pg_stat_activity. It's already gotten hard to use interactively
> due to the number of columns - and why stop with the columns you suggest?
> Why
> not show e.g. the total number of reads/writes, tuples inserted / deleted,
> etc. as well?
>
> I wonder if we shouldn't add a pg_stat_session or such for per-connection
> counters that show not the current state, but accumulated per-session
> state.
>
>
I would much rather go down this route than make the existing table wider.

pg_stat_activity_state_duration (this patch) [the table - for a given
backend - would be empty if track_activities is off]
pg_stat_activity_bandwidth_usage (if someone feels like implementing the
other items you mention)


I'm not really buying into the idea of having multiple states sum their
times together.  I would expect one column per state.  Actually two,
because I also suggest that not only is the duration recorded, but a
counter be incremented each time a given state becomes the currently active
state.  Seems like having access to a divisor of some form may be useful.

So 10 columns of data plus pid to join back to pg_stat_activity proper.

David J.


Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-08 Thread David Rowley
On Tue, 8 Nov 2022 at 19:51, Richard Guo  wrote:
> For unsorted paths, the original logic here is to explicitly add a Sort
> path only for the cheapest-total path.  This patch changes that and may
> add a Sort path for other paths besides the cheapest-total path.  I
> think this may introduce in some unnecessary path candidates.

Yeah, you're right. The patch shouldn't change that.  I've adjusted
the attached patch so that part works more like it did before.

v2 attached.

Thanks

David
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 4c6b1d1f55..fe8573d92c 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1959,8 +1959,8 @@ cost_incremental_sort(Path *path,
  double input_tuples, int width, Cost 
comparison_cost, int sort_mem,
  double limit_tuples)
 {
-   Coststartup_cost = 0,
-   run_cost = 0,
+   Coststartup_cost,
+   run_cost,
input_run_cost = input_total_cost - 
input_startup_cost;
double  group_tuples,
input_groups;
@@ -1969,10 +1969,9 @@ cost_incremental_sort(Path *path,
group_input_run_cost;
List   *presortedExprs = NIL;
ListCell   *l;
-   int i = 0;
boolunknown_varno = false;
 
-   Assert(presorted_keys != 0);
+   Assert(presorted_keys > 0 && presorted_keys < list_length(pathkeys));
 
/*
 * We want to be sure the cost of a sort is never estimated as zero, 
even
@@ -2025,12 +2024,11 @@ cost_incremental_sort(Path *path,
/* expression not containing any Vars with "varno 0" */
presortedExprs = lappend(presortedExprs, member->em_expr);
 
-   i++;
-   if (i >= presorted_keys)
+   if (foreach_current_index(l) + 1 >= presorted_keys)
break;
}
 
-   /* Estimate number of groups with equal presorted keys. */
+   /* Estimate the number of groups with equal presorted keys. */
if (!unknown_varno)
input_groups = estimate_num_groups(root, presortedExprs, 
input_tuples,

   NULL, NULL);
@@ -2039,22 +2037,19 @@ cost_incremental_sort(Path *path,
group_input_run_cost = input_run_cost / input_groups;
 
/*
-* Estimate average cost of sorting of one group where presorted keys 
are
-* equal.  Incremental sort is sensitive to distribution of tuples to 
the
-* groups, where we're relying on quite rough assumptions.  Thus, we're
-* pessimistic about incremental sort performance and increase its 
average
-* group size by half.
+* Estimate the average cost of sorting of one group where presorted 
keys
+* are equal.
 */
cost_tuplesort(_startup_cost, _run_cost,
-  1.5 * group_tuples, width, comparison_cost, 
sort_mem,
+  group_tuples, width, comparison_cost, 
sort_mem,
   limit_tuples);
 
/*
 * Startup cost of incremental sort is the startup cost of its first 
group
 * plus the cost of its input.
 */
-   startup_cost += group_startup_cost
-   + input_startup_cost + group_input_run_cost;
+   startup_cost = group_startup_cost + input_startup_cost +
+  group_input_run_cost;
 
/*
 * After we started producing tuples from the first group, the cost of
@@ -2062,17 +2057,20 @@ cost_incremental_sort(Path *path,
 * group, plus the total cost to process the remaining groups, plus the
 * remaining cost of input.
 */
-   run_cost += group_run_cost
-   + (group_run_cost + group_startup_cost) * (input_groups - 1)
-   + group_input_run_cost * (input_groups - 1);
+   run_cost = group_run_cost + (group_run_cost + group_startup_cost) *
+  (input_groups - 1) + group_input_run_cost * 
(input_groups - 1);
 
/*
 * Incremental sort adds some overhead by itself. Firstly, it has to
 * detect the sort groups. This is roughly equal to one extra copy and
-* comparison per tuple. Secondly, it has to reset the tuplesort context
-* for every group.
+* comparison per tuple.
 */
run_cost += (cpu_tuple_cost + comparison_cost) * input_tuples;
+
+   /*
+* Additionally, we charge double cpu_tuple_cost for each input group to
+* account for the tuplesort_reset that's performed after each group.
+*/
run_cost += 2.0 * cpu_tuple_cost * 

Re: Add connection active, idle time to pg_stat_activity

2022-11-08 Thread Andres Freund
Hi,

On 2022-07-21 18:22:51 +0200, Sergey Dudoladov wrote:
> From b5298301a3f5223bd78c519ddcddbd1bec9cf000 Mon Sep 17 00:00:00 2001
> From: Sergey Dudoladov 
> Date: Wed, 20 Apr 2022 23:47:37 +0200
> Subject: [PATCH] pg_stat_activity: add 'total_active_time' and
>  'total_idle_in_transaction_time'
> 
> catversion bump because of the change in the contents of pg_stat_activity
> 
> Author: Sergey Dudoladov, based on the initial version by Rafia Sabih
> 
> Reviewed-by: Aleksander Alekseev, Bertrand Drouvot, and Atsushi Torikoshi
> 
> Discussion: 
> https://www.postgresql.org/message-id/flat/CA%2BFpmFcJF0vwi-SWW0wYO-c-FbhyawLq4tCpRDCJJ8Bq%3Dja-gA%40mail.gmail.com

Isn't this patch breaking pg_stat_database? You removed
pgstat_count_conn_active_time() etc and the declaration for pgStatActiveTime /
pgStatTransactionIdleTime (but left the definition in pgstat_database.c), but
didn't replace it with anything afaics.


Separately from that, I'm a bit worried about starting to add accumulative
counters to pg_stat_activity. It's already gotten hard to use interactively
due to the number of columns - and why stop with the columns you suggest? Why
not show e.g. the total number of reads/writes, tuples inserted / deleted,
etc. as well?

I wonder if we shouldn't add a pg_stat_session or such for per-connection
counters that show not the current state, but accumulated per-session state.

Greetings,

Andres Freund




Re: Slow standby snapshot

2022-11-08 Thread Thomas Munro
On Wed, Nov 9, 2022 at 1:54 PM Andres Freund  wrote:
> On 2022-11-09 11:42:36 +1300, Thomas Munro wrote:
> > On Sat, Sep 17, 2022 at 6:27 PM Simon Riggs
> >  wrote:
> > > I've cleaned up the comments and used a #define for the constant.
> > >
> > > IMHO this is ready for commit. I will add it to the next CF.
> >
> > FYI This had many successful cfbot runs but today it blew up on
> > Windows when the assertion TransactionIdPrecedesOrEquals(safeXid,
> > snap->xmin) failed:
> >
> > https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/crashlog/crashlog-postgres.exe_1c40_2022-11-08_00-20-28-110.txt
>
> I don't immediately see how that could be connected to this patch - afaict
> that crash wasn't during recovery, and the modified functions should only be
> active during hot standby.

Indeed, sorry for the noise.




Re: Cygwin cleanup

2022-11-08 Thread Justin Pryzby
On Thu, Oct 20, 2022 at 10:40:40PM -0500, Justin Pryzby wrote:
> On Thu, Aug 04, 2022 at 04:16:06PM +1200, Thomas Munro wrote:
> > On Thu, Aug 4, 2022 at 3:38 PM Justin Pryzby  wrote:
> > > [train wreck]
> > 
> > Oh my, so I'm getting the impression we might actually be totally
> > unstable on Cygwin.  Which surprises me because ... wait a minute ...
> > lorikeet isn't even running most of the tests.  So... we don't really
> > know the degree to which any of this works at all?
> 
> Right.
> 
> Maybe it's of limited interest, but ..
> 
> This updates the patch to build and test with meson.
> Which first requires patching some meson.builds.
> I guess that's needed for some current BF members, too.
> Unfortunately, ccache+PCH causes gcc to crash :(

Resending with the 'only-if' line commented (doh).
And some fixes to 001 as Andres pointed out by on other thread.

-- 
Justin
>From 2741472080eceac5cb6d002c39eaf319d7f72b50 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 30 Sep 2022 13:39:43 -0500
Subject: [PATCH 1/3] meson: other fixes for cygwin

XXX: what about HAVE_BUGGY_STRTOF ?

See: 20221021034040.gt16...@telsasoft.com
---
 meson.build  | 8 ++--
 src/port/meson.build | 4 
 src/test/regress/meson.build | 2 ++
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index ce2f223a409..ed24370672a 100644
--- a/meson.build
+++ b/meson.build
@@ -211,6 +211,10 @@ if host_system == 'aix'
 
 elif host_system == 'cygwin'
   cppflags += '-D_GNU_SOURCE'
+  dlsuffix = '.dll'
+  mod_link_args_fmt = ['@0@']
+  mod_link_with_name = 'lib@0@.exe.a'
+  mod_link_with_dir = 'libdir'
 
 elif host_system == 'darwin'
   dlsuffix = '.dylib'
@@ -2301,8 +2305,8 @@ gnugetopt_dep = cc.find_library('gnugetopt', required: false)
 #   (i.e., allow '-' as a flag character), so use our version on those platforms
 # - We want to use system's getopt_long() only if the system provides struct
 #   option
-always_replace_getopt = host_system in ['windows', 'openbsd', 'solaris']
-always_replace_getopt_long = host_system == 'windows' or not cdata.has('HAVE_STRUCT_OPTION')
+always_replace_getopt = host_system in ['windows', 'cygwin', 'openbsd', 'solaris']
+always_replace_getopt_long = host_system in ['windows', 'cygwin'] or not cdata.has('HAVE_STRUCT_OPTION')
 
 # Required on BSDs
 execinfo_dep = cc.find_library('execinfo', required: false)
diff --git a/src/port/meson.build b/src/port/meson.build
index c696f1b..0ba83cc7930 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -40,6 +40,10 @@ if host_system == 'windows'
 'win32setlocale.c',
 'win32stat.c',
   )
+elif host_system == 'cygwin'
+  pgport_sources += files(
+'dirmod.c',
+  )
 endif
 
 if cc.get_id() == 'msvc'
diff --git a/src/test/regress/meson.build b/src/test/regress/meson.build
index f1adcd9198c..72a23737fa7 100644
--- a/src/test/regress/meson.build
+++ b/src/test/regress/meson.build
@@ -12,6 +12,8 @@ regress_sources = pg_regress_c + files(
 host_tuple_cc = cc.get_id()
 if host_system == 'windows' and host_tuple_cc == 'gcc'
   host_tuple_cc = 'mingw'
+elif host_system == 'cygwin' and host_tuple_cc == 'gcc'
+  host_tuple_cc = 'cygwin'
 endif
 host_tuple = '@0@-@1@-@2@'.format(host_cpu, host_system, host_tuple_cc)
 
-- 
2.25.1

>From 8f31be4d0bf036df890e32568dbc056c36fd57c5 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 25 Jul 2022 23:05:10 +1200
Subject: [PATCH 2/3] WIP CI support for Cygwin.

ci-os-only: cygwin

See also: d8e78714-dc77-4a64-783f-e863ba4d9...@2ndquadrant.com

https://cirrus-ci.com/task/5145086722834432

XXX This should use a canned Docker image with all the right packages
installed?  But if the larger image is slower to start, then maybe not...
---
 .cirrus.yml   | 76 +++
 configure |  2 +-
 configure.ac  |  2 +-
 src/test/perl/PostgreSQL/Test/Cluster.pm  |  4 +-
 src/test/perl/PostgreSQL/Test/Utils.pm| 12 +++-
 src/test/recovery/t/020_archive_status.pl |  2 +-
 src/tools/ci/cores_backtrace.sh   | 33 +-
 src/tools/ci/pg_ci_base.conf  |  2 +
 8 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 9f2282471a9..02b0f3b7045 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -464,6 +464,82 @@ task:
   type: text/plain
 
 
+task:
+  name: Windows - Cygwin
+  only_if: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*cygwin.*'
+  #XXX only_if: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*cygwin.*'
+  timeout_in: 90m
+
+  env:
+CPUS: 4
+BUILD_JOBS: 4
+TEST_JOBS: 1
+CCACHE_DIR: /tmp/ccache
+CCACHE_LOGFILE: ccache.log
+CONFIGURE_FLAGS: --enable-cassert --enable-debug --with-ldap --with-ssl=openssl --with-libxml
+# --enable-tap-tests
+# --disable-dynamicbase
+# --with-gssapi
+CONFIGURE_CACHE: /tmp/ccache/configure.cache
+PG_TEST_USE_UNIX_SOCKETS: 

Re: User functions for building SCRAM secrets

2022-11-08 Thread Jacob Champion
On 11/8/22 12:26, Peter Eisentraut wrote:
> On 04.11.22 21:39, Jacob Champion wrote:
>> I don't think it's helpful for me to try to block progress on this
>> patchset behind the other one. But is there a way for me to help this
>> proposal skate in the same general direction? Could Peter's encryption
>> framework expand to fit this case in the future?
> 
> We already have support in libpq for doing this (PQencryptPasswordConn()).

Sure, but you can't access that in SQL, right? The hand-wavy part is to
combine that existing function with your transparent encryption
proposal, as a special-case encryptor whose output could be bound to the
query.

But I guess that wouldn't really help with ALTER ROLE ... PASSWORD,
because you can't parameterize it. Hm...

--Jacob




Re: Slow standby snapshot

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-09 11:42:36 +1300, Thomas Munro wrote:
> On Sat, Sep 17, 2022 at 6:27 PM Simon Riggs
>  wrote:
> > I've cleaned up the comments and used a #define for the constant.
> >
> > IMHO this is ready for commit. I will add it to the next CF.
> 
> FYI This had many successful cfbot runs but today it blew up on
> Windows when the assertion TransactionIdPrecedesOrEquals(safeXid,
> snap->xmin) failed:
> 
> https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/crashlog/crashlog-postgres.exe_1c40_2022-11-08_00-20-28-110.txt

I don't immediately see how that could be connected to this patch - afaict
that crash wasn't during recovery, and the modified functions should only be
active during hot standby.

Greetings,

Andres Freund




Re: security_context_t marked as deprecated in libselinux 3.1

2022-11-08 Thread Michael Paquier
On Fri, Nov 04, 2022 at 08:49:24AM +0900, Michael Paquier wrote:
> In line of ad96696, seems like that it would make sense to do the same
> here even if the bar is lower.  sepgsql has not changed in years, so I
> suspect few conflicts.  Alvaro, if you want to take care of that,
> that's fine by me.  I could do it, but not before next week.

I got to look at that, now that the minor releases have been tagged,
and the change has no conflicts down to 9.3.  9.2 needed a slight
tweak, though, but it seemed fine as well.  (Tested the build on all
branches.)  So done all the way down, sticking with the new no-warning
policy for all the buildable branches.
--
Michael


signature.asc
Description: PGP signature


Re: Allow file inclusion in pg_hba and pg_ident files

2022-11-08 Thread Michael Paquier
On Tue, Nov 08, 2022 at 10:04:16AM +0900, Michael Paquier wrote:
> CF bot unhappy as I have messed up with rules.out.  Rebased.  I have
> removed the restriction on MAXPGPATH in AbsoluteConfigLocation() in
> 0001, while on it.  The absolute paths built on GUC or ident
> inclusions are the same.

Rebased after 6bbd8b7g that is an equivalent of the previous 0001.
Julien, please note that this is waiting on author for now.  What do
you think about the now-named v18-0001 and the addition of an
ErrorContextCallback to provide more information about the list of
included files on error?
--
Michael
From 50441282e6b298ee737e6e1a490d77baedfe5d69 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 7 Nov 2022 13:35:44 +0900
Subject: [PATCH v18 1/2] Invent open_auth_file() in hba.c, to refactor auth
 file opening

This adds a check on the recursion depth when including auth files,
something that has never been done when processing '@' files for
database and user name lists in pg_hba.conf.
---
 src/include/libpq/hba.h  |   4 +-
 src/backend/libpq/hba.c  | 100 ++-
 src/backend/utils/adt/hbafuncs.c |  18 ++
 3 files changed, 79 insertions(+), 43 deletions(-)

diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 7ad227d34a..a84a5f0961 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -177,7 +177,9 @@ extern int	check_usermap(const char *usermap_name,
 extern HbaLine *parse_hba_line(TokenizedAuthLine *tok_line, int elevel);
 extern IdentLine *parse_ident_line(TokenizedAuthLine *tok_line, int elevel);
 extern bool pg_isblank(const char c);
+extern FILE *open_auth_file(const char *filename, int elevel, int depth,
+			char **err_msg);
 extern MemoryContext tokenize_auth_file(const char *filename, FILE *file,
-		List **tok_lines, int elevel);
+		List **tok_lines, int elevel, int depth);
 
 #endif			/* HBA_H */
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index a9f87ab5bf..d8c0b585e5 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -117,7 +117,8 @@ static const char *const UserAuthName[] =
 
 
 static List *tokenize_inc_file(List *tokens, const char *outer_filename,
-			   const char *inc_filename, int elevel, char **err_msg);
+			   const char *inc_filename, int elevel,
+			   int depth, char **err_msg);
 static bool parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline,
 			   int elevel, char **err_msg);
 static int	regcomp_auth_token(AuthToken *token, char *filename, int line_num,
@@ -414,7 +415,7 @@ regexec_auth_token(const char *match, AuthToken *token, size_t nmatch,
  */
 static List *
 next_field_expand(const char *filename, char **lineptr,
-  int elevel, char **err_msg)
+  int elevel, int depth, char **err_msg)
 {
 	char		buf[MAX_TOKEN];
 	bool		trailing_comma;
@@ -431,7 +432,7 @@ next_field_expand(const char *filename, char **lineptr,
 		/* Is this referencing a file? */
 		if (!initial_quote && buf[0] == '@' && buf[1] != '\0')
 			tokens = tokenize_inc_file(tokens, filename, buf + 1,
-	   elevel, err_msg);
+	   elevel, depth + 1, err_msg);
 		else
 			tokens = lappend(tokens, make_auth_token(buf, initial_quote));
 	} while (trailing_comma && (*err_msg == NULL));
@@ -459,6 +460,7 @@ tokenize_inc_file(List *tokens,
   const char *outer_filename,
   const char *inc_filename,
   int elevel,
+  int depth,
   char **err_msg)
 {
 	char	   *inc_fullname;
@@ -468,24 +470,18 @@ tokenize_inc_file(List *tokens,
 	MemoryContext linecxt;
 
 	inc_fullname = AbsoluteConfigLocation(inc_filename, outer_filename);
+	inc_file = open_auth_file(inc_fullname, elevel, depth, err_msg);
 
-	inc_file = AllocateFile(inc_fullname, "r");
 	if (inc_file == NULL)
 	{
-		int			save_errno = errno;
-
-		ereport(elevel,
-(errcode_for_file_access(),
- errmsg("could not open secondary authentication file \"@%s\" as \"%s\": %m",
-		inc_filename, inc_fullname)));
-		*err_msg = psprintf("could not open secondary authentication file \"@%s\" as \"%s\": %s",
-			inc_filename, inc_fullname, strerror(save_errno));
+		/* error already logged */
 		pfree(inc_fullname);
 		return tokens;
 	}
 
 	/* There is possible recursion here if the file contains @ */
-	linecxt = tokenize_auth_file(inc_fullname, inc_file, _lines, elevel);
+	linecxt = tokenize_auth_file(inc_fullname, inc_file, _lines, elevel,
+ depth);
 
 	FreeFile(inc_file);
 	pfree(inc_fullname);
@@ -521,6 +517,59 @@ tokenize_inc_file(List *tokens,
 	return tokens;
 }
 
+/*
+ * open_auth_file
+ *		Open the given file.
+ *
+ * filename: the absolute path to the target file
+ * elevel: message logging level
+ * depth: recursion level of the file opened.
+ * err_msg: details about the error.
+ *
+ * Return value is the opened file.  On error, returns NULL with details
+ * about the error stored in "err_msg".
+ */
+FILE *
+open_auth_file(const char *filename, int elevel, 

Re: EINTR in ftruncate()

2022-11-08 Thread Thomas Munro
On Wed, Aug 17, 2022 at 7:51 AM Thomas Munro  wrote:
> On Sat, Jul 16, 2022 at 5:18 PM Thomas Munro  wrote:
> > On Sat, Jul 16, 2022 at 1:28 AM Tom Lane  wrote:
> > > Thomas Munro  writes:
> > > > On Fri, Jul 15, 2022 at 9:34 AM Tom Lane  wrote:
> > > >> (Someday we oughta go ahead and make our Windows signal API look more
> > > >> like POSIX, as I suggested back in 2015.  I'm still not taking
> > > >> point on that, though.)
> > >
> > > > For the sigprocmask() part, here's a patch that passes CI.  Only the
> > > > SIG_SETMASK case is actually exercised by our current code, though.
> > >
> > > Passes an eyeball check, but I can't actually test it.
> >
> > Thanks.  Pushed.
> >
> > I'm not brave enough to try to write a replacement sigaction() yet,
> > but it does appear that we could rip more ugliness and inconsistencies
> > that way, eg sa_mask.
>
> Here's a draft patch that adds a minimal sigaction() implementation
> for Windows.  It doesn't implement stuff we're not using, for sample
> sa_sigaction functions, but it has the sa_mask feature so we can
> harmonize the stuff that I believe you were talking about.

Pushed.

As discussed before, a much better idea would probably be to use
latch-based wakeup instead of putting postmaster logic in signal
handlers, but in the meantime this gets rid of the extra
Windows-specific noise.




Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-11-08 Thread David Christensen
Enclosed is v6, which squashes your refactor and adds the additional
recent suggestions.

Thanks!


v6-0001-Teach-pg_waldump-to-extract-FPIs-from-the-WAL-str.patch
Description: Binary data


Re: Crash in BRIN minmax-multi indexes

2022-11-08 Thread Justin Pryzby
On Mon, Oct 03, 2022 at 10:29:38PM +0200, Tomas Vondra wrote:
> I'll get this fixed in a couple days. Considering the benign nature of
> this issue (unnecessary assert) I'm not going to rush.

Is this still an outstanding issue ?

-- 
Justin




Re: psql: Add command to use extended query protocol

2022-11-08 Thread Corey Huinker
>
>
> Btw., this also allows doing things like
>
> SELECT $1, $2
> \bind '1' '2' \g
> \bind '3' '4' \g
>

That's one of the things I was hoping for. Very cool.


>
> This isn't a prepared statement being reused, but it relies on the fact
> that psql \g with an empty query buffer resends the previous query.
> Still kind of neat.


Yeah, if they wanted a prepared statement there's nothing stopping them.

Review:

Patch applies, tests pass.

Code is quite straightforward.

As for the docs, they're very clear and probably sufficient as-is, but I
wonder if we should we explicitly state that the bind-state and bind
parameters do not "stay around" after the query is executed? Suggestions in
bold:

 This command causes the extended query protocol (see ) to be used, unlike normal
 psql operation, which uses the simple
 query protocol.  *Extended query protocol will be used* *even if
no parameters are specified, s*o this command can be useful to test the
extended
 query protocol from psql. *This command affects only the next
query executed, all subsequent queries will use the regular query protocol
by default.*

Tests seem comprehensive. I went looking for the TAP test that this would
have replaced, but found none, and it seems the only test where we exercise
PQsendQueryParams is libpq_pipeline.c, so these tests are a welcome
addition.

Aside from the possible doc change, it looks ready to go.


Re: Commit fest 2022-11

2022-11-08 Thread Justin Pryzby
On Thu, Nov 03, 2022 at 07:43:03PM -0500, Justin Pryzby wrote:
> On Thu, Nov 03, 2022 at 11:33:57AM +0900, Ian Lawrence Barwick wrote:
> > 2022年11月2日(水) 19:10 Greg Stark :
> > > On Tue, 1 Nov 2022 at 06:56, Michael Paquier  wrote:
> > >
> > > > Two people showing up to help is really great, thanks!  I'll be around
> > > > as well this month, so I'll do my share of patches, as usual.
> > >
> > > Fwiw I can help as well -- starting next week. I can't do much this week 
> > > though.
> > >
> > > I would suggest starting with the cfbot to mark anything that isn't
> > > applying cleanly and passing tests (and looking for more than design
> > > feedback) as Waiting on Author and reminding the author that it's
> > > commitfest time and a good time to bring the patch into a clean state.
> > 
> > Sounds like a plan; I'll make a start on that today/tomorrow as I have
> > some time.
> 
> If I'm not wrong, Jacob used the CF app to bulk-mail people about
> patches not applying and similar things.  That seemed to work well, and
> doesn't require sending mails to dozens of threads.

If my script is not wrong, these patches add TAP tests, but don't update
the requisite ./meson.build file.  It seems like it'd be reasonable to
set them all as WOA until that's done.

$ for a in `git branch -a |sort |grep commitfest/40`; do : echo "$a..."; x=`git 
log -1 --compact-summary "$a"`; echo "$x" |grep '/t/.*pl.*new' >/dev/null || 
continue; echo "$x" |grep -Fw meson >/dev/null && continue; git log -1 
--oneline "$a"; done
... [CF 40/3558] Allow file inclusion in pg_hba and pg_ident files
... [CF 40/3628] Teach pg_waldump to extract FPIs from the WAL stream
... [CF 40/3646] Skip replicating the tables specified in except table option
... [CF 40/3663] Switching XLog source from archive to streaming when primary 
available
... [CF 40/3670] pg_rewind: warn when checkpoint hasn't happened after promotion
... [CF 40/3729] Testing autovacuum wraparound
... [CF 40/3877] vacuumlo: add test to vacuumlo for test coverage
... [CF 40/3985] TDE key management patches

-- 
Justin




Re: libpq support for NegotiateProtocolVersion

2022-11-08 Thread Jacob Champion
On 11/8/22 00:40, Peter Eisentraut wrote:
> On 02.11.22 20:02, Jacob Champion wrote:
>> This new code path doesn't go through the message length checks that are
>> done for the 'R' and 'E' cases, and pqGetNegotiateProtocolVersion3()
>> doesn't take the message length to know where to stop anyway, so a
>> misbehaving server can chew up client resources.
> 
> Fixed in new patch.

pqGetNegotiateProtocolVersion3() is still ignoring the message length,
though; it won't necessarily stop at the message boundary.

> We could add negotiation in the future, but then we'd have to first have 
> a concrete case of something to negotiate about.  For example, if we 
> added an optional performance feature into the protocol, then one could 
> negotiate by falling back to not using that.  But for the kinds of 
> features I'm thinking about right now (column encryption), you can't 
> proceed if the feature is not supported.  So I think this would need to 
> be considered case by case.

I guess I'm wondering about the definition of "minor" version if the
client treats an increment as incompatible by default. But that's a
discussion for the future, and this patch is just improving the existing
behavior, so I'll pipe down and watch.

>> I think the documentation on NegotiateProtocolVersion (not introduced in
>> this patch) is misleading/wrong; it says that the version number sent
>> back is the "newest minor protocol version supported by the server for
>> the major protocol version requested by the client" which doesn't seem
>> to match the actual usage seen here.
> 
> I don't follow.  If libpq sends a protocol version of 3.1, then the 
> server responds by saying it supports only 3.0.  What are you seeing?

I see what you've described on my end, too. The sentence I quoted seemed
to imply that the server should respond with only the minor version (the
least significant 16 bits). I think it should probably just say "newest
protocol version" in the docs.

Thanks,
--Jacob




Re: WIN32 pg_import_system_collations

2022-11-08 Thread Juan José Santamaría Flecha
On Mon, Nov 7, 2022 at 4:08 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

>
> This looks pretty good to me.  The refactoring of the non-Windows parts
> makes sense.  The Windows parts look reasonable on manual inspection,
> but again, I don't have access to Windows here, so someone else should
> also look it over.
>
> I was going to say that at least it is getting tested on the CI, but I
have found out that meson changes version(). That is fixed in this version.


> A small style issue: Change return (TRUE) to return TRUE.
>
> Fixed.


> The code
>
> +   if (strlen(localebuf) == 5 && localebuf[2] == '-')
>
> might be too specific.  At least on some POSIX systems, I have seen
> locales with a three-letter language name.  Maybe you should look with
> strchr() and not be too strict about the exact position.
>
> Ok, in this version the POSIX alias is created unconditionally.


> For the test patch, why is a separate test for non-UTF8 needed on
> Windows.  Does the UTF8 one not work?
>
> Windows locales will retain their CP_ACP encoding unless you change the OS
code page to UFT8, which is still experimental [1].


> +   version() !~ 'Visual C\+\+'
>
> This probably won't work for MinGW.
>
> When I proposed this patch it wouldn't have worked because of the
project's Windows minimum version requirement, now it should work in MinGW.
It actually doesn't because most locales are failing with "skipping locale
with unrecognized encoding", but checking what's wrong
with pg_get_encoding_from_locale() in MiNGW is subject for another thread.

[1]
https://stackoverflow.com/questions/56419639/what-does-beta-use-unicode-utf-8-for-worldwide-language-support-actually-do

Regards,

Juan José Santamaría Flecha


v8-0001-WIN32-pg_import_system_collations.patch
Description: Binary data


v8-0002-Add-collate.windows.win1252-test.patch
Description: Binary data


Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-11-08 Thread David Christensen
On Tue, Nov 8, 2022 at 4:45 PM Justin Pryzby  wrote:
>
> On Mon, Nov 07, 2022 at 05:01:01PM -0600, David Christensen wrote:
> > Hi Justin et al,
> >
> > Enclosed is v5 of this patch which now passes the CirrusCI checks for
> > all supported OSes. I went ahead and reworked the test a bit so it's a
> > little more amenable to the OS-agnostic approach for testing.
>
> Great, thanks.
>
> This includes the changes that I'd started a few months ago.
> Plus adding the test which was missing for meson.

Cool, will review, thanks.

> +format: 
> LSN.TSOID.DBOID.RELNODE.BLKNO
>
> I'd prefer if the abbreviations were "reltablespace" and "datoid"

Sure, no issues there.

> Also, should the test case call pg_relation_filenode() rather than using
> relfilenode directly ?  Is it a problem that the test code assumes
> pagesize=8192 ?

Both good points. Is pagesize just exposed via
`current_setting('block_size')` or is there a different approach?

David




Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-11-08 Thread Justin Pryzby
On Mon, Nov 07, 2022 at 05:01:01PM -0600, David Christensen wrote:
> Hi Justin et al,
> 
> Enclosed is v5 of this patch which now passes the CirrusCI checks for
> all supported OSes. I went ahead and reworked the test a bit so it's a
> little more amenable to the OS-agnostic approach for testing.

Great, thanks.

This includes the changes that I'd started a few months ago.
Plus adding the test which was missing for meson.

+format: 
LSN.TSOID.DBOID.RELNODE.BLKNO

I'd prefer if the abbreviations were "reltablespace" and "datoid"

Also, should the test case call pg_relation_filenode() rather than using
relfilenode directly ?  Is it a problem that the test code assumes
pagesize=8192 ?

-- 
Justin
>From d4cb23bc2f0edcc65b0bef75fd2e8d6e5aeda21f Mon Sep 17 00:00:00 2001
From: David Christensen 
Date: Wed, 20 Apr 2022 19:59:35 -0500
Subject: [PATCH 1/2] Teach pg_waldump to extract FPIs from the WAL stream

Extracts full-page images from the WAL stream into a target directory,
which must be empty or not exist.  These images are subject to the same
filtering rules as normal display in pg_waldump, which means that you
can isolate the full page writes to a target relation, among other
things.

Files are saved with the filename:  with
formatting to make things somewhat sortable; for instance:

-01C0.1663.1.6117.0
-01000150.1664.0.6115.0
-010001E0.1664.0.6114.0
-01000270.1663.1.6116.0
-01000300.1663.1.6113.0
-01000390.1663.1.6112.0
-01000420.1663.1.8903.0
-010004B0.1663.1.8902.0
-01000540.1663.1.6111.0
-010005D0.1663.1.6110.0

If the FPI comes from a fork other than the main fork, the fork name
will be appended on the output file name; e.g.:

-014A4758.1663.1.12864.0_vm

It's noteworthy that the raw block images do not have the current LSN
stored with them in the WAL stream (as would be true for on-heap
versions of the blocks), nor would the checksum be updated in
them (though WAL itself has checksums, so there is some protection
there).  As such there are two versions of this functionality, one which
returns the raw page as it appears in the WAL (--save-fpi) and one
which applies the updated pd_lsn and pd_checksum (--fixup-fpi).

These images could be loaded/inspected via `pg_read_binary_file()` and
used in the `pageinspect` suite of tools to perform detailed analysis on
the pages in question, based on historical information, and may come in
handy for forensics work.
---
 doc/src/sgml/ref/pg_waldump.sgml  |  79 +++
 src/bin/pg_waldump/pg_waldump.c   | 151 +-
 src/bin/pg_waldump/t/002_save_fullpage.pl | 137 
 3 files changed, 366 insertions(+), 1 deletion(-)
 create mode 100644 src/bin/pg_waldump/t/002_save_fullpage.pl

diff --git a/doc/src/sgml/ref/pg_waldump.sgml b/doc/src/sgml/ref/pg_waldump.sgml
index d559f091e5f..cc3ce918905 100644
--- a/doc/src/sgml/ref/pg_waldump.sgml
+++ b/doc/src/sgml/ref/pg_waldump.sgml
@@ -240,6 +240,85 @@ PostgreSQL documentation

  
 
+ 
+   -W save_path
+   --save-fpi=save_path
+   -X save_path
+   --fixup-fpi=save_path
+   
+   
+Save full page images seen in the WAL stream to the
+save_path directory, which will be created
+if it does not exist.  The images saved will be subject to the same
+filtering and limiting criteria as display records, but in this
+mode pg_waldump will not output any other
+information.
+   
+   
+If invoked using
+the -X/--fixup-fpi
+option, this page image will include the pd_lsn of
+the replayed record rather than the raw page image; as well,
+the pd_checksum field will be updated if it had
+previously existed.
+   
+   
+The page images will be saved with the file
+format: LSN.TSOID.DBOID.RELNODE.BLKNOFORK
+
+The dot-separated components are (in order):
+
+
+ 
+  
+   
+Component
+Description
+   
+  
+
+  
+   
+LSN
+The LSN of the record with this block, formatted
+ as two 8-character hexadecimal numbers %08X-%08X
+   
+
+   
+TSOID
+tablespace OID for the block
+   
+
+   
+DBOID
+database OID for the block
+   
+
+   
+RELNODE
+relnode id for the block
+   
+
+   
+BLKNO
+the block number of this block
+   
+
+   
+FORK
+
+ if coming from the main fork, will be empty, otherwise will be
+ one of _fsm, _vm,
+ or _init.
+
+   
+  
+ 
+
+   
+   
+ 
+
  
   -x xid
   --xid=xid
diff --git 

Re: Slow standby snapshot

2022-11-08 Thread Thomas Munro
On Sat, Sep 17, 2022 at 6:27 PM Simon Riggs
 wrote:
> I've cleaned up the comments and used a #define for the constant.
>
> IMHO this is ready for commit. I will add it to the next CF.

FYI This had many successful cfbot runs but today it blew up on
Windows when the assertion TransactionIdPrecedesOrEquals(safeXid,
snap->xmin) failed:

https://api.cirrus-ci.com/v1/artifact/task/5311549010083840/crashlog/crashlog-postgres.exe_1c40_2022-11-08_00-20-28-110.txt




Re: Suppressing useless wakeups in walreceiver

2022-11-08 Thread Nathan Bossart
On Tue, Nov 08, 2022 at 09:45:40PM +1300, Thomas Munro wrote:
> On Tue, Nov 8, 2022 at 9:20 PM Bharath Rupireddy
>  wrote:
>> Thanks. Do we need a similar wakeup approach for logical replication
>> workers in worker.c? Or is it okay that the nap time is 1sec there?
> 
> Yeah, I think so.  At least for its idle/nap case (waiting for flush
> is also a technically interesting case, but harder, and applies to
> non-idle systems so the polling is a little less offensive).

Bharath, are you planning to pick this up?  If not, I can probably spend
some time on it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Suppressing useless wakeups in walreceiver

2022-11-08 Thread Nathan Bossart
On Tue, Nov 08, 2022 at 08:46:26PM +1300, Thomas Munro wrote:
> On Tue, Nov 8, 2022 at 3:20 PM Thomas Munro  wrote:
>> This looks pretty good to me.  Thanks for picking it up!  I can live
>> with the change to use a global variable; it seems we couldn't quite
>> decide what the scope was for moving stuff into a struct (ie various
>> pre-existing stuff representing the walreceiver's state), but I'm OK
>> with considering that as a pure refactoring project for a separate
>> thread.  That array should be "static", though.
> 
> And with that change and a pgindent, pushed.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Non-emergency patch for bug #17679

2022-11-08 Thread Tom Lane
Andres Freund  writes:
> On 2022-11-08 11:28:08 -0500, Tom Lane wrote:
>> Hence, the attached reverts everything 4ab5dae94 did to this function,
>> and most of 0e758ae89 too, and instead makes IsBinaryUpgrade an
>> additional reason to take the immediate-unlink path.

> I wonder if it's worth aiming slightly higher. There's plenty duplicated code
> between the first segment handling and the loop body. Perhaps the if at the
> top just should decide whether to unlink the first segment or not, and we then
> check that in the body of the loop for segno == 0?

I don't care for that.  I think the point here is precisely that
we want behavior A for the first segment and behavior B for the
remaining ones, and so I'd prefer to keep the code that does A
and the code that does B distinct.  It was a misguided attempt to
share that code that got us into trouble here in the first place.
Moreover, any future changes to either behavior will be that much
harder if we combine the implementations.

regards, tom lane




Re: Non-emergency patch for bug #17679

2022-11-08 Thread Andres Freund
Hi,

On 2022-11-08 11:28:08 -0500, Tom Lane wrote:
> In the release team's discussion leading up to commit 0e758ae89,
> Andres opined that what commit 4ab5dae94 had done to mdunlinkfork
> was a mess, and I concur.  It invented an entirely new code path
> through that function, and required two different behaviors from the
> segment-deletion loop.  I think a very straight line can be drawn
> between that extra complexity and the introduction of a nasty bug.
> It's all unnecessary too, because AFAICS all we really need is to
> apply the pre-existing behavior for temp tables and REDO mode
> to binary-upgrade mode as well.

I'm not sure I understand the current code. In the binary upgrade case we
currently *do* truncate the file in the else of "Delete or truncate the first
segment.", then again truncate it in the loop and then unlink it, right?


> Hence, the attached reverts everything 4ab5dae94 did to this function,
> and most of 0e758ae89 too, and instead makes IsBinaryUpgrade an
> additional reason to take the immediate-unlink path.
> 
> Barring objections, I'll push this after the release freeze lifts.

I wonder if it's worth aiming slightly higher. There's plenty duplicated code
between the first segment handling and the loop body. Perhaps the if at the
top just should decide whether to unlink the first segment or not, and we then
check that in the body of the loop for segno == 0?

Greetings,

Andres Freund




Re: User functions for building SCRAM secrets

2022-11-08 Thread Peter Eisentraut

On 04.11.22 21:39, Jacob Champion wrote:

It seems to me that the use case here is extremely similar to the one
being tackled by Peter E's client-side encryption [1]. People want to
write SQL to perform a cryptographic operation using a secret, and
then send the resulting ciphertext (or in this case, a one-way hash)
to the server, but ideally the server should not actually have the
secret.


It might be possible, but it's a bit of a reach.  For instance, there 
are no keys and no decryption associated with this kind of operation.



I don't think it's helpful for me to try to block progress on this
patchset behind the other one. But is there a way for me to help this
proposal skate in the same general direction? Could Peter's encryption
framework expand to fit this case in the future?


We already have support in libpq for doing this (PQencryptPasswordConn()).




Re: [PATCHES] Post-special page storage TDE support

2022-11-08 Thread David Christensen
Looking into some CF bot failures which didn't show up locally.  Will
send a v3 when resolved.




Re: Introduce "log_connection_stages" setting.

2022-11-08 Thread Sergey Dudoladov
Hi hackers,

I've sketched an initial patch version; feedback is welcome.

Regards,
Sergey Dudoladov
From be2e6b5c2d6fff1021f52f150b4d849dfbd26ec7 Mon Sep 17 00:00:00 2001
From: Sergey Dudoladov 
Date: Tue, 8 Nov 2022 18:56:26 +0100
Subject: [PATCH] Introduce 'log_connection_messages'

This patch removes 'log_connections' and 'log_disconnections'
in favor of 'log_connection_messages', thereby introducing incompatibility

Author: Sergey Dudoladov

Reviewed-by:

Discussion:
https://www.postgresql.org/message-id/flat/CAA8Fd-qCB96uwfgMKrzfNs32mqqysi53yZFNVaRNJ6xDthZEgA%40mail.gmail.com
---
 doc/src/sgml/config.sgml  | 80 +++--
 src/backend/commands/variable.c   | 88 +++
 src/backend/libpq/auth.c  |  5 +-
 src/backend/postmaster/postmaster.c   |  5 +-
 src/backend/tcop/postgres.c   | 14 +--
 src/backend/utils/init/postinit.c |  2 +-
 src/backend/utils/misc/guc_tables.c   | 29 +++---
 src/backend/utils/misc/postgresql.conf.sample |  5 +-
 src/include/postgres.h|  5 ++
 src/include/postmaster/postmaster.h   |  1 -
 src/include/utils/guc_hooks.h |  3 +
 src/test/authentication/t/001_password.pl |  2 +-
 src/test/kerberos/t/001_auth.pl   |  2 +-
 src/test/ldap/t/001_auth.pl   |  2 +-
 src/test/recovery/t/013_crash_restart.pl  |  2 +-
 src/test/recovery/t/022_crash_temp_files.pl   |  2 +-
 src/test/recovery/t/032_relfilenode_reuse.pl  |  2 +-
 src/test/ssl/t/SSL/Server.pm  |  2 +-
 src/tools/ci/pg_ci_base.conf  |  3 +-
 19 files changed, 182 insertions(+), 72 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 559eb898a9..ab9d118c12 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -140,7 +140,7 @@
  An example of what this file might look like is:
 
 # This is a comment
-log_connections = yes
+log_connection_messages = all
 log_destination = 'syslog'
 search_path = '"$user", public'
 shared_buffers = 128MB
@@ -335,7 +335,7 @@ UPDATE pg_settings SET setting = reset_val WHERE name = 'configuration_parameter
passed to the postgres command via the
-c command-line parameter.  For example,
 
-postgres -c log_connections=yes -c log_destination='syslog'
+postgres -c log_connection_messages=all -c log_destination='syslog'
 
Settings provided in this way override those set via
postgresql.conf or ALTER SYSTEM,
@@ -6978,23 +6978,65 @@ local0.*/var/log/postgresql
   
  
 
- 
-  log_connections (boolean)
+ 
+  log_connection_messages (string)
   
-   log_connections configuration parameter
+   log_connection_messages configuration parameter
   
   
   

-Causes each attempted connection to the server to be logged,
-as well as successful completion of both client authentication (if
-necessary) and authorization.
+Causes stages of each attempted connection to the server to be logged. Example: authorized,disconnected.
+The default is the empty string meaning nothing is logged.
 Only superusers and users with the appropriate SET
 privilege can change this parameter at session start,
 and it cannot be changed at all within a session.
-The default is off.

 
+
+ Connection messages
+ 
+  
+  
+  
+   
+Name
+Description
+   
+  
+  
+   
+received
+Logs reception of a connection. At this point a connection has been received, but no further work has been done:
+Postgres is about to start interacting with a client.
+   
+
+   
+authenticated
+Logs the original identity that an authentication method employs to identify a user. In most cases, the identity string equals the PostgreSQL username,
+but some third-party authentication methods may alter the original user identifier before the server stores it. Failed authentication is always logged regardless of the value of this setting.
+   
+
+   
+authorized
+Logs successful completion of authorization. At this point the connection has been fully established.
+   
+
+   
+disconnected
+Logs session termination. The log output provides information similar to authorized,
+plus the duration of the session.
+   
+
+   
+all
+A convenience alias. When present, must be the only value in the list.
+   
+
+  
+ 
+
+

 
  Some client programs, like psql, attempt
@@ -7006,26 +7048,6 @@ local0.*/var/log/postgresql
   
  
 
- 
-  log_disconnections (boolean)
-  
-   log_disconnections configuration parameter
-  
-  
-  
-   
-Causes session terminations to 

Re: Add connection active, idle time to pg_stat_activity

2022-11-08 Thread Sergey Dudoladov
Hello hackers,

Is there anything we can do to facilitate merging of this patch ?
It has been in the "ready-for-commiter" state for 3 commitfests in a row now.

We would appreciate if the patch makes it to version 16: the need to
monitor idle-in-transaction connections is very real for us.

Regards,
Sergey Dudoladov




Out-of-date clang/llvm version lists in PGAC_LLVM_SUPPORT

2022-11-08 Thread Tom Lane
I happened to notice that these lists of supported versions haven't
been updated in a good long time:

  PGAC_PATH_PROGS(LLVM_CONFIG, llvm-config llvm-config-7 llvm-config-6.0 
llvm-config-5.0 llvm-config-4.0 llvm-config-3.9)

  PGAC_PATH_PROGS(CLANG, clang clang-7 clang-6.0 clang-5.0 clang-4.0 clang-3.9)

Given the lack of complaints, it seems likely that nobody is relying
on these.  Maybe we should just nuke them?  If not, I suppose we
better add 8 through 15.

I may be missing it, but it doesn't look like meson.build has any
equivalent lists.  So that might be an argument for getting rid
of the lists here?

regards, tom lane




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-08 Thread Peter Geoghegan
On Fri, Nov 4, 2022 at 8:25 AM Masahiko Sawada  wrote:
> For parallel heap pruning, multiple workers will insert key-value
> pairs to the radix tree concurrently. The simplest solution would be a
> single lock to protect writes but the performance will not be good.
> Another solution would be that we can divide the tables into multiple
> ranges so that keys derived from TIDs are not conflicted with each
> other and have parallel workers process one or more ranges. That way,
> parallel vacuum workers can build *sub-trees* and the leader process
> can merge them. In use cases of lazy vacuum, since the write phase and
> read phase are separated the readers don't need to worry about
> concurrent updates.

I think that the VM snapshot concept can eventually be used to
implement parallel heap pruning. Since every page that will become a
scanned_pages is known right from the start with VM snapshots, it will
be relatively straightforward to partition these pages into distinct
ranges with an equal number of pages, one per worker planned. The VM
snapshot structure can also be used for I/O prefetching, which will be
more important with parallel heap pruning (and with aio).

Working off of an immutable structure that describes which pages to
process right from the start is naturally easy to work with, in
general. We can "reorder work" flexibly (i.e. process individual
scanned_pages in any order that is convenient). Another example is
"changing our mind" about advancing relfrozenxid when it turns out
that we maybe should have decided to do that at the start of VACUUM
[1]. Maybe the specific "changing our mind" idea will turn out to not
be a very useful idea, but it is at least an interesting and thought
provoking concept.

[1] 
https://postgr.es/m/CAH2-WzkQ86yf==mgAF=cq0qelrwkx3htlw9qo+qx3zbwjjk...@mail.gmail.com
-- 
Peter Geoghegan




Non-emergency patch for bug #17679

2022-11-08 Thread Tom Lane
In the release team's discussion leading up to commit 0e758ae89,
Andres opined that what commit 4ab5dae94 had done to mdunlinkfork
was a mess, and I concur.  It invented an entirely new code path
through that function, and required two different behaviors from the
segment-deletion loop.  I think a very straight line can be drawn
between that extra complexity and the introduction of a nasty bug.
It's all unnecessary too, because AFAICS all we really need is to
apply the pre-existing behavior for temp tables and REDO mode
to binary-upgrade mode as well.

Hence, the attached reverts everything 4ab5dae94 did to this function,
and most of 0e758ae89 too, and instead makes IsBinaryUpgrade an
additional reason to take the immediate-unlink path.

Barring objections, I'll push this after the release freeze lifts.

regards, tom lane

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index b1a2cb575d..0f642df3c1 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -263,6 +263,12 @@ mdcreate(SMgrRelation reln, ForkNumber forknum, bool isRedo)
  * The fact that temp rels and regular rels have different file naming
  * patterns provides additional safety.
  *
+ * We also don't do it while performing a binary upgrade.  There is no reuse
+ * hazard in that case, since after a crash or even a simple ERROR, the
+ * upgrade fails and the whole cluster must be recreated from scratch.
+ * Furthermore, it is important to remove the files from disk immediately,
+ * because we may be about to reuse the same relfilenumber.
+ *
  * All the above applies only to the relation's main fork; other forks can
  * just be removed immediately, since they are not needed to prevent the
  * relfilenumber from being recycled.  Also, we do not carefully
@@ -319,14 +325,15 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
 {
 	char	   *path;
 	int			ret;
-	BlockNumber segno = 0;
 
 	path = relpath(rlocator, forknum);
 
 	/*
-	 * Delete or truncate the first segment.
+	 * Truncate and then unlink the first segment, or just register a request
+	 * to unlink it later, as described in the comments for mdunlink().
 	 */
-	if (isRedo || forknum != MAIN_FORKNUM || RelFileLocatorBackendIsTemp(rlocator))
+	if (isRedo || IsBinaryUpgrade || forknum != MAIN_FORKNUM ||
+		RelFileLocatorBackendIsTemp(rlocator))
 	{
 		if (!RelFileLocatorBackendIsTemp(rlocator))
 		{
@@ -351,7 +358,6 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
 ereport(WARNING,
 		(errcode_for_file_access(),
 		 errmsg("could not remove file \"%s\": %m", path)));
-			segno++;
 		}
 	}
 	else
@@ -359,42 +365,25 @@ mdunlinkfork(RelFileLocatorBackend rlocator, ForkNumber forknum, bool isRedo)
 		/* Prevent other backends' fds from holding on to the disk space */
 		ret = do_truncate(path);
 
-		/*
-		 * Except during a binary upgrade, register request to unlink first
-		 * segment later, rather than now.
-		 *
-		 * If we're performing a binary upgrade, the dangers described in the
-		 * header comments for mdunlink() do not exist, since after a crash or
-		 * even a simple ERROR, the upgrade fails and the whole new cluster
-		 * must be recreated from scratch. And, on the other hand, it is
-		 * important to remove the files from disk immediately, because we may
-		 * be about to reuse the same relfilenumber.
-		 */
-		if (!IsBinaryUpgrade)
-		{
-			register_unlink_segment(rlocator, forknum, 0 /* first seg */ );
-			segno++;
-		}
+		/* Register request to unlink first segment later */
+		register_unlink_segment(rlocator, forknum, 0 /* first seg */ );
 	}
 
 	/*
-	 * Delete any remaining segments (we might or might not have dealt with
-	 * the first one above).
+	 * Delete any additional segments.
 	 */
 	if (ret >= 0)
 	{
 		char	   *segpath = (char *) palloc(strlen(path) + 12);
+		BlockNumber segno;
 
 		/*
 		 * Note that because we loop until getting ENOENT, we will correctly
 		 * remove all inactive segments as well as active ones.
 		 */
-		for (;; segno++)
+		for (segno = 1;; segno++)
 		{
-			if (segno == 0)
-strcpy(segpath, path);
-			else
-sprintf(segpath, "%s.%u", path, segno);
+			sprintf(segpath, "%s.%u", path, segno);
 
 			if (!RelFileLocatorBackendIsTemp(rlocator))
 			{


Re: Tables not getting vacuumed in postgres

2022-11-08 Thread Mark Dilger



> On Nov 8, 2022, at 5:21 AM, Karthik Jagadish (kjagadis)  
> wrote:
> 
> I didn’t get your point dead tuples are visible to transaction means? 
> Vacuuming job is to remove dead tuples right?

Please see https://www.2ndquadrant.com/en/blog/when-autovacuum-does-not-vacuum/ 
for more information about your question.  Specifically, you might look at the 
third section down, "Long transactions", which starts with "So, if the table is 
vacuumed regularly, surely it can’t accumulate a lot of dead rows, right?"  You 
might benefit from reading the entire article rather than skipping down to that 
section.

I hope it helps


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-11-08 Thread Masahiko Sawada
On Sat, Nov 5, 2022 at 6:23 PM John Naylor  wrote:
>
> On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada  wrote:
> >
> > For parallel heap pruning, multiple workers will insert key-value
> > pairs to the radix tree concurrently. The simplest solution would be a
> > single lock to protect writes but the performance will not be good.
> > Another solution would be that we can divide the tables into multiple
> > ranges so that keys derived from TIDs are not conflicted with each
> > other and have parallel workers process one or more ranges. That way,
> > parallel vacuum workers can build *sub-trees* and the leader process
> > can merge them. In use cases of lazy vacuum, since the write phase and
> > read phase are separated the readers don't need to worry about
> > concurrent updates.
>
> It's a good idea to use ranges for a different reason -- readahead. See 
> commit 56788d2156fc3, which aimed to improve readahead for sequential scans. 
> It might work to use that as a model: Each worker prunes a range of 64 pages, 
> keeping the dead tids in a local array. At the end of the range: lock the tid 
> store, enter the tids into the store, unlock, free the local array, and get 
> the next range from the leader. It's possible contention won't be too bad, 
> and I suspect using small local arrays as-we-go would be faster and use less 
> memory than merging multiple sub-trees at the end.

Seems a promising idea. I think it might work well even in the current
parallel vacuum (ie., single writer). I mean, I think we can have a
single lwlock for shared cases in the first version. If the overhead
of acquiring the lwlock per insertion of key-value is not negligible,
we might want to try this idea.

Apart from that, I'm going to incorporate the comments on 0004 patch
and try a pointer tagging.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Tables not getting vacuumed in postgres

2022-11-08 Thread Karthik Jagadish (kjagadis)
I didn’t get your point dead tuples are visible to transaction means? Vacuuming 
job is to remove dead tuples right?

Regards,
Karthik

From: Amul Sul 
Date: Tuesday, 8 November 2022 at 6:39 PM
To: Karthik Jagadish (kjagadis) 
Cc: pgsql-hack...@postgresql.org , Prasanna 
Satyanarayanan (prassaty) , Chandruganth Ayyavoo Selvam 
(chaayyav) , Jaganbabu M (jmunusam) 
Subject: Re: Tables not getting vacuumed in postgres
On Tue, Nov 8, 2022 at 6:11 PM Karthik Jagadish (kjagadis)
 wrote:
>
> Hi,
>
>
>
> Thanks for the response.
>
>
>
> But what I understand that insert update and delete would still work and will 
> not interfere with vacuuming process. Yes we do perform a lot of updates on 
> that particular table which is not vacuuming. Does it mean that it waiting 
> for the lock to be released?
>

Well, yes, that won't interfere but the primary job of autovacuum is
to remove the bloat, if the dead tuple(s) is visible to any
transaction, then not going to remove that.


>
>
> Regards,
>
> Karthik
>
>
>
> From: Amul Sul 
> Date: Tuesday, 8 November 2022 at 5:38 PM
> To: Karthik Jagadish (kjagadis) 
> Cc: pgsql-hack...@postgresql.org , Prasanna 
> Satyanarayanan (prassaty) , Chandruganth Ayyavoo Selvam 
> (chaayyav) , Jaganbabu M (jmunusam) 
> Subject: Re: Tables not getting vacuumed in postgres
>
> On Tue, Nov 8, 2022 at 5:00 PM Karthik Jagadish (kjagadis)
>  wrote:
> >
> > Hi,
> >
> > We have a NMS application where we are using postgres as database, what we 
> > are noticing is that vacuuming is not happening for certain tables for 2-3 
> > days and eventually the table bloats and disk space is running out.
> >
> > What could be the reason for auto vacuuming not happening for certain 
> > tables?
> >
>
> Check if there is any long-running or prepared transaction.
>
> Regards,
> Amul


Re: Check return value of pclose() correctly

2022-11-08 Thread Peter Eisentraut

On 01.11.22 21:30, Peter Eisentraut wrote:

There are some places where the return value is apparently intentionally
ignored, such as in error recovery paths, or psql ignoring a failure to
launch the pager.  (The intention can usually be inferred by the kind of
error checking attached to the corresponding popen() call.)  But 
there are a
few places in psql that I'm suspicious about that I have marked, but 
need to

think about further.


Hmm.  I would leave these out, I think.  setQFout() relies on the
result of openQueryOutputFile().  And this could make commands like
\watch less reliable.


I don't quite understand what you are saying here.  My point is that, 
for example, setQFout() thinks it's important to check the result of 
popen() and write an error message, but it doesn't check the result of 
pclose() at all.  I don't think that makes sense in practice.


I have looked this over again.  In these cases, if the piped-to command 
is faulty, you get a "broken pipe" error anyway, so the effect of not 
checking the pclose() result is negligible.  So I have removed the 
"FIXME" markers without further action.


There is also the question whether we want to check the exit status of a 
user-supplied command, such as in pgbench's \setshell.  I have dialed 
back my patch there, since I don't know what the current practice in 
pgbench scripts is.  If the command fails badly, pgbench will probably 
complain anyway about invalid output.


More important is that something like pg_upgrade does check the exit 
status when it calls pg_controldata etc.  That's what this patch 
accomplishes.
From 86b2b61d30f848b84c69437c7106dea8fdf3738d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 8 Nov 2022 14:03:12 +0100
Subject: [PATCH v2] Check return value of pclose() correctly

Some callers didn't check the return value of pclose() or
ClosePipeStream() correctly.  Either they didn't check it at all or
they treated it like the return of fclose().

The correct way is to first check whether the return value is -1, and
then report errno, and then check the return value like a result from
system(), for which we already have wait_result_to_str() to make it
simpler.  To make this more compact, expand wait_result_to_str() to
also handle -1 explicitly.

Discussion: 
https://www.postgresql.org/message-id/flat/8cd9fb02-bc26-65f1-a809-b1cb360ee...@enterprisedb.com
---
 src/backend/commands/collationcmds.c | 11 ++-
 src/bin/pg_ctl/pg_ctl.c  |  3 +--
 src/bin/pg_upgrade/controldata.c | 12 +---
 src/bin/pg_upgrade/exec.c|  6 +-
 src/bin/pg_upgrade/option.c  |  6 +-
 src/bin/pgbench/pgbench.c|  2 +-
 src/bin/psql/command.c   | 18 ++
 src/common/wait_error.c  |  6 +-
 8 files changed, 50 insertions(+), 14 deletions(-)

diff --git a/src/backend/commands/collationcmds.c 
b/src/backend/commands/collationcmds.c
index 86fbc7fa019f..fbf45f05aa0f 100644
--- a/src/backend/commands/collationcmds.c
+++ b/src/backend/commands/collationcmds.c
@@ -639,6 +639,7 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
int naliases,
maxaliases,
i;
+   int pclose_rc;
 
/* expansible array of aliases */
maxaliases = 100;
@@ -745,7 +746,15 @@ pg_import_system_collations(PG_FUNCTION_ARGS)
}
}
 
-   ClosePipeStream(locale_a_handle);
+   pclose_rc = ClosePipeStream(locale_a_handle);
+   if (pclose_rc != 0)
+   {
+   ereport(ERROR,
+   (errcode_for_file_access(),
+errmsg("could not execute command 
\"%s\": %s",
+   "locale -a",
+   
wait_result_to_str(pclose_rc;
+   }
 
/*
 * Before processing the aliases, sort them by locale name.  
The point
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index a509fc9109e8..ceab603c47d9 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -2154,12 +2154,11 @@ adjust_data_dir(void)
fflush(NULL);
 
fd = popen(cmd, "r");
-   if (fd == NULL || fgets(filename, sizeof(filename), fd) == NULL)
+   if (fd == NULL || fgets(filename, sizeof(filename), fd) == NULL || 
pclose(fd) != 0)
{
write_stderr(_("%s: could not determine the data directory 
using command \"%s\"\n"), progname, cmd);
exit(1);
}
-   pclose(fd);
free(my_exec_path);
 
/* strip trailing newline and carriage return */
diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 

Re: Tables not getting vacuumed in postgres

2022-11-08 Thread Amul Sul
On Tue, Nov 8, 2022 at 6:11 PM Karthik Jagadish (kjagadis)
 wrote:
>
> Hi,
>
>
>
> Thanks for the response.
>
>
>
> But what I understand that insert update and delete would still work and will 
> not interfere with vacuuming process. Yes we do perform a lot of updates on 
> that particular table which is not vacuuming. Does it mean that it waiting 
> for the lock to be released?
>

Well, yes, that won't interfere but the primary job of autovacuum is
to remove the bloat, if the dead tuple(s) is visible to any
transaction, then not going to remove that.


>
>
> Regards,
>
> Karthik
>
>
>
> From: Amul Sul 
> Date: Tuesday, 8 November 2022 at 5:38 PM
> To: Karthik Jagadish (kjagadis) 
> Cc: pgsql-hack...@postgresql.org , Prasanna 
> Satyanarayanan (prassaty) , Chandruganth Ayyavoo Selvam 
> (chaayyav) , Jaganbabu M (jmunusam) 
> Subject: Re: Tables not getting vacuumed in postgres
>
> On Tue, Nov 8, 2022 at 5:00 PM Karthik Jagadish (kjagadis)
>  wrote:
> >
> > Hi,
> >
> > We have a NMS application where we are using postgres as database, what we 
> > are noticing is that vacuuming is not happening for certain tables for 2-3 
> > days and eventually the table bloats and disk space is running out.
> >
> > What could be the reason for auto vacuuming not happening for certain 
> > tables?
> >
>
> Check if there is any long-running or prepared transaction.
>
> Regards,
> Amul




Re: Tables not getting vacuumed in postgres

2022-11-08 Thread Karthik Jagadish (kjagadis)
Hi,

Thanks for the response.

But what I understand that insert update and delete would still work and will 
not interfere with vacuuming process. Yes we do perform a lot of updates on 
that particular table which is not vacuuming. Does it mean that it waiting for 
the lock to be released?

Regards,
Karthik

From: Amul Sul 
Date: Tuesday, 8 November 2022 at 5:38 PM
To: Karthik Jagadish (kjagadis) 
Cc: pgsql-hack...@postgresql.org , Prasanna 
Satyanarayanan (prassaty) , Chandruganth Ayyavoo Selvam 
(chaayyav) , Jaganbabu M (jmunusam) 
Subject: Re: Tables not getting vacuumed in postgres
On Tue, Nov 8, 2022 at 5:00 PM Karthik Jagadish (kjagadis)
 wrote:
>
> Hi,
>
> We have a NMS application where we are using postgres as database, what we 
> are noticing is that vacuuming is not happening for certain tables for 2-3 
> days and eventually the table bloats and disk space is running out.
>
> What could be the reason for auto vacuuming not happening for certain tables?
>

Check if there is any long-running or prepared transaction.

Regards,
Amul


Re: psql: Add command to use extended query protocol

2022-11-08 Thread Peter Eisentraut

On 08.11.22 13:02, Daniel Verite wrote:

A pset variable to control the default seems reasonable as well.
The implication would be that if you set that pset variable there is
no way to have individual commands use simple query mode directly.

+1 except that it would be a \set variable for consistency with the
other execution-controlling variables. \pset variables control only
the display.


Is there a use case for a global setting?

It seems to me that that would be just another thing that a 
super-careful psql script would have to reset to get a consistent 
starting state.






Re: psql: Add command to use extended query protocol

2022-11-08 Thread Peter Eisentraut

On 05.11.22 07:34, Corey Huinker wrote:
The most compact idea I can think of is to have \bind and \endbind (or 
more terse equivalents \bp and \ebp)


SELECT * FROM foo WHERE type_id = $1 AND cost > $2 \bind 'param1' 
'param2' \endbind $2 \g filename.csv


I like it.  It makes my code even simpler, and it allows using all the 
different \g variants transparently.  See attached patch.


Maybe the end-bind param isn't needed at all, we just insist that bind 
params be single quoted strings or numbers, so the next slash command 
ends the bind list.


Right, the end-bind isn't needed.

Btw., this also allows doing things like

SELECT $1, $2
\bind '1' '2' \g
\bind '3' '4' \g

This isn't a prepared statement being reused, but it relies on the fact 
that psql \g with an empty query buffer resends the previous query. 
Still kind of neat.From 076df09c701c5e60172e2dc80602726d3761e55c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 8 Nov 2022 13:33:29 +0100
Subject: [PATCH v2] psql: Add command to use extended query protocol

This adds a new psql command \bind that sets query parameters and
causes the next query to be sent using the extended query protocol.
Example:

SELECT $1, $2 \bind 'foo' 'bar' \g

This may be useful for psql scripting, but one of the main purposes is
also to be able to test various aspects of the extended query protocol
from psql and to write tests more easily.

Discussion: 
https://www.postgresql.org/message-id/flat/e8dd1cd5-0e04-3598-0518-a605159fe...@enterprisedb.com
---
 doc/src/sgml/ref/psql-ref.sgml | 33 ++
 src/bin/psql/command.c | 37 ++
 src/bin/psql/common.c  | 15 +++-
 src/bin/psql/help.c|  1 +
 src/bin/psql/settings.h|  3 +++
 src/bin/psql/tab-complete.c|  1 +
 src/test/regress/expected/psql.out | 31 +
 src/test/regress/sql/psql.sql  | 14 +++
 8 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 9494f28063ad..df93a5ca9897 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -879,6 +879,39 @@ Meta-Commands
 
   
 
+  
+   \bind [ parameter ] ... 
+
+   
+
+ Sets query parameters for the next query execution, with the
+ specified parameters passed for any parameter placeholders
+ ($1 etc.).
+
+
+
+ Example:
+
+INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 'second value' \g
+
+
+
+
+ This also works for query-execution commands besides
+ \g, such as \gx and
+ \gset.
+
+
+
+ This command causes the extended query protocol (see ) to be used, unlike normal
+ psql operation, which uses the simple
+ query protocol.  So this command can be useful to test the extended
+ query protocol from psql.
+
+   
+  
+
   
 \c or \connect [ 
-reuse-previous=on|off ] [ 
dbname [ username ] [ host ] [ port ] | conninfo ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ab613dd49e0a..3b06169ba0dc 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -63,6 +63,7 @@ static backslashResult exec_command(const char *cmd,

PQExpBuffer query_buf,

PQExpBuffer previous_buf);
 static backslashResult exec_command_a(PsqlScanState scan_state, bool 
active_branch);
+static backslashResult exec_command_bind(PsqlScanState scan_state, bool 
active_branch);
 static backslashResult exec_command_C(PsqlScanState scan_state, bool 
active_branch);
 static backslashResult exec_command_connect(PsqlScanState scan_state, bool 
active_branch);
 static backslashResult exec_command_cd(PsqlScanState scan_state, bool 
active_branch,
@@ -308,6 +309,8 @@ exec_command(const char *cmd,
 
if (strcmp(cmd, "a") == 0)
status = exec_command_a(scan_state, active_branch);
+   else if (strcmp(cmd, "bind") == 0)
+   status = exec_command_bind(scan_state, active_branch);
else if (strcmp(cmd, "C") == 0)
status = exec_command_C(scan_state, active_branch);
else if (strcmp(cmd, "c") == 0 || strcmp(cmd, "connect") == 0)
@@ -453,6 +456,40 @@ exec_command_a(PsqlScanState scan_state, bool 
active_branch)
return success ? PSQL_CMD_SKIP_LINE : PSQL_CMD_ERROR;
 }
 
+/*
+ * \bind -- set query parameters
+ */
+static backslashResult
+exec_command_bind(PsqlScanState scan_state, bool active_branch)
+{
+   backslashResult status = PSQL_CMD_SKIP_LINE;
+
+   if (active_branch)
+   {
+   char   *opt;
+   int nparams = 0;
+   int nalloc 

Re: when the startup process doesn't (logging startup delays)

2022-11-08 Thread Bharath Rupireddy
On Tue, Nov 8, 2022 at 4:35 PM Thomas Munro  wrote:
>
> On Sat, Oct 30, 2021 at 7:44 AM Robert Haas  wrote:
> > Committed.
>
> Is it expected that an otherwise idle standby's recovery process
> receives SIGALRM every N seconds, or should the timer be canceled at
> that point, as there is no further progress to report?

Nice catch. Yeah, that seems unnecessary, see the below standby logs.
I think we need to disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);,
something like the attached? I think there'll be no issue with the
patch since the StandbyMode gets reset only at the end of recovery (in
FinishWalRecovery()) but it can very well be set during recovery (in
ReadRecord()). Note that I also added an assertion in
has_startup_progress_timeout_expired(), just in case.

2022-11-08 11:28:23.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:23.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:33.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:33.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:43.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:43.563 UTC [980909] LOG:
startup_progress_timeout_handler called
2022-11-08 11:28:53.563 UTC [980909] LOG:  SIGALRM handle_sig_alarm received
2022-11-08 11:28:53.563 UTC [980909] LOG:
startup_progress_timeout_handler called

Whilte at it, I noticed that we report redo progress for PITR, but we
don't report when standby enters archive recovery mode, say due to a
failure in the connection to primary or after the promote signal is
found. Isn't it useful to report in this case as well to know the
recovery progress?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 40ac498a76f38903bbee37108812907a76bb1a78 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 8 Nov 2022 12:18:11 +
Subject: [PATCH v1] Disable STARTUP_PROGRESS_TIMEOUT in standby mode

---
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/postmaster/startup.c  |  2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index cb07694aea..5326a98633 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -63,6 +63,7 @@
 #include "utils/pg_lsn.h"
 #include "utils/ps_status.h"
 #include "utils/pg_rusage.h"
+#include "utils/timeout.h"
 
 /* Unsupported old recovery command file names (relative to $PGDATA) */
 #define RECOVERY_COMMAND_FILE	"recovery.conf"
@@ -1653,7 +1654,17 @@ PerformWalRecovery(void)
 		 */
 		do
 		{
-			if (!StandbyMode)
+			/*
+			 * To avoid server log bloat, we don't report recovery progress in
+			 * a standby as it will always be in recovery unless promoted. We
+			 * also disable the timeout as we don't need it anymore.
+			 */
+			if (StandbyMode)
+			{
+if (get_timeout_active(STARTUP_PROGRESS_TIMEOUT))
+	disable_timeout(STARTUP_PROGRESS_TIMEOUT, false);
+			}
+			else
 ereport_startup_progress("redo in progress, elapsed time: %ld.%02d s, current LSN: %X/%X",
 		 LSN_FORMAT_ARGS(xlogreader->ReadRecPtr));
 
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index f99186eab7..1456e3ad3a 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -347,6 +347,8 @@ has_startup_progress_timeout_expired(long *secs, int *usecs)
 	int			useconds;
 	TimestampTz now;
 
+	Assert(get_timeout_active(STARTUP_PROGRESS_TIMEOUT) == true);
+
 	/* No timeout has occurred. */
 	if (!startup_progress_timer_expired)
 		return false;
-- 
2.34.1



Re: Tables not getting vacuumed in postgres

2022-11-08 Thread Amul Sul
On Tue, Nov 8, 2022 at 5:00 PM Karthik Jagadish (kjagadis)
 wrote:
>
> Hi,
>
> We have a NMS application where we are using postgres as database, what we 
> are noticing is that vacuuming is not happening for certain tables for 2-3 
> days and eventually the table bloats and disk space is running out.
>
> What could be the reason for auto vacuuming not happening for certain tables?
>

Check if there is any long-running or prepared transaction.

Regards,
Amul




Re: psql: Add command to use extended query protocol

2022-11-08 Thread Daniel Verite
David G. Johnston wrote:

> I would keep the \gp meta-command to force extended mode regardless
> of whether the query itself requires it.

+1

> A pset variable to control the default seems reasonable as well.
> The implication would be that if you set that pset variable there is
> no way to have individual commands use simple query mode directly.

+1 except that it would be a \set variable for consistency with the
other execution-controlling variables. \pset variables control only
the display.

BTW if we wanted to auto-detect that a query requires binding or the
extended query protocol, we need to keep in mind that for instance
"PREPARE stmt AS $1" must pass without binding, with both the simple
and the extended query protocol.


Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite




Re: Perform streaming logical transactions by background workers and parallel apply

2022-11-08 Thread Amit Kapila
On Mon, Nov 7, 2022 at 6:49 PM houzj.f...@fujitsu.com
 wrote:
>
> On Friday, November 4, 2022 7:45 PM Amit Kapila  
> wrote:
> > 3.
> > apply_handle_stream_start(StringInfo s)
> > {
> > ...
> > + if (!first_segment)
> > + {
> > + /*
> > + * Unlock the shared object lock so that parallel apply worker
> > + * can continue to receive and apply changes.
> > + */
> > + parallel_apply_unlock(winfo->shared->stream_lock_id);
> > ...
> > }
> >
> > Can we have an assert before this unlock call that the lock must be
> > held? Similarly, if there are other places then we can have assert
> > there as well.
>
> It seems we don't have a standard API can be used without a transaction.
> Maybe we can use the list ParallelApplyLockids to check that ?
>

Yeah, that occurred to me as well but I am not sure if it is a good
idea to maintain this list just for assertion but if it turns out that
we need to maintain it for a different purpose then we can probably
use it for assert as well.

Few other comments/questions:
=
1.
apply_handle_stream_start(StringInfo s)
{
...

+ case TRANS_PARALLEL_APPLY:
...
...
+ /*
+ * Unlock the shared object lock so that the leader apply worker
+ * can continue to send changes.
+ */
+ parallel_apply_unlock(MyParallelShared->stream_lock_id, AccessShareLock);

As per the design in the email [1], this lock needs to be released by
the leader worker during stream start which means it should be
released under the state TRANS_LEADER_SEND_TO_PARALLEL. From the
comments as well, it is not clear to me why at this time leader is
supposed to be blocked. Is there a reason for doing differently than
what is proposed in the original design?

2. Similar to above, it is not clear why the parallel worker needs to
release the stream_lock_id lock at stream_commit and stream_prepare?

3. Am, I understanding correctly that you need to lock/unlock in
apply_handle_stream_abort() for the parallel worker because after
rollback to savepoint, there could be another set of stream or
transaction end commands for which you want to wait? If so, maybe an
additional comment would serve the purpose.

4.
The leader may have sent multiple streaming blocks in the queue
+ * When the child is processing a streaming block. So only try to
+ * lock if there is no message left in the queue.

Let's slightly reword this to: "By the time child is processing the
changes in the current streaming block, the leader may have sent
multiple streaming blocks. So, try to lock only if there is no message
left in the queue."

5.
+parallel_apply_unlock(uint16 lockid, LOCKMODE lockmode)
+{
+ if (!list_member_int(ParallelApplyLockids, lockid))
+ return;
+
+ UnlockSharedObjectForSession(SubscriptionRelationId, MySubscription->oid,
+ lockid, am_leader_apply_worker() ?
+ AccessExclusiveLock:
+ AccessShareLock);

This function should use lockmode argument passed rather than deciding
based on am_leader_apply_worker. I think this is anyway going to
change if we start using a different locktag as discussed in one of
the above emails.

6.
+
 /*
  * Common spoolfile processing.
  */
-static void
-apply_spooled_messages(TransactionId xid, XLogRecPtr lsn)
+void
+apply_spooled_messages(FileSet *stream_fileset, TransactionId xid,

Seems like a spurious line addition.

-- 
With Regards,
Amit Kapila.




Re: [PoC] Reducing planning time when tables have many partitions

2022-11-08 Thread Andrey Lepikhov

On 2/11/2022 15:27, Yuya Watari wrote:

I noticed that the previous patch does not apply to the current HEAD.
I attached the rebased version to this email.

Looking into find_em_for_rel() changes I see that you replaced
if (bms_is_subset(em->em_relids, rel->relids)
with assertion statement.
According of get_ecmember_indexes(), the em_relids field of returned 
equivalence members can contain relids, not mentioned in the relation.
I don't understand, why it works now? For example, we can sort by t1.x, 
but have an expression t1.x=t1.y*t2.z. Or I've missed something? If it 
is not a mistake, maybe to add a comment why assertion here isn't failed?


--
regards,
Andrey Lepikhov
Postgres Professional





Tables not getting vacuumed in postgres

2022-11-08 Thread Karthik Jagadish (kjagadis)
Hi,

We have a NMS application where we are using postgres as database, what we are 
noticing is that vacuuming is not happening for certain tables for 2-3 days and 
eventually the table bloats and disk space is running out.

What could be the reason for auto vacuuming not happening for certain tables?

Autovacuum is enabled

Regards,
Karthik


Re: when the startup process doesn't (logging startup delays)

2022-11-08 Thread Thomas Munro
On Sat, Oct 30, 2021 at 7:44 AM Robert Haas  wrote:
> Committed.

Is it expected that an otherwise idle standby's recovery process
receives SIGALRM every N seconds, or should the timer be canceled at
that point, as there is no further progress to report?




Re: Use proc instead of MyProc in ProcArrayGroupClearXid()/TransactionGroupUpdateXidStatus()

2022-11-08 Thread Bharath Rupireddy
On Tue, Nov 8, 2022 at 11:59 AM Amit Kapila  wrote:
>
> On Mon, Nov 7, 2022 at 3:17 PM rajesh singarapu  
> wrote:
> >
> > In both TransactionGroupUpdateXidStatus and ProcArrayGroupClearXid
> > global MyProc is used. for consistency, replaced with a function local 
> > variable.
> >
>
> In ProcArrayGroupClearXid(), currently, we always pass MyProc as proc,
> so the change suggested by you will work but I think if in the future
> someone calls it with a different proc, then the change suggested by
> you won't work.

Well, yes. Do you have any thoughts around such future usages of
ProcArrayGroupClearXid()?

> The change in TransactionGroupUpdateXidStatus() looks
> good but If we don't want to change ProcArrayGroupClearXid() then I am
> not sure if there is much value in making the change in
> TransactionGroupUpdateXidStatus().

AFICS, there are many places in the code that use proc == MyProc (20
instances) or proc != MyProc (6 instances) sorts of things. I think
defining a macro, something like below, is better for readability.
However, I'm concerned that we might have to use it in 26 places.

#define IsPGPROCMine(proc) (proc != NULL && proc == MyProc)
or just
#define IsPGPROCMine(proc) (proc == MyProc)

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

2022-11-08 Thread Amit Kapila
On Mon, Nov 7, 2022 at 11:12 PM Robert Haas  wrote:
>
> On Mon, Oct 31, 2022 at 11:49 PM Amit Kapila  wrote:
> > I am fine with any of those. Would you like to commit or do you prefer
> > me to take care of this?
>
> Sorry for not responding to this sooner. I think it's too late to do
> anything about this for the current round of releases at this point,
> but I am fine if you want to take care of it after that.
>

Okay, I'll take care of this either later this week after the release
work is finished or early next week.

-- 
With Regards,
Amit Kapila.




Re: Reviving lost replication slots

2022-11-08 Thread Amit Kapila
On Tue, Nov 8, 2022 at 12:08 PM sirisha chamarthi
 wrote:
>
> On Fri, Nov 4, 2022 at 11:02 PM Amit Kapila  wrote:
>>
>> On Fri, Nov 4, 2022 at 1:40 PM sirisha chamarthi
>>  wrote:
>> >
>> > A replication slot can be lost when a subscriber is not able to catch up 
>> > with the load on the primary and the WAL to catch up exceeds 
>> > max_slot_wal_keep_size. When this happens, target has to be reseeded 
>> > (pg_dump) from the scratch and this can take longer. I am investigating 
>> > the options to revive a lost slot.
>> >
>>
>> Why in the first place one has to set max_slot_wal_keep_size if they
>> care for WAL more than that?
>
>  Disk full is a typical use where we can't wait until the logical slots to 
> catch up before truncating the log.
>

Ideally, in such a case the subscriber should fall back to the
physical standby of the publisher but unfortunately, we don't yet have
a functionality where subscribers can continue logical replication
from physical standby. Do you think if we had such functionality it
would serve our purpose?

>> If you have a case where you want to
>> handle this case for some particular slot (where you are okay with the
>> invalidation of other slots exceeding max_slot_wal_keep_size) then the
>> other possibility could be to have a similar variable at the slot
>> level but not sure if that is a good idea because you haven't
>> presented any such case.
>
> IIUC, ability to fetch WAL from the archive as a fall back mechanism should 
> automatically take care of all the lost slots. Do you see a need to take care 
> of a specific slot?
>

No, I was just trying to see if your use case can be addressed in some
other way. BTW, won't copying the WAL again back from archive can lead
to a disk full situation.

-- 
With Regards,
Amit Kapila.




Re: Tracking last scan time

2022-11-08 Thread Dave Page
On Tue, 8 Nov 2022 at 04:10, Michael Paquier  wrote:

> On Mon, Nov 07, 2022 at 04:54:07PM +0900, Michael Paquier wrote:
> > FWIW, all the other areas of pgstatfuncs.c manipulate timestamptz
> > fields with a style like the attached.  That's a nit, still per the
> > role of consistency with the surroundings..
> >
> > Anyway, it seems to me that a regression test is in order before a
> > scan happens just after the relation creation, and the same problem
> > shows up with last_idx_scan.
>
> Hearing nothing, done this way as of d7744d5.  Thanks for the report,
> Robert.  And thanks for the patch, Dave.
>

Thank you!

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-11-08 Thread Andy Fan
Hi:


> cfbot reports the patch no longer applies [1].  As CommitFest 2022-11 is
> currently underway, this would be an excellent time to update the patch.
>

Thank you Ian & Andrey for taking care of this!  I am planning to start
a new  thread for this topic in 2 weeks,  and will post an update patch
at  that time.

-- 
Best Regards
Andy Fan


Re: Suppressing useless wakeups in walreceiver

2022-11-08 Thread Thomas Munro
On Tue, Nov 8, 2022 at 9:20 PM Bharath Rupireddy
 wrote:
> Thanks. Do we need a similar wakeup approach for logical replication
> workers in worker.c? Or is it okay that the nap time is 1sec there?

Yeah, I think so.  At least for its idle/nap case (waiting for flush
is also a technically interesting case, but harder, and applies to
non-idle systems so the polling is a little less offensive).




Re: Add proper planner support for ORDER BY / DISTINCT aggregates

2022-11-08 Thread Ronan Dunklau
Le mardi 8 novembre 2022, 02:31:12 CET David Rowley a écrit :
> 1. Adjusts add_paths_to_grouping_rel so that we don't add a Sort path
> when we can add an Incremental Sort path instead. This removes quite a
> few redundant lines of code.

This seems sensible

> 2. Removes the * 1.5 fuzz-factor in cost_incremental_sort()
> 3. Does various other code tidy stuff in cost_incremental_sort().
> 4. Removes the test from incremental_sort.sql that was ensuring the
> inferior Sort -> Sort plan was being used instead of the superior Sort
> -> Incremental Sort plan.
> 
> I'm not really that 100% confident in the removal of the * 1.5 thing.
> I wonder if there's some reason we're not considering that might cause
> a performance regression if we're to remove it.

I'm not sure about it either. It seems to me that we were afraid of 
regressions, and having this overcharged just made us miss a new optimization 
without changing existing plans. With ordered aggregates, the balance is a bit 
trickier and we are at risk of either regressing on aggregate plans, or more 
common ordered ones.

--
Ronan Dunklau








Re: libpq support for NegotiateProtocolVersion

2022-11-08 Thread Peter Eisentraut

On 02.11.22 20:02, Jacob Champion wrote:

A few notes:


+   else if (beresp == 'v')
+   {
+   if 
(pqGetNegotiateProtocolVersion3(conn))
+   {
+   /* We'll come back when there 
is more data */
+   return PGRES_POLLING_READING;
+   }
+   /* OK, we read the message; mark data 
consumed */
+   conn->inStart = conn->inCursor;
+   goto error_return;
+   }


This new code path doesn't go through the message length checks that are
done for the 'R' and 'E' cases, and pqGetNegotiateProtocolVersion3()
doesn't take the message length to know where to stop anyway, so a
misbehaving server can chew up client resources.


Fixed in new patch.


It looks like the server is expecting to be able to continue the
conversation with a newer client after sending a
NegotiateProtocolVersion. Is an actual negotiation planned for the future?


The protocol documentation says:

| The client may then choose either
| to continue with the connection using the specified protocol version
| or to abort the connection.

In this case, we are choosing to abort the connection.

We could add negotiation in the future, but then we'd have to first have 
a concrete case of something to negotiate about.  For example, if we 
added an optional performance feature into the protocol, then one could 
negotiate by falling back to not using that.  But for the kinds of 
features I'm thinking about right now (column encryption), you can't 
proceed if the feature is not supported.  So I think this would need to 
be considered case by case.



I think the documentation on NegotiateProtocolVersion (not introduced in
this patch) is misleading/wrong; it says that the version number sent
back is the "newest minor protocol version supported by the server for
the major protocol version requested by the client" which doesn't seem
to match the actual usage seen here.


I don't follow.  If libpq sends a protocol version of 3.1, then the 
server responds by saying it supports only 3.0.  What are you seeing?
From 64dc983097553af9bed4293821bd45f1932e62c6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 8 Nov 2022 09:24:29 +0100
Subject: [PATCH v3] libpq: Handle NegotiateProtocolVersion message

Before, receiving a NegotiateProtocolVersion message would result in a
confusing error message like

expected authentication request from server, but received v

This adds proper handling of this protocol message and produces an
on-topic error message from it.

Discussion: 
https://www.postgresql.org/message-id/flat/f9c7862f-b864-8ef7-a861-c4638c83e209%40enterprisedb.com
---
 src/interfaces/libpq/fe-connect.c   | 23 ++---
 src/interfaces/libpq/fe-protocol3.c | 50 +
 src/interfaces/libpq/libpq-int.h|  1 +
 3 files changed, 69 insertions(+), 5 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c 
b/src/interfaces/libpq/fe-connect.c
index 746e9b4f1efc..1e72eb92b073 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3246,10 +3246,11 @@ PQconnectPoll(PGconn *conn)
 
/*
 * Validate message type: we expect only an 
authentication
-* request or an error here.  Anything else 
probably means
-* it's not Postgres on the other end at all.
+* request, NegotiateProtocolVersion, or an 
error here.
+* Anything else probably means it's not 
Postgres on the other
+* end at all.
 */
-   if (!(beresp == 'R' || beresp == 'E'))
+   if (!(beresp == 'R' || beresp == 'v' || beresp 
== 'E'))
{
appendPQExpBuffer(>errorMessage,
  
libpq_gettext("expected authentication request from server, but received %c\n"),
@@ -3267,14 +3268,15 @@ PQconnectPoll(PGconn *conn)
/*
 * Try to validate message length before using 
it.
 * Authentication requests can't be very large, 
although GSS
-* auth requests may not be that small.  Errors 
can be a
+* auth requests may not be that small.  Same 
for
+* NegotiateProtocolVersion.  Errors can be a
 * little larger, but not 

Re: Suppressing useless wakeups in walreceiver

2022-11-08 Thread Bharath Rupireddy
On Tue, Nov 8, 2022 at 1:17 PM Thomas Munro  wrote:
>
> And with that change and a pgindent, pushed.

Thanks. Do we need a similar wakeup approach for logical replication
workers in worker.c? Or is it okay that the nap time is 1sec there?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: PL/pgSQL cursors should get generated portal names by default

2022-11-08 Thread Kirk Wolak
On Mon, Nov 7, 2022 at 11:10 AM Jan Wieck  wrote:

> On 11/4/22 19:46, Tom Lane wrote:
> > Jan Wieck  writes:
> >> I need to do some testing on this. I seem to recall that the naming was
> >> originally done because a reference cursor is basically a named cursor
> >> that can be handed around between functions and even the top SQL level
> >> of the application. For the latter to work the application needs to
> know
> >> the name of the portal.
> >
> > Right.  With this patch, it'd be necessary to hand back the actual
> > portal name (by returning the refcursor value), or else manually
> > set the refcursor value before OPEN to preserve the previous behavior.
> > But as far as I saw, all our documentation examples show handing back
> > the portal name, so I'm hoping most people do it like that already.
>
> I was mostly concerned that we may unintentionally break underdocumented
> behavior that was originally implemented on purpose. As long as everyone
> is aware that this is breaking backwards compatibility in the way it
> does, that's fine.
>

I respect the concern, and applied some deeper thinking to it...

Here is the logic I am applying to this compatibility issue and what may
break.
[FWIW, my motto is to be wrong out  loud, as you learn faster]

At first pass, I thought "Well, since this does not break a refcursor,
which is the obvious use case for RETURNING/PASSING, we are fine!"

But in trying to DEFEND this case, I have come up with example of code
(that makes some SENSE, but would break):

CREATE FUNCTION test() RETURNS refcursor() LANGUAGE plpgsql AS  $$
DECLARE
cur_this cursor FOR SELECT 1;
ref_cur refcursor;
BEGIN
OPEN cur_this;
ref_cur := 'cur_this';  -- Using the NAME of the cursor as the portal
name: Should do:  ref_cur := cur_this; -- Only works after OPEN
RETURN ref_cur;
END;
$$;

As noted in the comments.  If the code were:
  ref_cur := 'cur_this';  -- Now you can't just use ref_cur := cur_this;
  OPEN cur_this;
  RETURN ref_cur;
Then it would break now...  And even the CORRECT syntax would break, since
the cursor was not opened, so "cur_this" is null.

Now, I have NO IDEA if someone would actually do this.  It is almost
pathological.  The use case would be a complex cursor with parameters,
and they changed the code to return a refcursor!
This was the ONLY use case I could think of that wasn't HACKY!

HACKY use cases involve a child routine setting:  local_ref_cursor :=
'cur_this'; in order to access a cursor that was NOT passed to the child.
FWIW, I tested this, and it works, and I can FETCH in the child routine,
and it affects the parents' LOOP as it should... WOW.  I would be HAPPY
to break such horrible code, it has to be a security concern at some level.

Personally (and my 2 cents really shouldn't matter much), I think this
should still be fixed.
Because I believe this small use case is rare, it will break immediately,
and the fix is trivial (just initialize cur_this := 'cur_this'  in this
example),
and the fix removes the Orthogonal Behavior Tom pointed out, which led me
to reporting this.

I think I have exhausted examples of how this impacts a VALID
refcursor implementation.  I believe any other such versions are variations
of this!
And maybe we document that if a refcursor of a cursor is to be returned,
that the refcursor is ASSIGNED after the OPEN of the cursor, and it is done
without the quotes, as:
  ref_cursor := cur_this;  -- assign the name after opening.

Thanks!