date:20221207

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread Amit Kapila

On Thu, Dec 8, 2022 at 12:42 PM Masahiko Sawada  wrote:
>
> On Wed, Dec 7, 2022 at 10:03 PM houzj.f...@fujitsu.com
>  wrote:
> >
> >
> > > +static void
> > > +ProcessParallelApplyInterrupts(void)
> > > +{
> > > +CHECK_FOR_INTERRUPTS();
> > > +
> > > +if (ShutdownRequestPending)
> > > +{
> > > +ereport(LOG,
> > > +(errmsg("logical replication parallel
> > > apply worker for subscrip
> > > tion \"%s\" has finished",
> > > +MySubscription->name)));
> > > +
> > > +apply_worker_clean_exit(false);
> > > +}
> > > +
> > > +if (ConfigReloadPending)
> > > +{
> > > +ConfigReloadPending = false;
> > > +ProcessConfigFile(PGC_SIGHUP);
> > > +}
> > > +}
> > >
> > > I personally think that we don't need to have a function to do only
> > > these few things.
> >
> > I thought that introduce a new function make the handling of worker specific
> > Interrupts logic similar to other existing ones. Like:
> > ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in
> > pgarch.c ...
>
> I think the difference from them is that there is only one place to
> call ProcessParallelApplyInterrupts().
>

But I feel it is better to isolate this code in a separate function.
What if we decide to extend it further by having some logic to stop
workers after reloading of config?

> >
> > > ---
> > > server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
> > > options.proto.logical.proto_version =
> > > +server_version >= 16 ?
> > > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM :
> > > server_version >= 15 ?
> > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM :
> > > server_version >= 14 ?
> > > LOGICALREP_PROTO_STREAM_VERSION_NUM :
> > > LOGICALREP_PROTO_VERSION_NUM;
> > >
> > > Instead of always using the new protocol version, I think we can use
> > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not
> > > 'parallel'. That way, we don't need to change protocl version check
> > > logic in pgoutput.c and don't need to expose defGetStreamingMode().
> > > What do you think?
> >
> > I think that some user can also use the new version number when trying to 
> > get
> > changes (via pg_logical_slot_peek_binary_changes or other functions), so I 
> > feel
> > leave the check for new version number seems fine.
> >
> > Besides, I feel even if we don't use new version number, we still need to 
> > use
> > defGetStreamingMode to check if parallel mode in used as we need to send
> > abort_lsn when parallel is in used. I might be missing something, sorry for
> > that. Can you please explain the idea a bit ?
>
> My idea is that we use LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM if
> (server_version >= 16 && MySubscription->stream ==
> SUBSTREAM_PARALLEL). If the stream is SUBSTREAM_ON, we use
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM even if server_version is
> 16. That way, in pgoutput.c, we can send abort_lsn if the protocol
> version is LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM. We don't need
> to send "streaming = parallel" to the publisher since the publisher
> can decide whether or not to send abort_lsn based on the protocol
> version (still needs to send "streaming = on" though). I might be
> missing something.
>

What if we decide to send some more additional information as part of
another patch like we are discussing in the thread [1]? Now, we won't
be able to decide the version number based on just the streaming
option. Also, in such a case, even for
LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, it may not be a good
idea to send additional abort information unless the user has used the
streaming=parallel option.

[1] - 
https://www.postgresql.org/message-id/CAGPVpCRWEVhXa7ovrhuSQofx4to7o22oU9iKtrOgAOtz_%3DY6vg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: postgres_fdw: batch inserts vs. before row triggers

2022-12-07 Thread Etsuro Fujita

On Fri, Dec 2, 2022 at 4:54 PM Etsuro Fujita  wrote:
> On Sun, Nov 27, 2022 at 12:11 AM Tom Lane  wrote:
> > OK, as long as there's a reason for doing it that way, it's OK
> > by me.  I don't think that adding a field at the end of EState
> > is an ABI problem.
> >
> > We have to do something else than add to ResultRelInfo, though.
>
> OK, I removed from ResultRelInfo a field that I added in the commit to
> save the owning ModifyTableState if insert-pending, and added to
> EState another List member to save such ModifyTableStates, instead.  I
> am planning to apply this to not only back branches but HEAD, to make
> back-patching easy, if there are no objections.

There seems to be no objection, so pushed.

Best regards,
Etsuro Fujita

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread Masahiko Sawada

On Wed, Dec 7, 2022 at 10:03 PM houzj.f...@fujitsu.com
 wrote:
>
> On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada 
>  wrote:
> >
> > On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com
> > 
> > > >
> > > > Thursday, December 1, 2022 8:40 PM Amit Kapila
> > 
> > > > wrote:
> > > > > Some other comments:
> > > > ...
> > > > Attach the new version patch set which addressed most of the comments
> > > > received so far except some comments being discussed[1].
> > > > [1]
> > https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C
> > 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> > >
> > > Attach a new version patch set which fixed a testcase failure on CFbot.
> >
> > Here are some comments on v56 0001, 0002 patches. Please ignore
> > comments if you already incorporated them in v57.
>
> Thanks for the comments!
>
> > +static void
> > +ProcessParallelApplyInterrupts(void)
> > +{
> > +CHECK_FOR_INTERRUPTS();
> > +
> > +if (ShutdownRequestPending)
> > +{
> > +ereport(LOG,
> > +(errmsg("logical replication parallel
> > apply worker for subscrip
> > tion \"%s\" has finished",
> > +MySubscription->name)));
> > +
> > +apply_worker_clean_exit(false);
> > +}
> > +
> > +if (ConfigReloadPending)
> > +{
> > +ConfigReloadPending = false;
> > +ProcessConfigFile(PGC_SIGHUP);
> > +}
> > +}
> >
> > I personally think that we don't need to have a function to do only
> > these few things.
>
> I thought that introduce a new function make the handling of worker specific
> Interrupts logic similar to other existing ones. Like:
> ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in
> pgarch.c ...

I think the difference from them is that there is only one place to
call ProcessParallelApplyInterrupts().

>
> >
> > Should we change the names to something like
> > LOGICALREP_STREAM_PARALLEL?
>
> Agreed, will change.
>
> > ---
> > + * The lock graph for the above example will look as follows:
> > + * LA (waiting to acquire the lock on the unique index) -> PA (waiting to
> > + * acquire the lock on the remote transaction) -> LA
> >
> > and
> >
> > + * The lock graph for the above example will look as follows:
> > + * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to 
> > acquire the
> > + * lock due to unique index constraint) -> PA-1 (waiting to acquire the 
> > stream
> > + * lock) -> LA
> >
> > "(waiting to acquire the lock on the remote transaction)" in the first
> > example and "(waiting to acquire the stream lock)" in the second
> > example is the same meaning, right? If so, I think we should use
> > either term for consistency.
>
> Will change.
>
> > ---
> > +bool   write_abort_info = (data->streaming ==
> > SUBSTREAM_PARALLEL);
> >
> > I think that instead of setting write_abort_info every time when
> > pgoutput_stream_abort() is called, we can set it once, probably in
> > PGOutputData, at startup.
>
> I thought that since we already have a "stream" flag in PGOutputData, I am not
> sure if it would be better to introduce another flag for the same option.

I see your point. Another way is to have it as a static variable like
publish_no_origin. But since It's trivial change I'm fine also with
the current code.

>
> > ---
> > server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
> > options.proto.logical.proto_version =
> > +server_version >= 16 ?
> > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM :
> > server_version >= 15 ?
> > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM :
> > server_version >= 14 ?
> > LOGICALREP_PROTO_STREAM_VERSION_NUM :
> > LOGICALREP_PROTO_VERSION_NUM;
> >
> > Instead of always using the new protocol version, I think we can use
> > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not
> > 'parallel'. That way, we don't need to change protocl version check
> > logic in pgoutput.c and don't need to expose defGetStreamingMode().
> > What do you think?
>
> I think that some user can also use the new version number when trying to get
> changes (via pg_logical_slot_peek_binary_changes or other functions), so I 
> feel
> leave the check for new version number seems fine.
>
> Besides, I feel even if we don't use new version number, we still need to use
> defGetStreamingMode to check if parallel mode in used as we need to send
> abort_lsn when parallel is in used. I might be missing something, sorry for
> that. Can you please explain the idea a bit ?

My idea is that we use LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM if
(server_version >= 16 && MySubscription->stream ==
SUBSTREAM_PARALLEL). If the stream is SUBSTREAM_ON, we use
LOGICALREP_PROTO_TWOPHASE_VERSION_NUM even if server_version is

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread Masahiko Sawada

On Thu, Dec 8, 2022 at 1:52 PM Amit Kapila  wrote:
>
> On Wed, Dec 7, 2022 at 6:33 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada 
> >  wrote:
> > >
> >
> > > ---
> > > When max_parallel_apply_workers_per_subscription is changed to a value
> > > lower than the number of parallel worker running at that time, do we
> > > need to stop extra workers?
> >
> > I think we can do this, like adding a check in the main loop of leader 
> > worker, and
> > check every time after reloading the conf. OTOH, we will also stop the 
> > worker after
> > finishing a transaction, so I am slightly not sure do we need to add 
> > another check logic here.
> > But I am fine to add it if you think it would be better.
> >
>
> I think this is tricky because it is possible that all active workers
> are busy with long-running transactions, so, I think stopping them
> doesn't make sense.

Right, we should not stop running parallel workers.

> I think as long as we are freeing them after use
> it seems okay to me. OTOH, each time after finishing the transaction,
> we can stop the workers, if the workers in the free pool exceed
> 'max_parallel_apply_workers_per_subscription'.

Or the apply leader worker can check that after reloading the config file.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2022-12-07 Thread Bharath Rupireddy

On Tue, Dec 6, 2022 at 12:00 AM Andres Freund  wrote:
>
> FWIW, I don't see an advantage in 0003. If it allows us to make something else
> simpler / faster, cool, but on its own it doesn't seem worthwhile.

Thanks. I will discard it.

> I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
>
>  * Test first to see if it the slot is free right now.
>  *
>  * XXX: the caller uses a spinlock before this, so we don't need a 
> memory
>  * barrier here as far as the current usage is concerned.  But that 
> might
>  * not be safe in general.
>
> which happens to be true in the single, indirect, caller:
>
> /* Read the current insert position */
> SpinLockAcquire(>insertpos_lck);
> bytepos = Insert->CurrBytePos;
> SpinLockRelease(>insertpos_lck);
> reservedUpto = XLogBytePosToEndRecPtr(bytepos);
>
> I think at the very least we ought to have a comment in
> WaitXLogInsertionsToFinish() highlighting this.

So, using a spinlock ensures no memory ordering occurs while reading
lock->state in LWLockConflictsWithVar()? How does spinlock that gets
acquired and released in the caller WaitXLogInsertionsToFinish()
itself and the memory barrier in the called function
LWLockConflictsWithVar() relate here? Can you please help me
understand this a bit?

> It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> is safe. I think at the very least we could end up missing waiters that we
> should have woken up.
>
> I think it ought to be safe to do something like
>
> pg_atomic_exchange_u64()..
> if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS))
>   return;

pg_atomic_exchange_u64(>state, exchange_with_what_?. Exchange
will change the value no?

> because the pg_atomic_exchange_u64() will provide the necessary memory
> barrier.

I'm reading some comments [1], are these also true for 64-bit atomic
CAS? Does it mean that an atomic CAS operation inherently provides a
memory barrier? Can you please point me if it's described better
somewhere else?

[1]
 * Full barrier semantics.
 */
static inline uint32
pg_atomic_exchange_u32(volatile pg_atomic_uint32 *ptr,

/*
 * Get and clear the flags that are set for this backend. Note that
 * pg_atomic_exchange_u32 is a full barrier, so we're guaranteed that the
 * read of the barrier generation above happens before we atomically
 * extract the flags, and that any subsequent state changes happen
 * afterward.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: ANY_VALUE aggregate

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 10:00 PM Vik Fearing  wrote:

> On 12/7/22 04:22, David G. Johnston wrote:
> > On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing 
> wrote:
> >
> >> On 12/6/22 05:57, David G. Johnston wrote:
> >>> On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing 
> >> wrote:
> >>>
>  I can imagine an optimization that would remove an ORDER BY clause
>  because it isn't needed for any other aggregate.
> >>>
> >>>
> >>> I'm referring to the query:
> >>>
> >>> select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
> >>> // produces 1, per the documented implementation-defined behavior.
> >>
> >> Implementation-dependent.  It is NOT implementation-defined, per spec.
> >
> > I really don't care all that much about the spec here given that ORDER BY
> > in an aggregate call is non-spec.
>
>
> Well, this is demonstrably wrong.
>
>  ::=
>ARRAY_AGG 
>  
>  [ ORDER BY  ]
>  
>

Demoable only by you and a few others...

We should update our documentation - the source of SQL Standard knowledge
for mere mortals.

https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES

"Note: The ability to specify both DISTINCT and ORDER BY in an aggregate
function is a PostgreSQL extension."

Apparently only DISTINCT remains as our extension.

>
> > You are de-facto creating a first_value aggregate (which is by definition
> > non-standard) whether you like it or not.
>
>
> I am de jure creating an any_value aggregate (which is by definition
> standard) whether you like it or not.
>

Yes, both statements seem true.  At least until we decide to start ignoring
a user's explicit order by clause.

>
> >> If you care about which value you get back, use something else.
> >
> > There isn't a "something else" to use so that isn't presently an option.
>
>
> The query
>
>  SELECT proposed_first_value(x ORDER BY y) FROM ...
>
> is equivalent to
>
>  SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ...
>
> so I am not very sympathetic to your claim of "no other option".
>

Semantically, yes, in terms of performance, not so much, for any
non-trivial sized group.

I'm done, and apologize for getting too emotionally invested in this.  I
hope to get others to voice enough +1s to get a first_value function into
core along-side this one (which makes the above discussion either moot or
deferred until there is a concrete use case for ignoring an explicit ORDER
BY).  If that doesn't happen, well, it isn't going to make or break us
either way.

David J.

Re: PGDOCS - Logical replication GUCs - added some xrefs

2022-12-07 Thread Peter Smith

On Thu, Dec 8, 2022 at 10:49 AM samay sharma  wrote:
>
...

> Thanks for the changes. See a few points of feedback below.
>

Patch v7 addresses this feedback. PSA.

> > +
> > + For logical replication, 
> > publishers
> > + (servers that do  > linkend="sql-createpublication">CREATE 
> > PUBLICATION)
> > + replicate data to subscribers
> > + (servers that do  > linkend="sql-createsubscription">CREATE 
> > SUBSCRIPTION).
> > + Servers can also be publishers and subscribers at the same time. Note,
> > + the following sections refers to publishers as "senders". The 
> > parameter
> > + max_replication_slots has a different meaning for 
> > the
> > + publisher and subscriber, but all other parameters are relevant only 
> > to
> > + one side of the replication. For more details about logical 
> > replication
> > + configuration settings refer to
> > + .
> > +
>
> The second last line seems a bit odd here. In my last round of feedback, I 
> had meant to add the line "The parameter  " onwards to the top of 
> logical-replication-config.sgml.
>
> What if we made the top of logical-replication-config.sgml like below?
>
> Logical replication requires several configuration options to be set. Most 
> configuration options are relevant only on one side of the replication (i.e. 
> publisher or subscriber). However, max_replication_slots is applicable on 
> both sides but has different meanings on each side.


OK. Moving this note is not quite following the same pattern as the
"streaming replication" intro blurb, but anyway it looks fine when
moved, so I've done as suggested.


>> OK. I copied the tablesync note back to config.sgml definition of
>> 'max_replication_slots' and removed the link as suggested. Frankly, I
>> also thought it is a bit strange that the max_replication_slots in the
>> “Sending Servers” section was describing this parameter for
>> “Subscribers”. OTOH, I did not want to split the definition in half so
>> instead, I’ve added another Subscriber  that just refers
>> back to this place. It looks like an improvement to me.
>
>
> Hmm, I agree this is a tricky scenario. However, to me, it seems odd to 
> mention the parameter twice as this chapter of the docs just lists each 
> parameter and describes them. So, I'd probably remove the reference to it in 
> the subscriber section. We should describe it's usage in different places in 
> the logical replication part of the docs (as we do).

The 'max_replication_slots' is problematic because it is almost like
having 2 different GUCs that happen to have the same name. So I
preferred it also gets a mention in the “Subscriber” section to make
it obvious that it wears 2 hats, but IIUC you prefer that 2nd mention
is not present because typically each GUC should appear once only in
this chapter. TBH, I think both ways could be successfully argued for
or against -- so I’m just going to leave this as-is for now and let
the committer decide.

> > +   
> > + > linkend="guc-max-logical-replication-workers">max_logical_replication_workers
> > +must be set to at least the number of subscriptions (for apply 
> > workers), plus
> > +some reserve for the table synchronization workers. Configuration 
> > parameter
> > + > linkend="guc-max-worker-processes">max_worker_processes
> > +may need to be adjusted to accommodate for replication workers, at 
> > least (
> > + > linkend="guc-max-logical-replication-workers">max_logical_replication_workers
> > ++ 1). Note, some extensions and parallel queries 
> > also
> > +take worker slots from max_worker_processes.
> > +   
>
> Maybe do max_worker_processes in a new line like the rest.

OK. Done as suggested.

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v7-0001-Logical-replication-GUCs-links-and-tidy.patch
Description: Binary data

Re: add \dpS to psql

2022-12-07 Thread Tom Lane

Nathan Bossart  writes:
> On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote:
>> My previous analysis
>> shows that there is no vast hidden demand for new privilege bits. If we
>> implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER,
>> and REINDEX, we will cover everything that I can find that has seriously
>> discussed on this list, and still leave 3 unused bits for future expansion.

> If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual
> privilege bits, we'd still have 13 remaining for future use.

I think the appropriate question is not "have we still got bits left?".
It should be more like "under what plausible scenario would it be useful
to grant somebody CLUSTER but not VACUUM privileges on a table?".

I'm really thinking that MAINTAIN is the right level of granularity
here.  Or maybe it's worth segregating exclusive-lock from
not-exclusive-lock maintenance.  But I really fail to see how it's
useful to distinguish CLUSTER from REINDEX, say.

regards, tom lane

Re: add \dpS to psql

2022-12-07 Thread Isaac Morland

On Thu, 8 Dec 2022 at 00:07, Nathan Bossart 
wrote:

> On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote:
> > For what it's worth, I wouldn't bother changing the format of the
> > permission bits to expand the pool of available bits.
>
> 7b37823 expanded AclMode to 64 bits, so we now have room for 16 additional
> privileges (after the addition of VACUUM and ANALYZE in b5d6382).
>
> > My previous analysis
> > shows that there is no vast hidden demand for new privilege bits. If we
> > implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH,
> CLUSTER,
> > and REINDEX, we will cover everything that I can find that has seriously
> > discussed on this list, and still leave 3 unused bits for future
> expansion.
>
> If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual
> privilege bits, we'd still have 13 remaining for future use.
>

I was a bit imprecise. I was comparing to the state before the recent
changes - so 12 bits used out of 16, with MAINTAIN being the 13th bit. I
think in my mind it's still approximately 2019 on some level.

Re: RFC: Logging plan of the running query

2022-12-07 Thread torikoshia


On 2022-12-07 03:41, Andres Freund wrote:

Hi,

This patch does not currently build, due to a conflicting oid:

I suggest you choose a random oid out of the "development purposes" 
range:


Thanks for your advice!
Attached updated patch.


BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes 
which is used in auto_explain, I'm feeling that the following discussion 
also applies to this patch.



--
https://www.postgresql.org/message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com

explaining a query is a pretty
complicated operation that involves catalog lookups and lots of
complicated stuff, and I don't think that it would be safe to do all
of that at any arbitrary point in the code where ProcessInterrupts()
happened to fire.


If I can't come up with some workaround during the next Commitfest, I'm 
going to withdraw this proposal.


--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom a0d2179826a0fa224eaf37ca00d14954b76fde6b Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Thu, 8 Dec 2022 12:42:22 +0900
Subject: [PATCH v25] Add function to log the plan of the query
currently running on the backend.

Currently, we have to wait for the query execution to finish
to check its plan. This is not so convenient when
investigating long-running queries on production environments
where we cannot use debuggers.
To improve this situation, this patch adds
pg_log_query_plan() function that requests to log the
plan of the specified backend process.

By default, only superusers are allowed to request to log the
plans because allowing any users to issue this request at an
unbounded rate would cause lots of log messages and which can
lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its plan at LOG_SERVER_ONLY level, so
that these plans will appear in the server log but not be sent
to the client.

Reviewed-by: Bharath Rupireddy, Fujii Masao, Dilip Kumar,
Masahiro Ikeda, Ekaterina Sokolova, Justin Pryzby, Kyotaro
Horiguchi, Robert Treat, Alena Rybakina

---
 doc/src/sgml/func.sgml   |  49 +++
 src/backend/access/transam/xact.c|  13 ++
 src/backend/catalog/system_functions.sql |   2 +
 src/backend/commands/explain.c   | 139 ++-
 src/backend/executor/execMain.c  |  14 ++
 src/backend/storage/ipc/procsignal.c |   4 +
 src/backend/storage/lmgr/lock.c  |   9 +-
 src/backend/tcop/postgres.c  |   4 +
 src/backend/utils/init/globals.c |   2 +
 src/include/catalog/pg_proc.dat  |   6 +
 src/include/commands/explain.h   |   3 +
 src/include/miscadmin.h  |   1 +
 src/include/storage/lock.h   |   2 -
 src/include/storage/procsignal.h |   1 +
 src/include/tcop/pquery.h|   2 +-
 src/test/regress/expected/misc_functions.out |  54 +--
 src/test/regress/sql/misc_functions.sql  |  41 --
 17 files changed, 318 insertions(+), 28 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e57ffce971..d39a4e50a0 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25536,6 +25536,25 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_query_plan
+
+pg_log_query_plan ( pid integer )
+boolean
+   
+   
+Requests to log the plan of the query currently running on the
+backend with specified process ID.
+It will be logged at LOG message level and
+will appear in the server log based on the log
+configuration set (See 
+for more information), but will not be sent to the client
+regardless of .
+
+  
+
   

 
@@ -25756,6 +25775,36 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_query_plan can be used
+to log the plan of a backend process. For example:
+
+postgres=# SELECT pg_log_query_plan(201116);
+ pg_log_query_plan
+---
+ t
+(1 row)
+
+The format of the query plan is the same as when VERBOSE,
+COSTS, SETTINGS and
+FORMAT TEXT are used in the EXPLAIN
+command. For example:
+
+LOG:  query plan running on backend with PID 201116 is:
+Query Text: SELECT * FROM pgbench_accounts;
+Seq Scan on public.pgbench_accounts  (cost=0.00..52787.00 rows=200 width=97)
+  Output: aid, bid, abalance, filler
+Settings: work_mem = '1MB'
+
+Note that when statements are executed inside a function, only the
+plan of the most deeply nested query is logged.
+   
+
+   
+When a subtransaction is aborted, pg_log_query_plan
+cannot log the query plan after the abort.
+   
+
   
 
   
diff --git

Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2022-12-07 Thread Maciek Sakrejda

In the pg_stat_statements docs, there are several column descriptions like

   Total number of ... by the statement

You added an additional sentence to some describing the equivalent
pg_stat_io values, but you only added a period to the previous
sentence for shared_blks_read (for other columns, the additional
description just follows directly). These should be consistent.

Otherwise, the docs look good to me.

Re: add \dpS to psql

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote:
> For what it's worth, I wouldn't bother changing the format of the
> permission bits to expand the pool of available bits.

7b37823 expanded AclMode to 64 bits, so we now have room for 16 additional
privileges (after the addition of VACUUM and ANALYZE in b5d6382).

> My previous analysis
> shows that there is no vast hidden demand for new privilege bits. If we
> implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER,
> and REINDEX, we will cover everything that I can find that has seriously
> discussed on this list, and still leave 3 unused bits for future expansion.

If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual
privilege bits, we'd still have 13 remaining for future use.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Support logical replication of DDLs

2022-12-07 Thread vignesh C

On Wed, 7 Dec 2022 at 17:50, Alvaro Herrera  wrote:
>
> I think this patch is split badly.
>
> You have:
>
> 0001 an enormous patch including some required infrastructure, plus the
> DDL deparsing bits themselves.
>
> 0002 another enormous (though not as much) patch, this time for
> DDL replication using the above.
>
> 0003 a bugfix for 0001, which includes changes in both the
> infrastructure and the deparsing bits.
>
> 0004 test stuff for 0002.
>
> 0005 Another bugfix for 0001
>
> 0006 Another bugfix for 0001
>
> As presented, I think it has very little chance of being reviewed
> usefully.  A better way to go about this, I think, would be:
>
> 0001 - infrastructure bits to support the DDL deparsing parts (all these
> new functions in ruleutils.c, sequence.c, etc).  That means, everything
> (?) that's currently in your 0001 except ddl_deparse.c and friends.
> Clearly there are several independent changes here; maybe it is possible
> to break it down even further.  This patch or these patches should also
> include the parts of 0003, 0005, 0006 that require changes outside of
> ddl_deparse.c.
> I expect that this patch should be fairly small.
>
> 0002 - ddl_deparse.c and its very close friends.  This should not have
> any impact on places such as ruleutils.c, sequence.c, etc.  The parts of
> the bugfixes (0001, 0005, 0006) that touch this could should be merged
> here as well; there's no reason to have them as separate patches.  Some
> test code should be here also, though it probably doesn't need to aim to
> be complete.
> This one is likely to be very large, but also self-contained.
>
> 0003 - ddlmessage.c and friends.  I understand that DDL-messaging is
> supporting infrastructure for DDL replication; I think it should be its
> own patch.  Probably include its own simple-ish test bits.
> Not a very large patch.
>
> 0004 - DDL replication proper, including 0004.
> Probably not a very large patch either, not sure.
>
>
> Some review comments, just skimming:
> - 0002 adds some functions to event_trigger.c, but that doesn't seem to
> be their place.  Maybe some new file in src/backend/replication/logical
> would make more sense.
>
> - publication_deparse_ddl_command_end has a long strcmp() list; why?
> Maybe change things so that it compares some object type enum instead.
>
> - CreatePublication has a long list of command tags; is that good?
> Maybe it'd be better to annotate the list in cmdtaglist.h somehow.
>
> - The change in pg_dump's getPublications needs updated to 16.
>
> - Don't "git add" src/bin/pg_waldump/logicalddlmsgdesc.c, just update
> its Makefile and meson.build
>
> - I think psql's \dRp should not have the new column at the end.
> Maybe one of:
> + Name | Owner | DDL | All tables | Inserts | Updates | Deletes | Truncates | 
> Via root
> + Name | Owner | All tables | DDL | Inserts | Updates | Deletes | Truncates | 
> Via root
> + Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | DDL | 
> Via root
> (I would not add the "s" at the end of that column title, also).

Thanks for the comments, these comments will make the patch reviewing easier.
There are a couple of review comments [1] and [2] which are spread
across the code, it will be difficult to handle this after
restructuring of the patch as the comments are spread across the code
in the patch. So we will handle [1] and [2] first and then work on
restructuring work suggested by you.

[1] - 
https://www.postgresql.org/message-id/CAHut%2BPsERMFwO8oK3LFH_3CRG%2B512T%2Bay_viWzrgNetbH2MwxA%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAHut%2BPuxo_kq2toicNK_BQdeccK3REGW-Xv8tVauFvTNku6V-w%40mail.gmail.com

Regards,
Vignesh

Re: ANY_VALUE aggregate

2022-12-07 Thread Vik Fearing


On 12/7/22 04:22, David G. Johnston wrote:

On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing  wrote:


On 12/6/22 05:57, David G. Johnston wrote:

On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing 

wrote:



I can imagine an optimization that would remove an ORDER BY clause
because it isn't needed for any other aggregate.



I'm referring to the query:

select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
// produces 1, per the documented implementation-defined behavior.


Implementation-dependent.  It is NOT implementation-defined, per spec.


I really don't care all that much about the spec here given that ORDER BY
in an aggregate call is non-spec.



Well, this is demonstrably wrong.

 ::=
  ARRAY_AGG 

[ ORDER BY  ]




We often loosen the spec rules when they don't make technical sense to
us, but I don't know of any example of when we have tightened them.


The function has to choose some row from among its inputs,



True.



and the system has to obey an order by specification added to the function call.



False.



You are de-facto creating a first_value aggregate (which is by definition
non-standard) whether you like it or not.



I am de jure creating an any_value aggregate (which is by definition 
standard) whether you like it or not.




I'm just saying to be upfront
and honest about it - our users do want such a capability so maybe accept
that there is a first time for everything.  Not that picking an
advantageous "implementation-dependent" implementation should be considered
deviating from the spec.



Someone writing:

select any_value(v) from (values (2),(1),(3)) as vals (v) order by v;

Is not presently, nor am I saying, promised the value 1.

I'm assuming you are thinking of the second query form, while the

guarantee

only needs to apply to the first.


I am saying that a theoretical pg_aggregate.aggorderdoesnotmatter could
bestow upon ANY_VALUE the ability to make those two queries equivalent.



That theoretical idea should not be entertained.  Removing a user's
explicitly added ORDER BY should be off-limits.  Any approach at
optimization here should simply look at whether an ORDER BY is specified
and pass that information to the function.  If the function itself really
believes that ordering matters it can emit its own runtime exception
stating that fact and the user can fix their query.



It absolutely should be entertained, and I plan on doing so in an 
upcoming thread.  Whether it errors or ignores is something that should 
be discussed on that thread.




If you care about which value you get back, use something else.


There isn't a "something else" to use so that isn't presently an option.



The query

SELECT proposed_first_value(x ORDER BY y) FROM ...

is equivalent to

SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ...

so I am not very sympathetic to your claim of "no other option".



I suppose it comes down to what level of belief and care you have that
people will simply mis-use this function if it is added in its current form
to get the desired first_value effect that it produces.



People who rely on explicitly undefined behavior get what they deserve 
when the implementation changes.

--
Vik Fearing

Re: Avoid streaming the transaction which are skipped (in corner cases)

2022-12-07 Thread Amit Kapila

On Wed, Dec 7, 2022 at 9:35 AM shiy.f...@fujitsu.com
 wrote:
>
> On Wed, Dec 7, 2022 12:01 PM Dilip Kumar  wrote:
> >
> > >
> > > Thanks. I think it is better to go ahead with this patch and once we
> > > decide what is the right thing to do in terms of GUC then we can try
> > > to add additional tests for this. Anyway, it is not that the code
> > > added by this patch is not getting covered by existing tests. What do
> > > you think?
> >
> > That makes sense to me.
> >
>
> +1
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread Amit Kapila

On Wed, Dec 7, 2022 at 6:33 PM houzj.f...@fujitsu.com
 wrote:
>
> On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada 
>  wrote:
> >
>
> > ---
> > When max_parallel_apply_workers_per_subscription is changed to a value
> > lower than the number of parallel worker running at that time, do we
> > need to stop extra workers?
>
> I think we can do this, like adding a check in the main loop of leader 
> worker, and
> check every time after reloading the conf. OTOH, we will also stop the worker 
> after
> finishing a transaction, so I am slightly not sure do we need to add another 
> check logic here.
> But I am fine to add it if you think it would be better.
>

I think this is tricky because it is possible that all active workers
are busy with long-running transactions, so, I think stopping them
doesn't make sense. I think as long as we are freeing them after use
it seems okay to me. OTOH, each time after finishing the transaction,
we can stop the workers, if the workers in the free pool exceed
'max_parallel_apply_workers_per_subscription'. I don't know if it is
worth.

-- 
With Regards,
Amit Kapila.

Re: add \dpS to psql

2022-12-07 Thread Isaac Morland

On Wed, 7 Dec 2022 at 23:25, Tom Lane  wrote:

> Nathan Bossart  writes:
> > I haven't formed an opinion on whether VACUUM FULL should get its own
> bit,
> > but FWIW І just finished writing the first draft of a patch set to add
> bits
> > for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX.  I plan to post that
> > tomorrow.
>
> The fact that we just doubled the number of available bits doesn't
> mean we should immediately strive to use them up.  Perhaps it'd
> be better to subsume these retail privileges under some generic
> "maintenance action" privilege?
>

That was my original suggestion:

https://www.postgresql.org/message-id/CAMsGm5c4DycKBYZCypfV02s-SC8GwF%2BKeTt%3D%3DvbWrFn%2Bdz%3DKeg%40mail.gmail.com

In that message I review the history of permission bit growth. A bit later
in the discussion, I did some investigation into the history of demand for
new permission bits and I proposed calling the new privilege MAINTAIN:

https://www.postgresql.org/message-id/CAMsGm5d%3D2gi4kyKONUJyYFwen%3DbsWm4hz_KxLXkEhMmg5WSWTA%40mail.gmail.com

For what it's worth, I wouldn't bother changing the format of the
permission bits to expand the pool of available bits. My previous analysis
shows that there is no vast hidden demand for new privilege bits. If we
implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER,
and REINDEX, we will cover everything that I can find that has seriously
discussed on this list, and still leave 3 unused bits for future expansion.
There is even justification for stopping after this expansion: if it is
done, then schema changes (DDL) will only be able to be done by owner; data
changes (insert, update, delete, as well as triggering of automatic data
maintenance actions) will be able to be done by anybody who is granted
permission.

My guess is that if we ever do expand the privilege bit system, it should
be in a way that removes the limit entirely, replacing a bit map model with
something more like a table with one row for each individual grant, with a
field indicating which grant is involved. But that is a hypothetical future.

Re: add \dpS to psql

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 11:25:32PM -0500, Tom Lane wrote:
> The fact that we just doubled the number of available bits doesn't
> mean we should immediately strive to use them up.  Perhaps it'd
> be better to subsume these retail privileges under some generic
> "maintenance action" privilege?

That's fine with me, but I wouldn't be surprised if there's disagreement on
how to group the commands.  I certainly don't want to use up the rest of
the bits right away, but there might not be too many more existing
privileges after these three that deserve them.  Perhaps I should take
inventory and offer a plan for all the remaining privileges.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: add \dpS to psql

2022-12-07 Thread Tom Lane

Nathan Bossart  writes:
> I haven't formed an opinion on whether VACUUM FULL should get its own bit,
> but FWIW І just finished writing the first draft of a patch set to add bits
> for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX.  I plan to post that
> tomorrow.

The fact that we just doubled the number of available bits doesn't
mean we should immediately strive to use them up.  Perhaps it'd
be better to subsume these retail privileges under some generic
"maintenance action" privilege?

regards, tom lane

Re: [PoC] Federated Authn/z with OAUTHBEARER

2022-12-07 Thread Andrey Chudnovsky

That being said, the Diagram 2 would look like this with our proposal:
  +--+ +--+
  | +---+  | Postgres |
  | PQconnect ->|   |  |
   |
  | |   |  |
+---+
  | |   | -- Empty Token > | >
|   |
  | | libpq | <- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |  +--+   |  |
|  Hook |
  | +- < | Hook |   |  |
+---+
  | |+--+   |  |  |
  | v   |   |  |
   |
  |  [get token]|   |  |
   |
  | |   |   |  |
   |
  | +   |   |  |
+---+
  | PQconnect > |   | -- Access Token ---> | >
| Validator |
  | |   | < Authorization Success/Failure  | <
|   Hook|
  | |   |  |
+---+
  | +---+  |
   | +--+
+--+


With the application taking care of all Token acquisition logic. While
the server-side hook is participating in the pre-authentication reply.

That is definitely a required scenario for the long term and the
easiest to implement in the client core.
And if we can do at least that flow in PG16 it will be a strong
foundation to provide more support for specific grants in libpq going
forward.

Does the diagram above look good to you? We can then start cleaning up
the patch to get that in first.

Thanks!
Andrey.


On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky  wrote:
>
> > I think it's okay to have the extension and HBA collaborate to provide
> > discovery information. Your proposal goes further than that, though,
> > and makes the server aware of the chosen client flow. That appears to
> > be an architectural violation: why does an OAuth resource server need
> > to know the client flow at all?
>
> Ok. It may have left there from intermediate iterations. We did
> consider making extension drive the flow for specific grant_type, but
> decided against that idea. For the same reason you point to.
> Is it correct that your main concern about use of grant_type was that
> it's propagated to the server? Then yes, we will remove sending it to
> the server.
>
> > Ideally, yes, but that only works if all identity providers implement
> > the same flows in compatible ways. We're already seeing instances
> > where that's not the case and we'll necessarily have to deal with that
> > up front.
>
> Yes, based on our analysis OIDC spec is detailed enough, that
> providers implementing that one, can be supported with generic code in
> libpq / client.
> Github specifically won't fit there though. Microsoft Azure AD,
> Google, Okta (including Auth0) will.
> Theoretically discovery documents can be returned from the extension
> (server-side) which is provider specific. Though we didn't plan to
> prioritize that.
>
> > That seems to be restating the goal of OAuth and OIDC. Can you explain
> > how the incompatible change allows you to accomplish this better than
> > standard implementations?
>
> Do you refer to passing grant_type to the server? Which we will get
> rid of in the next iteration. Or other incompatible changes as well?
>
> > Why? I claim that standard OAUTHBEARER can handle all of that. What
> > does your proposed architecture (the third diagram) enable that my
> > proposed hook (the second diagram) doesn't?
>
> The hook proposed on the 2nd diagram effectively delegates all Oauth
> flows implementations to the client.
> We propose libpq takes care of pulling OpenId discovery and coordination.
> Which is effectively Diagram 1 + more flows + server hook providing
> root url/audience.
>
> Created the diagrams with all components for 3 flows:
> 1. Authorization code grant (Clients with Browser access):
>   +--+ 
> +--+
>   | +---+  |
>|
>   | PQconnect   |   |  |
>|
>   | [auth_code] |   |  |
> +---+
>   |  -> |   | -- Empty Token > | >
> |   |
>   | | libpq | <- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   | |   |  |
> |  Hook |
>   | |   |

Re: add \dpS to psql

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 08:39:30PM -0600, Justin Pryzby wrote:
> I think if vacuum privilege allows vacuum full, then it ought to also
> allow cluster.  But I suggest that it'd be even better if it doesn't
> allow either, and there was a separate privilege for those.
> 
> Disclaimer: I have not been following these threads.

I haven't formed an opinion on whether VACUUM FULL should get its own bit,
but FWIW І just finished writing the first draft of a patch set to add bits
for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX.  I plan to post that
tomorrow.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: fix and document CLUSTER privileges

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 08:25:59PM -0600, Justin Pryzby wrote:
> Your patch makes it inconsistent with vacuum full, which is strange
> because vacuum full calls cluster.
> 
> postgres=> VACUUM FULL t;
> VACUUM
> postgres=> CLUSTER t;
> ERROR:  must be owner of table t

This is the existing behavior on HEAD.  I think it has been this way for a
while.  Granted, that doesn't mean it's ideal, but AFAICT it's intentional.

Looking closer, the database ownership check in
get_tables_to_cluster_partitioned() appears to have no meaningful effect.
In this code path, cluster_rel() will always be called with CLUOPT_RECHECK,
and that function only checks for table ownership.  Anything gathered in
get_tables_to_cluster_partitioned() that the user doesn't own will be
skipped.  So, I don't think my patch changes the behavior in any meaningful
way, but I still think it's worthwhile to make the checks consistent.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: allow segment size to be set to < 1GiB

2022-12-07 Thread Andres Freund

Hi,

On 2022-11-09 12:25:09 -0800, Andres Freund wrote:
> Updated patch attached.

I pushed it now.


> I made one autoconf and one meson CI task use a small block size, but just to
> ensure it work on both. I'd probably leave it set on one, so we keep the
> coverage for cfbot?

It doesn't seem to cost that much, so I left it set in those two tasks for
now.

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread Corey Huinker

On Wed, Dec 7, 2022 at 12:17 PM Tom Lane  wrote:

> Corey Huinker  writes:
> > In my attempt to implement CAST...DEFAULT, I noticed that I immediately
> > needed an
> > OidInputFunctionCallSafe, which was trivial but maybe something we want
> to
> > add to the infra patch, but the comments around that function also
> somewhat
> > indicate that we might want to just do the work in-place and call
> > InputFunctionCallSafe directly. Open to both ideas.
>
> I'm a bit skeptical of that.  IMO using OidInputFunctionCall is only
> appropriate in places that will be executed just once per query.
>

That is what's happening when the expr of the existing CAST ( expr AS
typename ) is a constant and we want to just resolve the constant at parse
time.


>
>

Re: moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)

2022-12-07 Thread Amit Langote

On Wed, Dec 7, 2022 at 8:54 PM Amit Langote  wrote:
> Per Alvaro's advice, forking this from [1].
>
> In that thread, Tom had asked if it wouldn't be better to find a new
> place to put extraUpdatedCols [2] instead of RangeTblEntry, along with
> the permission-checking fields are now no longer stored in
> RangeTblEntry.
>
> In [3] of the same thread, I proposed to move it into a List of
> Bitmapsets in ModifyTable, as implemented in the attached patch that I
> had been posting to that thread.
>
> The latest version of that patch is attached herewith.

Updated to replace a list_nth() with list_nth_node() and rewrote the
commit message.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


v2-0001-Add-per-result-relation-extraUpdatedCols-to-Modif.patch
Description: Binary data

Re: Assertion failure in SnapBuildInitialSnapshot()

2022-12-07 Thread Masahiko Sawada

On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila  wrote:
>
> On Sat, Nov 19, 2022 at 6:35 AM Andres Freund  wrote:
> >
> > On 2022-11-18 11:20:36 +0530, Amit Kapila wrote:
> > > Okay, updated the patch accordingly.
> >
> > Assuming it passes tests etc, this'd work for me.
> >
>
> Thanks, Pushed.

The same assertion failure has been reported on another thread[1].
Since I could reproduce this issue several times in my environment
I've investigated the root cause.

I think there is a race condition of updating
procArray->replication_slot_xmin by CreateInitDecodingContext() and
LogicalConfirmReceivedLocation().

What I observed in the test was that a walsender process called:
SnapBuildProcessRunningXacts()
  LogicalIncreaseXminForSlot()
LogicalConfirmReceivedLocation()
  ReplicationSlotsComputeRequiredXmin(false).

In ReplicationSlotsComputeRequiredXmin() it acquired the
ReplicationSlotControlLock and got 0 as the minimum xmin since there
was no wal sender having effective_xmin. Before calling
ProcArraySetReplicationSlotXmin() (i.e. before acquiring
ProcArrayLock), another walsender process called
CreateInitDecodingContext(), acquired ProcArrayLock, computed
slot->effective_catalog_xmin, called
ReplicationSlotsComputeRequiredXmin(true). Since its
effective_catalog_xmin had been set, it got 39968 as the minimum xmin,
and updated replication_slot_xmin. However, as soon as the second
walsender released ProcArrayLock, the first walsender updated the
replication_slot_xmin to 0. After that, the second walsender called
SnapBuildInitialSnapshot(), and GetOldestSafeDecodingTransactionId()
returned an XID newer than snap->xmin.

One idea to fix this issue is that in
ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin
while holding both ProcArrayLock and ReplicationSlotControlLock, and
release only ReplicationSlotsControlLock before updating the
replication_slot_xmin. I'm concerned it will increase the contention
on ProcArrayLock but I've attached the patch for discussion.

Regards,

[1] 
https://www.postgresql.org/message-id/tencent_7EB71DA5D7BA00EB0B429DCE45D0452B6406%40qq.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

fix_slot_xmin_race_condition.patch
Description: Binary data

Re: add \dpS to psql

2022-12-07 Thread Justin Pryzby

On Wed, Dec 07, 2022 at 10:48:49AM +0300, Pavel Luzanov wrote:
> Furthermore. The VACUUM privilege allows you to also execute VACUUM FULL.
> VACUUM and VACUUM FULL are commands with similar names, but work completely
> differently.
> It may be worth clarifying on this page:
> https://www.postgresql.org/docs/devel/ddl-priv.html
> 
> Something like: Allows VACUUM on a relation, including VACUUM FULL.

Since (as you said) they work completely differently, I think it'd be
more useful if vacuum_full were a separate privilege, rather than being
included in vacuum.  And cluster could be allowed whenever vacuum_full
is allowed.

> There is a very similar command to VACUUM FULL with a different name -
> CLUSTER.  The VACUUM privilege does not apply to the CLUSTER command.
> This is probably correct.

I think if vacuum privilege allows vacuum full, then it ought to also
allow cluster.  But I suggest that it'd be even better if it doesn't
allow either, and there was a separate privilege for those.

Disclaimer: I have not been following these threads.

-- 
Justin

Re: [DOCS] Stats views and functions not in order?

2022-12-07 Thread Peter Smith

Thanks for the ongoing feedback.

PSA patches for v9*

v9-0001 - Now the table rows are ordered per PeterE's suggestions [1]

v9-0002 - All the review comments from DavidJ [2] are addressed

v9-0003 - Unchanged since v8.

--
[1] 
https://www.postgresql.org/message-id/cfdb0030-8f62-ed6d-4246-8d9bf855bc48%40enterprisedb.com
[2] 
https://www.postgresql.org/message-id/CAKFQuwby7xWHek8%3D6UPaNXrhGA-i0B2zMOmBoGHgc4yaO8NH_w%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia.


v9-0001-Re-order-Table-28.2-Collected-Statistics-Views.patch
Description: Binary data


v9-0003-Add-Statistics-Views-section-and-refentry-for-eac.patch
Description: Binary data


v9-0002-Cleanup-view-name-hyperlinks-for-Tables-28.1-and-.patch
Description: Binary data

Re: fix and document CLUSTER privileges

2022-12-07 Thread Justin Pryzby

On Wed, Dec 07, 2022 at 02:39:24PM -0800, Nathan Bossart wrote:
> Hi hackers,
> 
> While looking into other opportunities for per-table permissions, I noticed
> a weird discrepancy in CLUSTER.  When evaluating whether the current user
> has permission to CLUSTER a table, we ordinarily just check for ownership.
> However, the database owner is also allowed to CLUSTER all partitions that
> are not shared.  This was added in 3f19e17, and I didn't see any discussion
> about it in the corresponding thread [0].
> 
> My first instinct is that we should just remove the database ownership
> check, which is what I've done in the attached patch.  I don't see any
> strong reason to complicate matters with special
> database-owner-but-not-shared checks like other commands (e.g., VACUUM).
> But perhaps we should do so just for consistency's sake.  Thoughts?

Your patch makes it inconsistent with vacuum full, which is strange
because vacuum full calls cluster.

postgres=> VACUUM FULL t;
VACUUM
postgres=> CLUSTER t;
ERROR:  must be owner of table t

BTW, it'd be helpful to copy the relevant parties on this kind of
message, especially if there's a new thread dedicated just to this.

-- 
Justin

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Tom Lane

Andres Freund  writes:
> On 2022-12-07 17:53:05 -0500, Tom Lane wrote:
>> Is "-s" mode actually a relevant criterion here?  With per-table COPY
>> commands added into the mix you could not possibly get better than 2x
>> improvement, and likely a good deal less.

> Well, -s isn't something used all that rarely, so it'd not be insane to
> optimize it in isolation. But more importantly, I think the potential win
> without -s is far bigger than 2x, because the COPYs can be done in parallel,
> whereas the locking happens in the non-parallel stage.

True, and there's the reduce-the-lock-window issue that Jacob mentioned.

> With just a 5ms delay, very well within normal network latency range, I get:
> [ a nice win ]

OK.  I'm struggling to figure out why I rejected this idea last year.
I know that I thought about it and I'm fairly certain I actually
tested it.  Maybe I only tried it with near-zero-latency local
loopback; but I doubt that, because the potential for network
latency was certainly a factor in that whole discussion.

One idea is that I might've tried it before getting rid of all the
other per-object queries, at which point it wouldn't have stood out
quite so much.  But I'm just guessing.  I have a nagging feeling
there was something else.

Oh well, I guess we can always revert it if we discover a problem later.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Andres Freund  writes:
> I wonder if there are potential use-cases for levels other than ERROR. I can
> potentially see us wanting to defer some FATALs, e.g. when they occur in
> process exit hooks.

I thought about that early on, and concluded not.  The whole thing is
moot for levels less than ERROR, of course, and I'm having a hard
time seeing how it could be useful for FATAL or PANIC.  Maybe I just
lack imagination, but if a call is specifying FATAL rather than just
ERROR then it seems to me it's already a special snowflake rather
than something we could fold into a generic non-error behavior.

> For a moment I was worried that it could lead to odd behaviour that we don't
> do get_error_stack_entry() when !details_wanted, due to not raising an error
> we'd otherwise raise. But that's a should-never-be-reached case, so ...

I don't see how.  Returning false out of errsave_start causes the
errsave macro to immediately give control back to the caller, which
will go on about its business.

> It seems somewhat ugly transport this knowledge via edata->elevel, but it's
> not too bad.

The LOG-vs-ERROR business, you mean?  Yeah.  I considered adding another
bool flag to ErrorData, but couldn't convince myself it was worth the
trouble.  If we find a problem we can do that sometime in future.

>> +/*
>> + * We cannot include nodes.h yet, so make a stub reference.  (This is also
>> + * used by fmgr.h, which doesn't want to depend on nodes.h either.)
>> + */
>> +typedef struct Node *NodePtr;

> Seems like it'd be easier to just forward declare the struct, and use the
> non-typedef'ed name in the header than to have to deal with these
> interdependencies and the differing typenames.

Meh.  I'm a little allergic to writing "struct foo *" in function argument
lists, because I so often see gcc pointing out that if struct foo isn't
yet known then that can silently mean something different than you
intended.  With the typedef, it either works or is an error, no halfway
about it.  And the struct way isn't really much better in terms of
having two different notations to use rather than only one.

> Perhaps worth noting here that the reason why the errsave_start/errsave_finish
> split exist differs a bit from the reason in ereport_domain()? "Over there"
> it's just about not wanting to incur overhead when the message isn't logged,
> but here we'll always have >= ERROR, but ->details_wanted can still lead to
> not wanting to incur the overhead.

Hmmm ... it seems like the same reason to me, we don't want to incur the
overhead if the "start" function says not to.

> Is it ok that we throw an error in lookup_rowtype_tupdesc()?

Yeah, that should fall in the category of internal errors I think.
I don't see how you'd reach that from a bad input string.

(Or to be more precise, the point of pg_input_is_valid is to tell
you whether the input string is valid, not to tell you whether the
type name is valid; if you're worried about the latter you need
a separate and earlier test.)

regards, tom lane

Re: PGDOCS - Logical replication GUCs - added some xrefs

2022-12-07 Thread samay sharma

Hi,

On Tue, Dec 6, 2022 at 11:12 PM Peter Smith  wrote:

> On Tue, Dec 6, 2022 at 5:57 AM samay sharma 
> wrote:
> >
> > Hi,
> >
> > On Mon, Oct 24, 2022 at 12:45 AM Peter Smith 
> wrote:
> >>
> >> Hi hackers.
> >>
> >> There is a docs Logical Replication section "31.10 Configuration
> >> Settings" [1] which describes some logical replication GUCs, and
> >> details on how they interact with each other and how to take that into
> >> account when setting their values.
> >>
> >> There is another docs Server Configuration section "20.6 Replication"
> >> [2] which lists the replication-related GUC parameters, and what they
> >> are for.
> >>
> >> Currently AFAIK those two pages are unconnected, but I felt it might
> >> be helpful if some of the parameters in the list [2] had xref links to
> >> the additional logical replication configuration information [1]. PSA
> >> a patch to do that.
> >
> >
> > +1 on the patch. Some feedback on v5 below.
> >
>
> Thanks for your detailed review comments!
>
> I have changed most things according to your suggestions. Please check
> patch v6.
>

Thanks for the changes. See a few points of feedback below.

> +
> + For logical replication,
publishers
> + (servers that do CREATE
PUBLICATION)
> + replicate data to subscribers
> + (servers that do CREATE
SUBSCRIPTION).
> + Servers can also be publishers and subscribers at the same time.
Note,
> + the following sections refers to publishers as "senders". The
parameter
> + max_replication_slots has a different meaning
for the
> + publisher and subscriber, but all other parameters are relevant
only to
> + one side of the replication. For more details about logical
replication
> + configuration settings refer to
> + .
> +

The second last line seems a bit odd here. In my last round of feedback, I
had meant to add the line "The parameter  " onwards to the top of
logical-replication-config.sgml.

What if we made the top of logical-replication-config.sgml like below?

Logical replication requires several configuration options to be set. Most
configuration options are relevant only on one side of the replication
(i.e. publisher or subscriber). However, max_replication_slots is
applicable on both sides but has different meanings on each side.

>
> > > +
> > > + For logical replication configuration
> settings refer
> > > + also to .
> > > +
> > > +
> >
> > I feel the top paragraph needs to explain terminology for logical
> replication like it does for physical replication in addition to linking to
> the logical replication config page. I'm recommending this as we use terms
> like subscriber etc. in description of parameters without introducing them
> first.
> >
> > As an example, something like below might work.
> >
> > These settings control the behavior of the built-in streaming
> replication feature (see Section 27.2.5) and logical replication (link).
> >
> > For physical replication, servers will be either a primary or a standby
> server. Primaries can send data, while standbys are always receivers of
> replicated data. When cascading replication (see Section 27.2.7) is used,
> standby servers can also be senders, as well as receivers. Parameters are
> mainly for sending and standby servers, though some parameters have meaning
> only on the primary server. Settings may vary across the cluster without
> problems if that is required.
> >
> > For logical replication, servers will either be publishers (also called
> senders in the sections below) or subscribers. Publishers are ,
> Subscribers are...
> >
>
> OK. I split this blurb into 2 parts – streaming and logical
> replication. The streaming replication part is the same as before. The
> logical replication part is new.
>
> > > +   
> > > + See  for more
> details
> > > + about setting max_replication_slots for
> logical
> > > + replication.
> > > +
> >
> >
> > The link doesn't add any new information regarding max_replication_slots
> other than "to reserve some for table sync" and has a good amount of
> unrelated info. I think it might be useful to just put a line here asking
> to reserve some for table sync instead of linking to the entire logical
> replication config section.
> >
>
> OK. I copied the tablesync note back to config.sgml definition of
> 'max_replication_slots' and removed the link as suggested. Frankly, I
> also thought it is a bit strange that the max_replication_slots in the
> “Sending Servers” section was describing this parameter for
> “Subscribers”. OTOH, I did not want to split the definition in half so
> instead, I’ve added another Subscriber  that just refers
> back to this place. It looks like an improvement to me.
>

Hmm, I agree this is a tricky scenario. However, to me, it seems odd to
mention the parameter twice as this chapter of the docs just lists each
parameter and describes them. So, I'd probably remove the reference to it
in the subscriber

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 17:53:05 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > With an artificial delay of 100ms, the perf difference between the batching
> > patch and not using the batching patch is huge. Huge enough that I don't 
> > have
> > the patience to wait for the non-batched case to complete.
> 
> Clearly, if you insert a sufficiently large artificial round-trip delay,
> even squeezing a single command out of a pg_dump run will appear
> worthwhile.  What I'm unsure about is whether it's worthwhile at
> realistic round-trip delays (where "realistic" means that the dump
> performance would otherwise be acceptable).  I think the reason I didn't
> pursue this last year is that experimentation convinced me the answer
> was "no".

It seems to be a win even without any artificial delay. Not a *huge* win, but
a noticable win. And even just a few ms make it quite painful.


> > With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I
> > cancelled after 603 tables had been locked, which took 2:06.43.
> 
> Is "-s" mode actually a relevant criterion here?  With per-table COPY
> commands added into the mix you could not possibly get better than 2x
> improvement, and likely a good deal less.

Well, -s isn't something used all that rarely, so it'd not be insane to
optimize it in isolation. But more importantly, I think the potential win
without -s is far bigger than 2x, because the COPYs can be done in parallel,
whereas the locking happens in the non-parallel stage.

With just a 5ms delay, very well within normal network latency range, I get:

pg_dump.master -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1
2m7.830s

pg_dump.pipeline -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1
0m24.183s

pg_dump.batch -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1
0m24.321s

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 17:32:21 -0500, Tom Lane wrote:
> I already pushed the  elog-refactoring patch, since that seemed
> uncontroversial.  0001 attached covers the same territory as before,
> but I regrouped the rest so that 0002 installs the new test support
> functions, then 0003 adds both the per-datatype changes and
> corresponding test cases for bool, int4, arrays, and records.
> The idea here is that 0003 can be pointed to as a sample of what
> has to be done to datatype input functions, while the preceding
> patches can be cited as relevant documentation.  (I've not decided
> whether to squash 0001 and 0002 together or commit them separately.

I think they make sense as is.


> Does it make sense to break 0003 into 4 separate commits, or is
> that overkill?)

I think it'd be fine either way.


> + * If "context" is an ErrorSaveContext node, but the node creator only wants
> + * notification of the fact of a soft error without any details, just set
> + * the error_occurred flag in the ErrorSaveContext node and return false,
> + * which will cause us to skip the remaining error processing steps.
> + *
> + * Otherwise, create and initialize error stack entry and return true.
> + * Subsequently, errmsg() and perhaps other routines will be called to 
> further
> + * populate the stack entry.  Finally, errsave_finish() will be called to
> + * tidy up.
> + */
> +bool
> +errsave_start(NodePtr context, const char *domain)

I wonder if there are potential use-cases for levels other than ERROR. I can
potentially see us wanting to defer some FATALs, e.g. when they occur in
process exit hooks.


> +{
> + ErrorSaveContext *escontext;
> + ErrorData  *edata;
> +
> + /*
> +  * Do we have a context for soft error reporting?  If not, just punt to
> +  * errstart().
> +  */
> + if (context == NULL || !IsA(context, ErrorSaveContext))
> + return errstart(ERROR, domain);
> +
> + /* Report that a soft error was detected */
> + escontext = (ErrorSaveContext *) context;
> + escontext->error_occurred = true;
> +
> + /* Nothing else to do if caller wants no further details */
> + if (!escontext->details_wanted)
> + return false;
> +
> + /*
> +  * Okay, crank up a stack entry to store the info in.
> +  */
> +
> + recursion_depth++;
> +
> + /* Initialize data for this error frame */
> + edata = get_error_stack_entry();

For a moment I was worried that it could lead to odd behaviour that we don't
do get_error_stack_entry() when !details_wanted, due to not raising an error
we'd otherwise raise. But that's a should-never-be-reached case, so ...


> +/*
> + * errsave_finish --- end a "soft" error-reporting cycle
> + *
> + * If errsave_start() decided this was a regular error, behave as
> + * errfinish().  Otherwise, package up the error details and save
> + * them in the ErrorSaveContext node.
> + */
> +void
> +errsave_finish(NodePtr context, const char *filename, int lineno,
> +const char *funcname)
> +{
> + ErrorSaveContext *escontext = (ErrorSaveContext *) context;
> + ErrorData  *edata = [errordata_stack_depth];
> +
> + /* verify stack depth before accessing *edata */
> + CHECK_STACK_DEPTH();
> +
> + /*
> +  * If errsave_start punted to errstart, then elevel will be ERROR or
> +  * perhaps even PANIC.  Punt likewise to errfinish.
> +  */
> + if (edata->elevel >= ERROR)
> + {
> + errfinish(filename, lineno, funcname);
> + pg_unreachable();
> + }

It seems somewhat ugly transport this knowledge via edata->elevel, but it's
not too bad.



> +/*
> + * We cannot include nodes.h yet, so make a stub reference.  (This is also
> + * used by fmgr.h, which doesn't want to depend on nodes.h either.)
> + */
> +typedef struct Node *NodePtr;

Seems like it'd be easier to just forward declare the struct, and use the
non-typedef'ed name in the header than to have to deal with these
interdependencies and the differing typenames.


> +/*--
> + * Support for reporting "soft" errors that don't require a full transaction
> + * abort to clean up.  This is to be used in this way:
> + *   errsave(context,
> + *   errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
> + *   errmsg("invalid input syntax for type %s: 
> \"%s\"",
> + *  "boolean", in_str),
> + *   ... other errxxx() fields as needed ...);
> + *
> + * "context" is a node pointer or NULL, and the remaining auxiliary calls
> + * provide the same error details as in ereport().  If context is not a
> + * pointer to an ErrorSaveContext node, then errsave(context, ...)
> + * behaves identically to ereport(ERROR, ...).  If context is a pointer
> + * to an ErrorSaveContext node, then the information provided by the
> + * auxiliary calls is stored in the context node and control returns
> + *

Re: [PoC] Federated Authn/z with OAUTHBEARER

2022-12-07 Thread Andrey Chudnovsky

> I think it's okay to have the extension and HBA collaborate to provide
> discovery information. Your proposal goes further than that, though,
> and makes the server aware of the chosen client flow. That appears to
> be an architectural violation: why does an OAuth resource server need
> to know the client flow at all?

Ok. It may have left there from intermediate iterations. We did
consider making extension drive the flow for specific grant_type, but
decided against that idea. For the same reason you point to.
Is it correct that your main concern about use of grant_type was that
it's propagated to the server? Then yes, we will remove sending it to
the server.

> Ideally, yes, but that only works if all identity providers implement
> the same flows in compatible ways. We're already seeing instances
> where that's not the case and we'll necessarily have to deal with that
> up front.

Yes, based on our analysis OIDC spec is detailed enough, that
providers implementing that one, can be supported with generic code in
libpq / client.
Github specifically won't fit there though. Microsoft Azure AD,
Google, Okta (including Auth0) will.
Theoretically discovery documents can be returned from the extension
(server-side) which is provider specific. Though we didn't plan to
prioritize that.

> That seems to be restating the goal of OAuth and OIDC. Can you explain
> how the incompatible change allows you to accomplish this better than
> standard implementations?

Do you refer to passing grant_type to the server? Which we will get
rid of in the next iteration. Or other incompatible changes as well?

> Why? I claim that standard OAUTHBEARER can handle all of that. What
> does your proposed architecture (the third diagram) enable that my
> proposed hook (the second diagram) doesn't?

The hook proposed on the 2nd diagram effectively delegates all Oauth
flows implementations to the client.
We propose libpq takes care of pulling OpenId discovery and coordination.
Which is effectively Diagram 1 + more flows + server hook providing
root url/audience.

Created the diagrams with all components for 3 flows:
1. Authorization code grant (Clients with Browser access):
  +--+ +--+
  | +---+  |
   |
  | PQconnect   |   |  |
   |
  | [auth_code] |   |  |
+---+
  |  -> |   | -- Empty Token > | >
|   |
  | | libpq | <- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  | |   |  |
|  Hook |
  | |   |  |
+---+
  | |   |+--+  |  |
  | |   | ---[GET]-> | OIDC |  | Postgres |
  |  +--+   | <--Provider Metadata-- | Discovery|  |  |
  | +- < | Hook | < |+--+  |
   |
  | |+--+   |  |
   |
  | v   |   |  |
   |
  |  [get auth  |   |  |
   |
  |code]|   |  |
   |
  ||   |  |
   |
  | |   |   |  |
   |
  | +   |   |  |
   |
  | PQconnect > | ++ +--+  |
   |
  | | | iddawc | <-- [ Auth code ]-> | Issuer/  |  |  |
  | | || <-- Access Token -- | Authz Server |  |  |
  | | ++ +--+  |  |
  | |   |  |
+---+
  | |   | -- Access Token ---> | >
| Validator |
  | |   | < Authorization Success/Failure  | <
|   Hook|
  |  +--+   |  |
+---+
  |  +-< | Hook |   |  |
   |
  |  v   +--+   |  |
   |
  |[store   +---+  |
   |
  |  refresh_token]+--+
  +--+

2. Device code grant
  +--+ +--+
  | +---+  |
   |
  | PQconnect   |   |  |
   |
  | [auth_code] |   |

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Jacob Champion

On Wed, Dec 7, 2022 at 2:53 PM Tom Lane  wrote:
> Is "-s" mode actually a relevant criterion here?  With per-table COPY
> commands added into the mix you could not possibly get better than 2x
> improvement, and likely a good deal less.

Don't we hit this code path in pg_upgrade? You won't see huge
round-trip times, of course, but you also won't see COPY.

Absolute performance aside, isn't there another concern that, the
longer it takes for us to lock the tables, the bigger the window there
is for someone to interfere with them between our catalog query and
the lock itself?

--Jacob

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Andrew Dunstan  writes:
> On 2022-12-07 We 17:32, Tom Lane wrote:
>> Does it make sense to break 0003 into 4 separate commits, or is
>> that overkill?)

> No strong opinion about 0001 and 0002. I'm happy enough with them as
> they are, but if you want to squash them that's ok. I wouldn't break up
> 0003. I think we're going to end up committing the remaining work in
> batches, although they would probably be a bit more thematically linked
> than these.

Yeah, we certainly aren't likely to do this work as
one-commit-per-datatype going forward.  I'm just wondering
how to do these initial commits so that they provide
good reference material.

regards, tom lane

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Tom Lane

Andres Freund  writes:
> With an artificial delay of 100ms, the perf difference between the batching
> patch and not using the batching patch is huge. Huge enough that I don't have
> the patience to wait for the non-batched case to complete.

Clearly, if you insert a sufficiently large artificial round-trip delay,
even squeezing a single command out of a pg_dump run will appear
worthwhile.  What I'm unsure about is whether it's worthwhile at
realistic round-trip delays (where "realistic" means that the dump
performance would otherwise be acceptable).  I think the reason I didn't
pursue this last year is that experimentation convinced me the answer
was "no".

> With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I
> cancelled after 603 tables had been locked, which took 2:06.43.

Is "-s" mode actually a relevant criterion here?  With per-table COPY
commands added into the mix you could not possibly get better than 2x
improvement, and likely a good deal less.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread Andrew Dunstan



On 2022-12-07 We 17:32, Tom Lane wrote:
> OK, here's a v4 that I think is possibly committable.
>
> I've changed all the comments and docs to use the "soft error"
> terminology, but since using "soft" in the actual function names
> didn't seem that appealing, they still use "safe".
>
> I already pushed the  elog-refactoring patch, since that seemed
> uncontroversial.  0001 attached covers the same territory as before,
> but I regrouped the rest so that 0002 installs the new test support
> functions, then 0003 adds both the per-datatype changes and
> corresponding test cases for bool, int4, arrays, and records.
> The idea here is that 0003 can be pointed to as a sample of what
> has to be done to datatype input functions, while the preceding
> patches can be cited as relevant documentation.  (I've not decided
> whether to squash 0001 and 0002 together or commit them separately.
> Does it make sense to break 0003 into 4 separate commits, or is
> that overkill?)
>

No strong opinion about 0001 and 0002. I'm happy enough with them as
they are, but if you want to squash them that's ok. I wouldn't break up
0003. I think we're going to end up committing the remaining work in
batches, although they would probably be a bit more thematically linked
than these.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 13:32:42 -0800, Andres Freund wrote:
> On 2022-12-07 18:14:01 -0300, Fabrízio de Royes Mello wrote:
> > Here we have some numbers about the Aleksander's patch:
> 
> Hm. Were they taken in an assertion enabled build or such? Just testing the
> t1 case on HEAD, I get 0:01.23 elapsed for an unpatched pg_dump in an
> optimized build. And that's on a machine with not all that fast cores.

Comparing the overhead on the server side.

CREATE OR REPLACE FUNCTION exec(v_sql text) RETURNS text LANGUAGE plpgsql AS 
$$BEGIN EXECUTE v_sql;RETURN v_sql;END;$$;

Acquire locks in separate statements, three times:

SELECT exec(string_agg(format('LOCK t%s;', i), '')) FROM generate_series(1, 
1) AS i;
1267.909 ms
1116.008 ms
1108.383 ms

Acquire all locks in a single statement, three times:
SELECT exec('LOCK '||string_agg(format('t%s', i), ', ')) FROM 
generate_series(1, 1) AS i;
1210.732 ms
1101.390 ms
1105.331 ms

So there's some performance difference that's independent of the avoided
roundtrips - but it's pretty small.

With an artificial delay of 100ms, the perf difference between the batching
patch and not using the batching patch is huge. Huge enough that I don't have
the patience to wait for the non-batched case to complete.

With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I
cancelled after 603 tables had been locked, which took 2:06.43.

This made me try out using pipeline mode. Seems to work nicely from what I can
tell. The timings are a tad slower than the "manual" batch mode - I think
that's largely due to the extended protocol just being overcomplicated.

Greetings,

Andres Freund

Re: add \dpS to psql

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 10:48:49AM +0300, Pavel Luzanov wrote:
> There is a very similar command to VACUUM FULL with a different name -
> CLUSTER.
> The VACUUM privilege does not apply to the CLUSTER command. This is probably
> correct.
> However, the documentation for the CLUSTER command does not say
> who can perform this command. I think it would be correct to add a sentence
> to the Notes section
> (https://www.postgresql.org/docs/devel/sql-cluster.html)
> similar to the one in the VACUUM documentation:
> 
> "To cluster a table, one must ordinarily be the table's owner or a
> superuser."

I created a new thread for this:

https://postgr.es/m/20221207223924.GA4182184%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

fix and document CLUSTER privileges

2022-12-07 Thread Nathan Bossart

Hi hackers,

While looking into other opportunities for per-table permissions, I noticed
a weird discrepancy in CLUSTER.  When evaluating whether the current user
has permission to CLUSTER a table, we ordinarily just check for ownership.
However, the database owner is also allowed to CLUSTER all partitions that
are not shared.  This was added in 3f19e17, and I didn't see any discussion
about it in the corresponding thread [0].

My first instinct is that we should just remove the database ownership
check, which is what I've done in the attached patch.  I don't see any
strong reason to complicate matters with special
database-owner-but-not-shared checks like other commands (e.g., VACUUM).
But perhaps we should do so just for consistency's sake.  Thoughts?

It was also noted elsewhere [1] that the privilege requirements for CLUSTER
are not documented.  The attached patch adds such documentation.

[0] https://postgr.es/m/20220411140609.GF26620%40telsasoft.com
[1] https://postgr.es/m/661148f4-c7f1-dec1-2bc8-29f3bd58e242%40postgrespro.ru

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml
index c37f4236f1..feb61e6ec2 100644
--- a/doc/src/sgml/ref/cluster.sgml
+++ b/doc/src/sgml/ref/cluster.sgml
@@ -67,7 +67,8 @@ CLUSTER [VERBOSE]
   
 
   
-   CLUSTER without any parameter reclusters all the
+   CLUSTER without a
+   table_name reclusters all the
previously-clustered tables in the current database that the calling user
owns, or all such tables if called by a superuser.  This
form of CLUSTER cannot be executed inside a transaction
@@ -132,6 +133,13 @@ CLUSTER [VERBOSE]
  
   Notes
 
+   
+To cluster a table, one must be the table's owner or a superuser.
+Database-wide clusters and clusters on partitioned tables will silently
+skip over any tables that the calling user does not have permission to
+cluster.
+   
+

 In cases where you are accessing single rows randomly
 within a table, the actual order of the data in the
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index 07e091bb87..a8a51cc674 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -1693,9 +1693,7 @@ get_tables_to_cluster_partitioned(MemoryContext cluster_context, Oid indexOid)
 			continue;
 
 		/* Silently skip partitions which the user has no access to. */
-		if (!object_ownercheck(RelationRelationId, relid, GetUserId()) &&
-			(!object_ownercheck(DatabaseRelationId, MyDatabaseId, GetUserId()) ||
-			 IsSharedRelation(relid)))
+		if (!object_ownercheck(RelationRelationId, relid, GetUserId()))
 			continue;
 
 		/* Use a permanent memory context for the result list */

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

OK, here's a v4 that I think is possibly committable.

I've changed all the comments and docs to use the "soft error"
terminology, but since using "soft" in the actual function names
didn't seem that appealing, they still use "safe".

I already pushed the  elog-refactoring patch, since that seemed
uncontroversial.  0001 attached covers the same territory as before,
but I regrouped the rest so that 0002 installs the new test support
functions, then 0003 adds both the per-datatype changes and
corresponding test cases for bool, int4, arrays, and records.
The idea here is that 0003 can be pointed to as a sample of what
has to be done to datatype input functions, while the preceding
patches can be cited as relevant documentation.  (I've not decided
whether to squash 0001 and 0002 together or commit them separately.
Does it make sense to break 0003 into 4 separate commits, or is
that overkill?)

Thoughts?

regards, tom lane

diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 693423e524..994dfc6526 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -900,6 +900,17 @@ CREATE TYPE name
function is written in C.
   
 
+  
+   In PostgreSQL version 16 and later,
+   it is desirable for base types' input functions to
+   return soft errors using the
+   new errsave()/ereturn()
+   mechanism, rather than throwing ereport()
+   exceptions as in previous versions.
+   See src/backend/utils/fmgr/README for more
+   information.
+  
+
  
 
  
diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile
index 4368c30fdb..7c594be583 100644
--- a/src/backend/nodes/Makefile
+++ b/src/backend/nodes/Makefile
@@ -56,6 +56,7 @@ node_headers = \
 	nodes/bitmapset.h \
 	nodes/extensible.h \
 	nodes/lockoptions.h \
+	nodes/miscnodes.h \
 	nodes/replnodes.h \
 	nodes/supportnodes.h \
 	nodes/value.h \
diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7212bc486f..08992dfd47 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -68,6 +68,7 @@ my @all_input_files = qw(
   nodes/bitmapset.h
   nodes/extensible.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
   nodes/value.h
@@ -89,6 +90,7 @@ my @nodetag_only_files = qw(
   executor/tuptable.h
   foreign/fdwapi.h
   nodes/lockoptions.h
+  nodes/miscnodes.h
   nodes/replnodes.h
   nodes/supportnodes.h
 );
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index f5cd1b7493..a36aeb832e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -71,6 +71,7 @@
 #include "libpq/libpq.h"
 #include "libpq/pqformat.h"
 #include "mb/pg_wchar.h"
+#include "nodes/miscnodes.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "postmaster/bgworker.h"
@@ -611,6 +612,128 @@ errfinish(const char *filename, int lineno, const char *funcname)
 	CHECK_FOR_INTERRUPTS();
 }
 
+
+/*
+ * errsave_start --- begin a "soft" error-reporting cycle
+ *
+ * If "context" isn't an ErrorSaveContext node, this behaves as
+ * errstart(ERROR, domain), and the errsave() macro ends up acting
+ * exactly like ereport(ERROR, ...).
+ *
+ * If "context" is an ErrorSaveContext node, but the node creator only wants
+ * notification of the fact of a soft error without any details, just set
+ * the error_occurred flag in the ErrorSaveContext node and return false,
+ * which will cause us to skip the remaining error processing steps.
+ *
+ * Otherwise, create and initialize error stack entry and return true.
+ * Subsequently, errmsg() and perhaps other routines will be called to further
+ * populate the stack entry.  Finally, errsave_finish() will be called to
+ * tidy up.
+ */
+bool
+errsave_start(NodePtr context, const char *domain)
+{
+	ErrorSaveContext *escontext;
+	ErrorData  *edata;
+
+	/*
+	 * Do we have a context for soft error reporting?  If not, just punt to
+	 * errstart().
+	 */
+	if (context == NULL || !IsA(context, ErrorSaveContext))
+		return errstart(ERROR, domain);
+
+	/* Report that a soft error was detected */
+	escontext = (ErrorSaveContext *) context;
+	escontext->error_occurred = true;
+
+	/* Nothing else to do if caller wants no further details */
+	if (!escontext->details_wanted)
+		return false;
+
+	/*
+	 * Okay, crank up a stack entry to store the info in.
+	 */
+
+	recursion_depth++;
+
+	/* Initialize data for this error frame */
+	edata = get_error_stack_entry();
+	edata->elevel = LOG;		/* signal all is well to errsave_finish */
+	set_stack_entry_domain(edata, domain);
+	/* Select default errcode based on the assumed elevel of ERROR */
+	edata->sqlerrcode = ERRCODE_INTERNAL_ERROR;
+
+	/*
+	 * Any allocations for this error state level should go into the caller's
+	 * context.  We don't need to pollute ErrorContext, or even require it to
+	 * exist, in this code path.
+	 */
+	edata->assoc_context =

Re: Temporary tables versus wraparound... again

2022-12-07 Thread Greg Stark

On Thu, 1 Dec 2022 at 14:18, Andres Freund  wrote:
>
> I find it problematic that ResetVacStats() bypasses tableam. Normal vacuums
> etc go through tableam but you put a ResetVacStats() besides each call to
> table_relation_nontransactional_truncate().  Seems like this should just be in
> heapam_relation_nontransactional_truncate()?

So this seems a bit weird. The only other part of the tableam that
touches freezexid and minmxid is table_relation_set_new_filelocator()
which returns them via references so that the callers
(heap.c:heap_create() and relcache.c:RelationSetNewRelfilenumber())
can set them themselves.

I can't really duplicate that layering without changing the API of
heapam_relation_nontransactional_truncate(). Which if it changes would
be quite annoying I think for external pluggable tableams.

But you're right that where I've put it will update relfrozenxid and
minmxid even for relations that have a tableam handler that returns
InvalidXid and doesn't need it. That does seem inappropriate.

I could put it directly in the heapam_handler but is that a layering
issue to be doing a catalog update from within the tableam_handler?
There are no other catalog updates in there.

On the other hand the existing callers of
*_nontransactional_truncate() don't have any existing catalog updates
they want to make so perhaps returning the xid values by reference was
just for convenience to avoid doing an extra update and isn't needed
here.

-- 
greg

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 18:14:01 -0300, Fabrízio de Royes Mello wrote:
> Here we have some numbers about the Aleksander's patch:

Hm. Were they taken in an assertion enabled build or such? Just testing the
t1 case on HEAD, I get 0:01.23 elapsed for an unpatched pg_dump in an
optimized build. And that's on a machine with not all that fast cores.

Greetings,

Andres Freund

Re: Transaction timeout

2022-12-07 Thread Andrey Borodin

On Wed, Dec 7, 2022 at 10:23 AM Andres Freund  wrote:
>
> On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote:
> > Fixed. Added test for this.
>
> The tests don't pass: https://cirrus-ci.com/build/4811553145356288
>
oops, sorry. Here's the fix. I hope to address other feedback on the
weekend. Thank you!

Best regards, Andrey Borodin.


v3-0001-Intorduce-transaction_timeout.patch
Description: Binary data

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Fabrízio de Royes Mello

On Wed, Dec 7, 2022 at 2:28 PM Tom Lane  wrote:
>
> Andres Freund  writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here.  I'm pretty sure I experimented
> >> with this idea last year and gave up on it.  I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
>
> > IIRC the case we were looking at around 989596152 were CPU bound
workloads,
> > rather than latency bound workloads. It'd not be surprising to have
cases
> > where batching LOCKs helps latency, but not CPU bound.
>
> Yeah, perhaps.  Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.
>

Here we have some numbers about the Aleksander's patch:

1) Setup script

CREATE DATABASE t1000;
CREATE DATABASE t1;
CREATE DATABASE t10;

\c t1000
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3
TIMESTAMPTZ);', i) FROM generate_series(1, 1000) AS i \gexec

\c t1
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3
TIMESTAMPTZ);', i) FROM generate_series(1, 1) AS i \gexec

\c t10
SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3
TIMESTAMPTZ);', i) FROM generate_series(1, 10) AS i \gexec

2) Execution script

time pg_dump -s t1000 > /dev/null
time pg_dump -s t1 > /dev/null
time pg_dump -s t10 > /dev/null

3) HEAD execution

$ time pg_dump -s t1000 > /dev/null
0.02user 0.01system 0:00.36elapsed 8%CPU (0avgtext+0avgdata
11680maxresident)k
0inputs+0outputs (0major+1883minor)pagefaults 0swaps

$ time pg_dump -s t1 > /dev/null
0.30user 0.10system 0:05.04elapsed 8%CPU (0avgtext+0avgdata
57772maxresident)k
0inputs+0outputs (0major+14042minor)pagefaults 0swaps

$ time pg_dump -s t10 > /dev/null
3.42user 2.13system 7:50.09elapsed 1%CPU (0avgtext+0avgdata
517900maxresident)k
0inputs+0outputs (0major+134636minor)pagefaults 0swaps

4) PATCH execution

$ time pg_dump -s t1000 > /dev/null
0.02user 0.00system 0:00.28elapsed 9%CPU (0avgtext+0avgdata
11700maxresident)k
0inputs+0outputs (0major+1886minor)pagefaults 0swaps

$ time pg_dump -s t1 > /dev/null
0.18user 0.03system 0:02.17elapsed 10%CPU (0avgtext+0avgdata
57592maxresident)k
0inputs+0outputs (0major+14072minor)pagefaults 0swaps

$ time pg_dump -s t10 > /dev/null
1.97user 0.32system 0:21.39elapsed 10%CPU (0avgtext+0avgdata
517932maxresident)k
0inputs+0outputs (0major+134892minor)pagefaults 0swaps

5) Summary

 HEAD patch
  1k tables  0:00.36  0:00.28
 10k tables  0:05.04  0:02.17
100k tables  7:50.09  0:21.39

Seems we get very good performance gain using Aleksander's patch. I used
the "-s" to not waste time issuing COPY for each relation (even all is
empty) and evidence the difference due to roundtrip for LOCK TABLE. This
patch will also improve the pg_upgrade execution over database with
thousands of relations.

Regards,

--
Fabrízio de Royes Mello

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

"David G. Johnston"  writes:
> On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan  wrote:
>> I'm not sure InputFunctionCallSoft would be an improvement. Maybe
>> InputFunctionCallSoftError would be clearer, but I don't know that it's
>> much of an improvement either. The same goes for the other visible changes.

> InputFunctionCallSafe -> TryInputFunctionCall

I think we are already using "TryXXX" for code that involves catching
ereport errors.  Since the whole point here is that we are NOT doing
that, I think this naming would be more confusing than helpful.

> Unrelated observation: "Although the error stack is not large, we don't
> expect to run out of space." -> "Because the error stack is not large,
> assume that we will not run out of space and panic if we are wrong."?

That doesn't seem to make the point I wanted to make.

I've adopted your other suggestions in the v4 I'm preparing now.

regards, tom lane

[PATCH] psql: Add tab-complete for optional view parameters

2022-12-07 Thread Christoph Heiss


Hi all!

This adds tab-complete for optional parameters (as can be specified 
using WITH) for views, similarly to how it works for storage parameters 
of tables.


Thanks,
Christoph HeissFrom bd12c08a80b5973fdcac59def213216d06f150b0 Mon Sep 17 00:00:00 2001
From: Christoph Heiss 
Date: Wed, 7 Dec 2022 19:15:48 +0100
Subject: [PATCH] psql: Add tab-complete for optional view parameters

This adds them in the same matter as it works for storage parameters of
tables.

Signed-off-by: Christoph Heiss 
---
 src/bin/psql/tab-complete.c | 31 +++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 89e7317c23..999a0e5aa1 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1310,6 +1310,13 @@ static const char *const table_storage_parameters[] = {
 	NULL
 };
 
+/* Optional parameters for CREATE VIEW and ALTER VIEW */
+static const char *const view_optional_parameters[] = {
+	"check_option",
+	"security_barrier",
+	"security_invoker",
+	NULL
+};
 
 /* Forward declaration of functions */
 static char **psql_completion(const char *text, int start, int end);
@@ -2139,7 +2146,7 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER VIEW  */
 	else if (Matches("ALTER", "VIEW", MatchAny))
 		COMPLETE_WITH("ALTER COLUMN", "OWNER TO", "RENAME",
-	  "SET SCHEMA");
+	  "SET SCHEMA", "SET (", "RESET (");
 	/* ALTER VIEW xxx RENAME */
 	else if (Matches("ALTER", "VIEW", MatchAny, "RENAME"))
 		COMPLETE_WITH_ATTR_PLUS(prev2_wd, "COLUMN", "TO");
@@ -2155,6 +2162,12 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER VIEW xxx RENAME COLUMN yyy */
 	else if (Matches("ALTER", "VIEW", MatchAny, "RENAME", "COLUMN", MatchAnyExcept("TO")))
 		COMPLETE_WITH("TO");
+	/* ALTER VIEW xxx SET ( yyy [= zzz] ) */
+	else if (Matches("ALTER", "VIEW", MatchAny, "SET", "("))
+		COMPLETE_WITH_LIST(view_optional_parameters);
+	/* ALTER VIEW xxx RESET ( yyy , ... ) */
+	else if (Matches("ALTER", "VIEW", MatchAny, "RESET", "("))
+		COMPLETE_WITH_LIST(view_optional_parameters);
 
 	/* ALTER MATERIALIZED VIEW  */
 	else if (Matches("ALTER", "MATERIALIZED", "VIEW", MatchAny))
@@ -3426,13 +3439,23 @@ psql_completion(const char *text, int start, int end)
 	}
 
 /* CREATE VIEW --- is allowed inside CREATE SCHEMA, so use TailMatches */
-	/* Complete CREATE [ OR REPLACE ] VIEW  with AS */
+	/* Complete CREATE [ OR REPLACE ] VIEW  with AS or WITH ( */
 	else if (TailMatches("CREATE", "VIEW", MatchAny) ||
 			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny))
+		COMPLETE_WITH("AS", "WITH (");
+	/* Complete CREATE [ OR REPLACE ] VIEW  WITH ( with supported options */
+	else if (TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(") ||
+			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "("))
+		COMPLETE_WITH_LIST(view_optional_parameters);
+	/* Complete CREATE [ OR REPLACE ] VIEW  WITH ( ... ) with "AS" */
+	else if (TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)") ||
+			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(*)"))
 		COMPLETE_WITH("AS");
-	/* Complete "CREATE [ OR REPLACE ] VIEW  AS with "SELECT" */
+	/* Complete "CREATE [ OR REPLACE ] VIEW  [ WITH ( ... ) ] AS with "SELECT" */
 	else if (TailMatches("CREATE", "VIEW", MatchAny, "AS") ||
-			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS"))
+			 TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)", "AS") ||
+			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS") ||
+			 TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(*)", "AS"))
 		COMPLETE_WITH("SELECT");
 
 /* CREATE MATERIALIZED VIEW */
-- 
2.38.1

Creating HeapTuple from char and date values

2022-12-07 Thread Amin

Hi All,

I am trying to create HeapTuple data structure.
First, I create a tuple descriptor:
 TupleDesc *td=CreateTemplateTupleDesc(colCount);
Then, for each variable, I do:
  TupleDescInitEntry(*td,v->varattno,NULL,v->vartype,v->vartypmod,0);
Then, I assign values:
  if int32: values[v->varattno-1]=Int8GetDatum(myValue);
Similarly for float.
Finally, I create the HeapTuple:
  HeapTuple tuple=heap_form_tuple(td,values,isnull);

Everything works fine with int and float. But I don't know how to handle
chars.
Let's say we have a character(10) column. One problem is v->vartypmod will
be set to 14. Shouldn't it be 10?
Second, how should I assign values? Is
values[v->varattno-1]=CStringGetDatum(myValue); correct? Should I set the
last parameter to TupleDescInitEntry? Why am I getting "invalid memory
alloc request size" or segfault with different configurations?

Re: [PoC] Federated Authn/z with OAUTHBEARER

2022-12-07 Thread Jacob Champion

On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky  wrote:
> I think we can focus on the roles and responsibilities of the components 
> first.
> Details of the patch can be elaborated. Like "flow type code" is a
> mistake on our side, and we will use the term "grant_type" which is
> defined by OIDC spec. As well as details of usage of refresh_token.

(For the record, whether we call it "flow type" or "grant type"
doesn't address my concern.)

> Basically Yes. We propose an increase of the server side hook responsibility.
> From just validating the token, to also return the provider root URL
> and required audience. And possibly provide more metadata in the
> future.

I think it's okay to have the extension and HBA collaborate to provide
discovery information. Your proposal goes further than that, though,
and makes the server aware of the chosen client flow. That appears to
be an architectural violation: why does an OAuth resource server need
to know the client flow at all?

> Which is in our opinion aligned with SASL protocol, where the server
> side is responsible for telling the client auth requirements based on
> the requested role in the startup packet.

You've proposed an alternative SASL mechanism. There's nothing wrong
with that, per se, but I think it should be clear why we've chosen
something nonstandard.

> Our understanding is that in the original patch that information came
> purely from hba, and we propose extension being able to control that
> metadata.
> As we see extension as being owned by the identity provider, compared
> to HBA which is owned by the server administrator or cloud provider.

That seems reasonable, considering how tightly coupled the Issuer and
the token validation process are.

> 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> multi-cloud PAAS providers).
>- Install extensions and configure HBA to allow clients to
> authenticate with the identity providers of their choice.

(For a future conversation: they need to set up authorization, too,
with custom scopes or some other magic. It's not enough to check who
the token belongs to; even if Postgres is just using the verified
email from OpenID as an authenticator, you have to also know that the
user authorized the token -- and therefore the client -- to access
Postgres on their behalf.)

> 3. Client Application Developers (Data Wis, integration tools,
> PgAdmin, monitoring tools, e.t.c.)
>- Independent from specific Identity providers or server providers.
> Write one code for all identity providers.

Ideally, yes, but that only works if all identity providers implement
the same flows in compatible ways. We're already seeing instances
where that's not the case and we'll necessarily have to deal with that
up front.

>- Rely on application deployment owners to configure which OIDC
> provider to use across client and server setups.
> 4. Application Deployment Owners (End customers setting up applications)
>- The only actor actually aware of which identity provider to use.
> Configures the stack based on the Identity and PostgreSQL deployments
> they have.

(I have doubts that the roles will be as decoupled in practice as you
have described them, but I'd rather defer that for now.)

> The critical piece of the vision is (3.) above is applications
> agnostic of the identity providers. Those applications rely on
> properly configured servers and rich driver logic (libpq,
> com.postgresql, npgsql) to allow their application to popup auth
> windows or do service-to-service authentication with any provider. In
> our view that would significantly democratize the deployment of OAUTH
> authentication in the community.

That seems to be restating the goal of OAuth and OIDC. Can you explain
how the incompatible change allows you to accomplish this better than
standard implementations?

> In order to allow this separation, we propose:
> 1. HBA + Extension is the single source of truth of Provider root URL
> + Required Audience for each role. If some backfill for missing OIDC
> discovery is needed, the provider-specific extension would be
> providing it.
> 2. Client Application knows which grant_type to use in which scenario.
> But can be coded without knowledge of a specific provider. So can't
> provide discovery details.
> 3. Driver (libpq, others) - coordinate the authentication flow based
> on client grant_type and identity provider metadata to allow client
> applications to use any flow with any provider in a unified way.
>
> Yes, this would require a little more complicated flow between
> components than in your original patch.

Why? I claim that standard OAUTHBEARER can handle all of that. What
does your proposed architecture (the third diagram) enable that my
proposed hook (the second diagram) doesn't?

> And yes, more complexity comes
> with more opportunity to make bugs.
> However, I see PG Server and Libpq as the places which can have more
> complexity. For the purpose of making work for the

Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

2022-12-07 Thread Andres Freund

Hi,

On 2022-11-04 09:25:52 +0100, Drouvot, Bertrand wrote:
> Please find attached a rebase in v7.

cfbot complains that the docs don't build:
https://cirrus-ci.com/task/6694349031866368?logs=docs_build#L296

[03:24:27.317] ref/checkpoint.sgml:66: element para: validity error : Element 
para is not declared in para list of possible children

I've marked the patch as waitin-on-author for now.


There's been a bunch of architectural feedback too, but tbh, I don't know if
we came to any conclusion on that front...

Greetings,

Andres Freund

Re: Partial aggregates pushdown

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-05 02:03:49 +, fujii.y...@df.mitsubishielectric.co.jp wrote:
> > Attaching minor fixes. I haven't proof-read all comments (but perhaps, they
> > need attention from some native speaker).
> Thank you. I fixed according to your patch.
> And I fixed have proof-read all comments and messages.

cfbot complains about some compiler warnings when building with clang:
https://cirrus-ci.com/task/6606268580757504

deparse.c:3459:22: error: equality comparison with extraneous parentheses 
[-Werror,-Wparentheses-equality]
if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
 ~~~^~
deparse.c:3459:22: note: remove extraneous parentheses around the comparison to 
silence this warning
if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
~   ^ ~
deparse.c:3459:22: note: use '=' to turn this equality comparison into an 
assignment
if ((node->aggsplit == AGGSPLIT_SIMPLE)) {
^~
=

Greetings,

Andres Freund

Re: daitch_mokotoff module

2022-12-07 Thread Andres Freund

Hi,

On 2022-02-03 15:27:32 +0100, Dag Lem wrote:
> Just some minor adjustments to the patch:
> 
> * Removed call to locale-dependent toupper()
> * Cleaned up input normalization

This patch currently fails in cfbot, likely because meson.build needs to be
adjusted (this didn't exist at the time you submitted this version of the
patch):

[23:43:34.796] contrib/fuzzystrmatch/meson.build:18:0: ERROR: File 
fuzzystrmatch--1.1.sql does not exist.


> -DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.0--1.1.sql
> +DATA = fuzzystrmatch--1.2.sql fuzzystrmatch--1.1--1.2.sql 
> fuzzystrmatch--1.0--1.1.sql
>  PGFILEDESC = "fuzzystrmatch - similarities and distance between strings"


The patch seems to remove fuzzystrmatch--1.1.sql - I suggest not doing
that. In recent years our approach has been to just keep the "base version" of
the upgrade script, with extension creation running through the upgrade
scripts.

>  
> +
> +#include "daitch_mokotoff.h"
> +
> +#include "postgres.h"

Postgres policy is that the include of "postgres.h" has to be the first
include in every .c file.


> +#include "utils/builtins.h"
> +#include "mb/pg_wchar.h"
> +
> +#include 
> +
> +/* Internal C implementation */
> +static char *_daitch_mokotoff(char *word, char *soundex, size_t n);
> +
> +
> +PG_FUNCTION_INFO_V1(daitch_mokotoff);
> +Datum
> +daitch_mokotoff(PG_FUNCTION_ARGS)
> +{
> + text   *arg = PG_GETARG_TEXT_PP(0);
> + char   *string,
> +*tmp_soundex;
> + text   *soundex;
> +
> + /*
> +  * The maximum theoretical soundex size is several KB, however in 
> practice
> +  * anything but contrived synthetic inputs will yield a soundex size of
> +  * less than 100 bytes. We thus allocate and free a temporary work 
> buffer,
> +  * and return only the actual soundex result.
> +  */
> + string = pg_server_to_any(text_to_cstring(arg), VARSIZE_ANY_EXHDR(arg), 
> PG_UTF8);
> + tmp_soundex = palloc(DM_MAX_SOUNDEX_CHARS);

Seems that just using StringInfo to hold the soundex output would work better
than a static allocation?


> + if (!_daitch_mokotoff(string, tmp_soundex, DM_MAX_SOUNDEX_CHARS))

We imo shouldn't introduce new functions starting with _.


> +/* Mark soundex code tree node as leaf. */
> +static void
> +set_leaf(dm_leaves leaves_next, int *num_leaves_next, dm_node * node)
> +{
> + if (!node->is_leaf)
> + {
> + node->is_leaf = 1;
> + leaves_next[(*num_leaves_next)++] = node;
> + }
> +}
> +
> +
> +/* Find next node corresponding to code digit, or create a new node. */
> +static dm_node * find_or_create_node(dm_nodes nodes, int *num_nodes,
> +  
> dm_node * node, char code_digit)

PG code style is to have a line break between a function defintion's return
type and the function name - like you actually do above.




> +/* Mapping from ISO8859-1 to upper-case ASCII */
> +static const char tr_iso8859_1_to_ascii_upper[] =
> +/*
> +"`abcdefghijklmnopqrstuvwxyz{|}~  
> ¡¢£¤¥¦§¨©ª«¬ 
> ®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"
> +*/
> +"`ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~  !  
>
> ?AAECDNO*OYDSAAECDNO/OYDY";
> +
> +static char
> +iso8859_1_to_ascii_upper(unsigned char c)
> +{
> + return c >= 0x60 ? tr_iso8859_1_to_ascii_upper[c - 0x60] : c;
> +}
> +
> +
> +/* Convert an UTF-8 character to ISO-8859-1.
> + * Unconvertable characters are returned as '?'.
> + * NB! Beware of the domain specific conversion of Ą, Ę, and Ţ/Ț.
> + */
> +static char
> +utf8_to_iso8859_1(char *str, int *ix)

It seems decidedly not great to have custom encoding conversion routines in a
contrib module. Is there any way we can avoid this?


> +/* Generate all Daitch-Mokotoff soundex codes for word, separated by space. 
> */
> +static char *
> +_daitch_mokotoff(char *word, char *soundex, size_t n)
> +{
> + int i = 0,
> + j;
> + int letter_no = 0;
> + int ix_leaves = 0;
> + int num_nodes = 0,
> + num_leaves = 0;
> + dm_codes   *codes,
> +*next_codes;
> + dm_node*nodes;
> + dm_leaves  *leaves;
> +
> + /* First letter. */
> + if (!(codes = read_letter(word, )))
> + {
> + /* No encodable character in input. */
> + return NULL;
> + }
> +
> + /* Allocate memory for node tree. */
> + nodes = palloc(sizeof(dm_nodes));
> + leaves = palloc(2 * sizeof(dm_leaves));

So this allocates the worst case memory usage, is that right? That's quite a
bit of memory. Shouldn't nodes be allocated dynamically?

Instead of carefully freeing individual memory allocations, I think it be
better to create a

Re: Transaction timeout

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote:
> Fixed. Added test for this.

The tests don't pass: https://cirrus-ci.com/build/4811553145356288

[00:54:35.337](1.251s) not ok 1 - no parameters missing from 
postgresql.conf.sample
[00:54:35.338](0.000s) #   Failed test 'no parameters missing from 
postgresql.conf.sample'
#   at /tmp/cirrus-ci-build/src/test/modules/test_misc/t/003_check_guc.pl line 
81.
[00:54:35.338](0.000s) #  got: '1'
# expected: '0'


I am just looking through a bunch of failing CF entries, so I'm perhaps
over-sensitized right now. But I'm a bit confused why there's so many
occasions of the tests clearly not having been run...


Michael, any reason 003_check_guc doesn't show the missing GUCs? It's not
particularly helpful to see "0 is different from 1". Seems that even just
something like
  is_deeply(\@missing_from_file, [], "no parameters missing from 
postgresql.conf.sample");
would be a decent improvement?

Greetings,

Andres Freund

Re: wake up logical workers after ALTER SUBSCRIPTION

2022-12-07 Thread Nathan Bossart

On Wed, Dec 07, 2022 at 02:07:11PM +0300, Melih Mutlu wrote:
> Do we also need to wake up all sync workers too? Even if not, I'm not
> actually sure whether doing that would harm anything though.
> Just asking since currently the patch wakes up all workers including sync
> workers if any still exists.

After sleeping on this, I think we can do better.  IIUC we can simply check
for AllTablesyncsReady() at the end of process_syncing_tables_for_apply()
and wake up the logical replication workers (which should just consiѕt of
setting the current process's latch) if we are ready for two_phase mode.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 2457eca47c92b29409d5972fc5a3e1c8c038fa6d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 21 Nov 2022 16:01:01 -0800
Subject: [PATCH v8 1/1] wake up logical workers as needed instead of relying
 on periodic wakeups

---
 src/backend/access/transam/xact.c   |  3 ++
 src/backend/commands/alter.c|  7 
 src/backend/commands/subscriptioncmds.c |  4 ++
 src/backend/replication/logical/tablesync.c | 10 +
 src/backend/replication/logical/worker.c| 46 +
 src/include/replication/logicalworker.h |  3 ++
 6 files changed, 73 insertions(+)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 8086b857b9..dc00e66cfb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -47,6 +47,7 @@
 #include "pgstat.h"
 #include "replication/logical.h"
 #include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
 #include "replication/origin.h"
 #include "replication/snapbuild.h"
 #include "replication/syncrep.h"
@@ -2360,6 +2361,7 @@ CommitTransaction(void)
 	AtEOXact_PgStat(true, is_parallel_worker);
 	AtEOXact_Snapshot(true, false);
 	AtEOXact_ApplyLauncher(true);
+	AtEOXact_LogicalRepWorkers(true);
 	pgstat_report_xact_timestamp(0);
 
 	CurrentResourceOwner = NULL;
@@ -2860,6 +2862,7 @@ AbortTransaction(void)
 		AtEOXact_HashTables(false);
 		AtEOXact_PgStat(false, is_parallel_worker);
 		AtEOXact_ApplyLauncher(false);
+		AtEOXact_LogicalRepWorkers(false);
 		pgstat_report_xact_timestamp(0);
 	}
 
diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c
index 10b6fe19a2..d095cd3ced 100644
--- a/src/backend/commands/alter.c
+++ b/src/backend/commands/alter.c
@@ -59,6 +59,7 @@
 #include "commands/user.h"
 #include "miscadmin.h"
 #include "parser/parse_func.h"
+#include "replication/logicalworker.h"
 #include "rewrite/rewriteDefine.h"
 #include "tcop/utility.h"
 #include "utils/builtins.h"
@@ -279,6 +280,12 @@ AlterObjectRename_internal(Relation rel, Oid objectId, const char *new_name)
 		if (strncmp(new_name, "regress_", 8) != 0)
 			elog(WARNING, "subscriptions created by regression test cases should have names starting with \"regress_\"");
 #endif
+
+		/*
+		 * Wake up the logical replication workers to handle this change
+		 * quickly.
+		 */
+		LogicalRepWorkersWakeupAtCommit(objectId);
 	}
 	else if (nameCacheId >= 0)
 	{
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index d673557ea4..d6993c26e5 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -34,6 +34,7 @@
 #include "nodes/makefuncs.h"
 #include "pgstat.h"
 #include "replication/logicallauncher.h"
+#include "replication/logicalworker.h"
 #include "replication/origin.h"
 #include "replication/slot.h"
 #include "replication/walreceiver.h"
@@ -1362,6 +1363,9 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
 
 	InvokeObjectPostAlterHook(SubscriptionRelationId, subid, 0);
 
+	/* Wake up the logical replication workers to handle this change quickly. */
+	LogicalRepWorkersWakeupAtCommit(subid);
+
 	return myself;
 }
 
diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 94e813ac53..509fe2eb19 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -105,6 +105,7 @@
 #include "pgstat.h"
 #include "replication/logicallauncher.h"
 #include "replication/logicalrelation.h"
+#include "replication/logicalworker.h"
 #include "replication/walreceiver.h"
 #include "replication/worker_internal.h"
 #include "replication/slot.h"
@@ -619,6 +620,15 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
 
 	if (started_tx)
 	{
+		/*
+		 * If we are ready to enable two_phase mode, wake up the logical
+		 * replication workers to handle this change quickly.
+		 */
+		CommandCounterIncrement();
+		if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING &&
+			AllTablesyncsReady())
+			LogicalRepWorkersWakeupAtCommit(MyLogicalRepWorker->subid);
+
 		CommitTransactionCommand();
 		pgstat_report_stat(true);
 	}
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index

Re: [BUG] Logical replica crash if there was an error in a function.

2022-12-07 Thread Andres Freund

Hi,

On 2022-11-16 17:52:50 +0300, Anton A. Melnikov wrote:
> Sorry, i didn't fully understand what is required and
> added some functions to the test that spend extra cpu time. But i found
> that it is possible to make a test according to previous remarks by adding
> only a few extra queries to an existent test without any additional syncing.
> 
> Experimentally, i could not observe any significant difference in
> CPU usage due to this test addition.
> The difference in the CPU usage was less than standard error.
> See 100_bugs-CPU-time.txt attached.
> 
> > > Additionally
> > > i've tried to reduce overall number of nodes previously
> > > used in this test in a similar way.
> > 
> > Optimizing existing tests isn't an answer to that.  We could
> > install those optimizations without adding a new test case.
> 
> Oh sure! Here is a separate patch for this optimization:
> https://www.postgresql.org/message-id/eb7aa992-c2d7-6ce7-4942-0c784231a362%40inbox.ru
> On the contrary with previous case in that one the difference in the CPU usage
> during 100_bugs.pl is essential; it decreases approximately by 30%.

This CF entry causes tests to fail on all platforms:
https://cirrus-ci.com/build/5755408111894528

E.g.
https://api.cirrus-ci.com/v1/artifact/task/5298457144459264/testrun/build/testrun/subscription/100_bugs/log/regress_log_100_bugs

 Begin standard error
psql::1: NOTICE:  dropped replication slot "sub1" on publisher
 End standard error
timed out waiting for match: ERROR:  relation "error_name" does not exist at 
character at /tmp/cirrus-ci-build/src/test/subscription/t/100_bugs.pl line 115.

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

I wrote:
> Andres Freund  writes:
>> Is there a guarantee that input functions are stable or immutable?

> There's a project policy that that should be true.  That justifies
> marking things like record_in as stable --- if the per-column input
> functions could be volatile, record_in would need to be as well.
> There are other dependencies on it; see e.g. aab353a60, 3db6524fe.

I dug in the archives and found the thread leading up to aab353a60:

https://www.postgresql.org/message-id/flat/AANLkTik8v7O9QR9jjHNVh62h-COC1B0FDUNmEYMdtKjR%40mail.gmail.com

regards, tom lane

Re: Minimal logical decoding on standbys

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 10:00:25 +0100, Drouvot, Bertrand wrote:
> > Please find attached a new patch series:
> > 
> > v27-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch
> > v27-0002-Handle-logical-slot-conflicts-on-standby.patch
> > v27-0003-Allow-logical-decoding-on-standby.patch
> > v27-0004-New-TAP-test-for-logical-decoding-on-standby.patch
> > v27-0005-Doc-changes-describing-details-about-logical-dec.patch
> > v27-0006-Fixing-Walsender-corner-case-with-logical-decodi.patch

This failed on cfbot [1]. The tap output [2] has the following bit:

[09:48:56.216](5.979s) not ok 26 - cannot read from logical replication slot
[09:48:56.223](0.007s) #   Failed test 'cannot read from logical replication 
slot'
#   at C:/cirrus/src/test/recovery/t/034_standby_logical_decoding.pl line 422.
...
Warning: unable to close filehandle GEN150 properly: Bad file descriptor during 
global destruction.
Warning: unable to close filehandle GEN155 properly: Bad file descriptor during 
global destruction.

The "unable to close filehandle" stuff in my experience indicates an IPC::Run
process that wasn't ended before the tap test ended.

Greetings,

Andres Freund

[1] https://cirrus-ci.com/task/5092676671373312
[2] 
https://api.cirrus-ci.com/v1/artifact/task/5092676671373312/testrun/build/testrun/recovery/034_standby_logical_decoding/log/regress_log_034_standby_logical_decoding

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-07 Thread Peter Geoghegan

On Wed, Dec 7, 2022 at 9:50 AM Andres Freund  wrote:
> performing post-bootstrap initialization ... 
> ../src/backend/access/transam/slru.c:1520:9: runtime error: load of 
> misaligned address 0x7fff6362db8c for type 'int64', which requires 8 byte 
> alignment
> 0x7fff6362db8c: note: pointer points here
>   01 00 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  
> 01 00 00 00 00 00 00 00

I bet that this alignment issue can be fixed by using PGAlignedBlock
instead of a raw char buffer for a page. (I'm guessing, I haven't
directly checked.)

-- 
Peter Geoghegan

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Andres Freund  writes:
> Is there a guarantee that input functions are stable or immutable?

There's a project policy that that should be true.  That justifies
marking things like record_in as stable --- if the per-column input
functions could be volatile, record_in would need to be as well.
There are other dependencies on it; see e.g. aab353a60, 3db6524fe.

> We don't
> have any volatile input functions in core PG:

Indeed, because type_sanity.sql checks that.

regards, tom lane

Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 11:40:08 +0300, Aleksander Alekseev wrote:
> Hi Michael,
>
> > The CF bot is showing some failures here.  You may want to
> > double-check.
>
> Thanks! PFA v48.

This causes a lot of failures with ubsan:

https://cirrus-ci.com/task/6035600772431872

performing post-bootstrap initialization ... 
../src/backend/access/transam/slru.c:1520:9: runtime error: load of misaligned 
address 0x7fff6362db8c for type 'int64', which requires 8 byte alignment
0x7fff6362db8c: note: pointer points here
  01 00 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  d0 02 00 00 00 00 00 00  01 
00 00 00 00 00 00 00
  ^
==18947==Using libbacktrace symbolizer.
#0 0x564d7c45cc6b in SlruScanDirCbReportPresence 
../src/backend/access/transam/slru.c:1520
#1 0x564d7c45cd88 in SlruScanDirectory 
../src/backend/access/transam/slru.c:1595
#2 0x564d7c44872c in TruncateCLOG ../src/backend/access/transam/clog.c:889
#3 0x564d7c62ecd7 in vac_truncate_clog ../src/backend/commands/vacuum.c:1779
#4 0x564d7c6320a8 in vac_update_datfrozenxid 
../src/backend/commands/vacuum.c:1642
#5 0x564d7c632a78 in vacuum ../src/backend/commands/vacuum.c:537
#6 0x564d7c63347d in ExecVacuum ../src/backend/commands/vacuum.c:273
#7 0x564d7ca4afea in standard_ProcessUtility 
../src/backend/tcop/utility.c:866
#8 0x564d7ca4b723 in ProcessUtility ../src/backend/tcop/utility.c:530
#9 0x564d7ca46e81 in PortalRunUtility ../src/backend/tcop/pquery.c:1158
#10 0x564d7ca4755d in PortalRunMulti ../src/backend/tcop/pquery.c:1315
#11 0x564d7ca47c02 in PortalRun ../src/backend/tcop/pquery.c:791
#12 0x564d7ca40ecb in exec_simple_query ../src/backend/tcop/postgres.c:1238
#13 0x564d7ca43c01 in PostgresMain ../src/backend/tcop/postgres.c:4551
#14 0x564d7ca441a4 in PostgresSingleUserMain 
../src/backend/tcop/postgres.c:4028
#15 0x564d7c74d883 in main ../src/backend/main/main.c:197
#16 0x7fde7793dd09 in __libc_start_main 
(/lib/x86_64-linux-gnu/libc.so.6+0x23d09)
#17 0x564d7c2d30c9 in _start 
(/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8530c9)

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 10:34 AM Andres Freund  wrote:

> > +{ oid => '8053',
> > +  descr => 'get error message if string is not valid input for data
> type',
> > +  proname => 'pg_input_invalid_message', provolatile => 's',
> > +  prorettype => 'text', proargtypes => 'text regtype int4',
> > +  prosrc => 'pg_input_invalid_message_mod' },
> > +
>
> Is there a guarantee that input functions are stable or immutable? We don't
> have any volatile input functions in core PG:
>
> SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput
> FROM pg_type) GROUP BY provolatile;
>
>
Effectively yes, though I'm not sure if it is formally documented or
otherwise enforced by the system.

The fact we allow stable is a bit of a sore spot, volatile would be a
terrible property for an I/O function.

David J.

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 12:28:03 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> >> I have a strong sense of deja vu here.  I'm pretty sure I experimented
> >> with this idea last year and gave up on it.  I don't recall exactly
> >> why, but either it didn't show any meaningful performance improvement
> >> for me or there was some actual downside (that I'm not remembering
> >> right now).
> 
> > IIRC the case we were looking at around 989596152 were CPU bound workloads,
> > rather than latency bound workloads. It'd not be surprising to have cases
> > where batching LOCKs helps latency, but not CPU bound.
> 
> Yeah, perhaps.  Anyway my main point is that I don't want to just assume
> this is a win; I want to see some actual performance tests.

FWIW, one can simulate network latency with 'netem' on linux. Works even for
'lo'.

ping -c 3 -n localhost

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.035 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.049 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.043 ms

tc qdisc add dev lo root netem delay 10ms

64 bytes from ::1: icmp_seq=1 ttl=64 time=20.1 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=20.2 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=20.2 ms

tc qdisc delete dev lo root netem

64 bytes from ::1: icmp_seq=1 ttl=64 time=0.036 ms
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.047 ms
64 bytes from ::1: icmp_seq=3 ttl=64 time=0.050 ms


> > I wonder if "manual" batching is the best answer. Alexander, have you
> > considered using libpq level pipelining?
> 
> I'd be a bit nervous about how well that works with older servers.

I don't think there should be any problem - E.g. pgjdbc has been using
pipelining for ages.

Not sure if it's the right answer, just to be clear. I suspect that eventually
we're going to need to have a special "acquire pg_dump locks" function that is
cheaper than retail lock acquisition and perhaps deals more gracefully with
exceeding max_locks_per_transaction. Which would presumably not be pipelined.

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-06 15:21:09 -0500, Tom Lane wrote:
> +{ oid => '8050', descr => 'test whether string is valid input for data type',
> +  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
> +  proargtypes => 'text regtype', prosrc => 'pg_input_is_valid' },
> +{ oid => '8051', descr => 'test whether string is valid input for data type',
> +  proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool',
> +  proargtypes => 'text regtype int4', prosrc => 'pg_input_is_valid_mod' },
> +{ oid => '8052',
> +  descr => 'get error message if string is not valid input for data type',
> +  proname => 'pg_input_invalid_message', provolatile => 's',
> +  prorettype => 'text', proargtypes => 'text regtype',
> +  prosrc => 'pg_input_invalid_message' },
> +{ oid => '8053',
> +  descr => 'get error message if string is not valid input for data type',
> +  proname => 'pg_input_invalid_message', provolatile => 's',
> +  prorettype => 'text', proargtypes => 'text regtype int4',
> +  prosrc => 'pg_input_invalid_message_mod' },
> +

Is there a guarantee that input functions are stable or immutable? We don't
have any volatile input functions in core PG:

SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput FROM 
pg_type) GROUP BY provolatile;

Greetings,

Andres Freund

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Tom Lane

Andres Freund  writes:
> On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
>> I have a strong sense of deja vu here.  I'm pretty sure I experimented
>> with this idea last year and gave up on it.  I don't recall exactly
>> why, but either it didn't show any meaningful performance improvement
>> for me or there was some actual downside (that I'm not remembering
>> right now).

> IIRC the case we were looking at around 989596152 were CPU bound workloads,
> rather than latency bound workloads. It'd not be surprising to have cases
> where batching LOCKs helps latency, but not CPU bound.

Yeah, perhaps.  Anyway my main point is that I don't want to just assume
this is a win; I want to see some actual performance tests.

> I wonder if "manual" batching is the best answer. Alexander, have you
> considered using libpq level pipelining?

I'd be a bit nervous about how well that works with older servers.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

"David G. Johnston"  writes:
> So long as you aren't opposed to the idea if someone else does the work,
> adding sect2 is better than nothing even if it is just a stop-gap measure.

OK, we can agree on that.

As for the other point ---  not sure why I didn't remember this right off,
but the point of two test functions is that one exercises the code path
with details_wanted = true while the other exercises details_wanted =
false.  A combined function would only test the first case.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Corey Huinker  writes:
> In my attempt to implement CAST...DEFAULT, I noticed that I immediately
> needed an
> OidInputFunctionCallSafe, which was trivial but maybe something we want to
> add to the infra patch, but the comments around that function also somewhat
> indicate that we might want to just do the work in-place and call
> InputFunctionCallSafe directly. Open to both ideas.

I'm a bit skeptical of that.  IMO using OidInputFunctionCall is only
appropriate in places that will be executed just once per query.
Otherwise, unless you have zero concern for performance, you should
be caching the function lookup.  (The test functions in my 0003 patch
illustrate the standard way to do that within SQL-callable functions.
If you're implementing CAST as a new kind of executable expression,
the lookup would likely happen in expression compilation.)

I don't say that OidInputFunctionCallSafe won't ever be useful, but
I doubt it's what we want in CAST.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread Andres Freund

On 2022-12-07 09:20:33 -0500, Tom Lane wrote:
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.

+1

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Andres Freund

Hi,

On 2022-12-07 10:44:33 -0500, Tom Lane wrote:
> Aleksander Alekseev  writes:
> > What he proposes is taking the locks in batches.
> 
> I have a strong sense of deja vu here.  I'm pretty sure I experimented
> with this idea last year and gave up on it.  I don't recall exactly
> why, but either it didn't show any meaningful performance improvement
> for me or there was some actual downside (that I'm not remembering
> right now).

> This would've been in the leadup to 989596152 and adjacent commits.
> I took a quick look through the threads cited in those commit messages
> and didn't find anything about it, but I think the discussion had
> gotten scattered across more threads.  Some digging in the archives
> could be useful.

IIRC the case we were looking at around 989596152 were CPU bound workloads,
rather than latency bound workloads. It'd not be surprising to have cases
where batching LOCKs helps latency, but not CPU bound.


I wonder if "manual" batching is the best answer. Alexander, have you
considered using libpq level pipelining?

Greetings,

Andres Freund

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 9:59 AM Tom Lane  wrote:

> "David G. Johnston"  writes:
>
> > Are you suggesting we should not go down the path that v8-0003 does in
> the
> > monitoring section cleanup thread?  I find the usability of Chapter 54
> > System Views to be superior to these two run-on chapters and would rather
> > we emulate it in both these places - for what is in the end very little
> > additional effort, all mechanical in nature.
>
> I have not been following that thread, and am not really excited about
> putting in a huge amount of documentation work here.  I'd just like 9.26
> to have a mini-TOC at the page head, which 's would be enough for.
>
>
So long as you aren't opposed to the idea if someone else does the work,
adding sect2 is better than nothing even if it is just a stop-gap measure.

David J.

Re: Error-safe user functions

2022-12-07 Thread Corey Huinker

On Wed, Dec 7, 2022 at 9:20 AM Tom Lane  wrote:

> Andrew Dunstan  writes:
> > Perhaps we should add a type in the regress library that will never have
> > a safe input function, so we can test that the mechanism works as
> > expected in that case even after we adjust all the core data types'
> > input functions.
>
> I was intending that the existing "widget" type be that.  0003 already
> adds a comment to widget_in saying not to "fix" its one ereport call.
>
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.  That would lead to naming
> all the variant functions XXXSoft not XXXSafe.  There would still
> be commentary to the effect that "soft errors must be safe, in the
> sense that there's no question whether it's safe to continue
> processing the transaction".  Anybody think that'd be an
> improvement?

In my attempt to implement CAST...DEFAULT, I noticed that I immediately
needed an
OidInputFunctionCallSafe, which was trivial but maybe something we want to
add to the infra patch, but the comments around that function also somewhat
indicate that we might want to just do the work in-place and call
InputFunctionCallSafe directly. Open to both ideas.

Looking forward cascades up into coerce_type and its brethren, and
reimplementing those from a Node returner to a boolean returner with a Node
parameter seems a bit of a stretch, so I have to pick a point where the
code pivots from passing down a safe-mode indicator and passing back a
found_error indicator (which may be combine-able, as safe is always true
when the found_error pointer is not null, and always false when it isn't),
but for the most part things look do-able.

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

"David G. Johnston"  writes:
> On Wed, Dec 7, 2022 at 9:06 AM Tom Lane  wrote:
>> "David G. Johnston"  writes:
>>> Why not do away with two separate functions and define a composite type
>>> (boolean, text) for is_valid to return?

>> I don't see any advantage to that.  It would be harder to use in both
>> use-cases.

> I don't really see a use case for either of them individually.

Uh, several people opined that pg_input_is_valid would be of field
interest.  If I thought these were only for testing purposes I wouldn't
be especially concerned about documenting them at all.

> Are you suggesting we should not go down the path that v8-0003 does in the
> monitoring section cleanup thread?  I find the usability of Chapter 54
> System Views to be superior to these two run-on chapters and would rather
> we emulate it in both these places - for what is in the end very little
> additional effort, all mechanical in nature.

I have not been following that thread, and am not really excited about
putting in a huge amount of documentation work here.  I'd just like 9.26
to have a mini-TOC at the page head, which 's would be enough for.

regards, tom lane

Re: pg_upgrade: Make testing different transfer modes easier

2022-12-07 Thread Peter Eisentraut


On 02.12.22 13:04, Daniel Gustafsson wrote:

Wouldn't it be possible, and less change-code-manual, to accept this via an
extension to PROVE_FLAGS?  Any options after :: to prove are passed to the
test(s) [0] so we could perhaps inspect @ARGV for the mode if we invent a new
way to pass arguments.  Something along the lines of the untested sketch
below in the pg_upgrade test:

+# Optionally set the file transfer mode for the tests via arguments to PROVE
+my $mode = (@ARGV);
+$mode = '--copy' unless defined;

.. together with an extension to Makefile.global.in ..

-   $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
+   $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if 
$(PROVE_TESTS),$(PROVE_TESTS),t/*.pl) $(PROVE_TEST_ARGS)

.. should *I think* allow for passing the mode to the tests via:

make -C src/bin/pg_upgrade check PROVE_TEST_ARGS=":: --link"


I think this might be a lot of complication to get working robustly and 
in the different build systems.  Plus, what happens if you run all the 
test suites and want to pass some options to pg_upgrade and some to 
another test?


I think if we want to make this configurable on the fly, and environment 
variable would be much easier, like


my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';

Re: pg_upgrade: Make testing different transfer modes easier

2022-12-07 Thread Peter Eisentraut


On 02.12.22 01:56, Kyotaro Horiguchi wrote:

also thought about something like a "mode" option with an argument,
but given that we already have --link and --clone, this seemed the
most sensible.)

Thoughts?


When I read up to the point of the --copy option, what came to my mind
was the --mode= option.  IMHO, if I was going to add an option
to choose the copy behavior, I would add --mode option instead, like
pg_ctl does, as it implicitly signals that the suboptions are mutually
exclusive.


Ok, we have sort of one vote for each variant now.  Let's see if there 
are other opinions.

Re: [DOCS] Stats views and functions not in order?

2022-12-07 Thread David G. Johnston

On Tue, Dec 6, 2022 at 7:57 PM David G. Johnston 
wrote:

> On Tue, Dec 6, 2022 at 6:36 PM Peter Smith  wrote:
>
>> I'd like to "fix" this but IIUC there is no consensus yet about what
>> order is best for patch 0001, right?
>>
>>
> I'm planning on performing a more thorough review of 0003 and 0004
> tomorrow.
>
>
Compiled just fine.

I do think every row of the views table should be hyperlinked.  None of the
"xact" ones are for some reason.  For the sys/user ones just point to the
same place as the corresponding "all" link.

pg_stat_subscription_stats needs to be moved up to the "globals" section.

There are a bunch of trailing ". See" in the descriptions now that need to
be cleaned up. (0002)

David J.

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 9:06 AM Tom Lane  wrote:

> "David G. Johnston"  writes:
> > Why not do away with two separate functions and define a composite type
> > (boolean, text) for is_valid to return?
>
> I don't see any advantage to that.  It would be harder to use in both
> use-cases.
>

I don't really see a use case for either of them individually.  If all you
are doing is printing them out in a test and checking the result in what
situation wouldn't you want to check that both the true/false and message
are as expected?  Plus, you don't have to figure out a name for the second
function.

>
> >> BTW, does anyone else agree that 9.26 is desperately in need of some
> >>  subdivisions?  It seems to have gotten a lot longer since
> >> I looked at it last.
>
> > I'd be inclined to do something like what we are attempting for Chapter
> 28
> > Monitoring Database Activity; introduce pagination through refentry and
> > build our own table of contents into it.
>
> I'd prefer to follow the model that already exists in 9.27,
> ie break it up with 's, which provide a handy
> sub-table-of-contents.
>
>
I have a bigger issue with the non-pagination myself; the extra bit of
effort to manually create a tabular ToC (where we can add descriptions)
seems like a worthy price to pay.

Are you suggesting we should not go down the path that v8-0003 does in the
monitoring section cleanup thread?  I find the usability of Chapter 54
System Views to be superior to these two run-on chapters and would rather
we emulate it in both these places - for what is in the end very little
additional effort, all mechanical in nature.

David J.

Re: Think-o in foreign key comments

2022-12-07 Thread Peter Eisentraut


On 03.12.22 05:59, Ian Lawrence Barwick wrote:

2022年12月3日(土) 7:19 Paul Jungwirth :

I noticed a few places in the new foreign key code where a comment says
"the ON DELETE SET NULL/DELETE clause". I believe it should say "ON
DELETE SET NULL/DEFAULT".

These comments were added in d6f96ed94e7, "Allow specifying column list
for foreign key ON DELETE SET actions." Here is a patch to correct them.


LGTM.

I do notice the same patch adds the function "validateFkOnDeleteSetColumns"
but the name in the comment preceding it is "validateFkActionSetColumns",
might as well fix that the same time.


Committed with that addition and backpatched to PG15.

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

"David G. Johnston"  writes:
> Why not do away with two separate functions and define a composite type
> (boolean, text) for is_valid to return?

I don't see any advantage to that.  It would be harder to use in both
use-cases.

>> BTW, does anyone else agree that 9.26 is desperately in need of some
>>  subdivisions?  It seems to have gotten a lot longer since
>> I looked at it last.

> I'd be inclined to do something like what we are attempting for Chapter 28
> Monitoring Database Activity; introduce pagination through refentry and
> build our own table of contents into it.

I'd prefer to follow the model that already exists in 9.27,
ie break it up with 's, which provide a handy
sub-table-of-contents.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 8:23 AM Tom Lane  wrote:

> Andrew Dunstan  writes:
> > On 2022-12-07 We 09:20, Tom Lane wrote:
> >> Returning to the naming quagmire -- it occurred to me just now that
> >> it might be helpful to call this style of error reporting "soft"
> >> errors rather than "safe" errors, which'd provide a nice contrast
> >> with "hard" errors thrown by longjmp'ing.  That would lead to naming
> >> all the variant functions XXXSoft not XXXSafe.
>
> > I'm not sure InputFunctionCallSoft would be an improvement.
>
> Yeah, after reflecting on it a bit more I'm not that impressed with
> that as a function name either.
>
> (I think that "soft error" could be useful as informal terminology.
> AFAIR we don't use "hard error" in any formal way either, but there
> are certainly comments using that phrase.)
>
> More questions:
>
> * Anyone want to bikeshed about the new SQL-level function names?
> I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning
> variant, but not so much with "pg_input_invalid_message" for the
> error-message-returning variant.  Thinking about "pg_input_error_message"
> instead, but that's not stellar either.
>

Why not do away with two separate functions and define a composite type
(boolean, text) for is_valid to return?


> * Where in the world shall we document these, if we document them?
> The only section of chapter 9 that seems even a little bit appropriate
> is "9.26. System Information Functions and Operators", and even there,
> they would need their own new table because they don't fit well in any
> existing table.
>

I would indeed just add a table there.


>
> BTW, does anyone else agree that 9.26 is desperately in need of some
>  subdivisions?  It seems to have gotten a lot longer since
> I looked at it last.
>
>
I'd be inclined to do something like what we are attempting for Chapter 28
Monitoring Database Activity; introduce pagination through refentry and
build our own table of contents into it.

David J.

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Tom Lane

Aleksander Alekseev  writes:
> What he proposes is taking the locks in batches.

I have a strong sense of deja vu here.  I'm pretty sure I experimented
with this idea last year and gave up on it.  I don't recall exactly
why, but either it didn't show any meaningful performance improvement
for me or there was some actual downside (that I'm not remembering
right now).

This would've been in the leadup to 989596152 and adjacent commits.
I took a quick look through the threads cited in those commit messages
and didn't find anything about it, but I think the discussion had
gotten scattered across more threads.  Some digging in the archives
could be useful.

regards, tom lane

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan  wrote:

>
> On 2022-12-07 We 09:20, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> Perhaps we should add a type in the regress library that will never have
> >> a safe input function, so we can test that the mechanism works as
> >> expected in that case even after we adjust all the core data types'
> >> input functions.
> > I was intending that the existing "widget" type be that.  0003 already
> > adds a comment to widget_in saying not to "fix" its one ereport call.
>
>
> Yeah, I see that, I must have been insufficiently caffeinated.
>
>
> >
> > Returning to the naming quagmire -- it occurred to me just now that
> > it might be helpful to call this style of error reporting "soft"
> > errors rather than "safe" errors, which'd provide a nice contrast
> > with "hard" errors thrown by longjmp'ing.  That would lead to naming
> > all the variant functions XXXSoft not XXXSafe.  There would still
> > be commentary to the effect that "soft errors must be safe, in the
> > sense that there's no question whether it's safe to continue
> > processing the transaction".  Anybody think that'd be an
> > improvement?
> >
> >
>
>
> I'm not sure InputFunctionCallSoft would be an improvement. Maybe
> InputFunctionCallSoftError would be clearer, but I don't know that it's
> much of an improvement either. The same goes for the other visible changes.
>
>
InputFunctionCallSafe -> TryInputFunctionCall

I think in create type saying "input functions to handle errors softly" is
an improvement over "input functions to return safe errors".

start->save->finish describes a soft error handling procedure quite well.
safe has baggage, all code should be "safe".

fmgr/README: "Handling Non-Exception Errors" -> "Soft Error Handling"

"typical safe error conditions include" -> "error conditions that can be
handled softly include"

(pg_input_is_valid) "input function has been updated to return "safe'
errors" -> "input function has been updated to soft error handling"


Unrelated observation: "Although the error stack is not large, we don't
expect to run out of space." -> "Because the error stack is not large,
assume that we will not run out of space and panic if we are wrong."?

David J.

Re: [PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Fabrízio de Royes Mello

On Wed, Dec 7, 2022 at 12:09 PM Aleksander Alekseev <
aleksan...@timescale.com> wrote:
>
> Hi hackers,
>
> A colleague of mine reported a slight inconvenience with pg_dump.
>
> He is dumping the data from a remote server. There are several
> thousands of tables in the database. Making a dump locally and/or
> using pg_basebackup and/or logical replication is not an option. So
> what pg_dump currently does is sending LOCK TABLE queries one after
> another. Every query needs an extra round trip. So if we have let's
> say 2000 tables and every round trip takes 100 ms then ~3.5 minutes
> are spent in the not most useful way.
>
> What he proposes is taking the locks in batches. I.e. instead of:
>
> LOCK TABLE foo IN ACCESS SHARE MODE;
> LOCK TABLE bar IN ACCESS SHARE MODE;
>
> do:
>
> LOCK TABLE foo, bar, ... IN ACCESS SHARE MODE;
>
> The proposed patch makes this change. It's pretty straightforward and
> as a side effect saves a bit of network traffic too.
>

+1 for that change. It will improve the dump for databases with thousands
of relations.

The code LGTM and it passes in all tests and compiles without any warning.

Regards,

-- 
Fabrízio de Royes Mello

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Andrew Dunstan  writes:
> On 2022-12-07 We 09:20, Tom Lane wrote:
>> Returning to the naming quagmire -- it occurred to me just now that
>> it might be helpful to call this style of error reporting "soft"
>> errors rather than "safe" errors, which'd provide a nice contrast
>> with "hard" errors thrown by longjmp'ing.  That would lead to naming
>> all the variant functions XXXSoft not XXXSafe.

> I'm not sure InputFunctionCallSoft would be an improvement.

Yeah, after reflecting on it a bit more I'm not that impressed with
that as a function name either.

(I think that "soft error" could be useful as informal terminology.
AFAIR we don't use "hard error" in any formal way either, but there
are certainly comments using that phrase.)

More questions:

* Anyone want to bikeshed about the new SQL-level function names?
I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning
variant, but not so much with "pg_input_invalid_message" for the
error-message-returning variant.  Thinking about "pg_input_error_message"
instead, but that's not stellar either.

* Where in the world shall we document these, if we document them?
The only section of chapter 9 that seems even a little bit appropriate
is "9.26. System Information Functions and Operators", and even there,
they would need their own new table because they don't fit well in any
existing table.

BTW, does anyone else agree that 9.26 is desperately in need of some
 subdivisions?  It seems to have gotten a lot longer since
I looked at it last.

regards, tom lane

[PATCH] pg_dump: lock tables in batches

2022-12-07 Thread Aleksander Alekseev

Hi hackers,

A colleague of mine reported a slight inconvenience with pg_dump.

He is dumping the data from a remote server. There are several
thousands of tables in the database. Making a dump locally and/or
using pg_basebackup and/or logical replication is not an option. So
what pg_dump currently does is sending LOCK TABLE queries one after
another. Every query needs an extra round trip. So if we have let's
say 2000 tables and every round trip takes 100 ms then ~3.5 minutes
are spent in the not most useful way.

What he proposes is taking the locks in batches. I.e. instead of:

LOCK TABLE foo IN ACCESS SHARE MODE;
LOCK TABLE bar IN ACCESS SHARE MODE;

do:

LOCK TABLE foo, bar, ... IN ACCESS SHARE MODE;

The proposed patch makes this change. It's pretty straightforward and
as a side effect saves a bit of network traffic too.

Thoughts?

-- 
Best regards,
Aleksander Alekseev


v1-0001-pg_dump-lock-tables-in-batches.patch
Description: Binary data

Re: Error-safe user functions

2022-12-07 Thread Andrew Dunstan



On 2022-12-07 We 09:20, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Perhaps we should add a type in the regress library that will never have
>> a safe input function, so we can test that the mechanism works as
>> expected in that case even after we adjust all the core data types'
>> input functions.
> I was intending that the existing "widget" type be that.  0003 already
> adds a comment to widget_in saying not to "fix" its one ereport call.


Yeah, I see that, I must have been insufficiently caffeinated.


>
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.  That would lead to naming
> all the variant functions XXXSoft not XXXSafe.  There would still
> be commentary to the effect that "soft errors must be safe, in the
> sense that there's no question whether it's safe to continue
> processing the transaction".  Anybody think that'd be an
> improvement?
>
>   


I'm not sure InputFunctionCallSoft would be an improvement. Maybe
InputFunctionCallSoftError would be clearer, but I don't know that it's
much of an improvement either. The same goes for the other visible changes.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Error-safe user functions

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 7:20 AM Tom Lane  wrote:

>
> Returning to the naming quagmire -- it occurred to me just now that
> it might be helpful to call this style of error reporting "soft"
> errors rather than "safe" errors, which'd provide a nice contrast
> with "hard" errors thrown by longjmp'ing.  That would lead to naming
> all the variant functions XXXSoft not XXXSafe.  There would still
> be commentary to the effect that "soft errors must be safe, in the
> sense that there's no question whether it's safe to continue
> processing the transaction".  Anybody think that'd be an
> improvement?
>
>
+1

David J.

Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row

2022-12-07 Thread Muhammad Usama

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

Looks good to me

The new status of this patch is: Ready for Committer

Re: Error-safe user functions

2022-12-07 Thread Tom Lane

Andrew Dunstan  writes:
> Perhaps we should add a type in the regress library that will never have
> a safe input function, so we can test that the mechanism works as
> expected in that case even after we adjust all the core data types'
> input functions.

I was intending that the existing "widget" type be that.  0003 already
adds a comment to widget_in saying not to "fix" its one ereport call.

Returning to the naming quagmire -- it occurred to me just now that
it might be helpful to call this style of error reporting "soft"
errors rather than "safe" errors, which'd provide a nice contrast
with "hard" errors thrown by longjmp'ing.  That would lead to naming
all the variant functions XXXSoft not XXXSafe.  There would still
be commentary to the effect that "soft errors must be safe, in the
sense that there's no question whether it's safe to continue
processing the transaction".  Anybody think that'd be an
improvement?

regards, tom lane

Re: Allow tests to pass in OpenSSL FIPS mode

2022-12-07 Thread Peter Eisentraut


On 13.10.22 12:26, Peter Eisentraut wrote:

I think that the other md5() computations done in the main regression
test suite could just be switched to use one of the sha*() functions
as they just want to put their hands on text values.  It looks like a
few of them have some expections with the output size and
generate_series(), though, but this could be tweaked by making the
series shorter, for example.


Right, that's the rest of my original patch.  I'll come back with an 
updated version of that.


Here is the next step.  To contain the scope, I focused on just "make 
check" for now.  This patch removes all incidental calls to md5(), 
replacing them with sha256(), so that they'd pass with or without FIPS 
mode.  (Two tests would need alternative expected files: md5 and 
password.  I have not included those here.)


Some tests inspect the actual md5 result strings or build statistics 
based on them.  I have tried to carefully preserve the meaning of the 
original tests, to the extent that they could be inferred, in some cases 
adjusting example values by matching the md5 outputs to the equivalent 
sha256 outputs.  Some cases are tricky or mysterious or both and could 
use another look.
From 437c7c8b62b5574b017a5d05b5540e219abd6c4a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 7 Dec 2022 13:32:26 +0100
Subject: [PATCH] Remove incidental md5() function uses from main regression
 tests

Most of these calls were to generate some random data.  These can be
replaced by appropriately adapted sha256() calls.

This will eventually allow these tests to pass in OpenSSL FIPS mode
(which does not allow MD5 use).

Similar work for other test suites will follow later.
---
 src/test/regress/expected/arrays.out|  18 +-
 src/test/regress/expected/brin.out  |   4 +-
 src/test/regress/expected/brin_multi.out|   8 +-
 src/test/regress/expected/compression.out   |  13 +-
 src/test/regress/expected/compression_1.out |  11 +-
 src/test/regress/expected/inherit.out   |   2 +-
 src/test/regress/expected/largeobject.out   |   2 +-
 src/test/regress/expected/matview.out   |   8 +-
 src/test/regress/expected/memoize.out   |   2 +-
 src/test/regress/expected/plpgsql.out   |  24 +-
 src/test/regress/expected/rowsecurity.out   | 591 ++--
 src/test/regress/expected/stats_ext.out |  14 +-
 src/test/regress/sql/arrays.sql |  18 +-
 src/test/regress/sql/brin.sql   |   4 +-
 src/test/regress/sql/brin_multi.sql |   8 +-
 src/test/regress/sql/compression.sql|   7 +-
 src/test/regress/sql/inherit.sql|   2 +-
 src/test/regress/sql/largeobject.sql|   2 +-
 src/test/regress/sql/matview.sql|   8 +-
 src/test/regress/sql/memoize.sql|   2 +-
 src/test/regress/sql/plpgsql.sql|   2 +-
 src/test/regress/sql/rowsecurity.sql|  14 +-
 src/test/regress/sql/stats_ext.sql  |  14 +-
 23 files changed, 386 insertions(+), 392 deletions(-)

diff --git a/src/test/regress/expected/arrays.out 
b/src/test/regress/expected/arrays.out
index 97920f38c2..ae269d4f50 100644
--- a/src/test/regress/expected/arrays.out
+++ b/src/test/regress/expected/arrays.out
@@ -2272,20 +2272,20 @@ select * from t1;
 (1 row)
 
 -- Check that arrays of composites are safely detoasted when needed
-create temp table src (f1 text);
+create temp table src (f1 bytea);
 insert into src
-  select string_agg(random()::text,'') from generate_series(1,1);
-create type textandtext as (c1 text, c2 text);
-create temp table dest (f1 textandtext[]);
-insert into dest select array[row(f1,f1)::textandtext] from src;
-select length(md5((f1[1]).c2)) from dest;
+  select string_agg(random()::text::bytea,'') from generate_series(1,1);
+create type byteaandbytea as (c1 bytea, c2 bytea);
+create temp table dest (f1 byteaandbytea[]);
+insert into dest select array[row(f1,f1)::byteaandbytea] from src;
+select length(sha256((f1[1]).c2)) from dest;
  length 
 
  32
 (1 row)
 
 delete from src;
-select length(md5((f1[1]).c2)) from dest;
+select length(sha256((f1[1]).c2)) from dest;
  length 
 
  32
@@ -2293,14 +2293,14 @@ select length(md5((f1[1]).c2)) from dest;
 
 truncate table src;
 drop table src;
-select length(md5((f1[1]).c2)) from dest;
+select length(sha256((f1[1]).c2)) from dest;
  length 
 
  32
 (1 row)
 
 drop table dest;
-drop type textandtext;
+drop type byteaandbytea;
 -- Tests for polymorphic-array form of width_bucket()
 -- this exercises the varwidth and float8 code paths
 SELECT
diff --git a/src/test/regress/expected/brin.out 
b/src/test/regress/expected/brin.out
index 73fa38396e..d4a139e50b 100644
--- a/src/test/regress/expected/brin.out
+++ b/src/test/regress/expected/brin.out
@@ -530,7 +530,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM brin_test WHERE b = 1;
 CREATE TABLE brintest_3 (a text, b text, c text, d text);
 -- long random strings (~2000 chars each, so ~6kB for min/max

Re: Error-safe user functions

2022-12-07 Thread Andrew Dunstan



On 2022-12-06 Tu 15:21, Tom Lane wrote:
> OK, here's a v3 responding to the comments from Andres.


Looks pretty good to me.


>
>  is preliminary refactoring of elog.c, with (I trust) no
> functional effect.  It gets rid of some pre-existing code duplication
> as well as setting up to let 0001's additions be less duplicative.
>
> 0001 adopts use of Node pointers in place of "void *".  To do this
> I needed an alias type in elog.h equivalent to fmgr.h's fmNodePtr.
> I decided that having two different aliases would be too confusing,
> so what I did here was to converge both elog.h and fmgr.h on using
> the same alias "typedef struct Node *NodePtr".  That has to be in
> elog.h since it's included first, from postgres.h.  (I thought of
> defining NodePtr in postgres.h, but postgres.h includes elog.h
> immediately so that wouldn't have looked very nice.)
>
> I also adopted Andres' recommendation that InputFunctionCallSafe
> return boolean.  I'm still not totally sold on that ... but it does
> end with array_in and record_in never using SAFE_ERROR_OCCURRED at
> all, so maybe the idea's OK.


Originally I wanted to make the new function look as much like the
original as possible, but I'm not wedded to that either. I can live with
it like this.


>
> 0002 adjusts the I/O functions for these API changes, and fixes
> my silly oversight about error cleanup in record_in.
>
> Given the discussion about testing requirements, I threw away the
> COPY hack entirely.  This 0003 provides a couple of SQL-callable
> functions that can be used to invoke a specific datatype's input
> function.  I haven't documented them, pending bikeshedding on
> names etc.  I also arranged to test array_in and record_in with
> a datatype that still throws errors, reserving the existing test
> type "widget" for that purpose.
>
> (I'm not intending to foreclose development of new COPY features
> in this area, just abandoning the idea that that's our initial
> test mechanism.)
>

The new functions on their own are likely to make plenty of people quite
happy once we've adjusted all the input functions.

Perhaps we should add a type in the regress library that will never have
a safe input function, so we can test that the mechanism works as
expected in that case even after we adjust all the core data types'
input functions.

Otherwise I think we're good to go.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: ANY_VALUE aggregate

2022-12-07 Thread David G. Johnston

On Wed, Dec 7, 2022 at 1:58 AM Pantelis Theodosiou 
wrote:

> On Tue, Dec 6, 2022 at 4:57 AM David G. Johnston
>  wrote:
> ...
> >
> >
> > I'm referring to the query:
> >
> > select any_value(v order by v) from (values (2),(1),(3)) as vals (v);
> > // produces 1, per the documented implementation-defined behavior.
> >
> > Someone writing:
> >
> > select any_value(v) from (values (2),(1),(3)) as vals (v) order by v;
> >
> > Is not presently, nor am I saying, promised the value 1.
> >
>
> Shouldn't the 2nd query be producing an error, as it has an implied
> GROUP BY () - so column v cannot appear (unless aggregated) in SELECT
> and ORDER BY?
>

Right, that should be written as:

select any_value(v) from (values (2),(1),(3) order by 1) as vals (v);

(you said SELECT; the discussion here is that any_value is going to be
added as a new aggregate function)

David J.

Re: Allow placeholders in ALTER ROLE w/o superuser

2022-12-07 Thread Alexander Korotkov

On Wed, Dec 7, 2022 at 1:28 AM Pavel Borisov  wrote:
> On Tue, 6 Dec 2022 at 19:01, Alexander Korotkov  wrote:
> >
> > On Mon, Dec 5, 2022 at 10:32 PM Alexander Korotkov  
> > wrote:
> > > On Mon, Dec 5, 2022 at 8:18 PM Tom Lane  wrote:
> > > > Alvaro Herrera  writes:
> > > > > I couldn't find any discussion of the idea of adding "(s)" to the
> > > > > variable name in order to mark the variable userset in the catalog, 
> > > > > and
> > > > > I have to admit I find it a bit strange.  Are we really agreed that
> > > > > that's the way to proceed?
> > > >
> > > > I hadn't been paying close attention to this thread, sorry.
> > > >
> > > > I agree that that seems like a very regrettable choice,
> > > > especially if you anticipate having to bump catversion anyway.
> > >
> > > I totally understand that this change requires a catversion bump.
> > > I've reflected this in the commit message.
> > >
> > > > Better to add a bool column to the catalog.
> > >
> > > What about adding a boolean array to the pg_db_role_setting? So,
> > > pg_db_role_setting would have the following columns.
> > >  * setdatabase oid
> > >  * setrole oid
> > >  * setconfig text[]
> > >  * setuser bool[]
> >
> > The revised patch implements this way for storage USER SET flag.
> > think it really became more structured and less cumbersome.
>
> I agree that the patch became more structured and the complications
> for string parameter suffixing have gone away. I've looked it through
> and don't see problems with it. The only two-lines fix regarding
> variable initializing may be relevant (see v9). Tests pass and CI is
> also happy with it. I'd like to set it ready for committer if no
> objections.

Thank you, Pavel.
I've made few minor improvements in the docs and comments.
I'm going to push this if no objections.

--
Regards,
Alexander Korotkov


0001-Add-USER-SET-parameter-values-for-pg_db_role_set-v10.patch
Description: Binary data

RE: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread houzj.f...@fujitsu.com

On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada  
wrote:
> 
> On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com
> 
> > >
> > > Thursday, December 1, 2022 8:40 PM Amit Kapila
> 
> > > wrote:
> > > > Some other comments:
> > > ...
> > > Attach the new version patch set which addressed most of the comments
> > > received so far except some comments being discussed[1].
> > > [1]
> https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C
> 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> >
> > Attach a new version patch set which fixed a testcase failure on CFbot.
> 
> ---
> If a value of max_parallel_apply_workers_per_subscription is not
> sufficient, we get the LOG "out of parallel apply workers" every time
> when the apply worker doesn't launch a worker. But do we really need
> this log? It seems not consistent with
> max_sync_workers_per_subscription behavior. I think we can check if
> the number of running parallel workers is less than
> max_parallel_apply_workers_per_subscription before calling
> logicalrep_worker_launch(). What do you think?

(Sorry, I missed this comment in last email)

I personally feel giving a hint might help user to realize that the
max_parallel_applyxxx is not enough for the current workload and then they can
adjust the parameter. Otherwise, user might have an easy way to check if more
workers are needed.

Best regards,
Hou zj

RE: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread houzj.f...@fujitsu.com

On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada  
wrote:
> 
> On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com
> 
> > >
> > > Thursday, December 1, 2022 8:40 PM Amit Kapila
> 
> > > wrote:
> > > > Some other comments:
> > > ...
> > > Attach the new version patch set which addressed most of the comments
> > > received so far except some comments being discussed[1].
> > > [1]
> https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C
> 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com
> >
> > Attach a new version patch set which fixed a testcase failure on CFbot.
> 
> Here are some comments on v56 0001, 0002 patches. Please ignore
> comments if you already incorporated them in v57.

Thanks for the comments!

> +static void
> +ProcessParallelApplyInterrupts(void)
> +{
> +CHECK_FOR_INTERRUPTS();
> +
> +if (ShutdownRequestPending)
> +{
> +ereport(LOG,
> +(errmsg("logical replication parallel
> apply worker for subscrip
> tion \"%s\" has finished",
> +MySubscription->name)));
> +
> +apply_worker_clean_exit(false);
> +}
> +
> +if (ConfigReloadPending)
> +{
> +ConfigReloadPending = false;
> +ProcessConfigFile(PGC_SIGHUP);
> +}
> +}
> 
> I personally think that we don't need to have a function to do only
> these few things.

I thought that introduce a new function make the handling of worker specific
Interrupts logic similar to other existing ones. Like:
ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in
pgarch.c ...

> 
> Should we change the names to something like
> LOGICALREP_STREAM_PARALLEL?

Agreed, will change.

> ---
> + * The lock graph for the above example will look as follows:
> + * LA (waiting to acquire the lock on the unique index) -> PA (waiting to
> + * acquire the lock on the remote transaction) -> LA
> 
> and
> 
> + * The lock graph for the above example will look as follows:
> + * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to acquire 
> the
> + * lock due to unique index constraint) -> PA-1 (waiting to acquire the 
> stream
> + * lock) -> LA
> 
> "(waiting to acquire the lock on the remote transaction)" in the first
> example and "(waiting to acquire the stream lock)" in the second
> example is the same meaning, right? If so, I think we should use
> either term for consistency.

Will change.

> ---
> +bool   write_abort_info = (data->streaming ==
> SUBSTREAM_PARALLEL);
> 
> I think that instead of setting write_abort_info every time when
> pgoutput_stream_abort() is called, we can set it once, probably in
> PGOutputData, at startup.

I thought that since we already have a "stream" flag in PGOutputData, I am not
sure if it would be better to introduce another flag for the same option.


> ---
> server_version = walrcv_server_version(LogRepWorkerWalRcvConn);
> options.proto.logical.proto_version =
> +server_version >= 16 ?
> LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM :
> server_version >= 15 ?
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM :
> server_version >= 14 ?
> LOGICALREP_PROTO_STREAM_VERSION_NUM :
> LOGICALREP_PROTO_VERSION_NUM;
> 
> Instead of always using the new protocol version, I think we can use
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not
> 'parallel'. That way, we don't need to change protocl version check
> logic in pgoutput.c and don't need to expose defGetStreamingMode().
> What do you think?

I think that some user can also use the new version number when trying to get
changes (via pg_logical_slot_peek_binary_changes or other functions), so I feel
leave the check for new version number seems fine.

Besides, I feel even if we don't use new version number, we still need to use
defGetStreamingMode to check if parallel mode in used as we need to send
abort_lsn when parallel is in used. I might be missing something, sorry for
that. Can you please explain the idea a bit ?

> ---
> When max_parallel_apply_workers_per_subscription is changed to a value
> lower than the number of parallel worker running at that time, do we
> need to stop extra workers?

I think we can do this, like adding a check in the main loop of leader worker, 
and
check every time after reloading the conf. OTOH, we will also stop the worker 
after
finishing a transaction, so I am slightly not sure do we need to add another 
check logic here.
But I am fine to add it if you think it would be better.


> ---
> If a value of max_parallel_apply_workers_per_subscription is not
> sufficient, we get the LOG "out of parallel apply workers" every time
> when the apply worker doesn't launch a worker. But do we really need
> this log? It seems not consistent with
> max_sync_workers_per_subscription

Re: Perform streaming logical transactions by background workers and parallel apply

2022-12-07 Thread Masahiko Sawada

On Wed, Dec 7, 2022 at 4:31 PM Amit Kapila  wrote:
>
> On Wed, Dec 7, 2022 at 10:10 AM Masahiko Sawada  wrote:
> >
> > On Wed, Dec 7, 2022 at 1:29 PM Amit Kapila  wrote:
> > >
> > > Right, but the leader will anyway exit at some point either due to an
> > > ERROR like "lost connection ... to parallel worker" or with a LOG
> > > like: "... will restart because of a parameter change" but I see your
> > > point. So, will it be better if we have a LOG message here and then
> > > proc_exit()? Do you have something else in mind for this?
> >
> > No, I was thinking that too. It's better to write a LOG message and do
> > proc_exit().
> >
> > Regarding the error "lost connection ... to parallel worker", it could
> > still happen depending on the timing even if the parallel worker
> > cleanly exits due to parameter changes, right? If so, I'm concerned
> > that it could lead to disable the subscription unexpectedly if
> > disable_on_error is enabled.
> >
>
> If we want to avoid this then I think we have the following options
> (a) parallel apply skips checking parameter change (b) parallel worker
> won't exit on parameter change but will silently absorb the parameter
> and continue its processing; anyway, the leader will detect it and
> stop the worker for the parameter change
>
> Among these, the advantage of (b) is that it will allow reflecting the
> parameter change (that doesn't need restart) in the parallel worker.
> Do you have any better idea to deal with this?

I think (b) is better. We need to reflect the synchronous_commit
parameter also in parallel workers in the worker pool.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Support logical replication of DDLs

2022-12-07 Thread Alvaro Herrera

I think this patch is split badly.

You have:

0001 an enormous patch including some required infrastructure, plus the
DDL deparsing bits themselves.

0002 another enormous (though not as much) patch, this time for
DDL replication using the above.

0003 a bugfix for 0001, which includes changes in both the
infrastructure and the deparsing bits.

0004 test stuff for 0002.

0005 Another bugfix for 0001

0006 Another bugfix for 0001

As presented, I think it has very little chance of being reviewed
usefully.  A better way to go about this, I think, would be:

0001 - infrastructure bits to support the DDL deparsing parts (all these
new functions in ruleutils.c, sequence.c, etc).  That means, everything
(?) that's currently in your 0001 except ddl_deparse.c and friends.
Clearly there are several independent changes here; maybe it is possible
to break it down even further.  This patch or these patches should also
include the parts of 0003, 0005, 0006 that require changes outside of
ddl_deparse.c.
I expect that this patch should be fairly small.

0002 - ddl_deparse.c and its very close friends.  This should not have
any impact on places such as ruleutils.c, sequence.c, etc.  The parts of
the bugfixes (0001, 0005, 0006) that touch this could should be merged
here as well; there's no reason to have them as separate patches.  Some
test code should be here also, though it probably doesn't need to aim to
be complete.
This one is likely to be very large, but also self-contained.

0003 - ddlmessage.c and friends.  I understand that DDL-messaging is
supporting infrastructure for DDL replication; I think it should be its
own patch.  Probably include its own simple-ish test bits.
Not a very large patch.

0004 - DDL replication proper, including 0004.
Probably not a very large patch either, not sure.


Some review comments, just skimming:
- 0002 adds some functions to event_trigger.c, but that doesn't seem to
be their place.  Maybe some new file in src/backend/replication/logical
would make more sense.

- publication_deparse_ddl_command_end has a long strcmp() list; why?
Maybe change things so that it compares some object type enum instead.

- CreatePublication has a long list of command tags; is that good?
Maybe it'd be better to annotate the list in cmdtaglist.h somehow.

- The change in pg_dump's getPublications needs updated to 16.

- Don't "git add" src/bin/pg_waldump/logicalddlmsgdesc.c, just update
its Makefile and meson.build

- I think psql's \dRp should not have the new column at the end.
Maybe one of:
+ Name | Owner | DDL | All tables | Inserts | Updates | Deletes | Truncates | 
Via root
+ Name | Owner | All tables | DDL | Inserts | Updates | Deletes | Truncates | 
Via root
+ Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | DDL | 
Via root
(I would not add the "s" at the end of that column title, also).


-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)

Re: moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)

2022-12-07 Thread Amit Langote

On Wed, Dec 7, 2022 at 8:54 PM Amit Langote  wrote:
> Per Alvaro's advice, forking this from [1].
>
> In that thread, Tom had asked if it wouldn't be better to find a new
> place to put extraUpdatedCols [2] instead of RangeTblEntry, along with
> the permission-checking fields are now no longer stored in
> RangeTblEntry.
>
> In [3] of the same thread, I proposed to move it into a List of
> Bitmapsets in ModifyTable, as implemented in the attached patch that I
> had been posting to that thread.
>
> The latest version of that patch is attached herewith.  I'll add this
> one to the January CF too.

Done.

https://commitfest.postgresql.org/41/4049/

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Directly associate primary key with user-defined tablespace

2022-12-07 Thread MichaelDBA


Hi all!

PG by default always creates the primary key in the default tablespace 
unless you specify it to use an index that is defined in a user-defined 
tablespace.


We can create indexes in user-defined tablespaces, why can't we create  
a primary key in a user-defined tablespace without having to associate 
it with an index? Something like:

ALTER TABLE myschema.mytable ADD PRIMARY KEY (akey) tablespace mytablespace;

Regards,

Michael Vitale

moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)

2022-12-07 Thread Amit Langote

Per Alvaro's advice, forking this from [1].

In that thread, Tom had asked if it wouldn't be better to find a new
place to put extraUpdatedCols [2] instead of RangeTblEntry, along with
the permission-checking fields are now no longer stored in
RangeTblEntry.

In [3] of the same thread, I proposed to move it into a List of
Bitmapsets in ModifyTable, as implemented in the attached patch that I
had been posting to that thread.

The latest version of that patch is attached herewith.  I'll add this
one to the January CF too.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] 
https://www.postgresql.org/message-id/ca+hiwqgjjdmuhdsfv-u2qhkjjt9st7xh9jxc_irsaq1taus...@mail.gmail.com
[2] https://www.postgresql.org/message-id/3098829.1658956718%40sss.pgh.pa.us
[3] 
https://www.postgresql.org/message-id/CA%2BHiwqEHoLgN%3DvSsaNMaHP-%2BqYPT40-ooySyrieXZHNzbSBj0w%40mail.gmail.com


v1-0001-Add-per-result-relation-extraUpdatedCols-to-Modif.patch
Description: Binary data

1 2 >

1 - 100 of 117 matches

Mail list logo