Re: Perform streaming logical transactions by background workers and parallel apply
On Thu, Dec 8, 2022 at 12:42 PM Masahiko Sawada wrote: > > On Wed, Dec 7, 2022 at 10:03 PM houzj.f...@fujitsu.com > wrote: > > > > > > > +static void > > > +ProcessParallelApplyInterrupts(void) > > > +{ > > > +CHECK_FOR_INTERRUPTS(); > > > + > > > +if (ShutdownRequestPending) > > > +{ > > > +ereport(LOG, > > > +(errmsg("logical replication parallel > > > apply worker for subscrip > > > tion \"%s\" has finished", > > > +MySubscription->name))); > > > + > > > +apply_worker_clean_exit(false); > > > +} > > > + > > > +if (ConfigReloadPending) > > > +{ > > > +ConfigReloadPending = false; > > > +ProcessConfigFile(PGC_SIGHUP); > > > +} > > > +} > > > > > > I personally think that we don't need to have a function to do only > > > these few things. > > > > I thought that introduce a new function make the handling of worker specific > > Interrupts logic similar to other existing ones. Like: > > ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in > > pgarch.c ... > > I think the difference from them is that there is only one place to > call ProcessParallelApplyInterrupts(). > But I feel it is better to isolate this code in a separate function. What if we decide to extend it further by having some logic to stop workers after reloading of config? > > > > > --- > > > server_version = walrcv_server_version(LogRepWorkerWalRcvConn); > > > options.proto.logical.proto_version = > > > +server_version >= 16 ? > > > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM : > > > server_version >= 15 ? > > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM : > > > server_version >= 14 ? > > > LOGICALREP_PROTO_STREAM_VERSION_NUM : > > > LOGICALREP_PROTO_VERSION_NUM; > > > > > > Instead of always using the new protocol version, I think we can use > > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not > > > 'parallel'. That way, we don't need to change protocl version check > > > logic in pgoutput.c and don't need to expose defGetStreamingMode(). > > > What do you think? > > > > I think that some user can also use the new version number when trying to > > get > > changes (via pg_logical_slot_peek_binary_changes or other functions), so I > > feel > > leave the check for new version number seems fine. > > > > Besides, I feel even if we don't use new version number, we still need to > > use > > defGetStreamingMode to check if parallel mode in used as we need to send > > abort_lsn when parallel is in used. I might be missing something, sorry for > > that. Can you please explain the idea a bit ? > > My idea is that we use LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM if > (server_version >= 16 && MySubscription->stream == > SUBSTREAM_PARALLEL). If the stream is SUBSTREAM_ON, we use > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM even if server_version is > 16. That way, in pgoutput.c, we can send abort_lsn if the protocol > version is LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM. We don't need > to send "streaming = parallel" to the publisher since the publisher > can decide whether or not to send abort_lsn based on the protocol > version (still needs to send "streaming = on" though). I might be > missing something. > What if we decide to send some more additional information as part of another patch like we are discussing in the thread [1]? Now, we won't be able to decide the version number based on just the streaming option. Also, in such a case, even for LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM, it may not be a good idea to send additional abort information unless the user has used the streaming=parallel option. [1] - https://www.postgresql.org/message-id/CAGPVpCRWEVhXa7ovrhuSQofx4to7o22oU9iKtrOgAOtz_%3DY6vg%40mail.gmail.com -- With Regards, Amit Kapila.
Re: postgres_fdw: batch inserts vs. before row triggers
On Fri, Dec 2, 2022 at 4:54 PM Etsuro Fujita wrote: > On Sun, Nov 27, 2022 at 12:11 AM Tom Lane wrote: > > OK, as long as there's a reason for doing it that way, it's OK > > by me. I don't think that adding a field at the end of EState > > is an ABI problem. > > > > We have to do something else than add to ResultRelInfo, though. > > OK, I removed from ResultRelInfo a field that I added in the commit to > save the owning ModifyTableState if insert-pending, and added to > EState another List member to save such ModifyTableStates, instead. I > am planning to apply this to not only back branches but HEAD, to make > back-patching easy, if there are no objections. There seems to be no objection, so pushed. Best regards, Etsuro Fujita
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, Dec 7, 2022 at 10:03 PM houzj.f...@fujitsu.com wrote: > > On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada > wrote: > > > > On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com > > wrote: > > > > > > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com > > > > > > > > > > Thursday, December 1, 2022 8:40 PM Amit Kapila > > > > > > wrote: > > > > > Some other comments: > > > > ... > > > > Attach the new version patch set which addressed most of the comments > > > > received so far except some comments being discussed[1]. > > > > [1] > > https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C > > 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com > > > > > > Attach a new version patch set which fixed a testcase failure on CFbot. > > > > Here are some comments on v56 0001, 0002 patches. Please ignore > > comments if you already incorporated them in v57. > > Thanks for the comments! > > > +static void > > +ProcessParallelApplyInterrupts(void) > > +{ > > +CHECK_FOR_INTERRUPTS(); > > + > > +if (ShutdownRequestPending) > > +{ > > +ereport(LOG, > > +(errmsg("logical replication parallel > > apply worker for subscrip > > tion \"%s\" has finished", > > +MySubscription->name))); > > + > > +apply_worker_clean_exit(false); > > +} > > + > > +if (ConfigReloadPending) > > +{ > > +ConfigReloadPending = false; > > +ProcessConfigFile(PGC_SIGHUP); > > +} > > +} > > > > I personally think that we don't need to have a function to do only > > these few things. > > I thought that introduce a new function make the handling of worker specific > Interrupts logic similar to other existing ones. Like: > ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in > pgarch.c ... I think the difference from them is that there is only one place to call ProcessParallelApplyInterrupts(). > > > > > Should we change the names to something like > > LOGICALREP_STREAM_PARALLEL? > > Agreed, will change. > > > --- > > + * The lock graph for the above example will look as follows: > > + * LA (waiting to acquire the lock on the unique index) -> PA (waiting to > > + * acquire the lock on the remote transaction) -> LA > > > > and > > > > + * The lock graph for the above example will look as follows: > > + * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to > > acquire the > > + * lock due to unique index constraint) -> PA-1 (waiting to acquire the > > stream > > + * lock) -> LA > > > > "(waiting to acquire the lock on the remote transaction)" in the first > > example and "(waiting to acquire the stream lock)" in the second > > example is the same meaning, right? If so, I think we should use > > either term for consistency. > > Will change. > > > --- > > +bool write_abort_info = (data->streaming == > > SUBSTREAM_PARALLEL); > > > > I think that instead of setting write_abort_info every time when > > pgoutput_stream_abort() is called, we can set it once, probably in > > PGOutputData, at startup. > > I thought that since we already have a "stream" flag in PGOutputData, I am not > sure if it would be better to introduce another flag for the same option. I see your point. Another way is to have it as a static variable like publish_no_origin. But since It's trivial change I'm fine also with the current code. > > > --- > > server_version = walrcv_server_version(LogRepWorkerWalRcvConn); > > options.proto.logical.proto_version = > > +server_version >= 16 ? > > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM : > > server_version >= 15 ? > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM : > > server_version >= 14 ? > > LOGICALREP_PROTO_STREAM_VERSION_NUM : > > LOGICALREP_PROTO_VERSION_NUM; > > > > Instead of always using the new protocol version, I think we can use > > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not > > 'parallel'. That way, we don't need to change protocl version check > > logic in pgoutput.c and don't need to expose defGetStreamingMode(). > > What do you think? > > I think that some user can also use the new version number when trying to get > changes (via pg_logical_slot_peek_binary_changes or other functions), so I > feel > leave the check for new version number seems fine. > > Besides, I feel even if we don't use new version number, we still need to use > defGetStreamingMode to check if parallel mode in used as we need to send > abort_lsn when parallel is in used. I might be missing something, sorry for > that. Can you please explain the idea a bit ? My idea is that we use LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM if (server_version >= 16 && MySubscription->stream == SUBSTREAM_PARALLEL). If the stream is SUBSTREAM_ON, we use LOGICALREP_PROTO_TWOPHASE_VERSION_NUM even if server_version is
Re: Perform streaming logical transactions by background workers and parallel apply
On Thu, Dec 8, 2022 at 1:52 PM Amit Kapila wrote: > > On Wed, Dec 7, 2022 at 6:33 PM houzj.f...@fujitsu.com > wrote: > > > > On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada > > wrote: > > > > > > > > --- > > > When max_parallel_apply_workers_per_subscription is changed to a value > > > lower than the number of parallel worker running at that time, do we > > > need to stop extra workers? > > > > I think we can do this, like adding a check in the main loop of leader > > worker, and > > check every time after reloading the conf. OTOH, we will also stop the > > worker after > > finishing a transaction, so I am slightly not sure do we need to add > > another check logic here. > > But I am fine to add it if you think it would be better. > > > > I think this is tricky because it is possible that all active workers > are busy with long-running transactions, so, I think stopping them > doesn't make sense. Right, we should not stop running parallel workers. > I think as long as we are freeing them after use > it seems okay to me. OTOH, each time after finishing the transaction, > we can stop the workers, if the workers in the free pool exceed > 'max_parallel_apply_workers_per_subscription'. Or the apply leader worker can check that after reloading the config file. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())
On Tue, Dec 6, 2022 at 12:00 AM Andres Freund wrote: > > FWIW, I don't see an advantage in 0003. If it allows us to make something else > simpler / faster, cool, but on its own it doesn't seem worthwhile. Thanks. I will discard it. > I think it'd be safe to optimize LWLockConflictsWithVar(), due to some > pre-existing, quite crufty, code. LWLockConflictsWithVar() says: > > * Test first to see if it the slot is free right now. > * > * XXX: the caller uses a spinlock before this, so we don't need a > memory > * barrier here as far as the current usage is concerned. But that > might > * not be safe in general. > > which happens to be true in the single, indirect, caller: > > /* Read the current insert position */ > SpinLockAcquire(>insertpos_lck); > bytepos = Insert->CurrBytePos; > SpinLockRelease(>insertpos_lck); > reservedUpto = XLogBytePosToEndRecPtr(bytepos); > > I think at the very least we ought to have a comment in > WaitXLogInsertionsToFinish() highlighting this. So, using a spinlock ensures no memory ordering occurs while reading lock->state in LWLockConflictsWithVar()? How does spinlock that gets acquired and released in the caller WaitXLogInsertionsToFinish() itself and the memory barrier in the called function LWLockConflictsWithVar() relate here? Can you please help me understand this a bit? > It's not at all clear to me that the proposed fast-path for LWLockUpdateVar() > is safe. I think at the very least we could end up missing waiters that we > should have woken up. > > I think it ought to be safe to do something like > > pg_atomic_exchange_u64().. > if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS)) > return; pg_atomic_exchange_u64(>state, exchange_with_what_?. Exchange will change the value no? > because the pg_atomic_exchange_u64() will provide the necessary memory > barrier. I'm reading some comments [1], are these also true for 64-bit atomic CAS? Does it mean that an atomic CAS operation inherently provides a memory barrier? Can you please point me if it's described better somewhere else? [1] * Full barrier semantics. */ static inline uint32 pg_atomic_exchange_u32(volatile pg_atomic_uint32 *ptr, /* * Get and clear the flags that are set for this backend. Note that * pg_atomic_exchange_u32 is a full barrier, so we're guaranteed that the * read of the barrier generation above happens before we atomically * extract the flags, and that any subsequent state changes happen * afterward. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: ANY_VALUE aggregate
On Wed, Dec 7, 2022 at 10:00 PM Vik Fearing wrote: > On 12/7/22 04:22, David G. Johnston wrote: > > On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing > wrote: > > > >> On 12/6/22 05:57, David G. Johnston wrote: > >>> On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing > >> wrote: > >>> > I can imagine an optimization that would remove an ORDER BY clause > because it isn't needed for any other aggregate. > >>> > >>> > >>> I'm referring to the query: > >>> > >>> select any_value(v order by v) from (values (2),(1),(3)) as vals (v); > >>> // produces 1, per the documented implementation-defined behavior. > >> > >> Implementation-dependent. It is NOT implementation-defined, per spec. > > > > I really don't care all that much about the spec here given that ORDER BY > > in an aggregate call is non-spec. > > > Well, this is demonstrably wrong. > > ::= >ARRAY_AGG > > [ ORDER BY ] > > Demoable only by you and a few others... We should update our documentation - the source of SQL Standard knowledge for mere mortals. https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES "Note: The ability to specify both DISTINCT and ORDER BY in an aggregate function is a PostgreSQL extension." Apparently only DISTINCT remains as our extension. > > > You are de-facto creating a first_value aggregate (which is by definition > > non-standard) whether you like it or not. > > > I am de jure creating an any_value aggregate (which is by definition > standard) whether you like it or not. > Yes, both statements seem true. At least until we decide to start ignoring a user's explicit order by clause. > > >> If you care about which value you get back, use something else. > > > > There isn't a "something else" to use so that isn't presently an option. > > > The query > > SELECT proposed_first_value(x ORDER BY y) FROM ... > > is equivalent to > > SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ... > > so I am not very sympathetic to your claim of "no other option". > Semantically, yes, in terms of performance, not so much, for any non-trivial sized group. I'm done, and apologize for getting too emotionally invested in this. I hope to get others to voice enough +1s to get a first_value function into core along-side this one (which makes the above discussion either moot or deferred until there is a concrete use case for ignoring an explicit ORDER BY). If that doesn't happen, well, it isn't going to make or break us either way. David J.
Re: PGDOCS - Logical replication GUCs - added some xrefs
On Thu, Dec 8, 2022 at 10:49 AM samay sharma wrote: > ... > Thanks for the changes. See a few points of feedback below. > Patch v7 addresses this feedback. PSA. > > + > > + For logical replication, > > publishers > > + (servers that do > linkend="sql-createpublication">CREATE > > PUBLICATION) > > + replicate data to subscribers > > + (servers that do > linkend="sql-createsubscription">CREATE > > SUBSCRIPTION). > > + Servers can also be publishers and subscribers at the same time. Note, > > + the following sections refers to publishers as "senders". The > > parameter > > + max_replication_slots has a different meaning for > > the > > + publisher and subscriber, but all other parameters are relevant only > > to > > + one side of the replication. For more details about logical > > replication > > + configuration settings refer to > > + . > > + > > The second last line seems a bit odd here. In my last round of feedback, I > had meant to add the line "The parameter " onwards to the top of > logical-replication-config.sgml. > > What if we made the top of logical-replication-config.sgml like below? > > Logical replication requires several configuration options to be set. Most > configuration options are relevant only on one side of the replication (i.e. > publisher or subscriber). However, max_replication_slots is applicable on > both sides but has different meanings on each side. OK. Moving this note is not quite following the same pattern as the "streaming replication" intro blurb, but anyway it looks fine when moved, so I've done as suggested. >> OK. I copied the tablesync note back to config.sgml definition of >> 'max_replication_slots' and removed the link as suggested. Frankly, I >> also thought it is a bit strange that the max_replication_slots in the >> “Sending Servers” section was describing this parameter for >> “Subscribers”. OTOH, I did not want to split the definition in half so >> instead, I’ve added another Subscriber that just refers >> back to this place. It looks like an improvement to me. > > > Hmm, I agree this is a tricky scenario. However, to me, it seems odd to > mention the parameter twice as this chapter of the docs just lists each > parameter and describes them. So, I'd probably remove the reference to it in > the subscriber section. We should describe it's usage in different places in > the logical replication part of the docs (as we do). The 'max_replication_slots' is problematic because it is almost like having 2 different GUCs that happen to have the same name. So I preferred it also gets a mention in the “Subscriber” section to make it obvious that it wears 2 hats, but IIUC you prefer that 2nd mention is not present because typically each GUC should appear once only in this chapter. TBH, I think both ways could be successfully argued for or against -- so I’m just going to leave this as-is for now and let the committer decide. > > + > > + > linkend="guc-max-logical-replication-workers">max_logical_replication_workers > > +must be set to at least the number of subscriptions (for apply > > workers), plus > > +some reserve for the table synchronization workers. Configuration > > parameter > > + > linkend="guc-max-worker-processes">max_worker_processes > > +may need to be adjusted to accommodate for replication workers, at > > least ( > > + > linkend="guc-max-logical-replication-workers">max_logical_replication_workers > > ++ 1). Note, some extensions and parallel queries > > also > > +take worker slots from max_worker_processes. > > + > > Maybe do max_worker_processes in a new line like the rest. OK. Done as suggested. -- Kind Regards, Peter Smith. Fujitsu Australia v7-0001-Logical-replication-GUCs-links-and-tidy.patch Description: Binary data
Re: add \dpS to psql
Nathan Bossart writes: > On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote: >> My previous analysis >> shows that there is no vast hidden demand for new privilege bits. If we >> implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER, >> and REINDEX, we will cover everything that I can find that has seriously >> discussed on this list, and still leave 3 unused bits for future expansion. > If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual > privilege bits, we'd still have 13 remaining for future use. I think the appropriate question is not "have we still got bits left?". It should be more like "under what plausible scenario would it be useful to grant somebody CLUSTER but not VACUUM privileges on a table?". I'm really thinking that MAINTAIN is the right level of granularity here. Or maybe it's worth segregating exclusive-lock from not-exclusive-lock maintenance. But I really fail to see how it's useful to distinguish CLUSTER from REINDEX, say. regards, tom lane
Re: add \dpS to psql
On Thu, 8 Dec 2022 at 00:07, Nathan Bossart wrote: > On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote: > > For what it's worth, I wouldn't bother changing the format of the > > permission bits to expand the pool of available bits. > > 7b37823 expanded AclMode to 64 bits, so we now have room for 16 additional > privileges (after the addition of VACUUM and ANALYZE in b5d6382). > > > My previous analysis > > shows that there is no vast hidden demand for new privilege bits. If we > > implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, > CLUSTER, > > and REINDEX, we will cover everything that I can find that has seriously > > discussed on this list, and still leave 3 unused bits for future > expansion. > > If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual > privilege bits, we'd still have 13 remaining for future use. > I was a bit imprecise. I was comparing to the state before the recent changes - so 12 bits used out of 16, with MAINTAIN being the 13th bit. I think in my mind it's still approximately 2019 on some level.
Re: RFC: Logging plan of the running query
On 2022-12-07 03:41, Andres Freund wrote: Hi, This patch does not currently build, due to a conflicting oid: I suggest you choose a random oid out of the "development purposes" range: Thanks for your advice! Attached updated patch. BTW, since this patch depends on ProcessInterrupts() and EXPLAIN codes which is used in auto_explain, I'm feeling that the following discussion also applies to this patch. -- https://www.postgresql.org/message-id/CA%2BTgmoYW_rSOW4JMQ9_0Df9PKQ%3DsQDOKUGA4Gc9D8w4wui8fSA%40mail.gmail.com explaining a query is a pretty complicated operation that involves catalog lookups and lots of complicated stuff, and I don't think that it would be safe to do all of that at any arbitrary point in the code where ProcessInterrupts() happened to fire. If I can't come up with some workaround during the next Commitfest, I'm going to withdraw this proposal. -- Regards, -- Atsushi Torikoshi NTT DATA CORPORATIONFrom a0d2179826a0fa224eaf37ca00d14954b76fde6b Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Thu, 8 Dec 2022 12:42:22 +0900 Subject: [PATCH v25] Add function to log the plan of the query currently running on the backend. Currently, we have to wait for the query execution to finish to check its plan. This is not so convenient when investigating long-running queries on production environments where we cannot use debuggers. To improve this situation, this patch adds pg_log_query_plan() function that requests to log the plan of the specified backend process. By default, only superusers are allowed to request to log the plans because allowing any users to issue this request at an unbounded rate would cause lots of log messages and which can lead to denial of service. On receipt of the request, at the next CHECK_FOR_INTERRUPTS(), the target backend logs its plan at LOG_SERVER_ONLY level, so that these plans will appear in the server log but not be sent to the client. Reviewed-by: Bharath Rupireddy, Fujii Masao, Dilip Kumar, Masahiro Ikeda, Ekaterina Sokolova, Justin Pryzby, Kyotaro Horiguchi, Robert Treat, Alena Rybakina --- doc/src/sgml/func.sgml | 49 +++ src/backend/access/transam/xact.c| 13 ++ src/backend/catalog/system_functions.sql | 2 + src/backend/commands/explain.c | 139 ++- src/backend/executor/execMain.c | 14 ++ src/backend/storage/ipc/procsignal.c | 4 + src/backend/storage/lmgr/lock.c | 9 +- src/backend/tcop/postgres.c | 4 + src/backend/utils/init/globals.c | 2 + src/include/catalog/pg_proc.dat | 6 + src/include/commands/explain.h | 3 + src/include/miscadmin.h | 1 + src/include/storage/lock.h | 2 - src/include/storage/procsignal.h | 1 + src/include/tcop/pquery.h| 2 +- src/test/regress/expected/misc_functions.out | 54 +-- src/test/regress/sql/misc_functions.sql | 41 -- 17 files changed, 318 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index e57ffce971..d39a4e50a0 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -25536,6 +25536,25 @@ SELECT collation for ('foo' COLLATE "de_DE"); + + + + pg_log_query_plan + +pg_log_query_plan ( pid integer ) +boolean + + +Requests to log the plan of the query currently running on the +backend with specified process ID. +It will be logged at LOG message level and +will appear in the server log based on the log +configuration set (See +for more information), but will not be sent to the client +regardless of . + + + @@ -25756,6 +25775,36 @@ LOG: Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560 because it may generate a large number of log messages. + +pg_log_query_plan can be used +to log the plan of a backend process. For example: + +postgres=# SELECT pg_log_query_plan(201116); + pg_log_query_plan +--- + t +(1 row) + +The format of the query plan is the same as when VERBOSE, +COSTS, SETTINGS and +FORMAT TEXT are used in the EXPLAIN +command. For example: + +LOG: query plan running on backend with PID 201116 is: +Query Text: SELECT * FROM pgbench_accounts; +Seq Scan on public.pgbench_accounts (cost=0.00..52787.00 rows=200 width=97) + Output: aid, bid, abalance, filler +Settings: work_mem = '1MB' + +Note that when statements are executed inside a function, only the +plan of the most deeply nested query is logged. + + + +When a subtransaction is aborted, pg_log_query_plan +cannot log the query plan after the abort. + + diff --git
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
In the pg_stat_statements docs, there are several column descriptions like Total number of ... by the statement You added an additional sentence to some describing the equivalent pg_stat_io values, but you only added a period to the previous sentence for shared_blks_read (for other columns, the additional description just follows directly). These should be consistent. Otherwise, the docs look good to me.
Re: add \dpS to psql
On Wed, Dec 07, 2022 at 11:48:20PM -0500, Isaac Morland wrote: > For what it's worth, I wouldn't bother changing the format of the > permission bits to expand the pool of available bits. 7b37823 expanded AclMode to 64 bits, so we now have room for 16 additional privileges (after the addition of VACUUM and ANALYZE in b5d6382). > My previous analysis > shows that there is no vast hidden demand for new privilege bits. If we > implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER, > and REINDEX, we will cover everything that I can find that has seriously > discussed on this list, and still leave 3 unused bits for future expansion. If we added CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX as individual privilege bits, we'd still have 13 remaining for future use. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
On Wed, 7 Dec 2022 at 17:50, Alvaro Herrera wrote: > > I think this patch is split badly. > > You have: > > 0001 an enormous patch including some required infrastructure, plus the > DDL deparsing bits themselves. > > 0002 another enormous (though not as much) patch, this time for > DDL replication using the above. > > 0003 a bugfix for 0001, which includes changes in both the > infrastructure and the deparsing bits. > > 0004 test stuff for 0002. > > 0005 Another bugfix for 0001 > > 0006 Another bugfix for 0001 > > As presented, I think it has very little chance of being reviewed > usefully. A better way to go about this, I think, would be: > > 0001 - infrastructure bits to support the DDL deparsing parts (all these > new functions in ruleutils.c, sequence.c, etc). That means, everything > (?) that's currently in your 0001 except ddl_deparse.c and friends. > Clearly there are several independent changes here; maybe it is possible > to break it down even further. This patch or these patches should also > include the parts of 0003, 0005, 0006 that require changes outside of > ddl_deparse.c. > I expect that this patch should be fairly small. > > 0002 - ddl_deparse.c and its very close friends. This should not have > any impact on places such as ruleutils.c, sequence.c, etc. The parts of > the bugfixes (0001, 0005, 0006) that touch this could should be merged > here as well; there's no reason to have them as separate patches. Some > test code should be here also, though it probably doesn't need to aim to > be complete. > This one is likely to be very large, but also self-contained. > > 0003 - ddlmessage.c and friends. I understand that DDL-messaging is > supporting infrastructure for DDL replication; I think it should be its > own patch. Probably include its own simple-ish test bits. > Not a very large patch. > > 0004 - DDL replication proper, including 0004. > Probably not a very large patch either, not sure. > > > Some review comments, just skimming: > - 0002 adds some functions to event_trigger.c, but that doesn't seem to > be their place. Maybe some new file in src/backend/replication/logical > would make more sense. > > - publication_deparse_ddl_command_end has a long strcmp() list; why? > Maybe change things so that it compares some object type enum instead. > > - CreatePublication has a long list of command tags; is that good? > Maybe it'd be better to annotate the list in cmdtaglist.h somehow. > > - The change in pg_dump's getPublications needs updated to 16. > > - Don't "git add" src/bin/pg_waldump/logicalddlmsgdesc.c, just update > its Makefile and meson.build > > - I think psql's \dRp should not have the new column at the end. > Maybe one of: > + Name | Owner | DDL | All tables | Inserts | Updates | Deletes | Truncates | > Via root > + Name | Owner | All tables | DDL | Inserts | Updates | Deletes | Truncates | > Via root > + Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | DDL | > Via root > (I would not add the "s" at the end of that column title, also). Thanks for the comments, these comments will make the patch reviewing easier. There are a couple of review comments [1] and [2] which are spread across the code, it will be difficult to handle this after restructuring of the patch as the comments are spread across the code in the patch. So we will handle [1] and [2] first and then work on restructuring work suggested by you. [1] - https://www.postgresql.org/message-id/CAHut%2BPsERMFwO8oK3LFH_3CRG%2B512T%2Bay_viWzrgNetbH2MwxA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAHut%2BPuxo_kq2toicNK_BQdeccK3REGW-Xv8tVauFvTNku6V-w%40mail.gmail.com Regards, Vignesh
Re: ANY_VALUE aggregate
On 12/7/22 04:22, David G. Johnston wrote: On Mon, Dec 5, 2022 at 10:40 PM Vik Fearing wrote: On 12/6/22 05:57, David G. Johnston wrote: On Mon, Dec 5, 2022 at 9:48 PM Vik Fearing wrote: I can imagine an optimization that would remove an ORDER BY clause because it isn't needed for any other aggregate. I'm referring to the query: select any_value(v order by v) from (values (2),(1),(3)) as vals (v); // produces 1, per the documented implementation-defined behavior. Implementation-dependent. It is NOT implementation-defined, per spec. I really don't care all that much about the spec here given that ORDER BY in an aggregate call is non-spec. Well, this is demonstrably wrong. ::= ARRAY_AGG [ ORDER BY ] We often loosen the spec rules when they don't make technical sense to us, but I don't know of any example of when we have tightened them. The function has to choose some row from among its inputs, True. and the system has to obey an order by specification added to the function call. False. You are de-facto creating a first_value aggregate (which is by definition non-standard) whether you like it or not. I am de jure creating an any_value aggregate (which is by definition standard) whether you like it or not. I'm just saying to be upfront and honest about it - our users do want such a capability so maybe accept that there is a first time for everything. Not that picking an advantageous "implementation-dependent" implementation should be considered deviating from the spec. Someone writing: select any_value(v) from (values (2),(1),(3)) as vals (v) order by v; Is not presently, nor am I saying, promised the value 1. I'm assuming you are thinking of the second query form, while the guarantee only needs to apply to the first. I am saying that a theoretical pg_aggregate.aggorderdoesnotmatter could bestow upon ANY_VALUE the ability to make those two queries equivalent. That theoretical idea should not be entertained. Removing a user's explicitly added ORDER BY should be off-limits. Any approach at optimization here should simply look at whether an ORDER BY is specified and pass that information to the function. If the function itself really believes that ordering matters it can emit its own runtime exception stating that fact and the user can fix their query. It absolutely should be entertained, and I plan on doing so in an upcoming thread. Whether it errors or ignores is something that should be discussed on that thread. If you care about which value you get back, use something else. There isn't a "something else" to use so that isn't presently an option. The query SELECT proposed_first_value(x ORDER BY y) FROM ... is equivalent to SELECT (ARRAY_AGG(x ORDER BY y))[1] FROM ... so I am not very sympathetic to your claim of "no other option". I suppose it comes down to what level of belief and care you have that people will simply mis-use this function if it is added in its current form to get the desired first_value effect that it produces. People who rely on explicitly undefined behavior get what they deserve when the implementation changes. -- Vik Fearing
Re: Avoid streaming the transaction which are skipped (in corner cases)
On Wed, Dec 7, 2022 at 9:35 AM shiy.f...@fujitsu.com wrote: > > On Wed, Dec 7, 2022 12:01 PM Dilip Kumar wrote: > > > > > > > > Thanks. I think it is better to go ahead with this patch and once we > > > decide what is the right thing to do in terms of GUC then we can try > > > to add additional tests for this. Anyway, it is not that the code > > > added by this patch is not getting covered by existing tests. What do > > > you think? > > > > That makes sense to me. > > > > +1 > Pushed. -- With Regards, Amit Kapila.
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, Dec 7, 2022 at 6:33 PM houzj.f...@fujitsu.com wrote: > > On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada > wrote: > > > > > --- > > When max_parallel_apply_workers_per_subscription is changed to a value > > lower than the number of parallel worker running at that time, do we > > need to stop extra workers? > > I think we can do this, like adding a check in the main loop of leader > worker, and > check every time after reloading the conf. OTOH, we will also stop the worker > after > finishing a transaction, so I am slightly not sure do we need to add another > check logic here. > But I am fine to add it if you think it would be better. > I think this is tricky because it is possible that all active workers are busy with long-running transactions, so, I think stopping them doesn't make sense. I think as long as we are freeing them after use it seems okay to me. OTOH, each time after finishing the transaction, we can stop the workers, if the workers in the free pool exceed 'max_parallel_apply_workers_per_subscription'. I don't know if it is worth. -- With Regards, Amit Kapila.
Re: add \dpS to psql
On Wed, 7 Dec 2022 at 23:25, Tom Lane wrote: > Nathan Bossart writes: > > I haven't formed an opinion on whether VACUUM FULL should get its own > bit, > > but FWIW І just finished writing the first draft of a patch set to add > bits > > for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX. I plan to post that > > tomorrow. > > The fact that we just doubled the number of available bits doesn't > mean we should immediately strive to use them up. Perhaps it'd > be better to subsume these retail privileges under some generic > "maintenance action" privilege? > That was my original suggestion: https://www.postgresql.org/message-id/CAMsGm5c4DycKBYZCypfV02s-SC8GwF%2BKeTt%3D%3DvbWrFn%2Bdz%3DKeg%40mail.gmail.com In that message I review the history of permission bit growth. A bit later in the discussion, I did some investigation into the history of demand for new permission bits and I proposed calling the new privilege MAINTAIN: https://www.postgresql.org/message-id/CAMsGm5d%3D2gi4kyKONUJyYFwen%3DbsWm4hz_KxLXkEhMmg5WSWTA%40mail.gmail.com For what it's worth, I wouldn't bother changing the format of the permission bits to expand the pool of available bits. My previous analysis shows that there is no vast hidden demand for new privilege bits. If we implement MAINTAIN to control access to VACUUM, ANALYZE, REFRESH, CLUSTER, and REINDEX, we will cover everything that I can find that has seriously discussed on this list, and still leave 3 unused bits for future expansion. There is even justification for stopping after this expansion: if it is done, then schema changes (DDL) will only be able to be done by owner; data changes (insert, update, delete, as well as triggering of automatic data maintenance actions) will be able to be done by anybody who is granted permission. My guess is that if we ever do expand the privilege bit system, it should be in a way that removes the limit entirely, replacing a bit map model with something more like a table with one row for each individual grant, with a field indicating which grant is involved. But that is a hypothetical future.
Re: add \dpS to psql
On Wed, Dec 07, 2022 at 11:25:32PM -0500, Tom Lane wrote: > The fact that we just doubled the number of available bits doesn't > mean we should immediately strive to use them up. Perhaps it'd > be better to subsume these retail privileges under some generic > "maintenance action" privilege? That's fine with me, but I wouldn't be surprised if there's disagreement on how to group the commands. I certainly don't want to use up the rest of the bits right away, but there might not be too many more existing privileges after these three that deserve them. Perhaps I should take inventory and offer a plan for all the remaining privileges. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: add \dpS to psql
Nathan Bossart writes: > I haven't formed an opinion on whether VACUUM FULL should get its own bit, > but FWIW І just finished writing the first draft of a patch set to add bits > for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX. I plan to post that > tomorrow. The fact that we just doubled the number of available bits doesn't mean we should immediately strive to use them up. Perhaps it'd be better to subsume these retail privileges under some generic "maintenance action" privilege? regards, tom lane
Re: [PoC] Federated Authn/z with OAUTHBEARER
That being said, the Diagram 2 would look like this with our proposal: +--+ +--+ | +---+ | Postgres | | PQconnect ->| | | | | | | | +---+ | | | -- Empty Token > | > | | | | libpq | <- Error(w\ Root URL + Audience ) -- | < | Pre-Auth | | +--+ | | | Hook | | +- < | Hook | | | +---+ | |+--+ | | | | v | | | | | [get token]| | | | | | | | | | | + | | | +---+ | PQconnect > | | -- Access Token ---> | > | Validator | | | | < Authorization Success/Failure | < | Hook| | | | | +---+ | +---+ | | +--+ +--+ With the application taking care of all Token acquisition logic. While the server-side hook is participating in the pre-authentication reply. That is definitely a required scenario for the long term and the easiest to implement in the client core. And if we can do at least that flow in PG16 it will be a strong foundation to provide more support for specific grants in libpq going forward. Does the diagram above look good to you? We can then start cleaning up the patch to get that in first. Thanks! Andrey. On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky wrote: > > > I think it's okay to have the extension and HBA collaborate to provide > > discovery information. Your proposal goes further than that, though, > > and makes the server aware of the chosen client flow. That appears to > > be an architectural violation: why does an OAuth resource server need > > to know the client flow at all? > > Ok. It may have left there from intermediate iterations. We did > consider making extension drive the flow for specific grant_type, but > decided against that idea. For the same reason you point to. > Is it correct that your main concern about use of grant_type was that > it's propagated to the server? Then yes, we will remove sending it to > the server. > > > Ideally, yes, but that only works if all identity providers implement > > the same flows in compatible ways. We're already seeing instances > > where that's not the case and we'll necessarily have to deal with that > > up front. > > Yes, based on our analysis OIDC spec is detailed enough, that > providers implementing that one, can be supported with generic code in > libpq / client. > Github specifically won't fit there though. Microsoft Azure AD, > Google, Okta (including Auth0) will. > Theoretically discovery documents can be returned from the extension > (server-side) which is provider specific. Though we didn't plan to > prioritize that. > > > That seems to be restating the goal of OAuth and OIDC. Can you explain > > how the incompatible change allows you to accomplish this better than > > standard implementations? > > Do you refer to passing grant_type to the server? Which we will get > rid of in the next iteration. Or other incompatible changes as well? > > > Why? I claim that standard OAUTHBEARER can handle all of that. What > > does your proposed architecture (the third diagram) enable that my > > proposed hook (the second diagram) doesn't? > > The hook proposed on the 2nd diagram effectively delegates all Oauth > flows implementations to the client. > We propose libpq takes care of pulling OpenId discovery and coordination. > Which is effectively Diagram 1 + more flows + server hook providing > root url/audience. > > Created the diagrams with all components for 3 flows: > 1. Authorization code grant (Clients with Browser access): > +--+ > +--+ > | +---+ | >| > | PQconnect | | | >| > | [auth_code] | | | > +---+ > | -> | | -- Empty Token > | > > | | > | | libpq | <- Error(w\ Root URL + Audience ) -- | < > | Pre-Auth | > | | | | > | Hook | > | | |
Re: add \dpS to psql
On Wed, Dec 07, 2022 at 08:39:30PM -0600, Justin Pryzby wrote: > I think if vacuum privilege allows vacuum full, then it ought to also > allow cluster. But I suggest that it'd be even better if it doesn't > allow either, and there was a separate privilege for those. > > Disclaimer: I have not been following these threads. I haven't formed an opinion on whether VACUUM FULL should get its own bit, but FWIW І just finished writing the first draft of a patch set to add bits for CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX. I plan to post that tomorrow. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: fix and document CLUSTER privileges
On Wed, Dec 07, 2022 at 08:25:59PM -0600, Justin Pryzby wrote: > Your patch makes it inconsistent with vacuum full, which is strange > because vacuum full calls cluster. > > postgres=> VACUUM FULL t; > VACUUM > postgres=> CLUSTER t; > ERROR: must be owner of table t This is the existing behavior on HEAD. I think it has been this way for a while. Granted, that doesn't mean it's ideal, but AFAICT it's intentional. Looking closer, the database ownership check in get_tables_to_cluster_partitioned() appears to have no meaningful effect. In this code path, cluster_rel() will always be called with CLUOPT_RECHECK, and that function only checks for table ownership. Anything gathered in get_tables_to_cluster_partitioned() that the user doesn't own will be skipped. So, I don't think my patch changes the behavior in any meaningful way, but I still think it's worthwhile to make the checks consistent. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: allow segment size to be set to < 1GiB
Hi, On 2022-11-09 12:25:09 -0800, Andres Freund wrote: > Updated patch attached. I pushed it now. > I made one autoconf and one meson CI task use a small block size, but just to > ensure it work on both. I'd probably leave it set on one, so we keep the > coverage for cfbot? It doesn't seem to cost that much, so I left it set in those two tasks for now. Greetings, Andres Freund
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 12:17 PM Tom Lane wrote: > Corey Huinker writes: > > In my attempt to implement CAST...DEFAULT, I noticed that I immediately > > needed an > > OidInputFunctionCallSafe, which was trivial but maybe something we want > to > > add to the infra patch, but the comments around that function also > somewhat > > indicate that we might want to just do the work in-place and call > > InputFunctionCallSafe directly. Open to both ideas. > > I'm a bit skeptical of that. IMO using OidInputFunctionCall is only > appropriate in places that will be executed just once per query. > That is what's happening when the expr of the existing CAST ( expr AS typename ) is a constant and we want to just resolve the constant at parse time. > >
Re: moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)
On Wed, Dec 7, 2022 at 8:54 PM Amit Langote wrote: > Per Alvaro's advice, forking this from [1]. > > In that thread, Tom had asked if it wouldn't be better to find a new > place to put extraUpdatedCols [2] instead of RangeTblEntry, along with > the permission-checking fields are now no longer stored in > RangeTblEntry. > > In [3] of the same thread, I proposed to move it into a List of > Bitmapsets in ModifyTable, as implemented in the attached patch that I > had been posting to that thread. > > The latest version of that patch is attached herewith. Updated to replace a list_nth() with list_nth_node() and rewrote the commit message. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com v2-0001-Add-per-result-relation-extraUpdatedCols-to-Modif.patch Description: Binary data
Re: Assertion failure in SnapBuildInitialSnapshot()
On Mon, Nov 21, 2022 at 4:31 PM Amit Kapila wrote: > > On Sat, Nov 19, 2022 at 6:35 AM Andres Freund wrote: > > > > On 2022-11-18 11:20:36 +0530, Amit Kapila wrote: > > > Okay, updated the patch accordingly. > > > > Assuming it passes tests etc, this'd work for me. > > > > Thanks, Pushed. The same assertion failure has been reported on another thread[1]. Since I could reproduce this issue several times in my environment I've investigated the root cause. I think there is a race condition of updating procArray->replication_slot_xmin by CreateInitDecodingContext() and LogicalConfirmReceivedLocation(). What I observed in the test was that a walsender process called: SnapBuildProcessRunningXacts() LogicalIncreaseXminForSlot() LogicalConfirmReceivedLocation() ReplicationSlotsComputeRequiredXmin(false). In ReplicationSlotsComputeRequiredXmin() it acquired the ReplicationSlotControlLock and got 0 as the minimum xmin since there was no wal sender having effective_xmin. Before calling ProcArraySetReplicationSlotXmin() (i.e. before acquiring ProcArrayLock), another walsender process called CreateInitDecodingContext(), acquired ProcArrayLock, computed slot->effective_catalog_xmin, called ReplicationSlotsComputeRequiredXmin(true). Since its effective_catalog_xmin had been set, it got 39968 as the minimum xmin, and updated replication_slot_xmin. However, as soon as the second walsender released ProcArrayLock, the first walsender updated the replication_slot_xmin to 0. After that, the second walsender called SnapBuildInitialSnapshot(), and GetOldestSafeDecodingTransactionId() returned an XID newer than snap->xmin. One idea to fix this issue is that in ReplicationSlotsComputeRequiredXmin(), we compute the minimum xmin while holding both ProcArrayLock and ReplicationSlotControlLock, and release only ReplicationSlotsControlLock before updating the replication_slot_xmin. I'm concerned it will increase the contention on ProcArrayLock but I've attached the patch for discussion. Regards, [1] https://www.postgresql.org/message-id/tencent_7EB71DA5D7BA00EB0B429DCE45D0452B6406%40qq.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com fix_slot_xmin_race_condition.patch Description: Binary data
Re: add \dpS to psql
On Wed, Dec 07, 2022 at 10:48:49AM +0300, Pavel Luzanov wrote: > Furthermore. The VACUUM privilege allows you to also execute VACUUM FULL. > VACUUM and VACUUM FULL are commands with similar names, but work completely > differently. > It may be worth clarifying on this page: > https://www.postgresql.org/docs/devel/ddl-priv.html > > Something like: Allows VACUUM on a relation, including VACUUM FULL. Since (as you said) they work completely differently, I think it'd be more useful if vacuum_full were a separate privilege, rather than being included in vacuum. And cluster could be allowed whenever vacuum_full is allowed. > There is a very similar command to VACUUM FULL with a different name - > CLUSTER. The VACUUM privilege does not apply to the CLUSTER command. > This is probably correct. I think if vacuum privilege allows vacuum full, then it ought to also allow cluster. But I suggest that it'd be even better if it doesn't allow either, and there was a separate privilege for those. Disclaimer: I have not been following these threads. -- Justin
Re: [DOCS] Stats views and functions not in order?
Thanks for the ongoing feedback. PSA patches for v9* v9-0001 - Now the table rows are ordered per PeterE's suggestions [1] v9-0002 - All the review comments from DavidJ [2] are addressed v9-0003 - Unchanged since v8. -- [1] https://www.postgresql.org/message-id/cfdb0030-8f62-ed6d-4246-8d9bf855bc48%40enterprisedb.com [2] https://www.postgresql.org/message-id/CAKFQuwby7xWHek8%3D6UPaNXrhGA-i0B2zMOmBoGHgc4yaO8NH_w%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia. v9-0001-Re-order-Table-28.2-Collected-Statistics-Views.patch Description: Binary data v9-0003-Add-Statistics-Views-section-and-refentry-for-eac.patch Description: Binary data v9-0002-Cleanup-view-name-hyperlinks-for-Tables-28.1-and-.patch Description: Binary data
Re: fix and document CLUSTER privileges
On Wed, Dec 07, 2022 at 02:39:24PM -0800, Nathan Bossart wrote: > Hi hackers, > > While looking into other opportunities for per-table permissions, I noticed > a weird discrepancy in CLUSTER. When evaluating whether the current user > has permission to CLUSTER a table, we ordinarily just check for ownership. > However, the database owner is also allowed to CLUSTER all partitions that > are not shared. This was added in 3f19e17, and I didn't see any discussion > about it in the corresponding thread [0]. > > My first instinct is that we should just remove the database ownership > check, which is what I've done in the attached patch. I don't see any > strong reason to complicate matters with special > database-owner-but-not-shared checks like other commands (e.g., VACUUM). > But perhaps we should do so just for consistency's sake. Thoughts? Your patch makes it inconsistent with vacuum full, which is strange because vacuum full calls cluster. postgres=> VACUUM FULL t; VACUUM postgres=> CLUSTER t; ERROR: must be owner of table t BTW, it'd be helpful to copy the relevant parties on this kind of message, especially if there's a new thread dedicated just to this. -- Justin
Re: [PATCH] pg_dump: lock tables in batches
Andres Freund writes: > On 2022-12-07 17:53:05 -0500, Tom Lane wrote: >> Is "-s" mode actually a relevant criterion here? With per-table COPY >> commands added into the mix you could not possibly get better than 2x >> improvement, and likely a good deal less. > Well, -s isn't something used all that rarely, so it'd not be insane to > optimize it in isolation. But more importantly, I think the potential win > without -s is far bigger than 2x, because the COPYs can be done in parallel, > whereas the locking happens in the non-parallel stage. True, and there's the reduce-the-lock-window issue that Jacob mentioned. > With just a 5ms delay, very well within normal network latency range, I get: > [ a nice win ] OK. I'm struggling to figure out why I rejected this idea last year. I know that I thought about it and I'm fairly certain I actually tested it. Maybe I only tried it with near-zero-latency local loopback; but I doubt that, because the potential for network latency was certainly a factor in that whole discussion. One idea is that I might've tried it before getting rid of all the other per-object queries, at which point it wouldn't have stood out quite so much. But I'm just guessing. I have a nagging feeling there was something else. Oh well, I guess we can always revert it if we discover a problem later. regards, tom lane
Re: Error-safe user functions
Andres Freund writes: > I wonder if there are potential use-cases for levels other than ERROR. I can > potentially see us wanting to defer some FATALs, e.g. when they occur in > process exit hooks. I thought about that early on, and concluded not. The whole thing is moot for levels less than ERROR, of course, and I'm having a hard time seeing how it could be useful for FATAL or PANIC. Maybe I just lack imagination, but if a call is specifying FATAL rather than just ERROR then it seems to me it's already a special snowflake rather than something we could fold into a generic non-error behavior. > For a moment I was worried that it could lead to odd behaviour that we don't > do get_error_stack_entry() when !details_wanted, due to not raising an error > we'd otherwise raise. But that's a should-never-be-reached case, so ... I don't see how. Returning false out of errsave_start causes the errsave macro to immediately give control back to the caller, which will go on about its business. > It seems somewhat ugly transport this knowledge via edata->elevel, but it's > not too bad. The LOG-vs-ERROR business, you mean? Yeah. I considered adding another bool flag to ErrorData, but couldn't convince myself it was worth the trouble. If we find a problem we can do that sometime in future. >> +/* >> + * We cannot include nodes.h yet, so make a stub reference. (This is also >> + * used by fmgr.h, which doesn't want to depend on nodes.h either.) >> + */ >> +typedef struct Node *NodePtr; > Seems like it'd be easier to just forward declare the struct, and use the > non-typedef'ed name in the header than to have to deal with these > interdependencies and the differing typenames. Meh. I'm a little allergic to writing "struct foo *" in function argument lists, because I so often see gcc pointing out that if struct foo isn't yet known then that can silently mean something different than you intended. With the typedef, it either works or is an error, no halfway about it. And the struct way isn't really much better in terms of having two different notations to use rather than only one. > Perhaps worth noting here that the reason why the errsave_start/errsave_finish > split exist differs a bit from the reason in ereport_domain()? "Over there" > it's just about not wanting to incur overhead when the message isn't logged, > but here we'll always have >= ERROR, but ->details_wanted can still lead to > not wanting to incur the overhead. Hmmm ... it seems like the same reason to me, we don't want to incur the overhead if the "start" function says not to. > Is it ok that we throw an error in lookup_rowtype_tupdesc()? Yeah, that should fall in the category of internal errors I think. I don't see how you'd reach that from a bad input string. (Or to be more precise, the point of pg_input_is_valid is to tell you whether the input string is valid, not to tell you whether the type name is valid; if you're worried about the latter you need a separate and earlier test.) regards, tom lane
Re: PGDOCS - Logical replication GUCs - added some xrefs
Hi, On Tue, Dec 6, 2022 at 11:12 PM Peter Smith wrote: > On Tue, Dec 6, 2022 at 5:57 AM samay sharma > wrote: > > > > Hi, > > > > On Mon, Oct 24, 2022 at 12:45 AM Peter Smith > wrote: > >> > >> Hi hackers. > >> > >> There is a docs Logical Replication section "31.10 Configuration > >> Settings" [1] which describes some logical replication GUCs, and > >> details on how they interact with each other and how to take that into > >> account when setting their values. > >> > >> There is another docs Server Configuration section "20.6 Replication" > >> [2] which lists the replication-related GUC parameters, and what they > >> are for. > >> > >> Currently AFAIK those two pages are unconnected, but I felt it might > >> be helpful if some of the parameters in the list [2] had xref links to > >> the additional logical replication configuration information [1]. PSA > >> a patch to do that. > > > > > > +1 on the patch. Some feedback on v5 below. > > > > Thanks for your detailed review comments! > > I have changed most things according to your suggestions. Please check > patch v6. > Thanks for the changes. See a few points of feedback below. > + > + For logical replication, publishers > + (servers that do CREATE PUBLICATION) > + replicate data to subscribers > + (servers that do CREATE SUBSCRIPTION). > + Servers can also be publishers and subscribers at the same time. Note, > + the following sections refers to publishers as "senders". The parameter > + max_replication_slots has a different meaning for the > + publisher and subscriber, but all other parameters are relevant only to > + one side of the replication. For more details about logical replication > + configuration settings refer to > + . > + The second last line seems a bit odd here. In my last round of feedback, I had meant to add the line "The parameter " onwards to the top of logical-replication-config.sgml. What if we made the top of logical-replication-config.sgml like below? Logical replication requires several configuration options to be set. Most configuration options are relevant only on one side of the replication (i.e. publisher or subscriber). However, max_replication_slots is applicable on both sides but has different meanings on each side. > > > > + > > > + For logical replication configuration > settings refer > > > + also to . > > > + > > > + > > > > I feel the top paragraph needs to explain terminology for logical > replication like it does for physical replication in addition to linking to > the logical replication config page. I'm recommending this as we use terms > like subscriber etc. in description of parameters without introducing them > first. > > > > As an example, something like below might work. > > > > These settings control the behavior of the built-in streaming > replication feature (see Section 27.2.5) and logical replication (link). > > > > For physical replication, servers will be either a primary or a standby > server. Primaries can send data, while standbys are always receivers of > replicated data. When cascading replication (see Section 27.2.7) is used, > standby servers can also be senders, as well as receivers. Parameters are > mainly for sending and standby servers, though some parameters have meaning > only on the primary server. Settings may vary across the cluster without > problems if that is required. > > > > For logical replication, servers will either be publishers (also called > senders in the sections below) or subscribers. Publishers are , > Subscribers are... > > > > OK. I split this blurb into 2 parts – streaming and logical > replication. The streaming replication part is the same as before. The > logical replication part is new. > > > > + > > > + See for more > details > > > + about setting max_replication_slots for > logical > > > + replication. > > > + > > > > > > The link doesn't add any new information regarding max_replication_slots > other than "to reserve some for table sync" and has a good amount of > unrelated info. I think it might be useful to just put a line here asking > to reserve some for table sync instead of linking to the entire logical > replication config section. > > > > OK. I copied the tablesync note back to config.sgml definition of > 'max_replication_slots' and removed the link as suggested. Frankly, I > also thought it is a bit strange that the max_replication_slots in the > “Sending Servers” section was describing this parameter for > “Subscribers”. OTOH, I did not want to split the definition in half so > instead, I’ve added another Subscriber that just refers > back to this place. It looks like an improvement to me. > Hmm, I agree this is a tricky scenario. However, to me, it seems odd to mention the parameter twice as this chapter of the docs just lists each parameter and describes them. So, I'd probably remove the reference to it in the subscriber
Re: [PATCH] pg_dump: lock tables in batches
Hi, On 2022-12-07 17:53:05 -0500, Tom Lane wrote: > Andres Freund writes: > > With an artificial delay of 100ms, the perf difference between the batching > > patch and not using the batching patch is huge. Huge enough that I don't > > have > > the patience to wait for the non-batched case to complete. > > Clearly, if you insert a sufficiently large artificial round-trip delay, > even squeezing a single command out of a pg_dump run will appear > worthwhile. What I'm unsure about is whether it's worthwhile at > realistic round-trip delays (where "realistic" means that the dump > performance would otherwise be acceptable). I think the reason I didn't > pursue this last year is that experimentation convinced me the answer > was "no". It seems to be a win even without any artificial delay. Not a *huge* win, but a noticable win. And even just a few ms make it quite painful. > > With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I > > cancelled after 603 tables had been locked, which took 2:06.43. > > Is "-s" mode actually a relevant criterion here? With per-table COPY > commands added into the mix you could not possibly get better than 2x > improvement, and likely a good deal less. Well, -s isn't something used all that rarely, so it'd not be insane to optimize it in isolation. But more importantly, I think the potential win without -s is far bigger than 2x, because the COPYs can be done in parallel, whereas the locking happens in the non-parallel stage. With just a 5ms delay, very well within normal network latency range, I get: pg_dump.master -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1 2m7.830s pg_dump.pipeline -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1 0m24.183s pg_dump.batch -h localhost -j10 -f /tmp/pg_dump_backup -Fd t1 0m24.321s Greetings, Andres Freund
Re: Error-safe user functions
Hi, On 2022-12-07 17:32:21 -0500, Tom Lane wrote: > I already pushed the elog-refactoring patch, since that seemed > uncontroversial. 0001 attached covers the same territory as before, > but I regrouped the rest so that 0002 installs the new test support > functions, then 0003 adds both the per-datatype changes and > corresponding test cases for bool, int4, arrays, and records. > The idea here is that 0003 can be pointed to as a sample of what > has to be done to datatype input functions, while the preceding > patches can be cited as relevant documentation. (I've not decided > whether to squash 0001 and 0002 together or commit them separately. I think they make sense as is. > Does it make sense to break 0003 into 4 separate commits, or is > that overkill?) I think it'd be fine either way. > + * If "context" is an ErrorSaveContext node, but the node creator only wants > + * notification of the fact of a soft error without any details, just set > + * the error_occurred flag in the ErrorSaveContext node and return false, > + * which will cause us to skip the remaining error processing steps. > + * > + * Otherwise, create and initialize error stack entry and return true. > + * Subsequently, errmsg() and perhaps other routines will be called to > further > + * populate the stack entry. Finally, errsave_finish() will be called to > + * tidy up. > + */ > +bool > +errsave_start(NodePtr context, const char *domain) I wonder if there are potential use-cases for levels other than ERROR. I can potentially see us wanting to defer some FATALs, e.g. when they occur in process exit hooks. > +{ > + ErrorSaveContext *escontext; > + ErrorData *edata; > + > + /* > + * Do we have a context for soft error reporting? If not, just punt to > + * errstart(). > + */ > + if (context == NULL || !IsA(context, ErrorSaveContext)) > + return errstart(ERROR, domain); > + > + /* Report that a soft error was detected */ > + escontext = (ErrorSaveContext *) context; > + escontext->error_occurred = true; > + > + /* Nothing else to do if caller wants no further details */ > + if (!escontext->details_wanted) > + return false; > + > + /* > + * Okay, crank up a stack entry to store the info in. > + */ > + > + recursion_depth++; > + > + /* Initialize data for this error frame */ > + edata = get_error_stack_entry(); For a moment I was worried that it could lead to odd behaviour that we don't do get_error_stack_entry() when !details_wanted, due to not raising an error we'd otherwise raise. But that's a should-never-be-reached case, so ... > +/* > + * errsave_finish --- end a "soft" error-reporting cycle > + * > + * If errsave_start() decided this was a regular error, behave as > + * errfinish(). Otherwise, package up the error details and save > + * them in the ErrorSaveContext node. > + */ > +void > +errsave_finish(NodePtr context, const char *filename, int lineno, > +const char *funcname) > +{ > + ErrorSaveContext *escontext = (ErrorSaveContext *) context; > + ErrorData *edata = [errordata_stack_depth]; > + > + /* verify stack depth before accessing *edata */ > + CHECK_STACK_DEPTH(); > + > + /* > + * If errsave_start punted to errstart, then elevel will be ERROR or > + * perhaps even PANIC. Punt likewise to errfinish. > + */ > + if (edata->elevel >= ERROR) > + { > + errfinish(filename, lineno, funcname); > + pg_unreachable(); > + } It seems somewhat ugly transport this knowledge via edata->elevel, but it's not too bad. > +/* > + * We cannot include nodes.h yet, so make a stub reference. (This is also > + * used by fmgr.h, which doesn't want to depend on nodes.h either.) > + */ > +typedef struct Node *NodePtr; Seems like it'd be easier to just forward declare the struct, and use the non-typedef'ed name in the header than to have to deal with these interdependencies and the differing typenames. > +/*-- > + * Support for reporting "soft" errors that don't require a full transaction > + * abort to clean up. This is to be used in this way: > + * errsave(context, > + * errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), > + * errmsg("invalid input syntax for type %s: > \"%s\"", > + * "boolean", in_str), > + * ... other errxxx() fields as needed ...); > + * > + * "context" is a node pointer or NULL, and the remaining auxiliary calls > + * provide the same error details as in ereport(). If context is not a > + * pointer to an ErrorSaveContext node, then errsave(context, ...) > + * behaves identically to ereport(ERROR, ...). If context is a pointer > + * to an ErrorSaveContext node, then the information provided by the > + * auxiliary calls is stored in the context node and control returns > + *
Re: [PoC] Federated Authn/z with OAUTHBEARER
> I think it's okay to have the extension and HBA collaborate to provide > discovery information. Your proposal goes further than that, though, > and makes the server aware of the chosen client flow. That appears to > be an architectural violation: why does an OAuth resource server need > to know the client flow at all? Ok. It may have left there from intermediate iterations. We did consider making extension drive the flow for specific grant_type, but decided against that idea. For the same reason you point to. Is it correct that your main concern about use of grant_type was that it's propagated to the server? Then yes, we will remove sending it to the server. > Ideally, yes, but that only works if all identity providers implement > the same flows in compatible ways. We're already seeing instances > where that's not the case and we'll necessarily have to deal with that > up front. Yes, based on our analysis OIDC spec is detailed enough, that providers implementing that one, can be supported with generic code in libpq / client. Github specifically won't fit there though. Microsoft Azure AD, Google, Okta (including Auth0) will. Theoretically discovery documents can be returned from the extension (server-side) which is provider specific. Though we didn't plan to prioritize that. > That seems to be restating the goal of OAuth and OIDC. Can you explain > how the incompatible change allows you to accomplish this better than > standard implementations? Do you refer to passing grant_type to the server? Which we will get rid of in the next iteration. Or other incompatible changes as well? > Why? I claim that standard OAUTHBEARER can handle all of that. What > does your proposed architecture (the third diagram) enable that my > proposed hook (the second diagram) doesn't? The hook proposed on the 2nd diagram effectively delegates all Oauth flows implementations to the client. We propose libpq takes care of pulling OpenId discovery and coordination. Which is effectively Diagram 1 + more flows + server hook providing root url/audience. Created the diagrams with all components for 3 flows: 1. Authorization code grant (Clients with Browser access): +--+ +--+ | +---+ | | | PQconnect | | | | | [auth_code] | | | +---+ | -> | | -- Empty Token > | > | | | | libpq | <- Error(w\ Root URL + Audience ) -- | < | Pre-Auth | | | | | | Hook | | | | | +---+ | | |+--+ | | | | | ---[GET]-> | OIDC | | Postgres | | +--+ | <--Provider Metadata-- | Discovery| | | | +- < | Hook | < |+--+ | | | |+--+ | | | | v | | | | | [get auth | | | | |code]| | | | || | | | | | | | | | | + | | | | | PQconnect > | ++ +--+ | | | | | iddawc | <-- [ Auth code ]-> | Issuer/ | | | | | || <-- Access Token -- | Authz Server | | | | | ++ +--+ | | | | | | +---+ | | | -- Access Token ---> | > | Validator | | | | < Authorization Success/Failure | < | Hook| | +--+ | | +---+ | +-< | Hook | | | | | v +--+ | | | |[store +---+ | | | refresh_token]+--+ +--+ 2. Device code grant +--+ +--+ | +---+ | | | PQconnect | | | | | [auth_code] | |
Re: [PATCH] pg_dump: lock tables in batches
On Wed, Dec 7, 2022 at 2:53 PM Tom Lane wrote: > Is "-s" mode actually a relevant criterion here? With per-table COPY > commands added into the mix you could not possibly get better than 2x > improvement, and likely a good deal less. Don't we hit this code path in pg_upgrade? You won't see huge round-trip times, of course, but you also won't see COPY. Absolute performance aside, isn't there another concern that, the longer it takes for us to lock the tables, the bigger the window there is for someone to interfere with them between our catalog query and the lock itself? --Jacob
Re: Error-safe user functions
Andrew Dunstan writes: > On 2022-12-07 We 17:32, Tom Lane wrote: >> Does it make sense to break 0003 into 4 separate commits, or is >> that overkill?) > No strong opinion about 0001 and 0002. I'm happy enough with them as > they are, but if you want to squash them that's ok. I wouldn't break up > 0003. I think we're going to end up committing the remaining work in > batches, although they would probably be a bit more thematically linked > than these. Yeah, we certainly aren't likely to do this work as one-commit-per-datatype going forward. I'm just wondering how to do these initial commits so that they provide good reference material. regards, tom lane
Re: [PATCH] pg_dump: lock tables in batches
Andres Freund writes: > With an artificial delay of 100ms, the perf difference between the batching > patch and not using the batching patch is huge. Huge enough that I don't have > the patience to wait for the non-batched case to complete. Clearly, if you insert a sufficiently large artificial round-trip delay, even squeezing a single command out of a pg_dump run will appear worthwhile. What I'm unsure about is whether it's worthwhile at realistic round-trip delays (where "realistic" means that the dump performance would otherwise be acceptable). I think the reason I didn't pursue this last year is that experimentation convinced me the answer was "no". > With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I > cancelled after 603 tables had been locked, which took 2:06.43. Is "-s" mode actually a relevant criterion here? With per-table COPY commands added into the mix you could not possibly get better than 2x improvement, and likely a good deal less. regards, tom lane
Re: Error-safe user functions
On 2022-12-07 We 17:32, Tom Lane wrote: > OK, here's a v4 that I think is possibly committable. > > I've changed all the comments and docs to use the "soft error" > terminology, but since using "soft" in the actual function names > didn't seem that appealing, they still use "safe". > > I already pushed the elog-refactoring patch, since that seemed > uncontroversial. 0001 attached covers the same territory as before, > but I regrouped the rest so that 0002 installs the new test support > functions, then 0003 adds both the per-datatype changes and > corresponding test cases for bool, int4, arrays, and records. > The idea here is that 0003 can be pointed to as a sample of what > has to be done to datatype input functions, while the preceding > patches can be cited as relevant documentation. (I've not decided > whether to squash 0001 and 0002 together or commit them separately. > Does it make sense to break 0003 into 4 separate commits, or is > that overkill?) > No strong opinion about 0001 and 0002. I'm happy enough with them as they are, but if you want to squash them that's ok. I wouldn't break up 0003. I think we're going to end up committing the remaining work in batches, although they would probably be a bit more thematically linked than these. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: [PATCH] pg_dump: lock tables in batches
Hi, On 2022-12-07 13:32:42 -0800, Andres Freund wrote: > On 2022-12-07 18:14:01 -0300, Fabrízio de Royes Mello wrote: > > Here we have some numbers about the Aleksander's patch: > > Hm. Were they taken in an assertion enabled build or such? Just testing the > t1 case on HEAD, I get 0:01.23 elapsed for an unpatched pg_dump in an > optimized build. And that's on a machine with not all that fast cores. Comparing the overhead on the server side. CREATE OR REPLACE FUNCTION exec(v_sql text) RETURNS text LANGUAGE plpgsql AS $$BEGIN EXECUTE v_sql;RETURN v_sql;END;$$; Acquire locks in separate statements, three times: SELECT exec(string_agg(format('LOCK t%s;', i), '')) FROM generate_series(1, 1) AS i; 1267.909 ms 1116.008 ms 1108.383 ms Acquire all locks in a single statement, three times: SELECT exec('LOCK '||string_agg(format('t%s', i), ', ')) FROM generate_series(1, 1) AS i; 1210.732 ms 1101.390 ms 1105.331 ms So there's some performance difference that's independent of the avoided roundtrips - but it's pretty small. With an artificial delay of 100ms, the perf difference between the batching patch and not using the batching patch is huge. Huge enough that I don't have the patience to wait for the non-batched case to complete. With batching pg_dump -s -h localhost t1 took 0:16.23 elapsed, without I cancelled after 603 tables had been locked, which took 2:06.43. This made me try out using pipeline mode. Seems to work nicely from what I can tell. The timings are a tad slower than the "manual" batch mode - I think that's largely due to the extended protocol just being overcomplicated. Greetings, Andres Freund
Re: add \dpS to psql
On Wed, Dec 07, 2022 at 10:48:49AM +0300, Pavel Luzanov wrote: > There is a very similar command to VACUUM FULL with a different name - > CLUSTER. > The VACUUM privilege does not apply to the CLUSTER command. This is probably > correct. > However, the documentation for the CLUSTER command does not say > who can perform this command. I think it would be correct to add a sentence > to the Notes section > (https://www.postgresql.org/docs/devel/sql-cluster.html) > similar to the one in the VACUUM documentation: > > "To cluster a table, one must ordinarily be the table's owner or a > superuser." I created a new thread for this: https://postgr.es/m/20221207223924.GA4182184%40nathanxps13 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
fix and document CLUSTER privileges
Hi hackers, While looking into other opportunities for per-table permissions, I noticed a weird discrepancy in CLUSTER. When evaluating whether the current user has permission to CLUSTER a table, we ordinarily just check for ownership. However, the database owner is also allowed to CLUSTER all partitions that are not shared. This was added in 3f19e17, and I didn't see any discussion about it in the corresponding thread [0]. My first instinct is that we should just remove the database ownership check, which is what I've done in the attached patch. I don't see any strong reason to complicate matters with special database-owner-but-not-shared checks like other commands (e.g., VACUUM). But perhaps we should do so just for consistency's sake. Thoughts? It was also noted elsewhere [1] that the privilege requirements for CLUSTER are not documented. The attached patch adds such documentation. [0] https://postgr.es/m/20220411140609.GF26620%40telsasoft.com [1] https://postgr.es/m/661148f4-c7f1-dec1-2bc8-29f3bd58e242%40postgrespro.ru -- Nathan Bossart Amazon Web Services: https://aws.amazon.com diff --git a/doc/src/sgml/ref/cluster.sgml b/doc/src/sgml/ref/cluster.sgml index c37f4236f1..feb61e6ec2 100644 --- a/doc/src/sgml/ref/cluster.sgml +++ b/doc/src/sgml/ref/cluster.sgml @@ -67,7 +67,8 @@ CLUSTER [VERBOSE] - CLUSTER without any parameter reclusters all the + CLUSTER without a + table_name reclusters all the previously-clustered tables in the current database that the calling user owns, or all such tables if called by a superuser. This form of CLUSTER cannot be executed inside a transaction @@ -132,6 +133,13 @@ CLUSTER [VERBOSE] Notes + +To cluster a table, one must be the table's owner or a superuser. +Database-wide clusters and clusters on partitioned tables will silently +skip over any tables that the calling user does not have permission to +cluster. + + In cases where you are accessing single rows randomly within a table, the actual order of the data in the diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 07e091bb87..a8a51cc674 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -1693,9 +1693,7 @@ get_tables_to_cluster_partitioned(MemoryContext cluster_context, Oid indexOid) continue; /* Silently skip partitions which the user has no access to. */ - if (!object_ownercheck(RelationRelationId, relid, GetUserId()) && - (!object_ownercheck(DatabaseRelationId, MyDatabaseId, GetUserId()) || - IsSharedRelation(relid))) + if (!object_ownercheck(RelationRelationId, relid, GetUserId())) continue; /* Use a permanent memory context for the result list */
Re: Error-safe user functions
OK, here's a v4 that I think is possibly committable. I've changed all the comments and docs to use the "soft error" terminology, but since using "soft" in the actual function names didn't seem that appealing, they still use "safe". I already pushed the elog-refactoring patch, since that seemed uncontroversial. 0001 attached covers the same territory as before, but I regrouped the rest so that 0002 installs the new test support functions, then 0003 adds both the per-datatype changes and corresponding test cases for bool, int4, arrays, and records. The idea here is that 0003 can be pointed to as a sample of what has to be done to datatype input functions, while the preceding patches can be cited as relevant documentation. (I've not decided whether to squash 0001 and 0002 together or commit them separately. Does it make sense to break 0003 into 4 separate commits, or is that overkill?) Thoughts? regards, tom lane diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml index 693423e524..994dfc6526 100644 --- a/doc/src/sgml/ref/create_type.sgml +++ b/doc/src/sgml/ref/create_type.sgml @@ -900,6 +900,17 @@ CREATE TYPE name function is written in C. + + In PostgreSQL version 16 and later, + it is desirable for base types' input functions to + return soft errors using the + new errsave()/ereturn() + mechanism, rather than throwing ereport() + exceptions as in previous versions. + See src/backend/utils/fmgr/README for more + information. + + diff --git a/src/backend/nodes/Makefile b/src/backend/nodes/Makefile index 4368c30fdb..7c594be583 100644 --- a/src/backend/nodes/Makefile +++ b/src/backend/nodes/Makefile @@ -56,6 +56,7 @@ node_headers = \ nodes/bitmapset.h \ nodes/extensible.h \ nodes/lockoptions.h \ + nodes/miscnodes.h \ nodes/replnodes.h \ nodes/supportnodes.h \ nodes/value.h \ diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl index 7212bc486f..08992dfd47 100644 --- a/src/backend/nodes/gen_node_support.pl +++ b/src/backend/nodes/gen_node_support.pl @@ -68,6 +68,7 @@ my @all_input_files = qw( nodes/bitmapset.h nodes/extensible.h nodes/lockoptions.h + nodes/miscnodes.h nodes/replnodes.h nodes/supportnodes.h nodes/value.h @@ -89,6 +90,7 @@ my @nodetag_only_files = qw( executor/tuptable.h foreign/fdwapi.h nodes/lockoptions.h + nodes/miscnodes.h nodes/replnodes.h nodes/supportnodes.h ); diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c index f5cd1b7493..a36aeb832e 100644 --- a/src/backend/utils/error/elog.c +++ b/src/backend/utils/error/elog.c @@ -71,6 +71,7 @@ #include "libpq/libpq.h" #include "libpq/pqformat.h" #include "mb/pg_wchar.h" +#include "nodes/miscnodes.h" #include "miscadmin.h" #include "pgstat.h" #include "postmaster/bgworker.h" @@ -611,6 +612,128 @@ errfinish(const char *filename, int lineno, const char *funcname) CHECK_FOR_INTERRUPTS(); } + +/* + * errsave_start --- begin a "soft" error-reporting cycle + * + * If "context" isn't an ErrorSaveContext node, this behaves as + * errstart(ERROR, domain), and the errsave() macro ends up acting + * exactly like ereport(ERROR, ...). + * + * If "context" is an ErrorSaveContext node, but the node creator only wants + * notification of the fact of a soft error without any details, just set + * the error_occurred flag in the ErrorSaveContext node and return false, + * which will cause us to skip the remaining error processing steps. + * + * Otherwise, create and initialize error stack entry and return true. + * Subsequently, errmsg() and perhaps other routines will be called to further + * populate the stack entry. Finally, errsave_finish() will be called to + * tidy up. + */ +bool +errsave_start(NodePtr context, const char *domain) +{ + ErrorSaveContext *escontext; + ErrorData *edata; + + /* + * Do we have a context for soft error reporting? If not, just punt to + * errstart(). + */ + if (context == NULL || !IsA(context, ErrorSaveContext)) + return errstart(ERROR, domain); + + /* Report that a soft error was detected */ + escontext = (ErrorSaveContext *) context; + escontext->error_occurred = true; + + /* Nothing else to do if caller wants no further details */ + if (!escontext->details_wanted) + return false; + + /* + * Okay, crank up a stack entry to store the info in. + */ + + recursion_depth++; + + /* Initialize data for this error frame */ + edata = get_error_stack_entry(); + edata->elevel = LOG; /* signal all is well to errsave_finish */ + set_stack_entry_domain(edata, domain); + /* Select default errcode based on the assumed elevel of ERROR */ + edata->sqlerrcode = ERRCODE_INTERNAL_ERROR; + + /* + * Any allocations for this error state level should go into the caller's + * context. We don't need to pollute ErrorContext, or even require it to + * exist, in this code path. + */ + edata->assoc_context =
Re: Temporary tables versus wraparound... again
On Thu, 1 Dec 2022 at 14:18, Andres Freund wrote: > > I find it problematic that ResetVacStats() bypasses tableam. Normal vacuums > etc go through tableam but you put a ResetVacStats() besides each call to > table_relation_nontransactional_truncate(). Seems like this should just be in > heapam_relation_nontransactional_truncate()? So this seems a bit weird. The only other part of the tableam that touches freezexid and minmxid is table_relation_set_new_filelocator() which returns them via references so that the callers (heap.c:heap_create() and relcache.c:RelationSetNewRelfilenumber()) can set them themselves. I can't really duplicate that layering without changing the API of heapam_relation_nontransactional_truncate(). Which if it changes would be quite annoying I think for external pluggable tableams. But you're right that where I've put it will update relfrozenxid and minmxid even for relations that have a tableam handler that returns InvalidXid and doesn't need it. That does seem inappropriate. I could put it directly in the heapam_handler but is that a layering issue to be doing a catalog update from within the tableam_handler? There are no other catalog updates in there. On the other hand the existing callers of *_nontransactional_truncate() don't have any existing catalog updates they want to make so perhaps returning the xid values by reference was just for convenience to avoid doing an extra update and isn't needed here. -- greg
Re: [PATCH] pg_dump: lock tables in batches
Hi, On 2022-12-07 18:14:01 -0300, Fabrízio de Royes Mello wrote: > Here we have some numbers about the Aleksander's patch: Hm. Were they taken in an assertion enabled build or such? Just testing the t1 case on HEAD, I get 0:01.23 elapsed for an unpatched pg_dump in an optimized build. And that's on a machine with not all that fast cores. Greetings, Andres Freund
Re: Transaction timeout
On Wed, Dec 7, 2022 at 10:23 AM Andres Freund wrote: > > On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote: > > Fixed. Added test for this. > > The tests don't pass: https://cirrus-ci.com/build/4811553145356288 > oops, sorry. Here's the fix. I hope to address other feedback on the weekend. Thank you! Best regards, Andrey Borodin. v3-0001-Intorduce-transaction_timeout.patch Description: Binary data
Re: [PATCH] pg_dump: lock tables in batches
On Wed, Dec 7, 2022 at 2:28 PM Tom Lane wrote: > > Andres Freund writes: > > On 2022-12-07 10:44:33 -0500, Tom Lane wrote: > >> I have a strong sense of deja vu here. I'm pretty sure I experimented > >> with this idea last year and gave up on it. I don't recall exactly > >> why, but either it didn't show any meaningful performance improvement > >> for me or there was some actual downside (that I'm not remembering > >> right now). > > > IIRC the case we were looking at around 989596152 were CPU bound workloads, > > rather than latency bound workloads. It'd not be surprising to have cases > > where batching LOCKs helps latency, but not CPU bound. > > Yeah, perhaps. Anyway my main point is that I don't want to just assume > this is a win; I want to see some actual performance tests. > Here we have some numbers about the Aleksander's patch: 1) Setup script CREATE DATABASE t1000; CREATE DATABASE t1; CREATE DATABASE t10; \c t1000 SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 1000) AS i \gexec \c t1 SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 1) AS i \gexec \c t10 SELECT format('CREATE TABLE t%s(c1 INTEGER PRIMARY KEY, c2 TEXT, c3 TIMESTAMPTZ);', i) FROM generate_series(1, 10) AS i \gexec 2) Execution script time pg_dump -s t1000 > /dev/null time pg_dump -s t1 > /dev/null time pg_dump -s t10 > /dev/null 3) HEAD execution $ time pg_dump -s t1000 > /dev/null 0.02user 0.01system 0:00.36elapsed 8%CPU (0avgtext+0avgdata 11680maxresident)k 0inputs+0outputs (0major+1883minor)pagefaults 0swaps $ time pg_dump -s t1 > /dev/null 0.30user 0.10system 0:05.04elapsed 8%CPU (0avgtext+0avgdata 57772maxresident)k 0inputs+0outputs (0major+14042minor)pagefaults 0swaps $ time pg_dump -s t10 > /dev/null 3.42user 2.13system 7:50.09elapsed 1%CPU (0avgtext+0avgdata 517900maxresident)k 0inputs+0outputs (0major+134636minor)pagefaults 0swaps 4) PATCH execution $ time pg_dump -s t1000 > /dev/null 0.02user 0.00system 0:00.28elapsed 9%CPU (0avgtext+0avgdata 11700maxresident)k 0inputs+0outputs (0major+1886minor)pagefaults 0swaps $ time pg_dump -s t1 > /dev/null 0.18user 0.03system 0:02.17elapsed 10%CPU (0avgtext+0avgdata 57592maxresident)k 0inputs+0outputs (0major+14072minor)pagefaults 0swaps $ time pg_dump -s t10 > /dev/null 1.97user 0.32system 0:21.39elapsed 10%CPU (0avgtext+0avgdata 517932maxresident)k 0inputs+0outputs (0major+134892minor)pagefaults 0swaps 5) Summary HEAD patch 1k tables 0:00.36 0:00.28 10k tables 0:05.04 0:02.17 100k tables 7:50.09 0:21.39 Seems we get very good performance gain using Aleksander's patch. I used the "-s" to not waste time issuing COPY for each relation (even all is empty) and evidence the difference due to roundtrip for LOCK TABLE. This patch will also improve the pg_upgrade execution over database with thousands of relations. Regards, -- Fabrízio de Royes Mello
Re: Error-safe user functions
"David G. Johnston" writes: > On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan wrote: >> I'm not sure InputFunctionCallSoft would be an improvement. Maybe >> InputFunctionCallSoftError would be clearer, but I don't know that it's >> much of an improvement either. The same goes for the other visible changes. > InputFunctionCallSafe -> TryInputFunctionCall I think we are already using "TryXXX" for code that involves catching ereport errors. Since the whole point here is that we are NOT doing that, I think this naming would be more confusing than helpful. > Unrelated observation: "Although the error stack is not large, we don't > expect to run out of space." -> "Because the error stack is not large, > assume that we will not run out of space and panic if we are wrong."? That doesn't seem to make the point I wanted to make. I've adopted your other suggestions in the v4 I'm preparing now. regards, tom lane
[PATCH] psql: Add tab-complete for optional view parameters
Hi all! This adds tab-complete for optional parameters (as can be specified using WITH) for views, similarly to how it works for storage parameters of tables. Thanks, Christoph HeissFrom bd12c08a80b5973fdcac59def213216d06f150b0 Mon Sep 17 00:00:00 2001 From: Christoph Heiss Date: Wed, 7 Dec 2022 19:15:48 +0100 Subject: [PATCH] psql: Add tab-complete for optional view parameters This adds them in the same matter as it works for storage parameters of tables. Signed-off-by: Christoph Heiss --- src/bin/psql/tab-complete.c | 31 +++ 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c index 89e7317c23..999a0e5aa1 100644 --- a/src/bin/psql/tab-complete.c +++ b/src/bin/psql/tab-complete.c @@ -1310,6 +1310,13 @@ static const char *const table_storage_parameters[] = { NULL }; +/* Optional parameters for CREATE VIEW and ALTER VIEW */ +static const char *const view_optional_parameters[] = { + "check_option", + "security_barrier", + "security_invoker", + NULL +}; /* Forward declaration of functions */ static char **psql_completion(const char *text, int start, int end); @@ -2139,7 +2146,7 @@ psql_completion(const char *text, int start, int end) /* ALTER VIEW */ else if (Matches("ALTER", "VIEW", MatchAny)) COMPLETE_WITH("ALTER COLUMN", "OWNER TO", "RENAME", - "SET SCHEMA"); + "SET SCHEMA", "SET (", "RESET ("); /* ALTER VIEW xxx RENAME */ else if (Matches("ALTER", "VIEW", MatchAny, "RENAME")) COMPLETE_WITH_ATTR_PLUS(prev2_wd, "COLUMN", "TO"); @@ -2155,6 +2162,12 @@ psql_completion(const char *text, int start, int end) /* ALTER VIEW xxx RENAME COLUMN yyy */ else if (Matches("ALTER", "VIEW", MatchAny, "RENAME", "COLUMN", MatchAnyExcept("TO"))) COMPLETE_WITH("TO"); + /* ALTER VIEW xxx SET ( yyy [= zzz] ) */ + else if (Matches("ALTER", "VIEW", MatchAny, "SET", "(")) + COMPLETE_WITH_LIST(view_optional_parameters); + /* ALTER VIEW xxx RESET ( yyy , ... ) */ + else if (Matches("ALTER", "VIEW", MatchAny, "RESET", "(")) + COMPLETE_WITH_LIST(view_optional_parameters); /* ALTER MATERIALIZED VIEW */ else if (Matches("ALTER", "MATERIALIZED", "VIEW", MatchAny)) @@ -3426,13 +3439,23 @@ psql_completion(const char *text, int start, int end) } /* CREATE VIEW --- is allowed inside CREATE SCHEMA, so use TailMatches */ - /* Complete CREATE [ OR REPLACE ] VIEW with AS */ + /* Complete CREATE [ OR REPLACE ] VIEW with AS or WITH ( */ else if (TailMatches("CREATE", "VIEW", MatchAny) || TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny)) + COMPLETE_WITH("AS", "WITH ("); + /* Complete CREATE [ OR REPLACE ] VIEW WITH ( with supported options */ + else if (TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(") || + TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(")) + COMPLETE_WITH_LIST(view_optional_parameters); + /* Complete CREATE [ OR REPLACE ] VIEW WITH ( ... ) with "AS" */ + else if (TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)") || + TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(*)")) COMPLETE_WITH("AS"); - /* Complete "CREATE [ OR REPLACE ] VIEW AS with "SELECT" */ + /* Complete "CREATE [ OR REPLACE ] VIEW [ WITH ( ... ) ] AS with "SELECT" */ else if (TailMatches("CREATE", "VIEW", MatchAny, "AS") || - TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS")) + TailMatches("CREATE", "VIEW", MatchAny, "WITH", "(*)", "AS") || + TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "AS") || + TailMatches("CREATE", "OR", "REPLACE", "VIEW", MatchAny, "WITH", "(*)", "AS")) COMPLETE_WITH("SELECT"); /* CREATE MATERIALIZED VIEW */ -- 2.38.1
Creating HeapTuple from char and date values
Hi All, I am trying to create HeapTuple data structure. First, I create a tuple descriptor: TupleDesc *td=CreateTemplateTupleDesc(colCount); Then, for each variable, I do: TupleDescInitEntry(*td,v->varattno,NULL,v->vartype,v->vartypmod,0); Then, I assign values: if int32: values[v->varattno-1]=Int8GetDatum(myValue); Similarly for float. Finally, I create the HeapTuple: HeapTuple tuple=heap_form_tuple(td,values,isnull); Everything works fine with int and float. But I don't know how to handle chars. Let's say we have a character(10) column. One problem is v->vartypmod will be set to 14. Shouldn't it be 10? Second, how should I assign values? Is values[v->varattno-1]=CStringGetDatum(myValue); correct? Should I set the last parameter to TupleDescInitEntry? Why am I getting "invalid memory alloc request size" or segfault with different configurations?
Re: [PoC] Federated Authn/z with OAUTHBEARER
On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky wrote: > I think we can focus on the roles and responsibilities of the components > first. > Details of the patch can be elaborated. Like "flow type code" is a > mistake on our side, and we will use the term "grant_type" which is > defined by OIDC spec. As well as details of usage of refresh_token. (For the record, whether we call it "flow type" or "grant type" doesn't address my concern.) > Basically Yes. We propose an increase of the server side hook responsibility. > From just validating the token, to also return the provider root URL > and required audience. And possibly provide more metadata in the > future. I think it's okay to have the extension and HBA collaborate to provide discovery information. Your proposal goes further than that, though, and makes the server aware of the chosen client flow. That appears to be an architectural violation: why does an OAuth resource server need to know the client flow at all? > Which is in our opinion aligned with SASL protocol, where the server > side is responsible for telling the client auth requirements based on > the requested role in the startup packet. You've proposed an alternative SASL mechanism. There's nothing wrong with that, per se, but I think it should be clear why we've chosen something nonstandard. > Our understanding is that in the original patch that information came > purely from hba, and we propose extension being able to control that > metadata. > As we see extension as being owned by the identity provider, compared > to HBA which is owned by the server administrator or cloud provider. That seems reasonable, considering how tightly coupled the Issuer and the token validation process are. > 2. Server Owners / PAAS providers (On premise admins, Cloud providers, > multi-cloud PAAS providers). >- Install extensions and configure HBA to allow clients to > authenticate with the identity providers of their choice. (For a future conversation: they need to set up authorization, too, with custom scopes or some other magic. It's not enough to check who the token belongs to; even if Postgres is just using the verified email from OpenID as an authenticator, you have to also know that the user authorized the token -- and therefore the client -- to access Postgres on their behalf.) > 3. Client Application Developers (Data Wis, integration tools, > PgAdmin, monitoring tools, e.t.c.) >- Independent from specific Identity providers or server providers. > Write one code for all identity providers. Ideally, yes, but that only works if all identity providers implement the same flows in compatible ways. We're already seeing instances where that's not the case and we'll necessarily have to deal with that up front. >- Rely on application deployment owners to configure which OIDC > provider to use across client and server setups. > 4. Application Deployment Owners (End customers setting up applications) >- The only actor actually aware of which identity provider to use. > Configures the stack based on the Identity and PostgreSQL deployments > they have. (I have doubts that the roles will be as decoupled in practice as you have described them, but I'd rather defer that for now.) > The critical piece of the vision is (3.) above is applications > agnostic of the identity providers. Those applications rely on > properly configured servers and rich driver logic (libpq, > com.postgresql, npgsql) to allow their application to popup auth > windows or do service-to-service authentication with any provider. In > our view that would significantly democratize the deployment of OAUTH > authentication in the community. That seems to be restating the goal of OAuth and OIDC. Can you explain how the incompatible change allows you to accomplish this better than standard implementations? > In order to allow this separation, we propose: > 1. HBA + Extension is the single source of truth of Provider root URL > + Required Audience for each role. If some backfill for missing OIDC > discovery is needed, the provider-specific extension would be > providing it. > 2. Client Application knows which grant_type to use in which scenario. > But can be coded without knowledge of a specific provider. So can't > provide discovery details. > 3. Driver (libpq, others) - coordinate the authentication flow based > on client grant_type and identity provider metadata to allow client > applications to use any flow with any provider in a unified way. > > Yes, this would require a little more complicated flow between > components than in your original patch. Why? I claim that standard OAUTHBEARER can handle all of that. What does your proposed architecture (the third diagram) enable that my proposed hook (the second diagram) doesn't? > And yes, more complexity comes > with more opportunity to make bugs. > However, I see PG Server and Libpq as the places which can have more > complexity. For the purpose of making work for the
Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Hi, On 2022-11-04 09:25:52 +0100, Drouvot, Bertrand wrote: > Please find attached a rebase in v7. cfbot complains that the docs don't build: https://cirrus-ci.com/task/6694349031866368?logs=docs_build#L296 [03:24:27.317] ref/checkpoint.sgml:66: element para: validity error : Element para is not declared in para list of possible children I've marked the patch as waitin-on-author for now. There's been a bunch of architectural feedback too, but tbh, I don't know if we came to any conclusion on that front... Greetings, Andres Freund
Re: Partial aggregates pushdown
Hi, On 2022-12-05 02:03:49 +, fujii.y...@df.mitsubishielectric.co.jp wrote: > > Attaching minor fixes. I haven't proof-read all comments (but perhaps, they > > need attention from some native speaker). > Thank you. I fixed according to your patch. > And I fixed have proof-read all comments and messages. cfbot complains about some compiler warnings when building with clang: https://cirrus-ci.com/task/6606268580757504 deparse.c:3459:22: error: equality comparison with extraneous parentheses [-Werror,-Wparentheses-equality] if ((node->aggsplit == AGGSPLIT_SIMPLE)) { ~~~^~ deparse.c:3459:22: note: remove extraneous parentheses around the comparison to silence this warning if ((node->aggsplit == AGGSPLIT_SIMPLE)) { ~ ^ ~ deparse.c:3459:22: note: use '=' to turn this equality comparison into an assignment if ((node->aggsplit == AGGSPLIT_SIMPLE)) { ^~ = Greetings, Andres Freund
Re: daitch_mokotoff module
Hi, On 2022-02-03 15:27:32 +0100, Dag Lem wrote: > Just some minor adjustments to the patch: > > * Removed call to locale-dependent toupper() > * Cleaned up input normalization This patch currently fails in cfbot, likely because meson.build needs to be adjusted (this didn't exist at the time you submitted this version of the patch): [23:43:34.796] contrib/fuzzystrmatch/meson.build:18:0: ERROR: File fuzzystrmatch--1.1.sql does not exist. > -DATA = fuzzystrmatch--1.1.sql fuzzystrmatch--1.0--1.1.sql > +DATA = fuzzystrmatch--1.2.sql fuzzystrmatch--1.1--1.2.sql > fuzzystrmatch--1.0--1.1.sql > PGFILEDESC = "fuzzystrmatch - similarities and distance between strings" The patch seems to remove fuzzystrmatch--1.1.sql - I suggest not doing that. In recent years our approach has been to just keep the "base version" of the upgrade script, with extension creation running through the upgrade scripts. > > + > +#include "daitch_mokotoff.h" > + > +#include "postgres.h" Postgres policy is that the include of "postgres.h" has to be the first include in every .c file. > +#include "utils/builtins.h" > +#include "mb/pg_wchar.h" > + > +#include > + > +/* Internal C implementation */ > +static char *_daitch_mokotoff(char *word, char *soundex, size_t n); > + > + > +PG_FUNCTION_INFO_V1(daitch_mokotoff); > +Datum > +daitch_mokotoff(PG_FUNCTION_ARGS) > +{ > + text *arg = PG_GETARG_TEXT_PP(0); > + char *string, > +*tmp_soundex; > + text *soundex; > + > + /* > + * The maximum theoretical soundex size is several KB, however in > practice > + * anything but contrived synthetic inputs will yield a soundex size of > + * less than 100 bytes. We thus allocate and free a temporary work > buffer, > + * and return only the actual soundex result. > + */ > + string = pg_server_to_any(text_to_cstring(arg), VARSIZE_ANY_EXHDR(arg), > PG_UTF8); > + tmp_soundex = palloc(DM_MAX_SOUNDEX_CHARS); Seems that just using StringInfo to hold the soundex output would work better than a static allocation? > + if (!_daitch_mokotoff(string, tmp_soundex, DM_MAX_SOUNDEX_CHARS)) We imo shouldn't introduce new functions starting with _. > +/* Mark soundex code tree node as leaf. */ > +static void > +set_leaf(dm_leaves leaves_next, int *num_leaves_next, dm_node * node) > +{ > + if (!node->is_leaf) > + { > + node->is_leaf = 1; > + leaves_next[(*num_leaves_next)++] = node; > + } > +} > + > + > +/* Find next node corresponding to code digit, or create a new node. */ > +static dm_node * find_or_create_node(dm_nodes nodes, int *num_nodes, > + > dm_node * node, char code_digit) PG code style is to have a line break between a function defintion's return type and the function name - like you actually do above. > +/* Mapping from ISO8859-1 to upper-case ASCII */ > +static const char tr_iso8859_1_to_ascii_upper[] = > +/* > +"`abcdefghijklmnopqrstuvwxyz{|}~ > ¡¢£¤¥¦§¨©ª«¬ > ®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ" > +*/ > +"`ABCDEFGHIJKLMNOPQRSTUVWXYZ{|}~ ! > > ?AAECDNO*OYDSAAECDNO/OYDY"; > + > +static char > +iso8859_1_to_ascii_upper(unsigned char c) > +{ > + return c >= 0x60 ? tr_iso8859_1_to_ascii_upper[c - 0x60] : c; > +} > + > + > +/* Convert an UTF-8 character to ISO-8859-1. > + * Unconvertable characters are returned as '?'. > + * NB! Beware of the domain specific conversion of Ą, Ę, and Ţ/Ț. > + */ > +static char > +utf8_to_iso8859_1(char *str, int *ix) It seems decidedly not great to have custom encoding conversion routines in a contrib module. Is there any way we can avoid this? > +/* Generate all Daitch-Mokotoff soundex codes for word, separated by space. > */ > +static char * > +_daitch_mokotoff(char *word, char *soundex, size_t n) > +{ > + int i = 0, > + j; > + int letter_no = 0; > + int ix_leaves = 0; > + int num_nodes = 0, > + num_leaves = 0; > + dm_codes *codes, > +*next_codes; > + dm_node*nodes; > + dm_leaves *leaves; > + > + /* First letter. */ > + if (!(codes = read_letter(word, ))) > + { > + /* No encodable character in input. */ > + return NULL; > + } > + > + /* Allocate memory for node tree. */ > + nodes = palloc(sizeof(dm_nodes)); > + leaves = palloc(2 * sizeof(dm_leaves)); So this allocates the worst case memory usage, is that right? That's quite a bit of memory. Shouldn't nodes be allocated dynamically? Instead of carefully freeing individual memory allocations, I think it be better to create a
Re: Transaction timeout
Hi, On 2022-12-03 09:41:04 -0800, Andrey Borodin wrote: > Fixed. Added test for this. The tests don't pass: https://cirrus-ci.com/build/4811553145356288 [00:54:35.337](1.251s) not ok 1 - no parameters missing from postgresql.conf.sample [00:54:35.338](0.000s) # Failed test 'no parameters missing from postgresql.conf.sample' # at /tmp/cirrus-ci-build/src/test/modules/test_misc/t/003_check_guc.pl line 81. [00:54:35.338](0.000s) # got: '1' # expected: '0' I am just looking through a bunch of failing CF entries, so I'm perhaps over-sensitized right now. But I'm a bit confused why there's so many occasions of the tests clearly not having been run... Michael, any reason 003_check_guc doesn't show the missing GUCs? It's not particularly helpful to see "0 is different from 1". Seems that even just something like is_deeply(\@missing_from_file, [], "no parameters missing from postgresql.conf.sample"); would be a decent improvement? Greetings, Andres Freund
Re: wake up logical workers after ALTER SUBSCRIPTION
On Wed, Dec 07, 2022 at 02:07:11PM +0300, Melih Mutlu wrote: > Do we also need to wake up all sync workers too? Even if not, I'm not > actually sure whether doing that would harm anything though. > Just asking since currently the patch wakes up all workers including sync > workers if any still exists. After sleeping on this, I think we can do better. IIUC we can simply check for AllTablesyncsReady() at the end of process_syncing_tables_for_apply() and wake up the logical replication workers (which should just consiѕt of setting the current process's latch) if we are ready for two_phase mode. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From 2457eca47c92b29409d5972fc5a3e1c8c038fa6d Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 21 Nov 2022 16:01:01 -0800 Subject: [PATCH v8 1/1] wake up logical workers as needed instead of relying on periodic wakeups --- src/backend/access/transam/xact.c | 3 ++ src/backend/commands/alter.c| 7 src/backend/commands/subscriptioncmds.c | 4 ++ src/backend/replication/logical/tablesync.c | 10 + src/backend/replication/logical/worker.c| 46 + src/include/replication/logicalworker.h | 3 ++ 6 files changed, 73 insertions(+) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 8086b857b9..dc00e66cfb 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -47,6 +47,7 @@ #include "pgstat.h" #include "replication/logical.h" #include "replication/logicallauncher.h" +#include "replication/logicalworker.h" #include "replication/origin.h" #include "replication/snapbuild.h" #include "replication/syncrep.h" @@ -2360,6 +2361,7 @@ CommitTransaction(void) AtEOXact_PgStat(true, is_parallel_worker); AtEOXact_Snapshot(true, false); AtEOXact_ApplyLauncher(true); + AtEOXact_LogicalRepWorkers(true); pgstat_report_xact_timestamp(0); CurrentResourceOwner = NULL; @@ -2860,6 +2862,7 @@ AbortTransaction(void) AtEOXact_HashTables(false); AtEOXact_PgStat(false, is_parallel_worker); AtEOXact_ApplyLauncher(false); + AtEOXact_LogicalRepWorkers(false); pgstat_report_xact_timestamp(0); } diff --git a/src/backend/commands/alter.c b/src/backend/commands/alter.c index 10b6fe19a2..d095cd3ced 100644 --- a/src/backend/commands/alter.c +++ b/src/backend/commands/alter.c @@ -59,6 +59,7 @@ #include "commands/user.h" #include "miscadmin.h" #include "parser/parse_func.h" +#include "replication/logicalworker.h" #include "rewrite/rewriteDefine.h" #include "tcop/utility.h" #include "utils/builtins.h" @@ -279,6 +280,12 @@ AlterObjectRename_internal(Relation rel, Oid objectId, const char *new_name) if (strncmp(new_name, "regress_", 8) != 0) elog(WARNING, "subscriptions created by regression test cases should have names starting with \"regress_\""); #endif + + /* + * Wake up the logical replication workers to handle this change + * quickly. + */ + LogicalRepWorkersWakeupAtCommit(objectId); } else if (nameCacheId >= 0) { diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index d673557ea4..d6993c26e5 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -34,6 +34,7 @@ #include "nodes/makefuncs.h" #include "pgstat.h" #include "replication/logicallauncher.h" +#include "replication/logicalworker.h" #include "replication/origin.h" #include "replication/slot.h" #include "replication/walreceiver.h" @@ -1362,6 +1363,9 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt, InvokeObjectPostAlterHook(SubscriptionRelationId, subid, 0); + /* Wake up the logical replication workers to handle this change quickly. */ + LogicalRepWorkersWakeupAtCommit(subid); + return myself; } diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 94e813ac53..509fe2eb19 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -105,6 +105,7 @@ #include "pgstat.h" #include "replication/logicallauncher.h" #include "replication/logicalrelation.h" +#include "replication/logicalworker.h" #include "replication/walreceiver.h" #include "replication/worker_internal.h" #include "replication/slot.h" @@ -619,6 +620,15 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn) if (started_tx) { + /* + * If we are ready to enable two_phase mode, wake up the logical + * replication workers to handle this change quickly. + */ + CommandCounterIncrement(); + if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING && + AllTablesyncsReady()) + LogicalRepWorkersWakeupAtCommit(MyLogicalRepWorker->subid); + CommitTransactionCommand(); pgstat_report_stat(true); } diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index
Re: [BUG] Logical replica crash if there was an error in a function.
Hi, On 2022-11-16 17:52:50 +0300, Anton A. Melnikov wrote: > Sorry, i didn't fully understand what is required and > added some functions to the test that spend extra cpu time. But i found > that it is possible to make a test according to previous remarks by adding > only a few extra queries to an existent test without any additional syncing. > > Experimentally, i could not observe any significant difference in > CPU usage due to this test addition. > The difference in the CPU usage was less than standard error. > See 100_bugs-CPU-time.txt attached. > > > > Additionally > > > i've tried to reduce overall number of nodes previously > > > used in this test in a similar way. > > > > Optimizing existing tests isn't an answer to that. We could > > install those optimizations without adding a new test case. > > Oh sure! Here is a separate patch for this optimization: > https://www.postgresql.org/message-id/eb7aa992-c2d7-6ce7-4942-0c784231a362%40inbox.ru > On the contrary with previous case in that one the difference in the CPU usage > during 100_bugs.pl is essential; it decreases approximately by 30%. This CF entry causes tests to fail on all platforms: https://cirrus-ci.com/build/5755408111894528 E.g. https://api.cirrus-ci.com/v1/artifact/task/5298457144459264/testrun/build/testrun/subscription/100_bugs/log/regress_log_100_bugs Begin standard error psql::1: NOTICE: dropped replication slot "sub1" on publisher End standard error timed out waiting for match: ERROR: relation "error_name" does not exist at character at /tmp/cirrus-ci-build/src/test/subscription/t/100_bugs.pl line 115. Greetings, Andres Freund
Re: Error-safe user functions
I wrote: > Andres Freund writes: >> Is there a guarantee that input functions are stable or immutable? > There's a project policy that that should be true. That justifies > marking things like record_in as stable --- if the per-column input > functions could be volatile, record_in would need to be as well. > There are other dependencies on it; see e.g. aab353a60, 3db6524fe. I dug in the archives and found the thread leading up to aab353a60: https://www.postgresql.org/message-id/flat/AANLkTik8v7O9QR9jjHNVh62h-COC1B0FDUNmEYMdtKjR%40mail.gmail.com regards, tom lane
Re: Minimal logical decoding on standbys
Hi, On 2022-12-07 10:00:25 +0100, Drouvot, Bertrand wrote: > > Please find attached a new patch series: > > > > v27-0001-Add-info-in-WAL-records-in-preparation-for-logic.patch > > v27-0002-Handle-logical-slot-conflicts-on-standby.patch > > v27-0003-Allow-logical-decoding-on-standby.patch > > v27-0004-New-TAP-test-for-logical-decoding-on-standby.patch > > v27-0005-Doc-changes-describing-details-about-logical-dec.patch > > v27-0006-Fixing-Walsender-corner-case-with-logical-decodi.patch This failed on cfbot [1]. The tap output [2] has the following bit: [09:48:56.216](5.979s) not ok 26 - cannot read from logical replication slot [09:48:56.223](0.007s) # Failed test 'cannot read from logical replication slot' # at C:/cirrus/src/test/recovery/t/034_standby_logical_decoding.pl line 422. ... Warning: unable to close filehandle GEN150 properly: Bad file descriptor during global destruction. Warning: unable to close filehandle GEN155 properly: Bad file descriptor during global destruction. The "unable to close filehandle" stuff in my experience indicates an IPC::Run process that wasn't ended before the tap test ended. Greetings, Andres Freund [1] https://cirrus-ci.com/task/5092676671373312 [2] https://api.cirrus-ci.com/v1/artifact/task/5092676671373312/testrun/build/testrun/recovery/034_standby_logical_decoding/log/regress_log_034_standby_logical_decoding
Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
On Wed, Dec 7, 2022 at 9:50 AM Andres Freund wrote: > performing post-bootstrap initialization ... > ../src/backend/access/transam/slru.c:1520:9: runtime error: load of > misaligned address 0x7fff6362db8c for type 'int64', which requires 8 byte > alignment > 0x7fff6362db8c: note: pointer points here > 01 00 00 00 00 00 00 00 d0 02 00 00 00 00 00 00 d0 02 00 00 00 00 00 00 > 01 00 00 00 00 00 00 00 I bet that this alignment issue can be fixed by using PGAlignedBlock instead of a raw char buffer for a page. (I'm guessing, I haven't directly checked.) -- Peter Geoghegan
Re: Error-safe user functions
Andres Freund writes: > Is there a guarantee that input functions are stable or immutable? There's a project policy that that should be true. That justifies marking things like record_in as stable --- if the per-column input functions could be volatile, record_in would need to be as well. There are other dependencies on it; see e.g. aab353a60, 3db6524fe. > We don't > have any volatile input functions in core PG: Indeed, because type_sanity.sql checks that. regards, tom lane
Re: XID formatting and SLRU refactorings (was: Add 64-bit XIDs into PostgreSQL 15)
Hi, On 2022-12-07 11:40:08 +0300, Aleksander Alekseev wrote: > Hi Michael, > > > The CF bot is showing some failures here. You may want to > > double-check. > > Thanks! PFA v48. This causes a lot of failures with ubsan: https://cirrus-ci.com/task/6035600772431872 performing post-bootstrap initialization ... ../src/backend/access/transam/slru.c:1520:9: runtime error: load of misaligned address 0x7fff6362db8c for type 'int64', which requires 8 byte alignment 0x7fff6362db8c: note: pointer points here 01 00 00 00 00 00 00 00 d0 02 00 00 00 00 00 00 d0 02 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ^ ==18947==Using libbacktrace symbolizer. #0 0x564d7c45cc6b in SlruScanDirCbReportPresence ../src/backend/access/transam/slru.c:1520 #1 0x564d7c45cd88 in SlruScanDirectory ../src/backend/access/transam/slru.c:1595 #2 0x564d7c44872c in TruncateCLOG ../src/backend/access/transam/clog.c:889 #3 0x564d7c62ecd7 in vac_truncate_clog ../src/backend/commands/vacuum.c:1779 #4 0x564d7c6320a8 in vac_update_datfrozenxid ../src/backend/commands/vacuum.c:1642 #5 0x564d7c632a78 in vacuum ../src/backend/commands/vacuum.c:537 #6 0x564d7c63347d in ExecVacuum ../src/backend/commands/vacuum.c:273 #7 0x564d7ca4afea in standard_ProcessUtility ../src/backend/tcop/utility.c:866 #8 0x564d7ca4b723 in ProcessUtility ../src/backend/tcop/utility.c:530 #9 0x564d7ca46e81 in PortalRunUtility ../src/backend/tcop/pquery.c:1158 #10 0x564d7ca4755d in PortalRunMulti ../src/backend/tcop/pquery.c:1315 #11 0x564d7ca47c02 in PortalRun ../src/backend/tcop/pquery.c:791 #12 0x564d7ca40ecb in exec_simple_query ../src/backend/tcop/postgres.c:1238 #13 0x564d7ca43c01 in PostgresMain ../src/backend/tcop/postgres.c:4551 #14 0x564d7ca441a4 in PostgresSingleUserMain ../src/backend/tcop/postgres.c:4028 #15 0x564d7c74d883 in main ../src/backend/main/main.c:197 #16 0x7fde7793dd09 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d09) #17 0x564d7c2d30c9 in _start (/tmp/cirrus-ci-build/build/tmp_install/usr/local/pgsql/bin/postgres+0x8530c9) Greetings, Andres Freund
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 10:34 AM Andres Freund wrote: > > +{ oid => '8053', > > + descr => 'get error message if string is not valid input for data > type', > > + proname => 'pg_input_invalid_message', provolatile => 's', > > + prorettype => 'text', proargtypes => 'text regtype int4', > > + prosrc => 'pg_input_invalid_message_mod' }, > > + > > Is there a guarantee that input functions are stable or immutable? We don't > have any volatile input functions in core PG: > > SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput > FROM pg_type) GROUP BY provolatile; > > Effectively yes, though I'm not sure if it is formally documented or otherwise enforced by the system. The fact we allow stable is a bit of a sore spot, volatile would be a terrible property for an I/O function. David J.
Re: [PATCH] pg_dump: lock tables in batches
Hi, On 2022-12-07 12:28:03 -0500, Tom Lane wrote: > Andres Freund writes: > > On 2022-12-07 10:44:33 -0500, Tom Lane wrote: > >> I have a strong sense of deja vu here. I'm pretty sure I experimented > >> with this idea last year and gave up on it. I don't recall exactly > >> why, but either it didn't show any meaningful performance improvement > >> for me or there was some actual downside (that I'm not remembering > >> right now). > > > IIRC the case we were looking at around 989596152 were CPU bound workloads, > > rather than latency bound workloads. It'd not be surprising to have cases > > where batching LOCKs helps latency, but not CPU bound. > > Yeah, perhaps. Anyway my main point is that I don't want to just assume > this is a win; I want to see some actual performance tests. FWIW, one can simulate network latency with 'netem' on linux. Works even for 'lo'. ping -c 3 -n localhost 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.035 ms 64 bytes from ::1: icmp_seq=2 ttl=64 time=0.049 ms 64 bytes from ::1: icmp_seq=3 ttl=64 time=0.043 ms tc qdisc add dev lo root netem delay 10ms 64 bytes from ::1: icmp_seq=1 ttl=64 time=20.1 ms 64 bytes from ::1: icmp_seq=2 ttl=64 time=20.2 ms 64 bytes from ::1: icmp_seq=3 ttl=64 time=20.2 ms tc qdisc delete dev lo root netem 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.036 ms 64 bytes from ::1: icmp_seq=2 ttl=64 time=0.047 ms 64 bytes from ::1: icmp_seq=3 ttl=64 time=0.050 ms > > I wonder if "manual" batching is the best answer. Alexander, have you > > considered using libpq level pipelining? > > I'd be a bit nervous about how well that works with older servers. I don't think there should be any problem - E.g. pgjdbc has been using pipelining for ages. Not sure if it's the right answer, just to be clear. I suspect that eventually we're going to need to have a special "acquire pg_dump locks" function that is cheaper than retail lock acquisition and perhaps deals more gracefully with exceeding max_locks_per_transaction. Which would presumably not be pipelined. Greetings, Andres Freund
Re: Error-safe user functions
Hi, On 2022-12-06 15:21:09 -0500, Tom Lane wrote: > +{ oid => '8050', descr => 'test whether string is valid input for data type', > + proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool', > + proargtypes => 'text regtype', prosrc => 'pg_input_is_valid' }, > +{ oid => '8051', descr => 'test whether string is valid input for data type', > + proname => 'pg_input_is_valid', provolatile => 's', prorettype => 'bool', > + proargtypes => 'text regtype int4', prosrc => 'pg_input_is_valid_mod' }, > +{ oid => '8052', > + descr => 'get error message if string is not valid input for data type', > + proname => 'pg_input_invalid_message', provolatile => 's', > + prorettype => 'text', proargtypes => 'text regtype', > + prosrc => 'pg_input_invalid_message' }, > +{ oid => '8053', > + descr => 'get error message if string is not valid input for data type', > + proname => 'pg_input_invalid_message', provolatile => 's', > + prorettype => 'text', proargtypes => 'text regtype int4', > + prosrc => 'pg_input_invalid_message_mod' }, > + Is there a guarantee that input functions are stable or immutable? We don't have any volatile input functions in core PG: SELECT provolatile, count(*) FROM pg_proc WHERE oid IN (SELECT typinput FROM pg_type) GROUP BY provolatile; Greetings, Andres Freund
Re: [PATCH] pg_dump: lock tables in batches
Andres Freund writes: > On 2022-12-07 10:44:33 -0500, Tom Lane wrote: >> I have a strong sense of deja vu here. I'm pretty sure I experimented >> with this idea last year and gave up on it. I don't recall exactly >> why, but either it didn't show any meaningful performance improvement >> for me or there was some actual downside (that I'm not remembering >> right now). > IIRC the case we were looking at around 989596152 were CPU bound workloads, > rather than latency bound workloads. It'd not be surprising to have cases > where batching LOCKs helps latency, but not CPU bound. Yeah, perhaps. Anyway my main point is that I don't want to just assume this is a win; I want to see some actual performance tests. > I wonder if "manual" batching is the best answer. Alexander, have you > considered using libpq level pipelining? I'd be a bit nervous about how well that works with older servers. regards, tom lane
Re: Error-safe user functions
"David G. Johnston" writes: > So long as you aren't opposed to the idea if someone else does the work, > adding sect2 is better than nothing even if it is just a stop-gap measure. OK, we can agree on that. As for the other point --- not sure why I didn't remember this right off, but the point of two test functions is that one exercises the code path with details_wanted = true while the other exercises details_wanted = false. A combined function would only test the first case. regards, tom lane
Re: Error-safe user functions
Corey Huinker writes: > In my attempt to implement CAST...DEFAULT, I noticed that I immediately > needed an > OidInputFunctionCallSafe, which was trivial but maybe something we want to > add to the infra patch, but the comments around that function also somewhat > indicate that we might want to just do the work in-place and call > InputFunctionCallSafe directly. Open to both ideas. I'm a bit skeptical of that. IMO using OidInputFunctionCall is only appropriate in places that will be executed just once per query. Otherwise, unless you have zero concern for performance, you should be caching the function lookup. (The test functions in my 0003 patch illustrate the standard way to do that within SQL-callable functions. If you're implementing CAST as a new kind of executable expression, the lookup would likely happen in expression compilation.) I don't say that OidInputFunctionCallSafe won't ever be useful, but I doubt it's what we want in CAST. regards, tom lane
Re: Error-safe user functions
On 2022-12-07 09:20:33 -0500, Tom Lane wrote: > Returning to the naming quagmire -- it occurred to me just now that > it might be helpful to call this style of error reporting "soft" > errors rather than "safe" errors, which'd provide a nice contrast > with "hard" errors thrown by longjmp'ing. +1
Re: [PATCH] pg_dump: lock tables in batches
Hi, On 2022-12-07 10:44:33 -0500, Tom Lane wrote: > Aleksander Alekseev writes: > > What he proposes is taking the locks in batches. > > I have a strong sense of deja vu here. I'm pretty sure I experimented > with this idea last year and gave up on it. I don't recall exactly > why, but either it didn't show any meaningful performance improvement > for me or there was some actual downside (that I'm not remembering > right now). > This would've been in the leadup to 989596152 and adjacent commits. > I took a quick look through the threads cited in those commit messages > and didn't find anything about it, but I think the discussion had > gotten scattered across more threads. Some digging in the archives > could be useful. IIRC the case we were looking at around 989596152 were CPU bound workloads, rather than latency bound workloads. It'd not be surprising to have cases where batching LOCKs helps latency, but not CPU bound. I wonder if "manual" batching is the best answer. Alexander, have you considered using libpq level pipelining? Greetings, Andres Freund
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 9:59 AM Tom Lane wrote: > "David G. Johnston" writes: > > > Are you suggesting we should not go down the path that v8-0003 does in > the > > monitoring section cleanup thread? I find the usability of Chapter 54 > > System Views to be superior to these two run-on chapters and would rather > > we emulate it in both these places - for what is in the end very little > > additional effort, all mechanical in nature. > > I have not been following that thread, and am not really excited about > putting in a huge amount of documentation work here. I'd just like 9.26 > to have a mini-TOC at the page head, which 's would be enough for. > > So long as you aren't opposed to the idea if someone else does the work, adding sect2 is better than nothing even if it is just a stop-gap measure. David J.
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 9:20 AM Tom Lane wrote: > Andrew Dunstan writes: > > Perhaps we should add a type in the regress library that will never have > > a safe input function, so we can test that the mechanism works as > > expected in that case even after we adjust all the core data types' > > input functions. > > I was intending that the existing "widget" type be that. 0003 already > adds a comment to widget_in saying not to "fix" its one ereport call. > > Returning to the naming quagmire -- it occurred to me just now that > it might be helpful to call this style of error reporting "soft" > errors rather than "safe" errors, which'd provide a nice contrast > with "hard" errors thrown by longjmp'ing. That would lead to naming > all the variant functions XXXSoft not XXXSafe. There would still > be commentary to the effect that "soft errors must be safe, in the > sense that there's no question whether it's safe to continue > processing the transaction". Anybody think that'd be an > improvement? In my attempt to implement CAST...DEFAULT, I noticed that I immediately needed an OidInputFunctionCallSafe, which was trivial but maybe something we want to add to the infra patch, but the comments around that function also somewhat indicate that we might want to just do the work in-place and call InputFunctionCallSafe directly. Open to both ideas. Looking forward cascades up into coerce_type and its brethren, and reimplementing those from a Node returner to a boolean returner with a Node parameter seems a bit of a stretch, so I have to pick a point where the code pivots from passing down a safe-mode indicator and passing back a found_error indicator (which may be combine-able, as safe is always true when the found_error pointer is not null, and always false when it isn't), but for the most part things look do-able.
Re: Error-safe user functions
"David G. Johnston" writes: > On Wed, Dec 7, 2022 at 9:06 AM Tom Lane wrote: >> "David G. Johnston" writes: >>> Why not do away with two separate functions and define a composite type >>> (boolean, text) for is_valid to return? >> I don't see any advantage to that. It would be harder to use in both >> use-cases. > I don't really see a use case for either of them individually. Uh, several people opined that pg_input_is_valid would be of field interest. If I thought these were only for testing purposes I wouldn't be especially concerned about documenting them at all. > Are you suggesting we should not go down the path that v8-0003 does in the > monitoring section cleanup thread? I find the usability of Chapter 54 > System Views to be superior to these two run-on chapters and would rather > we emulate it in both these places - for what is in the end very little > additional effort, all mechanical in nature. I have not been following that thread, and am not really excited about putting in a huge amount of documentation work here. I'd just like 9.26 to have a mini-TOC at the page head, which 's would be enough for. regards, tom lane
Re: pg_upgrade: Make testing different transfer modes easier
On 02.12.22 13:04, Daniel Gustafsson wrote: Wouldn't it be possible, and less change-code-manual, to accept this via an extension to PROVE_FLAGS? Any options after :: to prove are passed to the test(s) [0] so we could perhaps inspect @ARGV for the mode if we invent a new way to pass arguments. Something along the lines of the untested sketch below in the pg_upgrade test: +# Optionally set the file transfer mode for the tests via arguments to PROVE +my $mode = (@ARGV); +$mode = '--copy' unless defined; .. together with an extension to Makefile.global.in .. - $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl) + $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl) $(PROVE_TEST_ARGS) .. should *I think* allow for passing the mode to the tests via: make -C src/bin/pg_upgrade check PROVE_TEST_ARGS=":: --link" I think this might be a lot of complication to get working robustly and in the different build systems. Plus, what happens if you run all the test suites and want to pass some options to pg_upgrade and some to another test? I think if we want to make this configurable on the fly, and environment variable would be much easier, like my $mode = $ENV{PG_TEST_PG_UPGRADE_MODE} || '--copy';
Re: pg_upgrade: Make testing different transfer modes easier
On 02.12.22 01:56, Kyotaro Horiguchi wrote: also thought about something like a "mode" option with an argument, but given that we already have --link and --clone, this seemed the most sensible.) Thoughts? When I read up to the point of the --copy option, what came to my mind was the --mode= option. IMHO, if I was going to add an option to choose the copy behavior, I would add --mode option instead, like pg_ctl does, as it implicitly signals that the suboptions are mutually exclusive. Ok, we have sort of one vote for each variant now. Let's see if there are other opinions.
Re: [DOCS] Stats views and functions not in order?
On Tue, Dec 6, 2022 at 7:57 PM David G. Johnston wrote: > On Tue, Dec 6, 2022 at 6:36 PM Peter Smith wrote: > >> I'd like to "fix" this but IIUC there is no consensus yet about what >> order is best for patch 0001, right? >> >> > I'm planning on performing a more thorough review of 0003 and 0004 > tomorrow. > > Compiled just fine. I do think every row of the views table should be hyperlinked. None of the "xact" ones are for some reason. For the sys/user ones just point to the same place as the corresponding "all" link. pg_stat_subscription_stats needs to be moved up to the "globals" section. There are a bunch of trailing ". See" in the descriptions now that need to be cleaned up. (0002) David J.
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 9:06 AM Tom Lane wrote: > "David G. Johnston" writes: > > Why not do away with two separate functions and define a composite type > > (boolean, text) for is_valid to return? > > I don't see any advantage to that. It would be harder to use in both > use-cases. > I don't really see a use case for either of them individually. If all you are doing is printing them out in a test and checking the result in what situation wouldn't you want to check that both the true/false and message are as expected? Plus, you don't have to figure out a name for the second function. > > >> BTW, does anyone else agree that 9.26 is desperately in need of some > >> subdivisions? It seems to have gotten a lot longer since > >> I looked at it last. > > > I'd be inclined to do something like what we are attempting for Chapter > 28 > > Monitoring Database Activity; introduce pagination through refentry and > > build our own table of contents into it. > > I'd prefer to follow the model that already exists in 9.27, > ie break it up with 's, which provide a handy > sub-table-of-contents. > > I have a bigger issue with the non-pagination myself; the extra bit of effort to manually create a tabular ToC (where we can add descriptions) seems like a worthy price to pay. Are you suggesting we should not go down the path that v8-0003 does in the monitoring section cleanup thread? I find the usability of Chapter 54 System Views to be superior to these two run-on chapters and would rather we emulate it in both these places - for what is in the end very little additional effort, all mechanical in nature. David J.
Re: Think-o in foreign key comments
On 03.12.22 05:59, Ian Lawrence Barwick wrote: 2022年12月3日(土) 7:19 Paul Jungwirth : I noticed a few places in the new foreign key code where a comment says "the ON DELETE SET NULL/DELETE clause". I believe it should say "ON DELETE SET NULL/DEFAULT". These comments were added in d6f96ed94e7, "Allow specifying column list for foreign key ON DELETE SET actions." Here is a patch to correct them. LGTM. I do notice the same patch adds the function "validateFkOnDeleteSetColumns" but the name in the comment preceding it is "validateFkActionSetColumns", might as well fix that the same time. Committed with that addition and backpatched to PG15.
Re: Error-safe user functions
"David G. Johnston" writes: > Why not do away with two separate functions and define a composite type > (boolean, text) for is_valid to return? I don't see any advantage to that. It would be harder to use in both use-cases. >> BTW, does anyone else agree that 9.26 is desperately in need of some >> subdivisions? It seems to have gotten a lot longer since >> I looked at it last. > I'd be inclined to do something like what we are attempting for Chapter 28 > Monitoring Database Activity; introduce pagination through refentry and > build our own table of contents into it. I'd prefer to follow the model that already exists in 9.27, ie break it up with 's, which provide a handy sub-table-of-contents. regards, tom lane
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 8:23 AM Tom Lane wrote: > Andrew Dunstan writes: > > On 2022-12-07 We 09:20, Tom Lane wrote: > >> Returning to the naming quagmire -- it occurred to me just now that > >> it might be helpful to call this style of error reporting "soft" > >> errors rather than "safe" errors, which'd provide a nice contrast > >> with "hard" errors thrown by longjmp'ing. That would lead to naming > >> all the variant functions XXXSoft not XXXSafe. > > > I'm not sure InputFunctionCallSoft would be an improvement. > > Yeah, after reflecting on it a bit more I'm not that impressed with > that as a function name either. > > (I think that "soft error" could be useful as informal terminology. > AFAIR we don't use "hard error" in any formal way either, but there > are certainly comments using that phrase.) > > More questions: > > * Anyone want to bikeshed about the new SQL-level function names? > I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning > variant, but not so much with "pg_input_invalid_message" for the > error-message-returning variant. Thinking about "pg_input_error_message" > instead, but that's not stellar either. > Why not do away with two separate functions and define a composite type (boolean, text) for is_valid to return? > * Where in the world shall we document these, if we document them? > The only section of chapter 9 that seems even a little bit appropriate > is "9.26. System Information Functions and Operators", and even there, > they would need their own new table because they don't fit well in any > existing table. > I would indeed just add a table there. > > BTW, does anyone else agree that 9.26 is desperately in need of some > subdivisions? It seems to have gotten a lot longer since > I looked at it last. > > I'd be inclined to do something like what we are attempting for Chapter 28 Monitoring Database Activity; introduce pagination through refentry and build our own table of contents into it. David J.
Re: [PATCH] pg_dump: lock tables in batches
Aleksander Alekseev writes: > What he proposes is taking the locks in batches. I have a strong sense of deja vu here. I'm pretty sure I experimented with this idea last year and gave up on it. I don't recall exactly why, but either it didn't show any meaningful performance improvement for me or there was some actual downside (that I'm not remembering right now). This would've been in the leadup to 989596152 and adjacent commits. I took a quick look through the threads cited in those commit messages and didn't find anything about it, but I think the discussion had gotten scattered across more threads. Some digging in the archives could be useful. regards, tom lane
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 8:04 AM Andrew Dunstan wrote: > > On 2022-12-07 We 09:20, Tom Lane wrote: > > Andrew Dunstan writes: > >> Perhaps we should add a type in the regress library that will never have > >> a safe input function, so we can test that the mechanism works as > >> expected in that case even after we adjust all the core data types' > >> input functions. > > I was intending that the existing "widget" type be that. 0003 already > > adds a comment to widget_in saying not to "fix" its one ereport call. > > > Yeah, I see that, I must have been insufficiently caffeinated. > > > > > > Returning to the naming quagmire -- it occurred to me just now that > > it might be helpful to call this style of error reporting "soft" > > errors rather than "safe" errors, which'd provide a nice contrast > > with "hard" errors thrown by longjmp'ing. That would lead to naming > > all the variant functions XXXSoft not XXXSafe. There would still > > be commentary to the effect that "soft errors must be safe, in the > > sense that there's no question whether it's safe to continue > > processing the transaction". Anybody think that'd be an > > improvement? > > > > > > > I'm not sure InputFunctionCallSoft would be an improvement. Maybe > InputFunctionCallSoftError would be clearer, but I don't know that it's > much of an improvement either. The same goes for the other visible changes. > > InputFunctionCallSafe -> TryInputFunctionCall I think in create type saying "input functions to handle errors softly" is an improvement over "input functions to return safe errors". start->save->finish describes a soft error handling procedure quite well. safe has baggage, all code should be "safe". fmgr/README: "Handling Non-Exception Errors" -> "Soft Error Handling" "typical safe error conditions include" -> "error conditions that can be handled softly include" (pg_input_is_valid) "input function has been updated to return "safe' errors" -> "input function has been updated to soft error handling" Unrelated observation: "Although the error stack is not large, we don't expect to run out of space." -> "Because the error stack is not large, assume that we will not run out of space and panic if we are wrong."? David J.
Re: [PATCH] pg_dump: lock tables in batches
On Wed, Dec 7, 2022 at 12:09 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > > Hi hackers, > > A colleague of mine reported a slight inconvenience with pg_dump. > > He is dumping the data from a remote server. There are several > thousands of tables in the database. Making a dump locally and/or > using pg_basebackup and/or logical replication is not an option. So > what pg_dump currently does is sending LOCK TABLE queries one after > another. Every query needs an extra round trip. So if we have let's > say 2000 tables and every round trip takes 100 ms then ~3.5 minutes > are spent in the not most useful way. > > What he proposes is taking the locks in batches. I.e. instead of: > > LOCK TABLE foo IN ACCESS SHARE MODE; > LOCK TABLE bar IN ACCESS SHARE MODE; > > do: > > LOCK TABLE foo, bar, ... IN ACCESS SHARE MODE; > > The proposed patch makes this change. It's pretty straightforward and > as a side effect saves a bit of network traffic too. > +1 for that change. It will improve the dump for databases with thousands of relations. The code LGTM and it passes in all tests and compiles without any warning. Regards, -- Fabrízio de Royes Mello
Re: Error-safe user functions
Andrew Dunstan writes: > On 2022-12-07 We 09:20, Tom Lane wrote: >> Returning to the naming quagmire -- it occurred to me just now that >> it might be helpful to call this style of error reporting "soft" >> errors rather than "safe" errors, which'd provide a nice contrast >> with "hard" errors thrown by longjmp'ing. That would lead to naming >> all the variant functions XXXSoft not XXXSafe. > I'm not sure InputFunctionCallSoft would be an improvement. Yeah, after reflecting on it a bit more I'm not that impressed with that as a function name either. (I think that "soft error" could be useful as informal terminology. AFAIR we don't use "hard error" in any formal way either, but there are certainly comments using that phrase.) More questions: * Anyone want to bikeshed about the new SQL-level function names? I'm reasonably satisfied with "pg_input_is_valid" for the bool-returning variant, but not so much with "pg_input_invalid_message" for the error-message-returning variant. Thinking about "pg_input_error_message" instead, but that's not stellar either. * Where in the world shall we document these, if we document them? The only section of chapter 9 that seems even a little bit appropriate is "9.26. System Information Functions and Operators", and even there, they would need their own new table because they don't fit well in any existing table. BTW, does anyone else agree that 9.26 is desperately in need of some subdivisions? It seems to have gotten a lot longer since I looked at it last. regards, tom lane
[PATCH] pg_dump: lock tables in batches
Hi hackers, A colleague of mine reported a slight inconvenience with pg_dump. He is dumping the data from a remote server. There are several thousands of tables in the database. Making a dump locally and/or using pg_basebackup and/or logical replication is not an option. So what pg_dump currently does is sending LOCK TABLE queries one after another. Every query needs an extra round trip. So if we have let's say 2000 tables and every round trip takes 100 ms then ~3.5 minutes are spent in the not most useful way. What he proposes is taking the locks in batches. I.e. instead of: LOCK TABLE foo IN ACCESS SHARE MODE; LOCK TABLE bar IN ACCESS SHARE MODE; do: LOCK TABLE foo, bar, ... IN ACCESS SHARE MODE; The proposed patch makes this change. It's pretty straightforward and as a side effect saves a bit of network traffic too. Thoughts? -- Best regards, Aleksander Alekseev v1-0001-pg_dump-lock-tables-in-batches.patch Description: Binary data
Re: Error-safe user functions
On 2022-12-07 We 09:20, Tom Lane wrote: > Andrew Dunstan writes: >> Perhaps we should add a type in the regress library that will never have >> a safe input function, so we can test that the mechanism works as >> expected in that case even after we adjust all the core data types' >> input functions. > I was intending that the existing "widget" type be that. 0003 already > adds a comment to widget_in saying not to "fix" its one ereport call. Yeah, I see that, I must have been insufficiently caffeinated. > > Returning to the naming quagmire -- it occurred to me just now that > it might be helpful to call this style of error reporting "soft" > errors rather than "safe" errors, which'd provide a nice contrast > with "hard" errors thrown by longjmp'ing. That would lead to naming > all the variant functions XXXSoft not XXXSafe. There would still > be commentary to the effect that "soft errors must be safe, in the > sense that there's no question whether it's safe to continue > processing the transaction". Anybody think that'd be an > improvement? > > I'm not sure InputFunctionCallSoft would be an improvement. Maybe InputFunctionCallSoftError would be clearer, but I don't know that it's much of an improvement either. The same goes for the other visible changes. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Error-safe user functions
On Wed, Dec 7, 2022 at 7:20 AM Tom Lane wrote: > > Returning to the naming quagmire -- it occurred to me just now that > it might be helpful to call this style of error reporting "soft" > errors rather than "safe" errors, which'd provide a nice contrast > with "hard" errors thrown by longjmp'ing. That would lead to naming > all the variant functions XXXSoft not XXXSafe. There would still > be commentary to the effect that "soft errors must be safe, in the > sense that there's no question whether it's safe to continue > processing the transaction". Anybody think that'd be an > improvement? > > +1 David J.
Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:not tested Looks good to me The new status of this patch is: Ready for Committer
Re: Error-safe user functions
Andrew Dunstan writes: > Perhaps we should add a type in the regress library that will never have > a safe input function, so we can test that the mechanism works as > expected in that case even after we adjust all the core data types' > input functions. I was intending that the existing "widget" type be that. 0003 already adds a comment to widget_in saying not to "fix" its one ereport call. Returning to the naming quagmire -- it occurred to me just now that it might be helpful to call this style of error reporting "soft" errors rather than "safe" errors, which'd provide a nice contrast with "hard" errors thrown by longjmp'ing. That would lead to naming all the variant functions XXXSoft not XXXSafe. There would still be commentary to the effect that "soft errors must be safe, in the sense that there's no question whether it's safe to continue processing the transaction". Anybody think that'd be an improvement? regards, tom lane
Re: Allow tests to pass in OpenSSL FIPS mode
On 13.10.22 12:26, Peter Eisentraut wrote: I think that the other md5() computations done in the main regression test suite could just be switched to use one of the sha*() functions as they just want to put their hands on text values. It looks like a few of them have some expections with the output size and generate_series(), though, but this could be tweaked by making the series shorter, for example. Right, that's the rest of my original patch. I'll come back with an updated version of that. Here is the next step. To contain the scope, I focused on just "make check" for now. This patch removes all incidental calls to md5(), replacing them with sha256(), so that they'd pass with or without FIPS mode. (Two tests would need alternative expected files: md5 and password. I have not included those here.) Some tests inspect the actual md5 result strings or build statistics based on them. I have tried to carefully preserve the meaning of the original tests, to the extent that they could be inferred, in some cases adjusting example values by matching the md5 outputs to the equivalent sha256 outputs. Some cases are tricky or mysterious or both and could use another look. From 437c7c8b62b5574b017a5d05b5540e219abd6c4a Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Wed, 7 Dec 2022 13:32:26 +0100 Subject: [PATCH] Remove incidental md5() function uses from main regression tests Most of these calls were to generate some random data. These can be replaced by appropriately adapted sha256() calls. This will eventually allow these tests to pass in OpenSSL FIPS mode (which does not allow MD5 use). Similar work for other test suites will follow later. --- src/test/regress/expected/arrays.out| 18 +- src/test/regress/expected/brin.out | 4 +- src/test/regress/expected/brin_multi.out| 8 +- src/test/regress/expected/compression.out | 13 +- src/test/regress/expected/compression_1.out | 11 +- src/test/regress/expected/inherit.out | 2 +- src/test/regress/expected/largeobject.out | 2 +- src/test/regress/expected/matview.out | 8 +- src/test/regress/expected/memoize.out | 2 +- src/test/regress/expected/plpgsql.out | 24 +- src/test/regress/expected/rowsecurity.out | 591 ++-- src/test/regress/expected/stats_ext.out | 14 +- src/test/regress/sql/arrays.sql | 18 +- src/test/regress/sql/brin.sql | 4 +- src/test/regress/sql/brin_multi.sql | 8 +- src/test/regress/sql/compression.sql| 7 +- src/test/regress/sql/inherit.sql| 2 +- src/test/regress/sql/largeobject.sql| 2 +- src/test/regress/sql/matview.sql| 8 +- src/test/regress/sql/memoize.sql| 2 +- src/test/regress/sql/plpgsql.sql| 2 +- src/test/regress/sql/rowsecurity.sql| 14 +- src/test/regress/sql/stats_ext.sql | 14 +- 23 files changed, 386 insertions(+), 392 deletions(-) diff --git a/src/test/regress/expected/arrays.out b/src/test/regress/expected/arrays.out index 97920f38c2..ae269d4f50 100644 --- a/src/test/regress/expected/arrays.out +++ b/src/test/regress/expected/arrays.out @@ -2272,20 +2272,20 @@ select * from t1; (1 row) -- Check that arrays of composites are safely detoasted when needed -create temp table src (f1 text); +create temp table src (f1 bytea); insert into src - select string_agg(random()::text,'') from generate_series(1,1); -create type textandtext as (c1 text, c2 text); -create temp table dest (f1 textandtext[]); -insert into dest select array[row(f1,f1)::textandtext] from src; -select length(md5((f1[1]).c2)) from dest; + select string_agg(random()::text::bytea,'') from generate_series(1,1); +create type byteaandbytea as (c1 bytea, c2 bytea); +create temp table dest (f1 byteaandbytea[]); +insert into dest select array[row(f1,f1)::byteaandbytea] from src; +select length(sha256((f1[1]).c2)) from dest; length 32 (1 row) delete from src; -select length(md5((f1[1]).c2)) from dest; +select length(sha256((f1[1]).c2)) from dest; length 32 @@ -2293,14 +2293,14 @@ select length(md5((f1[1]).c2)) from dest; truncate table src; drop table src; -select length(md5((f1[1]).c2)) from dest; +select length(sha256((f1[1]).c2)) from dest; length 32 (1 row) drop table dest; -drop type textandtext; +drop type byteaandbytea; -- Tests for polymorphic-array form of width_bucket() -- this exercises the varwidth and float8 code paths SELECT diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out index 73fa38396e..d4a139e50b 100644 --- a/src/test/regress/expected/brin.out +++ b/src/test/regress/expected/brin.out @@ -530,7 +530,7 @@ EXPLAIN (COSTS OFF) SELECT * FROM brin_test WHERE b = 1; CREATE TABLE brintest_3 (a text, b text, c text, d text); -- long random strings (~2000 chars each, so ~6kB for min/max
Re: Error-safe user functions
On 2022-12-06 Tu 15:21, Tom Lane wrote: > OK, here's a v3 responding to the comments from Andres. Looks pretty good to me. > > is preliminary refactoring of elog.c, with (I trust) no > functional effect. It gets rid of some pre-existing code duplication > as well as setting up to let 0001's additions be less duplicative. > > 0001 adopts use of Node pointers in place of "void *". To do this > I needed an alias type in elog.h equivalent to fmgr.h's fmNodePtr. > I decided that having two different aliases would be too confusing, > so what I did here was to converge both elog.h and fmgr.h on using > the same alias "typedef struct Node *NodePtr". That has to be in > elog.h since it's included first, from postgres.h. (I thought of > defining NodePtr in postgres.h, but postgres.h includes elog.h > immediately so that wouldn't have looked very nice.) > > I also adopted Andres' recommendation that InputFunctionCallSafe > return boolean. I'm still not totally sold on that ... but it does > end with array_in and record_in never using SAFE_ERROR_OCCURRED at > all, so maybe the idea's OK. Originally I wanted to make the new function look as much like the original as possible, but I'm not wedded to that either. I can live with it like this. > > 0002 adjusts the I/O functions for these API changes, and fixes > my silly oversight about error cleanup in record_in. > > Given the discussion about testing requirements, I threw away the > COPY hack entirely. This 0003 provides a couple of SQL-callable > functions that can be used to invoke a specific datatype's input > function. I haven't documented them, pending bikeshedding on > names etc. I also arranged to test array_in and record_in with > a datatype that still throws errors, reserving the existing test > type "widget" for that purpose. > > (I'm not intending to foreclose development of new COPY features > in this area, just abandoning the idea that that's our initial > test mechanism.) > The new functions on their own are likely to make plenty of people quite happy once we've adjusted all the input functions. Perhaps we should add a type in the regress library that will never have a safe input function, so we can test that the mechanism works as expected in that case even after we adjust all the core data types' input functions. Otherwise I think we're good to go. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: ANY_VALUE aggregate
On Wed, Dec 7, 2022 at 1:58 AM Pantelis Theodosiou wrote: > On Tue, Dec 6, 2022 at 4:57 AM David G. Johnston > wrote: > ... > > > > > > I'm referring to the query: > > > > select any_value(v order by v) from (values (2),(1),(3)) as vals (v); > > // produces 1, per the documented implementation-defined behavior. > > > > Someone writing: > > > > select any_value(v) from (values (2),(1),(3)) as vals (v) order by v; > > > > Is not presently, nor am I saying, promised the value 1. > > > > Shouldn't the 2nd query be producing an error, as it has an implied > GROUP BY () - so column v cannot appear (unless aggregated) in SELECT > and ORDER BY? > Right, that should be written as: select any_value(v) from (values (2),(1),(3) order by 1) as vals (v); (you said SELECT; the discussion here is that any_value is going to be added as a new aggregate function) David J.
Re: Allow placeholders in ALTER ROLE w/o superuser
On Wed, Dec 7, 2022 at 1:28 AM Pavel Borisov wrote: > On Tue, 6 Dec 2022 at 19:01, Alexander Korotkov wrote: > > > > On Mon, Dec 5, 2022 at 10:32 PM Alexander Korotkov > > wrote: > > > On Mon, Dec 5, 2022 at 8:18 PM Tom Lane wrote: > > > > Alvaro Herrera writes: > > > > > I couldn't find any discussion of the idea of adding "(s)" to the > > > > > variable name in order to mark the variable userset in the catalog, > > > > > and > > > > > I have to admit I find it a bit strange. Are we really agreed that > > > > > that's the way to proceed? > > > > > > > > I hadn't been paying close attention to this thread, sorry. > > > > > > > > I agree that that seems like a very regrettable choice, > > > > especially if you anticipate having to bump catversion anyway. > > > > > > I totally understand that this change requires a catversion bump. > > > I've reflected this in the commit message. > > > > > > > Better to add a bool column to the catalog. > > > > > > What about adding a boolean array to the pg_db_role_setting? So, > > > pg_db_role_setting would have the following columns. > > > * setdatabase oid > > > * setrole oid > > > * setconfig text[] > > > * setuser bool[] > > > > The revised patch implements this way for storage USER SET flag. > > think it really became more structured and less cumbersome. > > I agree that the patch became more structured and the complications > for string parameter suffixing have gone away. I've looked it through > and don't see problems with it. The only two-lines fix regarding > variable initializing may be relevant (see v9). Tests pass and CI is > also happy with it. I'd like to set it ready for committer if no > objections. Thank you, Pavel. I've made few minor improvements in the docs and comments. I'm going to push this if no objections. -- Regards, Alexander Korotkov 0001-Add-USER-SET-parameter-values-for-pg_db_role_set-v10.patch Description: Binary data
RE: Perform streaming logical transactions by background workers and parallel apply
On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada wrote: > > On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com > wrote: > > > > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com > > > > > > > Thursday, December 1, 2022 8:40 PM Amit Kapila > > > > wrote: > > > > Some other comments: > > > ... > > > Attach the new version patch set which addressed most of the comments > > > received so far except some comments being discussed[1]. > > > [1] > https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C > 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com > > > > Attach a new version patch set which fixed a testcase failure on CFbot. > > --- > If a value of max_parallel_apply_workers_per_subscription is not > sufficient, we get the LOG "out of parallel apply workers" every time > when the apply worker doesn't launch a worker. But do we really need > this log? It seems not consistent with > max_sync_workers_per_subscription behavior. I think we can check if > the number of running parallel workers is less than > max_parallel_apply_workers_per_subscription before calling > logicalrep_worker_launch(). What do you think? (Sorry, I missed this comment in last email) I personally feel giving a hint might help user to realize that the max_parallel_applyxxx is not enough for the current workload and then they can adjust the parameter. Otherwise, user might have an easy way to check if more workers are needed. Best regards, Hou zj
RE: Perform streaming logical transactions by background workers and parallel apply
On Wednesday, December 7, 2022 7:51 PM Masahiko Sawada wrote: > > On Mon, Dec 5, 2022 at 1:29 PM houzj.f...@fujitsu.com > wrote: > > > > On Sunday, December 4, 2022 7:17 PM houzj.f...@fujitsu.com > > > > > > > Thursday, December 1, 2022 8:40 PM Amit Kapila > > > > wrote: > > > > Some other comments: > > > ... > > > Attach the new version patch set which addressed most of the comments > > > received so far except some comments being discussed[1]. > > > [1] > https://www.postgresql.org/message-id/OS0PR01MB57167BF64FC0891734C > 8E81A94149%40OS0PR01MB5716.jpnprd01.prod.outlook.com > > > > Attach a new version patch set which fixed a testcase failure on CFbot. > > Here are some comments on v56 0001, 0002 patches. Please ignore > comments if you already incorporated them in v57. Thanks for the comments! > +static void > +ProcessParallelApplyInterrupts(void) > +{ > +CHECK_FOR_INTERRUPTS(); > + > +if (ShutdownRequestPending) > +{ > +ereport(LOG, > +(errmsg("logical replication parallel > apply worker for subscrip > tion \"%s\" has finished", > +MySubscription->name))); > + > +apply_worker_clean_exit(false); > +} > + > +if (ConfigReloadPending) > +{ > +ConfigReloadPending = false; > +ProcessConfigFile(PGC_SIGHUP); > +} > +} > > I personally think that we don't need to have a function to do only > these few things. I thought that introduce a new function make the handling of worker specific Interrupts logic similar to other existing ones. Like: ProcessWalRcvInterrupts () in walreceiver.c and HandlePgArchInterrupts() in pgarch.c ... > > Should we change the names to something like > LOGICALREP_STREAM_PARALLEL? Agreed, will change. > --- > + * The lock graph for the above example will look as follows: > + * LA (waiting to acquire the lock on the unique index) -> PA (waiting to > + * acquire the lock on the remote transaction) -> LA > > and > > + * The lock graph for the above example will look as follows: > + * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to acquire > the > + * lock due to unique index constraint) -> PA-1 (waiting to acquire the > stream > + * lock) -> LA > > "(waiting to acquire the lock on the remote transaction)" in the first > example and "(waiting to acquire the stream lock)" in the second > example is the same meaning, right? If so, I think we should use > either term for consistency. Will change. > --- > +bool write_abort_info = (data->streaming == > SUBSTREAM_PARALLEL); > > I think that instead of setting write_abort_info every time when > pgoutput_stream_abort() is called, we can set it once, probably in > PGOutputData, at startup. I thought that since we already have a "stream" flag in PGOutputData, I am not sure if it would be better to introduce another flag for the same option. > --- > server_version = walrcv_server_version(LogRepWorkerWalRcvConn); > options.proto.logical.proto_version = > +server_version >= 16 ? > LOGICALREP_PROTO_STREAM_PARALLEL_VERSION_NUM : > server_version >= 15 ? > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM : > server_version >= 14 ? > LOGICALREP_PROTO_STREAM_VERSION_NUM : > LOGICALREP_PROTO_VERSION_NUM; > > Instead of always using the new protocol version, I think we can use > LOGICALREP_PROTO_TWOPHASE_VERSION_NUM if the streaming is not > 'parallel'. That way, we don't need to change protocl version check > logic in pgoutput.c and don't need to expose defGetStreamingMode(). > What do you think? I think that some user can also use the new version number when trying to get changes (via pg_logical_slot_peek_binary_changes or other functions), so I feel leave the check for new version number seems fine. Besides, I feel even if we don't use new version number, we still need to use defGetStreamingMode to check if parallel mode in used as we need to send abort_lsn when parallel is in used. I might be missing something, sorry for that. Can you please explain the idea a bit ? > --- > When max_parallel_apply_workers_per_subscription is changed to a value > lower than the number of parallel worker running at that time, do we > need to stop extra workers? I think we can do this, like adding a check in the main loop of leader worker, and check every time after reloading the conf. OTOH, we will also stop the worker after finishing a transaction, so I am slightly not sure do we need to add another check logic here. But I am fine to add it if you think it would be better. > --- > If a value of max_parallel_apply_workers_per_subscription is not > sufficient, we get the LOG "out of parallel apply workers" every time > when the apply worker doesn't launch a worker. But do we really need > this log? It seems not consistent with > max_sync_workers_per_subscription
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, Dec 7, 2022 at 4:31 PM Amit Kapila wrote: > > On Wed, Dec 7, 2022 at 10:10 AM Masahiko Sawada wrote: > > > > On Wed, Dec 7, 2022 at 1:29 PM Amit Kapila wrote: > > > > > > Right, but the leader will anyway exit at some point either due to an > > > ERROR like "lost connection ... to parallel worker" or with a LOG > > > like: "... will restart because of a parameter change" but I see your > > > point. So, will it be better if we have a LOG message here and then > > > proc_exit()? Do you have something else in mind for this? > > > > No, I was thinking that too. It's better to write a LOG message and do > > proc_exit(). > > > > Regarding the error "lost connection ... to parallel worker", it could > > still happen depending on the timing even if the parallel worker > > cleanly exits due to parameter changes, right? If so, I'm concerned > > that it could lead to disable the subscription unexpectedly if > > disable_on_error is enabled. > > > > If we want to avoid this then I think we have the following options > (a) parallel apply skips checking parameter change (b) parallel worker > won't exit on parameter change but will silently absorb the parameter > and continue its processing; anyway, the leader will detect it and > stop the worker for the parameter change > > Among these, the advantage of (b) is that it will allow reflecting the > parameter change (that doesn't need restart) in the parallel worker. > Do you have any better idea to deal with this? I think (b) is better. We need to reflect the synchronous_commit parameter also in parallel workers in the worker pool. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
I think this patch is split badly. You have: 0001 an enormous patch including some required infrastructure, plus the DDL deparsing bits themselves. 0002 another enormous (though not as much) patch, this time for DDL replication using the above. 0003 a bugfix for 0001, which includes changes in both the infrastructure and the deparsing bits. 0004 test stuff for 0002. 0005 Another bugfix for 0001 0006 Another bugfix for 0001 As presented, I think it has very little chance of being reviewed usefully. A better way to go about this, I think, would be: 0001 - infrastructure bits to support the DDL deparsing parts (all these new functions in ruleutils.c, sequence.c, etc). That means, everything (?) that's currently in your 0001 except ddl_deparse.c and friends. Clearly there are several independent changes here; maybe it is possible to break it down even further. This patch or these patches should also include the parts of 0003, 0005, 0006 that require changes outside of ddl_deparse.c. I expect that this patch should be fairly small. 0002 - ddl_deparse.c and its very close friends. This should not have any impact on places such as ruleutils.c, sequence.c, etc. The parts of the bugfixes (0001, 0005, 0006) that touch this could should be merged here as well; there's no reason to have them as separate patches. Some test code should be here also, though it probably doesn't need to aim to be complete. This one is likely to be very large, but also self-contained. 0003 - ddlmessage.c and friends. I understand that DDL-messaging is supporting infrastructure for DDL replication; I think it should be its own patch. Probably include its own simple-ish test bits. Not a very large patch. 0004 - DDL replication proper, including 0004. Probably not a very large patch either, not sure. Some review comments, just skimming: - 0002 adds some functions to event_trigger.c, but that doesn't seem to be their place. Maybe some new file in src/backend/replication/logical would make more sense. - publication_deparse_ddl_command_end has a long strcmp() list; why? Maybe change things so that it compares some object type enum instead. - CreatePublication has a long list of command tags; is that good? Maybe it'd be better to annotate the list in cmdtaglist.h somehow. - The change in pg_dump's getPublications needs updated to 16. - Don't "git add" src/bin/pg_waldump/logicalddlmsgdesc.c, just update its Makefile and meson.build - I think psql's \dRp should not have the new column at the end. Maybe one of: + Name | Owner | DDL | All tables | Inserts | Updates | Deletes | Truncates | Via root + Name | Owner | All tables | DDL | Inserts | Updates | Deletes | Truncates | Via root + Name | Owner | All tables | Inserts | Updates | Deletes | Truncates | DDL | Via root (I would not add the "s" at the end of that column title, also). -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)
Re: moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)
On Wed, Dec 7, 2022 at 8:54 PM Amit Langote wrote: > Per Alvaro's advice, forking this from [1]. > > In that thread, Tom had asked if it wouldn't be better to find a new > place to put extraUpdatedCols [2] instead of RangeTblEntry, along with > the permission-checking fields are now no longer stored in > RangeTblEntry. > > In [3] of the same thread, I proposed to move it into a List of > Bitmapsets in ModifyTable, as implemented in the attached patch that I > had been posting to that thread. > > The latest version of that patch is attached herewith. I'll add this > one to the January CF too. Done. https://commitfest.postgresql.org/41/4049/ -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Directly associate primary key with user-defined tablespace
Hi all! PG by default always creates the primary key in the default tablespace unless you specify it to use an index that is defined in a user-defined tablespace. We can create indexes in user-defined tablespaces, why can't we create a primary key in a user-defined tablespace without having to associate it with an index? Something like: ALTER TABLE myschema.mytable ADD PRIMARY KEY (akey) tablespace mytablespace; Regards, Michael Vitale
moving extraUpdatedCols out of RangeTblEntry (into ModifyTable)
Per Alvaro's advice, forking this from [1]. In that thread, Tom had asked if it wouldn't be better to find a new place to put extraUpdatedCols [2] instead of RangeTblEntry, along with the permission-checking fields are now no longer stored in RangeTblEntry. In [3] of the same thread, I proposed to move it into a List of Bitmapsets in ModifyTable, as implemented in the attached patch that I had been posting to that thread. The latest version of that patch is attached herewith. I'll add this one to the January CF too. -- Thanks, Amit Langote EDB: http://www.enterprisedb.com [1] https://www.postgresql.org/message-id/ca+hiwqgjjdmuhdsfv-u2qhkjjt9st7xh9jxc_irsaq1taus...@mail.gmail.com [2] https://www.postgresql.org/message-id/3098829.1658956718%40sss.pgh.pa.us [3] https://www.postgresql.org/message-id/CA%2BHiwqEHoLgN%3DvSsaNMaHP-%2BqYPT40-ooySyrieXZHNzbSBj0w%40mail.gmail.com v1-0001-Add-per-result-relation-extraUpdatedCols-to-Modif.patch Description: Binary data