RE: Bug in query rewriter - hasModifyingCTE not getting set
From: Tom Lane > Greg Nancarrow writes: > > On Sun, Feb 7, 2021 at 10:03 AM Tom Lane wrote: > >> I think either the bit about rule_action is unnecessary, or most of > >> the code immediately above this is wrong, because it's only updating > >> flags in sub_action. Why do you think it's necessary to change > >> rule_action in addition to sub_action? > > > I believe that the bit about rule_action IS necessary, as it's needed > > for the case of INSERT...SELECT, so that hasModifyingCTE is set on the > > rewritten INSERT (see comment above the call to > > getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that > > function). > > Hm. So after looking at this more, the problem is that the rewrite is > producing > something equivalent to > > INSERT INTO bug6051_2 > (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM t1); > > If you try to do that directly, the parser will give you the raspberry: > > ERROR: WITH clause containing a data-modifying statement must be at the > top level LINE 2: (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * > FROM ... > ^ > > The code throwing that error, in analyzeCTE(), explains > > /* > * We disallow data-modifying WITH except at the top level of a query, > * because it's not clear when such a modification should be executed. > */ > > That semantic issue doesn't get any less pressing just because the query was > generated by rewrite. So I now think that what we have to do is throw an > error > if we have a modifying CTE and sub_action is different from rule_action. Not > quite sure how to phrase the error though. I am +1 for throwing an error if we have a modifying CTE and sub_action is different from rule_action. As we disallowed data-modifying CTEs which is not at the top level of a query, it will be safe and consistent to disallow the same case here. Maybe we can output the message like the following ? "DO INSTEAD INSERT ... SELECT rules are not supported for INSERT contains data-modifying statements in WITH." Best regards, houzj
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
On Wed, May 19, 2021 at 8:15 PM David Rowley wrote: > On Mon, 17 May 2021 at 14:52, Andy Fan wrote: > > Would marking the new added RestrictInfo.norm_selec > 1 be OK? > > There would be cases you'd want to not count the additional clauses in > the selectivity estimation and there would be cases you would want to. > > For example: > > SELECT ... FROM t1 INNER JOIN t2 ON t1.dt = t2.dt WHERE t1.dt BETWEEN > 'date1' AND 'date2'; > > If you derived that t2.dt is also BETWEEN 'date1' AND 'date2' then > you'd most likely want to include those quals for scans feeding merge, > hash and non-parameterized nested loop joins, so you'd also want to > count them in your selectivity estimations, else you'd feed junk > values into the join selectivity estimations. > > Yes, you are correct. > Parameterized nested loop joins might be different as if you were > looping up an index for t1.dt values on some index on t2.dt, then > you'd likely not want to bother also filtering out the between clause > values too. They're redundant in that case. > > I do not truly understand this. > I imagined we'd have some functions in equivclass.c that allows you to > choose if you wanted the additional filters or not. > Sounds like a good idea. > > Tom's example, WHERE a = b AND a IN (1,2,3), if a and b were in the > same relation then you'd likely never want to include the additional > quals. The only reason I could think that it would be a good idea is > if "b" had an index but "a" didn't. I've not checked the code, but > the index matching code might already allow that to work anyway. > > +1 for this feature overall. -- Best Regards Andy Fan (https://www.aliyun.com/)
Re[3]: On login trigger: take three
Hi, I have upgraded the patch for the 14th version. >Вторник, 16 марта 2021, 14:32 +03:00 от Ivan Panchenko : > >Hi, > >Thank you, Konstantin, for this very good feature with numerous use cases. >Please find the modified patch attached. > >I’ve added the ‘enable_client_connection_trigger’ GUC to the sample config >file and also an additional example page to the docs. >Check world has passed and it is ready for committer. > >>Четверг, 28 января 2021, 12:04 +03:00 от Konstantin Knizhnik < >>k.knizh...@postgrespro.ru >: >> >> >> >>On 28.01.2021 5:47, Amit Kapila wrote: >>> On Mon, Dec 28, 2020 at 5:46 PM Masahiko Sawada < sawada.m...@gmail.com > >>> wrote: On Sat, Dec 26, 2020 at 4:04 PM Pavel Stehule < pavel.steh...@gmail.com > wrote: > > > so 26. 12. 2020 v 8:00 odesílatel Pavel Stehule < pavel.steh...@gmail.com > > napsal: >> Hi >> >> >>> Thank you. >>> I have applied all your fixes in on_connect_event_trigger-12.patch. >>> >>> Concerning enable_client_connection_trigger GUC, I think that it is >>> really useful: it is the fastest and simplest way to disable login >>> triggers in case >>> of some problems with them (not only for superuser itself, but for all >>> users). Yes, it can be also done using "ALTER EVENT TRIGGER DISABLE". >>> But assume that you have a lot of databases with different login >>> policies enforced by on-login event triggers. And you want temporary >>> disable them all, for example for testing purposes. >>> In this case GUC is most convenient way to do it. >>> >> There was typo in patch >> >> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml >> index f810789..8861f1b 100644 >> --- a/doc/src/sgml/config.sgml >> +++ b/doc/src/sgml/config.sgml >> @@ -1,4 +1,4 @@ >> - >> +\ >> >> I have not any objections against functionality or design. I tested the >> performance, and there are no negative impacts when this feature is not >> used. There is significant overhead related to plpgsql runtime >> initialization, but when this trigger will be used, then probably some >> other PLpgSQL procedures and functions will be used too, and then this >> overhead can be ignored. >> >> * make without warnings >> * make check-world passed >> * doc build passed >> >> Possible ToDo: >> >> The documentation can contain a note so usage connect triggers in >> environments with short life sessions and very short fast queries >> without usage PLpgSQL functions or procedures can have negative impact >> on performance due overhead of initialization of PLpgSQL engine. >> >> I'll mark this patch as ready for committers > > looks so this patch has not entry in commitfestapp 2021-01 > Yeah, please register this patch before the next CommitFest[1] starts, 2021-01-01 AoE[2]. >>> Konstantin, did you register this patch in any CF? Even though the >>> reviewer seems to be happy with the patch, I am afraid that we might >>> lose track of this unless we register it. >>> Yes, certainly: >>https://commitfest.postgresql.org/31/2900/ >> >>-- >>Konstantin Knizhnik >>Postgres Professional: http://www.postgrespro.com >>The Russian Postgres Company >> >> > > > > diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml index b33e59d5e4..2bb5804e76 100644 --- a/doc/src/sgml/bki.sgml +++ b/doc/src/sgml/bki.sgml @@ -182,7 +182,7 @@ { oid = '1', oid_symbol = 'TemplateDbOid', descr = 'database\'s default template', datname = 'template1', encoding = 'ENCODING', datcollate = 'LC_COLLATE', - datctype = 'LC_CTYPE', datistemplate = 't', datallowconn = 't', + datctype = 'LC_CTYPE', datistemplate = 't', datallowconn = 't', dathaslogontriggers = 'f', datconnlimit = '-1', datlastsysoid = '0', datfrozenxid = '0', datminmxid = '1', dattablespace = 'pg_default', datacl = '_null_' }, diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 6d06ad22b9..7e0113f1e8 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -2968,6 +2968,17 @@ SCRAM-SHA-256$iteration count: + + + dathaslogontriggers bool + + +Indicates that there are client connection triggers defined for this database. +This flag is used to avoid extra lookup of pg_event_trigger table on each backend startup. +This flag is used internally by Postgres and should not be manually changed by DBA or application. + + + datconnlimit int4 diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7e32b0686c..d6d9b3eb34 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1035,6 +1035,24 @@ include_dir 'conf.d' + + enable_client_connection_trigger (boolean) + + enable_client_connection_trigger
Re: Transactions involving multiple postgres foreign servers, take 2
On 2021/05/11 13:37, Masahiko Sawada wrote: > I've attached the updated patches that incorporated comments from > Zhihong and Ikeda-san. Thanks for updating the patches! I have other comments including trivial things. a. about "foreign_transaction_resolver_timeout" parameter Now, the default value of "foreign_transaction_resolver_timeout" is 60 secs. Is there any reason? Although the following is minor case, it may confuse some users. Example case is that 1. a client executes transaction with 2PC when the resolver is processing FdwXactResolverProcessInDoubtXacts(). 2. the resolution of 1st transaction must be waited until the other transactions for 2pc are executed or timeout. 3. if the client check the 1st result value, it should wait until resolution is finished for atomic visibility (although it depends on the way how to realize atomic visibility.) The clients may be waited foreign_transaction_resolver_timeout". Users may think it's stale. Like this situation can be observed after testing with pgbench. Some unresolved transaction remains after benchmarking. I assume that this default value refers to wal_sender, archiver, and so on. But, I think this parameter is more like "commit_delay". If so, 60 seconds seems to be big value. b. about performance bottleneck (just share my simple benchmark results) The resolver process can be performance bottleneck easily although I think some users want this feature even if the performance is not so good. I tested with very simple workload in my laptop. The test condition is * two remote foreign partitions and one transaction inserts an entry in each partitions. * local connection only. If NW latency became higher, the performance became worse. * pgbench with 8 clients. The test results is the following. The performance of 2PC is only 10% performance of the one of without 2PC. * with foreign_twophase_commit = requried -> If load with more than 10TPS, the number of unresolved foreign transactions is increasing and stop with the warning "Increase max_prepared_foreign_transactions". * with foreign_twophase_commit = disabled -> 122TPS in my environments. c. v36-0001-Introduce-transaction-manager-for-foreign-transa.patch * typo: s/tranasction/transaction/ * Is it better to move AtEOXact_FdwXact() in AbortTransaction() to before "if (IsInParallelMode())" because make them in the same order as CommitTransaction()? * functions name of fdwxact.c Although this depends on my feeling, xact means transaction. If this feeling same as you, the function names of FdwXactRegisterXact and so on are odd to me. FdwXactRegisterEntry or FdwXactRegisterParticipant is better? * Are the following better? - s/to register the foreign transaction by/to register the foreign transaction participant by/ - s/The registered foreign transactions/The registered participants/ - s/given foreign transaction/given foreign transaction participant/ - s/Foreign transactions involved in the current transaction/Foreign transaction participants involved in the current transaction/ Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 6:13 PM Bharath Rupireddy wrote: > Thanks. I will work on the new structure ParseSubOption only for > subscription options. PSA v2 patch that has changes for 1) new ParseSubOption structure 2) the error reporting code refactoring. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From b72336b7dc803c4ebd49d25850c0b15d8fbdbbed Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Thu, 20 May 2021 09:03:37 +0530 Subject: [PATCH v2] Refactor parse_subscription_options Currently parse_subscription_options function receives a lot(14) of input parameters which makes it inextensible to add the new parameters. So, better wrap all the input parameters within a ParseSubOptions structure to which new parameters can be added easily. Also, remove similar code (with the only difference in the option) when throwing errors for mutually exclusive options. Have a variable for the option and use it in the error message. It saves some LOC and makes the code look better with lesser ereport(ERROR statements. --- src/backend/commands/subscriptioncmds.c | 348 +--- 1 file changed, 187 insertions(+), 161 deletions(-) diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 8aa6de1785..feb9074436 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -51,6 +51,26 @@ static void check_duplicates_in_publist(List *publist, Datum *datums); static List *merge_publications(List *oldpublist, List *newpublist, bool addpub, const char *subname); static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, char *err); +/* + * Structure to hold subscription options for parsing + */ +typedef struct ParseSubOptions +{ + List *stmt_options; + bool *connect; + bool *enabled_given; + bool *enabled; + bool *create_slot; + bool *slot_name_given; + char **slot_name; + bool *copy_data; + char **synchronous_commit; + bool *refresh; + bool *binary_given; + bool *binary; + bool *streaming_given; + bool *streaming; +} ParseSubOptions; /* * Common option parsing function for CREATE and ALTER SUBSCRIPTION commands. @@ -60,16 +80,7 @@ static void ReportSlotConnectionError(List *rstates, Oid subid, char *slotname, * accommodate that. */ static void -parse_subscription_options(List *options, - bool *connect, - bool *enabled_given, bool *enabled, - bool *create_slot, - bool *slot_name_given, char **slot_name, - bool *copy_data, - char **synchronous_commit, - bool *refresh, - bool *binary_given, bool *binary, - bool *streaming_given, bool *streaming) +parse_subscription_options(ParseSubOptions *opts) { ListCell *lc; bool connect_given = false; @@ -78,45 +89,46 @@ parse_subscription_options(List *options, bool refresh_given = false; /* If connect is specified, the others also need to be. */ - Assert(!connect || (enabled && create_slot && copy_data)); + Assert(!opts->connect || (opts->enabled && opts->create_slot && + opts->copy_data)); - if (connect) - *connect = true; - if (enabled) + if (opts->connect) + *opts->connect = true; + if (opts->enabled) { - *enabled_given = false; - *enabled = true; + *opts->enabled_given = false; + *opts->enabled = true; } - if (create_slot) - *create_slot = true; - if (slot_name) + if (opts->create_slot) + *opts->create_slot = true; + if (opts->slot_name) { - *slot_name_given = false; - *slot_name = NULL; + *opts->slot_name_given = false; + *opts->slot_name = NULL; } - if (copy_data) - *copy_data = true; - if (synchronous_commit) - *synchronous_commit = NULL; - if (refresh) - *refresh = true; - if (binary) + if (opts->copy_data) + *opts->copy_data = true; + if (opts->synchronous_commit) + *opts->synchronous_commit = NULL; + if (opts->refresh) + *opts->refresh = true; + if (opts->binary) { - *binary_given = false; - *binary = false; + *opts->binary_given = false; + *opts->binary = false; } - if (streaming) + if (opts->streaming) { - *streaming_given = false; - *streaming = false; + *opts->streaming_given = false; + *opts->streaming = false; } /* Parse options */ - foreach(lc, options) + foreach(lc, opts->stmt_options) { DefElem*defel = (DefElem *) lfirst(lc); - if (strcmp(defel->defname, "connect") == 0 && connect) + if (strcmp(defel->defname, "connect") == 0 && opts->connect) { if (connect_given) ereport(ERROR, @@ -124,19 +136,19 @@ parse_subscription_options(List *options, errmsg("conflicting or redundant options"))); connect_given = true; - *connect = defGetBoolean(defel); + *opts->connect = defGetBoolean(defel); } - else if (strcmp(defel->defname, "enabled") == 0 && enabled) + else if (strcmp(defel->defname, "enabled") == 0 && opts->enabled) { - if (*enabled_given) + if (*opts->enabled_given) ereport(ERROR,
Adaptive Plan Sharing for PreparedStmt
Currently we are using a custom/generic strategy to handle the data skew issue. However, it doesn't work well all the time. For example: SELECT * FROM t WHERE a between $1 and $2. We assume the selectivity is 0.0025, But users may provide a large range every time. Per our current strategy, a generic plan will be chosen, Index scan on A will be chosen. oops.. I think Oracle's Adaptive Cursor sharing should work. First It calculate the selectivity with the real bind values and generate/reuse different plan based on the similarity of selectivity. The challenges I can think of now are: a). How to define the similarity. b). How to adjust the similarity during the real run. for example, we say [1% ~ 10%] is similar. but we find selectivity 20% used the same plan as 10%. what should be done here. I am searching for the best place to invest in the optimizer aspect. and the above idea should be the one I can think of now. Any thought? Thanks -- Best Regards Andy Fan (https://www.aliyun.com/)
Re: Query about time zone patterns in to_char
+1 for the change. I quickly reviewed the patch and overall it looks good to me. Few cosmetic suggestions: 1: +RESET timezone; + + CREATE TABLE TIMESTAMPTZ_TST (a int , b timestamptz); Extra line. 2: +SET timezone = '00:00'; +SELECT to_char(now(), 'of') as "Of", to_char(now(), 'tzh:tzm') as "tzh:tzm"; O should be small in alias just for consistency. I am not sure whether we should backport this or not but I don't see any issues with back-patching. On Sun, May 16, 2021 at 9:43 PM Nitin Jadhav wrote: > > AFAICS, table 9.26 specifically shows which case-variants are supported. > > If there are some others that happen to work, we probably shouldn't > > remove them for fear of breaking poorly-written apps ... but that does > > not imply that we need to support every case-variant. > > Thanks for the explanation. I also feel that we may not support every > case-variant. But the other reason which triggered me to think in the > other way is, as mentioned in commit [1] where this feature was added, > says that these format patterns are compatible with Oracle. Whereas > Oracle supports both upper case and lower case patterns. I just wanted > to get it confirmed with this point before concluding. > > [1] - > commit 11b623dd0a2c385719ebbbdd42dd4ec395dcdc9d > Author: Andrew Dunstan > Date: Tue Jan 9 14:25:05 2018 -0500 > > Implement TZH and TZM timestamp format patterns > > These are compatible with Oracle and required for the datetime template > language for jsonpath in an upcoming patch. > > Nikita Glukhov and Andrew Dunstan, reviewed by Pavel Stehule. > > Thanks & Regards, > Nitin Jadhav > > On Sun, May 16, 2021 at 8:40 PM Tom Lane wrote: > > > > Nitin Jadhav writes: > > > While understanding the behaviour of the to_char() function as > > > explained in [1], I observed that some patterns related to time zones > > > do not display values if we mention in lower case. As shown in the > > > sample output [2], time zone related patterns TZH, TZM and OF outputs > > > proper values when specified in upper case but does not work if we > > > mention in lower case. But other patterns like TZ, HH, etc works fine > > > with upper case as well as lower case. > > > > > I would like to know whether the current behaviour of TZH, TZM and OF > > > is done intentionally and is as expected. > > > > AFAICS, table 9.26 specifically shows which case-variants are supported. > > If there are some others that happen to work, we probably shouldn't > > remove them for fear of breaking poorly-written apps ... but that does > > not imply that we need to support every case-variant. > > > > regards, tom lane > > > -- -- Thanks & Regards, Suraj kharage, edbpostgres.com
Re: Installation of regress.so?
Michael Paquier writes: > Could it be possible to install regress.so at least in the same > location as pg_regress? I don't think this is a great idea. Aside from the fact that we'd be littering the install tree with a .so of no use to end users, I'm failing to see how it really gets you anywhere unless you want to further require regress.so from back versions to be loadable into the current server. regards, tom lane
Installation of regress.so?
Hi all, While diving into a transformation of the tests of pg_upgrade to TAP, I am getting annoyed by the fact that regress.so is needed if you upgrade an older instance that holds the regression objects from the main regression test suite. The buildfarm code is using a trick to copy regress.so from the source code tree of the old instance into its installation. See in PGBuild/Modules/TestUpgradeXversion.pm: # at some stage we stopped installing regress.so copy "$self->{pgsql}/src/test/regress/regress.so", "$installdir/lib/postgresql/regress.so" unless (-e "$installdir/lib/postgresql/regress.so"); This creates a hard dependency with the source code of the old instance if attempting to create an old instance based on a dump, which is what the buildfarm does, and something that I'd like to get support for in the TAP tests of pg_upgrade in the tree. Could it be possible to install regress.so at least in the same location as pg_regress? This would still require the test to either move regress.so into a location from where the backend could load the library, but at least the library could be accessible without a dependency to the source tree of the old instance upgrading from. To make that really usable, this would require a backpatch, though.. Thoughts? -- Michael signature.asc Description: PGP signature
RE: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Hi Greg, Thanks a lot for you explanation and your fix. I think your fix can resolve the core dump issue. As with your fix, parallel process reset Transaction Xmin from ActiveSnapshot. But it will change Transaction snapshot for all parallel scenarios. I don't know whether it bring in other issue. For test only, I think it is enough. So is there anybody can explain what's exactly difference between ActiveSnapshot and TransactionSnapshot in parallel work process. Then maybe we can find a better solution and try to fix it really. Thanks Pengcheng -Original Message- From: Greg Nancarrow Sent: 2021年5月18日 17:15 To: Pengchengliu Cc: Andres Freund ; PostgreSQL-development Subject: Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump On Tue, May 18, 2021 at 11:27 AM Pengchengliu wrote: > > Hi Greg, > >Actually I am very confused about ActiveSnapshot and TransactionSnapshot. > I don't know why main process send ActiveSnapshot and TransactionSnapshot > separately. And what is exact difference between them? >If you know that, could you explain that for me? It will be very > appreciated. In the context of a parallel-worker, I am a little confused too, so I can't explain it either. It is not really explained in the file "src\backend\access\transam\README.parallel", it only mentions the following as part of the state that needs to be copied to each worker: - The transaction snapshot. - The active snapshot, which might be different from the transaction snapshot. So they might be different, but exactly when and why? When I debugged a typical parallel-SELECT case, I found that prior to plan execution, GetTransactionSnapshot() was called and its return value was stored in both the QueryDesc and the estate (es_snapshot), which was then pushed on the ActiveSnapshot stack. So by the time InitializeParallelDSM() was called, the (top) ActiveSnapshot was the last snapshot returned from GetTransactionSnapshot(). So why InitializeParallelDSM() calls GetTransactionSnapshot() again is not clear to me (because isn't then the ActiveSnapshot a potentially earlier snapshot? - which it shouldn't be, AFAIK. And also, it's then different to the non-parallel case). >Before we know them exactly, I think we should not modify the > TransactionSnapshot to ActiveSnapshot in main process. If it is, the main > process should send ActiveSnapshot only. I think it would be worth you trying my suggested change (if you have a development environment, which I assume you have). Sure, IF it was deemed a proper solution, you'd only send the one snapshot, and adjust accordingly in ParallelWorkerMain(), but we need not worry about that in order to test it. Regards, Greg Nancarrow Fujitsu Australia
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
Michael Paquier writes: > On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote: >> * Replace the edata->resultRelInfo field with two fields, one for >> the original parent and one for the actual/current target. Perhaps >> this is worth doing, not sure. > This one sounds more natural to me, though. OK, I'll give that a try tomorrow. > May I ask why you are not moving the snapshot pop and push into the > finish() and create() routines for this patch? That does need to happen, but I figured I'd leave it to the other patch, since there are other things to change too for that snapshot problem. Obviously, whichever patch goes in second will need trivial adjustments, but I think it's logically clearer that way. > Also, any thoughts > about adding the trigger tests from 013_partition.pl to REL_13_STABLE? Yeah, if this is a pre-existing problem then we should back-port the tests that revealed it. regards, tom lane
Re: wal stats questions
On 2021/05/19 11:40, Fujii Masao wrote: > Thanks for updating the patch! I modified some comments slightly and > pushed that version of the patch. Thanks a lot! Regards, -- Masahiro Ikeda NTT DATA CORPORATION
Re: Commitfest app vs. pgsql-docs
On Wed, May 19, 2021 at 03:35:00PM -0400, Tom Lane wrote: > Magnus Hagander writes: >> Changing that to look globally can certainly be done. It takes a bit >> of work I think, as there are no API endpoints today that will do >> that, but those could be added. > > Ah. Personally, I'd settle for it checking -hackers, -docs and -bugs. > Perhaps there's some case for -general as well. FWIW, I have seen cases for -general in the past. -- Michael signature.asc Description: PGP signature
Re: Skip partition tuple routing with constant partition key
On Thu, 20 May 2021 at 12:20, tsunakawa.ta...@fujitsu.com wrote: > Yes, I want to make/keep it possible that application developers can be > unaware of partitions. I believe that's why David-san, Alvaro-san, and you > have made great efforts to improve partitioning performance. So, I'm +1 for > what Hou-san is trying to achieve. > > Is there something you're concerned about? The amount and/or complexity of > added code? It would be good to see how close Amit's patch gets to the performance of the original patch on this thread. As far as I can see, the difference is, aside from the setup code to determine if the partition is constant, that Amit's patch just requires an additional ExecPartitionCheck() call per row. That should be pretty cheap when compared to the binary search to find the partition for a RANGE or LIST partitioned table. Houzj didn't mention how the table in the test was partitioned, so it's hard to speculate how many comparisons would be done during a binary search. Or maybe it was HASH partitioned and there was no binary search. David
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On Thu, May 20, 2021 at 5:23 AM Tom Lane wrote: > Amit Langote writes: > > IOW, the patch you posted earlier seems like the way to go. > > I really dislike that patch. I think it's doubling down on the messy, > unstructured coding patterns that got us into this situation to begin > with. I'd prefer to expend a little effort on refactoring so that > the ExecCleanupTupleRouting call can be moved to the cleanup function > where it belongs. > > So, I propose the attached, which invents a new struct to carry > the stuff we've discovered to be necessary. This makes the APIs > noticeably cleaner IMHO. Larger footprint, but definitely cleaner. Thanks. > I did not touch the APIs of the apply_XXX_internal functions, > as it didn't really seem to offer any notational advantage. > We can't simply collapse them to take an "edata" as I did for > apply_handle_tuple_routing, because the ResultRelInfo they're > supposed to operate on could be different from the original one. > I considered a couple of alternatives: > > * Replace their estate arguments with edata, but keep the separate > ResultRelInfo arguments. This might be worth doing in future, if we > add more fields to ApplyExecutionData. Right now it'd save nothing, > and it'd create a risk of confusion about when to use the > ResultRelInfo argument vs. edata->resultRelInfo. > > * Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo > with the partition child's RRI, then simplify the apply_XXX_internal > functions to take just edata instead of separate estate and > resultRelInfo args. I think this would work right now, but it seems > grotty, and it might cause problems in future. > > * Replace the edata->resultRelInfo field with two fields, one for > the original parent and one for the actual/current target. Perhaps > this is worth doing, not sure. > > Thoughts? IMHO, it would be better to keep the lowest-level apply_handle_XXX_internal() out of this, because presumably we're only cleaning up the mess in higher-level callers. Somewhat related, one of the intentions behind a04daa97a43, which removed es_result_relation_info in favor of passing the ResultRelInfo explicitly to the executor's lower-level functions, was to avoid bugs caused by failing to set/reset that global field correctly in higher-level callers. Now worker.c is pretty small compared with the executor, but still it seems like a good idea to follow the same principle here. -- Amit Langote EDB: http://www.enterprisedb.com
Re: Skip partition tuple routing with constant partition key
On Thu, 20 May 2021 at 01:17, Amit Langote wrote: > I gave a shot to implementing your idea and ended up with the attached > PoC patch, which does pass make check-world. I only had a quick look at this. + if ((dispatch->key->strategy == PARTITION_STRATEGY_RANGE || + dispatch->key->strategy == PARTITION_STRATEGY_RANGE)) + dispatch->lastPartInfo = rri; I think you must have meant to have one of these as PARTITION_STRATEGY_LIST? Wondering what your thoughts are on, instead of caching the last used ResultRelInfo from the last call to ExecFindPartition(), to instead cached the last looked up partition index in PartitionDescData? That way we could cache lookups between statements. Right now your caching is not going to help for single-row INSERTs, for example. For multi-level partition hierarchies that would still require looping and checking the cached value at each level. I've not studied the code that builds and rebuilds PartitionDescData, so there may be some reason that we shouldn't do that. I know that's changed a bit recently with DETACH CONCURRENTLY. However, providing the cached index is not outside the bounds of the oids array, it shouldn't really matter if the cached value happens to end up pointing to some other partition. If that happens, we'll just fail the ExecPartitionCheck() and have to look for the correct partition. David
RE: Skip partition tuple routing with constant partition key
From: Amit Langote > On Tue, May 18, 2021 at 11:11 AM houzj.f...@fujitsu.com > wrote: > > For some big data scenario, we sometimes transfer data from one table(only > store not expired data) > > to another table(historical data) for future analysis. > > In this case, we import data into historical table regularly(could be one > > day or > half a day), > > And the data is likely to be imported with date label specified, then all > > of the > data to be > > imported this time belong to the same partition which partition by time > > range. > > Is directing that data directly into the appropriate partition not an > acceptable solution to address this particular use case? Yeah, I know > we should avoid encouraging users to perform DML directly on > partitions, but... Yes, I want to make/keep it possible that application developers can be unaware of partitions. I believe that's why David-san, Alvaro-san, and you have made great efforts to improve partitioning performance. So, I'm +1 for what Hou-san is trying to achieve. Is there something you're concerned about? The amount and/or complexity of added code? Regards Takayuki Tsunakawa
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On Wed, May 19, 2021 at 02:36:03PM -0400, Andrew Dunstan wrote: > Yeah, this area needs substantial improvement. I have seen similar sorts > of nasty hangs, where the script is waiting forever for some process > that hasn't got the shutdown message. At least we probably need some way > of making sure the END handler doesn't abort early. Maybe > PostgresNode::stop() needs a mode that handles failure more gracefully. > Maybe it needs to try shutting down all the nodes and only calling > BAIL_OUT after trying all of them and getting a failure. But that might > still leave us work to do on failures occuring pre-END. For that, we could just make the END block called run_log() directly as well, as this catches stderr and an error code. What about making the shutdown a two-phase logic by the way? Trigger an immediate stop, and if it fails fallback to an extra kill9() to be on the safe side. Have you seen this being a problem even in cases where the tests all passed? If yes, it may be worth using the more aggressive flow even in the case where the tests pass. -- Michael signature.asc Description: PGP signature
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On Wed, May 19, 2021 at 04:23:55PM -0400, Tom Lane wrote: > I really dislike that patch. I think it's doubling down on the messy, > unstructured coding patterns that got us into this situation to begin > with. I'd prefer to expend a little effort on refactoring so that > the ExecCleanupTupleRouting call can be moved to the cleanup function > where it belongs. Okay. > I did not touch the APIs of the apply_XXX_internal functions, > as it didn't really seem to offer any notational advantage. > We can't simply collapse them to take an "edata" as I did for > apply_handle_tuple_routing, because the ResultRelInfo they're > supposed to operate on could be different from the original one. > I considered a couple of alternatives: > > * Replace their estate arguments with edata, but keep the separate > ResultRelInfo arguments. This might be worth doing in future, if we > add more fields to ApplyExecutionData. Right now it'd save nothing, > and it'd create a risk of confusion about when to use the > ResultRelInfo argument vs. edata->resultRelInfo. Not sure about this one. It may be better to wait until this gets more expanded, if it gets expanded. > * Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo > with the partition child's RRI, then simplify the apply_XXX_internal > functions to take just edata instead of separate estate and > resultRelInfo args. I think this would work right now, but it seems > grotty, and it might cause problems in future. Agreed that it does not seem like a good idea to blindly overwrite resultRelInfo with the partition targetted for the apply. > * Replace the edata->resultRelInfo field with two fields, one for > the original parent and one for the actual/current target. Perhaps > this is worth doing, not sure. This one sounds more natural to me, though. > Thoughts? May I ask why you are not moving the snapshot pop and push into the finish() and create() routines for this patch? Also, any thoughts about adding the trigger tests from 013_partition.pl to REL_13_STABLE? -- Michael signature.asc Description: PGP signature
Re: Commitfest app vs. pgsql-docs
On 2021-May-19, Andrew Dunstan wrote: > It's just a reference after all. So someone supplies a reference to an > email on an out of the way list. What's the evil that will occur? Not > much really AFAICT. ... as long as it doesn't leak data from private lists ... -- Álvaro Herrera39°49'30"S 73°17'W
Re: Freenode woes
On Wed, May 19, 2021 at 10:19 AM Christoph Berg wrote: > > Fwiw, if the PostgreSQL projects is considering moving the #postgresql > IRC channel(s) elsewhere given [1,2], I'm a member of OFTC.net's network > operations committee and would be happy to help. > > [1] https://gist.github.com/aaronmdjones/1a9a93ded5b7d162c3f58bdd66b8f491 > [2] https://fuchsnet.ch/freenode-resign-letter.txt > I've been wondering the same thing; given our relationship with SPI, OFTC seems like an option worthy of consideration. For those unfamiliar, there is additional info about the network at https://www.oftc.net Robert Treat PostgreSQL Project SPI Liaison https://xzilla.net
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
Amit Langote writes: > IOW, the patch you posted earlier seems like the way to go. I really dislike that patch. I think it's doubling down on the messy, unstructured coding patterns that got us into this situation to begin with. I'd prefer to expend a little effort on refactoring so that the ExecCleanupTupleRouting call can be moved to the cleanup function where it belongs. So, I propose the attached, which invents a new struct to carry the stuff we've discovered to be necessary. This makes the APIs noticeably cleaner IMHO. I did not touch the APIs of the apply_XXX_internal functions, as it didn't really seem to offer any notational advantage. We can't simply collapse them to take an "edata" as I did for apply_handle_tuple_routing, because the ResultRelInfo they're supposed to operate on could be different from the original one. I considered a couple of alternatives: * Replace their estate arguments with edata, but keep the separate ResultRelInfo arguments. This might be worth doing in future, if we add more fields to ApplyExecutionData. Right now it'd save nothing, and it'd create a risk of confusion about when to use the ResultRelInfo argument vs. edata->resultRelInfo. * Allow apply_handle_tuple_routing to overwrite edata->resultRelInfo with the partition child's RRI, then simplify the apply_XXX_internal functions to take just edata instead of separate estate and resultRelInfo args. I think this would work right now, but it seems grotty, and it might cause problems in future. * Replace the edata->resultRelInfo field with two fields, one for the original parent and one for the actual/current target. Perhaps this is worth doing, not sure. Thoughts? regards, tom lane diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c index 1432554d5a..c51a83f797 100644 --- a/src/backend/replication/logical/worker.c +++ b/src/backend/replication/logical/worker.c @@ -178,6 +178,14 @@ static HTAB *xidhash = NULL; /* BufFile handle of the current streaming file */ static BufFile *stream_fd = NULL; +typedef struct ApplyExecutionData +{ + EState *estate; /* always needed */ + ResultRelInfo *resultRelInfo; /* always needed */ + ModifyTableState *mtstate; /* used for partitioned target rel */ + PartitionTupleRouting *proute; /* used for partitioned target rel */ +} ApplyExecutionData; + typedef struct SubXactInfo { TransactionId xid; /* XID of the subxact */ @@ -239,8 +247,7 @@ static bool FindReplTupleInLocalRel(EState *estate, Relation localrel, LogicalRepRelation *remoterel, TupleTableSlot *remoteslot, TupleTableSlot **localslot); -static void apply_handle_tuple_routing(ResultRelInfo *relinfo, - EState *estate, +static void apply_handle_tuple_routing(ApplyExecutionData *edata, TupleTableSlot *remoteslot, LogicalRepTupleData *newtup, LogicalRepRelMapEntry *relmapentry, @@ -336,20 +343,21 @@ handle_streamed_transaction(LogicalRepMsgType action, StringInfo s) /* * Executor state preparation for evaluation of constraint expressions, - * indexes and triggers. + * indexes and triggers for the specified relation. * - * resultRelInfo is a ResultRelInfo for the relation to be passed to the - * executor routines. The caller must open and close any indexes to be - * updated independently of the relation registered here. + * Note that the caller must open and close any indexes to be updated. */ -static EState * -create_estate_for_relation(LogicalRepRelMapEntry *rel, - ResultRelInfo **resultRelInfo) +static ApplyExecutionData * +create_edata_for_relation(LogicalRepRelMapEntry *rel) { + ApplyExecutionData *edata; EState *estate; RangeTblEntry *rte; + ResultRelInfo *resultRelInfo; - estate = CreateExecutorState(); + edata = (ApplyExecutionData *) palloc0(sizeof(ApplyExecutionData)); + + edata->estate = estate = CreateExecutorState(); rte = makeNode(RangeTblEntry); rte->rtekind = RTE_RELATION; @@ -358,13 +366,13 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel, rte->rellockmode = AccessShareLock; ExecInitRangeTable(estate, list_make1(rte)); - *resultRelInfo = makeNode(ResultRelInfo); + edata->resultRelInfo = resultRelInfo = makeNode(ResultRelInfo); /* * Use Relation opened by logicalrep_rel_open() instead of opening it * again. */ - InitResultRelInfo(*resultRelInfo, rel->localrel, 1, NULL, 0); + InitResultRelInfo(resultRelInfo, rel->localrel, 1, NULL, 0); /* * We put the ResultRelInfo in the es_opened_result_relations list, even @@ -377,29 +385,38 @@ create_estate_for_relation(LogicalRepRelMapEntry *rel, * an apply operation being responsible for that. */ estate->es_opened_result_relations = - lappend(estate->es_opened_result_relations, *resultRelInfo); + lappend(estate->es_opened_result_relations, resultRelInfo); estate->es_output_cid = GetCurrentCommandId(true); /* Prepare to
Re: Commitfest app vs. pgsql-docs
On 5/19/21 3:07 PM, Magnus Hagander wrote: > On Wed, May 19, 2021 at 7:39 PM Tom Lane wrote: >> Stephen Frost writes: >>> * Laurenz Albe (laurenz.a...@cybertec.at) wrote: Since we have a "documentation" section in the commitfest, it would be useful to allow links to the -docs archives. >>> ... or get rid of the pgsql-docs mailing list, as has been suggested >>> before. >> IIRC, the CF app also rejects threads on pgsql-bugs, which is even >> more pointlessly annoying. Couldn't we just remove that restriction >> altogether, and allow anything posted to some pgsql list? > It's not technically rejecting anything, it's just explicitly looking > in -hackers and doesn't even know the others exist :) > > Changing that to look globally can certainly be done. It takes a bit > of work I think, as there are no API endpoints today that will do > that, but those could be added. > > But just to be clear -- "some pgsql list" would include things like > pgsql-general, the pgadmin lists, the non-english regional lists, etc. > That may be fine, I just want to be sure everybody realizes that's > what it means. Basically everything on > https://www.postgresql.org/list/ > It's just a reference after all. So someone supplies a reference to an email on an out of the way list. What's the evil that will occur? Not much really AFAICT. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Commitfest app vs. pgsql-docs
Magnus Hagander writes: > On Wed, May 19, 2021 at 7:39 PM Tom Lane wrote: >> IIRC, the CF app also rejects threads on pgsql-bugs, which is even >> more pointlessly annoying. Couldn't we just remove that restriction >> altogether, and allow anything posted to some pgsql list? > It's not technically rejecting anything, it's just explicitly looking > in -hackers and doesn't even know the others exist :) > Changing that to look globally can certainly be done. It takes a bit > of work I think, as there are no API endpoints today that will do > that, but those could be added. Ah. Personally, I'd settle for it checking -hackers, -docs and -bugs. Perhaps there's some case for -general as well. regards, tom lane
Re: pg_rewind fails if there is a read only file.
On 5/19/21 6:43 AM, Paul Guo wrote: > Several weeks ago I saw this issue in a production environment. The > read only file looks like a credential file. Michael told me that > usually such kinds of files should be better kept in non-pgdata > directories in production environments. Thought further it seems that > pg_rewind should be more user friendly to tolerate such scenarios. > > The failure error is "Permission denied" after open(). The reason is > open() fais with the below mode in open_target_file() > > mode = O_WRONLY | O_CREAT | PG_BINARY; > if (trunc) > mode |= O_TRUNC; > dstfd = open(dstpath, mode, pg_file_create_mode); > > The fix should be quite simple, if open fails with "Permission denied" > then we try to call chmod to add a S_IWUSR just before open() and call > chmod to reset the flags soon after open(). A stat() call to get > previous st_mode flags is needed. > Presumably the user has a reason for adding the file read-only to the data directory, and we shouldn't lightly ignore that. Michael's advice is reasonable. This seems like a case of: Patient: Doctor, it hurts when I do this. Doctor: Stop doing that. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Removed extra memory allocations from create_list_bounds
On Tue, May 18, 2021 at 01:49:12PM -0400, Robert Haas wrote: > I see that you have made a theoretical argument for why this should be > good for performance, but it would be better to have some test results > that show that it works out in practice. Sometimes things seem like > they ought to be more efficient but turn out to be less efficient when > they are actually tried. I see this as a code cleanup more than an performance optimization. I couldn't see a measurable difference in my tests, involving CREATE TABLE and SELECT. I think some of my patches could *increase* memory use, due to power-of-two allocation logic. I think it's still a good idea, since it doesn't seem to be the dominant memory allocation. On Thu, May 20, 2021 at 12:21:19AM +0530, Nitin Jadhav wrote: > > I see that you have made a theoretical argument for why this should be > > good for performance, but it would be better to have some test results > > that show that it works out in practice. Sometimes things seem like > > they ought to be more efficient but turn out to be less efficient when > > they are actually tried. > > After this I tried to create 10 partitions and observed the time taken > to execute. Here is the data for 'with patch'. > > postgres@34077=#select 'create table t_' || i || ' partition of t for > postgres'# values in (' || i || ');' > postgres-# from generate_series(10001, 10010) i > postgres-# \gexec I think you should be sure to do this within a transaction, without cassert, and maybe with fsync=off full_page_writes=off. > If we observe above data, we can see the improvement with the patch. > The average time taken to execute for the last 10 partitions is. It'd be interesting to know which patch(es) contributed to the improvement. It's possible that (say) patch 0001 alone gives all the gain, or that 0002 diminishes the gains. I think there'll be an interest in committing the smallest possible patch to realize the gains, to minimize code churn an unrelated changes. LIST and RANGE might need to be checked separately.. > With respect to memory usage, AFAIK the allocated memory gets cleaned > during deallocation of the memory used by the memory context. So at > the end of the query, we may see no difference in the memory usage but > during the query execution it tries to get less memory than before. You can check MAXRSS (at least on linux) if you enable log_executor_stats, like: \set QUIET \\ SET client_min_messages=debug; SET log_executor_stats=on; DROP TABLE p; CREATE TABLE p(i int) PARTITION BY LIST(i); CREATE TABLE pd PARTITION OF p DEFAULT; SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES IN (%s)', a,a) FROM generate_series(1,999)a;\gexec \\ SELECT; -- Justin
Re: Removed extra memory allocations from create_list_bounds
> Created a table with one column of type 'int' and partitioned by that > column. Created 1 million partitions using following queries. Sorry. It's not 1 million. Its 10,000 partitions. -- Thanks & Regards, Nitin Jadhav On Thu, May 20, 2021 at 12:21 AM Nitin Jadhav wrote: > > > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead. > > > > Also, use 'git diff --check' or 'git log --check' before generating > > patches to send, and fix any whitespace errors before submitting. > > Thanks for the suggestion. I will follow these. > > > I see that you have made a theoretical argument for why this should be > > good for performance, but it would be better to have some test results > > that show that it works out in practice. Sometimes things seem like > > they ought to be more efficient but turn out to be less efficient when > > they are actually tried. > > Created a table with one column of type 'int' and partitioned by that > column. Created 1 million partitions using following queries. > > create table t(a int) partition by list(a); > select 'create table t_' || i || ' partition of t for > values in (' || i || ');' > from generate_series(1, 1) i > \gexec > > After this I tried to create 10 partitions and observed the time taken > to execute. Here is the data for 'with patch'. > > postgres@34077=#select 'create table t_' || i || ' partition of t for > postgres'# values in (' || i || ');' > postgres-# from generate_series(10001, 10010) i > postgres-# \gexec > CREATE TABLE > Time: 36.863 ms > CREATE TABLE > Time: 46.645 ms > CREATE TABLE > Time: 44.915 ms > CREATE TABLE > Time: 39.660 ms > CREATE TABLE > Time: 42.188 ms > CREATE TABLE > Time: 43.163 ms > CREATE TABLE > Time: 44.374 ms > CREATE TABLE > Time: 45.117 ms > CREATE TABLE > Time: 40.340 ms > CREATE TABLE > Time: 38.604 ms > > The data for 'without patch' looks like this. > > postgres@31718=#select 'create table t_' || i || ' partition of t for > postgres'# values in (' || i || ');' > postgres-# from generate_series(10001, 10010) i > postgres-# \gexec > CREATE TABLE > Time: 45.917 ms > CREATE TABLE > Time: 46.815 ms > CREATE TABLE > Time: 44.180 ms > CREATE TABLE > Time: 48.163 ms > CREATE TABLE > Time: 45.884 ms > CREATE TABLE > Time: 48.330 ms > CREATE TABLE > Time: 48.614 ms > CREATE TABLE > Time: 48.376 ms > CREATE TABLE > Time: 46.380 ms > CREATE TABLE > Time: 48.233 ms > > If we observe above data, we can see the improvement with the patch. > The average time taken to execute for the last 10 partitions is. > With patch - 42.1869 ms > Without patch - 47.0892 ms. > > With respect to memory usage, AFAIK the allocated memory gets cleaned > during deallocation of the memory used by the memory context. So at > the end of the query, we may see no difference in the memory usage but > during the query execution it tries to get less memory than before. > Maybe during some worst case scenario, if there is less memory > available, we may see 'out of memory' errors without the patch but it > may work with the patch. I have not done experiments in this angle. I > am happy to do it if required. > > Please share your thoughts. > > -- > Thanks & Regards, > Nitin Jadhav > > On Tue, May 18, 2021 at 11:19 PM Robert Haas wrote: > > > > On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav > > wrote: > > > > The CFBOT had no issues with the patches, so I suspect an issue on your > > > > side. > > > > http://cfbot.cputube.org/nitin-jadhav.html > > > > > > I am getting the following error when I try to apply in my machine. > > > > > > $ git apply > > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch > > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18: > > > trailing whitespace. > > > > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead. > > > > Also, use 'git diff --check' or 'git log --check' before generating > > patches to send, and fix any whitespace errors before submitting. > > > > I see that you have made a theoretical argument for why this should be > > good for performance, but it would be better to have some test results > > that show that it works out in practice. Sometimes things seem like > > they ought to be more efficient but turn out to be less efficient when > > they are actually tried. > > > > -- > > Robert Haas > > EDB: http://www.enterprisedb.com
Re: Commitfest app vs. pgsql-docs
On Wed, May 19, 2021 at 7:39 PM Tom Lane wrote: > > Stephen Frost writes: > > * Laurenz Albe (laurenz.a...@cybertec.at) wrote: > >> Since we have a "documentation" section in the commitfest, it would > >> be useful to allow links to the -docs archives. > > > ... or get rid of the pgsql-docs mailing list, as has been suggested > > before. > > IIRC, the CF app also rejects threads on pgsql-bugs, which is even > more pointlessly annoying. Couldn't we just remove that restriction > altogether, and allow anything posted to some pgsql list? It's not technically rejecting anything, it's just explicitly looking in -hackers and doesn't even know the others exist :) Changing that to look globally can certainly be done. It takes a bit of work I think, as there are no API endpoints today that will do that, but those could be added. But just to be clear -- "some pgsql list" would include things like pgsql-general, the pgadmin lists, the non-english regional lists, etc. That may be fine, I just want to be sure everybody realizes that's what it means. Basically everything on https://www.postgresql.org/list/ -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Re: SSL Tests for sslinfo extension
On 5/19/21 1:01 PM, Dagfinn Ilmari Mannsåker wrote: > Daniel Gustafsson writes: > >> In order to be able to test extensions with SSL connections, allow >> configure_test_server_for_ssl to create any extensions passed as >> comma separated list. Each extension is created in all the test >> databases which may or may not be useful. > Why the comma-separated string, rather than an array reference, > i.e. `extensions => [qw(foo bar baz)]`? Also, should it use `CREATE > EXTENSION .. CASCADE`, in case the specified extensions depend on > others? > Also, instead of one line per db there should be an inner loop over the db names. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Commitfest app vs. pgsql-docs
On 5/19/21 1:39 PM, Tom Lane wrote: > Stephen Frost writes: >> * Laurenz Albe (laurenz.a...@cybertec.at) wrote: >>> Since we have a "documentation" section in the commitfest, it would >>> be useful to allow links to the -docs archives. >> ... or get rid of the pgsql-docs mailing list, as has been suggested >> before. > IIRC, the CF app also rejects threads on pgsql-bugs, which is even > more pointlessly annoying. Couldn't we just remove that restriction > altogether, and allow anything posted to some pgsql list? > > +several cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Removed extra memory allocations from create_list_bounds
> 'git apply' is very picky. Use 'patch -p1' to apply your patches instead. > > Also, use 'git diff --check' or 'git log --check' before generating > patches to send, and fix any whitespace errors before submitting. Thanks for the suggestion. I will follow these. > I see that you have made a theoretical argument for why this should be > good for performance, but it would be better to have some test results > that show that it works out in practice. Sometimes things seem like > they ought to be more efficient but turn out to be less efficient when > they are actually tried. Created a table with one column of type 'int' and partitioned by that column. Created 1 million partitions using following queries. create table t(a int) partition by list(a); select 'create table t_' || i || ' partition of t for values in (' || i || ');' from generate_series(1, 1) i \gexec After this I tried to create 10 partitions and observed the time taken to execute. Here is the data for 'with patch'. postgres@34077=#select 'create table t_' || i || ' partition of t for postgres'# values in (' || i || ');' postgres-# from generate_series(10001, 10010) i postgres-# \gexec CREATE TABLE Time: 36.863 ms CREATE TABLE Time: 46.645 ms CREATE TABLE Time: 44.915 ms CREATE TABLE Time: 39.660 ms CREATE TABLE Time: 42.188 ms CREATE TABLE Time: 43.163 ms CREATE TABLE Time: 44.374 ms CREATE TABLE Time: 45.117 ms CREATE TABLE Time: 40.340 ms CREATE TABLE Time: 38.604 ms The data for 'without patch' looks like this. postgres@31718=#select 'create table t_' || i || ' partition of t for postgres'# values in (' || i || ');' postgres-# from generate_series(10001, 10010) i postgres-# \gexec CREATE TABLE Time: 45.917 ms CREATE TABLE Time: 46.815 ms CREATE TABLE Time: 44.180 ms CREATE TABLE Time: 48.163 ms CREATE TABLE Time: 45.884 ms CREATE TABLE Time: 48.330 ms CREATE TABLE Time: 48.614 ms CREATE TABLE Time: 48.376 ms CREATE TABLE Time: 46.380 ms CREATE TABLE Time: 48.233 ms If we observe above data, we can see the improvement with the patch. The average time taken to execute for the last 10 partitions is. With patch - 42.1869 ms Without patch - 47.0892 ms. With respect to memory usage, AFAIK the allocated memory gets cleaned during deallocation of the memory used by the memory context. So at the end of the query, we may see no difference in the memory usage but during the query execution it tries to get less memory than before. Maybe during some worst case scenario, if there is less memory available, we may see 'out of memory' errors without the patch but it may work with the patch. I have not done experiments in this angle. I am happy to do it if required. Please share your thoughts. -- Thanks & Regards, Nitin Jadhav On Tue, May 18, 2021 at 11:19 PM Robert Haas wrote: > > On Tue, May 18, 2021 at 1:29 PM Nitin Jadhav > wrote: > > > The CFBOT had no issues with the patches, so I suspect an issue on your > > > side. > > > http://cfbot.cputube.org/nitin-jadhav.html > > > > I am getting the following error when I try to apply in my machine. > > > > $ git apply > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch > > ../patches/0001-Removed-extra-memory-allocations-from-create_list_bo.patch:18: > > trailing whitespace. > > 'git apply' is very picky. Use 'patch -p1' to apply your patches instead. > > Also, use 'git diff --check' or 'git log --check' before generating > patches to send, and fix any whitespace errors before submitting. > > I see that you have made a theoretical argument for why this should be > good for performance, but it would be better to have some test results > that show that it works out in practice. Sometimes things seem like > they ought to be more efficient but turn out to be less efficient when > they are actually tried. > > -- > Robert Haas > EDB: http://www.enterprisedb.com
Re: pgbench test failing on 14beta1 on Debian/i386
On 5/19/21 6:32 AM, Fabien COELHO wrote: > > >> Confirmed, thanks for looking. I can reproduce it on my machine with >> -m32. It's somewhat annoying that the buildfarm didn't pick it up >> sooner :-( >> >> On Wed, 19 May 2021 at 08:28, Michael Paquier >> wrote: >>> >>> On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote: I see two simple approaches: (1) use another PRNG inside pgbench, eg Knuth's which was used in some previous submission and is very simple and IMHO better than the rand48 stuff. (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits state. >>> >>> Or, (3) remove this test? I am not quite sure what there is to gain >>> with this extra test considering all the other tests with permute() >>> already present in this script. >> >> Yes, I think removing the test is the best option. It was originally >> added because there was a separate code path for larger permutation >> sizes that needed testing, but that's no longer the case so the test >> really isn't adding anything. > > Hmmm… > > It is the one test which worked in actually detecting an issue, so I > would not say that it is not adding anything, on the contrary, it did > prove its value! The permute function is expected to be deterministic > on different platforms and architectures, and it is not. > > I agree that removing the test will hide the issue effectively:-) but > ISTM more appropriate to solve the underlying issue and keep the test. > > I'd agree with a two phases approach: drop the test in the short term > and deal with the PRNG later. I'm so unhappy with this 48 bit PRNG > that I may be motivated enough to attempt to replace it, or at least > add a better (faster?? larger state?? same/better quality?) alternative. > Yeah, this does seem to be something that should be fixed rather than hidden. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On 5/18/21 11:03 PM, Michael Paquier wrote: > >> 3. Once the subscriber1 postmaster has exited, the TAP >> test will eventually time out, and then this happens: >> >> [.. logs ..] >> >> That is, because we failed to shut down subscriber1, the >> test script neglects to shut down subscriber2, and now >> things just sit indefinitely. So that's a robustness >> problem in the TAP infrastructure, rather than a bug in >> PG proper; but I still say it's a bug that needs fixing. > This one comes down to teardown_node() that uses system_or_bail(), > leaving things unfinished. I guess that we could be more aggressive > and ignore failures if we have a non-zero error code and that not all > the tests have passed within the END block of PostgresNode.pm. Yeah, this area needs substantial improvement. I have seen similar sorts of nasty hangs, where the script is waiting forever for some process that hasn't got the shutdown message. At least we probably need some way of making sure the END handler doesn't abort early. Maybe PostgresNode::stop() needs a mode that handles failure more gracefully. Maybe it needs to try shutting down all the nodes and only calling BAIL_OUT after trying all of them and getting a failure. But that might still leave us work to do on failures occuring pre-END. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Commitfest app vs. pgsql-docs
> On 19 May 2021, at 19:39, Tom Lane wrote: > > Stephen Frost writes: >> * Laurenz Albe (laurenz.a...@cybertec.at) wrote: >>> Since we have a "documentation" section in the commitfest, it would >>> be useful to allow links to the -docs archives. > >> ... or get rid of the pgsql-docs mailing list, as has been suggested >> before. > > IIRC, the CF app also rejects threads on pgsql-bugs, which is even > more pointlessly annoying. Couldn't we just remove that restriction > altogether, and allow anything posted to some pgsql list? +1. Regardless of the fate of any individual list I think this is the most sensible thing for the CF app. -- Daniel Gustafsson https://vmware.com/
Re: cutting down the TODO list thread
Hi, I let this drop off my radar a few months ago, but I'm going to try to get back into the habit of looking at a few items a week. As before, let me know in the next few days if anyone has thoughts or objections. (Optimizer / Executor) - Consider increasing the default values of from_collapse_limit, join_collapse_limit, and/or geqo_threshold http://archives.postgresql.org/message-id/4136ffa0905210551u22eeb31bn5655dbe7c9a3a...@mail.gmail.com This seems to have been rejected. - Improve use of expression indexes for ORDER BY http://archives.postgresql.org/pgsql-hackers/2009-08/msg01553.php Skimming the thread, I'm not quite sure if index-only scans (not available at the time) solves the problem, or is orthogonal to it? - Modify the planner to better estimate caching effects http://archives.postgresql.org/pgsql-performance/2010-11/msg00117.php Huge discussion. This sounds like a research project, and maybe a risky one. - Allow shared buffer cache contents to affect index cost computations http://archives.postgresql.org/pgsql-hackers/2011-06/msg01140.php Related to the above, but has a more specific approach in mind. The discussion thread is not useful for getting one's head around how to think about the problem, much less to decide if it's worth working on -- it's mostly complaining about the review process. Independent of that, the idea of inspecting the buffer cache seems impractical. -- John Naylor EDB: http://www.enterprisedb.com
Re: Commitfest app vs. pgsql-docs
Stephen Frost writes: > * Laurenz Albe (laurenz.a...@cybertec.at) wrote: >> Since we have a "documentation" section in the commitfest, it would >> be useful to allow links to the -docs archives. > ... or get rid of the pgsql-docs mailing list, as has been suggested > before. IIRC, the CF app also rejects threads on pgsql-bugs, which is even more pointlessly annoying. Couldn't we just remove that restriction altogether, and allow anything posted to some pgsql list? regards, tom lane
Re: libpq_pipeline in tmp_install
Alvaro Herrera writes: > On 2021-May-19, Tom Lane wrote: >> +1, except that you should add documentation for NO_INSTALL to the >> list of definable symbols at the head of pgxs.mk, and to the list >> in extend.sgml (compare that for NO_INSTALLCHECK). > I propose this. WFM. regards, tom lane
Re: Improve documentation for pg_upgrade, standbys and rsync
On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote: > * Laurenz Albe (laurenz.a...@cybertec.at) wrote: > > I revently tried to upgrade a standby following the documentation, > > but I found it hard to understand, [...] > > Haven't had a chance to look at this in depth but improving things here > would be good. > > An additional thing that we should really be mentioning is to tell > people to go in and TRUNCATE all of their UNLOGGED tables before going > through this process, otherwise the rsync will end up spending a bunch > of time copying the files for UNLOGGED relations which you really don't > want. Thanks for the feedback and the suggestion. CCing -hackers so that I can add it to the commitfest. Yours, Laurenz Albe
Re: SSL Tests for sslinfo extension
Daniel Gustafsson writes: > In order to be able to test extensions with SSL connections, allow > configure_test_server_for_ssl to create any extensions passed as > comma separated list. Each extension is created in all the test > databases which may or may not be useful. Why the comma-separated string, rather than an array reference, i.e. `extensions => [qw(foo bar baz)]`? Also, should it use `CREATE EXTENSION .. CASCADE`, in case the specified extensions depend on others? - ilmari -- "The surreality of the universe tends towards a maximum" -- Skud's Law "Never formulate a law or axiom that you're not prepared to live with the consequences of." -- Skud's Meta-Law
Re: Commitfest app vs. pgsql-docs
Greetings, * Laurenz Albe (laurenz.a...@cybertec.at) wrote: > I would like to add a thread on pgsql-docs to the commitfest, but I > found that that cannot be done. > > What is the best way to proceed? > Since we have a "documentation" section in the commitfest, it would > be useful to allow links to the -docs archives. ... or get rid of the pgsql-docs mailing list, as has been suggested before. Thanks, Stephen signature.asc Description: PGP signature
Commitfest app vs. pgsql-docs
I would like to add a thread on pgsql-docs to the commitfest, but I found that that cannot be done. What is the best way to proceed? Since we have a "documentation" section in the commitfest, it would be useful to allow links to the -docs archives. Yours, Laurenz Albe
Re: Inaccurate error message when set fdw batch_size to 0
On Wed, May 19, 2021 at 5:20 PM Fujii Masao wrote: > > ereport(ERROR, > > (errcode(ERRCODE_SYNTAX_ERROR), > -errmsg("%s requires a > non-negative numeric value", > +errmsg("%s must be greater > than or equal to zero", > > def->defname))); > } > else if (strcmp(def->defname, "extensions") == 0) > @@ -142,7 +142,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS) > if (fetch_size <= 0) > ereport(ERROR, > > (errcode(ERRCODE_SYNTAX_ERROR), > -errmsg("%s requires a > non-negative integer value", > +errmsg("%s must be greater > than zero", > > I'm fine to convert "non-negative" word to "greater than" or "greater than > or equal to" in the messages. But this change also seems to get rid of > the information about the data type of the option from the message. > I'm not sure if this is an improvement. Probably isn't it better to > convert "requires a non-negative integer value" to "must be an integer value > greater than zero"? Thanks for the comments. Done that way. PSA v3 patch. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v3-0001-Disambiguate-error-messages-that-use-non-negative.patch Description: Binary data
Add PortalDrop in exec_execute_message
Hi, hackers. I've been playing with "autoprepared" patch, and have got isolation "freeze-the-dead" test stuck on first VACUUM FREEZE statement. After some research I found issue is reproduced with unmodified master branch if extended protocol used. I've prepared ruby script for demonstration (cause ruby-pg has simple interface to PQsendQueryParams). Further investigation showed it happens due to portal is not dropped inside of exec_execute_message, and it is kept in third session till COMMIT is called. Therefore heap page remains pinned, and VACUUM FREEZE became locked inside LockBufferForCleanup. It seems that it is usually invisible to common users since either: - command is called as standalone and then transaction is closed immediately, - next PQsendQueryParams will initiate another unnamed portal using CreatePortal("", true, true) and this action will drop previous one. But "freeze-the-dead" remains locked since third session could not send COMMIT until VACUUM FULL finished. I propose to add PortalDrop at the 'if (completed)' branch of exec_execute_message. --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -2209,6 +2209,8 @@ exec_execute_message(const char *portal_name, long max_rows) if (completed) { + PortalDrop(portal, false); + if (is_xact_command) { With this change 'make check-world' runs without flaws (at least on empty configure with enable-cassert and enable-tap-tests). There is small chance applications exist which abuses seekable portals with 'execute' protocol message so not every completed portal can be safely dropped. But I believe there is some sane conditions that cover common case. For example, isn't empty name check is enough? Can client reset or seek portal with empty name? regards, Sokolov Yura aka funny_falconrequire 'pg' c1 = PG.connect(host: "localhost", dbname: "postgres") c2 = PG.connect(host: "localhost", dbname: "postgres") c3 = PG.connect(host: "localhost", dbname: "postgres") class PG::Connection def simple(sql) puts sql exec(sql) end def extended(sql) puts "#{sql}" exec_params(sql, []) end end c1.simple "DROP TABLE IF EXISTS tab_freeze" c1.simple " CREATE TABLE tab_freeze ( id int PRIMARY KEY, name char(3), x int); INSERT INTO tab_freeze VALUES (1, '111', 0); INSERT INTO tab_freeze VALUES (3, '333', 0); " c1.simple "BEGIN" c2.simple "BEGIN" c3.simple "BEGIN" c1.simple "UPDATE tab_freeze SET x = x + 1 WHERE id = 3" c2.extended "SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE" c3.extended "SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE" c1.simple "COMMIT" c2.simple "COMMIT" c2.simple "VACUUM FREEZE tab_freeze" c1.simple " BEGIN; SET LOCAL enable_seqscan = false; SET LOCAL enable_bitmapscan = false; SELECT * FROM tab_freeze WHERE id = 3; COMMIT; " c3.simple "COMMIT" c2.simple "VACUUM FREEZE tab_freeze" c1.simple "SELECT * FROM tab_freeze ORDER BY name, id"
Re: postgres_fdw - should we tighten up batch_size, fetch_size options against non-numeric values?
On Wed, May 19, 2021 at 5:02 PM Fujii Masao wrote: > On 2021/05/19 14:34, Bharath Rupireddy wrote: > > On Wed, May 19, 2021 at 8:28 AM Fujii Masao > > wrote: > > I agree with throwing an error for non-numeric junk though. > > Allowing that on the grounds of backwards compatibility > > seems like too much of a stretch. > > +1. > >>> > >>> +1. > >> > >> +1 > > > > Thanks all for your inputs. PSA which uses parse_int for > > batch_size/fech_size and parse_real for fdw_startup_cost and > > fdw_tuple_cost. > > Thanks for updating the patch! It basically looks good to me. > > - val = strtod(defGetString(def), ); > - if (*endp || val < 0) > + is_parsed = parse_real(defGetString(def), , 0, > NULL); > + if (!is_parsed || val < 0) > ereport(ERROR, > > (errcode(ERRCODE_SYNTAX_ERROR), > errmsg("%s requires a > non-negative numeric value", > > Isn't it better to check "!is_parsed" and "val < 0" separately like > reloptions.c does? That is, we should throw different error messages > for them? > > ERRCODE_SYNTAX_ERROR seems strange for this type of error? > ERRCODE_INVALID_PARAMETER_VALUE is better and more proper? Thanks for the comments. I added separate messages, changed the error code from ERRCODE_SYNTAX_ERROR to ERRCODE_INVALID_PARAMETER_VALUE and also quoted the option name in the error message. PSA v3 patch. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v3-0001-Tighten-up-batch_size-fetch_size-options-against-.patch Description: Binary data
Re: libpq_pipeline in tmp_install
On 2021-May-19, Tom Lane wrote: > Peter Eisentraut writes: > > On 10.05.21 20:26, Peter Eisentraut wrote: > >> The reason this is there is that the test suite uses PGXS to build the > >> test program, and so things get installed automatically. I suggest that > >> we should either write out the build system by hand to avoid this, or > >> maybe extend PGXS to support building programs but not installing them. > > > Here is a patch that implements the second solution, which turned out to > > be very easy. Great, thank you. > +1, except that you should add documentation for NO_INSTALL to the > list of definable symbols at the head of pgxs.mk, and to the list > in extend.sgml (compare that for NO_INSTALLCHECK). I propose this. -- Álvaro Herrera Valdivia, Chile diff --git a/doc/src/sgml/extend.sgml b/doc/src/sgml/extend.sgml index ec95b4eb01..1a4f208692 100644 --- a/doc/src/sgml/extend.sgml +++ b/doc/src/sgml/extend.sgml @@ -1672,6 +1672,16 @@ include $(PGXS) + + NO_INSTALL + + +don't define an install target, useful for test +modules that don't need their build products to be installed. + + + + NO_INSTALLCHECK diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk index 271e7eaba8..224eac069c 100644 --- a/src/makefiles/pgxs.mk +++ b/src/makefiles/pgxs.mk @@ -49,6 +49,8 @@ # TAP_TESTS -- switch to enable TAP tests # ISOLATION -- list of isolation test cases # ISOLATION_OPTS -- additional switches to pass to pg_isolation_regress +# NO_INSTALL -- don't define an install target, useful for test modules +# that don't need their build products to be installed # NO_INSTALLCHECK -- don't define an installcheck target, useful e.g. if # tests require special configuration, or don't use pg_regress # EXTRA_CLEAN -- extra files to remove in 'make clean'
Re: pgbench test failing on 14beta1 on Debian/i386
Or, (3) remove this test? I am not quite sure what there is to gain with this extra test considering all the other tests with permute() already present in this script. Yes, I think removing the test is the best option. It was originally added because there was a separate code path for larger permutation sizes that needed testing, but that's no longer the case so the test really isn't adding anything. Hmmm… It is the one test which worked in actually detecting an issue, so I would not say that it is not adding anything, on the contrary, it did prove its value! The permute function is expected to be deterministic on different platforms and architectures, and it is not. In fact what it demonstrates is that the results from permute(), like all the other pgbench random functions, will vary by platform for sufficiently large size parameters. Indeed, it is the case if the underlying math use doubles & large numbers. For integer-only computations it should be safe though, and permute should be in this category. I'd agree with a two phases approach: drop the test in the short term and deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I may be motivated enough to attempt to replace it, or at least add a better (faster?? larger state?? same/better quality?) alternative. I don't necessarily have a problem with that provided the replacement is well-chosen and has a proven track record (i.e., let's not invent our own PRNG). Yes, obviously, I'm not daft enough to reinvent a PRNG. The question is to chose one, motivate the choice, and build the relevant API for what pg needs, possibly with some benchmarking. For now though, I'll go remove the test. This also removes the reminder… -- Fabien.
Re: libpq_pipeline in tmp_install
Peter Eisentraut writes: > On 10.05.21 20:26, Peter Eisentraut wrote: >> The reason this is there is that the test suite uses PGXS to build the >> test program, and so things get installed automatically. I suggest that >> we should either write out the build system by hand to avoid this, or >> maybe extend PGXS to support building programs but not installing them. > Here is a patch that implements the second solution, which turned out to > be very easy. +1, except that you should add documentation for NO_INSTALL to the list of definable symbols at the head of pgxs.mk, and to the list in extend.sgml (compare that for NO_INSTALLCHECK). regards, tom lane
Re: PG 14 release notes, first draft
These sound weird since markup was added in 6a5bde7d4: https://www.postgresql.org/docs/devel/release-14.html | Remove server and Chapter 34 support for the version 2 wire protocol (Heikki Linnakangas) ... | Pass doubled quote marks in Chapter 36 SQL command strings literally (Tom Lane) -Remove server and libpq support for the version 2 wire protocol (Heikki Linnakangas) +Remove server and support for the version 2 wire protocol (Heikki Linnakangas) > Force custom server variable names to match the pattern used for unquoted SQL > identifiers (Tom Lane) Say "Require" not force? > This setting was disabled in PostgreSQL version 13.3. "disabled" sounds like it was set to "off". Maybe say it was ignored. > Add long-running queries to be canceled if the client disconnects (Sergey > Cherkashin, Thomas Munro) Should say: Allow > The server variable client_connection_check_interval allows supporting > operating systems, e.g., Linux, to automatically cancel queries by > disconnected clients. say "some operating systems" ? > This can be disabled by turning client option "sslsni" off. "turning off" | Add %P to log_line_prefix to report the parallel group leader (Justin Pryzby) Maybe it should say "Allow %P in log_line_prefix to ...", otherwise it sounds like the default was changed. | Reduce the default value of vacuum_cost_page_miss (Peter Geoghegan) | This new default better reflects current hardware capabilities. Also say: the previous default was 10.
Re: Added missing tab completion for alter subscription set option
On Tue, May 18, 2021 at 9:20 PM Alvaro Herrera wrote: > > On 2021-May-14, vignesh C wrote: > > > While I was reviewing one of the logical decoding features, I found > > Streaming and binary options were missing in tab completion for the > > alter subscription set option, the attached patch has the changes for > > the same. > > Thoughts? > > I wish we didn't have to keep knowledge in the psql source on which > option names are to be used for each command. If we had some function > SELECT pg_completion_options('alter subscription set'); > that returned the list of options usable for each command, we wouldn't > have to ... psql would just retrieve the list of options for the current > command. > > Maintaining such a list does not seem hard -- for example we could just > have a function alongside parse_subscription_option() that returns the > names that are recognized by that one. If we drive the implementation > of both off a single struct, it would never be outdated. > I like the idea of maintaining a common list, that will also prevent options getting missed in the future. I will work on this and provide a patch for it. Regards, Vignesh
Freenode woes
Fwiw, if the PostgreSQL projects is considering moving the #postgresql IRC channel(s) elsewhere given [1,2], I'm a member of OFTC.net's network operations committee and would be happy to help. [1] https://gist.github.com/aaronmdjones/1a9a93ded5b7d162c3f58bdd66b8f491 [2] https://fuchsnet.ch/freenode-resign-letter.txt Christoph
SSL Tests for sslinfo extension
While playing around with the recent SSL testharness changes I wrote a test suite for sslinfo as a side effect, which seemed valuable in its own right as we currently have no coverage of this code. The small change needed to the testharness is to support installing modules, which is broken out into 0001 for easier reading. I'll park this in the next commitfest for now. -- Daniel Gustafsson https://vmware.com/ 0002-Add-tests-for-sslinfo.patch Description: Binary data 0001-Extend-configure_test_server_for_ssl-to-add-extensio.patch Description: Binary data
Re: Skip partition tuple routing with constant partition key
On Tue, May 18, 2021 at 11:11 AM houzj.f...@fujitsu.com wrote: > > > Hmm, does this seem common enough for the added complexity to be > > worthwhile? > > > > I'd also like to know if there's some genuine use case for this. For testing > > purposes does not seem to be quite a good enough reason. > > Thanks for the response. > > For some big data scenario, we sometimes transfer data from one table(only > store not expired data) > to another table(historical data) for future analysis. > In this case, we import data into historical table regularly(could be one day > or half a day), > And the data is likely to be imported with date label specified, then all of > the data to be > imported this time belong to the same partition which partition by time range. Is directing that data directly into the appropriate partition not an acceptable solution to address this particular use case? Yeah, I know we should avoid encouraging users to perform DML directly on partitions, but... -- Amit Langote EDB: http://www.enterprisedb.com
Re: Skip partition tuple routing with constant partition key
On Tue, May 18, 2021 at 10:28 AM David Rowley wrote: > On Tue, 18 May 2021 at 01:31, Amit Langote wrote: > > Hmm, does this seem common enough for the added complexity to be worthwhile? > > I'd also like to know if there's some genuine use case for this. For > testing purposes does not seem to be quite a good enough reason. > > A slightly different optimization that I have considered and even > written patches before was to have ExecFindPartition() cache the last > routed to partition and have it check if the new row can go into that > one on the next call. I imagined there might be a use case for > speeding that up for RANGE partitioned tables since it seems fairly > likely that most use cases, at least for time series ranges will > always hit the same partition most of the time. Since RANGE requires > a binary search there might be some savings there. I imagine that > optimisation would never be useful for HASH partitioning since it > seems most likely that we'll be routing to a different partition each > time and wouldn't save much since routing to hash partitions are > cheaper than other types. LIST partitioning I'm not so sure about. It > seems much less likely than RANGE to hit the same partition twice in a > row. > > IIRC, the patch did something like call ExecPartitionCheck() on the > new tuple with the previously routed to ResultRelInfo. I think the > last used partition was cached somewhere like relcache (which seems a > bit questionable). Likely this would speed up the example case here > a bit. Not as much as the proposed patch, but it would likely apply in > many more cases. > > I don't think I ever posted the patch to the list, and if so I no > longer have access to it, so it would need to be done again. I gave a shot to implementing your idea and ended up with the attached PoC patch, which does pass make check-world. I do see some speedup: -- creates a range-partitioned table with 1000 partitions create unlogged table foo (a int) partition by range (a); select 'create unlogged table foo_' || i || ' partition of foo for values from (' || (i-1)*10+1 || ') to (' || i*10+1 || ');' from generate_series(1, 1000) i; \gexec -- generates a 100 million record file copy (select generate_series(1, 1)) to '/tmp/100m.csv' csv; Times for loading that file compare as follows: HEAD: postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 31813.964 ms (00:31.814) postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 31972.942 ms (00:31.973) postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 32049.046 ms (00:32.049) Patched: postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 26151.158 ms (00:26.151) postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 28161.082 ms (00:28.161) postgres=# copy foo from '/tmp/100m.csv' csv; COPY 1 Time: 26700.908 ms (00:26.701) I guess it would be nice if we could fit in a solution for the use case that houjz mentioned as a special case. BTW, houjz, could you please check if a patch like this one helps the case you mentioned? -- Amit Langote EDB: http://www.enterprisedb.com ExecFindPartition-cache-partition-PoC.patch Description: Binary data
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On 5/19/21 1:42 AM, Tom Lane wrote: I discovered $SUBJECT after wondering why hyrax hadn't reported in recently, and trying to run check-world under CCA to see if anything got stuck. Indeed it did --- although this doesn't explain the radio silence from hyrax, because that animal doesn't run any TAP tests. (Neither does avocet, which I think is the only other active CCA critter. So this could have been broken for a very long time.) There are three CCA animals on the same box (avocet, husky and trilobite) with different compilers, running in a round-robin manner. One cycle took about 14 days, but about a month ago the machine got stuck, requiring a hard reboot about a week ago (no idea why it got stuck). It has more CPU power now (8 cores instead of 2), so it should take less time to run one test cycle. avocet already ran all the tests, husky is running HEAD at the moment and then it'll be trilobite's turn ... AFAICS none of those runs seems to have failed or got stuck so far. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 5:56 PM Amit Kapila wrote: > > On Wed, May 19, 2021 at 4:42 PM Bharath Rupireddy > wrote: > > > > On Wed, May 19, 2021 at 4:10 PM Amit Kapila wrote: > > > > > > On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy > > > wrote: > > > > > > > > On Wed, May 19, 2021 at 2:33 PM Amul Sul wrote: > > > > > > > > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy > > > > > wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > parse_subscription_options function has some similar code when > > > > > > throwing errors [with the only difference in the option]. I feel we > > > > > > could just use a variable for the option and use it in the error. > > > > > > I am not sure how much it helps to just refactor this part of the code > > > alone unless we need to add/change it more. Having said that, this > > > function is being modified by one of the proposed patches for logical > > > decoding of 2PC and I noticed that the proposed patch is adding more > > > parameters to this function which already takes 14 input parameters, > > > so I suggested refactoring it. See comment 11 in email[1]. See, if > > > that makes sense to you then we can refactor this function such that > > > it can be enhanced easily by future patches. > > > > Thanks Amit for the comments. I agree to move the parse options to a > > new structure ParseSubOptions as suggested. Then the function can just > > be parse_subscription_options(ParseSubOptions opts); I wonder if we > > should also have a structure for parse_publication_options as we might > > add new options there in the future? > > > > That function has just 5 parameters so not sure if that needs the same > treatment. Let's leave it for now. Thanks. I will work on the new structure ParseSubOption only for subscription options. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Alias collision in `refresh materialized view concurrently`
On Wed, May 19, 2021 at 5:33 PM Mathis Rudolf wrote: > > Hello, > > we had a Customer-Report in which `refresh materialized view > CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous` > > They're using `mv` as an alias for one column and this is causing a > collision with an internal alias. They also made it reproducible like this: > ``` > create materialized view testmv as select 'asdas' mv; --ok > create unique index on testmv (mv); --ok > refresh materialized view testmv; --ok > refresh materialized view CONCURRENTLY testmv; ---BAM! > ``` > > ``` > ERROR: column reference "mv" is ambiguous > LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ... >^ > QUERY: CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid > AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322 > newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata > OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid > ``` > > The corresponding Code is in `matview.c` in function > `refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we > could make collisions pretty unlikely, without intrusive changes. > > The appended patch does this change for the aliases `mv`, `newdata` and > `newdata2`. I think it's better to have some random name, see below. We could either use "OIDNewHeap" or "MyBackendId" to make those column names unique and almost unguessable. So, something like "pg_temp1_", "pg_temp2_" or "pg_temp3_" and so on would be better IMO. snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u", OIDOldHeap); snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d", MyBackendId); With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 4:42 PM Bharath Rupireddy wrote: > > On Wed, May 19, 2021 at 4:10 PM Amit Kapila wrote: > > > > On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy > > wrote: > > > > > > On Wed, May 19, 2021 at 2:33 PM Amul Sul wrote: > > > > > > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > parse_subscription_options function has some similar code when > > > > > throwing errors [with the only difference in the option]. I feel we > > > > > could just use a variable for the option and use it in the error. > > > > I am not sure how much it helps to just refactor this part of the code > > alone unless we need to add/change it more. Having said that, this > > function is being modified by one of the proposed patches for logical > > decoding of 2PC and I noticed that the proposed patch is adding more > > parameters to this function which already takes 14 input parameters, > > so I suggested refactoring it. See comment 11 in email[1]. See, if > > that makes sense to you then we can refactor this function such that > > it can be enhanced easily by future patches. > > Thanks Amit for the comments. I agree to move the parse options to a > new structure ParseSubOptions as suggested. Then the function can just > be parse_subscription_options(ParseSubOptions opts); I wonder if we > should also have a structure for parse_publication_options as we might > add new options there in the future? > That function has just 5 parameters so not sure if that needs the same treatment. Let's leave it for now. -- With Regards, Amit Kapila.
Re: Race condition in recovery?
On Tue, May 18, 2021 at 12:22 PM Kyotaro Horiguchi wrote: > And finally I think I could reach the situation the commit wanted to fix. > > I took a basebackup from a standby just before replaying the first > checkpoint of the new timeline (by using debugger), without copying > pg_wal. In this backup, the control file contains checkPointCopy of > the previous timeline. > > I modified StartXLOG so that expectedTLEs is set just after first > determining recoveryTargetTLI, then started the grandchild node. I > have the following error and the server fails to continue replication. > [postmaster] LOG: starting PostgreSQL 14beta1 on x86_64-pc-linux-gnu... > [startup] LOG: database system was interrupted while in recovery at log... > [startup] LOG: set expectedtles tli=6, length=1 > [startup] LOG: Probing history file for TLI=7 > [startup] LOG: entering standby mode > [startup] LOG: scanning segment 3 TLI 6, source 0 > [startup] LOG: Trying fetching history file for TLI=6 > [walreceiver] LOG: fetching timeline history file for timeline 5 from pri... > [walreceiver] LOG: fetching timeline history file for timeline 6 from pri... > [walreceiver] LOG: started streaming ... primary at 0/300 on timeline 5 > [walreceiver] DETAIL: End of WAL reached on timeline 5 at 0/30006E0. > [startup] LOG: unexpected timeline ID 1 in log segment > 00050003, offset 0 > [startup] LOG: Probing history file for TLI=7 > [startup] LOG: scanning segment 3 TLI 6, source 0 > (repeats forever) So IIUC, this logs shows that "ControlFile->checkPointCopy.ThisTimeLineID" is 6 but "ControlFile->checkPoint" record is on TL 5? I think if you had the old version of the code (before the commit) or below code [1], right after initializing expectedTLEs then you would have hit the FATAL the patch had fix. While debugging did you check what was the "ControlFile->checkPoint" LSN vs the first LSN of the first segment with TL6? expectedTLEs = readTimeLineHistory(recoveryTargetTLI); [1] if (tliOfPointInHistory(ControlFile->checkPoint, expectedTLEs) != ControlFile->checkPointCopy.ThisTimeLineID) { report(FATAL.. } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?
On Mon, 17 May 2021 at 14:52, Andy Fan wrote: > Would marking the new added RestrictInfo.norm_selec > 1 be OK? There would be cases you'd want to not count the additional clauses in the selectivity estimation and there would be cases you would want to. For example: SELECT ... FROM t1 INNER JOIN t2 ON t1.dt = t2.dt WHERE t1.dt BETWEEN 'date1' AND 'date2'; If you derived that t2.dt is also BETWEEN 'date1' AND 'date2' then you'd most likely want to include those quals for scans feeding merge, hash and non-parameterized nested loop joins, so you'd also want to count them in your selectivity estimations, else you'd feed junk values into the join selectivity estimations. Parameterized nested loop joins might be different as if you were looping up an index for t1.dt values on some index on t2.dt, then you'd likely not want to bother also filtering out the between clause values too. They're redundant in that case. I imagined we'd have some functions in equivclass.c that allows you to choose if you wanted the additional filters or not. Tom's example, WHERE a = b AND a IN (1,2,3), if a and b were in the same relation then you'd likely never want to include the additional quals. The only reason I could think that it would be a good idea is if "b" had an index but "a" didn't. I've not checked the code, but the index matching code might already allow that to work anyway. David
Re: pgbench test failing on 14beta1 on Debian/i386
On Wed, 19 May 2021 at 12:07, Fabien COELHO wrote: > > Attached patch disactivates the test with comments to outline that there > is an issue to fix… so it is *not* removed. > I opted to just remove the test rather than comment it out, since the issue highlighted isn't specific to permute(). Also changing the PRNG will completely change the results, so all the test values would require rewriting, rather than it just being a case of uncommenting the test and expecting it to work. > I'm obviously okay with providing an alternate PRNG, let me know if this > is the prefered option. > That's something for consideration in v15. If we do decide we want a new PRNG, it should apply across the board to all pgbench random functions. Regards, Dean
Alias collision in `refresh materialized view concurrently`
Hello, we had a Customer-Report in which `refresh materialized view CONCURRENTLY` failed with: `ERROR: column reference "mv" is ambiguous` They're using `mv` as an alias for one column and this is causing a collision with an internal alias. They also made it reproducible like this: ``` create materialized view testmv as select 'asdas' mv; --ok create unique index on testmv (mv); --ok refresh materialized view testmv; --ok refresh materialized view CONCURRENTLY testmv; ---BAM! ``` ``` ERROR: column reference "mv" is ambiguous LINE 1: ...alog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE ... ^ QUERY: CREATE TEMP TABLE pg_temp_4.pg_temp_218322_2 AS SELECT mv.ctid AS tid, newdata FROM public.testmv mv FULL JOIN pg_temp_4.pg_temp_218322 newdata ON (newdata.mv OPERATOR(pg_catalog.=) mv.mv AND newdata OPERATOR(pg_catalog.*=) mv) WHERE newdata IS NULL OR mv IS NULL ORDER BY tid ``` The corresponding Code is in `matview.c` in function `refresh_by_match_merge`. With adding a prefix like `_pg_internal_` we could make collisions pretty unlikely, without intrusive changes. The appended patch does this change for the aliases `mv`, `newdata` and `newdata2`. Kind regards, Mathis From 121b06258b80e5097c963335794e9a89f7e6d3ec Mon Sep 17 00:00:00 2001 From: Mathis Rudolf Date: Mon, 17 May 2021 14:55:49 +0200 Subject: [PATCH] Fix alias collision for REFRESH MV CONCURRENTLY This patch adds the prefix _pg_internal_ to aliases like 'mv' and 'newdata' in 'refresh_by_match_merge()', which makes it unliky to cause any collisions with user created MVs. --- src/backend/commands/matview.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c index 172ec6e982..04602dba80 100644 --- a/src/backend/commands/matview.c +++ b/src/backend/commands/matview.c @@ -629,12 +629,12 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner, */ resetStringInfo(); appendStringInfo(, - "SELECT newdata FROM %s newdata " - "WHERE newdata IS NOT NULL AND EXISTS " - "(SELECT 1 FROM %s newdata2 WHERE newdata2 IS NOT NULL " - "AND newdata2 OPERATOR(pg_catalog.*=) newdata " - "AND newdata2.ctid OPERATOR(pg_catalog.<>) " - "newdata.ctid)", + "SELECT pg_internal_newdata FROM %s pg_internal_newdata " + "WHERE pg_internal_newdata IS NOT NULL AND EXISTS " + "(SELECT 1 FROM %s pg_internal_newdata2 WHERE pg_internal_newdata2 IS NOT NULL " + "AND pg_internal_newdata2 OPERATOR(pg_catalog.*=) pg_internal_newdata " + "AND pg_internal_newdata2.ctid OPERATOR(pg_catalog.<>) " + "pg_internal_newdata.ctid)", tempname, tempname); if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT) elog(ERROR, "SPI_exec failed: %s", querybuf.data); @@ -662,8 +662,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner, resetStringInfo(); appendStringInfo(, "CREATE TEMP TABLE %s AS " - "SELECT mv.ctid AS tid, newdata " - "FROM %s mv FULL JOIN %s newdata ON (", + "SELECT pg_internal_mv.ctid AS tid, pg_internal_newdata " + "FROM %s pg_internal_mv FULL JOIN %s pg_internal_newdata ON (", diffname, matviewname, tempname); /* @@ -756,9 +756,9 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner, if (foundUniqueIndex) appendStringInfoString(, " AND "); -leftop = quote_qualified_identifier("newdata", +leftop = quote_qualified_identifier("pg_internal_newdata", NameStr(attr->attname)); -rightop = quote_qualified_identifier("mv", +rightop = quote_qualified_identifier("pg_internal_mv", NameStr(attr->attname)); generate_operator_clause(, @@ -786,8 +786,8 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner, Assert(foundUniqueIndex); appendStringInfoString(, - " AND newdata OPERATOR(pg_catalog.*=) mv) " - "WHERE newdata IS NULL OR mv IS NULL " + " AND pg_internal_newdata OPERATOR(pg_catalog.*=) pg_internal_mv) " + "WHERE pg_internal_newdata IS NULL OR pg_internal_mv IS NULL " "ORDER BY tid"); /* Create the temporary "diff" table. */ @@ -813,10 +813,10 @@ refresh_by_match_merge(Oid matviewOid, Oid tempOid, Oid relowner, /* Deletes must come before inserts; do them first. */ resetStringInfo(); appendStringInfo(, - "DELETE FROM %s mv WHERE ctid OPERATOR(pg_catalog.=) ANY " + "DELETE FROM %s pg_internal_mv WHERE ctid OPERATOR(pg_catalog.=) ANY " "(SELECT diff.tid FROM %s diff " "WHERE diff.tid IS NOT NULL " - "AND diff.newdata IS NULL)", + "AND diff.pg_internal_newdata IS NULL)", matviewname, diffname); if (SPI_exec(querybuf.data, 0) != SPI_OK_DELETE) elog(ERROR, "SPI_exec failed: %s", querybuf.data); @@ -824,7 +824,7 @@ refresh_by_match_merge(Oid matviewOid, Oid
Re: Parallel INSERT SELECT take 2
On Fri, May 14, 2021 at 6:24 PM houzj.f...@fujitsu.com wrote: > > Thanks for the comments, I have posted new version patches with this change. > > > How about reorganisation of the patches like the following? > > 0001: CREATE ALTER TABLE PARALLEL DML > > 0002: parallel-SELECT-for-INSERT (planner changes, > > max_parallel_hazard() update, XID changes) > > 0003: pg_get_parallel_safety() > > 0004: regression test updates > > Thanks, it looks good and I reorganized the latest patchset in this way. > > Attaching new version patches with the following change. > > 0003 > Change functions arg type to regclass. > > 0004 > remove updates for "serial_schedule". > I've got some comments for the V4 set of patches: (0001) (i) Patch comment needs a little updating (suggested change is below): Enable users to declare a table's parallel data-modification safety (SAFE/RESTRICTED/UNSAFE). Add a table property that represents parallel safety of a table for DML statement execution. It may be specified as follows: CREATE TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE }; ALTER TABLE table_name PARALLEL DML { UNSAFE | RESTRICTED | SAFE }; This property is recorded in pg_class's relparallel column as 'u', 'r', or 's', just like pg_proc's proparallel. The default is UNSAFE. The planner assumes that all of the table, its descendant partitions, and their ancillary objects have, at worst, the specified parallel safety. The user is responsible for its correctness. --- NOTE: The following sentence was removed from the original V4 0001 patch comment (since this version of the patch is not doing runtime parallel-safety checks on functions):. If the parallel processes find an object that is less safer than the assumed parallel safety during statement execution, it throws an ERROR and abort the statement execution. (ii) Update message to say "a foreign ...": BEFORE: + errmsg("cannot support parallel data modification on foreign or temporary table"))); AFTER: + errmsg("cannot support parallel data modification on a foreign or temporary table"))); (iii) strVal() macro already casts to "Value *", so the cast can be removed from the following: + char*parallel = strVal((Value *) def); (0003) (i) Suggested updates to the patch comment: Provide a utility function "pg_get_parallel_safety(regclass)" that returns records of (objid, classid, parallel_safety) for all parallel unsafe/restricted table-related objects from which the table's parallel DML safety is determined. The user can use this information during development in order to accurately declare a table's parallel DML safety. or to identify any problematic objects if a parallel DML fails or behaves unexpectedly. When the use of an index-related parallel unsafe/restricted function is detected, both the function oid and the index oid are returned. Provide a utility function "pg_get_max_parallel_hazard(regclass)" that returns the worst parallel DML safety hazard that can be found in the given relation. Users can use this function to do a quick check without caring about specific parallel-related objects. Regards, Greg Nancarrow Fujitsu Australia
Re: Inaccurate error message when set fdw batch_size to 0
On 2021/05/19 20:01, Bharath Rupireddy wrote: On Mon, May 17, 2021 at 4:23 PM Amit Kapila wrote: On Tue, May 11, 2021 at 11:28 AM Bharath Rupireddy wrote: On Mon, May 10, 2021 at 7:39 PM Tom Lane wrote: Bharath Rupireddy writes: On Mon, May 10, 2021 at 12:00 PM Tom Lane wrote: Yeah, this error message seems outright buggy. However, it's a minor matter. Also, some people think "positive" is the same thing as "non-negative", so maybe we need less ambiguous wording? Since value 0 can't be considered as either a positive or negative integer, I think we can do as following(roughly): if (value < 0) "requires a zero or positive integer value" if (value <= 0) "requires a positive integer value" I was thinking of avoiding the passive voice and writing "foo must be greater than zero" +1 for "foo must be greater than zero" if (foo <= 0) kind of errors. But, we also have some values for which zero is accepted, see below error messages. How about the error message "foo must be greater than or equal to zero"? +1 for your proposed message for the cases where we have a check if (foo < 0). Tom, Michael, do you see any problem with the proposed message? We would like to make a similar change at another place [1] so wanted to be consistent. [1] - https://www.postgresql.org/message-id/CALj2ACWGB9oHCR5ygkc8u6_QDqecObf9j2MxtOgsjZMMKsLj%3DQ%40mail.gmail.com Thanks all for your inputs. PSA v2 patch that uses the new convention. Thanks for the patch! ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), -errmsg("%s requires a non-negative numeric value", +errmsg("%s must be greater than or equal to zero", def->defname))); } else if (strcmp(def->defname, "extensions") == 0) @@ -142,7 +142,7 @@ postgres_fdw_validator(PG_FUNCTION_ARGS) if (fetch_size <= 0) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), -errmsg("%s requires a non-negative integer value", +errmsg("%s must be greater than zero", I'm fine to convert "non-negative" word to "greater than" or "greater than or equal to" in the messages. But this change also seems to get rid of the information about the data type of the option from the message. I'm not sure if this is an improvement. Probably isn't it better to convert "requires a non-negative integer value" to "must be an integer value greater than zero"? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: postgres_fdw - should we tighten up batch_size, fetch_size options against non-numeric values?
On 2021/05/19 14:34, Bharath Rupireddy wrote: On Wed, May 19, 2021 at 8:28 AM Fujii Masao wrote: I agree with throwing an error for non-numeric junk though. Allowing that on the grounds of backwards compatibility seems like too much of a stretch. +1. +1. +1 Thanks all for your inputs. PSA which uses parse_int for batch_size/fech_size and parse_real for fdw_startup_cost and fdw_tuple_cost. Thanks for updating the patch! It basically looks good to me. - val = strtod(defGetString(def), ); - if (*endp || val < 0) + is_parsed = parse_real(defGetString(def), , 0, NULL); + if (!is_parsed || val < 0) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s requires a non-negative numeric value", Isn't it better to check "!is_parsed" and "val < 0" separately like reloptions.c does? That is, we should throw different error messages for them? ERRCODE_SYNTAX_ERROR seems strange for this type of error? ERRCODE_INVALID_PARAMETER_VALUE is better and more proper? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Incorrect usage of strtol, atoi for non-numeric junk inputs
Hi, While working on [1], I found that some parts of the code is using strtol and atoi without checking for non-numeric junk input strings. I found this strange. Most of the time users provide proper numeric strings but there can be some scenarios where these strings are not user-supplied but generated by some other code which may contain non-numeric strings. Shouldn't the code use strtol or atoi appropriately and error out in such cases? One way to fix this once and for all is to have a common API something like int pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which returns a generic message upon invalid strings ("invalid value \"%s\" is provided for option \"%s\", opt_name, opt_value) and returns integers on successful parsing. Although this is not a critical issue, I would like to seek thoughts. [1] - https://www.postgresql.org/message-id/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA%40mail.gmail.com [2] strtol: vacuumlo.c --> ./vacuumlo -l '2323adfd' postgres -p '5432ERE' libpq_pipeline.c --> ./libpq_pipeline -r '2232adf' tests atoi: pg_amcheck.c --> ./pg_amcheck -j '1211efe' pg_basebackup --> ./pg_basebackup -Z 'EFEF' -s 'a$##' pg_receivewal.c --> ./pg_receivewal -p '54343GDFD' -s '54343GDFD' pg_recvlogical.c --> ./pg_recvlogical -F '-$#$#' -p '5432$$$' -s '100$$$' pg_checksums.c. --> ./pg_checksums -f '1212abc' pg_ctl.c --> ./pg_ctl -t 'efc' pg_dump.c --> ./pg_dump -j '454adc' -Z '4adc' --extra-float-digits '-14adc' pg_upgrade/option.c pgbench.c reindexdb.c vacuumdb.c pg_regress.c With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: pgbench test failing on 14beta1 on Debian/i386
On Wed, 19 May 2021 at 11:32, Fabien COELHO wrote: > > >> Or, (3) remove this test? I am not quite sure what there is to gain > >> with this extra test considering all the other tests with permute() > >> already present in this script. > > > > Yes, I think removing the test is the best option. It was originally > > added because there was a separate code path for larger permutation > > sizes that needed testing, but that's no longer the case so the test > > really isn't adding anything. > > Hmmm… > > It is the one test which worked in actually detecting an issue, so I would > not say that it is not adding anything, on the contrary, it did prove its > value! The permute function is expected to be deterministic on different > platforms and architectures, and it is not. > In fact what it demonstrates is that the results from permute(), like all the other pgbench random functions, will vary by platform for sufficiently large size parameters. > I'd agree with a two phases approach: drop the test in the short term and > deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I > may be motivated enough to attempt to replace it, or at least add a better > (faster?? larger state?? same/better quality?) alternative. > I don't necessarily have a problem with that provided the replacement is well-chosen and has a proven track record (i.e., let's not invent our own PRNG). For now though, I'll go remove the test. Regards, Dean
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 4:10 PM Amit Kapila wrote: > > On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy > wrote: > > > > On Wed, May 19, 2021 at 2:33 PM Amul Sul wrote: > > > > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy > > > wrote: > > > > > > > > Hi, > > > > > > > > parse_subscription_options function has some similar code when > > > > throwing errors [with the only difference in the option]. I feel we > > > > could just use a variable for the option and use it in the error. > > I am not sure how much it helps to just refactor this part of the code > alone unless we need to add/change it more. Having said that, this > function is being modified by one of the proposed patches for logical > decoding of 2PC and I noticed that the proposed patch is adding more > parameters to this function which already takes 14 input parameters, > so I suggested refactoring it. See comment 11 in email[1]. See, if > that makes sense to you then we can refactor this function such that > it can be enhanced easily by future patches. Thanks Amit for the comments. I agree to move the parse options to a new structure ParseSubOptions as suggested. Then the function can just be parse_subscription_options(ParseSubOptions opts); I wonder if we should also have a structure for parse_publication_options as we might add new options there in the future? If okay, I can work on these changes and attach it along with these error message changes. Thoughts? > > > > While this has no benefit at all, it saves some LOC and makes the code > > > > look better with lesser ereport(ERROR statements. PSA patch. > > > > > > > > Thoughts? > > > > > > I don't have a strong opinion on this, but the patch should add > > > __translator__ help comment for the error msg. > > > > Is the "/*- translator:" help comment something visible to the user or > > some other tool? > > > > We use similar comments at other places. So, it makes sense to retain > the comment as it might help translation tools. I will retail it. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: pgbench test failing on 14beta1 on Debian/i386
Hello Dean, Or, (3) remove this test? I am not quite sure what there is to gain with this extra test considering all the other tests with permute() already present in this script. Yes, I think removing the test is the best option. It was originally added because there was a separate code path for larger permutation sizes that needed testing, but that's no longer the case so the test really isn't adding anything. Hmmm… It is the one test which worked in actually detecting an issue, so I would not say that it is not adding anything, on the contrary, it did prove its value! The permute function is expected to be deterministic on different platforms and architectures, and it is not. I agree that removing the test will hide the issue effectively:-) but ISTM more appropriate to solve the underlying issue and keep the test. I'd agree with a two phases approach: drop the test in the short term and deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I may be motivated enough to attempt to replace it, or at least add a better (faster?? larger state?? same/better quality?) alternative. Attached patch disactivates the test with comments to outline that there is an issue to fix… so it is *not* removed. I'm obviously okay with providing an alternate PRNG, let me know if this is the prefered option. -- Fabien.diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl index 2eaf9ab4c2..db65988f76 100644 --- a/src/bin/pgbench/t/001_pgbench_with_server.pl +++ b/src/bin/pgbench/t/001_pgbench_with_server.pl @@ -645,13 +645,15 @@ SELECT :v0, :v1, :v2, :v3; permute(0, 2, 5435) = 1 and permute(1, 2, 5435) = 0) -- 63 bits tests \set size debug(:max - 10) -\set t debug(permute(:size-1, :size, 5432) = 5301702756001087507 and \ - permute(:size-2, :size, 5432) = 8968485976055840695 and \ - permute(:size-3, :size, 5432) = 6708495591295582115 and \ - permute(:size-4, :size, 5432) = 2801794404574855121 and \ - permute(:size-5, :size, 5432) = 1489011409218895840 and \ - permute(:size-6, :size, 5432) = 2267749475878240183 and \ - permute(:size-7, :size, 5432) = 1300324176838786780) +-- FIXME non deterministic issue on i386 due to floating point different behavior +\set t debug(true) +-- \set t debug(permute(:size-1, :size, 5432) = 5301702756001087507 and \ +-- permute(:size-2, :size, 5432) = 8968485976055840695 and \ +-- permute(:size-3, :size, 5432) = 6708495591295582115 and \ +-- permute(:size-4, :size, 5432) = 2801794404574855121 and \ +-- permute(:size-5, :size, 5432) = 1489011409218895840 and \ +-- permute(:size-6, :size, 5432) = 2267749475878240183 and \ +-- permute(:size-7, :size, 5432) = 1300324176838786780) } });
Re: Inaccurate error message when set fdw batch_size to 0
On Mon, May 17, 2021 at 4:23 PM Amit Kapila wrote: > > On Tue, May 11, 2021 at 11:28 AM Bharath Rupireddy > wrote: > > > > On Mon, May 10, 2021 at 7:39 PM Tom Lane wrote: > > > > > > Bharath Rupireddy writes: > > > > On Mon, May 10, 2021 at 12:00 PM Tom Lane wrote: > > > >> Yeah, this error message seems outright buggy. However, it's a minor > > > >> matter. Also, some people think "positive" is the same thing as > > > >> "non-negative", so maybe we need less ambiguous wording? > > > > > > > Since value 0 can't be considered as either a positive or negative > > > > integer, I think we can do as following(roughly): > > > > > > > if (value < 0) "requires a zero or positive integer value" > > > > if (value <= 0) "requires a positive integer value" > > > > > > I was thinking of avoiding the passive voice and writing > > > > > > "foo must be greater than zero" > > > > +1 for "foo must be greater than zero" if (foo <= 0) kind of errors. > > But, we also have some values for which zero is accepted, see below > > error messages. How about the error message "foo must be greater than > > or equal to zero"? > > > > +1 for your proposed message for the cases where we have a check if > (foo < 0). Tom, Michael, do you see any problem with the proposed > message? We would like to make a similar change at another place [1] > so wanted to be consistent. > > [1] - > https://www.postgresql.org/message-id/CALj2ACWGB9oHCR5ygkc8u6_QDqecObf9j2MxtOgsjZMMKsLj%3DQ%40mail.gmail.com Thanks all for your inputs. PSA v2 patch that uses the new convention. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com v2-0001-Disambiguate-error-messages-that-use-non-negative.patch Description: Binary data
pg_rewind fails if there is a read only file.
Several weeks ago I saw this issue in a production environment. The read only file looks like a credential file. Michael told me that usually such kinds of files should be better kept in non-pgdata directories in production environments. Thought further it seems that pg_rewind should be more user friendly to tolerate such scenarios. The failure error is "Permission denied" after open(). The reason is open() fais with the below mode in open_target_file() mode = O_WRONLY | O_CREAT | PG_BINARY; if (trunc) mode |= O_TRUNC; dstfd = open(dstpath, mode, pg_file_create_mode); The fix should be quite simple, if open fails with "Permission denied" then we try to call chmod to add a S_IWUSR just before open() and call chmod to reset the flags soon after open(). A stat() call to get previous st_mode flags is needed. Any other suggestions or thoughts? Thanks, -- Paul Guo (Vmware)
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 3:08 PM Bharath Rupireddy wrote: > > On Wed, May 19, 2021 at 2:33 PM Amul Sul wrote: > > > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy > > wrote: > > > > > > Hi, > > > > > > parse_subscription_options function has some similar code when > > > throwing errors [with the only difference in the option]. I feel we > > > could just use a variable for the option and use it in the error. I am not sure how much it helps to just refactor this part of the code alone unless we need to add/change it more. Having said that, this function is being modified by one of the proposed patches for logical decoding of 2PC and I noticed that the proposed patch is adding more parameters to this function which already takes 14 input parameters, so I suggested refactoring it. See comment 11 in email[1]. See, if that makes sense to you then we can refactor this function such that it can be enhanced easily by future patches. > > > While this has no benefit at all, it saves some LOC and makes the code > > > look better with lesser ereport(ERROR statements. PSA patch. > > > > > > Thoughts? > > > > I don't have a strong opinion on this, but the patch should add > > __translator__ help comment for the error msg. > > Is the "/*- translator:" help comment something visible to the user or > some other tool? > We use similar comments at other places. So, it makes sense to retain the comment as it might help translation tools. [1] - https://www.postgresql.org/message-id/CAA4eK1Jz64rwLyB6H7Z_SmEDouJ41KN42%3DVkVFp6JTpafJFG8Q%40mail.gmail.com -- With Regards, Amit Kapila.
RE: Forget close an open relation in ReorderBufferProcessTXN()
On Wednesday, May 19, 2021 1:52 PM Amit Kapila wrote: > > > I am not sure but I > > > > think we should prohibit truncate on user_catalog_tables as we > > > > prohibit truncate on system catalog tables (see below [1]) if we > > > > want plugin to lock them, otherwise, as you said it might lead to > deadlock. > > > > For the matter, I think we should once check all other operations > > > > where we can take an exclusive lock on [user]_catalog_table, say > > > > Cluster command, and compare the behavior of same on system > > > > catalog tables. > > > > > > > > [1] > > > > postgres=# truncate pg_class; > > > > ERROR: permission denied: "pg_class" is a system catalog > > > > postgres=# cluster pg_class; > > > > ERROR: there is no previously clustered index for table "pg_class" > > > > > > > > > > Please ignore the cluster command as we need to use 'using index' > > > with that command to make it successful. I just want to show the > > > truncate command behavior for which you have asked the question. > > Thank you so much for clarifying the direction. > > I agree with the changing the TRUNCATE side. > > I'll make a patch based on this. > > > > Isn't it a better idea to start a new thread where you can summarize whatever > we have discussed here about user_catalog_tables? We might get the opinion > from others about the behavior change you are proposing. You are right. So, I've launched it with the patch for this. Best Regards, Takamichi Osumi
Re: pgbench test failing on 14beta1 on Debian/i386
Confirmed, thanks for looking. I can reproduce it on my machine with -m32. It's somewhat annoying that the buildfarm didn't pick it up sooner :-( On Wed, 19 May 2021 at 08:28, Michael Paquier wrote: On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote: I see two simple approaches: (1) use another PRNG inside pgbench, eg Knuth's which was used in some previous submission and is very simple and IMHO better than the rand48 stuff. (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits state. Or, (3) remove this test? I am not quite sure what there is to gain with this extra test considering all the other tests with permute() already present in this script. Yes, I think removing the test is the best option. It was originally added because there was a separate code path for larger permutation sizes that needed testing, but that's no longer the case so the test really isn't adding anything. Hmmm… It is the one test which worked in actually detecting an issue, so I would not say that it is not adding anything, on the contrary, it did prove its value! The permute function is expected to be deterministic on different platforms and architectures, and it is not. I agree that removing the test will hide the issue effectively:-) but ISTM more appropriate to solve the underlying issue and keep the test. I'd agree with a two phases approach: drop the test in the short term and deal with the PRNG later. I'm so unhappy with this 48 bit PRNG that I may be motivated enough to attempt to replace it, or at least add a better (faster?? larger state?? same/better quality?) alternative. -- Fabien.
Deadlock concern caused by TRUNCATE on user_catalog_table in synchronous mode
Hi I've been discussing about user_catalog_table and the possibility of deadlock during synchronous mode of logical replication in [1]. I'll launch a new thread and summarize the contents so that anyone who is interested in this title can join the discussion. We don't have any example of user_catalog_tables in the core code, so any response and idea related to this area is helpful. Now, we don't disallow output plugin to take a lock on user_catalog_table. Then, we can consider a deadlock scenario like below. 1. TRUNCATE command is performed on user_catalog_table. 2. TRUNCATE command locks the table and index with ACCESS EXCLUSIVE LOCK. 3. TRUNCATE waits for the subscriber's synchronization when synchronous_standby_names is set. 4. Here, the walsender hangs, *if* it tries to acquire a lock on the user_catalog_table because the table where it wants to see is locked by the TRUNCATE already. (Here, we don't talk about LOCK command because the discussion is in progress in another thread independently - [2]) Another important point here is that we can *not* know how and when plugin does read only access to user_catalog_table in general, because it depends on the purpose of the plugin. Then, I'm thinking that changing a behavior of TRUNCATE side to error out when TRUNCATE is performed on user_catalog_table will work to make the concern disappear. Kindly have a look at the attached patch. [1] - https://www.postgresql.org/message-id/MEYP282MB166933B1AB02B4FE56E82453B64D9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM [2] - https://www.postgresql.org/message-id/CALDaNm1UB==gl9poad4etjfcygdjbphwezezocodnbd--kj...@mail.gmail.com Best Regards, Takamichi Osumi disallow_TRUNCATE_on_user_catalog_table_v01.patch Description: disallow_TRUNCATE_on_user_catalog_table_v01.patch
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
On 2021/05/19 16:43, Kyotaro Horiguchi wrote: +1 for adding some tests for pg_wal_replay_pause() but the test seems like checking only that pg_get_wal_replay_pause_state() returns the expected state value. Don't we need to check that the recovery is actually paused and that the promotion happens at expected LSN? Sounds good. Attached is the updated version of the patch. I added such checks into the test. BTW, while reading some recovery regression tests, I found that 013_crash_restart.pl has "use Time::HiRes qw(usleep)" but it seems not necessary. We can safely remove that? Patch attached. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl index 7f177afaed..496fa40fe1 100644 --- a/src/test/recovery/t/005_replay_delay.pl +++ b/src/test/recovery/t/005_replay_delay.pl @@ -1,13 +1,13 @@ # Copyright (c) 2021, PostgreSQL Global Development Group -# Checks for recovery_min_apply_delay +# Checks for recovery_min_apply_delay and recovery pause use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 1; +use Test::More tests => 3; # Initialize primary node my $node_primary = get_new_node('primary'); @@ -55,3 +55,58 @@ $node_standby->poll_query_until('postgres', # the configured apply delay. ok(time() - $primary_insert_time >= $delay, "standby applies WAL only after replication delay"); + + +# Check that recovery can be paused or resumed expectedly. +my $node_standby2 = get_new_node('standby2'); +$node_standby2->init_from_backup($node_primary, $backup_name, + has_streaming => 1); +$node_standby2->start; + +# Recovery is not yet paused. +is($node_standby2->safe_psql('postgres', + "SELECT pg_get_wal_replay_pause_state()"), + 'not paused', 'pg_get_wal_replay_pause_state() reports not paused'); + +# Request to pause recovery and wait until it's actually paused. +$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(21,30))"); +$node_standby2->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'paused'") + or die "Timed out while waiting for recovery to be paused"; + +# Even if new WAL records are streamed from the primary, +# recovery in the paused state doesn't replay them. +my $receive_lsn = $node_standby2->safe_psql('postgres', + "SELECT pg_last_wal_receive_lsn()"); +my $replay_lsn = $node_standby2->safe_psql('postgres', + "SELECT pg_last_wal_replay_lsn()"); +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(31,40))"); +$node_standby2->poll_query_until('postgres', + "SELECT '$receive_lsn'::pg_lsn < pg_last_wal_receive_lsn()") + or die "Timed out while waiting for new WAL to be streamed"; +is($node_standby2->safe_psql('postgres', + "SELECT pg_last_wal_replay_lsn()"), + qq($replay_lsn), 'no WAL is replayed in the paused state'); + +# Request to resume recovery and wait until it's actually resumed. +$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_resume()"); +$node_standby2->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'not paused' AND pg_last_wal_replay_lsn() > '$replay_lsn'::pg_lsn") + or die "Timed out while waiting for recovery to be resumed"; + +# Check that the paused state ends and promotion continues if a promotion +# is triggered while recovery is paused. +$node_standby2->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(41,50))"); +$node_standby2->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'paused'") + or die "Timed out while waiting for recovery to be paused"; + +$node_standby2->promote; +$node_standby2->poll_query_until('postgres', + "SELECT NOT pg_is_in_recovery()") + or die "Timed out while waiting for promotion to finish"; diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl index 66e43ffbe8..e1c36abe97 100644 --- a/src/test/recovery/t/013_crash_restart.pl +++ b/src/test/recovery/t/013_crash_restart.pl @@ -17,7 +17,6 @@ use PostgresNode; use TestLib; use Test::More; use Config; -use Time::HiRes qw(usleep); plan tests => 18;
Re: libpq_pipeline in tmp_install
On 10.05.21 20:26, Peter Eisentraut wrote: The test program libpq_pipeline produced by the test suite in src/test/modules/libpq_pipeline/ is installed into tmp_install as part of make check. This isn't a real problem by itself, but I think it creates a bit of an asymmetric situation that might be worth cleaning up. Before, the contents of tmp_install exactly matched an actual installation. There were no extra test programs installed. Also, the test suite code doesn't actually use that installed version, so it's not of any use, and it creates confusion about which copy is in use. The reason this is there is that the test suite uses PGXS to build the test program, and so things get installed automatically. I suggest that we should either write out the build system by hand to avoid this, or maybe extend PGXS to support building programs but not installing them. Here is a patch that implements the second solution, which turned out to be very easy. From 0c2f11034d66c072ffd2141bc5724c211ffb6ae8 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Wed, 19 May 2021 08:12:15 +0200 Subject: [PATCH] Add NO_INSTALL option to pgxs Apply in libpq_pipeline test makefile, so that the test file is not installed into tmp_install. Discussion: https://www.postgresql.org/message-id/flat/cb9d16a6-760f-cd44-28d6-b091d5fb6ca7%40enterprisedb.com --- src/makefiles/pgxs.mk| 4 src/test/modules/libpq_pipeline/Makefile | 2 ++ 2 files changed, 6 insertions(+) diff --git a/src/makefiles/pgxs.mk b/src/makefiles/pgxs.mk index 271e7eaba8..95d0441e9b 100644 --- a/src/makefiles/pgxs.mk +++ b/src/makefiles/pgxs.mk @@ -227,6 +227,8 @@ all: all-lib endif # MODULE_big +ifndef NO_INSTALL + install: all installdirs ifneq (,$(EXTENSION)) $(INSTALL_DATA) $(addprefix $(srcdir)/, $(addsuffix .control, $(EXTENSION))) '$(DESTDIR)$(datadir)/extension/' @@ -336,6 +338,8 @@ endif # with_llvm uninstall: uninstall-lib endif # MODULE_big +endif # NO_INSTALL + clean: ifdef MODULES diff --git a/src/test/modules/libpq_pipeline/Makefile b/src/test/modules/libpq_pipeline/Makefile index b798f5fbbc..c9c5ae1beb 100644 --- a/src/test/modules/libpq_pipeline/Makefile +++ b/src/test/modules/libpq_pipeline/Makefile @@ -3,6 +3,8 @@ PROGRAM = libpq_pipeline OBJS = libpq_pipeline.o +NO_INSTALL = 1 + PG_CPPFLAGS = -I$(libpq_srcdir) PG_LIBS_INTERNAL += $(libpq_pgport) -- 2.31.1
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 2:33 PM Amul Sul wrote: > > On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy > wrote: > > > > Hi, > > > > parse_subscription_options function has some similar code when > > throwing errors [with the only difference in the option]. I feel we > > could just use a variable for the option and use it in the error. > > While this has no benefit at all, it saves some LOC and makes the code > > look better with lesser ereport(ERROR statements. PSA patch. > > > > Thoughts? > > I don't have a strong opinion on this, but the patch should add > __translator__ help comment for the error msg. Is the "/*- translator:" help comment something visible to the user or some other tool? If not, I don't think that's necessary as the meaning of the error message is evident by looking at the error message itself. IMO, anyone who looks at that part of the code can understand it. With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: Different compression methods for FPI
On Tue, May 18, 2021 at 10:06:18PM -0500, Justin Pryzby wrote: > On Mon, May 17, 2021 at 04:44:11PM +0900, Michael Paquier wrote: >> On Sun, Mar 21, 2021 at 02:30:04PM -0500, Justin Pryzby wrote: >> >> For this patch, this is going to require a bit more in terms of library >> linking as the block decompression is done in xlogreader.c, so that's one >> thing to worry about. > > I'm not sure what you mean here ? I was wondering if anything else was needed in terms of linking here. >> Don't think you need a hook here, but zlib, or any other method which >> is not supported by the build, had better not be listed instead. This >> ensures that the value cannot be assigned if the binaries don't >> support that. > > I think you're confusing the walmethods struct (which is unconditional) with > wal_compression_options, which is conditional. Indeed I was. For the note, walmethods is not necessary either as a new structure. You could just store a list of strings in xlogreader.c and make a note to keep the entries in a given order. That's simpler as well. >> The patch set is a gathering of various things, and not only things >> associated to the compression method used for FPIs. >> What is the point of that in patch 0002? >>> Subject: [PATCH 03/12] Make sure published XIDs are persistent >> Patch 0003 looks unrelated to this thread. > > ..for the reason that I gave: > > | And 2ndary patches from another thread to allow passing recovery tests. > |These two patches are a prerequisite for this patch to progress: > | * Run 011_crash_recovery.pl with wal_level=minimal > | * Make sure published XIDs are persistent I still don't understand why XID consistency has anything to do with the compression of FPIs. There is nothing preventing the testing of compression of FPIs, and plese note this argument: https://www.postgresql.org/message-id/bef3b1e0-0b31-4f05-8e0a-f681cb918...@yandex-team.ru For example, I can just revert from my tree 0002 and 0003, and still perform tests of the various compression methods. I do agree that we are going to need to do something about this problem, but let's drop this stuff from the set of patches of this thread and just discuss them where they are needed. >> + { >> + {"wal_compression_method", PGC_SIGHUP, WAL_SETTINGS, >> + gettext_noop("Set the method used to compress full page images >> in the WAL."), >> + NULL >> + }, >> + _compression_method, >> + WAL_COMPRESSION_PGLZ, wal_compression_options, >> + NULL, NULL, NULL >> + }, >> The interface is not quite right to me. I think that we should just >> change wal_compression to become an enum, with extra values for pglz >> and the new method. "on" would be a synonym for "pglz". > > Andrey gave a reason in March: > > | I hope one day we will compress all WAL, not just FPIs. Advanced archive > management tools already do so, why not compress it in walwriter? > | When this will be implemented, we could have wal_compression = {off, fpi, > all}. > > [...] > I don't see why we'd add a guc for configuration compression but not include > the 30 lines of code needed to support a 3rd method that we already used by > the > core server. Because that makes things confusing. We have no idea if we'll ever reach a point or even if it makes sense to have compression applied to multiple parts of WAL. So, for now, let's just use one single GUC and be done with it. Your argument is not tied to what's proposed on this thread either, and could go the other way around. If we were to invent more compression concepts in WAL records, we could as well just go with a new GUC that lists the parts of the WAL where compression needs to be applied. I'd say to keep it to a minimum for now, that's an interface less confusing than what's proposed here. And you have not replaced BKPIMAGE_IS_COMPRESSED by a PGLZ-equivalent, so your patch set is eating more bits for BKPIMAGE_* than it needs to. By the way, it would be really useful for the user to print in pg_waldump -b the type of compression used :) A last point, and I think that this should be part of a study of the choice to made for an extra compression method: there is no discussion yet about the level of compression applied, which is something that can be applied to zstd, lz4 or even zlib. Perhaps there is an argument for a different GUC controlling that, so more benchmarking is a first step needed for this thread to move on. Benchmarking can happen with what's currently posted, of course. -- Michael signature.asc Description: PGP signature
Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
On Wed, May 19, 2021 at 2:09 PM Bharath Rupireddy wrote: > > Hi, > > parse_subscription_options function has some similar code when > throwing errors [with the only difference in the option]. I feel we > could just use a variable for the option and use it in the error. > While this has no benefit at all, it saves some LOC and makes the code > look better with lesser ereport(ERROR statements. PSA patch. > > Thoughts? I don't have a strong opinion on this, but the patch should add __translator__ help comment for the error msg. Regards, Amul
Refactor "mutually exclusive options" error reporting code in parse_subscription_options
Hi, parse_subscription_options function has some similar code when throwing errors [with the only difference in the option]. I feel we could just use a variable for the option and use it in the error. While this has no benefit at all, it saves some LOC and makes the code look better with lesser ereport(ERROR statements. PSA patch. Thoughts? With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com From 6e4824653d2849631cba3cfe1fe562e1b2823c4a Mon Sep 17 00:00:00 2001 From: Bharath Rupireddy Date: Sat, 1 May 2021 20:26:30 +0530 Subject: [PATCH v1] Refactoring of error code in parse_subscription_options --- src/backend/commands/subscriptioncmds.c | 50 - 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c index 517c8edd3b..c4c9b4bd8c 100644 --- a/src/backend/commands/subscriptioncmds.c +++ b/src/backend/commands/subscriptioncmds.c @@ -227,25 +227,21 @@ parse_subscription_options(List *options, */ if (connect && !*connect) { + char *option = NULL; + /* Check for incompatible options from the user. */ if (enabled && *enabled_given && *enabled) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - /*- translator: both %s are strings of the form "option = value" */ - errmsg("%s and %s are mutually exclusive options", - "connect = false", "enabled = true"))); - - if (create_slot && create_slot_given && *create_slot) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - errmsg("%s and %s are mutually exclusive options", - "connect = false", "create_slot = true"))); + option = "enabled = true"; + else if (create_slot && create_slot_given && *create_slot) + option = "create_slot = true"; + else if (copy_data && copy_data_given && *copy_data) + option = "copy_data = true"; - if (copy_data && copy_data_given && *copy_data) + if (option) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually exclusive options", - "connect = false", "copy_data = true"))); + "connect = false", option))); /* Change the defaults of other options. */ *enabled = false; @@ -259,31 +255,31 @@ parse_subscription_options(List *options, */ if (slot_name && *slot_name_given && !*slot_name) { + char *option = NULL; + if (enabled && *enabled_given && *enabled) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - /*- translator: both %s are strings of the form "option = value" */ - errmsg("%s and %s are mutually exclusive options", - "slot_name = NONE", "enabled = true"))); + option = "enabled = true"; + else if (create_slot && create_slot_given && *create_slot) + option = "create_slot = true"; - if (create_slot && create_slot_given && *create_slot) + if (option) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("%s and %s are mutually exclusive options", - "slot_name = NONE", "create_slot = true"))); + "slot_name = NONE", option))); + + option = NULL; if (enabled && !*enabled_given && *enabled) - ereport(ERROR, - (errcode(ERRCODE_SYNTAX_ERROR), - /*- translator: both %s are strings of the form "option = value" */ - errmsg("subscription with %s must also set %s", - "slot_name = NONE", "enabled = false"))); + option = "enabled = false"; + else if (create_slot && !create_slot_given && *create_slot) + option = "create_slot = false"; - if (create_slot && !create_slot_given && *create_slot) + if (option) ereport(ERROR, (errcode(ERRCODE_SYNTAX_ERROR), errmsg("subscription with %s must also set %s", - "slot_name = NONE", "create_slot = false"))); + "slot_name = NONE", option))); } } -- 2.25.1
Re: Added missing tab completion for alter subscription set option
On Tue, May 18, 2021 at 9:21 PM Alvaro Herrera wrote: > > On 2021-May-14, vignesh C wrote: > > > While I was reviewing one of the logical decoding features, I found > > Streaming and binary options were missing in tab completion for the > > alter subscription set option, the attached patch has the changes for > > the same. > > Thoughts? > > I wish we didn't have to keep knowledge in the psql source on which > option names are to be used for each command. If we had some function > SELECT pg_completion_options('alter subscription set'); > that returned the list of options usable for each command, we wouldn't > have to ... psql would just retrieve the list of options for the current > command. > > Maintaining such a list does not seem hard -- for example we could just > have a function alongside parse_subscription_option() that returns the > names that are recognized by that one. If we drive the implementation > of both off a single struct, it would never be outdated. Yeah, having something similar to table_storage_parameters works better. While on this, I found that all the options are not listed for CREATE SUBSCRIPTION command in tab-complete.c, missing ones are binary and streaming: else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "(")) COMPLETE_WITH("copy_data", "connect", "create_slot", "enabled", "slot_name", "synchronous_commit"); Similarly, CREATE and ALTER PUBLICATION don't have publish_via_partition_root option: else if (HeadMatches("CREATE", "PUBLICATION") && TailMatches("WITH", "(")) COMPLETE_WITH("publish"); I think having some structures like below in subscriptioncmds.h, publicationcmds.h and using them in tab-complete.c would make more sense. static const char *const create_subscription_params[] = { "copy_data", "create_slot", "enabled", "slot_name", "synchronous_commit", "binary", "connect", "streaming", NULL } static const char *const alter_subscription_set_params[] = { "binary", "slot_name", "streaming", "synchronous_commit", NULL } static const char *const create_or_alter_publication_params[] = { "publish", "publish_via_partition_root" NULL } With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com
Re: pgbench test failing on 14beta1 on Debian/i386
On Wed, 19 May 2021 at 00:35, Thomas Munro wrote: > > FWIW this is reproducible on my local Debian/gcc box with -m32, Confirmed, thanks for looking. I can reproduce it on my machine with -m32. It's somewhat annoying that the buildfarm didn't pick it up sooner :-( On Wed, 19 May 2021 at 08:28, Michael Paquier wrote: > > On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote: > > I see two simple approaches: > > > > (1) use another PRNG inside pgbench, eg Knuth's which was used in some > > previous submission and is very simple and IMHO better than the rand48 > > stuff. > > > > (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits > > state. > > Or, (3) remove this test? I am not quite sure what there is to gain > with this extra test considering all the other tests with permute() > already present in this script. Yes, I think removing the test is the best option. It was originally added because there was a separate code path for larger permutation sizes that needed testing, but that's no longer the case so the test really isn't adding anything. Regards, Dean
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
On Wed, May 19, 2021 at 11:55 AM Kyotaro Horiguchi wrote: > > At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar wrote > in > > On Wed, May 19, 2021 at 10:16 AM Fujii Masao > > wrote: > > > > > > On 2021/05/18 15:46, Michael Paquier wrote: > > > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote: > > > >> Currently a promotion causes all available WAL to be replayed before > > > >> a standby becomes a primary whether it was in paused state or not. > > > >> OTOH, something like immediate promotion (i.e., standby becomes > > > >> a primary without replaying outstanding WAL) might be useful for > > > >> some cases. I don't object to that. > > > > > > > > Sounds like a "promotion immediate" mode. It does not sound difficult > > > > nor expensive to add a small test for that in one of the existing > > > > recovery tests triggerring a promotion. Could you add one based on > > > > pg_get_wal_replay_pause_state()? > > > > > > You're thinking to add the test like the following? > > > #1. Pause the recovery > > > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused' > > > #3. Trigger standby promotion > > > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused' > > > > > > It seems not easy to do the test #4 stably because > > > pg_get_wal_replay_pause_state() needs to be executed > > > before the promotion finishes. > > > > Even for #2, we can not ensure that whether it will be 'paused' or > > 'pause requested'. > > We often use poll_query_until() to make sure some desired state is > reached. And, as Michael suggested, the function > pg_get_wal_replay_pause_state() still works at the time of > recovery_end_command. So a bit more detailed steps are: Right, if we are polling for the state change in #2 then that makes sense. > #0. Equip the server with recovery_end_command that waits for some > trigger then start the server. > #1. Pause the recovery > #2. Wait until pg_get_wal_replay_pause_state() returns 'paused' > #3. Trigger standby promotion > #4. Wait until pg_get_wal_replay_pause_state() returns 'not paused' > #5. Trigger recovery_end_command to let promotion proceed. +1 -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...
On Tue, Apr 27, 2021 at 12:58:52PM +0300, Aleksander Alekseev wrote: > I've noticed there is no tab completion for ALTER TABLE xxx ADD. Here > is an alternative version of the patch that fixes this as well. Not > sure if this should be in the same commit though. - /* If we have ALTER TABLE DROP, provide COLUMN or CONSTRAINT */ - else if (Matches("ALTER", "TABLE", MatchAny, "DROP")) + /* If we have ALTER TABLE ADD|DROP, provide COLUMN or CONSTRAINT */ + else if (Matches("ALTER", "TABLE", MatchAny, "ADD|DROP")) Seems to me that the behavior to not complete with COLUMN or CONSTRAINT for ADD is intentional, as it is possible to specify a constraint or column name without the object type first. This introduces a inconsistent behavior with what we do for columns with ADD, for one. So a more consistent approach would be to list columns, constraints, COLUMN and CONSTRAINT in the list of options available after ADD. + else if (Matches("ALTER", "TABLE", MatchAny, "VALIDATE", "CONSTRAINT")) + { + completion_info_charp = prev3_wd; + COMPLETE_WITH_QUERY(Query_for_nonvalid_constraint_of_table); + } Specifying valid constraints is an authorized grammar, so it does not seem that bad to keep things as they are, either. I would leave that alone. -- Michael signature.asc Description: PGP signature
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
At Wed, 19 May 2021 16:21:58 +0900, Fujii Masao wrote in > > > On 2021/05/19 15:25, Kyotaro Horiguchi wrote: > > At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar > > wrote in > >> On Wed, May 19, 2021 at 10:16 AM Fujii Masao > >> wrote: > >>> > >>> On 2021/05/18 15:46, Michael Paquier wrote: > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote: > > Currently a promotion causes all available WAL to be replayed before > > a standby becomes a primary whether it was in paused state or not. > > OTOH, something like immediate promotion (i.e., standby becomes > > a primary without replaying outstanding WAL) might be useful for > > some cases. I don't object to that. > > Sounds like a "promotion immediate" mode. It does not sound difficult > nor expensive to add a small test for that in one of the existing > recovery tests triggerring a promotion. Could you add one based on > pg_get_wal_replay_pause_state()? > >>> > >>> You're thinking to add the test like the following? > >>> #1. Pause the recovery > >>> #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused' > >>> #3. Trigger standby promotion > >>> #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused' > >>> > >>> It seems not easy to do the test #4 stably because > >>> pg_get_wal_replay_pause_state() needs to be executed > >>> before the promotion finishes. > >> > >> Even for #2, we can not ensure that whether it will be 'paused' or > >> 'pause requested'. > > We often use poll_query_until() to make sure some desired state is > > reached. > > Yes. > > > And, as Michael suggested, the function > > pg_get_wal_replay_pause_state() still works at the time of > > recovery_end_command. So a bit more detailed steps are: > > IMO this idea is tricky and fragile, so I'm inclined to avoid that if Agreed, the recovery_end_command would be something like the following avoiding dependency on sh. However, I'm not sure it works as well on Windows.. recovery_end_command='perl -e "while( -f \'$trigfile\') {sleep 0.1;}"' > possible. > Attached is the POC patch to add the following tests. > > #1. Check that pg_get_wal_replay_pause_state() reports "not paused" at > #first. > #2. Request to pause archive recovery and wait until it's actually > #paused. > #3. Request to resume archive recovery and wait until it's actually > #resumed. > #4. Request to pause archive recovery and wait until it's actually > #paused. >Then, check that the paused state ends and promotion continues >if a promotion is triggered while recovery is paused. > > In #4, pg_get_wal_replay_pause_state() is not executed while promotion > is ongoing. #4 checks that pg_is_in_recovery() returns false and > the promotion finishes expectedly in that case. Isn't this test enough > for now? +1 for adding some tests for pg_wal_replay_pause() but the test seems like checking only that pg_get_wal_replay_pause_state() returns the expected state value. Don't we need to check that the recovery is actually paused and that the promotion happens at expected LSN? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: pgbench test failing on 14beta1 on Debian/i386
On Wed, May 19, 2021 at 09:06:16AM +0200, Fabien COELHO wrote: > I see two simple approaches: > > (1) use another PRNG inside pgbench, eg Knuth's which was used in some > previous submission and is very simple and IMHO better than the rand48 > stuff. > > (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits > state. Or, (3) remove this test? I am not quite sure what there is to gain with this extra test considering all the other tests with permute() already present in this script. -- Michael signature.asc Description: PGP signature
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
On 2021/05/19 15:25, Kyotaro Horiguchi wrote: At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar wrote in On Wed, May 19, 2021 at 10:16 AM Fujii Masao wrote: On 2021/05/18 15:46, Michael Paquier wrote: On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote: Currently a promotion causes all available WAL to be replayed before a standby becomes a primary whether it was in paused state or not. OTOH, something like immediate promotion (i.e., standby becomes a primary without replaying outstanding WAL) might be useful for some cases. I don't object to that. Sounds like a "promotion immediate" mode. It does not sound difficult nor expensive to add a small test for that in one of the existing recovery tests triggerring a promotion. Could you add one based on pg_get_wal_replay_pause_state()? You're thinking to add the test like the following? #1. Pause the recovery #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused' #3. Trigger standby promotion #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused' It seems not easy to do the test #4 stably because pg_get_wal_replay_pause_state() needs to be executed before the promotion finishes. Even for #2, we can not ensure that whether it will be 'paused' or 'pause requested'. We often use poll_query_until() to make sure some desired state is reached. Yes. And, as Michael suggested, the function pg_get_wal_replay_pause_state() still works at the time of recovery_end_command. So a bit more detailed steps are: IMO this idea is tricky and fragile, so I'm inclined to avoid that if possible. Attached is the POC patch to add the following tests. #1. Check that pg_get_wal_replay_pause_state() reports "not paused" at first. #2. Request to pause archive recovery and wait until it's actually paused. #3. Request to resume archive recovery and wait until it's actually resumed. #4. Request to pause archive recovery and wait until it's actually paused. Then, check that the paused state ends and promotion continues if a promotion is triggered while recovery is paused. In #4, pg_get_wal_replay_pause_state() is not executed while promotion is ongoing. #4 checks that pg_is_in_recovery() returns false and the promotion finishes expectedly in that case. Isn't this test enough for now? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl index c675c0886c..8db7e47d13 100644 --- a/src/test/recovery/t/002_archiving.pl +++ b/src/test/recovery/t/002_archiving.pl @@ -6,7 +6,7 @@ use strict; use warnings; use PostgresNode; use TestLib; -use Test::More tests => 3; +use Test::More tests => 4; use File::Copy; # Initialize primary node, doing archives @@ -75,3 +75,42 @@ ok( !-f "$node_standby2_data/pg_wal/RECOVERYHISTORY", "RECOVERYHISTORY removed after promotion"); ok( !-f "$node_standby2_data/pg_wal/RECOVERYXLOG", "RECOVERYXLOG removed after promotion"); + +# Check that archive recovery can be paused or resumed expectedly. +my $node_standby3 = get_new_node('standby3'); +$node_standby3->init_from_backup($node_primary, $backup_name, + has_restoring => 1); +$node_standby3->start; + +# Archive recovery is not yet paused. +is($node_standby3->safe_psql('postgres', + "SELECT pg_get_wal_replay_pause_state()"), + 'not paused', 'pg_get_wal_replay_pause_state() reports not paused'); + +# Request to pause archive recovery and wait until it's actually paused. +$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(2001,2010))"); +$node_standby3->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'paused'") + or die "Timed out while waiting for archive recovery to be paused"; + +# Request to resume archive recovery and wait until it's actually resumed. +$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_resume()"); +$node_standby3->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'not paused'") + or die "Timed out while waiting for archive recovery to be resumed"; + +# Check that the paused state ends and promotion continues if a promotion +# is triggered while recovery is paused. +$node_standby3->safe_psql('postgres', "SELECT pg_wal_replay_pause()"); +$node_primary->safe_psql('postgres', + "INSERT INTO tab_int VALUES (generate_series(2011,2020))"); +$node_standby3->poll_query_until('postgres', + "SELECT pg_get_wal_replay_pause_state() = 'paused'") + or die "Timed out while waiting for archive recovery to be paused"; + +$node_standby3->promote; +$node_standby3->poll_query_until('postgres', + "SELECT NOT pg_is_in_recovery()") + or die "Timed out while waiting for promotion to finish";
Re: pgbench test failing on 14beta1 on Debian/i386
Forgot to post the actual values: r = 2563421694876090368 r = 2563421694876090365 Smells a bit like a precision problem in the workings of pg_erand48(), but as soon as I saw floating point numbers I closed my laptop and ran for the door. Yup. This test has a touching, but entirely unwarranted, faith in pg_erand48() producing bit-for-bit the same values everywhere. Indeed. I argued against involving any floats computation on principle, but Dean was confident it could work, and it did simplify the code, so it did not look that bad an option. I see two simple approaches: (1) use another PRNG inside pgbench, eg Knuth's which was used in some previous submission and is very simple and IMHO better than the rand48 stuff. (2) extend pg_*rand48() to provide an unsigned 64 bits out of the 48 bits state. Any preference? -- Fabien.
Re: subscriptioncheck failure
On Wed, May 19, 2021 at 10:28 AM Amit Kapila wrote: > > On Tue, May 18, 2021 at 11:25 AM vignesh C wrote: > > > > Thanks for the comments, the attached patch has the changes for the same. > > > > Thanks, I have pushed your patch after making minor changes in the comments. Thanks for pushing this patch. Regards, Vignesh
Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
On Wed, May 19, 2021 at 2:05 PM Michael Paquier wrote: > On Wed, May 19, 2021 at 10:26:28AM +0530, Amit Kapila wrote: > > How about moving AfterTriggerEndQuery() to apply_handle_*_internal > > calls? That way, we might not even need to change Push/Pop calls. > > Isn't that going to be a problem when a tuple is moved to a new > partition in the tuple routing? This does a DELETE followed by an > INSERT, but the operation is an UPDATE. That indeed doesn't work. Once AfterTriggerEndQuery() would get called for DELETE from apply_handle_delete_internal(), after triggers of the subsequent INSERT can't be processed, instead causing: ERROR: AfterTriggerSaveEvent() called outside of query IOW, the patch you posted earlier seems like the way to go. -- Amit Langote EDB: http://www.enterprisedb.com
Re: pg_get_wal_replay_pause_state() should not return 'paused' while a promotion is ongoing.
At Wed, 19 May 2021 11:19:13 +0530, Dilip Kumar wrote in > On Wed, May 19, 2021 at 10:16 AM Fujii Masao > wrote: > > > > On 2021/05/18 15:46, Michael Paquier wrote: > > > On Tue, May 18, 2021 at 12:48:38PM +0900, Fujii Masao wrote: > > >> Currently a promotion causes all available WAL to be replayed before > > >> a standby becomes a primary whether it was in paused state or not. > > >> OTOH, something like immediate promotion (i.e., standby becomes > > >> a primary without replaying outstanding WAL) might be useful for > > >> some cases. I don't object to that. > > > > > > Sounds like a "promotion immediate" mode. It does not sound difficult > > > nor expensive to add a small test for that in one of the existing > > > recovery tests triggerring a promotion. Could you add one based on > > > pg_get_wal_replay_pause_state()? > > > > You're thinking to add the test like the following? > > #1. Pause the recovery > > #2. Confirm that pg_get_wal_replay_pause_state() returns 'paused' > > #3. Trigger standby promotion > > #4. Confirm that pg_get_wal_replay_pause_state() returns 'not paused' > > > > It seems not easy to do the test #4 stably because > > pg_get_wal_replay_pause_state() needs to be executed > > before the promotion finishes. > > Even for #2, we can not ensure that whether it will be 'paused' or > 'pause requested'. We often use poll_query_until() to make sure some desired state is reached. And, as Michael suggested, the function pg_get_wal_replay_pause_state() still works at the time of recovery_end_command. So a bit more detailed steps are: #0. Equip the server with recovery_end_command that waits for some trigger then start the server. #1. Pause the recovery #2. Wait until pg_get_wal_replay_pause_state() returns 'paused' #3. Trigger standby promotion #4. Wait until pg_get_wal_replay_pause_state() returns 'not paused' #5. Trigger recovery_end_command to let promotion proceed. regards. -- Kyotaro Horiguchi NTT Open Source Software Center