Re: Question about savepoint level?
On Mon, 24 Oct 2022 at 12:19, Japin Li wrote: > Hi, hackers > > The TransactionStateData has savepointLevel field, however, I do not > understand > what is savepoint level, it seems all savepoints have the same savepointLevel, > I want to know how the savepoint level changes. I try to remove the savepointLevel, and it seems harmless. Any thoughts? -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. >From 1e5c015efc44bcf2bc93365e99740deb618eebfe Mon Sep 17 00:00:00 2001 From: Japin Li Date: Mon, 24 Oct 2022 14:54:03 +0800 Subject: [PATCH v1] Remove useless savepoint level --- src/backend/access/transam/xact.c | 18 -- 1 file changed, 18 deletions(-) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index fd5103a78e..e8a90a3a30 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -190,7 +190,6 @@ typedef struct TransactionStateData FullTransactionId fullTransactionId; /* my FullTransactionId */ SubTransactionId subTransactionId; /* my subxact ID */ char *name; /* savepoint name, if any */ - int savepointLevel; /* savepoint level */ TransState state; /* low-level state */ TBlockState blockState; /* high-level state */ int nestingLevel; /* transaction nesting depth */ @@ -3234,12 +3233,10 @@ CommitTransactionCommand(void) case TBLOCK_SUBRESTART: { char *name; -int savepointLevel; /* save name and keep Cleanup from freeing it */ name = s->name; s->name = NULL; -savepointLevel = s->savepointLevel; AbortSubTransaction(); CleanupSubTransaction(); @@ -3247,7 +3244,6 @@ CommitTransactionCommand(void) DefineSavepoint(NULL); s = CurrentTransactionState; /* changed by push */ s->name = name; -s->savepointLevel = savepointLevel; /* This is the same as TBLOCK_SUBBEGIN case */ AssertState(s->blockState == TBLOCK_SUBBEGIN); @@ -3263,19 +3259,16 @@ CommitTransactionCommand(void) case TBLOCK_SUBABORT_RESTART: { char *name; -int savepointLevel; /* save name and keep Cleanup from freeing it */ name = s->name; s->name = NULL; -savepointLevel = s->savepointLevel; CleanupSubTransaction(); DefineSavepoint(NULL); s = CurrentTransactionState; /* changed by push */ s->name = name; -s->savepointLevel = savepointLevel; /* This is the same as TBLOCK_SUBBEGIN case */ AssertState(s->blockState == TBLOCK_SUBBEGIN); @@ -4352,11 +4345,6 @@ ReleaseSavepoint(const char *name) (errcode(ERRCODE_S_E_INVALID_SPECIFICATION), errmsg("savepoint \"%s\" does not exist", name))); - /* disallow crossing savepoint level boundaries */ - if (target->savepointLevel != s->savepointLevel) - ereport(ERROR, -(errcode(ERRCODE_S_E_INVALID_SPECIFICATION), - errmsg("savepoint \"%s\" does not exist within current savepoint level", name))); /* * Mark "commit pending" all subtransactions up to the target @@ -4461,11 +4449,6 @@ RollbackToSavepoint(const char *name) (errcode(ERRCODE_S_E_INVALID_SPECIFICATION), errmsg("savepoint \"%s\" does not exist", name))); - /* disallow crossing savepoint level boundaries */ - if (target->savepointLevel != s->savepointLevel) - ereport(ERROR, -(errcode(ERRCODE_S_E_INVALID_SPECIFICATION), - errmsg("savepoint \"%s\" does not exist within current savepoint level", name))); /* * Mark "abort pending" all subtransactions up to the target @@ -5253,7 +5236,6 @@ PushTransaction(void) s->parent = p; s->nestingLevel = p->nestingLevel + 1; s->gucNestLevel = NewGUCNestLevel(); - s->savepointLevel = p->savepointLevel; s->state = TRANS_DEFAULT; s->blockState = TBLOCK_SUBBEGIN; GetUserIdAndSecContext(&s->prevUser, &s->prevSecContext); -- 2.25.1
Re: Improve tab completion for ALTER STATISTICS
On Wed, Oct 19, 2022 at 04:06:51PM +0530, vignesh C wrote: > I noticed that the tab completion for ALTER STATISTICS .. SET was not > handled. The attached patch displays SCHEMA and STATISTICS for tab > completion of ALTER STATISTICS name SET. Indeed, it is a bit strange as we would get a list of settable parameters once the completion up to SET is done, rather than STATISTICS and SCHEMA. Your patch looks fine, so applied. Thanks! -- Michael signature.asc Description: PGP signature
Re: [PATCH] minor bug fix for pg_dump --clean
On 10/24/22 03:01, Tom Lane wrote: =?UTF-8?Q?Fr=c3=a9d=c3=a9ric_Yhuel?= writes: When using pg_dump (or pg_restore) with option "--clean", there is some SQL code to drop every objects at the beginning. Yup ... The DROP statement for a view involving circular dependencies is : CREATE OR REPLACE VIEW [...] (see commit message of d8c05aff for a much better explanation) If the view is not in the "public" schema, and the target database is empty, this statement fails, because the schema hasn't been created yet. The attached patches are a TAP test which can be used to reproduce the bug, and a proposed fix. They apply to the master branch. I am disinclined to accept this as a valid bug, because there's never been any guarantee that a --clean script would execute error-free in a database that doesn't match what the source database contains. (The pg_dump documentation used to say that in so many words. I see that whoever added the --if-exists option was under the fond illusion that that fixes all cases, which it surely does not. We need to back off the promises a bit there.) An example of a case that won't execute error-free is if the view having a circular dependency includes a column of a non-built-in data type. If you try to run that in an empty database, you'll get an error from the CREATE OR REPLACE VIEW's reference to that data type. For instance, if I adjust your test case to make the "payload" column be of type hstore, I get something like psql:dumpresult.sql:22: ERROR: type "public.hstore" does not exist LINE 4: NULL::public.hstore AS payload; ^ The same type of failure occurs for user-defined functions and operators that use a non-built-in type, and I'm sure there are more cases in the same vein. But it gets *really* messy if the target database isn't completely empty, but contains objects with different properties than the dump script expects; for example, if the view being discussed here exists with a different column set than the script thinks, or if the dependency chains aren't all the same. If this fix were cleaner I might be inclined to accept it anyway, but it's not very clean at all --- for example, it's far from obvious to me what are the side-effects of changing the filter in RestoreArchive like that. Nor am I sure that the schema you want to create is guaranteed to get dropped again later in every use-case. Hi Tom, Viktoria, Thank you for your review Viktoria! Thank you for this detailed explanation, Tom! I didn't have great hope for this patch. I thought that the TAP test could be accepted, but now I can see that it is clearly useless. So I think mainly what we ought to do here is to adjust the documentation to make it clearer that --clean is not guaranteed to work without errors unless the target database has the same set of objects as the source. --if-exists can reduce the set of error cases, but not eliminate it. Possibly we should be more enthusiastic about recommending --create --clean (ie, drop and recreate the whole database) instead. I beleive a documentation patch would be useful, indeed. Best regards, Frédéric
Re: Allow file inclusion in pg_hba and pg_ident files
On Mon, Oct 24, 2022 at 01:33:12PM +0800, Julien Rouhaud wrote: > v12 attached, fixing multiple conflicts with recent activity. typedef struct TokenizedAuthLine { List *fields; /* List of lists of AuthTokens */ + char *file_name; /* File name * Hmm. While putting my eyes on this patch set for the first time in a few months (sorry!), the addition of file_name to TokenizedAuthLine in 0002 stands out. This impacts all the macros used for the validation of the ident and HBA lines, leading as well to a lot of bloat in the patch with patterns like that in 20~25 places: errcontext("line %d of configuration file \"%s\"", \ - line_num, IdentFileName))); \ + line_num, file_name))); \ [...] errcontext("line %d of configuration file \"%s\"", - line_num, HbaFileName))); + line_num, file_name))); Do you think that it would make sense to split that into its own patch? That would move the code toward less references to HbaFileName and IdentFileName, which is one step toward what we want for this thread. This tokenization of HBA and ident files is already shared, so moving this knowledge into TokenizedAuthLine looks helpful independently of the rest. -- Michael signature.asc Description: PGP signature
Re: Move backup-related code to xlogbackup.c/.h
On Wed, Oct 19, 2022 at 09:07:04PM +0530, Bharath Rupireddy wrote: > XLogBackupResetRunning() seemed better. +1 for above function names. I see what you are doing here. XLogCtl would still live in xlog.c, but we want to have functions that are able to manipulate some of its fields. I am not sure to like that much because it introduces a circling dependency between xlog.c and xlogbackup.c. As of HEAD, xlog.c calls build_backup_content() from xlogbackup.c, which is fine as xlog.c is kind of a central piece that feeds on the insert and recovery pieces. However your patch makes some code paths of xlogbackup.c call routines from xlog.c, and I don't think that we should do that. > I'm okay either way. > > Please see the attached v8 patch set. Among all that, CleanupBackupHistory() is different, still it has a dependency with some of the archiving pieces.. -- Michael signature.asc Description: PGP signature
PGDOCS - Logical replication GUCs - added some xrefs
Hi hackers. There is a docs Logical Replication section "31.10 Configuration Settings" [1] which describes some logical replication GUCs, and details on how they interact with each other and how to take that into account when setting their values. There is another docs Server Configuration section "20.6 Replication" [2] which lists the replication-related GUC parameters, and what they are for. Currently AFAIK those two pages are unconnected, but I felt it might be helpful if some of the parameters in the list [2] had xref links to the additional logical replication configuration information [1]. PSA a patch to do that. ~~ Meanwhile, I also suspect that the main blurb top of [1] is not entirely correct... it says "These settings control the behaviour of the built-in streaming replication feature", although some of the GUCs mentioned later in this section are clearly for "logical replication". Thoughts? -- [1] 31.10 Configuration Settings - https://www.postgresql.org/docs/current/logical-replication-config.html [2] 20.6 Replication - https://www.postgresql.org/docs/current/runtime-config-replication.html Kind Regards, Peter Smith. Fujitsu Australia v1-0001-Logical-replication-GUCs-added-some-docs-xrefs.patch Description: Binary data
Re: Allow file inclusion in pg_hba and pg_ident files
Hi, On Mon, Oct 24, 2022 at 04:13:51PM +0900, Michael Paquier wrote: > On Mon, Oct 24, 2022 at 01:33:12PM +0800, Julien Rouhaud wrote: > > v12 attached, fixing multiple conflicts with recent activity. > > typedef struct TokenizedAuthLine > { > List *fields; /* List of lists of AuthTokens */ > + char *file_name; /* File name * > > Hmm. While putting my eyes on this patch set for the first time in a > few months (sorry!), the addition of file_name to TokenizedAuthLine in > 0002 stands out. This impacts all the macros used for the validation > of the ident and HBA lines, leading as well to a lot of bloat in the > patch with patterns like that in 20~25 places: > errcontext("line %d of configuration file \"%s\"", \ > - line_num, IdentFileName))); \ > + line_num, file_name))); \ > [...] > errcontext("line %d of configuration file \"%s\"", > - line_num, HbaFileName))); > + line_num, file_name))); > > Do you think that it would make sense to split that into its own > patch? That would move the code toward less references to HbaFileName > and IdentFileName, which is one step toward what we want for this > thread. This tokenization of HBA and ident files is already shared, > so moving this knowledge into TokenizedAuthLine looks helpful > independently of the rest. It would also require to bring HbaLine->sourcefile. I'm afraid it would be weird to introduce such a refactoring in a separate commit just to pass a constant down multiple level of indirection, as all the macro will remain specific to either hba or ident anyway. I agree that there are quite a lot of s/XXXFileName/file_name/, but those aren't complicated, and keeping them in the same commit makes it easy to validate that none has been forgotten since the regression tests covering those messages are in that commit too. And of course while double checking that none was forgotten I realize that I missed the new regcomp_auth_token() which introduced a couple new usage of HbaFileName.
Re: effective_multixact_freeze_max_age issue
Hello! On 18.10.2022 20:56, Peter Geoghegan wrote: The term "removable cutoff" is how VACUUM VERBOSE reports OldestXmin. I think that it's good to use the same terminology here. Thanks for the explanation! Firstly exactly this term confused me. Sure, the same terminology makes all easier to understand. Could you clarify this moment please? Would be very grateful. While this WARNING is triggered when a threshold controlled by autovacuum_freeze_max_age is crossed, it's not just a problem with freezing. It's convenient to use autovacuum_freeze_max_age to represent "a wildly excessive number of XIDs for OldestXmin to be held back by, no matter what". In practice it is usually already a big problem when OldestXmin is held back by far fewer XIDs than that, but it's hard to reason about when exactly we need to consider that a problem. However, we can at least be 100% sure of real problems when OldestXmin age reaches autovacuum_freeze_max_age. There is no longer any doubt that we need to warn the user, since antiwraparound autovacuum cannot work as designed at that point. But the WARNING is nevertheless not primarily (or not exclusively) about not being able to freeze. It's also about not being able to remove bloat.> Freezing can be thought of as roughly the opposite process to removing dead tuples deleted by now committed transactions. OldestXmin is the cutoff both for removing dead tuples (which we want to get rid of immediately), and freezing live all-visible tuples (which we want to keep long term). FreezeLimit is usually 50 million XIDs before OldestXmin (the freeze_min_age default), but that's just how we implement lazy freezing, which is just an optimization. That's clear. Thanks a lot! As variant may be split these checks for the freeze cuttoffs and the oldest xmins for clarity? The patch attached tries to do this. I don't think that this is an improvement. For one thing the FreezeLimit cutoff will have been held back (relative to nextXID-wise table age) by more than the freeze_min_age setting for a long time before this WARNING is hit -- so we're not going to show the WARNING in most cases where freeze_min_age has been completely ignored (it must be ignored in extreme cases because FreezeLimit must always be <= OldestXmin). Also, the proposed new WARNING is only seen when we're bound to also see the existing OldestXmin WARNING already. Why have 2 WARNINGs about exactly the same problem?> I didn't understand this moment. If the FreezeLimit is always older than OldestXmin or equal to it according to: FreezeLimit is usually 50 million XIDs before OldestXmin (the freeze_min_age default), can't there be a situation like this? __ | autovacuum_freeze_max_age | past <___||_||> future FreezeLimit safeOldestXmin oldestXmin nextXID |___| freeze_min_age In that case the existing OldestXmin WARNING will not fire while the new one will. As the FreezeLimit is only optimization it's likely not a warning but a notice message before OldestXmin WARNING and possible real problems in the future. Maybe it can be useful in a such kind? With best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Pluggable toaster
Hi Nikita, I don't argue with most of what you say. I am just pointing out the reason why the chosen approach "N TOASTers x M TableAMs" will not work: > Don't you think that this is an arguable design decision? Basically > all we know about the underlying TableAM is that it stores tuples > _somehow_ and that tuples have TIDs [1]. That's it. We don't know if > it even has any sort of pages, whether they are fixed in size or not, > whether it uses shared buffers, etc. It may not even require TOAST. > [...] Also I completely agree with: > Implementing another Table AM just to implement another TOAST strategy seems > too > much, the TAM API is very heavy and complex, and you would have to add it as > a contrib. This is what I meant above when talking about the framework for simplifying this task: > It looks like the idea should be actually turned inside out. I.e. what > would be nice to have is some sort of _framework_ that helps TableAM > authors to implement TOAST (alternatively, the rest of the TableAM > except for TOAST) if the TableAM is similar to the default one. >From the user perspective it's much easier to think about one entity - TableAM, and choosing from heapam_with_default_toast and heapam_with_different_toast. >From the extension implementer point of view creating TableAMs is a difficult task. This is what the framework should address. Ideally the interface should be as simple as: CreateParametrizedDefaultHeapAM(SomeTOASTSubstitutionObject, ...other arguments, in the future...) Where the extension author should be worried only about an alternative TOAST implementation. I think at some point such a framework may address at least one more issue we have - an inability to change the page size on the table level. As it was shown by Tomas Vondra [1] the default 8 KB page size can be suboptimal depending on the load. So it would be nice if the user could change it without rebuilding PostgreSQL. Naturally this is out of scope of this particular patchset. I just wanted to point out opportunities we have here. [1]: https://www.postgresql.org/message-id/flat/b4861449-6c54-ccf8-e67c-c039228cdc6d%40enterprisedb.com -- Best regards, Aleksander Alekseev
Re: Question about savepoint level?
On Mon, Oct 24, 2022 at 3:00 PM Japin Li wrote: > On Mon, 24 Oct 2022 at 12:19, Japin Li wrote: > > The TransactionStateData has savepointLevel field, however, I do not > understand > > what is savepoint level, it seems all savepoints have the same > savepointLevel, > > I want to know how the savepoint level changes. > > I try to remove the savepointLevel, and it seems harmless. Any thoughts? ISTM the savepointLevel always remains the same as what is in TopTransactionStateData after looking at the codes. Now I also get confused. Maybe what we want is nestingLevel? Thanks Richard
Re: Crash after a call to pg_backup_start()
On 2022-Oct-24, Michael Paquier wrote: > On the contrary, it seems to me that putting the assertion within the > if() block makes the assertion weaker, because we would never check > for an incorrect state after do_pg_abort_backup() is registered (aka > any pg_backup_start() call) when not entering in this if() block. Reading it again, I agree with your conclusion, so I'll push as you proposed with some extra tests, after they complete running. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "La verdad no siempre es bonita, pero el hambre de ella sí"
pg_dump: Refactor code that constructs ALTER ... OWNER TO commands
Avoid having to list all the possible object types twice. Instead, only _getObjectDescription() needs to know about specific object types. It communicates back to _printTocEntry() whether an owner is to be set. In passing, remove the logic to use ALTER TABLE to set the owner of views and sequences. This is no longer necessary. Furthermore, if pg_dump doesn't recognize the object type, this is now a fatal error, not a warning.From ef05e1dcecbdd51974c2dad0c552b314a7c00f4c Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Mon, 24 Oct 2022 11:46:37 +0200 Subject: [PATCH] pg_dump: Refactor code that constructs ALTER ... OWNER TO commands Avoid having to list all the possible object types twice. Instead, only _getObjectDescription() needs to know about specific object types. It communicates back to _printTocEntry() whether an owner is to be set. In passing, remove the logic to use ALTER TABLE to set the owner of views and sequences. This is no longer necessary. Furthermore, if pg_dump doesn't recognize the object type, this is now a fatal error, not a warning. --- src/bin/pg_dump/pg_backup_archiver.c | 130 ++- 1 file changed, 47 insertions(+), 83 deletions(-) diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index 233198afc0c9..f39c0fa36fdc 100644 --- a/src/bin/pg_dump/pg_backup_archiver.c +++ b/src/bin/pg_dump/pg_backup_archiver.c @@ -72,7 +72,7 @@ typedef struct _parallelReadyList static ArchiveHandle *_allocAH(const char *FileSpec, const ArchiveFormat fmt, const int compression, bool dosync, ArchiveMode mode, SetupWorkerPtrType setupWorkerPtr); -static void _getObjectDescription(PQExpBuffer buf, TocEntry *te); +static void _getObjectDescription(PQExpBuffer buf, const TocEntry *te); static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData); static char *sanitize_line(const char *str, bool want_hyphen); static void _doSetFixedOutputState(ArchiveHandle *AH); @@ -3398,27 +3398,27 @@ _selectTableAccessMethod(ArchiveHandle *AH, const char *tableam) * Extract an object description for a TOC entry, and append it to buf. * * This is used for ALTER ... OWNER TO. + * + * If the object type has no owner, do nothing. */ static void -_getObjectDescription(PQExpBuffer buf, TocEntry *te) +_getObjectDescription(PQExpBuffer buf, const TocEntry *te) { const char *type = te->desc; - /* Use ALTER TABLE for views and sequences */ - if (strcmp(type, "VIEW") == 0 || strcmp(type, "SEQUENCE") == 0 || - strcmp(type, "MATERIALIZED VIEW") == 0) - type = "TABLE"; - /* objects that don't require special decoration */ if (strcmp(type, "COLLATION") == 0 || strcmp(type, "CONVERSION") == 0 || strcmp(type, "DOMAIN") == 0 || - strcmp(type, "TABLE") == 0 || - strcmp(type, "TYPE") == 0 || strcmp(type, "FOREIGN TABLE") == 0 || + strcmp(type, "MATERIALIZED VIEW") == 0 || + strcmp(type, "SEQUENCE") == 0 || + strcmp(type, "STATISTICS") == 0 || + strcmp(type, "TABLE") == 0 || strcmp(type, "TEXT SEARCH DICTIONARY") == 0 || strcmp(type, "TEXT SEARCH CONFIGURATION") == 0 || - strcmp(type, "STATISTICS") == 0 || + strcmp(type, "TYPE") == 0 || + strcmp(type, "VIEW") == 0 || /* non-schema-specified objects */ strcmp(type, "DATABASE") == 0 || strcmp(type, "PROCEDURAL LANGUAGE") == 0 || @@ -3427,33 +3427,28 @@ _getObjectDescription(PQExpBuffer buf, TocEntry *te) strcmp(type, "FOREIGN DATA WRAPPER") == 0 || strcmp(type, "SERVER") == 0 || strcmp(type, "PUBLICATION") == 0 || - strcmp(type, "SUBSCRIPTION") == 0 || - strcmp(type, "USER MAPPING") == 0) + strcmp(type, "SUBSCRIPTION") == 0) { appendPQExpBuffer(buf, "%s ", type); if (te->namespace && *te->namespace) appendPQExpBuffer(buf, "%s.", fmtId(te->namespace)); appendPQExpBufferStr(buf, fmtId(te->tag)); - return; } - /* BLOBs just have a name, but it's numeric so must not use fmtId */ - if (strcmp(type, "BLOB") == 0) + else if (strcmp(type, "BLOB") == 0) { appendPQExpBuffer(buf, "LARGE OBJECT %s", te->tag); - return; } - /* * These object types require additional decoration. Fortunately, the * information needed is exactly what's in the DROP command. */ - if (strcmp(type, "AGGREGATE") == 0 || - strcmp(type, "FUNCTION") == 0 || - strcmp(type, "OPERATO
Re: Reducing the chunk header sizes on all memory context types
On Thu, Oct 20, 2022 at 1:55 AM Andres Freund wrote: > > Hi, > > On 2022-10-11 10:21:17 +0700, John Naylor wrote: > > On Tue, Oct 11, 2022 at 5:31 AM David Rowley wrote: > > > > > > The proposed patches in [1] do aim to make additional usages of the > > > slab allocator, and I have a feeling that we'll want to fix the > > > performance of slab.c before those. Perhaps the Asserts are a better > > > option if we're to get the proposed radix tree implementation. > > > > Going by [1], that use case is not actually a natural fit for slab because > > of memory fragmentation. > > > > [1] > > https://www.postgresql.org/message-id/20220704220038.at2ane5xkymzzssb%40awork3.anarazel.de > > Not so sure about that - IIRC I made one slab for each different size class, > which seemed to work well and suit slab well? If that's the case, then great! The linked message didn't give me that impression, but I won't worry about it. -- John Naylor EDB: http://www.enterprisedb.com
Re: Question about savepoint level?
On 2022-Oct-24, Richard Guo wrote: > On Mon, Oct 24, 2022 at 3:00 PM Japin Li wrote: > > > I try to remove the savepointLevel, and it seems harmless. Any thoughts? > > ISTM the savepointLevel always remains the same as what is in > TopTransactionStateData after looking at the codes. Now I also get > confused. Maybe what we want is nestingLevel? This has already been discussed: https://postgr.es/m/1317297307-sup-7...@alvh.no-ip.org Now that we have transaction-controlling procedures, I think the next step is to add the SQL-standard feature that allows savepoint level control for them, which would make the savepointLevel no longer dead code. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "You're _really_ hosed if the person doing the hiring doesn't understand relational systems: you end up with a whole raft of programmers, none of whom has had a Date with the clue stick." (Andrew Sullivan)
Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf
Hi, On 10/24/22 5:34 AM, Michael Paquier wrote: On Fri, Oct 21, 2022 at 02:10:37PM +0200, Drouvot, Bertrand wrote: On 10/21/22 2:58 AM, Michael Paquier wrote: I have spent a couple of hours doing a pass over v2, playing manually with regex patterns, reloads, the system views and item lists. The logic was fine, but I have adjusted a few things related to the comments and the documentation (particularly with the examples, removing one example and updating one with a regex that has a comma, needing double quotes). The CI and all my machines were green, and the test coverage looked sufficient. So, applied. Thanks! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Question about savepoint level?
On Mon, 24 Oct 2022 at 17:56, Alvaro Herrera wrote: > This has already been discussed: > https://postgr.es/m/1317297307-sup-7...@alvh.no-ip.org Sorry for my lazy search. > Now that we have transaction-controlling procedures, I think the next > step is to add the SQL-standard feature that allows savepoint level > control for them, which would make the savepointLevel no longer dead > code. So the savepoint level is used for CREATE PROCEDURE ... OLD/NEW SAVEPOINT LEVEL syntax [1], right? [1] https://www.ibm.com/docs/en/db2/10.1.0?topic=statements-create-procedure-sql -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Some comments that should've covered MERGE
On 2022-Oct-19, Richard Guo wrote: > Hi hackers, > > I happened to notice $subject. Attach a trivial patch for that. Thanks, applied. I did change the comment atop setTargetTable, which I thought could use a little bit more detail on what is happening, and also in its callsite in transformMergeStmt. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "La experiencia nos dice que el hombre peló millones de veces las patatas, pero era forzoso admitir la posibilidad de que en un caso entre millones, las patatas pelarían al hombre" (Ijon Tichy)
Re: Question about savepoint level?
On 2022-Oct-24, Japin Li wrote: > On Mon, 24 Oct 2022 at 17:56, Alvaro Herrera wrote: > > Now that we have transaction-controlling procedures, I think the next > > step is to add the SQL-standard feature that allows savepoint level > > control for them, which would make the savepointLevel no longer dead > > code. > > So the savepoint level is used for CREATE PROCEDURE ... OLD/NEW SAVEPOINT > LEVEL > syntax [1], right? > > [1] > https://www.ibm.com/docs/en/db2/10.1.0?topic=statements-create-procedure-sql Yeah, that's what I understand. The default behavior is the current behavior (OLD SAVEPOINT LEVEL). In a procedure that specifies NEW SAVEPOINT LEVEL trying to rollback a savepoint that was defined before the procedure was called is an error, which sounds a useful protection. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "El sentido de las cosas no viene de las cosas, sino de las inteligencias que las aplican a sus problemas diarios en busca del progreso." (Ernesto Hernández-Novich)
Re: Mingw task for Cirrus CI
Hi, > > + > > + on_failure: > > +<<: *on_failure_meson > > +cores_script: | > > + %BASH_EXE% -lc "cd build src/tools/ci/cores_backtrace.sh msys > build/tmp_install" > > + > > This is wrong - it should just archive the same files that the current > windows > task does. > Changed it with the on_failure from the other windows task. > Other than that, I think this is basically ready? > If you say so, then I think it's ready. Best, Melih v10-0001-Added-Windows-with-MinGW-environment-in-Cirrus-CI.patch Description: Binary data
Re: Pluggable toaster
Hi! >I don't argue with most of what you say. I am just pointing out the >reason why the chosen approach "N TOASTers x M TableAMs" will not >work: We assume that TAM used in custom Toaster works as it is should work, and leave TAM internals to this TAM developer - say, we do not want to change internals of Heap AM. We don't want to create some kind of silver bullet. There are already existing and widely-known (from production environments) problems with TOAST mechanics, and we suggest not too complex way to solve them. As I mentioned before, Pluggable TOAST does not change Heap AM, it is not minded to change TAMs. >This is what I meant above when talking about the framework for >simplifying this task: That's a kind of generalizing custom TOAST implementation. It is very good intention, but keep in mind that different kinds of data require very different approach to external storage - say, JSON TOAST works with maps of keys and values, super binary object (experimental name) does not work with internals of TOASTed data except searching. But, we thought about that too and reusable code resides in toast_internals.c source - any custom Toaster working with Heap could use it's insert, update and fetch methods, but deal with data on it's own. Even with the general framework there must be a common interface which would be the entry point for those custom methods developed with the framework. That's what the TOAST API is - just an interface that all custom TOAST implementations use to have a common entry point from any TAM, with syntax support to plug in custom TOAST implementations from the SQL. No less, but no more. Moreover, our patches show that even Generic (default) TOAST implementation could still be left as-is, without necessity to route it via our API, though it is logically wrong because common API is meant to be common for all TOAST implementations without exceptions. Have I answered your question? Please don't hesitate to point to any unclear parts, I'd be glad to explain that. The main idea in TOAST API is very elegant and light, and it is designed alike to Pluggable Storage (Table AM API). On Mon, Oct 24, 2022 at 12:10 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > I don't argue with most of what you say. I am just pointing out the > reason why the chosen approach "N TOASTers x M TableAMs" will not > work: > > > Don't you think that this is an arguable design decision? Basically > > all we know about the underlying TableAM is that it stores tuples > > _somehow_ and that tuples have TIDs [1]. That's it. We don't know if > > it even has any sort of pages, whether they are fixed in size or not, > > whether it uses shared buffers, etc. It may not even require TOAST. > > [...] > > Also I completely agree with: > > > Implementing another Table AM just to implement another TOAST strategy > seems too > > much, the TAM API is very heavy and complex, and you would have to add > it as a contrib. > > This is what I meant above when talking about the framework for > simplifying this task: > > > It looks like the idea should be actually turned inside out. I.e. what > > would be nice to have is some sort of _framework_ that helps TableAM > > authors to implement TOAST (alternatively, the rest of the TableAM > > except for TOAST) if the TableAM is similar to the default one. > > From the user perspective it's much easier to think about one entity - > TableAM, and choosing from heapam_with_default_toast and > heapam_with_different_toast. > > From the extension implementer point of view creating TableAMs is a > difficult task. This is what the framework should address. Ideally the > interface should be as simple as: > > CreateParametrizedDefaultHeapAM(SomeTOASTSubstitutionObject, ...other > arguments, in the future...) > > Where the extension author should be worried only about an alternative > TOAST implementation. > > I think at some point such a framework may address at least one more > issue we have - an inability to change the page size on the table > level. As it was shown by Tomas Vondra [1] the default 8 KB page size > can be suboptimal depending on the load. So it would be nice if the > user could change it without rebuilding PostgreSQL. Naturally this is > out of scope of this particular patchset. I just wanted to point out > opportunities we have here. > > [1]: > https://www.postgresql.org/message-id/flat/b4861449-6c54-ccf8-e67c-c039228cdc6d%40enterprisedb.com > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: Testing DDL Deparser
On 2022-Oct-20, Runqi Tian wrote: > My question regarding subcommand is actually on commands other than > ALTER TABLE. Let me give an example (You can also find this example in > the patch), when command like > > CREATE SCHEMA element_test CREATE TABLE foo (id int) > > is caught by ddl_command_end trigger, function > pg_event_trigger_ddl_commands() will return 2 records which I called > as subcommands in the previous email. Ah, right. I don't remember why we made these commands be separate; but for instance if you try to add a SERIAL column you'll also see one command to create the sequence, then the table, then the sequence gets its OWNED BY the column. I think the point is that we want to have some regularity so that an application can inspect the JSON blobs and perhaps modify them; if you make a bunch of sub-objects, this becomes more difficult. For DDL replication purposes perhaps this isn't very useful (since you just grab it and execute on the other side as-is), but other use cases might have other ideas. > Is this behavior expected? I thought the deparser is designed to > deparse the entire command to a single command instead of dividing > them into 2 commands. It is expected. > It seems that keeping separate test cases in deparser tests folder is > better than using the test cases in core regression tests folder > directly. I will write some test cases in the new deparser test > folder. Well, the reason to use the regular regression tests rather than separate, is that when a developer adds a new feature, they will add test cases for it in regular regression tests, so deparsing of their command will be automatically picked up by the DDL-deparse testing framework. We discussed at the time that another option would be to have patch reviewers ensure that the added DDL commands are also tested in the DDL-deparse framework, but nobody wants to add yet another thing that we have to remember (or, more likely, forget). > I see, it seems that a function to deparse DROP command to JSON output > is needed but not provided yet. I implemented a function > deparse_drop_ddl() in the testing harness, maybe we could consider > exposing a function in engine to deparse DROP command as > deparse_ddl_command() does. No objection against this idea. > updated to: > > 1, The deparsed JSON is the same as the expected string I would rather say "has the same effects as". > 1, Build DDL event triggers and DDL deparser into pg_regress tests so > that DDL commands in these tests can be captured and deparsed. > 2, Let the goal 3 implementation, aka the TAP test to execute test > cases from pg_regress, if sub and pub node don’t dump the same > results, some DDL commands must be changed. > > Solution 1 is more lighter weight as we only need to run pg_regress > once. Any other thoughts? We have several runs of pg_regress already -- apart from the normal one, we run it once in recovery/t/027_stream_regress.pl and once in the pg_upgrade test. I'm not sure that we necessarily need to avoid another one here, particularly if avoiding it would potentially pollute the results for the regular tests. I am okay with solution 2 personally. If we really wanted to optimize this, perhaps we should try to drive all three uses (pg_upgrade, stream_regress, this new test) from a single pg_regress run. But ISTM we don't *have* to. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Hay dos momentos en la vida de un hombre en los que no debería especular: cuando puede permitírselo y cuando no puede" (Mark Twain)
Re: Perform streaming logical transactions by background workers and parallel apply
On Fri, Oct 21, 2022 at 3:02 PM houzj.f...@fujitsu.com wrote: > Few comments on the 0001 and 0003 patches: v40-0001* == 1. + /* + * The queue used to transfer messages from the parallel apply worker to + * the leader apply worker. + */ + shm_mq_handle *error_mq_handle; Shall we say error messages instead of messages? 2. +/* + * Is there a message pending in parallel apply worker which we need to + * receive? + */ +volatile sig_atomic_t ParallelApplyMessagePending = false; Can we slightly change above comment to: "Is there a message sent by parallel apply worker which we need to receive?" 3. + + ThrowErrorData(&edata); + + /* Should not reach here after rethrowing an error. */ + error_context_stack = save_error_context_stack; Should we instead do Assert(false) after ThrowErrorData? 4. + * apply worker (c) necessary information to be shared among parallel apply + * workers and leader apply worker (i.e. in_parallel_apply_xact flag and the + * corresponding LogicalRepWorker slot information). I don't think here the comment needs to exactly say which variables are shared. necessary information to synchronize between parallel apply workers and leader apply worker. 5. + * The dynamic shared memory segment will contain (a) a shm_mq that can be + * used to send changes in the transaction from leader apply worker to parallel + * apply worker (b) another shm_mq that can be used to send errors In both (a) and (b), instead of "can be", we can use "is". 6. Note that we cannot skip the streaming transactions when using + * parallel apply workers because we cannot get the finish LSN before + * applying the changes. This comment is unclear about the action of parallel apply worker when finish LSN is set. We can add something like: "So, we don't start parallel apply worker when finish LSN is set by the user." v40-0003 == 7. The function RelationGetUniqueKeyBitmap() should be defined in relcache.c next to RelationGetIdentityKeyBitmap(). 8. +RelationGetUniqueKeyBitmap(Relation rel) { ... + if (!rel->rd_rel->relhasindex) + return NULL; It would be better to use "if (!RelationGetForm(relation)->relhasindex)" so as to be consistent with similar usage in RelationGetUniqueKeyBitmap. 9. In RelationGetUniqueKeyBitmap(), we must assert here that the historic snapshot is set as we are not taking a lock on index rels. The same is already ensured in RelationGetIdentityKeyBitmap(), is there a reason to be different here? -- With Regards, Amit Kapila.
Re: Perform streaming logical transactions by background workers and parallel apply
On Wed, Oct 12, 2022 at 3:04 PM Amit Kapila wrote: > > On Tue, Oct 11, 2022 at 5:52 AM Masahiko Sawada wrote: > > > > On Fri, Oct 7, 2022 at 2:00 PM Amit Kapila wrote: > > > > > > About your point that having different partition structures for > > > publisher and subscriber, I don't know how common it will be once we > > > have DDL replication. Also, the default value of > > > publish_via_partition_root is false which doesn't seem to indicate > > > that this is a quite common case. > > > > So how can we consider these concurrent issues that could happen only > > when streaming = 'parallel'? Can we restrict some use cases to avoid > > the problem or can we have a safeguard against these conflicts? > > > > Yeah, right now the strategy is to disallow parallel apply for such > cases as you can see in *0003* patch. Tightening the restrictions could work in some cases but there might still be coner cases and it could reduce the usability. I'm not really sure that we can ensure such a deadlock won't happen with the current restrictions. I think we need something safeguard just in case. For example, if the leader apply worker is waiting for a lock acquired by its parallel worker, it cancels the parallel worker's transaction, commits its transaction, and restarts logical replication. Or the leader can log the deadlock to let the user know. > > > We > > could find a new problematic scenario in the future and if it happens, > > logical replication gets stuck, it cannot be resolved only by apply > > workers themselves. > > > > I think users can change streaming option to on/off and internally the > parallel apply worker can detect and restart to allow replication to > proceed. Having said that, I think that would be a bug in the code and > we should try to fix it. We may need to disable parallel apply in the > problematic case. > > The other ideas that occurred to me in this regard are (a) provide a > reloption (say parallel_apply) at table level and we can use that to > bypass various checks like different Unique Key between > publisher/subscriber, constraints/expressions having mutable > functions, Foreign Key (when enabled on subscriber), operations on > Partitioned Table. We can't detect whether those are safe or not > (primarily because of a different structure in publisher and > subscriber) so we prohibit parallel apply but if users use this > option, we can allow it even in those cases. The parallel apply worker is assigned per transaction, right? If so, how can we know which tables are modified in the transaction in advance? and what if two tables whose reloptions are true and false are modified in the same transaction? > (b) While enabling the > parallel option in the subscription, we can try to match all the > table(s) information of the publisher/subscriber. It will be tricky to > make this work because say even if match some trigger function name, > we won't be able to match the function body. The other thing is when > at a later point the table definition is changed on the subscriber, we > need to again validate the information between publisher and > subscriber which I think would be difficult as we would be already in > between processing some message and getting information from the > publisher at that stage won't be possible. Indeed. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Pluggable toaster
Hi Nikita, > >I don't argue with most of what you say. I am just pointing out the > >reason why the chosen approach "N TOASTers x M TableAMs" will not > >work: > > We assume that TAM used in custom Toaster works as it is should work, > and leave TAM internals to this TAM developer - say, we do not want to > change internals of Heap AM. > > We don't want to create some kind of silver bullet. This is exactly the point. In order to not to create a silver bullet, TOASTers should be limited to a single TableAM. The one we know uses pages of a known fixed size, the one that actually requires TOAST because pages are relatively small, etc. > That's what the TOAST API is - just an interface that all custom > TOAST implementations use to have a common entry point from any TAM, > [...] I believe this is the source of misunderstanding. Note that not _any_ TableAM needs TOAST to begin with. As an example, if one chooses to implement a column-organized TableAM that stores all text/bytea attributes in a separate dictionary file this person doesn't need TOAST and doesn't want to be constrained by the need of choosing one. For this reason the "N TOASTers x M TableAMs" approach is architecturally broken. > keep in mind that different kinds of data require very > different approach to external storage - say, JSON TOAST works with > maps of keys and values, [...] To clarify: is an ability to specify TOASTers for given columns and/or types also part of the plan? > Have I answered your question? Please don't hesitate to point to any unclear > parts, I'd be glad to explain that. No. To be honest, it looks like you are merely discarding most/any feedback the community provided so far. I really think that pluggable TOASTers would be a great feature. However if the goal is to get it into the core I doubt that we are going to make much progress with the current approach. -- Best regards, Aleksander Alekseev
Re: Perform streaming logical transactions by background workers and parallel apply
On Mon, Oct 24, 2022 at 11:41 AM Peter Smith wrote: > > Here are my review comments for v40-0001. > > == > > src/backend/replication/logical/worker.c > > > 1. should_apply_changes_for_rel > > + else if (am_parallel_apply_worker()) > + { > + if (rel->state != SUBREL_STATE_READY) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("logical replication parallel apply worker for subscription > \"%s\" will stop", > + MySubscription->name), > + errdetail("Cannot handle streamed replication transaction using parallel " > +"apply workers until all tables are synchronized."))); > > 1a. > "transaction" -> "transactions" > > 1b. > "are synchronized" -> "have been synchronized." > > e.g. "Cannot handle streamed replication transactions using parallel > apply workers until all tables have been synchronized." > > ~~~ > > 2. maybe_reread_subscription > > + if (am_parallel_apply_worker()) > + ereport(LOG, > + (errmsg("logical replication parallel apply worker for subscription > \"%s\" will " > + "stop because the subscription was removed", > + MySubscription->name))); > + else > + ereport(LOG, > + (errmsg("logical replication apply worker for subscription \"%s\" will " > + "stop because the subscription was removed", > + MySubscription->name))); > > Maybe there is an easier way to code this instead of if/else and > cut/paste message text: > > SUGGESTION > > ereport(LOG, > (errmsg("logical replication %s for subscription \"%s\" will stop > because the subscription was removed", > am_parallel_apply_worker() ? "parallel apply worker" : "apply worker", > MySubscription->name))); > ~~~ > If we want to go this way then it may be better to record the appropriate string beforehand and use that here. -- With Regards, Amit Kapila.
Re: PATCH: Using BRIN indexes for sorted output
On 10/24/22 06:32, Justin Pryzby wrote: > On Sat, Oct 15, 2022 at 02:33:50PM +0200, Tomas Vondra wrote: >> Of course, if there are e.g. BTREE indexes this is going to be slower, >> but people are unlikely to have both index types on the same column. > > On Sun, Oct 16, 2022 at 05:48:31PM +0200, Tomas Vondra wrote: >> I don't think it's all that unfair. How likely is it to have both a BRIN >> and btree index on the same column? And even if you do have such indexes > > Note that we (at my work) use unique, btree indexes on multiple columns > for INSERT ON CONFLICT into the most-recent tables: UNIQUE(a,b,c,...), > plus a separate set of indexes on all tables, used for searching: > BRIN(a) and BTREE(b). I'd hope that the costing is accurate enough to > prefer the btree index for searching the most-recent table, if that's > what's faster (for example, if columns b and c are specified). > Well, the costing is very crude at the moment - at the moment it's pretty much just a copy of the existing BRIN costing. And the cost is likely going to increase, because brinsort needs to do regular BRIN bitmap scan (more or less) and then also a sort (which is an extra cost, of course). So if it works now, I don't see why would brinsort break it. Moreover, if you don't have ORDER BY in the query, I don't see why would we create a brinsort at all. But if you could test this once the costing gets improved, that'd be very valuable. >> +/* There must not be any TID scan in progress yet. */ >> +Assert(node->ss.ss_currentScanDesc == NULL); >> + >> +/* Initialize the TID range scan, for the provided block range. */ >> +if (node->ss.ss_currentScanDesc == NULL) >> +{ > > Why is this conditional on the condition that was just Assert()ed ? > Yeah, that's a mistake, due to how the code evolved. >> >> +void >> +cost_brinsort(BrinSortPath *path, PlannerInfo *root, double loop_count, >> + bool partial_path) > > It's be nice to refactor existing code to avoid this part being so > duplicitive. > >> + * In some situations (particularly with OR'd index conditions) we may >> + * have scan_clauses that are not equal to, but are logically implied >> by, >> + * the index quals; so we also try a predicate_implied_by() check to see > > Isn't that somewhat expensive ? > > If that's known, then it'd be good to say that in the documentation. > Some of this is probably a residue from create_indexscan_path and may not be needed for this new node. >> +{ >> +{"enable_brinsort", PGC_USERSET, QUERY_TUNING_METHOD, >> +gettext_noop("Enables the planner's use of BRIN sort >> plans."), >> +NULL, >> +GUC_EXPLAIN >> +}, >> +&enable_brinsort, >> +false, > > I think new GUCs should be enabled during patch development. > Maybe in a separate 0002 patch "for CI only not for commit". > That way "make check" at least has a chance to hit that new code paths. > > Also, note that indxpath.c had the var initialized to true. > Good point. >> +attno = (i + 1); >> + nranges = (nblocks / pagesPerRange); >> + node->bs_phase = (nullsFirst) ? >> BRINSORT_LOAD_NULLS : BRINSORT_LOAD_RANGE; > > I'm curious why you have parenthesis these places ? > Not sure, it seemed more readable when writing the code I guess. >> +#ifndef NODEBrinSort_H >> +#define NODEBrinSort_H > > NODEBRIN_SORT would be more consistent with NODEINCREMENTALSORT. > But I'd prefer NODE_* - otherwise it looks like NO DEBRIN. > Yeah, stupid search/replace on the indescan code, which was used as a starting point. > This needed a bunch of work needed to pass any of the regression tests - > even with the feature set to off. > > . meson.build needs the same change as the corresponding ./Makefile. > . guc missing from postgresql.conf.sample > . brin_validate.c is missing support for the opr function. >I gather you're planning on changing this part (?) but this allows to >pass tests for now. > . mingw is warning about OidIsValid(pointer) in nodeBrinSort.c. >https://cirrus-ci.com/task/5771227447951360?logs=mingw_cross_warning#L969 > . Uninitialized catalog attribute. > . Some typos in your other patches: "heuristics heuristics". ste. >lest (least). > Thanks, I'll get this fixed. I've posted the patch as a PoC to showcase it and gather some feedback, I should have mentioned it's incomplete in these ways. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
On 2022-Oct-13, Michael Paquier wrote: > On Wed, Oct 12, 2022 at 08:54:34PM -0400, Tom Lane wrote: > > Don't we need to back-patch these fixes? > > I guess I should, though I have finished by not doing due to the > unlikeliness of the problem, where we would need the combination of a > page eviction with an error in the critical section to force a PANIC, > or a crash before the WAL gets inserted. Other opinions? I suppose it's a matter of whether any bugs are possible outside of Neon. If yes, then definitely this should be backpatched. Offhand, I don't see any. On the other hand, even if no bugs are known, then it's still valuable to have all code paths do WAL insertion in the same way, rather than having a single place that is alone in doing it in a different way. But if we don't know of any bugs, then backpatching might be more risk than not doing so. I confess I don't understand why is it important that XLogBeginInsert is called inside the critical section. It seems to me that that part is only a side-effect of having to acquire the buffer locks in the critical section. Right? I noticed that line 427 logs the GIN metapage with flag REGBUF_STANDARD; is the GIN metapage really honoring pd_upper? I see only pg_lower being set. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "La grandeza es una experiencia transitoria. Nunca es consistente. Depende en gran parte de la imaginación humana creadora de mitos" (Irulan)
[patch] Have psql's \d+ indicate foreign partitions
Hi Recently I have been working a lot with partitioned tables which contain a mix of local and foreign partitions, and find it would be very useful to be able to easily obtain an overview of which partitions are foreign and where they are located. Currently, executing "\d+" on a partitioned table lists the partitions like this: postgres=# \d+ parttest Partitioned table "public.parttest" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+--+-+--+- id | integer | | not null | | plain| | | val1 | text| | | | extended | | | val2 | text| | | | extended | | | Partition key: HASH (id) Partitions: parttest_10_0 FOR VALUES WITH (modulus 10, remainder 0), parttest_10_1 FOR VALUES WITH (modulus 10, remainder 1), parttest_10_2 FOR VALUES WITH (modulus 10, remainder 2), parttest_10_3 FOR VALUES WITH (modulus 10, remainder 3), parttest_10_4 FOR VALUES WITH (modulus 10, remainder 4), parttest_10_5 FOR VALUES WITH (modulus 10, remainder 5), parttest_10_6 FOR VALUES WITH (modulus 10, remainder 6), parttest_10_7 FOR VALUES WITH (modulus 10, remainder 7), parttest_10_8 FOR VALUES WITH (modulus 10, remainder 8), parttest_10_9 FOR VALUES WITH (modulus 10, remainder 9) which doesn't help much in that respect. Attached patch changes this output to: postgres=# \d+ parttest Partitioned table "public.parttest" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+--+-+--+- id | integer | | not null | | plain| | | val1 | text| | | | extended | | | val2 | text| | | | extended | | | Partition key: HASH (id) Partitions: parttest_10_0 FOR VALUES WITH (modulus 10, remainder 0), parttest_10_1 FOR VALUES WITH (modulus 10, remainder 1), server: "fdw_node2", parttest_10_2 FOR VALUES WITH (modulus 10, remainder 2), parttest_10_3 FOR VALUES WITH (modulus 10, remainder 3), server: "fdw_node2", parttest_10_4 FOR VALUES WITH (modulus 10, remainder 4), parttest_10_5 FOR VALUES WITH (modulus 10, remainder 5), server: "fdw_node2", parttest_10_6 FOR VALUES WITH (modulus 10, remainder 6), parttest_10_7 FOR VALUES WITH (modulus 10, remainder 7), server: "fdw_node2", parttest_10_8 FOR VALUES WITH (modulus 10, remainder 8), parttest_10_9 FOR VALUES WITH (modulus 10, remainder 9), server: "fdw_node2" which is much more informative, albeit a little more cluttered, but short of using emojis I can't see any better way (suggestions welcome). For completeness, output with child tables could look like this: postgres=# \d+ inhtest Table "public.inhtest" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description +-+---+--+-+--+-+--+- id | integer | | not null | | plain| | | val1 | text| | | | extended | | | val2 | text| | | | extended | | | Child tables: inhtest_10_0, inhtest_10_1 (server: "fdw_node2"), inhtest_10_2, inhtest_10_3 (server: "fdw_node2"), inhtest_10_4, inhtest_10_5 (server: "fdw_node2"), inhtest_10_6, inhtest_10_7 (server: "fdw_node2"), inhtest_10_8, inhtest_10_9 (server: "fdw_node2") Access method: heap Will add to next CF. Regards Ian Barwick commit d5f5de96381b93a6ea1066d4abb4c6617e0af758 Author: Ian Barwick Date: Thu Oct 20 12:45:28 2022 +0900 psql: in \d+, indicate foreign partitions Currently with a partitioned table, \d+ lists the partitions and their partition key, but it would be useful to see which ones, if any, are foreign partitions. A simple way of doing this is, for foreign partitions, to display the name of the partition's foreign se
Re: Pluggable toaster
Hi, >This is exactly the point. In order to not to create a silver bullet, >TOASTers should be limited to a single TableAM. The one we know uses >pages of a known fixed size, the one that actually requires TOAST >because pages are relatively small, etc. Currently all our TOAST implementations use Heap AM, except ones that use external (truly external, i.e. files outside DB) storage. Using Table AM Routine and routing AM methods calls via it is a topic for further discussion, if Pluggable TOAST will be committed. And even then it would be an open issue. >I believe this is the source of misunderstanding. Note that not _any_ >TableAM needs TOAST to begin with. As an example, if one chooses to >implement a column-organized TableAM that stores all text/bytea >attributes in a separate dictionary file this person doesn't need >TOAST and doesn't want to be constrained by the need of choosing one. >For this reason the "N TOASTers x M TableAMs" approach is >architecturally broken. TOAST implementation is not necessary for Table AM. And TOAST API is just an optional open interface - SET TOASTER is an option for CREATE/ALTER TABLE command. In previous discussion we haven't mentioned an approach "N TOASTers x M TableAMs". >To clarify: is an ability to specify TOASTers for given columns and/or >types also part of the plan? For table columns it is already supported by the syntax part of the TOAST API. For data types we reserved the validation part of the API, but this support is still a subject for discussion, although we think it will be very handy for DB users, like we issue something like: CREATE TYPE ... TOASTER=jsonb_toaster ... ; or ALTER TYPE JSONB SET TOASTER jsonb_toaster; and do not have to set special toaster for jsonb column each time we create or alter a table with it. >No. To be honest, it looks like you are merely discarding most/any >feedback the community provided so far. Very sorry to read that. Almost all of the requests in this discussion have been taken into account in patches, and the most serious one - I mean pg_attribute expansion which was mentioned by Tom Lane and Robert Haas - is being fixed right now and will be ready very soon. >I really think that pluggable TOASTers would be a great feature. >However if the goal is to get it into the core I doubt that we are >going to make much progress with the current approach. We hope we will. This feature is very demanded by end-users, and will be even more as time goes by - current TOAST limitations and how they affect DBMS performance is a serious drawback in comparison to competitors. On Mon, Oct 24, 2022 at 2:55 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > >I don't argue with most of what you say. I am just pointing out the > > >reason why the chosen approach "N TOASTers x M TableAMs" will not > > >work: > > > > We assume that TAM used in custom Toaster works as it is should work, > > and leave TAM internals to this TAM developer - say, we do not want to > > change internals of Heap AM. > > > > We don't want to create some kind of silver bullet. > > This is exactly the point. In order to not to create a silver bullet, > TOASTers should be limited to a single TableAM. The one we know uses > pages of a known fixed size, the one that actually requires TOAST > because pages are relatively small, etc. > > > That's what the TOAST API is - just an interface that all custom > > TOAST implementations use to have a common entry point from any TAM, > > [...] > > I believe this is the source of misunderstanding. Note that not _any_ > TableAM needs TOAST to begin with. As an example, if one chooses to > implement a column-organized TableAM that stores all text/bytea > attributes in a separate dictionary file this person doesn't need > TOAST and doesn't want to be constrained by the need of choosing one. > > For this reason the "N TOASTers x M TableAMs" approach is > architecturally broken. > > > keep in mind that different kinds of data require very > > different approach to external storage - say, JSON TOAST works with > > maps of keys and values, [...] > > To clarify: is an ability to specify TOASTers for given columns and/or > types also part of the plan? > > > Have I answered your question? Please don't hesitate to point to any > unclear > > parts, I'd be glad to explain that. > > No. To be honest, it looks like you are merely discarding most/any > feedback the community provided so far. > > I really think that pluggable TOASTers would be a great feature. > However if the goal is to get it into the core I doubt that we are > going to make much progress with the current approach. > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: [patch] Have psql's \d+ indicate foreign partitions
On Mon, Oct 24, 2022 at 09:44:18PM +0900, Ian Lawrence Barwick wrote: > Recently I have been working a lot with partitioned tables which contain a mix > of local and foreign partitions, and find it would be very useful to be able > to > easily obtain an overview of which partitions are foreign and where they are > located. > Partitions: parttest_10_0 FOR VALUES WITH (modulus 10, remainder 0), > parttest_10_1 FOR VALUES WITH (modulus 10, remainder 1), > server: "fdw_node2", > which is much more informative, albeit a little more cluttered, but > @@ -3445,6 +3451,10 @@ describeOneTableDetails(const char *schemaname, > if (child_relkind == RELKIND_PARTITIONED_TABLE > || > child_relkind == > RELKIND_PARTITIONED_INDEX) > appendPQExpBufferStr(&buf, ", > PARTITIONED"); > + else if (child_relkind == RELKIND_FOREIGN_TABLE > && is_partitioned) > + appendPQExpBuffer(&buf, ", server: > \"%s\"", PQgetvalue(result, i, 4)); > + else if (child_relkind == RELKIND_FOREIGN_TABLE > && !is_partitioned) > + appendPQExpBuffer(&buf, " (server: > \"%s\")", PQgetvalue(result, i, 4)); > if (strcmp(PQgetvalue(result, i, 2), "t") == 0) > appendPQExpBufferStr(&buf, " (DETACH > PENDING)"); > if (i < tuples - 1) To avoid the clutter that you mentioned, I suggest that this should show that the table *is* foreign, but without the server - if you want to know the server (or its options), you can run another \d command for that (or run a SQL query). That's similar to what's shown if the child is partitioned: a suffix like ", PARTITIONED", but without show the partition strategy. I had a patch to allow \d++, and maybe showing the foreign server would be reasonable for that. But the patch got closed, evidently lack of interest.
Re: Pluggable toaster
Hi Nikita, > Using Table AM Routine and routing AM methods calls via it is a topic for > further discussion, > if Pluggable TOAST will be committed. [...] And even then it would be an open > issue. >From personal experience with the project I have serious doubts this is going to happen. Before such invasive changes are going to be accepted there should be a clear understanding of how exactly TOASTers are supposed to be used. This should be part of the documentation in the patchset. Additionally there should be an open-soruce or source-available extension that actually demonstrates the benefits of TOASTers with reproducible benchmarks (we didn't even get to that part yet). > TOAST implementation is not necessary for Table AM. What other use cases for TOAST do you have in mind? >> > Have I answered your question? Please don't hesitate to point to any >> > unclear >> > parts, I'd be glad to explain that. >> >> No. To be honest, it looks like you are merely discarding most/any >> feedback the community provided so far. >> >> I really think that pluggable TOASTers would be a great feature. >> However if the goal is to get it into the core I doubt that we are >> going to make much progress with the current approach. To clarify, the concern about "N TOASTers vs M TableAM" was expressed by Robert Haas back in Jan 2022: > I agree ... but I'm also worried about what happens when we have > multiple table AMs. One can imagine a new table AM that is > specifically optimized for TOAST which can be used with an existing > heap table. One can imagine a new table AM for the main table that > wants to use something different for TOAST. So, I don't think it's > right to imagine that the choice of TOASTer depends solely on the > column data type. I'm not really sure how this should work exactly ... > but it needs careful thought. This is the most important open question so far to my knowledge. It was never addressed, it doesn't seem like there is a plan of doing so, the suggested alternative approach was ignored, nor are there any strong arguments that would defend this design choice and/or criticize the alternative one (other than general words "don't worry we know what we are doing"). This what I mean by the community feedback being discarded. -- Best regards, Aleksander Alekseev
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
On Sun, Oct 23, 2022 at 9:32 PM Jeff Davis wrote: > It's possible this would be easier for users to understand: one process > that does cleanup work over time in a way that minimizes interference; > and another process that activates in more urgent situations (perhaps > due to misconfiguration of the first process). I think that the new "early" version of antiwraparound autovacuum (that can still be autocancelled) would simply be called autovacuum. It wouldn't appear as "autovacuum to prevent wraparound" in places like pg_stat_activity. For the most part users wouldn't have to care about the difference between these autovacuums and traditional non-antiwraparound autovacuums. They really would be exactly the same thing, so it would make sense if users typically noticed no difference whatsoever (at least in contexts like pg_stat_activity). > But we should be careful that we don't end up with more confusion. For > something like that to work, we'd probably want the second process to > not be configurable at all, and we'd want it to be issuing WARNINGs > pointing to what might be misconfigured, and otherwise just be > invisible. There should be some simple scheme for determining when an antiwraparound autovacuum (non-cancellable autovacuum to advance relfrozenxid/relminmxid) should run (applied by the autovacuum.c scheduling logic). Something like "table has attained an age that's now 2x autovacuum_freeze_max_age, or 1/2 of vacuum_failsafe_age, whichever is less". The really important thing is giving a regular/early autocancellable autovacuum triggered by age(relfrozenxid) *some* opportunity to run. I strongly suspect that the exact details won't matter too much, provided we manage to launch at least one such autovacuum before escalating to traditional antiwraparound autovacuum (which cannot be autocancelled). Even if regular/early autovacuum had just one opportunity to run to completion, we'd already be much better off. The hazards from blocking automated DDL in a way that leads to a very disruptive traffic jam (like in the Joyent Manta postmortem) would go way down. > > That way we wouldn't be fighting against the widely held perception > > that antiwraparound autovacuums are scary. > > There's certainly a terminology problem there. Just to brainstorm on > some new names, we might want to call it something like "xid > reclamation" or "xid horizon advancement". I think that we should simply call it autovacuum. Under this scheme, antiwraparound autovacuum would be a qualitatively different kind of operation to users (though not to vacuumlazy.c), because it would not be autocancellable in the standard way. And because users should take it as a signal that things aren't really working well (otherwise we wouldn't have reached the point of requiring a scary antiwraparound autovacuum in the first place). Right now antiwraparound autovacuums are both an emergency thing (or at least described as such in one or two areas of the source code), and a completely routine occurrence. This is deeply confusing. Separately, I plan on breaking out insert-triggered autovacuums from traditional dead tuple triggered autovacuums [1], which creates a need to invent some kind of name to differentiate the new table age triggering criteria from both insert-driven and dead tuple driven autovacuums. These are all fundamentally the same operations with the same urgency to users, though. We'd only need to describe the *criteria* that *triggered* the autovacuum in our autovacuum log report (actually we'd still report autovacuums aš antiwraparound autovacuum in cases where that still happened, which won't be presented as just another triggering criteria in the report). [1] https://www.postgresql.org/message-id/flat/cah2-wzneqmkmry8feudk8xdph37-4anygf7a04bwxoc1gkd...@mail.gmail.com -- Peter Geoghegan
Re: parse partition strategy string in gram.y
Is there a reason why HASH partitioning does not currently support range partition bounds, where the values in the partition bounds would refer to the hashed value? The advantage of hash partition bounds is that they are not domain-specific, as they are for ordinary RANGE partitions, but they are more flexible than MODULUS/REMAINDER partition bounds. On 10/21/22, 9:48 AM, "Japin Li" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On Fri, 21 Oct 2022 at 20:34, Justin Pryzby wrote: > On Fri, Oct 21, 2022 at 06:22:44PM +0800, Japin Li wrote: >> Is there any way to get the regression tests diffs from Cirrus CI? >> I did not find the diffs in [1]. >> >> [1] https://cirrus-ci.com/build/4721735111540736 > > They're called "main". > I'm planning on submitting a patch to rename it to "regress", someday. > See also: https://www.postgresql.org/message-id/20221001161420.GF6256%40telsasoft.com Oh, thank you very much! I find it in testrun/build/testrun/main/regress [1]. [1] https://api.cirrus-ci.com/v1/artifact/task/6215926717612032/testrun/build/testrun/main/regress/regression.diffs -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Pluggable toaster
Hi! >From personal experience with the project I have serious doubts this >is going to happen. Before such invasive changes are going to be >accepted there should be a clear understanding of how exactly TOASTers >are supposed to be used. This should be part of the documentation in >the patchset. Additionally there should be an open-soruce or >source-available extension that actually demonstrates the benefits of >TOASTers with reproducible benchmarks (we didn't even get to that part >yet). Actually, there's a documentation part in the patchset. Also, there is README file explaining the API. In addition, we have several custom TOAST implementations with some results - they were published and presented on PgCon. I was asked to exclude custom TOAST implementations and some further improvements for the first iteration, that's why currently the patchset consists only of 3 patches - base core changes, default TOAST implementation via TOAST API and documentation package. >What other use cases for TOAST do you have in mind? The main use case is the same as for the TOAST mechanism - storing and retrieving oversized data. But we expanded this case with some details - - update TOASTed data (yes, current TOAST implementation cannot update stored data - is marks whole TOASTED object as dead and stores new one); - retrieve part of the stored data chunks without fully de-TOASTing stored data (even with existing TOAST this will be painful if you have to get just a small part of the several hundreds Mb sized object); - be able to store objects of size larger than 1 Gb; - store more than 4 Tb of TOASTed data for one table; - optimize storage for fast search and retrieval of parts of TOASTed object - this is must-have for effectively using JSON, PostgreSQL already is in catching-up position in JSON performance field. For all this cases we have test results that show improvements in storage and performance. >To clarify, the concern about "N TOASTers vs M TableAM" was expressed >by Robert Haas back in Jan 2022: >> I agree ... but I'm also worried about what happens when we have >> multiple table AMs. One can imagine a new table AM that is >> specifically optimized for TOAST which can be used with an existing >> heap table. One can imagine a new table AM for the main table that >> wants to use something different for TOAST. So, I don't think it's >> right to imagine that the choice of TOASTer depends solely on the >> column data type. I'm not really sure how this should work exactly ... >> but it needs careful thought. >This is the most important open question so far to my knowledge. It >was never addressed, it doesn't seem like there is a plan of doing so, >the suggested alternative approach was ignored, nor are there any >strong arguments that would defend this design choice and/or criticize >the alternative one (other than general words "don't worry we know >what we are doing"). >This what I mean by the community feedback being discarded. Maybe there was some misunderstanding, I was new to this project and company at that time - I was introduced to is in the middle of December 2021, but Theodor Sigaev gave an answer to Mr. Haas: >Right. that's why we propose a validate method (may be, it's a wrong >name, but I don't known better one) which accepts several arguments, one >of which is table AM oid. If that method returns false then toaster >isn't useful with current TAM, storage or/and compression kinds, etc. And this is generalized and correct from the OOP POV mean to provide a way to ensure that this concrete TOAST implementation is valid for Table AM calling it. On Mon, Oct 24, 2022 at 4:53 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > Using Table AM Routine and routing AM methods calls via it is a topic > for further discussion, > > if Pluggable TOAST will be committed. [...] And even then it would be an > open issue. > > From personal experience with the project I have serious doubts this > is going to happen. Before such invasive changes are going to be > accepted there should be a clear understanding of how exactly TOASTers > are supposed to be used. This should be part of the documentation in > the patchset. Additionally there should be an open-soruce or > source-available extension that actually demonstrates the benefits of > TOASTers with reproducible benchmarks (we didn't even get to that part > yet). > > > TOAST implementation is not necessary for Table AM. > > What other use cases for TOAST do you have in mind? > > >> > Have I answered your question? Please don't hesitate to point to any > unclear > >> > parts, I'd be glad to explain that. > >> > >> No. To be honest, it looks like you are merely discarding most/any > >> feedback the community provided so far. > >> > >> I really think that pluggable TOASTers would be a great feature. > >> However if the goal is to get it into the core I doubt that we are > >> going to make much progress with the current approach. > >
Re: effective_multixact_freeze_max_age issue
On Mon, Oct 24, 2022 at 1:18 AM Anton A. Melnikov wrote: > > Also, the proposed new WARNING is only seen when we're > > bound to also see the existing OldestXmin WARNING already. Why have 2 > > WARNINGs about exactly the same problem?> > > I didn't understand this moment. > > If the FreezeLimit is always older than OldestXmin or equal to it according > to: > > > FreezeLimit is usually 50 million XIDs before > > OldestXmin (the freeze_min_age default), > > can't there be a situation like this? I don't understand what you mean. FreezeLimit *isn't* always exactly 50 million XIDs before OldestXmin -- not anymore. In fact that's the main benefit of this work (commit c3ffa731). That detail has changed (and changed for the better). Though it will only be noticeable to users when an old snapshot holds back OldestXmin by a significant amount. It is true that we must always respect the classic "FreezeLimit <= OldestXmin" invariant. So naturally vacuum_set_xid_limits() continues to make sure that the invariant holds in all cases, if necessary by holding back FreezeLimit: + /* freezeLimit must always be <= oldestXmin */ + if (TransactionIdPrecedes(*oldestXmin, *freezeLimit)) + *freezeLimit = *oldestXmin; When we *don't* have to do this (very typical when vacuum_freeze_min_age is set to its default of 50 million), then FreezeLimit won't be affected by old snapshots. Overall, FreezeLimit must either be: 1. *Exactly* freeze_min_age XIDs before nextXID (note that it is nextXID instead of OldestXmin here, as of commit c3ffa731). or: 2. *Exactly* OldestXmin. FreezeLimit must always be either exactly 1 or exactly 2, regardless of anything else (like long running transactions/snapshots). Importantly, we still never interpret freeze_min_age as more than "autovacuum_freeze_max_age / 2" when determining FreezeLimit. While the safeOldestXmin cutoff is "nextXID - autovacuum_freeze_max_age". Before commit c3ffa731, FreezeLimit would sometimes be interpreted as exactly OldestXmin -- it would be set to OldestXmin directly when the WARNING was given. But now we get smoother behavior, without any big discontinuities in how FreezeLimit is set over time when OldestXmin is held back generally. -- Peter Geoghegan
Re: New docs chapter on Transaction Management and related changes
On Sun, 16 Oct 2022 at 02:08, Bruce Momjian wrote: > > On Fri, Oct 14, 2022 at 05:46:55PM -0400, Robert Treat wrote: > > On Fri, Oct 14, 2022 at 3:51 PM Bruce Momjian wrote: > > > Attached is the merged patch from all the great comments I received. I > > > have also rebuilt the docs with the updated patch: > > > > > > https://momjian.us/tmp/pgsql/ > > > > > > > + RELEASE SAVEPOINT also subcommits and destroys > > + all savepoints that were established after the named savepoint was > > + established. This means that any subtransactions of the named savepoint > > + will also be subcommitted and destroyed. > > > > Wonder if we should be more explicit that data changes are preserved, > > not destroyed... something like: > > "This means that any changes within subtransactions of the named > > savepoint will be subcommitted and those subtransactions will be > > destroyed." > > Good point. I reread the section and there was just too much confusion > over subtransactions, partly because the behavior just doesn't map > easily to subtransaction. I therefore merged all three paragraphs into > one and tried to make the text saner; release_savepoint.sgml diff > attached, URL content updated. Just got around to reading this, thanks for changes. The rewording doesn't work for me. The use of the word "destroy" is very misleading, since the effect is to commit. So I think we must avoid use of the word destroy completely. Possible rewording: RELEASE SAVEPOINT will subcommit the subtransaction established by the named savepoint, releasing any resources held by it. If there were any subtransactions created by the named savepoint, these will also be subcommitted. The point is that savepoints create subtransactions, but they are not the only way to create them, so we cannot equate savepoint and subtransaction completely. -- Simon Riggshttp://www.EnterpriseDB.com/
[PATCH] minor optimization for ineq_histogram_selectivity()
Hello, When studying the weird planner issue reported here [1], I came up with the attached patch. It reduces the probability of calling get_actual_variable_range(). The patch applies to the master branch. How to test : CREATE TABLE foo (a bigint, b TEXT) WITH (autovacuum_enabled = off); INSERT INTO foo SELECT i%213, md5(i::text) from generate_series(1,100) i; VACUUM ANALYZE foo; SELECT * FROM pg_stats WHERE tablename = 'foo' AND attname='a'\gx CREATE INDEX ON foo(a); DELETE FROM foo WHERE a = 212; EXPLAIN (BUFFERS) SELECT count(a) FROM foo WHERE a > 208; Without this patch, you will observe at least 4694 shared hits (which are mostly heap fetches). If you apply the patch, you will observe very few of them. You should run the EXPLAIN on a standby, if you want to observe the heap fetches more than one time (because of killed index tuples being ignored). Best regards, Frédéric [1] https://www.postgresql.org/message-id/flat/CAECtzeVPM4Oi6dTdqVQmjoLkDBVChNj7ed3hNs1RGrBbwCJ7Cw%40mail.gmail.comFrom d7602f0731a3c5cac59cf2fc81bc336f800cb11a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= Date: Mon, 24 Oct 2022 16:33:26 +0200 Subject: [PATCH] minor optimization for ineq_histogram_selectivity() Avoid calling of get_actual_variable_range() when possible. With this change, 'probe' can still reach sslot.nvalues - 1 if needed, so the algorithm still works correctly. But if the right end of the hitogram bin is equal to sslot.values[sslot.nvalues - 2], 'probe' will stay strictly less than sslot.nvalues - 1 in the while loop, and we save a call to get_actual_variable_range(). Use of get_actual_variable_range() can sometimes lead to surprising slow planning time (see [1] and [2]) [1] https://www.postgresql.org/message-id/flat/CAECtzeVPM4Oi6dTdqVQmjoLkDBVChNj7ed3hNs1RGrBbwCJ7Cw%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/36adf760-680b-7a4a-e019-64f4eaaf6ff7%40gmail.com --- src/backend/utils/adt/selfuncs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index 69e0fb98f5..c1bc67705c 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -1108,7 +1108,7 @@ ineq_histogram_selectivity(PlannerInfo *root, while (lobound < hibound) { -int probe = (lobound + hibound) / 2; +int probe = (lobound + hibound - 1) / 2; bool ltcmp; /* -- 2.30.2
Re: Add the ability to limit the amount of memory that can be allocated to backends.
Hello Reid, could you rebase the patch again? It doesn't apply currently (http://cfbot.cputube.org/patch_40_3867.log). Thanks! You mention, that you want to prevent the compiler from getting cute. I don't think this comments are exactly helpful in the current state. I think probably fine to just omit them. I don't understand the purpose of the result variable in exceeds_max_total_bkend_mem. What purpose does it serve? I really like the simplicity of the suggestion here to prevent oom. I intent to play around with a lot of backends, once I get a rebased patch. Regards Arne From: Reid Thompson Sent: Thursday, September 15, 2022 4:58:19 PM To: Ibrar Ahmed; pgsql-hackers@lists.postgresql.org Cc: reid.thomp...@crunchydata.com; Justin Pryzby Subject: Re: Add the ability to limit the amount of memory that can be allocated to backends. On Thu, 2022-09-15 at 12:07 +0400, Ibrar Ahmed wrote: > > The patch does not apply; please rebase the patch. > > patching file src/backend/utils/misc/guc.c > Hunk #1 FAILED at 3664. > 1 out of 1 hunk FAILED -- saving rejects to file > src/backend/utils/misc/guc.c.rej > > patching file src/backend/utils/misc/postgresql.conf.sample > rebased patches attached. Thanks, Reid
Re: effective_multixact_freeze_max_age issue
On Mon, Oct 24, 2022 at 7:56 AM Peter Geoghegan wrote: > I don't understand what you mean. FreezeLimit *isn't* always exactly > 50 million XIDs before OldestXmin -- not anymore. In fact that's the > main benefit of this work (commit c3ffa731). That detail has changed > (and changed for the better). Though it will only be noticeable to > users when an old snapshot holds back OldestXmin by a significant > amount. I meant that the new behavior will only have a noticeable impact when OldestXmin is significantly earlier than nextXID. Most of the time there won't be any old snapshots, which means that there will only be a negligible difference between OldestXmin and nextXID when things are operating normally (OldestXmin will still probably be a tiny bit earlier than nextXID, but not enough to matter). And so most of the time the difference between the old approach and the new approach will be completely negligible; 50 million XIDs is usually a huge number (it is usually far far larger than the difference between OldestXmin and nextXID). BTW, I have some sympathy for the argument that the WARNINGs that we have here may not be enough -- we only warn when the situation is already extremely serious. I just don't see any reason why that problem should be treated as a regression caused by commit c3ffa731. The WARNINGs may be inadequate, but that isn't new. -- Peter Geoghegan
Re: Pluggable toaster
Hi Nikita, > > > TOAST implementation is not necessary for Table AM. > > >What other use cases for TOAST do you have in mind? > > The main use case is the same as for the TOAST mechanism - storing and > retrieving > oversized data. But we expanded this case with some details - > - update TOASTed data (yes, current TOAST implementation cannot update stored > data - is marks whole TOASTED object as dead and stores new one); > - retrieve part of the stored data chunks without fully de-TOASTing stored > data (even > with existing TOAST this will be painful if you have to get just a small part > of the several > hundreds Mb sized object); > - be able to store objects of size larger than 1 Gb; > - store more than 4 Tb of TOASTed data for one table; > - optimize storage for fast search and retrieval of parts of TOASTed object - > this is > must-have for effectively using JSON, PostgreSQL already is in catching-up > position > in JSON performance field. I see. Actually most of this is what TableAM does. We just happened to give this part of TableAM a separate name. The only exception is the last case, when you create custom TOASTers for particular types and potentially specify them for the given column. All in all, this makes sense. > Right. that's why we propose a validate method (may be, it's a wrong > name, but I don't known better one) which accepts several arguments, one > of which is table AM oid. If that method returns false then toaster > isn't useful with current TAM, storage or/and compression kinds, etc. OK, I missed this message. So there was some misunderstanding after all, sorry for this. That's exactly what I wanted to know. It's much better than allowing any TOASTer to run on top of any TableAM. -- Best regards, Aleksander Alekseev
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
On Mon, 2022-10-24 at 07:25 -0700, Peter Geoghegan wrote: > The really important thing is giving a regular/early autocancellable > autovacuum triggered by age(relfrozenxid) *some* opportunity to run. +1. That principle seems both reasonable from a system standpoint and understandable to a user. > Even if regular/early autovacuum had just one > opportunity to run to completion, we'd already be much better off. By "opportunity", you mean that, regardless of configuration, the cancellable autovacuum would at least start; though it still might be cancelled by DDL. Right? > These are all fundamentally the same operations with the > same urgency to users, though. We'd only need to describe the > *criteria* that *triggered* the autovacuum in our autovacuum log > report Hmm... I'm worried that could be a bit confusing depending on how it's done. Let's be clear that it was merely the triggering criteria and doesn't necessarily represent the work that is being done. There are enough cases that it would be good to start a document and outline the end behavior that your patch series is designed to accomplish. In other words, a before/after of the interesting cases. -- Jeff Davis PostgreSQL Contributor Team - AWS
Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
On Mon, Oct 24, 2022 at 8:43 AM Jeff Davis wrote: > > Even if regular/early autovacuum had just one > > opportunity to run to completion, we'd already be much better off. > > By "opportunity", you mean that, regardless of configuration, the > cancellable autovacuum would at least start; though it still might be > cancelled by DDL. Right? Yes, exactly. It might be difficult as a practical matter to make sure that we *reliably* give autovacuum.c the opportunity to launch a "standard" autovacuum tasked with advancing relfrozenxid (just after autovacuum_freeze_max_age is first crossed) before the point that a scary antiwraparound autovacuum is launched. So we might end up giving it more XID slack than it's likely to ever need (say by only launching a traditional antiwraparound autovacuum against tables that attain an age that is twice the value of autovacuum_freeze_max_age). These are all just details, though -- the important principle is that we try our utmost to give the less disruptive strategy a chance to succeed before concluding that it has failed, and then "escalating" to a traditional antiwraparound autovacuum. > > These are all fundamentally the same operations with the > > same urgency to users, though. We'd only need to describe the > > *criteria* that *triggered* the autovacuum in our autovacuum log > > report > > Hmm... I'm worried that could be a bit confusing depending on how it's > done. Let's be clear that it was merely the triggering criteria and > doesn't necessarily represent the work that is being done. Maybe it could be broken out into a separate "autovacuum triggered by: " line, seen only in the autovacuum log instrumentation (and not in the similar report output by a manual VACUUM VERBOSE). When we still end up "escalating" to an antiwraparound autovacuum, the "triggered by:" line would match whatever we'd show in the benign the non-cancellable-but-must-advance-relfrozenxid autovacuum case. The difference would be that we'd now be reporting on a different operation entirely (not just a regular autovacuum, a scary antiwraparound autovacuum). (Again, even these distinctions wouldn't be meaningful to vacuumlazy.c itself -- it would just need to handle the details around logging in a way that gave users the right idea. There wouldn't be any special discrete aggressive mode of operation anymore, assuming my big patch set gets into Postgres 16 too.) > There are enough cases that it would be good to start a document and > outline the end behavior that your patch series is designed to > accomplish. In other words, a before/after of the interesting cases. That's on my TODO list. Mostly it's an independent thing to this (antiwraparound) autovacuum stuff, despite the fact that both projects share the same underlying philosophy. -- Peter Geoghegan
Re: parse partition strategy string in gram.y
On 2022-Oct-24, Finnerty, Jim wrote: > Is there a reason why HASH partitioning does not currently support > range partition bounds, where the values in the partition bounds would > refer to the hashed value? Just lack of an implementation, I suppose. > The advantage of hash partition bounds is that they are not > domain-specific, as they are for ordinary RANGE partitions, but they > are more flexible than MODULUS/REMAINDER partition bounds. Well, modulus/remainder is what we have. If you have ideas for a different implementation, let's hear them. I suppose we would have to know about both the user interface and how it would internally, from two perspectives: how does tuple routing work (ie. how to match a tuple's values to a set of bound values), and how does partition pruning work (ie. how do partition bounds match a query's restriction clauses). -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: New docs chapter on Transaction Management and related changes
On Mon, Oct 24, 2022 at 11:02 AM Simon Riggs wrote: > > On Sun, 16 Oct 2022 at 02:08, Bruce Momjian wrote: > > > > On Fri, Oct 14, 2022 at 05:46:55PM -0400, Robert Treat wrote: > > > On Fri, Oct 14, 2022 at 3:51 PM Bruce Momjian wrote: > > > > Attached is the merged patch from all the great comments I received. I > > > > have also rebuilt the docs with the updated patch: > > > > > > > > https://momjian.us/tmp/pgsql/ > > > > > > > > > > + RELEASE SAVEPOINT also subcommits and destroys > > > + all savepoints that were established after the named savepoint was > > > + established. This means that any subtransactions of the named > > > savepoint > > > + will also be subcommitted and destroyed. > > > > > > Wonder if we should be more explicit that data changes are preserved, > > > not destroyed... something like: > > > "This means that any changes within subtransactions of the named > > > savepoint will be subcommitted and those subtransactions will be > > > destroyed." > > > > Good point. I reread the section and there was just too much confusion > > over subtransactions, partly because the behavior just doesn't map > > easily to subtransaction. I therefore merged all three paragraphs into > > one and tried to make the text saner; release_savepoint.sgml diff > > attached, URL content updated. > > Just got around to reading this, thanks for changes. > > The rewording doesn't work for me. The use of the word "destroy" is > very misleading, since the effect is to commit. > > So I think we must avoid use of the word destroy completely. Possible > rewording: > > RELEASE SAVEPOINT will subcommit the subtransaction > established by the named savepoint, releasing any resources held by > it. If there were any subtransactions created by the named savepoint, > these will also be subcommitted. > I think it should be "If there were any subtransactions of the named savepoint, these will also be subcommitted", but otherwise I think this wording should work. Robert Treat https://xzilla.net
fixing typo in comment for restriction_is_or_clause
Hi, When I was looking at src/backend/optimizer/util/restrictinfo.c, I found a typo in one of the comments. I also took the chance to simplify the code a little bit. Please take a look at the patch. Thanks is-or.patch Description: Binary data
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
On Thu, Oct 20, 2022 at 1:31 PM Andres Freund wrote: > > Hi, > > - we shouldn't do pgstat_count_io_op() while the buffer header lock is held, > if possible. I've changed this locally. It will be fixed in the next version I share. > > I wonder if we should add a "source" output argument to > StrategyGetBuffer(). Then nearly all the counting can happen in > BufferAlloc(). I think we can just check for BM_VALID being set before invalidating it in order to claim the buffer at the end of BufferAlloc(). Then we can count it as an eviction or reuse. > > - "repossession" is a very unintuitive name for me. If we want something like > it, can't we just name it reuse_failed or such? Repossession could be called eviction_failed or reuse_failed. Do we think we will ever want to use it to count buffers we released in other IOContexts (thus making the name eviction_failed better than reuse_failed)? > - Is it actually correct to count evictions in StrategyGetBuffer()? What if we > then decide to not use that buffer in BufferAlloc()? Yes, that'll be counted > via rejected, but that still leaves the eviction count to be "misleading"? I agree that counting evictions in StrategyGetBuffer() is incorrect. Checking BM_VALID at bottom of BufferAlloc() should be better. > On 2022-10-19 15:26:51 -0400, Melanie Plageman wrote: > > I have made some major changes in this area to make the columns more > > useful. I have renamed and split "clocksweeps". It is now "evicted" and > > "freelist acquired". This makes it clear when a block must be evicted > > from a shared buffer must be and may help to identify misconfiguration > > of shared buffers. > > I'm not sure freelist acquired is really that useful? If we don't add it, we > should however definitely not count buffers from the freelist as evictions. > > > > There is some nuance here that I tried to make clear in the docs. > > "freelist acquired" in a shared context is straightforward. > > "freelist acquired" in a strategy context is counted when a shared > > buffer is added to the strategy ring (not when it is reused). > > Not sure what the second half here means - why would a buffer that's not from > the freelist ever be counted as being from the freelist? > > > > "freelist_acquired" is confusing for local buffers but I wanted to > > distinguish between reuse/eviction of local buffers and initial > > allocation. "freelist_acquired" seemed more fitting because there is a > > clocksweep to find a local buffer and if it hasn't been allocated yet it > > is allocated in a place similar to where shared buffers acquire a buffer > > from the freelist. If I didn't count it here, I would need to make a new > > column only for local buffers called "allocated" or something like that. > > I think you're making this too granular. We need to have more detail than > today. But we don't necessarily need to catch every nuance. > I am fine with cutting freelist_acquired. The same actionable information that it could provide could be provided by "read", right? Also, removing it means I can remove the complicated explanation of how freelist_acquired should be interpreted in IOCONTEXT_LOCAL. Speaking of IOCONTEXT_LOCAL, I was wondering if it is confusing to call it IOCONTEXT_LOCAL since it refers to IO done for temporary tables. What if, in the future, we want to track other IO done using data in local memory? Also, what if we want to track other IO done using data from shared memory that is not in shared buffers? Would IOCONTEXT_SB and IOCONTEXT_TEMP be better? Should IOContext literally describe the context of the IO being done and there be a separate column which indicates the source of the data for the IO? Like wal_buffer, local_buffer, shared_buffer? Then if it is not block-oriented, it could be shared_mem, local_mem, or bypass? If we had another dimension to the matrix "data_src" which, with block-oriented IO is equivalent to "buffer type", this could help with some of the clarity problems. We could remove the "reused" column and that becomes: IOCONTEXT | DATA_SRC| IOOP strategy | strategy_buffer | EVICT Having data_src and iocontext simplifies the meaning of all io operations involving a strategy. Some operations are done on shared buffers and some on existing strategy buffers and this would be more clear without the addition of special columns for strategies. > > I have also added the columns "repossessed" and "rejected". "rejected" > > is when a bulkread rejects a strategy buffer because it is dirty and > > requires flush. Seeing a lot of rejections could indicate you need to > > vacuum. "repossessed" is the number of times a strategy buffer was > > pinned or in use by another backend and had to be removed from the > > strategy ring and replaced with a new shared buffer. This gives you some > > indication that there is contention on blocks recently used by a > > strategy. > > I don't immediately see a real use case f
Re: Pluggable toaster
Hi, Aleksander, thanks for the discussion! It seems to me that I have to add some parts of it to API documentation, to clarify the details on API purpose and use-cases. On Mon, Oct 24, 2022 at 6:37 PM Aleksander Alekseev < aleksan...@timescale.com> wrote: > Hi Nikita, > > > > > TOAST implementation is not necessary for Table AM. > > > > >What other use cases for TOAST do you have in mind? > > > > The main use case is the same as for the TOAST mechanism - storing and > retrieving > > oversized data. But we expanded this case with some details - > > - update TOASTed data (yes, current TOAST implementation cannot update > stored > > data - is marks whole TOASTED object as dead and stores new one); > > - retrieve part of the stored data chunks without fully de-TOASTing > stored data (even > > with existing TOAST this will be painful if you have to get just a small > part of the several > > hundreds Mb sized object); > > - be able to store objects of size larger than 1 Gb; > > - store more than 4 Tb of TOASTed data for one table; > > - optimize storage for fast search and retrieval of parts of TOASTed > object - this is > > must-have for effectively using JSON, PostgreSQL already is in > catching-up position > > in JSON performance field. > > I see. Actually most of this is what TableAM does. We just happened to > give this part of TableAM a separate name. The only exception is the > last case, when you create custom TOASTers for particular types and > potentially specify them for the given column. > > All in all, this makes sense. > > > Right. that's why we propose a validate method (may be, it's a wrong > > name, but I don't known better one) which accepts several arguments, one > > of which is table AM oid. If that method returns false then toaster > > isn't useful with current TAM, storage or/and compression kinds, etc. > > OK, I missed this message. So there was some misunderstanding after > all, sorry for this. > > That's exactly what I wanted to know. It's much better than allowing > any TOASTer to run on top of any TableAM. > > -- > Best regards, > Aleksander Alekseev > -- Regards, Nikita Malakhov Postgres Professional https://postgrespro.ru/
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
On Sun, Oct 23, 2022 at 6:35 PM Maciek Sakrejda wrote: > > On Wed, Oct 19, 2022 at 12:27 PM Melanie Plageman > wrote: > > > > v34 is attached. > > I think the column names need discussion. Also, the docs need more work > > (I added a lot of new content there). I could use feedback on the column > > names and definitions and review/rephrasing ideas for the docs > > additions. > > Nice! I think the expanded docs are great, and make this information > much easier to interpret. > > >+ io_context bulkread, existing > >+ dirty buffers in the ring requirng flush are > > "requiring" Thanks! > > >+ shared buffers were acquired from the freelist and added to the > >+ fixed-size strategy ring buffer. Shared buffers are added to the > >+ strategy ring lazily. If the current buffer in the ring is pinned or > >in > > This is the first mention of the term "strategy" in these docs. It's > not totally opaque, since there's some context, but maybe we should > either try to avoid that term or define it more explicitly? > I am thinking it might be good to define the term strategy for use in this view documentation. In the IOContext column documentation, I've added this ... avoid occupying an undue portion of the main shared buffer pool. This pattern is called a Buffer Access Strategy and the fixed-size ring buffer can be referred to as a strategy ring buffer. I was thinking this would allow me to refer to the strategy ring buffer more easily. I fear simply referring to "the" ring buffer throughout this view documentation will be confusing. > >+ io_contexts. This is equivalent to > >+ evicted for shared buffers in > >+ io_context shared, as the > >contents > >+ of the buffer are evicted but refers to the case when > >the > > I don't quite follow this: does this mean that I should expect > 'reused' and 'evicted' to be equal in the 'shared' context, because > they represent the same thing? Or will 'reused' just be null because > it's not distinct from 'evicted'? It looks like it's null right now, > but I find the wording here confusing. You should only see evictions when the strategy evicts shared buffers and reuses when the strategy evicts existing strategy buffers. How about this instead in this docs? the number of times an existing buffer in the strategy ring was reused as part of an operation in the bulkread, bulkwrite, or vacuum io_contexts. when a buffer access strategy reuses a buffer in the strategy ring, it must evict its contents, incrementing reused. when a buffer access strategy adds a new shared buffer to the strategy ring and this shared buffer is occupied, the buffer access strategy must evict the contents of the shared buffer, incrementing evicted. > > I've implemented a change using the same function pg_settings uses to > > turn the build-time parameter BLCKSZ into 8kB (get_config_unit_name()) > > using the flag GUC_UNIT_BLOCKS. I am unsure if this is better or worse > > than "block_size". I am feeling very conflicted about this column. > > Yeah, I guess it feels less natural here than in pg_settings, but it > still kind of feels like one way of doing this is better than two... So, Andres pointed out that it would be nice to be able to multiply the unit column by the operation column (e.g. select unit * reused from pg_stat_io...) and get a number of bytes. Then you can use pg_size_pretty to convert it to something more human readable. It probably shouldn't be called unit, then, since that would be the same name as pg_settings but a different meaning. I thought of "bytes_conversion". Then, non-block-oriented IO also wouldn't have to be in bytes. They could put 1000 or 1 for bytes_conversion. What do you think? - Melanie
Question about "compound" queries.
Hello! Please, could somebody explain what the "compound" queries were created for? Maybe i'm calling them wrong. It's about queries like: SELECT 1 + 2 \; SELECT 2.0 AS "float" \; SELECT 1; Such queries can neither be prepared nor used in the extended protocol with ERROR: cannot insert multiple commands into a prepared statement. What are their advantages? And what is the proper name for such queries? "Compound" or something else? Would be very grateful for clarification. Best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: fix stats_fetch_consistency value in postgresql.conf.sample
On Mon, Sep 26, 2022 at 05:29:58PM +0900, Kyotaro Horiguchi wrote: > At Sat, 17 Sep 2022 23:53:07 -0500, Justin Pryzby > wrote in > > This is an alternative implementation, which still relies on adding the > > GUC_DYNAMIC, flag but doesn't require adding a new, sql-accessible > > function to convert the GUC to a pretty/human display value. > > Thanks! > > I'm not sure shared_buffer is GUC_DYNAMIC_DEFAULT, and we need to read It's set during initdb. > postgresql.conf.sample using SQL, but +1 for the direction. > > + AND NOT (sc.sample_value ~ '^0' AND current_setting(name) ~ '^0') -- > zeros may be written differently > + AND NOT (sc.sample_value='60s' AND current_setting(name) = '1min') -- > two ways to write 1min > > Mmm. Couldn't we get away from that explicit exceptions? Suggestions are welcomed. Rebased the patch. I also split the flag into DEFAULTS_COMPILE and DEFAULTS_INITDB, since that makes it easier to understand what the flags mean and the intent of the patch. And maybe allows fewer exclusions in patches like Peter's, which I think would only want to exclude compile-time defaults. -- Justin >From c3ac7bd40d26eb692f939d58935415ceb1cf308e Mon Sep 17 00:00:00 2001 From: Justin Pryzby Date: Tue, 19 Jul 2022 08:31:56 -0500 Subject: [PATCH 1/2] add DYNAMIC_DEFAULT for settings which vary by ./configure or platform or initdb --- doc/src/sgml/func.sgml | 15 ++ src/backend/utils/misc/guc_funcs.c | 6 ++- src/backend/utils/misc/guc_tables.c | 78 ++--- src/include/utils/guc.h | 2 + 4 files changed, 70 insertions(+), 31 deletions(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 6e0425cb3dc..b8e05073d1a 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -24374,6 +24374,21 @@ SELECT collation for ('foo' COLLATE "de_DE"); FlagDescription + + + DEFAULT_COMPILE + Parameters with this flag have default values which vary by + platform, or compile-time options. + + + + + DEFAULT_INITDB + Parameters with this flag have default values which are + determined dynamically during initdb. + + + EXPLAIN Parameters with this flag are included in diff --git a/src/backend/utils/misc/guc_funcs.c b/src/backend/utils/misc/guc_funcs.c index 108b3bd1290..2b666e8d014 100644 --- a/src/backend/utils/misc/guc_funcs.c +++ b/src/backend/utils/misc/guc_funcs.c @@ -538,7 +538,7 @@ ShowAllGUCConfig(DestReceiver *dest) Datum pg_settings_get_flags(PG_FUNCTION_ARGS) { -#define MAX_GUC_FLAGS 6 +#define MAX_GUC_FLAGS 8 char *varname = TextDatumGetCString(PG_GETARG_DATUM(0)); struct config_generic *record; int cnt = 0; @@ -551,6 +551,10 @@ pg_settings_get_flags(PG_FUNCTION_ARGS) if (record == NULL) PG_RETURN_NULL(); + if (record->flags & GUC_DEFAULT_COMPILE) + flags[cnt++] = CStringGetTextDatum("DEFAULT_COMPILE"); + if (record->flags & GUC_DEFAULT_INITDB) + flags[cnt++] = CStringGetTextDatum("DEFAULT_INITDB"); if (record->flags & GUC_EXPLAIN) flags[cnt++] = CStringGetTextDatum("EXPLAIN"); if (record->flags & GUC_NO_RESET) diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index 6454d034e7e..885d3dd8b81 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -601,7 +601,7 @@ StaticAssertDecl(lengthof(GucContext_Names) == (PGC_USERSET + 1), const char *const GucSource_Names[] = { /* PGC_S_DEFAULT */ "default", - /* PGC_S_DYNAMIC_DEFAULT */ "default", + /* PGC_S_DYNAMIC_DEFAULT */ "default", // /* PGC_S_ENV_VAR */ "environment variable", /* PGC_S_FILE */ "configuration file", /* PGC_S_ARGV */ "command line", @@ -1191,7 +1191,7 @@ struct config_bool ConfigureNamesBool[] = {"debug_assertions", PGC_INTERNAL, PRESET_OPTIONS, gettext_noop("Shows whether the running server has assertion checks enabled."), NULL, - GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | GUC_DEFAULT_COMPILE }, &assert_enabled, #ifdef USE_ASSERT_CHECKING @@ -1367,7 +1367,8 @@ struct config_bool ConfigureNamesBool[] = { {"update_process_title", PGC_SUSET, PROCESS_TITLE, gettext_noop("Updates the process title to show the active SQL command."), - gettext_noop("Enables updating of the process title every time a new SQL command is received by the server.") + gettext_noop("Enables updating of the process title every time a new SQL command is received by the server."), + GUC_DEFAULT_COMPILE }, &update_process_title, #ifdef WIN32 @@ -2116,7 +2117,8 @@ struct config_int ConfigureNamesInt[] = { {"max_connections", PGC_POSTMASTER, CONN_AUTH_SETTINGS, gettext_noop("Sets the maximum number of concurrent connections."), - NULL + NULL, + GUC_DEFAULT_INITDB }, &MaxConnections, 100, 1, MAX_BACKENDS, @@ -2153,7 +2155,7 @@ struct config_int ConfigureNamesI
Re: Interesting areas for beginners
Thanks so much for the answers, I'll try to start looking at some patches. -- Matheus Alcantara
Re: Question about "compound" queries.
On Mon, Oct 24, 2022 at 3:02 PM Anton A. Melnikov wrote: > Hello! > > Please, could somebody explain what the "compound" queries were created > for? > Maybe i'm calling them wrong. It's about queries like: > SELECT 1 + 2 \; SELECT 2.0 AS "float" \; SELECT 1; > > Such queries can neither be prepared nor used in the extended protocol with > ERROR: cannot insert multiple commands into a prepared statement. > What are their advantages? > And what is the proper name for such queries? "Compound" or something else? > Would be very grateful for clarification. > I suspect they came about out of simplicity - being able to simply take a text file with a bunch of SQL commands in a script and send them as-is to the server without any client-side parsing and let the server just deal with it. It works because the system needs to do those kinds of things anyway so, why not make it user-facing, even if most uses would find its restrictions makes it undesirable to use. David J.
Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
On Mon, Oct 4, 2021 at 9:14 PM Bruce Momjian wrote: > On Tue, Sep 28, 2021 at 02:54:39AM -0700, tho...@habets.se wrote: > > And you say for complex setups. Fair enough. But currently I'd say the > > default is wrong, and what should be default is not configurable. > > Agreed, I think this needs much more discussion and documentation. I'd like to try to get this conversation started again. To pique interest I've attached a new version of 0001, which implements `sslrootcert=system` instead as suggested upthread. In 0002 I went further and switched the default sslmode to `verify-full` when using the system CA roots, because I feel pretty strongly that anyone interested in using public CA systems is also interested in verifying hostnames. (Otherwise, why make the switch?) Notes: - 0001, like Thomas' original patch, uses SSL_CTX_set_default_verify_paths(). This will load both a default file and a default directory. This is probably what most people want if they're using the system roots -- just give me whatever the local system wants me to use! -- but sslrootcert currently deals with files only, I think. Is that a problem? - The implementation in 0002 goes all the way down to conninfo_add_defaults(). Maybe this is overly complex. Should I just make sslmode a derived option, via connectOptions2()? Thanks, --Jacob From 14311929a443f25f5064cdb01b57fae8d575e66d Mon Sep 17 00:00:00 2001 From: Jacob Champion Date: Mon, 24 Oct 2022 15:24:11 -0700 Subject: [PATCH v2 2/2] libpq: default to verify-full for system CAs Weaker verification modes don't make much sense for public CAs. --- src/interfaces/libpq/fe-connect.c | 25 + src/interfaces/libpq/t/001_uri.pl | 30 -- src/test/ssl/t/001_ssltests.pl| 6 ++ 3 files changed, 59 insertions(+), 2 deletions(-) diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 746e9b4f1e..92b87d3318 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -5876,6 +5876,7 @@ static bool conninfo_add_defaults(PQconninfoOption *options, PQExpBuffer errorMessage) { PQconninfoOption *option; + PQconninfoOption *sslmode_default = NULL, *sslrootcert = NULL; char *tmp; /* @@ -5892,6 +5893,9 @@ conninfo_add_defaults(PQconninfoOption *options, PQExpBuffer errorMessage) */ for (option = options; option->keyword != NULL; option++) { + if (strcmp(option->keyword, "sslrootcert") == 0) + sslrootcert = option; /* save for later */ + if (option->val != NULL) continue; /* Value was in conninfo or service */ @@ -5936,6 +5940,13 @@ conninfo_add_defaults(PQconninfoOption *options, PQExpBuffer errorMessage) } continue; } + + /* + * sslmode is not specified. Let it be filled in with the compiled + * default for now, but if sslrootcert=system, we'll override the + * default later before returning. + */ + sslmode_default = option; } /* @@ -5969,6 +5980,20 @@ conninfo_add_defaults(PQconninfoOption *options, PQExpBuffer errorMessage) } } + /* + * Special handling for sslrootcert=system with no sslmode explicitly + * defined. In this case we want to strengthen the default sslmode to + * verify-full. + */ + if (sslmode_default && sslrootcert) + { + if (sslrootcert->val && strcmp(sslrootcert->val, "system") == 0) + { + free(sslmode_default->val); + sslmode_default->val = strdup("verify-full"); + } + } + return true; } diff --git a/src/interfaces/libpq/t/001_uri.pl b/src/interfaces/libpq/t/001_uri.pl index beaf13b49c..ccfcd77680 100644 --- a/src/interfaces/libpq/t/001_uri.pl +++ b/src/interfaces/libpq/t/001_uri.pl @@ -8,7 +8,9 @@ use IPC::Run; # List of URIs tests. For each test the first element is the input string, the -# second the expected stdout and the third the expected stderr. +# second the expected stdout and the third the expected stderr. Optionally, +# additional arguments may specify key/value pairs which will override +# environment variables for the duration of the test. my @tests = ( [ q{postgresql://uri-user:secret@host:12345/db}, @@ -209,20 +211,44 @@ my @tests = ( q{postgres://%2Fvar%2Flib%2Fpostgresql/dbname}, q{dbname='dbname' host='/var/lib/postgresql' (local)}, q{}, + ], + # Usually the default sslmode is 'prefer' (for libraries with SSL) or + # 'disable' (for those without). This default changes to 'verify-full' if + # the system CA store is in use. + [ + q{postgresql://host?sslmode=disable}, + q{host='host' sslmode='disable' (inet)}, + q{}, + PGSSLROOTCERT => "system", + ], + [ + q{postgresql://host?sslmode=prefer}, + q{host='host' sslmode='prefer' (inet)}, + q{}, + PGSSLROOTCERT => "system", + ], + [ + q{postgresql://host?sslmode=verify-full}, + q{host='host' (inet)}, + q{}, + PGSSLROOTCERT => "system", ]); # test to run for each of the above test definitions sub test_uri { local $Test::Builder::Level = $Test::Builder::Level + 1;
Re: Crash after a call to pg_backup_start()
On Mon, Oct 24, 2022 at 11:39:19AM +0200, Alvaro Herrera wrote: > Reading it again, I agree with your conclusion, so I'll push as you > proposed with some extra tests, after they complete running. Thanks for the fix, Álvaro! -- Michael signature.asc Description: PGP signature
Re: parse partition strategy string in gram.y
Alvaro Herrera writes: > On 2022-Oct-24, Finnerty, Jim wrote: >> The advantage of hash partition bounds is that they are not >> domain-specific, as they are for ordinary RANGE partitions, but they >> are more flexible than MODULUS/REMAINDER partition bounds. I'm more than a bit skeptical of that claim. Under what circumstances (other than a really awful hash function, perhaps) would it make sense to not use equi-sized hash partitions? If you can predict that more stuff is going to go into one partition than another, then you need to fix your hash function, not invent more complication for the core partitioning logic. regards, tom lane
Re: [PATCHES] Post-special page storage TDE support
Hi, On Mon, Oct 24, 2022 at 12:55:53PM -0500, David Christensen wrote: > > Explicitly > locking (assuming you stay in your lane) should only need to guard > against access from other > backends of this type if using shared buffers, so will be use-case dependent. I'm not sure what you mean here? > This does have a runtime overhead due to moving some offset > calculations from compile time to > runtime. It is thought that the utility of this feature will outweigh > the costs here. Have you done some benchmarking to give an idea of how much overhead we're talking about? > Candidates for page features include 32-bit or 64-bit checksums, > encryption tags, or additional > per-page metadata. > > While we are not currently getting rid of the pd_checksum field, this > mechanism could be used to > free up that 16 bits for some other purpose. IIUC there's a hard requirement of initdb-time initialization, as there's otherwise no guarantee that you will find enough free space in each page at runtime. It seems like a very hard requirement for a full replacement of the current checksum approach (even if I agree that the current implementation limitations are far from ideal), especially since there's no technical reason that would prevent us from dynamically enabling data-checksums without doing all the work when the cluster is down.
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 12:19 AM Zhihong Yu wrote: > > Hi, > When I was looking at src/backend/optimizer/util/restrictinfo.c, I found a typo in one of the comments. Using "t" as an abbreviation for "true" was probably intentional, so not a typo. There is no doubt what the behavior is. > I also took the chance to simplify the code a little bit. It's perfectly clear and simple now, even if it doesn't win at "code golf". -- John Naylor EDB: http://www.enterprisedb.com
Re: Question about savepoint level?
On Mon, Oct 24, 2022 at 6:01 PM Alvaro Herrera wrote: > On 2022-Oct-24, Richard Guo wrote: > > ISTM the savepointLevel always remains the same as what is in > > TopTransactionStateData after looking at the codes. Now I also get > > confused. Maybe what we want is nestingLevel? > > This has already been discussed: > https://postgr.es/m/1317297307-sup-7...@alvh.no-ip.org > Now that we have transaction-controlling procedures, I think the next > step is to add the SQL-standard feature that allows savepoint level > control for them, which would make the savepointLevel no longer dead > code. Now I see the context. Thanks for pointing that out. Thanks Richard
Re: Some comments that should've covered MERGE
On Mon, Oct 24, 2022 at 6:59 PM Alvaro Herrera wrote: > On 2022-Oct-19, Richard Guo wrote: > > > Hi hackers, > > > > I happened to notice $subject. Attach a trivial patch for that. > > Thanks, applied. I did change the comment atop setTargetTable, which I > thought could use a little bit more detail on what is happening, and > also in its callsite in transformMergeStmt. Thanks for the fix! Thanks Richard
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 10:05 AM John Naylor wrote: > > On Tue, Oct 25, 2022 at 12:19 AM Zhihong Yu wrote: > > > > Hi, > > When I was looking at src/backend/optimizer/util/restrictinfo.c, I found > a typo in one of the comments. > > Using "t" as an abbreviation for "true" was probably intentional, so not a > typo. There is no doubt what the behavior is. > > > I also took the chance to simplify the code a little bit. > > It's perfectly clear and simple now, even if it doesn't win at "code golf". > Agree with your point. Do you think we can further make the one-line function a macro or an inline function in the .h file? I think this function is called quite frequently during planning, so maybe doing that would bring a little bit of efficiency. Thanks Richard
Re: fixing typo in comment for restriction_is_or_clause
On Tue, 25 Oct 2022 at 10:48, Richard Guo wrote: > On Tue, Oct 25, 2022 at 10:05 AM John Naylor > wrote: > >> >> On Tue, Oct 25, 2022 at 12:19 AM Zhihong Yu wrote: >> > >> > Hi, >> > When I was looking at src/backend/optimizer/util/restrictinfo.c, I found >> a typo in one of the comments. >> >> Using "t" as an abbreviation for "true" was probably intentional, so not a >> typo. There is no doubt what the behavior is. >> >> > I also took the chance to simplify the code a little bit. >> >> It's perfectly clear and simple now, even if it doesn't win at "code golf". >> > > Agree with your point. Do you think we can further make the one-line > function a macro or an inline function in the .h file? I think this > function is called quite frequently during planning, so maybe doing that > would bring a little bit of efficiency. > +1, same goes for restriction_is_securely_promotable. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: fixing typo in comment for restriction_is_or_clause
On Mon, Oct 24, 2022 at 7:58 PM Japin Li wrote: > > On Tue, 25 Oct 2022 at 10:48, Richard Guo wrote: > > On Tue, Oct 25, 2022 at 10:05 AM John Naylor < > john.nay...@enterprisedb.com> > > wrote: > > > >> > >> On Tue, Oct 25, 2022 at 12:19 AM Zhihong Yu wrote: > >> > > >> > Hi, > >> > When I was looking at src/backend/optimizer/util/restrictinfo.c, I > found > >> a typo in one of the comments. > >> > >> Using "t" as an abbreviation for "true" was probably intentional, so > not a > >> typo. There is no doubt what the behavior is. > >> > >> > I also took the chance to simplify the code a little bit. > >> > >> It's perfectly clear and simple now, even if it doesn't win at "code > golf". > >> > > > > Agree with your point. Do you think we can further make the one-line > > function a macro or an inline function in the .h file? I think this > > function is called quite frequently during planning, so maybe doing that > > would bring a little bit of efficiency. > > > > +1, same goes for restriction_is_securely_promotable. > > Hi, Thanks for the comments. Please take a look at patch v2. v2-is-or.patch Description: Binary data
Re: Perform streaming logical transactions by background workers and parallel apply
On Fri, Oct 21, 2022 at 6:32 PM houzj.f...@fujitsu.com wrote: > > On Thursday, October 20, 2022 5:49 PM Amit Kapila > wrote: > > On Thu, Oct 20, 2022 at 2:08 PM Peter Smith > > wrote: > > > > > > 7. get_transaction_apply_action > > > > > > > 12. get_transaction_apply_action > > > > > > > > I still felt like there should be some tablesync checks/comments in > > > > this function, just for sanity, even if it works as-is now. > > > > > > > > For example, are you saying ([3] #22b) that there might be rare > > > > cases where a Tablesync would call to parallel_apply_find_worker? > > > > That seems strange, given that "for streaming transactions that are > > > > being applied in the parallel ... we disallow applying changes on a > > > > table that is not in the READY state". > > > > > > > > -- > > > > > > Houz wrote [2] - > > > > > > I think because we won't try to start parallel apply worker in table > > > sync worker(see the check in parallel_apply_can_start()), so we won't > > > find any worker in parallel_apply_find_worker() which means > > > get_transaction_apply_action will return TRANS_LEADER_SERIALIZE. And > > > get_transaction_apply_action is a function which can be invoked for > > > all kinds of workers(same is true for all apply_handle_xxx functions), > > > so not sure if table sync check/comment is necessary. > > > > > > ~ > > > > > > Sure, and I believe you when you say it all works OK - but IMO there > > > is something still not quite right with this current code. For > > > example, > > > > > > e.g.1 the functional will return TRANS_LEADER_SERIALIZE for Tablesync > > > worker, and yet the comment for TRANS_LEADER_SERIALIZE says "means > > > that we are in the leader apply worker" (except we are not) > > > > > > e.g.2 we know for a fact that Tablesync workers cannot start their own > > > parallel apply workers, so then why do we even let the Tablesync > > > worker make a call to parallel_apply_find_worker() looking for > > > something we know will not be found? > > > > > > > I don't see much benefit in adding an additional check for tablesync workers > > here. It will unnecessarily make this part of the code look bit ugly. > > Thanks for the review, here is the new version patch set which addressed > Peter[1] > and Kuroda-san[2]'s comments. I've started to review this patch. I tested v40-0001 patch and have one question: IIUC even when most of the changes in the transaction are filtered out in pgoutput (eg., by relation filter or row filter), the walsender sends STREAM_START. This means that the subscriber could end up launching parallel apply workers also for almost empty (and streamed) transactions. For example, I created three subscriptions each of which subscribes to a different table. When I loaded a large amount of data into one table, all three (leader) apply workers received START_STREAM and launched their parallel apply workers. However, two of them finished without applying any data. I think this behaviour looks problematic since it wastes workers and rather decreases the apply performance if the changes are not large. Is it worth considering a way to delay launching a parallel apply worker until we find out the amount of changes is actually large? For example, the leader worker writes the streamed changes to files as usual and launches a parallel worker if the amount of changes exceeds a threshold or the leader receives the second segment. After that, the leader worker switches to send the streamed changes to parallel workers via shm_mq instead of files. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Improve tab completion for ALTER STATISTICS
On Mon, 24 Oct 2022 at 12:30, Michael Paquier wrote: > > On Wed, Oct 19, 2022 at 04:06:51PM +0530, vignesh C wrote: > > I noticed that the tab completion for ALTER STATISTICS .. SET was not > > handled. The attached patch displays SCHEMA and STATISTICS for tab > > completion of ALTER STATISTICS name SET. > > Indeed, it is a bit strange as we would get a list of settable > parameters once the completion up to SET is done, rather than > STATISTICS and SCHEMA. Your patch looks fine, so applied. Thanks! Thanks for pushing this. Regards, Vignesh
Outdated comments in AssignTransactionId?
The AssignTransactionId has the following comments: /* * ensure this test matches similar one in * RecoverPreparedTransactions() */ if (nUnreportedXids >= PGPROC_MAX_CACHED_SUBXIDS || log_unknown_top) { ... } However, RecoverPreparedTransactions removes this reference in 49e9281549. Attached remove this reference in AssignTransactionId. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index fd5103a78e..fe0cf5527b 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -741,10 +741,6 @@ AssignTransactionId(TransactionState s) unreportedXids[nUnreportedXids] = XidFromFullTransactionId(s->fullTransactionId); nUnreportedXids++; - /* - * ensure this test matches similar one in - * RecoverPreparedTransactions() - */ if (nUnreportedXids >= PGPROC_MAX_CACHED_SUBXIDS || log_unknown_top) {
Re: GUC values - recommended way to declare the C variables?
On Thu, Oct 20, 2022 at 6:52 PM Peter Smith wrote: > > On Thu, Oct 20, 2022 at 3:16 PM Justin Pryzby wrote: > > > > On Thu, Oct 20, 2022 at 11:56:58AM +1100, Peter Smith wrote: > > > Patch 0002 adds a sanity-check function called by > > > InitializeGUCOptions, as suggested by Tom [2]. This is to ensure that > > > the GUC C variable initial values are sensible and/or have not gone > > > stale compared with the compiled-in defaults of guc_tables.c. This > > > patch also changes some GUC C variable initial values which were > > > already found (by this sanity-checker) to be different. > > > > I like it. > > > > However it's fails on windows: > > > > https://cirrus-ci.com/task/5545965036765184 > > > > running bootstrap script ... FATAL: GUC (PGC_BOOL) update_process_title, > > boot_val=0, C-var=1 > > > > Maybe you need to exclude dynamically set gucs ? > > See also this other thread, where I added a flag identifying exactly > > that. https://commitfest.postgresql.org/40/3736/ > > I need to polish that patch some, but maybe it'll be useful for you, too. > > PSA patch set v3. This is essentially the same as before except now, utilizing the GUC_DEFAULT_COMPILE flag added by Justin's patch [1], the sanity-check skips over any dynamic compiler-dependent GUCs. Patch 0001 - GUC trivial mods to logical replication GUC C var declarations Patch 0002 - (TMP) Justin's patch adds the GUC_DEFAULT_COMPILE flag support -- this is now a prerequisite for 0003 Patch 0003 - GUC sanity-check comparisons of GUC C var declarations with the GUC defaults from guc_tables.c -- [1] Justin's patch of 24/Oct - https://www.postgresql.org/message-id/20221024220544.GJ16921%40telsasoft.com Kind Regards, Peter Smith. Fujitsu Australia v3-0003-GUC-C-variable-sanity-check.patch Description: Binary data v3-0001-GUC-tidy-some-C-variable-declarations.patch Description: Binary data v3-0002-TMP-Justins-DYNAMIC_DEFAULT-patch.patch Description: Binary data
Re: fixing typo in comment for restriction_is_or_clause
On Tue, 25 Oct 2022 at 11:07, Zhihong Yu wrote: > Please take a look at patch v2. Maybe we should define those functions in headers. See patch v3. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c index ef8df3d098..8a6812b4b1 100644 --- a/src/backend/optimizer/util/restrictinfo.c +++ b/src/backend/optimizer/util/restrictinfo.c @@ -375,41 +375,6 @@ commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op) return result; } -/* - * restriction_is_or_clause - * - * Returns t iff the restrictinfo node contains an 'or' clause. - */ -bool -restriction_is_or_clause(RestrictInfo *restrictinfo) -{ - if (restrictinfo->orclause != NULL) - return true; - else - return false; -} - -/* - * restriction_is_securely_promotable - * - * Returns true if it's okay to evaluate this clause "early", that is before - * other restriction clauses attached to the specified relation. - */ -bool -restriction_is_securely_promotable(RestrictInfo *restrictinfo, - RelOptInfo *rel) -{ - /* - * It's okay if there are no baserestrictinfo clauses for the rel that - * would need to go before this one, *or* if this one is leakproof. - */ - if (restrictinfo->security_level <= rel->baserestrict_min_security || - restrictinfo->leakproof) - return true; - else - return false; -} - /* * get_actual_clauses * diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h index 6d30bd5e9d..dbfbebff48 100644 --- a/src/include/optimizer/restrictinfo.h +++ b/src/include/optimizer/restrictinfo.h @@ -31,9 +31,6 @@ extern RestrictInfo *make_restrictinfo(PlannerInfo *root, Relids outer_relids, Relids nullable_relids); extern RestrictInfo *commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op); -extern bool restriction_is_or_clause(RestrictInfo *restrictinfo); -extern bool restriction_is_securely_promotable(RestrictInfo *restrictinfo, - RelOptInfo *rel); extern List *get_actual_clauses(List *restrictinfo_list); extern List *extract_actual_clauses(List *restrictinfo_list, bool pseudoconstant); @@ -46,4 +43,36 @@ extern bool join_clause_is_movable_into(RestrictInfo *rinfo, Relids currentrelids, Relids current_and_outer); +/* + * restriction_is_or_clause + * + * Returns true iff the restrictinfo node contains an 'or' clause. + */ +static inline bool +restriction_is_or_clause(RestrictInfo *restrictinfo) +{ + return (restrictinfo->orclause != NULL); +} + +/* + * restriction_is_securely_promotable + * + * Returns true if it's okay to evaluate this clause "early", that is before + * other restriction clauses attached to the specified relation. + */ +static inline bool +restriction_is_securely_promotable(RestrictInfo *restrictinfo, + RelOptInfo *rel) +{ + /* + * It's okay if there are no baserestrictinfo clauses for the rel that + * would need to go before this one, *or* if this one is leakproof. + */ + if (restrictinfo->security_level <= rel->baserestrict_min_security || + restrictinfo->leakproof) + return true; + else + return false; +} + #endif /* RESTRICTINFO_H */
Re: fix stats_fetch_consistency value in postgresql.conf.sample
On Tue, Oct 25, 2022 at 9:05 AM Justin Pryzby wrote: > > On Mon, Sep 26, 2022 at 05:29:58PM +0900, Kyotaro Horiguchi wrote: > > At Sat, 17 Sep 2022 23:53:07 -0500, Justin Pryzby > > wrote in ... > Rebased the patch. > > I also split the flag into DEFAULTS_COMPILE and DEFAULTS_INITDB, since > that makes it easier to understand what the flags mean and the intent of > the patch. And maybe allows fewer exclusions in patches like Peter's, > which I think would only want to exclude compile-time defaults. > Thanks! FYI, I'm making use of this patch now as a prerequisite for my GUC C var sanity-checker [1]. -- [1] https://www.postgresql.org/message-id/CAHut%2BPss16YBiYYKyrZBvSp_4uSQfCy7aYfDXU0N8w5VZ5dd_g%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 11:46 AM Japin Li wrote: > > On Tue, 25 Oct 2022 at 11:07, Zhihong Yu wrote: > > Please take a look at patch v2. > > Maybe we should define those functions in headers. See patch v3. Yes, putting them in .h file is better to me. For the v3 patch, we can do the same one-line trick for restriction_is_securely_promotable. Thanks Richard
Re: Understanding, testing and improving our Windows filesystem code
I pushed the bug fixes from this series, without their accompanying tests. Here's a rebase of the test suite, with all those tests now squashed into the main test patch, and also the tell-Windows-to-be-more-like-Unix patch. Registered in the commitfest. From 1b70db7f4068df22b29e02d0c1b759195e0187ff Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 11 Oct 2022 17:26:02 +1300 Subject: [PATCH v3 1/4] Add suite of macros for writing TAP tests in C. Historically we had to load test modules into a PostgreSQL backend or write an ad-hoc standalone program to test C code. Let's provide a convenient way to write stand-alone tests for low-level C code. This commit defines a small set of macros that spit out results in TAP format. These can be improved and extended as required. Discussion: https://postgr.es/m/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com --- src/include/lib/pg_tap.h | 138 +++ 1 file changed, 138 insertions(+) create mode 100644 src/include/lib/pg_tap.h diff --git a/src/include/lib/pg_tap.h b/src/include/lib/pg_tap.h new file mode 100644 index 00..3da46ac5d5 --- /dev/null +++ b/src/include/lib/pg_tap.h @@ -0,0 +1,138 @@ +/* + * Simple macros for writing tests in C that print results in TAP format, + * as consumed by "prove". + * + * Specification for the output format: https://testanything.org/ + */ + +#ifndef PG_TAP_H +#define PG_TAP_H + +#include +#include +#include + +/* Counters are global, so we can break our tests into multiple functions. */ +static int pg_test_count; +static int pg_fail_count; +static int pg_todo_count; + +/* + * Require an expression to be true. Used for set-up steps that are not + * reported as a test. + */ +#define PG_REQUIRE(expr) \ +if (!(expr)) { \ + printf("Bail out! requirement (" #expr ") failed at %s:%d\n", \ + __FILE__, __LINE__); \ + exit(EXIT_FAILURE); \ +} + +/* + * Like PG_REQUIRE, but log strerror(errno) before bailing. + */ +#define PG_REQUIRE_SYS(expr) \ +if (!(expr)) { \ + printf("Bail out! requirement (" #expr ") failed at %s:%d, error: %s\n", \ + __FILE__, __LINE__, strerror(errno)); \ + exit(EXIT_FAILURE); \ +} + +/* + * Test that an expression is true. An optional message can be provided, + * defaulting to the expression itself if not provided. + */ +#define PG_EXPECT(expr, ...) \ +do { \ + const char *messages[] = {#expr, __VA_ARGS__}; \ + const char *message = messages[lengthof(messages) - 1]; \ + pg_test_count++; \ + if (expr) { \ + printf("ok %d - %s\n", pg_test_count, message); \ + } else { \ + if (pg_todo_count == 0) \ + pg_fail_count++; \ + printf("not ok %d - %s (at %s:%d)%s\n", pg_test_count, \ + message, __FILE__, __LINE__, \ + pg_todo_count > 0 ? " # TODO" : ""); \ + } \ +} while (0) + +/* + * Like PG_EXPECT(), but also log strerror(errno) on failure. + */ +#define PG_EXPECT_SYS(expr, ...) \ +do { \ + const char *messages[] = {#expr, __VA_ARGS__}; \ + const char *message = messages[lengthof(messages) - 1]; \ + pg_test_count++; \ + if (expr) { \ + printf("ok %d - %s\n", pg_test_count, message); \ + } else { \ + if (pg_todo_count == 0) \ + pg_fail_count++; \ + printf("not ok %d - %s (at %s:%d), error: %s%s\n", pg_test_count, \ + message, __FILE__, __LINE__, strerror(errno), \ + pg_todo_count > 0 ? " # TODO" : ""); \ + } \ +} while (0) + + +/* + * Test that one int expression is equal to another, logging the values if not. + */ +#define PG_EXPECT_EQ(expr1, expr2, ...) \ +do { \ + const char *messages[] = {#expr1 " == " #expr2, __VA_ARGS__}; \ + const char *message = messages[lengthof(messages) - 1]; \ + int expr1_val = (expr1); \ + int expr2_val = (expr2); \ + pg_test_count++; \ + if (expr1_val == expr2_val) { \ + printf("ok %d - %s\n", pg_test_count, message); \ + } else { \ + if (pg_todo_count == 0) \ + pg_fail_count++; \ + printf("not ok %d - %s: %d != %d (at %s:%d)%s\n", pg_test_count, \ + message, expr1_val, expr2_val, __FILE__, __LINE__, \ + pg_todo_count > 0 ? " # TODO" : ""); \ + } \ +} while (0) + +/* + * Test that one C string expression is equal to another, logging the values if + * not. + */ +#define PG_EXPECT_EQ_STR(expr1, expr2, ...) \ +do { \ + const char *messages[] = {#expr1 " matches " #expr2, __VA_ARGS__}; \ + const char *message = messages[lengthof(messages) - 1]; \ + const char *expr1_val = (expr1); \ + const char *expr2_val = (expr2); \ + pg_test_count++; \ + if (strcmp(expr1_val, expr2_val) == 0) { \ + printf("ok %d - %s\n", pg_test_count, message); \ + } else { \ + if (pg_todo_count == 0) \ + pg_fail_count++; \ + printf("not ok %d - %s: \"%s\" vs \"%s\" (at %s:%d) %s\n", \ + pg_test_count, \ + message, expr1_val, expr2_val, __FILE__, __LINE__, \ + pg_todo_count > 0 ? " # TODO" : ""); \ + } \ +} while (0) + +/* + * The main function of a test program should begin and end with these + * functions. + */ +#define PG_BEGIN_TESTS() \ + setbuf(stdout, NULL); /* disable buffering for Windows */ + +#
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
On Mon, Oct 24, 2022 at 02:22:16PM +0200, Alvaro Herrera wrote: > I suppose it's a matter of whether any bugs are possible outside of > Neon. If yes, then definitely this should be backpatched. Offhand, I > don't see any. On the other hand, even if no bugs are known, then it's > still valuable to have all code paths do WAL insertion in the same way, > rather than having a single place that is alone in doing it in a > different way. But if we don't know of any bugs, then backpatching > might be more risk than not doing so. I have been putting my mind on that once again, and I don't see how this would cause an issue in vanilla PG code. XLogBeginInsert() does its checks, meaning that we'd get a PANIC instead of an ERROR now that these calls are within a critical section but that should not matter because we know that recovery has ended or we would not be able to do GIN insertions like that. Then, it only switches to begininsert_called to true, that we use for sanity checks in the various WAL insert paths. As Matthias has outlined, Neon relies on begininsert_called more than we do currently. FWIW, I think that we're still fine not backpatching that, even considering the consistency of the code with stable branches. Now, if there is a strong trend in favor of a backpatch, I'd be fine with this conclusion as well. > I confess I don't understand why is it important that XLogBeginInsert is > called inside the critical section. It seems to me that that part is > only a side-effect of having to acquire the buffer locks in the critical > section. Right? Yeah, you are right that it would not matter for XLogBeginInsert(), though I'd like to think that this is a good practice on consistency grounds with anywhere else, and we respect what's documented in the README. > I noticed that line 427 logs the GIN metapage with flag REGBUF_STANDARD; > is the GIN metapage really honoring pd_upper? I see only pg_lower being > set. Hmm. Not sure. -- Michael signature.asc Description: PGP signature
Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering
Michael Paquier writes: > On Mon, Oct 24, 2022 at 02:22:16PM +0200, Alvaro Herrera wrote: >> I confess I don't understand why is it important that XLogBeginInsert is >> called inside the critical section. It seems to me that that part is >> only a side-effect of having to acquire the buffer locks in the critical >> section. Right? > Yeah, you are right that it would not matter for XLogBeginInsert(), > though I'd like to think that this is a good practice on consistency > grounds with anywhere else, and we respect what's documented in the > README. Yeah --- it's documented that way, and there doesn't seem to be a good reason not to honor that here. regards, tom lane
Re: GUC values - recommended way to declare the C variables?
On Tue, Oct 25, 2022 at 02:43:43PM +1100, Peter Smith wrote: > This is essentially the same as before except now, utilizing the > GUC_DEFAULT_COMPILE flag added by Justin's patch [1], the sanity-check > skips over any dynamic compiler-dependent GUCs. Yeah, this is a self-reminder that I should try to look at what's on the other thread. > Patch 0001 - GUC trivial mods to logical replication GUC C var declarations This one seems fine, so done. -- Michael signature.asc Description: PGP signature
Re: Getting rid of SQLValueFunction
On Fri, Oct 21, 2022 at 02:27:07PM +0900, Michael Paquier wrote: > I have looked at that, and the attribute mapping remains compatible > with past versions once the appropriate pg_proc entries are added. > The updated patch set attached does that (with a user() function as > well to keep the code a maximum simple), with more tests to cover the > attribute case mentioned upthread. Attached is a rebased patch set, as of the conflicts from 2e0d80c. -- Michael From 1d2d6e4803f3ab3e9d2c29efcc8dee0f8a53d699 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 25 Oct 2022 14:15:14 +0900 Subject: [PATCH v3 1/2] Remove from SQLValueFunction all the entries using "name" as return type This commit changes six SQL keywords to use COERCE_SQL_SYNTAX rather than SQLValueFunction: - CURRENT_ROLE - CURRENT_USER - USER - SESSION_USER - CURRENT_CATALOG - CURRENT_SCHEMA Among the six, "user", "current_role" and "current_catalog" require specific SQL functions to allow ruleutils.c to map them to the SQL keywords these require when using COERCE_SQL_SYNTAX. Having pg_proc.proname match with the keyword ensures that the compatibility remains the same when projecting any of these keywords in a FROM clause to an attribute name when an alias is not specified. Tests are added to make sure that the mapping happens from the SQL keyword is correct, using a view defition that feeds on these keywords in one of the tests introduced by 40c24bf (both in a SELECT clause and when using each keyword in a FROM clause). SQLValueFunction is reduced to half its contents after this change, simplifying its logic a bit as there is no need to enforce a C collation anymore. I have made a few performance tests, with a million-ish calls to these keywords without seeing a difference in run-time or in perf profiles. The remaining SQLValueFunctions relate to timestamps. Bump catalog version. Reviewed-by: Corey Huinker Discussion: https://postgr.es/m/yzag3morycguu...@paquier.xyz --- src/include/catalog/pg_proc.dat | 9 +++ src/include/nodes/primnodes.h | 8 +- src/backend/executor/execExprInterp.c | 27 src/backend/nodes/nodeFuncs.c | 11 +++- src/backend/parser/gram.y | 30 +- src/backend/parser/parse_expr.c | 8 -- src/backend/parser/parse_target.c | 18 -- src/backend/utils/adt/ruleutils.c | 36 +-- 8 files changed, 55 insertions(+), 92 deletions(-) diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 62a5b8e655..241366fc8e 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -1505,6 +1505,15 @@ { oid => '745', descr => 'current user name', proname => 'current_user', provolatile => 's', prorettype => 'name', proargtypes => '', prosrc => 'current_user' }, +{ oid => '9695', descr => 'current role name', + proname => 'current_role', provolatile => 's', prorettype => 'name', + proargtypes => '', prosrc => 'current_user' }, +{ oid => '9696', descr => 'user name', + proname => 'user', provolatile => 's', prorettype => 'name', + proargtypes => '', prosrc => 'current_user' }, +{ oid => '9697', descr => 'name of the current database', + proname => 'current_catalog', provolatile => 's', prorettype => 'name', + proargtypes => '', prosrc => 'current_database' }, { oid => '746', descr => 'session user name', proname => 'session_user', provolatile => 's', prorettype => 'name', proargtypes => '', prosrc => 'session_user' }, diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h index f71f551782..f6dd27edcc 100644 --- a/src/include/nodes/primnodes.h +++ b/src/include/nodes/primnodes.h @@ -1313,13 +1313,7 @@ typedef enum SQLValueFunctionOp SVFOP_LOCALTIME, SVFOP_LOCALTIME_N, SVFOP_LOCALTIMESTAMP, - SVFOP_LOCALTIMESTAMP_N, - SVFOP_CURRENT_ROLE, - SVFOP_CURRENT_USER, - SVFOP_USER, - SVFOP_SESSION_USER, - SVFOP_CURRENT_CATALOG, - SVFOP_CURRENT_SCHEMA + SVFOP_LOCALTIMESTAMP_N } SQLValueFunctionOp; typedef struct SQLValueFunction diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c index 9b9bbf00a9..6ebf5c287e 100644 --- a/src/backend/executor/execExprInterp.c +++ b/src/backend/executor/execExprInterp.c @@ -2495,15 +2495,10 @@ ExecEvalParamExtern(ExprState *state, ExprEvalStep *op, ExprContext *econtext) void ExecEvalSQLValueFunction(ExprState *state, ExprEvalStep *op) { - LOCAL_FCINFO(fcinfo, 0); SQLValueFunction *svf = op->d.sqlvaluefunction.svf; *op->resnull = false; - /* - * Note: current_schema() can return NULL. current_user() etc currently - * cannot, but might as well code those cases the same way for safety. - */ switch (svf->op) { case SVFOP_CURRENT_DATE: @@ -2525,28 +2520,6 @@ ExecEvalSQLValueFunction(ExprState *state, ExprEvalStep *op) case SVFOP_LOCALTIMESTAMP_N: *op->resvalue = TimestampGetDatum(GetSQLLocalTimestamp(svf->typmo
Some regression tests for the pg_control_*() functions
Hi all, As mentioned in [1], there is no regression tests for the SQL control functions: pg_control_checkpoint, pg_control_recovery, pg_control_system and pg_control_init. It would be minimal to check their execution, as of a "SELECT FROM func()", still some validation can be done on its output as long as the test is portable enough (needs transparency for wal_level, commit timestamps, etc.). Attached is a proposal to provide some coverage. Some of the checks could be just removed, like the ones for non-NULL fields, but I have written out everything to show how much could be done. Thoughts? [1]: https://www.postgresql.org/message-id/yzy0ilxnbmaxh...@paquier.xyz -- Michael diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out index 9f106c2a10..a5df523109 100644 --- a/src/test/regress/expected/misc_functions.out +++ b/src/test/regress/expected/misc_functions.out @@ -594,3 +594,89 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g; Index Cond: (unique1 = g.g) (4 rows) +-- +-- Test functions for control data +-- +\x +SELECT checkpoint_lsn > '0/0'::pg_lsn AS checkpoint_lsn, +redo_lsn > '0/0'::pg_lsn AS redo_lsn, +redo_wal_file IS NOT NULL AS redo_wal_file, +timeline_id > 0 AS timeline_id, +prev_timeline_id > 0 AS prev_timeline_id, +next_xid IS NOT NULL AS next_xid, +next_oid > 0 AS next_oid, +next_multixact_id != '0'::xid AS next_multixact_id, +next_multi_offset IS NOT NULL AS next_multi_offset, +oldest_xid != '0'::xid AS oldest_xid, +oldest_xid_dbid > 0 AS oldest_xid_dbid, +oldest_active_xid IS NOT NULL AS oldest_active_xid, +oldest_multi_xid != '0'::xid AS oldest_multi_xid, +oldest_multi_dbid > 0 AS oldest_multi_dbid, +oldest_commit_ts_xid IS NOT NULL AS oldest_commit_ts_xid, +newest_commit_ts_xid IS NOT NULL AS newest_commit_ts_xid + FROM pg_control_checkpoint(); +-[ RECORD 1 ]+-- +checkpoint_lsn | t +redo_lsn | t +redo_wal_file| t +timeline_id | t +prev_timeline_id | t +next_xid | t +next_oid | t +next_multixact_id| t +next_multi_offset| t +oldest_xid | t +oldest_xid_dbid | t +oldest_active_xid| t +oldest_multi_xid | t +oldest_multi_dbid| t +oldest_commit_ts_xid | t +newest_commit_ts_xid | t + +SELECT max_data_alignment > 0 AS max_data_alignment, +database_block_size > 0 AS database_block_size, +blocks_per_segment > 0 AS blocks_per_segment, +wal_block_size > 0 AS wal_block_size, +max_identifier_length > 0 AS max_identifier_length, +max_index_columns > 0 AS max_index_columns, +max_toast_chunk_size > 0 AS max_toast_chunk_size, +large_object_chunk_size > 0 AS large_object_chunk_size, +float8_pass_by_value IS NOT NULL AS float8_pass_by_value, +data_page_checksum_version >= 0 AS data_page_checksum_version + FROM pg_control_init(); +-[ RECORD 1 ]--+-- +max_data_alignment | t +database_block_size| t +blocks_per_segment | t +wal_block_size | t +max_identifier_length | t +max_index_columns | t +max_toast_chunk_size | t +large_object_chunk_size| t +float8_pass_by_value | t +data_page_checksum_version | t + +SELECT min_recovery_end_lsn >= '0/0'::pg_lsn AS min_recovery_end_lsn, +min_recovery_end_timeline >= 0 AS min_recovery_end_timeline, +backup_start_lsn >= '0/0'::pg_lsn AS backup_start_lsn, +backup_end_lsn >= '0/0'::pg_lsn AS backup_end_lsn, +end_of_backup_record_required IS NOT NULL AS end_of_backup_record_required + FROM pg_control_recovery(); +-[ RECORD 1 ]-+-- +min_recovery_end_lsn | t +min_recovery_end_timeline | t +backup_start_lsn | t +backup_end_lsn| t +end_of_backup_record_required | t + +SELECT pg_control_version > 0 AS pg_control_version, +catalog_version_no > 0 AS catalog_version_no, +system_identifier >= 0 AS system_identifier, +pg_control_last_modified <= now() AS pg_control_last_modified + FROM pg_control_system(); +-[ RECORD 1 ]+-- +pg_control_version | t +catalog_version_no | t +system_identifier| t +pg_control_last_modified | t + diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql index 639e9b352c..e5e75b82f3 100644 --- a/src/test/regress/sql/misc_functions.sql +++ b/src/test/regress/sql/misc_functions.sql @@ -223,3 +223,47 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,1000) g ON a.unique1 = g; EXPLAIN (COSTS OFF) SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g; + +-- +-- Test functions for control data +-- +\x +SELECT checkpoint_lsn > '0/0'::pg_lsn AS checkpoint_lsn, +redo_lsn > '0/0'::pg_lsn AS redo_lsn, +redo_wal_file IS NOT NULL AS redo_wal_file, +timeline_id > 0 AS timeline_id, +prev_timeline_id > 0 AS prev_timeline_id, +next_xid
Re: fixing typo in comment for restriction_is_or_clause
On Tue, 25 Oct 2022 at 12:01, Richard Guo wrote: > On Tue, Oct 25, 2022 at 11:46 AM Japin Li wrote: > >> >> On Tue, 25 Oct 2022 at 11:07, Zhihong Yu wrote: >> > Please take a look at patch v2. >> >> Maybe we should define those functions in headers. See patch v3. > > > Yes, putting them in .h file is better to me. For the v3 patch, we can > do the same one-line trick for restriction_is_securely_promotable. > Fixed. Please consider the v4 for further review. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd. diff --git a/src/backend/optimizer/util/restrictinfo.c b/src/backend/optimizer/util/restrictinfo.c index ef8df3d098..8a6812b4b1 100644 --- a/src/backend/optimizer/util/restrictinfo.c +++ b/src/backend/optimizer/util/restrictinfo.c @@ -375,41 +375,6 @@ commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op) return result; } -/* - * restriction_is_or_clause - * - * Returns t iff the restrictinfo node contains an 'or' clause. - */ -bool -restriction_is_or_clause(RestrictInfo *restrictinfo) -{ - if (restrictinfo->orclause != NULL) - return true; - else - return false; -} - -/* - * restriction_is_securely_promotable - * - * Returns true if it's okay to evaluate this clause "early", that is before - * other restriction clauses attached to the specified relation. - */ -bool -restriction_is_securely_promotable(RestrictInfo *restrictinfo, - RelOptInfo *rel) -{ - /* - * It's okay if there are no baserestrictinfo clauses for the rel that - * would need to go before this one, *or* if this one is leakproof. - */ - if (restrictinfo->security_level <= rel->baserestrict_min_security || - restrictinfo->leakproof) - return true; - else - return false; -} - /* * get_actual_clauses * diff --git a/src/include/optimizer/restrictinfo.h b/src/include/optimizer/restrictinfo.h index 6d30bd5e9d..b61707fd26 100644 --- a/src/include/optimizer/restrictinfo.h +++ b/src/include/optimizer/restrictinfo.h @@ -31,9 +31,6 @@ extern RestrictInfo *make_restrictinfo(PlannerInfo *root, Relids outer_relids, Relids nullable_relids); extern RestrictInfo *commute_restrictinfo(RestrictInfo *rinfo, Oid comm_op); -extern bool restriction_is_or_clause(RestrictInfo *restrictinfo); -extern bool restriction_is_securely_promotable(RestrictInfo *restrictinfo, - RelOptInfo *rel); extern List *get_actual_clauses(List *restrictinfo_list); extern List *extract_actual_clauses(List *restrictinfo_list, bool pseudoconstant); @@ -46,4 +43,33 @@ extern bool join_clause_is_movable_into(RestrictInfo *rinfo, Relids currentrelids, Relids current_and_outer); +/* + * restriction_is_or_clause + * + * Returns true iff the restrictinfo node contains an 'or' clause. + */ +static inline bool +restriction_is_or_clause(RestrictInfo *restrictinfo) +{ + return (restrictinfo->orclause != NULL); +} + +/* + * restriction_is_securely_promotable + * + * Returns true if it's okay to evaluate this clause "early", that is before + * other restriction clauses attached to the specified relation. + */ +static inline bool +restriction_is_securely_promotable(RestrictInfo *restrictinfo, + RelOptInfo *rel) +{ + /* + * It's okay if there are no baserestrictinfo clauses for the rel that + * would need to go before this one, *or* if this one is leakproof. + */ + return (restrictinfo->security_level <= rel->baserestrict_min_security || + restrictinfo->leakproof); +} + #endif /* RESTRICTINFO_H */
Re: GUC values - recommended way to declare the C variables?
On Tue, Oct 25, 2022 at 4:09 PM Michael Paquier wrote: > > On Tue, Oct 25, 2022 at 02:43:43PM +1100, Peter Smith wrote: > > This is essentially the same as before except now, utilizing the > > GUC_DEFAULT_COMPILE flag added by Justin's patch [1], the sanity-check > > skips over any dynamic compiler-dependent GUCs. > > Yeah, this is a self-reminder that I should try to look at what's on > the other thread. > > > Patch 0001 - GUC trivial mods to logical replication GUC C var declarations > > This one seems fine, so done. > -- Thanks for pushing v3-0001. PSA v4. Rebased the remaining 2 patches so the cfbot can still work. -- Kind Regards, Peter Smith. Fujitsu Australia v4-0001-TMP-Justins-DYNAMIC_DEFAULT-patch.patch Description: Binary data v4-0002-GUC-C-variable-sanity-check.patch Description: Binary data
Re: fixing typo in comment for restriction_is_or_clause
On Tue, Oct 25, 2022 at 9:48 AM Richard Guo wrote: > > > On Tue, Oct 25, 2022 at 10:05 AM John Naylor wrote: >> >> It's perfectly clear and simple now, even if it doesn't win at "code golf". > > > Agree with your point. Do you think we can further make the one-line > function a macro or an inline function in the .h file? I think this > function is called quite frequently during planning, so maybe doing that > would bring a little bit of efficiency. My point was misunderstood, which is: I don't think we need to do anything at all here if the goal was purely about aesthetics. If the goal has now changed to efficiency, I have no opinion about that yet since no evidence has been presented. -- John Naylor EDB: http://www.enterprisedb.com
Re: Perform streaming logical transactions by background workers and parallel apply
FYI - After a recent push, the v40-0001 patch no longer applies on the latest HEAD. [postgres@CentOS7-x64 oss_postgres_misc]$ git apply ../patches_misc/v40-0001-Perform-streaming-logical-transactions-by-parall.patch error: patch failed: src/backend/replication/logical/launcher.c:54 error: src/backend/replication/logical/launcher.c: patch does not apply -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Allow file inclusion in pg_hba and pg_ident files
On Mon, Oct 24, 2022 at 04:03:03PM +0800, Julien Rouhaud wrote: > It would also require to bring HbaLine->sourcefile. I'm afraid it would be > weird to introduce such a refactoring in a separate commit just to pass a > constant down multiple level of indirection, as all the macro will remain > specific to either hba or ident anyway. Putting my hands on it, I am not really afraid of doing that independently. From what I can see, this is cutting 23kB worth of diffs from 0002, reducing it from 94K to 71kB. > I agree that there are quite a lot of s/XXXFileName/file_name/, but those > aren't complicated, and keeping them in the same commit makes it easy to > validate that none has been forgotten since the regression tests covering > those > messages are in that commit too. Another advantage is that it minimizes the presence of the hardcoded HbaFileName and IdentFileName in hba.c, which is one thing we are trying to achieve here for the inclusion of more files. I found a bit strange that IdentLine had no sourcefile, actually. We track the file number but use it nowhere, and it seems to me that having more symmetry between both would be a good thing. So, the key of the logic is how we are going to organize the tokenization of the HBA and ident lines through all the inclusions.. As far as I get it, tokenize_auth_file() is the root call and tokenize_file_with_context() with its depth is able to work on each individual file, and it can optionally recurse depending on what's included. Why do you need to switch to the old context in tokenize_file_with_context()? Could it be simpler to switch once to linecxt outside of the internal routine? It looks like GetDirConfFiles() is another piece that can be refactored and reviewed on its own, as we use it in ParseConfigDirectory()@guc.c. -- Michael From 7c12a9dd2b23765c0e0d38da8140051a89c45fb4 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Tue, 25 Oct 2022 15:17:27 +0900 Subject: [PATCH v13 1/3] Refactor knowledge of origin file in hba.c This limits the footprint of HbaFileName and IdentFileName to their entry loading point, easing the introduction of the inclusion logic. --- src/include/libpq/hba.h | 3 ++ src/backend/libpq/hba.c | 114 +--- 2 files changed, 63 insertions(+), 54 deletions(-) diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h index cec2e2665f..bf896ac084 100644 --- a/src/include/libpq/hba.h +++ b/src/include/libpq/hba.h @@ -93,6 +93,7 @@ typedef struct AuthToken typedef struct HbaLine { + char *sourcefile; int linenumber; char *rawline; ConnType conntype; @@ -138,6 +139,7 @@ typedef struct HbaLine typedef struct IdentLine { + char *sourcefile; int linenumber; char *usermap; @@ -157,6 +159,7 @@ typedef struct IdentLine typedef struct TokenizedAuthLine { List *fields; /* List of lists of AuthTokens */ + char *file_name; /* File name of origin */ int line_num; /* Line number */ char *raw_line; /* Raw line text */ char *err_msg; /* Error message if any */ diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c index ea92f02a47..6524b60610 100644 --- a/src/backend/libpq/hba.c +++ b/src/backend/libpq/hba.c @@ -641,6 +641,7 @@ tokenize_auth_file(const char *filename, FILE *file, List **tok_lines, tok_line = (TokenizedAuthLine *) palloc(sizeof(TokenizedAuthLine)); tok_line->fields = current_line; + tok_line->file_name = pstrdup(filename); tok_line->line_num = line_number; tok_line->raw_line = pstrdup(buf.data); tok_line->err_msg = err_msg; @@ -984,7 +985,7 @@ do { \ errmsg("authentication option \"%s\" is only valid for authentication methods %s", \ optname, _(validmethods)), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, HbaFileName))); \ + line_num, file_name))); \ *err_msg = psprintf("authentication option \"%s\" is only valid for authentication methods %s", \ optname, validmethods); \ return false; \ @@ -1004,7 +1005,7 @@ do { \ errmsg("authentication method \"%s\" requires argument \"%s\" to be set", \ authname, argname), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, HbaFileName))); \ + line_num, file_name))); \ *err_msg = psprintf("authentication method \"%s\" requires argument \"%s\" to be set", \ authname, argname); \ return NULL; \ @@ -1027,7 +1028,7 @@ do { \ (errcode(ERRCODE_CONFIG_FILE_ERROR), \ errmsg("missing entry at end of line"), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, IdentFileName))); \ + line_num, file_name))); \ *err_msg = pstrdup("missing entry at end of line"); \ return NULL; \ } \ @@ -1040,7 +1041,7 @@ do { \ (errcode(ERRCODE_CONFIG_FILE_ERROR), \ errmsg("multiple values in ident field"), \ errcontext("line %d of configuration file \"%s\"", \ - line_num, IdentFileName))); \ +