date:20210915

Hi

looks so this patch is broken again. Please, can you do rebase?

Regards

Pavel

čt 16. 9. 2021 v 8:28 odesílatel wenjing  napsal:

>
>
>
>
>
>
>
>
>

Re: Added schema level support for publication.

On Wed, Sep 15, 2021 at 4:45 PM Greg Nancarrow  wrote:
>
> On Tue, Sep 14, 2021 at 6:38 PM vignesh C  wrote:
> >
> > I have handled this in the patch attached.
> >
>
> Regarding the following function in the v28-0002 patch:
>
> +/*
> + * Check if the relation schema is member of the schema list.
> + */
> +static void
> +RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool 
> schemacheck)
>
> I think this function is not well named or commented, and I don't like
> how the "schemacheck" bool parameter determines the type of objects in
> the "rels" list.
>

I think after fixing the comments in my previous email, the rels list
will become the same for this function but surely the extra parameter
is required for giving object-specific errors.

> I would suggest you simply split this function into two separate
> functions, corresponding to each of the blocks of the "if-else" within
> the for-loop of the existing RelSchemaIsMemberOfSchemaList function.
> The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function
> name implies a boolean return value, so seems misleading.
> So I think the names of the two functions that I am suggesting should
> be "CheckNotAlreadyInPublication" or something similar.
>

I think if we write individual functions then we need to add new
functions as and when we add new object types like sequences. The
other idea could be to keep a single function like now say
CheckObjSchemaNotAlreadyInPublication and instead of the bool
parameter as the patch has now, we can keep an enum parameter
"add_obj_type" for 'rel', 'schema', 'sequence'. We can either use
exiting enum PublicationObjSpecType or define a new one for the same.

-- 
With Regards,
Amit Kapila.

Re: Allow escape in application_name

2021-09-15 Thread Fujii Masao





On 2021/09/16 12:36, kuroda.hay...@fujitsu.com wrote:

you mean that this is not strtoXXX, right?


Yes, the behavior of process_log_prefix_padding() is different from
strtoint() or pg_strtoint32(). For example, how the heading space
characters are handled is different. If we use the name like
pg_fast_strtoint(), we might be likely to replace the existing strtoint()
or pg_strtoint32() with pg_fast_strtoint() because its name contains
"fast", for better performance. But we should not do that because
their behavior is different.


If so we should
go back to process_padding() Horiguchi-san, do you have any idea?


At least for me process_padding() sounds like the function is actually
doing the padding operation. This is not what the function does.
So how about something like parse_padding(), parse_padding_format()?


And I added pg_restrict keywords for compiler optimizations.


I'm not sure how much it's worth doing this here. If this change is helpful
for better performance or something, IMO it's better to propose this
separately.


I agreed that (2) and (3) breaks the rule which should override server option.
Hence I chose (4), values[i] substituted to buf.data in any case.

Attached is the latest version.


Thanks! Will review the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Column Filtering in Logical Replication

2021-09-15 Thread vignesh C

On Thu, Sep 16, 2021 at 8:45 AM Amit Kapila  wrote:
>
> On Wed, Sep 15, 2021 at 6:06 PM Alvaro Herrera  
> wrote:
> >
> > On 2021-Sep-15, vignesh C wrote:
> > > The patch
> > > Generic_object_type_parser_002_table_schema_publication.patch has the
> > > changes that were used to handle the parsing. Schema and Relation both
> > > are different objects, schema is of string type and relation is of
> > > RangeVar type. While parsing, schema name is parsed in string format
> > > and relation is parsed and converted to rangevar type, these objects
> > > will be then handled accordingly during post processing.
> >
> > Yeah, I think it'd be cleaner if the node type has two members, something 
> > like
> > this
> >
> > typedef struct PublicationObjSpec
> > {
> > NodeTag type;
> > PublicationObjSpecType pubobjtype;  /* type of this publication 
> > object */
> > RangeVar*rv;/* if a table */
> > String  *objname;   /* if a schema */
> > int location;   /* token location, or -1 if 
> > unknown */
> > } PublicationObjSpec;
> >
> > and only one of them is set, the other is NULL, depending on the object 
> > type.
> >
>
> I think the problem here is that with the proposed grammar we won't be
> always able to distinguish names at the gram.y stage.

This is the issue that Amit was talking about:
gram.y: error: shift/reduce conflicts: 2 found, 0 expected
gram.y: warning: shift/reduce conflict on token ',' [-Wcounterexamples]
  First example: CREATE PUBLICATION name FOR TABLE relation_expr_list
• ',' relation_expr ',' PublicationObjSpec opt_definition $end
  Shift derivation
$accept
↳ parse_toplevel
  $end
  ↳ stmtmulti
↳ toplevel_stmt
  ↳ stmt
↳ CreatePublicationStmt
  ↳ CREATE PUBLICATION name FOR pub_obj_list
   opt_definition
↳ PublicationObjSpec
',' PublicationObjSpec
  ↳ TABLE relation_expr_list
  ↳
relation_expr_list • ',' relation_expr
  Second example: CREATE PUBLICATION name FOR TABLE relation_expr_list
• ',' PublicationObjSpec opt_definition $end
  Reduce derivation
$accept
↳ parse_toplevel
$end
  ↳ stmtmulti
↳ toplevel_stmt
  ↳ stmt
↳ CreatePublicationStmt
  ↳ CREATE PUBLICATION name FOR pub_obj_list
 opt_definition
↳ pub_obj_list
  ',' PublicationObjSpec
  ↳ PublicationObjSpec
↳ TABLE relation_expr_list •
Here it is not able to distinguish if ',' is used for the next table
name or the next object.
I was able to reproduce this issue with the attached patch.

Regards,
Vignesh
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6efe..2a2fe03c13 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,14 +141,14 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
  * Insert new publication / relation mapping.
  */
 ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, Relation targetrel,
 		 bool if_not_exists)
 {
 	Relation	rel;
 	HeapTuple	tup;
 	Datum		values[Natts_pg_publication_rel];
 	bool		nulls[Natts_pg_publication_rel];
-	Oid			relid = RelationGetRelid(targetrel->relation);
+	Oid			relid = RelationGetRelid(targetrel);
 	Oid			prrelid;
 	Publication *pub = GetPublication(pubid);
 	ObjectAddress myself,
@@ -172,10 +172,10 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 		ereport(ERROR,
 (errcode(ERRCODE_DUPLICATE_OBJECT),
  errmsg("relation \"%s\" is already member of publication \"%s\"",
-		RelationGetRelationName(targetrel->relation), pub->name)));
+		RelationGetRelationName(targetrel), pub->name)));
 	}
 
-	check_publication_add_relation(targetrel->relation);
+	check_publication_add_relation(targetrel);
 
 	/* Form a tuple. */
 	memset(values, 0, sizeof(values));
@@ -209,7 +209,7 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 	table_close(rel, RowExclusiveLock);
 
 	/* Invalidate relcache so that publication info is rebuilt. */
-	CacheInvalidateRelcache(targetrel->relation);
+	CacheInvalidateRelcache(targetrel);
 
 	return myself;
 }
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da1f5..4e4e02ba70 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -34,6 +34,7 @@
 #include "commands/publicationcmds.h"
 #inclu

Re: proposal: possibility to read dumped table's name from file

Hi

In yesterday's patch I used strndup, which is not available on win. I am
sending update when I used pnstrdup instead.

Regards

Pavel
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 7682226b99..1b74c0eadd 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -789,6 +789,55 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Read objects filters from the specified file. Specify "-" to read from
+stdin. Lines of this file must have the following format:
+
+{ include | exclude } { table | schema | foreign_data | data } PATTERN
+
+   
+
+   
+The first keyword specifies whether the object is to be included
+or excluded, and the second keyword specifies the type of object
+to be filtered:
+table (table),
+schema (schema),
+foreign_data (foreign server),
+data (table data).
+   
+
+   
+With the following filter file, the dump would include table
+mytable1 and data from foreign tables of
+some_foreign_server foreign server, but exclude data
+from table mytable2.
+
+include table mytable1
+include foreign_data some_foreign_server
+exclude table mytable2
+
+   
+
+   
+Lines starting with # are ignored. The comment
+(started by #) can be placed after filter too.
+Empty lines are ignored too.
+   
+
+   
+The --filter option works just like the other
+options to include or exclude tables, schemas, table data, or foreign
+tables, and both forms may be combined.  Note that there are no options
+to exclude a specific foreign table or to include a specific table's
+data.
+   
+  
+ 
+
  
   --if-exists
   
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a485fb2d07..81533eb144 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -55,10 +55,12 @@
 #include "catalog/pg_trigger_d.h"
 #include "catalog/pg_type_d.h"
 #include "common/connect.h"
+#include "common/string.h"
 #include "dumputils.h"
 #include "fe_utils/option_utils.h"
 #include "fe_utils/string_utils.h"
 #include "getopt_long.h"
+#include "lib/stringinfo.h"
 #include "libpq/libpq-fs.h"
 #include "parallel.h"
 #include "pg_backup_db.h"
@@ -308,7 +310,7 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(const TableInfo *tbinfo);
-
+static void read_filters_from_file(char *filename, DumpOptions *dopt);
 
 int
 main(int argc, char **argv)
@@ -380,6 +382,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, &dopt.enable_row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, &dopt.if_exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -613,6 +616,10 @@ main(int argc, char **argv)
 		  optarg);
 break;
 
+			case 12:			/* filter implementation */
+read_filters_from_file(optarg, &dopt);
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -1038,6 +1045,8 @@ help(const char *progname)
 			 "   access to)\n"));
 	printf(_("  --exclude-table-data=PATTERN do NOT dump data for the specified table(s)\n"));
 	printf(_("  --extra-float-digits=NUM override default setting for extra_float_digits\n"));
+	printf(_("  --filter=FILENAMEdump objects and data based on the filter expressions\n"
+			 "   from the filter file\n"));
 	printf(_("  --if-exists  use IF EXISTS when dropping objects\n"));
 	printf(_("  --include-foreign-data=PATTERN\n"
 			 "   include data of foreign tables on foreign\n"
@@ -18979,3 +18988,344 @@ appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 	if (!res)
 		pg_log_warning("could not parse reloptions array");
 }
+
+typedef struct
+{
+	FILE	   *fp;
+	char	   *filename;
+	int			lineno;
+} FilterStateData;
+
+typedef enum
+{
+	FILTER_OBJECT_TYPE_NONE,
+	FILTER_OBJECT_TYPE_TABLE,
+	FILTER_OBJECT_TYPE_SCHEMA,
+	FILTER_OBJECT_TYPE_FOREIGN_DATA,
+	FILTER_OBJECT_TYPE_DATA
+} FilterObjectType;
+
+/*
+ * Print error message and exit.
+ */
+static void
+exit_invalid_filter_format(FilterStateData *fstate, char *message)
+{
+	if (fstate->fp != stdin)
+	{
+		pg_log_error("invalid format of filter file \"%s\" on line %d: %s",
+	 fstate->filename,
+	 fstate->lineno,
+	 message);
+
+		fclose(fstate->fp);
+	}
+	else
+		pg_log_error("invalid format of filter on line %d: %s",
+	 fstate-

Re: right join with partitioned table crash

2021-09-15 Thread Justin Pryzby

On Wed, Sep 15, 2021 at 07:53:49PM -0400, Tom Lane wrote:
> Jaime Casanova  writes:
> > Here's another crash caught by sqlsmith.
> 
> Fun.  Looks like it fails back to v12, but not in v11,
> so it's some optimization we added in v12 that's at fault.

It seems to be a regression (?) in 12.6 (2021-02-11), from
| 1cce024fd2 Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar.

-- 
Justin

Re: Hook for extensible parsing.

On Thu, Sep 16, 2021 at 5:40 AM Andres Freund  wrote:
>
> > I don't have any better ideas to offer :-( ... but I very much fear
> > that the approach proposed here is a dead end.
>
> I unfortunately don't see a good way forward without changing the way we do
> parsing on a more fundamental level :(.

Using the ExtensibleNode infrastructure, I can see two ways to try to
leverage that.

First one would be to require modules to wrap their RawStmt->stmt in
an ExtensibleNode if they want to do anything that requires semantic
analysis, and handle such node in transformStmt() with another hook.
I think it would allow modules to do pretty much anything, at the cost
of walking the stmt twice and duplicating possibly huge amount of
analyze.c and friends.

The other one would be to allow the parser to leak ExtensibleNode in
the middle of the RawStmt and catch them in the transform* functions,
with e.g. some generic transformExtensibleNode(pstate, node,
some_identifier...) (the identifier giving both the general transform
action and some secondary info, like ParseExprKind for expressions).
This would avoid the downsides of the first approach, but would
require to call this new hook in a bunch of places.

Or we could combine both approaches so that the most common use cases,
like transformExprRecurse(), would be easily handled while more exotic
cases will have to go the hard way.  Parser authors could still ask
for adding a new call to this new hook to ease their work in the next
major version.

Would any of that be a reasonable approach?

Re: Added schema level support for publication.

On Wed, Sep 15, 2021 at 12:30 PM Amit Kapila  wrote:
>
> On Tue, Sep 14, 2021 at 2:08 PM vignesh C  wrote:
> >
> > I have handled this in the patch attached.
> >
>
> 4.
> AlterPublicationSchemas()
> {
> ..
> + /*
> + * If the table option was not specified remove the existing tables
> + * from the publication.
> + */
> + if (!tables)
> + {
> + rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
> + PublicationDropTables(pubform->oid, rels, false, true);
> + }
> +
> + /* Identify which schemas should be dropped */
> + delschemas = list_difference_oid(oldschemaids, schemaidlist);
> +
> + /* And drop them */
> + PublicationDropSchemas(pubform->oid, delschemas, true);
>
> Here, you have neither locked tables to be dropped nor schemas. I
> think both need to be locked as we do for tables in similar code in
> AlterPublicationTables(). Can you please test via debugger what
> happens if we try to drop without taking lock here and concurrently
> try to drop the actual object? It should give some error. If we decide
> to lock here then we should be able to pass the list of relations to
> PublicationDropTables() instead of Oids which would then obviate the
> need for any change to that function.
>
> Similarly don't we need to lock schemas before dropping them in
> AlterPublicationTables()?
>

I think there is one more similar locking problem.
AlterPublicationSchemas()
{
..
+ if (stmt->action == DEFELEM_ADD)
+ {
+ List *rels;
+
+ rels = GetPublicationRelations(pubform->oid, PUBLICATION_PART_ROOT);
+ RelSchemaIsMemberOfSchemaList(rels, schemaidlist, true);
...
...
}

Here, we don't have a lock on the relation. So, what if the relation
is concurrently dropped after you get the rel list by
GetPublicationRelations?

-- 
With Regards,
Amit Kapila.

Re: Signed vs Unsigned (take 2) (src/backend/storage/ipc/procarray.c)

2021-09-15 Thread Fujii Masao





On 2021/09/15 21:27, Ranier Vilela wrote:

I found this in the commit log in the patch. I agree that these patches
are refactoring ones. But I'm thinking that it's worth doing back-patch,
to make future back-patching easy. Thought?

Thanks for picking this.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: improve pg_receivewal code

> Here's the v2 with above modifications.

I was looking at this patch, and I agree that checking for the system
ID and the timeline by setting sysidentifier beforehand looks like an
improvement.

The extra IDENTIFY_SYSTEM done at the beginning of StreamLog() is not
a performance bottleneck as we run it only once for each loop.  I
don't really get the argument of a server replacing another one on the
same port requiring to rely only on the first system ID fetched before 
starting the loops of StreamLog() calls.  So I would leave main()
alone, but fill in the system ID from RunIdentifySystem() in
StreamLog().
--
Michael


signature.asc
Description: PGP signature

RE: Allow escape in application_name

2021-09-15 Thread kuroda.hay...@fujitsu.com

Dear Fujii-san,

Thanks for comments!

> >> Thanks for the new version.  I don't object for reusing
> >> process_log_prefix_padding, but I still find it strange that the
> >> function with the name 'process_padding' is in string.c.  If we move
> >> it to string.c, I'd like to name it "pg_fast_strtoi" or something and
> >> change the interface to int pg_fast_strtoi(const char *p, char
> >> **endptr) that is (roughly) compatible to strtol.  What do (both) you
> >> think about this?
> >
> > I agree that this interface might be confused.
> > I changed its name and interface. How do you think?
> > Actually I cannot distinguish the name is good or not,
> > but I could not think of anything else...
> 
> The name using the word "strtoint" sounds confusing to me
> because the behavior of the function is different from strtoint() or
> pg_strtoint32(), etc. Otherwise we can easily misunderstand that
> pg_fast_strtoint() can be used as alternative of strtoint() or
> pg_strtoint32(). I have no better idea for the name, though..

you mean that this is not strtoXXX, right? If so we should
go back to process_padding() Horiguchi-san, do you have any idea?

And I added pg_restrict keywords for compiler optimizations.

> >> I didn't fully checked in what case parse_pgfdw_appname gives "" as
> >> result, I feel that we should use the original value in that
> >> case. That is,
> >>
> >>>   parse_pgfdw_appname(&buf, vaues[i]);
> >>>
> >>>   /* use the result if any, otherwise use the original string */
> >>>   if (buf.data[0] != 0)
> >>>  values[i] = buf.data;
> >>>
> >>>   /* break if it results in non-empty string */
> >>>   if (values[0][0] != 0)
> >>>  break;
> 
> Agreed. It's strange to use the application_name of the server
> object in that case. There seems to be four options:
> 
> (1) Use the original postgres_fdw.application_name like "%b"
> (2) Use the application_name of the server object (if set)
> (3) Use fallback_application_name
> (4) Use empty string as application_name
> 
> (2) and (3) look strange to me because we expect that
> postgres_fdw.application_name should override application_name
> of the server object and fallback_application_mame.
> 
> Also reporting invalid escape sequence in application_name as it is,
> i.e., (1), looks strange to me.
> 
> So (4) looks most intuitive and similar behavior to log_line_prefix.
> Thought?

I agreed that (2) and (3) breaks the rule which should override server option.
Hence I chose (4), values[i] substituted to buf.data in any case.

Attached is the latest version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v16_0002_allow_escapes.patch
Description: v16_0002_allow_escapes.patch

RE: Added schema level support for publication.

On Tuesday, September 14, 2021 4:39 PM vignesh C  wrote:
> 
> I have handled this in the patch attached.

Thanks for updating the patch.
Here are some comments.

1)
+static void
+AlterPublicationSchemas(AlterPublicationStmt *stmt, Relation rel,
...
+   /*
+* If the table option was not specified remove the existing 
tables
+* from the publication.
+*/
+   if (!tables)
+   {
+   rels = GetPublicationRelations(pubform->oid, 
PUBLICATION_PART_ROOT);
+   PublicationDropTables(pubform->oid, rels, false, true);
+   }

It seems not natural to drop tables in AlterPublication*Schemas*,
I think we'd better do it in AlterPublicationTables.

2)
 static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
 ...
+   /*
+* If ALL TABLES IN SCHEMA option was not specified 
remove the
+* existing schemas from the publication.
+*/
+   List *pubschemas = GetPublicationSchemas(pubid);
+   PublicationDropSchemas(pubform->oid, pubschemas, false);

Same as 1), Is it better to modify the schema list in AlterPublicationSchemas ?

3)
 static void
 AlterPublicationTables(AlterPublicationStmt *stmt, Relation rel,
...
/* check if the relation is member of the schema list specified 
*/
RelSchemaIsMemberOfSchemaList(rels, schemaidlist, false);

IIRC, The check here is to check the specified tables and schemas in the
command. Personally, this seems a common operation which can be placed in
function AlterPublication(). If we move this check to AlterPublication() and if
comment 1) and 2) makes sense to you, then we don't need the new function
parameters in AlterPublicationTables() and AlterPublicationSchemas().

Best regards,
Hou zj

Re: Column Filtering in Logical Replication

On Wed, Sep 15, 2021 at 8:19 PM Euler Taveira  wrote:
>
> On Wed, Sep 15, 2021, at 9:19 AM, vignesh C wrote:
>
> I have extracted the parser code and attached it here, so that it will
> be easy to go through. We wanted to support the following syntax as in
> [1]:
> CREATE PUBLICATION pub1 FOR
> TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
> SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
>
> I don't like this syntax. It seems too much syntax for the same purpose in a
> single command. If you look at GRANT command whose ALL TABLES IN SCHEMA syntax
> was extracted, you can use ON TABLE or ON ALL TABLES IN SCHEMA; you cannot use
> both. This proposal allows duplicate objects (of course, you can ignore it but
> the current code prevent duplicates -- see publication_add_relation).
>
> IMO you should mimic the GRANT grammar and have multiple commands for row
> filtering, column filtering, and ALL FOO IN SCHEMA. The filtering patches only
> use the FOR TABLE syntax. The later won't have filtering syntax.
>

Sure, but we don't prevent if the user uses only FOR TABLE variant.
OTOH, it is better to provide flexibility to allow multiple objects in
one command unless that is not feasible. It saves the effort of users
in many cases. In short, +1 for the syntax where multiple objects can
be allowed.

-- 
With Regards,
Amit Kapila.

Re: Column Filtering in Logical Replication

On Wed, Sep 15, 2021 at 6:06 PM Alvaro Herrera  wrote:
>
> On 2021-Sep-15, vignesh C wrote:
> > The patch
> > Generic_object_type_parser_002_table_schema_publication.patch has the
> > changes that were used to handle the parsing. Schema and Relation both
> > are different objects, schema is of string type and relation is of
> > RangeVar type. While parsing, schema name is parsed in string format
> > and relation is parsed and converted to rangevar type, these objects
> > will be then handled accordingly during post processing.
>
> Yeah, I think it'd be cleaner if the node type has two members, something like
> this
>
> typedef struct PublicationObjSpec
> {
> NodeTag type;
> PublicationObjSpecType pubobjtype;  /* type of this publication 
> object */
> RangeVar*rv;/* if a table */
> String  *objname;   /* if a schema */
> int location;   /* token location, or -1 if 
> unknown */
> } PublicationObjSpec;
>
> and only one of them is set, the other is NULL, depending on the object type.
>

I think the problem here is that with the proposed grammar we won't be
always able to distinguish names at the gram.y stage. Some post
parsing analysis is required to attribute the right type to name as is
done in the patch. The same seems to be indicated by Tom in his email
as well where he has proposed this syntax [1]. Also, something similar
is done for privilege_target (GRANT syntax) where we have a list of
objects but here the story is slightly more advanced because we are
planning to allow specifying multiple objects in one command. One
might think that we can identify each type of objects lists separately
but that gives grammar conflicts as it is not able to identify whether
the comma ',' is used for the same type object or for the next type.
Due to which we need to come up with a generic object for names to
which we attribute the right type in post parse analysis. Now, I think
instead of void *, it might be better to use Node * for generic
objects unless we have some problem.

[1] - https://www.postgresql.org/message-id/877603.1629120678%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.

Re: Column Filtering in Logical Replication

2021-09-15 Thread Peter Smith

On Tue, Sep 7, 2021 at 3:51 AM Alvaro Herrera  wrote:
>
...
> I pushed the clerical part of this -- namely the addition of
> PublicationTable node and PublicationRelInfo struct.  I attach the part
> of your v4 patch that I didn't include.  It contains a couple of small
> corrections, but I didn't do anything invasive (such as pgindent)
> because that would perhaps cause you too much merge pain.

I noticed that the latest v5 no longer includes the TAP test which was
in the v4 patch.

(src/test/subscription/t/021_column_filter.pl)

Was that omission deliberate?

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Estimating HugePages Requirements?

On Wed, Sep 15, 2021 at 10:31:20PM +, Bossart, Nathan wrote:
> +This can be used on a running server for most parameters.  However,
> +the server must be shut down for some runtime-computed parameters
> +(e.g., , and
> +).
> 
> nitpick: I think you can remove the comma before the "and" in the list
> of examples.

Fixed that, and applied.  Could you rebase the last patch with the
name suggested for the last GUC, including the docs?  It looks like we
are heading for shared_memory_size_in_huge_pages.
--
Michael


signature.asc
Description: PGP signature

RE: Column Filtering in Logical Replication

On Wednesday, September 15, 2021 8:19 PM vignesh C  wrote:
> I have extracted the parser code and attached it here, so that it will be 
> easy to
> go through. We wanted to support the following syntax as in
> [1]:
> CREATE PUBLICATION pub1 FOR
> TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2, SEQUENCE seq1,seq2, ALL
> SEQUENCES IN SCHEMA s3,s4;

I am +1 for this syntax.

This syntax is more flexible than adding or dropping different type objects in
separate commands. User can either use one single command to add serval 
different
objects or use serval commands to add each type objects.

Best regards,
Hou zj

RE: [BUG] Unexpected action when publishing partition tables

On Tuesday, September 14, 2021 10:41 PM vignesh C  wrote:
> On Tue, Sep 7, 2021 at 11:38 AM houzj.f...@fujitsu.com 
>  wrote:
> >
> >
> > Attach new version patches which addressed the comment.
> 
> Thanks for fixing this issue. The bug gets fixed by the patch, I did not find 
> any
> issues in my testing.
> I just had one minor comment:
> 
> We could clean the table at the end by calling DROP TABLE testpub_parted2:
> +-- still fail, because parent's publication replicates updates UPDATE
> +testpub_parted2 SET a = 2;
> +ERROR:  cannot update table "testpub_parted2" because it does not
> have a replica identity and publishes updates
> +HINT:  To enable updating the table, set REPLICA IDENTITY using ALTER
> TABLE.
> +ALTER PUBLICATION testpub_forparted DROP TABLE testpub_parted;
> +-- works again, because update is no longer replicated UPDATE
> +testpub_parted2 SET a = 2;

Thanks for the comment.
Attach new version patches which clean the table at the end.

Best regards,
Hou zj


v3-0001-Made-the-existing-relation-cache-invalidation-an.patch
Description:  v3-0001-Made-the-existing-relation-cache-invalidation-an.patch


v3-0002-Invalidate-all-partitions.patch
Description: v3-0002-Invalidate-all-partitions.patch

RE: Logical replication keepalive flood

From Tuesday, September 14, 2021 1:39 PM Greg Nancarrow  
wrote:
> However, the problem I found is that, with the patch applied, there is
> a test failure when running “make check-world”:
> 
>  t/006_logical_decoding.pl  4/14
> #   Failed test 'pg_recvlogical acknowledged changes'
> #   at t/006_logical_decoding.pl line 117.
> #  got: 'BEGIN
> # table public.decoding_test: INSERT: x[integer]:5 y[text]:'5''
> # expected: ''
> # Looks like you failed 1 test of 14.
> t/006_logical_decoding.pl  Dubious, test returned 1 (wstat
> 256, 0x100) Failed 1/14 subtests
> 
> 

After applying the patch,
I saw the same problem and can reproduce it by the following steps:

1) execute the SQLs.
---SQL---
CREATE TABLE decoding_test(x integer, y text);
SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');
INSERT INTO decoding_test(x,y) SELECT s, s::text FROM generate_series(1,4) s;

-- use the lsn here to execute pg_recvlogical later
SELECT lsn FROM pg_logical_slot_peek_changes('test_slot', NULL, NULL) ORDER BY 
lsn DESC LIMIT 1;
INSERT INTO decoding_test(x,y) SELECT s, s::text FROM generate_series(5,50) s;
--

2) Then, if I execute the following command twice:
# pg_recvlogical -E lsn -d postgres -S 'test_slot' --start --no-loop -f -
BEGIN 708
table public.decoding_test: INSERT: x[integer]:1 y[text]:'1'
table public.decoding_test: INSERT: x[integer]:2 y[text]:'2'
table public.decoding_test: INSERT: x[integer]:3 y[text]:'3'
table public.decoding_test: INSERT: x[integer]:4 y[text]:'4'
COMMIT 708

# pg_recvlogical -E lsn -d postgres -S 'test_slot' --start --no-loop -f -
BEGIN 709

It still generated ' BEGIN 709' in the second time execution.
But it will output nothing in the second time execution without the patch.

Best regards,
Hou zj

Re: prevent immature WAL streaming

On 2021-Sep-15, Kyotaro Horiguchi wrote:

> + CopyXLogRecordToWAL(rechdr->xl_tot_len, isLogSwitch,
> + flags & 
> XLOG_SET_ABORTED_PARTIAL,
> + rdata, StartPos, 
> EndPos);
> 
> The new xlog flag XLOG_SET_ABORTED_PARTIAL is used only by
> RM_XLOG_ID/XLOG_OVERWRITE_CONTRECORD records, so the flag value is the
> equivalent of the record type.

In the new version I removed all this; it was wrong.

> + if (set_aborted_partial)
> + pagehdr->xlp_info |= 
> XLP_FIRST_IS_ABORTED_PARTIAL;
> +
> 
> I'm not sure about the reason for the change from the previous patch
> (I might be missing something), this sets the flag on the *next* page
> of the page where the record starts.  So in the first place we
> shouldn't set the flag there.

You're right, this code is wrong.  And in fact I had already noticed it
yesterday, but much to my embarrasment I forgot to fix it.  Here is a
fixed version, where I moved the flag set back to AdvanceXLInsertBuffer.
I think doing it anywhere else is going to be very painful.  AFAICT we
do the right thing now, but amusingly we don't have any tooling to
verify that the XLP flag is set in the page where we want it.

With this patch we now have two recptrs: the LSN of the broken record,
and the LSN of the missing contrecord.  The latter is where to start
writing WAL after recovery is done, and the former is currently unused
but we could use it to double-check that we're aborting (forgetting) the
correct record.  I didn't try to implement that, but IIUC it is
xlogreader that would have to do that.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"
>From f468f058d5b480a9f27f9d1830f3959297accd4d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 2 Sep 2021 17:21:46 -0400
Subject: [PATCH v6] Implement FIRST_IS_ABORTED_CONTRECORD

---
 src/backend/access/rmgrdesc/xlogdesc.c  |  11 +++
 src/backend/access/transam/xlog.c   | 112 ++--
 src/backend/access/transam/xloginsert.c |   1 +
 src/backend/access/transam/xlogreader.c |  38 +++-
 src/include/access/xlog_internal.h  |  20 -
 src/include/access/xlogreader.h |   7 ++
 src/include/catalog/pg_control.h|   1 +
 7 files changed, 180 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/rmgrdesc/xlogdesc.c b/src/backend/access/rmgrdesc/xlogdesc.c
index e6090a9dad..0be382f8a5 100644
--- a/src/backend/access/rmgrdesc/xlogdesc.c
+++ b/src/backend/access/rmgrdesc/xlogdesc.c
@@ -139,6 +139,14 @@ xlog_desc(StringInfo buf, XLogReaderState *record)
 		 xlrec.ThisTimeLineID, xlrec.PrevTimeLineID,
 		 timestamptz_to_str(xlrec.end_time));
 	}
+	else if (info == XLOG_OVERWRITE_CONTRECORD)
+	{
+		xl_overwrite_contrecord xlrec;
+
+		memcpy(&xlrec, rec, sizeof(xl_overwrite_contrecord));
+		appendStringInfo(buf, "lsn %X/%X",
+		 LSN_FORMAT_ARGS(xlrec.overwritten_lsn));
+	}
 }
 
 const char *
@@ -178,6 +186,9 @@ xlog_identify(uint8 info)
 		case XLOG_END_OF_RECOVERY:
 			id = "END_OF_RECOVERY";
 			break;
+		case XLOG_OVERWRITE_CONTRECORD:
+			id = "OVERWRITE_CONTRECORD";
+			break;
 		case XLOG_FPI:
 			id = "FPI";
 			break;
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e51a7a749d..a8e587b5c6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -199,6 +199,15 @@ static XLogRecPtr LastRec;
 static XLogRecPtr flushedUpto = 0;
 static TimeLineID receiveTLI = 0;
 
+/*
+ * abortedRecPtr is the start pointer of a broken record at end of WAL when
+ * recovery completes; missingContrecPtr is the location of the first
+ * contrecord that went missing.  See CreateOverwriteContrecordRecord for
+ * details.
+ */
+static XLogRecPtr abortedRecPtr;
+static XLogRecPtr missingContrecPtr;
+
 /*
  * During recovery, lastFullPageWrites keeps track of full_page_writes that
  * the replayed WAL records indicate. It's initialized with full_page_writes
@@ -894,6 +903,7 @@ static void checkTimeLineSwitch(XLogRecPtr lsn, TimeLineID newTLI,
 TimeLineID prevTLI);
 static void LocalSetXLogInsertAllowed(void);
 static void CreateEndOfRecoveryRecord(void);
+static XLogRecPtr CreateOverwriteContrecordRecord(XLogRecPtr aborted_lsn);
 static void CheckPointGuts(XLogRecPtr checkPointRedo, int flags);
 static void KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo);
 static XLogRecPtr XLogGetReplicationSlotMinimumLSN(void);
@@ -1120,8 +1130,8 @@ XLogInsertRecord(XLogRecData *rdata,
 		 * All the record data, including the header, is now ready to be
 		 * inserted. Copy the record in the space reserved.
 		 */
-		CopyXLogRecordToWAL(rechdr->xl_tot_len, isLogSwitch, rdata,
-			StartPos, EndPos);
+		CopyXLogRecordToWAL(rechdr->xl_tot_len, isLogSwitch,
+			rdata, StartPos, EndPos)

Re: right join with partitioned table crash

Jaime Casanova  writes:
> Here's another crash caught by sqlsmith.

Fun.  Looks like it fails back to v12, but not in v11,
so it's some optimization we added in v12 that's at fault.

(That being the case, this isn't a blocker for 14rc1,
though of course it'd be nice if we fix it in time for that.)

regards, tom lane

Re: Deduplicate code updating ControleFile's DBState.

On Wed, Sep 15, 2021 at 10:49:39PM +, Bossart, Nathan wrote:
> Ah, I was missing this context.  Perhaps this should be included in
> the patch set for the other thread, especially if it will need to be
> exported.

This part of the patch is mentioned at the top of the thread:
-   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-   ControlFile->state = DB_IN_PRODUCTION;
-   ControlFile->time = (pg_time_t) time(NULL);
-
+   SetControlFileDBState(DB_IN_PRODUCTION);
SpinLockAcquire(&XLogCtl->info_lck);
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);

There is an assumption in this code to update SharedRecoveryState
*while* holding ControlFileLock.  For example, see the following
comment in xlog.c, ReadRecord():
/*
 * We update SharedRecoveryState while holding the lock on
 * ControlFileLock so both states are consistent in shared  

  
 * memory.  

  
 */
--
Michael


signature.asc
Description: PGP signature

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

On Wed, Sep 15, 2021 at 01:48:09PM -0700, Andres Freund wrote:
> Will do. Although I do wish the original committer would have chimed in at
> some point...

Thanks, Andres.
--
Michael


signature.asc
Description: PGP signature

Re: EXPLAIN(VERBOSE) to CTE with SEARCH BREADTH FIRST fails

I wrote:
> I do not think that patch is a proper solution, but we do need to do
> something about this.

I poked into this and decided it's an ancient omission within ruleutils.c.
The reason we've not seen it before is probably that you can't get to the
case through the parser.  The SEARCH stuff is generating a query structure
basically equivalent to

regression=# with recursive cte (x,r) as (
select 42 as x, row(i, 2.3) as r from generate_series(1,3) i
union all   
select x, row((c.r).f1, 4.5) from cte c
)  
select * from cte;
ERROR:  record type has not been registered

and as you can see, expandRecordVariable fails to figure out what
the referent of "c.r" is.  I think that could be fixed (by looking
into the non-recursive term), but given the lack of field demand,
I'm not feeling that it's urgent.

So the omission is pretty obvious from the misleading comment:
actually, Vars referencing RTE_CTE RTEs can also appear in WorkTableScan
nodes, and we're not doing anything to support that.  But we only reach
this code when trying to resolve a field of a Var of RECORD type, which
is a case that it seems like the parser can't produce.

It doesn't look too hard to fix: we just have to find the RecursiveUnion
that goes with the WorkTableScan, and drill down into that, much as we
would do in the CteScan case.  See attached draft patch.  I'm too tired
to beat on this heavily or add a test case, but I have verified that it
passes check-world and handles the example presented in this thread.

regards, tom lane

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index b932a83827..b15bd81b9c 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -375,6 +375,8 @@ static void identify_join_columns(JoinExpr *j, RangeTblEntry *jrte,
   deparse_columns *colinfo);
 static char *get_rtable_name(int rtindex, deparse_context *context);
 static void set_deparse_plan(deparse_namespace *dpns, Plan *plan);
+static Plan *find_recursive_union(deparse_namespace *dpns,
+  WorkTableScan *wtscan);
 static void push_child_plan(deparse_namespace *dpns, Plan *plan,
 			deparse_namespace *save_dpns);
 static void pop_child_plan(deparse_namespace *dpns,
@@ -4866,6 +4868,9 @@ set_deparse_plan(deparse_namespace *dpns, Plan *plan)
 	 * For a SubqueryScan, pretend the subplan is INNER referent.  (We don't
 	 * use OUTER because that could someday conflict with the normal meaning.)
 	 * Likewise, for a CteScan, pretend the subquery's plan is INNER referent.
+	 * For a WorkTableScan, locate the parent RecursiveUnion plan node and use
+	 * that as INNER referent.
+	 *
 	 * For ON CONFLICT .. UPDATE we just need the inner tlist to point to the
 	 * excluded expression's tlist. (Similar to the SubqueryScan we don't want
 	 * to reuse OUTER, it's used for RETURNING in some modify table cases,
@@ -4876,6 +4881,9 @@ set_deparse_plan(deparse_namespace *dpns, Plan *plan)
 	else if (IsA(plan, CteScan))
 		dpns->inner_plan = list_nth(dpns->subplans,
 	((CteScan *) plan)->ctePlanId - 1);
+	else if (IsA(plan, WorkTableScan))
+		dpns->inner_plan = find_recursive_union(dpns,
+(WorkTableScan *) plan);
 	else if (IsA(plan, ModifyTable))
 		dpns->inner_plan = plan;
 	else
@@ -4899,6 +4907,29 @@ set_deparse_plan(deparse_namespace *dpns, Plan *plan)
 		dpns->index_tlist = NIL;
 }
 
+/*
+ * Locate the ancestor plan node that is the RecursiveUnion generating
+ * the WorkTableScan's work table.  We can match on wtParam, since that
+ * should be unique within the plan tree.
+ */
+static Plan *
+find_recursive_union(deparse_namespace *dpns, WorkTableScan *wtscan)
+{
+	ListCell   *lc;
+
+	foreach(lc, dpns->ancestors)
+	{
+		Plan	   *ancestor = (Plan *) lfirst(lc);
+
+		if (IsA(ancestor, RecursiveUnion) &&
+			((RecursiveUnion *) ancestor)->wtParam == wtscan->wtParam)
+			return ancestor;
+	}
+	elog(ERROR, "could not find RecursiveUnion for WorkTableScan with wtParam %d",
+		 wtscan->wtParam);
+	return NULL;
+}
+
 /*
  * push_child_plan: temporarily transfer deparsing attention to a child plan
  *
@@ -7646,9 +7677,12 @@ get_name_for_var_field(Var *var, int fieldno,
 {
 	/*
 	 * We're deparsing a Plan tree so we don't have a CTE
-	 * list.  But the only place we'd see a Var directly
-	 * referencing a CTE RTE is in a CteScan plan node, and we
-	 * can look into the subplan's tlist instead.
+	 * list.  But the only places we'd see a Var directly
+	 * referencing a CTE RTE are in CteScan or WorkTableScan
+	 * plan nodes.  For those cases, set_deparse_plan arranged
+	 * for dpns->inner_plan to be the plan node that emits the
+	 * CTE or RecursiveUnion result, and we can look at its
+	 * tlist instead.
 	 */
 	TargetEntry *tle;
 	deparse_namespace save_dpns;

right join with partitioned table crash

2021-09-15 Thread Jaime Casanova

Hi everyone,

Here's another crash caught by sqlsmith.

"""
drop table if exists fkpart3_pk5 cascade;
drop table if exists inet_tbl;

create table fkpart3_pk5 (
a integer not null primary key
)
partition by range (a);

create table fkpart3_pk51 partition of fkpart3_pk5
for values from (4000) to (4500);

create table inet_tbl (
c cidr,
i inet
);

select
1 as c0
from
(select null::integer as c9,
ref_0.a as c24
   from fkpart3_pk5 as ref_0
) as subq_0
right join public.inet_tbl as sample_0 on (cast(null as cidr) = c)
where subq_0.c9 <= subq_0.c24
"""


Attached the backtrace.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {4194304, 140736805622640, 2, 6, 556, 
94745029079040, 
4611686018427388799, 139894076431014, 0, 281470681751456, 0, 0, 0, 
0, 0, 0}}
pid = 
tid = 
ret = 
#1  0x7f3ba0830535 in __GI_abort () at abort.c:79
save_stage = 1
act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, 
sa_mask = {__val = {
  0, 0, 0, 0, 0, 139894074187765, 2, 7075777569425871984, 
7003713384980111928, 
  94745029079040, 7003719963994515424, 0, 9449567279706640128, 
140736805622880, 0, 
  140736805623744}}, sa_flags = -1949470720, sa_restorer = 0x0}
sigs = {__val = {32, 0 }}
#2  0x562b8c3818a0 in ExceptionalCondition (
conditionName=0x562b8c527e38 "!bms_overlap(baserel->relids, 
required_outer)", 
errorType=0x562b8c527a26 "FailedAssertion", fileName=0x562b8c527abf 
"relnode.c", 
lineNumber=1313) at assert.c:69
No locals.
#3  0x562b8c0ef355 in get_baserel_parampathinfo (root=0x562b8d3ebd00, 
baserel=0x562b8d3e8ec0, required_outer=0x562b8d40e350) at relnode.c:1313
ppi = 0x562b8d3e8ec0
joinrelids = 0x562b8d40f468
pclauses = 0x562b8d40e490
rows = 6.9533220764192298e-310
lc = 0x562b8d40f4c0
#4  0x562b8c0debbd in create_append_path (root=0x562b8d3ebd00, 
rel=0x562b8d3e8ec0, 
subpaths=0x562b8d40f468, partial_subpaths=0x0, pathkeys=0x0, 
required_outer=0x562b8d40e350, 
parallel_workers=0, parallel_aware=false, rows=-1) at pathnode.c:1270
pathnode = 0x562b8d40f4c0
l = 0x0
#5  0x562b8c06ff7c in add_paths_to_append_rel (root=0x562b8d3ebd00, 
rel=0x562b8d3e8ec0, 
live_childrels=0x562b8d40f060) at allpaths.c:1610
required_outer = 0x562b8d40e350
lcr = 0x0
l__state = {l = 0x562b8d40f1c0, i = 0}
subpaths = 0x562b8d40f468
subpaths_valid = true
partial_subpaths = 0x562b8d40f110
pa_partial_subpaths = 0x562b8d40f168
pa_nonpartial_subpaths = 0x0
partial_subpaths_valid = true
pa_subpaths_valid = true
all_child_pathkeys = 0x0
all_child_outers = 0x562b8d40f1c0
l = 0x562b8d40f1d8
partial_rows = 500
#6  0x562b8c06f47b in set_append_rel_pathlist (root=0x562b8d3ebd00, 
rel=0x562b8d3e8ec0, 
rti=4, rte=0x562b8d3e7df0) at allpaths.c:1269
parentRTindex = 4
live_childrels = 0x562b8d40f060
l = 0x0
#7  0x562b8c06e33a in set_rel_pathlist (root=0x562b8d3ebd00, 
rel=0x562b8d3e8ec0, rti=4, 
rte=0x562b8d3e7df0) at allpaths.c:481
__func__ = "set_rel_pathlist"
#8  0x562b8c06e023 in set_base_rel_pathlists (root=0x562b8d3ebd00) at 
allpaths.c:353
rel = 0x562b8d3e8ec0
rti = 4
#9  0x562b8c06dd72 in make_one_rel (root=0x562b8d3ebd00, 
joinlist=0x562b8d40c8a8)
at allpaths.c:223
rel = 0x8d3ebd00
rti = 6
total_pages = 20
#10 0x562b8c0aab06 in query_planner (root=0x562b8d3ebd00, 
qp_callback=0x562b8c0b07c1 , qp_extra=0x7fffd74e5cc0) 
at planmain.c:276
parse = 0x562b8d3e72f0
joinlist = 0x562b8d40c8a8
final_rel = 0x562b8d3ed520
__func__ = "query_planner"
#11 0x562b8c0ad2bf in grouping_planner (root=0x562b8d3ebd00, 
tuple_fraction=0)
at planner.c:1447
sort_input_targets = 0x562b8d3edec8
sort_input_target_parallel_safe = false
grouping_target = 0x562b8d3ebc70
scanjoin_target = 0x562b8d3edc40
activeWindows = 0x0
qp_extra = {activeWindows = 0x0, groupClause = 0x0}
sort_input_targets_contain_srfs = 0x0
have_grouping = false
wflists = 0x0
gset_data = 0x0
sort_input_target = 0x562b8d3edc98
grouping_targets = 0x562b8c017eee 
grouping_target_parallel_safe = false
scanjoin_targets = 0x562b8d315790
scanjoin_target_parallel_safe = false
grouping_targets_contain_srfs = 0x7fffd74e5d00
scanjoin_targets_contain_srfs = 0x0
scanjoin_target_same_exprs = false
parse = 0x562b8d3e72f0
offset_est = 0
count_est = 0
limit_tuples = -1

Re: Deduplicate code updating ControleFile's DBState.

2021-09-15 Thread Bossart, Nathan

On 9/15/21, 4:47 AM, "Amul Sul"  wrote:
> On Wed, Sep 15, 2021 at 12:52 AM Bossart, Nathan  wrote:
>> It looks like ebdf5bf intentionally made sure that we hold
>> ControlFileLock while updating SharedRecoveryInProgress (now
>> SharedRecoveryState after 4e87c48).  The thread for this change [0]
>> has some additional details.
>>
>
> Yeah, I saw that and ebdf5bf main intention was to minimize the gap
> between both of them which was quite big previously.  The comments
> added by the same commit also describe the case that backends can
> write WAL and the control file is still referring not in
> DB_IN_PRODUCTION and IIUC, which seems to be acceptable.
> Then the question is what would be wrong if a process can see an
> inconsistent shared memory view for a small window?  Might be
> wait-promoting might behave unexpectedly, that I have to test.

For your proposed change, I would either leave out this particular
call site or add a "WithLock" version of the function.

void
SetControlFileDBState(DBState state)
{
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
SetControlFileDBStateWithLock(state);
LWLockRelease(ControlFileLock);
}

void
SetControlFileDBStateWithLock(DBState state)
{
Assert(LWLockHeldByMeInMode(ControlFileLock, LW_EXCLUSIVE));

ControlFile->state = state;
ControlFile->time = (pg_time_t) time(NULL);
UpdateControlFile();
}

>> As far as the patch goes, I'm not sure why SetControlFileDBState()
>> needs to be exported, and TBH I don't know if this change is really a
>> worthwhile improvement.  ISTM the main benefit is that it could help
>> avoid cases where we update the state but not the time.  However,
>> there's still nothing preventing that, and I don't get the idea that
>> it was really a big problem to begin with.
>>
>
> Oh ok, I was working on a different patch[1] where I want to call this
> function from checkpointer, but I agree exporting function is not in
> the scope of this patch.

Ah, I was missing this context.  Perhaps this should be included in
the patch set for the other thread, especially if it will need to be
exported.

Nathan

Re: Estimating HugePages Requirements?

2021-09-15 Thread Bossart, Nathan

On 9/14/21, 8:06 PM, "Michael Paquier"  wrote:
> Attached is a refreshed patch (commit message is the same as v9 for
> now), with some minor tweaks and the tests.
>
> Thoughts?

LGTM

+This can be used on a running server for most parameters.  However,
+the server must be shut down for some runtime-computed parameters
+(e.g., , and
+).

nitpick: I think you can remove the comma before the "and" in the list
of examples.

Nathan

Re: decoupling table and index vacuum

2021-09-15 Thread Peter Geoghegan

On Wed, Apr 21, 2021 at 8:21 AM Robert Haas  wrote:
> Now, the reason for this is that when we discover dead TIDs, we only
> record them in memory, not on disk. So, as soon as VACUUM ends, we
> lose all knowledge of whether those TIDs were and must rediscover
> them. Suppose we didn't do this, and instead had a "dead TID" fork for
> each table. Suppose further that this worked like a conveyor belt,
> similar to WAL, where every dead TID we store into the fork is
> assigned an identifying 64-bit number that is never reused.

Have you started any work on this project? I think that it's a very good idea.

Enabling index-only scans is a good enough reason to pursue this
project, even on its own. The flexibility that this design offers
allows VACUUM to run far more aggressively, with little possible
downside. It makes it possible for VACUUM to run so frequently that it
rarely dirties pages most of the time -- at least in many important
cases. Imagine if VACUUM almost kept in lockstep with inserters into
an append-mostly table -- that would be great. The main blocker to
making VACUUM behave like that is of course indexes.

Setting visibility map bits during VACUUM can make future vacuuming
cheaper (for the obvious reason), which *also* makes it cheaper to set
*most* visibility map bits as the table is further extended, which in
turn makes future vacuuming cheaper...and so on. This virtuous circle
seems like it might be really important. Especially once you factor in
the cost of dirtying pages a second or a third time. I think that we
can really keep the number of times VACUUM dirties pages under
control, simply by decoupling. Decoupling is key to keeping the costs
to a minimum.

I attached a POC autovacuum logging instrumentation patch that shows
how VACUUM uses *and* sets VM bits. I wrote this for my TPC-C + FSM
work. Seeing both things together, and seeing how both things *change*
over time was a real eye opener for me: it turns out that the master
branch keeps setting and resetting VM bit pages in the two big
append-mostly tables that are causing so much trouble for Postgres
today. What we see right now is pretty disorderly -- the numbers don't
trend in the right direction when they should. But it could be a lot
more orderly, with a little work.

This instrumentation helped me to discover a better approach to
indexing within TPC-C, based on index-only scans [1]. It also made me
realize that it's possible for a table to have real problems with dead
tuple cleanup in indexes, while nevertheless being an effective target
for index-only scans. There is actually no good reason to think that
one condition should preclude the other -- they may very well go
together. You did say this yourself when talking about global indexes,
but there is no reason to think that it's limited to partitioning
cases. The current "ANALYZE dead_tuples statistics" paradigm cannot
recognize when both conditions go together, even though I now think
that it's fairly common. I also like your idea here because it enables
a more qualitative approach, based on recent information for recently
modified blocks -- not whole-table statistics. Averages are
notoriously misleading.

[1] https://github.com/pgsql-io/benchmarksql/pull/16
-- 
Peter Geoghegan

0001-Instrument-pages-skipped-by-VACUUM.patch
Description: Binary data

Re: Hook for extensible parsing.

Hi,

On 2021-09-15 16:51:37 -0400, Tom Lane wrote:
> The other problem here is that a simple call-this-instead-of-that
> top-level hook doesn't seem all that useful anyway, because it leaves
> you with the task of duplicating a huge amount of functionality that
> you're then going to make some tweaks within.  That's already an issue
> when you're just thinking about the grammar, and if you have to buy
> into it for parse analysis too, I doubt that it's going to be very
> practical.  If, say, you'd like to support some weird function that
> requires special parsing and analysis rules, I don't see how you get
> that out of this without first duplicating a very large fraction of
> src/backend/parser/.

We do have a small amount of infrastructure around this - the hackery that
plpgsql uses. That's not going to help you with everything, but I think it
should be be enough to recognize e.g. additional top-level
statements. Obviously not enough to intercept parsing deeper into a statement,
but at least something.

And parse-analysis for some types of things will be doable with the current
infrastructure, by e.g. handling the new top-level statement in the hook, and
then passing the buck to the normal parse analysis for e.g. expressions in
that.

Obviously not going to get you that far...

> (As a comparison point, we do have a top-level hook for replacing
> the planner; but I have never heard of anyone actually doing so.
> There are people using that hook to *wrap* the planner with some
> before-and-after processing, which is quite a different thing.)

Citus IIRC has some paths that do not end up calling into the standard
planner, but only for a few simplistic cases.

> I don't have any better ideas to offer :-( ... but I very much fear
> that the approach proposed here is a dead end.

I unfortunately don't see a good way forward without changing the way we do
parsing on a more fundamental level :(.

Greetings,

Andres Freund

Re: Polyphase merge is obsolete

2021-09-15 Thread Heikki Linnakangas


On 16/09/2021 00:12, Jaime Casanova wrote:

On Sat, Sep 11, 2021 at 01:35:27AM -0500, Jaime Casanova wrote:

On Wed, Jul 14, 2021 at 06:04:14PM +0300, Heikki Linnakangas wrote:

On 14/07/2021 15:12, vignesh C wrote:

On Sat, Jan 23, 2021 at 3:49 AM Heikki Linnakangas  wrote:

Here's an updated version that fixes one bug:

The CFBot was reporting a failure on the FreeBSD system [1]. It turned
out to be an out-of-memory issue caused by an underflow bug in the
calculation of the size of the tape read buffer size. With a small
work_mem size, the memory left for tape buffers was negative, and that
wrapped around to a very large number. I believe that was not caught by
the other systems, because the other ones had enough memory for the
incorrectly-sized buffers anyway. That was the case on my laptop at
least. It did cause a big slowdown in the 'tuplesort' regression test
though, which I hadn't noticed.

The fix for that bug is here as a separate patch for easier review, but
I'll squash it before committing.


The patch does not apply on Head anymore, could you rebase and post a
patch. I'm changing the status to "Waiting for Author".


Here's a rebased version. I also squashed that little bug fix from previous
patch set.



Hi,

This patch does not apply, can you submit a rebased version?


BTW, I'm marking this one as "waiting on author"


Thanks, here's another rebase.

- Heikki
>From 501d73488159af3e172e117439054bc06dd09edf Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Wed, 21 Oct 2020 21:57:42 +0300
Subject: [PATCH v5 1/2] Refactor LogicalTapeSet/LogicalTape interface.

All the tape functions, like LogicalTapeRead and LogicalTapeWrite, now
take a LogicalTape as argument, instead of LogicalTapeSet+tape number.
You can create any number of LogicalTapes in a single LogicalTapeSet, and
you don't need to decide the number upfront, when you create the tape set.

This makes the tape management in hash agg spilling in nodeAgg.c simpler.
---
 src/backend/executor/nodeAgg.c | 187 
 src/backend/utils/sort/logtape.c   | 456 -
 src/backend/utils/sort/tuplesort.c | 229 +++
 src/include/nodes/execnodes.h  |   3 +-
 src/include/utils/logtape.h|  37 ++-
 5 files changed, 359 insertions(+), 553 deletions(-)

diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 39bea204d16..c99a0de4ddb 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -208,7 +208,16 @@
  *
  *	  Spilled data is written to logical tapes. These provide better control
  *	  over memory usage, disk space, and the number of files than if we were
- *	  to use a BufFile for each spill.
+ *	  to use a BufFile for each spill.  We don't know the number of tapes needed
+ *	  at the start of the algorithm (because it can recurse), so a tape set is
+ *	  allocated at the beginning, and individual tapes are created as needed.
+ *	  As a particular tape is read, logtape.c recycles its disk space. When a
+ *	  tape is read to completion, it is destroyed entirely.
+ *
+ *	  Tapes' buffers can take up substantial memory when many tapes are open at
+ *	  once. We only need one tape open at a time in read mode (using a buffer
+ *	  that's a multiple of BLCKSZ); but we need one tape open in write mode (each
+ *	  requiring a buffer of size BLCKSZ) for each partition.
  *
  *	  Note that it's possible for transition states to start small but then
  *	  grow very large; for instance in the case of ARRAY_AGG. In such cases,
@@ -311,27 +320,6 @@
  */
 #define CHUNKHDRSZ 16
 
-/*
- * Track all tapes needed for a HashAgg that spills. We don't know the maximum
- * number of tapes needed at the start of the algorithm (because it can
- * recurse), so one tape set is allocated and extended as needed for new
- * tapes. When a particular tape is already read, rewind it for write mode and
- * put it in the free list.
- *
- * Tapes' buffers can take up substantial memory when many tapes are open at
- * once. We only need one tape open at a time in read mode (using a buffer
- * that's a multiple of BLCKSZ); but we need one tape open in write mode (each
- * requiring a buffer of size BLCKSZ) for each partition.
- */
-typedef struct HashTapeInfo
-{
-	LogicalTapeSet *tapeset;
-	int			ntapes;
-	int		   *freetapes;
-	int			nfreetapes;
-	int			freetapes_alloc;
-} HashTapeInfo;
-
 /*
  * Represents partitioned spill data for a single hashtable. Contains the
  * necessary information to route tuples to the correct partition, and to
@@ -343,9 +331,8 @@ typedef struct HashTapeInfo
  */
 typedef struct HashAggSpill
 {
-	LogicalTapeSet *tapeset;	/* borrowed reference to tape set */
 	int			npartitions;	/* number of partitions */
-	int		   *partitions;		/* spill partition tape numbers */
+	LogicalTape **partitions;	/* spill partition tapes */
 	int64	   *ntuples;		/* number of tuples in each partition */
 	uint32		mask;			/* mask to find partition from hash value */

Re: Polyphase merge is obsolete

2021-09-15 Thread Jaime Casanova

On Sat, Sep 11, 2021 at 01:35:27AM -0500, Jaime Casanova wrote:
> On Wed, Jul 14, 2021 at 06:04:14PM +0300, Heikki Linnakangas wrote:
> > On 14/07/2021 15:12, vignesh C wrote:
> > > On Sat, Jan 23, 2021 at 3:49 AM Heikki Linnakangas  
> > > wrote:
> > > > Here's an updated version that fixes one bug:
> > > > 
> > > > The CFBot was reporting a failure on the FreeBSD system [1]. It turned
> > > > out to be an out-of-memory issue caused by an underflow bug in the
> > > > calculation of the size of the tape read buffer size. With a small
> > > > work_mem size, the memory left for tape buffers was negative, and that
> > > > wrapped around to a very large number. I believe that was not caught by
> > > > the other systems, because the other ones had enough memory for the
> > > > incorrectly-sized buffers anyway. That was the case on my laptop at
> > > > least. It did cause a big slowdown in the 'tuplesort' regression test
> > > > though, which I hadn't noticed.
> > > > 
> > > > The fix for that bug is here as a separate patch for easier review, but
> > > > I'll squash it before committing.
> > > 
> > > The patch does not apply on Head anymore, could you rebase and post a
> > > patch. I'm changing the status to "Waiting for Author".
> > 
> > Here's a rebased version. I also squashed that little bug fix from previous
> > patch set.
> > 
> 
> Hi,
> 
> This patch does not apply, can you submit a rebased version?
> 

BTW, I'm marking this one as "waiting on author"

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: Hook for extensible parsing.

Andres Freund  writes:
> Agreed - it doesn't make sense to me to have a hook that only replaces raw
> parsing, without also hooking into parse-analysis. ISTM that the least a
> patchset going for a parser hook would have to do is to do sufficient
> restructuring so that one could hook together into both raw parsing and
> analysis. It could still be two callbacks, but perhaps we'd ensure that
> they're both set.

The other problem here is that a simple call-this-instead-of-that
top-level hook doesn't seem all that useful anyway, because it leaves
you with the task of duplicating a huge amount of functionality that
you're then going to make some tweaks within.  That's already an issue
when you're just thinking about the grammar, and if you have to buy
into it for parse analysis too, I doubt that it's going to be very
practical.  If, say, you'd like to support some weird function that
requires special parsing and analysis rules, I don't see how you get
that out of this without first duplicating a very large fraction of
src/backend/parser/.

(As a comparison point, we do have a top-level hook for replacing
the planner; but I have never heard of anyone actually doing so.
There are people using that hook to *wrap* the planner with some
before-and-after processing, which is quite a different thing.)

I don't have any better ideas to offer :-( ... but I very much fear
that the approach proposed here is a dead end.

regards, tom lane

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

On 2021-09-15 10:47:33 -0400, Andrew Dunstan wrote:
> this is an open item for release 14. Is someone going to commit?

Will do. Although I do wish the original committer would have chimed in at
some point...

Re: Hook for extensible parsing.

Hi,

On 2021-09-15 16:35:53 -0400, Jonah H. Harris wrote:
> On Wed, Sep 15, 2021 at 3:55 PM Andres Freund  wrote:
> > On 2021-09-15 12:57:00 -0400, Tom Lane wrote:
> > Agreed - it doesn't make sense to me to have a hook that only replaces raw
> > parsing, without also hooking into parse-analysis. ISTM that the least a
> > patchset going for a parser hook would have to do is to do sufficient
> > restructuring so that one could hook together into both raw parsing and
> > analysis. It could still be two callbacks, but perhaps we'd ensure that
> > they're both set.
> >
> 
> This is a bad example as it doesn't require semantic analysis from
> Postgres.

"it"? I assume you mean a different type of join? If so, I'm highly doubtful -
without semantic analysis you can't really handle column references.


> While most of the tools out there tend to do simple replacement,
> this can be done within a custom parser by simply walking its own AST,
> evaluating join conditions against the expression, and rewriting the join
> accordingly. Or, do you have an example that couldn't be done this way
> within a custom parser?

You cannot just "evaluate conditions" in a raw parse tree... You don't even
know what things are functions, columns etc, nor to what relation a column
belongs.

Greetings,

Andres Freund

Re: Hook for extensible parsing.

2021-09-15 Thread Jonah H. Harris

On Wed, Sep 15, 2021 at 3:55 PM Andres Freund  wrote:

> On 2021-09-15 12:57:00 -0400, Tom Lane wrote:
> > That's not what the patch actually does, though.  It only replaces
> > the grammar, not semantic analysis.  So you couldn't associate the
> > (+)-decorated WHERE clause with the appropriate join.  (And no,
> > I will not accept that it's okay to perform catalog lookups in
> > the grammar to get around that.  See comment at the head of gram.y.)
>
> > In general, I'm having a hard time believing that anything very
> > interesting can be done at only the grammar level without changing
> > the parse analysis phase.  That's not unrelated to the restriction
> > that the grammar can't do catalog accesses.  Maybe with some fundamental
> > restructuring, we could get around that issue ... but this patch isn't
> > doing any fundamental restructuring, it's just putting a hook where it's
> > easy to do so.  We've often found that such hooks aren't as useful as
> > they initially seem.
>
> Agreed - it doesn't make sense to me to have a hook that only replaces raw
> parsing, without also hooking into parse-analysis. ISTM that the least a
> patchset going for a parser hook would have to do is to do sufficient
> restructuring so that one could hook together into both raw parsing and
> analysis. It could still be two callbacks, but perhaps we'd ensure that
> they're both set.
>

This is a bad example as it doesn't require semantic analysis from
Postgres. While most of the tools out there tend to do simple replacement,
this can be done within a custom parser by simply walking its own AST,
evaluating join conditions against the expression, and rewriting the join
accordingly. Or, do you have an example that couldn't be done this way
within a custom parser?

-- 
Jonah H. Harris

mem context is not reset between extended stats

2021-09-15 Thread Justin Pryzby

Memory allocation appeared be O(1) WRT the number of statistics objects, which
was not expected to me.  This is true in v13 (and probably back to v10).

It seems to work fine to reset the memory context within the loop, so long as
the statslist is allocated in the parent context.

|DROP TABLE t; CREATE TABLE t AS SELECT i, i+1 AS a, i+2 AS b, i+3 AS c, i+4 AS 
d, i+5 AS e FROM generate_series(1,9)i;

|SELECT format('CREATE STATISTICS sta%s (ndistinct) ON 
a,(1+b),(2+c),(3+d),(4+e) FROM t', a) FROM generate_series(1,9)a\gexec
|SET log_statement_stats=on; SET client_min_messages=debug; ANALYZE t;
|=> 369432 kB max resident size

|SELECT format('CREATE STATISTICS sta%s (ndistinct) ON a,b,c,d,e FROM t', a) 
FROM generate_series(1,33)a\gexec
|SET log_statement_stats=on; SET client_min_messages=debug; ANALYZE t;
|=> 1284368 kB max resident size
>From 6ae4d059f2ed8baf2af92ec0847458055147383c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 15 Sep 2021 14:38:49 -0500
Subject: [PATCH] stx: do not leak memory for each stats obj

---
 src/backend/statistics/extended_stats.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 63549757ec..d9387ceeb6 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -136,14 +136,14 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 	if (!natts)
 		return;
 
+	pg_stext = table_open(StatisticExtRelationId, RowExclusiveLock);
+	statslist = fetch_statentries_for_relation(pg_stext, RelationGetRelid(onerel));
+
 	cxt = AllocSetContextCreate(CurrentMemoryContext,
 "BuildRelationExtStatistics",
 ALLOCSET_DEFAULT_SIZES);
 	oldcxt = MemoryContextSwitchTo(cxt);
 
-	pg_stext = table_open(StatisticExtRelationId, RowExclusiveLock);
-	statslist = fetch_statentries_for_relation(pg_stext, RelationGetRelid(onerel));
-
 	/* report this phase */
 	if (statslist != NIL)
 	{
@@ -245,14 +245,12 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		pgstat_progress_update_param(PROGRESS_ANALYZE_EXT_STATS_COMPUTED,
 	 ++ext_cnt);
 
-		/* free the build data (allocated as a single chunk) */
-		pfree(data);
+		MemoryContextReset(cxt);
 	}
 
-	table_close(pg_stext, RowExclusiveLock);
-
 	MemoryContextSwitchTo(oldcxt);
 	MemoryContextDelete(cxt);
+	table_close(pg_stext, RowExclusiveLock);
 }
 
 /*
-- 
2.17.0

Re: Hook for extensible parsing.

Hi,

On 2021-09-15 12:57:00 -0400, Tom Lane wrote:
> That's not what the patch actually does, though.  It only replaces
> the grammar, not semantic analysis.  So you couldn't associate the
> (+)-decorated WHERE clause with the appropriate join.  (And no,
> I will not accept that it's okay to perform catalog lookups in
> the grammar to get around that.  See comment at the head of gram.y.)

> In general, I'm having a hard time believing that anything very
> interesting can be done at only the grammar level without changing
> the parse analysis phase.  That's not unrelated to the restriction
> that the grammar can't do catalog accesses.  Maybe with some fundamental
> restructuring, we could get around that issue ... but this patch isn't
> doing any fundamental restructuring, it's just putting a hook where it's
> easy to do so.  We've often found that such hooks aren't as useful as
> they initially seem.

Agreed - it doesn't make sense to me to have a hook that only replaces raw
parsing, without also hooking into parse-analysis. ISTM that the least a
patchset going for a parser hook would have to do is to do sufficient
restructuring so that one could hook together into both raw parsing and
analysis. It could still be two callbacks, but perhaps we'd ensure that
they're both set.

Greetings,

Andres Freund

Re: pg_upgrade test for binary compatibility of core data types



On 9/13/21 9:20 AM, Andrew Dunstan wrote:
> On 9/12/21 2:41 PM, Andrew Dunstan wrote:
>> On 9/11/21 8:51 PM, Justin Pryzby wrote:
>>> @Andrew: did you have any comment on this part ?
>>>
>>> |Subject: buildfarm xversion diff
>>> |Forking 
>>> https://www.postgresql.org/message-id/20210328231433.gi15...@telsasoft.com
>>> |
>>> |I gave suggestion how to reduce the "lines of diff" metric almost to 
>>> nothing,
>>> |allowing a very small "fudge factor", and which I think makes this a pretty
>>> |good metric rather than a passable one.
>>>
>> Somehow I missed that. Looks like some good suggestions. I'll
>> experiment. (Note: we can't assume the presence of sed, especially on
>> Windows).
>>
>>
> I tried with the attached patch on crake, which tests back as far as
> 9.2. Here are the diff counts from HEAD:
>
>
> andrew@emma:HEAD $ grep -c '^[+-]' dumpdiff-REL9_* dumpdiff-REL_1*
> dumpdiff-HEAD
> dumpdiff-REL9_2_STABLE:514
> dumpdiff-REL9_3_STABLE:169
> dumpdiff-REL9_4_STABLE:185
> dumpdiff-REL9_5_STABLE:221
> dumpdiff-REL9_6_STABLE:11
> dumpdiff-REL_10_STABLE:11
> dumpdiff-REL_11_STABLE:73
> dumpdiff-REL_12_STABLE:73
> dumpdiff-REL_13_STABLE:73
> dumpdiff-REL_14_STABLE:0
> dumpdiff-HEAD:0
>
>
> I've also attached those non-empty dumpdiff files for information, since
> they are quite small.
>
>
> There is still work to do, but this is promising. Next step: try it on
> Windows.
>
>

It appears to do the right thing on Windows. yay!


We probably need to get smarter about the heuristics, though, e.g. by
taking into account the buildfarm options and the platform. It would
also help a lot if we could make vcregress.pl honor USE_MODULE_DB.
That's on my TODO list, but it just got a lot higher priority.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Em qua., 15 de set. de 2021 às 16:16, Ranier Vilela 
escreveu:

> Em qua., 15 de set. de 2021 às 15:35, Tom Lane 
> escreveu:
>
>> Ranier Vilela  writes:
>> > Em qua., 15 de set. de 2021 às 12:00, Tom Lane 
>> escreveu:
>> >> We could, in fact, not bother with removing the no-longer-referenced
>> >> subplans, and it probably wouldn't be all that awful.  But the intent
>> >> of the original patch was to save the executor startup time for such
>> >> subplans, so I wanted to preserve that goal if I could.
>>
>> > I'm sorry if I'm being persistent with this issue, but I'd like to give
>> it
>> > one last try before I let it go
>> > I modified the way the subplane deletion is done and it seems to me that
>> > this really happens.
>>
>> It looks like what this fragment is doing is clobbering the List
>> substructure of the AlternativeSubPlan node itself.  That's not
>> going to make any difference, since the whole point of the exercise
>> is that the AlternativeSubPlan gets cut out of the finished tree.
>> But the list that we want to modify, in order to save the
>> executor time, is the root->glob->subplans list (which ends
>> up being PlannedStmt.subplans).  And that's global to the
>> query, so we can't fix it correctly on the basis of a single
>> AlternativeSubPlan.
>>
> Ok, I can see now.
> But this leads me to the conclusion that AlternativeSubPlan *asplan
> does not seem to me to be a good approach for this function, better to
> deal with it directly:
> "root->glob->subplans" which, it seems, works too.
>
Hmm, too fast and wrong, do not work.

postgres=# explain (costs off)
postgres-# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  unrecognized node type: 13
postgres=# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  unrecognized node type: 13

regards,
Ranier Vilela

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Em qua., 15 de set. de 2021 às 15:35, Tom Lane  escreveu:

> Ranier Vilela  writes:
> > Em qua., 15 de set. de 2021 às 12:00, Tom Lane 
> escreveu:
> >> We could, in fact, not bother with removing the no-longer-referenced
> >> subplans, and it probably wouldn't be all that awful.  But the intent
> >> of the original patch was to save the executor startup time for such
> >> subplans, so I wanted to preserve that goal if I could.
>
> > I'm sorry if I'm being persistent with this issue, but I'd like to give
> it
> > one last try before I let it go
> > I modified the way the subplane deletion is done and it seems to me that
> > this really happens.
>
> It looks like what this fragment is doing is clobbering the List
> substructure of the AlternativeSubPlan node itself.  That's not
> going to make any difference, since the whole point of the exercise
> is that the AlternativeSubPlan gets cut out of the finished tree.
> But the list that we want to modify, in order to save the
> executor time, is the root->glob->subplans list (which ends
> up being PlannedStmt.subplans).  And that's global to the
> query, so we can't fix it correctly on the basis of a single
> AlternativeSubPlan.
>
Ok, I can see now.
But this leads me to the conclusion that AlternativeSubPlan *asplan
does not seem to me to be a good approach for this function, better to deal
with it directly:
"root->glob->subplans" which, it seems, works too.

i = 0;
foreach(lc, root->glob->subplans)
{
  SubPlan*curplan = (SubPlan *) lfirst(lc);
  Cost curcost;

  curcost = curplan->startup_cost + num_exec * curplan->per_call_cost;
  if (bestplan == NULL || curcost <= bestcost)
  {
   bestplan = curplan;
   bestcost = curcost;
  }
  i++;
}

if (bestplan != NULL)
{
   foreach(lc, root->glob->subplans)
   {
SubPlan*curplan = (SubPlan *) lfirst(lc);
if (curplan != bestplan)
lfirst(lc) = NULL;
   }
   j = 0;
  foreach(lc, root->glob->subplans)
  {
SubPlan*curplan = (SubPlan *) lfirst(lc);
if (curplan != NULL)
j++;
  }
  if (j != i)
  {
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many subplans: total_plans=%d,
remain_plans=%d", i, j)));
   }
}

postgres=# explain (costs off)
postgres-# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  too many subplans: total_plans=2, remain_plans=1
postgres=# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  too many subplans: total_plans=2, remain_plans=1

Anyway, thank you for the explanations.

regards,
Ranier Vilela

Re: automatically generating node support functions

2021-09-15 Thread Peter Eisentraut


On 17.08.21 16:36, Peter Eisentraut wrote:
Here is another set of preparatory patches that clean up various special 
cases and similar in the node support.


This set of patches has been committed.  I'll close this commit fest 
entry and come back with the main patch series in the future.

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Ranier Vilela  writes:
> Em qua., 15 de set. de 2021 às 12:00, Tom Lane  escreveu:
>> We could, in fact, not bother with removing the no-longer-referenced
>> subplans, and it probably wouldn't be all that awful.  But the intent
>> of the original patch was to save the executor startup time for such
>> subplans, so I wanted to preserve that goal if I could.

> I'm sorry if I'm being persistent with this issue, but I'd like to give it
> one last try before I let it go
> I modified the way the subplane deletion is done and it seems to me that
> this really happens.

It looks like what this fragment is doing is clobbering the List
substructure of the AlternativeSubPlan node itself.  That's not
going to make any difference, since the whole point of the exercise
is that the AlternativeSubPlan gets cut out of the finished tree.
But the list that we want to modify, in order to save the 
executor time, is the root->glob->subplans list (which ends
up being PlannedStmt.subplans).  And that's global to the
query, so we can't fix it correctly on the basis of a single
AlternativeSubPlan.

regards, tom lane

Re: proposal: possibility to read dumped table's name from file

po 13. 9. 2021 v 15:11 odesílatel Justin Pryzby 
napsal:

> On Wed, Jul 28, 2021 at 09:28:17AM +0200, Pavel Stehule wrote:
> > Here is an updated implementation of filter's file, that implements
> syntax
> > proposed by you.
>
> Thanks.
>
> If there's any traction for this approach.  I have some comments for the
> next
> revision,
>
> > +++ b/doc/src/sgml/ref/pg_dump.sgml
> > @@ -789,6 +789,56 @@ PostgreSQL documentation
> >
> >   
> >
> > + 
> > +  --filter= class="parameter">filename
> > +  
> > +   
> > +Read objects filters from the specified file.
> > +If you use "-" as a filename, the filters are read from stdin.
>
> Say 'Specify "-" to read from stdin'
>
> > +The lines starting with symbol # are ignored.
>
> Remove "The" and "symbol"
>
> > +Previous white chars (spaces, tabs) are not allowed. These
>
> Preceding whitespace characters...
>
> But actually, they are allowed?  But if it needs to be explained, maybe
> they
> shouldn't be - I don't see the utility of it.
>
> > +static bool
> > +isblank_line(const char *line)
> > +{
> > + while (*line)
> > + {
> > + if (!isblank(*line++))
> > + return false;
> > + }
> > +
> > + return true;
> > +}
>
> I don't think this requires nor justifies having a separate function.
> Either don't support blank lines, or use get_keyword() with size==0 for
> that ?
>
> > + /* Now we expect sequence of two keywords */
> > + if (keyword && is_keyword(keyword, size, "include"))
> > + is_include = true;
> > + else if (keyword && is_keyword(keyword, size, "exclude"))
> > + is_include = false;
> > + else
>
> I think this should first check "if keyword == NULL".
> That could give a more specific error message like "no keyword found",
>
> > + exit_invalid_filter_format(fp,
> > +
> filename,
> > +
> "expected keyword \"include\" or \"exclude\"",
> > +
> line.data,
> > +
> lineno);
>
> ..and then this one can say "invalid keyword".
>

I fixed (I hope) mentioned issues. Please check last patch

Regards

Pavel


> --
> Justin
>

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Em qua., 15 de set. de 2021 às 12:00, Tom Lane  escreveu:

> We could, in fact, not bother with removing the no-longer-referenced
> subplans, and it probably wouldn't be all that awful.  But the intent
> of the original patch was to save the executor startup time for such
> subplans, so I wanted to preserve that goal if I could.
>

I'm sorry if I'm being persistent with this issue, but I'd like to give it
one last try before I let it go
I modified the way the subplane deletion is done and it seems to me that
this really happens.

I ran a quick dirty test to count the remaining subplanes.

i = 0;
foreach(lc, asplan->subplans)
{
  SubPlan*curplan = (SubPlan *) lfirst(lc);
  Cost curcost;

  curcost = curplan->startup_cost + num_exec * curplan->per_call_cost;
  if (bestplan == NULL || curcost <= bestcost)
  {
 bestplan = curplan;
 bestcost = curcost;
  }
  i++;
}
if (bestplan != NULL)
{
 foreach(lc, asplan->subplans)
 {
   SubPlan*curplan = (SubPlan *) lfirst(lc);
   if (curplan != bestplan)
   lfirst(lc) = NULL;
 }
 j = 0;
foreach(lc, asplan->subplans)
{
   SubPlan*curplan = (SubPlan *) lfirst(lc);
   if (curplan != NULL)
   j++;
 }
if (j != i)
{
  ereport(ERROR,
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
  errmsg("too many subplans: total_plans=%d, remain_plans=%d",
i, j)));
 }
}

explain (costs off)
postgres-# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  too many subplans: total_plans=2, remain_plans=1
postgres=# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
ERROR:  too many subplans: total_plans=2, remain_plans=1

I think that works:
   lfirst(lc) = NULL;

regards,
Ranier Vilela

Re: Hook for extensible parsing.

On Thu, Sep 16, 2021 at 1:23 AM Julien Rouhaud  wrote:
>
> On Thu, Sep 16, 2021 at 12:57 AM Tom Lane  wrote:
> >
> > > The requirement is that the parser can't leak any
> > > node that the rest of the system doesn't know about, but you can do
> > > what you want inside the parser.
> >
> > That's not what the patch actually does, though.  It only replaces
> > the grammar, not semantic analysis.  So you couldn't associate the
> > (+)-decorated WHERE clause with the appropriate join.  (And no,
> > I will not accept that it's okay to perform catalog lookups in
> > the grammar to get around that.  See comment at the head of gram.y.)
>
> I never said that one should do catalog lookup for that?  What I said
> is that you can do a specific semantic analysis pass in the hook if
> you know that you can have extensible nodes in your parsetree, and you
> can do that with that hook unless I'm missing something?

Ah, now that I think more about it I think that you're talking about
unqualified fields?  I was naively assuming that those wouldn't be
allowed by oracle, but I guess that wishful thinking.

Re: proposal: possibility to read dumped table's name from file

Hi

po 13. 9. 2021 v 15:01 odesílatel Daniel Gustafsson 
napsal:

> > On 28 Jul 2021, at 09:28, Pavel Stehule  wrote:
> > út 13. 7. 2021 v 1:16 odesílatel Tom Lane  t...@sss.pgh.pa.us>> napsal:
>
> > Hence I suggest
> >
> > include table PATTERN
> > exclude table PATTERN
> >
> > which ends up being the above but with words not [+-].
>
> One issue with this syntax is that the include keyword can be quite
> misleading
> as it's semantic interpretion of "include table t" can be different from
> "--table=t".  The former is less clear about the fact that it means
> "exclude
> all other tables than " then the latter.  It can be solved with
> documentation,
> but I think that needs be to be made clearer.
>

I invite any documentation enhancing and fixing

>
> > Here is an updated implementation of filter's file, that implements
> syntax proposed by you.
>
> While it's not the format I would prefer, it does allow for most (all?) use
> cases expressed in this thread with ample armtwisting applied so let's go
> ahead
> from this point and see if we can agree on it (or a version of it).
>
> A few notes on the patch after a first pass over it:
>
> +(include|exclude)[table|schema|foreign_data|data]  class="parameter">objectname
> Lacks whitespace between keyword and object type.  Also, since these are
> mandatory parameters, shouldn't they be within '{..}' ?
>
> yes, fixed



>
> +   /* skip initial white spaces */
> +   while (isblank(*ptr))
> +   ptr += 1;
> We don't trust isblank() as of 3fd5faed5 due to portability concerns, this
> should probably use a version of the pg_isblank() we already have (and
> possibly
> move that to src/common/string.c as there now are more consumers).
>
>
I rewrote this part, and I don't use function isblank ever


>
> +static bool
> +isblank_line(const char *line)
> This could be replaced with a single call to strspn() as we already do for
> parsing the TOC file.
>
>
> +   /* when first char is hash, ignore whole line */
> +   if (*str == '#')
> +   continue;
> I think we should strip leading whitespace before this to allow
> commentlines to
> start with whitespace, it's easy enough and will make life easier for
> users.
>

now, the comments can be used as first non blank char or after filter

>
>
> +   pg_log_error("invalid format of filter file \"%s\": %s",
> +filename,
> +message);
> +
> +   fprintf(stderr, "%d: %s\n", lineno, line);
> Can't we just include the lineno in the error logging and skip dumping the
> offending line?  Fast-forwarding the pointer to print the offending part is
> less useful than a context marker, and in some cases suboptimal.  With this
> coding, if a pattern is omitted for example the below error message is
> given:
>
>
pg_dump: error: invalid format of filter file "filter.txt": missing
> object name
> 1:
>
> The errormessage and the linenumber in the file should be enough for the
> user
> to figure out what to fix.
>

I did it like you proposed, but still, I think the content can be useful.
More times you read dynamically generated files, or you read data from
stdin, and in complex environments it can be hard regenerate new content
for debugging.


>
> +   if (keyword && is_keyword(keyword, size, "table"))
> +   objecttype = 't';
> Should this use an enum, or at least a struct translation the literal
> keyword
> to the internal representation?  Magic constants without explicit
> connection to
> their token counterparts can easily be cause of bugs.
>
>
fixed


> If I create a table called "a\nb" and try to dump it I get an error in
> parsing
> the file.  Isn't this supposed to work?
> $ cat filter.txt
> include table "a
> b"
> $ ./bin/pg_dump --filter=filter.txt
> pg_dump: error: invalid format of filter file "filter.txt": unexpected
> chars after object name
> 2:


probably there was some issue, because it should work. I tested a new
version and this is tested in new regress tests. Please, check


>
> Did you consider implementing this in Bison to abstract some of the messier
> parsing logic?
>

Initially not, but now, when I am thinking about it, I don't think so Bison
helps. The syntax of the filter file is nicely linear. Now, the code of the
parser is a little bit larger than minimalistic, but it is due to nicer
error's messages. The raw implementation in Bison raised just "syntax
error" and positions. I did code refactoring, and now the scanning, parsing
and processing are divided into separated routines. Parsing related code
has 90 lines. In this case, I don't think using a parser grammar file can
carry any benefit. grammar is more readable, sure, but we need to include
bison, we need to handle errors, and if we want to raise more helpful
errors than just "syntax error", then the code will be longer.

please, check attached patch

Regards

Pavel



> --
> Daniel Gustafsson

Re: Hook for extensible parsing.

On Thu, Sep 16, 2021 at 12:57 AM Tom Lane  wrote:
>
> > The requirement is that the parser can't leak any
> > node that the rest of the system doesn't know about, but you can do
> > what you want inside the parser.
>
> That's not what the patch actually does, though.  It only replaces
> the grammar, not semantic analysis.  So you couldn't associate the
> (+)-decorated WHERE clause with the appropriate join.  (And no,
> I will not accept that it's okay to perform catalog lookups in
> the grammar to get around that.  See comment at the head of gram.y.)

I never said that one should do catalog lookup for that?  What I said
is that you can do a specific semantic analysis pass in the hook if
you know that you can have extensible nodes in your parsetree, and you
can do that with that hook unless I'm missing something?

Yes that's not ideal, but I don't see how it can be worse than writing
some middleware that parses the query, rewrite it to postgres style
sql on the fly so that postgres can parse it again.  I'm also not sure
how the semantic analysis could be made generally extensible, if
possible at all, so that's the best I can propose.

If that approach is a deal breaker then fine I can accept it.

Re: Hook for extensible parsing.

Julien Rouhaud  writes:
> I'm not sure why you couldn't implement an Oracle-style outer join
> with such a hook?

Try it.

> The requirement is that the parser can't leak any
> node that the rest of the system doesn't know about, but you can do
> what you want inside the parser.

That's not what the patch actually does, though.  It only replaces
the grammar, not semantic analysis.  So you couldn't associate the
(+)-decorated WHERE clause with the appropriate join.  (And no,
I will not accept that it's okay to perform catalog lookups in
the grammar to get around that.  See comment at the head of gram.y.)

In general, I'm having a hard time believing that anything very
interesting can be done at only the grammar level without changing
the parse analysis phase.  That's not unrelated to the restriction
that the grammar can't do catalog accesses.  Maybe with some fundamental
restructuring, we could get around that issue ... but this patch isn't
doing any fundamental restructuring, it's just putting a hook where it's
easy to do so.  We've often found that such hooks aren't as useful as
they initially seem.

regards, tom lane

Re: Partial index "microvacuum"

2021-09-15 Thread Peter Geoghegan

On Wed, Sep 15, 2021 at 7:18 AM Marko Tiikkaja  wrote:
> So I've been looking at issues we used to have in production some time
> ago which eventually lead us to migrating away from partial indexes in
> some cases.  In the end, I'm surprised how easy this (or at least a
> similar case) was to reproduce.

> (I've tested this on 9.6, v11 and v13.  13 seems to be a bit better
> here, but not "fixed", I think.)

What about v14? There were significant changes to the
microvacuum/index deletion stuff in that release:

https://www.postgresql.org/docs/14/btree-implementation.html#BTREE-DELETION

-- 
Peter Geoghegan

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Em qua., 15 de set. de 2021 às 12:00, Tom Lane  escreveu:

> Ranier Vilela  writes:
> > I would like to ask if this alternative fix (attached) would also solve
> the
> > problem or not.
>
> If I'm reading the patch correctly, that fixes it by failing to drop
> unused subplans at all --- the second loop you have has no external
> effect.
>
> We could, in fact, not bother with removing the no-longer-referenced
> subplans, and it probably wouldn't be all that awful.  But the intent
> of the original patch was to save the executor startup time for such
> subplans, so I wanted to preserve that goal if I could.  The committed
> patch seems small enough and cheap enough to be worthwhile.
>
 Understood, thanks for replying.

regards,
Ranier Vilela

Re: Hook for extensible parsing.

On Wed, Sep 15, 2021 at 11:26 PM Tom Lane  wrote:
>
> In the case at hand, what's troubling me is that I don't see any
> particular use in merely substituting a new bison grammar, if it
> still has to produce parse trees that the rest of the system will
> understand.  Yeah, you could make some very simple surface-syntax
> changes that way, but it doesn't seem like you could do anything
> interesting (like, say, support Oracle-style outer join syntax).
> AFAICS, to get to a useful feature, you'd then need to invent an
> extensible Node system (which'd be hugely invasive if it's feasible
> at all), and then probably more things on top of that.  So I'm not
> convinced that you've demonstrated any real-world usefulness.

I agree that this patchset can only implement syntactic sugars,
nothing more (although for utility command you can do a bit more than
that).  But that's already something people can use, mostly for
migration to postgres use cases probably.

I'm not sure why you couldn't implement an Oracle-style outer join
with such a hook?  The requirement is that the parser can't leak any
node that the rest of the system doesn't know about, but you can do
what you want inside the parser.  And as far as I can see we already
have an extensible node since bcac23de73b, so it seems to me that
there's enough infrastructure to handle this kind of use case.

The main downside is that you'll have to make a first pass to
transform your "custom raw statement" into a valid RawStmt in your
parser, and the system will do another one to transform it in a Query.
But apart from that it should work.  Am I missing something?

Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-15 Thread Robert Haas

On Wed, Sep 15, 2021 at 10:32 AM Robert Haas  wrote:
> Putting these changes into 0001 seems to make no sense. It seems like
> they should be part of 0003, or maybe a new 0004 patch.

After looking at this a little bit more, I think it's really necessary
to separate out all of your changes into separate patches at least for
initial review. It's particularly important to separate code movement
changes from other kinds of changes. 0001 was just moving code before,
and so was 0002, but now both are making other changes, which is not
easy to see from looking at the 'git diff' output. For that reason
it's not so easy to understand exactly what you've changed here and
analyze it.

I poked around a little bit at these patches, looking for
perhaps-interesting global variables upon which the code called from
XLogAcceptWrites() would depend with your patches applied. The most
interesting ones seem to be (1) ThisTimeLineID, which you mentioned
and which may be fine but I'm not totally convinced yet, (2)
LocalXLogInsertAllowed, which is probably not broken but I'm thinking
we may want to redesign that mechanism somehow to make it cleaner, and
(3) CheckpointStats, which is called from RemoveXlogFile which is
called from RemoveNonParentXlogFiles which is called from
CleanupAfterArchiveRecovery which is called from XLogAcceptWrites.
This last one is actually pretty weird already in the existing code.
It sort of looks like RemoveXlogFile() only expects to be called from
the checkpointer (or a standalone backend) so that it can update
CheckpointStats and have that just work, but actually it's also called
from the startup process when a timeline switch happens. I don't know
whether the fact that the increments to ckpt_segs_recycled get lost in
that case should be considered an intentional behavior that should be
preserved or an inadvertent mistake.

So I think you've covered most of the necessary things here, with
probably some more discussion needed on whether you've done the right
things...

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

2021-09-15 Thread John Naylor

On Wed, Sep 1, 2021 at 3:25 PM Tom Lane  wrote:
>
> John Naylor  writes:
> > On Wed, Sep 1, 2021 at 2:44 PM Tom Lane  wrote:
> >> I see that these two answers are both exactly multiples of 24 hours
away
> >> from the given origin.  But if I'm binning on the basis of "days" or
> >> larger units, I would sort of expect to get local midnight, and I'm not
> >> getting that once I cross a DST boundary.
>
> > Hmm, that's seems like a reasonable expectation. I can get local
midnight
> > if I recast to timestamp:
>
> > # select date_bin('1 day', '2021-11-10 00:00
+00'::timestamptz::timestamp,
> > '2021-09-01 00:00 -04'::timestamptz::timestamp);
> >   date_bin
> > -
> >  2021-11-09 00:00:00
> > (1 row)
>
> Yeah, and then back to timestamptz if that's what you really need :-(
>
> > It's a bit unintuitive, though.
>
> Agreed.  If we keep it like this, adding some documentation around
> the point would be a good idea I think.

Attached is a draft doc patch using the above examples. Is there anything
else that would be useful to mention?

--
John Naylor
EDB: http://www.enterprisedb.com


v1-doc-date-bin-dst.patch
Description: Binary data

Re: Trigger position

We can run triggers using position only, this way we don´t have these few
cycles to determine ordering.
On creation time we populate position, even if it's not set, so for the
first time position will match trigger names. When user changes a trigger
position we sum 1 to the followers.

regards,
Marcos

Em qua., 15 de set. de 2021 às 12:13, Euler Taveira 
escreveu:

> On Wed, Sep 15, 2021, at 10:51 AM, Alvaro Herrera wrote:
>
> In a computer system, alphabet letters are just a different way to
> present numbers, so you just choose ASCII letters that match what you
> want.  You can use "AA_first_trigger", "BB_second_trigger",
> "AB_nope_this_is_second" and you'll be fine; you can do
> "AAB_oops_really_second" afterwards, and so on.  The integer numbering
> system doesn't seem very useful/flexible when seen in this light.
>
> ... or renumber all trigger positions in a single transaction. I agree that
> letters are more flexible than numbers but some users are number-oriented.
>
> I'm afraid an extra mechanism to determine the order to fire triggers will
> confuse programmers if someone decides to use both. Besides that, we have
> to
> expend a few cycles to determine the exact trigger execution order.
>
>
> --
> Euler Taveira
> EDB   https://www.enterprisedb.com/
>
>

Re: Trigger position

2021-09-15 Thread Chapman Flack

On 09/15/21 06:28, Marcos Pegoraro wrote:

> Oracle and SQL Server have FOLLOWS and PRECEDES when defining trigger
> execution order. Firebird has POSITION, which I like it more.

Between those two, I think my vote would come down the other way,
assuming FOLLOWS and PRECEDES work the way I am guessing they do:
you would be specifying the firing order between triggers whose
relative order you care about, and leaving it unspecified between
triggers whose relative order doesn't matter.

I find that an appealing general solution that allows the machine
to find a satisfactory order, and is less fussy than trying to manually
create a total order for all of the triggers (even those whose relative
order may not matter) by arbitrarily fussing with names or integers.

It resembles similar constructs in lots of other things, like the way
grammar precedences are specified [0] in SDF.

It may be objected that this makes a trigger order that is less
fully determined in advance, and can lead to issues that are harder
to reason out if you forgot to specify a relative order that matters.

But balancing that is that it may be easier in general to reason about
just the relative orders that matter, undistracted by any that don't.
In some settings, leaving unspecified the ones that don't may increase
opportunities for optimization. (Not that I have any specific optimizations
in mind for this setting.)

One could even think about a test mode that would deliberately randomize
the relative order between triggers where it hasn't been specified.

Regards,
-Chap

[0]
https://www.metaborg.org/en/latest/source/langdev/meta/lang/sdf3/reference.html#priorities

Re: Trigger position

On 2021-Sep-15, Pavel Stehule wrote:

> Triggers that depend on execution order are pretty hell. It is a clean
> signal of some crazy design and overusing of triggers.

Yeah.  The only case I've seen where order of triggers was important
(beyond the "before" / "after" classification) is where you have
something that you need to ensure runs before the FK triggers.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.
https://burningbird.net/the-parable-of-the-languages/

Re: EXPLAIN(VERBOSE) to CTE with SEARCH BREADTH FIRST fails

Andrew Dunstan  writes:
> On 9/10/21 10:10 AM, torikoshia wrote:
>> I've attached the patch for the changes made for this test for your
>> reference, but I'm not sure it's appropriate for creating a new
>> CoercionForm to fix the issue..

> This is listed as an open item for release 14. Is it planned to commit
> the patch? If not, we should close the item.

I do not think that patch is a proper solution, but we do need to do
something about this.

regards, tom lane

Re: Hook for extensible parsing.

Jim Mlodgenski  writes:
> On Wed, Sep 15, 2021 at 9:25 AM Simon Riggs
>  wrote:
>> The general rule has always been that we don't just put hooks in, we
>> always require an in-core use for those hooks. I was reminded of that
>> myself recently.

> That's not historically what has happened. There are several hooks with
> no in core use such as emit_log_hook and ExplainOneQuery_hook.

Yeah.  I think the proper expectation is that there be a sufficiently
worked-out example to convince us that the proposed hooks have real-world
usefulness, and are not missing any basic requirements to make them do
something useful.  Whether the example ends up in our tree is a
case-by-case decision.

In the case at hand, what's troubling me is that I don't see any
particular use in merely substituting a new bison grammar, if it
still has to produce parse trees that the rest of the system will
understand.  Yeah, you could make some very simple surface-syntax
changes that way, but it doesn't seem like you could do anything
interesting (like, say, support Oracle-style outer join syntax).
AFAICS, to get to a useful feature, you'd then need to invent an
extensible Node system (which'd be hugely invasive if it's feasible
at all), and then probably more things on top of that.  So I'm not
convinced that you've demonstrated any real-world usefulness.

regards, tom lane

Re: Trigger position

st 15. 9. 2021 v 17:14 odesílatel Euler Taveira  napsal:

> On Wed, Sep 15, 2021, at 10:51 AM, Alvaro Herrera wrote:
>
> In a computer system, alphabet letters are just a different way to
> present numbers, so you just choose ASCII letters that match what you
> want.  You can use "AA_first_trigger", "BB_second_trigger",
> "AB_nope_this_is_second" and you'll be fine; you can do
> "AAB_oops_really_second" afterwards, and so on.  The integer numbering
> system doesn't seem very useful/flexible when seen in this light.
>
> ... or renumber all trigger positions in a single transaction. I agree that
> letters are more flexible than numbers but some users are number-oriented.
>
> I'm afraid an extra mechanism to determine the order to fire triggers will
> confuse programmers if someone decides to use both. Besides that, we have
> to
> expend a few cycles to determine the exact trigger execution order.
>

Triggers that depend on execution order are pretty hell. It is a clean
signal of some crazy design and overusing of triggers.

Personally I prefer to don't have any similar feature just as a strong
signal for developers - Don't do this. Unfortunately (but good for
business) . A lot of migrated applications from Oracle use this terrible
style. I like PL/SQL, but the most ugly code that I saw was in PL/SQL. So
this feature can be necessary for migrations from Oracle, but I don't see
reasons to be more visible.

Regards

Pavel

>
> --
> Euler Taveira
> EDB   https://www.enterprisedb.com/
>
>

Re: Trigger position

2021-09-15 Thread Euler Taveira

On Wed, Sep 15, 2021, at 10:51 AM, Alvaro Herrera wrote:
> In a computer system, alphabet letters are just a different way to
> present numbers, so you just choose ASCII letters that match what you
> want.  You can use "AA_first_trigger", "BB_second_trigger",
> "AB_nope_this_is_second" and you'll be fine; you can do
> "AAB_oops_really_second" afterwards, and so on.  The integer numbering
> system doesn't seem very useful/flexible when seen in this light.
... or renumber all trigger positions in a single transaction. I agree that
letters are more flexible than numbers but some users are number-oriented.

I'm afraid an extra mechanism to determine the order to fire triggers will
confuse programmers if someone decides to use both. Besides that, we have to
expend a few cycles to determine the exact trigger execution order.


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Trigger position

When I was writing my initial email I was remembering exactly this, my
first basic programs.
I would like this feature more because I sometimes have a mess of triggers
when this trigger function is fired on several tables and it needs to be
the first on this table but not on that table. And usually trigger names
have same names as their functions, so for this table I have to have a
different name just to be fired first.

Em qua., 15 de set. de 2021 às 10:51, Alvaro Herrera <
alvhe...@alvh.no-ip.org> escreveu:

> On 2021-Sep-15, Marcos Pegoraro wrote:
>
> > This problem can raise ... there is a trigger foo using position 1,
> please
> > choose another
>
> This is reminiscent of the old BASIC programming language, where you
> eventually learn to choose line numbers that aren't consecutive, so that
> if you later have to add lines in between you have some room to do so.
> (This happens when modifying a program sufficient times you are forced
> to renumber old lines where you want to add new lines that no longer fit
> in the sequence.)  It's a pretty bad system.
>
> In a computer system, alphabet letters are just a different way to
> present numbers, so you just choose ASCII letters that match what you
> want.  You can use "AA_first_trigger", "BB_second_trigger",
> "AB_nope_this_is_second" and you'll be fine; you can do
> "AAB_oops_really_second" afterwards, and so on.  The integer numbering
> system doesn't seem very useful/flexible when seen in this light.
>
> --
> Álvaro Herrera  Valdivia, Chile  —
> https://www.EnterpriseDB.com/
> "Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
> (Barón Vladimir Harkonnen)
>

Re: Hook for extensible parsing.

st 15. 9. 2021 v 16:55 odesílatel Julien Rouhaud 
napsal:

> On Wed, Sep 15, 2021 at 10:14 PM Jim Mlodgenski  wrote:
> >
> > On Wed, Sep 15, 2021 at 9:25 AM Simon Riggs
> >  wrote:
> > >
> > > The general rule has always been that we don't just put hooks in, we
> > > always require an in-core use for those hooks. I was reminded of that
> > > myself recently.
> > >
> > That's not historically what has happened. There are several hooks with
> > no in core use such as emit_log_hook and ExplainOneQuery_hook. The recent
> > openssl_tls_init_hook only has a usage in src/test/modules
>
> Yes, I also think that it's not a strict requirement that all hooks
> have a caller in the core, even if it's obviously better if that's the
> case.
>
> > > What we need is something in core that actually makes use of this. The
> > > reason for that is not politics, but a simple test of whether the
> > > feature makes sense AND includes all required bells and whistles to be
> > > useful in the real world.
> > >
> > Agreed. There should be something in src/test/modules to exercise this
> > but probably more to flush things out. Maybe extending adminpack to use
> > this so if enabled, it can use syntax like:
> > FILE READ 'foo.txt'
>
> For this hook, maintaining a real alternative parser seems like way
> too much trouble to justify an in-core user.  The fact that many
> people have asked for such a feature over the year should be enough to
> justify the use case.  We could try to invent some artificial need
> like the one you suggest for adminpack, but it also feels like a waste
> of resources.
>
> As far as I'm concerned a naive strcmp-based parser in
> src/test/modules should be enough to validate that the hook is
> working, there's no need for more.  In any case if the only
> requirement for it to be committed is to write a real parser, whether
> in contrib or src/test/modules, I'll be happy to do it.
>
> > > Core doesn't need a LOL parser and I don't think we should commit that.
> > >
> > +1
>
> I entirely agree, and I repeatedly mentioned in that thread that I did
> *not* want to add this parser in core.  The only purpose of patches
> 0002 and 0004 is to make the third-party bison based parser
> requirements less abstract, and demonstrate that this approach can
> successfully make two *radically different* parsers cohabit.
>
> > > If we do this, I think it should have CREATE LANGUAGE support, so that
> > > each plugin can be seen as an in-core object and allow security around
> > > which users can execute which language types, allow us to switch
> > > between languages and have default languages for specific users or
> > > databases.
> > >
> > This hook allows extension developers to supplement syntax in addition
> > to adding a whole new language allowing the extension to appear more
> > native to the end user. For example, pglogical could use this to add
> > syntax to do a CREATE NODE instead of calling the function create_node.
> > Adding CREATE LANGUAGE support around this would just be for a narrow
> > set of use cases where a new language is added.
>
> Yes, this hook can be used to implement multiple things as I mentioned
> my initial email.  Additionally, if this is eventually committed I'd
> like to add support for CREATE HYPOTHETICAL INDEX grammar in hypopg.
> Such a parser would only support one command (that extends an existing
> one), so it can't really be called a language.  Of course if would be
> better to have the core parser accept a CREATE [ HYPOTHETICAL ] INDEX
> and setup a flag so that third-parrty module can intercept this
> utility command, but until that happens I could provide that syntactic
> sugar for my users as long as I'm motivated enough to write this
> parser.
>
>
There were nice stream databases, but that ended because maintaining a fork
is too expensive.  And without direct SQL (without possibility of parser
enhancing), the commands based on function call API were not readable and
workable flexible like SQL. Sometimes we really don't want to replace
PostgreSQL, but just enhance the main interface for extensions.


Also, a hook based approach is still compatible with per database /
> role configuration.  It can be done either via specific
> session_preload_libraries, or via a custom GUC if for some reason the
> module requires to be in shared_preload_libraries.
>
>
>

Re: EXPLAIN(VERBOSE) to CTE with SEARCH BREADTH FIRST fails



On 9/10/21 10:10 AM, torikoshia wrote:
> On 2021-09-09 19:03, Peter Eisentraut wrote:
>> On 07.09.21 20:31, Tom Lane wrote:
>>> torikoshia  writes:
 While working on [1], we found that EXPLAIN(VERBOSE) to CTE with
 SEARCH
 BREADTH FIRST ends up ERROR.
>>>
>>> Yeah.  It's failing here:
>>>
>>>   * We're deparsing a Plan tree so we don't have
>>> a CTE
>>>   * list.  But the only place we'd see a Var
>>> directly
>>>   * referencing a CTE RTE is in a CteScan plan
>>> node, and we
>>>   * can look into the subplan's tlist instead.
>>>
>>>  if (!dpns->inner_plan)
>>>  elog(ERROR, "failed to find plan for CTE %s",
>>>   rte->eref->aliasname);
>>>
>>> The problematic Var is *not* in a CteScan plan node; it's in a
>>> WorkTableScan node.  It's not clear to me whether this is a bug
>>> in the planner's handling of SEARCH BREADTH FIRST, or if the plan
>>> is as-intended and ruleutils.c is failing to cope.
>>
>> The search clause is resolved by the rewriter, so it's unlikely that
>> the planner is doing something wrong.  Either the rewriting produces
>> something incorrect (but then one might expect that the query results
>> would be wrong), or the structures constructed by rewriting are not
>> easily handled by ruleutils.c.
>>
>> If we start from the example in the documentation
>> :
>>
>>
>> """
>> WITH RECURSIVE search_tree(id, link, data, depth) AS (
>>     SELECT t.id, t.link, t.data, 0
>>     FROM tree t
>>   UNION ALL
>>     SELECT t.id, t.link, t.data, depth + 1
>>     FROM tree t, search_tree st
>>     WHERE t.id = st.link
>> )
>> SELECT * FROM search_tree ORDER BY depth;
>>
>> To get a stable sort, add data columns as secondary sorting columns.
>> """
>>
>> In order to handle that part about the stable sort, the query
>> constructed internally is something like
>>
>> WITH RECURSIVE search_tree(id, link, data, seq) AS (
>>     SELECT t.id, t.link, t.data, ROW(0, id, link)
>>     FROM tree t
>>   UNION ALL
>>     SELECT t.id, t.link, t.data, ROW(seq.depth + 1, id, link)
>>     FROM tree t, search_tree st
>>     WHERE t.id = st.link
>> )
>> SELECT * FROM search_tree ORDER BY seq;
>>
>> The bit "seq.depth" isn't really valid when typed in like that, I
>> think, but of course internally this is all wired together with
>> numbers rather than identifiers.  I suspect that that is what
>> ruleutils.c trips over.
>
> Thanks for your advice, it seems right.
>
> EXPLAIN VERBOSE can be output without error when I assigned testing
> purpose CoercionForm to 'seq.depth + 1'.
>
> I've attached the patch for the changes made for this test for your
> reference, but I'm not sure it's appropriate for creating a new
> CoercionForm to fix the issue..



This is listed as an open item for release 14. Is it planned to commit
the patch? If not, we should close the item.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Ranier Vilela  writes:
> I would like to ask if this alternative fix (attached) would also solve the
> problem or not.

If I'm reading the patch correctly, that fixes it by failing to drop
unused subplans at all --- the second loop you have has no external
effect.

We could, in fact, not bother with removing the no-longer-referenced
subplans, and it probably wouldn't be all that awful.  But the intent
of the original patch was to save the executor startup time for such
subplans, so I wanted to preserve that goal if I could.  The committed
patch seems small enough and cheap enough to be worthwhile.

regards, tom lane

Re: Hook for extensible parsing.

2021-09-15 Thread David Fetter

On Wed, Sep 15, 2021 at 02:25:17PM +0100, Simon Riggs wrote:
> On Sat, 1 May 2021 at 08:24, Julien Rouhaud  wrote:
> 
> > Being able to extend core parser has been requested multiple times, and 
> > AFAICT
> > all previous attempts were rejected not because this isn't wanted but 
> > because
> > the proposed implementations required plugins to reimplement all of the core
> > grammar with their own changes, as bison generated parsers aren't 
> > extensible.
> >
> > I'd like to propose an alternative approach, which is to allow multiple 
> > parsers
> > to coexist, and let third-party parsers optionally fallback on the core
> > parsers.
> 
> Yes, that approach has been discussed by many people, most recently
> around the idea to create a fast-path grammar to make the most
> frequently used SQL parse faster.
> 
> > 0002 implements a lame "sqlol" parser, based on LOLCODE syntax, with only 
> > the
> > ability to produce "select [col, ] col FROM table" parsetree, for testing
> > purpose.  I chose it to ensure that everything works properly even with a
> > totally different grammar that has different keywords, which doesn't even 
> > ends
> > statements with a semicolon but a plain keyword.
> 
> The general rule has always been that we don't just put hooks in, we
> always require an in-core use for those hooks. I was reminded of that
> myself recently.
> 
> What we need is something in core that actually makes use of this. The
> reason for that is not politics, but a simple test of whether the
> feature makes sense AND includes all required bells and whistles to be
> useful in the real world.
> 
> Core doesn't need a LOL parser and I don't think we should commit that.

It doesn't, but it very likely needs something people can use when
they create a new table AM, and that we should use the hook in core to
implement the heap* table AM to make sure the thing is working at DDL
time.

> If we do this, I think it should have CREATE LANGUAGE support, so
> that each plugin can be seen as an in-core object and allow security
> around which users can execute which language types, allow us to
> switch between languages and have default languages for specific
> users or databases.

That's a great idea, but I must be missing something important as it
relates to parser hooks. Could you connect those a little more
explicitly?

Best,
David.

* It's not actually a heap in the sense that the term is normally used
in computing. I'd love to find out how it got to have this name and
document same so others aren't also left wondering.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Hook for extensible parsing.

On Wed, Sep 15, 2021 at 10:14 PM Jim Mlodgenski  wrote:
>
> On Wed, Sep 15, 2021 at 9:25 AM Simon Riggs
>  wrote:
> >
> > The general rule has always been that we don't just put hooks in, we
> > always require an in-core use for those hooks. I was reminded of that
> > myself recently.
> >
> That's not historically what has happened. There are several hooks with
> no in core use such as emit_log_hook and ExplainOneQuery_hook. The recent
> openssl_tls_init_hook only has a usage in src/test/modules

Yes, I also think that it's not a strict requirement that all hooks
have a caller in the core, even if it's obviously better if that's the
case.

> > What we need is something in core that actually makes use of this. The
> > reason for that is not politics, but a simple test of whether the
> > feature makes sense AND includes all required bells and whistles to be
> > useful in the real world.
> >
> Agreed. There should be something in src/test/modules to exercise this
> but probably more to flush things out. Maybe extending adminpack to use
> this so if enabled, it can use syntax like:
> FILE READ 'foo.txt'

For this hook, maintaining a real alternative parser seems like way
too much trouble to justify an in-core user.  The fact that many
people have asked for such a feature over the year should be enough to
justify the use case.  We could try to invent some artificial need
like the one you suggest for adminpack, but it also feels like a waste
of resources.

As far as I'm concerned a naive strcmp-based parser in
src/test/modules should be enough to validate that the hook is
working, there's no need for more.  In any case if the only
requirement for it to be committed is to write a real parser, whether
in contrib or src/test/modules, I'll be happy to do it.

> > Core doesn't need a LOL parser and I don't think we should commit that.
> >
> +1

I entirely agree, and I repeatedly mentioned in that thread that I did
*not* want to add this parser in core.  The only purpose of patches
0002 and 0004 is to make the third-party bison based parser
requirements less abstract, and demonstrate that this approach can
successfully make two *radically different* parsers cohabit.

> > If we do this, I think it should have CREATE LANGUAGE support, so that
> > each plugin can be seen as an in-core object and allow security around
> > which users can execute which language types, allow us to switch
> > between languages and have default languages for specific users or
> > databases.
> >
> This hook allows extension developers to supplement syntax in addition
> to adding a whole new language allowing the extension to appear more
> native to the end user. For example, pglogical could use this to add
> syntax to do a CREATE NODE instead of calling the function create_node.
> Adding CREATE LANGUAGE support around this would just be for a narrow
> set of use cases where a new language is added.

Yes, this hook can be used to implement multiple things as I mentioned
my initial email.  Additionally, if this is eventually committed I'd
like to add support for CREATE HYPOTHETICAL INDEX grammar in hypopg.
Such a parser would only support one command (that extends an existing
one), so it can't really be called a language.  Of course if would be
better to have the core parser accept a CREATE [ HYPOTHETICAL ] INDEX
and setup a flag so that third-parrty module can intercept this
utility command, but until that happens I could provide that syntactic
sugar for my users as long as I'm motivated enough to write this
parser.

Also, a hook based approach is still compatible with per database /
role configuration.  It can be done either via specific
session_preload_libraries, or via a custom GUC if for some reason the
module requires to be in shared_preload_libraries.

Re: [PATCH] Proof of concept for GUC improvements

2021-09-15 Thread David Christensen

Updated version attached with comment fixes and updated for new GUC.


special-guc-values-v2.patch
Description: Binary data

Re: Column Filtering in Logical Replication

2021-09-15 Thread Euler Taveira

On Wed, Sep 15, 2021, at 9:19 AM, vignesh C wrote:
> I have extracted the parser code and attached it here, so that it will
> be easy to go through. We wanted to support the following syntax as in
> [1]:
> CREATE PUBLICATION pub1 FOR
> TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
> SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;
I don't like this syntax. It seems too much syntax for the same purpose in a
single command. If you look at GRANT command whose ALL TABLES IN SCHEMA syntax
was extracted, you can use ON TABLE or ON ALL TABLES IN SCHEMA; you cannot use
both. This proposal allows duplicate objects (of course, you can ignore it but
the current code prevent duplicates -- see publication_add_relation).

IMO you should mimic the GRANT grammar and have multiple commands for row
filtering, column filtering, and ALL FOO IN SCHEMA. The filtering patches only
use the FOR TABLE syntax. The later won't have filtering syntax. Having said
that the grammar should be:

CREATE PUBLICATION name
[ FOR TABLE [ ONLY ] table_name [ * ] [ (column_name [, ...] ) ] [ WHERE 
(expression) ] [, ...]
  | FOR ALL TABLES
  | FOR ALL TABLES IN SCHEMA schema_name, [, ...]
  | FOR ALL SEQUENCES IN SCHEMA schema_name, [, ...] ]
[ WITH ( publication_parameter [= value] [, ... ] ) ]

ALTER PUBLICATION name ADD TABLE [ ONLY ] table_name [ * ] [ (column_name [, 
...] ) ] [ WHERE (expression) ]
ALTER PUBLICATION name ADD ALL TABLES IN SCHEMA schema_name, [, ...]
ALTER PUBLICATION name ADD ALL SEQUENCES IN SCHEMA schema_name, [, ...]

ALTER PUBLICATION name SET TABLE [ ONLY ] table_name [ * ] [ (column_name [, 
...] ) ] [ WHERE (expression) ]
ALTER PUBLICATION name SET ALL TABLES IN SCHEMA schema_name, [, ...]
ALTER PUBLICATION name SET ALL SEQUENCES IN SCHEMA schema_name, [, ...]

ALTER PUBLICATION name DROP TABLE [ ONLY ] table_name [ * ]
ALTER PUBLICATION name DROP ALL TABLES IN SCHEMA schema_name, [, ...]
ALTER PUBLICATION name DROP ALL SEQUENCES IN SCHEMA schema_name, [, ...]

Opinions?


--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead



On 9/8/21 12:11 AM, Laurenz Albe wrote:
> On Fri, 2021-09-03 at 17:04 -0700, Andres Freund wrote:
>> Here's how I think that would look like. While writing up this draft, I found
>> two more issues:
>>
>> - On windows / 32 bit systems, the session time would overflow if idle for
>>   longer than ~4300s. long is only 32 bit. Easy to fix obviously.
>> - Right now walsenders, including database connected walsenders, are not
>>   reported in connection stats. That doesn't seem quite right to me.
>>
>> In the patch I made the message for connecting an explicitly reported 
>> message,
>> that seems cleaner, because it then happens at a clearly defined point. I
>> didn't do the same for disconnecting, but perhaps that would be better? Then
>> we could get rid of the whole pgStatSessionEndCause variable.
> I have gone over your patch and made the following changes:
>
> - cache the last report time in a static variable pgLastSessionReportTime
> - add a comment to explain why we only track normal backends
> - pgindent
> - an attempt at a commit message
>

Hi,


this is an open item for release 14. Is someone going to commit?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Getting ERROR "subplan "SubPlan 1" was not initialized" in EXISTS subplan when using for list partition.

Em ter., 14 de set. de 2021 às 17:11, Tom Lane  escreveu:

> Zhihong Yu  writes:
> > In the fix, isUsedSubplan is used to tell whether any given subplan is
> used.
> > Since only one subplan is used, I wonder if the array can be replaced by
> > specifying the subplan is used.
>
> That doesn't seem particularly more convenient.  The point of the bool
> array is to merge the results from examination of (possibly) many
> AlternativeSubPlans.
>
Impressive quick fix, but IMHO I also think it's a bit excessive.

I would like to ask if this alternative fix (attached) would also solve the
problem or not.
Apparently, it passes the proposed test and in regress.

postgres=# create temp table exists_tbl (c1 int, c2 int, c3 int) partition
by list (c1);
CREATE TABLE
postgres=# create temp table exists_tbl_null partition of exists_tbl for
values in (null);
CREATE TABLE
postgres=# create temp table exists_tbl_def partition of exists_tbl default;
CREATE TABLE
postgres=# insert into exists_tbl select x, x/2, x+1 from
generate_series(0,10) x;
INSERT 0 11
postgres=# analyze exists_tbl;
ANALYZE
postgres=# explain (costs off)
postgres-# explain (costs off);
ERROR:  syntax error at or near "explain"
LINE 2: explain (costs off);
^
postgres=# explain (costs off)
postgres-# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
  QUERY PLAN
--
 Append
   ->  Seq Scan on exists_tbl_null t1_1
 Filter: ((SubPlan 1) OR (c3 < 0))
 SubPlan 1
   ->  Append
 ->  Seq Scan on exists_tbl_null t2_1
   Filter: (t1_1.c1 = c2)
 ->  Seq Scan on exists_tbl_def t2_2
   Filter: (t1_1.c1 = c2)
   ->  Seq Scan on exists_tbl_def t1_2
 Filter: ((hashed SubPlan 2) OR (c3 < 0))
 SubPlan 2
   ->  Append
 ->  Seq Scan on exists_tbl_null t2_4
 ->  Seq Scan on exists_tbl_def t2_5
(15 rows)


postgres=# select * from exists_tbl t1
postgres-#   where (exists(select 1 from exists_tbl t2 where t1.c1 = t2.c2)
or c3 < 0);
 c1 | c2 | c3
++
  0 |  0 |  1
  1 |  0 |  2
  2 |  1 |  3
  3 |  1 |  4
  4 |  2 |  5
  5 |  2 |  6
(6 rows)

regards,
Ranier Vilela


fix_subplans_selection.patch
Description: Binary data

RE: [PATCH] support tab-completion for single quote input with equal sign

2021-09-15 Thread tanghy.f...@fujitsu.com

On Saturday, September 4, 2021 11:58 PM, Tom Lane  wrote:
>Actually ... those are just implementation details, and now that
>I've thought about it a little more, I question the entire concept
>of making single-quoted strings be single words in tab-complete's
>view.  I think it's quite intentional that we don't do that;
>if we did, it'd forever foreclose the possibility of tab-completing
>*within* strings.  You don't have to look any further than CREATE
>SUBSCRIPTION itself to see possible applications of that: someone
>could wish that
>
>CREATE SUBSCRIPTION my_sub CONNECTION 'db
>
>would complete with "name=", or that  right after the quote
>would offer a list of connection keywords.
>
>(More generally, I'm afraid that people are already relying on this
>behavior in other contexts, and thus that the proposed patch could
>break more use-cases than it fixes.)

Agreed. Thanks for your comments.

>So now I think that this approach should be rejected, and that the
>right thing is to fix the CREATE SUBSCRIPTION completion rules
>to allow more than one "word" between CONNECTION and PUBLICATION.
>
>Another idea that might be useful is to treat the opening and
>closing quotes themselves as separate "words", which'd give
>the CREATE SUBSCRIPTION rules a bit more to go on about when to
>offer PUBLICATION.

Treat the opening and closing quotes themselves as separate "words" may affect 
some current tap completion.
So I tried to fix the CREATE SUBSCRIPTION completion rules in the V3 patch.
The basic idea is to check the head words of the input text as "CREATE 
SUBSCRIPTION subname CONNECTION anystring", 
then check to see if anystring ends with single quote. If all check passed, 
PUBLICATION will be auto-completed.

Tap tests(including the one added in V3) has been passed.

Regards,
Tang



v3-0001-support-tab-completion-for-CONNECTION-string-with.patch
Description:  v3-0001-support-tab-completion-for-CONNECTION-string-with.patch

Re: Release 14 Schedule



On 9/15/21 10:20 AM, Robert Haas wrote:
> On Wed, Sep 15, 2021 at 8:56 AM Andrew Dunstan  wrote:
>> The Release Management Team (Michael Paquier, Peter Geoghegan and
>> myself) in consultation with the release team proposes the following
>> release schedule:
>>
>> * PostgreSQL 14 Release Candidate 1 (RC1) will be released on September 23, 
>> 2021.
>>
>> * In the absence of any critical issues, PostgreSQL 14 will become generally 
>> available on September 30, 2021.
>>
>> All commits and fixes intended for this release should be made before 
>> September 23, 2021 AoE.
> Presumably this needs to be a couple days earlier, right? Tom would
> probably stamp on Monday, so I guess fixes should be in by Sunday at
> the very latest to allow for a full buildfarm cycle.


Good point. Let's say Sunday 19th. There are in fact very few open
items, so the release 14 branch should already be fairly quiet.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-15 Thread Robert Haas

On Wed, Sep 15, 2021 at 6:49 AM Amul Sul  wrote:
>  Initially, I thought to
> use SharedRecoveryState which is always set to RECOVERY_STATE_ARCHIVE,
> if  the archive recovery requested. But there is another case where
> SharedRecoveryState could be RECOVERY_STATE_ARCHIVE irrespective of
> ArchiveRecoveryRequested value, that is the presence of a backup label
> file.

Right, there's a difference between whether archive recovery has been
*requested* and whether it is actually *happening*.

> If we want to use SharedRecoveryState, we need one more state
> which could differentiate between ArchiveRecoveryRequested and the
> backup label file presence case.  To move ahead, I have copied
> ArchiveRecoveryRequested into shared memory and it will be cleared
> once archive cleanup is finished. With all these changes, we could get
> rid of xlogaction argument and DetermineRecoveryXlogAction() function.
> Could move its logic to PerformRecoveryXLogAction() directly.

Putting these changes into 0001 seems to make no sense. It seems like
they should be part of 0003, or maybe a new 0004 patch.

> And for ThisTimeLineID, I don't think we need to do anything since this
> value is available with all the backend as per the following comment:
> "
> /*
>  * ThisTimeLineID will be same in all backends --- it identifies current
>  * WAL timeline for the database system.
>  */
> TimeLineID  ThisTimeLineID = 0;

I'm not sure I find that argument totally convincing. The two
variables aren't assigned at exactly the same places in the code,
nonwithstanding the comment. I'm not saying you're wrong. I'm just
saying I don't believe it just because the comment says so.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Release 14 Schedule

2021-09-15 Thread Robert Haas

On Wed, Sep 15, 2021 at 8:56 AM Andrew Dunstan  wrote:
> The Release Management Team (Michael Paquier, Peter Geoghegan and
> myself) in consultation with the release team proposes the following
> release schedule:
>
> * PostgreSQL 14 Release Candidate 1 (RC1) will be released on September 23, 
> 2021.
>
> * In the absence of any critical issues, PostgreSQL 14 will become generally 
> available on September 30, 2021.
>
> All commits and fixes intended for this release should be made before 
> September 23, 2021 AoE.

Presumably this needs to be a couple days earlier, right? Tom would
probably stamp on Monday, so I guess fixes should be in by Sunday at
the very latest to allow for a full buildfarm cycle.

Also, I really like the fact that we're looking to release in
September! I think that's nicer than when it slips into October.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Partial index "microvacuum"

2021-09-15 Thread Marko Tiikkaja

So I've been looking at issues we used to have in production some time
ago which eventually lead us to migrating away from partial indexes in
some cases.  In the end, I'm surprised how easy this (or at least a
similar case) was to reproduce.  The attached program does some
UPDATEs where around every third update deletes the row from the
partial index since it doesn't match indpred anymore.  In that case
the row is immediately UPDATEd back to match the index WHERE clause
again.  This roughly emulates what some of our processes do in
production.

Today, running the program for a few minutes (until the built-in
262144 iteration limit), I usually end up with a partial index through
which producing the only row takes milliseconds on a cold cache, and
over a millisecond on a hot one.  Finding the row through the primary
key is still fast, because the bloat there gets cleaned up.  As far as
I can tell, after the index has gotten into this state, there's no way
to clean it up except VACUUMing the entire table or a REINDEX.  Both
solutions are pretty bad.

My working theory was that this has to do with the fact that
HeapTupleSatisfiesMVCC doesn't set the HEAP_XMAX_COMMITTED bit here,
but I'm not so sure anymore.  Has anyone seen something like this?  If
that really is what's happening here, then I can see why we wouldn't
want to slow down SELECTs with expensive visibility checks.  But that
really leaves me wishing for something like VACUUM INDEX partial_idx.
Otherwise your elephant just keeping getting slower and slower until
you get called at 2 AM to play REINDEX.

(I've tested this on 9.6, v11 and v13.  13 seems to be a bit better
here, but not "fixed", I think.)


.m
#!perl

use strict;
use warnings;

use DBI;
use DBD::Pg;

my @connect = ("dbi:Pg:", '', '', {pg_enable_utf8 => 1, RaiseError => 1, PrintError => 0, AutoCommit => 1});
my $dbh = DBI->connect(@connect) or die;

$dbh->do(q{
DROP TABLE IF EXISTS t1;

CREATE TABLE t1 (
	id serial PRIMARY KEY,
	partial boolean NOT NULL,
	data bigint NOT NULL
);
ALTER TABLE t1 SET (autovacuum_enabled = false);

INSERT INTO t1(partial, data) SELECT FALSE, -gs.i FROM generate_series(1, 100) gs(i);

CREATE INDEX i_love_partial_indexes ON t1 (id) WHERE partial;
CREATE INDEX pkish ON t1 ((id::text));

INSERT INTO t1 (partial, data) VALUES (TRUE, 0);
});

$dbh->do(q{VACUUM ANALYZE t1});

my $rows = $dbh->selectall_arrayref(q{SELECT id FROM t1 WHERE partial});
die unless (scalar(@$rows) == 1 && scalar(@{$rows->[0]}) == 1);
my $partial_id = $rows->[0]->[0];

for (my $i = 0; $i < 256 * 1024; ++$i) {
my $rows = $dbh->selectall_arrayref(
q{UPDATE t1 SET data = data + 1, partial = random() < 0.7 WHERE id = $1 RETURNING partial::int},
undef,
$partial_id,
);
die unless (scalar(@$rows) == 1 && scalar(@{$rows->[0]}) == 1);
my $partial = $rows->[0]->[0];
if ($partial == 1) {
$dbh->do(
q{UPDATE t1 SET partial = TRUE WHERE id = $1},
undef,
$partial_id,
);
}
}

Re: Hook for extensible parsing.

2021-09-15 Thread Jim Mlodgenski

On Wed, Sep 15, 2021 at 9:25 AM Simon Riggs
 wrote:
>
> The general rule has always been that we don't just put hooks in, we
> always require an in-core use for those hooks. I was reminded of that
> myself recently.
>
That's not historically what has happened. There are several hooks with
no in core use such as emit_log_hook and ExplainOneQuery_hook. The recent
openssl_tls_init_hook only has a usage in src/test/modules

> What we need is something in core that actually makes use of this. The
> reason for that is not politics, but a simple test of whether the
> feature makes sense AND includes all required bells and whistles to be
> useful in the real world.
>
Agreed. There should be something in src/test/modules to exercise this
but probably more to flush things out. Maybe extending adminpack to use
this so if enabled, it can use syntax like:
FILE READ 'foo.txt'

> Core doesn't need a LOL parser and I don't think we should commit that.
>
+1

> If we do this, I think it should have CREATE LANGUAGE support, so that
> each plugin can be seen as an in-core object and allow security around
> which users can execute which language types, allow us to switch
> between languages and have default languages for specific users or
> databases.
>
This hook allows extension developers to supplement syntax in addition
to adding a whole new language allowing the extension to appear more
native to the end user. For example, pglogical could use this to add
syntax to do a CREATE NODE instead of calling the function create_node.
Adding CREATE LANGUAGE support around this would just be for a narrow
set of use cases where a new language is added.

Re: Trigger position

On 2021-Sep-15, Marcos Pegoraro wrote:

> This problem can raise ... there is a trigger foo using position 1, please
> choose another

This is reminiscent of the old BASIC programming language, where you
eventually learn to choose line numbers that aren't consecutive, so that
if you later have to add lines in between you have some room to do so.
(This happens when modifying a program sufficient times you are forced
to renumber old lines where you want to add new lines that no longer fit
in the sequence.)  It's a pretty bad system.

In a computer system, alphabet letters are just a different way to
present numbers, so you just choose ASCII letters that match what you
want.  You can use "AA_first_trigger", "BB_second_trigger",
"AB_nope_this_is_second" and you'll be fine; you can do
"AAB_oops_really_second" afterwards, and so on.  The integer numbering
system doesn't seem very useful/flexible when seen in this light.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)

Re: On login trigger: take three

2021-09-15 Thread Greg Nancarrow

On Wed, Sep 8, 2021 at 10:56 PM Daniel Gustafsson  wrote:
>
> I took a look at this, and while I like the proposed feature I think the patch
> has a bit more work required.
>

Thanks for reviewing the patch.
I am not the original patch author (who no longer seems active) but
I've been contributing a bit and keeping the patch alive because I
think it's a worthwhile feature.

>
> +
> +-- 2) Initialize some user session data
> +   CREATE TEMP TABLE session_storage (x float, y integer);
> The example in the documentation use a mix of indentations, neither of which 
> is
> the 2-space indentation used elsewhere in the docs.
>

Fixed, using 2-space indentation.
(to be honest, the indentation seems inconsistent elsewhere in the
docs e.g. I'm seeing a nearby case of 2-space on 1st indent, 6-space
on 2nd indent)

>
> +   /* Get the command tag. */
> +   tag = parsetree ? CreateCommandTag(parsetree) : CMDTAG_CONNECT;
> This is hardcoding knowledge that currently only this trigger wont have a
> parsetree, and setting the tag accordingly.  This should instead check the
> event and set CMDTAG_UNKNOWN if it isn't the expected one.
>

I updated that, but maybe not exactly how you expected?

>
> +   /* database has on-login triggers */
> +   booldathaslogontriggers;
> This patch uses three different names for the same thing: client connection,
> logontrigger and login trigger.  Since this trigger only fires after 
> successful
> authentication it’s not accurate to name it client connection, as that implies
> it running on connections rather than logins.  The nomenclature used in the
> tree is "login" so the patch should be adjusted everywhere to use that.
>

Fixed.

>
> +/* Hook for plugins to get control at start of session */
> +client_connection_hook_type client_connection_hook = EventTriggerOnConnect;
> I don't see the reason for adding core functionality by hooks.  Having a hook
> might be a useful thing on its own (to be discussed in a new thread, not 
> hidden
> here), but core functionality should not depend on it IMO.
>

Fair enough, I removed the hook.

>
> +   {"enable_client_connection_trigger", PGC_SU_BACKEND, 
> DEVELOPER_OPTIONS,
> +   gettext_noop("Enables the client_connection event trigger."),
> +   gettext_noop("In case of errors in the ON client_connection 
> EVENT TRIGGER procedure, "
>  ..and..
> +   /*
> +* Try to ignore error for superuser to make it possible to login 
> even in case of errors
> +* during trigger execution
> +*/
> +   if (!is_superuser)
> +   PG_RE_THROW();
> This patch adds two ways for superusers to bypass this event trigger in case 
> of
> it being faulty, but for every other event trigger we've documented to restart
> in single-user mode and fixing it there.  Why does this need to be different?
> This clearly has a bigger chance of being a footgun but I don't see that as a
> reason to add a GUC and a bypass that other footguns lack.
>

OK, I removed those bypasses. We'll see what others think.

>
> +   elog(NOTICE, "client_connection trigger failed with message: %s", 
> error->message);
> Calling elog() in a PG_CATCH() block isn't allowed is it?
>

I believe it is allowed (e.g. there's a case in libpq), but I removed
this anyway as part of the superuser bypass.

>
> +   /* Runtlist is empty: clear dathaslogontriggers flag
> +*/
> s/Runtlist/Runlist/ and also commenting style.
>

Fixed.

>
> The logic for resetting the pg_database flag when firing event trigger in case
> the trigger has gone away is messy and a problem waiting to happen IMO.  For
> normal triggers we don't bother with that on the grounds of it being racy, and
> instead perform relhastrigger removal during VACUUM.  Another approach is 
> using
> a counter as propose upthread, since updating that can be made safer.  The
> current coding also isn't instructing concurrent backends to perform relcache
> rebuild.
>

I think there are pros and cons of each possible approach, but I think
I prefer the current way (which I have tweaked a bit) for similar
reasons to those explained by the original patch author when debating
whether to use reference-counting instead, in the discussion upthread
(e.g. it keeps it all in one place). Also, it seems to be more inline
with the documented reason why that pg_database flag was added in the
first place. I have debugged a few concurrent scenarios with the
current mechanism in place. If you still dislike the logic for
resetting the flag, please elaborate on the issues you foresee and one
of the alternative approaches can be tried.

I've attached the updated patch.

Regards,
Greg Nancarrow
Fujitsu Australia


v18-0001-Add-a-new-login-event-and-login-trigger-support.patch
Description: Binary data

Re: Hook for extensible parsing.

2021-09-15 Thread Simon Riggs

On Sat, 1 May 2021 at 08:24, Julien Rouhaud  wrote:

> Being able to extend core parser has been requested multiple times, and AFAICT
> all previous attempts were rejected not because this isn't wanted but because
> the proposed implementations required plugins to reimplement all of the core
> grammar with their own changes, as bison generated parsers aren't extensible.
>
> I'd like to propose an alternative approach, which is to allow multiple 
> parsers
> to coexist, and let third-party parsers optionally fallback on the core
> parsers.

Yes, that approach has been discussed by many people, most recently
around the idea to create a fast-path grammar to make the most
frequently used SQL parse faster.

> 0002 implements a lame "sqlol" parser, based on LOLCODE syntax, with only the
> ability to produce "select [col, ] col FROM table" parsetree, for testing
> purpose.  I chose it to ensure that everything works properly even with a
> totally different grammar that has different keywords, which doesn't even ends
> statements with a semicolon but a plain keyword.

The general rule has always been that we don't just put hooks in, we
always require an in-core use for those hooks. I was reminded of that
myself recently.

What we need is something in core that actually makes use of this. The
reason for that is not politics, but a simple test of whether the
feature makes sense AND includes all required bells and whistles to be
useful in the real world.

Core doesn't need a LOL parser and I don't think we should commit that.

If we do this, I think it should have CREATE LANGUAGE support, so that
each plugin can be seen as an in-core object and allow security around
which users can execute which language types, allow us to switch
between languages and have default languages for specific users or
databases.

--
Simon Riggshttp://www.EnterpriseDB.com/

Release 14 Schedule



Greetings


The Release Management Team (Michael Paquier, Peter Geoghegan and
myself) in consultation with the release team proposes the following
release schedule:

* PostgreSQL 14 Release Candidate 1 (RC1) will be released on September 23, 
2021.

* In the absence of any critical issues, PostgreSQL 14 will become generally 
available on September 30, 2021.

All commits and fixes intended for this release should be made before September 
23, 2021 AoE.

We would like to thank all the contributors. reviewers and committers for their 
work on this release, and for making this a fairly smooth process.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Trigger position

Correct, we need a field tgposition on pg_trigger and when it´s null we
follow normal ordering
select * from pg_trigger where tgrelid = X and tgtype = Y order by
tgposition nulls last, tgname

regards,
Marcos

Em qua., 15 de set. de 2021 às 09:35, Andreas Karlsson 
escreveu:

> On 9/15/21 1:40 PM, Tom Lane wrote:
> > Marcos Pegoraro  writes:
> >> Alphabetical order of triggers sometimes makes me write a_Recalc or
> z_Calc
> >> to be sure it´ll be the first or the last trigger with same event of
> that
> >> table
> >
> >> Oracle and SQL Server have FOLLOWS and PRECEDES when defining trigger
> >> execution order. Firebird has POSITION, which I like it more.
> >
> > Color me skeptical: doesn't that introduce more complication without
> > fundamentally solving anything?  You still don't know which position
> > numbers other triggers have used, so it seems like this is just a
> > different way to spell the same problem.
>
> I guess one advantage is that it would make the intent of the DDL author
> more clear to a reader and that it also makes it more clear to people
> new to PostgreSQL that trigger order is something that is important to
> reason about.
>
> If those small advantages are worth the complication is another question
> (I am skpetical), but if we would implement this I prefer the Firebird
> solution over the Oralce/MSSQL solution since the Firebird solution is
> simpler while achieving the same thing plus that the Firefird solution
> seems like it would be obviously backwards compatible with our current
> solution.
>
> Andreas
>

Re: SSL/TLS instead of SSL in docs

2021-09-15 Thread Daniel Gustafsson

Since the approach taken wasn't to anyones liking, attached is a v4 (partly
extracted from the previous patch) which only adds notes that SSL is used
interchangeably with TLS in our documentation and configuration.

--
Daniel Gustafsson   https://vmware.com/



v4-0001-doc-Clarify-that-we-by-SSL-actually-mean-TLS.patch
Description: Binary data

Re: Column Filtering in Logical Replication

On 2021-Sep-15, vignesh C wrote:

> I have extracted the parser code and attached it here, so that it will
> be easy to go through. We wanted to support the following syntax as in
> [1]:
> CREATE PUBLICATION pub1 FOR
> TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
> SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;

Oh, thanks, it looks like this can be useful.  We can get the common
grammar done and then rebase all the other patches (I was also just told
about support for sequences in [1]) on top.

[1] https://postgr.es/m/3d6df331-5532-6848-eb45-344b265e0...@enterprisedb.com

> Columns can be added to PublicationObjSpec data structure.

Right.  (As a List of String, I imagine.)

> The patch
> Generic_object_type_parser_002_table_schema_publication.patch has the
> changes that were used to handle the parsing. Schema and Relation both
> are different objects, schema is of string type and relation is of
> RangeVar type. While parsing, schema name is parsed in string format
> and relation is parsed and converted to rangevar type, these objects
> will be then handled accordingly during post processing.

Yeah, I think it'd be cleaner if the node type has two members, something like
this

typedef struct PublicationObjSpec
{
NodeTag type;
PublicationObjSpecType pubobjtype;  /* type of this publication 
object */
RangeVar*rv;/* if a table */
String  *objname;   /* if a schema */
int location;   /* token location, or -1 if 
unknown */
} PublicationObjSpec;

and only one of them is set, the other is NULL, depending on the object type.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/

Re: Trigger position

2021-09-15 Thread Andreas Karlsson


On 9/15/21 1:40 PM, Tom Lane wrote:

Marcos Pegoraro  writes:

Alphabetical order of triggers sometimes makes me write a_Recalc or z_Calc
to be sure it´ll be the first or the last trigger with same event of that
table



Oracle and SQL Server have FOLLOWS and PRECEDES when defining trigger
execution order. Firebird has POSITION, which I like it more.


Color me skeptical: doesn't that introduce more complication without
fundamentally solving anything?  You still don't know which position
numbers other triggers have used, so it seems like this is just a
different way to spell the same problem.


I guess one advantage is that it would make the intent of the DDL author 
more clear to a reader and that it also makes it more clear to people 
new to PostgreSQL that trigger order is something that is important to 
reason about.


If those small advantages are worth the complication is another question 
(I am skpetical), but if we would implement this I prefer the Firebird 
solution over the Oralce/MSSQL solution since the Firebird solution is 
simpler while achieving the same thing plus that the Firefird solution 
seems like it would be obviously backwards compatible with our current 
solution.


Andreas

Re: Signed vs Unsigned (take 2) (src/backend/storage/ipc/procarray.c)

Em qua., 15 de set. de 2021 às 01:08, Fujii Masao <
masao.fu...@oss.nttdata.com> escreveu:

>
>
> On 2021/09/11 12:21, Fujii Masao wrote:
> >
> >
> > On 2021/07/23 20:07, Ranier Vilela wrote:
> >> Em sex., 23 de jul. de 2021 às 07:02, Aleksander Alekseev <
> aleksan...@timescale.com > escreveu:
> >>
> >> Hi hackers,
> >>
> >> The following review has been posted through the commitfest
> application:
> >> make installcheck-world:  tested, passed
> >> Implements feature:   tested, passed
> >> Spec compliant:   tested, passed
> >> Documentation:tested, passed
> >>
> >> The patch was tested on MacOS against master `80ba4bb3`.
> >>
> >> The new status of this patch is: Ready for Committer
> >>
> >>
> >> The second patch seems fine too. I'm attaching both patches to
> trigger cfbot and to double-check them.
> >>
> >> Thanks Aleksander, for reviewing this.
> >
> > I looked at these patches because they are marked as ready for committer.
> > They don't change any actual behavior, but look valid to me in term of
> coding.
> > Barring any objection, I will commit them.
>
> > No need to backpatch, why this patch is classified as
> > refactoring only.
>
> I found this in the commit log in the patch. I agree that these patches
> are refactoring ones. But I'm thinking that it's worth doing back-patch,
> to make future back-patching easy. Thought?
>
Thanks for picking this.

I don't see anything against it being more work for the committer.

regards,
Ranier Vilela

Re: Column Filtering in Logical Replication

2021-09-15 Thread vignesh C

On Wed, Sep 15, 2021 at 5:20 PM Alvaro Herrera  wrote:
>
> On 2021-Sep-15, Amit Kapila wrote:
>
> > On Mon, Sep 6, 2021 at 11:21 PM Alvaro Herrera  
> > wrote:
> > >
> > > I pushed the clerical part of this -- namely the addition of
> > > PublicationTable node and PublicationRelInfo struct.
> >
> > One point to note here is that we are developing a generic grammar for
> > publications where not only tables but other objects like schema,
> > sequences, etc. can be specified, see [1]. So, there is some overlap
> > in the grammar modifications being made by this patch and the work
> > being done in that other thread.
>
> Oh rats.  I was not aware of that thread, or indeed of the fact that
> adding multiple object types to publications was being considered.
>
> I do see that 0002 there contains gram.y changes, but AFAICS those
> changes don't allow specifying a column list for a table, so there are
> some changes needed in that patch for that either way.
>
> I agree that it's better to move forward in unison.
>
> I noticed that 0002 in that other patch uses a void * pointer in
> PublicationObjSpec that "could be either RangeVar or String", which
> strikes me as a really bad idea.  (Already discussed in some other
> thread recently, maybe this one or the row filtering one.)

I have extracted the parser code and attached it here, so that it will
be easy to go through. We wanted to support the following syntax as in
[1]:
CREATE PUBLICATION pub1 FOR
TABLE t1,t2,t3, ALL TABLES IN SCHEMA s1,s2,
SEQUENCE seq1,seq2, ALL SEQUENCES IN SCHEMA s3,s4;

Columns can be added to PublicationObjSpec data structure. The patch
Generic_object_type_parser_002_table_schema_publication.patch has the
changes that were used to handle the parsing. Schema and Relation both
are different objects, schema is of string type and relation is of
RangeVar type. While parsing, schema name is parsed in string format
and relation is parsed and converted to rangevar type, these objects
will be then handled accordingly during post processing. That is the
reason it used void * type which could hold both RangeVar and String.
Thoughts?

[1] - https://www.postgresql.org/message-id/877603.1629120678%40sss.pgh.pa.us

Regards,
Vignesh
diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index d6fddd6efe..2a2fe03c13 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -141,14 +141,14 @@ pg_relation_is_publishable(PG_FUNCTION_ARGS)
  * Insert new publication / relation mapping.
  */
 ObjectAddress
-publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
+publication_add_relation(Oid pubid, Relation targetrel,
 		 bool if_not_exists)
 {
 	Relation	rel;
 	HeapTuple	tup;
 	Datum		values[Natts_pg_publication_rel];
 	bool		nulls[Natts_pg_publication_rel];
-	Oid			relid = RelationGetRelid(targetrel->relation);
+	Oid			relid = RelationGetRelid(targetrel);
 	Oid			prrelid;
 	Publication *pub = GetPublication(pubid);
 	ObjectAddress myself,
@@ -172,10 +172,10 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 		ereport(ERROR,
 (errcode(ERRCODE_DUPLICATE_OBJECT),
  errmsg("relation \"%s\" is already member of publication \"%s\"",
-		RelationGetRelationName(targetrel->relation), pub->name)));
+		RelationGetRelationName(targetrel), pub->name)));
 	}
 
-	check_publication_add_relation(targetrel->relation);
+	check_publication_add_relation(targetrel);
 
 	/* Form a tuple. */
 	memset(values, 0, sizeof(values));
@@ -209,7 +209,7 @@ publication_add_relation(Oid pubid, PublicationRelInfo *targetrel,
 	table_close(rel, RowExclusiveLock);
 
 	/* Invalidate relcache so that publication info is rebuilt. */
-	CacheInvalidateRelcache(targetrel->relation);
+	CacheInvalidateRelcache(targetrel);
 
 	return myself;
 }
diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index 30929da1f5..f5fd463c15 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -34,6 +34,7 @@
 #include "commands/publicationcmds.h"
 #include "funcapi.h"
 #include "miscadmin.h"
+#include "nodes/makefuncs.h"
 #include "utils/acl.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
@@ -138,6 +139,34 @@ parse_publication_options(ParseState *pstate,
 	}
 }
 
+/*
+ * Convert the PublicationObjSpecType list into rangevar list.
+ */
+static void
+ObjectsInPublicationToOids(List *pubobjspec_list, ParseState *pstate,
+		   List **rels)
+{
+	ListCell   *cell;
+	PublicationObjSpec *pubobj;
+
+	if (!pubobjspec_list)
+		return;
+
+	foreach(cell, pubobjspec_list)
+	{
+		Node *node;
+
+		pubobj = (PublicationObjSpec *) lfirst(cell);
+		node = (Node *) pubobj->object;
+
+		if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)
+		{
+			if (IsA(node, RangeVar))
+*rels = lappend(*rels, (RangeVar *) node);
+		}		
+	}
+}
+ 
 /*
  * Create new publication.
  */
@@ -155,6 +184,7 @@ CreatePublication(ParseState *pstate

Re: Trigger position

This way would be interesting for those are migrating from these databases
too. But ok, I´ll forget it.

Em qua., 15 de set. de 2021 às 08:40, Tom Lane  escreveu:

> Marcos Pegoraro  writes:
> > Alphabetical order of triggers sometimes makes me write a_Recalc or
> z_Calc
> > to be sure it´ll be the first or the last trigger with same event of that
> > table
>
> > Oracle and SQL Server have FOLLOWS and PRECEDES when defining trigger
> > execution order. Firebird has POSITION, which I like it more.
>
> Color me skeptical: doesn't that introduce more complication without
> fundamentally solving anything?  You still don't know which position
> numbers other triggers have used, so it seems like this is just a
> different way to spell the same problem.
>
> regards, tom lane
>

Re: Trigger position

This problem can raise ... there is a trigger foo using position 1, please
choose another

Atenciosamente,




Em qua., 15 de set. de 2021 às 07:59, Daniel Gustafsson 
escreveu:

> > On 15 Sep 2021, at 12:28, Marcos Pegoraro  wrote:
>
> > CREATE TRIGGER RECALC_THAT BEFORE UPDATE POSITION 1 ON ORDERS...
> > CREATE TRIGGER DO_OTHER_CALC BEFORE UPDATE POSITION 2 ON ORDERS...
>
> For those not familiar with Firebird: triggers are executing in
> alphabetical
> order within a position number, so it multiple triggers are defined for
> POSITION 1 then they are individually executed alphabetically.
>
> --
> Daniel Gustafsson   https://vmware.com/
>
>

Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events

2021-09-15 Thread Artur Zakirov

On Wed, Sep 15, 2021 at 2:57 AM Tom Lane  wrote:
> Hearing no comments, I pushed that.

Thank you!

> > I'm inclined to think we should flat-out reject LISTEN in any process
> > that is not attached to a frontend, at least until somebody takes the
> > trouble to add infrastructure that would let it be useful.  I've not
> > done that here though; I'm not quite sure what we should test for.
>
> After a bit of looking around, it seems that MyBackendType addresses
> that problem nicely, so I propose the attached.

Indeed, it seems only regular backends can send out notifies to
frontends. In that case there is no point in supporting LISTEN to
other processes. I guess a background worker can try it with SPI, but
with no luck.
+1 from my sight.

-- 
Artur

Re: Column Filtering in Logical Replication

On 2021-Sep-15, Amit Kapila wrote:

> On Mon, Sep 6, 2021 at 11:21 PM Alvaro Herrera  
> wrote:
> >
> > I pushed the clerical part of this -- namely the addition of
> > PublicationTable node and PublicationRelInfo struct.
> 
> One point to note here is that we are developing a generic grammar for
> publications where not only tables but other objects like schema,
> sequences, etc. can be specified, see [1]. So, there is some overlap
> in the grammar modifications being made by this patch and the work
> being done in that other thread.

Oh rats.  I was not aware of that thread, or indeed of the fact that
adding multiple object types to publications was being considered.

I do see that 0002 there contains gram.y changes, but AFAICS those
changes don't allow specifying a column list for a table, so there are
some changes needed in that patch for that either way.

I agree that it's better to move forward in unison.

I noticed that 0002 in that other patch uses a void * pointer in
PublicationObjSpec that "could be either RangeVar or String", which
strikes me as a really bad idea.  (Already discussed in some other
thread recently, maybe this one or the row filtering one.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Deduplicate code updating ControleFile's DBState.

2021-09-15 Thread Amul Sul

On Wed, Sep 15, 2021 at 12:52 AM Bossart, Nathan  wrote:
>
> On 9/13/21, 11:06 PM, "Amul Sul"  wrote:
> > The patch is straightforward but the only concern is that in
> > StartupXLOG(), SharedRecoveryState now gets updated only with spin
> > lock; earlier it also had ControlFileLock in addition to that. AFAICU,
> > I don't see any problem there, since until the startup process exists
> > other backends could not connect and write a WAL record.
>
> It looks like ebdf5bf intentionally made sure that we hold
> ControlFileLock while updating SharedRecoveryInProgress (now
> SharedRecoveryState after 4e87c48).  The thread for this change [0]
> has some additional details.
>

Yeah, I saw that and ebdf5bf main intention was to minimize the gap
between both of them which was quite big previously.  The comments
added by the same commit also describe the case that backends can
write WAL and the control file is still referring not in
DB_IN_PRODUCTION and IIUC, which seems to be acceptable.
Then the question is what would be wrong if a process can see an
inconsistent shared memory view for a small window?  Might be
wait-promoting might behave unexpectedly, that I have to test.

> As far as the patch goes, I'm not sure why SetControlFileDBState()
> needs to be exported, and TBH I don't know if this change is really a
> worthwhile improvement.  ISTM the main benefit is that it could help
> avoid cases where we update the state but not the time.  However,
> there's still nothing preventing that, and I don't get the idea that
> it was really a big problem to begin with.
>

Oh ok, I was working on a different patch[1] where I want to call this
function from checkpointer, but I agree exporting function is not in
the scope of this patch.

Regards,
Amul

1] 
https://postgr.es/m/caaj_b97kzzdjsffwrk7w0xu5hnxkcgkgtr69t8cozztsyxj...@mail.gmail.com

Re: Trigger position

Marcos Pegoraro  writes:
> Alphabetical order of triggers sometimes makes me write a_Recalc or z_Calc
> to be sure it´ll be the first or the last trigger with same event of that
> table

> Oracle and SQL Server have FOLLOWS and PRECEDES when defining trigger
> execution order. Firebird has POSITION, which I like it more.

Color me skeptical: doesn't that introduce more complication without
fundamentally solving anything?  You still don't know which position
numbers other triggers have used, so it seems like this is just a
different way to spell the same problem.

regards, tom lane

Re: .ready and .done files considered harmful

2021-09-15 Thread Dipesh Pandit

Hi,

Thanks for the feedback.

> I wonder if this can be simplified even further.  If we don't bother
> trying to catch out-of-order .ready files in XLogArchiveNotify() and
> just depend on the per-checkpoint/restartpoint directory scans, we can
> probably remove lastReadySegNo from archiver state completely.

If we agree that some extra delay in archiving these files is acceptable
then we don't require any special handling for this scenario otherwise
we may need to handle it separately.

> +   /* Initialize the current state of archiver */
> +   xlogState.lastSegNo = MaxXLogSegNo;
> +   xlogState.lastTli = MaxTimeLineID;
>
> It looks like we have two ways to force a directory scan.  We can
> either set force_dir_scan to true, or lastSegNo can be set to
> MaxXLogSegNo.  Why not just set force_dir_scan to true here so that we
> only have one way to force a directory scan?

make sense, I have updated it.

> Don't we also need to force a directory scan in the other cases we
> return early from pgarch_ArchiverCopyLoop()?  We will have already
> advanced the archiver state in pgarch_readyXlog(), so I think we'd end
> up skipping files if we didn't.  For example, if archive_command isn't
> set, we'll just return, and the next call to pgarch_readyXlog() might
> return the next file.

I agree, we should do it for all early return paths.

> nitpick: I think we should just call PgArchForceDirScan() here.

Yes, that's right.

> > > This is an interesting idea, but the "else" block here seems prone to
> > > race conditions.  I think we'd have to hold arch_lck to prevent that.
> > > But as I mentioned above, if we are okay with depending on the
> > > fallback directory scans, I think we can remove the "else" block
> > > completely.

Ohh I didn't realize the race condition here. The competing processes
can read the same value of lastReadySegNo.

> > Thinking further, we probably need to hold a lock even when we are
> > creating the .ready file to avoid race conditions.
>
> The race condition surely happens, but even if that happens, all
> competing processes except one of them detect out-of-order and will
> enforce directory scan.  But I'm not sure how it behaves under more
> complex situation so I'm not sure I like that behavior.
>
> We could just use another lock for the logic there, but instead
> couldn't we merge PgArchGetLastReadySegNo and PgArchSetLastReadySegNo
> into one atomic test-and-(check-and-)set function?  Like this.

I agree that we can merge the existing "Get" and "Set" functions into
an atomic test-and-check-and-set function to avoid a race condition.

I have incorporated these changes and updated a new patch. PFA patch.

Thanks,
Dipesh
From f8983c7b80ff0f8cbc415d4d9a3ad4992925e775 Mon Sep 17 00:00:00 2001
From: Dipesh Pandit 
Date: Wed, 8 Sep 2021 21:47:16 +0530
Subject: [PATCH] keep trying the next file approach

---
 src/backend/access/transam/xlog.c|  20 +++
 src/backend/access/transam/xlogarchive.c |  16 ++
 src/backend/postmaster/pgarch.c  | 249 +++
 src/include/access/xlogdefs.h|   2 +
 src/include/postmaster/pgarch.h  |   4 +
 5 files changed, 259 insertions(+), 32 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e51a7a7..7dd4b96 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9285,6 +9285,16 @@ CreateCheckPoint(int flags)
 	RemoveOldXlogFiles(_logSegNo, RedoRecPtr, recptr);
 
 	/*
+	 * Force the archiver to perform a directory scan.
+	 *
+	 * Ordinarily, this should not be needed, but it seems like a good idea
+	 * to make sure we scan the archive_status directory every once in a
+	 * while to make sure we haven't left anything behind.  Calling it here
+	 * ensures we do a directory scan at least once per checkpoint.
+	 */
+	PgArchForceDirScan();
+
+	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
 	 */
@@ -9650,6 +9660,16 @@ CreateRestartPoint(int flags)
 	RemoveOldXlogFiles(_logSegNo, RedoRecPtr, endptr);
 
 	/*
+	 * Force the archiver to perform a directory scan.
+	 *
+	 * Ordinarily, this should not be needed, but it seems like a good idea
+	 * to make sure we scan the archive_status directory every once in a
+	 * while to make sure we haven't left anything behind.  Calling it here
+	 * ensures we do a directory scan at least once per restartpoint.
+	 */
+	PgArchForceDirScan();
+
+	/*
 	 * Make more log segments if needed.  (Do this after recycling old log
 	 * segments, since that may supply some of the needed files.)
 	 */
diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 26b023e..110bee4 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -489,6 +489,22 @@ XLogArchiveNotify(const char *xlog)
 		return;
 	}
 
+	/* Force a di

Re: Added schema level support for publication.

2021-09-15 Thread Greg Nancarrow

On Tue, Sep 14, 2021 at 6:38 PM vignesh C  wrote:
>
> I have handled this in the patch attached.
>

Regarding the following function in the v28-0002 patch:

+/*
+ * Check if the relation schema is member of the schema list.
+ */
+static void
+RelSchemaIsMemberOfSchemaList(List *rels, List *schemaidlist, bool schemacheck)

I think this function is not well named or commented, and I don't like
how the "schemacheck" bool parameter determines the type of objects in
the "rels" list.
I would suggest you simply split this function into two separate
functions, corresponding to each of the blocks of the "if-else" within
the for-loop of the existing RelSchemaIsMemberOfSchemaList function.
The "Is" part of the existing "RelSchemaIsMemberOfSchemaList" function
name implies a boolean return value, so seems misleading.
So I think the names of the two functions that I am suggesting should
be "CheckNotAlreadyInPublication" or something similar.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Trigger position

2021-09-15 Thread Daniel Gustafsson

> On 15 Sep 2021, at 12:28, Marcos Pegoraro  wrote:

> CREATE TRIGGER RECALC_THAT BEFORE UPDATE POSITION 1 ON ORDERS...
> CREATE TRIGGER DO_OTHER_CALC BEFORE UPDATE POSITION 2 ON ORDERS...

For those not familiar with Firebird: triggers are executing in alphabetical
order within a position number, so it multiple triggers are defined for
POSITION 1 then they are individually executed alphabetically.

--
Daniel Gustafsson   https://vmware.com/

Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-15 Thread Amul Sul

, On Sat, Jul 24, 2021 at 1:33 AM Robert Haas  wrote:
>
> On Thu, Jun 17, 2021 at 1:23 AM Amul Sul  wrote:
> > Attached is rebase for the latest master head.  Also, I added one more
> > refactoring code that deduplicates the code setting database state in the
> > control file. The same code set the database state is also needed for this
> > feature.
>
> I started studying 0001 today and found that it rearranged the order
> of operations in StartupXLOG() more than I was expecting. It does, as
> per previous discussions, move a bunch of things to the place where we
> now call XLogParamters(). But, unsatisfyingly, InRecovery = false and
> XLogReaderFree() then have to move down even further. Since the goal
> here is to get to a situation where we sometimes XLogAcceptWrites()
> after InRecovery = false, it didn't seem nice for this refactoring
> patch to still end up with a situation where this stuff happens while
> InRecovery = true. In fact, with the patch, the amount of code that
> runs with InRecovery = true actually *increases*, which is not what I
> think should be happening here. That's why the patch ends up having to
> adjust SetMultiXactIdLimit to not Assert(!InRecovery).
>
> And then I started to wonder how this was ever going to work as part
> of the larger patch set, because as you have it here,
> XLogAcceptWrites() takes arguments XLogReaderState *xlogreader,
> XLogRecPtr EndOfLog, and TimeLineID EndOfLogTLI and if the
> checkpointer is calling that at a later time after the user issues
> pg_prohibit_wal(false), it's going to have none of those things. So I
> had a quick look at that part of the code and found this in
> checkpointer.c:
>
> XLogAcceptWrites(true, NULL, InvalidXLogRecPtr, 0);
>
> For those following along from home, the additional "true" is a bool
> needChkpt argument added to XLogAcceptWrites() by 0003. Well, none of
> this is very satisfying. The whole purpose of passing the xlogreader
> is so we can figure out whether we need a checkpoint (never mind the
> question of whether the existing algorithm for determining that is
> really sensible) but now we need a second argument that basically
> serves the same purpose since one of the two callers to this function
> won't have an xlogreader. And then we're passing the EndOfLog and
> EndOfLogTLI as dummy values which seems like it's probably just
> totally wrong, but if for some reason it works correctly there sure
> don't seem to be any comments explaining why.
>
> So I started doing a bit of hacking myself and ended up with the
> attached, which I think is not completely the right thing yet but I
> think it's better than your version. I split this into three parts.
> 0001 splits up the logic that currently decides whether to write an
> end-of-recovery record or a checkpoint record and if the latter how
> the checkpoint ought to be performed into two functions.
> DetermineRecoveryXlogAction() figures out what we want to do, and
> PerformRecoveryXlogAction() does it. It also moves the code to run
> recovery_end_command and related stuff into a new function
> CleanupAfterArchiveRecovery(). 0002 then builds on this by postponing
> UpdateFullPageWrites(), PerformRecoveryXLogAction(), and
> CleanupAfterArchiveRecovery() to just before we
> XLogReportParameters(). Because of the refactoring done by 0001, this
> is only a small amount of code movement. Because of the separation
> between DetermineRecoveryXlogAction() and PerformRecoveryXlogAction(),
> the latter doesn't need the xlogreader. So we can do
> DetermineRecoveryXlogAction() at the same time as now, while the
> xlogreader is available, and then we don't need it later when we
> PerformRecoveryXlogAction(), because we already know what we need to
> know. I think this is all fine as far as it goes.
>
> My 0003 is where I see some lingering problems. It creates
> XLogAcceptWrites(), moves the appropriate stuff there, and doesn't
> need the xlogreader. But it doesn't really solve the problem of how
> checkpointer.c would be able to call this function with proper
> arguments. It is at least better in not needing two arguments to
> decide what to do, but how is checkpointer.c supposed to know what to
> pass for xlogaction? Worse yet, how is checkpointer.c supposed to know
> what to pass for EndOfLogTLI and EndOfLog? Actually, EndOfLog doesn't
> seem too problematic, because that value has been stored in four (!)
> places inside XLogCtl by this code:
>
> LogwrtResult.Write = LogwrtResult.Flush = EndOfLog;
>
> XLogCtl->LogwrtResult = LogwrtResult;
>
> XLogCtl->LogwrtRqst.Write = EndOfLog;
> XLogCtl->LogwrtRqst.Flush = EndOfLog;
>
> Presumably we could relatively easily change things around so that we
> finish one of those values ... probably one of the "write" values ..
> back out of XLogCtl instead of passing it as a parameter. That would
> work just as well from the checkpointer as from the startup process,
> and there seems to be no way for the value to change until after

Trigger position