Re: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-09 Thread Peter Smith
Hi, Here are some review comments for patch v7-0004

==
Commit message

1.
A detailed commit message is needed to describe the purpose and
details of this patch.

==
doc/src/sgml/ref/alter_subscription.sgml

2. CREATE SUBSCRIPTION

Shouldn't there be an entry for "force_alter" parameter in the CREATE
SUBSCRIPTION "parameters" section, instead of just vaguely mentioning
it in passing when describing the "two_phase" in ALTER SUBSCRIPTION?

~

3. ALTER SUBSCRIPTION - alterable parameters

And shouldn't this new option also be named in the ALTER SUBSCRIPTION
list: "The parameters that can be altered are..."

==
src/backend/commands/subscriptioncmds.c

4.
  XLogRecPtr lsn;
+ bool twophase_force;
 } SubOpts;

IMO this field ought to be called 'force_alter' to be the same as the
option name. Sure, now it is only relevant for 'two_phase', but that
might not always be the case in the future.

~~~

5. AlterSubscription

+ /*
+ * Abort prepared transactions if force option is also
+ * specified. Otherwise raise an ERROR.
+ */
+ if (!opts.twophase_force)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot alter %s when there are prepared transactions",
+ "two_phase = false")));
+

5a.
/if force option is also specified/only if the 'force_alter' option is true/

~

5b.
"two_phase = false" -- IMO that should say "two_phase = off"

~

5c.
IMO this ereport should include a errhint to tell the user they can
use 'force_alter = true' to avoid getting this error.

~~~

6.

+ /* force_alter cannot be used standalone */
+ if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) &&
+ !IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("%s must be specified with %s",
+ "force_alter", "two_phase")));
+

IMO this rule is not necessary so the code should be removed. I think
using 'force_alter' standalone doesn't do anything at all (certainly,
it does no harm) so why add more complications (more rules, more code,
more tests) just for the sake of it?

==
src/test/subscription/t/099_twophase_added.pl

7.
+$node_subscriber->safe_psql('postgres',
+"ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);");

"force" is a verb, so it is better to say 'force_alter = true' instead
of 'force_alter = on'.

~~~

8.
 $result = $node_subscriber->safe_psql('postgres',
 "SELECT count(*) FROM pg_prepared_xacts;");
 is($result, q(0), "prepared transaction done by worker is aborted");

+$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub ENABLE;");
+

I felt the ENABLE statement should be above the SELECT statement so
that the code is more like it was before applying the patch.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Weird test mixup

2024-05-09 Thread Michael Paquier
On Thu, May 09, 2024 at 01:47:45PM +0900, Michael Paquier wrote:
> That sounds fine to do that at the end..  I'm not sure how large this
> chunk area added to each InjectionPointEntry should be, though.  128B
> stored in each InjectionPointEntry would be more than enough I guess?
> Or more like 1024?  The in-core module does not need much currently,
> but larger is likely better for pluggability.

Actually, this is leading to simplifications in the module, giving me
the attached:
4 files changed, 117 insertions(+), 134 deletions(-) 

So I like your suggestion.  This version closes the race window
between the shmem exit detach in backend A and a concurrent backend B
running a point local to A so as B will never run the local point of
A.  However, it can cause a failure in the shmem exit callback of
backend A if backend B does a detach of a point local to A because A
tracks its local points with a static list in its TopMemoryContext, at
least in the attached.  The second case could be solved by tracking
the list of local points in the module's InjectionPointSharedState,
but is this case really worth the complications it adds in the module
knowing that the first case would be solid enough?  Perhaps not.

Another idea I have for the second case is to make
InjectionPointDetach() a bit "softer", by returning a boolean status 
rather than fail if the detach cannot be done, so as the shmem exit
callback can still loop through the entries it has in store.  It could
always be possible that a concurrent backend does a detach followed by
an attach with the same name, causing the shmem exit callback to drop
a point it should not, but that's not really a plausible case IMO :)

This stuff can be adjusted in subtle ways depending on the cases you
are most interested in.  What do you think?
--
Michael
From 27b0ba88d3530c50cb87d1495439372885f0773b Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 9 May 2024 16:29:16 +0900
Subject: [PATCH v3] Add chunk area for injection points

---
 src/include/utils/injection_point.h   |   7 +-
 src/backend/utils/misc/injection_point.c  |  45 -
 .../injection_points/injection_points.c   | 189 +++---
 doc/src/sgml/xfunc.sgml   |  10 +-
 4 files changed, 117 insertions(+), 134 deletions(-)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 55524b568f..a8a11de5f2 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -23,14 +23,17 @@
 /*
  * Typedef for callback function launched by an injection point.
  */
-typedef void (*InjectionPointCallback) (const char *name);
+typedef void (*InjectionPointCallback) (const char *name,
+		const void *private_data);
 
 extern Size InjectionPointShmemSize(void);
 extern void InjectionPointShmemInit(void);
 
 extern void InjectionPointAttach(const char *name,
  const char *library,
- const char *function);
+ const char *function,
+ const void *private_data,
+ int private_data_size);
 extern void InjectionPointRun(const char *name);
 extern void InjectionPointDetach(const char *name);
 
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 0cf4d51cac..d46ad8b48f 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -42,6 +42,7 @@ static HTAB *InjectionPointHash;	/* find points from names */
 #define INJ_NAME_MAXLEN		64
 #define INJ_LIB_MAXLEN		128
 #define INJ_FUNC_MAXLEN		128
+#define INJ_PRIVATE_MAXLEN	1024
 
 /* Single injection point stored in InjectionPointHash */
 typedef struct InjectionPointEntry
@@ -49,6 +50,12 @@ typedef struct InjectionPointEntry
 	char		name[INJ_NAME_MAXLEN];	/* hash key */
 	char		library[INJ_LIB_MAXLEN];	/* library */
 	char		function[INJ_FUNC_MAXLEN];	/* function */
+
+	/*
+	 * Opaque data area that modules can use to pass some custom data to
+	 * callbacks, registered when attached.
+	 */
+	char		private_data[INJ_PRIVATE_MAXLEN];
 } InjectionPointEntry;
 
 #define INJECTION_POINT_HASH_INIT_SIZE	16
@@ -61,6 +68,7 @@ typedef struct InjectionPointEntry
 typedef struct InjectionPointCacheEntry
 {
 	char		name[INJ_NAME_MAXLEN];
+	char		private_data[INJ_PRIVATE_MAXLEN];
 	InjectionPointCallback callback;
 } InjectionPointCacheEntry;
 
@@ -73,7 +81,8 @@ static HTAB *InjectionPointCache = NULL;
  */
 static void
 injection_point_cache_add(const char *name,
-		  InjectionPointCallback callback)
+		  InjectionPointCallback callback,
+		  const void *private_data)
 {
 	InjectionPointCacheEntry *entry;
 	bool		found;
@@ -99,6 +108,7 @@ injection_point_cache_add(const char *name,
 	Assert(!found);
 	strlcpy(entry->name, name, sizeof(entry->name));
 	entry->callback = callback;
+	memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
 }
 
 /*
@@ -124,11 +134,14 @@ injection_point_cache_remove(const char *name)
  * Retrieve an injection

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Peter,

Thanks for reviewing! The patch will be posted in the upcoming post.

> ==
> src/backend/access/transam/twophase.c
> 
> 1. IsTwoPhaseTransactionGidForSubid
> 
> +/*
> + * IsTwoPhaseTransactionGidForSubid
> + * Check whether the given GID is formed by TwoPhaseTransactionGid.
> + */
> +static bool
> +IsTwoPhaseTransactionGidForSubid(Oid subid, char *gid)
> 
> I think the function comment should mention something about 'subid'.
> 
> SUGGESTION
> Check whether the given GID (as formed by TwoPhaseTransactionGid) is
> for the specified 'subid'.

Fixed.

> src/backend/commands/subscriptioncmds.c
> 
> 2. AlterSubscription
> 
> + if (!opts.twophase &&
> + form->subtwophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED
> &&
> + LookupGXactBySubid(subid))
> + /* Add error message */
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot disable two_phase when uncommitted prepared
> transactions present"),
> + errhint("Resolve these transactions and try again")));
> 
> The comment "/* Add error message */" seems unnecessary.

Yeah, this was an internal flag. Removed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Peter,

> ==
> Commit message
> 
> 1.
> IIUC there is quite a lot of subtlety and details about why the slot
> option needs to be changed only when altering "true" to "false", but
> not when altering "false" to "true".
> 
> It also should explain why PreventInTransactionBlock is only needed
> when altering two_phase "true" to "false".
> 
> Please include a commit message to describe all those tricky details.

Added.

> ==
> src/backend/commands/subscriptioncmds.c
> 
> 2. AlterSubscription
> 
> - PreventInTransactionBlock(isTopLevel, "ALTER SUBSCRIPTION ... SET
> (two_phase)");
> + if (!opts.twophase)
> + PreventInTransactionBlock(isTopLevel,
> +   "ALTER SUBSCRIPTION ... SET (two_phase = off)");
> 
> IMO this needs a comment to explain why PreventInTransactionBlock is
> only needed when changing the 'two_phase' option from on to off.

Added. Thoutht?

> 3. AlterSubscription
> 
> /*
> * Try to acquire the connection necessary for altering slot.
> *
> * This has to be at the end because otherwise if there is an error while
> * doing the database operations we won't be able to rollback altered
> * slot.
> */
> if (replaces[Anum_pg_subscription_subfailover - 1] ||
> replaces[Anum_pg_subscription_subtwophasestate - 1])
> {
> bool must_use_password;
> char*err;
> WalReceiverConn *wrconn;
> bool failover_needs_to_be_updated;
> bool two_phase_needs_to_be_updated;
> 
> /* Load the library providing us libpq calls. */
> load_file("libpqwalreceiver", false);
> 
> /* Try to connect to the publisher. */
> must_use_password = sub->passwordrequired && !sub->ownersuperuser;
> wrconn = walrcv_connect(sub->conninfo, true, true, must_use_password,
> sub->name, &err);
> if (!wrconn)
> ereport(ERROR,
> (errcode(ERRCODE_CONNECTION_FAILURE),
> errmsg("could not connect to the publisher: %s", err)));
> 
> /*
> * Consider which slot option must be altered.
> *
> * We must alter the failover option whenever subfailover is updated.
> * Two_phase, however, is altered only when changing true to false.
> */
> failover_needs_to_be_updated =
> replaces[Anum_pg_subscription_subfailover - 1];
> two_phase_needs_to_be_updated =
> (replaces[Anum_pg_subscription_subtwophasestate - 1] &&
> !opts.twophase);
> 
> PG_TRY();
> {
> if (two_phase_needs_to_be_updated || failover_needs_to_be_updated)
> walrcv_alter_slot(wrconn, sub->slotname,
>   failover_needs_to_be_updated ? &opts.failover : NULL,
>   two_phase_needs_to_be_updated ? &opts.twophase : NULL);
> }
> PG_FINALLY();
> {
> walrcv_disconnect(wrconn);
> }
> PG_END_TRY();
> }
> 
> 3a.
> The block comment "Consider which slot option must be altered..." says
> WHEN those options need to be updated, but it doesn't say WHY. e.g.
> why only update the 'two_phase' when it is being disabled but not when
> it is being enabled? In other words, I think there needs to be more
> background/reason details given in this comment.
> 
> ~~~
> 
> 3b.
> Can't those 2 new variable assignments be done up-front and guard this
> entire "if-block" instead of the current replaces[] guarding it? Then
> the code is somewhat simplified.
> 
> SUGGESTION:
> /*
>  * 
>  */
> update_failover = replaces[Anum_pg_subscription_subfailover - 1];
> update_two_phase = (replaces[Anum_pg_subscription_subtwophasestate -
> 1] && !opts.twophase);
> 
> /*
>  * Try to acquire the connection necessary for altering slot.
>  *
>  * This has to be at the end because otherwise if there is an error while
>  * doing the database operations we won't be able to rollback altered
>  * slot.
>  */
> if (update_failover || update_two_phase)
> {
>   ...
> 
>   /* Load the library providing us libpq calls. */
>   load_file("libpqwalreceiver", false);
> 
>   /* Try to connect to the publisher. */
>   must_use_password = sub->passwordrequired && !sub->ownersuperuser;
>   wrconn = walrcv_connect(sub->conninfo, true, true,
> must_use_password, sub->name, &err);
>   if (!wrconn)
> ereport(ERROR, ...);
> 
>   PG_TRY();
>   {
> walrcv_alter_slot(wrconn, sub->slotname,
>   update_failover ? &opts.failover : NULL,
>   update_two_phase ? &opts.twophase : NULL);
>   }
>   PG_FINALLY();
>   {
> walrcv_disconnect(wrconn);
>   }
>   PG_END_TRY();
> }

Two variables were added and comments were updated.

> .../libpqwalreceiver/libpqwalreceiver.c
> 
> 4.
> + appendStringInfo(&cmd, "ALTER_REPLICATION_SLOT %s ( ",
> + quote_identifier(slotname));
> +
> + if (failover)
> + appendStringInfo(&cmd, "FAILOVER %s ",
> + (*failover) ? "true" : "false");
> +
> + if (two_phase)
> + appendStringInfo(&cmd, "TWO_PHASE %s%s ",
> + (*two_phase) ? "true" : "false",
> + failover ? ", " : "");
> +
> + appendStringInfoString(&cmd, ");");
> 
> 4a.
> IIUC the comma logic here was broken in v7 when you swapped the order.
> Anyway, IMO it will be better NOT to try combining that comma logic
> with the existing appendStringInfo. Doing it separately is both easier
> and less error-prone.
> 
> Furthermore, the parentheses like "(*two_phase)" instead of just
> 

RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Peter,

> Commit Message
> 
> 1.
> The patch needs a commit message to describe the purpose and highlight
> any limitations and other details.

Added.

> ==
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2.
> +
> + 
> +  The two_phase parameter can only be altered when
> the
> +  subscription is disabled. When altering the parameter from
> true
> +  to false, the backend process checks prepared
> +  transactions done by the logical replication worker and aborts them.
> + 
> 
> Here, the para is referring to "true" and "false" but earlier on this
> page it talks about "twophase = off". IMO it is better to use a
> consistent terminology like "on|off" everywhere instead of randomly
> changing the way it is described each time.

I checked contents and changed to "on|off".

> ==
> src/backend/commands/subscriptioncmds.c
> 
> 3. AlterSubscription
> 
>   if (IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
>   {
> + List *prepared_xacts = NIL;
> 
> This 'prepared_xacts' can be declared at a lower scrope because it is
> only used if (!opts.twophase).
> 
> Furthermore, IIUC you don't need to assign NIL in the declaration
> because there is no chance for it to be unassigned anyway.

Made the namespace narrower and initialization was removed.

> ~~~
> 
> 4. AlterSubscription
> 
> + /*
> + * The changed two_phase option (true->false) of the
> + * slot can't be rolled back.
> + */
>   PreventInTransactionBlock(isTopLevel,
> "ALTER SUBSCRIPTION ... SET (two_phase = off)");
> 
> Here is another example of inconsistent mixing of the terminology
> where the comment says "true"/"false" but the message says "off".
> Let's keep everything consistent. (I prefer on|off).

Modified.

> ~~~
> 
> 5.
> + if (sub->twophasestate == LOGICALREP_TWOPHASE_STATE_ENABLED &&
> + (prepared_xacts = GetGidListBySubid(subid)) != NIL)
> + {
> + ListCell *cell;
> +
> + /* Abort all listed transactions */
> + foreach(cell, prepared_xacts)
> + FinishPreparedTransaction((char *) lfirst(cell),
> +   false);
> +
> + list_free(prepared_xacts);
> + }
> 
> 5A.
> IIRC there is a cleaner way to write this loop without needing
> ListCell variable -- e.g. foreach_ptr() macro?

Changed.

> 5B.
> Shouldn't this be using list_free_deep() so the pstrdup gid gets freed too?

Yeah, fixed.

> ==
> src/test/subscription/t/099_twophase_added.pl
> 
> 6.
> +##
> +# Check the case that prepared transactions exist on subscriber node
> +##
> +
> 
> Give some more detailed comments here similar to the review comment of
> patch v7-0002 for the other part of this TAP test.
> 
> ~~~
> 
> 7. TAP test - comments
> 
> Same as for my v7-0002 review comments, I think this test case also
> needs a few more one-line comments to describe the sub-steps. e.g.:
> 
> # prepare a transaction to insert some rows to the table
> 
> # verify the prepared tx is replicated to the subscriber (because
> 'two_phase = on')
> 
> # toggle the two_phase to 'off' *before* the COMMIT PREPARED
> 
> # verify the prepared tx got aborted
> 
> # do the COMMIT PREPARED (note that now two_phase is 'off')
> 
> # verify the inserted rows got replicated ok

They were fixed based on your previous comments.

> 
> 8. TAP test - subscription name
> 
> It's better to rename the SUBSCRIPTION in this TAP test so you can
> avoid getting log warnings like:
> 
> psql::4: WARNING:  subscriptions created by regression test
> cases should have names starting with "regress_"
> psql::4: NOTICE:  created replication slot "sub" on publisher

Modified, but it was included in 0001.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 




RE: Slow catchup of 2PC (twophase) transactions on replica in LR

2024-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Peter,

> 
> ==
> Commit message
> 
> 1.
> A detailed commit message is needed to describe the purpose and
> details of this patch.

Added.

> ==
> doc/src/sgml/ref/alter_subscription.sgml
> 
> 2. CREATE SUBSCRIPTION
> 
> Shouldn't there be an entry for "force_alter" parameter in the CREATE
> SUBSCRIPTION "parameters" section, instead of just vaguely mentioning
> it in passing when describing the "two_phase" in ALTER SUBSCRIPTION?
>
> 3. ALTER SUBSCRIPTION - alterable parameters
> 
> And shouldn't this new option also be named in the ALTER SUBSCRIPTION
> list: "The parameters that can be altered are..."

Hmm, but the parameter cannot be used for CREATE SUBSCRIPTION. Should we
modify to accept and add the description in the doc? This was not accepted.

> ==
> src/backend/commands/subscriptioncmds.c
> 
> 4.
>   XLogRecPtr lsn;
> + bool twophase_force;
>  } SubOpts;
> 
> IMO this field ought to be called 'force_alter' to be the same as the
> option name. Sure, now it is only relevant for 'two_phase', but that
> might not always be the case in the future.

Modified.

> 5. AlterSubscription
> 
> + /*
> + * Abort prepared transactions if force option is also
> + * specified. Otherwise raise an ERROR.
> + */
> + if (!opts.twophase_force)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot alter %s when there are prepared transactions",
> + "two_phase = false")));
> +
> 
> 5a.
> /if force option is also specified/only if the 'force_alter' option is true/

Modified.

> 
> 5b.
> "two_phase = false" -- IMO that should say "two_phase = off"

Modified.

> 5c.
> IMO this ereport should include a errhint to tell the user they can
> use 'force_alter = true' to avoid getting this error.

Hint was added.

> 6.
> 
> + /* force_alter cannot be used standalone */
> + if (IsSet(opts.specified_opts, SUBOPT_FORCE_ALTER) &&
> + !IsSet(opts.specified_opts, SUBOPT_TWOPHASE_COMMIT))
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("%s must be specified with %s",
> + "force_alter", "two_phase")));
> +
> 
> IMO this rule is not necessary so the code should be removed. I think
> using 'force_alter' standalone doesn't do anything at all (certainly,
> it does no harm) so why add more complications (more rules, more code,
> more tests) just for the sake of it?

Removed. So standalone 'force_alter' is now no-op.

> src/test/subscription/t/099_twophase_added.pl
> 
> 7.
> +$node_subscriber->safe_psql('postgres',
> +"ALTER SUBSCRIPTION sub SET (two_phase = off, force_alter = on);");
> 
> "force" is a verb, so it is better to say 'force_alter = true' instead
> of 'force_alter = on'.

Fixed. Actually not sure it is better because I'm not a native.

> 8.
>  $result = $node_subscriber->safe_psql('postgres',
>  "SELECT count(*) FROM pg_prepared_xacts;");
>  is($result, q(0), "prepared transaction done by worker is aborted");
> 
> +$node_subscriber->safe_psql('postgres', "ALTER SUBSCRIPTION sub
> ENABLE;");
> +
> 
> I felt the ENABLE statement should be above the SELECT statement so
> that the code is more like it was before applying the patch.

Fixed.

Please see attached patch set.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 



v8-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIPT.patch
Description:  v8-0001-Allow-altering-of-two_phase-option-of-a-SUBSCRIPT.patch


v8-0002-Alter-slot-option-two_phase-only-when-altering-on.patch
Description:  v8-0002-Alter-slot-option-two_phase-only-when-altering-on.patch


v8-0003-Abort-prepared-transactions-while-altering-two_ph.patch
Description:  v8-0003-Abort-prepared-transactions-while-altering-two_ph.patch


v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTION-.-S.patch
Description:  v8-0004-Add-force_alter-option-for-ALTER-SUBSCIRPTION-.-S.patch


Re: First draft of PG 17 release notes

2024-05-09 Thread Aleksander Alekseev
Hi,

> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html
>
> It will be improved until the final release.  The item count is 188,
> which is similar to recent releases:

Thanks for working on this.

I believe the part of the 64-bit XIDs patchset that was delivered in
PG17 is worth highlighting in "E.1.3.10. Source Code" section:

4ed8f0913bfd
2cdf131c46e6
5a1dfde8334b
a60b8a58f435

All this can probably be summarized as one bullet "Index SLRUs by
64-bit integers rather than by 32-bit ones" where the authors are:
Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev,
Nikita Glukhov, Pavel Borisov, Yura Sokolov.

-- 
Best regards,
Aleksander Alekseev




consider -Wmissing-variable-declarations

2024-05-09 Thread Peter Eisentraut
In [0] I had noticed that we have no automated verification that global 
variables are declared in header files.  (For global functions, we have 
this through -Wmissing-prototypes.)  As I mentioned there, I discovered 
the Clang compiler option -Wmissing-variable-declarations, which does 
exactly that.  Clang has supported this for quite some time, and GCC 14, 
which was released a few days ago, now also supports it.  I went and 
installed this option into the standard build flags and cleaned up the 
warnings it found, which revealed a number of interesting things.


I think these checks are important.  We have been trying to mark global 
variables as PGDLLIMPORT consistently, but that only catches variables 
declared in header files.  Also, a threading project would surely 
benefit from global variables (thread-local variables?) having 
consistent declarations.


Attached are patches organized by sub-topic.  The most dubious stuff is 
in patches 0006 and 0007.  A bunch of GUC-related variables are not in 
header files but are pulled in via ad-hoc extern declarations.  I can't 
recognize an intentional scheme there, probably just done for 
convenience or copied from previous practice.  These should be organized 
into appropriate header files.



[0]: 
https://www.postgresql.org/message-id/c4ac402f-9d7b-45fa-bbc1-4a5bf0a9f...@eisentraut.orgFrom 296a1c465de6cfea333bf7bb7950f02b3ef80b12 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 8 May 2024 13:49:37 +0200
Subject: [PATCH v1 1/7] Add -Wmissing-variable-declarations to the standard
 compilation flags

This warning flag detects global variables not declared in header
files.  This is similar to what -Wmissing-prototypes does for
functions.  (More correctly, it is similar to what
-Wmissing-declarations does for functions, but -Wmissing-prototypes is
a superset of that in C.)

This flag is new in GCC 14.  Clang has supported it for a while.

XXX many warnings to fix first
---
 configure | 99 +++
 configure.ac  |  9 +++
 meson.build   | 10 +++
 src/Makefile.global.in|  1 +
 src/interfaces/ecpg/test/Makefile.regress |  2 +-
 src/interfaces/ecpg/test/meson.build  |  1 +
 src/makefiles/meson.build |  2 +
 src/tools/pg_bsd_indent/Makefile  |  2 +
 src/tools/pg_bsd_indent/meson.build   |  1 +
 9 files changed, 126 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 89644f2249e..70fd1de88b5 100755
--- a/configure
+++ b/configure
@@ -741,6 +741,7 @@ CXXFLAGS_SL_MODULE
 CFLAGS_SL_MODULE
 CFLAGS_VECTORIZE
 CFLAGS_UNROLL_LOOPS
+PERMIT_MISSING_VARIABLE_DECLARATIONS
 PERMIT_DECLARATION_AFTER_STATEMENT
 LLVM_BINPATH
 LLVM_CXXFLAGS
@@ -6080,6 +6081,104 @@ if test x"$pgac_cv_prog_CXX_cxxflags__Wformat_security" 
= x"yes"; then
 fi
 
 
+  # gcc 14+, clang for a while
+  save_CFLAGS=$CFLAGS
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CC} supports 
-Wmissing-variable-declarations, for CFLAGS" >&5
+$as_echo_n "checking whether ${CC} supports -Wmissing-variable-declarations, 
for CFLAGS... " >&6; }
+if ${pgac_cv_prog_CC_cflags__Wmissing_variable_declarations+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CFLAGS=$CFLAGS
+pgac_save_CC=$CC
+CC=${CC}
+CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=yes
+else
+  pgac_cv_prog_CC_cflags__Wmissing_variable_declarations=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS="$pgac_save_CFLAGS"
+CC="$pgac_save_CC"
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&5
+$as_echo "$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" >&6; }
+if test x"$pgac_cv_prog_CC_cflags__Wmissing_variable_declarations" = x"yes"; 
then
+  CFLAGS="${CFLAGS} -Wmissing-variable-declarations"
+fi
+
+
+  PERMIT_MISSING_VARIABLE_DECLARATIONS=
+  if test x"$save_CFLAGS" != x"$CFLAGS"; then
+PERMIT_MISSING_VARIABLE_DECLARATIONS=-Wno-missing-variable-declarations
+  fi
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${CXX} supports 
-Wmissing-variable-declarations, for CXXFLAGS" >&5
+$as_echo_n "checking whether ${CXX} supports -Wmissing-variable-declarations, 
for CXXFLAGS... " >&6; }
+if ${pgac_cv_prog_CXX_cxxflags__Wmissing_variable_declarations+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  pgac_save_CXXFLAGS=$CXXFLAGS
+pgac_save_CXX=$CXX
+CXX=${CXX}
+CXXFLAGS="${CXXFLAGS} -Wmissing-variable-declarations"
+ac_save_cxx_werror_flag=$ac_cxx_werror_flag
+ac_cxx_werror_flag=yes
+ac_ext=cpp
+ac_cpp='$CXXCPP $CP

[no subject]

2024-05-09 Thread Rajan Pandey
Hi everyone, I just installed Postgres and pg_tle extension as I was
looking to contribute to pg_tle.

Somehow, I am unable to update the shared_preload_libraries. It feels like
ALTER has happened but the SPL value is not updated:

> test=# show shared_preload_libraries;
>  shared_preload_libraries
> --
>
> (1 row)
>
> test=# ALTER SYSTEM SET shared_preload_libraries TO 'pg_tle';
> ALTER SYSTEM
> test=# SELECT pg_reload_conf();
>  pg_reload_conf
> 
>  t
> (1 row)
>
> test=# show shared_preload_libraries;
>  shared_preload_libraries
> --
>
> (1 row)
>
> test=#
>

I'm unable to open the postgresql.conf file to update it either. I provided
the correct macbook password above. But it is not accepted! :/

> rajanx@b0be835adb74 postgresql % cat
> /usr/local/pgsql/data/postgresql.auto.conf
> cat: /usr/local/pgsql/data/postgresql.auto.conf: Permission denied

rajanx@b0be835adb74 postgresql % su cat
> /usr/local/pgsql/data/postgresql.auto.conf
> Password:
> su: Sorry
>

Please help! Thank you. :)
-- 
Regards
Rajan Pandey


Re:

2024-05-09 Thread Kashif Zeeshan
Hi


On Thu, May 9, 2024 at 2:50 PM Rajan Pandey 
wrote:

> Hi everyone, I just installed Postgres and pg_tle extension as I was
> looking to contribute to pg_tle.
>
> Somehow, I am unable to update the shared_preload_libraries. It feels like
> ALTER has happened but the SPL value is not updated:
>
>> test=# show shared_preload_libraries;
>>  shared_preload_libraries
>> --
>>
>> (1 row)
>>
>> test=# ALTER SYSTEM SET shared_preload_libraries TO 'pg_tle';
>> ALTER SYSTEM
>> test=# SELECT pg_reload_conf();
>>  pg_reload_conf
>> 
>>  t
>> (1 row)
>>
>> test=# show shared_preload_libraries;
>>  shared_preload_libraries
>> --
>>
>> (1 row)
>>
>> test=#
>>
>
>
I'm unable to open the postgresql.conf file to update it either. I provided
> the correct macbook password above. But it is not accepted! :/
>
>> rajanx@b0be835adb74 postgresql % cat
>> /usr/local/pgsql/data/postgresql.auto.conf
>> cat: /usr/local/pgsql/data/postgresql.auto.conf: Permission denied
>
> rajanx@b0be835adb74 postgresql % su cat
>> /usr/local/pgsql/data/postgresql.auto.conf
>> Password:
>> su: Sorry
>>
>
The issue is related with permissions, please make sure which user did the
installation and update the  postgresql.auto.conf file with that user
permissions.

Regards
Kashif Zeeshan
Bitnine Global

>
> Please help! Thank you. :)
> --
> Regards
> Rajan Pandey
>


Re: First draft of PG 17 release notes

2024-05-09 Thread jian he
On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
>
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html
>

another potential incompatibilities issue:
ALTER TABLE DROP PRIMARY KEY

see:
https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql




Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Andy Fan


Hello Tomas,

>>> 2) v20240502-0002-Use-mergesort-in-the-leader-process.patch
>>>
>>> The approach implemented by 0001 works, but there's a little bit of
>>> issue - if there are many distinct keys (e.g. for trigrams that can
>>> happen very easily), the workers will hit the memory limit with only
>>> very short TID lists for most keys. For serial build that means merging
>>> the data into a lot of random places, and in parallel build it means the
>>> leader will have to merge a lot of tiny lists from many sorted rows.
>>>
>>> Which can be quite annoying and expensive, because the leader does so
>>> using qsort() in the serial part. It'd be better to ensure most of the
>>> sorting happens in the workers, and the leader can do a mergesort. But
>>> the mergesort must not happen too often - merging many small lists is
>>> not cheaper than a single qsort (especially when the lists overlap).
>>>
>>> So this patch changes the workers to process the data in two phases. The
>>> first works as before, but the data is flushed into a local tuplesort.
>>> And then each workers sorts the results it produced, and combines them
>>> into results with much larger TID lists, and those results are written
>>> to the shared tuplesort. So the leader only gets very few lists to
>>> combine for a given key - usually just one list per worker.
>> 
>> Hmm, I was hoping we could implement the merging inside the tuplesort
>> itself during its own flush phase, as it could save significantly on
>> IO, and could help other users of tuplesort with deduplication, too.
>> 
>
> Would that happen in the worker or leader process? Because my goal was
> to do the expensive part in the worker, because that's what helps with
> the parallelization.

I guess both of you are talking about worker process, if here are
something in my mind:

*btbuild* also let the WORKER dump the tuples into Sharedsort struct
and let the LEADER merge them directly. I think this aim of this design
is it is potential to save a mergeruns. In the current patch, worker dump
to local tuplesort and mergeruns it and then leader run the merges
again. I admit the goal of this patch is reasonable, but I'm feeling we
need to adapt this way conditionally somehow. and if we find the way, we
can apply it to btbuild as well. 

-- 
Best Regards
Andy Fan





Re: First draft of PG 17 release notes

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Bruce Momjian  writes:

> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
>   https://momjian.us/pgsql_docs/release-17.html

My name is listed twice in the "Improve psql tab completion" item.

- ilmari




Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Andy Fan


Tomas Vondra  writes:

> 3) v20240502-0003-Remove-the-explicit-pg_qsort-in-workers.patch
>
> In 0002 the workers still do an explicit qsort() on the TID list before
> writing the data into the shared tuplesort. But we can do better - the
> workers can do a merge sort too. To help with this, we add the first TID
> to the tuplesort tuple, and sort by that too - it helps the workers to
> process the data in an order that allows simple concatenation instead of
> the full mergesort.
>
> Note: There's a non-obvious issue due to parallel scans always being
> "sync scans", which may lead to very "wide" TID ranges when the scan
> wraps around. More about that later.

This is really amazing.

> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>
> There's one more efficiency problem - the parallel scans are required to
> be synchronized, i.e. the scan may start half-way through the table, and
> then wrap around. Which however means the TID list will have a very wide
> range of TID values, essentially the min and max of for the key.
>
> Without 0006 this would cause frequent failures of the index build, with
> the error I already mentioned:
>
>   ERROR: could not split GIN page; all old items didn't fit

I have two questions here and both of them are generall gin index questions
rather than the patch here.

1. What does the "wrap around" mean in the "the scan may start half-way
through the table, and then wrap around".  Searching "wrap" in
gin/README gets nothing. 

2. I can't understand the below error.

>   ERROR: could not split GIN page; all old items didn't fit

When the posting list is too long, we have posting tree strategy. so in
which sistuation we could get this ERROR. 

> issue with efficiency - having such a wide TID list forces the mergesort
> to actually walk the lists, because this wide list overlaps with every
> other list produced by the worker.

If we split the blocks among worker 1-block by 1-block, we will have a
serious issue like here.  If we can have N-block by N-block, and N-block
is somehow fill the work_mem which makes the dedicated temp file, we
can make things much better, can we? 

-- 
Best Regards
Andy Fan





Re: First draft of PG 17 release notes

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Bruce Momjian  writes:
>
>> I have committed the first draft of the PG 17 release notes;  you can
>> see the results here:
>>
>>  https://momjian.us/pgsql_docs/release-17.html
>
> My name is listed twice in the "Improve psql tab completion" item.

You can move one of them to "Track DEALLOCATE in pg_stat_statements",
which Michael and I co-authored.

- ilmari




Re: First draft of PG 17 release notes

2024-05-09 Thread jian he
On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html
>

* Add function pg_buffercache_evict() to allow shared buffer eviction
(Palak Chaturvedi, Thomas Munro)
* This is useful for testing.

this should put it on the section
< E.1.3.11. Additional Modules
?

Then I found out official release notes don't have  attributes,
so it doesn't matter?



<<
Allow ALTER OPERATOR to set more optimization attributes (Tommy Pavlicek)
This is useful for extensions.
<<
I think this commit title "Add hash support functions and hash opclass
for contrib/ltree."
 from [1] is more descriptive.
i am not 100% sure of the meaning of "This is useful for extensions."



[1] 
https://git.postgresql.org/cgit/postgresql.git/commit/?id=485f0aa85995340fb62113448c992ee48dc6fff1




Re: First draft of PG 17 release notes

2024-05-09 Thread jian he
> <<
> Allow ALTER OPERATOR to set more optimization attributes (Tommy Pavlicek)
> This is useful for extensions.
> <<

sorry,  I mean
<<
Allow the creation of hash indexes on ltree columns (Tommy Pavlicek)
This also enables hash join and hash aggregation on ltree columns.
<<

better description would be:
<<
Add hash support functions and hash opclass for contrib/ltree (Tommy Pavlicek)
This also enables hash join and hash aggregation on ltree columns.
<<




Re: First draft of PG 17 release notes

2024-05-09 Thread Hans Buschmann
Some findings


1.

>>Remove adminpack contrib extension (Daniel Gustafsson)

>>This was used by non-end-of-life pgAdmin III.

Perhaps you mean now-end-of-life (s/non/now/)

2.
>>All specification of partitioned table access methods (Justin Pryzby, 
Soumyadeep Chakraborty, Michael Paquier)

perhaps you mean Allow, otherwise meaning not clear.

3.
>> Add some long options to pg_archivecleanup (Atsushi Torikoshi)
>>The long options are --debug, --dry-run, and /--strip-extension.

The slash should be omitted.

Hans Buschmann


Re: First draft of PG 17 release notes

2024-05-09 Thread jian he
On Thu, May 9, 2024 at 6:53 PM jian he  wrote:
>
> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html

< Add columns to pg_stats to report range histogram information (Egor
Rogov, Soumyadeep Chakraborty)
I think this applies to range type and multi range type, "range
histogram information" seems not very clear to me.
So maybe:
< Add columns to pg_stats to report range-type histogram information
(Egor Rogov, Soumyadeep Chakraborty)



Display length and bounds histograms in pg_stats
< Add new COPY option "ON_ERROR ignore" to discard error rows (Damir
Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He, Jian He, Yugo
Nagata)
duplicate name.




request for database identifier in the startup packet

2024-05-09 Thread Dave Cramer
Greetings,

The JDBC driver is currently keeping a per connection cache of types in the
driver. We are seeing cases where the number of columns is quite high. In
one case Prevent fetchFieldMetaData() from being run when unnecessary. ·
Issue #3241 · pgjdbc/pgjdbc (github.com)
 2.6 Million columns.

If we knew that we were connecting to the same database we could use a
single cache across connections.

I think we would require a server/database identifier in the startup
message.

Dave Cramer


Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Tomas Vondra
On 5/9/24 12:14, Andy Fan wrote:
> 
> Tomas Vondra  writes:
> 
>> 3) v20240502-0003-Remove-the-explicit-pg_qsort-in-workers.patch
>>
>> In 0002 the workers still do an explicit qsort() on the TID list before
>> writing the data into the shared tuplesort. But we can do better - the
>> workers can do a merge sort too. To help with this, we add the first TID
>> to the tuplesort tuple, and sort by that too - it helps the workers to
>> process the data in an order that allows simple concatenation instead of
>> the full mergesort.
>>
>> Note: There's a non-obvious issue due to parallel scans always being
>> "sync scans", which may lead to very "wide" TID ranges when the scan
>> wraps around. More about that later.
> 
> This is really amazing.
> 
>> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>>
>> There's one more efficiency problem - the parallel scans are required to
>> be synchronized, i.e. the scan may start half-way through the table, and
>> then wrap around. Which however means the TID list will have a very wide
>> range of TID values, essentially the min and max of for the key.
>>
>> Without 0006 this would cause frequent failures of the index build, with
>> the error I already mentioned:
>>
>>   ERROR: could not split GIN page; all old items didn't fit
> 
> I have two questions here and both of them are generall gin index questions
> rather than the patch here.
> 
> 1. What does the "wrap around" mean in the "the scan may start half-way
> through the table, and then wrap around".  Searching "wrap" in
> gin/README gets nothing. 
> 

The "wrap around" is about the scan used to read data from the table
when building the index. A "sync scan" may start e.g. at TID (1000,0)
and read till the end of the table, and then wraps and returns the
remaining part at the beginning of the table for blocks 0-999.

This means the callback would not see a monotonically increasing
sequence of TIDs.

Which is why the serial build disables sync scans, allowing simply
appending values to the sorted list, and even with regular flushes of
data into the index we can simply append data to the posting lists.

> 2. I can't understand the below error.
> 
>>   ERROR: could not split GIN page; all old items didn't fit
> 
> When the posting list is too long, we have posting tree strategy. so in
> which sistuation we could get this ERROR. 
> 

AFAICS the index build relies on the assumption that we only append data
to the TID list on a leaf page, and when the page gets split, the "old"
part will always fit. Which may not be true, if there was a wrap around
and we're adding low TID values to the list on the leaf page.

FWIW the error in dataBeginPlaceToPageLeaf looks like this:

  if (!append || ItemPointerCompare(&maxOldItem, &remaining) >= 0)
elog(ERROR, "could not split GIN page; all old items didn't fit");

It can fail simply because of the !append part.

I'm not sure why dataBeginPlaceToPageLeaf() relies on this assumption,
or with GIN details in general, and I haven't found any explanation. But
AFAIK this is why the serial build disables sync scans.

>> issue with efficiency - having such a wide TID list forces the mergesort
>> to actually walk the lists, because this wide list overlaps with every
>> other list produced by the worker.
> 
> If we split the blocks among worker 1-block by 1-block, we will have a
> serious issue like here.  If we can have N-block by N-block, and N-block
> is somehow fill the work_mem which makes the dedicated temp file, we
> can make things much better, can we? 
> 

I don't understand the question. The blocks are distributed to workers
by the parallel table scan, and it certainly does not do that block by
block. But even it it did, that's not a problem for this code.

The problem is that if the scan wraps around, then one of the TID lists
for a given worker will have the min TID and max TID, so it will overlap
with every other TID list for the same key in that worker. And when the
worker does the merging, this list will force a "full" merge sort for
all TID lists (for that key), which is very expensive.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Avoid orphaned objects dependencies, take 3

2024-05-09 Thread Bertrand Drouvot
Hi,

On Tue, Apr 30, 2024 at 08:00:00PM +0300, Alexander Lakhin wrote:
> Hi Bertrand,
> 
> But I've discovered yet another possibility to get a broken dependency.

Thanks for the testing and the report!

> Please try this script:
> res=0
> numclients=20
> for ((i=1;i<=100;i++)); do
> for ((c=1;c<=numclients;c++)); do
>   echo "
> CREATE SCHEMA s_$c;
> CREATE CONVERSION myconv_$c FOR 'LATIN1' TO 'UTF8' FROM iso8859_1_to_utf8;
> ALTER CONVERSION myconv_$c SET SCHEMA s_$c;
>   " | psql >psql1-$c.log 2>&1 &
>   echo "DROP SCHEMA s_$c RESTRICT;" | psql >psql2-$c.log 2>&1 &
> done
> wait
> pg_dump -f db.dump || { echo "on iteration $i"; res=1; break; }
> for ((c=1;c<=numclients;c++)); do
>   echo "DROP SCHEMA s_$c CASCADE;" | psql >psql3-$c.log 2>&1
> done
> done
> psql -c "SELECT * FROM pg_conversion WHERE connamespace NOT IN (SELECT oid 
> FROM pg_namespace);"
> 
> It fails for me (with the v4 patch applied) as follows:
> pg_dump: error: schema with OID 16392 does not exist
> on iteration 1
>   oid  | conname  | connamespace | conowner | conforencoding | contoencoding 
> |  conproc  | condefault
> ---+--+--+--++---+---+
>  16396 | myconv_6 |    16392 |   10 |  8 | 6 
> | iso8859_1_to_utf8 | f
> 

Thanks for sharing the test, I'm able to reproduce the issue with v4.

Oh I see, your test updates an existing dependency. v4 took care about brand 
new 
dependencies creation (recordMultipleDependencies()) but forgot to take care
about changing an existing dependency (which is done in another code path:
changeDependencyFor()).

Please find attached v5 that adds:

- a call to the new depLockAndCheckObject() function in changeDependencyFor().
- a test when altering an existing dependency.

With v5 applied, I don't see the issue anymore.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 6fc24a798210f730ab04833fa58074f142be968e Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot 
Date: Fri, 29 Mar 2024 15:43:26 +
Subject: [PATCH v5] Avoid orphaned objects dependencies

It's currently possible to create orphaned objects dependencies, for example:

Scenario 1:

session 1: begin; drop schema schem;
session 2: create a function in the schema schem
session 1: commit;

With the above, the function created in session 2 would be linked to a non
existing schema.

Scenario 2:

session 1: begin; create a function in the schema schem
session 2: drop schema schem;
session 1: commit;

With the above, the function created in session 1 would be linked to a non
existing schema.

To avoid those scenarios, a new lock (that conflicts with a lock taken by DROP)
has been put in place when the dependencies are being recorded. With this in
place, the drop schema in scenario 2 would be locked.

Also, after the new lock attempt, the patch checks that the object still exists:
with this in place session 2 in scenario 1 would be locked and would report an
error once session 1 committs (that would not be the case should session 1 abort
the transaction).

If the object is dropped before the new lock attempt is triggered then the patch
would also report an error (but with less details).

The patch adds a few tests for some dependency cases (that would currently produce
orphaned objects):

- schema and function (as the above scenarios)
- alter a dependency (function and schema)
- function and arg type
- function and return type
- function and function
- domain and domain
- table and type
- server and foreign data wrapper
---
 src/backend/catalog/dependency.c  |  54 
 src/backend/catalog/objectaddress.c   |  70 ++
 src/backend/catalog/pg_depend.c   |  12 ++
 src/include/catalog/dependency.h  |   1 +
 src/include/catalog/objectaddress.h   |   1 +
 src/test/modules/Makefile |   1 +
 src/test/modules/meson.build  |   1 +
 .../test_dependencies_locks/.gitignore|   3 +
 .../modules/test_dependencies_locks/Makefile  |  14 ++
 .../expected/test_dependencies_locks.out  | 129 ++
 .../test_dependencies_locks/meson.build   |  12 ++
 .../specs/test_dependencies_locks.spec|  89 
 12 files changed, 387 insertions(+)
  22.9% src/backend/catalog/
  43.7% src/test/modules/test_dependencies_locks/expected/
  27.2% src/test/modules/test_dependencies_locks/specs/
   4.5% src/test/modules/test_dependencies_locks/

diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c
index d4b5b2ade1..a49357bbe2 100644
--- a/src/backend/catalog/dependency.c
+++ b/src/backend/catalog/dependency.c
@@ -1517,6 +1517,60 @@ AcquireDeletionLock(const ObjectAddress *object, int flags)
 	}
 }
 
+/*
+ * depLockAndCheckObject
+ *
+ * Lock the object that we are about to record a dependency on.
+ * After it's loc

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

2024-05-09 Thread Michail Nikolaev
Hello, Matthias and others!

Realized new horizon was applied only during validation phase (once index
is marked as ready).
Now it applied if index is not marked as valid yet.

Updated version in attach.

--

> I think the best way for this to work would be an index method that
> exclusively stores TIDs, and of which we can quickly determine new
> tuples, too. I was thinking about something like GIN's format, but
> using (generation number, tid) instead of ([colno, colvalue], tid) as
> key data for the internal trees, and would be unlogged (because the
> data wouldn't have to survive a crash). Then we could do something
> like this for the second table scan phase:

Regarding that approach to dealing with validation phase and resetting of
snapshot:

I was thinking about it and realized: once we go for an additional index -
we don't need the second heap scan at all!

We may do it this way:

* create target index, not marked as indisready yet
* create a temporary unlogged index with the same parameters to store tids
(optionally with the indexes columns data, see below), marked as indisready
(but not indisvalid)
* commit them both in a single transaction
* wait for other transaction to know about them and honor in HOT
constraints and new inserts (for temporary index)
* now our temporary index is filled by the tuples inserted to the table
* start building out target index, resetting snapshot every so often (if it
is "safe" index)
* finish target index building phase
* mark target index as indisready
* now, start validation of the index:
* take the reference snapshot
* take a visibility snapshot of the target index, sort it (as it done
currently)
* take a visibility snapshot of our temporary index, sort it
* start merging loop using two synchronized cursors over both
visibility snapshots
* if we encountered tid which is not present in target visibility
snapshot
* insert it to target index
* if a temporary index contains the column's data - we may
even avoid the tuple fetch
* if temporary index is tid-only - we fetch tuple from the
heap, but as plus we are also skipping dead tuples from insertion to the
new index (I think it is better option)
* commit everything, release reference snapshot
* wait for transactions older than reference snapshot (as it done currently)
* mark target index as indisvalid, drop temporary index
* done


So, pros:
* just a single heap scan
* snapshot is reset periodically

Cons:
* we need to maintain the additional index during the main building phase
* one more tuplesort

If the temporary index is unlogged, cheap to maintain (just append-only
mechanics) this feels like a perfect tradeoff for me.

This approach will work perfectly with low amount of tuple inserts during
the building phase. And looks like even in the worst case it still better
than the current approach.

What do you think? Have I missed something?

Thanks,
Michail.
From 4878cc22c9176e5bf2b7d3d9d8c95cc66c8ac007 Mon Sep 17 00:00:00 2001
From: nkey 
Date: Wed, 8 May 2024 22:31:33 +0200
Subject: [PATCH v4] WIP: fix d9d076222f5b "VACUUM: ignore indexing operations 
 with CONCURRENTLY" which was reverted by e28bb8851969.

Issue was caused by absent of any snapshot actually protects the data in relation in the required to build index correctly.

Introduce new type of visibility horizon to be used for relation with concurrently build indexes (in the case of "safe" index).

Now `GlobalVisHorizonKindForRel` may dynamically decide which horizon to used base of the data about safe indexes being built concurrently.

To reduce performance impact counter of concurrently built indexes updated in shared memory.
---
 src/backend/catalog/index.c  |  36 ++
 src/backend/commands/indexcmds.c |  20 +++
 src/backend/storage/ipc/ipci.c   |   2 +
 src/backend/storage/ipc/procarray.c  |  85 -
 src/backend/utils/cache/relcache.c   |  11 ++
 src/bin/pg_amcheck/t/006_concurrently.pl | 155 +++
 src/include/catalog/index.h  |   5 +
 src/include/utils/rel.h  |   1 +
 8 files changed, 309 insertions(+), 6 deletions(-)
 create mode 100644 src/bin/pg_amcheck/t/006_concurrently.pl

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 5a8568c55c..3caa2bab12 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -97,6 +97,11 @@ typedef struct
 	Oid			pendingReindexedIndexes[FLEXIBLE_ARRAY_MEMBER];
 } SerializedReindexState;
 
+typedef struct {
+	pg_atomic_uint32 numSafeConcurrentlyBuiltIndexes;
+} SafeICSharedState;
+static SafeICSharedState *SafeICStateShmem;
+
 /* non-export function prototypes */
 static bool relationHasPrimaryKey(Relation rel);
 static TupleDesc ConstructTupleDescriptor(Relation heapRelation,
@@ -176,6 +181,37 @@ relationHasPrimaryKey(Relation rel)
 	return result;
 }
 
+
+void 

Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 04:44:47PM +1200, David Rowley wrote:
> On Thu, 9 May 2024 at 16:04, Bruce Momjian  wrote:
> > I welcome feedback.  For some reason it was an easier job than usual.
> 
> Thanks for working on that.
> 
> > +2023-11-02 [cac169d68] Increase DEFAULT_FDW_TUPLE_COST from 0.01 to 0.2
> 
> > +Double the default foreign data wrapper tuple cost (David Rowley, Umair 
> > Shahid)
> 
> That's 20x rather than 2x.

Oops, changed to:

Increase the default foreign data wrapper tuple cost (David
Rowley, Umair Shahid)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 09:47:34AM +0500, Muhammad Ikram wrote:
> Hi Bruce,
> 
> A minor formatting issue in the start below. Bullet is not required here.
> 
> 
> E.1.1. Overview  
> 
> PostgreSQL 17 contains many new features and enhancements, including:
> 
>   • 
> 
> The above items and other new features of PostgreSQL 17 are explained in more
> detail in the sections below.

That is just a place-holder.  I changed the bullet text to be:

TO BE COMPLETED LATER

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 04:52:14PM +1200, David Rowley wrote:
> On Thu, 9 May 2024 at 16:47, Muhammad Ikram  wrote:
> > A minor formatting issue in the start below. Bullet is not required here.
> 
> This is a placeholder for the highlight features of v17 will go.
> Bruce tends not to decide what those are all by himself.

Yes, I already have so much of my opinion in the release notes that I
prefer others to make that list, and to make the Acknowledgments list
at the bottom.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Xing Guo
Hi hackers,

After several refactoring iterations, auxiliary processes are no
longer initialized from the bootstrapper. I think using the
InitProcessing mode for initializing auxiliary processes is more
appropriate.

Best Regards,
Xing
From 50fb1ccc7259e0ce8512dee236cbfe11635c15fc Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Thu, 9 May 2024 20:57:48 +0800
Subject: [PATCH v1] Set appropriate processing mode for auxiliary processes.

After several refactoring iterations, auxiliary processes are no longer
initialized from the bootstrapper. Using the InitProcessing mode for
initializing auxiliary processes is more appropriate.
---
 src/backend/postmaster/auxprocess.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 78f4263eeb..ca4bfa2ed5 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -54,7 +54,7 @@ AuxiliaryProcessMainCommon(void)
 
 	init_ps_display(NULL);
 
-	SetProcessingMode(BootstrapProcessing);
+	SetProcessingMode(InitProcessing);
 	IgnoreSystemIndexes = true;
 
 	/*
-- 
2.45.0



Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Tomas Vondra



On 5/9/24 11:44, Andy Fan wrote:
> 
> Hello Tomas,
> 
 2) v20240502-0002-Use-mergesort-in-the-leader-process.patch

 The approach implemented by 0001 works, but there's a little bit of
 issue - if there are many distinct keys (e.g. for trigrams that can
 happen very easily), the workers will hit the memory limit with only
 very short TID lists for most keys. For serial build that means merging
 the data into a lot of random places, and in parallel build it means the
 leader will have to merge a lot of tiny lists from many sorted rows.

 Which can be quite annoying and expensive, because the leader does so
 using qsort() in the serial part. It'd be better to ensure most of the
 sorting happens in the workers, and the leader can do a mergesort. But
 the mergesort must not happen too often - merging many small lists is
 not cheaper than a single qsort (especially when the lists overlap).

 So this patch changes the workers to process the data in two phases. The
 first works as before, but the data is flushed into a local tuplesort.
 And then each workers sorts the results it produced, and combines them
 into results with much larger TID lists, and those results are written
 to the shared tuplesort. So the leader only gets very few lists to
 combine for a given key - usually just one list per worker.
>>>
>>> Hmm, I was hoping we could implement the merging inside the tuplesort
>>> itself during its own flush phase, as it could save significantly on
>>> IO, and could help other users of tuplesort with deduplication, too.
>>>
>>
>> Would that happen in the worker or leader process? Because my goal was
>> to do the expensive part in the worker, because that's what helps with
>> the parallelization.
> 
> I guess both of you are talking about worker process, if here are
> something in my mind:
> 
> *btbuild* also let the WORKER dump the tuples into Sharedsort struct
> and let the LEADER merge them directly. I think this aim of this design
> is it is potential to save a mergeruns. In the current patch, worker dump
> to local tuplesort and mergeruns it and then leader run the merges
> again. I admit the goal of this patch is reasonable, but I'm feeling we
> need to adapt this way conditionally somehow. and if we find the way, we
> can apply it to btbuild as well. 
> 

I'm a bit confused about what you're proposing here, or how is that
related to what this patch is doing and/or to the what Matthias
mentioned in his e-mail from last week.

Let me explain the relevant part of the patch, and how I understand the
improvement suggested by Matthias. The patch does the work in three phases:


1) Worker gets data from table, split that into index items and add
those into a "private" tuplesort, and finally sorts that. So a worker
may see a key many times, with different TIDs, so the tuplesort may
contain many items for the same key, with distinct TID lists:

   key1: 1, 2, 3, 4
   key1: 5, 6, 7
   key1: 8, 9, 10
   key2: 1, 2, 3
   ...


2) Worker reads the sorted data, and combines TIDs for the same key into
larger TID lists, depending on work_mem etc. and writes the result into
a shared tuplesort. So the worker may write this:

   key1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
   key2: 1, 2, 3


3) Leader reads this, combines TID lists from all workers (using a
higher memory limit, probably), and writes the result into the index.


The step (2) is optional - it would work without it. But it helps, as it
moves a potentially expensive sort into the workers (and thus the
parallel part of the build), and it also makes it cheaper, because in a
single worker the lists do not overlap and thus can be simply appended.
Which in the leader is not the case, forcing an expensive mergesort.

The trouble with (2) is that it "just copies" data from one tuplesort
into another, increasing the disk space requirements. In an extreme
case, when nothing can be combined, it pretty much doubles the amount of
disk space, and makes the build longer.

What I think Matthias is suggesting, is that this "TID list merging"
could be done directly as part of the tuplesort in step (1). So instead
of just moving the "sort tuples" from the appropriate runs, it could
also do an optional step of combining the tuples and writing this
combined tuple into the tuplesort result (for that worker).

Matthias also mentioned this might be useful when building btree indexes
with key deduplication.

AFAICS this might work, although it probably requires for the "combined"
tuple to be smaller than the sum of the combined tuples (in order to fit
into the space). But at least in the GIN build in the workers this is
likely true, because the TID lists do not overlap (and thus not hurting
the compressibility).

That being said, I still see this more as an optimization than something
required for the patch, and I don't think I'll have time to work on this
anytime soon. The patch is not extremely complex, b

Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 04:53:38AM +, Bertrand Drouvot wrote:
> Hi,
> 
> On Thu, May 09, 2024 at 12:03:50AM -0400, Bruce Momjian wrote:
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> > 
> > https://momjian.us/pgsql_docs/release-17.html
> 
> Thanks for working on that!
>  
> > I welcome feedback.
> 
> > Add system view pg_wait_events that reports wait event types (Michael 
> > Paquier) 
> 
> Michael is the committer for 1e68e43d3f, the author is me.

Wow, thank you for finding that.  The commit message is very clear so I
don't know how I made that mistake.  Fixed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re:

2024-05-09 Thread Tom Lane
Rajan Pandey  writes:
>> test=# ALTER SYSTEM SET shared_preload_libraries TO 'pg_tle';
>> ALTER SYSTEM
>> test=# SELECT pg_reload_conf();
>> pg_reload_conf
>> 
>> t
>> (1 row)

Changing shared_preload_libraries requires a postmaster restart,
not just config reload.  The postmaster log would have told you
that, but pg_reload_conf() can't really see the effects of its
signal.

regards, tom lane




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 02:17:12PM +0900, Masahiko Sawada wrote:
> Hi,
> 
> On Thu, May 9, 2024 at 1:03 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> 
> Thank you for working on that!
> 
> I'd like to mention some of my works. I think we can add the vacuum
> performance improvements by the following commits:
> 
> - Add template for adaptive radix tree (ee1b30f1)
> - Add TIDStore, to store sets of TIDs (ItemPointerData) efficiently 
> (30e144287)
> - Use TidStore for dead tuple TIDs storage during lazy vacuum (667e65aac)

Okay, I reworded the item, added authors, and added the commits:





Allow vacuum to more efficiently remove and freeze tuples (John Naylor, 
Masahiko Sawada, Melanie Plageman)



> Also, please consider the following item:
> 
> - Improve eviction algorithm in ReorderBuffer using max-heap for many
> subtransactions (5bec1d6bc)

I looked at that item and I don't have a generic "make logical
replication apply faster" item to merge it into, and many
subtransactions seemed like enough of an edge-case that I didn't think
mentioning it make sense.  Can you see a good place to add it?

> Finally, should we mention the following commit in the release note?
> It's not a user-visible change but added a new regression test module.
> 
> - Add tests for XID wraparound (e255b646a)

I don't normally add testing infrastructure changes unless they are
major.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: request for database identifier in the startup packet

2024-05-09 Thread David G. Johnston
On Thursday, May 9, 2024, Dave Cramer  wrote:

> Greetings,
>
> The JDBC driver is currently keeping a per connection cache of types in
> the driver. We are seeing cases where the number of columns is quite high.
> In one case Prevent fetchFieldMetaData() from being run when unnecessary.
> · Issue #3241 · pgjdbc/pgjdbc (github.com)
>  2.6 Million columns.
>
> If we knew that we were connecting to the same database we could use a
> single cache across connections.
>
> I think we would require a server/database identifier in the startup
> message.
>

I feel like pgbouncer ruins this plan.

But maybe you can construct a lookup key from some combination of data
provided by these functions:
https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-SESSION

David J.


Re: Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Heikki Linnakangas

On 09/05/2024 16:12, Xing Guo wrote:

Hi hackers,

After several refactoring iterations, auxiliary processes are no
longer initialized from the bootstrapper. I think using the
InitProcessing mode for initializing auxiliary processes is more
appropriate.


At first I was sure this was introduced by my refactorings in v17, but 
in fact it's been like this forever. I agree that InitProcessing makes 
much more sense. The ProcessingMode variable is initialized to 
InitProcessing, so I think we can simply remove that line from 
AuxiliaryProcessMainCommon(). There are existing 
"SetProcessingMode(InitProcessing)" calls in other Main functions too 
(AutoVacLauncherMain, BackgroundWorkerMain, etc.), and I think those can 
also be removed.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 02:37:57PM +0800, Richard Guo wrote:
> 
> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> 
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
> 
>         https://momjian.us/pgsql_docs/release-17.html
> 
> 
> Thanks for working on that.
> 
> For this item:
>  
> 
> Allow the optimizer to improve CTE plans by using the sort order of
> columns referenced in earlier CTE clauses (Jian Guo)
> 
> 
> I think you mean a65724dfa.  The author should be 'Richard Guo'.

Wow the CTE item above it was done by Jian Guo.  I probably copied the
text from the line above it, modified the description, but thought the
author's name was the same, but it was not.  Fixed.

> And I'm wondering if it is more accurate to state it as "Allow the
> optimizer to improve plans for the outer query by leveraging the sort
> order of a CTE's output."
>
> I think maybe a similar revision can be applied to the item just above
> this one.

Okay, I went with this text:





Allow the optimizer to improve CTE plans by considering the statistics 
of columns referenced in earlier row output clauses (Jian Guo, Tom Lane)







Allow the optimizer to improve CTE plans by considering the sort order 
of columns referenced in earlier row output clauses (Richard Guo)



I did not use "leveraging" because I am concerned non-native English
speakers might find the term confusing.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 12:18:44PM +0300, Aleksander Alekseev wrote:
> Hi,
> 
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
> > It will be improved until the final release.  The item count is 188,
> > which is similar to recent releases:
> 
> Thanks for working on this.
> 
> I believe the part of the 64-bit XIDs patchset that was delivered in
> PG17 is worth highlighting in "E.1.3.10. Source Code" section:
> 
> 4ed8f0913bfd
> 2cdf131c46e6
> 5a1dfde8334b
> a60b8a58f435
> 
> All this can probably be summarized as one bullet "Index SLRUs by
> 64-bit integers rather than by 32-bit ones" where the authors are:
> Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev,
> Nikita Glukhov, Pavel Borisov, Yura Sokolov.

Wow, I try to only list source code items that have some user-facing
impact, and I don't think these do.  I do realize how important they are
though.  This gets into the balance of mentioning items _users_ need to
know about, vs. important improvements that _we_ know about.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Tomas Vondra
On 5/2/24 20:22, Tomas Vondra wrote:
>> 
>>> For some of the opclasses it can regress (like the jsonb_path_ops). I
>>> don't think that's a major issue. Or more precisely, I'm not surprised
>>> by it. It'd be nice to be able to disable the parallel builds in these
>>> cases somehow, but I haven't thought about that.
>>
>> Do you know why it regresses?
>>
> 
> No, but one thing that stands out is that the index is much smaller than
> the other columns/opclasses, and the compression does not save much
> (only about 5% for both phases). So I assume it's the overhead of
> writing writing and reading a bunch of GB of data without really gaining
> much from doing that.
> 

I finally got to look into this regression, but I think I must have done
something wrong before because I can't reproduce it. This is the timings
I get now, if I rerun the benchmark:

 workers   trgm   tsvector jsonb  jsonb (hash)
---
   0   1225404   10456
   17721805760
   25491434752
   34261274350
   43641164048
   53231113846
   62921113745

and the speedup, relative to serial build:


 workers   trgm   tsvector  jsonb  jsonb (hash)

   163%45%54%  108%
   245%35%45%   94%
   335%31%41%   89%
   430%29%38%   86%
   526%28%37%   83%
   624%28%35%   81%

So there's a small regression for the jsonb_path_ops opclass, but only
with one worker. After that, it gets a bit faster than serial build.
While not a great speedup, it's far better than the earlier results that
showed maybe 40% regression.

I don't know what I did wrong before - maybe I had a build with an extra
debug info or something like that? No idea why would that affect only
one of the opclasses. But this time I made doubly sure the results are
correct etc.

Anyway, I'm fairly happy with these results. I don't think it's
surprising there are cases where parallel build does not help much.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 06:00:24PM +0800, jian he wrote:
> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> >
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
> 
> another potential incompatibilities issue:
> ALTER TABLE DROP PRIMARY KEY
> 
> see:
> https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql

I see it now, and I see Alvaro Herrera saying:


https://www.postgresql.org/message-id/202404181849.6frtmajobe27%40alvherre.pgsql

> I wonder is there any incompatibility issue, or do we need to say 
something
> about the new behavior when dropping a key column?

--> Umm, yeah, maybe we should document it in ALTER TABLE DROP PRIMARY KEY
--> and in the release notes to note the different behavior.

However, I don't see it mentioned as a release note item in the commit
message or mentioned in our docs. I suppose the release note text would
be:

Removing a PRIMARY KEY will remove the NOT NULL column specification

Previously the NOT NULL specification would be retained.

Do we have agreement that we want this release note item?

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 11:22:06AM +0100, Dagfinn Ilmari Mannsåker wrote:
> Bruce Momjian  writes:
> 
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> 
> My name is listed twice in the "Improve psql tab completion" item.

You did such a great job I wanted to list you twice.  :-)  Actually, the
author list was so long I just didn't notice, fixed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 11:31:17AM +0100, Dagfinn Ilmari Mannsåker wrote:
> Dagfinn Ilmari Mannsåker  writes:
> 
> > Bruce Momjian  writes:
> >
> >> I have committed the first draft of the PG 17 release notes;  you can
> >> see the results here:
> >>
> >>https://momjian.us/pgsql_docs/release-17.html
> >
> > My name is listed twice in the "Improve psql tab completion" item.
> 
> You can move one of them to "Track DEALLOCATE in pg_stat_statements",
> which Michael and I co-authored.

Yep, also my mistake, fixed.  My apologies.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Xing Guo
On Thu, May 9, 2024 at 10:13 PM Heikki Linnakangas  wrote:
>
> On 09/05/2024 16:12, Xing Guo wrote:
> > Hi hackers,
> >
> > After several refactoring iterations, auxiliary processes are no
> > longer initialized from the bootstrapper. I think using the
> > InitProcessing mode for initializing auxiliary processes is more
> > appropriate.
>
> At first I was sure this was introduced by my refactorings in v17, but
> in fact it's been like this forever. I agree that InitProcessing makes
> much more sense. The ProcessingMode variable is initialized to
> InitProcessing, so I think we can simply remove that line from
> AuxiliaryProcessMainCommon(). There are existing
> "SetProcessingMode(InitProcessing)" calls in other Main functions too
> (AutoVacLauncherMain, BackgroundWorkerMain, etc.), and I think those can
> also be removed.

Good catch! I agree with you.

Best Regards,
Xing.
From 210e97c88fd49853d2a6304763ef7e8ad65e1b79 Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Thu, 9 May 2024 20:57:48 +0800
Subject: [PATCH v2] Remove redundant SetProcessingMode(InitProcessing) calls.

After several refactoring iterations, auxiliary processes are no longer
initialized from the bootstrapper. Using the InitProcessing mode for
initializing auxiliary processes is more appropriate. Since the global
variable Mode is initialized to InitProcessing, we can remove redundant
calls of SetProcessingMode(InitProcessing).
---
 src/backend/postmaster/autovacuum.c| 4 
 src/backend/postmaster/auxprocess.c| 1 -
 src/backend/postmaster/bgworker.c  | 2 --
 src/backend/replication/logical/slotsync.c | 2 --
 src/backend/tcop/postgres.c| 2 --
 5 files changed, 11 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..5a1cfabf19 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -380,8 +380,6 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 	if (PostAuthDelay)
 		pg_usleep(PostAuthDelay * 100L);
 
-	SetProcessingMode(InitProcessing);
-
 	/*
 	 * Set up signal handlers.  We operate on databases much like a regular
 	 * backend, so we use the same signal handling.  See equivalent code in
@@ -1373,8 +1371,6 @@ AutoVacWorkerMain(char *startup_data, size_t startup_data_len)
 	MyBackendType = B_AUTOVAC_WORKER;
 	init_ps_display(NULL);
 
-	SetProcessingMode(InitProcessing);
-
 	/*
 	 * Set up signal handlers.  We operate on databases much like a regular
 	 * backend, so we use the same signal handling.  See equivalent code in
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 78f4263eeb..0faf7df0b6 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -54,7 +54,6 @@ AuxiliaryProcessMainCommon(void)
 
 	init_ps_display(NULL);
 
-	SetProcessingMode(BootstrapProcessing);
 	IgnoreSystemIndexes = true;
 
 	/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index cf64a4beb2..5bbe02bbc3 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -746,8 +746,6 @@ BackgroundWorkerMain(char *startup_data, size_t startup_data_len)
 	MyBackendType = B_BG_WORKER;
 	init_ps_display(worker->bgw_name);
 
-	SetProcessingMode(InitProcessing);
-
 	/* Apply PostAuthDelay */
 	if (PostAuthDelay > 0)
 		pg_usleep(PostAuthDelay * 100L);
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f1f44d41ef..72857694a5 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1342,8 +1342,6 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 
 	init_ps_display(NULL);
 
-	SetProcessingMode(InitProcessing);
-
 	/*
 	 * Create a per-backend PGPROC struct in shared memory.  We must do this
 	 * before we access any shared memory.
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2dff28afce..4b2de1dc70 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4161,8 +4161,6 @@ PostgresMain(const char *dbname, const char *username)
 	Assert(dbname != NULL);
 	Assert(username != NULL);
 
-	SetProcessingMode(InitProcessing);
-
 	/*
 	 * Set up signal handlers.  (InitPostmasterChild or InitStandaloneProcess
 	 * has already set up BlockSig and made that the active signal mask.)
-- 
2.45.0



Re: New GUC autovacuum_max_threshold ?

2024-05-09 Thread Robert Haas
On Wed, May 8, 2024 at 1:30 PM Imseih (AWS), Sami  wrote:
> There is also an alternative of making this GUC -1 by default, which
> means it has not effect and any value larger will be used in the threshold
> calculation of autovacuunm. A user will have to be careful not to set it too 
> low,
> but that is going to be a concern either way.

Personally, I'd much rather ship it with a reasonable default. If we
ship it disabled, most people won't end up using it at all, which
sucks, and those who do will be more likely to set it to a ridiculous
value, which also sucks. If we ship it with a value that has a good
chance of being within 2x or 3x of the optimal value on a given user's
system, then a lot more people will benefit from it.

> Also, I think coming up with a good default will be challenging,
> and perhaps this idea is a good middle ground.

Maybe. I freely admit that I don't know exactly what the optimal value
is here, and I think there is some experimentation that is needed to
try to get some better intuition there. At what table size does the
current system actually result in too little vacuuming, and how can we
demonstrate that? Does the point at which that happens depend more on
the table size in gigabytes, or more on the number of rows? These are
things that someone can research and about which they can present
data.

As I see it, a lot of the lack of agreement up until now is people
just not understanding the math. Since I think I've got the right idea
about the math, I attribute this to other people being confused about
what is going to happen and would tend to phrase it as: some people
don't understand how catastrophically bad it will be if you set this
value too low. However, another possibility is that it is I who am
misunderstanding the math. In that case, the correct phrasing is
probably something like: Robert wants a completely useless and
worthless value for this parameter that will be of no help to anyone.
Regardless, at least some of us are confused. If we can reduce that
confusion, then people's ideas about what values for this parameter
might be suitable should start to come closer together.

I tend to feel like the disagreement here is not really about whether
it's a good idea to increase the frequency of vacuuming on large
tables by three orders of magnitude compared to what we do now, but
rather than about whether that's actually going to happen.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.

2024-05-09 Thread Tom Lane
Richard Guo  writes:
> I also think it seems better to do bms_copy in separate steps, not only
> because this keeps consistent with the existing code in
> remove_rel_from_restrictinfo, but also because we need to do
> bms_del_member twice for each lefthand/righthand relid set.

Yeah.  Of course, we don't need a bms_copy() in the second one,
but that'd just add even more asymmetry and chance for confusion.

> Speaking of consistency, do you think it would improve the code's
> readability if we rearrange the code in remove_rel_from_query so that
> the modifications of the same relid set are grouped together, just like
> what we do in remove_rel_from_restrictinfo?

I left it alone, just because it didn't seem worth cluttering "git
blame" here.

regards, tom lane




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 06:53:30PM +0800, jian he wrote:
> On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> > I have committed the first draft of the PG 17 release notes;  you can
> > see the results here:
> >
> > https://momjian.us/pgsql_docs/release-17.html
> >
> 
> * Add function pg_buffercache_evict() to allow shared buffer eviction
> (Palak Chaturvedi, Thomas Munro)
> * This is useful for testing.
> 
> this should put it on the section
> < E.1.3.11. Additional Modules
> ?

Oh, it is in the pg_buffercache module --- I should have realized that
from the name, fixed.

> Then I found out official release notes don't have  attributes,
> so it doesn't matter?

Uh, what are sections?  Did previous release notes have it?

> I think this commit title "Add hash support functions and hash opclass
> for contrib/ltree."
>  from [1] is more descriptive.

Uh, I don't think people know what hash support functions are, but they
know what hash indexes are, and maybe hash joins and hash aggregates. 
Why do you consider the commit text better?

> i am not 100% sure of the meaning of "This is useful for extensions."

The commit says:

commit 2b5154beab7
Author: Tom Lane 
Date:   Fri Oct 20 12:28:38 2023 -0400

Extend ALTER OPERATOR to allow setting more optimization attributes.

Allow the COMMUTATOR, NEGATOR, MERGES, and HASHES attributes to be 
set
by ALTER OPERATOR.  However, we don't allow COMMUTATOR/NEGATOR to be
changed once set, nor allow the MERGES/HASHES flags to be unset once
set.  Changes like that might invalidate plans already made, and
dealing with the consequences seems like more trouble than it's 
worth.
--> The main use-case we foresee for this is to allow addition of missed
--> properties in extension update scripts, such as extending an 
existing
--> operator to support hashing.  So only transitions from not-set to 
set
states seem very useful.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 06:57:01PM +0800, jian he wrote:
> > <<
> > Allow ALTER OPERATOR to set more optimization attributes (Tommy Pavlicek)
> > This is useful for extensions.
> > <<
> 
> sorry,  I mean
> <<
> Allow the creation of hash indexes on ltree columns (Tommy Pavlicek)
> This also enables hash join and hash aggregation on ltree columns.
> <<
> 
> better description would be:
> <<
> Add hash support functions and hash opclass for contrib/ltree (Tommy Pavlicek)
> This also enables hash join and hash aggregation on ltree columns.
> <<

Yes, please see my previous email where I am asking why being more
specific is worse.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Xing Guo
Sorry, forget to add an assertion to guard our codes in my previous patch.
From 4f2d70fb27e3ff23b8572c5e679c791daa1f6665 Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Thu, 9 May 2024 20:57:48 +0800
Subject: [PATCH v3] Remove redundant SetProcessingMode(InitProcessing) calls.

After several refactoring iterations, auxiliary processes are no longer
initialized from the bootstrapper. Using the InitProcessing mode for
initializing auxiliary processes is more appropriate. Since the global
variable Mode is initialized to InitProcessing, we can remove redundant
calls of SetProcessingMode(InitProcessing).
---
 src/backend/postmaster/autovacuum.c| 4 ++--
 src/backend/postmaster/auxprocess.c| 3 ++-
 src/backend/postmaster/bgworker.c  | 2 +-
 src/backend/replication/logical/slotsync.c | 2 +-
 src/backend/tcop/postgres.c| 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 9a925a10cd..928754b51c 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -380,7 +380,7 @@ AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
 	if (PostAuthDelay)
 		pg_usleep(PostAuthDelay * 100L);
 
-	SetProcessingMode(InitProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
 
 	/*
 	 * Set up signal handlers.  We operate on databases much like a regular
@@ -1373,7 +1373,7 @@ AutoVacWorkerMain(char *startup_data, size_t startup_data_len)
 	MyBackendType = B_AUTOVAC_WORKER;
 	init_ps_display(NULL);
 
-	SetProcessingMode(InitProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
 
 	/*
 	 * Set up signal handlers.  We operate on databases much like a regular
diff --git a/src/backend/postmaster/auxprocess.c b/src/backend/postmaster/auxprocess.c
index 78f4263eeb..b21eae7136 100644
--- a/src/backend/postmaster/auxprocess.c
+++ b/src/backend/postmaster/auxprocess.c
@@ -54,7 +54,8 @@ AuxiliaryProcessMainCommon(void)
 
 	init_ps_display(NULL);
 
-	SetProcessingMode(BootstrapProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
+
 	IgnoreSystemIndexes = true;
 
 	/*
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index cf64a4beb2..dc20a25345 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -746,7 +746,7 @@ BackgroundWorkerMain(char *startup_data, size_t startup_data_len)
 	MyBackendType = B_BG_WORKER;
 	init_ps_display(worker->bgw_name);
 
-	SetProcessingMode(InitProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
 
 	/* Apply PostAuthDelay */
 	if (PostAuthDelay > 0)
diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
index f1f44d41ef..2653ce0342 100644
--- a/src/backend/replication/logical/slotsync.c
+++ b/src/backend/replication/logical/slotsync.c
@@ -1342,7 +1342,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
 
 	init_ps_display(NULL);
 
-	SetProcessingMode(InitProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
 
 	/*
 	 * Create a per-backend PGPROC struct in shared memory.  We must do this
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 2dff28afce..b9fee13612 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4161,7 +4161,7 @@ PostgresMain(const char *dbname, const char *username)
 	Assert(dbname != NULL);
 	Assert(username != NULL);
 
-	SetProcessingMode(InitProcessing);
+	Assert(GetProcessingMode() == InitProcessing);
 
 	/*
 	 * Set up signal handlers.  (InitPostmasterChild or InitStandaloneProcess
-- 
2.45.0



Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 07:49:55PM +0800, jian he wrote:
> On Thu, May 9, 2024 at 6:53 PM jian he  wrote:
> >
> > On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> > > I have committed the first draft of the PG 17 release notes;  you can
> > > see the results here:
> > >
> > > https://momjian.us/pgsql_docs/release-17.html
> 
> < Add columns to pg_stats to report range histogram information (Egor
> Rogov, Soumyadeep Chakraborty)
> I think this applies to range type and multi range type, "range
> histogram information" seems not very clear to me.
> So maybe:
> < Add columns to pg_stats to report range-type histogram information
> (Egor Rogov, Soumyadeep Chakraborty)

Yes, good point, done.

> Display length and bounds histograms in pg_stats

Uh, isn't that assumed?  Is this a detail worth mentioning?

> < Add new COPY option "ON_ERROR ignore" to discard error rows (Damir
> Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He, Jian He, Yugo
> Nagata)
> duplicate name.

Fixed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Tom Lane
Heikki Linnakangas  writes:
> At first I was sure this was introduced by my refactorings in v17, but 
> in fact it's been like this forever. I agree that InitProcessing makes 
> much more sense. The ProcessingMode variable is initialized to 
> InitProcessing, so I think we can simply remove that line from 
> AuxiliaryProcessMainCommon(). There are existing 
> "SetProcessingMode(InitProcessing)" calls in other Main functions too 
> (AutoVacLauncherMain, BackgroundWorkerMain, etc.), and I think those can 
> also be removed.

This only works if the postmaster can be trusted never to change the
variable; else children could inherit some other value via fork().
In that connection, it seems a bit scary that postmaster.c contains a
couple of calls "SetProcessingMode(NormalProcessing)".  It looks like
they are in functions that should only be executed by child processes,
but should we try to move them somewhere else?  Another idea could be
to add an Assert to SetProcessingMode that insists that it can't be
executed by the postmaster.

regards, tom lane




Re: First draft of PG 17 release notes

2024-05-09 Thread jian he
On Thu, May 9, 2024 at 11:12 PM Bruce Momjian  wrote:
>
> On Thu, May  9, 2024 at 07:49:55PM +0800, jian he wrote:
> > On Thu, May 9, 2024 at 6:53 PM jian he  wrote:
> > >
> > > On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> > > > I have committed the first draft of the PG 17 release notes;  you can
> > > > see the results here:
> > > >
> > > > https://momjian.us/pgsql_docs/release-17.html
> >

E.1.3.1.5. Privileges
Add per-table GRANT permission MAINTAIN to control maintenance
operations (Nathan Bossart)

The operations are VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW,
CLUSTER, and LOCK TABLE.

Add user-grantable role pg_maintain to control maintenance operations
(Nathan Bossart)

The operations are VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW,
CLUSTER, and LOCK TABLE.

Allow roles with pg_monitor privileges to execute pg_current_logfile()
(Pavlo Golub, Nathan Bossart)
---
should be "REFRESH MATERIALIZED VIEW"?

also
"Allow roles with pg_monitor privileges to execute
pg_current_logfile() (Pavlo Golub, Nathan Bossart)"
"pg_monitor" is a predefined role, so technically, "with pg_monitor
privileges" is not correct?
--
Add function XMLText() to convert text to a single XML text node (Jim Jones)

XMLText()
should be
xmltext()
--
Add function to_regtypemod() to return the typemod of a string (David
Wheeler, Erik Wienhold)
I think this description does not mean the same thing as the doc[1]

[1] 
https://www.postgresql.org/docs/devel/functions-info.html#FUNCTIONS-INFO-CATALOG
--

Allow GROUP BY columns to be internally ordered to match ORDER BY
(Andrei Lepikhov, Teodor Sigaev)
This can be disabled using server variable enable_group_by_reordering.

Probably
`This can be disabled by setting the server variable
enable_group_by_reordering to false`.




Re: First draft of PG 17 release notes

2024-05-09 Thread br...@momjian.us
On Thu, May  9, 2024 at 11:03:39AM +, Hans Buschmann wrote:
> Some findings
>
>
> 1.
>
> >>Remove adminpack contrib extension (Daniel Gustafsson)
>
> >>This was used by non-end-of-life pgAdmin III.
>
>
> Perhaps you mean now-end-of-life (s/non/now/)

Yes, fixed to "now end-of-life"

> 2.
> >>All specification of partitioned table access methods (Justin Pryzby, >>
> Soumyadeep Chakraborty, Michael Paquier)
>
> perhaps you mean Allow, otherwise meaning not clear.

Yep, fixed.

> 3.
> >> Add some long options to pg_archivecleanup (Atsushi Torikoshi)
> >>The long options are --debug, --dry-run, and /--strip-extension.
>
> The slash should be omitted.

Fixed, not sure how that got in there.

I have committed all outstanding fixes and updated the doc build:

https://momjian.us/pgsql_docs/release-17.html

Thank you for all the valuable feedback.

Incidentally, the big surprise for me was the large number of optimizer
improvements.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 11:26:44PM +0800, jian he wrote:
> On Thu, May 9, 2024 at 11:12 PM Bruce Momjian  wrote:
> >
> > On Thu, May  9, 2024 at 07:49:55PM +0800, jian he wrote:
> > > On Thu, May 9, 2024 at 6:53 PM jian he  
> > > wrote:
> > > >
> > > > On Thu, May 9, 2024 at 12:04 PM Bruce Momjian  wrote:
> > > > > I have committed the first draft of the PG 17 release notes;  you can
> > > > > see the results here:
> > > > >
> > > > > https://momjian.us/pgsql_docs/release-17.html
> > >
> 
> E.1.3.1.5. Privileges
> Add per-table GRANT permission MAINTAIN to control maintenance
> operations (Nathan Bossart)
> 
> The operations are VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW,
> CLUSTER, and LOCK TABLE.
> 
> Add user-grantable role pg_maintain to control maintenance operations
> (Nathan Bossart)
> 
> The operations are VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW,
> CLUSTER, and LOCK TABLE.
> 
> Allow roles with pg_monitor privileges to execute pg_current_logfile()
> (Pavlo Golub, Nathan Bossart)
> ---
> should be "REFRESH MATERIALIZED VIEW"?

Yes, fixed.

> also
> "Allow roles with pg_monitor privileges to execute
> pg_current_logfile() (Pavlo Golub, Nathan Bossart)"
> "pg_monitor" is a predefined role, so technically, "with pg_monitor
> privileges" is not correct?

Good point, new text:

Allow roles with pg_monitor membership to execute pg_current_logfile() 
(Pavlo Golub, Nathan Bossart)

> --
> Add function XMLText() to convert text to a single XML text node (Jim Jones)
> 
> XMLText()
> should be
> xmltext()

Right, fixed.

> --
> Add function to_regtypemod() to return the typemod of a string (David
> Wheeler, Erik Wienhold)
> I think this description does not mean the same thing as the doc[1]

Yes, I see your point.  I changed the text to:

Add function to_regtypemod() to return the type modifier of a
type specification (David Wheeler, Erik Wienhold)


> [1] 
> https://www.postgresql.org/docs/devel/functions-info.html#FUNCTIONS-INFO-CATALOG
> --
> 
> Allow GROUP BY columns to be internally ordered to match ORDER BY
> (Andrei Lepikhov, Teodor Sigaev)
> This can be disabled using server variable enable_group_by_reordering.
> 
> Probably
> `This can be disabled by setting the server variable
> enable_group_by_reordering to false`.

Uh, I usually don't go into that detail.  There will be a link to the
variable in about a month so users can look up its behavior.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Matthias van de Meent
On Thu, 9 May 2024 at 15:13, Tomas Vondra  wrote:
> Let me explain the relevant part of the patch, and how I understand the
> improvement suggested by Matthias. The patch does the work in three phases:
>
>
> 1) Worker gets data from table, split that into index items and add
> those into a "private" tuplesort, and finally sorts that. So a worker
> may see a key many times, with different TIDs, so the tuplesort may
> contain many items for the same key, with distinct TID lists:
>
>key1: 1, 2, 3, 4
>key1: 5, 6, 7
>key1: 8, 9, 10
>key2: 1, 2, 3
>...

This step is actually split in several components/phases, too.
As opposed to btree, which directly puts each tuple's data into the
tuplesort, this GIN approach actually buffers the tuples in memory to
generate these TID lists for data keys, and flushes these pairs of Key
+ TID list into the tuplesort when its own memory limit is exceeded.
That means we essentially double the memory used for this data: One
GIN deform buffer, and one in-memory sort buffer in the tuplesort.
This is fine for now, but feels duplicative, hence my "let's allow
tuplesort to merge the key+TID pairs into pairs of key+TID list"
comment.

> The trouble with (2) is that it "just copies" data from one tuplesort
> into another, increasing the disk space requirements. In an extreme
> case, when nothing can be combined, it pretty much doubles the amount of
> disk space, and makes the build longer.
>
> What I think Matthias is suggesting, is that this "TID list merging"
> could be done directly as part of the tuplesort in step (1). So instead
> of just moving the "sort tuples" from the appropriate runs, it could
> also do an optional step of combining the tuples and writing this
> combined tuple into the tuplesort result (for that worker).

Yes, but with a slightly more extensive approach than that even, see above.

> Matthias also mentioned this might be useful when building btree indexes
> with key deduplication.
>
> AFAICS this might work, although it probably requires for the "combined"
> tuple to be smaller than the sum of the combined tuples (in order to fit
> into the space).

*must not be larger than the sum; not "must be smaller than the sum" [^0].
For btree tuples with posting lists this is guaranteed to be true: The
added size of a btree tuple with a posting list (containing at least 2
values) vs one without is the maxaligned size of 2 TIDs, or 16 bytes
(12 on 32-bit systems). The smallest btree tuple with data is also 16
bytes (or 12 bytes on 32-bit systems), so this works out nicely.

> But at least in the GIN build in the workers this is
> likely true, because the TID lists do not overlap (and thus not hurting
> the compressibility).
>
> That being said, I still see this more as an optimization than something
> required for the patch,

Agreed.

> and I don't think I'll have time to work on this
> anytime soon. The patch is not extremely complex, but it's not trivial
> either. But if someone wants to take a stab at extending tuplesort to
> allow this, I won't object ...

Same here: While I do have some ideas on where and how to implement
this, I'm not planning on working on that soon.


Kind regards,

Matthias van de Meent

[^0] There's some overhead in the tuplesort serialization too, so
there is some leeway there, too.




Re: First draft of PG 17 release notes

2024-05-09 Thread Andrew Dunstan


On 2024-05-09 Th 00:03, Bruce Momjian wrote:

I have committed the first draft of the PG 17 release notes;  you can
see the results here:

https://momjian.us/pgsql_docs/release-17.html

It will be improved until the final release.  The item count is 188,
which is similar to recent releases:

release-10:  189
release-11:  170
release-12:  180
release-13:  178
release-14:  220
release-15:  184
release-16:  206
release-17:  188

I welcome feedback.  For some reason it was an easier job than usual.



 *

   Remove the ability to build Postgres with Visual Studio (Michael
   Paquier)

   Meson is now the only available Windows build method.


This is a category mistake. What was removed was the special code we had 
for building with VS, but not the ability to build with VS. You can 
build with VS using meson (see for example drongo on the buildfarm)



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: request for database identifier in the startup packet

2024-05-09 Thread Robert Haas
On Thu, May 9, 2024 at 8:06 AM Dave Cramer  wrote:
> The JDBC driver is currently keeping a per connection cache of types in the 
> driver. We are seeing cases where the number of columns is quite high. In one 
> case Prevent fetchFieldMetaData() from being run when unnecessary. · Issue 
> #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.
>
> If we knew that we were connecting to the same database we could use a single 
> cache across connections.
>
> I think we would require a server/database identifier in the startup message.

I understand the desire to share the cache, but not why that would
require any kind of change to the wire protocol.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
$subject

Make has one:
https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK

This needs updating:
https://www.postgresql.org/docs/current/docguide-build-meson.html

I've been using "ninja html" which isn't shown here.  Also, as a sanity
check, running that command takes my system 1 minute.  Any idea what
percentile that falls into?

David J.


Re: CREATE DATABASE with filesystem cloning

2024-05-09 Thread Robert Haas
On Wed, May 8, 2024 at 10:34 AM Nazir Bilal Yavuz  wrote:
> > I'm not so sure about the GUC name. On the one hand, it feels like
> > createdb should be spelled out as create_database, but on the other
> > hand, the GUC name is quite long already. Then again, why would we
> > make this specific to CREATE DATABASE in the first place? Would we
> > also want alter_tablespace_file_copy_method and
> > basic_archive.file_copy_method?
>
> I agree that it is already quite long, because of that I chose the
> createdb as a prefix. I did not think that file cloning was planned to
> be used in other places. If that is the case, does something like
> 'preferred_copy_method' work? Then, we would mention which places will
> be affected with this GUC in the docs.

I would go with file_copy_method rather than preferred_copy_method,
because (1) there's nothing "preferred" about it if we're using it
unconditionally and (2) it's nice to clarify that we're talking about
copying files rather than anything else.

My personal enthusiasm for making platform-specific file copy methods
usable all over PostgreSQL is quite limited. However, it is my
observation that other people seem far more enthusiastic about it than
I am. For example, consider how quickly it got added to
pg_combinebackup. So, I suspect it's smart to plan on anything we add
in this area getting used in a bunch of places. And perhaps it is even
best to think about making it work in all of those places right from
the start. If we build support into copydir and copy_file, then we
just get everything that uses those, and all that remains is to
document was is covered (and add comments so that future patch authors
know they should further update the documentation).

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 12:10:11PM -0400, Andrew Dunstan wrote:
> 
> On 2024-05-09 Th 00:03, Bruce Momjian wrote:
> 
> I have committed the first draft of the PG 17 release notes;  you can
> see the results here:
> 
> https://momjian.us/pgsql_docs/release-17.html
> 
> It will be improved until the final release.  The item count is 188,
> which is similar to recent releases:
> 
> release-10:  189
> release-11:  170
> release-12:  180
> release-13:  178
> release-14:  220
> release-15:  184
> release-16:  206
> release-17:  188
> 
> I welcome feedback.  For some reason it was an easier job than usual.
> 
> 
>   • Remove the ability to build Postgres with Visual Studio (Michael Paquier)
> 
> Meson is now the only available Windows build method.
> 
> 
> This is a category mistake. What was removed was the special code we had for
> building with VS, but not the ability to build with VS. You can build with VS
> using meson (see for example drongo on the buildfarm)

Wow, okay, I am not surprised I was confused.  New text is:





Remove the Microsoft Visual Studio Studio-specific Postgres build 
option (Michael Paquier)



Meson is now the only method for Visual Studio builds.






Remove the Microsoft Visual Studio Studio-specific Postgres build 
option (Michael Paquier)



Meson is now the only method for Visual Studio builds.



-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Tom Lane
"David G. Johnston"  writes:
> I've been using "ninja html" which isn't shown here.  Also, as a sanity
> check, running that command takes my system 1 minute.  Any idea what
> percentile that falls into?

On my no-longer-shiny-new workstation, "make" in doc/src/sgml
(which builds just the HTML docs) takes right about 30s in HEAD.
Can't believe that the overhead would be noticeably different
between make and meson, since it's a simple command sequence
with no opportunity for parallelism.

regards, tom lane




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
"David G. Johnston"  writes:

> $subject
>
> Make has one:
> https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK
>
> This needs updating:
> https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> I've been using "ninja html" which isn't shown here.  Also, as a sanity
> check, running that command takes my system 1 minute.  Any idea what
> percentile that falls into?

My laptop (8-core i7-11800H @ 2.30GHz) takes 22s to do `ninja html`
after `ninja clean`.

> David J.

- ilmari




Re: Support tid range scan in parallel?

2024-05-09 Thread Cary Huang
 > I've not looked at the patch, but please add it to the July CF.  I'll
 > try and look in more detail then.

Thanks David, I have added this patch on July commitfest under the
server feature category. 

I understand that the regression tests for parallel ctid range scan is a
bit lacking now. It only has a few EXPLAIN clauses to ensure parallel 
workers are used when tid ranges are specified. They are added as
part of select_parallel.sql test. I am not sure if it is more appropriate
to have them as part of tidrangescan.sql test instead. So basically
re-run the same test cases in tidrangescan.sql but in parallel? 

thank you

Cary









Re: [PATCH] json_lex_string: don't overread on bad UTF8

2024-05-09 Thread Jacob Champion
On Wed, May 8, 2024 at 9:27 PM Michael Paquier  wrote:
> This is a bit mitigated by the fact that d6607016c738 is recent, but
> this is incorrect since v13 so backpatched down to that.

Thank you!

--Jacob




Re: request for database identifier in the startup packet

2024-05-09 Thread Dave Cramer
Dave Cramer


On Thu, 9 May 2024 at 12:22, Robert Haas  wrote:

> On Thu, May 9, 2024 at 8:06 AM Dave Cramer  wrote:
> > The JDBC driver is currently keeping a per connection cache of types in
> the driver. We are seeing cases where the number of columns is quite high.
> In one case Prevent fetchFieldMetaData() from being run when unnecessary. ·
> Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.
> >
> > If we knew that we were connecting to the same database we could use a
> single cache across connections.
> >
> > I think we would require a server/database identifier in the startup
> message.
>
> I understand the desire to share the cache, but not why that would
> require any kind of change to the wire protocol.
>
> The server identity is actually useful for many things such as knowing
which instance of a cluster you are connected to.
For the cache however we can't use the IP address to determine which server
we are connected to as we could be connected to a pooler.
Knowing exactly which server/database makes it relatively easy to have a
common cache across connections. Getting that in the startup message seems
like a good place

Dave


Re: First draft of PG 17 release notes

2024-05-09 Thread Álvaro Herrera
On 2024-May-09, Bruce Momjian wrote:

> However, I don't see it mentioned as a release note item in the commit
> message or mentioned in our docs. I suppose the release note text would
> be:
> 
>   Removing a PRIMARY KEY will remove the NOT NULL column specification
> 
>   Previously the NOT NULL specification would be retained.
> 
> Do we have agreement that we want this release note item?

Yes.  Maybe we want some others too (especially regarding inheritance,
but also regarding the way we handle the constraints internally), and
maybe in this one we want different wording.  How about something like
this:

  Removing a primary key constraint may change the nullability
  characteristic of the columns that the primary key covered.

  If explicit not-null constraints exist on the same column, then they
  continue to be /known not nullable/; otherwise they become /possibly
  nullable/.

This is largely based on the SQL standard's language of a column
descriptor having a "nullability characteristic", which for columns with
not-null or primary key constraints is "known not null".  I don't think
we use those terms anywhere.  I hope this isn't too confusing.

The standard's text on this, in section "4.13 Columns, fields, and
attributes", is

  Every column has a nullability characteristic that indicates whether
  the value from that column can be the null value. A nullability
  characteristic is either known not nullable or possibly nullable.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

2024-05-09 Thread Ryo Matsumura (Fujitsu)
Hi hackers,

I detected two problems about ECPG.
I show my opinion. Please comment.
If it's correct, I will prepare a patch.
Thank you.

1.
It is indefinite what PGTYPEStimestamp_from_asc() returns in error.
The following is written in document(36.6.8. Special Constants of pgtypeslib):
  A value of type timestamp representing an invalid time stamp.
  This is returned by the function PGTYPEStimestamp_from_asc on parse error.
  Note that due to the internal representation of the timestamp data type,
  PGTYPESInvalidTimestamp is also a valid timestamp at the same time.
  It is set to 1899-12-31 23:59:59. In order to detect errors,
  make sure that your application does not only test for PGTYPESInvalidTimestamp
  but also for errno != 0 after each call to PGTYPEStimestamp_from_asc.

However, PGTYPESInvalidTimestamp is not defined anywhere.
It no loger exists at REL6_2 that is the oldest branch.
At current implementation, PGTYPEStimestamp_from_asc returns -1.

So we must fix the document so that users write as follows:

  r = PGTYPEStimestamp_from_asc(.);
  if (r < 0 || errno != 0)
goto error;


2.
Regression test of pgtypelib is not robust (maybe incorrect).
Our test failed although there is no bug actually.

I think block2 and block3 should be swapped.

---[src/interfaces/ecpg/test/pgtypeslib/dt_test.pgc]---

  // block1 (my comment)
  ts1 = PGTYPEStimestamp_from_asc("96-02-29", NULL);
  text = PGTYPEStimestamp_to_asc(ts1);
  printf("timestamp_to_asc1: %s\n", text);
  PGTYPESchar_free(text);

  // block2
  ts1 = PGTYPEStimestamp_from_asc("1994-02-11 26:10:35", NULL);
  text = PGTYPEStimestamp_to_asc(ts1);
  printf("timestamp_to_asc3: %s\n", text);
  PGTYPESchar_free(text);

  // The following comment is for block1 clearly.
/* abc-03:10:35-def-02/11/94-gh  */
/*  12345678901234567890123456789 */

  // block3
  // Maybe the following is for 'ts1' returned in block1.
  // In our environment, 'out' is indefinite because PGTYPEStimestamp_fmt_asc()
  // didn't complete and the area is not initialized.
  out = (char*) malloc(32);
  i = PGTYPEStimestamp_fmt_asc(&ts1, out, 31, "abc-%X-def-%x-ghi%%");
  printf("timestamp_fmt_asc: %d: %s\n", i, out);
  free(out);



Best Regards
Ryo Matsumura


Best Regards
Ryo Matsumura



Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
"David G. Johnston"  writes:

> $subject
>
> Make has one:
> https://www.postgresql.org/docs/current/docguide-build.html#DOCGUIDE-BUILD-SYNTAX-CHECK
>
> This needs updating:
> https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> I've been using "ninja html" which isn't shown here.

The /devel/ version has a link to the full list of doc targets:

https://www.postgresql.org/docs/devel/install-meson.html#TARGETS-MESON-DOCUMENTATION

Attached is a patch which adds a check-docs target for meson, which
takes 0.3s on my laptop.

- ilmari

>From d54104493b9d97b95a890e47b395723d9b152447 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 9 May 2024 19:59:46 +0100
Subject: [PATCH] Add a check-docs target for meson

This allows checking the syntax of the docs much faster than
rebuilding them all, similar to `make -C doc/src/sgml check`.
---
 doc/src/sgml/meson.build   | 10 ++
 doc/src/sgml/targets-meson.txt |  1 +
 2 files changed, 11 insertions(+)

diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index e418de83a7..51eed7b31d 100644
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -112,6 +112,16 @@ postgres_full_xml = custom_target('postgres-full.xml',
 docs += postgres_full_xml
 alldocs += postgres_full_xml
 
+checkdocs = custom_target('check-docs',
+  input: 'postgres.sgml',
+  output: 'check-docs',
+  depfile: 'postgres-full.xml.d',
+  command: [xmllint,  '--nonet', '--valid', '--noout',
+'--path', '@OUTDIR@', '@INPUT@'],
+  depends: doc_generated,
+  build_by_default: false,
+)
+alias_target('check-docs', checkdocs)
 
 if xsltproc_bin.found()
   xsltproc_flags = [
diff --git a/doc/src/sgml/targets-meson.txt b/doc/src/sgml/targets-meson.txt
index bd470c87a7..ba426707be 100644
--- a/doc/src/sgml/targets-meson.txt
+++ b/doc/src/sgml/targets-meson.txt
@@ -26,6 +26,7 @@ Documentation Targets:
   doc/src/sgml/postgres-US.pdf  Build documentation in PDF format, with US letter pages
   doc/src/sgml/postgres.htmlBuild documentation in single-page HTML format
   alldocs   Build documentation in all supported formats
+  check-docsCheck the syntax of the documentation
 
 Installation Targets:
   install   Install postgres, excluding documentation
-- 
2.39.2



Re: request for database identifier in the startup packet

2024-05-09 Thread Andres Freund
Hi,

On 2024-05-09 14:20:49 -0400, Dave Cramer wrote:
> On Thu, 9 May 2024 at 12:22, Robert Haas  wrote:
> > On Thu, May 9, 2024 at 8:06 AM Dave Cramer  wrote:
> > > The JDBC driver is currently keeping a per connection cache of types in
> > the driver. We are seeing cases where the number of columns is quite high.
> > In one case Prevent fetchFieldMetaData() from being run when unnecessary. ·
> > Issue #3241 · pgjdbc/pgjdbc (github.com) 2.6 Million columns.
> > >
> > > If we knew that we were connecting to the same database we could use a
> > single cache across connections.
> > >
> > > I think we would require a server/database identifier in the startup
> > message.
> >
> > I understand the desire to share the cache, but not why that would
> > require any kind of change to the wire protocol.
> >
> > The server identity is actually useful for many things such as knowing
> which instance of a cluster you are connected to.
> For the cache however we can't use the IP address to determine which server
> we are connected to as we could be connected to a pooler.
> Knowing exactly which server/database makes it relatively easy to have a
> common cache across connections. Getting that in the startup message seems
> like a good place

ISTM that you could just as well query the information you'd like after
connecting. And that's going to be a lot more flexible than having to have
precisely the right information in the startup message, and most clients not
needing it.

Greetings,

Andres Freund




Re: request for database identifier in the startup packet

2024-05-09 Thread Robert Haas
On Thu, May 9, 2024 at 3:14 PM Andres Freund  wrote:
> ISTM that you could just as well query the information you'd like after
> connecting. And that's going to be a lot more flexible than having to have
> precisely the right information in the startup message, and most clients not
> needing it.

I agree with this.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Parallel CREATE INDEX for GIN indexes

2024-05-09 Thread Tomas Vondra



On 5/9/24 17:51, Matthias van de Meent wrote:
> On Thu, 9 May 2024 at 15:13, Tomas Vondra  
> wrote:
>> Let me explain the relevant part of the patch, and how I understand the
>> improvement suggested by Matthias. The patch does the work in three phases:
>>
>>
>> 1) Worker gets data from table, split that into index items and add
>> those into a "private" tuplesort, and finally sorts that. So a worker
>> may see a key many times, with different TIDs, so the tuplesort may
>> contain many items for the same key, with distinct TID lists:
>>
>>key1: 1, 2, 3, 4
>>key1: 5, 6, 7
>>key1: 8, 9, 10
>>key2: 1, 2, 3
>>...
> 
> This step is actually split in several components/phases, too.
> As opposed to btree, which directly puts each tuple's data into the
> tuplesort, this GIN approach actually buffers the tuples in memory to
> generate these TID lists for data keys, and flushes these pairs of Key
> + TID list into the tuplesort when its own memory limit is exceeded.
> That means we essentially double the memory used for this data: One
> GIN deform buffer, and one in-memory sort buffer in the tuplesort.
> This is fine for now, but feels duplicative, hence my "let's allow
> tuplesort to merge the key+TID pairs into pairs of key+TID list"
> comment.
> 

True, although the "GIN deform buffer" (flushed by the callback if using
too much memory) likely does most of the merging already. If it only
happened in the tuplesort merge, we'd likely have far more tuples and
overhead associated with that. So we certainly won't get rid of either
of these things.

You're right the memory limits are a bit unclear, and need more thought.
I certainly have not thought very much about not using more than the
specified maintenance_work_mem amount. This includes the planner code
determining the number of workers to use - right now it simply does the
same thing as for btree/brin, i.e. assumes each workers uses 32MB of
memory and checks how many workers fit into maintenance_work_mem.

That was a bit bogus even for BRIN, because BRIN sorts only summaries,
which is typically tiny - perhaps a couple kB, much less than 32MB. But
it's still just one sort, and some opclasses may be much larger (like
bloom, for example). So I just went with it.

But for GIN it's more complicated, because we have two tuplesorts (not
sure if both can use the memory at the same time) and the GIN deform
buffer. Which probably means we need to have a per-worker allowance
considering all these buffers.

>> The trouble with (2) is that it "just copies" data from one tuplesort
>> into another, increasing the disk space requirements. In an extreme
>> case, when nothing can be combined, it pretty much doubles the amount of
>> disk space, and makes the build longer.
>>
>> What I think Matthias is suggesting, is that this "TID list merging"
>> could be done directly as part of the tuplesort in step (1). So instead
>> of just moving the "sort tuples" from the appropriate runs, it could
>> also do an optional step of combining the tuples and writing this
>> combined tuple into the tuplesort result (for that worker).
> 
> Yes, but with a slightly more extensive approach than that even, see above.
> 
>> Matthias also mentioned this might be useful when building btree indexes
>> with key deduplication.
>>
>> AFAICS this might work, although it probably requires for the "combined"
>> tuple to be smaller than the sum of the combined tuples (in order to fit
>> into the space).
> 
> *must not be larger than the sum; not "must be smaller than the sum" [^0].

Yeah, I wrote that wrong.

> For btree tuples with posting lists this is guaranteed to be true: The
> added size of a btree tuple with a posting list (containing at least 2
> values) vs one without is the maxaligned size of 2 TIDs, or 16 bytes
> (12 on 32-bit systems). The smallest btree tuple with data is also 16
> bytes (or 12 bytes on 32-bit systems), so this works out nicely.
> 
>> But at least in the GIN build in the workers this is
>> likely true, because the TID lists do not overlap (and thus not hurting
>> the compressibility).
>>
>> That being said, I still see this more as an optimization than something
>> required for the patch,
> 
> Agreed.
> 

OK

>> and I don't think I'll have time to work on this
>> anytime soon. The patch is not extremely complex, but it's not trivial
>> either. But if someone wants to take a stab at extending tuplesort to
>> allow this, I won't object ...
> 
> Same here: While I do have some ideas on where and how to implement
> this, I'm not planning on working on that soon.
> 

Understood. I don't have a very good intuition on how significant the
benefit could be, which is one of the reasons why I have not prioritized
this very much.

I did a quick experiment, to measure how expensive it is to build the
second worker tuplesort - for the pg_trgm index build with 2 workers, it
takes ~30seconds. The index build takes ~550s in total, so 30s is ~5%.
If we eliminated all of this wo

open items

2024-05-09 Thread Robert Haas
Hi,

Just a few reminders about the open items list at
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items --

- Please don't add issues to this list unless they are the result of
development done during this release cycle. This is not a
general-purpose bug tracker.

- The owner of an item is the person who committed the patch that
caused the problem, because that committer is responsible for cleaning
up the mess. Of course, the patch author is warmly invited to help,
especially if they have aspirations of being a committer some day
themselves. Other help is welcome, too.

- Fixing the stuff on this list is a time-boxed activity. We want to
put out a release on time. If the stuff listed here doesn't get fixed,
the release management team will have to do something about it, like
start yelling at people, or forcing patches to be reverted, which will
be no fun for anyone involved, including but not limited to the
release management team.

A great number of things that were added as open items have already
been resolved, but some of the remaining items have been there for a
while. Here's a quick review of what's on the list as of this moment:

* Incorrect Assert in heap_end/rescan for BHS. Either the description
of this item is inaccurate, or we've been unable to fix an incorrect
assert after more than a month. I interpret
https://www.postgresql.org/message-id/54858BA1-084E-4F7D-B2D1-D15505E512FF%40yesql.se
as a vote in favor of committing some patch by Melanie to fix this.
Either Tomas should commit that patch, or Melanie should commit that
patch, or somebody should say why that patch shouldn't be committed,
or someone should request more help determining whether that patch is
indeed the correct fix, or something. But let's not just sit on this.

* Register ALPN protocol id with IANA. From the mailing list thread,
it is abundantly clear that IANA is in no hurry to finish dealing with
what seems to be a completely pro forma request from our end. I think
we just have to be patient.

* not null constraints break dump/restore. I asked whether all of the
issues had been addressed here and Justin Pryzby opined that the only
thing that was still relevant for this release was a possible test
case change, which I would personally consider a good enough reason to
at least consider calling this done. But it's not clear to me whether
Justin's opinion is the consensus position, or perhaps more
relevantly, whether it's Álvaro's position.

* Temporal PKs allow duplicates with empty ranges. Peter Eisentraut
has started working with Paul Jungwirth on this. Looks good so far.

* Rename sslnegotiation "requiredirect." option to "directonly". I
still think Heikki has implemented the wrong behavior here, and I
don't think this renaming is going to make any difference one way or
the other in how understandable it is. But if we're going to leave the
behavior as-is and do the renaming, then let's get that done.

* Race condition with local injection point detach. Discussion is ongoing.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: request for database identifier in the startup packet

2024-05-09 Thread Dave Cramer
On Thu, 9 May 2024 at 15:19, Robert Haas  wrote:

> On Thu, May 9, 2024 at 3:14 PM Andres Freund  wrote:
> > ISTM that you could just as well query the information you'd like after
> > connecting. And that's going to be a lot more flexible than having to
> have
> > precisely the right information in the startup message, and most clients
> not
> > needing it.
>
> I agree with this.
>
> Well other than the extra round trip.

Thanks,
Dave


Re: add --no-sync to pg_upgrade's calls to pg_dump and pg_dumpall

2024-05-09 Thread Nathan Bossart
On Thu, May 09, 2024 at 09:03:56AM +0900, Michael Paquier wrote:
> +1.  Could there be an argument in favor of a backpatch?  This is a
> performance improvement, but one could also side that the addition of
> sync support in pg_dump[all] has made that a regression that we'd
> better fix because the flushes don't matter in this context.  They
> also bring costs for no gain.

I don't see a strong need to back-patch this, if for no other reason than
it seems to have gone unnoticed for 7 major versions.  Plus, based on my
admittedly limited testing, this is unlikely to provide significant
improvements.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Andres Freund
Hi,

On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
> Attached is a patch which adds a check-docs target for meson, which
> takes 0.3s on my laptop.

Nice.


> +checkdocs = custom_target('check-docs',
> +  input: 'postgres.sgml',
> +  output: 'check-docs',
> +  depfile: 'postgres-full.xml.d',
> +  command: [xmllint,  '--nonet', '--valid', '--noout',
> +'--path', '@OUTDIR@', '@INPUT@'],
> +  depends: doc_generated,
> +  build_by_default: false,
> +)
> +alias_target('check-docs', checkdocs)

Isn't the custom target redundant with postgres_full_xml? I.e. you could just
have the alias_target depend on postgres_full_xml?

Greetings,

Andres Freund




Re: open items

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Robert Haas  writes:

> * Register ALPN protocol id with IANA. From the mailing list thread,
> it is abundantly clear that IANA is in no hurry to finish dealing with
> what seems to be a completely pro forma request from our end. I think
> we just have to be patient.

This appears to have been approved without anyone mentioning it on the
list, and the registry now lists "postgresql" at the bottom:

https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids

- ilmari




Re: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

2024-05-09 Thread Tom Lane
"Ryo Matsumura (Fujitsu)"  writes:
> It is indefinite what PGTYPEStimestamp_from_asc() returns in error.
> The following is written in document(36.6.8. Special Constants of pgtypeslib):
>   A value of type timestamp representing an invalid time stamp.
>   This is returned by the function PGTYPEStimestamp_from_asc on parse error.

> However, PGTYPESInvalidTimestamp is not defined anywhere.

Ugh.

> At current implementation, PGTYPEStimestamp_from_asc returns -1.

It looks to me like it returns 0 ("noresult").  Where are you seeing
-1?

That documentation has more problems too: it claims that "endptr"
is unimplemented, which looks like a lie to me: the code is there
to do it, and there are several callers that depend on it.

> Regression test of pgtypelib is not robust (maybe incorrect).
> Our test failed although there is no bug actually.

> I think block2 and block3 should be swapped.

Hm, block3 seems to be completely nuts.  It looks like the code is
rejecting the input (probably because "26" is out of range) and
returning zero, because what we see in the expected-stdout file is:

timestamp_to_asc3: 2000-01-01 00:00:00

We also see that the expected output from the PGTYPEStimestamp_fmt_asc
step is:

timestamp_fmt_asc: 0: abc-00:00:00-def-01/01/00-ghi%

which is consistent with that, but not very much like what the
comment is expecting.  I'm a bit inclined to just drop "block 3".
If we want to keep some kind of test of the error behavior,
it doesn't belong right there, and it should be showing what errno
gets set to.

As for the "lack of robustness", I'll bet the problem you are
seeing is that the test uses the %X/%x format specifiers which
are locale-dependent.  But how come we haven't noticed that
before?  Have you added a setlocale() call somewhere?

regards, tom lane




Re: request for database identifier in the startup packet

2024-05-09 Thread Robert Haas
On Thu, May 9, 2024 at 3:33 PM Dave Cramer  wrote:
> On Thu, 9 May 2024 at 15:19, Robert Haas  wrote:
>> On Thu, May 9, 2024 at 3:14 PM Andres Freund  wrote:
>> > ISTM that you could just as well query the information you'd like after
>> > connecting. And that's going to be a lot more flexible than having to have
>> > precisely the right information in the startup message, and most clients 
>> > not
>> > needing it.
>>
>> I agree with this.
>>
> Well other than the extra round trip.

I mean, sure, but we can't avoid that for everyone for everything.
There might be some way of doing something like this with, for
example, the infrastructure that was proposed to dynamically add stuff
to the list of PGC_REPORT GUCs, if the values you need are GUCs
already, or were made so. But I think it's just not workable to
unconditionally add a bunch of things to the startup packet. It'll
just grow and grow.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: open items

2024-05-09 Thread Robert Haas
On Thu, May 9, 2024 at 3:38 PM Dagfinn Ilmari Mannsåker
 wrote:
> Robert Haas  writes:
> > * Register ALPN protocol id with IANA. From the mailing list thread,
> > it is abundantly clear that IANA is in no hurry to finish dealing with
> > what seems to be a completely pro forma request from our end. I think
> > we just have to be patient.
>
> This appears to have been approved without anyone mentioning it on the
> list, and the registry now lists "postgresql" at the bottom:
>
> https://www.iana.org/assignments/tls-extensiontype-values/tls-extensiontype-values.xhtml#alpn-protocol-ids

Nice, thanks!

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: request for database identifier in the startup packet

2024-05-09 Thread Dave Cramer
On Thu, 9 May 2024 at 15:39, Robert Haas  wrote:

> On Thu, May 9, 2024 at 3:33 PM Dave Cramer  wrote:
> > On Thu, 9 May 2024 at 15:19, Robert Haas  wrote:
> >> On Thu, May 9, 2024 at 3:14 PM Andres Freund 
> wrote:
> >> > ISTM that you could just as well query the information you'd like
> after
> >> > connecting. And that's going to be a lot more flexible than having to
> have
> >> > precisely the right information in the startup message, and most
> clients not
> >> > needing it.
> >>
> >> I agree with this.
> >>
> > Well other than the extra round trip.
>
> I mean, sure, but we can't avoid that for everyone for everything.
> There might be some way of doing something like this with, for
> example, the infrastructure that was proposed to dynamically add stuff
> to the list of PGC_REPORT GUCs, if the values you need are GUCs
> already, or were made so. But I think it's just not workable to
> unconditionally add a bunch of things to the startup packet. It'll
> just grow and grow.
>

I don't think this is unconditional. These are real world situations where
having this information is useful.
That said, adding them everytime I ask for them would end up growing
uncontrollably. This seems like a decent discussion to have with others.

Dave


Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
On Thu, May 9, 2024 at 12:12 PM Dagfinn Ilmari Mannsåker 
wrote:

> "David G. Johnston"  writes:
>
> > I've been using "ninja html" which isn't shown here.
>
> The /devel/ version has a link to the full list of doc targets:
>
>
> https://www.postgresql.org/docs/devel/install-meson.html#TARGETS-MESON-DOCUMENTATION


I knew I learned about the html target from the docs.  Forgot to use the
devel ones this time around.


> Attached is a patch which adds a check-docs target for meson, which
> takes 0.3s on my laptop.
>

Thanks.

David J.


Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Attached is a patch which adds a check-docs target for meson, which
>> takes 0.3s on my laptop.
>
> Nice.
>
>
>> +checkdocs = custom_target('check-docs',
>> +  input: 'postgres.sgml',
>> +  output: 'check-docs',
>> +  depfile: 'postgres-full.xml.d',
>> +  command: [xmllint,  '--nonet', '--valid', '--noout',
>> +'--path', '@OUTDIR@', '@INPUT@'],
>> +  depends: doc_generated,
>> +  build_by_default: false,
>> +)
>> +alias_target('check-docs', checkdocs)
>
> Isn't the custom target redundant with postgres_full_xml? I.e. you could just
> have the alias_target depend on postgres_full_xml?

We could, but that would actually rebuild postgres-full.xml, not just
check the syntax (but that only takes 0.1-0.2s longer), and only run if
the docs have been modified since it was last built (which I guess is
fine, since if you haven't modified the docs you can't have introduced
any syntax errors).

It's already possible to run that target directly, i.e.

ninja doc/src/sgml/postgres-full.xml

We could just document that in the list of meson doc targets, but a
shortcut alias would roll off the fingers more easily and be more
discoverable.

> Greetings,
>
> Andres Freund

- ilmari




Re: add --no-sync to pg_upgrade's calls to pg_dump and pg_dumpall

2024-05-09 Thread Daniel Gustafsson
> On 9 May 2024, at 21:34, Nathan Bossart  wrote:
> 
> On Thu, May 09, 2024 at 09:03:56AM +0900, Michael Paquier wrote:
>> +1.  Could there be an argument in favor of a backpatch?  This is a
>> performance improvement, but one could also side that the addition of
>> sync support in pg_dump[all] has made that a regression that we'd
>> better fix because the flushes don't matter in this context.  They
>> also bring costs for no gain.
> 
> I don't see a strong need to back-patch this, if for no other reason than
> it seems to have gone unnoticed for 7 major versions.  Plus, based on my
> admittedly limited testing, this is unlikely to provide significant
> improvements.

Agreed, this is a nice little improvement but it's unlikely to be enough of a
speedup to warrant changing the backbranches.

--
Daniel Gustafsson





Re: cataloguing NOT NULL constraints

2024-05-09 Thread Robert Haas
On Wed, May 8, 2024 at 4:42 PM Alvaro Herrera  wrote:
> I spent a long time trying to think how to fix this, and I had despaired
> wanting to write that I would need to revert the whole NOT NULL business
> for pg17 -- but that was until I realized that we don't actually need
> this NOT NULL NO INHERIT business except during pg_upgrade, and that
> simplifies things enough to give me confidence that the whole feature
> can be kept.

Yeah, I have to admit that the ongoing bug fixing here has started to
make me a bit nervous, but I also can't totally follow everything
that's under discussion, so I don't want to rush to judgement. I feel
like we might need some documentation or a README or something that
explains the takeaway from the recent commits dealing with no-inherit
constraints. None of those commits updated the documentation, which
may be fine, but neither the resulting behavior nor the reasoning
behind it is obvious. It's not enough for it to be correct -- it has
to be understandable enough to the hive mind that we can maintain it
going forward.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: First draft of PG 17 release notes

2024-05-09 Thread Thomas Munro
On Thu, May 9, 2024 at 4:04 PM Bruce Momjian  wrote:
> I welcome feedback.  For some reason it was an easier job than usual.

> 2024-01-25 [820b5af73] jit: Require at least LLVM 10.

> Require LLVM version 10 or later (Peter Eisentraut)

Peter reviewed, I authored, and I think you intend to list authors in
parentheses.




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Andres Freund
Hi,

On 2024-05-09 09:23:37 -0700, David G. Johnston wrote:
> This needs updating:
> https://www.postgresql.org/docs/current/docguide-build-meson.html

You mean it should have a syntax target? Or that something else is out of
date?


> Also, as a sanity check, running that command takes my system 1 minute.  Any
> idea what percentile that falls into?

I think that's on the longer end - what OS/environment is this on? Even on
~5yo CPU with turbo boost disabled it's 48s for me.  FWIW, the single-page
html is a good bit faster, 29s on the same system.

I remember the build being a lot slower on windows, fwiw, due to the number of
files being opened/created. I guess that might also be the case on slow
storage, due to filesystem journaling.

Greetings,

Andres Freund




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Andres Freund
Hi,

On 2024-05-09 20:53:27 +0100, Dagfinn Ilmari Mannsåker wrote:
> Andres Freund  writes:
> > On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
> >> Attached is a patch which adds a check-docs target for meson, which
> >> takes 0.3s on my laptop.
> >> +checkdocs = custom_target('check-docs',
> >> +  input: 'postgres.sgml',
> >> +  output: 'check-docs',
> >> +  depfile: 'postgres-full.xml.d',
> >> +  command: [xmllint,  '--nonet', '--valid', '--noout',
> >> +'--path', '@OUTDIR@', '@INPUT@'],
> >> +  depends: doc_generated,
> >> +  build_by_default: false,
> >> +)
> >> +alias_target('check-docs', checkdocs)
> >
> > Isn't the custom target redundant with postgres_full_xml? I.e. you could 
> > just
> > have the alias_target depend on postgres_full_xml?
> 
> We could, but that would actually rebuild postgres-full.xml, not just
> check the syntax (but that only takes 0.1-0.2s longer),

I don't think this is performance critical enough to worry about 0.1s. If
anything I think the performance argument goes the other way round - doing the
validation work multiple times is a waste of time...


> and only run if the docs have been modified since it was last built (which I
> guess is fine, since if you haven't modified the docs you can't have
> introduced any syntax errors).

That actually seems good to me.


> It's already possible to run that target directly, i.e.
> 
> ninja doc/src/sgml/postgres-full.xml
> 
> We could just document that in the list of meson doc targets, but a
> shortcut alias would roll off the fingers more easily and be more
> discoverable.

Agreed.

Greetings,

Andres Freund




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread David G. Johnston
On Thu, May 9, 2024 at 1:16 PM Andres Freund  wrote:

> Hi,
>
> On 2024-05-09 09:23:37 -0700, David G. Johnston wrote:
> > This needs updating:
> > https://www.postgresql.org/docs/current/docguide-build-meson.html
>
> You mean it should have a syntax target? Or that something else is out of
> date?
>
>
v17 looks good, I like the auto-generation.  I failed to notice I was
looking at v16 when searching for a check docs target.


> Also, as a sanity check, running that command takes my system 1 minute.
> Any
> > idea what percentile that falls into?
>
> I think that's on the longer end - what OS/environment is this on? Even on
> ~5yo CPU with turbo boost disabled it's 48s for me.  FWIW, the single-page
> html is a good bit faster, 29s on the same system.
>

Ubuntu 22.04 running in AWS Workspaces Power.

Amazon EC2 t3.xlarge
Intel® Xeon(R) Platinum 8259CL CPU @ 2.50GHz × 4
16GiB Ram

David J.


Re: partitioning and identity column

2024-05-09 Thread Ashutosh Bapat
Thanks a lot Peter.

On Wed, May 8, 2024 at 2:34 AM Peter Eisentraut 
wrote:

> On 30.04.24 12:59, Ashutosh Bapat wrote:
> > PFA patch which fixes all the three problems.
>
> I have committed this patch.  Thanks all.
>


-- 
Best Wishes,
Ashutosh Bapat


Re: PERIOD foreign key feature

2024-05-09 Thread Bruce Momjian
On Wed, May  8, 2024 at 08:47:45PM -0700, Paul Jungwirth wrote:
> On 5/8/24 07:44, Bruce Momjian wrote:
> > On Wed, May  8, 2024 at 02:29:34PM +0200, Peter Eisentraut wrote:
> > > > Yes, David is correct here on all points. I like his suggestion to
> > > > clarify the language here also. If you need a patch from me let me know,
> > > > but I assume it's something a committer can just make happen?
> > > 
> > > In principle yes, but it's also very helpful if someone produces an actual
> > > patch file, with complete commit message, credits, mailing list link, etc.
> > 
> > I am ready to do the work, but waited a day for Peter to reply, since he
> > was the author of the text.
> 
> Here is a patch for this.

Thanks, patch applied.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Fri, May 10, 2024 at 08:05:43AM +1200, Thomas Munro wrote:
> On Thu, May 9, 2024 at 4:04 PM Bruce Momjian  wrote:
> > I welcome feedback.  For some reason it was an easier job than usual.
> 
> > 2024-01-25 [820b5af73] jit: Require at least LLVM 10.
> 
> > Require LLVM version 10 or later (Peter Eisentraut)
> 
> Peter reviewed, I authored, and I think you intend to list authors in
> parentheses.

Yes, my mistake, fixed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Is there an undocumented Syntax Check in Meson?

2024-05-09 Thread Dagfinn Ilmari Mannsåker
Andres Freund  writes:

> Hi,
>
> On 2024-05-09 20:53:27 +0100, Dagfinn Ilmari Mannsåker wrote:
>> Andres Freund  writes:
>> > On 2024-05-09 20:12:38 +0100, Dagfinn Ilmari Mannsåker wrote:
>> >> Attached is a patch which adds a check-docs target for meson, which
>> >> takes 0.3s on my laptop.
>> >> +checkdocs = custom_target('check-docs',
>> >> +  input: 'postgres.sgml',
>> >> +  output: 'check-docs',
>> >> +  depfile: 'postgres-full.xml.d',
>> >> +  command: [xmllint,  '--nonet', '--valid', '--noout',
>> >> +'--path', '@OUTDIR@', '@INPUT@'],
>> >> +  depends: doc_generated,
>> >> +  build_by_default: false,
>> >> +)
>> >> +alias_target('check-docs', checkdocs)
>> >
>> > Isn't the custom target redundant with postgres_full_xml? I.e. you could 
>> > just
>> > have the alias_target depend on postgres_full_xml?
>> 
>> We could, but that would actually rebuild postgres-full.xml, not just
>> check the syntax (but that only takes 0.1-0.2s longer),
>
> I don't think this is performance critical enough to worry about 0.1s. If
> anything I think the performance argument goes the other way round - doing the
> validation work multiple times is a waste of time...

These targets are both build_by_default: false, so the checkdocs target
won't both be built when building the docs.  But reusing the
postgres_full_xml target will save half a second of the half-minute
build time if you then go on to actually build the docs without changing
anything after the syntax check passes.

>> and only run if the docs have been modified since it was last built (which I
>> guess is fine, since if you haven't modified the docs you can't have
>> introduced any syntax errors).
>
> That actually seems good to me.
>
>
>> It's already possible to run that target directly, i.e.
>> 
>> ninja doc/src/sgml/postgres-full.xml
>> 
>> We could just document that in the list of meson doc targets, but a
>> shortcut alias would roll off the fingers more easily and be more
>> discoverable.
>
> Agreed.

Here's a v2 patch that does it that way.  Should we document that
check-docs actually builds postgres-full.xml, and if so, where?

> Greetings,
>
> Andres Freund

- ilmari


>From 20b5a8e96ec88022b63062f9e4d74d46f09cedd6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Thu, 9 May 2024 19:59:46 +0100
Subject: [PATCH v2] Add a check-docs target for meson

This is is just an alias for the `postgres-full.xml` target, but is
more discoverable and quicker to type.  This allows checking the
syntax of the docs much faster than actually building them, similar to
`make -C doc/src/sgml check`.
---
 doc/src/sgml/meson.build   | 2 ++
 doc/src/sgml/targets-meson.txt | 1 +
 2 files changed, 3 insertions(+)

diff --git a/doc/src/sgml/meson.build b/doc/src/sgml/meson.build
index e418de83a7..3eb0b99462 100644
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -112,6 +112,8 @@ postgres_full_xml = custom_target('postgres-full.xml',
 docs += postgres_full_xml
 alldocs += postgres_full_xml
 
+# Shortcut to the above for checking doc syntax
+alias_target('check-docs', postgres_full_xml)
 
 if xsltproc_bin.found()
   xsltproc_flags = [
diff --git a/doc/src/sgml/targets-meson.txt b/doc/src/sgml/targets-meson.txt
index bd470c87a7..ba426707be 100644
--- a/doc/src/sgml/targets-meson.txt
+++ b/doc/src/sgml/targets-meson.txt
@@ -26,6 +26,7 @@ Documentation Targets:
   doc/src/sgml/postgres-US.pdf  Build documentation in PDF format, with US letter pages
   doc/src/sgml/postgres.htmlBuild documentation in single-page HTML format
   alldocs   Build documentation in all supported formats
+  check-docsCheck the syntax of the documentation
 
 Installation Targets:
   install   Install postgres, excluding documentation
-- 
2.39.2



Re: open items

2024-05-09 Thread Michael Paquier
On Thu, May 09, 2024 at 03:28:13PM -0400, Robert Haas wrote:
> Just a few reminders about the open items list at
> https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items --

Thanks for summarizing the situation.

> * Race condition with local injection point detach. Discussion is ongoing.

I have sent a patch for that yesterday, which I assume is going in the
right direction to close entirely the loop:
https://www.postgresql.org/message-id/Zjx9-2swyNg6E1y1%40paquier.xyz

There is still one point of detail related to the amount of
flexibility we'd want for detachs (concurrent detach happening in
parallel of an automated one in the shmem callback) that I'm not
entirely sure about yet but I've proposed an idea to solve that as
well.  I'm hopeful in getting that wrapped at the beginning of next
week.
--
Michael


signature.asc
Description: PGP signature


Re: Weird test mixup

2024-05-09 Thread Noah Misch
On Thu, May 09, 2024 at 04:40:43PM +0900, Michael Paquier wrote:
> So I like your suggestion.  This version closes the race window
> between the shmem exit detach in backend A and a concurrent backend B
> running a point local to A so as B will never run the local point of
> A.  However, it can cause a failure in the shmem exit callback of
> backend A if backend B does a detach of a point local to A because A
> tracks its local points with a static list in its TopMemoryContext, at
> least in the attached.  The second case could be solved by tracking
> the list of local points in the module's InjectionPointSharedState,
> but is this case really worth the complications it adds in the module
> knowing that the first case would be solid enough?  Perhaps not.
> 
> Another idea I have for the second case is to make
> InjectionPointDetach() a bit "softer", by returning a boolean status 
> rather than fail if the detach cannot be done, so as the shmem exit
> callback can still loop through the entries it has in store.

The return-bool approach sounds fine.  Up to you whether to do in this patch,
else I'll do it when I add the test.

> It could
> always be possible that a concurrent backend does a detach followed by
> an attach with the same name, causing the shmem exit callback to drop
> a point it should not, but that's not really a plausible case IMO :)

Agreed.  It's reasonable to expect test cases to serialize backend exits,
attach calls, and detach calls.  If we need to fix that later, we can use
attachment serial numbers.

> --- a/src/test/modules/injection_points/injection_points.c
> +++ b/src/test/modules/injection_points/injection_points.c

> +typedef enum InjectionPointConditionType
> +{
> + INJ_CONDITION_INVALID = 0,

I'd name this INJ_CONDITION_UNCONDITIONAL or INJ_CONDITION_ALWAYS.  INVALID
sounds like a can't-happen event or an injection point that never runs.
Otherwise, the patch looks good and makes src/test/modules/gin safe for
installcheck.  Thanks.




Re: First draft of PG 17 release notes

2024-05-09 Thread Bruce Momjian
On Thu, May  9, 2024 at 08:40:00PM +0200, Álvaro Herrera wrote:
> On 2024-May-09, Bruce Momjian wrote:
> 
> > However, I don't see it mentioned as a release note item in the commit
> > message or mentioned in our docs. I suppose the release note text would
> > be:
> > 
> > Removing a PRIMARY KEY will remove the NOT NULL column specification
> > 
> > Previously the NOT NULL specification would be retained.
> > 
> > Do we have agreement that we want this release note item?
> 
> Yes.  Maybe we want some others too (especially regarding inheritance,
> but also regarding the way we handle the constraints internally), and
> maybe in this one we want different wording.  How about something like
> this:
> 
>   Removing a primary key constraint may change the nullability
>   characteristic of the columns that the primary key covered.
> 
>   If explicit not-null constraints exist on the same column, then they
>   continue to be /known not nullable/; otherwise they become /possibly
>   nullable/.
> 
> This is largely based on the SQL standard's language of a column
> descriptor having a "nullability characteristic", which for columns with
> not-null or primary key constraints is "known not null".  I don't think
> we use those terms anywhere.  I hope this isn't too confusing.

Yes, it was confusing, partly because it is using wording we don't use,
and partly because it is talking about what can go into the column,
rather than the visible column restriction NOT NULL.  I also think "may"
is too imprecise.

How about:

Removing a primary key will remove a column's NOT NULL constraint
if the constraint was added by the primary key

Previously such NOT NULL constraints would remain after a primary
key was removed.  A column-level NOT NULL constraint would not be
emoved.

Here is the PG 16 output:

CREATE TABLE test ( x INT CONSTRAINT test_pkey PRIMARY KEY );
Table "public.test"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 x  | integer |   | not null |
Indexes:
"test_pkey" PRIMARY KEY, btree (x)

CREATE TABLE test_with_not_null (x INT NOT NULL CONSTRAINT 
test_pkey_with_not_null PRIMARY KEY);
 Table "public.test_with_not_null"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 x  | integer |   | not null |
Indexes:
"test_pkey_with_not_null" PRIMARY KEY, btree (x)

ALTER TABLE test DROP CONSTRAINT test_pkey;
Table "public.test"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
-->  x  | integer |   | not null |

ALTER TABLE test_with_not_null DROP CONSTRAINT test_pkey_with_not_null;
 Table "public.test_with_not_null"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
-->  x  | integer |   | not null |

Here is the output in PG 17:

CREATE TABLE test ( x INT CONSTRAINT test_pkey PRIMARY KEY );
Table "public.test"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 x  | integer |   | not null |
Indexes:
"test_pkey" PRIMARY KEY, btree (x)

CREATE TABLE test_with_not_null (x INT NOT NULL CONSTRAINT 
test_pkey_with_not_null PRIMARY KEY);
 Table "public.test_with_not_null"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
 x  | integer |   | not null |
Indexes:
"test_pkey_with_not_null" PRIMARY KEY, btree (x)

ALTER TABLE test DROP CONSTRAINT test_pkey;
Table "public.test"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
-->  x  | integer |   |  |

ALTER TABLE test_with_not_null DROP CONSTRAINT test_pkey_with_not_null;
 Table "public.test_with_not_null"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
-->  x  | integer |   | not null |

Notice that the table without a _column_ NOT NULL removes the NOT NULL
designation after removing the primary key only in PG 17.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-09 Thread Alexander Korotkov
On Wed, May 1, 2024 at 5:26 AM Alexander Korotkov  wrote:
> On Wed, May 1, 2024 at 5:24 AM Noah Misch  wrote:
> > On Thu, Apr 25, 2024 at 04:59:54PM +0400, Pavel Borisov wrote:
> > > 0001: Optimize speed by avoiding heap visibility checking for different
> > > non-deduplicated index tuples as proposed by Noah Misch
> > >
> > > Speed measurements on my laptop using the exact method recommended by Noah
> > > upthread:
> > > Current master branch: checkunique off: 144s, checkunique on: 419s
> > > With patch 0001: checkunique off: 141s, checkunique on: 171s
> >
> > Where is the CPU time going to make it still be 21% slower w/ checkunique 
> > on?
> > It's a great improvement vs. current master, but I don't have an obvious
> > explanation for the remaining +21%.
>
> I think there is at least extra index tuples comparison.

The revised patchset is attached.  I applied cosmetical changes.  I'm
going to push it if no objections.

I don't post the patch with rename of new option.  It doesn't seem
there is a consensus.  I must admit that keeping all the options in
the same naming convention makes sense.

--
Regards,
Alexander Korotkov
Supabase


v2-0003-Amcheck-Don-t-load-rightpage-into-BtreeCheckState.patch
Description: Binary data


v2-0002-amcheck-Refactoring-the-storage-of-the-last-visib.patch
Description: Binary data


v2-0001-amcheck-Optimize-speed-of-checking-for-unique-con.patch
Description: Binary data


v2-0004-amcheck-Report-an-error-when-the-next-page-to-a-l.patch
Description: Binary data


Re: [PATCH] Improve amcheck to also check UNIQUE constraint in btree index.

2024-05-09 Thread Tom Lane
Alexander Korotkov  writes:
> The revised patchset is attached.  I applied cosmetical changes.  I'm
> going to push it if no objections.

Is this really suitable material to be pushing post-feature-freeze?
It doesn't look like it's fixing any new-in-v17 issues.

regards, tom lane




Re: SQL:2011 application time

2024-05-09 Thread Matthias van de Meent
Hi,

I haven't really been following this thread, but after playing around
a bit with the feature I feel there are new gaps in error messages. I
also think there are gaps in the functionality regarding the (lack of)
support for CREATE UNIQUE INDEX, and attaching these indexes to
constraints.

pg=# CREATE TABLE temporal_testing (
pg(#  id bigint NOT NULL
pg(#generated always as identity,
pg(#  valid_during tstzrange
pg(# );
CREATE TABLE
pg=# ALTER TABLE temporal_testing
pg-#  ADD CONSTRAINT temp_unique UNIQUE (id, valid_during WITHOUT OVERLAPS);
ALTER TABLE
pg=# \d+ temp_unique
 Index "public.temp_unique"
Column|Type | Key? |  Definition  | Storage  | Stats target
--+-+--+--+--+--
 id   | gbtreekey16 | yes  | id   | plain|
 valid_during | tstzrange   | yes  | valid_during | extended |
unique, gist, for table "public.temporal_testing"
-- ^^ note the "unique, gist"
pg=# CREATE UNIQUE INDEX ON temporal_testing USING gist (id, valid_during);
ERROR:  access method "gist" does not support unique indexes

Here we obviously have a unique GIST index in the catalogs, but
they're "not supported" by GIST when we try to create such index
ourselves (!). Either the error message needs updating, or we need to
have a facility to actually support creating these unique indexes
outside constraints.

Additionally, because I can't create my own non-constraint-backing
unique GIST indexes, I can't pre-create my unique constraints
CONCURRENTLY as one could do for the non-temporal case: UNIQUE
constraints hold ownership of the index and would drop the index if
the constraint is dropped, too, and don't support a CONCURRENTLY
modifier, nor an INVALID modifier. This means temporal unique
constraints have much less administrative wiggle room than normal
unique constraints, and I think that's not great.

Kind regards,

Matthias van de Meent.




Re: Weird test mixup

2024-05-09 Thread Michael Paquier
On Thu, May 09, 2024 at 04:39:00PM -0700, Noah Misch wrote:

Thanks for the feedback.

> The return-bool approach sounds fine.  Up to you whether to do in this patch,
> else I'll do it when I add the test.

I see no reason to not change the signature of the routine now if we
know that we're going to do it anyway in the future.  I was shortly
wondering if doing the same for InjectionpointAttach() would make
sense, but it has more error states, so I'm not really tempted without
an actual reason (cannot think of a case where I'd want to put more
control into a module after a failed attach).

>> It could
>> always be possible that a concurrent backend does a detach followed by
>> an attach with the same name, causing the shmem exit callback to drop
>> a point it should not, but that's not really a plausible case IMO :)
> 
> Agreed.  It's reasonable to expect test cases to serialize backend exits,
> attach calls, and detach calls.  If we need to fix that later, we can use
> attachment serial numbers.

Okay by me.

> I'd name this INJ_CONDITION_UNCONDITIONAL or INJ_CONDITION_ALWAYS.  INVALID
> sounds like a can't-happen event or an injection point that never runs.
> Otherwise, the patch looks good and makes src/test/modules/gin safe for
> installcheck.  Thanks.

INJ_CONDITION_ALWAYS sounds like a good compromise here.

Attached is an updated patch for now, indented with a happy CI.  I am
still planning to look at that a second time on Monday with a fresher
mind, in case I'm missing something now.
--
Michael
From 233e710c8fc2242bdd4e40d5a20f8de2828765b7 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Fri, 10 May 2024 09:40:42 +0900
Subject: [PATCH v4] Add chunk area for injection points

---
 src/include/utils/injection_point.h   |   9 +-
 src/backend/utils/misc/injection_point.c  |  53 -
 .../expected/injection_points.out |   2 +-
 .../injection_points/injection_points.c   | 191 +++---
 doc/src/sgml/xfunc.sgml   |  14 +-
 src/tools/pgindent/typedefs.list  |   1 +
 6 files changed, 131 insertions(+), 139 deletions(-)

diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index 55524b568f..a61d5d4439 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -23,15 +23,18 @@
 /*
  * Typedef for callback function launched by an injection point.
  */
-typedef void (*InjectionPointCallback) (const char *name);
+typedef void (*InjectionPointCallback) (const char *name,
+		const void *private_data);
 
 extern Size InjectionPointShmemSize(void);
 extern void InjectionPointShmemInit(void);
 
 extern void InjectionPointAttach(const char *name,
  const char *library,
- const char *function);
+ const char *function,
+ const void *private_data,
+ int private_data_size);
 extern void InjectionPointRun(const char *name);
-extern void InjectionPointDetach(const char *name);
+extern bool InjectionPointDetach(const char *name);
 
 #endif			/* INJECTION_POINT_H */
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 0cf4d51cac..d5a8726644 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -42,6 +42,7 @@ static HTAB *InjectionPointHash;	/* find points from names */
 #define INJ_NAME_MAXLEN		64
 #define INJ_LIB_MAXLEN		128
 #define INJ_FUNC_MAXLEN		128
+#define INJ_PRIVATE_MAXLEN	1024
 
 /* Single injection point stored in InjectionPointHash */
 typedef struct InjectionPointEntry
@@ -49,6 +50,12 @@ typedef struct InjectionPointEntry
 	char		name[INJ_NAME_MAXLEN];	/* hash key */
 	char		library[INJ_LIB_MAXLEN];	/* library */
 	char		function[INJ_FUNC_MAXLEN];	/* function */
+
+	/*
+	 * Opaque data area that modules can use to pass some custom data to
+	 * callbacks, registered when attached.
+	 */
+	char		private_data[INJ_PRIVATE_MAXLEN];
 } InjectionPointEntry;
 
 #define INJECTION_POINT_HASH_INIT_SIZE	16
@@ -61,6 +68,7 @@ typedef struct InjectionPointEntry
 typedef struct InjectionPointCacheEntry
 {
 	char		name[INJ_NAME_MAXLEN];
+	char		private_data[INJ_PRIVATE_MAXLEN];
 	InjectionPointCallback callback;
 } InjectionPointCacheEntry;
 
@@ -73,7 +81,8 @@ static HTAB *InjectionPointCache = NULL;
  */
 static void
 injection_point_cache_add(const char *name,
-		  InjectionPointCallback callback)
+		  InjectionPointCallback callback,
+		  const void *private_data)
 {
 	InjectionPointCacheEntry *entry;
 	bool		found;
@@ -99,6 +108,7 @@ injection_point_cache_add(const char *name,
 	Assert(!found);
 	strlcpy(entry->name, name, sizeof(entry->name));
 	entry->callback = callback;
+	memcpy(entry->private_data, private_data, INJ_PRIVATE_MAXLEN);
 }
 
 /*
@@ -124,11 +134,14 @@ injection_point_cache_remove(const char *name)
  * Retrieve an injection point from the local cache, if any.
  */
 static InjectionPointCallback
-injection_po

RE: Improving the latch handling between logical replication launcher and worker processes.

2024-05-09 Thread Hayato Kuroda (Fujitsu)
Dear Vignesh,

Thanks for raising idea!

> a) Introduce a new latch to handle worker attach and exit.

Just to confirm - there are three wait events for launchers, so I feel we may be
able to create latches per wait event. Is there a reason to introduce
"a" latch?

> b) Add a new GUC launcher_retry_time which gives more flexibility to
> users as suggested by Amit at [1]. Before 5a3a953, the
> wal_retrieve_retry_interval plays a similar role as the suggested new
> GUC launcher_retry_time, e.g. even if a worker is launched, the
> launcher only wait wal_retrieve_retry_interval time before next round.

Hmm. My concern is how users estimate the value. Maybe the default will be
3min, but should users change it? If so, how long? I think even if it becomes
tunable, they cannot control well.

> c) Don't reset the latch at worker attach and allow launcher main to
> identify and handle it. For this there is a patch v6-0002 available at
> [2].

Does it mean that you want to remove ResetLatch() from
WaitForReplicationWorkerAttach(), right? If so, what about the scenario?

1) The launcher waiting the worker is attached in 
WaitForReplicationWorkerAttach(),
and 2) subscription is created before attaching. In this case, the launcher will
become un-sleepable because the latch is set but won't be reset. It may waste 
the
CPU time.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/ 


Re: Set appropriate processing mode for auxiliary processes.

2024-05-09 Thread Xing Guo
On Thu, May 9, 2024 at 11:19 PM Tom Lane  wrote:
>
> Heikki Linnakangas  writes:
> > At first I was sure this was introduced by my refactorings in v17, but
> > in fact it's been like this forever. I agree that InitProcessing makes
> > much more sense. The ProcessingMode variable is initialized to
> > InitProcessing, so I think we can simply remove that line from
> > AuxiliaryProcessMainCommon(). There are existing
> > "SetProcessingMode(InitProcessing)" calls in other Main functions too
> > (AutoVacLauncherMain, BackgroundWorkerMain, etc.), and I think those can
> > also be removed.
>
> This only works if the postmaster can be trusted never to change the
> variable; else children could inherit some other value via fork().
> In that connection, it seems a bit scary that postmaster.c contains a
> couple of calls "SetProcessingMode(NormalProcessing)".  It looks like
> they are in functions that should only be executed by child processes,
> but should we try to move them somewhere else?

After checking calls to "SetProcessingMode(NormalProcessing)" in the
postmaster.c, they are used in background worker specific functions
(BackgroundWorkerInitializeConnectionByOid and
BackgroundWorkerInitializeConnection). So I think it's a good idea to
move these functions to bgworker.c. Then, we can get rid of calling
"SetProcessingMode(NormalProcessing)" in postmaster.c.

I also noticed that there's an unnecessary call to
"BackgroundWorkerInitializeConnection" in worker_spi.c (The worker_spi
launcher has set the dboid correctly).

Best Regards,
Xing.
From 4dbe54ef746203740202e181f731b7e08bb6e4ea Mon Sep 17 00:00:00 2001
From: Xing Guo 
Date: Fri, 10 May 2024 10:34:09 +0800
Subject: [PATCH v4 1/3] Move bgworker specific logic to bgworker.c.

---
 src/backend/postmaster/bgworker.c   | 83 +
 src/backend/postmaster/postmaster.c | 83 -
 2 files changed, 83 insertions(+), 83 deletions(-)

diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index cf64a4beb2..0ab64ae3ec 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1240,6 +1240,89 @@ TerminateBackgroundWorker(BackgroundWorkerHandle *handle)
 		SendPostmasterSignal(PMSIGNAL_BACKGROUND_WORKER_CHANGE);
 }
 
+/*
+ * Connect background worker to a database.
+ */
+void
+BackgroundWorkerInitializeConnection(const char *dbname, const char *username, uint32 flags)
+{
+	BackgroundWorker *worker = MyBgworkerEntry;
+	bits32		init_flags = 0; /* never honor session_preload_libraries */
+
+	/* ignore datallowconn? */
+	if (flags & BGWORKER_BYPASS_ALLOWCONN)
+		init_flags |= INIT_PG_OVERRIDE_ALLOW_CONNS;
+	/* ignore rolcanlogin? */
+	if (flags & BGWORKER_BYPASS_ROLELOGINCHECK)
+		init_flags |= INIT_PG_OVERRIDE_ROLE_LOGIN;
+
+	/* XXX is this the right errcode? */
+	if (!(worker->bgw_flags & BGWORKER_BACKEND_DATABASE_CONNECTION))
+		ereport(FATAL,
+(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("database connection requirement not indicated during registration")));
+
+	InitPostgres(dbname, InvalidOid,	/* database to connect to */
+ username, InvalidOid,	/* role to connect as */
+ init_flags,
+ NULL);			/* no out_dbname */
+
+	/* it had better not gotten out of "init" mode yet */
+	if (!IsInitProcessingMode())
+		ereport(ERROR,
+(errmsg("invalid processing mode in background worker")));
+	SetProcessingMode(NormalProcessing);
+}
+
+/*
+ * Connect background worker to a database using OIDs.
+ */
+void
+BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid, uint32 flags)
+{
+	BackgroundWorker *worker = MyBgworkerEntry;
+	bits32		init_flags = 0; /* never honor session_preload_libraries */
+
+	/* ignore datallowconn? */
+	if (flags & BGWORKER_BYPASS_ALLOWCONN)
+		init_flags |= INIT_PG_OVERRIDE_ALLOW_CONNS;
+	/* ignore rolcanlogin? */
+	if (flags & BGWORKER_BYPASS_ROLELOGINCHECK)
+		init_flags |= INIT_PG_OVERRIDE_ROLE_LOGIN;
+
+	/* XXX is this the right errcode? */
+	if (!(worker->bgw_flags & BGWORKER_BACKEND_DATABASE_CONNECTION))
+		ereport(FATAL,
+(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("database connection requirement not indicated during registration")));
+
+	InitPostgres(NULL, dboid,	/* database to connect to */
+ NULL, useroid, /* role to connect as */
+ init_flags,
+ NULL);			/* no out_dbname */
+
+	/* it had better not gotten out of "init" mode yet */
+	if (!IsInitProcessingMode())
+		ereport(ERROR,
+(errmsg("invalid processing mode in background worker")));
+	SetProcessingMode(NormalProcessing);
+}
+
+/*
+ * Block/unblock signals in a background worker
+ */
+void
+BackgroundWorkerBlockSignals(void)
+{
+	sigprocmask(SIG_SETMASK, &BlockSig, NULL);
+}
+
+void
+BackgroundWorkerUnblockSignals(void)
+{
+	sigprocmask(SIG_SETMASK, &UnBlockSig, NULL);
+}
+
 /*
  * Look up (and possibly load) a bgworker entry point function.
  *
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/pos

Re:

2024-05-09 Thread zaidagilist
On Thu, May 9, 2024 at 2:50 PM Rajan Pandey  wrote:
>
> Hi everyone, I just installed Postgres and pg_tle extension as I was looking 
> to contribute to pg_tle.
>
> Somehow, I am unable to update the shared_preload_libraries. It feels like 
> ALTER has happened but the SPL value is not updated:
>>
>> test=# show shared_preload_libraries;
>>  shared_preload_libraries
>> --
>>
>> (1 row)
>>
>> test=# ALTER SYSTEM SET shared_preload_libraries TO 'pg_tle';
>> ALTER SYSTEM
>> test=# SELECT pg_reload_conf();
>>  pg_reload_conf
>> 
>>  t
>> (1 row)
>>
>> test=# show shared_preload_libraries;
>>  shared_preload_libraries
>> --
>>
>> (1 row)
>>
>> test=#
>
>
> I'm unable to open the postgresql.conf file to update it either. I provided 
> the correct macbook password above. But it is not accepted! :/
>>
>> rajanx@b0be835adb74 postgresql % cat 
>> /usr/local/pgsql/data/postgresql.auto.conf
>> cat: /usr/local/pgsql/data/postgresql.auto.conf: Permission denied
>>
>> rajanx@b0be835adb74 postgresql % su cat 
>> /usr/local/pgsql/data/postgresql.auto.conf
>> Password:
>> su: Sorry




It seems you're trying to use the su command to switch to another user
to read a file, but the su command is not the appropriate command for
this purpose on macOS.

If you want to read a file located in another user's directory, and
you have the necessary permissions to access that file, you can simply
use the cat command without su. Here's how you can do it:



>
>
> Please help! Thank you. :)
> --
> Regards
> Rajan Pandey