Re: Potential memory leak in contrib/intarray's g_intbig_compress

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 06:28:39PM +0200, Matthias van de Meent wrote:
> There are similar pfree calls in the _int_gist.c file's g_int_compress
> function, which made me think we do need to clean up after use, but
> indeed these pfrees are useless (or even harmful if bug #17888 can be
> trusted)

Indeed, all these are in a GiST temporary context.  So you'd mean
something like the attached perhaps, for both the decompress and
compress paths?
--
Michael
diff --git a/contrib/intarray/_int_gist.c b/contrib/intarray/_int_gist.c
index 98145fe370..9b71cb6c91 100644
--- a/contrib/intarray/_int_gist.c
+++ b/contrib/intarray/_int_gist.c
@@ -201,11 +201,7 @@ g_int_compress(PG_FUNCTION_ARGS)
 	r = DatumGetArrayTypeP(entry->key);
 	CHECKARRVALID(r);
 	if (ARRISEMPTY(r))
-	{
-		if (r != (ArrayType *) DatumGetPointer(entry->key))
-			pfree(r);
 		PG_RETURN_POINTER(entry);
-	}
 
 	if ((len = ARRNELEMS(r)) >= 2 * num_ranges)
 	{			/* compress */
@@ -346,8 +342,6 @@ g_int_decompress(PG_FUNCTION_ARGS)
 			if ((!i) || *(dr - 1) != j)
 *dr++ = j;
 
-	if (in != (ArrayType *) DatumGetPointer(entry->key))
-		pfree(in);
 	retval = palloc(sizeof(GISTENTRY));
 	gistentryinit(*retval, PointerGetDatum(r),
   entry->rel, entry->page, entry->offset, false);


signature.asc
Description: PGP signature


16beta2 SQL parser: different defaults on absent_on_null

2023-07-13 Thread Martin Butter
While adapting a Java implementation of the SQL parser, I noticed that 
in structures JsonArrayAgg, JsonArrayConstructor, 
JsonArrayQueryConstructor and JsonObjectConstrutor, the absent_on_null 
field defaults to TRUE.

But in JsonObjectAgg, absent_on_null defaults to FALSE.
Is that intentionally?

Regards,
Martin.





Re: add non-option reordering to in-tree getopt_long

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 09:38:42PM -0700, Nathan Bossart wrote:
> I did notice this, but I had the opposite reaction.

Ahah, well ;)

> Take the following examples of client programs that accept one non-option:
> 
>   ~$ pg_resetwal a b c
>   pg_resetwal: error: too many command-line arguments (first is "b")
>   pg_resetwal: hint: Try "pg_resetwal --help" for more information.
> 
> Yet pg_ctl gives:
> 
>   ~$ pg_ctl start a b c
>   pg_ctl: too many command-line arguments (first is "start")
>   Try "pg_ctl --help" for more information.
> 
> In this example, isn't "a" the first extra non-option that should be
> reported?

Good point.  This is interpreting "first" as being the first option
that's invalid.  Here my first impression was that pg_ctl got that
right, where "first" refers to the first subcommand that would be
valid.  Objection withdrawn.
--
Michael


signature.asc
Description: PGP signature


Re: Autogenerate some wait events code and documentation

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 10:26:54AM +0900, Michael Paquier wrote:
> I would like to apply 0001 and 0002 to improve the format if there are
> no objections.  0003 and 0004 are still here for discussion, as it
> does not seem like a consensus has been reached for that yet.  Getting
> more opinions would be a good next step for the last two patches, I
> assume.

I have looked again at 0001 and 0002 and applied them to get them out
of the way.  0003 and 0004 are rebased and attached.  I'll add them to
the CF for later consideration.  More opinions are welcome.
--
Michael
From e4c2ae8d0879c1839d1873522155162e49f50788 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Sun, 9 Jul 2023 13:24:54 +0900
Subject: [PATCH v3 3/4] Rename wait events with more consistent camelcase
 style

This will help in the introduction of more simplifications with the
generation of wait event data using wait_event_names.txt.  The event
names used the same case-insensitive characters, hence applying lower()
or upper() to the monitoring queries allows the detection of the same
events as before this change.
---
 src/backend/libpq/pqmq.c  |  2 +-
 src/backend/storage/ipc/shm_mq.c  |  6 +-
 .../utils/activity/wait_event_names.txt   | 90 +--
 3 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index 39e77ef945..87b8570a42 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -182,7 +182,7 @@ mq_putmessage(char msgtype, const char *s, size_t len)
 			break;
 
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_PUT_MESSAGE);
+		 WAIT_EVENT_MESSAGE_QUEUE_PUT_MESSAGE);
 		ResetLatch(MyLatch);
 		CHECK_FOR_INTERRUPTS();
 	}
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index d134a09a19..06d6b73527 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -1017,7 +1017,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data,
 			 * cheaper than setting one that has been reset.
 			 */
 			(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-			 WAIT_EVENT_MQ_SEND);
+			 WAIT_EVENT_MESSAGE_QUEUE_SEND);
 
 			/* Reset the latch so we don't spin. */
 			ResetLatch(MyLatch);
@@ -1163,7 +1163,7 @@ shm_mq_receive_bytes(shm_mq_handle *mqh, Size bytes_needed, bool nowait,
 		 * setting one that has been reset.
 		 */
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_RECEIVE);
+		 WAIT_EVENT_MESSAGE_QUEUE_RECEIVE);
 
 		/* Reset the latch so we don't spin. */
 		ResetLatch(MyLatch);
@@ -1252,7 +1252,7 @@ shm_mq_wait_internal(shm_mq *mq, PGPROC **ptr, BackgroundWorkerHandle *handle)
 
 		/* Wait to be signaled. */
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_INTERNAL);
+		 WAIT_EVENT_MESSAGE_QUEUE_INTERNAL);
 
 		/* Reset the latch so we don't spin. */
 		ResetLatch(MyLatch);
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 3fabad96d9..c4967e274e 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -45,15 +45,15 @@
 Section: ClassName - WaitEventActivity
 
 WAIT_EVENT_ARCHIVER_MAIN	ArchiverMain	"Waiting in main loop of archiver process."
-WAIT_EVENT_AUTOVACUUM_MAIN	AutoVacuumMain	"Waiting in main loop of autovacuum launcher process."
-WAIT_EVENT_BGWRITER_HIBERNATE	BgWriterHibernate	"Waiting in background writer process, hibernating."
-WAIT_EVENT_BGWRITER_MAIN	BgWriterMain	"Waiting in main loop of background writer process."
+WAIT_EVENT_AUTOVACUUM_MAIN	AutovacuumMain	"Waiting in main loop of autovacuum launcher process."
+WAIT_EVENT_BGWRITER_HIBERNATE	BgwriterHibernate	"Waiting in background writer process, hibernating."
+WAIT_EVENT_BGWRITER_MAIN	BgwriterMain	"Waiting in main loop of background writer process."
 WAIT_EVENT_CHECKPOINTER_MAIN	CheckpointerMain	"Waiting in main loop of checkpointer process."
 WAIT_EVENT_LOGICAL_APPLY_MAIN	LogicalApplyMain	"Waiting in main loop of logical replication apply process."
 WAIT_EVENT_LOGICAL_LAUNCHER_MAIN	LogicalLauncherMain	"Waiting in main loop of logical replication launcher process."
 WAIT_EVENT_LOGICAL_PARALLEL_APPLY_MAIN	LogicalParallelApplyMain	"Waiting in main loop of logical replication parallel apply process."
 WAIT_EVENT_RECOVERY_WAL_STREAM	RecoveryWalStream	"Waiting in main loop of startup process for WAL to arrive, during streaming recovery."
-WAIT_EVENT_SYSLOGGER_MAIN	SysLoggerMain	"Waiting in main loop of syslogger process."
+WAIT_EVENT_SYSLOGGER_MAIN	SysloggerMain	"Waiting in main loop of syslogger process."
 WAIT_EVENT_WAL_RECEIVER_MAIN	WalReceiverMain	"Waiting in main loop of WAL receiver process."
 WAIT_EVENT_WAL_SENDER_MAIN	WalSenderMain	"Waiting in main loop of WAL sender process."
 

Re: add non-option reordering to in-tree getopt_long

2023-07-13 Thread Nathan Bossart
On Fri, Jul 14, 2023 at 01:27:26PM +0900, Michael Paquier wrote:
> Indeed, it looks like I've fat-fingered a rebase here.  I am able to
> get a clean CI run when running this patch, sorry for the noise.
> 
> Anyway, this introduces a surprising behavior when specifying too many
> subcommands.  On HEAD:
> $ pg_ctl stop -D $PGDATA kill -t 20 start
> pg_ctl: too many command-line arguments (first is "stop")
> Try "pg_ctl --help" for more information.
> $ pg_ctl stop -D $PGDATA -t 20 start
> pg_ctl: too many command-line arguments (first is "stop")
> Try "pg_ctl --help" for more information.
> 
> With the patch:
> $ pg_ctl stop -D $PGDATA -t 20 start
> pg_ctl: too many command-line arguments (first is "start")
> Try "pg_ctl --help" for more information.
> $ pg_ctl stop -D $PGDATA kill -t 20 start
> pg_ctl: too many command-line arguments (first is "kill")
> Try "pg_ctl --help" for more information.
> 
> So the error message reported is incorrect now, referring to an
> incorrect first subcommand.

I did notice this, but I had the opposite reaction.  Take the following
examples of client programs that accept one non-option:

~$ pg_resetwal a b c
pg_resetwal: error: too many command-line arguments (first is "b")
pg_resetwal: hint: Try "pg_resetwal --help" for more information.

~$ createuser a b c
createuser: error: too many command-line arguments (first is "b")
createuser: hint: Try "createuser --help" for more information.

~$ pgbench a b c
pgbench: error: too many command-line arguments (first is "b")
pgbench: hint: Try "pgbench --help" for more information.

~$ pg_restore a b c
pg_restore: error: too many command-line arguments (first is "b")
pg_restore: hint: Try "pg_restore --help" for more information.

Yet pg_ctl gives:

~$ pg_ctl start a b c
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.

In this example, isn't "a" the first extra non-option that should be
reported?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: add non-option reordering to in-tree getopt_long

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 07:57:12AM -0700, Nathan Bossart wrote:
> Assuming you are referring to [0], it looks like you are missing 411b720.
> 
> [0] https://github.com/michaelpq/postgres/commits/getopt_test

Indeed, it looks like I've fat-fingered a rebase here.  I am able to
get a clean CI run when running this patch, sorry for the noise.

Anyway, this introduces a surprising behavior when specifying too many
subcommands.  On HEAD:
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "stop")
Try "pg_ctl --help" for more information.

With the patch:
$ pg_ctl stop -D $PGDATA -t 20 start
pg_ctl: too many command-line arguments (first is "start")
Try "pg_ctl --help" for more information.
$ pg_ctl stop -D $PGDATA kill -t 20 start
pg_ctl: too many command-line arguments (first is "kill")
Try "pg_ctl --help" for more information.

So the error message reported is incorrect now, referring to an
incorrect first subcommand.
--
Michael


signature.asc
Description: PGP signature


Re: Preventing non-superusers from altering session authorization

2023-07-13 Thread Nathan Bossart
On Wed, Jul 12, 2023 at 09:37:57PM -0700, Nathan Bossart wrote:
> On Mon, Jul 10, 2023 at 01:49:55PM -0700, Nathan Bossart wrote:
>> Great.  I'm going to wait a few more days in case anyone has additional
>> feedback, but otherwise I intend to commit this shortly.
> 
> I've committed 0001 for now.  I'm hoping to commit the other two patches
> within the next couple of days.

Committed.  I dwelled on whether to proceed with this change because it
doesn't completely solve the originally-stated problem; i.e., a role that
has changed its session authorization before losing superuser can still
take advantage of the privileges of the target role, which might include
reaquiring superuser.  However, I think SET ROLE is subject to basically
the same problem, and I'd argue that this change is strictly an
improvement, if for no other reason than it makes SET SESSION AUTHORIZATION
more consistent with SET ROLE.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: In Postgres 16 BETA, should the ParseNamespaceItem have the same index as it's RangeTableEntry?

2023-07-13 Thread Amit Langote
On Fri, Jul 14, 2023 at 7:12 AM Tom Lane  wrote:
> Farias de Oliveira  writes:
> > With further inspection in AGE's code, after executing the SET query,
> > it goes inside transform_cypher_clause_as_subquery() function and the
> > ParseNamespaceItem has the following values:
>
> >  {p_names = 0x1205638, p_rte = 0x11edb70, p_rtindex = 1, p_perminfo =
> > 0x7f7f7f7f7f7f7f7f,
> >   p_nscolumns = 0x1205848, p_rel_visible = true, p_cols_visible =
> > true, p_lateral_only = false,
> >   p_lateral_ok = true}
>
> Hmm, that uninitialized value for p_perminfo is pretty suspicious.
> I see that transformFromClauseItem and buildNSItemFromLists both
> create ParseNamespaceItems without bothering to fill p_perminfo,
> while buildNSItemFromTupleDesc fills it per the caller and
> addRangeTableEntryForJoin always sets it to NULL.  I think we
> ought to make the first two set it to NULL as well, because
> uninitialized fields are invariably a bad idea (even though the
> lack of valgrind complaints says that the core code is managing
> to avoid touching those fields).

Agreed, I'll go ahead and fix that.

> If we do that, is it sufficient to resolve your problem?

Hmm, I'm afraid maybe not, because if the above were the root issue,
we'd have seen a segfault and not the error the OP mentioned?  I'm
thinking the issue is that their code appears to be consing up an RTE
that the core code (getRTEPermissionInfo() most likely via
markRTEForSelectPriv()) is not expecting to be called with?  I would
be helpful to see a backtrace when the error occurs to be sure.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: 'ERROR: attempted to update invisible tuple' from 'ALTER INDEX ... ATTACH PARTITION' on parent index

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 02:01:49PM +0530, Shruthi Gowda wrote:
> While analyzing the issue I did notice that validatePartitionedIndex() is
> the only place where the index tuple was copied from rel->rd_indextuple
> however was not clear about the motive behind it.

No idea either.  Anyway, I've split this stuff into two parts and
applied the whole across the whole set of stable branches.  Thanks!
--
Michael


signature.asc
Description: PGP signature


Re: Add hint message for check_log_destination()

2023-07-13 Thread Japin Li

On Thu, 13 Jul 2023 at 16:19, Masahiko Sawada  wrote:
> On Tue, Jul 11, 2023 at 10:24 AM Japin Li  wrote:
>>
>>
>> On Mon, 10 Jul 2023 at 14:23, Masahiko Sawada  wrote:
>> > On Mon, Jul 10, 2023 at 2:07 PM Kyotaro Horiguchi
>> >  wrote:
>> >>
>> >> At Mon, 10 Jul 2023 09:04:42 +0800, Japin Li  wrote 
>> >> in
>> >> >
>> >> > On Sat, 08 Jul 2023 at 12:48, Michael Paquier  
>> >> > wrote:
>> >> > > On Fri, Jul 07, 2023 at 07:23:47PM +0800, Japin Li wrote:
>> >> > >> +  appendStringInfoString(, "\"stderr\"");
>> >> > >> +#ifdef HAVE_SYSLOG
>> >> > >> +  appendStringInfoString(, ", \"syslog\"");
>> >> > >> +#endif
>> >> > >> +#ifdef WIN32
>> >> > >> +  appendStringInfoString(, ", 
>> >> > >> \"eventlog\"");
>> >> > >> +#endif
>> >> > >> +  appendStringInfoString(, ", \"csvlog\", 
>> >> > >> and \"jsonlog\"");
>> >> > >
>> >> > > Hmm.  Is that OK as a translatable string?
>> >
>> > It seems okay to me but needs to be checked.
>> >
>> >> >
>> >> >
>> >> > Sorry for the late reply!  I'm not sure.  How can I know whether it is 
>> >> > translatable?
>> >>
>> >> At the very least, we can't generate comma-separated lists
>> >> programatically because punctuation marks vary across languages.
>> >>
>> >> One potential approach could involve defining the message for every
>> >> potential combination, in full length.
>> >
>> > Don't we generate a comma-separated list for an error hint of an enum
>> > parameter? For example, to generate the following error hint:
>> >
>> > =# alter system set client_min_messages = 'aaa';
>> > ERROR:  invalid value for parameter "client_min_messages": "aaa"
>> > HINT:  Available values: debug5, debug4, debug3, debug2, debug1, log,
>> > notice, warning, error.
>> >
>> > we use the comma-separated generated by config_enum_get_options() and
>> > do ereport() like:
>> >
>> >   ereport(elevel,
>> >   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> >errmsg("invalid value for parameter \"%s\": \"%s\"",
>> >   name, value),
>> >hintmsg ? errhint("%s", _(hintmsg)) : 0));
>>
>> > IMO log_destination is a string GUC parameter but its value is the
>> > list of enums. So it makes sense to me to add a hint message like what
>> > we do for enum parameters in case where the user mistypes a wrong
>> > value. I'm not sure why the proposed patch needs to quote the usable
>> > values, though.
>>
>> I borrowed the description from pg_settings extra_desc.  In my first patch,
>> I used the hint message line enum parameter, however, since it might be a
>> combination of multiple log destinations, so, I update the patch using
>> extra_desc.
>
> I agree to use description from pg_settings extra_desc, but it seems
> to be better not to quote each available value like we do for enum
> parameter cases. That is, the hint message would be like:
>
> =# alter system set log_destination to 'xxx';
> ERROR:  invalid value for parameter "log_destination": "xxx"
> DETAIL:  Unrecognized key word: "xxx".
> HINT:  Valid values are combinations of stderr, syslog, csvlog, and jsonlog.
>

Agreed.  Fixed as per your suggestions.

-- 
Regrads,
Japin Li

>From 3f709effe14de81b84bbc441aa5884ece0a757da Mon Sep 17 00:00:00 2001
From: Japin Li 
Date: Fri, 14 Jul 2023 09:26:01 +0800
Subject: [PATCH v4 1/1] Add hint message for check_log_destination

---
 src/backend/utils/error/elog.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5898100acb..dccbabf40a 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -2228,8 +2228,22 @@ check_log_destination(char **newval, void **extra, GucSource source)
 #endif
 		else
 		{
+			StringInfoData errhint;
+
+			initStringInfo();
+			appendStringInfoString(, "stderr");
+#ifdef HAVE_SYSLOG
+			appendStringInfoString(, ", syslog");
+#endif
+#ifdef WIN32
+			appendStringInfoString(, ", eventlog");
+#endif
+			appendStringInfoString(, ", csvlog, and jsonlog");
+
 			GUC_check_errdetail("Unrecognized key word: \"%s\".", tok);
+			GUC_check_errhint("Valid values are combinations of %s.", errhint.data);
 			pfree(rawstring);
+			pfree(errhint.data);
 			list_free(elemlist);
 			return false;
 		}
-- 
2.41.0



Re: CommandStatus from insert returning when using a portal.

2023-07-13 Thread Dave Cramer
On Thu, 13 Jul 2023 at 10:24, David G. Johnston 
wrote:

> On Thursday, July 13, 2023, Dave Cramer  wrote:
>
>>
>> Any comment on why the CommandComplete is incorrect ?
>> It returns INSERT 0 0  if a cursor is used
>>
>
>  Looking at DECLARE it is surprising that what you describe is even
> possible.  Can you share a psql reproducer?
>

apologies, we are using a portal, not a cursor.
Dave Cramer

>


Re: WAL Insertion Lock Improvements

2023-07-13 Thread Andres Freund
Hi,

On 2023-07-13 14:04:31 -0700, Andres Freund wrote:
> From b74b6e953cb5a7e7ea1a89719893f6ce9e231bba Mon Sep 17 00:00:00 2001
> From: Bharath Rupireddy 
> Date: Fri, 19 May 2023 15:00:21 +
> Subject: [PATCH v8] Optimize WAL insertion lock acquisition and release
>
> This commit optimizes WAL insertion lock acquisition and release
> in the following way:

I think this commit does too many things at once.


> 1. WAL insertion lock's variable insertingAt is currently read and
> written with the help of lwlock's wait list lock to avoid
> torn-free reads/writes. This wait list lock can become a point of
> contention on a highly concurrent write workloads. Therefore, make
> insertingAt a 64-bit atomic which inherently provides torn-free
> reads/writes.

"inherently" is a bit strong, given that we emulate 64bit atomics where not
available...


> 2. LWLockUpdateVar currently acquires lwlock's wait list lock even when
> there are no waiters at all. Add a fastpath exit to LWLockUpdateVar when
> there are no waiters to avoid unnecessary locking.

I don't think there's enough of an explanation for why this isn't racy.

The reason it's, I think, safe, is that anyone using LWLockConflictsWithVar()
will do so twice in a row, with a barrier inbetween. But that really relies on
what I explain in the paragraph below:



> It also adds notes on why LWLockConflictsWithVar doesn't need a
> memory barrier as far as its current usage is concerned.

I don't think:
 * NB: LWLockConflictsWithVar (which is called from
 * LWLockWaitForVar) relies on the spinlock used above 
in this
 * function and doesn't use a memory barrier.

helps to understand why any of this is safe to a meaningful degree.


The existing comments aren't obviously aren't sufficient to explain this, but
the reason it's, I think, safe today, is that we are only waiting for
insertions that started before WaitXLogInsertionsToFinish() was called.  The
lack of memory barriers in the loop means that we might see locks as "unused"
that have *since* become used, which is fine, because they only can be for
later insertions that we wouldn't want to wait on anyway.

Not taking a lock to acquire the current insertingAt value means that we might
see older insertingAt value. Which should also be fine, because if we read a
too old value, we'll add ourselves to the queue, which contains atomic
operations.



> diff --git a/src/backend/storage/lmgr/lwlock.c 
> b/src/backend/storage/lmgr/lwlock.c
> index 59347ab951..82266e6897 100644
> --- a/src/backend/storage/lmgr/lwlock.c
> +++ b/src/backend/storage/lmgr/lwlock.c
> @@ -1547,9 +1547,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
>   * *result is set to true if the lock was free, and false otherwise.
>   */
>  static bool
> -LWLockConflictsWithVar(LWLock *lock,
> -uint64 *valptr, uint64 oldval, 
> uint64 *newval,
> -bool *result)
> +LWLockConflictsWithVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval,
> +uint64 *newval, bool *result)
>  {
>   boolmustwait;
>   uint64  value;
> @@ -1572,13 +1571,11 @@ LWLockConflictsWithVar(LWLock *lock,
>   *result = false;
>
>   /*
> -  * Read value using the lwlock's wait list lock, as we can't generally
> -  * rely on atomic 64 bit reads/stores.  TODO: On platforms with a way to
> -  * do atomic 64 bit reads/writes the spinlock should be optimized away.
> +  * Reading the value atomically ensures that we don't need any explicit
> +  * locking. Note that in general, 64 bit atomic APIs in postgres 
> inherently
> +  * provide explicit locking for the platforms without atomics support.
>*/

This comment seems off to me. Using atomics doesn't guarantee not needing
locking. It just guarantees that we are reading a non-torn value.



> @@ -1605,9 +1602,14 @@ LWLockConflictsWithVar(LWLock *lock,
>   *
>   * Note: this function ignores shared lock holders; if the lock is held
>   * in shared mode, returns 'true'.
> + *
> + * Be careful that LWLockConflictsWithVar() does not include a memory 
> barrier,
> + * hence the caller of this function may want to rely on an explicit barrier 
> or
> + * a spinlock to avoid memory ordering issues.
>   */

s/careful/aware/?

s/spinlock/implied barrier via spinlock or lwlock/?


Greetings,

Andres




Re: Fix search_path for all maintenance commands

2023-07-13 Thread Jeff Davis
On Thu, 2023-07-13 at 13:37 -0700, David G. Johnston wrote:
> If this is limited to MAINT, which I'm in support of, there is no
> need for an "escape hatch".  A prerequisite for leveraging the new
> feature is that you fix the code so it conforms to the new way of
> doing things.

The current patch is not limited to exercise of the MAINTAIN privilege.

> Tom's opinion was a general dislike for differing behavior in
> different situations.  I dislike it too, on purist grounds, but would
> rather do this than not make any forward progress because we made a
> poor decision in the past.

I believe the opinion you're referring to is here:

https://www.postgresql.org/message-id/1659699.1688086...@sss.pgh.pa.us

Which was a reaction to another version of my patch which implemented
your idea to limit the changes to the MAINTAIN privilege.

I agree with you that we should be practical here. The end goal is to
move users away from using functions that both (a) implicitly depend on
the search_path; and (b) are called implicitly as a side-effect of
accessing a table. Whatever is the fastest and smoothest way to get
there is fine with me.

> And I'm against simply breaking the past without any recourse as what
> we did for pg_dump/pg_restore still bothers me.

Is a GUC the solution here? Gurjeet called it heavy-handed, and I see
the point that carrying such a GUC forever would be unpleasant. But if
it reduces the risk of breakage (or at least offers an escape hatch)
then it may be wise, and hopefully we can remove it later.

Regards,
Jeff Davis





Re: In Postgres 16 BETA, should the ParseNamespaceItem have the same index as it's RangeTableEntry?

2023-07-13 Thread Tom Lane
Farias de Oliveira  writes:
> With further inspection in AGE's code, after executing the SET query,
> it goes inside transform_cypher_clause_as_subquery() function and the
> ParseNamespaceItem has the following values:

>  {p_names = 0x1205638, p_rte = 0x11edb70, p_rtindex = 1, p_perminfo =
> 0x7f7f7f7f7f7f7f7f,
>   p_nscolumns = 0x1205848, p_rel_visible = true, p_cols_visible =
> true, p_lateral_only = false,
>   p_lateral_ok = true}

Hmm, that uninitialized value for p_perminfo is pretty suspicious.
I see that transformFromClauseItem and buildNSItemFromLists both
create ParseNamespaceItems without bothering to fill p_perminfo,
while buildNSItemFromTupleDesc fills it per the caller and
addRangeTableEntryForJoin always sets it to NULL.  I think we
ought to make the first two set it to NULL as well, because
uninitialized fields are invariably a bad idea (even though the
lack of valgrind complaints says that the core code is managing
to avoid touching those fields).

If we do that, is it sufficient to resolve your problem?

regards, tom lane




Re: Allowing parallel-safe initplans

2023-07-13 Thread Tom Lane
I wrote:
> Eventually I noticed that all the failing cases were instances of
> optimizing MIN()/MAX() aggregates into indexscans, and then I figured
> out what the problem is: we substitute Params for the optimized-away
> Aggref nodes in setrefs.c, *after* SS_finalize_plan has been run.
> That means we fail to account for those Params in extParam/allParam
> sets.  We've gotten away with that up to now because such Params
> could only appear where Aggrefs can appear, which is only in top-level
> (above scans and joins) nodes, which generally don't have any of the
> sorts of rescan optimizations that extParam/allParam bits control.
> But this patch results in needing to have a correct extParam set for
> the node just below Gather, and we don't.  I am not sure whether there
> are any reachable bugs without this patch; but there might be, or some
> future optimization might introduce one.

> It seems like the cleanest fix for this is to replace such optimized
> Aggrefs in a separate tree scan before running SS_finalize_plan.
> That's fairly annoying from a planner-runtime standpoint, although
> we could skip the extra pass in the typical case where no minmax aggs
> have been optimized.
> I also thought about swapping the order of operations so that we
> run SS_finalize_plan after setrefs.c.  That falls down because of
> set_param_references itself, which requires those bits to be
> calculated already.  But maybe we could integrate that computation
> into SS_finalize_plan instead?  There's certainly nothing very
> pretty about the way it's done now.

I tried both of those and concluded they'd be too messy for a patch
that we might find ourselves having to back-patch.  So 0001 attached
fixes it by teaching SS_finalize_plan to treat optimized MIN()/MAX()
aggregates as if they were already Params.  It's slightly annoying
to have knowledge of that optimization metastasizing into another
place, but the alternatives are even less palatable.

Having done that, if you adjust 0002 to inject Gathers even when
debug_parallel_query = regress, the only diffs in the core regression
tests are that some initPlans disappear from EXPLAIN output.  The
outputs of the actual queries are still correct, demonstrating that
e89a71fb4 does indeed make it work as long as the param bitmapsets
are correct.

I'm still resistant to the idea of kluging EXPLAIN to the extent
of hiding the EXPLAIN output changes.  It wouldn't be that hard
to do really, but I worry that such a kluge might hide real problems
in future.  So what I did in 0002 was to allow initPlans for an
injected Gather only if debug_parallel_query = on, so that there
will be a place for EXPLAIN to show them.  Other than the changes
in that area, 0002 is the same as the previous patch.

regards, tom lane

From c0ae4618308f89de57b4a7d744cf2f734555d4a2 Mon Sep 17 00:00:00 2001
From: Tom Lane 
Date: Thu, 13 Jul 2023 16:50:13 -0400
Subject: [PATCH v2 1/2] Account for optimized MinMax aggregates during
 SS_finalize_plan.

We are capable of optimizing MIN() and MAX() aggregates on indexed
columns into subqueries that exploit the index, rather than the normal
thing of scanning the whole table.  When we do this, we replace the
Aggref node(s) with Params referencing subquery outputs.  Such Params
really ought to be included in the per-plan-node extParam/allParam
sets computed by SS_finalize_plan.  However, we've never done so
up to now because of an ancient implementation choice to perform
that substitution during set_plan_references, which runs after
SS_finalize_plan, so that SS_finalize_plan never sees these Params.

This seems like clearly a bug, yet there have been no field reports
of problems that could trace to it.  This may be because the types
of Plan nodes that could contain Aggrefs do not have any of the
rescan optimizations that are controlled by extParam/allParam.
Nonetheless it seems certain to bite us someday, so let's fix it
in a self-contained patch that can be back-patched if we find a
case in which there's a live bug pre-v17.

The cleanest fix would be to perform a separate tree walk to do
these substitutions before SS_finalize_plan runs.  That seems
unattractive, first because a whole-tree mutation pass is expensive,
and second because we lack infrastructure for visiting expression
subtrees in a Plan tree, so that we'd need a new function knowing
as much as SS_finalize_plan knows about that.  I also considered
swapping the order of SS_finalize_plan and set_plan_references,
but that fell foul of various assumptions that seem tricky to fix.
So the approach adopted here is to teach SS_finalize_plan itself
to check for such Aggrefs.  I refactored things a bit in setrefs.c
to avoid having three copies of the code that does that.

Given the lack of any currently-known bug, no test case here.

Discussion: https://postgr.es/m/2391880.1689025...@sss.pgh.pa.us
---
 src/backend/optimizer/plan/setrefs.c   | 68 --
 

Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread Pavel Luzanov

On 13.07.2023 11:26, Pavel Luzanov wrote:
And for versions <16 I forget to simplify expression for 'Options' 
column after removing LEFT JOIN on pam:


SELECT m.rolname AS "Role name", r.rolname AS "Member of",
  pg_catalog.concat_ws(', ',
    CASE WHEN pam.admin_option THEN 'ADMIN' END,
    CASE WHEN pam.roleid IS NOT NULL AND m.rolinherit THEN 'INHERIT' END,
    CASE WHEN pam.roleid IS NOT NULL THEN 'SET' END
  ) AS "Options",
  g.rolname AS "Grantor"
FROM pg_catalog.pg_roles m
 JOIN pg_catalog.pg_auth_members pam ON (pam.member = m.oid)
 LEFT JOIN pg_catalog.pg_roles r ON (pam.roleid = r.oid)
 LEFT JOIN pg_catalog.pg_roles g ON (pam.grantor = g.oid)
WHERE m.rolname !~ '^pg_'
ORDER BY 1, 2, 4;

I plan to replace it to:

  pg_catalog.concat_ws(', ',
    CASE WHEN pam.admin_option THEN 'ADMIN' END,
    CASE WHEN m.rolinherit THEN 'INHERIT' END,
    'SET'
  ) AS "Options",



The new version contains only this change.

--
-
Pavel Luzanov
From d25743f87462ef89f9c098f0b907ac1fff473a99 Mon Sep 17 00:00:00 2001
From: Pavel Luzanov 
Date: Fri, 14 Jul 2023 00:08:03 +0300
Subject: [PATCH v10] psql: Add \drg command to show role memberships

New command shows assigned privileges for each membership
ADMIN, INHERIT (added by e3ce2de0) or SET (3d14e171) and
who is grantor. This is important to know
how privileges can be used and who can revoke membership.
Without this command you need to query pg_auth_members directly.

This patch also removes the "Member of" column from the \du & \dg
commands, as it does not provide enough information about membership.
---
 doc/src/sgml/ref/psql-ref.sgml | 21 
 src/bin/psql/command.c |  2 +
 src/bin/psql/describe.c| 87 +-
 src/bin/psql/describe.h|  3 ++
 src/bin/psql/help.c|  1 +
 src/bin/psql/tab-complete.c|  6 ++-
 src/test/regress/expected/psql.out | 46 ++--
 src/test/regress/sql/psql.sql  | 26 +
 8 files changed, 174 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 35aec6d3ce..e09fd06105 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1883,6 +1883,27 @@ INSERT INTO tbl1 VALUES ($1, $2) \bind 'first value' 'second value' \g
 
   
 
+
+  
+\drg[S] [ pattern ]
+
+
+Lists detailed information about each role membership, including
+assigned options (ADMIN,
+INHERIT or SET) and grantor.
+See the GRANT
+command for their meaning.
+
+
+By default, only user-created roles are shown; supply the
+S modifier to include system roles.
+If pattern is specified,
+only those roles whose names match the pattern are listed.
+
+
+  
+
+
   
 \drds [ role-pattern [ database-pattern ] ]
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 511debbe81..88a653a709 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -918,6 +918,8 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 
 	free(pattern2);
 }
+else if (strncmp(cmd, "drg", 3) == 0)
+	success = describeRoleGrants(pattern, show_system);
 else
 	status = PSQL_CMD_UNKNOWN;
 break;
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 9325a46b8f..722ba95420 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -3617,7 +3617,7 @@ describeRoles(const char *pattern, bool verbose, bool showSystem)
 	PGresult   *res;
 	printTableContent cont;
 	printTableOpt myopt = pset.popt.topt;
-	int			ncols = 3;
+	int			ncols = 2;
 	int			nrows = 0;
 	int			i;
 	int			conns;
@@ -3631,11 +3631,7 @@ describeRoles(const char *pattern, bool verbose, bool showSystem)
 	printfPQExpBuffer(,
 	  "SELECT r.rolname, r.rolsuper, r.rolinherit,\n"
 	  "  r.rolcreaterole, r.rolcreatedb, r.rolcanlogin,\n"
-	  "  r.rolconnlimit, r.rolvaliduntil,\n"
-	  "  ARRAY(SELECT b.rolname\n"
-	  "FROM pg_catalog.pg_auth_members m\n"
-	  "JOIN pg_catalog.pg_roles b ON (m.roleid = b.oid)\n"
-	  "WHERE m.member = r.oid) as memberof");
+	  "  r.rolconnlimit, r.rolvaliduntil\n");
 
 	if (verbose)
 	{
@@ -3675,8 +3671,6 @@ describeRoles(const char *pattern, bool verbose, bool showSystem)
 
 	printTableAddHeader(, gettext_noop("Role name"), true, align);
 	printTableAddHeader(, gettext_noop("Attributes"), true, align);
-	/* ignores implicit memberships from superuser & pg_database_owner */
-	printTableAddHeader(, gettext_noop("Member of"), true, align);
 
 	if (verbose)
 		printTableAddHeader(, gettext_noop("Description"), true, align);
@@ -3701,11 +3695,11 @@ describeRoles(const char *pattern, bool verbose, bool showSystem)
 		if (strcmp(PQgetvalue(res, i, 5), "t") != 0)
 			add_role_attribute(, _("Cannot login"));
 
-		

Re: Fix search_path for all maintenance commands

2023-07-13 Thread David G. Johnston
On Thu, Jul 13, 2023 at 2:00 PM Gurjeet Singh  wrote:

> On Thu, Jul 13, 2023 at 1:37 PM David G. Johnston
>  wrote:
> >
> >  I'm against simply breaking the past without any recourse as what we
> did for pg_dump/pg_restore still bothers me.
>
> I'm sure this is tangential, but can you please provide some
> context/links to the change you're referring to here.
>
>
Here is the instigating issue and a discussion thread on the aftermath:

https://wiki.postgresql.org/wiki/A_Guide_to_CVE-2018-1058%3A_Protect_Your_Search_Path

https://www.postgresql.org/message-id/flat/13033.1531517020%40sss.pgh.pa.us#2aa2e25816d899d62f168926e3ff17b1

David J.


Re: WAL Insertion Lock Improvements

2023-07-13 Thread Andres Freund
On 2023-07-11 09:20:45 +0900, Michael Paquier wrote:
> On Mon, Jun 05, 2023 at 08:00:00AM +0530, Bharath Rupireddy wrote:
> > On Wed, May 31, 2023 at 5:05 PM Michael Paquier  wrote:
> >> @Andres: Were there any extra tests you wanted to be run for more
> >> input?
> > 
> > @Andres Freund please let us know your thoughts.
> 
> Err, ping.  It seems like this thread is waiting on input from you,
> Andres?

Looking. Sorry for not getting to this earlier.

- Andres




Re: Fix search_path for all maintenance commands

2023-07-13 Thread Gurjeet Singh
On Thu, Jul 13, 2023 at 1:37 PM David G. Johnston
 wrote:
>
>  I'm against simply breaking the past without any recourse as what we did for 
> pg_dump/pg_restore still bothers me.

I'm sure this is tangential, but can you please provide some
context/links to the change you're referring to here.

Best regards,
Gurjeet
http://Gurje.et




Re: DROP DATABASE is interruptible

2023-07-13 Thread Andres Freund
Hi,

On 2023-07-12 11:54:18 +0200, Daniel Gustafsson wrote:
> > On 12 Jul 2023, at 03:59, Andres Freund  wrote:
> > On 2023-07-07 14:09:08 +0200, Daniel Gustafsson wrote:
> >>> On 25 Jun 2023, at 19:03, Andres Freund  wrote:
> >>> On 2023-06-21 12:02:04 -0700, Andres Freund wrote:
> 
> >>> There don't need to be explict checks, because pg_upgrade will fail, 
> >>> because
> >>> it connects to every database. Obviously the error could be nicer, but it
> >>> seems ok for something hopefully very rare. I did add a test ensuring 
> >>> that the
> >>> behaviour is caught.
> >> 
> >> I don't see any pg_upgrade test in the patch?
> > 
> > Oops, I stashed them alongside some unrelated changes... Included this time.
> 
> Looking more at this I wonder if we in HEAD should make this a bit nicer by
> extending the --check phase to catch this?  I did a quick hack along these
> lines in the 0003 commit attached here (0001 and 0002 are your unchanged
> patches, just added for consistency and to be CFBot compatible).  If done it
> could be a separate commit to make the 0002 patch backport cleaner of course.

I don't really have an opinion on that, tbh...

> >> +  errhint("Use DROP DATABASE to drop invalid databases"));
> >> Should end with a period as a complete sentence?
> > 
> > I get confused about this every time. It's not helped by this example in
> > sources.sgml:
> > 
> > 
> > Primary:could not create shared memory segment: %m
> > Detail: Failed syscall was shmget(key=%d, size=%u, 0%o).
> > Hint:   the addendum
> > 
> > 
> > Which notably does not use punctuation for the hint. But indeed, later we 
> > say:
> >   
> >Detail and hint messages: Use complete sentences, and end each with
> >a period.  Capitalize the first word of sentences.  Put two spaces after
> >the period if another sentence follows (for English text; might be
> >inappropriate in other languages).
> >   
> 
> That's not a very helpful example, and one which may give the wrong impression
> unless the entire page is read.  I've raised this with a small diff to improve
> it on -docs.

Thanks for doing that!


> > Updated patches attached.
> 
> This version of the patchset LGTM.

Backpatching indeed was no fun. Not having BackgroundPsql.pm was the worst
part. But also a lot of other conflicts in tests... Took me 5-6 hours or
so.
But I now finally pushed the fixes. Hope the buildfarm agrees with it...

Thanks for the review!

Greetings,

Andres Freund




Re: vac_truncate_clog()'s bogus check leads to bogusness

2023-07-13 Thread Andres Freund
On 2023-06-22 22:29:12 -0700, Noah Misch wrote:
> On Thu, Jun 22, 2023 at 09:45:18AM -0700, Andres Freund wrote:
> > On 2023-06-21 21:50:39 -0700, Noah Misch wrote:
> > > On Wed, Jun 21, 2023 at 03:12:08PM -0700, Andres Freund wrote:
> > > > When vac_truncate_clog() returns early
> > > ...
> > > > we haven't released the lwlock that we acquired earlier
> > > 
> > > > Until there's some cause for the session to call LWLockReleaseAll(), 
> > > > the lock
> > > > is held. Until then neither the process holding the lock, nor any other
> > > > process, can finish vacuuming.  We don't even have an assert against a
> > > > self-deadlock with an already held lock, oddly enough.
> > > 
> > > I agree with this finding.  Would you like to add the lwlock releases, or
> > > would you like me to?
> > 
> > Happy with either.  I do have code and testcase, so I guess it would make
> > sense for me to do it?
> 
> Sounds good.  Thanks.

Done.




Re: Fix search_path for all maintenance commands

2023-07-13 Thread David G. Johnston
On Thu, Jul 13, 2023 at 12:54 PM Gurjeet Singh  wrote:

>
> The approach seems good to me. My concern is with this change's
> potential to cause an extended database outage. Hence sending it out
> as part of v16, without any escape hatch for the DBA, is my objection.
>
>
If this is limited to MAINT, which I'm in support of, there is no need for
an "escape hatch".  A prerequisite for leveraging the new feature is that
you fix the code so it conforms to the new way of doing things.

Tom's opinion was a general dislike for differing behavior in different
situations.  I dislike it too, on purist grounds, but would rather do this
than not make any forward progress because we made a poor decision in the
past. And I'm against simply breaking the past without any recourse as what
we did for pg_dump/pg_restore still bothers me.

David J.


Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication

2023-07-13 Thread Melih Mutlu
Hi Peter,

Peter Smith , 11 Tem 2023 Sal, 05:59 tarihinde şunu
yazdı:
> Even if patches 0003 and 0002 are to be combined, I think that should
> not happen until after the "reuse" design is confirmed which way is
> best.
>
> e.g. IMO it might be easier to compare the different PoC designs for
> patch 0002 if there is no extra logic involved.
>
> PoC design#1 -- each tablesync decides for itself what to do next
> after it finishes
> PoC design#2 -- reuse tablesync using a "pool" of available workers

Right. I made a patch 0003 to change 0002 so that tables will be assigned
to sync workers by apply worker.
It's a rough POC and ignores some edge cases. But this is what I think how
apply worker would take the responsibility of table assignments. Hope the
implementation makes sense and I'm not missing anything that may cause
degraded perforrmance.

PoC design#1 --> apply only patch 0001 and 0002
PoC design#2 --> apply all patches, 0001, 0002 and 0003

Here are some quick numbers with 100 empty tables.

+--++++
|  | 2 sync workers | 4 sync workers | 8 sync workers |
+--++++
| POC design#1 | 1909.873 ms| 986.261 ms | 552.404 ms |
+--++++
| POC design#2 | 4962.208 ms| 1240.503 ms| 1165.405 ms|
+--++++
| master   | 2666.008 ms| 1462.012 ms| 986.848 ms |
+--++++

Seems like design#1 is better than both design#2 and master overall. It's
surprising to see that even master beats design#2 in some cases though. Not
sure if that is expected or there are some places to improve design#2 even
more.

What do you think?

PS: I only attached the related patches and not the whole patch set. 0001
and 0002 may contain some of your earlier reviews, but I'll send a proper
updated set soon.

Thanks,
-- 
Melih Mutlu
Microsoft


v18-0002-Reuse-Tablesync-Workers.patch
Description: Binary data


v18-0001-Refactor-to-split-Apply-and-Tablesync-Workers.patch
Description: Binary data


v18-0003-apply-worker-assigns-tables.patch
Description: Binary data


Re: Fix search_path for all maintenance commands

2023-07-13 Thread Gurjeet Singh
On Thu, Jul 13, 2023 at 11:56 AM Jeff Davis  wrote:
>
> The current approach seemed to get support from Noah, Nathan, and Greg
> Stark.
>
> Concerns were raised by Gurjeet, Tom, and Robert in the 16 cycle; but

I didn't see Tom's or Robert's concerns raised in this thread. I see
now that for some reason there are two threads with slightly different
subjects. I'll catch-up on that, as well.

The other thread's subject: "pgsql: Fix search_path to a safe value
during maintenance operations"

> I'm not sure whether those objections were specific to the 16 cycle or
> whether they are objections to the approach entirely. Comments?

The approach seems good to me. My concern is with this change's
potential to cause an extended database outage. Hence sending it out
as part of v16, without any escape hatch for the DBA, is my objection.

It it were some commercial database, we would have simply introduced a
hidden event, or a GUC, with default off. But a GUC for this feels too
heavy handed.

Perhaps we can provide an escape hatch as follows (warning, pseudocode ahead).

> if (first_element(search_path) != 
> "dont_override_search_patch_for_maintenance")
>  SetConfigOption("search_path", GUC_SAFE_SEARCH_PATH, ...);

So, if someone desperately wants to just plow through the maintenance
command, and are not ready or able to fix their dependence on their
search_path, the community can show them this escape-hatch.

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Thu, 2023-07-13 at 18:01 +0100, Dean Rasheed wrote:
> For some use cases, I can imagine allowing OLD/NEW.colname would mean
> you wouldn't need pg_merge_action() (if the column was NOT NULL), so
> I
> think the features should work well together.

For use cases where a user could do it either way, which would you
expect to be the "typical" way (assuming we supported the new/old)?

  MERGE ... RETURNING pg_merge_action(), id, val;

or

  MERGE ... RETURNING id, OLD.val, NEW.val;

?

I am still bothered that pg_merge_action() is so context-sensitive.
"SELECT pg_merge_action()" by itself doesn't make any sense, but it's
allowed in the v8 patch. We could make that a runtime error, which
would be better, but it feels like it's structurally wrong. This is not
an objection, but it's just making me think harder about alternatives.

Maybe instead of a function it could be a special table reference like:

  MERGE ... RETURNING MERGE.action, MERGE.action_number, id, val?

Regards,
Jeff Davis





In Postgres 16 BETA, should the ParseNamespaceItem have the same index as it's RangeTableEntry?

2023-07-13 Thread Farias de Oliveira
Hello, my name is Matheus Farias and this is the first time that I'm
sending an email to the pgsql-hackers list. I'm a software developer intern
at Bitnine Global Inc. and, along with other interns, we've been working on
updating Apache AGE with the latest version of Postgres, the REL_16_BETA
version. One of the main problems that we are facing is that the code was
reworked to update the permission checking and now some of the queries
return ERROR: invalid perminfoindex perminfoindex> in RTE with relid
relid>. This occurs due to one of the RTEs having perminfoindex = 0
and the relid containing a value.

AGE is a Postgres extension which allows us to execute openCypher commands
to create a graph with nodes and edges. There are two main tables that are
created: _ag_label_vertex and _ag_label_edge. Both of them will be the
parent label tables of every other vertex/edge label we create.

When we do a simple MATCH query to find all nodes with the v label:

SELECT * FROM cypher('cypher_set', $$MATCH (n:v)RETURN n
$$) AS (node agtype);


inside the add_rtes_to_flat_rtable() function, it goes inside a loop
where we can see the stored RTEs in root->parse->rtable:

// I've simplified what every RTE shows.

root->parse->rtable
[
(rtekind = RTE_SUBQUERY, relid = 0, perminfoindex = 0),
(rtekind = RTE_SUBQUERY, relid = 0, perminfoindex = 0),
(rtekind = RTE_SUBQUERY, relid = 0, perminfoindex = 0),
(rtekind = RTE_RELATION, relid = 16991, perminfoindex = 1)
]

But executing the query with a simple SET clause:

SELECT * FROM cypher('cypher_set', $$MATCH (n) SET n.i = 3
$$) AS (a agtype);

One of the RTEs of the RTE_RELATION type and relid with a not null
value has perminfoindex = 0

root->parse->rtable
[
(rtekind = RTE_SUBQUERY, relid = 0, perminfoindex = 0),
(rtekind = RTE_RELATION, relid = 16971, perminfoindex = 1),
(rtekind = RTE_RELATION, relid = 16971, perminfoindex = 1),
(rtekind = RTE_RELATION, relid = 16991, perminfoindex = 0)
]

the relid = 16991 is related to the child vertex label and the relid =
16971 related to the parent vertex label:

SELECT to_regclass('cypher_set._ag_label_vertex')::oid;
 to_regclass -
   16971
SELECT to_regclass('cypher_set.v')::oid;
 to_regclass -
   16991

With further inspection in AGE's code, after executing the SET query,
it goes inside transform_cypher_clause_as_subquery() function and the
ParseNamespaceItem has the following values:

 {p_names = 0x1205638, p_rte = 0x11edb70, p_rtindex = 1, p_perminfo =
0x7f7f7f7f7f7f7f7f,
  p_nscolumns = 0x1205848, p_rel_visible = true, p_cols_visible =
true, p_lateral_only = false,
  p_lateral_ok = true}

And the pnsi->p_rte has:

{type = T_RangeTblEntry, rtekind = RTE_SUBQUERY, relid = 0, relkind =
0 '\000', rellockmode = 0,
  tablesample = 0x0, perminfoindex = 0, subquery = 0x11ed710,
security_barrier = false,
  jointype = JOIN_INNER, joinmergedcols = 0, joinaliasvars = 0x0,
joinleftcols = 0x0, joinrightcols = 0x0,
  join_using_alias = 0x0, functions = 0x0, funcordinality = false,
tablefunc = 0x0, values_lists = 0x0,
  ctename = 0x0, ctelevelsup = 0, self_reference = false, coltypes =
0x0, coltypmods = 0x0,
  colcollations = 0x0, enrname = 0x0, enrtuples = 0, alias =
0x12055f0, eref = 0x1205638, lateral = false,
  inh = false, inFromCl = true, securityQuals = 0x0}

Then it calls addNSItemToQuery(pstate, pnsi, true, false, true);. This
function adds the given nsitem/RTE as a top-level entry in the pstate's
join list and/or namespace list. I've been thinking if adding the
nsitem/RTE like this won't cause this error?

Also in handle_prev_clause it has the following line, which is going to add
all the rte's attributes to the current queries targetlist which, again,
I'm not sure if that's what causing the problem because the relid of the
rte is 0:

query->targetList = expandNSItemAttrs(pstate, pnsi, 0, true, -1);

If someone knows more about it, I would be grateful for any kind of
answer or help. AGE's source code can be found here:
https://github.com/apache/age


Re: Fix search_path for all maintenance commands

2023-07-13 Thread Jeff Davis
On Thu, 2023-07-06 at 18:39 -0700, Jeff Davis wrote:

> * The fix might not go far enough or might be in the wrong place. I'm
> open to suggestion here, too. Maybe we can make it part of the
> general
> function call mechanism, and can be overridden by explicitly setting
> the function search path? Or maybe we need new syntax where the
> function can acquire the search path from the session explicitly, but
> uses a safe search path by default?

I'm inclined to proceed with the current approach here, which is to
just fix search_path for maintenance commands. Other approaches may be
possible, but:

 (a) We already special-case the way functions are executed for
maintenance commands in other ways -- we run the functions as the table
owner (even SECURITY INVOKER functions) and run them as a
SECURITY_RESTRICTED_OPERATION. Setting the search_path to a safe value
seems like a natural extension of that; and

 (b) The lack of a safe search path is blocking other useful features,
such as the MAINTAIN privilege; and

 (c) I don't see other proposals, aside from a few ideas I put forward
here[1], which didn't get any responses.

The current approach seemed to get support from Noah, Nathan, and Greg
Stark.

Concerns were raised by Gurjeet, Tom, and Robert in the 16 cycle; but
I'm not sure whether those objections were specific to the 16 cycle or
whether they are objections to the approach entirely. Comments?

Regards,
Jeff Davis


[1]
https://www.postgresql.org/message-id/6781cc79580c464a63fc0a1343637ed2b2b0cf09.camel%40j-davis.com




Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Thu, 13 Jul 2023 at 17:43, Vik Fearing  wrote:
>
> There is also the WITH ORDINALITY and FOR ORDINALITY examples.
>

True. I just think "number" is a friendlier, more familiar word than "ordinal".

> So perhaps pg_merge_when_clause_number() would
> > be a better name. It's still quite long, but it's the best I can think
> > of.
>
> How about pg_merge_match_number() or pg_merge_ordinality()?

I think "match_number" is problematic, because it might be a "matched"
or a "not matched" action. "when_clause" is the term used on the MERGE
doc page.

Regards,
Dean




Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Thu, 13 Jul 2023 at 17:01, Jeff Davis  wrote:
>
> MERGE can end up combining old and new values in a way that doesn't
> happen with INSERT/UPDATE/DELETE. For instance, a "MERGE ... RETURNING
> id" would return a mix of NEW.id (for INSERT/UPDATE actions) and OLD.id
> (for DELETE actions).
>

Right, but allowing OLD/NEW.colname in the RETURNING list would remove
that complication, and it shouldn't change how a bare colname
reference behaves.

> The pg_merge_action() can differentiate the old and new values, but
> it's a bit more awkward.
>

For some use cases, I can imagine allowing OLD/NEW.colname would mean
you wouldn't need pg_merge_action() (if the column was NOT NULL), so I
think the features should work well together.

Regards,
Dean




Re: remaining sql/json patches

2023-07-13 Thread Alvaro Herrera
I looked at your 0001.  My 0001 are some trivial comment cleanups to
that.

I scrolled through all of jsonfuncs.c to see if there was a better place
for the new function than the end of the file.  Man, is that one ugly
file.  There are almost no comments!  I almost wish you would create a
new file so that you don't have to put this new function in such bad
company.  But maybe it'll improve someday, so ... whatever.

In the original code, the functions here being (re)moved do not need to
return a type output function in a few cases.  This works okay when the
functions are each contained in a single file (because each function
knows that the respective datum_to_json/datum_to_jsonb user of the
returned values won't need the function OID in those other cases); but
as an exported function, that strange API doesn't seem great.  (It only
works for 0002 because the only thing that the executor does with these
cached values is call datum_to_json/b).  That seems easy to solve, since
we can return the hardcoded output function OID in those cases anyway.
A possible complaint about this is that the OID so returned would be
untested code, so they might be wrong and we'd never know.  However,
ISTM it's better to make a promise about always returning a function OID
and later fixing any bogus function OID if we ever discover that we
return one, rather than having to document in the function's comment
that "we only return function OIDs in such and such cases".  So I made
this change my 0002.

A similar complain can be made about which casts we look for.  Right
now, only an explicit cast to JSON is useful, so that's the only thing
we do.  But maybe one day a cast to JSONB would become useful if there's
no cast to JSON for some datatype (in the is_jsonb case only?); and
maybe another type of cast would be useful.  However, that seems like
going too much into uncharted territory with no useful use case, so
let's just not go there for now.  Maybe in the future we can improve
this aspect of it, if need arises.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
>From 44504a8fe96ad76a375c77e5f53e4f897e3ca2f2 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 13 Jul 2023 18:06:07 +0200
Subject: [PATCH 1/2] trivial fixups

---
 src/backend/utils/adt/jsonfuncs.c | 5 ++---
 src/include/utils/jsonfuncs.h | 8 
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/src/backend/utils/adt/jsonfuncs.c 
b/src/backend/utils/adt/jsonfuncs.c
index 612bbf06a3..764d48505b 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -5691,9 +5691,8 @@ json_get_first_token(text *json, bool throw_error)
  * Determine how we want to print values of a given type in datum_to_json(b).
  *
  * Given the datatype OID, return its JsonTypeCategory, as well as the type's
- * output function OID.  If the returned category is JSONTYPE_CAST or
- * JSOBTYPE_CASTJSON, we return the OID of the type->JSON cast function
- * instead.
+ * output function OID.  If the returned category is JSONTYPE_CAST, we return
+ * the OID of the type->JSON cast function instead.
  */
 void
 json_categorize_type(Oid typoid, bool is_jsonb,
diff --git a/src/include/utils/jsonfuncs.h b/src/include/utils/jsonfuncs.h
index 27c2d20610..27a25ba283 100644
--- a/src/include/utils/jsonfuncs.h
+++ b/src/include/utils/jsonfuncs.h
@@ -63,7 +63,7 @@ extern Jsonb *transform_jsonb_string_values(Jsonb *jsonb, 
void *action_state,
 extern text *transform_json_string_values(text *json, void *action_state,

  JsonTransformStringValuesAction transform_action);
 
-/* Type categories for datum_to_json[b] and friends. */
+/* Type categories returned by json_categorize_type */
 typedef enum
 {
JSONTYPE_NULL,  /* null, so we didn't bother to 
identify */
@@ -72,8 +72,8 @@ typedef enum
JSONTYPE_DATE,  /* we use special formatting 
for datetimes */
JSONTYPE_TIMESTAMP,
JSONTYPE_TIMESTAMPTZ,
-   JSONTYPE_JSON,  /* JSON and JSONB */
-   JSONTYPE_JSONB, /* JSONB (for datum_to_jsonb) */
+   JSONTYPE_JSON,  /* JSON (and JSONB, if not 
is_jsonb) */
+   JSONTYPE_JSONB, /* JSONB (if is_jsonb) */
JSONTYPE_ARRAY, /* array */
JSONTYPE_COMPOSITE, /* composite */
JSONTYPE_CAST,  /* something with an explicit 
cast to JSON */
@@ -81,6 +81,6 @@ typedef enum
 } JsonTypeCategory;
 
 void json_categorize_type(Oid typoid, bool is_jsonb,
-JsonTypeCategory *tcategory, Oid 
*outfuncoid);
+ JsonTypeCategory *tcategory, 
Oid *outfuncoid);
 
 #endif
-- 
2.30.2

>From 

Re: MERGE ... RETURNING

2023-07-13 Thread Vik Fearing

On 7/13/23 17:38, Dean Rasheed wrote:

On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:



I think the name of function pg_merge_when_clause() can be improved.
How about pg_merge_when_clause_ordinal().


That's a bit of a mouthful, but I don't have a better idea right now.
I've left the names alone for now, in case something better occurs to
anyone.


+1. How do we make sure we don't forget that it needs to be named
better. Perhaps a TODO item within the patch will help.



Thinking about that some more, I think the word "number" is more
familiar to most people than "ordinal". There's the row_number()
function, for example. 



There is also the WITH ORDINALITY and FOR ORDINALITY examples.


So perhaps pg_merge_when_clause_number() would

be a better name. It's still quite long, but it's the best I can think
of.



How about pg_merge_match_number() or pg_merge_ordinality()?
--
Vik Fearing





Re: MERGE ... RETURNING

2023-07-13 Thread Gurjeet Singh
On Thu, Jul 13, 2023 at 8:38 AM Dean Rasheed  wrote:
>
> On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:

> >  { oid => '9499', descr => 'command type of current MERGE action',
> > -  proname => 'pg_merge_action',  provolatile => 'v',
> > +  proname => 'pg_merge_action',  provolatile => 'v', proparallel => 'r',
> > 
> >  { oid => '9500', descr => 'index of current MERGE WHEN clause',
> > -  proname => 'pg_merge_when_clause',  provolatile => 'v',
> > +  proname => 'pg_merge_when_clause',  provolatile => 'v', proparallel => 
> > 'r',
> > 
> >
> > I see that you've now set proparallel of these functions to 'r'. I'd
> > just like to understand how you got to that conclusion.
> >
>
> Now that these functions can appear in subqueries in the RETURNING
> list, there exists the theoretical possibility that the subquery might
> use a parallel plan (actually that can't happen today, for any query
> that modifies data, but maybe someday it may become a possibility),
> and it's possible to use these functions in a SELECT query inside a
> PL/pgSQL function called from the RETURNING list, which might consider
> a parallel plan. Since these functions rely on access to executor
> state that isn't copied to parallel workers, they must be run on the
> leader, hence I think PARALLEL RESTRICTED is the right level to use. A
> similar example is pg_trigger_depth().

Thanks for the explanation. That helps.

Best regards,
Gurjeet
http://Gurje.et




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Mon, 2023-01-09 at 12:29 +, Dean Rasheed wrote:
> > Would it be useful to have just the action? Perhaps "WITH ACTION"?
> > My idea is that this would return an enum of INSERT, UPDATE, DELETE
> > (so is "action" the right word?). It seems to me in many situations
> > I would be more likely to care about which of these 3 happened
> > rather than the exact clause that applied. This isn't necessarily
> > meant to be instead of your suggestion because I can imagine
> > wanting to know the exact clause, just an alternative that might
> > suffice in many situations. Using it would also avoid problems
> > arising from editing the query in a way which changes the numbers
> > of the clauses.
> > 
> 
> Hmm, perhaps that's something that can be added as well. Both use
> cases seem useful.

Can you expand a bit on the use cases for identifying individual WHEN
clauses? I see that it offers a new capability beyond just the action
type, but I'm having trouble thinking of real use cases.

Regards,
Jeff Davis





Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread Tom Lane
Pavel Luzanov  writes:
> On 13.07.2023 18:01, Tom Lane wrote:
>> That does not seem right.  Is it impossible for pam.set_option
>> to be false?  Even if it is, should this code assume that?

> For versions before 16, including one role to another automatically
> gives possibility to issue SET ROLE.

Right, -ENOCAFFEINE.

> IMO, the only question is whether it is correct to show IMPLICIT and
> SET options in versions where they are not actually present
> in pg_auth_members, but can be determined.

Hmm, that's definitely a judgment call.  You could argue that it's
useful and consistent, but also that it's confusing to somebody
who's not familiar with the new terminology.  On balance I'd lean
to showing them, but I won't fight hard for that viewpoint.

regards, tom lane




Re: Potential memory leak in contrib/intarray's g_intbig_compress

2023-07-13 Thread Matthias van de Meent
On Thu, 13 Jul 2023 at 17:20, Tom Lane  wrote:
>
> Matthias van de Meent  writes:
> > My collegue Konstantin (cc-ed) noticed that the GiST code of intarray
> > may leak memory in certain index operations:
>
> Can you demonstrate an actual problem here, that is a query-lifespan leak?
>
> IMO, the coding rule in the GiST and GIN AMs is that the AM code is
> responsible for running all opclass-supplied functions in suitably
> short-lived memory contexts, so that leaks within those functions
> don't cause problems.  This is different from btree which requires
> comparison functions to not leak.  The rationale for having different
> conventions is that btree comparison functions are typically simple
> enough to be able to deal with such a restriction, whereas GiST
> and GIN opclasses are often complex critters for which it'd be too
> bug-prone to insist on leakproofness.  So it seems worth the cost
> to make the AM itself set up a throwaway memory context.
>
> (I don't recall offhand about which rule the other AMs use.
> I'm also not sure where or if this choice is documented.)
>
> In the case at hand, I think the pfree is useless and was installed
> by somebody who had mis-extrapolated from btree rules.  So what
> I'm thinking would be sufficient is to drop it altogether:
>
> -   if (in != DatumGetArrayTypeP(entry->key))
> -   pfree(in);

Looks like it's indeed a useless pfree call here - all paths that I
could find that lead to GiST's compress procedure are encapsulated in
a temporary context that is reset fairly quickly after use (at most
this memory context would live for the duration of the recursive
splitting of pages up the tree, but I haven't verified this
hypotheses).

There are similar pfree calls in the _int_gist.c file's g_int_compress
function, which made me think we do need to clean up after use, but
indeed these pfrees are useless (or even harmful if bug #17888 can be
trusted)

Kind regards,

Matthias van de Meent.




Re: Potential memory leak in contrib/intarray's g_intbig_compress

2023-07-13 Thread Tom Lane
I wrote:
> Matthias van de Meent  writes:
>> My collegue Konstantin (cc-ed) noticed that the GiST code of intarray
>> may leak memory in certain index operations:

> Can you demonstrate an actual problem here, that is a query-lifespan leak?

> IMO, the coding rule in the GiST and GIN AMs is that the AM code is
> responsible for running all opclass-supplied functions in suitably
> short-lived memory contexts, so that leaks within those functions
> don't cause problems.

I tried adding "MemoryContextStats(CurrentMemoryContext);" at the top
of g_intbig_compress() and running the intarray regression tests
(which do reach the pfree in question).  This confirmed that the
compress function is always called in the "GiST temporary context"
made by createTempGistContext.  Also, the amount of memory reported as
consumed didn't seem to vary when I removed the pfree, which indicates
that we do manage to reset that context often enough that leakage here
doesn't matter.  It's hard to make an exact comparison because of
GiST's habit of randomizing page-split decisions, so that the sequence
of calls to the compress function isn't identical from one run to the
next.  But at least in the cases exercised by the regression tests,
we do not need that pfree --- and if you believe the comment for
createTempGistContext, it would be a GiST bug not an intarray bug
if we did.

I'll go remove the pfree.  Perhaps there is a TODO item here to
improve the documentation about these memory management conventions.

regards, tom lane




Re: logical decoding and replication of sequences, take 2

2023-07-13 Thread Tomas Vondra



On 7/13/23 16:24, Ashutosh Bapat wrote:
> Thanks for the updated patches. I haven't looked at the patches yet
> but have some responses below.
> 
> On Thu, Jul 13, 2023 at 12:35 AM Tomas Vondra
>  wrote:
> 
>>
>>
>> 3) simplify the replicated state
>>
>> As suggested by Ashutosh, it may not be a good idea to replicate the
>> (last_value, log_cnt, is_called) tuple, as that's pretty tightly tied to
>> our internal implementation. Which may not be the right thing for other
>> plugins. So this new patch replicates just "value" which is pretty much
>> (last_value + log_cnt), representing the next value that should be safe
>> to generate on the subscriber (in case of a failover).
>>
> 
> Thanks. That will help.
> 
> 
>> 5) minor tweaks in the built-in replication
>>
>> This adopts the relaxed LOCK code to allow locking sequences during the
>> initial sync, and also adopts the replication of a single value (this
>> affects the "apply" side of that change too).
>>
> 
> I think the problem we are trying to solve with LOCK is not actually
> getting solved. See [2]. Instead your earlier idea of using page LSN
> looks better.
> 

Thanks. I think you may be right, and the interlock may not be
necessary. I've responded to the linked threads, that's probably easier
to follow as it keeps the context.

>>
>> 6) simplified protocol versioning
> 
> I had tested the cross-version logical replication with older set of
> patches. Didn't see any unexpected behaviour then. I will test again.
>>

I think the question is what's the expected behavior. What behavior did
you expect/observe?

IIRC with the previous version of the patch, if you connected an old
subscriber (without sequence replication), it just ignored/skipped the
sequence increments and replicated the other changes.

The new patch detects that, and triggers ERROR on the publisher. And I
think that's the correct thing to do.

There was a lengthy discussion about making this more flexible (by not
tying this to "linear" protocol version) and/or permissive. I tried
doing that by doing similar thing to decoding of 2PC, which allows
choosing when creating a subscription.

But ultimately that just chooses where to throw an error - whether on
the publisher (in the output plugin callback) or on apply side (when
trying to apply change to non-existent sequence).

I still think it might be useful to have these "capabilities" orthogonal
to the protocol version, but it's a matter for a separate patch. It's
enough not to fail with "unknown message" on the subscriber.

>> The one thing I'm not really sure about is how it interferes with the
>> replication of DDL. But in principle, if it decodes DDL for ALTER
>> SEQUENCE, I don't see why it would be a problem that we then decode and
>> replicate the WAL for the sequence state. But if it is a problem, we
>> should be able to skip this WAL record with the initial sequence state
>> (which I think should be possible thanks to the "created" flag this
>> patch adds to the WAL record).
> 
> I had suggested a solution in [1] to avoid adding a flag to the WAL
> record. Did you consider it? If you considered it and rejected, I
> would be interested in knowing reasons behind rejecting it. Let me
> repeat here again:
> 
> ```
> We can add a
> decoding routine for RM_SMGR_ID. The decoding routine will add relfilelocator
> in XLOG_SMGR_CREATE record to txn->sequences hash. Rest of the logic will work
> as is. Of course we will add non-sequence relfilelocators as well but that
> should be fine. Creating a new relfilelocator shouldn't be a frequent
> operation. If at all we are worried about that, we can add only the
> relfilenodes associated with sequences to the hash table.
> ```
> 

Thanks for reminding me. In principle I'm not against using the proposed
approach - tracking all relfilenodes created by a transaction, although
I don't think the new flag in xl_seq_rec is a problem, and it's probably
cheaper than having to decode all relfilenode creations.

> If the DDL replication takes care of replicating and applying sequence
> changes, I think we don't need the changes tracking "transactional"
> sequence changes in this patch-set. That also makes a case for not
> adding a new field to WAL which may not be used.
> 

Maybe, but the DDL replication patch is not there yet, and I'm not sure
it's a good idea to make this patch wait for a much larger/complex
patch. If the DDL replication patch gets committed, it may ditch this
part (assuming it happens in the same development cycle).

However, my impression was DDL replication would be optional. In which
case we still need to handle the transactional case, to support sequence
replication without DDL replication enabled.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: sslinfo extension - add notbefore and notafter timestamps

2023-07-13 Thread Daniel Gustafsson
I had another look at this today and I think this patch is in pretty good
shape, below are a few comments on this revision:

-  'sslinfo--1.2.sql',
+  'sslinfo--1.2--1.3.sql',
+  'sslinfo--1.3.sql',

The way we typically ship extensions in contrib/ is to not create a new base
version .sql file for smaller changes like adding a few functions.  For this
patch we should keep --1.2.sql and instead supply a 1.2--1.3.sql with the new
functions.


+   not_befoer text

s/not_befoer/not_before/


+   errmsg("failed to convert tm to timestamp")));

I think this error is too obscure for the user to act on, what we use elsewhere
is "timestamp out of range" and I think thats more helpful.  I do wonder if
there is ever a legitimate case when this can fail while still having a
authenticated client connection?


+   ^\d+,t,TLSv[\d.]+,[\w-]+,\d+,/?CN=ssltestuser,$serialno,/?\QCN=Test CA 
for PostgreSQL SSL regression test client certs\E,\Q2021-03-03 
22:12:07\E,\Q2048-07-19 22:12:07\E\r?$}mx,

This test output won't actually work for testing against, it works now because
the dates match the current set of certificates, but the certificates can be
regenerated with `cd src/test/ssl && make -f sslfiles.mk` and that will change
the not_before/not_after dates.  In order to have stable test data we need to
set fixed end/start dates and reissue all the client certs.


+   "SELECT ssl_client_get_notbefore() = not_before FROM pg_stat_ssl WHERE 
pid = pg_backend_pid();",
+   connstr => $common_connstr);
+is($result, 't', "ssl_client_get_notbefore() for not_before timestamp");

While this works, it will fail to catch if we have the same bug in both sslinfo
and the backend.  With stable test data we can add the actual date in the mix
and verify that both timestamps are equal and that they match the expected
date.

I have addressed the issues above in a new v5 patchset which includes a new
patch for setting stable validity on the test certificates (the notBefore time
was arbitrarily chosen to match the date of opening up the tree for v17 - we
just need a date in the past).  Your two patches are rolled into a single one
with a commit message added to get started on that part as well.

--
Daniel Gustafsson



v5-0001-Set-fixed-dates-for-test-certificates-validity.patch
Description: Binary data


v5-0002-Add-notBefore-and-notAfter-to-SSL-cert-info-displ.patch
Description: Binary data


Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Mon, 2023-01-09 at 12:29 +, Dean Rasheed wrote:
> > Would it be feasible to allow specifying old.column or new.column?
> > These would always be NULL for INSERT and DELETE respectively but
> > more useful with UPDATE. Actually I've been meaning to ask this
> > question about UPDATE … RETURNING.
> > 
> 
> I too have wished for the ability to do that with UPDATE ...
> RETURNING, though I'm not sure how feasible it is.
> 
> I think it's something best considered separately though. I haven't
> given any thought as to how to make it work, so there might be
> technical difficulties. But if it could be made to work for UPDATE,
> it
> shouldn't be much more effort to make it work for MERGE.

MERGE can end up combining old and new values in a way that doesn't
happen with INSERT/UPDATE/DELETE. For instance, a "MERGE ... RETURNING
id" would return a mix of NEW.id (for INSERT/UPDATE actions) and OLD.id
(for DELETE actions).

The pg_merge_action() can differentiate the old and new values, but
it's a bit more awkward.

I'm fine considering that as a separate patch, but it does seem worth
discussing briefly here.

Regards,
Jeff Davis





Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread Pavel Luzanov

On 13.07.2023 18:01, Tom Lane wrote:

The idea with that, IMO, is to do something at least minimally sane
if there's a bogus role OID in pg_auth_members.  With plain joins,
the output row would disappear and you'd have no clue that anything
is wrong.  With left joins, you get a row with a null column and
there's reason to investigate why.

Since such a case should not happen in normal use, I don't think it
counts for discussions about compactness of output.  However, this
is also an argument for using a plain not left join between pg_roles
and pg_auth_members: if we do it as per the earlier patch, then
nulls in the output are common and wouldn't draw your attention.
(Indeed, I think broken and not-broken pg_auth_members contents
would be indistinguishable.)


It reminds me about defensive programming practices.
That's great, thanks for explanation.


That does not seem right.  Is it impossible for pam.set_option
to be false?  Even if it is, should this code assume that?


For versions before 16, including one role to another automatically
gives possibility to issue SET ROLE.

IMO, the only question is whether it is correct to show IMPLICIT and
SET options in versions where they are not actually present
in pg_auth_members, but can be determined.

--
Pavel Luzanov
Postgres Professional: https://postgrespro.com





Re: logical decoding and replication of sequences, take 2

2023-07-13 Thread Tomas Vondra
On 7/5/23 16:51, Ashutosh Bapat wrote:
> 0005, 0006 and 0007 are all related to the initial sequence sync. [3]
> resulted in 0007 and I think we need it. That leaves 0005 and 0006 to
> be reviewed in this response.
> 
> I followed the discussion starting [1] till [2]. The second one
> mentions the interlock mechanism which has been implemented in 0005
> and 0006. While I don't have an objection to allowing LOCKing a
> sequence using the LOCK command, I am not sure whether it will
> actually work or is even needed.
> 
> The problem described in [1] seems to be the same as the problem
> described in [2]. In both cases we see the sequence moving backwards
> during CATCHUP. At the end of catchup the sequence is in the right
> state in both the cases. [2] actually deems this behaviour OK. I also
> agree that the behaviour is ok. I am confused whether we have solved
> anything using interlocking and it's really needed.
> 
> I see that the idea of using an LSN to decide whether or not to apply
> a change to sequence started in [4]. In [5] Tomas proposed to use page
> LSN. Looking at [6], it actually seems like a good idea. In [7] Tomas
> agreed that LSN won't be sufficient. But I don't understand why. There
> are three LSNs in the picture - restart LSN of sync slot,
> confirmed_flush LSN of sync slot and page LSN of the sequence page
> from where we read the initial state of the sequence. I think they can
> be used with the following rules:
> 1. The publisher will not send any changes with LSN less than
> confirmed_flush so we are good there.
> 2. Any non-transactional changes that happened between confirmed_flush
> and page LSN should be discarded while syncing. They are already
> visible to SELECT.
> 3. Any transactional changes with commit LSN between confirmed_flush
> and page LSN should be discarded while syncing. They are already
> visible to SELECT.
> 4. A DDL acquires a lock on sequence. Thus no other change to that
> sequence can have an LSN between the LSN of the change made by DDL and
> the commit LSN of that transaction. Only DDL changes to sequence are
> transactional. Hence any transactional changes with commit LSN beyond
> page LSN would not have been seen by the SELECT otherwise SELECT would
> see the page LSN committed by that transaction. so they need to be
> applied while syncing.
> 5. Any non-transactional changes beyond page LSN should be applied.
> They are not seen by SELECT.
> 
> Am I missing something?
> 

Hmmm, I think you're onto something and the interlock may not be
actually necessary ...

IIRC there were two examples of the non-MVCC sequence behavior, leading
me to add the interlock.


1) going "backwards" during catchup

Sequences are not MVCC, and if there are increments between the slot
creation and the SELECT, the sequence will go backwards. But it will
ultimately end with the correct value. The LSN checks were an attempt to
prevent this.

I don't recall why I concluded this would not be sufficient (there's no
link for [7] in your message), but maybe it was related to the sequence
increments not being WAL-logged and thus not guaranteed to update the
page LSN, or something like that.

But if we agree we only guarantee consistency at the end of the catchup,
this does not matter - it's OK to go backwards as long as the sequence
ends with the correct value.


2) missing an increment because of ALTER SEQUENCE

My concern here was that we might have a transaction that does ALTER
SEQUENCE before the tablesync slot gets created, and the SELECT still
sees the old sequence state because we start decoding after the ALTER.

But now that I think about it again, this probably can't happen, because
the slot won't be created until the ALTER commits. So we shouldn't miss
anything.

I suspect I got confused by some other bug in the patch at that time,
leading me to a faulty conclusion.


I'll try removing the interlock, and make sure it actually works OK.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread David G. Johnston
On Thu, Jul 13, 2023 at 8:01 AM Tom Lane  wrote:

> > I plan to replace it to:
>
> >pg_catalog.concat_ws(', ',
> >  CASE WHEN pam.admin_option THEN 'ADMIN' END,
> >  CASE WHEN m.rolinherit THEN 'INHERIT' END,
> >  'SET'
> >) AS "Options",
>
> That does not seem right.  Is it impossible for pam.set_option
> to be false?  Even if it is, should this code assume that?
>
>
That replacement is for version 15 and earlier where pam.set_option doesn't
exist at all and the presence of a row here means that set has been granted.

David J.


Re: MERGE ... RETURNING

2023-07-13 Thread Dean Rasheed
On Tue, 11 Jul 2023 at 21:43, Gurjeet Singh  wrote:
>
> > > I think the name of function pg_merge_when_clause() can be improved.
> > > How about pg_merge_when_clause_ordinal().
> >
> > That's a bit of a mouthful, but I don't have a better idea right now.
> > I've left the names alone for now, in case something better occurs to
> > anyone.
>
> +1. How do we make sure we don't forget that it needs to be named
> better. Perhaps a TODO item within the patch will help.
>

Thinking about that some more, I think the word "number" is more
familiar to most people than "ordinal". There's the row_number()
function, for example. So perhaps pg_merge_when_clause_number() would
be a better name. It's still quite long, but it's the best I can think
of.

> I believe this elog can be safely turned into an Assert.
>
> +switch (mergeActionCmdType)
>  {
> -CmdType commandType = relaction->mas_action->commandType;
> 
> +case CMD_INSERT:
> 
> +default:
> +elog(ERROR, "unrecognized commandType: %d", (int)
> mergeActionCmdType);
>
> The same treatment can be applied to the elog(ERROR) in pg_merge_action().
>

Hmm, that's a very common code pattern used in dozens, if not hundreds
of places throughout the backend code, so I don't think this should be
different.

> > > +CREATE FUNCTION merge_into_sq_target(sid int, balance int, delta int,
> > > + OUT action text, OUT tid int,
> > > OUT new_balance int)
> > > +LANGUAGE sql AS
> > > +$$
> > > +MERGE INTO sq_target t
> > > +USING (VALUES ($1, $2, $3)) AS v(sid, balance, delta)
> > > +ON tid = v.sid
> > > +WHEN MATCHED AND tid > 2 THEN
> > >
> > > Again, for consistency, the comparison operator should be >=. There
> > > are a few more occurrences of this comparison in the rest of the file,
> > >  that need the same treatment.
> > >
> >
> > I changed the new tests to use ">= 2" (and the COPY test now returns 3
> > rows, with an action of each type, which is easier to read), but I
> > don't think it's really necessary to change all the existing tests
> > from "> 2". There's nothing wrong with the "= 2" case having no
> > action, as long as the tests give decent coverage.
>
> I was just trying to drive these tests towards a consistent pattern.
> As a reader, if I see these differences, the first and the
> conservative thought I have is that these differences must be there
> for a reason, and then I have to work to find out what those reasons
> might be. And that's a lot of wasted effort, just in case someone
> intends to change something in these tests.
>

OK, I see what you're saying. I think it should be a follow-on patch
though, because I don't like the idea of this patch to be making
changes to tests not related to the feature being added.

>  { oid => '9499', descr => 'command type of current MERGE action',
> -  proname => 'pg_merge_action',  provolatile => 'v',
> +  proname => 'pg_merge_action',  provolatile => 'v', proparallel => 'r',
> 
>  { oid => '9500', descr => 'index of current MERGE WHEN clause',
> -  proname => 'pg_merge_when_clause',  provolatile => 'v',
> +  proname => 'pg_merge_when_clause',  provolatile => 'v', proparallel => 'r',
> 
>
> I see that you've now set proparallel of these functions to 'r'. I'd
> just like to understand how you got to that conclusion.
>

Now that these functions can appear in subqueries in the RETURNING
list, there exists the theoretical possibility that the subquery might
use a parallel plan (actually that can't happen today, for any query
that modifies data, but maybe someday it may become a possibility),
and it's possible to use these functions in a SELECT query inside a
PL/pgSQL function called from the RETURNING list, which might consider
a parallel plan. Since these functions rely on access to executor
state that isn't copied to parallel workers, they must be run on the
leader, hence I think PARALLEL RESTRICTED is the right level to use. A
similar example is pg_trigger_depth().

> --- error when using MERGE support functions outside MERGE
> -SELECT pg_merge_action() FROM sq_target;
>
> I believe it would be worthwhile to keep a record of the expected
> outputs of these invocations in the tests, just in case they change
> over time.
>

Yeah, that makes sense. I'll post an update soon.

Regards,
Dean




Re: MERGE ... RETURNING

2023-07-13 Thread Jeff Davis
On Sun, 2023-01-08 at 12:28 +, Dean Rasheed wrote:
> I considered allowing a separate RETURNING list at the end of each
> action, but rapidly dismissed that idea.

One potential benefit of that approach is that it would be more natural
to specify output specific to the action, e.g. 

   WHEN MATCHED THEN UPDATE ... RETURNING 'UPDATE', ...

which would be an alternative to the special function pg_merge_action()
or "WITH WHEN".

I agree that it can be awkward to specify multiple RETURNING clauses
and get the columns to match up, but it's hard for me to say whether
it's better or worse without knowing more about the use cases.

Regards,
Jeff Davis





Re: Potential memory leak in contrib/intarray's g_intbig_compress

2023-07-13 Thread Tom Lane
Matthias van de Meent  writes:
> My collegue Konstantin (cc-ed) noticed that the GiST code of intarray
> may leak memory in certain index operations:

Can you demonstrate an actual problem here, that is a query-lifespan leak?

IMO, the coding rule in the GiST and GIN AMs is that the AM code is
responsible for running all opclass-supplied functions in suitably
short-lived memory contexts, so that leaks within those functions
don't cause problems.  This is different from btree which requires
comparison functions to not leak.  The rationale for having different
conventions is that btree comparison functions are typically simple
enough to be able to deal with such a restriction, whereas GiST
and GIN opclasses are often complex critters for which it'd be too
bug-prone to insist on leakproofness.  So it seems worth the cost
to make the AM itself set up a throwaway memory context.

(I don't recall offhand about which rule the other AMs use.
I'm also not sure where or if this choice is documented.)

In the case at hand, I think the pfree is useless and was installed
by somebody who had mis-extrapolated from btree rules.  So what
I'm thinking would be sufficient is to drop it altogether:

-   if (in != DatumGetArrayTypeP(entry->key))
-   pfree(in);

regards, tom lane




Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread Tom Lane
Pavel Luzanov  writes:
> On 08.07.2023 20:07, Tom Lane wrote
>> 3. Not sure about use of LEFT JOIN in the query.  That will mean we
>> get a row out even for roles that have no grants, which seems like
>> clutter.  The LEFT JOINs to r and g are fine, but I suggest changing
>> the first join to a plain join.

> Can you explain why LEFT JOIN to r and g are fine after removing LEFT 
> JOIN to pam?

The idea with that, IMO, is to do something at least minimally sane
if there's a bogus role OID in pg_auth_members.  With plain joins,
the output row would disappear and you'd have no clue that anything
is wrong.  With left joins, you get a row with a null column and
there's reason to investigate why.

Since such a case should not happen in normal use, I don't think it
counts for discussions about compactness of output.  However, this
is also an argument for using a plain not left join between pg_roles
and pg_auth_members: if we do it as per the earlier patch, then
nulls in the output are common and wouldn't draw your attention.
(Indeed, I think broken and not-broken pg_auth_members contents
would be indistinguishable.)

> I plan to replace it to:

>    pg_catalog.concat_ws(', ',
>      CASE WHEN pam.admin_option THEN 'ADMIN' END,
>      CASE WHEN m.rolinherit THEN 'INHERIT' END,
>      'SET'
>    ) AS "Options",

That does not seem right.  Is it impossible for pam.set_option
to be false?  Even if it is, should this code assume that?

regards, tom lane




Re: logical decoding and replication of sequences, take 2

2023-07-13 Thread Tomas Vondra
On 6/23/23 15:18, Ashutosh Bapat wrote:
> ...
>
> I reviewed 0001 and related parts of 0004 and 0008 in detail.
> 
> I have only one major change request, about
> typedef struct xl_seq_rec
> {
> RelFileLocator locator;
> + bool created; /* creates a new relfilenode (CREATE/ALTER) */
> 
> I am not sure what are the repercussions of adding a member to an existing WAL
> record. I didn't see any code which handles the old WAL format which doesn't
> contain the "created" flag. IIUC, the logical decoding may come across
> a WAL record written in the old format after upgrade and restart. Is
> that not possible?
> 

I don't understand why would adding a new field to xl_seq_rec be an
issue, considering it's done in a new major version. Sure, if you
generate WAL with old build, and start with a patched version, that
would break things. But that's true for many other patches, and it's
irrelevant for releases.

> But I don't think it's necessary. We can add a
> decoding routine for RM_SMGR_ID. The decoding routine will add relfilelocator
> in XLOG_SMGR_CREATE record to txn->sequences hash. Rest of the logic will work
> as is. Of course we will add non-sequence relfilelocators as well but that
> should be fine. Creating a new relfilelocator shouldn't be a frequent
> operation. If at all we are worried about that, we can add only the
> relfilenodes associated with sequences to the hash table.
> 

H, that might work. I feel a bit uneasy about having to keep all
relfilenodes, not just sequences ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: add non-option reordering to in-tree getopt_long

2023-07-13 Thread Nathan Bossart
On Thu, Jul 13, 2023 at 02:39:32PM +0900, Michael Paquier wrote:
> Something does not seem to be right seen from here, a CI run with
> Windows 2019 fails when using pg_ctl at the beginning of most of the
> tests, like:
> [04:56:07.404] - 8< 
> -
> [04:56:07.404] stderr:
> [04:56:07.404] pg_ctl: too many command-line arguments (first is "-D")

Assuming you are referring to [0], it looks like you are missing 411b720.

[0] https://github.com/michaelpq/postgres/commits/getopt_test

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

2023-07-13 Thread Önder Kalacı
Hi Amit, Hayato, all

> +1 for the readability. I think we can as you suggest or I feel it
> > would be better if we return false as soon as we found any condition
> > is false. The current patch has a mixed style such that for
> > InvalidStrategy, it returns immediately but for others, it does a
> > combined check.
> >
>
> I have followed the later style in the attached.
>

Looks good to me!

I agree with your note that the confusion was mostly
around two different styles (e,g., some checks return early others wait
until the final return). Now, the code is pretty easy to follow.


> > The other point we should consider in this regard is
> > the below assert check:
> >
> > +#ifdef USE_ASSERT_CHECKING
> > + {
> > + /* Check that the given index access method has amgettuple routine */
> > + IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am,
> > + false);
> > + Assert(amroutine->amgettuple != NULL);
> > + }
> > +#endif
> >
> > Apart from having an assert, we have the following two options (a)
> > check this condition as well and return false if am doesn't support
> > amgettuple (b) report elog(ERROR, ..) in this case.
>

I think with the current state of the patch (e.g., we only support btree
and hash),
Assert looks reasonable.

In the future, in case we have a future hypothetical index type that
fulfills the
"if" checks but fails on amgettuple, we could change the code to "return
false"
such that it gives a chance for the other indexes to satisfy the condition.


Let me know what you think of the attached.
>
>
Looks good to me. I have also done some testing, which works as
documented/expected.

Thanks,
Onder


Re: CommandStatus from insert returning when using a portal.

2023-07-13 Thread David G. Johnston
On Thursday, July 13, 2023, Dave Cramer  wrote:

>
> Any comment on why the CommandComplete is incorrect ?
> It returns INSERT 0 0  if a cursor is used
>

 Looking at DECLARE it is surprising that what you describe is even
possible.  Can you share a psql reproducer?

David J.


Re: logical decoding and replication of sequences, take 2

2023-07-13 Thread Ashutosh Bapat
Thanks for the updated patches. I haven't looked at the patches yet
but have some responses below.

On Thu, Jul 13, 2023 at 12:35 AM Tomas Vondra
 wrote:

>
>
> 3) simplify the replicated state
>
> As suggested by Ashutosh, it may not be a good idea to replicate the
> (last_value, log_cnt, is_called) tuple, as that's pretty tightly tied to
> our internal implementation. Which may not be the right thing for other
> plugins. So this new patch replicates just "value" which is pretty much
> (last_value + log_cnt), representing the next value that should be safe
> to generate on the subscriber (in case of a failover).
>

Thanks. That will help.


> 5) minor tweaks in the built-in replication
>
> This adopts the relaxed LOCK code to allow locking sequences during the
> initial sync, and also adopts the replication of a single value (this
> affects the "apply" side of that change too).
>

I think the problem we are trying to solve with LOCK is not actually
getting solved. See [2]. Instead your earlier idea of using page LSN
looks better.

>
> 6) simplified protocol versioning

I had tested the cross-version logical replication with older set of
patches. Didn't see any unexpected behaviour then. I will test again.
>
> The one thing I'm not really sure about is how it interferes with the
> replication of DDL. But in principle, if it decodes DDL for ALTER
> SEQUENCE, I don't see why it would be a problem that we then decode and
> replicate the WAL for the sequence state. But if it is a problem, we
> should be able to skip this WAL record with the initial sequence state
> (which I think should be possible thanks to the "created" flag this
> patch adds to the WAL record).

I had suggested a solution in [1] to avoid adding a flag to the WAL
record. Did you consider it? If you considered it and rejected, I
would be interested in knowing reasons behind rejecting it. Let me
repeat here again:

```
We can add a
decoding routine for RM_SMGR_ID. The decoding routine will add relfilelocator
in XLOG_SMGR_CREATE record to txn->sequences hash. Rest of the logic will work
as is. Of course we will add non-sequence relfilelocators as well but that
should be fine. Creating a new relfilelocator shouldn't be a frequent
operation. If at all we are worried about that, we can add only the
relfilenodes associated with sequences to the hash table.
```

If the DDL replication takes care of replicating and applying sequence
changes, I think we don't need the changes tracking "transactional"
sequence changes in this patch-set. That also makes a case for not
adding a new field to WAL which may not be used.

[1] 
https://www.postgresql.org/message-id/CAExHW5v_vVqkhF4ehST9EzpX1L3bemD1S%2BkTk_-ZVu_ir-nKDw%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAExHW5vHRgjWzi6zZbgCs97eW9U7xMtzXEQK%2BaepuzoGDsDNtg%40mail.gmail.com
-- 
Best Wishes,
Ashutosh Bapat




Re: cataloguing NOT NULL constraints

2023-07-13 Thread Dean Rasheed
On Wed, 12 Jul 2023 at 18:11, Alvaro Herrera  wrote:
>
> v13, because a conflict was just committed to alter_table.sql.
>
> Here I also call out the relcache.c change by making it a separate
> commit.  I'm likely to commit it that way, too.  To recap: the change is
> to have a partitioned table's index list include the primary key, even
> when said primary key is marked invalid.  This turns out to be necessary
> for the currently proposed pg_dump strategy to work; if this is not in
> place, attaching the per-partition PK indexes to the parent index fails
> because it sees that the columns are not marked NOT NULL.
>

Hmm, looking at that change, it looks a little ugly. I think someone
reading that code in the future will have no idea why it's including
some invalid indexes, and not others.

> There are no other changes from v12.  One thing I should probably get
> to, is fixing the constraint name comparison code in pg_dump.  Right now
> it's a bit dumb and will get in silly trouble with overlength
> table/column names (nothing that would actually break, just that it will
> emit constraint names when there's no need to.)
>

Yeah, that seems a bit ugly. Presumably, also, if something like a
column rename happens, the constraint name will no longer match.

I see that it's already been discussed, but I don't like the fact that
there is no way to get hold of the new constraint names in psql. I
think for the purposes of dropping named constraints, and also
possible future stuff like NOT VALID / DEFERRABLE constraints, having
some way to get their names will be important.

Something else I noticed is that the result from "ALTER TABLE ...
ALTER COLUMN ... DROP NOT NULL" is no longer easily predictable -- if
there are multiple NOT NULL constraints on the column, it just drops
one (chosen at random?) and leaves the others. I think that it should
either drop all the constraints, or throw an error. Either way, I
would expect that if DROP NOT NULL succeeds, the result is that the
column is nullable.

Regards,
Dean




Re: CommandStatus from insert returning when using a portal.

2023-07-13 Thread Dave Cramer
On Wed, 12 Jul 2023 at 21:31, David G. Johnston 
wrote:

> On Wed, Jul 12, 2023 at 5:57 PM Dave Cramer  wrote:
>
>> On Wed, 12 Jul 2023 at 20:00,  wrote:
>>
>>> Dave Cramer  writes:
>>> > Obviously I am biased by the JDBC API which would like to have
>>> > PreparedStatement.execute() return the number of rows inserted
>>> > without having to wait to read all of the rows returned
>>>
>>> Huh ... just how *is* PreparedStatement.execute() supposed
>>> to behave when the statement is an INSERT RETURNING?
>>>
>>
>> It's really executeUpdate which is supposed to return the number of rows
>> updated.
>>
>
> Right, and executeUpdate is the wrong API method to use, in the PostgreSQL
> world, when executing insert/update/delete with the non-SQL-standard
> returning clause.  executeQuery is the method to use.  And execute() should
> behave as if executeQuery was called, i.e., return true, which it is
> capable of doing since it has resultSet data that it needs to handle.
>
> The addition of returning turns the insert/update/delete into a select in
> terms of effective client-seen behavior.
>
> ISTM that you are trying to make user-error less painful.  While that is
> laudable it apparently isn't practical.  They can either discard the
> results and get a count by omitting returning or obtain the result and
> derive the count by counting rows alongside whatever else they needed the
> returned data for.
>
Any comment on why the CommandComplete is incorrect ?
It returns INSERT 0 0  if a cursor is used

Dave

>


Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

2023-07-13 Thread Jonathan S. Katz

On 7/5/23 2:15 AM, Richard Guo wrote:


On Tue, Jul 4, 2023 at 7:15 PM David Rowley > wrote:


On Tue, 4 Jul 2023 at 20:12, Richard Guo mailto:guofengli...@gmail.com>> wrote:
 > The v4 patch looks good to me (maybe some cosmetic tweaks are still
 > needed for the comments).  I think it's now 'Ready for Committer'.

I agree. I went and hit the comments with a large hammer and while
there also adjusted the regression tests. I didn't think having "t" as
a table name was a good idea as it seems like a name with a high risk
of conflicting with a concurrently running test. Also, there didn't
seem to be much need to insert data into that table as the tests
didn't query any of it.

The only other small tweak I made was to not call list_copy_head()
when the list does not need to be shortened. It's likely not that
important, but if the majority of cases are not partial matches, then
we'd otherwise be needlessly making copies of the list.

I pushed the adjusted patch.


The adjustments improve the patch a lot.  Thanks for adjusting and
pushing the patch.


Thanks for working on this! While it allows the planner to consider 
choosing an incremental sort for indexes that implement 
"amcanorderbyop", it also has a positive side-effect that the planner 
will also consider choosing a plan for spawning parallel workers!


Because of that, I'd like to open the discussion that we consider 
backpatching this. Currently, extensions that implement index access 
methods (e.g. pgvector[1]) that are built primarily around 
"amcanorderbyop" are unable to get the planner to consider choosing a 
parallel scan, i.e. at this point in "create_order_paths"[2]:


/*
* If cheapest partial path doesn't need a sort, this is redundant
* with what's already been tried.
*/
if (!pathkeys_contained_in(root->sort_pathkeys,
   cheapest_partial_path->pathkeys))

However, 625d5b3c does unlock this path for these types of indexes to 
allow for a parallel index scan to be chosen, which would allow 
extensions that implement a "amcanorderbyop" scan to use it. I would 
argue that this is a bug, given we offer the ability for index access 
methods to implement parallel index scans.


That said, I do think they may still need to be one planner tweak to 
properly support parallel index scan in this case, as I have yet to see 
costs generated where the parallel index scan is cheaper. However, I 
have not yet narrowed what/where that is.


Thanks,

Jonathan

[1] https://github.com/pgvector/pgvector
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/plan/planner.c;#l5188


OpenPGP_signature
Description: OpenPGP digital signature


Re: doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL

2023-07-13 Thread Amit Kapila
On Tue, Jul 11, 2023 at 2:17 PM Sergei Kornilov  wrote:
>
> I think it's appropriate to add on the restrictions page. (But mentioning 
> that this restriction is only for subscriber)
>
> If the list were larger, then the restrictions page could be divided into 
> publisher and subscriber restrictions. But not for one very specific 
> restriction.
>

Okay, how about something like: "The UPDATE and DELETE operations
cannot be applied on the subscriber for the published tables that
specify REPLICA IDENTITY FULL when the table has attributes with
datatypes (e.g point or box) that don't have a default operator class
for Btree or Hash. This won't be a problem if the table has a primary
key or replica identity defined for it."?

-- 
With Regards,
Amit Kapila.




Potential memory leak in contrib/intarray's g_intbig_compress

2023-07-13 Thread Matthias van de Meent
Hi,

My collegue Konstantin (cc-ed) noticed that the GiST code of intarray
may leak memory in certain index operations:

> g_intbig_compress(...):
> [...]
> ArrayType  *in = DatumGetArrayTypeP(entry->key);
> [...]
> if (in != DatumGetArrayTypeP(entry->key))
> pfree(in);

DatumGetArrayTypeP will allocate a new, uncompressed copy if the given
Datum is compressed. So in this code, if entry->key is compressed we'd
allocate two decompressed copies, while only we only deallocate the
first of these two. I believe the attached patch fixes the issue.

It looks like this bug has existed since the code was first committed,
so backpatching would go back to 11 if this is an active issue.

Kind regards,

Matthias van de Meent


0001-Fix-memory-leak-in-g_intbig_compress.patch
Description: Binary data


Re: Better help output for pgbench -I init_steps

2023-07-13 Thread Peter Eisentraut

On 12.07.23 19:47, Gurjeet Singh wrote:

If you end up with variant 3 or 4, please use double quotes instead of
single quotes.


Will  do.

I believe you're suggesting this because in the neighboring help text
the string literals use double quotes. I see that other utilities,
such as psql also use double quotes in help  text.

If there's a convention, documented somewhere in our sources, I'd love
to know and learn other conventions.


https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-QUOTATION-MARKS





Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Peter Eisentraut

On 13.07.23 11:29, Alvaro Herrera wrote:

On 2023-Jul-13, Peter Eisentraut wrote:


I suppose we could just say "logical replication worker" in all cases. That
should be enough precision for the purpose of these messages.


Agreed.  IMO the user doesn't care about specifics.


Ok, committed.





Re: Configurable FP_LOCK_SLOTS_PER_BACKEND

2023-07-13 Thread Tomas Vondra



On 7/13/23 07:02, Nikolay Samokhvalov wrote:
> We're observing a few cases with lockmanager spikes in a few quite
> loaded systems.
> 
> These cases are different; queries are different, Postgres versions are
> 12, 13, and 14.
> 
> But in all cases, servers are quite beefy (96-128 vCPUs, ~600-800 GiB)
> receiving a lot of TPS (a few dozens of thousands). Most queries that
> struggle from wait_event=lockmanager involve a substantial number of
> tables/indexes, often with partitioning.
> 
> FP_LOCK_SLOTS_PER_BACKEND is now hard-coded 16 in storage/proc.h – and
> it is now very easy to hit this threshold in a loaded system,
> especially, for example, if a table with a dozen of indexes was
> partitioned. It seems any system with good growth hits it sooner or later.
> 
> I wonder, would it make sense to:
> 1) either consider increasing this hard-coded value, taking into account
> that 16 seems to be very low for modern workloads, schemas, and hardware
> – say, it could be 64,

Well, that has a cost too, as it makes PGPROC larger, right? At the
moment that struct is already ~880B / 14 cachelines, adding 48 XIDs
would make it +192B / +3 cachelines. I doubt that won't impact other
common workloads ...

However, the lmgr/README says this is meant to alleviate contention on
the lmgr partition locks. Wouldn't it be better to increase the number
of those locks, without touching the PGPROC stuff? Could this be tuned
using some heuristics based on number of connections?

> 2) or even make it configurable – a new GUC.
> 

I'm rather strongly against adding yet another GUC for this low-level
stuff. We already have enough of such knobs it's almost impossible for
regular users to actually tune the system without doing something wrong.
I'd even say it's actively harmful, especially if it's aimed at very
common setups/workloads (like here).

regards
-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Check lateral references within PHVs for memoize cache keys

2023-07-13 Thread Richard Guo
On Sun, Jul 9, 2023 at 12:17 PM David Rowley  wrote:

> I just pushed a fix for this.  Thanks for reporting it.


BTW, I noticed a typo in the comment inside paraminfo_get_equal_hashops.

foreach(lc, innerrel->lateral_vars)
{
Node   *expr = (Node *) lfirst(lc);
TypeCacheEntry *typentry;

/* Reject if there are any volatile functions in PHVs */
if (contain_volatile_functions(expr))
{
list_free(*operators);
list_free(*param_exprs);
return false;
}

The expressions in RelOptInfo.lateral_vars are not necessarily from
PHVs.  For instance

explain (costs off)
select * from t t1 join
lateral (select * from t t2 where t1.a = t2.a offset 0) on true;
QUERY PLAN
--
 Nested Loop
   ->  Seq Scan on t t1
   ->  Memoize
 Cache Key: t1.a
 Cache Mode: binary
 ->  Seq Scan on t t2
   Filter: (t1.a = a)
(7 rows)

The lateral Var 't1.a' comes from the lateral subquery, not PHV.

This seems a typo from 63e4f13d.  How about we change it to the below?

- /* Reject if there are any volatile functions in PHVs */
+ /* Reject if there are any volatile functions in lateral vars */

Thanks
Richard


Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Alvaro Herrera
On 2023-Jul-13, Peter Eisentraut wrote:

> I suppose we could just say "logical replication worker" in all cases. That
> should be enough precision for the purpose of these messages.

Agreed.  IMO the user doesn't care about specifics.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Check lateral references within PHVs for memoize cache keys

2023-07-13 Thread Richard Guo
On Sun, Jul 9, 2023 at 1:28 AM Tom Lane  wrote:

> More generally, it's not clear to me why we should need to look inside
> lateral PHVs in the first place.  Wouldn't the lateral PHV itself
> serve fine as a cache key?


Do you mean we use the lateral PHV directly as a cache key?  Hmm, it
seems to me that we'd have problem if the PHV references rels that are
inside the PHV's syntactic scope.  For instance

select * from t t1 left join
lateral (select t1.a+t2.a as t1a, t2.a as t2a from t t2) s on true
where s.t1a = s.t2a;

The PHV references t1.a so it's lateral.  But it also references t2.a,
so if we use the PHV itself as cache key, the plan would look like

   QUERY PLAN

 Nested Loop
   ->  Seq Scan on t t1
   ->  Memoize
 Cache Key: (t1.a + t2.a)
 Cache Mode: binary
 ->  Seq Scan on t t2
   Filter: ((t1.a + a) = a)
(7 rows)

which is an invalid plan as the cache key contains t2.a.

Thanks
Richard


Re: [Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL

2023-07-13 Thread Amit Kapila
On Thu, Jul 13, 2023 at 8:51 AM Amit Kapila  wrote:
>
> On Wed, Jul 12, 2023 at 8:14 PM Önder Kalacı  wrote:
> >
> >>
> >> > - return is_btree && !is_partial && !is_only_on_expression;
> >> > + /* Check whether the index is supported or not */
> >> > + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
> >> > + != InvalidStrategy));
> >> > +
> >> > + is_partial = (indexInfo->ii_Predicate != NIL);
> >> > + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
> >> > +
> >> > + return is_suitable_type && !is_partial && !is_only_on_expression;
> >> >
> >
> >
> > I don't want to repeat this too much, as it is a minor note. Just
> > sharing my perspective here.
> >
> > As discussed in the other email [1], I feel like keeping
> > IsIndexUsableForReplicaIdentityFull()  function readable is useful
> > for documentation purposes as well.
> >
> > So, I'm more inclined to see something like your old code, maybe with
> > a different variable name.
> >
> >> bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
> >
> >
> > to
> >>
> >> bool has_equal_strategy = get_equal_strategy_number_for_am...
> >> 
> >>  return  has_equal_strategy   && !is_partial && !is_only_on_expression;
> >
>
> +1 for the readability. I think we can as you suggest or I feel it
> would be better if we return false as soon as we found any condition
> is false. The current patch has a mixed style such that for
> InvalidStrategy, it returns immediately but for others, it does a
> combined check.
>

I have followed the later style in the attached.

> The other point we should consider in this regard is
> the below assert check:
>
> +#ifdef USE_ASSERT_CHECKING
> + {
> + /* Check that the given index access method has amgettuple routine */
> + IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am,
> + false);
> + Assert(amroutine->amgettuple != NULL);
> + }
> +#endif
>
> Apart from having an assert, we have the following two options (a)
> check this condition as well and return false if am doesn't support
> amgettuple (b) report elog(ERROR, ..) in this case.
>
> I am of the opinion that we should either have an assert for this or
> do (b) because if do (a) currently there is no case where it can
> return false. What do you think?
>

For now, I have kept the assert but moved it to the end of the function.

Apart from the above, I have made a number of minor changes (a)
changed the datatype for the strategy to StrategyNumber at various
places in the patch; (b) made a number of changes in comments based on
Peter's comments and otherwise; (c) ran pgindent and changed the
commit message; (d) few other cosmetic changes.

Let me know what you think of the attached.

-- 
With Regards,
Amit Kapila.


v6-0001-Allow-the-use-of-a-hash-index-on-the-subscriber-d.patch
Description: Binary data


About `GetNonHistoricCatalogSnapshot`: does it allow developers to use catalog snapshot to scan non-catalog tables?

2023-07-13 Thread Xiaoran Wang
Hi,
I have a question about the routine "GetNonHistoricCatalogSnapshot".​
It has a param "Oid relid​​". It firstly
checks if the relation has systemcache or ​if it is in 
"RelationInvalidatesSnapshotsOnly" related relations.
If yes, it will invalidate the CatalogSnapshot.

I just wonder in which situation the developer tries to scan a non-catalog 
table by CatalogSnapshot.

By the way, in the routine " SnapshotSetCommandId", there is a comment

 /* Should we do the same with CatalogSnapshot? */

From my point of view, there is no need to update the curcid of 
CatalogSnapshot,  as the CatalogSnapshot
will be invalidated if there are any updates on the catalog tables in the 
current transaction.

If I misunderstood, please correct me!

Best regards, xiaoran


Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Masahiko Sawada
On Thu, Jul 13, 2023 at 4:07 PM Peter Eisentraut  wrote:
>
> On 13.07.23 06:59, Peter Smith wrote:
> > On Wed, Jul 12, 2023 at 9:35 PM Peter Eisentraut  
> > wrote:
> >>
> >> On 21.06.23 09:18, Alvaro Herrera wrote:
> >>> That is a terrible pattern in relatively new code.  Let's get rid of it
> >>> entirely rather than continue to propagate it.
> >>>
>  So, I don't think it is fair to say that these format strings are OK
>  for the existing HEAD code, but not OK for the patch code, when they
>  are both the same.
> >>>
> >>> Agreed.  Let's remove them all.
> >>
> >> This is an open issue for PG16 translation.  I propose the attached
> >> patch to fix this.  Mostly, this just reverts to the previous wordings.
> >> (I don't think for these messages the difference between "apply worker"
> >> and "parallel apply worker" is all that interesting to explode the
> >> number of messages.  AFAICT, the table sync worker case wasn't even
> >> used, since callers always handled it separately.)
> >
> > I thought the get_worker_name function was reachable by tablesync workers 
> > also.
> >
> > Since ApplyWorkerMain is a common entry point for both leader apply
> > workers and tablesync workers, any logs in that code path could
> > potentially be from either kind of worker. At least, when the function
> > was first introduced (around patch v43-0001? [1]) it was also
> > replacing some tablesync logs.
>
> I suppose we could just say "logical replication worker" in all cases.
> That should be enough precision for the purpose of these messages.

+1

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: 'ERROR: attempted to update invisible tuple' from 'ALTER INDEX ... ATTACH PARTITION' on parent index

2023-07-13 Thread Shruthi Gowda
On Thu, Jul 13, 2023 at 1:40 PM Michael Paquier  wrote:

> On Thu, Jul 13, 2023 at 02:26:42PM +0900, Michael Paquier wrote:
> > On Thu, Jul 13, 2023 at 09:35:17AM +0530, Dilip Kumar wrote:
> >> Or do we actually need to update all the tuple header information as
> >> well in RelationReloadIndexInfo() in order to fix that invariant so
> >> that it can be used for catalog tuple updates as well?
> >
> > RelationReloadIndexInfo() is designed to be minimal, so I am not
> > really excited about extending it more than necessary without a case
> > in favor of it.  indisreplident is clearly on the list of things to
> > update in this concept.  The others would need a more careful
> > evaluation, though we don't really have a case for doing more, IMO,
> > particularly in the score of stable branches.
>
> FYI, I was planning to do something about this thread in the shape of
> two different patches: one for the indisreplident missing from the
> RelationReloadIndexInfo() and one for the syscache issue in the
> partitioned index validation.  indisreplident use in the backend code
> is interesting, as, while double-checking the code, I did not find a
> code path involving a command where indisreplident would be checked
> from the pg_index tuple in the relcache: all the values are from
> tuples retrieved from the syscache.
>

Agree with the idea of splitting the patch.
While analyzing the issue I did notice that validatePartitionedIndex() is
the only place where the index tuple was copied from rel->rd_indextuple
however was not clear about the motive behind it.

Regards,
Shruthi KC
EnterpriseDB: http://www.enterprisedb.com


Re: psql: Add role's membership options to the \du+ command

2023-07-13 Thread Pavel Luzanov

On 08.07.2023 20:07, Tom Lane wrote

3. Not sure about use of LEFT JOIN in the query.  That will mean we
get a row out even for roles that have no grants, which seems like
clutter.  The LEFT JOINs to r and g are fine, but I suggest changing
the first join to a plain join.


Hm.
Can you explain why LEFT JOIN to r and g are fine after removing LEFT 
JOIN to pam?

Why not to change all three left joins to plain join?

The query for v16+ now looks like:

SELECT m.rolname AS "Role name", r.rolname AS "Member of",
  pg_catalog.concat_ws(', ',
    CASE WHEN pam.admin_option THEN 'ADMIN' END,
    CASE WHEN pam.inherit_option THEN 'INHERIT' END,
    CASE WHEN pam.set_option THEN 'SET' END
  ) AS "Options",
  g.rolname AS "Grantor"
FROM pg_catalog.pg_roles m
 JOIN pg_catalog.pg_auth_members pam ON (pam.member = m.oid)
 LEFT JOIN pg_catalog.pg_roles r ON (pam.roleid = r.oid)
 LEFT JOIN pg_catalog.pg_roles g ON (pam.grantor = g.oid)
WHERE m.rolname !~ '^pg_'
ORDER BY 1, 2, 4;


And for versions <16 I forget to simplify expression for 'Options' 
column after removing LEFT JOIN on pam:


SELECT m.rolname AS "Role name", r.rolname AS "Member of",
  pg_catalog.concat_ws(', ',
    CASE WHEN pam.admin_option THEN 'ADMIN' END,
    CASE WHEN pam.roleid IS NOT NULL AND m.rolinherit THEN 'INHERIT' END,
    CASE WHEN pam.roleid IS NOT NULL THEN 'SET' END
  ) AS "Options",
  g.rolname AS "Grantor"
FROM pg_catalog.pg_roles m
 JOIN pg_catalog.pg_auth_members pam ON (pam.member = m.oid)
 LEFT JOIN pg_catalog.pg_roles r ON (pam.roleid = r.oid)
 LEFT JOIN pg_catalog.pg_roles g ON (pam.grantor = g.oid)
WHERE m.rolname !~ '^pg_'
ORDER BY 1, 2, 4;

I plan to replace it to:

  pg_catalog.concat_ws(', ',
    CASE WHEN pam.admin_option THEN 'ADMIN' END,
    CASE WHEN m.rolinherit THEN 'INHERIT' END,
    'SET'
  ) AS "Options",

--
Pavel Luzanov
Postgres Professional: https://postgrespro.com





Re: add non-option reordering to in-tree getopt_long

2023-07-13 Thread Kyotaro Horiguchi
At Thu, 13 Jul 2023 14:39:32 +0900, Michael Paquier  wrote 
in 
> [04:56:07.404] pg_ctl: too many command-line arguments (first is "-D")

Mmm. It checks, for example, for "pg_ctl initdb -D $tempdir/data -o
-N".  This version of getopt_long() returns -1 as soon as it meets the
first non-option "initdb". This is somewhat different from the last
time what I saw the patch and looks strange at a glance..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add hint message for check_log_destination()

2023-07-13 Thread Masahiko Sawada
On Tue, Jul 11, 2023 at 10:24 AM Japin Li  wrote:
>
>
> On Mon, 10 Jul 2023 at 14:23, Masahiko Sawada  wrote:
> > On Mon, Jul 10, 2023 at 2:07 PM Kyotaro Horiguchi
> >  wrote:
> >>
> >> At Mon, 10 Jul 2023 09:04:42 +0800, Japin Li  wrote in
> >> >
> >> > On Sat, 08 Jul 2023 at 12:48, Michael Paquier  
> >> > wrote:
> >> > > On Fri, Jul 07, 2023 at 07:23:47PM +0800, Japin Li wrote:
> >> > >> +  appendStringInfoString(, "\"stderr\"");
> >> > >> +#ifdef HAVE_SYSLOG
> >> > >> +  appendStringInfoString(, ", \"syslog\"");
> >> > >> +#endif
> >> > >> +#ifdef WIN32
> >> > >> +  appendStringInfoString(, ", \"eventlog\"");
> >> > >> +#endif
> >> > >> +  appendStringInfoString(, ", \"csvlog\", 
> >> > >> and \"jsonlog\"");
> >> > >
> >> > > Hmm.  Is that OK as a translatable string?
> >
> > It seems okay to me but needs to be checked.
> >
> >> >
> >> >
> >> > Sorry for the late reply!  I'm not sure.  How can I know whether it is 
> >> > translatable?
> >>
> >> At the very least, we can't generate comma-separated lists
> >> programatically because punctuation marks vary across languages.
> >>
> >> One potential approach could involve defining the message for every
> >> potential combination, in full length.
> >
> > Don't we generate a comma-separated list for an error hint of an enum
> > parameter? For example, to generate the following error hint:
> >
> > =# alter system set client_min_messages = 'aaa';
> > ERROR:  invalid value for parameter "client_min_messages": "aaa"
> > HINT:  Available values: debug5, debug4, debug3, debug2, debug1, log,
> > notice, warning, error.
> >
> > we use the comma-separated generated by config_enum_get_options() and
> > do ereport() like:
> >
> >   ereport(elevel,
> >   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >errmsg("invalid value for parameter \"%s\": \"%s\"",
> >   name, value),
> >hintmsg ? errhint("%s", _(hintmsg)) : 0));
>
> > IMO log_destination is a string GUC parameter but its value is the
> > list of enums. So it makes sense to me to add a hint message like what
> > we do for enum parameters in case where the user mistypes a wrong
> > value. I'm not sure why the proposed patch needs to quote the usable
> > values, though.
>
> I borrowed the description from pg_settings extra_desc.  In my first patch,
> I used the hint message line enum parameter, however, since it might be a
> combination of multiple log destinations, so, I update the patch using
> extra_desc.

I agree to use description from pg_settings extra_desc, but it seems
to be better not to quote each available value like we do for enum
parameter cases. That is, the hint message would be like:

=# alter system set log_destination to 'xxx';
ERROR:  invalid value for parameter "log_destination": "xxx"
DETAIL:  Unrecognized key word: "xxx".
HINT:  Valid values are combinations of stderr, syslog, csvlog, and jsonlog.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: 'ERROR: attempted to update invisible tuple' from 'ALTER INDEX ... ATTACH PARTITION' on parent index

2023-07-13 Thread Michael Paquier
On Thu, Jul 13, 2023 at 02:26:42PM +0900, Michael Paquier wrote:
> On Thu, Jul 13, 2023 at 09:35:17AM +0530, Dilip Kumar wrote:
>> Or do we actually need to update all the tuple header information as
>> well in RelationReloadIndexInfo() in order to fix that invariant so
>> that it can be used for catalog tuple updates as well?
> 
> RelationReloadIndexInfo() is designed to be minimal, so I am not
> really excited about extending it more than necessary without a case
> in favor of it.  indisreplident is clearly on the list of things to
> update in this concept.  The others would need a more careful
> evaluation, though we don't really have a case for doing more, IMO,
> particularly in the score of stable branches.

FYI, I was planning to do something about this thread in the shape of
two different patches: one for the indisreplident missing from the
RelationReloadIndexInfo() and one for the syscache issue in the
partitioned index validation.  indisreplident use in the backend code
is interesting, as, while double-checking the code, I did not find a
code path involving a command where indisreplident would be checked
from the pg_index tuple in the relcache: all the values are from
tuples retrieved from the syscache.
--
Michael


signature.asc
Description: PGP signature


Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-07-13 Thread Masahiko Sawada
On Sat, Jul 8, 2023 at 11:54 AM John Naylor
 wrote:
>
>
> On Fri, Jul 7, 2023 at 2:19 PM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 5, 2023 at 8:21 PM John Naylor  
> > wrote:
> > > Well, it's going to be a bit of a mess until I can demonstrate it working 
> > > (and working well) with bitmap heap scan. Fixing that now is just going 
> > > to create conflicts. I do have a couple small older patches laying around 
> > > that were quick experiments -- I think at least some of them should give 
> > > a performance boost in loading speed, but haven't had time to test. Would 
> > > you like to take a look?
> >
> > Yes, I can experiment with these patches in the meantime.
>
> Okay, here it is in v36. 0001-6 are same as v35.
>
> 0007 removes a wasted extra computation newly introduced by refactoring 
> growing nodes. 0008 just makes 0011 nicer. Not worth testing by themselves, 
> but better to be tidy.
> 0009 is an experiment to get rid of slow memmoves in node4, addressing a 
> long-standing inefficiency. It looks a bit tricky, but I think it's actually 
> straightforward after drawing out the cases with pen and paper. It works if 
> the fanout is either 4 or 5, so we have some wiggle room. This may give a 
> noticeable boost if the input is reversed or random.
> 0010 allows RT_EXTEND_DOWN to reduce function calls, so should help with 
> sparse trees.
> 0011 reduces function calls when growing the smaller nodes. Not sure about 
> this one -- possibly worth it for node4 only?
>
> If these help, it'll show up more easily in smaller inputs. Large inputs tend 
> to be more dominated by RAM latency.

Thanks for sharing the patches!

0007, 0008, 0010, and 0011 are straightforward and agree to merge them.

I have some questions on 0009 patch:

+   /* shift chunks and children
+
+   Unfortunately, gcc has gotten too aggressive in
turning simple loops
+   into slow memmove's, so we have to be a bit more clever.
+   See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481
+
+   We take advantage of the fact that a good
+   compiler can turn a memmove of a small constant power-of-two
+   number of bytes into a single load/store.
+   */

According to the comment, this optimization is for only gcc? and there
is no negative impact when building with other compilers such as clang
by this change?

I'm not sure that it's a good approach to hand-optimize the code much
to generate better instructions on gcc. I think this change reduces
readability and maintainability. According to the bugzilla ticket
referred to in the comment, it's realized as a bug in the community,
so once the gcc bug fixes, we might no longer need this trick, no?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: \di+ cannot show the same name indexes

2023-07-13 Thread Julien Rouhaud
Hi,

On Thu, Jul 13, 2023 at 03:17:17PM +0800, フブキダイスキ wrote:
> After I create the same name index on the heap table and the temporary
> table, I can only get the temporary table's index by \di+.
>
> create table t1(c1 int);
> create temp table t2(c1 int);
>
> create index idx1 on t1(c1);
> \di+
>   List of relations
>  Schema | Name | Type  | Owner | Table |  Size  | Description
> +--+---+---+---++-
>  public | idx1 | index | zhrt  | t1| 128 kB |
> (1 row)
>
> create index idx1 on t2(c1);
> \di+
>  List of relations
>Schema| Name | Type  | Owner | Table |  Size  | Description
> -+--+---+---+---++-
>  pg_temp_298 | idx1 | index | zhrt  | t2| 128 kB |
> (1 row)
>
> Is it the expected bavior?

Yes, since the pg_temp schema has higher priority and those command will not
show multiple objects for the same non qualified name.  You can either change
the priority with something like

SET search_path TO public, pg_temp;

to look at public (or any other schema) first, or explicitly ask for the schema
you want, e.g. \di+ public.*




\di+ cannot show the same name indexes

2023-07-13 Thread フブキダイスキ
Hi,
After I create the same name index on the heap table and the temporary
table, I can only get the temporary table's index by \di+.

create table t1(c1 int);
create temp table t2(c1 int);

create index idx1 on t1(c1);
\di+
  List of relations
 Schema | Name | Type  | Owner | Table |  Size  | Description
+--+---+---+---++-
 public | idx1 | index | zhrt  | t1| 128 kB |
(1 row)

create index idx1 on t2(c1);
\di+
 List of relations
   Schema| Name | Type  | Owner | Table |  Size  | Description
-+--+---+---+---++-
 pg_temp_298 | idx1 | index | zhrt  | t2| 128 kB |
(1 row)

Is it the expected bavior?


Re: Check lateral references within PHVs for memoize cache keys

2023-07-13 Thread Richard Guo
On Sun, Jul 9, 2023 at 1:28 AM Tom Lane  wrote:

> Richard Guo  writes:
> > Rebase the patch on HEAD as cfbot reminds.
>
> This fix seems like a mess.  The function that is in charge of filling
> RelOptInfo.lateral_vars is extract_lateral_references; or at least
> that was how it was done up to now.  Can't we compute these additional
> references there?  If not, maybe we ought to just merge
> extract_lateral_references into create_lateral_join_info, rather than
> having the responsibility split.  I also wonder whether this change
> isn't creating hidden dependencies on RTE order (which would likely be
> bugs), since create_lateral_join_info itself examines the lateral_vars
> lists as it walks the rtable.


Yeah, you're right.  Currently all RelOptInfo.lateral_vars are filled in
extract_lateral_references.  And then in create_lateral_join_info these
lateral_vars, together with all PlaceHolderInfos, are examined to
compute the per-base-relation lateral refs relids.  However, in the
process of extract_lateral_references, it's possible that we create new
PlaceHolderInfos.  So I think it may not be a good idea to extract the
lateral references within PHVs there.  But I agree with you that it's
also not a good idea to compute these additional lateral Vars within
PHVs in create_lateral_join_info as the patch does.  Actually with the
patch I find that with PHVs that are due to be evaluated at a join we
may get a problematic plan.  For instance

explain (costs off)
select * from t t1 left join lateral
(select t1.a as t1a, t2.a as t2a from t t2 join t t3 on true) s on true
where s.t1a = s.t2a;
 QUERY PLAN

 Nested Loop
   ->  Seq Scan on t t1
   ->  Nested Loop
 Join Filter: (t1.a = t2.a)
 ->  Seq Scan on t t2
 ->  Memoize
   Cache Key: t1.a
   Cache Mode: binary
   ->  Seq Scan on t t3
(9 rows)

There are no lateral refs in the subtree of the Memoize node, so it
should be a Materialize node rather than a Memoize node.  This is caused
by that for a PHV that is due to be evaluated at a join, we fill its
lateral refs in each baserel in the join, which is wrong.

So I'm wondering if it'd be better that we move all this logic of
computing additional lateral references within PHVs to get_memoize_path,
where we can examine only PHVs that are evaluated at innerrel.  And
considering that these lateral refs are only used by Memoize, it seems
more sensible to compute them there.  But I'm a little worried that
doing this would make get_memoize_path too expensive.

Please see v4 patch for this change.

Thanks
Richard


v4-0001-Check-lateral-refs-within-PHVs-for-memoize-cache-keys.patch
Description: Binary data


Re: Consistent coding for the naming of LR workers

2023-07-13 Thread Peter Eisentraut

On 13.07.23 06:59, Peter Smith wrote:

On Wed, Jul 12, 2023 at 9:35 PM Peter Eisentraut  wrote:


On 21.06.23 09:18, Alvaro Herrera wrote:

That is a terrible pattern in relatively new code.  Let's get rid of it
entirely rather than continue to propagate it.


So, I don't think it is fair to say that these format strings are OK
for the existing HEAD code, but not OK for the patch code, when they
are both the same.


Agreed.  Let's remove them all.


This is an open issue for PG16 translation.  I propose the attached
patch to fix this.  Mostly, this just reverts to the previous wordings.
(I don't think for these messages the difference between "apply worker"
and "parallel apply worker" is all that interesting to explode the
number of messages.  AFAICT, the table sync worker case wasn't even
used, since callers always handled it separately.)


I thought the get_worker_name function was reachable by tablesync workers also.

Since ApplyWorkerMain is a common entry point for both leader apply
workers and tablesync workers, any logs in that code path could
potentially be from either kind of worker. At least, when the function
was first introduced (around patch v43-0001? [1]) it was also
replacing some tablesync logs.


I suppose we could just say "logical replication worker" in all cases. 
That should be enough precision for the purpose of these messages.






Re: CREATE INDEX CONCURRENTLY on partitioned index

2023-07-13 Thread Alexander Pyhalov

Justin Pryzby писал 2023-07-13 05:27:

On Mon, Mar 27, 2023 at 01:28:24PM +0300, Alexander Pyhalov wrote:

Justin Pryzby писал 2023-03-26 17:51:
> On Sun, Dec 04, 2022 at 01:09:35PM -0600, Justin Pryzby wrote:
> > This currently handles partitions with a loop around the whole CIC
> > implementation, which means that things like WaitForLockers() happen
> > once for each index, the same as REINDEX CONCURRENTLY on a partitioned
> > table.  Contrast that with ReindexRelationConcurrently(), which handles
> > all the indexes on a table in one pass by looping around indexes within
> > each phase.
>
> Rebased over the progress reporting fix (27f5c712b).
>
> I added a list of (intermediate) partitioned tables, rather than looping
> over the list of inheritors again, to save calling rel_get_relkind().
>
> I think this patch is done.

Overall looks good to me. However, I think that using 'partitioned' as 
list
of partitioned index oids in DefineIndex() is a bit misleading - we've 
just

used it as boolean, specifying if we are dealing with a partitioned
relation.


Right.  This is also rebased on 8c852ba9a4 (Allow some exclusion
constraints on partitions).


Hi.
I have some more question.
In the following code (indexcmds.c:1640 and later)

1640 rel = table_open(relationId, 
ShareUpdateExclusiveLock);

1641 heaprelid = rel->rd_lockInfo.lockRelId;
1642 table_close(rel, ShareUpdateExclusiveLock);
1643 SET_LOCKTAG_RELATION(heaplocktag, 
heaprelid.dbId, heaprelid.relId);


should we release ShareUpdateExclusiveLock before getting session lock 
in DefineIndexConcurrentInternal()?
Also we unlock parent table there between reindexing childs in the end 
of DefineIndexConcurrentInternal():


1875 /*
1876  * Last thing to do is release the session-level lock on 
the parent table.

1877  */
1878 UnlockRelationIdForSession(, 
ShareUpdateExclusiveLock);

1879 }


Is it safe? Shouldn't we hold session lock on the parent table while 
rebuilding child indexes?




--
Best regards,
Alexander Pyhalov,
Postgres Professional