introduce optimized linear search functions that return index of matching element

2022-09-16 Thread Nathan Bossart
On Fri, Sep 16, 2022 at 02:54:14PM +0700, John Naylor wrote:
> v6 demonstrates why this should have been put off towards the end. (more 
> below)

Since the SIMD code is fresh in my mind, I wanted to offer my review for
0001 in the "Improve dead tuple storage for lazy vacuum" thread [0].
However, I agree with John that the SIMD part of that work should be left
for the end, and I didn't want to distract from the radix tree part too
much.  So, here is a new thread for just the SIMD part.

>> I've updated the radix tree patch. It's now separated into two patches.
>>
>> 0001 patch introduces pg_lsearch8() and pg_lsearch8_ge() (we may find
>> better names) that are similar to the pg_lfind8() family but they
>> return the index of the key in the vector instead of true/false. The
>> patch includes regression tests.

I don't think it's clear that the "lfind" functions return whether there is
a match while the "lsearch" functions return the index of the first match.
It might be better to call these something like "pg_lfind8_idx" and
"pg_lfind8_ge_idx" instead.

> +/*
> + * Return the index of the first element in the vector that is greater than
> + * or eual to the given scalar. Return sizeof(Vector8) if there is no such
> + * element.
>
> That's a bizarre API to indicate non-existence.

+1.  It should probably just return -1 in that case.

> + *
> + * Note that this function assumes the elements in the vector are sorted.
> + */
>
> That is *completely* unacceptable for a general-purpose function.

+1

> +#else /* USE_NO_SIMD */
> + Vector8 r = 0;
> + uint8 *rp = (uint8 *) 
> +
> + for (Size i = 0; i < sizeof(Vector8); i++)
> + rp[i] = (((const uint8 *) )[i] == ((const uint8 *) )[i]) ? 0xFF : 0;
>
> I don't think we should try to force the non-simd case to adopt the
> special semantics of vector comparisons. It's much easier to just use
> the same logic as the assert builds.

+1

> +#ifdef USE_SSE2
> + return (uint32) _mm_movemask_epi8(v);
> +#elif defined(USE_NEON)
> + static const uint8 mask[16] = {
> +1 << 0, 1 << 1, 1 << 2, 1 << 3,
> +1 << 4, 1 << 5, 1 << 6, 1 << 7,
> +1 << 0, 1 << 1, 1 << 2, 1 << 3,
> +1 << 4, 1 << 5, 1 << 6, 1 << 7,
> +  };
> +
> +uint8x16_t masked = vandq_u8(vld1q_u8(mask), (uint8x16_t)
> vshrq_n_s8(v, 7));
> +uint8x16_t maskedhi = vextq_u8(masked, masked, 8);
> +
> +return (uint32) vaddvq_u16((uint16x8_t) vzip1q_u8(masked, maskedhi));
>
> For Arm, we need to be careful here. This article goes into a lot of
> detail for this situation:
>
> https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/porting-x86-vector-bitmask-optimizations-to-arm-neon

The technique demonstrated in this article seems to work nicely.

For these kinds of patches, I find the best way to review them is to try
out my proposed changes as I'm reading through the patch.  I hope you don't
mind that I've done so here and attached a new version of the patch.  In
addition to addressing the aforementioned feedback, I made the following
changes:

* I renamed the vector8_search_* functions to vector8_find() and
vector8_find_ge().  IMO this is more in the spirit of existing function
names like vector8_has().

* I simplified vector8_find_ge() by essentially making it do the opposite
of what vector8_has_le() does (i.e., using saturating subtraction to find
matching bytes).  This removes the need for vector8_min(), and since
vector8_find_ge() can just call vector8_search() to find any 0 bytes,
vector8_highbit_mask() can be removed as well.

* I simplified the test for pg_lfind8_ge_idx() by making it look a little
more like the test for pg_lfind32().  I wasn't sure about the use of rand()
and qsort(), and overall it just felt a little too complicated to me.

I've tested all three code paths (i.e., SSE2, Neon, and USE_NO_SIMD), but I
haven't done any performance analysis yet.

[0] 
https://postgr.es/m/CAD21AoD3w76wERs_Lq7_uA6%2BgTaoOERPji%2BYz8Ac6aui4JwvTg%40mail.gmail.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From fcbb68b7d2bb9df63c92bc773240873e1e27a5a8 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 16 Sep 2022 20:44:03 -0700
Subject: [PATCH v1 1/1] introduce pg_lfind8_idx and pg_lfind8_ge_idx

---
 src/include/port/pg_lfind.h   |  72 +
 src/include/port/simd.h   | 100 ++
 .../test_lfind/expected/test_lfind.out|  12 +++
 .../modules/test_lfind/sql/test_lfind.sql |   2 +
 .../modules/test_lfind/test_lfind--1.0.sql|   8 ++
 src/test/modules/test_lfind/test_lfind.c  |  81 +-
 6 files changed, 274 insertions(+), 1 deletion(-)

diff --git a/src/include/port/pg_lfind.h b/src/include/port/pg_lfind.h
index 0625cac6b5..34cf30e591 100644
--- a/src/include/port/pg_lfind.h
+++ b/src/include/port/pg_lfind.h
@@ -48,6 +48,42 @@ pg_lfind8(uint8 key, uint8 *base, uint32 nelem)
 	return false;
 }
 
+/*
+ * pg_lfind8_idx
+ *
+ * Return index 

Re: ICU for global collation

2022-09-16 Thread Peter Eisentraut

On 16.09.22 08:49, Marina Polyakova wrote:
But perhaps the check that --icu-locale cannot be specified unless 
locale provider icu is chosen should also be moved here? So all these 
checks will be in one place and it will use the provider from the 
template database (which could be icu):


$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU


Can you be more specific about what you are proposing here?  I'm not 
following.






RE: why can't a table be part of the same publication as its schema

2022-09-16 Thread houzj.f...@fujitsu.com
On Saturday, September 17, 2022 11:22 AM Amit Kapila  
wrote:
> 
> On Fri, Sep 16, 2022 at 1:09 PM houzj.f...@fujitsu.com 
> 
> wrote:
> >
> > Attach the new version patch which addressed above comments and ran
> pgident.
> > I also improved some codes and documents based on some comments given
> > by Vignesh and Peter offlist.
> >
> 
> Thanks, the patch looks mostly good to me. I have made a few cosmetic changes
> and edited a few comments. I would like to push this to HEAD and backpatch it
> to 15 by Tuesday unless there are any comments. I think we should back patch
> this because otherwise, users will see a change in behavior in 16 but if 
> others
> don't think the same way then we can consider pushing this to HEAD only.

Thanks for the patch.
I rebased it based on PG15 and here is the patch.

Best regards,
Hou zj


v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Description:  v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch


v5-PG15-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Description:  v5-PG15-0001-Allow-publications-with-schema-and-table-of-the-s.patch


Re: Make mesage at end-of-recovery less scary.

2022-09-16 Thread Justin Pryzby
@cfbot: rebased over adb466150, which did the same thing as one of the
hunks in xlogreader.c.
>From c4069bb7181b68d742d2025567f859e69d24f513 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Thu, 7 Jul 2022 11:51:45 +0900
Subject: [PATCH] Make End-Of-Recovery error less scary

When recovery in any type ends, we see a bit scary error message like
"invalid record length" that suggests something serious is
happening. Actually if recovery meets a record with length = 0, that
usually means it finished applying all available WAL records.

Make this message less scary as "reached end of WAL". Instead, raise
the error level for other kind of WAL failure to WARNING.
---
 src/backend/access/transam/xlogreader.c   | 134 +-
 src/backend/access/transam/xlogrecovery.c |  95 +++
 src/backend/replication/walreceiver.c |   7 +-
 src/bin/pg_waldump/pg_waldump.c   |  13 ++-
 src/include/access/xlogreader.h   |   1 +
 src/test/recovery/t/011_crash_recovery.pl | 106 +
 6 files changed, 298 insertions(+), 58 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 050d2f424e4..9b8f29d0ad0 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -48,6 +48,8 @@ static int	ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr,
 			 int reqLen);
 static void XLogReaderInvalReadState(XLogReaderState *state);
 static XLogPageReadResult XLogDecodeNextRecord(XLogReaderState *state, bool non_blocking);
+static bool ValidXLogRecordLength(XLogReaderState *state, XLogRecPtr RecPtr,
+  XLogRecord *record);
 static bool ValidXLogRecordHeader(XLogReaderState *state, XLogRecPtr RecPtr,
   XLogRecPtr PrevRecPtr, XLogRecord *record, bool randAccess);
 static bool ValidXLogRecord(XLogReaderState *state, XLogRecord *record,
@@ -149,6 +151,7 @@ XLogReaderAllocate(int wal_segment_size, const char *waldir,
 		pfree(state);
 		return NULL;
 	}
+	state->EndOfWAL = false;
 	state->errormsg_buf[0] = '\0';
 
 	/*
@@ -558,6 +561,7 @@ XLogDecodeNextRecord(XLogReaderState *state, bool nonblocking)
 	/* reset error state */
 	state->errormsg_buf[0] = '\0';
 	decoded = NULL;
+	state->EndOfWAL = false;
 
 	state->abortedRecPtr = InvalidXLogRecPtr;
 	state->missingContrecPtr = InvalidXLogRecPtr;
@@ -640,25 +644,21 @@ restart:
 	Assert(pageHeaderSize <= readOff);
 
 	/*
-	 * Read the record length.
+	 * Validate the record header.
 	 *
-	 * NB: Even though we use an XLogRecord pointer here, the whole record
-	 * header might not fit on this page. xl_tot_len is the first field of the
-	 * struct, so it must be on this page (the records are MAXALIGNed), but we
-	 * cannot access any other fields until we've verified that we got the
-	 * whole header.
-	 */
-	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
-	total_len = record->xl_tot_len;
-
-	/*
-	 * If the whole record header is on this page, validate it immediately.
-	 * Otherwise do just a basic sanity check on xl_tot_len, and validate the
-	 * rest of the header after reading it from the next page.  The xl_tot_len
+	 * Even though we use an XLogRecord pointer here, the whole record header
+	 * might not fit on this page.  If the whole record header is on this page,
+	 * validate it immediately.  Even otherwise xl_tot_len must be on this page
+	 * (it is the first field of MAXALIGNed records), but we still cannot
+	 * access any further fields until we've verified that we got the whole
+	 * header, so do just a basic sanity check on record length, and validate
+	 * the rest of the header after reading it from the next page.  The length
 	 * check is necessary here to ensure that we enter the "Need to reassemble
 	 * record" code path below; otherwise we might fail to apply
 	 * ValidXLogRecordHeader at all.
 	 */
+	record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
+
 	if (targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord)
 	{
 		if (!ValidXLogRecordHeader(state, RecPtr, state->DecodeRecPtr, record,
@@ -668,18 +668,14 @@ restart:
 	}
 	else
 	{
-		/* XXX: more validation should be done here */
-		if (total_len < SizeOfXLogRecord)
-		{
-			report_invalid_record(state,
-  "invalid record length at %X/%X: wanted %u, got %u",
-  LSN_FORMAT_ARGS(RecPtr),
-  (uint32) SizeOfXLogRecord, total_len);
+		if (!ValidXLogRecordLength(state, RecPtr, record))
 			goto err;
-		}
+
 		gotheader = false;
 	}
 
+	total_len = record->xl_tot_len;
+
 	/*
 	 * Find space to decode this record.  Don't allow oversized allocation if
 	 * the caller requested nonblocking.  Otherwise, we *have* to try to
@@ -1106,16 +1102,47 @@ XLogReaderInvalReadState(XLogReaderState *state)
 }
 
 /*
- * Validate an XLOG record header.
+ * Validate record length of an XLOG record header.
  *
- * This is just a convenience subroutine to avoid duplicated code in
- * XLogReadRecord.  It's not intended for 

Re: why can't a table be part of the same publication as its schema

2022-09-16 Thread Amit Kapila
On Fri, Sep 16, 2022 at 1:09 PM houzj.f...@fujitsu.com
 wrote:
>
> Attach the new version patch which addressed above comments and ran pgident.
> I also improved some codes and documents based on some comments
> given by Vignesh and Peter offlist.
>

Thanks, the patch looks mostly good to me. I have made a few cosmetic
changes and edited a few comments. I would like to push this to HEAD
and backpatch it to 15 by Tuesday unless there are any comments. I
think we should back patch this because otherwise, users will see a
change in behavior in 16 but if others don't think the same way then
we can consider pushing this to HEAD only.

-- 
With Regards,
Amit Kapila.


v5-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Description: Binary data


Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 6:20 PM Tom Lane  wrote:
> I think they're easily Stroustrup's worst idea ever.  You're basically
> throwing away an opportunity for documentation, and that documentation
> is often sorely needed.

He could at least point to C++ pure virtual functions, where omitting
a parameter name in the base class supposedly conveys useful
information. I don't find that argument particularly convincing
myself, even in a C++ context, but at least it's an argument. Doesn't
apply here in any case.

> I'd view the current state of reorderbuffer.h as pretty unacceptable on
> stylistic grounds no matter which position you take.  Having successive
> declarations randomly using named or unnamed parameters is seriously
> ugly and distracting, at least to my eye.  We don't need such blatant
> reminders of how many cooks have stirred this broth.

I'll come up with a revision that deals with that too, then. Shouldn't
be too much more work.

I suppose that I ought to backpatch a fix for the really egregious
issue in hba.h, and leave it at that on stable branches.

-- 
Peter Geoghegan




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Tom Lane
Peter Geoghegan  writes:
> The check that I used to write the patches doesn't treat unnamed
> parameters in a function declaration as an inconsistency, even when
> "strict" is used. Another nearby check *could* be used to catch
> unnamed parameters [1] if that was deemed useful, though. How do you
> feel about unnamed parameters?

I think they're easily Stroustrup's worst idea ever.  You're basically
throwing away an opportunity for documentation, and that documentation
is often sorely needed.  Handy example:

extern void ReorderBufferCommitChild(ReorderBuffer *, TransactionId, 
TransactionId,
 XLogRecPtr commit_lsn, XLogRecPtr end_lsn);

Which TransactionId parameter is which?  You might be tempted to guess,
if you think you remember how the function works, and that is a recipe
for bugs.

I'd view the current state of reorderbuffer.h as pretty unacceptable on
stylistic grounds no matter which position you take.  Having successive
declarations randomly using named or unnamed parameters is seriously
ugly and distracting, at least to my eye.  We don't need such blatant
reminders of how many cooks have stirred this broth.

regards, tom lane




Re: Tree-walker callbacks vs -Wdeprecated-non-prototype

2022-09-16 Thread Tom Lane
I wrote:
> Ugh.  I wonder if we can get away with declaring the walker arguments
> as something like "bool (*walker) (Node *, void *)" without having
> to change all the actual walkers to be exactly that signature.
> Having to insert casts in the walkers would be a major pain-in-the-butt.

No joy on that: both gcc and clang want the walkers to be declared
as taking exactly "void *".

Attached is an incomplete POC patch that suppresses these warnings
in nodeFuncs.c itself and in costsize.c, which I selected at random
as a typical caller.  I'll push forward with converting the other
call sites if this way seems good to people.

In nodeFuncs.c, we can hide the newly-required casts inside macros;
indeed, the mutators barely need any changes because they already
had MUTATE() macros that contained casts.  So on that side, it feels
to me that this is actually a bit nicer than before.

For the callers, we can either do it as I did below:

 static bool
-cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
+cost_qual_eval_walker(Node *node, void *ctx)
 {
+   cost_qual_eval_context *context = (cost_qual_eval_context *) ctx;
+
if (node == NULL)
return false;

or perhaps like this:

 static bool
-cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
+cost_qual_eval_walker(Node *node, void *context)
 {
+   cost_qual_eval_context *cqctx = (cost_qual_eval_context *) context;
+
if (node == NULL)
return false;

but the latter would require changing references further down in the
function, so I felt it more invasive.

It's sad to note that this exercise in hoop-jumping actually leaves
us with net LESS type safety, because the outside callers of
cost_qual_eval_walker are no longer constrained to call it with
the appropriate kind of context struct.  Thanks, C committee.

regards, tom lane

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 3bac350bf5..1e2ae3a5a4 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -27,10 +27,10 @@
 static bool expression_returns_set_walker(Node *node, void *context);
 static int	leftmostLoc(int loc1, int loc2);
 static bool fix_opfuncids_walker(Node *node, void *context);
-static bool planstate_walk_subplans(List *plans, bool (*walker) (),
+static bool planstate_walk_subplans(List *plans, tree_walker_callback walker,
 	void *context);
 static bool planstate_walk_members(PlanState **planstates, int nplans,
-   bool (*walker) (), void *context);
+   tree_walker_callback walker, void *context);
 
 
 /*
@@ -1836,7 +1836,7 @@ check_functions_in_node(Node *node, check_function_callback checker,
  * that modify nodes in-place but never add/delete/replace nodes).
  * A walker routine should look like this:
  *
- * bool my_walker (Node *node, my_struct *context)
+ * bool my_walker (Node *node, void *context)
  * {
  *		if (node == NULL)
  *			return false;
@@ -1850,7 +1850,7 @@ check_functions_in_node(Node *node, check_function_callback checker,
  *			... do special actions for other node types
  *		}
  *		// for any node type not specially processed, do:
- *		return expression_tree_walker(node, my_walker, (void *) context);
+ *		return expression_tree_walker(node, my_walker, context);
  * }
  *
  * The "context" argument points to a struct that holds whatever context
@@ -1910,7 +1910,7 @@ check_functions_in_node(Node *node, check_function_callback checker,
 
 bool
 expression_tree_walker(Node *node,
-	   bool (*walker) (),
+	   tree_walker_callback walker,
 	   void *context)
 {
 	ListCell   *temp;
@@ -1923,6 +1923,10 @@ expression_tree_walker(Node *node,
 	 * when we expect a List we just recurse directly to self without
 	 * bothering to call the walker.
 	 */
+#define WALK(n) walker((Node *) (n), context)
+
+#define LIST_WALK(l) expression_tree_walker((Node *) (l), walker, context)
+
 	if (node == NULL)
 		return false;
 
@@ -1946,25 +1950,21 @@ expression_tree_walker(Node *node,
 			/* primitive node types with no expression subnodes */
 			break;
 		case T_WithCheckOption:
-			return walker(((WithCheckOption *) node)->qual, context);
+			return WALK(((WithCheckOption *) node)->qual);
 		case T_Aggref:
 			{
 Aggref	   *expr = (Aggref *) node;
 
-/* recurse directly on List */
-if (expression_tree_walker((Node *) expr->aggdirectargs,
-		   walker, context))
+/* recurse directly on Lists */
+if (LIST_WALK(expr->aggdirectargs))
 	return true;
-if (expression_tree_walker((Node *) expr->args,
-		   walker, context))
+if (LIST_WALK(expr->args))
 	return true;
-if (expression_tree_walker((Node *) expr->aggorder,
-		   walker, context))
+if (LIST_WALK(expr->aggorder))
 	return true;
-if (expression_tree_walker((Node *) expr->aggdistinct,
-		   walker, context))
+if (LIST_WALK(expr->aggdistinct))
 	return true;
-

Re: Make ON_ERROR_STOP stop on shell script failure

2022-09-16 Thread bt22nakamorit

2022-09-16 17:30 に Kyotaro Horiguchi さんは書きました:

At Fri, 16 Sep 2022 15:55:33 +0900, bt22nakamorit
 wrote in

Hi,

"""\set ON_ERROR_STOP on""" stops any subsequent incoming query that
comes after an error of an SQL, but does not stop after a shell script
ran by """\! """ returning values other than 0, -1, or
127, which suggests a failure in the result of the shell script.

For example, suppose that below is an SQL file.
\set ON_ERROR_STOP on
SELECT 1;
\! false
SELECT 2;

The current design allows SELECT 2 even though the shell script
returns a value indicating a failure.


Since the "false" command did not "error out"?


I thought that this action is rather unexpected since, based on the
word """ON_ERROR_STOP""", ones may expect that failures of shell
scripts should halt the incoming instructions as well.
One clear solution is to let failures of shell script stop incoming
queries just like how errors of SQLs do currently. Thoughts?


I'm not sure we want to regard any exit status from a succssful run as
a failure.

On the other hand, the proposed behavior seems useful to me.

So +1 from me to the proposal, assuming the corresponding edit of the
documentation happens.

regards.



Since the "false" command did not "error out"?
"false" command returns 1 which is an exit status code that indicates 
failure, but not error.

I think it does not "error out" if that is what you mean.


So +1 from me to the proposal, assuming the corresponding edit of the
documentation happens.

I will work on editing the document and share further updates.

Thank you!
Tatsu




Re: SLRUs in the main buffer pool, redux

2022-09-16 Thread Bagga, Rishu
Hi Thomas,

While I was working on adding the page headers to SLRU pages on your patch, I 
came across this code where it seems like "MultiXactIdToMemberPage" is 
mistakenly being used instead of MultiXactIdToOffsetPage in the TrimMultiXact 
function.

Below is the area of concern in the patch:

@@ -2045,14 +1977,7 @@ TrimMultiXact(void)
oldestMXactDB = MultiXactState->oldestMultiXactDB;
LWLockRelease(MultiXactGenLock);
 
-   /* Clean up offsets state */
-   LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);
-
-   /*
-* (Re-)Initialize our idea of the latest page number for offsets.
-*/
-   pageno = MultiXactIdToOffsetPage(nextMXact);
-   MultiXactOffsetCtl->shared->latest_page_number = pag0eno;
+   pageno = MXOffsetToMemberPage(offset);


Let us know if I am missing something here or if it is an error.

Sincerely,

Rishu Bagga (Amazon Web Services)

On 9/16/22, 5:37 PM, "Thomas Munro"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



Rebased, debugged and fleshed out a tiny bit more, but still with
plenty of TODO notes and questions.  I will talk about this idea at
PGCon, so I figured it'd help to have a patch that actually applies,
even if it doesn't work quite right yet.  It's quite a large patch but
that's partly because it removes a lot of lines...



Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 4:49 PM Tom Lane  wrote:
> Agreed; I see no need to tolerate any inconsistency.

The check that I used to write the patches doesn't treat unnamed
parameters in a function declaration as an inconsistency, even when
"strict" is used. Another nearby check *could* be used to catch
unnamed parameters [1] if that was deemed useful, though. How do you
feel about unnamed parameters?

Many of the function declarations from reorderbuffer.h will be
affected if we decide that we don't want to allow unnamed parameters
-- it's quite noticeable there. I myself lean towards not allowing
unnamed parameters. (Though perhaps I should reserve judgement until
after I've measured just how prevalent unnamed parameters are.)

> Yeah.  I'd be inclined to handle it about like cpluspluscheck:
> provide a script that people can run from time to time, but
> don't insist that it's a commit-blocker.

My thoughts exactly.

[1] 
https://releases.llvm.org/14.0.0/tools/clang/tools/extra/docs/clang-tidy/checks/readability-named-parameter.html
-- 
Peter Geoghegan




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Tom Lane
Peter Geoghegan  writes:
> It's possible to configure the clang-tidy tooling to tolerate various
> inconsistencies, below some kind of threshold -- it is totally
> customizable. But I think that a strict, simple rule is the way to go
> here.

Agreed; I see no need to tolerate any inconsistency.

> (Though without creating busy work for committers that don't
> want to use clang-tidy all the time.)

Yeah.  I'd be inclined to handle it about like cpluspluscheck:
provide a script that people can run from time to time, but
don't insist that it's a commit-blocker.  (I wouldn't be unhappy
to see the cfbot include this in its compiler warnings suite,
though, once we get rid of the existing instances.)

regards, tom lane




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 4:19 PM Tom Lane  wrote:
> I agree, this has always been a pet peeve of mine as well.  I would
> have guessed there were fewer examples than you found, because I've
> generally fixed any such cases I happened to notice.

If you actually go through them all one by one you'll see that the
vast majority of individual cases involve an inconsistency that
follows some kind of recognizable pattern. For example, a Relation
parameter might be spelled "relation" in one place and "rel" in
another. I find these more common cases much less noticeable --
perhaps that's why there are more than you thought there'd be?

It's possible to configure the clang-tidy tooling to tolerate various
inconsistencies, below some kind of threshold -- it is totally
customizable. But I think that a strict, simple rule is the way to go
here. (Though without creating busy work for committers that don't
want to use clang-tidy all the time.)
-- 
Peter Geoghegan




Re: Making C function declaration parameter names consistent with corresponding definition names

2022-09-16 Thread Tom Lane
Peter Geoghegan  writes:
> I have to admit that these inconsistencies are a pet peeve of mine. I
> find them distracting, and have a history of fixing them on an ad-hoc
> basis. But there are real practical arguments in favor of being strict
> about it as a matter of policy -- it's not *just* neatnikism.

I agree, this has always been a pet peeve of mine as well.  I would
have guessed there were fewer examples than you found, because I've
generally fixed any such cases I happened to notice.

regards, tom lane




Re: clang 15 doesn't like our JIT code

2022-09-16 Thread Tom Lane
Thomas Munro  writes:
> there's also the walker stuff[1] to address.

Yeah.  I just did some experimentation with that, and it looks like
neither gcc nor clang will cut you any slack at all for declaring
an argument as "void *": given say

typedef bool (*tree_walker_callback) (Node *node, void *context);

the walker functions also have to be declared with exactly "void *"
as their second argument.  So it's going to be just as messy and
full-of-casts as we feared.  Still, I'm not sure we have any
alternative.

regards, tom lane




RFC: Logging plan of the running query

2022-09-16 Thread a.rybakina

Hi,

I liked this idea and after reviewing code I noticed some moments and 
I'd rather ask you some questions.



Firstly, I suggest some editing in the comment of commit. I think, it is 
turned out the more laconic and the same clear. I wrote it below since I 
can't think of any other way to add it.


```
Currently, we have to wait for finishing of the query execution to check 
its plan.
This is not so convenient in investigation long-running queries on 
production

environments where we cannot use debuggers.

To improve this situation there is proposed the patch containing the 
pg_log_query_plan()

function which request to log plan of the specified backend process.

By default, only superusers are allowed to request log of the plan 
otherwise
allowing any users to issue this request could create cause lots of log 
messages

and it can lead to denial of service.

At the next requesting CHECK_FOR_INTERRUPTS(), the target backend logs 
its plan at
LOG_SERVER_ONLY level and therefore this plan will appear in the server 
log only,

not to be sent to the client.
```

Secondly, I have question about deleting USE_ASSERT_CHECKING in lock.h?
It supposed to have been checked in another placed of the code by 
matching values. I am worry about skipping errors due to untesting with 
assert option in the places where it (GetLockMethodLocalHash) 
participates and we won't able to get core file in segfault cases. I 
might not understand something, then can you please explain to me?


Thirdly, I have incomprehension of the point why save_ActiveQueryDesc is 
declared in the pquery.h? I am seemed to save_ActiveQueryDesc to be used 
in an once time in the ExecutorRun function and  its declaration 
superfluous. I added it in the attached patch.


Fourthly, it seems to me there are not enough explanatory comments in 
the code. I also added them in the attached patch.


Lastly, I have incomprehension about handling signals since have been 
unused it before. Could another signal disabled calling this signal to 
log query plan? I noticed this signal to be checked the latest in 
procsignal_sigusr1_handler function.


Regards,

--
Alena Rybakina
Postgres Professional
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index a82ac87457e..65b692b0ddf 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -306,6 +306,12 @@ ExecutorRun(QueryDesc *queryDesc,
 {
 	QueryDesc *save_ActiveQueryDesc;
 
+	/*
+	 * Save value of ActiveQueryDesc before having called
+	 * ExecutorRun_hook function due to having reset by
+	 * AbortSubTransaction.
+	 */
+
 	save_ActiveQueryDesc = ActiveQueryDesc;
 	ActiveQueryDesc = queryDesc;
 
@@ -314,6 +320,7 @@ ExecutorRun(QueryDesc *queryDesc,
 	else
 		standard_ExecutorRun(queryDesc, direction, count, execute_once);
 
+	/* We set the actual value of ActiveQueryDesc */
 	ActiveQueryDesc = save_ActiveQueryDesc;
 }
 
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index fc9f9f8e3f0..8e7ce3c976f 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -126,6 +126,7 @@ extern void ExplainOpenGroup(const char *objtype, const char *labelname,
 extern void ExplainCloseGroup(const char *objtype, const char *labelname,
 			  bool labeled, ExplainState *es);
 
+/* Function to handle the signal to output the query plan. */
 extern void HandleLogQueryPlanInterrupt(void);
 extern void ProcessLogQueryPlanInterrupt(void);
 #endif			/* EXPLAIN_H */
diff --git a/src/include/tcop/pquery.h b/src/include/tcop/pquery.h
index 227d24b9d60..d04de1f5566 100644
--- a/src/include/tcop/pquery.h
+++ b/src/include/tcop/pquery.h
@@ -22,7 +22,6 @@ struct PlannedStmt;/* avoid including plannodes.h here */
 
 extern PGDLLIMPORT Portal ActivePortal;
 extern PGDLLIMPORT QueryDesc *ActiveQueryDesc;
-extern PGDLLIMPORT QueryDesc *save_ActiveQueryDesc;
 
 
 extern PortalStrategy ChoosePortalStrategy(List *stmts);


Re: clang 15 doesn't like our JIT code

2022-09-16 Thread Thomas Munro
On Sat, Sep 17, 2022 at 6:45 AM Andres Freund  wrote:
> On 2022-09-16 11:40:46 -0400, Tom Lane wrote:
> > According to
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=2127503
> >
> > bleeding-edge clang complains thusly:
> >
> > llvmjit_inline.cpp: In function 'std::unique_ptr 
> > llvm_load_summary(llvm::StringRef)':
> > llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used 
> > in nested name specifier
> >   771 | llvm::MemoryBuffer::getFile(path);
> >   | ^~~
> > In file included from /usr/include/c++/12/memory:76,
> >  from /usr/include/llvm/ADT/SmallVector.h:28,
> >  from /usr/include/llvm/ADT/ArrayRef.h:14,
> >  from /usr/include/llvm/ADT/SetVector.h:23,
> >  from llvmjit_inline.cpp:48:
> > /usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void 
> > std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = 
> > llvm::MemoryBuffer]':
> > /usr/include/c++/12/bits/unique_ptr.h:396:17:   required from 
> > 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; 
> > _Dp = std::default_delete]'
> > /usr/include/llvm/Support/ErrorOr.h:142:34:   required from 
> > 'llvm::ErrorOr::~ErrorOr() [with T = 
> > std::unique_ptr]'
> > llvmjit_inline.cpp:771:35:   required from here
> > /usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of 
> > 'sizeof' to incomplete type 'llvm::MemoryBuffer'
> >93 | static_assert(sizeof(_Tp)>0,
> >   |   ^~~
> >
> > I suspect this is less about clang and more about LLVM APIs,
> > but anyway it seems like we gotta fix something.
>
> Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this
> particular failure is pretty easy to fix, but there's some others that are
> harder. They redesigned a fairly core part of the IR representation. Thomas
> has a WIP fix, I think.

Yes, I've been working on this and will try to have a patch on the
list in a few days.  There are also a few superficial changes to
names, arguments, headers etc like the one reported there, but the
real problem is that it aborts at runtime when JIT stuff happens, so I
didn't want to push changes for the superficial things without
addressing that or someone might get a nasty surprise.  Separately,
there's also the walker stuff[1] to address.

[1] 
https://www.postgresql.org/message-id/flat/CA%2BhUKGKpHPDTv67Y%2Bs6yiC8KH5OXeDg6a-twWo_xznKTcG0kSA%40mail.gmail.com




Re: [PoC] Let libpq reject unexpected authentication requests

2022-09-16 Thread Jacob Champion
On Fri, Sep 16, 2022 at 7:56 AM Peter Eisentraut
 wrote:
> On 08.09.22 20:18, Jacob Champion wrote:
> After thinking about this a bit more, I think it would be best if the
> words used here match exactly with what is used in pg_hba.conf.  That's
> the only thing the user cares about: reject "password", reject "trust",
> require "scram-sha-256", etc.  How this maps to the protocol and that
> some things are SASL or not is not something they have needed to care
> about and don't really need to know for this.  So I would suggest to
> organize it that way.

I tried that in v1, if you'd like to see what that ends up looking
like. As a counterexample, I believe `cert` auth looks identical to
`trust` on the client side. (The server always asks for a client
certificate even if it doesn't use it. Otherwise, this proposal would
probably have looked different.) And `ldap` auth is indistinguishable
from `password`, etc. In my opinion, how it maps to the protocol is
more honest to the user than how it maps to HBA, because the auth
request sent by the protocol determines your level of risk.

I also like `none` over `trust` because you don't have to administer a
server to understand what it means. That's why I was on board with
your proposal to change the name of `password`. And you don't have to
ignore the natural meaning of client-side "trust", which IMO means
"trust the server." There's opportunity for confusion either way,
unfortunately, but naming them differently may help make it clear that
they _are_ different.

This problem overlaps a bit with the last remaining TODO in the code.
I treat gssenc tunnels as satisfying require_auth=gss. Maybe that's
useful, because it kind of maps to how HBA treats it? But it's not
consistent with the TLS side of things, and it overlaps with
gssencmode=require, complicating the relationship with the new
sslcertmode.

> Another idea:  Maybe instead of the "!" syntax, use two settings,
> require_auth and reject_auth?  Might be simpler?

Might be. If that means we have to handle the case where both are set
to something, though, it might make things harder.

We can error out if they conflict, which adds a decent but not huge
amount of complication. Or we can require that only one is set, which
is both easy and overly restrictive. But either choice makes it harder
to adopt a `reject password` default, as many people seem to be
interested in doing. Because if you want to override that default,
then you have to first unset reject_auth and then set require_auth, as
opposed to just saying require_auth=something and being done with it.
I'm not sure that's worth it. Thoughts?

I'm happy to implement proofs of concept for that, or any other ideas,
given the importance of getting this "right enough" the first time.
Just let me know.

Thanks,
--Jacob




Re: [RFC] building postgres with meson - v13

2022-09-16 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-16 09:14:20 +1200, Thomas Munro wrote:
>> On Thu, Sep 15, 2022 at 2:26 PM Andres Freund  wrote:
>>> - noticed that libpgport.a had and needed a dependency on errcodes.h - that
>>> seemed wrong. The dependency is due to src/port/*p{read,write}v?.c including
>>> postgres.h - which seems wrong. So I added a patch changing them to include
>>> c.h.

>> Oops.  +1

> Looks like this has been the case since
> 0d56acfbaa799553c0c6ea350fd6e68d81025994 in 14. Any opinions on whether we
> should backpatch the "fix"?

+1, those files have no business including all of postgres.h

regards, tom lane




Re: clang 15 doesn't like our JIT code

2022-09-16 Thread Andres Freund
Hi,

On 2022-09-16 16:07:51 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2022-09-16 11:40:46 -0400, Tom Lane wrote:
> >> I suspect this is less about clang and more about LLVM APIs,
> >> but anyway it seems like we gotta fix something.
>
> > Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - 
> > this
> > particular failure is pretty easy to fix, but there's some others that are
> > harder. They redesigned a fairly core part of the IR representation. Thomas
> > has a WIP fix, I think.
>
> I'm more and more getting the feeling that we're interfacing with LLVM
> at too low a level, because it seems like our code is constantly breaking.
> Do they just not have any stable API at all?

I don't think it's the wrong level. While LLVM has a subset of the API that's
supposed to be stable, and we mostly use only that subset, they've definitely
are breaking it more and more frequently. Based on my observation that's
because more and more of the development is done by google and facebook, which
internally use monorepos, and vendor LLVM - that kind of model makes API
changes much less of an issue.  OTOH, the IR breakage (and a few prior related
ones) is about fixing a design issue they've been talking about fixing for 10+
years...

Greetings,

Andres Freund




Re: clang 15 doesn't like our JIT code

2022-09-16 Thread Tom Lane
Andres Freund  writes:
> On 2022-09-16 11:40:46 -0400, Tom Lane wrote:
>> I suspect this is less about clang and more about LLVM APIs,
>> but anyway it seems like we gotta fix something.

> Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this
> particular failure is pretty easy to fix, but there's some others that are
> harder. They redesigned a fairly core part of the IR representation. Thomas
> has a WIP fix, I think.

I'm more and more getting the feeling that we're interfacing with LLVM
at too low a level, because it seems like our code is constantly breaking.
Do they just not have any stable API at all?

regards, tom lane




Re: Pruning never visible changes

2022-09-16 Thread Laurenz Albe
On Fri, 2022-09-16 at 10:26 -0400, Tom Lane wrote:
> Simon Riggs  writes:
> > A user asked me whether we prune never visible changes, such as
> > BEGIN;
> > INSERT...
> > UPDATE.. (same row)
> > COMMIT;
> 
> Didn't we just have this discussion in another thread?

For reference: that was
https://postgr.es/m/f6a491b32cb44bb5daaafec835364f7149348fa1.ca...@cybertec.at

Yours,
Laurenz Albe




Re: Fix gin index cost estimation

2022-09-16 Thread Tom Lane
Ronan Dunklau  writes:
> The attached does that and is much simpler. I only took into account 
> entryPagesFetched, not sure if we should also charge something for data pages.

I think this is wrong, because there is already a CPU charge based on
the number of tuples visited, down at the very end of the routine:

*indexTotalCost += (numTuples * *indexSelectivity) * 
(cpu_index_tuple_cost + qual_op_cost);

It's certainly possible to argue that that's incorrectly computed,
but just adding another cost charge for the same topic can't be right.

I do suspect that that calculation is bogus, because it looks like it's
based on the concept of "apply the quals at each index entry", which we
know is not how GIN operates.  So maybe we should drop that bit in favor
of a per-page-ish cost like you have here.  Not sure.  In any case it
seems orthogonal to the question of startup/descent costs.  Maybe we'd
better tackle just one thing at a time.

(BTW, given that that charge does exist and is not affected by
repeated-scan amortization, why do we have a problem in the first place?
Is it just too small?  I guess that when we're only expecting one tuple
to be retrieved, it would only add about cpu_index_tuple_cost.)

> Is the power(0.15) used an approximation for a log ? If so why ? Also 
> shouldn't we round that up ?

No idea, but I'm pretty hesitant to just randomly fool with that equation
when (a) neither of us know where it came from and (b) exactly no evidence
has been provided that it's wrong.

I note for instance that the existing logic switches from charging 1 page
per search to 2 pages per search at numEntryPages = 15 (1.5 ^ (1/0.15)).
Your version would switch at 2 pages, as soon as the pow() result is even
fractionally above 1.0.  Maybe the existing logic is too optimistic there,
but ceil() makes it too pessimistic I think.  I'd sooner tweak the power
constant than change from rint() to ceil().

regards, tom lane




Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans

2022-09-16 Thread Peter Geoghegan
On Fri, Sep 16, 2022 at 12:30 AM Masahiko Sawada  wrote:
> After a quick benchmark, I've confirmed that the amount of WAL records
> for freezing 1 million tuples reduced to about one-fifth (1.2GB vs
> 250MB). Great.

I think that the really interesting thing about the patch is how this
changes the way we should think about freezing costs. It makes
page-level batching seem very natural.

The minimum possible size of a Heap2/FREEZE_PAGE record is 64 bytes,
once alignment and so on is taken into account (without the patch).
Once we already know that we have to freeze *some* tuples on a given
heap page, it becomes very reasonable to freeze as many as possible,
in batch, just because we know that it'll be much cheaper if we do it
now versus doing it later on instead. Even if this extra freezing ends
up "going to waste" due to updates against the same tuples that happen
a little later on, the *added* cost of freezing "extra" tuples will
have been so small that it's unlikely to matter. On the other hand, if
it's not wasted we'll be *much* better off.

It's very hard to predict the future, which is kinda what the current
FreezeLimit-based approach to freezing does. It's actually quite easy
to understand the cost of freezing now versus freezing later, though.
At a high level, it makes sense for VACUUM to focus on freezing costs
(including the risk that comes with *not* freezing with larger
tables), and not worry so much about making accurate predictions.
Making accurate predictions about freezing/workload characteristics is
overrated.

> True. I've not looked at the patch in depth yet but I think we need
> regression tests for this.

What did you have in mind?

I think that the best way to test something like this is with
wal_consistency_checking. That mostly works fine. However, note that
heap_mask() won't always be able to preserve the state of a tuple's
xmax when modified by freezing. We sometimes need "hint bits" to
actually reliably be set in REDO, when replaying the records for
freezing. At other times they really are just hints. We have to
conservatively assume that it's just a hint when masking. Not sure if
I can do much about that.

Note that this optimization is one level below lazy_scan_prune(), and
one level above heap_execute_freeze_tuple(). Neither function really
changes at all. This seems useful because there are rare
pg_upgrade-only paths where xvac fields need to be frozen. That's not
tested either.

-- 
Peter Geoghegan




Re: [RFC] building postgres with meson - v13

2022-09-16 Thread Andres Freund
Hi,

On 2022-09-16 09:14:20 +1200, Thomas Munro wrote:
> On Thu, Sep 15, 2022 at 2:26 PM Andres Freund  wrote:
> > - noticed that libpgport.a had and needed a dependency on errcodes.h - that
> >   seemed wrong. The dependency is due to src/port/*p{read,write}v?.c 
> > including
> >   postgres.h - which seems wrong. So I added a patch changing them to 
> > include
> >   c.h.
> 
> Oops.  +1

Looks like this has been the case since
0d56acfbaa799553c0c6ea350fd6e68d81025994 in 14. Any opinions on whether we
should backpatch the "fix"?

Greetings,

Andres Freund




Re: clang 15 doesn't like our JIT code

2022-09-16 Thread Andres Freund
Hi,


On 2022-09-16 11:40:46 -0400, Tom Lane wrote:
> According to
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2127503
> 
> bleeding-edge clang complains thusly:
> 
> llvmjit_inline.cpp: In function 'std::unique_ptr 
> llvm_load_summary(llvm::StringRef)':
> llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used 
> in nested name specifier
>   771 | llvm::MemoryBuffer::getFile(path);
>   | ^~~
> In file included from /usr/include/c++/12/memory:76,
>  from /usr/include/llvm/ADT/SmallVector.h:28,
>  from /usr/include/llvm/ADT/ArrayRef.h:14,
>  from /usr/include/llvm/ADT/SetVector.h:23,
>  from llvmjit_inline.cpp:48:
> /usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void 
> std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = 
> llvm::MemoryBuffer]':
> /usr/include/c++/12/bits/unique_ptr.h:396:17:   required from 
> 'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; _Dp 
> = std::default_delete]'
> /usr/include/llvm/Support/ErrorOr.h:142:34:   required from 
> 'llvm::ErrorOr::~ErrorOr() [with T = std::unique_ptr]'
> llvmjit_inline.cpp:771:35:   required from here
> /usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of 
> 'sizeof' to incomplete type 'llvm::MemoryBuffer'
>93 | static_assert(sizeof(_Tp)>0,
>   |   ^~~
> 
> I suspect this is less about clang and more about LLVM APIs,
> but anyway it seems like we gotta fix something.

Yea, there's definitely a bunch of llvm 15 issues that need to be fixed - this
particular failure is pretty easy to fix, but there's some others that are
harder. They redesigned a fairly core part of the IR representation. Thomas
has a WIP fix, I think.

Greetings,

Andres Freund




Re:pg_stat_statements and "IN" conditions

2022-09-16 Thread Sergei Kornilov
Hello!

Unfortunately the patch needs another rebase due to the recent split of guc.c 
(0a20ff54f5e66158930d5328f89f087d4e9ab400)

I'm reviewing a patch on top of a previous commit and noticed a failed test:

#   Failed test 'no parameters missing from postgresql.conf.sample'
#   at t/003_check_guc.pl line 82.
#  got: '1'
# expected: '0'
# Looks like you failed 1 test of 3.
t/003_check_guc.pl .. 

The new option has not been added to the postgresql.conf.sample

PS: I would also like to have such a feature. It's hard to increase 
pg_stat_statements.max or lose some entries just because some ORM sends 
requests with a different number of parameters.

regards, Sergei




Re: Pruning never visible changes

2022-09-16 Thread Tom Lane
Simon Riggs  writes:
> On Fri, 16 Sept 2022 at 15:26, Tom Lane  wrote:
>> You cannot
>> do that, at least not without checking that the originating
>> transaction has no snapshots that could see the older row version.

> I'm saying this is possible only AFTER the end of the originating
> xact, so there are no issues with additional snapshots.

I see, so the point is just that we can prune even if the originating
xact hasn't yet passed the global xmin horizon.  I agree that's safe,
but will it fire often enough to be worth the trouble?  Also, why
does it need to be restricted to certain shapes of HOT chains ---
that is, why can't we do exactly what we'd do if the xact *were*
past the xmin horizon?

regards, tom lane




Re: Pruning never visible changes

2022-09-16 Thread Simon Riggs
On Fri, 16 Sept 2022 at 15:26, Tom Lane  wrote:
>
> Simon Riggs  writes:
> > A user asked me whether we prune never visible changes, such as
> > BEGIN;
> > INSERT...
> > UPDATE.. (same row)
> > COMMIT;
>
> Didn't we just have this discussion in another thread?

Not that I was aware of, but it sounds like a different case anyway.

> You cannot
> do that, at least not without checking that the originating
> transaction has no snapshots that could see the older row version.

I'm saying this is possible only AFTER the end of the originating
xact, so there are no issues with additional snapshots.

i.e. the never visible row has to be judged RECENTLY_DEAD before we even check.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Patch proposal: make use of regular expressions for the username in pg_hba.conf

2022-09-16 Thread Drouvot, Bertrand

Hi,

On 9/12/22 9:55 AM, Drouvot, Bertrand wrote:

Hi,

On 9/10/22 1:21 AM, Jacob Champion wrote:

On 8/19/22 01:12, Drouvot, Bertrand wrote:

+   wstr = palloc((strlen(tok->string + 1) + 1) * sizeof(pg_wchar));
+   wlen = pg_mb2wchar_with_len(tok->string + 1,
+   wstr, strlen(tok->string + 1));

The (tok->string + 1) construction comes up often enough that I think it
should be put in a `regex` variable or similar. That would help my eyes
with the (strlen(tok->string + 1) + 1) construction, especially.

I noticed that for pg_ident, we precompile the regexes per-line and
reuse those in child processes. Whereas here we're compiling, using, and
then discarding the regex for each check. I think the example set by the
pg_ident code is probably the one to follow, unless you have a good
reason not to.


Thanks for the feedback.

Yeah fully agree. I'll provide a new version that follow the same logic 
as the pg_ident code.



+# Testing with regular expression for username
+reset_pg_hba($node, '/^.*md.*$', 'password');
+test_role($node, 'md5_role', 'password from pgpass and regular expression for 
username', 0);
+

IMO the coverage for this patch needs to be filled out. Negative test
cases are more important than positive ones for security-related code.


Agree, will do.


Other than that, and Tom's note on potentially expanding this to other
areas,


I'll add regexp usage for the database column and also the for the 
address one when non CIDR is provided (so host name(s)) (I think it also 
makes sense specially as we don't allow multiple values for this column).





Please find attached v2 addressing the comments mentioned above.

v2 also provides regular expression usage for the database and the 
address columns (when a host name is being used).


Remark:

The CF bot is failing for Windows (all other tests are green) and only 
for the new tap test related to the regular expression on the host name 
(the ones on database and role are fine).


The issue is not related to the patch. The issue is that the Windows 
Cirrus test does not like when a host name is provided for a "host" 
entry in pg_hba.conf (while it works fine when a CIDR is provided).


You can see an example in [1] where the only change is to replace the 
CIDR by "localhost" in 002_scram.pl. As you can see the Cirrus tests are 
failing on Windows only (its log file is here [2]).


I'll look at this "Windows" related issue but would appreciate any 
guidance/help if someone has experience in this area on windows.





[1]: https://github.com/bdrouvot/postgres/branches on branch “host_non_cidr”

[2]: 
https://api.cirrus-ci.com/v1/artifact/task/6507279833890816/log/src/test/ssl/tmp_check/log/002_scram_primary.log


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.comdiff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index c6f1b70fd3..28343445be 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -235,8 +235,9 @@ hostnogssenc  database  
userPostgreSQL database.
-   Multiple database names can be supplied by separating them with
-   commas.  A separate file containing database names can be specified by
+   Multiple database names and/or regular expressions preceded by 
/
+   can be supplied by separating them with commas.
+   A separate file containing database names can be specified by
preceding the file name with @.
   
  
@@ -249,7 +250,8 @@ hostnogssenc  database  
userall specifies that it
matches all users.  Otherwise, this is either the name of a specific
-   database user, or a group name preceded by +.
+   database user, a regular expression preceded by /
+   or a group name preceded by +.
(Recall that there is no real distinction between users and groups
in PostgreSQL; a + mark 
really means
match any of the roles that are directly or indirectly members
@@ -258,7 +260,8 @@ hostnogssenc  database  
user/
+   can be supplied by separating them with commas.
A separate file containing user names can be specified by preceding the
file name with @.
   
@@ -270,8 +273,9 @@ hostnogssenc  database  
user
   
Specifies the client machine address(es) that this record
-   matches.  This field can contain either a host name, an IP
-   address range, or one of the special key words mentioned below.
+   matches.  This field can contain either a host name, a regular 
expression
+   preceded by / representing host names, an IP address 
range,
+   or one of the special key words mentioned below.
   
 
   
@@ -785,16 +789,18 @@ hostall all 192.168.12.10/32  
  gss
 # TYPE  DATABASEUSERADDRESS METHOD
 hostall all 192.168.0.0/16  ident 

Re: postgres_fdw hint messages

2022-09-16 Thread Nathan Bossart
On Fri, Sep 16, 2022 at 03:54:53PM +0200, Peter Eisentraut wrote:
> I don't think we need a verb there at all.  I committed it without a verb.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Query Jumbling for CALL and SET utility statements

2022-09-16 Thread Drouvot, Bertrand

Hi,

On 9/16/22 2:53 PM, Fujii Masao wrote:


 

Attached v5 to normalize 2PC commands too, so that we get things like:


+    case T_VariableSetStmt:
+    {
+    VariableSetStmt *stmt = (VariableSetStmt *) node;
+
+    /* stmt->name is NULL for RESET ALL */
+    if (stmt->name)
+    {
+    APP_JUMB_STRING(stmt->name);
+    JumbleExpr(jstate, (Node *) stmt->args);

With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the 
same query.
Is this intentional? 


Thanks for looking at the patch!
No, it is not intentional, good catch!

Which might be ok because their behavior is 
basically the same.
But I'm afaid which may cause users to be confused. For example, they 
may fail to
find the pgss entry for RESET command they ran and just wonder why the 
command was
not recorded. To avoid such confusion, how about appending stmt->kind to 
the jumble?

Thought?


I think that's a good idea and will provide a new version taking care of 
it (and also Sami's comments up-thread).


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




clang 15 doesn't like our JIT code

2022-09-16 Thread Tom Lane
According to

https://bugzilla.redhat.com/show_bug.cgi?id=2127503

bleeding-edge clang complains thusly:

llvmjit_inline.cpp: In function 'std::unique_ptr 
llvm_load_summary(llvm::StringRef)':
llvmjit_inline.cpp:771:37: error: incomplete type 'llvm::MemoryBuffer' used in 
nested name specifier
  771 | llvm::MemoryBuffer::getFile(path);
  | ^~~
In file included from /usr/include/c++/12/memory:76,
 from /usr/include/llvm/ADT/SmallVector.h:28,
 from /usr/include/llvm/ADT/ArrayRef.h:14,
 from /usr/include/llvm/ADT/SetVector.h:23,
 from llvmjit_inline.cpp:48:
/usr/include/c++/12/bits/unique_ptr.h: In instantiation of 'void 
std::default_delete<_Tp>::operator()(_Tp*) const [with _Tp = 
llvm::MemoryBuffer]':
/usr/include/c++/12/bits/unique_ptr.h:396:17:   required from 
'std::unique_ptr<_Tp, _Dp>::~unique_ptr() [with _Tp = llvm::MemoryBuffer; _Dp = 
std::default_delete]'
/usr/include/llvm/Support/ErrorOr.h:142:34:   required from 
'llvm::ErrorOr::~ErrorOr() [with T = std::unique_ptr]'
llvmjit_inline.cpp:771:35:   required from here
/usr/include/c++/12/bits/unique_ptr.h:93:23: error: invalid application of 
'sizeof' to incomplete type 'llvm::MemoryBuffer'
   93 | static_assert(sizeof(_Tp)>0,
  |   ^~~

I suspect this is less about clang and more about LLVM APIs,
but anyway it seems like we gotta fix something.

regards, tom lane




Re: Query Jumbling for CALL and SET utility statements

2022-09-16 Thread Imseih (AWS), Sami
>The utility commands for cursor like DECLARE CURSOR seem to have the same 
> issue
>and can cause lots of pgss entries. For example, when we use postgres_fdw 
> and
>execute "SELECT * FROM  WHERE id = 10" five times in the 
> same
>transaction, the following commands are executed in the remote PostgreSQL 
> server
>and recorded as pgss entries there.

>DECLARE c1 CURSOR FOR ...
>DECLARE c2 CURSOR FOR ...
>DECLARE c3 CURSOR FOR ...

+1

I also made this observation recently and have a patch to suggest
to improve tis situation. I will start a separate thread for this.

Regards,

--
Sami Imseih
Amazon Web Services (AWS)




Re: [PoC] Let libpq reject unexpected authentication requests

2022-09-16 Thread Peter Eisentraut

On 08.09.22 20:18, Jacob Champion wrote:

Sounds fair. "cleartext"? "plaintext"? "plain" (like SASL's PLAIN)?



On the SASL front: In the back of my head I'd been considering adding
a "sasl:" prefix to "scram-sha-256", so that we have a namespace for
new SASL methods. That would also give us a jumping-off point in the
future if we decide to add SASL method negotiation to the protocol.
What do you think about that?


After thinking about this a bit more, I think it would be best if the 
words used here match exactly with what is used in pg_hba.conf.  That's 
the only thing the user cares about: reject "password", reject "trust", 
require "scram-sha-256", etc.  How this maps to the protocol and that 
some things are SASL or not is not something they have needed to care 
about and don't really need to know for this.  So I would suggest to 
organize it that way.


Another idea:  Maybe instead of the "!" syntax, use two settings, 
require_auth and reject_auth?  Might be simpler?






Re: Pruning never visible changes

2022-09-16 Thread Tom Lane
Simon Riggs  writes:
> A user asked me whether we prune never visible changes, such as
> BEGIN;
> INSERT...
> UPDATE.. (same row)
> COMMIT;

Didn't we just have this discussion in another thread?  You cannot
do that, at least not without checking that the originating
transaction has no snapshots that could see the older row version.
I'm not sure whether or not snapmgr.c has enough information to
determine that, but in any case this formulation is surely
unsafe, because it isn't even checking whether that transaction is
our own, let alone asking snapmgr.c.

I'm dubious that a safe version would fire often enough to be
worth the cycles spent.

regards, tom lane




Re: Add 64-bit XIDs into PostgreSQL 15

2022-09-16 Thread Maxim Orlov
I have to say, to my embarrassment, after sending the previous email, I've
notice minor imperfections in a patch set caused by the last rebase.
These imperfections led to cf bot fail. I'll address this issue in the next
iteration in order not to generate excessive flow.


-- 
Best regards,
Maxim Orlov.


Re: postgres_fdw hint messages

2022-09-16 Thread Peter Eisentraut

On 13.09.22 21:02, Nathan Bossart wrote:

On Tue, Sep 13, 2022 at 08:32:43AM +0200, Peter Eisentraut wrote:

I notice that for column misspellings, the hint is phrased "Perhaps you
meant X." whereas here we have "Did you mean X?".  Let's make that uniform.


Good point.  I attached a new version of the patch that uses the column
phrasing.  I wasn't sure whether "reference" was the right word to use in
this context, but I used it for now for consistency with the column hints.
I think "specify" or "use" would work as well.


I don't think we need a verb there at all.  I committed it without a verb.





RE: Allow logical replication to copy tables in binary format

2022-09-16 Thread osumi.takami...@fujitsu.com
On Tuesday, August 16, 2022 2:04 AM Melih Mutlu  wrote:
> Attached patch also includes some additions to the doc along with the tests.

Hi, thank you for updating the patch. Minor review comments for the v2.


(1) whitespace issues

Please fix below whitespace errors.

$ git apply v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch
v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:39: trailing 
whitespace.
  binary format.(See .)
v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:120: trailing 
whitespace.

v2-0001-Allow-logical-replication-to-copy-table-in-binary.patch:460: trailing 
whitespace.
);
warning: 3 lines add whitespace errors.


(2) Suggestion to update another general description about the subscription

Kindly have a look at doc/src/sgml/logical-replication.sgml.

"The data types of the columns do not need to match,
as long as the text representation of the data can be converted to the target 
type.
For example, you can replicate from a column of type integer to a column of 
type bigint."

With the patch, I think we have an impact about those descriptions
since enabling the binary option for a subscription and executing the
initial synchronization requires the same data types for binary format.

I suggest that we update those descriptions as well.


(3) shouldn't test that we fail expectedly with binary copy for different types 
?

How about having a test that we correctly fail with different data types
between the publisher and the subscriber, for instance ?


(4) Error message of the binary format copy

I've gotten below message from data type contradiction (between integer and 
bigint).
Probably, this is unclear for the users to understand the direct cause 
and needs to be adjusted ?
This might be a similar comment Euler mentioned in [1].

2022-09-16 11:54:54.835 UTC [4570] ERROR:  insufficient data left in message
2022-09-16 11:54:54.835 UTC [4570] CONTEXT:  COPY tab, line 1, column id


(5) Minor adjustment of the test comment in 002_types.pl.

+is( $result, $sync_result, 'check initial sync on subscriber');
+is( $result_binary, $sync_result, 'check initial sync on subscriber in 
binary');

 # Insert initial test data

There are two same comments which say "Insert initial test data" in this file.
We need to update them, one for the initial table sync and
the other for the application of changes.

[1] - 
https://www.postgresql.org/message-id/f1d58324-8df4-4bb5-a546-8c741c2e6fa8%40www.fastmail.com



Best Regards,
Takamichi Osumi



PostgreSQL 15 RC1 + GA dates

2022-09-16 Thread Jonathan S. Katz

Hi,

The release team is planning to release PostgreSQL 15 Release Candidate 
1 (RC1) on 2022-09-29. Please ensure all open items[1] are resolved no 
later than 2022-09-24 0:00 AoE.


Following recent release patterns, we planning for 2022-10-06 to be the 
GA date. This may change based on what reports we receive from testing 
over the next few weeks.


Please let us know if you have any questions.

Thanks,

Jonathan

[1] https://wiki.postgresql.org/wiki/PostgreSQL_15_Open_Items


OpenPGP_signature
Description: OpenPGP digital signature


Re: Error for WITH options on partitioned tables

2022-09-16 Thread Japin Li


I wrote:
> On Fri, 16 Sep 2022 at 20:13, Simon Riggs  
> wrote:
>> Patch 002 replaces this with a more meaningful error message, which
>> matches our fine manual.
>> https://www.postgresql.org/docs/current/sql-createtable.html
>>
>>  ERROR:  cannot specify storage options for a partitioned table
>>  HINT:  specify storage options on leaf partitions instead
>
> Looks good.  Does this means we don't need the partitioned_table_reloptions()
> function and remove the reloptions validation in DefineRelation() for
> partitioned table.  Or we can ereport() in partitioned_table_reloptions().

I want to know why we should do validation for partitioned tables even if it
doesn't support storage parameters?

/*
 * There are no options for partitioned tables yet, but this is able to do
 * some validation.
 */
return (bytea *) build_reloptions(reloptions, validate,
  RELOPT_KIND_PARTITIONED,
  0, NULL, 0);

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Query Jumbling for CALL and SET utility statements

2022-09-16 Thread Fujii Masao




On 2022/09/09 19:11, Drouvot, Bertrand wrote:

IME if your application relies on 2PC it's very likely that you will hit the
exact same problems described in your original email.


The utility commands for cursor like DECLARE CURSOR seem to have the same issue
and can cause lots of pgss entries. For example, when we use postgres_fdw and
execute "SELECT * FROM  WHERE id = 10" five times in the same
transaction, the following commands are executed in the remote PostgreSQL server
and recorded as pgss entries there.

DECLARE c1 CURSOR FOR ...
DECLARE c2 CURSOR FOR ...
DECLARE c3 CURSOR FOR ...
DECLARE c4 CURSOR FOR ...
DECLARE c5 CURSOR FOR ...
FETCH 100 FROM c1
FETCH 100 FROM c2
FETCH 100 FROM c3
FETCH 100 FROM c4
FETCH 100 FROM c5
CLOSE c1
CLOSE c2
CLOSE c3
CLOSE c4
CLOSE c5

Furthermore, if the different query on foreign table is executed in the next
transaction, it may reuse the same cursor name previously used by another query.
That is, different queries can cause the same FETCH command like
"FETCH 100 FROM c1". This would be also an issue.

I'm not sure if the patch should also handle cursor cases. We can implement
that separately later if necessary.

I don't think that the patch should include the fix for cursor cases. It can be 
implemented separately later if necessary.



Attached v5 to normalize 2PC commands too, so that we get things like:


+   case T_VariableSetStmt:
+   {
+   VariableSetStmt *stmt = (VariableSetStmt *) 
node;
+
+   /* stmt->name is NULL for RESET ALL */
+   if (stmt->name)
+   {
+   APP_JUMB_STRING(stmt->name);
+   JumbleExpr(jstate, (Node *) stmt->args);

With the patch, "SET ... TO DEFAULT" and "RESET ..." are counted as the same 
query.
Is this intentional? Which might be ok because their behavior is basically the 
same.
But I'm afaid which may cause users to be confused. For example, they may fail 
to
find the pgss entry for RESET command they ran and just wonder why the command 
was
not recorded. To avoid such confusion, how about appending stmt->kind to the 
jumble?
Thought?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Error for WITH options on partitioned tables

2022-09-16 Thread Japin Li


On Fri, 16 Sep 2022 at 20:13, Simon Riggs  wrote:
> Someone on general list recently complained that the error message
> from trying to use options on a partitioned table was misleading,
> which it definitely is:
>
> CREATE TABLE parted_col_comment (a int, b text) PARTITION BY LIST (a)
> WITH (fillfactor=100);
> ERROR:  unrecognized parameter "fillfactor"
>
> Which is verified by patch 001.
>
> Patch 002 replaces this with a more meaningful error message, which
> matches our fine manual.
> https://www.postgresql.org/docs/current/sql-createtable.html
>
>  ERROR:  cannot specify storage options for a partitioned table
>  HINT:  specify storage options on leaf partitions instead

Looks good.  Does this means we don't need the partitioned_table_reloptions()
function and remove the reloptions validation in DefineRelation() for
partitioned table.  Or we can ereport() in partitioned_table_reloptions().

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Pruning never visible changes

2022-09-16 Thread Simon Riggs
A user asked me whether we prune never visible changes, such as

BEGIN;
INSERT...
UPDATE.. (same row)
COMMIT;

Once committed, the original insert is no longer visible to anyone, so
"ought to be able to be pruned", sayeth the user. And they also say
that changing the app is much harder, as ever.

After some thought, Yes, we can prune, but not in all cases - only if
the never visible tuple is at the root end of the update chain. The
only question is can that be done cheaply enough to bother with. The
answer in one specific case is Yes, in other cases No.

This patch adds a new test for this use case, and code to remove the
never visible row when the changes are made by the same xid.

(I'm pretty sure there used to be a test for this some years back and
I'm guessing it was removed because it isn't always possible to remove
the tuple, which this new patch honours.)

Please let me know what you think.

--
Simon Riggshttp://www.EnterpriseDB.com/


never_visible.v1.patch
Description: Binary data


Re: SUBTRANS: Minimizing calls to SubTransSetParent()

2022-09-16 Thread Simon Riggs
On Thu, 15 Sept 2022 at 12:36, Japin Li  wrote:
>
>
> On Thu, 15 Sep 2022 at 18:04, Simon Riggs  
> wrote:
> > On Wed, 14 Sept 2022 at 15:21, Alvaro Herrera  
> > wrote:
> >>
> >> On 2022-Aug-30, Simon Riggs wrote:
> >>
> >> > 001_new_isolation_tests_for_subxids.v3.patch
> >> > Adds new test cases to master without adding any new code, specifically
> >> > addressing the two areas of code that are not tested by existing tests.
> >> > This gives us a baseline from which we can do test driven development.
> >> > I'm hoping this can be reviewed and committed fairly smoothly.
> >>
> >> I gave this a quick run to confirm the claimed increase of coverage.  It
> >> checks out, so pushed.
> >
> > Thank you.
> >
> > So now we just have the main part of the patch, reattached here for
> > the auto patch tester's benefit.
>
> Hi Simon,
>
> Thanks for the updated patch, here are some comments.

Thanks for your comments.

> There is a typo, 
> s/SubTransGetTopMostTransaction/SubTransGetTopmostTransaction/g.
>
> +*call SubTransGetTopMostTransaction() if that xact 
> overflowed;
>
>
> Is there a punctuation mark missing on the following first line?
>
> +* 2. When IsolationIsSerializable() we sometimes need to 
> access topxid
> +*This occurs only when SERIALIZABLE is requested by app 
> user.
>
>
> When we use function name in comments, some places we use parentheses,
> but others do not use it. Why? I think, we should keep them consistent,
> at least in the same commit.
>
> +* 3. When TransactionIdSetTreeStatus will use a status of 
> SUB_COMMITTED,
> +*which then requires us to consult subtrans to find 
> parent, which
> +*is needed to avoid race condition. In this case we ask 
> Clog/Xact
> +*module if TransactionIdsAreOnSameXactPage(). Since we 
> start a new
> +*clog page every 32000 xids, this is usually <<1% of 
> subxids.

I've reworded those comments, hoping to address all of your above points.

> Maybe we declaration a topxid to avoid calling GetTopTransactionId()
> twice when we should set subtrans parent?
>
> +   TransactionId subxid = 
> XidFromFullTransactionId(s->fullTransactionId);
> +   TransactionId topxid = GetTopTransactionId();
>   ...
> +   if (MyProc->subxidStatus.overflowed ||
> +   IsolationIsSerializable() ||
> +   !TransactionIdsAreOnSameXactPage(topxid, subxid))
> +   {
>   ...
> +   SubTransSetParent(subxid, topxid);
> +   }

Seems a minor point, but I've done this anyway.

Thanks for the review.

v10 attached

--
Simon Riggshttp://www.EnterpriseDB.com/


002_minimize_calls_to_SubTransSetParent.v10.patch
Description: Binary data


Error for WITH options on partitioned tables

2022-09-16 Thread Simon Riggs
Someone on general list recently complained that the error message
from trying to use options on a partitioned table was misleading,
which it definitely is:

CREATE TABLE parted_col_comment (a int, b text) PARTITION BY LIST (a)
WITH (fillfactor=100);
ERROR:  unrecognized parameter "fillfactor"

Which is verified by patch 001.

Patch 002 replaces this with a more meaningful error message, which
matches our fine manual.
https://www.postgresql.org/docs/current/sql-createtable.html

 ERROR:  cannot specify storage options for a partitioned table
 HINT:  specify storage options on leaf partitions instead

-- 
Simon Riggshttp://www.EnterpriseDB.com/


001_test_options_with_partitioned_table.v1.patch
Description: Binary data


002_better_error_for_options_on_partitioned.v1.patch
Description: Binary data


Re: Switching XLog source from archive to streaming when primary available

2022-09-16 Thread Bharath Rupireddy
On Fri, Sep 16, 2022 at 12:06 PM Kyotaro Horiguchi
 wrote:
>
> In other words, it seems to me that the macro name doesn't manifest
> the condition correctly.
>
> I don't think we don't particularly want to do that unconditionally.
> I wanted just to get rid of the macro from the usage site.  Even if
> the same condition is used elsewhere, I see it better to write out the
> condition directly there..

I wanted to avoid a bit of duplicate code there. How about naming that
macro IsXLOGSourceSwitchToStreamEnabled() or
SwitchFromArchiveToStreamEnabled() or just SwitchFromArchiveToStream()
or any other better name?

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




About displaying NestLoopParam

2022-09-16 Thread Richard Guo
I have a question about displaying NestLoopParam. In the plan below,

# explain (costs off)
select * from a, lateral (select sum(i) as i from b where exists (select
sum(c.i) from c where c.j = a.j and c.i = b.i) ) ss where a.i = ss.i;
 QUERY PLAN

 Nested Loop
   ->  Seq Scan on a
   ->  Subquery Scan on ss
 Filter: (a.i = ss.i)
 ->  Aggregate
   ->  Seq Scan on b
 Filter: (SubPlan 1)
 SubPlan 1
   ->  Aggregate
 ->  Seq Scan on c
   Filter: ((j = $0) AND (i = b.i))
(11 rows)

There are three Params. Param 0 (a.j) and param 2 (a.i) are from
nestParams of the NestLoop. Param 1 (b.i) is from parParam of the
SubPlan. As we can see, param 1 and param 2 are displayed as the
corresponding expressions, while param 0 is displayed as $0.

I'm not saying this is a bug, but just curious why param 0 cannot be
displayed as the referenced expression. And I find the reason is that in
function find_param_referent(), we have the 'in_same_plan_level' flag
controlling that if we have emerged from a subplan, i.e. not the same
plan level any more, we would not look further for the matching
NestLoopParam. Param 0 suits this situation.

And there is a comment there also saying,

/*
 * NestLoops transmit params to their inner child only; also, once
 * we've crawled up out of a subplan, this couldn't possibly be
 * the right match.
 */

My question is why is that?

Thanks
Richard


Re: A question about wording in messages

2022-09-16 Thread Amit Kapila
On Fri, Sep 16, 2022 at 2:07 PM Kyotaro Horiguchi
 wrote:
>
> At Fri, 16 Sep 2022 12:29:52 +0530, Amit Kapila  
> wrote in
> > So, the change related to wal_decode_buffer_size needs to be
> > backpatched to 15 whereas other message changes will be HEAD only, am
> > I correct?
>
> Has 15 closed the entry?  IMHO I supposed that all changes are applied
> back(?) to 15.
>

We only want to commit the changes to 15 (a) if those fixes a problem
introduced in 15, or (b) those are for a bug fix. I think the error
message improvements fall into none of those categories, we can map it
to (b) but I feel those are an improvement in the current messages and
don't seem critical to me.

-- 
With Regards,
Amit Kapila.




Re: A question about wording in messages

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 12:29:52 +0530, Amit Kapila  wrote 
in 
> So, the change related to wal_decode_buffer_size needs to be
> backpatched to 15 whereas other message changes will be HEAD only, am
> I correct?

Has 15 closed the entry?  IMHO I supposed that all changes are applied
back(?) to 15.

regardes.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Make ON_ERROR_STOP stop on shell script failure

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 15:55:33 +0900, bt22nakamorit 
 wrote in 
> Hi,
> 
> """\set ON_ERROR_STOP on""" stops any subsequent incoming query that
> comes after an error of an SQL, but does not stop after a shell script
> ran by """\! """ returning values other than 0, -1, or
> 127, which suggests a failure in the result of the shell script.
> 
> For example, suppose that below is an SQL file.
> \set ON_ERROR_STOP on
> SELECT 1;
> \! false
> SELECT 2;
> 
> The current design allows SELECT 2 even though the shell script
> returns a value indicating a failure.

Since the "false" command did not "error out"?

> I thought that this action is rather unexpected since, based on the
> word """ON_ERROR_STOP""", ones may expect that failures of shell
> scripts should halt the incoming instructions as well.
> One clear solution is to let failures of shell script stop incoming
> queries just like how errors of SQLs do currently. Thoughts?

I'm not sure we want to regard any exit status from a succssful run as
a failure.

On the other hand, the proposed behavior seems useful to me.

So +1 from me to the proposal, assuming the corresponding edit of the
documentation happens.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: ICU for global collation

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 09:56:31 +0200, Peter Eisentraut 
 wrote in 
> On 15.09.22 17:41, Marina Polyakova wrote:
> > I agree with you. Here's another version of the patch. The
> > locale/encoding checks and reports in initdb have been reordered,
> > because now the encoding is set first and only then the ICU locale is
> > checked.
> 
> I committed something based on the first version of your patch.  This
> reordering of the messages here was a little too much surgery for me
> at this point.  For instance, there are also messages in #ifdef WIN32
> code that would need to be reordered as well.  I kept the overall
> structure of the code the same and just inserted the additional
> proposed checks.

Yeah, as I sent just before, I reached the same conclusion.

> If you want to pursue the reordering of the checks and messages
> overall, a patch for the master branch could be considered.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: ICU for global collation

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 09:49:28 +0300, Marina Polyakova 
 wrote in 
> On 2022-09-15 09:52, Kyotaro Horiguchi wrote:
> > However, when I reran the command, it complains about incompatible
> > encoding this time.  I think it's more user-friendly to check for the
> > encoding compatibility before the check for missing --icu-locale
> > option.
> > regards.
> 
> In continuation of options check: AFAICS the following checks in
> initdb
> 
>   if (locale_provider == COLLPROVIDER_ICU)
>   {
>   if (!icu_locale)
>   pg_fatal("ICU locale must be specified");
> 
>   /*
>* In supported builds, the ICU locale ID will be checked by the
>* backend during post-bootstrap initialization.
>*/
> #ifndef USE_ICU
>   pg_fatal("ICU is not supported in this build");
> #endif
>   }
> 
> are executed approximately when they are executed in create database
> after getting all the necessary data from the template database:

initdb doesn't work that way, but anyway, I realized that I am
proposing to move that code in setlocales() to the caller function as
the result. I don't think setlocales() is the place for the code
because icu locale has no business with what the function does.  That
being said there's no obvious reason we *need* to move the code out to
its caller.

> if (dblocprovider == COLLPROVIDER_ICU)
> {
>   /*
>* This would happen if template0 uses the libc provider but the new
>* database uses icu.
>*/
>   if (!dbiculocale)
>   ereport(ERROR,
>   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>errmsg("ICU locale must be specified")));
> }
> 
> if (dblocprovider == COLLPROVIDER_ICU)
>   check_icu_locale(dbiculocale);
> 
> But perhaps the check that --icu-locale cannot be specified unless
> locale provider icu is chosen should also be moved here? So all these
> checks will be in one place and it will use the provider from the
> template database (which could be icu):
> 
> $ initdb --locale-provider icu --icu-locale en-US -D data &&
> pg_ctl -D data -l logfile start &&
> createdb --icu-locale ru-RU --template template0 mydb
> ...
> createdb: error: database creation failed: ERROR: ICU locale cannot be
> specified unless locale provider is ICU

And, I realized that this causes bigger churn than I thought. So, I'm
sorry but I withdraw the comment.

Thus the first proposed patch will be more or less the direction we
would go. And the patch looks good to me as a whole.

+errmsg("encoding \"%s\" is not 
supported with ICU provider",

+   pg_log_error("encoding \"%s\" is not supported with ICU 
provider",
+pg_encoding_to_char(encodingid));

I might be wrong, but the messages look wrong to me.  The alternatives
below might work.

"encoding \"%s\" is not supported by ICU"
"encoding \"%s\" cannot be used for/with ICU locales"

+   pg_log_error_hint("Rerun %s and choose a matching combination.",
+ progname);

This doesn't seem to provide users with useful information.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: ICU for global collation

2022-09-16 Thread Peter Eisentraut

On 16.09.22 09:31, Marina Polyakova wrote:
IMO it is hardly understantable from the program output either - it 
looks like I manually chose the encoding UTF8. Maybe first inform about 
selected encoding?..


Yes, I included something like that in the patch just committed.


diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 
6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 100644

--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2310,7 +2310,11 @@ setup_locale_encoding(void)
  }

  if (!encoding && locale_provider == COLLPROVIDER_ICU)
+    {
  encodingid = PG_UTF8;
+    printf(_("The default database encoding has been set to \"%s\" 
for a better experience with the ICU provider.\n"),

+   pg_encoding_to_char(encodingid));
+    }
  else if (!encoding)
  {
  int    ctype_enc;






Re: ICU for global collation

2022-09-16 Thread Peter Eisentraut

On 15.09.22 17:41, Marina Polyakova wrote:
I agree with you. Here's another version of the patch. The 
locale/encoding checks and reports in initdb have been reordered, 
because now the encoding is set first and only then the ICU locale is 
checked.


I committed something based on the first version of your patch.  This 
reordering of the messages here was a little too much surgery for me at 
this point.  For instance, there are also messages in #ifdef WIN32 code 
that would need to be reordered as well.  I kept the overall structure 
of the code the same and just inserted the additional proposed checks.


If you want to pursue the reordering of the checks and messages overall, 
a patch for the master branch could be considered.





Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-16 Thread John Naylor
On Fri, Sep 16, 2022 at 1:01 PM Masahiko Sawada  wrote:
>
> On Mon, Aug 15, 2022 at 10:39 PM John Naylor
>  wrote:
> >
> > bool, buth = and <=. Should be pretty close. Also, i believe if you
> > left this for last as a possible refactoring, it might save some work.

v6 demonstrates why this should have been put off towards the end. (more below)

> > In any case, I'll take a look at the latest patch next month.

Since the CF entry said "Needs Review", I began looking at v5 again
this week. Hopefully not too much has changed, but in the future I
strongly recommend setting to "Waiting on Author" if a new version is
forthcoming. I realize many here share updated patches at any time,
but I'd like to discourage the practice especially for large patches.

> I've updated the radix tree patch. It's now separated into two patches.
>
> 0001 patch introduces pg_lsearch8() and pg_lsearch8_ge() (we may find
> better names) that are similar to the pg_lfind8() family but they
> return the index of the key in the vector instead of true/false. The
> patch includes regression tests.

I don't want to do a full review of this just yet, but I'll just point
out some problems from a quick glance.

+/*
+ * Return the index of the first element in the vector that is greater than
+ * or eual to the given scalar. Return sizeof(Vector8) if there is no such
+ * element.

That's a bizarre API to indicate non-existence.

+ *
+ * Note that this function assumes the elements in the vector are sorted.
+ */

That is *completely* unacceptable for a general-purpose function.

+#else /* USE_NO_SIMD */
+ Vector8 r = 0;
+ uint8 *rp = (uint8 *) 
+
+ for (Size i = 0; i < sizeof(Vector8); i++)
+ rp[i] = (((const uint8 *) )[i] == ((const uint8 *) )[i]) ? 0xFF : 0;

I don't think we should try to force the non-simd case to adopt the
special semantics of vector comparisons. It's much easier to just use
the same logic as the assert builds.

+#ifdef USE_SSE2
+ return (uint32) _mm_movemask_epi8(v);
+#elif defined(USE_NEON)
+ static const uint8 mask[16] = {
+1 << 0, 1 << 1, 1 << 2, 1 << 3,
+1 << 4, 1 << 5, 1 << 6, 1 << 7,
+1 << 0, 1 << 1, 1 << 2, 1 << 3,
+1 << 4, 1 << 5, 1 << 6, 1 << 7,
+  };
+
+uint8x16_t masked = vandq_u8(vld1q_u8(mask), (uint8x16_t)
vshrq_n_s8(v, 7));
+uint8x16_t maskedhi = vextq_u8(masked, masked, 8);
+
+return (uint32) vaddvq_u16((uint16x8_t) vzip1q_u8(masked, maskedhi));

For Arm, we need to be careful here. This article goes into a lot of
detail for this situation:

https://community.arm.com/arm-community-blogs/b/infrastructure-solutions-blog/posts/porting-x86-vector-bitmask-optimizations-to-arm-neon

Here again, I'd rather put this off and focus on getting the "large
details" in good enough shape so we can got towards integrating with
vacuum.

> In addition to two patches, I've attached the third patch. It's not
> part of radix tree implementation but introduces a contrib module
> bench_radix_tree, a tool for radix tree performance benchmarking. It
> measures loading and lookup performance of both the radix tree and a
> flat array.

Excellent! This was high on my wish list.

-- 
John Naylor
EDB: http://www.enterprisedb.com




RE: why can't a table be part of the same publication as its schema

2022-09-16 Thread houzj.f...@fujitsu.com
On Friday, September 16, 2022 1:42 PM Amit Kapila  
wrote:
> 
> On Thu, Sep 15, 2022 at 6:27 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > Attach the new version patch which added suggested restriction for
> > column list and merged Vignesh's patch.
> >
> 
> Few comments:
> 
> 1.
>  static void
> -CheckPubRelationColumnList(List *tables, const char *queryString,
> +CheckPubRelationColumnList(List *tables, bool publish_schema,
> +const char *queryString,
>  bool pubviaroot)
> 
> It is better to keep bool parameters together at the end.
> 
> 2.
>   /*
> + * Disallow using column list if any schema is in the publication.
> + */
> + if (publish_schema)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot use publication column list for relation \"%s.%s\"",
> +get_namespace_name(RelationGetNamespace(pri->relation)),
> +RelationGetRelationName(pri->relation)),
> + errdetail("Column list cannot be specified if any schema is part of
> the publication or specified in the list."));
> 
> I think it would be better to explain why we disallow this case.
> 
> 3.
> + if (!heap_attisnull(coltuple, Anum_pg_publication_rel_prattrs, NULL))
> + ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot add schema to the publication"), errdetail("Schema
> + cannot be added if any table that specifies column
> list is already part of the publication"));
> 
> A full stop is missing at the end in the errdetail message.
> 
> 4. I have modified a few comments in the attached. Please check and if you 
> like
> the changes then please include those in the next version.

Thanks for the comments.
Attach the new version patch which addressed above comments and ran pgident.
I also improved some codes and documents based on some comments
given by Vignesh and Peter offlist.

Best regards,
Hou zj


v4-0001-Allow-publications-with-schema-and-table-of-the-s.patch
Description:  v4-0001-Allow-publications-with-schema-and-table-of-the-s.patch


Re: ICU for global collation

2022-09-16 Thread Marina Polyakova

On 2022-09-16 07:55, Kyotaro Horiguchi wrote:

At Thu, 15 Sep 2022 18:41:31 +0300, Marina Polyakova
 wrote in

P.S. While working on the patch, I discovered that UTF8 encoding is
always used for the ICU provider in initdb unless it is explicitly
specified by the user:

if (!encoding && locale_provider == COLLPROVIDER_ICU)
encodingid = PG_UTF8;

IMO this creates additional errors for locales with other encodings:

$ initdb --locale de_DE.iso885915@euro --locale-provider icu
--icu-locale de-DE
...
initdb: error: encoding mismatch
initdb: detail: The encoding you selected (UTF8) and the encoding that
the selected locale uses (LATIN9) do not match. This would lead to
misbehavior in various character string processing functions.
initdb: hint: Rerun initdb and either do not specify an encoding
explicitly, or choose a matching combination.

And ICU supports many encodings, see the contents of pg_enc2icu_tbl in
encnames.c...


It seems to me the best default that fits almost all cases using icu
locales.

So, we need to specify encoding explicitly in that case.

$ initdb --encoding iso-8859-15 --locale de_DE.iso885915@euro
--locale-provider icu --icu-locale de-DE

However, I think it is hardly understantable from the documentation.

(I checked this using euc-jp [1] so it might be wrong..)

[1] initdb --encoding euc-jp --locale ja_JP.eucjp --locale-provider
icu --icu-locale ja-x-icu

regards.


Thank you!

IMO it is hardly understantable from the program output either - it 
looks like I manually chose the encoding UTF8. Maybe first inform about 
selected encoding?..


diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 
6aeec8d426c52414b827686781c245291f27ed1f..348bbbeba0f5bc7ff601912bf883510d580b814c 
100644

--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -2310,7 +2310,11 @@ setup_locale_encoding(void)
}

if (!encoding && locale_provider == COLLPROVIDER_ICU)
+   {
encodingid = PG_UTF8;
+		printf(_("The default database encoding has been set to \"%s\" for a 
better experience with the ICU provider.\n"),

+  pg_encoding_to_char(encodingid));
+   }
else if (!encoding)
{
int ctype_enc;

ISTM that such choices (e.g. UTF8 for Windows in some cases) are 
described in the documentation [1] as


By default, initdb uses the locale provider libc, takes the locale 
settings from the environment, and determines the encoding from the 
locale settings. This is almost always sufficient, unless there are 
special requirements.


[1] https://www.postgresql.org/docs/devel/app-initdb.html

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans

2022-09-16 Thread Masahiko Sawada
On Tue, Sep 13, 2022 at 6:02 AM Peter Geoghegan  wrote:
>
> My ongoing project to make VACUUM more predictable over time by
> proactive freezing [1] will increase the overall number of tuples
> frozen by VACUUM significantly (at least in larger tables). It's
> important that we avoid any new user-visible impact from extra
> freezing, though. I recently spent a lot of time on adding high-level
> techniques that aim to avoid extra freezing (e.g. by being lazy about
> freezing) when that makes sense. Low level techniques aimed at making
> the mechanical process of freezing cheaper might also help. (In any
> case it's well worth optimizing.)
>
> I'd like to talk about one such technique on this thread. The attached
> WIP patch reduces the size of xl_heap_freeze_page records by applying
> a simple deduplication process. This can be treated as independent
> work (I think it can, at least).

+1

> The patch doesn't change anything
> about the conceptual model used by VACUUM's lazy_scan_prune function
> to build xl_heap_freeze_page records for a page, and yet still manages
> to make the WAL records for freeze records over 5x smaller in many
> important cases. They'll be ~4x-5x smaller with *most* workloads,
> even.

After a quick benchmark, I've confirmed that the amount of WAL records
for freezing 1 million tuples reduced to about one-fifth (1.2GB vs
250MB). Great.

>
> Each individual tuple entry (each xl_heap_freeze_tuple) adds a full 12
> bytes to the WAL record right now -- no matter what. So the existing
> approach is rather space inefficient by any standard (perhaps because
> it was developed under time pressure while fixing bugs in Postgres
> 9.3). More importantly, there is a lot of natural redundancy among
> each xl_heap_freeze_tuple entry -- each tuple's xl_heap_freeze_tuple
> details tend to match. We can usually get away with storing each
> unique combination of values from xl_heap_freeze_tuple once per
> xl_heap_freeze_page record, while storing associated page offset
> numbers in a separate area, grouped by their canonical freeze plan
> (which is a normalized version of the information currently stored in
> xl_heap_freeze_tuple).

True. I've not looked at the patch in depth yet but I think we need
regression tests for this.

Regards,

-- 
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: missing PG_FREE_IF_COPY in textlike() and textnlike() ?

2022-09-16 Thread CK Tan
Got it. It is a leak-by-design for efficiency.

Thanks,
-cktan

On Fri, Sep 16, 2022 at 12:03 AM Tom Lane  wrote:
>
> CK Tan  writes:
> > I see in the texteq() function calls to DatumGetTextPP() are followed
> > by conditional calls to PG_FREE_IF_COPY. e.g.
>
> That's because texteq() is potentially usable as a btree index
> function, and btree isn't too forgiving about leaks.
>
> > However, in textlike(), PG_FREE_IF_COPY calls are missing.
> > https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/like.c#L283
>
> textlike() isn't a member of any btree opclass.
>
> > Is this a memory leak bug?
>
> Not unless you can demonstrate a case where it causes problems.
> For the most part, such functions run in short-lived contexts.
>
> regards, tom lane




Re: Remove dead macro exec_subplan_get_plan

2022-09-16 Thread Zhang Mingli


On Sep 16, 2022, 14:47 +0800, Richard Guo , wrote:
>
> On Tue, Sep 6, 2022 at 12:39 AM Zhang Mingli  wrote:
> > Macro exec_subplan_get_plan is not used anymore.
> > Attach a patch to remove it.
>
> How about add it to the CF to not lose track of it?
Will add it, thanks~

Regards,
Zhang Mingli


Re: missing PG_FREE_IF_COPY in textlike() and textnlike() ?

2022-09-16 Thread Tom Lane
CK Tan  writes:
> I see in the texteq() function calls to DatumGetTextPP() are followed
> by conditional calls to PG_FREE_IF_COPY. e.g.

That's because texteq() is potentially usable as a btree index
function, and btree isn't too forgiving about leaks.

> However, in textlike(), PG_FREE_IF_COPY calls are missing.
> https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/like.c#L283

textlike() isn't a member of any btree opclass.

> Is this a memory leak bug?

Not unless you can demonstrate a case where it causes problems.
For the most part, such functions run in short-lived contexts.

regards, tom lane




Re: A question about wording in messages

2022-09-16 Thread Amit Kapila
On Fri, Sep 16, 2022 at 11:46 AM Kyotaro Horiguchi
 wrote:
>
> At Fri, 16 Sep 2022 12:10:05 +1200, Thomas Munro  
> wrote in
> > On Wed, Sep 14, 2022 at 2:38 PM Tom Lane  wrote:
> > > Kyotaro Horiguchi  writes:
> > > > I saw the following message recently modified.
> > > >> This controls the maximum distance we can read ahead in the WAL to 
> > > >> prefetch referenced data blocks.
> > > > Maybe the "we" means "PostgreSQL program and you" but I see it
> > > > somewhat out of place.
> > >
> > > +1, I saw that today and thought it was outside our usual style.
> > > The whole thing is awfully verbose for a GUC description, too.
> > > Maybe
> > >
> > > "Maximum distance to read ahead in WAL to prefetch data blocks."
> >
> > +1
> >
> > For "we", I must have been distracted by code comment style.  For the
> > extra useless verbiage, it's common for GUC description to begin "This
> > control/affects/blah" like that, but I agree it's useless noise.
> >
> > For the other cases, Amit's suggestion of 'server' seems sensible to me.
>
> Thaks for the opinion. I'm fine with that, too.
>

So, the change related to wal_decode_buffer_size needs to be
backpatched to 15 whereas other message changes will be HEAD only, am
I correct?

-- 
With Regards,
Amit Kapila.




missing PG_FREE_IF_COPY in textlike() and textnlike() ?

2022-09-16 Thread CK Tan
Hi Hackers,

I see in the texteq() function calls to DatumGetTextPP() are followed
by conditional calls to PG_FREE_IF_COPY. e.g.

https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/varlena.c#L1792

   text *targ1 = DatumGetTextPP(arg1);
   text *targ2 = DatumGetTextPP(arg2);
   result = (memcmp(VARDATA_ANY(targ1), VARDATA_ANY(targ2), len1 -
VARHDRSZ) == 0);
   PG_FREE_IF_COPY(targ1, 0);
   PG_FREE_IF_COPY(targ2, 1);

However, in textlike(), PG_FREE_IF_COPY calls are missing.

https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/like.c#L283

Is this a memory leak bug?

Regards,
-cktan




Make ON_ERROR_STOP stop on shell script failure

2022-09-16 Thread bt22nakamorit

Hi,

"""\set ON_ERROR_STOP on""" stops any subsequent incoming query that 
comes after an error of an SQL, but does not stop after a shell script 
ran by """\! """ returning values other than 0, -1, or 
127, which suggests a failure in the result of the shell script.


For example, suppose that below is an SQL file.
\set ON_ERROR_STOP on
SELECT 1;
\! false
SELECT 2;

The current design allows SELECT 2 even though the shell script returns 
a value indicating a failure.


I thought that this action is rather unexpected since, based on the word 
"""ON_ERROR_STOP""", ones may expect that failures of shell scripts 
should halt the incoming instructions as well.
One clear solution is to let failures of shell script stop incoming 
queries just like how errors of SQLs do currently. Thoughts?


Tatsudiff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a141146e70..7445ca04ff 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -4989,6 +4989,10 @@ do_shell(const char *command)
 		pg_log_error("\\!: failed");
 		return false;
 	}
+	else if (result != 0) {
+		pg_log_error("command failed");
+		return false;
+	}
 	return true;
 }
 


Re: ICU for global collation

2022-09-16 Thread Marina Polyakova

On 2022-09-15 09:52, Kyotaro Horiguchi wrote:

If I executed initdb as follows, I would be told to specify
--icu-locale option.


$ initdb --encoding sql-ascii --locale-provider icu hoge
...
initdb: error: ICU locale must be specified


However, when I reran the command, it complains about incompatible
encoding this time.  I think it's more user-friendly to check for the
encoding compatibility before the check for missing --icu-locale
option.

regards.


In continuation of options check: AFAICS the following checks in initdb

if (locale_provider == COLLPROVIDER_ICU)
{
if (!icu_locale)
pg_fatal("ICU locale must be specified");

/*
 * In supported builds, the ICU locale ID will be checked by the
 * backend during post-bootstrap initialization.
 */
#ifndef USE_ICU
pg_fatal("ICU is not supported in this build");
#endif
}

are executed approximately when they are executed in create database 
after getting all the necessary data from the template database:


if (dblocprovider == COLLPROVIDER_ICU)
{
/*
 * This would happen if template0 uses the libc provider but the new
 * database uses icu.
 */
if (!dbiculocale)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg("ICU locale must be specified")));
}

if (dblocprovider == COLLPROVIDER_ICU)
check_icu_locale(dbiculocale);

But perhaps the check that --icu-locale cannot be specified unless 
locale provider icu is chosen should also be moved here? So all these 
checks will be in one place and it will use the provider from the 
template database (which could be icu):


$ initdb --locale-provider icu --icu-locale en-US -D data &&
pg_ctl -D data -l logfile start &&
createdb --icu-locale ru-RU --template template0 mydb
...
createdb: error: database creation failed: ERROR:  ICU locale cannot be 
specified unless locale provider is ICU


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Remove dead macro exec_subplan_get_plan

2022-09-16 Thread Richard Guo
On Tue, Sep 6, 2022 at 12:39 AM Zhang Mingli  wrote:

> Macro exec_subplan_get_plan is not used anymore.
> Attach a patch to remove it.
>

How about add it to the CF to not lose track of it?

Thanks
Richard


Re: [PATCH] Simple code cleanup in tuplesort.c.

2022-09-16 Thread Richard Guo
On Wed, Jul 27, 2022 at 5:10 PM Xing Guo  wrote:

> The bounded heap sorting status flag is set twice in sort_bounded_heap()
> and tuplesort_performsort(). This patch helps remove one of them.
>

Revisiting this patch I think maybe it's better to remove the setting of
Tuplesort status from tuplesort_performsort() for the TSS_BOUNDED case.
Thus we keep the heap manipulation routines, make_bounded_heap and
sort_bounded_heap, consistent in that they update their status
accordingly inside the function.

Also, would you please add it to the CF to not lose track of it?

Thanks
Richard


Re: Switching XLog source from archive to streaming when primary available

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 09:15:58 +0530, Bharath Rupireddy 
 wrote in 
> On Thu, Sep 15, 2022 at 1:52 PM Kyotaro Horiguchi
>  wrote:
> >
> > At Thu, 15 Sep 2022 10:28:12 +0530, Bharath Rupireddy 
> >  wrote in
> > > I'm attaching the v6 patch that's rebased on to the latest HEAD.
> > > Please consider this for review.
> >
> > Thaks for the new version!
> >
> > +#define StreamingReplRetryEnabled() \
> > +   (streaming_replication_retry_interval > 0 && \
> > +StandbyMode && \
> > +currentSource == XLOG_FROM_ARCHIVE)
> >
> > It seems to me a bit too complex..

In other words, it seems to me that the macro name doesn't manifest
the condition correctly.

> I don't think so, it just tells whether a standby is allowed to switch
> source to stream from archive.
> 
> > +   /* Save the timestamp at which we're switching to 
> > archive. */
> > +   if (StreamingReplRetryEnabled())
> > +   switched_to_archive_at = 
> > GetCurrentTimestamp();
> >
> > Anyway we are going to open a file just after this so
> > GetCurrentTimestamp() cannot cause a perceptible degradation.
> > Coulnd't we do that unconditionally, to get rid of the macro?
> 
> Do we really need to do it unconditionally? I don't think so. And, we
> can't get rid of the macro, as we need to check for the current
> source, GUC and standby mode. When this feature is disabled, it
> mustn't execute any extra code IMO.

I don't think we don't particularly want to do that unconditionally.
I wanted just to get rid of the macro from the usage site.  Even if
the same condition is used elsewhere, I see it better to write out the
condition directly there..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)

2022-09-16 Thread Michael Paquier
On Wed, Sep 14, 2022 at 02:24:12PM +0530, Bharath Rupireddy wrote:
> I'm attaching the v4 patch that's rebased on to the latest HEAD.
> Please consider this for review.

I have been looking at this patch.

-   StringInfo  labelfile;
-   StringInfo  tblspc_map_file;
backup_manifest_info manifest;
+   BackupState backup_state;
You could use initialize the state here with a {0}.  That's simpler.

--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -380,6 +380,31 @@ GetRmgr(RmgrId rmid)
 }
 #endif
 
+/* Structure to hold backup state. */
+typedef struct BackupStateData
+{
Why is that in xlog_internal.h?  This header includes a lot of
declarations about the internals of WAL, but the backup state is not
that internal.  I'd like to think that we should begin separating the
backup-related routines into their own file, as of a set of
xlogbackup.c/h in this case.  That's a split I have been wondering
about for some time now.  The internals of xlog.c for the start/stop
backups are tied to XLogCtlData which map such a switch more
complicated than it looks, so we can begin small and have the routines
to create, free and build the label file and the tablespace map in
this new file.

+   state->name = (char *) palloc0(len + 1);
+   memcpy(state->name, name, len);
Or just pstrdup()?

+BackupState
+get_backup_state(const char *name)
+{
I would name this one create_backup_state() instead.

+void
+create_backup_content_str(BackupState state, bool forhistoryfile)
+{
This could be a build_backup_content().

It seems to me that there is no point in having the list of
tablespaces in BackupStateData?  This actually makes the code harder
to follow, see for example the changes with do_pg_backup_start(), we
the list of tablespace may or may be not passed down as a pointer of
BackupStateData while BackupStateData is itself the first argument of
this routine.  These are independent from the label and backup history
file, as well.

We need to be careful about the file format (see read_backup_label()),
and create_backup_content_str() is careful about that which is good.
Core does not care about the format of the backup history file, though
some community tools may.  I agree that what you are proposing here
makes the generation of these files easier to follow, but let's
document what forhistoryfile is here for, at least.  Saving the
the backup label and history file strings in BackupState is a
confusing interface IMO.  It would be more intuitive to have the
backup state in input, and provide the string generated in output
depending on what we want from the backup state.

-   backup_started_in_recovery = RecoveryInProgress();
+   Assert(state != NULL);
+
+   in_recovery = RecoveryInProgress();
[...]
-   if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery)
+   if (state->started_in_recovery == true && in_recovery == false)

I would have kept the naming to backup_started_in_recovery here.  What
you are doing is logically right by relying on started_in_recovery to
check if recovery was running when the backup started, but this just
creates useless noise in the refactoring.

Something unrelated to your patch that I am noticing while scanning
the area is that we have been always lazy in freeing the label file
data allocated in TopMemoryContext when using the SQL functions if the
backup is aborted.  We are not talking about this much amount of
memory each time a backup is performed, but that could be a cause for
memory bloat in a server if the same session is used and backups keep
failing, as the data is freed only on a successful pg_backup_stop().
Base backups through the replication protocol don't care about that as
the code keeps around the same pointer for the whole duration of
perform_base_backup().  Trying to tie the cleanup of the label file
with the abort phase would be the cause of more complications with
do_pg_backup_stop(), and xlog.c has no need to know about that now.
Just a remark for later.
--
Michael


signature.asc
Description: PGP signature


Re: A question about wording in messages

2022-09-16 Thread Kyotaro Horiguchi
At Fri, 16 Sep 2022 12:10:05 +1200, Thomas Munro  wrote 
in 
> On Wed, Sep 14, 2022 at 2:38 PM Tom Lane  wrote:
> > Kyotaro Horiguchi  writes:
> > > I saw the following message recently modified.
> > >> This controls the maximum distance we can read ahead in the WAL to 
> > >> prefetch referenced data blocks.
> > > Maybe the "we" means "PostgreSQL program and you" but I see it
> > > somewhat out of place.
> >
> > +1, I saw that today and thought it was outside our usual style.
> > The whole thing is awfully verbose for a GUC description, too.
> > Maybe
> >
> > "Maximum distance to read ahead in WAL to prefetch data blocks."
> 
> +1
> 
> For "we", I must have been distracted by code comment style.  For the
> extra useless verbiage, it's common for GUC description to begin "This
> control/affects/blah" like that, but I agree it's useless noise.
> 
> For the other cases, Amit's suggestion of 'server' seems sensible to me.

Thaks for the opinion. I'm fine with that, too.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 31df0a224d7d4d6cb619534f24940c4c3571cfd9 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Fri, 16 Sep 2022 15:07:21 +0900
Subject: [PATCH v2] Get rid of "we" from messages

It's not our usual style to use "we" in error messages. Get rid of
that usages so that the messages sound usual.  On the way fixing them,
simplify a verbose message.
---
 src/backend/replication/pgoutput/pgoutput.c | 4 ++--
 src/backend/storage/file/fd.c   | 2 +-
 src/backend/utils/misc/guc_tables.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 62e0ffecd8..9c8faece10 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -441,13 +441,13 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 		if (data->protocol_version > LOGICALREP_PROTO_MAX_VERSION_NUM)
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("client sent proto_version=%d but we only support protocol %d or lower",
+	 errmsg("client sent proto_version=%d but server only supports protocol %d or lower",
 			data->protocol_version, LOGICALREP_PROTO_MAX_VERSION_NUM)));
 
 		if (data->protocol_version < LOGICALREP_PROTO_MIN_VERSION_NUM)
 			ereport(ERROR,
 	(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-	 errmsg("client sent proto_version=%d but we only support protocol %d or higher",
+	 errmsg("client sent proto_version=%d but server only supports protocol %d or higher",
 			data->protocol_version, LOGICALREP_PROTO_MIN_VERSION_NUM)));
 
 		if (data->publication_names == NIL)
diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c
index 20c3741aa1..073dab2be5 100644
--- a/src/backend/storage/file/fd.c
+++ b/src/backend/storage/file/fd.c
@@ -978,7 +978,7 @@ set_max_safe_fds(void)
 		ereport(FATAL,
 (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
  errmsg("insufficient file descriptors available to start server process"),
- errdetail("System allows %d, we need at least %d.",
+ errdetail("System allows %d, server needs at least %d.",
 		   max_safe_fds + NUM_RESERVED_FDS,
 		   FD_MINFREE + NUM_RESERVED_FDS)));
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 550e95056c..5c340d471d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2591,7 +2591,7 @@ struct config_int ConfigureNamesInt[] =
 	{
 		{"wal_decode_buffer_size", PGC_POSTMASTER, WAL_RECOVERY,
 			gettext_noop("Buffer size for reading ahead in the WAL during recovery."),
-			gettext_noop("This controls the maximum distance we can read ahead in the WAL to prefetch referenced data blocks."),
+			gettext_noop("Maximum distance to read ahead in WAL to prefetch data blocks."),
 			GUC_UNIT_BYTE
 		},
 		_decode_buffer_size,
-- 
2.31.1



Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-09-16 Thread Masahiko Sawada
On Mon, Aug 15, 2022 at 10:39 PM John Naylor
 wrote:
>
> On Mon, Aug 15, 2022 at 12:39 PM Masahiko Sawada  
> wrote:
> >
> > On Fri, Jul 22, 2022 at 10:43 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Jul 19, 2022 at 1:30 PM John Naylor
> > >  wrote:
> > > >
> > > >
> > > >
> > > > On Tue, Jul 19, 2022 at 9:11 AM Masahiko Sawada  
> > > > wrote:
> > > >
> > > > > I’d like to keep the first version simple. We can improve it and add
> > > > > more optimizations later. Using radix tree for vacuum TID storage
> > > > > would still be a big win comparing to using a flat array, even without
> > > > > all these optimizations. In terms of single-value leaves method, I'm
> > > > > also concerned about an extra pointer traversal and extra memory
> > > > > allocation. It's most flexible but multi-value leaves method is also
> > > > > flexible enough for many use cases. Using the single-value method
> > > > > seems to be too much as the first step for me.
> > > > >
> > > > > Overall, using 64-bit keys and 64-bit values would be a reasonable
> > > > > choice for me as the first step . It can cover wider use cases
> > > > > including vacuum TID use cases. And possibly it can cover use cases by
> > > > > combining a hash table or using tree of tree, for example.
> > > >
> > > > These two aspects would also bring it closer to Andres' prototype, 
> > > > which 1) makes review easier and 2) easier to preserve optimization 
> > > > work already done, so +1 from me.
> > >
> > > Thanks.
> > >
> > > I've updated the patch. It now implements 64-bit keys, 64-bit values,
> > > and the multi-value leaves method. I've tried to remove duplicated
> > > codes but we might find a better way to do that.
> > >
> >
> > With the recent changes related to simd, I'm going to split the patch
> > into at least two parts: introduce other simd optimized functions used
> > by the radix tree and the radix tree implementation. Particularly we
> > need two functions for radix tree: a function like pg_lfind32 but for
> > 8 bits integers and return the index, and a function that returns the
> > index of the first element that is >= key.
>
> I recommend looking at
>
> https://www.postgresql.org/message-id/CAFBsxsESLUyJ5spfOSyPrOvKUEYYNqsBosue9SV1j8ecgNXSKA%40mail.gmail.com
>
> since I did the work just now for searching bytes and returning a
> bool, buth = and <=. Should be pretty close. Also, i believe if you
> left this for last as a possible refactoring, it might save some work.
> In any case, I'll take a look at the latest patch next month.

I've updated the radix tree patch. It's now separated into two patches.

0001 patch introduces pg_lsearch8() and pg_lsearch8_ge() (we may find
better names) that are similar to the pg_lfind8() family but they
return the index of the key in the vector instead of true/false. The
patch includes regression tests.

0002 patch is the main radix tree implementation. I've removed some
duplicated codes of node manipulation. For instance, since node-4,
node-16, and node-32 have a similar structure with different fanouts,
I introduced the common function for them.

In addition to two patches, I've attached the third patch. It's not
part of radix tree implementation but introduces a contrib module
bench_radix_tree, a tool for radix tree performance benchmarking. It
measures loading and lookup performance of both the radix tree and a
flat array.

Regards,

--
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 5d0115b068ecb01d791eab5f8a78a6d25b9cf45c Mon Sep 17 00:00:00 2001
From: Masahiko Sawada 
Date: Wed, 14 Sep 2022 12:38:01 +
Subject: [PATCH v6 1/3] Support pg_lsearch8_eq and pg_lsearch8_ge

---
 src/include/port/pg_lfind.h   |  71 
 src/include/port/simd.h   | 155 +-
 .../test_lfind/expected/test_lfind.out|  12 ++
 .../modules/test_lfind/sql/test_lfind.sql |   2 +
 .../modules/test_lfind/test_lfind--1.0.sql|   8 +
 src/test/modules/test_lfind/test_lfind.c  | 139 
 6 files changed, 378 insertions(+), 9 deletions(-)

diff --git a/src/include/port/pg_lfind.h b/src/include/port/pg_lfind.h
index 0625cac6b5..583f204763 100644
--- a/src/include/port/pg_lfind.h
+++ b/src/include/port/pg_lfind.h
@@ -80,6 +80,77 @@ pg_lfind8_le(uint8 key, uint8 *base, uint32 nelem)
 	return false;
 }
 
+/*
+ * pg_lsearch8
+ *
+ * Return the index of the element in 'base' that equals to 'key', otherwise return
+ * -1.
+ */
+static inline int
+pg_lsearch8(uint8 key, uint8 *base, uint32 nelem)
+{
+	uint32		i;
+
+	/* round down to multiple of vector length */
+	uint32		tail_idx = nelem & ~(sizeof(Vector8) - 1);
+	Vector8		chunk;
+
+	for (i = 0; i < tail_idx; i += sizeof(Vector8))
+	{
+		int idx;
+
+		vector8_load(, [i]);
+		if ((idx = vector8_search_eq(chunk, key)) != -1)
+			return i + idx;
+	}
+
+	/* Process the remaining elements one at a time. */
+	for (; i < nelem; i++)
+