date:20221012

Re: meson PGXS compatibility

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-12 07:50:07 +0200, Peter Eisentraut wrote:
> On 05.10.22 22:07, Andres Freund wrote:
> > 0001: autoconf: Unify CFLAGS_SSE42 and CFLAGS_ARMV8_CRC32C
> 
> I wonder, there has been some work lately to use SIMD and such in other
> places.  Would that affect what kinds of extra CPU-specific compiler flags
> and combinations we might need?

The current infrastructure is very CRC specific, so I don't think this change
would stand in the way of using sse specific flags in other places.


> > 0005: meson: Add PGXS compatibility
> > 
> >The actual meson PGXS compatibility. Plenty more replacements to do, but
> >suffices to build common extensions on a few platforms.
> > 
> >What level of completeness do we want to require here?
> 
> I have tried this with a few extensions.  Seems to work alright.  I don't
> think we need to overthink this.  The way it's set up, if someone needs
> additional variables set, they can easily be added.

Yea, I am happy enough with it now that the bulk is out of src/meson.build and
strip isn't set to an outright wrong value.

Thanks!

Andres

Re: meson PGXS compatibility

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-10 14:35:14 -0700, Andres Freund wrote:
> On 2022-10-05 13:07:10 -0700, Andres Freund wrote:
> > 0004: solaris: Check for -Wl,-E directly instead of checking for gnu LD
> > 
> >   This allows us to get rid of the nontrivial detection of with_gnu_ld,
> >   simplifying meson PGXS compatibility. It's also nice to delete libtool.m4.
> > 
> >   I don't like the SOLARIS_EXPORT_DYNAMIC variable I invented. If somebody 
> > has
> >   a better idea...
> 
> A cleaner approach could be to add a LDFLAGS_BE and emit the -Wl,-E into it
> from both configure and meson, solely based on whether -Wl,-E is supported.  I
> haven't verified cygwin, but on our other platforms that seems to do the right
> thing.

I think that does look better. See the attached 0003.


The attached v3 of this patch has an unchanged 0001 (CRC cflags).

For 0002, I still removed LD from Makefile.global, but instead just hardcoded
ld in the export file generation portion of the backend build - there's no
working alternative linkers, and we already hardcode a bunch of other paths in
mkldexport.

0003 is changed significantly - as proposed in the message quoted above, I
introduced LDFLAGS_EX_BE and moved the detection -Wl,-E (I used
-Wl,--export-dynamic, which we previously only used on FreeBSD) into
configure, getting rid of export_dynamic.

0004, the patch introducing PGXS compat, saw a few changes too:
- I implemented one of the FIXMEs, the correct determination of strip flags
- I moved the bulk of the pgxs compat code to src/makefiles/meson.build - imo
  src/meson.build got bulked up too much with pgxs-emulation code
- some minor cleanups

Greetings,

Andres Freund
>From 6feb84e2f87320f936764aa3d906a8828e080390 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 5 Oct 2022 12:24:14 -0700
Subject: [PATCH v3 1/4] autoconf: Unify CFLAGS_SSE42 and CFLAGS_ARMV8_CRC32C

Until now we emitted the cflags to build the CRC objects into architecture
specific variables. That doesn't make a whole lot of sense to me - we're never
going to target x86 and arm at the same time, so they don't need to be
separate variables.

It might be better to instead continue to have CFLAGS_SSE42 /
CFLAGS_ARMV8_CRC32C be computed by PGAC_ARMV8_CRC32C_INTRINSICS /
PGAC_SSE42_CRC32_INTRINSICS and then set CFLAGS_CRC based on those. But it
seems unlikely that we'd need other sets of CRC specific flags for those two
architectures at the same time.

Discussion: https://postgr.es/m/20221005200710.luvw5evhwf6cl...@awork3.anarazel.de
---
 src/port/Makefile  | 16 
 config/c-compiler.m4   |  8 
 configure  | 19 +--
 configure.ac   | 10 +-
 src/Makefile.global.in |  3 +--
 5 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/src/port/Makefile b/src/port/Makefile
index b3754d8940a..711f59e32bd 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -88,15 +88,15 @@ libpgport.a: $(OBJS)
 thread.o: CFLAGS+=$(PTHREAD_CFLAGS)
 thread_shlib.o: CFLAGS+=$(PTHREAD_CFLAGS)
 
-# all versions of pg_crc32c_sse42.o need CFLAGS_SSE42
-pg_crc32c_sse42.o: CFLAGS+=$(CFLAGS_SSE42)
-pg_crc32c_sse42_shlib.o: CFLAGS+=$(CFLAGS_SSE42)
-pg_crc32c_sse42_srv.o: CFLAGS+=$(CFLAGS_SSE42)
+# all versions of pg_crc32c_sse42.o need CFLAGS_CRC
+pg_crc32c_sse42.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_sse42_shlib.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_sse42_srv.o: CFLAGS+=$(CFLAGS_CRC)
 
-# all versions of pg_crc32c_armv8.o need CFLAGS_ARMV8_CRC32C
-pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_ARMV8_CRC32C)
-pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_ARMV8_CRC32C)
-pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_ARMV8_CRC32C)
+# all versions of pg_crc32c_armv8.o need CFLAGS_CRC
+pg_crc32c_armv8.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_armv8_shlib.o: CFLAGS+=$(CFLAGS_CRC)
+pg_crc32c_armv8_srv.o: CFLAGS+=$(CFLAGS_CRC)
 
 #
 # Shared library versions of object files
diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 000b075312e..eb8cc8ce170 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -597,7 +597,7 @@ fi])# PGAC_HAVE_GCC__ATOMIC_INT64_CAS
 # the other ones are, on x86-64 platforms)
 #
 # An optional compiler flag can be passed as argument (e.g. -msse4.2). If the
-# intrinsics are supported, sets pgac_sse42_crc32_intrinsics, and CFLAGS_SSE42.
+# intrinsics are supported, sets pgac_sse42_crc32_intrinsics, and CFLAGS_CRC.
 AC_DEFUN([PGAC_SSE42_CRC32_INTRINSICS],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_sse42_crc32_intrinsics_$1])])dnl
 AC_CACHE_CHECK([for _mm_crc32_u8 and _mm_crc32_u32 with CFLAGS=$1], [Ac_cachevar],
@@ -613,7 +613,7 @@ AC_LINK_IFELSE([AC_LANG_PROGRAM([#include ],
   [Ac_cachevar=no])
 CFLAGS="$pgac_save_CFLAGS"])
 if test x"$Ac_cachevar" = x"yes"; then
-  CFLAGS_SSE42="$1"
+  CFLAGS_CRC="$1"
   pgac_sse42_crc32_intrinsics=yes
 fi
 undefine([Ac_cachevar])dnl
@@ -629,7 +629,7 @@ undefine([Ac_cachevar])dnl
 #
 # An optional compiler flag can be passed as argument (e.g.
 # -march=armv8-a+crc). If the

Re: create subscription - improved warning message

2022-10-12 Thread Amit Kapila

On Wed, Oct 12, 2022 at 8:31 PM Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > On 2022-Oct-12, Amit Kapila wrote:
> >> Okay, then I think we can commit the last error message patch of Peter
> >> [1] as we have an agreement on the same, and then work on this as a
> >> separate patch. What do you think?
>
> > No objection.
>
> Yeah, the v3-0001 patch is fine.
>

Pushed this one.

-- 
With Regards,
Amit Kapila.

Re: Perform streaming logical transactions by background workers and parallel apply

2022-10-12 Thread Amit Kapila

On Wed, Oct 12, 2022 at 3:41 PM Peter Smith  wrote:
>
> Here are some review comments for v36-0001.
>
>
> 6. LogicalParallelApplyLoop
>
> + for (;;)
> + {
> + void*data;
> + Size len;
> + int c;
> + StringInfoData s;
> + MemoryContext oldctx;
> +
> + CHECK_FOR_INTERRUPTS();
> +
> + /* Ensure we are reading the data into our memory context. */
> + oldctx = MemoryContextSwitchTo(ApplyMessageContext);
> +
> ...
> +
> + MemoryContextSwitchTo(oldctx);
> + MemoryContextReset(ApplyMessageContext);
> + }
>
> Do those memory context switches need to happen inside the for(;;)
> loop like that? I thought perhaps those can be done *outside* of the
> loop instead of always switching and switching back on the next
> iteration.
>

I think we need to reset the ApplyMessageContext each time after
processing a message and also don't want to process the config file in
the applymessagecontext.

> ==
>
> src/backend/replication/logical/launcher.c
>
> 12. logicalrep_worker_launch
>
> Previously I suggested may the apply process name should change
>
> FROM
> "logical replication worker for subscription %u"
> TO
> "logical replication apply worker for subscription %u"
>
> and Houz wrote ([1] #13)
> I am not sure if it's a good idea to change existing process description.
>
> ~
>
> But that seems inconsistent to me because elsewhere this patch is
> already exposing the name to the user (like when it says "logical
> replication apply worker for subscription \"%s\" has started".
> Shouldn’t the process name match these logs?
>

I think it is okay to change the name here for the sake of consistency.

>
> 19. ApplyWorkerMain
>
> +
> +/* Logical Replication Apply worker entry point */
> +void
> +ApplyWorkerMain(Datum main_arg)
>
> Previously I suugested changing "Apply worker" to "apply worker", and
> Houzj ([1] #48) replied:
> Since it's the existing comment, I feel we can leave this.
>
> ~
>
> Normally I agree don't change the original code unrelated to the
> patch, but in practice, I think no patch would be accepted that just
> changes just "A" to "a", so if you don't change it here in this patch
> to be consistent then it will never happen. That's why I think should
> be part of this patch.
>

Hmm, I think one might then extend this to many other similar cosmetic
stuff in the nearby areas. It sometimes distracts the reviewer if
there are unrelated changes, so better to avoid it.

>
> 22. get_transaction_apply_action
>
> +static TransApplyAction
> +get_transaction_apply_action(TransactionId xid,
> ParallelApplyWorkerInfo **winfo)
> +{
> + *winfo = NULL;
> +
> + if (am_parallel_apply_worker())
> + {
> + return TRANS_PARALLEL_APPLY;
> + }
> + else if (in_remote_transaction)
> + {
> + return TRANS_LEADER_APPLY;
> + }
> +
> + /*
> + * Check if we are processing this transaction using a parallel apply
> + * worker and if so, send the changes to that worker.
> + */
> + else if ((*winfo = parallel_apply_find_worker(xid)))
> + {
> + return TRANS_LEADER_SEND_TO_PARALLEL;
> + }
> + else
> + {
> + return TRANS_LEADER_SERIALIZE;
> + }
> +}
>
> 22a.
>
> Previously I suggested the statement blocks are overkill and all the
> {} should be removed, and Houzj ([1] #52a) wrote:
> I feel this style is fine.
>
> ~
>
> Sure, it is fine, but FWIW I thought it is not the normal PG coding
> convention to use unnecessary {} unless it would seem strange to omit
> them.
>

Yeah, but here we are using comments in between the else if construct
due to which using {} makes it look better. I agree that this is
mostly a question of personal preference and we can go either way but
my preference would be to use the style patch has currently used.

>
> 23. src/test/regress/sql/subscription.sql
>
> Previously I mentioned testing the 'streaming' option with no value.
> Houzj replied ([1]
> I didn't find similar tests for no value explicitly specified cases,
> so I didn't add this for now.
>
> But as I also responded ([4] #58) already to Amit:
> IMO this one is a bit different because it's not really a boolean
> option anymore - it's a kind of a hybrid boolean/enum. That's why I
> thought this ought to be tested regardless if there are existing tests
> for the (normal) boolean options.
>

I still feel this is not required. I think we have to be cautious
about not adding too many tests in this area that are of less or no
value.

-- 
With Regards,
Amit Kapila.

Re: Fast COPY FROM based on batch insert

2022-10-12 Thread Andrey Lepikhov


On 10/12/22 07:56, Etsuro Fujita wrote:

On Tue, Oct 11, 2022 at 3:06 PM Andrey Lepikhov
 wrote:

I reviewed the patch one more time. Only one question: bistate and
ri_FdwRoutine are strongly bounded. Maybe to add some assertion on
(ri_FdwRoutine XOR bistate) ? Just to prevent possible errors in future.


You mean the bistate member of CopyMultiInsertBuffer?

Yes


We do not use that member at all for foreign tables, so the patch
avoids initializing that member in CopyMultiInsertBufferInit() when
called for a foreign table.  So we have bistate = NULL for foreign
tables (and bistate != NULL for plain tables), as you mentioned above.
I think it is a good idea to add such assertions.  How about adding
them to CopyMultiInsertBufferFlush() and
CopyMultiInsertBufferCleanup() like the attached?  In the attached I
updated comments a bit further as well.

Yes, quite enough.

--
Regards
Andrey Lepikhov
Postgres Professional

Add mssing test to test_decoding/meson.build

2022-10-12 Thread kuroda.hay...@fujitsu.com

Hi hackers,

I found that the test catalog_change_snapshot was missed in 
test_decoding/meson.build file.
PSA the patch to fix it.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



0001-add-missing-test-to-test_decoding-meson.build.patch
Description: 0001-add-missing-test-to-test_decoding-meson.build.patch

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Julien Rouhaud

On Wed, Oct 12, 2022 at 11:34:32PM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > On Wed, Oct 12, 2022 at 11:08:25PM -0400, Tom Lane wrote:
> >> It may be worth looking at the GUC code, which has been dealing
> >> with the same sorts of issues pretty successfully for many years.
> 
> > The GUC code relies on malloc/free,
> 
> Not for much longer [1].  And no, I don't believe that that patch
> makes any noticeable difference in the code's robustness.

Ok, so the new code still assumes that guc_free can't/shouldn't fail:

static void
set_string_field(struct config_string *conf, char **field, char *newval)
{
char   *oldval = *field;

/* Do the assignment */
*field = newval;

/* Free old value if it's not NULL and isn't referenced anymore */
if (oldval && !string_field_used(conf, oldval))
guc_free(oldval);
}


[...]

set_string_field(conf, 
>reset_val, newval);
set_extra_field(>gen, 
>reset_extra,

newextra);
conf->gen.reset_source = source;
conf->gen.reset_scontext = 
context;
conf->gen.reset_srole = srole;

Any error in guc_free will leave the struct in some inconsistent state and
possibly leak some data.  We can use the same approach for session variables.

Re: Perform streaming logical transactions by background workers and parallel apply

2022-10-12 Thread Amit Kapila

On Thu, Oct 6, 2022 at 6:09 PM kuroda.hay...@fujitsu.com
 wrote:
>
> ~~~
> 10. worker.c - apply_handle_stream_start
>
> ```
> + *
> + * XXX We can avoid sending pair of the START/STOP messages to the parallel
> + * worker because unlike apply worker it will process only one
> + * transaction-at-a-time. However, it is not clear whether that is worth the
> + * effort because it is sent after logical_decoding_work_mem changes.
> ```
>
> I can understand that START message is not needed, but is STOP really 
> removable? If leader does not send STOP to its child, does it lose a chance 
> to change the worker-state to IDLE_IN_TRANSACTION?
>

I think if we want we can set that state before we went to sleep in
parallel apply worker. So, I guess ideally we don't need both of these
messages but for now, it is fine as mentioned in the comments.

-- 
With Regards,
Amit Kapila.

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread Richard Guo

On Thu, Oct 13, 2022 at 4:50 AM David Rowley  wrote:

> The problem is that I'm only added the LimitPath to the
> cheapest_total_path.  I think to make your case work we'd need to add
> the LimitPath only in cases where the distinct_pathkeys are empty but
> the sort_pathkeys are not and hasDistinctOn is true and the path has
> pathkeys_contained_in(root->sort_pathkeys, path->pathkeys).  I think
> that's doable, but it's become quite a bit more complex than the patch
> I proposed. Maybe it's worth a 2nd effort for that part?


Currently in the patch the optimization is done before we check for
presorted paths or do the explicit sort of the cheapest path. How about
we move this optimization into the branch where we've found any
presorted paths?  Maybe something like:

--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4780,11 +4780,24 @@ create_final_distinct_paths(PlannerInfo *root,
RelOptInfo *input_rel,

  if (pathkeys_contained_in(needed_pathkeys, path->pathkeys))
  {
- add_path(distinct_rel, (Path *)
-  create_upper_unique_path(root, distinct_rel,
-   path,
-
list_length(root->distinct_pathkeys),
-   numDistinctRows));
+ if (root->distinct_pathkeys == NIL)
+ {
+ Node   *limitCount = (Node *) makeConst(INT8OID, -1,
InvalidOid,
+ sizeof(int64),
+ Int64GetDatum(1),
false,
+ FLOAT8PASSBYVAL);
+
+ add_path(distinct_rel, (Path *)
+  create_limit_path(root, distinct_rel,
+path, NULL, limitCount,
+LIMIT_OPTION_COUNT, 0, 1));
+ }
+ else
+ add_path(distinct_rel, (Path *)
+  create_upper_unique_path(root, distinct_rel,
+   path,
+
list_length(root->distinct_pathkeys),
+   numDistinctRows));

This again makes the code less 'straightforward', just to cover a more
uncommon case. I'm also not sure if it's worth doing.

Thanks
Richard

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread Richard Guo

On Thu, Oct 13, 2022 at 4:41 AM David Rowley  wrote:

> We could do something like set some bool flag in PlannerInfo to tell
> the planner not to bother adding the final LimitPath as we've already
> added another which does the job, but is it really worth adding that
> complexity for this patch? You already mentioned that "this patch is
> very straightforward". I don't think it would be if we added code to
> avoid the LimitPath duplication.


Yeah, maybe this is the right way to do it. I agree that this would
complicate the code. Not sure if it's worth doing.

Thanks
Richard

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Tom Lane

Julien Rouhaud  writes:
> On Wed, Oct 12, 2022 at 11:08:25PM -0400, Tom Lane wrote:
>> It may be worth looking at the GUC code, which has been dealing
>> with the same sorts of issues pretty successfully for many years.

> The GUC code relies on malloc/free,

Not for much longer [1].  And no, I don't believe that that patch
makes any noticeable difference in the code's robustness.

In the end, a bug affecting these considerations is a bug to be fixed
once it's found.  Building potentially-themselves-buggy defenses against
hypothetical bugs is an exercise with rapidly diminishing returns.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/2982579.1662416...@sss.pgh.pa.us

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Julien Rouhaud

On Wed, Oct 12, 2022 at 11:08:25PM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > We can change the API to accept an optional new value (and the few other 
> > needed
> > information) when cleaning the old one, but that's adding some complication
> > just to deal with a possible error in pfree.  So it still unclear to me 
> > what to
> > do here.
> 
> I think it's worth investing some effort in ensuring consistency
> of persistent data structures in the face of errors.  I doubt it's
> worth investing effort in avoiding leaks in the face of errors.

So if e.g.

LET myvar = somebigstring;

errors out because of hypothetical pfree() error, it would be ok to leak that
memory as long as everything is consistent, meaning here that myvar is in a
normal "reset" state?

> In any case, thinking of it in terms of "trapping" errors is the
> wrong approach.  We don't have a cheap or complication-free way
> to do that, mainly because you can't trap just one error cause.
>
> It may be worth looking at the GUC code, which has been dealing
> with the same sorts of issues pretty successfully for many years.

The GUC code relies on malloc/free, so any hypothetical error during free
should abort and force an emergency shutdown.  I don't think it's ok for
session variables to bypass memory contexts, and forcing a panic wouldn't be
possible without some PG_TRY block.

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Tom Lane

Julien Rouhaud  writes:
> We can change the API to accept an optional new value (and the few other 
> needed
> information) when cleaning the old one, but that's adding some complication
> just to deal with a possible error in pfree.  So it still unclear to me what 
> to
> do here.

I think it's worth investing some effort in ensuring consistency
of persistent data structures in the face of errors.  I doubt it's
worth investing effort in avoiding leaks in the face of errors.
In any case, thinking of it in terms of "trapping" errors is the
wrong approach.  We don't have a cheap or complication-free way
to do that, mainly because you can't trap just one error cause.

It may be worth looking at the GUC code, which has been dealing
with the same sorts of issues pretty successfully for many years.

regards, tom lane

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Julien Rouhaud

On Wed, Oct 12, 2022 at 10:34:29PM -0400, Tom Lane wrote:
> Julien Rouhaud  writes:
> > On Wed, Oct 12, 2022 at 07:24:53PM -0400, Tom Lane wrote:
> >> There are hundreds, if not thousands, of "shouldn't ever happen" elogs
> >> in Postgres.  We don't make any attempt to trap any of them.  Why do
> >> you think this one should be different?
> 
> > Because session variables are allocated in a persistent memory context, so
> > there's a code doing something like this to implement LET variable:
> 
> > [...]
> > oldctxt = MemoryContextSwitchTo(SomePersistentContext);
> > newval = palloc(...);
> > MemoryContextSwitchTo(oldctxt);
> > /* No error should happen after that point or we leak memory */
> > pfree(var->val);
> > var->val = newval;
> > return;
> 
> > Any error thrown in pfree would mean leaking memory forever in that backend.
> 
> I've got little sympathy for that complaint, because an error
> in pfree likely means you have worse problems than a leak.

I agree, thus the question of what should be done in such case.

> Moreover, what it most likely means is that you messed up your
> memory allocations sometime earlier; making your code more
> complicated makes that risk higher not lower.
> 
> Having said that, the above is really not very good code.
> Better would look like
> 
>   newval = MemoryContextAlloc(SomePersistentContext, ...);
>   ... fill newval ...
>   oldval = var->val;
>   var->val = newval;
>   pfree(oldval);
> 
> which leaves your data structure in a consistent state whether
> pfree fails or not (and regardless of whether it fails before or
> after recycling your chunk).  That property is way more important
> than the possibility of leaking some memory.

Well, it was an over simplification to outline the reason for the question.

The real code has a dedicated function to cleanup the variable (used for other
cases, like ON TRANSACTION END RESET), and one of the callers will use that
function to implement LET so that approach isn't possible as-is.  Note that for
the other callers the code is written like that so the structure is consistent
whether pfree errors out or not.

We can change the API to accept an optional new value (and the few other needed
information) when cleaning the old one, but that's adding some complication
just to deal with a possible error in pfree.  So it still unclear to me what to
do here.

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Tom Lane

Julien Rouhaud  writes:
> On Wed, Oct 12, 2022 at 07:24:53PM -0400, Tom Lane wrote:
>> There are hundreds, if not thousands, of "shouldn't ever happen" elogs
>> in Postgres.  We don't make any attempt to trap any of them.  Why do
>> you think this one should be different?

> Because session variables are allocated in a persistent memory context, so
> there's a code doing something like this to implement LET variable:

> [...]
> oldctxt = MemoryContextSwitchTo(SomePersistentContext);
> newval = palloc(...);
> MemoryContextSwitchTo(oldctxt);
> /* No error should happen after that point or we leak memory */
> pfree(var->val);
> var->val = newval;
> return;

> Any error thrown in pfree would mean leaking memory forever in that backend.

I've got little sympathy for that complaint, because an error
in pfree likely means you have worse problems than a leak.
Moreover, what it most likely means is that you messed up your
memory allocations sometime earlier; making your code more
complicated makes that risk higher not lower.

Having said that, the above is really not very good code.
Better would look like

newval = MemoryContextAlloc(SomePersistentContext, ...);
... fill newval ...
oldval = var->val;
var->val = newval;
pfree(oldval);

which leaves your data structure in a consistent state whether
pfree fails or not (and regardless of whether it fails before or
after recycling your chunk).  That property is way more important
than the possibility of leaking some memory.

regards, tom lane

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Julien Rouhaud

On Wed, Oct 12, 2022 at 07:24:53PM -0400, Tom Lane wrote:
> Pavel Stehule  writes:
> > I had a talk with Julien about the correct handling of an exception raised
> > by pfree function.
> 
> > Currently, this exception (elog(ERROR, "could not find block containing
> > chunk %p", chunk);) is not specially handled ever.
> 
> There are hundreds, if not thousands, of "shouldn't ever happen" elogs
> in Postgres.  We don't make any attempt to trap any of them.  Why do
> you think this one should be different?

Because session variables are allocated in a persistent memory context, so
there's a code doing something like this to implement LET variable:

[...]
oldctxt = MemoryContextSwitchTo(SomePersistentContext);
newval = palloc(...);
MemoryContextSwitchTo(oldctxt);
/* No error should happen after that point or we leak memory */
pfree(var->val);
var->val = newval;
return;

Any error thrown in pfree would mean leaking memory forever in that backend.

Is it ok to leak memory in such should-not-happen case or should there be some
safeguard?

Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

2022-10-12 Thread Michael Paquier

On Wed, Oct 12, 2022 at 08:54:34PM -0400, Tom Lane wrote:
> Don't we need to back-patch these fixes?

I guess I should, though I have finished by not doing due to the
unlikeliness of the problem, where we would need the combination of a
page eviction with an error in the critical section to force a PANIC,
or a crash before the WAL gets inserted.  Other opinions?
--
Michael


signature.asc
Description: PGP signature

Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

2022-10-12 Thread Tom Lane

Michael Paquier  writes:
> Thanks.  This looks fine on a second look, so applied.

Don't we need to back-patch these fixes?

regards, tom lane

RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-10-12 Thread kuroda.hay...@fujitsu.com

Dear Önder,

Thanks for updating the patch!

I think your saying seems reasonable.
I have no comments anymore now. Thanks for updating so quickly.


Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: GUC values - recommended way to declare the C variables?

2022-10-12 Thread Michael Paquier

On Wed, Oct 12, 2022 at 12:12:15PM -0700, Nathan Bossart wrote:
> Looks reasonable to me.  I've marked this as ready-for-committer.

So, the initial values of max_wal_senders and max_replication_slots
became out of sync with their defaults in guc_tables.c.  FWIW, I would
argue the opposite way: rather than removing the initializations, I
would fix and keep them as these references can be useful when
browsing the area of the code related to such GUCs, without having to
look at guc_tables.c for this information.
--
Michael


signature.asc
Description: PGP signature

Re: Allow WindowFuncs prosupport function to use more optimal WindowClause options

2022-10-12 Thread David Rowley

On Wed, 12 Oct 2022 at 16:33, Vik Fearing  wrote:
> Per spec, the ROW_NUMBER() window function is not even allowed to have a
> frame specified.
>
>  b) The window framing clause of WDX shall not be present.
>
> Also, the specification for ROW_NUMBER() is:
>
>  f) ROW_NUMBER() OVER WNS is equivalent to the :
>
>  COUNT (*) OVER (WNS1 ROWS UNBOUNDED PRECEDING)
>
>
> So I don't think we need to test for anything at all and can
> indiscriminately add or replace the frame with ROWS UNBOUNDED PRECEDING.

Thanks for digging that out.

Just above that I see:

RANK() OVER WNS is equivalent to:
( COUNT (*) OVER (WNS1 RANGE UNBOUNDED PRECEDING)
- COUNT (*) OVER (WNS1 RANGE CURRENT ROW) + 1 )

and

DENSE_RANK() OVER WNS is equivalent to the :
COUNT (DISTINCT ROW ( VE1, ..., VEN ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)

So it looks like the same can be done for rank() and dense_rank() too.
I've added support for those in the attached.

This also got me thinking that maybe we should be a bit more generic
with the support function node tag name. After looking at the
nodeWindowAgg.c code for a while, I wondered if we might want to add
some optimisations in the future that makes WindowAgg not bother
storing tuples for row_number(), rank() and dense_rank().  That might
save a bit of overhead from the tuple store.  I imagined that we'd
want to allow the expansion of this support request so that the
support function could let the planner know if any tuples will be
accessed by the window function or not.  The
SupportRequestWFuncOptimizeFrameOpts name didn't seem very fitting for
that so I adjusted it to become SupportRequestOptimizeWindowClause
instead.

The updated patch is attached.

David
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..74a8bafd8b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -38,6 +38,7 @@
 #include "miscadmin.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
+#include "nodes/supportnodes.h"
 #ifdef OPTIMIZER_DEBUG
 #include "nodes/print.h"
 #endif
@@ -207,6 +208,8 @@ static PathTarget *make_partial_grouping_target(PlannerInfo 
*root,

PathTarget *grouping_target,

Node *havingQual);
 static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist);
+static void optimize_window_clauses(PlannerInfo *root,
+   
WindowFuncLists *wflists);
 static List *select_active_windows(PlannerInfo *root, WindowFuncLists 
*wflists);
 static PathTarget *make_window_input_target(PlannerInfo *root,

PathTarget *final_target,
@@ -1422,7 +1425,16 @@ grouping_planner(PlannerInfo *root, double 
tuple_fraction)
wflists = find_window_functions((Node *) 
root->processed_tlist,

list_length(parse->windowClause));
if (wflists->numWindowFuncs > 0)
+   {
+   /*
+* See if any modifications can be made to each 
WindowClause
+* to allow the executor to execute the 
WindowFuncs more
+* quickly.
+*/
+   optimize_window_clauses(root, wflists);
+
activeWindows = select_active_windows(root, 
wflists);
+   }
else
parse->hasWindowFuncs = false;
}
@@ -5391,6 +5403,85 @@ postprocess_setop_tlist(List *new_tlist, List 
*orig_tlist)
return new_tlist;
 }
 
+/*
+ * optimize_window_clauses
+ * Call each WindowFunc's prosupport function to see if we're able 
to
+ * make any adjustments to any of the WindowClause's so that the 
executor
+ * can execute the window functions in a more optimal way.
+ *
+ * Currently we only allow adjustments to the WindowClause's frameOptions.  We
+ * may allow more things to be done here in the future.
+ */
+static void
+optimize_window_clauses(PlannerInfo *root, WindowFuncLists *wflists)
+{
+   List   *windowClause = root->parse->windowClause;
+   ListCell *lc;
+
+   foreach(lc, windowClause)
+   {
+   WindowClause *wc = lfirst_node(WindowClause, lc);
+   ListCell *lc2;
+   int optimizedFrameOptions = 0;
+
+   Assert(wc->winref <= wflists->maxWinRef);
+
+   /* skip any WindowClauses that have no WindowFuncs */
+   if

Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

2022-10-12 Thread Michael Paquier

On Wed, Oct 12, 2022 at 09:39:11PM +0800, Zhang Mingli wrote:
> Patch added.

Thanks.  This looks fine on a second look, so applied.
--
Michael


signature.asc
Description: PGP signature

Re: Bloom filter Pushdown Optimization for Merge Join

2022-10-12 Thread Zhihong Yu

On Wed, Oct 12, 2022 at 3:36 PM Lyu Pan  wrote:

> Hello Zhihong Yu & Tomas Vondra,
>
> Thank you so much for your review and feedback!
>
> We made some updates based on previous feedback and attached the new
> patch set. Due to time constraints, we didn't get to resolve all the
> comments, and we'll continue to improve this patch.
>
> > In this prototype, the cost model is based on an assumption that there is
> > a linear relationship between the performance gain from using a semijoin
> > filter and the estimated filtering rate:
> > % improvement to Merge Join cost = 0.83 * estimated filtering rate -
> 0.137.
> >
> > How were the coefficients (0.83 and 0.137) determined ?
> > I guess they were based on the results of running certain workload.
>
> Right, the coefficients (0.83 and 0.137) determined are based on some
> preliminary testings. The current costing model is pretty naive and
> we'll work on a more robust costing model in future work.
>
>
> > I agree, in principle, although I think the current logic / formula is a
> > bit too crude and fitted to the simple data used in the test. I think
> > this needs to be formulated as a regular costing issue, considering
> > stuff like cost of the hash functions, and so on.
> >
> > I think this needs to do two things:
> >
> > 1) estimate the cost of building the bloom filter - This shall depend on
> > the number of rows in the inner relation, number/cost of the hash
> > functions (which may be higher for some data types), etc.
> >
> > 2) estimate improvement for the probing branch - Essentially, we need to
> > estimate how much we save by filtering some of the rows, but this also
> > neeeds to include the cost of probing the bloom filter.
> >
> > This will probably require some improvements to the lib/bloomfilter, in
> > order to estimate the false positive rate - this may matter a lot for
> > large data sets and small work_mem values. The bloomfilter library
> > simply reduces the size of the bloom filter, which increases the false
> > positive rate. At some point it'll start reducing the benefit.
> >
>
> These suggestions make a lot of sense. The current costing model is
> definitely not good enough, and we plan to work on a more robust
> costing model as we continue to improve the patch.
>
>
> > OK. Could also build the bloom filter in shared memory?
> >
>
> We thought about this approach but didn't prefer this one because if
> all worker processes share the same bloom filter in shared memory, we
> need to frequently lock and unlock the bloom filter to avoid race
> conditions. So we decided to have each worker process create its own
> bloom filter.
>
>
> > IMHO we shouldn't make too many conclusions from these examples. Yes, it
> > shows merge join can be improved, but for cases where a hashjoin works
> > better so we wouldn't use merge join anyway.
> >
> > I think we should try constructing examples where either merge join wins
> > already (and gets further improved by the bloom filter), or would lose
> > to hash join and the bloom filter improves it enough to win.
> >
> > AFAICS that requires a join of two large tables - large enough that hash
> > join would need to be batched, or pre-sorted inputs (which eliminates
> > the explicit Sort, which is the main cost in most cases).
> >
> > The current patch only works with sequential scans, which eliminates the
> > second (pre-sorted) option. So let's try the first one - can we invent
> > an example with a join of two large tables where a merge join would win?
> >
> > Can we find such example in existing benchmarks like TPC-H/TPC-DS.
> >
>
> Agreed. The current examples are only intended to show us that using
> bloom filters in merge join could improve the merge join performance
> in some cases. We are working on testing more examples that merge join
> with bloom filter could out-perform hash join, which should be more
> persuasive.
>
>
> > The bloom filter is built by the first seqscan (on t0), and then used by
> > the second seqscan (on t1). But this only works because we always run
> > the t0 scan to completion (because we're feeding it into Sort) before we
> > start scanning t1.
> >
> > But when the scan on t1 switches to an index scan, it's over - we'd be
> > building the filter without being able to probe it, and when we finish
> > building it we no longer need it. So this seems pretty futile.
> >
> > It might still improve plans like
> >
> >->  Merge Join
> >  Merge Cond: (t0.c1 = t1.c1)
> >  SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
> >  SemiJoin Estimated Filtering Rate: 1.
> > ->  Sort
> >Sort Key: t0.c1
> >->  Seq Scan on t0
> >  ->  Index Scan on t1
> >
> > But I don't know how common/likely that actually is. I'd expect to have
> > an index on both sides, but perhaps I'm wrong.
> >
> > This is why hashjoin seems like a more natural fit for the bloom filter,
> > BTW, because there we have a guarantee the inner

Re: how to correctly react on exception in pfree function?

2022-10-12 Thread Tom Lane

Pavel Stehule  writes:
> I had a talk with Julien about the correct handling of an exception raised
> by pfree function.

> Currently, this exception (elog(ERROR, "could not find block containing
> chunk %p", chunk);) is not specially handled ever.

There are hundreds, if not thousands, of "shouldn't ever happen" elogs
in Postgres.  We don't make any attempt to trap any of them.  Why do
you think this one should be different?

regards, tom lane

Re: Pluggable toaster

2022-10-12 Thread Nikita Malakhov

Hi hackers!

Due to recent changes in postgres.h cfbot is failing again.
Here's rebased patch:
v22-0001-toaster-interface.patch - Pluggable TOAST API interface with
reference (original) TOAST mechanics - new API is introduced but
reference TOAST is still left unchanged;
v22-0002-toaster-default.patch - Default TOAST mechanics is re-implemented
using TOAST API and is plugged in as Default Toaster;
v22-0003-toaster-docs.patch - Pluggable TOAST API documentation package

Actual GitHub branch resides at
https://github.com/postgrespro/postgres/tree/toasterapi_clean
Please note that because the development goes on, the actual branch
contains much more than is provided here.


-- 
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/


v22-0002-toaster-default.patch.gz
Description: GNU Zip compressed data


v22-0003-toaster-docs.patch.gz
Description: GNU Zip compressed data


v22-0001-toaster-interface.patch.gz
Description: GNU Zip compressed data

Re: Tracking last scan time

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-12 12:50:31 -0700, Andres Freund wrote:
> I think this should have at a basic test in src/test/regress/sql/stats.sql. If
> I can write one in a few minutes I'll go for that, otherwise will reply
> detailing difficulties.

Took a bit longer (+lunch). Attached.

In the attached 0001, the patch to make GetCurrentTransactionStopTimestamp()
set xactStopTimestamp, I added a few comment updates and an Assert() to ensure
that CurrentTransactionState->state is TRANS_(DEFAULT|COMMIT|ABORT|PREPARE). I
am worried that otherwise we might end up with someone ending up using it in a
place before the end of the transaction, which'd then end up recording the
wrong timestamp in the commit/abort record.

For 0002, the commit adding lastscan, I added catversion/stats version bumps
(because I was planning to commit it already...), a commit message, and that
minor docs change mentioned earlier.

0003 adds the tests mentioned above. I plan to merge them with 0002, but left
them separate for easier review for now.

To be able to compare timestamps for > not just >= we need to make sure that
two subsequent timestamps differ. The attached achieves this by sleeping for
100ms between those points - we do that in other places already. I'd started
out with 10ms, which I am fairly sure would suffice, but then deciced to copy
the existing 100ms sleeps.

I verified tests pass under valgrind, debug_discard_caches and after I make
pgstat_report_stat() only flush when force is passed in.

Greetings,

Andres Freund
>From 976679bc590560d3c051e30e7576e7d3d03070e6 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 12 Oct 2022 14:43:41 -0700
Subject: [PATCH v5 1/3] Have GetCurrentTransactionStopTimestamp() set
 xactStopTimestamp if unset

Previously GetCurrentTransactionStopTimestamp() computed a new timestamp
whenever xactStopTimestamp was unset and xactStopTimestamp was only set when a
commit or abort record was written.

An upcoming patch will add additional calls to
GetCurrentTransactionStopTimestamp() from pgstats. To avoid computing
timestamps multiple times, set xactStopTimestamp in
GetCurrentTransactionStopTimestamp() if not already set.

Author: Dave Page 
Reviewed-by: Andres Freund 
Reviewed-by: Vik Fearing 
Discussion: https://postgr.es/m/20220906155325.an3xesq5o3fq36gt%40awork3.anarazel.de
---
 src/backend/access/transam/xact.c | 42 +++
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index c1ffbd89b88..fd5103a78e2 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -263,7 +263,10 @@ static bool currentCommandIdUsed;
 /*
  * xactStartTimestamp is the value of transaction_timestamp().
  * stmtStartTimestamp is the value of statement_timestamp().
- * xactStopTimestamp is the time at which we log a commit or abort WAL record.
+ * xactStopTimestamp is the time at which we log a commit / abort WAL record,
+ * or if that was skipped, the time of the first subsequent
+ * GetCurrentTransactionStopTimestamp() call.
+ *
  * These do not change as we enter and exit subtransactions, so we don't
  * keep them inside the TransactionState stack.
  */
@@ -865,15 +868,24 @@ GetCurrentStatementStartTimestamp(void)
 /*
  *	GetCurrentTransactionStopTimestamp
  *
- * We return current time if the transaction stop time hasn't been set
- * (which can happen if we decide we don't need to log an XLOG record).
+ * If the transaction stop time hasn't already been set, which can happen if
+ * we decided we don't need to log an XLOG record, set xactStopTimestamp.
  */
 TimestampTz
 GetCurrentTransactionStopTimestamp(void)
 {
-	if (xactStopTimestamp != 0)
-		return xactStopTimestamp;
-	return GetCurrentTimestamp();
+	TransactionState s PG_USED_FOR_ASSERTS_ONLY = CurrentTransactionState;
+
+	/* should only be called after commit / abort processing */
+	Assert(s->state == TRANS_DEFAULT ||
+		   s->state == TRANS_COMMIT ||
+		   s->state == TRANS_ABORT ||
+		   s->state == TRANS_PREPARE);
+
+	if (xactStopTimestamp == 0)
+		xactStopTimestamp = GetCurrentTimestamp();
+
+	return xactStopTimestamp;
 }

 /*
@@ -891,15 +903,6 @@ SetCurrentStatementStartTimestamp(void)
 		Assert(stmtStartTimestamp != 0);
 }

-/*
- *	SetCurrentTransactionStopTimestamp
- */
-static inline void
-SetCurrentTransactionStopTimestamp(void)
-{
-	xactStopTimestamp = GetCurrentTimestamp();
-}
-
 /*
  *	GetCurrentTransactionNestLevel
  *
@@ -1396,9 +1399,7 @@ RecordTransactionCommit(void)
 		START_CRIT_SECTION();
 		MyProc->delayChkptFlags |= DELAY_CHKPT_START;

-		SetCurrentTransactionStopTimestamp();
-
-		XactLogCommitRecord(xactStopTimestamp,
+		XactLogCommitRecord(GetCurrentTransactionStopTimestamp(),
 			nchildren, children, nrels, rels,
 			ndroppedstats, droppedstats,
 			nmsgs, invalMessages,
@@ -1422,7 +1423,7 @@ RecordTransactionCommit(void)
 		 */

 		if (!replorigin ||

Re: create subscription - improved warning message

2022-10-12 Thread Peter Smith

On Thu, Oct 13, 2022 at 2:01 AM Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > On 2022-Oct-12, Amit Kapila wrote:
> >> Okay, then I think we can commit the last error message patch of Peter
> >> [1] as we have an agreement on the same, and then work on this as a
> >> separate patch. What do you think?
>
> > No objection.
>
> Yeah, the v3-0001 patch is fine.  I agree that the docs change needs
> more work.

Thanks to everybody for the feedback/suggestions. I will work on
improving the documentation for this and post something in a day or
so.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: problems with making relfilenodes 56-bits

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-12 22:05:30 +0200, Matthias van de Meent wrote:
> On Wed, 5 Oct 2022 at 01:50, Andres Freund  wrote:
> > On 2022-10-03 10:01:25 -0700, Andres Freund wrote:
> > > On 2022-10-03 08:12:39 -0400, Robert Haas wrote:
> > > > On Fri, Sep 30, 2022 at 8:20 PM Andres Freund  
> > > > wrote:
> > > > I thought about trying to buy back some space elsewhere, and I think
> > > > that would be a reasonable approach to getting this committed if we
> > > > could find a way to do it. However, I don't see a terribly obvious way
> > > > of making it happen.
> > >
> > > I think there's plenty potential...
> >
> > I light dusted off my old varint implementation from [1] and converted the
> > RelFileLocator and BlockNumber from fixed width integers to varint ones. 
> > This
> > isn't meant as a serious patch, but an experiment to see if this is a path
> > worth pursuing.
> >
> > A run of installcheck in a cluster with autovacuum=off, full_page_writes=off
> > (for increased reproducability) shows a decent saving:
> >
> > master: 241106544 - 230 MB
> > varint: 227858640 - 217 MB
> 
> I think a signficant part of this improvement comes from the premise
> of starting with a fresh database. tablespace OID will indeed most
> likely be low, but database OID may very well be linearly distributed
> if concurrent workloads in the cluster include updating (potentially
> unlogged) TOASTed columns and the databases are not created in one
> "big bang" but over the lifetime of the cluster. In that case DBOID
> will consume 5B for a significant fraction of databases (anything with
> OID >=2^28).
> 
> My point being: I don't think that we should have different WAL
> performance in databases which is dependent on which OID was assigned
> to that database.

To me this is raising the bar to an absurd level. Some minor space usage
increase after oid wraparound and for very large block numbers isn't a huge
issue - if you're in that situation you already have a huge amount of wal.


> 0002 - Rework XLogRecord
> This makes many fields in the xlog header optional, reducing the size
> of many xlog records by several bytes. This implements the design I
> shared in my earlier message [1].
> 
> 0003 - Rework XLogRecordBlockHeader.
> This patch could be applied on current head, and saves some bytes in
> per-block data. It potentially saves some bytes per registered
> block/buffer in the WAL record (max 2 bytes for the first block, after
> that up to 3). See the patch's commit message in the patch for
> detailed information.

The amount of complexity these two introduce seems quite substantial to
me. Both from an maintenance and a runtime perspective. I think we'd be better
off using building blocks like variable lengths encoded values than open
coding it in many places.

Greetings,

Andres Freund

Re: allowing for control over SET ROLE

2022-10-12 Thread Nathan Bossart

On Fri, Sep 30, 2022 at 04:34:32PM -0400, Robert Haas wrote:
> That thread has not reached an entirely satisfying conclusion.
> However, the behavior that was deemed outright buggy over there has
> been fixed. The remaining question is what to do about commands that
> allow you to give objects to other users (like ALTER  ..
> OWNER TO) or commands that allow you to create objects owned by other
> users (like CREATE DATABASE ... OWNER). I have, in this version,
> adopted the proposal by Wolfgang Walther on the other thread to make
> this controlled by the new SET option. This essentially takes the view
> that the ability to create objects owned by another user is not
> precisely a privilege, and is thus not inherited just because the
> INHERIT option is set on the GRANT, but it is something you can do if
> you could SET ROLE to that role, so we make it dependent on the SET
> option. This logic is certainly debatable, but it does have the
> practical advantage of making INHERIT TRUE, SET FALSE a useful
> combination of settings for predefined roles. It's also 100%
> backward-compatible, whereas if we made the behavior dependent on the
> INHERIT option, users could potentially notice behavior changes after
> upgrading to v16.

I'm not sure about tying the ownership stuff to this new SET privilege.
While you noted some practical advantages, I'd expect users to find it kind
of surprising.  Also, for predefined roles, I think you need to be careful
about distributing ADMIN, as anyone with ADMIN on a predefined role can
just GRANT SET to work around the restrictions.  I don't have a better
idea, though, so perhaps neither of these things is a deal-breaker.  I was
tempted to suggest using ADMIN instead of SET for the ownership stuff, but
that wouldn't be backward-compatible, and you'd still be able to work
around it to some extent with SET (e.g., SET ROLE followed by CREATE
DATABASE).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread David Rowley

On Thu, 13 Oct 2022 at 01:13, Richard Guo  wrote:
> For DISTINCT ON, if all the distinct pathkeys are redundant but there
> are available sort pathkeys, then for adequately-presorted paths I think
> we can also apply this optimization, using a Limit 1 rather than Unique.
>
> regression=# explain (analyze, costs off, timing off) select distinct on 
> (four) * from tenk1 where four = 0 order by four, hundred desc;
>QUERY PLAN
> 
>  Limit (actual rows=1 loops=1)
>->  Index Scan Backward using tenk1_hundred on tenk1 (actual rows=1 
> loops=1)
>  Filter: (four = 0)
>  Rows Removed by Filter: 300

I don't think we can optimise this case, at least not the same way I'm
doing it in the patch I attached.

The problem is that I'm only added the LimitPath to the
cheapest_total_path.  I think to make your case work we'd need to add
the LimitPath only in cases where the distinct_pathkeys are empty but
the sort_pathkeys are not and hasDistinctOn is true and the path has
pathkeys_contained_in(root->sort_pathkeys, path->pathkeys).  I think
that's doable, but it's become quite a bit more complex than the patch
I proposed. Maybe it's worth a 2nd effort for that part?

David

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread David Rowley

On Thu, 13 Oct 2022 at 00:30, Richard Guo  wrote:
> I also have concerns about the 2 Limit nodes pointed by the comment
> inside the patch. Maybe we can check with limit_needed() and manually
> add the limit node only if there is no LIMIT clause in the origin query?

I wasn't hugely concerned about this. I think we're a little limited
to what we can actually do about it too.

It seems easy enough to skip adding the LimitPath in
create_final_distinct_paths() if the existing query already has
exactly LIMIT 1. However, if they've written LIMIT 10 or LIMIT
random()*1234 then we must add the LimitPath to ensure we only get 1
row rather than 10 or some random number.

As for getting rid of the LIMIT 10 / LIMIT random()*1234, we store the
LIMIT clause information in the parse and currently that's what the
planner uses when creating the LimitPath for the LIMIT clause.  I'd
quite like to avoid making any adjustments to the parse fields here.
(There's a general project desire to move away from the planner
modifying the parse. If we didn't do that we could do things like
re-plan queries with stronger optimization levels when they come out
too costly.)

We could do something like set some bool flag in PlannerInfo to tell
the planner not to bother adding the final LimitPath as we've already
added another which does the job, but is it really worth adding that
complexity for this patch? You already mentioned that "this patch is
very straightforward". I don't think it would be if we added code to
avoid the LimitPath duplication.

David

Re: Tracking last scan time

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-12 15:40:21 +0900, Michael Paquier wrote:
> On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:
> > Thanks. It's just the changes in xact.c, so it doesn't seem like it would
> > cause you any more work either way, in which case, I'll leave it to you :-)
> 
> Okay, I have just moved the patch to the next CF then, still marked as
> ready for committer.  Are you planning to look at that?

Yep, doing so right now.

I think this should have at a basic test in src/test/regress/sql/stats.sql. If
I can write one in a few minutes I'll go for that, otherwise will reply
detailing difficulties.


> +  
> +   The time of the last sequential scan of this table, based on the
> +   most recent transaction stop time
> +  

Related rows seem to say "on this table".

Greetings,

Andres Freund

how to correctly react on exception in pfree function?

2022-10-12 Thread Pavel Stehule

Hi

I had a talk with Julien about the correct handling of an exception raised
by pfree function.

Currently, this exception (elog(ERROR, "could not find block containing
chunk %p", chunk);) is not specially handled ever. Because the check of
pointer sanity is executed first (before any memory modification), then it
is safe to repeatedly call pfree (but if I read code correctly, this
behavior is not asserted or tested).

The question is - What is the correct action on this error. In the end,
this exception means detection of memory corruption. One, and probably safe
way is raising FATAL error.  But it looks like too hard a solution and not
too friendly. Moreover, this way is not used in the current code base.

The traditional solution is just raising the exception and doing nothing
more. I didn't find code, where the exception from pfree is exactly
handled. Similar issues with the possible exception from pfree can be in
plan cache, plpgsql code cache, partially in implementation of update of
plpgsql variable. Everywhere the implementation is not too strict - just
the exception is raised, but the session continues (although in this moment
we know so some memory is corrupted).

Is it a common strategy in Postgres?

Regards

Pavel

Re: Summary function for pg_buffercache

2022-10-12 Thread Andres Freund

Hi,

On 2022-09-28 18:19:57 +0300, Melih Mutlu wrote:
> diff --git a/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql 
> b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
> new file mode 100644
> index 00..77e250b430
> --- /dev/null
> +++ b/contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql
> @@ -0,0 +1,13 @@
> +/* contrib/pg_buffercache/pg_buffercache--1.3--1.4.sql */
> +
> +-- complain if script is sourced in psql, rather than via ALTER EXTENSION
> +\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.4'" to load this 
> file. \quit
> +
> +CREATE FUNCTION pg_buffercache_summary()
> +RETURNS TABLE (buffers_used int4, buffers_unused int4, buffers_dirty int4,
> + buffers_pinned int4, usagecount_avg real)
> +AS 'MODULE_PATHNAME', 'pg_buffercache_summary'
> +LANGUAGE C PARALLEL SAFE;

I think using RETURNS TABLE isn't quite right here, as it implies 'SETOF'. But
the function doesn't return a set of rows. I changed this to use OUT
parameters.

> +-- Don't want these to be available to public.
> +REVOKE ALL ON FUNCTION pg_buffercache_summary() FROM PUBLIC;

I think this needs to grant to pg_monitor too. See
pg_buffercache--1.2--1.3.sql

I added a test verifying the permissions are right, with the hope that it'll
make future contributors try to add a parallel test and notice the permissions
aren't right.

> + /* Construct a tuple descriptor for the result rows. */
> + tupledesc = CreateTemplateTupleDesc(NUM_BUFFERCACHE_SUMMARY_ELEM);

Given that we define the return type on the SQL level, it imo is nicer to use
get_call_result_type() here.

> + TupleDescInitEntry(tupledesc, (AttrNumber) 5, "usagecount_avg",
> +FLOAT4OID, -1, 0);

I changed this to FLOAT8. Not that the precision will commonly be useful, but
it doesn't seem worth having to even think about whether there are cases where
it'd matter.

I also changed it so that the accumulation happens in an int64 variable named
usagecount_total, which gets converted to a double only when actually
computing the result.

>   
>The pg_buffercache module provides a means for
> -  examining what's happening in the shared buffer cache in real time.
> +  examining what's happening in the shared buffer in real time.
>   

This seems to be an unnecessary / unrelated change. I suspect you made it in
response to
https://postgr.es/m/20220922161014.copbzwdl3ja4nt6z%40awork3.anarazel.de
but that was about a different sentence, where you said 'shared buffer caches'
(even though there is only a single shared buffer cache).

>   
> @@ -17,10 +17,19 @@
>   
>  
>   
> -  The module provides a C function pg_buffercache_pages
> -  that returns a set of records, plus a view
> -  pg_buffercache that wraps the function for
> -  convenient use.
> +  The module provides C functions pg_buffercache_pages
> +  and pg_buffercache_summary.
> + 
> +
> + 
> +  The pg_buffercache_pages function
> +  returns a set of records, plus a view 
> pg_buffercache that wraps the function for
> +  convenient use is provided.
> + 

I rephrased this, because it sounds like the function returns a set of records
and a view.

> + 
> +  The pg_buffercache_summary function returns a table 
> with a single row
> +  that contains summarized and aggregated information about shared buffer.
>   

"summarized and aggregated" is quite redundant.

> +  
> +   pg_buffercachesummary Columns

Missing underscore.

> + 
> +  
> +   buffers_unused int4
> +  
> +  
> +Number of shared buffers that not currently being used
> +  
> + 

There's a missing 'are' in here, I think. I rephrased all of these to
"Number of (used|unused|dirty|pinned) shared buffers"

> + 
> +  
> +   buffers_dirty int4
> +  
> +  
> +   Number of dirty shared buffers
> +  
> + 
> +
> + 
> +  
> +   buffers_pinned int4
> +  
> +  
> +   Number of shared buffers that has a pinned backend
> +  
> + 

Backends pin buffers, not the other way round...

> +  
> +   There is a single row to show summarized information of all shared 
> buffers.
> +   pg_buffercache_summary is not interested
> +   in the state of each shared buffer, only shows aggregated information.
> +  
> +
> +  
> +   The pg_buffercache_summary doesn't provide a result
> +   that is consistent across all buffers. This is intentional. The purpose
> +   of this function is to provide a general idea about the state of shared
> +   buffers as fast as possible. Additionally, 
> pg_buffercache_summary
> +   allocates much less memory.
> +  

I still didn't like this comment. Please see the attached.

I intentionally put my changes into a fixup commit, in case you want to look
at the differences.

Greetings,

Andres Freund
>From 9caf0911bca9be5b630f8086befc861331e91c5a Mon Sep 17 00:00:00 2001
From: Melih Mutlu 
Date: Sat, 10 Sep 2022 02:08:24 +0300
Subject: [PATCH v15 1/2] pg_buffercache: Add

Re: Simplifying our Trap/Assert infrastructure

2022-10-12 Thread Peter Eisentraut


On 12.10.22 20:36, Nathan Bossart wrote:

On Sun, Oct 09, 2022 at 02:01:48PM -0700, Nathan Bossart wrote:

The patch LGTM.  It might be worth removing usages of AssertArg and
AssertState, too, but that can always be done separately.


If you are so inclined...


I'm in favor of this.  These variants are a distraction.

Re: GUC values - recommended way to declare the C variables?

2022-10-12 Thread Nathan Bossart

On Wed, Sep 28, 2022 at 10:13:22AM +1000, Peter Smith wrote:
> PSA a small patch to tidy a few of the GUC C variables - adding
> comments and removing unnecessary declaration assignments.
> 
> make check-world passed OK.

Looks reasonable to me.  I've marked this as ready-for-committer.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

2022-10-12 Thread Önder Kalacı

Hi,




> ~~~
> 01. 032_subscribe_use_index.pl - SUBSCRIPTION CAN UPDATE THE INDEX IT
> USES AFTER ANALYZE
>
> ```
> # show that index_b is not used
> $node_subscriber->poll_query_until(
> 'postgres', q{select idx_scan=0 from pg_stat_all_indexes where
> indexrelname = 'index_b';}
> ) or die "Timed out while waiting for check subscriber tap_sub_rep_full
> updates two rows via index scan with index on high cardinality column-2";
> ```
>
> poll_query_until() is still remained here, it should be replaced to is().
>
>
>
Updated

02. 032_subscribe_use_index.pl - SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN
>
> ```
> # show that the unique index on replica identity is used even when
> enable_indexscan=false
> $result = $node_subscriber->safe_psql('postgres',
> "select idx_scan from pg_stat_all_indexes where indexrelname =
> 'test_replica_id_full_idx'");
> is($result, qq(0), 'ensure subscriber has not used index with
> enable_indexscan=false');
> ```
>
> Is the comment wrong? The index test_replica_id_full_idx is not used here.
>
>
Yeah, the comment is wrong. It is a copy & paste error from the other test.
Fixed now


>
>
> 03. 032_subscribe_use_index.pl - SUBSCRIPTION BEHAVIOR WITH
> ENABLE_INDEXSCAN
>
> ```
> $node_publisher->safe_psql('postgres',
> "ALTER TABLE test_replica_id_full REPLICA IDENTITY USING INDEX
> test_replica_id_full_unique;");
> ```
>
> I was not sure why ALTER TABLE REPLICA IDENTITY USING INDEX was done on
> the publisher side.
> IIUC this feature works when REPLICA IDENTITY FULL is specified on a
> publisher,
> so it might not be altered here. If so, an index does not have to define
> on the publisher too.
>
>
Yes, not strictly necessary but it is often the case that both
subscriber and publication have the similar schemas when unique index/pkey
is used. For example, see t/028_row_filter.pl where we follow this pattern.

Still, I manually tried that without the index on the publisher (e.g.,
replica identity full), that works as expected. But given that the majority
of the tests already have that approach and this test focuses on
enable_indexscan, I think I'll keep it as is - unless it is confusing?


> 04. 032_subscribe_use_index.pl - SUBSCRIPTION BEHAVIOR WITH
> ENABLE_INDEXSCAN
>
> ```
> $node_subscriber->poll_query_until(
> 'postgres', q{select (idx_scan=1) from pg_stat_all_indexes where
> indexrelname = 'test_replica_id_full_unique'}
> ) or die "Timed out while waiting ensuring subscriber used unique index as
> replica identity even with enable_indexscan=false'";
> ```
>
> 03 comment should be added here.
>
> Yes, done that as well.


Attached v17 now. Thanks for the review!


v17_0001_use_index_on_subs_when_pub_rep_ident_full.patch
Description: Binary data

Re: Simplifying our Trap/Assert infrastructure

2022-10-12 Thread Nathan Bossart

On Sun, Oct 09, 2022 at 02:01:48PM -0700, Nathan Bossart wrote:
> The patch LGTM.  It might be worth removing usages of AssertArg and
> AssertState, too, but that can always be done separately.

If you are so inclined...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 90e2166a64b9b6b9362a21b046e52a6304f85fc9 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Wed, 12 Oct 2022 11:29:08 -0700
Subject: [PATCH v1 1/1] remove AssertArg and AssertState

---
 contrib/fuzzystrmatch/fuzzystrmatch.c |  4 +-
 src/backend/access/common/tupdesc.c   | 32 ++--
 src/backend/access/heap/heapam.c  |  2 +-
 src/backend/access/nbtree/nbtsort.c   |  2 +-
 src/backend/access/transam/multixact.c|  8 +--
 src/backend/access/transam/xact.c |  6 +--
 src/backend/bootstrap/bootstrap.c |  2 +-
 src/backend/catalog/namespace.c   |  2 +-
 src/backend/catalog/pg_collation.c|  8 +--
 src/backend/commands/dbcommands.c |  2 +-
 src/backend/commands/indexcmds.c  |  2 +-
 src/backend/commands/portalcmds.c |  4 +-
 src/backend/commands/tablecmds.c  |  6 +--
 src/backend/jit/jit.c |  2 +-
 src/backend/parser/parse_utilcmd.c|  2 +-
 src/backend/postmaster/autovacuum.c   |  4 +-
 .../replication/logical/reorderbuffer.c   |  2 +-
 src/backend/replication/slot.c|  2 +-
 src/backend/storage/lmgr/lwlock.c |  6 +--
 src/backend/tcop/postgres.c   |  4 +-
 src/backend/tcop/pquery.c |  8 +--
 src/backend/utils/activity/pgstat.c   |  8 +--
 src/backend/utils/activity/pgstat_replslot.c  |  4 +-
 src/backend/utils/activity/pgstat_shmem.c |  2 +-
 src/backend/utils/activity/pgstat_slru.c  |  2 +-
 src/backend/utils/adt/mcxtfuncs.c |  2 +-
 src/backend/utils/adt/xml.c   |  2 +-
 src/backend/utils/cache/relcache.c|  2 +-
 src/backend/utils/fmgr/dfmgr.c| 12 ++---
 src/backend/utils/init/miscinit.c | 30 +--
 src/backend/utils/misc/guc-file.l |  2 +-
 src/backend/utils/misc/guc.c  |  6 +--
 src/backend/utils/mmgr/aset.c | 18 +++
 src/backend/utils/mmgr/generation.c   | 16 +++---
 src/backend/utils/mmgr/mcxt.c | 50 +--
 src/backend/utils/mmgr/portalmem.c| 12 ++---
 src/backend/utils/mmgr/slab.c | 16 +++---
 src/backend/utils/sort/tuplesortvariants.c| 12 ++---
 src/common/controldata_utils.c|  2 +-
 src/common/protocol_openssl.c |  2 +-
 src/include/c.h   | 12 -
 src/include/executor/tuptable.h   |  8 +--
 src/include/miscadmin.h   |  2 +-
 src/include/utils/pgstat_internal.h   |  4 +-
 44 files changed, 162 insertions(+), 174 deletions(-)

diff --git a/contrib/fuzzystrmatch/fuzzystrmatch.c b/contrib/fuzzystrmatch/fuzzystrmatch.c
index a04251ace6..5659cc71ac 100644
--- a/contrib/fuzzystrmatch/fuzzystrmatch.c
+++ b/contrib/fuzzystrmatch/fuzzystrmatch.c
@@ -724,8 +724,8 @@ _soundex(const char *instr, char *outstr)
 {
 	int			count;
 
-	AssertArg(instr);
-	AssertArg(outstr);
+	Assert(instr);
+	Assert(outstr);
 
 	outstr[SOUNDEX_LEN] = '\0';
 
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index d6fb261e20..b7f918c877 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -49,7 +49,7 @@ CreateTemplateTupleDesc(int natts)
 	/*
 	 * sanity checks
 	 */
-	AssertArg(natts >= 0);
+	Assert(natts >= 0);
 
 	/*
 	 * Allocate enough memory for the tuple descriptor, including the
@@ -273,12 +273,12 @@ TupleDescCopyEntry(TupleDesc dst, AttrNumber dstAttno,
 	/*
 	 * sanity checks
 	 */
-	AssertArg(PointerIsValid(src));
-	AssertArg(PointerIsValid(dst));
-	AssertArg(srcAttno >= 1);
-	AssertArg(srcAttno <= src->natts);
-	AssertArg(dstAttno >= 1);
-	AssertArg(dstAttno <= dst->natts);
+	Assert(PointerIsValid(src));
+	Assert(PointerIsValid(dst));
+	Assert(srcAttno >= 1);
+	Assert(srcAttno <= src->natts);
+	Assert(dstAttno >= 1);
+	Assert(dstAttno <= dst->natts);
 
 	memcpy(dstAtt, srcAtt, ATTRIBUTE_FIXED_PART_SIZE);
 
@@ -594,9 +594,9 @@ TupleDescInitEntry(TupleDesc desc,
 	/*
 	 * sanity checks
 	 */
-	AssertArg(PointerIsValid(desc));
-	AssertArg(attributeNumber >= 1);
-	AssertArg(attributeNumber <= desc->natts);
+	Assert(PointerIsValid(desc));
+	Assert(attributeNumber >= 1);
+	Assert(attributeNumber <= desc->natts);
 
 	/*
 	 * initialize the attribute fields
@@ -664,9 +664,9 @@ TupleDescInitBuiltinEntry(TupleDesc desc,
 	Form_pg_attribute att;
 
 	/* sanity checks */
-	AssertArg(PointerIsValid(desc));
-	AssertArg(attributeNumber >= 1);
-	AssertArg(attributeNumber <= desc->natts);
+	Assert(PointerIsValid(desc));
+

Re: Git tag for v15

2022-10-12 Thread Matthias van de Meent

On Wed, 12 Oct 2022 at 20:08, Nemo  wrote:
> Since v15 doesn't seem to be announced yet - this is confusing me. It
> doesn't impact us anywhere (yet), since we haven't added the v15 release
> cycle to our page yet, but I'd like to check if this is an incorrect tag
> or? If not, what's the correct source for us to use for automation purposes?

Tags are usually published a few days in advance of the official
release, so that all packagers have the time to bundle and prepare the
release in their repositories. The official release is planned for
tomorrow the 13th, as mentioned in [0].

This same pattern can be seen for minor release versions; where the
release is generally stamped well in advance of the release post going
live.

Kind regards,

Matthias van de Meent

[0] https://postgr.es/m/2a88ff2e-ffcc-bb39-379c-37244b4114a5%40postgresql.org

Git tag for v15

2022-10-12 Thread Nemo


Hi,

Writing this on behalf of endoflife.date, where we track postgres 
releases for the endoflife.date/postgres page.


We have our automation linked to git tags published on the postgres repo 
mirror on GitHub[0].


It recently picked up the REL_15_0 tag[2], and compiled it here: 
https://github.com/endoflife-date/release-data/blob/main/releases/postgresql.json#L452


Since v15 doesn't seem to be announced yet - this is confusing me. It 
doesn't impact us anywhere (yet), since we haven't added the v15 release 
cycle to our page yet, but I'd like to check if this is an incorrect tag 
or? If not, what's the correct source for us to use for automation purposes?


I went through the release process on the Wiki[1], and it mentions final 
version tagging as the final irreversible step, so that added to my 
confusion.


Thanks,
Nemo

(Please keep me in cc for any replies)

[0]: https://github.com/postgres/postgres

[1]: https://wiki.postgresql.org/wiki/Release_process#Final_version_tagging

[2]: https://github.com/postgres/postgres/releases/tag/REL_15_0

Re: possibility of partial data dumps with pg_dump

2022-10-12 Thread Никита Старовойтов

Good afternoon, Indeed, the functionality that I started to implement in
the patch is very similar to what is included in the program you proposed.
Many of the use cases are the same. Thanks for giving me a hint about it. I
have been working on implementing referential integrity, but have not been
able to find simple solutions for a complex structure. And I am not sure if
it can be done in the dump process. Although it is obvious that without
this functionality, the usefulness of the function is insignificant. When I
worked with another database management system, the partial offer feature
was available from the dump program. It was useful for me. But I understand
why it might not be worth extending pg_dump with a non-essential feature.
However, I will try to work again to solve the problem with the guaranteed
recovery of the database. Thanks for the comments, they were really helpful
to me.

вт, 4 окт. 2022 г. в 19:24, Julien Rouhaud :

> Hi,
>
> On Tue, Oct 04, 2022 at 02:15:16PM +0200, Pavel Stehule wrote:
> >
> > út 4. 10. 2022 v 12:48 odesílatel Никита Старовойтов <
> nikstar...@gmail.com>
> > napsal:
> >
> > > Hello,
> > > with a view to meeting with postgres code and to get some practice with
> > > it, I am making a small patch that adds the possibility of partial
> tables
> > > dump.
> > > A rule of filtering is specified with standard SQL where clause
> (without
> > > "where" keyword)
> >
> > What is benefit and use case? For this case I don't see any benefit
> against
> > simple
> >
> > \copy (select * from xx where ...) to file CSV
> >
> > or how hard is it to write trivial application that does export of what
> you
> > want in the format that you want?
>
> Also, such approach probably requires a lot of effort to get a valid backup
> (with regards to foreign keys and such).
>
> There's already a project dedicated to generate such partial (and
> consistent)
> backups: https://github.com/mla/pg_sample.  Maybe that would address your
> needs?
>

Re: Support for Rust

2022-10-12 Thread Jeff Davis

On Mon, 2022-09-12 at 11:29 -0400, Tom Lane wrote:
> > Rust gives many things we wanted for decades:
> 
> > 1. No undefined behavior
> > 2. No memory leaks, guaranteed at compile time
> 
> Really?  It seems impossible to me that a language that even thinks
> it can guarantee that could interoperate with the backend's memory
> management.  And that's not something we are interested in replacing.

It's a distraction to talk about rust's safety "guarantees" in the
context of this thread. #1 is partially true, and #2 is outright
false[1].

C interoperability is the most compelling rust feature, in my opinion.
C memory representations are explicitly supported, and high-level
language features don't impose on your struct layouts. For instance,
rust does dynamic dispatch using trait objects[2], which hold the
vtable along with the reference, rather than in the struct itself. And
a "Foo *" from C has the same memory representation as an Option<>
in rust, so that you get the type safety.

Of course, rewriting Postgres would be terrible idea regardless of the
merits of rust for all kinds of reasons. But writing *extensions* in
rust is very promising because of this C interoperability.

> 
> Yeah, that's what I thought.  "Allow some parts to be written in
> language X" soon turns into "Rewrite the entire system in language X,
> including fundamental rethinking of memory management, error
> handling,
> and some other things".  That's pretty much a non-starter.

You may be surprised how much you can do with rust extensions without
changing any of those things[3].

Regards,
Jeff Davis

[1] https://doc.rust-lang.org/std/mem/fn.forget.html
[2] https://doc.rust-lang.org/book/ch17-02-trait-objects.html
[3] https://www.pgcon.org/2019/schedule/attachments/532_RustTalk.pdf

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Peter Eisentraut


On 10.10.22 12:04, Alvaro Herrera wrote:

In my own tags script I just call "ctags -R", and I feed cscope with
these find lines

(find $SRCDIR \( -name tmp_install -prune -o -name tmp_check -prune \) -o \( -name "*.[chly]" -o -iname "*makefile*" -o -name 
"*.mk" -o -name "*.in" -o -name "*.sh" -o -name "*.sgml" -o -name "*.sql" -o -name "*.p[lm]" \) 
-type f -print ; \
find $BUILDDIR \( -name tmp_install -prune \) -o \( -name \*.h -a -type f \) 
-print )

which seems to give decent results.  (Nowadays I wonder if it'd be
better to exclude the "*_d.h" files from the builddir.)
(I wonder why don't I have a prune for .git ...)


Or use git ls-files.

Re: [PoC] Let libpq reject unexpected authentication requests

2022-10-12 Thread Jacob Champion

On 10/5/22 06:33, Peter Eisentraut wrote:
> I think it would be good to put some provisions in place here, even if 
> they are elementary.  Otherwise, there will be a significant burden on 
> the person who implements the next SASL method (i.e., you ;-) ) to 
> figure that out then.

Sounds good, I'll work on that. v10 does not yet make changes in this area.

> I think you could just stick a string list of allowed SASL methods into 
> PGconn.
> 
> By the way, I'm not sure all the bit fiddling is really worth it.  An 
> array of integers (or unsigned char or whatever) would work just as 
> well.  Especially if you are going to have a string list for SASL 
> anyway.  You're not really saving any bits or bytes either way in the 
> normal case.

Yeah, with the SASL case added in, the bitmasks might not be long for
this world. It is nice to be able to invert the whole thing, but a
separate boolean saying "invert the list" could accomplish the same goal
and I think we'll need to have that for the SASL mechanism list anyway.

> Minor comments:
> 
> Pasting together error messages like with auth_description() isn't going 
> to work.  You either need to expand the whole message in 
> check_expected_areq(), or perhaps rephrase the message like
> 
> libpq_gettext("auth method \"%s\" required, but server requested \"%s\"\n"),
>   conn->require_auth,
>   auth_description(areq)
> 
> and make auth_description() just return a single word not subject to 
> translation.

Right. Michael tried to warn me about that upthread, but I only ended up
fixing one of the two error cases for some reason. I've merged the two
into one code path for v10.

Quick error messaging bikeshed: do you prefer

auth method "!password,!md5" requirement failed: ...

or

auth method requirement "!password,!md5" failed: ...

?

> spurious whitespace change in fe-secure-openssl.c

Fixed.

> whitespace error in patch:
> 
> .git/rebase-apply/patch:109: tab in indent.
>  via TLS, nor GSS authentication via its 
> encrypted transport.)

Fixed.

> In the 0002 patch, the configure test needs to be added to meson.build.

Added.

Thanks,
--Jacobdiff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 45c5228cfe..c06b0718cf 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1306,7 +1306,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname

 The server must not prompt the client for an authentication
 exchange. (This does not prohibit client certificate authentication
-   via TLS, nor GSS authentication via its encrypted 
transport.)
+via TLS, nor GSS authentication via its encrypted transport.)

   
  
diff --git a/meson.build b/meson.build
index 925db70c9d..bad035c8e3 100644
--- a/meson.build
+++ b/meson.build
@@ -1173,8 +1173,9 @@ if get_option('ssl') == 'openssl'
 ['CRYPTO_new_ex_data', {'required': true}],
 ['SSL_new', {'required': true}],
 
-# Function introduced in OpenSSL 1.0.2.
+# Functions introduced in OpenSSL 1.0.2. LibreSSL doesn't have all of 
these.
 ['X509_get_signature_nid'],
+['SSL_CTX_set_cert_cb'],
 
 # Functions introduced in OpenSSL 1.1.0. We used to check for
 # OPENSSL_VERSION_NUMBER, but that didn't work with 1.1.0, because LibreSSL
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index 793888d30f..295b978525 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -837,7 +837,7 @@ pg_password_sendauth(PGconn *conn, const char *password, 
AuthRequest areq)
 }
 
 /*
- * Translate an AuthRequest into a human-readable description.
+ * Translate a disallowed AuthRequest code into an error message.
  */
 static const char *
 auth_description(AuthRequest areq)
@@ -845,23 +845,23 @@ auth_description(AuthRequest areq)
switch (areq)
{
case AUTH_REQ_PASSWORD:
-   return libpq_gettext("a cleartext password");
+   return libpq_gettext("server requested a cleartext 
password");
case AUTH_REQ_MD5:
-   return libpq_gettext("a hashed password");
+   return libpq_gettext("server requested a hashed 
password");
case AUTH_REQ_GSS:
case AUTH_REQ_GSS_CONT:
-   return libpq_gettext("GSSAPI authentication");
+   return libpq_gettext("server requested GSSAPI 
authentication");
case AUTH_REQ_SSPI:
-   return libpq_gettext("SSPI authentication");
+   return libpq_gettext("server requested SSPI 
authentication");
case AUTH_REQ_SCM_CREDS:
-   return libpq_gettext("UNIX socket credentials");
+   return libpq_gettext("server requested UNIX socket 
credentials");

Re: START_REPLICATION SLOT causing a crash in an assert build

2022-10-12 Thread Andres Freund

Hi,

On 2022-10-11 17:10:52 +0900, Masahiko Sawada wrote:
> +# Reset the replication slot statistics.
> +$node->safe_psql('postgres',
> + "SELECT pg_stat_reset_replication_slot('regression_slot');");
> +my $result = $node->safe_psql('postgres',
> + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 
> 'regrssion_slot'"
> +);

Typo in the slot name "regrssion_slot" instead of "regression_slot". We can't
use * here, because that'll include the reset timestamp.

> +# Teardown the node so the statistics is removed.
> +$pg_recvlogical->kill_kill;
> +$node->teardown_node;
> +$node->start;

ISTM that removing the file instead of shutting down the cluster with force
would make it a more targeted test.

> +# Check if the replication slot statistics have been removed.
> +$result = $node->safe_psql('postgres',
> + "SELECT * FROM pg_stat_replication_slots WHERE slot_name = 
> 'regrssion_slot'"
> +);
> +is($result, "", "replication slot statistics are removed");

Same typo as above. We can't assert a specific result here either, because
recvlogical will have processed a bunch of changes. Perhaps we could check at
least that the reset time is NULL? 

> +# Test if the replication slot staistics continue to be accumulated even 
> after

s/staistics/statistics/

Greetings,

Andres Freund

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Bruce Momjian

On Wed, Oct 12, 2022 at 10:26:03PM +0900, Tatsuo Ishii wrote:
> > However ... hmm ... 
> > 
> >>  find . \( -name 'CVS' -prune \) -o \( -name .git -prune \) -o -type d 
> >> -print |
> >>  while read DIR
> >> -do[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
> >> 's;/[^/]*;/..;g'`/tags "$DIR"/tags
> >> +do[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
> >> 's;/[^/]*;/..;g'`/$tags_file "$DIR"/$tags_file
> >>  done
> > 
> > ... does this create a tags symlink on each directory?  This seems
> > strange to me,
> 
> I don't know the original author's intention for this but I think it
> makes use of the tag file in emacs a little bit easier.  Emacs
> confirms for the first time the default location of tags file under
> the same directory where the source file resides. I can just hit
> return key if there's a symlink of tags. If we do not create the
> symlink, we have to specify the directory where the tags file was
> originally created, which is a little bit annoying.

Yes, that is exactly the intent of why it uses symlinks in every
directory.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: Warning about using pg_stat_reset() and pg_stat_reset_shared()

2022-10-12 Thread Bruce Momjian

On Wed, Oct 12, 2022 at 08:50:19AM +1300, David Rowley wrote:
> On Wed, 12 Oct 2022 at 04:11, Bruce Momjian  wrote:
> > As far as I can tell, analyze updates pg_statistics values, but not
> > pg_stat_all_tables.n_dead_tup and n_live_tup, which are used by
> > autovacuum to trigger vacuum operations.  I am afraid we have to
> > recommand VACUUM ANALYZE after pg_stat_reset(), no?
> 
> As far as I can see ANALYZE will update these fields.  I'm looking at
> pgstat_report_analyze() called from do_analyze_rel().
> 
> It does:
> 
> tabentry->n_live_tuples = livetuples;
> tabentry->n_dead_tuples = deadtuples;
> 
> I also see it working from testing:
> 
> create table t as select x from generate_Series(1,10)x;
> delete from t where x > 9;
> select pg_sleep(1);
> select n_live_tup,n_dead_tup from pg_stat_user_tables where relname = 't';
> select pg_stat_reset();
> select n_live_tup,n_dead_tup from pg_stat_user_tables where relname = 't';
> analyze t;
> select n_live_tup,n_dead_tup from pg_stat_user_tables where relname = 't';
> 
> The result of the final query is:
> 
>  n_live_tup | n_dead_tup
> +
>   9 |  1
> 
> Maybe the random sample taken by ANALYZE for your case didn't happen
> to land on any pages with dead tuples?

Ah, good point, I missed that in pgstat_report_analyze().  I will apply
the patch then in a few days, thanks.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

Re: create subscription - improved warning message

2022-10-12 Thread Tom Lane

Alvaro Herrera  writes:
> On 2022-Oct-12, Amit Kapila wrote:
>> Okay, then I think we can commit the last error message patch of Peter
>> [1] as we have an agreement on the same, and then work on this as a
>> separate patch. What do you think?

> No objection.

Yeah, the v3-0001 patch is fine.  I agree that the docs change needs
more work.

regards, tom lane

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Tatsuo Ishii wrote:

> >>  find . \( -name 'CVS' -prune \) -o \( -name .git -prune \) -o -type d 
> >> -print |
> >>  while read DIR
> >> -do[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
> >> 's;/[^/]*;/..;g'`/tags "$DIR"/tags
> >> +do[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
> >> 's;/[^/]*;/..;g'`/$tags_file "$DIR"/$tags_file
> >>  done
> > 
> > ... does this create a tags symlink on each directory?  This seems
> > strange to me,
> 
> I don't know the original author's intention for this but I think it
> makes use of the tag file in emacs a little bit easier.  Emacs
> confirms for the first time the default location of tags file under
> the same directory where the source file resides. I can just hit
> return key if there's a symlink of tags. If we do not create the
> symlink, we have to specify the directory where the tags file was
> originally created, which is a little bit annoying.

OK, that sounds good then.  I would make a feature request to have a
switch that supresses creation of these links, then.

> Well, I often visit different tags file for different development
> projects (for example Pgpool-II) and I don't want to have fixed
> location of tags file in emacs setting. For this reason make_etags's
> approach is acceptable for me.

Makes sense.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
Thou shalt study thy libraries and strive not to reinvent them without
cause, that thy code may be short and readable and thy days pleasant
and productive. (7th Commandment for C Programmers)

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Tom Lane wrote:

> Tatsuo Ishii  writes:
> >> If we're going to do this, then I suggest that make_etags should become
> >> a symlink to make_ctags, and behave as if -e is given when called under
> >> that name.
> 
> > What I had in my mind was making make_etags a script just exec
> > make_ctags (with -e option). But I don't have strong
> > preference. Symlink is ok for me too.
> 
> I don't think it's possible to store a symlink in git, so
> having one file exec the other sounds saner to me too.

I tried before my reply and it seems to work, but perhaps it requires
too new a git version.  It may also be a problem under Windows.  Having
one exec the other sounds perfect.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Julien Rouhaud

On Wed, Oct 12, 2022 at 10:09:06AM -0400, Tom Lane wrote:
>
> I don't think it's possible to store a symlink in git, so
> having one file exec the other sounds saner to me too.

Git handles symlink just fine, but those will obviously create extra burden for
Windows users (git will just create a text file containing the target path I
think), so agreed (even if I doubt that any Windows user will run those
scripts anyway).

[PATCH] Improve tab completion for ALTER TABLE on identity columns

2022-10-12 Thread Dagfinn Ilmari Mannsåker

Hi Hackers,

I noticed that psql has no tab completion around identity columns in
ALTER TABLE, so here's some patches for that.

In passing, I also added completion for ALTER SEQUECNE … START, which was
missing for some reason.

- ilmari

>From 4dad38d17e82b2efafe3666a0a92aa248cfc5e60 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 12 Oct 2022 14:17:12 +0100
Subject: [PATCH 1/4] =?UTF-8?q?psql:=20Add=20tab=20completion=20for=20ALTE?=
 =?UTF-8?q?R=20SEQUENCE=20=E2=80=A6=20START=20=E2=80=A6?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 src/bin/psql/tab-complete.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 584d9d5ae6..24a5f1f62b 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2104,7 +2104,7 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER SEQUENCE  */
 	else if (Matches("ALTER", "SEQUENCE", MatchAny))
 		COMPLETE_WITH("AS", "INCREMENT", "MINVALUE", "MAXVALUE", "RESTART",
-	  "NO", "CACHE", "CYCLE", "SET", "OWNED BY",
+	  "START", "NO", "CACHE", "CYCLE", "SET", "OWNED BY",
 	  "OWNER TO", "RENAME TO");
 	/* ALTER SEQUENCE  AS */
 	else if (TailMatches("ALTER", "SEQUENCE", MatchAny, "AS"))
-- 
2.34.1

>From 59b0ab0159ad941dbc39b22aabb07e82c814612d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 12 Oct 2022 14:03:52 +0100
Subject: [PATCH 2/4] =?UTF-8?q?psql:=20Add=20tab=20completion=20for=20ALTE?=
 =?UTF-8?q?R=20COLUMN=20=E2=80=A6=20SET=20GENERATED=20=E2=80=A6?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 src/bin/psql/tab-complete.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 24a5f1f62b..1d4f1fd794 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2381,7 +2381,7 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER TABLE ALTER [COLUMN]  SET */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET"))
-		COMPLETE_WITH("(", "COMPRESSION", "DEFAULT", "NOT NULL", "STATISTICS", "STORAGE");
+		COMPLETE_WITH("(", "COMPRESSION", "DEFAULT", "GENERATED", "NOT NULL", "STATISTICS", "STORAGE");
 	/* ALTER TABLE ALTER [COLUMN]  SET ( */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "(") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "("))
@@ -2390,6 +2390,10 @@ psql_completion(const char *text, int start, int end)
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "COMPRESSION") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "COMPRESSION"))
 		COMPLETE_WITH("DEFAULT", "PGLZ", "LZ4");
+	/* ALTER TABLE ALTER [COLUMN]  SET GENERATED */
+	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "GENERATED") ||
+			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "GENERATED"))
+		COMPLETE_WITH("ALWAYS", "BY DEFAULT");
 	/* ALTER TABLE ALTER [COLUMN]  SET STORAGE */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "STORAGE") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "STORAGE"))
-- 
2.34.1

>From 45c3d606f7d19fcee8223618ace7907b197401e2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Wed, 12 Oct 2022 14:21:24 +0100
Subject: [PATCH 3/4] =?UTF-8?q?psql:=20Add=20tab=20completion=20for=20ALTE?=
 =?UTF-8?q?R=20COLUMN=20=E2=80=A6=20SET=20?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 src/bin/psql/tab-complete.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 1d4f1fd794..009b6a7265 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2381,7 +2381,9 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER TABLE ALTER [COLUMN]  SET */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET"))
-		COMPLETE_WITH("(", "COMPRESSION", "DEFAULT", "GENERATED", "NOT NULL", "STATISTICS", "STORAGE");
+		COMPLETE_WITH("(", "COMPRESSION", "DEFAULT", "GENERATED", "NOT NULL", "STATISTICS", "STORAGE",
+	  /* a subset of ALTER SEQUENCE options */
+	  "INCREMENT", "MINVALUE", "MAXVALUE", "START", "NO", "CACHE", "CYCLE");
 	/* ALTER TABLE ALTER [COLUMN]  SET ( */
 	else if (Matches("ALTER", "TABLE", MatchAny, "ALTER", "COLUMN", MatchAny, "SET", "(") ||
 			 Matches("ALTER", "TABLE", MatchAny, "ALTER", MatchAny, "SET", "("))
@@ -2394,6 +2396,10 @@ psql_completion(const char

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Tom Lane

Tatsuo Ishii  writes:
>> If we're going to do this, then I suggest that make_etags should become
>> a symlink to make_ctags, and behave as if -e is given when called under
>> that name.

> What I had in my mind was making make_etags a script just exec
> make_ctags (with -e option). But I don't have strong
> preference. Symlink is ok for me too.

I don't think it's possible to store a symlink in git, so
having one file exec the other sounds saner to me too.

regards, tom lane

Re: ExecRTCheckPerms() and many prunable partitions

2022-10-12 Thread Peter Eisentraut


On 06.10.22 15:29, Amit Langote wrote:

I tried in the attached 0004.  ModifyTable gets a new member
extraUpdatedColsBitmaps, which is List of Bitmapset "nodes".

Actually, List of Bitmapsets turned out to be something that doesn't
just-work with our Node infrastructure, which I found out thanks to
-DWRITE_READ_PARSE_PLAN_TREES.  So, I had to go ahead and add
first-class support for copy/equal/write/read support for Bitmapsets,
such that writeNode() can write appropriately labeled versions of them
and nodeRead() can read them as Bitmapsets.  That's done in 0003.  I
didn't actually go ahead and make*all*  Bitmapsets in the plan trees
to be Nodes, but maybe 0003 can be expanded to do that.  We won't need
to make gen_node_support.pl emit *_BITMAPSET_FIELD() blurbs then; can
just use *_NODE_FIELD().


Seeing that on 64-bit platforms we have a 4-byte padding gap in the 
Bitmapset struct, sticking a node tag in there seems pretty sensible. 
So turning Bitmapset into a proper Node and then making the other 
adjustments you describe makes sense to me.


Making a new thread about this might be best.

(I can't currently comment on the rest of the patch set.  So I don't 
know if you'll really end up needing lists of bitmapsets.  But from here 
it looks like turning bitmapsets into nodes might be a worthwhile change 
just by itself.)

Re: Issue in GIN fast-insert: XLogBeginInsert + Read/LockBuffer ordering

2022-10-12 Thread Zhang Mingli

HI,

On Oct 12, 2022, 10:09 +0800, Michael Paquier , wrote:
>
> Nice catches, both of you. Let's adjust everything spotted in one
> shot. Matthias' patch makes the ordering right, but the
> initialization path a bit harder to follow when using a separate list.
> The code is short so it does not strike me as a big deal, and it comes
> from the fact that we need to lock an existing buffer when merging two
> lists. For the branch where we insert into a tail page, the pages are
> already locked but it looks to be enough to move XLogBeginInsert()
> before the first XLogRegisterBuffer() call.
>
> Would any of you like to write a patch?
> --
> Michael
Patch added.

Regards,
Zhang Mingli


v2-0001-Fix-GIN-fast-path-XLogBeginInsert-and-Read-LockBu.patch
Description: Binary data

Re: [Proposal] Add foreign-server health checks infrastructure

2022-10-12 Thread Önder Kalacı

Hi,

Sounds reasonable. Do you mean that we can add additional GUC like
> "postgres_fdw.initial_check",
> wait WL_SOCKET_CLOSED if the conneciton is found in the hash table, and do
> reconnection if it might be closed, right?
>
>
Alright, it took me sometime to realize that postgres_fdw already has a
retry mechanism if the first command fails: postgres_fdw: reestablish new
connection if cached one is detected as… · postgres/postgres@32a9c0b
(github.com)



Still, the reestablish mechanism can be further simplified with
WL_SOCKET_CLOSED event such as the following (where we should probably
rename pgfdw_connection_check_internal):

 /*
> * If the connection needs to be remade due to invalidation, disconnect as
> * soon as we're out of all transactions.
> */
>
* | +bool remoteSocketIsClosed =  entry->conn != NULL : *
pgfdw_connection_check_internal*(entry->conn) : false;*

> if (entry->conn != NULL && (entry->invalidated || remoteSocketIsClosed) &&
> entry->xact_depth == 0)

{
> elog(DEBUG3, "closing connection %p for option changes to take effect",
> entry->conn);
> disconnect_pg_server(entry);
> }

* | +else if (remoteSocketIsClosed && && entry->xact_depth > 0)*
* | + error ("Remote Server is down ...")*


In other words, a variation of pgfdw_connection_check_internal()
could potentially go into interfaces/libpq/libpq-fe.h
(backend/libpq/pqcomm.c or src/interfaces/libpq/fe-connect.c).
Then, GetConnection() in postgres_fdw, it can force to reconnect as it is
already done for some cases or error properly:


>
>  Based on off and on discussions, I modified my patch.
>
>
I still think that it is probably too much work/code to detect the
mentioned use-case you described on [1]. Each backend constantly
calling CallCheckingRemoteServersCallbacks() for this purpose doesn't sound
the optimal way to approach the "check whether server down" problem. You
typically try to decide whether a server is down by establishing a
connection (or ping etc), not going over all the existing connections.

As far as I can think of, it should probably be a single background task
checking whether the server is down. If so, sending an invalidation message
to all the backends such that related backends could act on the
invalidation and throw an error. This is to cover the use-case you
described on [1].

Also, maybe we could have a new catalog table like pg_foreign_server_health
or such, where we can keep the last time the check succeeded (and/or
failed), and how many times the check  succeeded (and/or failed).

This is of course how I would approach this problem. I think some other
perspectives on this would be very useful to hear.

Thanks,
Onder KALACI


  [1]
https://www.postgresql.org/message-id/TYAPR01MB58662809E678253B90E82CE5F5889%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Re: shadow variables - pg15 edition

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-11, David Rowley wrote:

> I'm also keen to wait for complaints and only if we really have to,
> remove the shadow flag from being used only in the places where we
> need to.

+1

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"The problem with the future is that it keeps turning into the present"
(Hobbes)

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Tatsuo Ishii

>> I tried to apply the v2 patch approach to make_etags but noticed that
>> make_ctags and make_etags have quite a few duplicate codes, that would
>> bring maintenance headache. I think we could merge make_etags into
>> make_ctags, then add "-e" option (or whatever) to make_ctags so that
>> it generates tags files for emacs if the option is specified.
> 
> If we're going to do this, then I suggest that make_etags should become
> a symlink to make_ctags, and behave as if -e is given when called under
> that name.

What I had in my mind was making make_etags a script just exec
make_ctags (with -e option). But I don't have strong
preference. Symlink is ok for me too.

>> +tags_file=tags
> 
>> +rm -f ./$tags_file
> 
> I think $tags_file should include the leading ./ bit, to reduce
> confusion.

Ok.

> However ... hmm ... 
> 
>>  find . \( -name 'CVS' -prune \) -o \( -name .git -prune \) -o -type d 
>> -print |
>>  while read DIR
>> -do  [ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 's;/[^/]*;/..;g'`/tags 
>> "$DIR"/tags
>> +do  [ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
>> 's;/[^/]*;/..;g'`/$tags_file "$DIR"/$tags_file
>>  done
> 
> ... does this create a tags symlink on each directory?  This seems
> strange to me,

I don't know the original author's intention for this but I think it
makes use of the tag file in emacs a little bit easier.  Emacs
confirms for the first time the default location of tags file under
the same directory where the source file resides. I can just hit
return key if there's a symlink of tags. If we do not create the
symlink, we have to specify the directory where the tags file was
originally created, which is a little bit annoying.

> but I admit the hack I keep in my .vim/vimrc looks even
> more strange:
> 
> :  let $CSCOPE_DB=substitute(getcwd(), "^\\(.*/pgsql/source/ [^/]*\\)/.*", 
> "\\1", "")
> :  let =substitute(getcwd(), "^\\(.*/pgsql/source/[^/]*\\)/.*", "\\1", 
> "") . "/tags"
> 
> Not sure which is worse.  Having dozens of identically named symlinks
> doesn't strike me as a great arrangement though.  I would definitely not
> use make_ctags if this is unavoidable.  I see both scripts do likewise
> currently.

Well, I often visit different tags file for different development
projects (for example Pgpool-II) and I don't want to have fixed
location of tags file in emacs setting. For this reason make_etags's
approach is acceptable for me.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: doc: add entry for pg_get_partkeydef()

2022-10-12 Thread Ian Lawrence Barwick

2022年10月12日(水) 3:29 Tom Lane :

> Ian Lawrence Barwick  writes:
> > Small patch for $subject, as the other pg_get_XXXdef() functions are
> > documented
> > and I was looking for this one but couldn't remember what it was called.
>
> Seems reasonable.  Pushed with minor wording adjustment.
>

Many thanks!

Regards

Ian Barwick

Re: create subscription - improved warning message

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Amit Kapila wrote:

> Okay, then I think we can commit the last error message patch of Peter
> [1] as we have an agreement on the same, and then work on this as a
> separate patch. What do you think?

No objection.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Tatsuo Ishii wrote:

> I tried to apply the v2 patch approach to make_etags but noticed that
> make_ctags and make_etags have quite a few duplicate codes, that would
> bring maintenance headache. I think we could merge make_etags into
> make_ctags, then add "-e" option (or whatever) to make_ctags so that
> it generates tags files for emacs if the option is specified.

If we're going to do this, then I suggest that make_etags should become
a symlink to make_ctags, and behave as if -e is given when called under
that name.

> +tags_file=tags

> +rm -f ./$tags_file

I think $tags_file should include the leading ./ bit, to reduce
confusion.

However ... hmm ... 

>  find . \( -name 'CVS' -prune \) -o \( -name .git -prune \) -o -type d -print 
> |
>  while read DIR
> -do   [ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 's;/[^/]*;/..;g'`/tags 
> "$DIR"/tags
> +do   [ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 
> 's;/[^/]*;/..;g'`/$tags_file "$DIR"/$tags_file
>  done

... does this create a tags symlink on each directory?  This seems
strange to me, but I admit the hack I keep in my .vim/vimrc looks even
more strange:

:  let $CSCOPE_DB=substitute(getcwd(), "^\\(.*/pgsql/source/ [^/]*\\)/.*", 
"\\1", "")
:  let =substitute(getcwd(), "^\\(.*/pgsql/source/[^/]*\\)/.*", "\\1", "") 
. "/tags"

Not sure which is worse.  Having dozens of identically named symlinks
doesn't strike me as a great arrangement though.  I would definitely not
use make_ctags if this is unavoidable.  I see both scripts do likewise
currently.

(I keep all my build trees under /pgsql/build [a symlink to
~/Code/pgsql/source], and all source trees under /pgsql/source, so this
is an easy conversion to make most of the time.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"World domination is proceeding according to plan"(Andrew Morton)

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread Richard Guo

On Wed, Oct 12, 2022 at 5:19 PM David Rowley  wrote:

> When all the distinct pathkeys are redundant then there can only be,
> at most, 1 single distinct value. There may be many rows with that
> value, but we can remove those extra ones with a LIMIT 1 rather than
> troubling over needlessly uniquifing them.
>
> This might not be a hugely common case, but; 1) it is very cheap to
> detect and 2) the speedups are likely to be *very* good.
>
> With the attached we get:
>
> regression=# explain (analyze, costs off, timing off) SELECT DISTINCT
> four,1,2,3 FROM tenk1 WHERE four = 0;
>QUERY PLAN
> -
>  Limit (actual rows=1 loops=1)
>->  Seq Scan on tenk1 (actual rows=1 loops=1)
>  Filter: (four = 0)
>  Planning Time: 0.215 ms
>  Execution Time: 0.071 ms
>
> naturally, if we removed the WHERE four = 0, we can't optimise this
> plan using this method.
>
> I see no reason why this also can't work for DISTINCT ON too.


For DISTINCT ON, if all the distinct pathkeys are redundant but there
are available sort pathkeys, then for adequately-presorted paths I think
we can also apply this optimization, using a Limit 1 rather than Unique.

regression=# explain (analyze, costs off, timing off) select distinct on
(four) * from tenk1 where four = 0 order by four, hundred desc;
   QUERY PLAN

 Limit (actual rows=1 loops=1)
   ->  Index Scan Backward using tenk1_hundred on tenk1 (actual rows=1
loops=1)
 Filter: (four = 0)
 Rows Removed by Filter: 300
 Planning Time: 0.165 ms
 Execution Time: 0.458 ms
(6 rows)

Thanks
Richard

Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

2022-10-12 Thread vignesh C

On Wed, 12 Oct 2022 at 16:16, vignesh C  wrote:
>
> On Thu, 6 Oct 2022 at 12:44, Amit Kapila  wrote:
> >
> > On Sat, Oct 1, 2022 at 12:35 AM Andres Freund  wrote:
> > >
> > > This issue does occasionally happen in CI, as e.g. noted in this thread:
> > > https://www.postgresql.org/message-id/20220930185345.GD6256%40telsasoft.com
> > >
> > > On 2022-08-18 15:17:47 +0530, Amit Kapila wrote:
> > > > I agree with you that getting rid of the clean-up lock on the new
> > > > bucket is a more invasive patch and should be done separately if
> > > > required. Yesterday, I have done a brief analysis and I think that is
> > > > possible but it doesn't seem to be a good idea to backpatch it.
> > >
> > > My problem with this approach is that the whole cleanup lock is hugely
> > > misleading as-is. As I noted in
> > > https://www.postgresql.org/message-id/20220817193032.z35vdjhpzkgldrd3%40awork3.anarazel.de
> > > we take the cleanup lock *after* re-initializing the page. Thereby
> > > completely breaking the properties that a cleanup lock normally tries to
> > > guarantee.
> > >
> > > Even if that were to achieve something useful (doubtful in this case),
> > > it'd need a huge comment explaining what's going on.
> > >
> >
> > Attached are two patches. The first patch is what Robert has proposed
> > with some changes in comments to emphasize the fact that cleanup lock
> > on the new bucket is just to be consistent with the old bucket page
> > locking as we are initializing it just before checking for cleanup
> > lock. In the second patch, I removed the acquisition of cleanup lock
> > on the new bucket page and changed the comments/README accordingly.
> >
> > I think we can backpatch the first patch and the second patch can be
> > just a HEAD-only patch. Does that sound reasonable to you?
>
> Thanks for the patches.
> I have verified that the issue is fixed using a manual test upto
> REL_10_STABLE version and found it to be working fine.

Just to clarify, I have verified that the first patch with Head,
REL_15_STABLE, REL_14_STABLE, REL_13_STABLE, REL_12_STABLE,
REL_11_STABLE and REL_10_STABLE branch fixes the issue. Also verified
that the first and second patch with Head branch fixes the issue.

Regards,
Vignesh

Re: Proposal to provide the facility to set binary format output for specific OID's per session

2022-10-12 Thread Dave Cramer

Waiting on the author to do what ? I'm waiting for a review.

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Tatsuo Ishii

Hi,

>> > I found that tag files generated by src/tools/make_ctags
>> > doesn't include some keywords, that were field names of node
>> > structs, for example norm_select in RestrictInfo. Such fields
>> > are defined with pg_node_attr macro that was introduced
>> > recently, like:
>> > 
>> > Selectivity norm_selec pg_node_attr(equal_ignore);
>> > 
>> > In this case, pg_node_attr is mistakenly interpreted to be
>> > the name of the field. So, I propose to use -I option of ctags
>> > to ignore the marco name. Attached is a patch to do it.

I found the same issue with make_etags too.

> I updated the patch to ignore the code under tmp_install and add
> some file types like sql, p[lm], and so on. .sgml or .sh is not
> included because they don't seem to be beneficial for ctags.

I tried to apply the v2 patch approach to make_etags but noticed that
make_ctags and make_etags have quite a few duplicate codes, that would
bring maintenance headache. I think we could merge make_etags into
make_ctags, then add "-e" option (or whatever) to make_ctags so that
it generates tags files for emacs if the option is specified.

Patch attahced.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/tools/make_ctags b/src/tools/make_ctags
index 912b6fafac..3a65139d94 100755
--- a/src/tools/make_ctags
+++ b/src/tools/make_ctags
@@ -1,12 +1,30 @@
 #!/bin/sh
 
-# src/tools/make_ctags
+# src/tools/make_ctags [-e]
+# If -e is specified, generate tags files for emacs.
+usage="$0 [-e]"
+if [ $# -gt 1 ];then
+echo $usage
+exit 1
+fi
+
+mode=
+tags_file=tags
+
+if [ $# = 1 ];then
+if [ $1 != "-e" ];then
+	echo $usage
+	exit 1
+fi
+mode="-e"
+tags_file=TAGS
+fi
 
 command -v ctags >/dev/null || \
 	{ echo "'ctags' program not found" 1>&2; exit 1; }
 
 trap "ret=$?; rm -rf /tmp/$$; exit $ret" 0 1 2 3 15
-rm -f ./tags
+rm -f ./$tags_file
 
 IS_EXUBERANT=""
 ctags --version 2>&1 | grep Exuberant && IS_EXUBERANT="Y"
@@ -34,9 +52,17 @@ then	FLAGS="--c-kinds=+dfmstuv"
 else	FLAGS="-dt"
 fi
 
+# Use -I option to ignore a macro
+if [ "$IS_EXUBERANT" ]
+then	IGNORE_IDENTIFIES="-I pg_node_attr+"
+else	IGNORE_IDENTIFIES=
+fi
+
 # this is outputting the tags into the file 'tags', and appending
-find `pwd`/ -type f -name '*.[chyl]' -print |
-	xargs ctags -a -f tags "$FLAGS"
+find `pwd`/ \( -name tmp_install -prune -o -name tmp_check -prune \) \
+	-o \( -name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" \
+	-o -name "*.sql" -o -name "*.p[lm]" \) -type f -print |
+	xargs ctags $mode -a -f $tags_file "$FLAGS" "$IGNORE_IDENTIFIES"
 
 # Exuberant tags has a header that we cannot sort in with the other entries
 # so we skip the sort step
@@ -45,10 +71,10 @@ find `pwd`/ -type f -name '*.[chyl]' -print |
 if [ ! "$IS_EXUBERANT" ]
 then	LC_ALL=C
 	export LC_ALL
-	sort tags >/tmp/$$ && mv /tmp/$$ tags
+	sort $tags_file >/tmp/$$ && mv /tmp/$$ $tags_file
 fi
 
 find . \( -name 'CVS' -prune \) -o \( -name .git -prune \) -o -type d -print |
 while read DIR
-do	[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 's;/[^/]*;/..;g'`/tags "$DIR"/tags
+do	[ "$DIR" != "." ] && ln -f -s `echo "$DIR" | sed 's;/[^/]*;/..;g'`/$tags_file "$DIR"/$tags_file
 done

Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread Richard Guo

On Wed, Oct 12, 2022 at 5:19 PM David Rowley  wrote:

> When all the distinct pathkeys are redundant then there can only be,
> at most, 1 single distinct value. There may be many rows with that
> value, but we can remove those extra ones with a LIMIT 1 rather than
> troubling over needlessly uniquifing them.

I'm not sure if this case is common enough in practice, but since this
patch is very straightforward and adds no more costs, I think it's worth
doing.

I also have concerns about the 2 Limit nodes pointed by the comment
inside the patch. Maybe we can check with limit_needed() and manually
add the limit node only if there is no LIMIT clause in the origin query?

Thanks
Richard

Re: create subscription - improved warning message

2022-10-12 Thread Amit Kapila

On Wed, Oct 12, 2022 at 2:08 PM Alvaro Herrera  wrote:
>
> On 2022-Oct-12, Amit Kapila wrote:
>
> > I think it is a good idea to expand the docs for this but note that
> > there are multiple places that use a similar description. For example,
> > see the description slot_name option: "When slot_name is set to NONE,
> > there will be no replication slot associated with the subscription.
> > This can be used if the replication slot will be created later
> > manually. Such subscriptions must also have both enabled and
> > create_slot set to false.".  Then, we have a few places in the logical
> > replication docs [1] that talk about creating the slot manually but
> > didn't explain in detail the name or options to use. We might want to
> > write a slightly bigger doc patch so that we can write the description
> > in one place and give reference to the same at other places.
>
> +1
>

Okay, then I think we can commit the last error message patch of Peter
[1] as we have an agreement on the same, and then work on this as a
separate patch. What do you think?

[1] - 
https://www.postgresql.org/message-id/CAHut%2BPtgkebavGYsGnROkY1%3DULhJ5%2Byn4_i3Y9E9%2ByDeksqpwQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

2022-10-12 Thread vignesh C

On Thu, 6 Oct 2022 at 12:44, Amit Kapila  wrote:
>
> On Sat, Oct 1, 2022 at 12:35 AM Andres Freund  wrote:
> >
> > This issue does occasionally happen in CI, as e.g. noted in this thread:
> > https://www.postgresql.org/message-id/20220930185345.GD6256%40telsasoft.com
> >
> > On 2022-08-18 15:17:47 +0530, Amit Kapila wrote:
> > > I agree with you that getting rid of the clean-up lock on the new
> > > bucket is a more invasive patch and should be done separately if
> > > required. Yesterday, I have done a brief analysis and I think that is
> > > possible but it doesn't seem to be a good idea to backpatch it.
> >
> > My problem with this approach is that the whole cleanup lock is hugely
> > misleading as-is. As I noted in
> > https://www.postgresql.org/message-id/20220817193032.z35vdjhpzkgldrd3%40awork3.anarazel.de
> > we take the cleanup lock *after* re-initializing the page. Thereby
> > completely breaking the properties that a cleanup lock normally tries to
> > guarantee.
> >
> > Even if that were to achieve something useful (doubtful in this case),
> > it'd need a huge comment explaining what's going on.
> >
>
> Attached are two patches. The first patch is what Robert has proposed
> with some changes in comments to emphasize the fact that cleanup lock
> on the new bucket is just to be consistent with the old bucket page
> locking as we are initializing it just before checking for cleanup
> lock. In the second patch, I removed the acquisition of cleanup lock
> on the new bucket page and changed the comments/README accordingly.
>
> I think we can backpatch the first patch and the second patch can be
> just a HEAD-only patch. Does that sound reasonable to you?

Thanks for the patches.
I have verified that the issue is fixed using a manual test upto
REL_10_STABLE version and found it to be working fine.

I have added code to print the old buffer and new buffer values when
both old buffer and new buffer will get dirtied. Then I had executed
the following test and note down the old buffer and new buffer value
from the log file:
CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
CREATE INDEX hash_pvactst ON pvactst USING hash (i);
create table t1(c1 int);
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;

In my environment, the issue will occur when oldbuf is 38 and newbuf is 60.

Once we know the old buffer and new buffer values, we will have to
debug the checkpointer and recovery process to simulate the scenario.
I used the following steps to simulate the issue in my environment:
1) Create streaming replication setup with the following configurations:
wal_consistency_checking = all
shared_buffers = 128MB  # min 128kB
bgwriter_lru_maxpages = 0   # max buffers written/round, 0 disables
checkpoint_timeout = 30s# range 30s-1d
2) Execute the following in master node:
CREATE TABLE pvactst (i INT, a INT[], p POINT) with (autovacuum_enabled = off);
CREATE INDEX hash_pvactst ON pvactst USING hash (i);
3) Hold checkpointer process of standby instance at BufferSync while debugging.
4) Execute the following in master node:
create table t1(c1 int); -- This is required so that the old buffer
value is not dirty in checkpoint process. (If old buffer is dirty then
we will not be able to sync the new buffer as checkpointer will wait
while trying to acquire the lock on old buffer).
5) Make checkpoint process to check the buffers up to old buffer + 1.
In our case it should cross 38.
6) Hold recovery process at
hash_xlog_split_allocate_page->IsBufferCleanupOK (approximately line
hash_xlog.c:357) while executing the following for the last insert in
the master node:
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;
INSERT INTO pvactst SELECT i, array[1,2,3], point(i, i+1) FROM
generate_series(1,1000) i;
7) Continue the checkpointer process and make it proceed to
SyncOneBuffer with buf_id = 60(newbuf value that was noted from the
earlier execution) and let it proceed up to PinBuffer_Locked(bufHdr);
8) Continue the recovery process will reproduce the PANIC scenario.

Regards,
Vignesh

Re: Perform streaming logical transactions by background workers and parallel apply

2022-10-12 Thread Peter Smith

Here are some review comments for v36-0001.

==

1. GENERAL

Houzj wrote ([1] #3a):
The word 'streaming' is the same as the actual option name, so
personally I think it's fine. But if others also agreed that the name
can be improved, I can change it.

~

Sure, I was not really complaining that the name is "wrong". Only I
did not think it was a good idea to have multiple struct members
called 'streaming' when they don't have the same meaning. e.g. one is
the internal character mode equivalent of the parameter, and one is
the parameter value as a string. That's why I thought they should be
different names. e.g. Make the 2nd one 'streaming_valstr' or
something.

==

2. doc/src/sgml/config.sgml

Previously I suggested there should be xrefsto the "Configuration
Settings" page but Houzj wrote ([1] #4):
Not sure about this as we don't have similar thing in the document of
max_logical_replication_workers and max_sync_workers_per_subscription.

~

Fair enough, but IMO perhaps all those others should also xref to the
"Configuration Settings" chapter. So if such a change does not belong
in this patch, then how about if I make another independent thread to
post this suggestion?

==

.../replication/logical/applyparallelworker.c


3. parallel_apply_find_worker

+parallel_apply_find_worker(TransactionId xid)
+{
+ bool found;
+ ParallelApplyWorkerEntry *entry = NULL;
+
+ if (!TransactionIdIsValid(xid))
+ return NULL;
+
+ if (ParallelApplyWorkersHash == NULL)
+ return NULL;
+
+ /* Return the cached parallel apply worker if valid. */
+ if (stream_apply_worker != NULL)
+ return stream_apply_worker;
+
+ /*
+ * Find entry for requested transaction.
+ */
+ entry = hash_search(ParallelApplyWorkersHash, , HASH_FIND, );

In function parallel_apply_start_worker() you removed the entry
assignment to NULL because it is never needed. Can do the same here
too.

~~~

4. parallel_apply_free_worker

+/*
+ * Remove the parallel apply worker entry from the hash table. And stop the
+ * worker if there are enough workers in the pool. For more information about
+ * the worker pool, see comments atop worker.c.
+ */
+void
+parallel_apply_free_worker(ParallelApplyWorkerInfo *winfo, TransactionId xid)

"And stop" -> "Stop"

~~~

5. parallel_apply_free_worker

+ * Although some error messages may be lost in rare scenarios, but
+ * since the parallel apply worker has finished processing the
+ * transaction, and error messages may be lost even if we detach the
+ * error queue after terminating the process. So it should be ok.
+ */

SUGGESTION (minor rewording)
Some error messages may be lost in rare scenarios, but it should be OK
because the parallel apply worker has finished processing the
transaction, and error messages may be lost even if we detached the
error queue after terminating the process.

~~~

6. LogicalParallelApplyLoop

+ for (;;)
+ {
+ void*data;
+ Size len;
+ int c;
+ StringInfoData s;
+ MemoryContext oldctx;
+
+ CHECK_FOR_INTERRUPTS();
+
+ /* Ensure we are reading the data into our memory context. */
+ oldctx = MemoryContextSwitchTo(ApplyMessageContext);
+
...
+
+ MemoryContextSwitchTo(oldctx);
+ MemoryContextReset(ApplyMessageContext);
+ }

Do those memory context switches need to happen inside the for(;;)
loop like that? I thought perhaps those can be done *outside* of the
loop instead of always switching and switching back on the next
iteration.

~~~

7. LogicalParallelApplyLoop

Previous I suggested maybe the name (e.g. the 2nd param) should be
changed to "ParallelApplyMessageContext"? Houzj wrote ([1] #13): Not
sure about this, because ApplyMessageContext is used in both worker.c
and applyparallelworker.c.

~

But I thought those are completely independent ApplyMessageContext's
in different processes that happen to have the same name. Shouldn't
they have a name appropriate to who owns them?

~~~

8. ParallelApplyWorkerMain

+ /*
+ * Allocate the origin name in a long-lived context for error context
+ * message.
+ */
+ snprintf(originname, sizeof(originname), "pg_%u", MySubscription->oid);

Now that ReplicationOriginNameForLogicalRep patch is pushed [2] please
make use of this common function.

~~~

9. HandleParallelApplyMessage

+ case 'X': /* Terminate, indicating clean exit */
+ {
+ shm_mq_detach(winfo->error_mq_handle);
+ winfo->error_mq_handle = NULL;
+ break;
+ }
+
+ /*
+ * Don't need to do anything about NoticeResponse and
+ * NotifyResponse as the logical replication worker doesn't need
+ * to send messages to the client.
+ */
+ case 'N':
+ case 'A':
+ break;
+ default:
+ {
+ elog(ERROR, "unrecognized message type received from parallel apply
worker: %c (message length %d bytes)",
+ msgtype, msg->len);
+ }

9a. case 'X':
There are no variable declarations here so the statement block {} is not needed

~

9b. default:
There are no variable declarations here so the statement block {} is not needed

~~~

10. parallel_apply_stream_abort

+ int i;
+ bool found = false;
+ char spname[MAXPGPATH];
+
+

Re: slab allocator performance issues

2022-10-12 Thread David Rowley

On Sat, 11 Sept 2021 at 09:07, Tomas Vondra
 wrote:
> I've been investigating the regressions in some of the benchmark
> results, together with the generation context benchmarks [1].

I've not looked into the regression you found with this yet, but I did
rebase the patch.  slab.c has seen quite a number of changes recently.

I didn't spend a lot of time checking over the patch. I mainly wanted
to see what the performance was like before reviewing in too much
detail.

To test the performance, I used [1] and ran:

select pg_allocate_memory_test(, 1024*1024,
10::bigint*1024*1024*1024, 'slab');

that basically allocates chunks of  and keeps around 1MB of
them at a time and allocates a total of 10GBs of them.

I saw:

Master:
16 byte chunk = 8754.678 ms
32 byte chunk = 4511.725 ms
64 byte chunk = 2244.885 ms
128 byte chunk = 1135.349 ms
256  byte chunk = 548.030 ms
512 byte chunk = 272.017 ms
1024 byte chunk = 144.618 ms

Master + attached patch:
16 byte chunk = 5255.974 ms
32 byte chunk = 2640.807 ms
64 byte chunk = 1328.949 ms
128 byte chunk = 668.078 ms
256 byte chunk = 330.564 ms
512 byte chunk = 166.844 ms
1024 byte chunk = 85.399 ms

So patched runs in about 60% of the time that master runs in.

I plan to look at the patch in a bit more detail and see if I can
recreate and figure out the regression that Tomas reported. For now, I
just want to share the rebased patch.

The only thing I really adjusted from Andres' version is to instead of
using pointers for the linked list block freelist, I made it store the
number of bytes into the block that the chunk is.  This means we can
use 4 bytes instead of 8 bytes for these pointers.  The block size is
limited to 1GB now anyway, so 32-bit is large enough for these
offsets.

David

[1] 
https://www.postgresql.org/message-id/attachment/137056/allocate_performance_functions.patch.txt
From 2b89d993b0294d5c0fe0a8333bccc555337ac979 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Wed, 12 Oct 2022 09:30:24 +1300
Subject: [PATCH v3] WIP: slab performance.

Author: Andres Freund
Discussion: 
https://postgr.es/m/20210717194333.mr5io3zup3kxa...@alap3.anarazel.de
---
 src/backend/utils/mmgr/slab.c | 449 ++
 1 file changed, 294 insertions(+), 155 deletions(-)

diff --git a/src/backend/utils/mmgr/slab.c b/src/backend/utils/mmgr/slab.c
index 1a0b28f9ea..58b8e2d67c 100644
--- a/src/backend/utils/mmgr/slab.c
+++ b/src/backend/utils/mmgr/slab.c
@@ -23,15 +23,18 @@
  * global (context) level. This is possible as the chunk size (and thus 
also
  * the number of chunks per block) is fixed.
  *
- * On each block, free chunks are tracked in a simple linked list. Contents
- * of free chunks is replaced with an index of the next free chunk, forming
- * a very simple linked list. Each block also contains a counter of free
- * chunks. Combined with the local block-level freelist, it makes it 
trivial
- * to eventually free the whole block.
+ * On each block, never allocated chunks are tracked by a simple offset, 
and
+ * free chunks are tracked in a simple linked list. The offset approach
+ * avoids needing to iterate over all chunks when allocating a new block,
+ * which would cause page faults and cache pollution. Contents of free 
chunks
+ * is replaced with a the offset (in bytes) from the block pointer to the
+ * next free chunk, forming a very simple linked list. Each block also
+ * contains a counter of free chunks. Combined with the local block-level
+ * freelist, it makes it trivial to eventually free the whole block.
  *
  * At the context level, we use 'freelist' to track blocks ordered by 
number
  * of free chunks, starting with blocks having a single allocated chunk, 
and
- * with completely full blocks on the tail.
+ * with completely full blocks on the tail. XXX
  *
  * This also allows various optimizations - for example when searching for
  * free chunk, the allocator reuses space from the fullest blocks first, in
@@ -44,7 +47,7 @@
  * case this performs as if the pointer was not maintained.
  *
  * We cache the freelist index for the blocks with the fewest free chunks
- * (minFreeChunks), so that we don't have to search the freelist on every
+ * (minFreeChunkIndex), so that we don't have to search the freelist on 
every
  * SlabAlloc() call, which is quite expensive.
  *
  *-
@@ -60,6 +63,17 @@
 
 #define Slab_BLOCKHDRSZMAXALIGN(sizeof(SlabBlock))
 
+struct SlabBlock;
+struct SlabChunk;
+
+/*
+ * Number of actual freelists + 1, for full blocks. Full blocks are always at
+ * offset 0.
+ */
+#define SLAB_FREELIST_COUNT 9
+
+#define SLAB_RETAIN_EMPTY_BLOCK_COUNT 10
+
 /*
  * SlabContext is a specialized implementation of MemoryContext.
  */
@@ -72,13 +86,16 @@ typedef struct SlabContext
SizeblockSize;  /* block size */

Re: make_ctags: use -I option to ignore pg_node_attr macro

2022-10-12 Thread Yugo NAGATA

Hello

On Mon, 10 Oct 2022 12:04:22 +0200
Alvaro Herrera  wrote:

> Hello
> 
> On 2022-Oct-07, Yugo NAGATA wrote:
> 
> > I found that tag files generated by src/tools/make_ctags
> > doesn't include some keywords, that were field names of node
> > structs, for example norm_select in RestrictInfo. Such fields
> > are defined with pg_node_attr macro that was introduced
> > recently, like:
> > 
> > Selectivity norm_selec pg_node_attr(equal_ignore);
> > 
> > In this case, pg_node_attr is mistakenly interpreted to be
> > the name of the field. So, I propose to use -I option of ctags
> > to ignore the marco name. Attached is a patch to do it.
> 
> I've wondered if there's anybody that uses this script.  I suppose if
> you're reporting this problem then it has at least one user, and
> therefore worth fixing.

Yeah, I am a make_ctags user, there may be few users though

> If we do patch it, how about doing some more invasive surgery and
> bringing it to the XXI century?  I think the `find` line is a bit lame:
> 
> >  # this is outputting the tags into the file 'tags', and appending
> >  find `pwd`/ -type f -name '*.[chyl]' -print |
> > -   xargs ctags -a -f tags "$FLAGS"
> > +   xargs ctags -a -f tags "$FLAGS" -I "$IGNORE_LIST"
> 
> especially because it includes everything in tmp_install, which pollutes
> the tag list.
> 
> In my own tags script I just call "ctags -R", and I feed cscope with
> these find lines
> 
> (find $SRCDIR \( -name tmp_install -prune -o -name tmp_check -prune \) -o \( 
> -name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" -o 
> -name "*.sh" -o -name "*.sgml" -o -name "*.sql" -o -name "*.p[lm]" \) -type f 
> -print ; \
> find $BUILDDIR \( -name tmp_install -prune \) -o \( -name \*.h -a -type f \) 
> -print )

Thank you for your comments. 

I updated the patch to ignore the code under tmp_install and add
some file types like sql, p[lm], and so on. .sgml or .sh is not
included because they don't seem to be beneficial for ctags.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/tools/make_ctags b/src/tools/make_ctags
index 912b6fafac..b1070dcea9 100755
--- a/src/tools/make_ctags
+++ b/src/tools/make_ctags
@@ -34,9 +34,17 @@ then	FLAGS="--c-kinds=+dfmstuv"
 else	FLAGS="-dt"
 fi
 
+# Use -I option to ignore a macro
+if [ "$IS_EXUBERANT" ]
+then	IGNORE_IDENTIFIES="-I pg_node_attr+"
+else	IGNORE_IDENTIFIES=
+fi
+
 # this is outputting the tags into the file 'tags', and appending
-find `pwd`/ -type f -name '*.[chyl]' -print |
-	xargs ctags -a -f tags "$FLAGS"
+find `pwd`/ \( -name tmp_install -prune -o -name tmp_check -prune \) \
+	-o \( -name "*.[chly]" -o -iname "*makefile*" -o -name "*.mk" -o -name "*.in" \
+	-o -name "*.sql" -o -name "*.p[lm]" \) -type f -print |
+	xargs ctags -a -f tags "$FLAGS" "$IGNORE_IDENTIFIES"
 
 # Exuberant tags has a header that we cannot sort in with the other entries
 # so we skip the sort step

Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant

2022-10-12 Thread David Rowley

Over on [1], Klint highlights a query with a DISTINCT which is a
little sub-optimal in PostgreSQL.  ISTM that in cases where all
DISTINCT pathkeys have been marked as redundant due to constants
existing in all of the EquivalenceClasses of the DISTINCT columns,
then it looks like it should be okay not to bother using a Unique node
to remove duplicate results.

When all the distinct pathkeys are redundant then there can only be,
at most, 1 single distinct value. There may be many rows with that
value, but we can remove those extra ones with a LIMIT 1 rather than
troubling over needlessly uniquifing them.

This might not be a hugely common case, but; 1) it is very cheap to
detect and 2) the speedups are likely to be *very* good.

With the attached we get:

regression=# explain (analyze, costs off, timing off) SELECT DISTINCT
four,1,2,3 FROM tenk1 WHERE four = 0;
   QUERY PLAN
-
 Limit (actual rows=1 loops=1)
   ->  Seq Scan on tenk1 (actual rows=1 loops=1)
 Filter: (four = 0)
 Planning Time: 0.215 ms
 Execution Time: 0.071 ms

naturally, if we removed the WHERE four = 0, we can't optimise this
plan using this method.

I see no reason why this also can't work for DISTINCT ON too.

regression=# explain (analyze, costs off, timing off) SELECT DISTINCT
ON (four,two) four,two FROM tenk1 WHERE four = 0 order by 1,2;
QUERY PLAN
--
 Unique (actual rows=1 loops=1)
   ->  Sort (actual rows=2500 loops=1)
 Sort Key: two
 Sort Method: quicksort  Memory: 175kB
 ->  Seq Scan on tenk1 (actual rows=2500 loops=1)
   Filter: (four = 0)
   Rows Removed by Filter: 7500
 Planning Time: 0.123 ms
 Execution Time: 4.251 ms
(9 rows)

then, of course, if we introduce some column that the pathkey is not
redundant for then we must do the distinct operation as normal.

regression=# explain (analyze, costs off, timing off) SELECT DISTINCT
four,two FROM tenk1 WHERE four = 0 order by 1,2;
QUERY PLAN
--
 Sort (actual rows=1 loops=1)
   Sort Key: two
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate (actual rows=1 loops=1)
 Group Key: four, two
 Batches: 1  Memory Usage: 24kB
 ->  Seq Scan on tenk1 (actual rows=2500 loops=1)
   Filter: (four = 0)
   Rows Removed by Filter: 7500
 Planning Time: 0.137 ms
 Execution Time: 4.274 ms
(11 rows)

Does this seem like something we'd want to do?

Patch attached.

David

[1] 
https://postgr.es/m/meypr01mb7101cd5da0a07c9de2b74850a4...@meypr01mb7101.ausprd01.prod.outlook.com
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 5d0fd6e072..84856721f3 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4764,8 +4764,6 @@ create_final_distinct_paths(PlannerInfo *root, RelOptInfo 
*input_rel,
 * the other.)
 */
List   *needed_pathkeys;
-   Path   *path;
-   ListCell   *lc;
 
if (parse->hasDistinctOn &&
list_length(root->distinct_pathkeys) <
@@ -4774,44 +4772,73 @@ create_final_distinct_paths(PlannerInfo *root, 
RelOptInfo *input_rel,
else
needed_pathkeys = root->distinct_pathkeys;
 
-   foreach(lc, input_rel->pathlist)
+   /*
+* needed_pathkeys may have become empty if all of the pathkeys 
were
+* determined to be redundant.  If all of the pathkeys are 
redundant
+* then each DISTINCT target must only allow a single value, 
therefore
+* all resulting tuples must be identical.  We can uniquify 
these
+* tuples simply by just taking the first tuple.  All we do 
here is
+* add a path to do LIMIT 1 on the cheapest input path.
+*
+* XXX we could end up with 2 Limit nodes if the query already 
has a
+* LIMIT clause. Does that matter?
+*/
+   if (needed_pathkeys == NIL)
+   {
+   Node   *limitCount = (Node *) makeConst(INT8OID, 
-1, InvalidOid,
+   
sizeof(int64),
+   
Int64GetDatum(1), false,
+   
FLOAT8PASSBYVAL);
+
+   add_path(distinct_rel, (Path *)
+create_limit_path(root, distinct_rel,
+

Re: Make ON_ERROR_STOP stop on shell script failure

2022-10-12 Thread bt22nakamorit


There was a mistake in the error message for \! so I updated the patch.

Best,
Tatsuhiro Nakamoridiff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 9494f28063..82febf0ace 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -4143,7 +4143,12 @@ bar
 
 By default, command processing continues after an error.  When this
 variable is set to on, processing will instead stop
-immediately.  In interactive mode,
+immediately upon an error in SQL query. When this variable is set to
+shell, a nonzero exit status of a shell command,
+which indicates failure, is interpreted as an error that stops the processing.
+When this variable is set to all, errors from both
+SQL queries and shell commands can stop the processing.
+In interactive mode,
 psql will return to the command prompt;
 otherwise, psql will exit, returning
 error code 3 to distinguish this case from fatal error
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index ab613dd49e..2a4086893e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -34,6 +34,7 @@
 #include "describe.h"
 #include "fe_utils/cancel.h"
 #include "fe_utils/print.h"
+#include "fe_utils/psqlscan_int.h"
 #include "fe_utils/string_utils.h"
 #include "help.h"
 #include "input.h"
@@ -2355,9 +2356,13 @@ exec_command_set(PsqlScanState scan_state, bool active_branch)
 			 */
 			char	   *newval;
 			char	   *opt;
+			PQExpBuffer output_buf = scan_state->output_buf;
 
 			opt = psql_scan_slash_option(scan_state,
 		 OT_NORMAL, NULL, false);
+			if (output_buf->len >= output_buf->maxlen
+&& (pset.on_error_stop == PSQL_ERROR_STOP_SHELL || pset.on_error_stop == PSQL_ERROR_STOP_ALL))
+success = false;
 			newval = pg_strdup(opt ? opt : "");
 			free(opt);
 
@@ -2693,8 +2698,25 @@ exec_command_write(PsqlScanState scan_state, bool active_branch,
 			else if (previous_buf && previous_buf->len > 0)
 fprintf(fd, "%s\n", previous_buf->data);
 
-			if (is_pipe)
+			if (is_pipe) {
 result = pclose(fd);
+
+if (result != 0)
+{
+	if (result < 0)
+		pg_log_error("could not close pipe to external command: %m");
+	else
+	{
+		char *reason = wait_result_to_str(result);
+
+		pg_log_error("%s: %s", fname + 1,
+	reason ? reason : "");
+		free(reason);
+	}
+}
+if (pset.on_error_stop == PSQL_ERROR_STOP_SHELL || pset.on_error_stop == PSQL_ERROR_STOP_ALL)
+	status = PSQL_CMD_ERROR;
+			}
 			else
 result = fclose(fd);
 
@@ -4984,10 +5006,19 @@ do_shell(const char *command)
 	else
 		result = system(command);
 
-	if (result == 127 || result == -1)
+	if (result != 0)
 	{
-		pg_log_error("\\!: failed");
-		return false;
+		if (result < 0)
+			pg_log_error("could not execute command: %m");
+		else
+		{
+			char *reason = wait_result_to_str(result);
+
+			pg_log_error("%s: %s", command, reason ? reason : "");
+			free(reason);
+		}
+		if (pset.on_error_stop == PSQL_ERROR_STOP_SHELL || pset.on_error_stop == PSQL_ERROR_STOP_ALL)
+			return false;
 	}
 	return true;
 }
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 4f310a8019..966cd34d23 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -94,6 +94,7 @@ setQFout(const char *fname)
 {
 	FILE	   *fout;
 	bool		is_pipe;
+	bool		status = true;
 
 	/* First make sure we can open the new output file/pipe */
 	if (!openQueryOutputFile(fname, , _pipe))
@@ -103,7 +104,24 @@ setQFout(const char *fname)
 	if (pset.queryFout && pset.queryFout != stdout && pset.queryFout != stderr)
 	{
 		if (pset.queryFoutPipe)
-			pclose(pset.queryFout);
+		{
+			int pclose_rc = pclose(pset.queryFout);
+			if (pclose_rc != 0)
+			{
+if (pclose_rc < 0)
+	pg_log_error("could not close pipe to external command: %m");
+else
+{
+	char *reason = wait_result_to_str(pclose_rc);
+
+	pg_log_error("command failure: %s",
+ reason ? reason : "");
+	free(reason);
+}
+if ((pset.on_error_stop == PSQL_ERROR_STOP_SHELL || pset.on_error_stop == PSQL_ERROR_STOP_ALL))
+	status = false;
+			}
+		}
 		else
 			fclose(pset.queryFout);
 	}
@@ -115,7 +133,7 @@ setQFout(const char *fname)
 	set_sigpipe_trap_state(is_pipe);
 	restore_sigpipe_trap();
 
-	return true;
+	return status;
 }
 
 
@@ -1373,7 +1391,7 @@ DescribeQuery(const char *query, double *elapsed_msec)
  * For other commands, the results are processed normally, depending on their
  * status.
  *
- * Returns 1 on complete success, 0 on interrupt and -1 or errors.  Possible
+ * Returns 1 on complete success, 0 on interrupt and -1 on errors.  Possible
  * failure modes include purely client-side problems; check the transaction
  * status for the server-side opinion.
  *
@@ -1635,7 +1653,22 @@ ExecQueryAndProcessResults(const char *query,
 	{
 		if (gfile_is_pipe)
 		{
-			pclose(gfile_fout);
+			int pclose_rc =

Re: Fix wrong syntax about CREATE_REPLICATION_SLOT

2022-10-12 Thread Michael Paquier

On Wed, Oct 12, 2022 at 08:33:43AM +, tachikake.ay...@fujitsu.com wrote:
> Hi hackers,
> 
> A minor bug was found in the "CREATE_REPLICATION_SLOT" syntax.
> It is in protocol.sgml at line 1990.

Indeed, right.  The command is described two times, in its old and new
fashions and the new flavor is incorrect.
--
Michael


signature.asc
Description: PGP signature

Re: [Commitfest 2022-09] Date is Over.

2022-10-12 Thread Michael Paquier

On Wed, Oct 05, 2022 at 06:01:01PM +0800, Julien Rouhaud wrote:
> The CF should be marked as closed, and its entries fully processed within a 
> few
> days after its final day.
> 
> The general rule is that patches that have been waiting on authors for 2 weeks
> or more without any answer from the author(s) should get returned with
> feedback with some message to the author, and the rest is moved to the next
> commitfest.  If you have the time to look at some patches and see if they need
> something else, that's always better.

One week after this message, there was a total of 170-ish entries
still in the commit fest, so I have gone through each one of them and
updated the ones in need of a refresh.  Please note that there was
something like 50~60 entries where the CF bot was failing.  In some
cases, Ibrar has mentioned that on the thread near the beginning of
September, and based on the lack of updates such entries have been
switched as RwF.  Other entries where the CF bot is failing have been
marked as waiting on author for now.  A couple of entries have been
committed but not marked as such, and there was the usual set of
entries with an incorrect status.

Of course, I may have done some mistakes while classifying all that,
so feel free to scream back at me if you feel that something has been
handled incorrectly.

Here is the final score:
Committed: 65.
Moved to next CF: 177.
Withdrawn: 11.
Rejected: 3.
Returned with Feedback: 40.
Total: 296. 

Thanks,
--
Michael

signature.asc
Description: PGP signature

Re: Summary Sort workers Stats in EXPLAIN ANALYZE

2022-10-12 Thread Michael Paquier

On Tue, Sep 06, 2022 at 11:37:32AM +0500, Ibrar Ahmed wrote:
> The patch failed different regression tests on all platforms. Please
> correct that and send an updated patch.
> 
> [06:40:02.370] Test Summary Report
> [06:40:02.370] ---
> [06:40:02.370] t/002_pg_upgrade.pl (Wstat: 256 Tests: 13 Failed: 1)
> [06:40:02.370] Failed test: 4
> [06:40:02.370] Non-zero exit status: 1
> [06:40:02.370] Files=2, Tests=21, 45 wallclock secs ( 0.02 usr 0.00 sys +
> 3.52 cusr 2.06 csys = 5.60 CPU)

This has been marked as RwF based on the lack of an update.
--
Michael


signature.asc
Description: PGP signature

Re: create subscription - improved warning message

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Amit Kapila wrote:

> I think it is a good idea to expand the docs for this but note that
> there are multiple places that use a similar description. For example,
> see the description slot_name option: "When slot_name is set to NONE,
> there will be no replication slot associated with the subscription.
> This can be used if the replication slot will be created later
> manually. Such subscriptions must also have both enabled and
> create_slot set to false.".  Then, we have a few places in the logical
> replication docs [1] that talk about creating the slot manually but
> didn't explain in detail the name or options to use. We might want to
> write a slightly bigger doc patch so that we can write the description
> in one place and give reference to the same at other places.

+1

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Para tener más hay que desear menos"

Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?

2022-10-12 Thread Michael Paquier

On Tue, Sep 06, 2022 at 11:32:18AM +0500, Ibrar Ahmed wrote:
> Hunk #5 succeeded at 147 with fuzz 2 (offset -3 lines).
> Hunk #6 FAILED at 170.
> Hunk #7 succeeded at 165 (offset -69 lines).
> 2 out of 7 hunks FAILED -- saving rejects to file
> src/include/portability/instr_time.h.rej
> patching file src/tools/msvc/Mkvcbuild.pm

No rebased version has been sent since this update, so this patch has
been marked as RwF.
--
Michael


signature.asc
Description: PGP signature

Fix wrong syntax about CREATE_REPLICATION_SLOT

2022-10-12 Thread tachikake.ay...@fujitsu.com

Hi hackers,

A minor bug was found in the "CREATE_REPLICATION_SLOT" syntax.
It is in protocol.sgml at line 1990.

The current syntax written in it is as follows,

CREATE_REPLICATION_SLOT slot_name [ TEMPORARY ] { PHYSICAL | LOGICAL } [ ( 
option [, ...] ) ]

However, when I executed a command as follows, that became syntax error.

CREATE_REPLICATION_SLOT tachi LOGICAL;
ERROR:  syntax error

To use LOGICAL, output_plugin must be required.
Correct syntax is as follows.

CREATE_REPLICATION_SLOT slot_name [ TEMPORARY ] { PHYSICAL | LOGICAL 
output_plugin } [ ( option [, ...] ) ]

PSA patch to fix it.

Note that version 15 must also be fixed.

Best Regards,
Ayaki Tachikake
FUJITSU LIMITED



doc.patch
Description: doc.patch

Re: create subscription - improved warning message

2022-10-12 Thread Amit Kapila

On Wed, Oct 12, 2022 at 12:34 PM Alvaro Herrera  wrote:
>
> On 2022-Oct-11, Peter Smith wrote:
>
> > > BTW, do we want to slightly adjust the documentation for the
> > > connect option on CREATE SUBSCRIPTION page [1]? It says: "Since no
> > > connection is made when this option is false, no tables are
> > > subscribed, and so after you enable the subscription nothing will be
> > > replicated. You will need to then run ALTER SUBSCRIPTION ... REFRESH
> > > PUBLICATION for tables to be subscribed."
> > >
> > > It doesn't say anything about manually creating the slot and probably
> > > the wording can be made similar.
> >
> > PSA patch v3-0002 which changes CREATE SUBSCRIPTION pgdocs to use the
> > same wording as in the HINT message.
>
> I think we want the documentation to explain in much more detail what is
> meant.  Users are going to need some help in determining which commands
> to run for each of the step mentioned in the hint, so I don't think we
> want the docs to say the same thing as the hint.  How does the user know
> the name of the slot, what options to use, what are the commands to run
> afterwards.
>

I think it is a good idea to expand the docs for this but note that
there are multiple places that use a similar description. For example,
see the description slot_name option: "When slot_name is set to NONE,
there will be no replication slot associated with the subscription.
This can be used if the replication slot will be created later
manually. Such subscriptions must also have both enabled and
create_slot set to false.".  Then, we have a few places in the logical
replication docs [1] that talk about creating the slot manually but
didn't explain in detail the name or options to use. We might want to
write a slightly bigger doc patch so that we can write the description
in one place and give reference to the same at other places.

[1] - 
https://www.postgresql.org/docs/devel/logical-replication-subscription.html

-- 
With Regards,
Amit Kapila.

Re: Tracking last scan time

2022-10-12 Thread Michael Paquier

On Wed, Oct 12, 2022 at 09:09:46AM +0100, Dave Page wrote:
> On Wed, 12 Oct 2022 at 07:40, Michael Paquier  wrote:
>> Okay, I have just moved the patch to the next CF then, still marked as
>> ready for committer.  Are you planning to look at that?
>
> Thanks. Was the question directed at me or Andres?

Apologies for the confusion.  This question is addressed to Andres.
--
Michael


signature.asc
Description: PGP signature

Re: Tracking last scan time

2022-10-12 Thread Dave Page

On Wed, 12 Oct 2022 at 07:40, Michael Paquier  wrote:

> On Mon, Oct 03, 2022 at 12:55:40PM +0100, Dave Page wrote:
> > Thanks. It's just the changes in xact.c, so it doesn't seem like it would
> > cause you any more work either way, in which case, I'll leave it to you
> :-)
>
> Okay, I have just moved the patch to the next CF then, still marked as
> ready for committer.  Are you planning to look at that?
>

Thanks. Was the question directed at me or Andres?

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com

Re: future of serial and identity columns

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-11, Peter Eisentraut wrote:

> diff --git a/src/test/modules/test_ddl_deparse/expected/alter_table.out 
> b/src/test/modules/test_ddl_deparse/expected/alter_table.out
> index 87a1ab7aabce..30e3dbb8d08a 100644
> --- a/src/test/modules/test_ddl_deparse/expected/alter_table.out
> +++ b/src/test/modules/test_ddl_deparse/expected/alter_table.out
> @@ -25,12 +25,9 @@ NOTICE:  DDL test: type simple, tag CREATE TABLE
>  CREATE TABLE grandchild () INHERITS (child);
>  NOTICE:  DDL test: type simple, tag CREATE TABLE
>  ALTER TABLE parent ADD COLUMN b serial;
> -NOTICE:  DDL test: type simple, tag CREATE SEQUENCE
> -NOTICE:  DDL test: type alter table, tag ALTER TABLE
> -NOTICE:subcommand: type ADD COLUMN (and recurse) desc column b of table 
> parent
> -NOTICE:  DDL test: type simple, tag ALTER SEQUENCE
> +ERROR:  cannot recursively add identity column to table that has child tables

I think this change merits some discussion.  Surely we cannot simply
disallow SERIAL from being used with inheritance.  Do we need to have
a way for identity columns to be used by children tables?

(My first thought was "let's keep SERIAL as the old code when used for
inheritance", but then I realized that the parent table starts as a
normal-looking table that only later acquires inheritors, so we wouldn't
know ahead of time that we need to treat that SERIAL column in a special
way.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"

Re: Use extended statistics to estimate (Var op Var) clauses

2022-10-12 Thread Michael Paquier

On Fri, Jul 22, 2022 at 02:17:47PM +0100, Dean Rasheed wrote:
> It's a worry that none of the existing regression tests picked up on
> that. Perhaps a similar test could be added using the existing test
> data. Otherwise, I think it'd be worth adding a new test with similar
> data to the above.

This feedback has not been answered for two months, so marked this
entry as RwF for now.
--
Michael


signature.asc
Description: PGP signature

Re: Fix obsolete reference to ExecCreatePartitionPruneState

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-12, Amit Langote wrote:

> Robert pointed out in [1] that ExecCreatePartitionPruneState() that
> was renamed to CreatePartitionPruneState() in 297daa9d4353 is still
> referenced in src/test/modules/delay_execution/specs/partition-addition.spec.
> 
> Please find attached a patch to fix that.

Thanks, pushed.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: create subscription - improved warning message

2022-10-12 Thread Amit Kapila

On Tue, Oct 11, 2022 at 4:27 PM Alvaro Herrera  wrote:
>
> On 2022-Oct-10, Peter Smith wrote:
>
> > On Mon, Oct 10, 2022 at 4:04 PM Amit Kapila  wrote:
>
> > > I think the below gives accurate information:
> > > WARNING: subscription was created, but is not connected
> > > HINT: You should create the slot manually, enable the subscription,
> > > and run %s to initiate replication.
>
> I guess this is reasonable, but how do I know what slot name do I have
> to create?  Maybe it'd be better to be explicit about that:
>
> HINT: You should create slot \"%s\" manually, enable the subscription, and 
> run %s to initiate replication.
>

I am not so sure about including a slot name because users can create
a slot with a name of their choice and set it via Alter Subscription.

-- 
With Regards,
Amit Kapila.

Re: Implementing Incremental View Maintenance

2022-10-12 Thread Michael Paquier

On Fri, Sep 09, 2022 at 08:10:32PM +0900, Yugo NAGATA wrote:
> I am working on above issues (#1-#4) now, and I'll respond on each later.

Okay, well.  There has been some feedback sent lately and no update
for one month, so I am marking it as RwF for now.  As a whole the
patch has been around for three years and it does not seem that a lot
has happened in terms of design discussion (now the thread is long so
I would easily miss something)..
--
Michael


signature.asc
Description: PGP signature

Re: BufferAlloc: don't take two simultaneous locks

2022-10-12 Thread Michael Paquier

On Wed, Sep 07, 2022 at 12:53:07PM +0500, Ibrar Ahmed wrote:
> Hunk #1 FAILED at 231.
> Hunk #2 succeeded at 409 (offset 82 lines).
> 
> 1 out of 2 hunks FAILED -- saving rejects to file
> src/include/storage/buf_internals.h.rej

With no rebase done since this notice, I have marked this entry as
RwF.
--
Michael


signature.asc
Description: PGP signature

Re: libpq error message refactoring

2022-10-12 Thread Peter Eisentraut


On 23.09.22 04:37, Tom Lane wrote:

Separately from that: is it really okay to delegate use of a va_list
argument like that?  The other call paths of
appendPQExpBufferVA[_internal] write va_start/va_end directly around it,
not somewhere else in the call chain.  I'm too tired to language-lawyer
out what happens when you do it like this, but I'm suspecting that it's
not well-defined portable behavior.

I think what you probably need to do is export appendPQExpBufferVA
as-is and require libpq_append_error to provide the error loop.


There was actually a live problem here, maybe not the exact one you had 
in mind:  When you actually need the "need more space" loop, you must do 
va_end() and va_start() before calling down again.  Otherwise, the next 
va_arg() gets garbage.


It so happens that the error message

"private key file \"%s\" has group or world access; file must have 
permissions u=rw (0600) or less if owned by the current user, or 
permissions u=rw,g=r (0640) or less if owned by root"


together with an in-tree test location for the file in question just 
barely exceeds INITIAL_EXPBUFFER_SIZE (256), and so my previous patch 
would fail the "ssl" test suite.  Good test coverage. :)


Anyway, I have updated my patch with your suggestion, which should fix 
these kinds of issues.

Re: Proposal: allow database-specific role memberships

2022-10-12 Thread Michael Paquier

On Wed, Sep 07, 2022 at 12:50:32PM +0500, Ibrar Ahmed wrote:
> The patch requires a rebase, please do that.
> 
> Hunk #5 succeeded at 454 (offset 28 lines). 1 out of 5 hunks FAILED
> -- saving rejects to file doc/src/sgml/ref/grant.sgml.rej

There has been no updates on this thread for one month, so this has
been switched as RwF.
--
Michael


signature.asc
Description: PGP signature

Re: Table AM and DROP TABLE [ Was: Table AM and DDLs]

2022-10-12 Thread Michael Paquier

On Sun, Oct 02, 2022 at 09:53:01AM -0700, Andres Freund wrote:
> The CF entry for this patch doesn't currently apply and there has been a bunch
> of feedback on the approach. Mats, are you actually waiting for further
> feedback right now?

Okay, for now this has been marked as RwF.
--
Michael


signature.asc
Description: PGP signature

Re: Add 64-bit XIDs into PostgreSQL 15

2022-10-12 Thread Michael Paquier

On Fri, Oct 07, 2022 at 02:04:09PM +0300, Maxim Orlov wrote:
> As always, reviews are very welcome!

This patch set needs a rebase, as far as I can see.
--
Michael


signature.asc
Description: PGP signature

Re: Move backup-related code to xlogbackup.c/.h

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-06, Bharath Rupireddy wrote:

> On Thu, Oct 6, 2022 at 4:50 AM Andres Freund  wrote:
> >
> > I'm doubtful it's a good idea to expose these to outside of xlog.c - they 
> > are
> > very low level, and it's very easy to break stuff by using them wrongly.
> 
> Hm. Here's the v3 patch set without exposing WAL insert lock related
> functions. Please have a look.

Hmm, I don't like your 0001 very much.  This sort of thing:

+/*
+ * Get the ControlFile.
+ */
+ControlFileData *
+GetControlFile(void)
+{
+   return ControlFile;
+}

looks too easy to misuse; what about locking?  Also, isn't the addition
of ControlFile as a variable in do_pg_backup_start going to cause shadow
variable warnings?  Given the locking requirements, I think it would be
feasible to copy stuff out of ControlFile under lock, then return the
copies.

+/*
+ * Increment runningBackups and forcePageWrites.
+ *
+ * NOTE: This function is tailor-made for use in xlogbackup.c. It doesn't set
+ * the respective XLogCtl members directly, and acquires and releases locks.
+ * Hence be careful when using it elsewhere.
+ */
+void
+SetXLogBackupRelatedInfo(void)

I understand that naming is difficult, but I think "Set foo Related
Info" seems way too vague.  And the comment says "it doesn't set stuff
directly", and then it goes and sets stuff directly.  What gives?

You added some commentary that these functions are tailor-made for
internal operations, and then declared them in the most public header
function that xlog has?  I think at the bare minimum, these prototypes
should be in xlog_internal.h, not xlog.h.

I didn't look at 0002 and 0003 other than to notice that xlogbackup.h is
no longer removed from xlog.h.  So what is the point of all this?

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: test_decoding assertion failure for the loss of top-sub transaction relationship

2022-10-12 Thread Masahiko Sawada

On Wed, Oct 12, 2022 at 3:35 PM kuroda.hay...@fujitsu.com
 wrote:
>
> Dear Sawada-san,
>
> > FYI, as I just replied to the related thread[1], the assertion failure
> > in REL14 and REL15 can be fixed by the patch proposed there. So I'd
> > like to see how the discussion goes. Regardless of this proposed fix,
> > the patch proposed by Kuroda-san is required for HEAD, REL14, and
> > REL15, in order to fix the assertion failure in SnapBuildCommitTxn().
>
> I understood that my patches for REL14 and REL15 might be not needed.

No, sorry for confusing you. I meant that even if we agreed with the
patch I proposed there, your patch is still required to fix the issue.

Regards,

-- 
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Re[2]: Possible solution for masking chosen columns when using pg_dump

2022-10-12 Thread Виктория Шепард

Hi,

Here is an idea of how to read masking options from a file. Please, take a
look.

пн, 10 окт. 2022 г. в 14:54, Олег Целебровский :

> Hi,
>
> I applied most of suggestions: used separate files for most of added code,
> fixed typos/mistakes, got rid of that pesky TODO that was already
> implemented, just not deleted.
>
> Added tests (and functionality) for cases when you need to mask columns in
> tables with the same name in different schemas. If schema is not specified,
> then columns in all tables with specified name are masked (Example -
> pg_dump -t ‘t0’ --mask-columns id --mask-function mask_int will mask all
> ids in all tables with names ‘t0’ in all existing schemas).
>
> Wrote comments for all ‘magic numbers’
>
> About that
>
> >- Also it can be hard to use a lot of different functions for different
> fields, maybe it would be better to set up functions in a file.
>
> I agree with that, but I know about at least 2 other patches (both are
> WIP, but still) that are interacting with reading command-line options from
> file. And if everyone will write their own version of reading command-line
> options from file, it will quickly get confusing.
>
> A solution to that problem is another patch that will put all options from
> file (one file for any possible options, from existing to future ones) into
> **argv in main, so that pg_dump can process them as if they came form
> command line.
>
>
> Пятница, 7 октября 2022, 8:01 +07:00 от Виктория Шепард <
> we.vikt...@gmail.com>:
>
> Hi,
> I took a look, here are several suggestions for improvement:
>
> - Masking is not a main functionality of pg_dump and it is better to write
> most of the connected things in a separate file like parallel.c or
> dumputils.c. This will help slow down the growth of an already huge pg_dump
> file.
>
> - Also it can be hard to use a lot of different functions for different
> fields, maybe it would be better to set up functions in a file.
>
> - How will it work for the same field and tables in the different schemas?
> Can we set up the exact schema for the field?
>
> - misspelling in a word
> >/*
> >* Add all columns and funcions to list of MaskColumnInfo structures,
> >*/
>
> - Why did you use 256 here?
> > char* table = (char*) pg_malloc(256 * sizeof(char));
> Also for malloc you need malloc on 1 symbol more because you have to store
> '\0' symbol.
>
> - Instead of addFuncToDatabase you can run your query using something
> already defined from fe_utils/query_utils.c. And It will be better to set
> up a connection only once and create all functions. Establishing a
> connection is a resource-intensive procedure. There are a lot of magic
> numbers, better to leave some comments explaining why there are 64 or 512.
>
> - It seems that you are not using temp_string
> > char   *temp_string = (char*)malloc(256 * sizeof(char));
>
> - Grammar issues
> >/*
> >* mask_column_info_list contains info about every to-be-masked column:
> >* its name, a name its table (if nothing is specified - mask all columns
> with this name),
> >* name of masking function and name of schema containing this function
> (public if not specified)
> >*/
> the name of its table
>
>
> пн, 3 окт. 2022 г. в 20:45, Julien Rouhaud  >:
>
> Hi,
>
> On Mon, Oct 03, 2022 at 06:30:17PM +0300, Олег Целебровский wrote:
> >
> > Hello, here's my take on masking data when using pg_dump
> >
> > The main idea is using PostgreSQL functions to replace data during a
> SELECT.
> > When table data is dumped SELECT a,b,c,d ... from ... query is
> generated, the columns that are marked for masking are replaced with result
> of functions on those columns
> > Example: columns name, count are to be masked, so the query will look as
> such: SELECT id, mask_text(name), mask_int(count), date from ...
> >
> > So about the interface: I added 2 more command-line options:
> >
> > --mask-columns, which specifies what columns from what tables will be
> masked
> > usage example:
> > --mask-columns "t1.name, t2.description" - both columns
> will be masked with the same corresponding function
> > or --mask-columns name - ALL columns with name "name" from
> all dumped tables will be masked with correspoding function
> >
> > --mask-function, which specifies what functions will mask data
> > usage example:
> > --mask-function mask_int - corresponding columns will be
> masked with function named "mask_int" from default schema (public)
> > or --mask-function my_schema.mask_varchar - same as above
> but with specified schema where the function is stored
> > or --mask-function somedir/filename - the function is
> "defined" here - more on the structure below
>
> FTR I wrote an extension POC [1] last weekend that does that but on the
> backend
> side.  The main advantage is that it's working with any existing versions
> of
> pg_dump (or any client relying on COPY or even plain interactive SQL
> statements), and that the DBA can force a

Re: Add index scan progress to pg_stat_progress_vacuum

2022-10-12 Thread Masahiko Sawada

On Tue, Oct 11, 2022 at 10:50 PM Imseih (AWS), Sami  wrote:
>
> >One idea would be to add a flag, say report_parallel_vacuum_progress,
> >to IndexVacuumInfo struct and expect index AM to check and update the
> >parallel index vacuum progress, say every 1GB blocks processed. The
> >flag is true only when the leader process is vacuuming an index.
>
> >Regards,
>
> Sorry for the long delay on this. I have taken the approach as suggested
> by Sawada-san and Robert and attached is v12.

Thank you for updating the patch!

>
> 1. The patch introduces a new counter in the same shared memory already
> used by the parallel leader and workers to keep track of the number
> of indexes completed. This way there is no reason to loop through
> the index status every time we want to get the status of indexes completed.

While it seems to be a good idea to have the atomic counter for the
number of indexes completed, I think we should not use the global
variable referencing the counter as follow:

+static pg_atomic_uint32 *index_vacuum_completed = NULL;
:
+void
+parallel_vacuum_progress_report(void)
+{
+   if (IsParallelWorker() || !report_parallel_vacuum_progress)
+   return;
+
+   pgstat_progress_update_param(PROGRESS_VACUUM_INDEX_COMPLETED,
+pg_atomic_read_u32(index_vacuum_completed));
+}

I think we can pass ParallelVacuumState (or PVIndStats) to the
reporting function so that it can check the counter and report the
progress.

> 2. A new function in vacuumparallel.c will be used to update
> the progress of indexes completed by reading from the
> counter created in point #1.
>
> 3. The function is called during the vacuum_delay_point as a
> matter of convenience, since this is called in all major vacuum
> loops. The function will only do something if the caller
> sets a boolean to report progress. Doing so will also ensure
> progress is being reported in case the parallel workers completed
> before the leader.

Robert pointed out:

---
I am not too sure that the idea of using the vacuum delay points is the best
plan. I think we should try to avoid piggybacking on such general
infrastructure if we can, and instead look for a way to tie this to
something that is specific to parallel vacuum.
---

I agree with this part.

Instead, I think we can add a boolean and the pointer for
ParallelVacuumState to IndexVacuumInfo. If the boolean is true, an
index AM can call the reporting function with the pointer to
ParallelVacuumState while scanning index blocks, for example, for
every 1GB. The boolean can be true only for the leader process.

>
> 4. Rather than adding any complexity to WaitForParallelWorkersToFinish
> and introducing a new callback, vacuumparallel.c will wait until
> the number of vacuum workers is 0 and then call
> WaitForParallelWorkersToFinish as it does currently.

Agreed, but with the following change, the leader process waits in a
busy loop, which should not be acceptable:

+   if (VacuumActiveNWorkers)
+   {
+   while (pg_atomic_read_u32(VacuumActiveNWorkers) > 0)
+   {
+   parallel_vacuum_progress_report();
+   }
+   }
+

Also, I think it's better to check whether idx_completed_progress
equals to the number indexes instead.

> 5. Went back to the idea of adding a new view called 
> pg_stat_progress_vacuum_index
> which is accomplished by adding a new type called VACUUM_PARALLEL in 
> progress.h

Probably, we can devide the patch into two patches. One for adding the
new statistics of the number of vacuumed indexes to
pg_stat_progress_vacuum and another one for adding new statistics view
pg_stat_progress_vacuum_index.

Regards,

-- 
Masahiko Sawada
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Fix gin index cost estimation

2022-10-12 Thread Ronan Dunklau

> > You're right, I was too eager to try to raise the CPU cost proportionnally 
to
> > the number of array scans (scalararrayop). I'd really like to understand 
where
> > this equation comes from though...
> 
> So, what's the latest update here?

Thanks Michael for reviving this thread.

Before proceeding any further with this, I'd like to understand where we 
stand. Tom argued my way of charging cost per entry pages visited boils down 
to charging per tuple, which I expressed disagreement with. 

If we can come to a consensus whether that's a bogus way of thinking about it 
I'll proceed with what we agree on.

-- 
Ronan Dunklau

Re: future of serial and identity columns

2022-10-12 Thread Peter Eisentraut


On 12.10.22 08:22, Corey Huinker wrote:
Question: the xref  refers the reader to sql-createtable, which is a 
pretty big page, which could leave the reader lost. Would it make sense 
to create a SQL-CREATETABLE-IDENTITY anchor and link to that instead?


Yes, I think that would be good.

Fix obsolete reference to ExecCreatePartitionPruneState

2022-10-12 Thread Amit Langote

Robert pointed out in [1] that ExecCreatePartitionPruneState() that
was renamed to CreatePartitionPruneState() in 297daa9d4353 is still
referenced in src/test/modules/delay_execution/specs/partition-addition.spec.

Please find attached a patch to fix that.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

[1] 
https://www.postgresql.org/message-id/ca+tgmoa7wnsmnan7n2826tkmauzm79jtjdmtpmmayjqh0hz...@mail.gmail.com


ExecCreatePartitionPruneState-no-longer-exists.patch
Description: Binary data

Re: create subscription - improved warning message

2022-10-12 Thread Alvaro Herrera

On 2022-Oct-11, Peter Smith wrote:

> On Tue, Oct 11, 2022 at 2:46 PM Amit Kapila  wrote:
> >
> > On Mon, Oct 10, 2022 at 8:14 PM Tom Lane  wrote:

> > > It feels a little strange to me that we describe two steps rather
> > > vaguely and then give exact SQL for the third step.  How about,
> > > say,
> > >
> > > HINT: To initiate replication, you must manually create the
> > > replication slot, enable the subscription, and refresh the
> > > subscription.
> >
> > LGTM.
> 
> PSA patch v3-0001 where the message/hint is worded as suggested above

LGTM.

> > BTW, do we want to slightly adjust the documentation for the
> > connect option on CREATE SUBSCRIPTION page [1]? It says: "Since no
> > connection is made when this option is false, no tables are
> > subscribed, and so after you enable the subscription nothing will be
> > replicated. You will need to then run ALTER SUBSCRIPTION ... REFRESH
> > PUBLICATION for tables to be subscribed."
> >
> > It doesn't say anything about manually creating the slot and probably
> > the wording can be made similar.
> 
> PSA patch v3-0002 which changes CREATE SUBSCRIPTION pgdocs to use the
> same wording as in the HINT message.

I think we want the documentation to explain in much more detail what is
meant.  Users are going to need some help in determining which commands
to run for each of the step mentioned in the hint, so I don't think we
want the docs to say the same thing as the hint.  How does the user know
the name of the slot, what options to use, what are the commands to run
afterwards.  So I think we should aim to *expand* that text, not reduce
it.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

1 2 >

1 - 100 of 112 matches

Mail list logo