RE: speed up a logical replica setup

2024-01-25 Thread Hayato Kuroda (Fujitsu)
Dear Euler,

Again, thanks for updating the patch! There are my random comments for v9.

01.
I cannot find your replies for my comments#7 [1] but you reverted related 
changes.
I'm not sure you are still considering it or you decided not to include changes.
Can you clarify your opinion?
(It is needed because changes are huge so it quite affects other 
developments...)

02.
```
+   -t seconds
+   --timeout=seconds
```

But source codes required `--recovery-timeout`. Please update either of them,

03.
```
+ *Create a new logical replica from a standby server
```

Junwang pointed out to change here but the change was reverted [2]
Can you clarify your opinion as well?

04.
```
+/*
+ * Is the source server ready for logical replication? If so, create the
+ * publications and replication slots in preparation for logical replication.
+ */
+static bool
+setup_publisher(LogicalRepInfo *dbinfo)
```

But this function verifies the source server. I felt they should be in the
different function.

05.
```
+/*
+ * Is the target server ready for logical replication?
+ */
+static bool
+setup_subscriber(LogicalRepInfo *dbinfo)


Actually, this function does not set up subscriber. It just verifies whether the
target can become a subscriber, right? If should be renamed.

06.
```
+atexit(cleanup_objects_atexit);
```

The registration of the cleanup function is too early. This sometimes triggers
a core-dump. E.g., 

```
$ pg_subscriber --publisher-conninfo --subscriber-conninfo 'user=postgres 
port=5432' --verbose --database 'postgres' --pgdata data_N2/
pg_subscriber: error: too many command-line arguments (first is "user=postgres 
port=5432")
pg_subscriber: hint: Try "pg_subscriber --help" for more information.
Segmentation fault (core dumped)

$ gdb ...
(gdb) bt
#0  cleanup_objects_atexit () at pg_subscriber.c:131
#1  0x7fb982cffce9 in __run_exit_handlers () from /lib64/libc.so.6
#2  0x7fb982cffd37 in exit () from /lib64/libc.so.6
#3  0x004054e6 in main (argc=9, argv=0x7ffc59074158) at 
pg_subscriber.c:1500
(gdb) f 3
#3  0x004054e6 in main (argc=9, argv=0x7ffc59074158) at 
pg_subscriber.c:1500
1500exit(1);
(gdb) list
1495if (optind < argc)
1496{
1497pg_log_error("too many command-line arguments (first is 
\"%s\")",
1498 argv[optind]);
1499pg_log_error_hint("Try \"%s --help\" for more 
information.", progname);
1500exit(1);
1501}
1502 
1503/*
1504 * Required arguments
```

I still think it should be done just before the creation of objects [3].

07.
Missing a removal of publications on the standby.

08.
Missing registration of LogicalRepInfo in the typedefs.list.

09
```
+ 
+  
+   pg_subscriber
+   option
+  
+ 
```

Can you reply my comment#2 [4]? I think mandatory options should be written.

10.
Just to confirm - will you implement start_standby/stop_standby functions in 
next version?

[1]: 
https://www.postgresql.org/message-id/TY3PR01MB9889C362FF76102C88FA1C29F56F2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
[2]: 
https://www.postgresql.org/message-id/CAEG8a3%2BwL_2R8n12BmRz7yBP3EBNdHDhmdgxQFA9vS%2BzPR%2B3Kw%40mail.gmail.com
[3]: 
https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
[4]: 
https://www.postgresql.org/message-id/TY3PR01MB9889C362FF76102C88FA1C29F56F2%40TY3PR01MB9889.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 





Re: A performance issue with Memoize

2024-01-25 Thread David Rowley
On Fri, 26 Jan 2024 at 19:03, Richard Guo  wrote:
> At first I wondered if we should assume that the same param expr must
> have the same equality operator. If not, we should also check the
> operator to tell if the cache key is a duplicate, like
>
> -   if (!list_member(*param_exprs, expr))
> +   if (!list_member(*param_exprs, expr) ||
> +   !list_member_oid(*operators, hasheqoperator))

hmm, if that were the case you wouldn't do it that way. You'd need to
forboth() and look for an item in both lists matching the search.

> But after looking at how rinfo->left_hasheqoperator/right_hasheqoperator
> is set, it seems we can assume that: the operator is from the type cache
> entry which is fetched according to the expr datatype.

Yip.

> So I think the patch makes sense.  +1.

Thanks for reviewing. I've pushed the patch.

David




Apply the "LIMIT 1" optimization to partial DISTINCT

2024-01-25 Thread Richard Guo
In 5543677ec9 we introduced an optimization that uses Limit instead of
Unique to implement DISTINCT when all the DISTINCT pathkeys have been
marked as redundant.  I happened to notice that this optimization was
not applied to partial DISTINCT, which I think should be.  This can
improve plans in some cases, such as

-- on master
explain (costs off) select distinct four from tenk1 where four = 4;
  QUERY PLAN
--
 Limit
   ->  Gather
 Workers Planned: 4
 ->  Unique
   ->  Parallel Seq Scan on tenk1
 Filter: (four = 4)
(6 rows)

-- patched
explain (costs off) select distinct four from tenk1 where four = 4;
  QUERY PLAN
--
 Limit
   ->  Gather
 Workers Planned: 4
 ->  Limit
   ->  Parallel Seq Scan on tenk1
 Filter: (four = 4)
(6 rows)

Such queries might not be that common, but it's very cheap to apply this
optimization.

Attached is a patch for that.

Thanks
Richard


v1-0001-Apply-the-LIMIT-1-optimization-to-partial-DISTINCT.patch
Description: Binary data


Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-25 Thread Masahiko Sawada
On Thu, Jan 25, 2024 at 10:17 PM Aleksander Alekseev
 wrote:
>
> Hi,
>
> > > Here is the corrected patch.
> >
> > Thank you for updating the patch! I have some comments:
>
> Thanks for the review.
>
> > -tuple = (ReorderBufferTupleBuf *)
> > +tuple = (HeapTuple)
> >  MemoryContextAlloc(rb->tup_context,
> > -
> > sizeof(ReorderBufferTupleBuf) +
> > +   HEAPTUPLESIZE +
> > MAXIMUM_ALIGNOF +
> > alloc_len);
> > -tuple->alloc_tuple_size = alloc_len;
> > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple);
> > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE);
> >
> > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the
> > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a
> > similar thing but it doesn't add it.
>
> Indeed. I gave it a try and nothing crashed, so it appears that
> MAXIMUM_ALIGNOF is not needed.
>
> > ---
> >  xl_parameter_change *xlrec =
> > -(xl_parameter_change *)
> > XLogRecGetData(buf->record);
> > +(xl_parameter_change *)
> > XLogRecGetData(buf->record);
> >
> > Unnecessary change.
>
> That's pgindent. Fixed.
>
> > ---
> > -/*
> > - * Free a ReorderBufferTupleBuf.
> > - */
> > -void
> > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf 
> > *tuple)
> > -{
> > -pfree(tuple);
> > -}
> > -
> >
> > Why does ReorderBufferReturnTupleBuf need to be moved from
> > reorderbuffer.c to reorderbuffer.h? It seems not related to this
> > refactoring patch so I think we should do it in a separate patch if we
> > really want it. I'm not sure it's necessary, though.
>
> OK, fixed.

Thank you for updating the patch. It looks good to me. I'm going to
push it next week, barring any objections.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread David G. Johnston
On Thursday, January 25, 2024, Yugo NAGATA  wrote:

>
> Maybe, we can separate the sentese to two, for example:
>
>   COPY stops operation at the first error. (The exception is if the error
>   is due to data type incompatibility and a value other than stop is
> specified
>   to the ON_ERROR option.)
>

I’d lean more toward saying:

Copy from can be instructed to ignore errors that arise from casting input
data to the data types of the target columns by setting the on_error option
to ignore.  Skipping the entire input row in the process.

The last part is because another way to ignore would be to set null values
for those columns.

That a command stops on an error is the assumed behavior throughout the
system, writing “copy stops operation at the first error.” just seems
really unnecessary.

I will need to make this tweak and probably a couple others to my own
suggestions in 12 hours or so.

David J.


Delay Memoize hashtable build until executor run

2024-01-25 Thread David Rowley
Currently, nodeMemoize.c builds the hashtable for the cache during
executor startup.  This is not what is done in hash joins. I think we
should make the two behave the same way.

Per [1] and the corresponding discussion leading to that, making a
possibly large allocation at executor startup can lead to excessively
long EXPLAIN (not EXPLAIN ANALYZE) times.  This can confuse users as
we don't mention in EXPLAIN where the time is being spent.

Although there's not yet any conclusion that Memoize is to blame,
there's another report from someone confused about where this time is
being spent in [2].

Working on the Memoize code, I originally created the hash table
during executor startup to save on having to check we have a table
each time the node is executed.  However, the branch for this should
be quite predictable and I doubt it'll add any overhead that we would
notice.

The patch to do this is attached.

David

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1e731ed12aa
[2] https://postgr.es/m/61e642df-5f48-4e4e-b4c3-58936f90d...@thefreecat.org
diff --git a/src/backend/executor/nodeMemoize.c 
b/src/backend/executor/nodeMemoize.c
index 0722e4..1075178340 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -278,11 +278,14 @@ MemoizeHash_equal(struct memoize_hash *tb, const 
MemoizeKey *key1,
 }
 
 /*
- * Initialize the hash table to empty.
+ * Initialize the hash table to empty.  The MemoizeState's hashtable field
+ * must point to NULL.
  */
 static void
 build_hash_table(MemoizeState *mstate, uint32 size)
 {
+   Assert(mstate->hashtable == NULL);
+
/* Make a guess at a good size when we're not given a valid size. */
if (size == 0)
size = 1024;
@@ -400,9 +403,12 @@ remove_cache_entry(MemoizeState *mstate, MemoizeEntry 
*entry)
 static void
 cache_purge_all(MemoizeState *mstate)
 {
-   uint64  evictions = mstate->hashtable->members;
+   uint64 evictions = 0;
PlanState  *pstate = (PlanState *) mstate;
 
+   if (mstate->hashtable != NULL)
+   evictions = mstate->hashtable->members;
+
/*
 * Likely the most efficient way to remove all items is to just reset 
the
 * memory context for the cache and then rebuild a fresh hash table.  
This
@@ -410,8 +416,8 @@ cache_purge_all(MemoizeState *mstate)
 */
MemoryContextReset(mstate->tableContext);
 
-   /* Make the hash table the same size as the original size */
-   build_hash_table(mstate, ((Memoize *) pstate->plan)->est_entries);
+   /* NULLify so we recreate the table on the next call */
+   mstate->hashtable = NULL;
 
/* reset the LRU list */
dlist_init(>lru_list);
@@ -707,6 +713,10 @@ ExecMemoize(PlanState *pstate)
 
Assert(node->entry == NULL);
 
+   /* first call? we'll need a hash table. */
+   if (unlikely(node->hashtable == NULL))
+   build_hash_table(node, ((Memoize *) 
pstate->plan)->est_entries);
+
/*
 * We're only ever in this state for the first 
call of the
 * scan.  Here we have a look to see if we've 
already seen the
@@ -1051,8 +1061,11 @@ ExecInitMemoize(Memoize *node, EState *estate, int 
eflags)
/* Zero the statistics counters */
memset(>stats, 0, sizeof(MemoizeInstrumentation));
 
-   /* Allocate and set up the actual cache */
-   build_hash_table(mstate, node->est_entries);
+   /*
+* Because it may require a large allocation we delay building of the 
hash
+* table until executor run.
+*/
+   mstate->hashtable = NULL;
 
return mstate;
 }
@@ -1062,6 +1075,7 @@ ExecEndMemoize(MemoizeState *node)
 {
 #ifdef USE_ASSERT_CHECKING
/* Validate the memory accounting code is correct in assert builds. */
+   if (node->hashtable != NULL)
{
int count;
uint64  mem = 0;


Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread Yugo NAGATA
On Fri, 26 Jan 2024 15:26:55 +0900
Masahiko Sawada  wrote:

> On Fri, Jan 26, 2024 at 2:40 PM Yugo NAGATA  wrote:
> >
> > On Fri, 26 Jan 2024 13:59:09 +0900
> > Masahiko Sawada  wrote:
> >
> > > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA  wrote:
> > > >
> > > > Hi,
> > > >
> > > > I found that the documentation of COPY ON_ERROR said
> > > > COPY stops operation at the first error when
> > > > "ON_ERROR is not specified.", but it also stop when
> > > > ON_ERROR is specified to the default value, "stop".
> > > >
> > > > I attached a very small patch to fix this just for
> > > > making the description more accurate.
> > >
> > > Thank you for the patch!
> > >
> > > +1 to fix it.
> > >
> > > -ON_ERROR is not specified. This
> > > -should not lead to problems in the event of a COPY
> > > +ON_ERROR is not specified or 
> > > stop.
> > > +This should not lead to problems in the event of a COPY
> > >
> > > How about the followings for consistency with the description of the
> > > ON_ERROR option?
> > >
> > > COPY stops operation at the first error if the stop value is specified
> > > to the ON_ERROR option. This should not lead to ...
> >
> > Thank you for you review. However, after posting the previous patch,
> > I noticed that I overlooked ON_ERROR works only for errors due to data
> > type incompatibility (that is mentioned as "malformed data" in the
> > ON_ERROR description, though).
> 
> Right.
> 
> >
> > So, how about rewriting this like:
> >
> >  COPY stops operation at the first error unless the error is due to data
> >  type incompatibility and a value other than stop is specified to the
> >  ON_ERROR option.
> 
> Hmm, this sentence seems not very readable to me, especially the "COPY
> stops ... unless ... a value other than stop is specified ..." part. I
> think we can simplify it:

I can agree with your opinion on the readability, but...

> COPY stops operation at the first data type incompatibility error if
> the stop value is specified to the ON_ERROR option.

This statement doesn't explain COPY also stops when an error other than
data type incompatibility (e.g. constrain violations) occurs.

Maybe, we can separate the sentese to two, for example:

  COPY stops operation at the first error. (The exception is if the error
  is due to data type incompatibility and a value other than stop is specified
  to the ON_ERROR option.)

Regards,
Yugo Nagata

> Regards,
> 
> -- 
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com


-- 
Yugo NAGATA 




Re: Transaction timeout

2024-01-25 Thread Andrey M. Borodin


> On 26 Jan 2024, at 11:44, Andrey M. Borodin  wrote:
> 
> 
> 1. It’s unsafe for isaoltion tester to await transaction_timeout within a 
> query. Usually it gets
> FATAL: terminating connection due to transaction timeout
> But if VM is a bit slow it can get occasional
> PQconsumeInput failed: server closed the connection unexpectedly
> So, currently all tests use “passive waiting”, in a session that will not 
> timeout.


Oops, sorry, I’ve accidentally sent version without this fix.
Here it is.


Best regards, Andrey Borodin.


v24-0001-Introduce-transaction_timeout.patch
Description: Binary data


Re: Transaction timeout

2024-01-25 Thread Andrey M. Borodin


> On 22 Jan 2024, at 11:23, Peter Smith  wrote:
> 
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there was a CFbot test failure last time it was run [2]. Please have a
> look and post an updated version if necessary.
Thanks Peter!

I’ve inspected CI fails and they were caused by two different problems:
1. It’s unsafe for isaoltion tester to await transaction_timeout within a 
query. Usually it gets
FATAL: terminating connection due to transaction timeout
But if VM is a bit slow it can get occasional
PQconsumeInput failed: server closed the connection unexpectedly
So, currently all tests use “passive waiting”, in a session that will not 
timeout.

2. In some cases pg_sleep(0.1) were sleeping up to 200 ms. That was making s7 
and s8 fail, because they rely on this margin.
I’ve separated these tests into different test timeouts-long and increased 
margin to 300ms. Now tests run horrible 2431 ms. Moreover I’m afraid that on 
buildfarm we can have much randomly-slower machines so this test might be 
excluded.
This test checks COMMIT AND CHAIN and flow of small queries (Nik’s case).

Also I’ve verified that every "enable_timeout_after(TRANSACTION_TIMEOUT)” and 
“disable_timeout(TRANSACTION_TIMEOUT)” is necessary and found that case of 
aborting "idle in transaction (aborted)” is not covered by tests. I’m not sure 
we need a test for this.
Japin, Junwang, what do you think?

Thanks!


Best regards, Andrey Borodin.



v23-0001-Introduce-transaction_timeout.patch
Description: Binary data


Re: Popcount optimization using AVX512

2024-01-25 Thread Alvaro Herrera
On 2024-Jan-25, Alvaro Herrera wrote:

> Finally, the matter of using ifunc as proposed by Noah seems to be still
> in the air, with no patches offered for the popcount family.

Oh, I just realized that the patch as currently proposed is placing the
optimized popcount code in the path that does not require going through
a function pointer.  So the performance increase is probably coming from
both avoiding jumping through the pointer as well as from the improved
instruction.

This suggests that finding a way to make the ifunc stuff work (with good
performance) is critical to this work.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"The ability of users to misuse tools is, of course, legendary" (David Steele)
https://postgr.es/m/11b38a96-6ded-4668-b772-40f992132...@pgmasters.net




Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread David G. Johnston
On Thu, Jan 25, 2024 at 10:40 PM Yugo NAGATA  wrote:

> On Fri, 26 Jan 2024 13:59:09 +0900
> Masahiko Sawada  wrote:
>
> > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA 
> wrote:
> > >
> > > Hi,
> > >
> > > I found that the documentation of COPY ON_ERROR said
> > > COPY stops operation at the first error when
> > > "ON_ERROR is not specified.", but it also stop when
> > > ON_ERROR is specified to the default value, "stop".
> > >
>
>
I'm finding the current wording surrounding ON_ERROR to not fit in with the
rest of the page.  I've tried to improve things by being a bit more
invasive.

This first hunk updates the description of the COPY command to include
describing the purpose of the ON_ERROR clause.

I too am concerned about the word "parsed" here, and "malformed" in the
more detailed descriptions; this would need to be modified to better
reflect the circumstances under which ignore happens.

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..5fe4c9f747 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -90,6 +90,14 @@ COPY { table_name [ ( pg_stat_progress_copy view. See
  for details.
   
+
+  
+By default, COPY will abort if it encounters errors.
+For non-binary format COPY FROM the user can specify
+to instead ignore input rows that cannot be parsed by specifying
+the ignore option to the ON_ERROR
+clause.
+  
  

  

The use of the word "Currently" stood out to me when reading the next
hunk.  We always document current reality and don't call out that fact.  We
do not imply some future feature may change how this all works.  On a
related note, right now we have stop and ignore, which are basically enums
just like we have for format (csv, text, binary).  Unlike format, though,
this option requires single quotes.  Why is that?  I did keep with not
specifying the enum in the main syntax block since format doesn't as well;
though writing { stop | ignore } seemed quite appealing.

The rest of the page indicates what the default is in a sentence by itself,
not as a parenthetical next to the option.

The command limitations seemed worthy of a separate paragraph, though it
repeats the description which isn't great.  I'm going to sleep on this one.

@@ -379,17 +387,16 @@ COPY { table_name [ ( 
  
   Specifies which 
-  error_action to perform when there is malformed data
in the input.
-  Currently, only stop (default) and
ignore
-  values are supported.
-  If the stop value is specified,
-  COPY stops operation at the first error.
-  If the ignore value is specified,
-  COPY skips malformed data and continues copying
data.
-  The option is allowed only in COPY FROM.
-  Only stop value is allowed when
-  using binary format.
+  error_action to perform when encountering a malformed
input row:
+  stop or ignore.
+  A value of stop means fail the command, while
+  ignore means discard the input row and continue
with the next one.
+  The default is stop
  
+ 
+  The ignore option is applicable only for COPY FROM
+  when the FORMAT is text
+  or csv.
+
 


The following was, IMO, too much text about something not to worry about,
COPY TO and ignore mode.  The point is dead tuples on error and running
vacuum.  It doesn't really even matter what caused the error, though if the
error was even before rows started to be imported then obviously there
would be no dead tuples.

Oh, and the word "first" really isn't adding anything here that we cannot
reasonably leave to common sense, IMO.  We either ignore all errors or stop
on the first one.  There isn't a stop mode that is capable of bypassing
errors and the ignore mode doesn't have a "first n" option so one assumes
all errors are ignored.

@@ -576,15 +583,12 @@ COPY count



-COPY stops operation at the first error when
-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
-TO, but the target table will already have received
-earlier rows in a COPY FROM. These rows will not
-be visible or accessible, but they still occupy disk space. This might
-amount to a considerable amount of wasted disk space if the failure
-happened well into a large copy operation. You might wish to invoke
-VACUUM to recover the wasted space.
+A failed COPY FROM command will have performed
+physical insertion of all rows prior to the malformed one.
+While these rows will not be visible or accessible, they still occupy
+disk space. This might amount to a considerable amount of wasted disk
+space if the failure happened well into a large copy operation.
+VACUUM should be used to recover the wasted space.




David J.


Re: make dist using git archive

2024-01-25 Thread Peter Eisentraut

On 25.01.24 17:25, Tristan Partin wrote:
The way that this currently works is that you will fail at configure 
time if bz2 doesn't exist on the system. Meson will try to resolve a 
.path() method on a NotFoundProgram. You might want to define the bz2 
target to just call `exit 1` in this case.


if bzip2.found()
  # do your current target
else
  bz2 = run_target('tar.bz2', command: ['exit', 1])
endif

This should cause pgdist to appropriately fail at run time when 
generating the bz2 tarball.


Ok, done that way.

For what it's worth, I run Meson 1.3, and the behavior of generating the 
tarballs even though it is a dirty tree still occurred. In the new patch 
you seem to say it was fixed in 0.60.


The problem I'm referring to is that before 0.60, alias_target cannot 
depend on run_target (only "build target").  This is AFAICT not 
documented and might not have been an intentional change, but you can 
trace it in the meson source code, and it shows in the PostgreSQL CI. 
That's also why for the above bzip2 issue I have to use custom_target in 
place of your run_target.
From 48ddd079c1530baaabf92a5650ff9a7bfa7ac3d6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 25 Jan 2024 12:28:28 +0100
Subject: [PATCH v2 1/2] make dist uses git archive

Discussion: 
https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
 GNUmakefile.in | 34 --
 meson.build| 47 +++
 2 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..680c765dd73 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers 
submake-libpgport
 distdir= postgresql-$(VERSION)
 dummy  = =install=
 
+GIT = git
+
 dist: $(distdir).tar.gz $(distdir).tar.bz2
-   rm -rf $(distdir)
-
-$(distdir).tar: distdir
-   $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
-   @echo $(distdir)
-
-distdir:
-   rm -rf $(distdir)* $(dummy)
-   for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name 
.git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
-   mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
-   ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
-   done
-   $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+   $(GIT) -C $(srcdir) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+   $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD 
-o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2: check-dirty-index
+   $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive 
--format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
 
 distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index 55184db2488..c245a1818e0 100644
--- a/meson.build
+++ b/meson.build
@@ -3348,6 +3348,53 @@ run_target('help',
 
 
 
+###
+# Distribution archive
+###
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+  command: [git, '-C', '@SOURCE_ROOT@', 'diff-index', '--quiet', 'HEAD'])
+
+tar_gz = custom_target('tar.gz',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+'--format', 'tar.gz',
+'--prefix', distdir + '/',
+'-o', join_paths(meson.build_root(), '@OUTPUT@'),
+'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.gz',
+)
+
+if bzip2.found()
+  tar_bz2 = custom_target('tar.bz2',
+build_always_stale: true,
+command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command="' + 
bzip2.path() + '" -c', 'archive',
+  '--format', 'tar.bz2',
+  '--prefix', distdir + '/',
+  '-o', join_paths(meson.build_root(), '@OUTPUT@'),
+  'HEAD', '.'],
+install: false,
+output: distdir + '.tar.bz2',
+  )
+else
+  tar_bz2 = custom_target('tar.bz2',
+command: ['sh', '-c', 'exit 1'],
+output: distdir + '.tar.bz2',
+  )
+endif
+
+# FIXME: would like to add check_dirty_index here, broken(?) before
+# meson 0.60.0
+alias_target('pgdist', [tar_gz, tar_bz2])
+
+
+
 ###
 # The End, The End, My Friend
 ###

base-commit: 46d8587b504170ca14f064890bc7ccbbd7523f81
-- 
2.43.0

From 

Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread Masahiko Sawada
On Fri, Jan 26, 2024 at 2:40 PM Yugo NAGATA  wrote:
>
> On Fri, 26 Jan 2024 13:59:09 +0900
> Masahiko Sawada  wrote:
>
> > On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA  wrote:
> > >
> > > Hi,
> > >
> > > I found that the documentation of COPY ON_ERROR said
> > > COPY stops operation at the first error when
> > > "ON_ERROR is not specified.", but it also stop when
> > > ON_ERROR is specified to the default value, "stop".
> > >
> > > I attached a very small patch to fix this just for
> > > making the description more accurate.
> >
> > Thank you for the patch!
> >
> > +1 to fix it.
> >
> > -ON_ERROR is not specified. This
> > -should not lead to problems in the event of a COPY
> > +ON_ERROR is not specified or 
> > stop.
> > +This should not lead to problems in the event of a COPY
> >
> > How about the followings for consistency with the description of the
> > ON_ERROR option?
> >
> > COPY stops operation at the first error if the stop value is specified
> > to the ON_ERROR option. This should not lead to ...
>
> Thank you for you review. However, after posting the previous patch,
> I noticed that I overlooked ON_ERROR works only for errors due to data
> type incompatibility (that is mentioned as "malformed data" in the
> ON_ERROR description, though).

Right.

>
> So, how about rewriting this like:
>
>  COPY stops operation at the first error unless the error is due to data
>  type incompatibility and a value other than stop is specified to the
>  ON_ERROR option.

Hmm, this sentence seems not very readable to me, especially the "COPY
stops ... unless ... a value other than stop is specified ..." part. I
think we can simplify it:

COPY stops operation at the first data type incompatibility error if
the stop value is specified to the ON_ERROR option.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: A performance issue with Memoize

2024-01-25 Thread Richard Guo
On Fri, Jan 26, 2024 at 12:18 PM David Rowley  wrote:

> On Fri, 26 Jan 2024 at 16:51, Tom Lane  wrote:
> > >> However ... it seems like we're not out of the woods yet.  Why
> > >> is Richard's proposed test case still showing
> > >> + ->  Memoize (actual rows=5000 loops=N)
> > >> +   Cache Key: t1.two, t1.two
> > >> Seems like there is missing de-duplication logic, or something.
> >
> > > This seems separate and isn't quite causing the same problems as what
> > > Richard wants to fix so I didn't touch this for now.
> >
> > Fair enough, but I think it might be worth pursuing later.
>
> Here's a patch for that.


At first I wondered if we should assume that the same param expr must
have the same equality operator. If not, we should also check the
operator to tell if the cache key is a duplicate, like

-   if (!list_member(*param_exprs, expr))
+   if (!list_member(*param_exprs, expr) ||
+   !list_member_oid(*operators, hasheqoperator))

But after looking at how rinfo->left_hasheqoperator/right_hasheqoperator
is set, it seems we can assume that: the operator is from the type cache
entry which is fetched according to the expr datatype.

So I think the patch makes sense.  +1.

Thanks
Richard


Race condition in FetchTableStates() breaks synchronization of subscription tables

2024-01-25 Thread Alexander Lakhin

Hello hackers,

After determining a possible cause for intermittent failures of the test
subscription/031_column_list [1], I was wondering what makes another
subscription test (014_binary) fail on the buildfarm:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=snakefly=2024-01-22%2001%3A19%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2024-01-14%2018%3A19%3A20
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet=2023-12-21%2001%3A11%3A52
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2023-11-27%2001%3A42%3A39

All those failures caused by a timeout when waiting for a message expected
in _subscriber.log. For example, in the snakefly's case:
[01:14:46.158](1.937s) ok 7 - check synced data on subscriber with custom type
timed out waiting for match: (?^:ERROR: ( [A-Z0-9]+:)? incorrect binary data format) at 
/home/bf/bf-build/piculet/HEAD/pgsql/src/test/subscription/t/014_binary.pl line 269.


_subscriber.log contains:
2023-12-21 01:14:46.215 UTC [410039] 014_binary.pl LOG:  statement: ALTER 
SUBSCRIPTION tsub REFRESH PUBLICATION;
2023-12-21 01:17:46.756 UTC [40] ERROR:  could not receive data from WAL stream: server closed the connection 
unexpectedly

    This probably means the server terminated abnormally
    before or while processing the request.
2023-12-21 01:17:46.760 UTC [405057] LOG:  background worker "logical replication apply worker" (PID 40) exited with 
exit code 1

2023-12-21 01:17:46.779 UTC [532857] LOG:  logical replication apply worker for 
subscription "tsub" has started
...

While _subscriber.log from a successful test run contains:
2024-01-26 03:49:07.065 UTC [9726:5] 014_binary.pl LOG:  statement: ALTER 
SUBSCRIPTION tsub REFRESH PUBLICATION;
2024-01-26 03:49:07.075 UTC [9726:6] 014_binary.pl LOG: disconnection: session time: 0:00:00.014 user=postgres 
database=postgres host=[local]

2024-01-26 03:49:07.558 UTC [9729:1] LOG:  logical replication apply worker for 
subscription "tsub" has started
2024-01-26 03:49:07.563 UTC [9731:1] LOG:  logical replication table synchronization worker for subscription "tsub", 
table "test_mismatching_types" has started

2024-01-26 03:49:07.585 UTC [9731:2] ERROR:  incorrect binary data format
2024-01-26 03:49:07.585 UTC [9731:3] CONTEXT:  COPY test_mismatching_types, 
line 1, column a

In this case, "logical replication apply worker for subscription "tsub" has
started" appears just after "ALTER SUBSCRIPTION", not 3 minutes later.

I've managed to reproduce this failure locally by running multiple tests in
parallel, and my analysis shows that it is caused by a race condition when
accessing variable table_states_valid inside tablesync.c.

tablesync.c does the following with table_states_valid:
/*
 * Callback from syscache invalidation.
 */
void
invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue)
{
    table_states_valid = false;
}
...
static bool
FetchTableStates(bool *started_tx)
{
...
    if (!table_states_valid)
    {
...
    /* Fetch all non-ready tables. */
    rstates = GetSubscriptionRelations(MySubscription->oid, true);
...
    table_states_valid = true;
    }

So, when syscache invalidation occurs inside the code block
"if (!table_states_valid)", that invalidation is effectively ignored.

In a failed case I observe the following events:
1. logical replication apply worker performs
 LogicalRepApplyLoop() -> process_syncing_tables() ->
 process_syncing_tables_for_apply() -> FetchTableStates() periodically.

2. ALTER SUBSCRIPTION tsub REFRESH PUBLICATION invalidates syscache
 for SUBSCRIPTIONRELMAP, and that leads to calling
 invalidate_syncing_table_states().

3. If the apply worker manages to fetch no non-ready tables in
 FetchTableStates() and ignore "table_states_valid = false" from
 invalidate_syncing_table_states(), then it just misses the invalidation
 event, so it keeps working without noticing non-ready tables appeared as
 the result of ALTER SUBSCRIPTION (process_syncing_tables_for_apply() skips
 a loop "foreach(lc, table_states_not_ready) ..." until some other event
 occurs).

pg_usleep(10) added just below GetSubscriptionRelations(...) proves my
analysis -- without it, I need tens of iterations (with 50 tests running in
parallel) to catch the failure, but with it, I get the failure on the first
iteration.

(Reproduced on REL_16_STABLE..master, where the test 014_binary tries to
"Refresh the publication to trigger the tablesync", but I think the race
condition exists in the previous versions as well.)

[1] 
https://www.postgresql.org/message-id/16d6d9cc-f97d-0b34-be65-425183ed3...@gmail.com

Best regards,
Alexander




RE: speed up a logical replica setup

2024-01-25 Thread Hayato Kuroda (Fujitsu)
Dear Euler,

Thanks for updating the patch! Before reading yours, I wanted to reply some of 
comments.

>
I'm still thinking about replacing --subscriber-conninfo with separate items
(username, port, password?, host = socket dir). Maybe it is an overengineering.
The user can always prepare the environment to avoid unwanted and/or external
connections.
>

For me, required amount of fixes are not so different from current one. How 
about
others?

>
It is extremely useful because (a) you have a physical replication setup and
don't know if it is prepared for logical replication, (b) check GUCs (is
max_wal_senders sufficient for this pg_subscriber command? Or is
max_replication_slots sufficient to setup the logical replication even though I
already have some used replication slots?), (c) connectivity and (d)
credentials.
>

Yeah, it is useful for verification purpose, so let's keep this option.
But I still think the naming should be "--check". Also, there are many
`if (!dry_run)` but most of them can be removed if the process exits earlier.
Thought?


>
> 05.
> I felt we should accept some settings from enviroment variables, like 
> pg_upgrade.
> Currently, below items should be acceted.
> 
> - data directory
> - username
> - port
> - timeout

Maybe PGDATA.
>

Sorry, I cannot follow this. Did you mean that the target data directory should
be able to be specified by PGDATA? OF so, +1.

>
> 06.
> ```
> pg_logging_set_level(PG_LOG_WARNING);
> ```
> 
> If the default log level is warning, there are no ways to output debug logs.
> (-v option only raises one, so INFO would be output)
> I think it should be PG_LOG_INFO.

You need to specify multiple -v options.
>

Hmm. I felt the specification was bit strange...but at least it must be
described on the documentation. pg_dump.sgml has similar lines.

>
> 08.
> Not sure, but if we want to record outputs by pg_subscriber, the sub-directory
> should be created. The name should contain the timestamp.

The log file already contains the timestamp. Why?
>

This comment assumed outputs by pg_subscriber were also recorded to a file.
In this case and if the file also has the same timestamp, I think they can be
gathered in the same place. No need if outputs are not recorded.


>
> 09.
> Not sure, but should we check max_slot_wal_keep_size of primary server? It can
> avoid to fail starting of logical replicaiton.

A broken physical replication *before* running this tool is its responsibility?
Hmm. We might add another check that can be noticed during dry run mode.
>

I thought that we should not generate any broken objects, but indeed, not sure
it is our scope. How do other think?


>
> 11.
> ```
> /*
> * Stop the subscriber if it is a standby server. Before executing the
> * transformation steps, make sure the subscriber is not running because
> * one of the steps is to modify some recovery parameters that require a
> * restart.
> */
> if (stat(pidfile, ) == 0)
> ```
> 
> I kept just in case, but I'm not sure it is still needed. How do you think?
> Removing it can reduce an inclusion of pidfile.h.

Are you suggesting another way to check if the standby is up and running?
>

Running `pg_ctl stop` itself can detect whether the process has been still 
alive.
It would exit with 1 when the process is not there.

>
> I didn't notice the comment, but still not sure the reason. Why we must 
> reserve
> the slot until pg_subscriber finishes? IIUC, the slot would be never used, it
> is created only for getting a consistent_lsn. So we do not have to keep.
> Also, just before, logical replication slots for each databases are created, 
> so
> WAL records are surely reserved.
>

I want to confirm the conclusion - will you remove the creation of a transient 
slot?

Also, not tested, I'm now considering that we can reuse the primary_conninfo 
value.
We are assuming that the target server is standby and the current upstream one 
will
convert to publisher. In this case, the connection string is already specified 
as
primary_conninfo so --publisher-conninfo may not be needed. The parameter does
not contain database name, so --databases is still needed. I imagine like:

1. Parse options
2. Turn on standby
3. Verify the standby
4. Turn off standby
5. Get primary_conninfo from standby
6. Connect to primary (concat of primary_conninfo and an option is used)
7. Verify the primary
...

How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/global/ 





Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread Yugo NAGATA
On Fri, 26 Jan 2024 13:59:09 +0900
Masahiko Sawada  wrote:

> On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA  wrote:
> >
> > Hi,
> >
> > I found that the documentation of COPY ON_ERROR said
> > COPY stops operation at the first error when
> > "ON_ERROR is not specified.", but it also stop when
> > ON_ERROR is specified to the default value, "stop".
> >
> > I attached a very small patch to fix this just for
> > making the description more accurate.
> 
> Thank you for the patch!
> 
> +1 to fix it.
> 
> -ON_ERROR is not specified. This
> -should not lead to problems in the event of a COPY
> +ON_ERROR is not specified or stop.
> +This should not lead to problems in the event of a COPY
> 
> How about the followings for consistency with the description of the
> ON_ERROR option?
> 
> COPY stops operation at the first error if the stop value is specified
> to the ON_ERROR option. This should not lead to ...

Thank you for you review. However, after posting the previous patch, 
I noticed that I overlooked ON_ERROR works only for errors due to data
type incompatibility (that is mentioned as "malformed data" in the 
ON_ERROR description, though). 

So, how about rewriting this like:

 COPY stops operation at the first error unless the error is due to data
 type incompatibility and a value other than stop is specified to the
 ON_ERROR option.

I attached the updated patch.

Regards,
Yugo Nagata

> 
> Regards,
> 
> -- 
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com


-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..3676bf0e87 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -576,9 +576,10 @@ COPY count

 

-COPY stops operation at the first error when
-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
+COPY stops operation at the first error unless the error
+is due to data type incompatibility and a value other than
+stop is specified to the ON_ERROR option.
+This should not lead to problems in the event of a COPY
 TO, but the target table will already have received
 earlier rows in a COPY FROM. These rows will not
 be visible or accessible, but they still occupy disk space. This might


Re: A performance issue with Memoize

2024-01-25 Thread Richard Guo
On Fri, Jan 26, 2024 at 1:22 AM Tom Lane  wrote:

> Apologies for not having noticed this thread before.  I'm taking
> a look at it now.  However, while sniffing around this I found
> what seems like an oversight in paramassign.c's
> assign_param_for_var(): it says it should compare all the same
> fields as _equalVar except for varlevelsup, but it's failing to
> compare varnullingrels.  Is that a bug?  It's conceivable that
> it's not possible to get here with varnullingrels different and
> all else the same, but I don't feel good about that proposition.
>
> I tried adding
>
> @@ -91,7 +91,10 @@ assign_param_for_var(PlannerInfo *root, Var *var)
>  pvar->vartype == var->vartype &&
>  pvar->vartypmod == var->vartypmod &&
>  pvar->varcollid == var->varcollid)
> +{
> +Assert(bms_equal(pvar->varnullingrels,
> var->varnullingrels));
>  return pitem->paramId;
> +}
>  }
>  }


Yeah, I think it should be safe to assert that the varnullingrels is
equal here.  The Var is supposed to be an upper-level Var, and two same
such Vars should not have different varnullingrels at this point,
although the varnullingrels might be adjusted later in
identify_current_nestloop_params according to which form of identity 3
we end up applying.

Thanks
Richard


Re: Small fix on COPY ON_ERROR document

2024-01-25 Thread Masahiko Sawada
On Fri, Jan 26, 2024 at 11:28 AM Yugo NAGATA  wrote:
>
> Hi,
>
> I found that the documentation of COPY ON_ERROR said
> COPY stops operation at the first error when
> "ON_ERROR is not specified.", but it also stop when
> ON_ERROR is specified to the default value, "stop".
>
> I attached a very small patch to fix this just for
> making the description more accurate.

Thank you for the patch!

+1 to fix it.

-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
+ON_ERROR is not specified or stop.
+This should not lead to problems in the event of a COPY

How about the followings for consistency with the description of the
ON_ERROR option?

COPY stops operation at the first error if the stop value is specified
to the ON_ERROR option. This should not lead to ...

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-25 Thread Masahiko Sawada
On Fri, Jan 26, 2024 at 1:24 PM  wrote:
>
> On Thu, 2024-01-25 at 16:11 -0500, reid.thomp...@crunchydata.com wrote:
> >
> > I walked through v6 and didn't note any issues.

Thank you for reviewing the patch!

> >
> > I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and
> > drops the unused parameter ReorderBuffer *rb. It seems that
> > ReorderBufferFreeSnap(), ReorderBufferReturnRelids(),
> > ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup()
> > also pass ReorderBuffer *rb but do not use it. Would it be beneficial
> > to implement a separate patch to remove this parameter from these
> > functions also?
> >
> >
>
> It also appears that ReorderBufferSerializeChange() has 5 instances
> where it increments the local variables "data" but then they're never
> read.
> Lines 3806, 3836, 3854, 3889, 3910
>
> I can create patch and post it to this thread or a new one if deemed
> worthwhile.

I'm not sure these changes are really beneficial. They contribute to
improving neither readability and performance IMO.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-25 Thread reid . thompson
On Thu, 2024-01-25 at 16:11 -0500, reid.thomp...@crunchydata.com wrote:
> 
> I walked through v6 and didn't note any issues.
> 
> I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and
> drops the unused parameter ReorderBuffer *rb. It seems that
> ReorderBufferFreeSnap(), ReorderBufferReturnRelids(),
> ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup()
> also pass ReorderBuffer *rb but do not use it. Would it be beneficial
> to implement a separate patch to remove this parameter from these
> functions also?
> 
> Thanks,
> Reid
> 

It also appears that ReorderBufferSerializeChange() has 5 instances
where it increments the local variables "data" but then they're never
read.
Lines 3806, 3836, 3854, 3889, 3910

I can create patch and post it to this thread or a new one if deemed
worthwhile.

Thanks,
Reid




Re: A performance issue with Memoize

2024-01-25 Thread David Rowley
On Fri, 26 Jan 2024 at 16:51, Tom Lane  wrote:
> >> However ... it seems like we're not out of the woods yet.  Why
> >> is Richard's proposed test case still showing
> >> + ->  Memoize (actual rows=5000 loops=N)
> >> +   Cache Key: t1.two, t1.two
> >> Seems like there is missing de-duplication logic, or something.
>
> > This seems separate and isn't quite causing the same problems as what
> > Richard wants to fix so I didn't touch this for now.
>
> Fair enough, but I think it might be worth pursuing later.

Here's a patch for that.

David
diff --git a/src/backend/optimizer/path/joinpath.c 
b/src/backend/optimizer/path/joinpath.c
index c0ba087b40..8ecf9fe8dc 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -492,8 +492,16 @@ paraminfo_get_equal_hashops(PlannerInfo *root, 
ParamPathInfo *param_info,
return false;
}
 
-   *operators = lappend_oid(*operators, hasheqoperator);
-   *param_exprs = lappend(*param_exprs, expr);
+   /*
+* 'expr' may already exist as a parameter from a 
previous ppi_clauses.  No
+* need to include it again, however we'd better ensure 
we do switch into
+* binary mode if required.  See below.
+*/
+   if (!list_member(*param_exprs, expr))
+   {
+   *operators = lappend_oid(*operators, 
hasheqoperator);
+   *param_exprs = lappend(*param_exprs, expr);
+   }
 
/*
 * When the join operator is not hashable then it's 
possible that
@@ -536,8 +544,16 @@ paraminfo_get_equal_hashops(PlannerInfo *root, 
ParamPathInfo *param_info,
return false;
}
 
-   *operators = lappend_oid(*operators, typentry->eq_opr);
-   *param_exprs = lappend(*param_exprs, expr);
+   /*
+* 'expr' may already exist as a parameter from the 
ppi_clauses.  No need to
+* include it again, however we'd better ensure we do switch 
into binary
+* mode.
+*/
+   if (!list_member(*param_exprs, expr))
+   {
+   *operators = lappend_oid(*operators, typentry->eq_opr);
+   *param_exprs = lappend(*param_exprs, expr);
+   }
 
/*
 * We must go into binary mode as we don't have too much of an 
idea of
diff --git a/src/test/regress/expected/memoize.out 
b/src/test/regress/expected/memoize.out
index ca198ec3b8..17bb3c8661 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -107,7 +107,7 @@ WHERE t1.unique1 < 10;', false);
  ->  Index Scan using tenk1_unique1 on tenk1 t1 (actual rows=10 
loops=N)
Index Cond: (unique1 < 10)
  ->  Memoize (actual rows=2 loops=N)
-   Cache Key: t1.two, t1.two
+   Cache Key: t1.two
Cache Mode: binary
Hits: 8  Misses: 2  Evictions: Zero  Overflows: 0  Memory 
Usage: NkB
->  Subquery Scan on t2 (actual rows=2 loops=N)


Re: A performance issue with Memoize

2024-01-25 Thread Tom Lane
David Rowley  writes:
> I've adjusted the comments to what you mentioned and also leaned out
> the pretty expensive test case to something that'll run much faster
> and pushed the result.

+1, I was wondering if the test could be cheaper.  It wasn't horrid
as Richard had it, but core regression tests add up over time.

>> However ... it seems like we're not out of the woods yet.  Why
>> is Richard's proposed test case still showing
>> + ->  Memoize (actual rows=5000 loops=N)
>> +   Cache Key: t1.two, t1.two
>> Seems like there is missing de-duplication logic, or something.

> This seems separate and isn't quite causing the same problems as what
> Richard wants to fix so I didn't touch this for now.

Fair enough, but I think it might be worth pursuing later.

regards, tom lane




Re: A performance issue with Memoize

2024-01-25 Thread David Rowley
On Fri, 26 Jan 2024 at 07:32, Tom Lane  wrote:
>
> David Rowley  writes:
> > I'd feel better about doing it your way if Tom could comment on if
> > there was a reason he put the function calls that way around in
> > 5ebaaa494.
>
> I'm fairly sure I thought it wouldn't matter because of the Param
> de-duplication done in paramassign.c.  However, Richard's example
> shows that's not so, because process_subquery_nestloop_params is
> picky about the param ID assigned to a particular Var while
> replace_nestloop_params is not.  So flipping the order makes sense.

Makes sense.

I've adjusted the comments to what you mentioned and also leaned out
the pretty expensive test case to something that'll run much faster
and pushed the result.

> However ... it seems like we're not out of the woods yet.  Why
> is Richard's proposed test case still showing
>
> + ->  Memoize (actual rows=5000 loops=N)
> +   Cache Key: t1.two, t1.two
>
> Seems like there is missing de-duplication logic, or something.

This seems separate and isn't quite causing the same problems as what
Richard wants to fix so I didn't touch this for now.

David




Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 05:44:52PM -0400, David Steele wrote:
> On 1/25/24 17:42, Tom Lane wrote:
>> We're talking about 1d35f705e, right?  That certainly looks harmless
>> and potentially useful.  I'm +1 for back-patching.
> 
> That's the one. If we were modifying existing messages I would be against
> it, but new, infrequent (but oh so helpful) messages seem fine.

Well, I'm OK with this consensus on 1d35f705e if folks think this is
useful enough for all the stable branches.
--
Michael


signature.asc
Description: PGP signature


Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-25 Thread Jeff Davis
On Thu, 2024-01-25 at 14:35 +0530, Bharath Rupireddy wrote:
> 
> "expecting zeros at the end" - this can't always be true as the WAL
> 
...

> I think this needs to be discussed separately. If okay, I'll start a
> new thread.

Thank you for investigating. When the above issue is handled, I'll be
more comfortable expanding the call sites for XLogReadFromBuffers().

> > Also, if we've detected that the first requested buffer has been
> > evicted, is there any value in continuing the loop to see if more
> > recent buffers are available? For example, if the requested LSNs
> > range
> > over buffers 4, 5, and 6, and 4 has already been evicted, should we
> > try
> > to return LSN data from 5 and 6 at the proper offset in the dest
> > buffer? If so, we'd need to adjust the API so the caller knows what
> > parts of the dest buffer were filled in.
> 
> I'd second this capability for now to keep the API simple and clear,
> but we can consider expanding it as needed.

Agreed. This case doesn't seem important; I just thought I'd ask about
it.

> If the callers use GetFlushRecPtr() to determine how far to read,
> LogwrtResult will be *reasonably* latest

It will be up-to-date enough that we'd never go through
WaitXLogInsertionsToFinish(), which is all we care about.

> As far as the current WAL readers are concerned, we don't need an
> explicit spinlock to determine LogwrtResult because all of them use
> GetFlushRecPtr() to determine how far to read. If there's any caller
> that's not updating LogwrtResult at all, we can consider reading
> LogwrtResult it ourselves in future.

So we don't actually need that path yet, right?

> 5. I think the two requirements specified at
> https://www.postgresql.org/message-id/20231109205836.zjoawdrn4q77yemv%40awork3.anarazel.de
> still hold with the v19j.

Agreed.

> PSA v20 patch set.

0001 is very close. I have the following suggestions:

  * Don't just return zero. If the caller is doing something we don't
expect, we want to fix the caller. I understand you'd like this to be
more like a transparent optimization, and we may do that later, but I
don't think it's a good idea to do that now.

  * There's currently no use for reading LSNs between Write and Insert,
so remove the WaitXLogInsertionsToFinish() code path. That also means
we don't need the extra emitLog parameter, so we can remove that. When
we have a use case, we can bring it all back.

If you agree, I can just make those adjustments (and do some final
checking) and commit 0001. Otherwise let me know what you think.

0002: How does the test control whether the data requested is before
the Flush pointer, the Write pointer, or the Insert pointer? What if
the walwriter comes in and moves one of those pointers before the next
statement is executed? Also, do you think a test module is required for
the basic functionality in 0001, or only when we start doing more
complex things like reading past the Flush pointer?

0003: can you explain why this is useful for wal summarizer to read
from the buffers?

Regards,
Jeff Davis





Re: A performance issue with Memoize

2024-01-25 Thread Richard Guo
On Fri, Jan 26, 2024 at 2:32 AM Tom Lane  wrote:

> I'm fairly sure I thought it wouldn't matter because of the Param
> de-duplication done in paramassign.c.  However, Richard's example
> shows that's not so, because process_subquery_nestloop_params is
> picky about the param ID assigned to a particular Var while
> replace_nestloop_params is not.  So flipping the order makes sense.
> I'd change the comment though, maybe to
>
> /*
>  * Replace any outer-relation variables with nestloop params.
>  *
>  * We must provide nestloop params for both lateral references of
>  * the subquery and outer vars in the scan_clauses.  It's better
>  * to assign the former first, because that code path requires
>  * specific param IDs, while replace_nestloop_params can adapt
>  * to the IDs assigned by process_subquery_nestloop_params.
>  * This avoids possibly duplicating nestloop params when the
>  * same Var is needed for both reasons.
>  */


+1.  It's much better.


> However ... it seems like we're not out of the woods yet.  Why
> is Richard's proposed test case still showing
>
> + ->  Memoize (actual rows=5000 loops=N)
> +   Cache Key: t1.two, t1.two
>
> Seems like there is missing de-duplication logic, or something.


When we collect the cache keys in paraminfo_get_equal_hashops() we
search param_info's ppi_clauses as well as innerrel's lateral_vars for
outer expressions.  We do not perform de-duplication on the collected
outer expressions there.  In my proposed test case, the same Var
't1.two' appears both in the param_info->ppi_clauses and in the
innerrel->lateral_vars, so we see two identical cache keys in the plan.
I noticed this before and wondered if we should do de-duplication on the
cache keys, but somehow I did not chase this to the ground.

Thanks
Richard


Re: gai_strerror() is not thread-safe on Windows

2024-01-25 Thread vignesh C
On Tue, 5 Dec 2023 at 00:57, Thomas Munro  wrote:
>
> On second thoughts, I guess it would make more sense to use the exact
> messages Windows' own implementation would return instead of whatever
> we had in the past (probably cribbed from some other OS or just made
> up?).  I asked CI to spit those out[1].  Updated patch attached.  Will
> add to CF.

CFBot shows that the patch does not apply anymore as in [1]:

=== Applying patches on top of PostgreSQL commit ID
376c216138c75e161d39767650ea30536f23b482 ===
=== applying patch ./v2-0001-Fix-gai_strerror-thread-safety-on-Windows.patch
patching file configure
Hunk #1 succeeded at 16388 (offset 34 lines).
patching file configure.ac
Hunk #1 succeeded at 1885 (offset 7 lines).
patching file src/include/port/win32/sys/socket.h
patching file src/port/meson.build
patching file src/port/win32gai_strerror.c
can't find file to patch at input line 134
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--
|diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm
|index 46df01cc8d..c51296bdb6 100644
|--- a/src/tools/msvc/Mkvcbuild.pm
|+++ b/src/tools/msvc/Mkvcbuild.pm
--
No file to patch.  Skipping patch.
1 out of 1 hunk ignored

Please have a look and post an updated version.

[1] - http://cfbot.cputube.org/patch_46_4682.log

Regards,
Vignesh




Re: Improve tab completion for ALTER DEFAULT PRIVILEGE and ALTER TABLE

2024-01-25 Thread vignesh C
On Mon, 27 Nov 2023 at 21:58, vignesh C  wrote:
>
> On Fri, 24 Nov 2023 at 18:37, Shubham Khanna
>  wrote:
> >
> > n Fri, Nov 24, 2023 at 6:33 PM vignesh C  wrote:
> > >
> > > Hi,
> > >
> > > Improved tab completion for "ALTER DEFAULT PRIVILEGE" and "ALTER TABLE":
> > > 1) GRANT, REVOKE and FOR USER keyword was not displayed in tab
> > > completion of alter default privileges like the below statement:
> > > ALTER DEFAULT PRIVILEGES GRANT INSERT ON tables TO PUBLIC;
> > > ALTER DEFAULT PRIVILEGES REVOKE INSERT ON tables FROM PUBLIC;
> > > ALTER DEFAULT PRIVILEGES FOR USER vignesh revoke INSERT ON tables FROM 
> > > dba1;
> > >
> > > 2) USER was not displayed for "ALTER DEFAULT PRIVILEGES IN SCHEMA
> > > public FOR " like in below statement:
> > > ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR USER dba1 GRANT INSERT
> > > ON TABLES TO PUBLIC;
> > >
> > > 3) "FOR GRANT OPTION" was not display for "ALTER DEFAULT PRIVILEGES
> > > REVOKE " like in below statement:
> > > alter default privileges revoke grant option for select ON tables FROM 
> > > dba1;
> > >
> > > 4) "DATA TYPE" was missing in "ALTER TABLE table-name ALTER COLUMN
> > > column-name SET" like in:
> > > ALTER TABLE t1 ALTER COLUMN c1 SET DATA TYPE text;
> > >
> > > Attached patch has the changes for the same.
> >
> > +COMPLETE_WITH("ROLE", "USER");
> > +  /* ALTER DEFAULT PRIVILEGES REVOKE */
> > +  else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "REVOKE"))
> > +COMPLETE_WITH("SELECT", "INSERT", "UPDATE", "DELETE", "TRUNCATE",
> > +    "REFERENCES", "TRIGGER", "CREATE", "EXECUTE", "USAGE",
> > +    "MAINTAIN", "ALL", "GRANT OPTION FOR");
> >
> > I could not find "alter default privileges revoke maintain", should
> > this be removed?
>
> This was reverted as part of:
> 151c22deee66a3390ca9a1c3675e29de54ae73fc.
> Revert MAINTAIN privilege and pg_maintain predefined role.
>
> This reverts the following commits: 4dbdb82513, c2122aae63,
> 5b1a879943, 9e1e9d6560, ff9618e82a, 60684dd834, 4441fc704d,
> and b5d6382496.  A role with the MAINTAIN privilege may be able to
> use search_path tricks to escalate privileges to the table owner.
> Unfortunately, it is too late in the v16 development cycle to apply
> the proposed fix, i.e., restricting search_path when running
> maintenance commands.
>
> The attached v2 version has the changes for the same.

The patch was not applying because of a recent commit in tab
completion, PSA new patch set.

Regards,
Vignesh
From e3695181cf7d5f31dda520a01def544dc576d6c1 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Sun, 2 Jul 2023 19:35:46 +0530
Subject: [PATCH v3 1/2] Fix missing tab completion in "ALTER DEFAULT
 PRIVILEGES"

GRANT, REVOKE and FOR USER keyword was not displayed in tab completion of alter default prvileges like the below statement:
ALTER DEFAULT PRIVILEGES GRANT INSERT ON tables TO PUBLIC;
ALTER DEFAULT PRIVILEGES REVOKE INSERT ON tables FROM PUBLIC;
ALTER DEFAULT PRIVILEGES FOR USER vignesh revoke INSERT ON tables FROM vignesh;

USER was not displayed for "ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR " like in below statement:
ALTER DEFAULT PRIVILEGES IN SCHEMA public FOR USER vignesh GRANT INSERT ON TABLES TO PUBLIC;

"FOR GRANT OPTION" was not display for "ALTER DEFAULT PRIVILEGES REVOKE " like in below statement:
alter default privileges revoke grant option for select ON tables FROM testdba;
---
 src/bin/psql/tab-complete.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ada711d02f..d5b3794fd6 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -2144,10 +2144,15 @@ psql_completion(const char *text, int start, int end)
 
 	/* ALTER DEFAULT PRIVILEGES */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES"))
-		COMPLETE_WITH("FOR ROLE", "IN SCHEMA");
+		COMPLETE_WITH("FOR", "GRANT", "IN SCHEMA", "REVOKE");
 	/* ALTER DEFAULT PRIVILEGES FOR */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "FOR"))
-		COMPLETE_WITH("ROLE");
+		COMPLETE_WITH("ROLE", "USER");
+	/* ALTER DEFAULT PRIVILEGES REVOKE */
+	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "REVOKE"))
+		COMPLETE_WITH("SELECT", "INSERT", "UPDATE", "DELETE", "TRUNCATE",
+	  "REFERENCES", "TRIGGER", "CREATE", "EXECUTE", "USAGE",
+	  "ALL", "GRANT OPTION FOR");
 	/* ALTER DEFAULT PRIVILEGES IN */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN"))
 		COMPLETE_WITH("SCHEMA");
@@ -2158,11 +2163,11 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER DEFAULT PRIVILEGES IN SCHEMA ... */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN", "SCHEMA",
 	 MatchAny))
-		COMPLETE_WITH("GRANT", "REVOKE", "FOR ROLE");
+		COMPLETE_WITH("GRANT", "REVOKE", "FOR");
 	/* ALTER DEFAULT PRIVILEGES IN SCHEMA ... FOR */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN", "SCHEMA",
 	 MatchAny, "FOR"))
-		COMPLETE_WITH("ROLE");
+		COMPLETE_WITH("ROLE", "USER");
 	/* 

Re: Evaluate arguments of correlated SubPlans in the referencing ExprState

2024-01-25 Thread vignesh C
On Tue, 24 Oct 2023 at 01:47, Alena Rybakina  wrote:
>
> Hi!
>
> I looked through your patch and noticed that it was not applied to the 
> current version of the master. I rebased it and attached a version. I didn't 
> see any problems and, honestly, no big changes were needed, all regression 
> tests were passed.
>
> I think it's better to add a test, but to be honest, I haven't been able to 
> come up with something yet.

The patch does not apply anymore as in CFBot at [1]:

=== Applying patches on top of PostgreSQL commit ID
7014c9a4bba2d1b67d60687afb5b2091c1d07f73 ===
=== applying patch
./v2-0001-WIP-Evaluate-arguments-of-correlated-SubPlans-in-the.patch

patching file src/include/executor/execExpr.h
Hunk #1 succeeded at 160 (offset 1 line).
Hunk #2 succeeded at 382 (offset 2 lines).
Hunk #3 FAILED at 778.
1 out of 3 hunks FAILED -- saving rejects to file
src/include/executor/execExpr.h.rej
patching file src/include/nodes/execnodes.h
Hunk #1 succeeded at 959 (offset 7 lines).

Please have a look and post an updated version.

[1] - http://cfbot.cputube.org/patch_46_4209.log

Regards,
Vignesh




Small fix on COPY ON_ERROR document

2024-01-25 Thread Yugo NAGATA
Hi,

I found that the documentation of COPY ON_ERROR said
COPY stops operation at the first error when 
"ON_ERROR is not specified.", but it also stop when
ON_ERROR is specified to the default value, "stop".

I attached a very small patch to fix this just for
making the description more accurate.

Regards,
Yugo Nagata 

-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..14c8ceab06 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -577,8 +577,8 @@ COPY count
 

 COPY stops operation at the first error when
-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
+ON_ERROR is not specified or stop.
+This should not lead to problems in the event of a COPY
 TO, but the target table will already have received
 earlier rows in a COPY FROM. These rows will not
 be visible or accessible, but they still occupy disk space. This might


Re: Printing backtrace of postgres processes

2024-01-25 Thread vignesh C
On Sun, 5 Nov 2023 at 01:49, Bharath Rupireddy
 wrote:
>
> On Thu, Jul 20, 2023 at 8:22 PM Daniel Gustafsson  wrote:
> >
> > > On 11 Jan 2023, at 15:44, Bharath Rupireddy 
> > >  wrote:
> > >
> > > On Wed, Nov 30, 2022 at 11:43 AM Bharath Rupireddy
> > >  wrote:
> > >>
> > >> I'm attaching the v22 patch set for further review.
> > >
> > > Needed a rebase due to 216a784829c2c5f03ab0c43e009126cbb819e9b2.
> > > Attaching v23 patch set for further review.
> >
> > This thread has stalled for well over 6 months with the patch going from CF 
> > to
> > CF.  From skimming the thread it seems that a lot of the details have been
> > ironed out with most (all?) objections addressed.  Is any committer 
> > interested
> > in picking this up?  If not we should probably mark it returned with 
> > feedback.
>
> Rebase needed due to function oid clash. Picked the new OID with the
> help of src/include/catalog/unused_oids. PSA v24 patch set.

Rebase needed due to changes in parallel_schedule. PSA v25 patch set.

Regards,
Vignesh
From 32fa1d898b2599b836a2265f746acf230ffb358e Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 4 Nov 2023 18:06:36 +
Subject: [PATCH v25 1/2] Move sending multiplexed-SIGUSR1 signal code to a
 function

Add a new function hosting the common code for sending
multiplexed-SIGUSR1 signal to a backend process. This function
will also be used as-is by an upcoming commit reducing the code
duplication.
---
 src/backend/storage/ipc/procarray.c | 60 +
 src/backend/utils/adt/mcxtfuncs.c   | 49 ++-
 src/include/storage/procarray.h |  1 +
 3 files changed, 64 insertions(+), 46 deletions(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ee2d7f8585..f6465529e7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3097,6 +3097,66 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids, int type)
 	return result;
 }
 
+/*
+ * SendProcSignalBackendOrAuxproc -- check if the process with given pid is a
+ * backend or an auxiliary process and send it the SIGUSR1 signal for a given
+ * reason.
+ *
+ * Returns true if sending the signal was successful, false otherwise.
+ */
+bool
+SendProcSignalBackendOrAuxproc(int pid, ProcSignalReason reason)
+{
+	PGPROC	   *proc;
+	BackendId	backendId = InvalidBackendId;
+
+	proc = BackendPidGetProc(pid);
+
+	/*
+	 * See if the process with given pid is a backend or an auxiliary process.
+	 *
+	 * If the given process is a backend, use its backend id in
+	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
+	 * that because auxiliary processes (except the startup process) don't
+	 * have a valid backend id.
+	 */
+	if (proc != NULL)
+		backendId = proc->backendId;
+	else
+		proc = AuxiliaryPidGetProc(pid);
+
+	/*
+	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
+	 * isn't valid; but by the time we reach kill(), a process for which we
+	 * get a valid proc here might have terminated on its own.  There's no way
+	 * to acquire a lock on an arbitrary process to prevent that. But since
+	 * this mechanism is usually used to debug a backend or an auxiliary
+	 * process running and consuming lots of memory or a long running process,
+	 * that it might end on its own first and its memory contexts are not
+	 * logged or backtrace not logged is not a problem.
+	 */
+	if (proc == NULL)
+	{
+		/*
+		 * This is just a warning so a loop-through-resultset will not abort
+		 * if one backend terminated on its own during the run.
+		 */
+		ereport(WARNING,
+(errmsg("PID %d is not a PostgreSQL server process", pid)));
+		return false;
+	}
+
+	if (SendProcSignal(pid, reason, backendId) < 0)
+	{
+		/* Again, just a warning to allow loops */
+		ereport(WARNING,
+(errmsg("could not send signal to process %d: %m", pid)));
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * BackendPidGetProc -- get a backend's PGPROC given its PID
  *
diff --git a/src/backend/utils/adt/mcxtfuncs.c b/src/backend/utils/adt/mcxtfuncs.c
index 4708d73f5f..f7b4f8dac1 100644
--- a/src/backend/utils/adt/mcxtfuncs.c
+++ b/src/backend/utils/adt/mcxtfuncs.c
@@ -145,51 +145,8 @@ Datum
 pg_log_backend_memory_contexts(PG_FUNCTION_ARGS)
 {
 	int			pid = PG_GETARG_INT32(0);
-	PGPROC	   *proc;
-	BackendId	backendId = InvalidBackendId;
+	bool		result;
 
-	proc = BackendPidGetProc(pid);
-
-	/*
-	 * See if the process with given pid is a backend or an auxiliary process.
-	 *
-	 * If the given process is a backend, use its backend id in
-	 * SendProcSignal() later to speed up the operation. Otherwise, don't do
-	 * that because auxiliary processes (except the startup process) don't
-	 * have a valid backend id.
-	 */
-	if (proc != NULL)
-		backendId = proc->backendId;
-	else
-		proc = AuxiliaryPidGetProc(pid);
-
-	/*
-	 * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid
-	 * isn't valid; but by the time 

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-25 Thread vignesh C
On Wed, 20 Dec 2023 at 19:17, Jelte Fennema-Nio  wrote:
>
> On Thu, 14 Dec 2023 at 13:57, Jelte Fennema-Nio  wrote:
> > I changed all the places that were not adhering to those spellings.
>
> It seems I forgot a /g on my sed command to do this so it turned out I
> missed one that caused the test to fail to compile... Attached is a
> fixed version.
>
> I also updated the patchset to use the EOF detection provided by
> 0a5c46a7a488f2f4260a90843bb9de6c584c7f4e instead of introducing a new
> way of EOF detection using a -2 return value.

CFBot shows that the patch does not apply anymore as in [1]:
patching file doc/src/sgml/libpq.sgml
...
patching file src/interfaces/libpq/exports.txt
Hunk #1 FAILED at 191.
1 out of 1 hunk FAILED -- saving rejects to file
src/interfaces/libpq/exports.txt.rej
patching file src/interfaces/libpq/fe-connect.c

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_3511.log

Regards,
Vignesh




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 04:52:30PM -0500, Dave Cramer wrote:
> On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan  wrote:
>> On 2024-01-25 Th 16:17, Dave Cramer wrote:
>> Yeah, I think the default Developer Command Prompt for VS2022 is set up
>> for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64".
> 
> Yup, now I'm in the same state you are

Wait a minute here.  Based on [1], x64_arm64 means you can use a x64
host and you'll be able to produce ARM64 builds, still these will not
be able to run on the host where they were built.  How much of the
patch posted upthread is required to produce such builds?  Basically 
everything from it, I guess, so as build dependencies can be
satisfied?

[1]: 
https://learn.microsoft.com/en-us/cpp/build/building-on-the-command-line?view=msvc-170
--
Michael


signature.asc
Description: PGP signature


RE: Fix inappropriate comments in function ReplicationSlotAcquire

2024-01-25 Thread Wei Wang (Fujitsu)
On Thu, Jan 25, 2024 at 20:33 Amit Kapila  wrote:
> On Thu, Jan 25, 2024 at 4:01 PM Wei Wang (Fujitsu)
>  wrote:
> >
> >
> > Yes, agree. I think these two parts have become slightly outdated after the
> > commit 1632ea4. So also tried to fix the first part of the comment.
> > Attach the new patch.
> >
> 
> How about changing it to something simple like:
> diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
> index f2781d0455..84c257a7aa 100644
> --- a/src/backend/replication/slot.c
> +++ b/src/backend/replication/slot.c
> @@ -465,10 +465,7 @@ retry:
> 
> LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
> 
> -   /*
> -* Search for the slot with the specified name if the slot to acquire 
> is
> -* not given. If the slot is not found, we either return -1 or
> error out.
> -*/
> +/* Check if the slot exits with the given name. */
> s = SearchNamedReplicationSlot(name, false);
> if (s == NULL || !s->in_use)
> {

It looks good to me. So, I updated the patch as suggested.

Regards,
Wang Wei


v3-0001-Fix-inappropriate-comments-in-function-Replicatio.patch
Description:  v3-0001-Fix-inappropriate-comments-in-function-Replicatio.patch


Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote:
> I would still advocate for a back patch here. It is frustrating to get logs
> from users that just say:
> 
> LOG:  invalid checkpoint record
> PANIC:  could not locate a valid checkpoint record
> 
> It would be very helpful to know what the checkpoint record LSN was in this
> case.

Yes, I've pested over this one in the past when debugging corruption
issues.  To me, this would just mean to appens to the PANIC an "at
%X/%X", but perhaps you have more in mind for these code paths?
--
Michael


signature.asc
Description: PGP signature


Rename setup_cancel_handler in pg_dump

2024-01-25 Thread Yugo NAGATA
Hi,

Attached is a simple patch to rename setup_cancel_handler()
in pg_dump/parallel.c. 

I am proposing it because there is a public function with
the same name in fe_utils/cancel.c. I know pg_dump/parallel.c
does not include fe_utils/cancel.h, so there is no conflict,
but I think it is better to use different names to reduce
possible confusion. 

I guess there was no concerns when setup_cancel_handler in
pg_dump/parallel.c was introduced because the same name
function was not in fe_utils that could be used in common
between client tools.. The public setup_cancel_handler in
fe_utils was introduced in a4fd3aa719e, where this function
was moved from psql.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 188186829c..261b23cb3f 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -204,7 +204,7 @@ static ParallelSlot *GetMyPSlot(ParallelState *pstate);
 static void archive_close_connection(int code, void *arg);
 static void ShutdownWorkersHard(ParallelState *pstate);
 static void WaitForTerminatingWorkers(ParallelState *pstate);
-static void setup_cancel_handler(void);
+static void pg_dump_setup_cancel_handler(void);
 static void set_cancel_pstate(ParallelState *pstate);
 static void set_cancel_slot_archive(ParallelSlot *slot, ArchiveHandle *AH);
 static void RunWorker(ArchiveHandle *AH, ParallelSlot *slot);
@@ -550,7 +550,7 @@ sigTermHandler(SIGNAL_ARGS)
 	/*
 	 * Some platforms allow delivery of new signals to interrupt an active
 	 * signal handler.  That could muck up our attempt to send PQcancel, so
-	 * disable the signals that setup_cancel_handler enabled.
+	 * disable the signals that pg_dump_setup_cancel_handler enabled.
 	 */
 	pqsignal(SIGINT, SIG_IGN);
 	pqsignal(SIGTERM, SIG_IGN);
@@ -605,7 +605,7 @@ sigTermHandler(SIGNAL_ARGS)
  * Enable cancel interrupt handler, if not already done.
  */
 static void
-setup_cancel_handler(void)
+pg_dump_setup_cancel_handler(void)
 {
 	/*
 	 * When forking, signal_info.handler_set will propagate into the new
@@ -705,7 +705,7 @@ consoleHandler(DWORD dwCtrlType)
  * Enable cancel interrupt handler, if not already done.
  */
 static void
-setup_cancel_handler(void)
+pg_dump_setup_cancel_handler(void)
 {
 	if (!signal_info.handler_set)
 	{
@@ -737,7 +737,7 @@ set_archive_cancel_info(ArchiveHandle *AH, PGconn *conn)
 	 * important that this happen at least once before we fork off any
 	 * threads.
 	 */
-	setup_cancel_handler();
+	pg_dump_setup_cancel_handler();
 
 	/*
 	 * On Unix, we assume that storing a pointer value is atomic with respect


Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value

2024-01-25 Thread Yugo NAGATA
On Tue, 2 Jan 2024 08:00:00 +0800
jian he  wrote:

> On Mon, Nov 6, 2023 at 8:00 AM jian he  wrote:
> >
> > minor doc issues.
> > Returns the chunk id of the TOASTed value, or NULL if the value is not 
> > TOASTed.
> > Should it be "chunk_id"?

Thank you for your suggestion. As you pointed out, it is called "chunk_id" 
in the documentation, so I rewrote it and also added a link to the section
where the TOAST table structure is explained.

> > you may place it after pg_create_logical_replication_slot entry to
> > make it look like alphabetical order.

I've been thinking about where we should place the function in the doc,
and I decided place it in the table  of Database Object Size Functions
because I think pg_column_toast_chunk_id also would assist understanding
the result of size functions as similar to pg_column_compression; that is,
those function can explain why a large value in size could be stored in
a column.

> > There is no test. maybe we can add following to 
> > src/test/regress/sql/misc.sql
> > create table val(t text);
> > INSERT into val(t) SELECT string_agg(
> >   chr((ascii('B') + round(random() * 25)) :: integer),'')
> > FROM generate_series(1,2500);
> > select pg_column_toast_chunk_id(t) is  not null from val;
> > drop table val;

Thank you for the test proposal. However, if we add a test, I want
to check that the chunk_id returned by the function exists in the
TOAST table, and that it returns NULL if the values is not TOASTed.
For the purpose, I wrote a test using a dynamic SQL since the table
name of the TOAST table have to be generated from the main table's OID.

> Hi
> the main C function (pg_column_toast_chunk_id)  I didn't change.
> I added tests as mentioned above.
> tests put it on src/test/regress/sql/misc.sql, i hope that's fine.
> I placed pg_column_toast_chunk_id in "Table 9.99. Database Object
> Location Functions" (below Table 9.98. Database Object Size
> Functions).

I could not find any change in your patch from my previous patch.
Maybe, you attached wrong file. I attached a patch updated based
on your review, including the documentation fixes and a test.
What do you think about this it? 

Regards,
Yugo Nagata

-- 
Yugo NAGATA 
>From 97af1b2300ecf07a34923da87a9d84e6aa963956 Mon Sep 17 00:00:00 2001
From: Yugo Nagata 
Date: Wed, 29 Mar 2023 09:59:25 +0900
Subject: [PATCH v3] Add pg_column_toast_chunk_id function

This function returns the chunk_id of an on-disk TOASTed value, or
NULL if the value is un-TOASTed or not on disk. This enables users to
know which columns are actually TOASTed. This function is also useful
to identify a problematic row when an error like
  "ERROR:  unexpected chunk number ... (expected ...) for toast value"
occurs.
---
 doc/src/sgml/func.sgml  | 17 ++
 src/backend/utils/adt/varlena.c | 41 +
 src/include/catalog/pg_proc.dat |  3 +++
 3 files changed, 61 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 210c7c0b02..2d82331323 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28078,6 +28078,23 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset

   
 
+  
+   
+
+ pg_column_toast_chunk_id
+
+pg_column_toast_chunk_id ( "any" )
+oid
+   
+   
+Shows the chunk_id of an on-disk
+TOASTed value. Returns NULL
+if the value is un-TOASTed or not on-disk.
+See  for details about
+TOAST.
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 543afb66e5..84d36781a4 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -5105,6 +5105,47 @@ pg_column_compression(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
+/*
+ * Return the chunk_id of the on-disk TOASTed value.
+ * Return NULL if the value is unTOASTed or not on disk.
+ */
+Datum
+pg_column_toast_chunk_id(PG_FUNCTION_ARGS)
+{
+	int			typlen;
+	struct varlena *attr;
+	struct varatt_external toast_pointer;
+
+	/* On first call, get the input type's typlen, and save at *fn_extra */
+	if (fcinfo->flinfo->fn_extra == NULL)
+	{
+		/* Lookup the datatype of the supplied argument */
+		Oid			argtypeid = get_fn_expr_argtype(fcinfo->flinfo, 0);
+
+		typlen = get_typlen(argtypeid);
+		if (typlen == 0)		/* should not happen */
+			elog(ERROR, "cache lookup failed for type %u", argtypeid);
+
+		fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
+	  sizeof(int));
+		*((int *) fcinfo->flinfo->fn_extra) = typlen;
+	}
+	else
+		typlen = *((int *) fcinfo->flinfo->fn_extra);
+
+	if (typlen != -1)
+		PG_RETURN_NULL();
+
+	attr = (struct varlena *) DatumGetPointer(PG_GETARG_DATUM(0));
+
+	if (!VARATT_IS_EXTERNAL_ONDISK(attr))
+		PG_RETURN_NULL();
+
+	VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr);
+
+	

Re: speed up a logical replica setup

2024-01-25 Thread Euler Taveira
On Thu, Jan 25, 2024, at 6:05 AM, Hayato Kuroda (Fujitsu) wrote:
> 01.
> ```
> /* Options */
> static char *pub_conninfo_str = NULL;
> static SimpleStringList database_names = {NULL, NULL};
> static int wait_seconds = DEFAULT_WAIT;
> static bool retain = false;
> static bool dry_run = false;
> ```
> 
> Just to confirm - is there a policy why we store the specified options? If you
> want to store as global ones, username and port should follow (my fault...).
> Or, should we have a structure to store them?

It is a matter of style I would say. Check other client applications. Some of
them also use global variable. There are others that group options into a
struct. I would say that since it has a short lifetime, I don't think the
current style is harmful.

> 04.
> ```
> {"dry-run", no_argument, NULL, 'n'},
> ```
> 
> I'm not sure why the dry_run mode exists. In terms pg_resetwal, it shows the
> which value would be changed based on the input. As for the pg_upgrade, it 
> checks
> whether the node can be upgraded for now. I think, we should have the checking
> feature, so it should be renamed to --check. Also, the process should exit 
> earlier
> at that time.

It is extremely useful because (a) you have a physical replication setup and
don't know if it is prepared for logical replication, (b) check GUCs (is
max_wal_senders sufficient for this pg_subscriber command? Or is
max_replication_slots sufficient to setup the logical replication even though I
already have some used replication slots?), (c) connectivity and (d)
credentials.

> 05.
> I felt we should accept some settings from enviroment variables, like 
> pg_upgrade.
> Currently, below items should be acceted.
> 
> - data directory
> - username
> - port
> - timeout

Maybe PGDATA.

> 06.
> ```
> pg_logging_set_level(PG_LOG_WARNING);
> ```
> 
> If the default log level is warning, there are no ways to output debug logs.
> (-v option only raises one, so INFO would be output)
> I think it should be PG_LOG_INFO.

You need to specify multiple -v options.

> 07.
> Can we combine verifications into two functions, e.g., check_primary() and 
> check_standby/check_subscriber()?

I think v9 does it.

> 08.
> Not sure, but if we want to record outputs by pg_subscriber, the sub-directory
> should be created. The name should contain the timestamp.

The log file already contains the timestamp. Why?

> 09.
> Not sure, but should we check max_slot_wal_keep_size of primary server? It can
> avoid to fail starting of logical replicaiton.

A broken physical replication *before* running this tool is its responsibility?
Hmm. We might add another check that can be noticed during dry run mode.

> 10.
> ```
> nslots_new = nslots_old + dbarr.ndbs;
> 
> if (nslots_new > max_replication_slots)
> {
> pg_log_error("max_replication_slots (%d) must be greater than or equal to "
> "the number of replication slots required (%d)", max_replication_slots, 
> nslots_new);
> exit(1);
> }
> ```
> 
> I think standby server must not have replication slots. Because subsequent
> pg_resetwal command discards all the WAL file, so WAL records pointed by them
> are removed. Currently pg_resetwal does not raise ERROR at that time.

Again, dry run mode might provide a message for it.

> 11.
> ```
> /*
> * Stop the subscriber if it is a standby server. Before executing the
> * transformation steps, make sure the subscriber is not running because
> * one of the steps is to modify some recovery parameters that require a
> * restart.
> */
> if (stat(pidfile, ) == 0)
> ```
> 
> I kept just in case, but I'm not sure it is still needed. How do you think?
> Removing it can reduce an inclusion of pidfile.h.

Are you suggesting another way to check if the standby is up and running?

> 12.
> ```
> pg_ctl_cmd = psprintf("\"%s/pg_ctl\" stop -D \"%s\" -s",
>   standby.bindir, standby.pgdata);
> rc = system(pg_ctl_cmd);
> pg_ctl_status(pg_ctl_cmd, rc, 0);
> ```
> 
> 
> There are two places to stop the instance. Can you divide it into a function?

Yes.

> 13.
> ```
> * A temporary replication slot is not used here to avoid keeping a
> * replication connection open (depending when base backup was taken, the
> * connection should be open for a few hours).
> */
> conn = connect_database(primary.base_conninfo, dbarr.perdb[0].dbname);
> if (conn == NULL)
> exit(1);
> consistent_lsn = create_logical_replication_slot(conn, true,
> [0]);
> ```
> 
> I didn't notice the comment, but still not sure the reason. Why we must 
> reserve
> the slot until pg_subscriber finishes? IIUC, the slot would be never used, it
> is created only for getting a consistent_lsn. So we do not have to keep.
> Also, just before, logical replication slots for each databases are created, 
> so
> WAL records are surely reserved.

This comment needs to be updated. It was written at the time I was pursuing
base backup support too. It doesn't matter if you remove this transient
replication slot earlier because all of the replication slots created to the

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-25 Thread Michael Paquier
On Thu, Jan 25, 2024 at 05:45:43PM +0900, Sutou Kouhei wrote:
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Thu, 25 Jan 2024 12:17:55 +0900,
>   Michael Paquier  wrote:
>> +extern CopyToRoutine CopyToRoutineText;
>> +extern CopyToRoutine CopyToRoutineCSV;
>> +extern CopyToRoutine CopyToRoutineBinary;
>> 
>> All that should IMO remain in copyto.c and copyfrom.c in the initial
>> patch doing the refactoring.  Why not using a fetch function instead
>> that uses a string in input?  Then you can call that once after
>> parsing the List of options in ProcessCopyOptions().
> 
> OK. How about the following for the fetch function
> signature?
> 
> extern CopyToRoutine *GetBuiltinCopyToRoutine(const char *format);

Or CopyToRoutineGet()?  I am not wedded to my suggestion, got a bad
history with naming things around here.

> We may introduce an enum and use it:
> 
> typedef enum CopyBuiltinFormat
> {
>   COPY_BUILTIN_FORMAT_TEXT = 0,
>   COPY_BUILTIN_FORMAT_CSV,
>   COPY_BUILTIN_FORMAT_BINARY,
> } CopyBuiltinFormat;
> 
> extern CopyToRoutine *GetBuiltinCopyToRoutine(CopyBuiltinFormat format);

I am not sure that this is necessary as the option value is a string.

> Oh, sorry. I assumed that the comment style was adjusted by
> pgindent.

No worries, that's just something we get used to.  I tend to fix a lot
of these things by myself when editing patches.

>> +getTypeBinaryOutputInfo(attr->atttypid, _func_oid, );
>> +fmgr_info(out_func_oid, >out_functions[attnum - 1]);
>> 
>> Actually, this split is interesting.  It is possible for a custom
>> format to plug in a custom set of out functions.  Did you make use of
>> something custom for your own stuff?
> 
> I didn't. My PoC custom COPY format handler for Apache Arrow
> just handles integer and text for now. It doesn't use
> cstate->out_functions because cstate->out_functions may not
> return a valid binary format value for Apache Arrow. So it
> formats each value by itself.

I mean, if you use a custom output function, you could tweak things
even more with byteas or such..  If a callback is expected to do
something, like setting the output function OIDs in the start
callback, we'd better document it rather than letting that be implied.

>>   Actually, could it make sense to
>> split the assignment of cstate->out_functions into its own callback?
> 
> Yes. Because we need to use getTypeBinaryOutputInfo() for
> "binary" and use getTypeOutputInfo() for "text" and "csv".

Okay.  After sleeping on it, a split makes sense here, because it also
reduces the presence of TupleDesc in the start callback.

>> Sure, that's part of the start phase, but at least it would make clear
>> that a custom method *has* to assign these OIDs to work.  The patch
>> implies that as a rule, without a comment that CopyToStart *must* set
>> up these OIDs.
> 
> CopyToStart doesn't need to set up them if the handler
> doesn't use cstate->out_functions.

Noted.

>> I think that 0001 and 0005 should be handled first, as pieces
>> independent of the rest.  Then we could move on with 0002~0004 and
>> 0006~0008.
> 
> OK. I'll focus on 0001 and 0005 for now. I'll restart
> 0002-0004/0006-0008 after 0001 and 0005 are accepted.

Once you get these, I'd be interested in re-doing an evaluation of
COPY TO and more tests with COPY FROM while running Postgres on
scissors.  One thing I was thinking to use here is my blackhole_am for
COPY FROM:
https://github.com/michaelpq/pg_plugins/tree/main/blackhole_am

As per its name, it does nothing on INSERT, so you could create a
table using it as access method, and stress the COPY FROM execution
paths without having to mount Postgres on a tmpfs because the data is
sent to the void.  Perhaps it does not matter, but that moves the
tests to the bottlenecks we want to stress (aka the per-row callback
for large data sets).

I've switched the patch as waiting on author for now.  Thanks for your
perseverance here.  I understand that's not easy to follow up with
patches and reviews (^_^;)
--
Michael


signature.asc
Description: PGP signature


Re: speed up a logical replica setup

2024-01-25 Thread Euler Taveira
On Tue, Jan 23, 2024, at 10:29 PM, Euler Taveira wrote:
> I'll post a new one soon.

I'm attaching another patch that fixes some of the issues pointed out by
Hayato, Shlok, and Junwang.

* publication doesn't exist. The analysis [1] was done by Hayato but I didn't
  use the proposed patch. Instead I refactored the code a bit [2] and call it
  from a new function (setup_publisher) that is called before the promotion.
* fix wrong path name in the initial comment [3]
* change terminology: logical replica -> physical replica [3]
* primary / standby is ready for logical replication? setup_publisher() and
  setup_subscriber() check if required GUCs are set accordingly. For primary,
  it checks wal_level = logical, max_replication_slots has remain replication
  slots for the proposed setup and also max_wal_senders available. For standby,
  it checks max_replication_slots for replication origin and also remain number
  of background workers to start the subscriber.
* retain option: I extracted this one from Hayato's patch [4]
* target server must be a standby. It seems we agree that this restriction
  simplifies the code a bit but can be relaxed in the future (if/when base
  backup support is added.)
* recovery timeout option: I decided to include it but I think the use case is
  too narrow. It helps in broken setups. However, it can be an issue in some
  scenarios like time-delayed replica, large replication lag, slow hardware,
  slow network. I didn't use the proposed patch [5]. Instead, I came up with a
  simple one that defaults to forever. The proposed one defaults to 60 seconds
  but I'm afraid that due to one of the scenarios I said in a previous
  sentence, we cancel a legitimate case. Maybe we should add a message during
  dry run saying that due to a replication lag, it will take longer to run.
* refactor primary_slot_name code. With the new setup_publisher and
  setup_subscriber functions, I splitted the function that detects the
  primary_slot_name use into 2 pieces just to avoid extra connections to have
  the job done.
* remove fallback_application_name as suggested by Hayato [5] because logical
  replication already includes one.

I'm still thinking about replacing --subscriber-conninfo with separate items
(username, port, password?, host = socket dir). Maybe it is an overengineering.
The user can always prepare the environment to avoid unwanted and/or external
connections.

I didn't change the name from pg_subscriber to pg_createsubscriber yet but if I
didn't hear objections about it, I'll do it in the next patch.


[1] 
https://www.postgresql.org/message-id/TY3PR01MB9889C5D55206DDD978627D07F5752%40TY3PR01MB9889.jpnprd01.prod.outlook.com
[2] 
https://www.postgresql.org/message-id/73ab86ca-3fd5-49b3-9c80-73d1525202f1%40app.fastmail.com
[3] 
https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
[4] 
https://www.postgresql.org/message-id/TY3PR01MB9889678E47B918F4D83A6FD8F57B2%40TY3PR01MB9889.jpnprd01.prod.outlook.com
[5] 
https://www.postgresql.org/message-id/TY3PR01MB9889593399165B9A04106741F5662%40TY3PR01MB9889.jpnprd01.prod.outlook.com


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From d9ef01a806c3d8697faa444283f19c2deaa58850 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 5 Jun 2023 14:39:40 -0400
Subject: [PATCH v9] Creates a new logical replica from a standby server

A new tool called pg_subscriber can convert a physical replica into a
logical replica. It runs on the target server and should be able to
connect to the source server (publisher) and the target server
(subscriber).

The conversion requires a few steps. Check if the target data directory
has the same system identifier than the source data directory. Stop the
target server if it is running as a standby server. Create one
replication slot per specified database on the source server. One
additional replication slot is created at the end to get the consistent
LSN (This consistent LSN will be used as (a) a stopping point for the
recovery process and (b) a starting point for the subscriptions). Write
recovery parameters into the target data directory and start the target
server (Wait until the target server is promoted). Create one
publication (FOR ALL TABLES) per specified database on the source
server. Create one subscription per specified database on the target
server (Use replication slot and publication created in a previous step.
Don't enable the subscriptions yet). Sets the replication progress to
the consistent LSN that was got in a previous step. Enable the
subscription for each specified database on the target server. Remove
the additional replication slot that was used to get the consistent LSN.
Stop the target server. Change the system identifier from the target
server.

Depending on your workload and database size, creating a logical replica
couldn't be an option due to resource constraints (WAL backlog should be
available until all table data 

Re: Guiding principle for dropping LLVM versions?

2024-01-25 Thread Mark Wong

On 1/25/24 13:41, Thomas Munro wrote:

On Thu, Jan 25, 2024 at 4:44 PM Thomas Munro  wrote:

... A few build farm animals will
now fail in the configure step as discussed, and need some adjustment
(ie disable LLVM or upgrade to LLVM 10+ for the master branch).


Owners pinged.


I think I fixed up the 4 or 6 under my name...

Regards,
Mark





Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 16:32, Andrew Dunstan  wrote:

>
> On 2024-01-25 Th 16:17, Dave Cramer wrote:
>
>
>
> On Thu, 25 Jan 2024 at 16:04, Anthony Roberts 
> wrote:
>
>> Hi David,
>>
>> Unix "file" or "dumpbin /headers" in vcvarsall are your best bets.
>>
>> Thanks,
>> Anthony
>>
>
>
> So there is another way, select the file in Windows Explorer and right
> click, in the compatibility tab if the "Windows on ARM" is greyed out it is
> an arm binary.
>
> So far mine are not :(
>
>
> Yeah, I think the default Developer Command Prompt for VS2022 is set up
> for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64".
>

Yup, now I'm in the same state you are

Dave


Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele

On 1/25/24 17:42, Tom Lane wrote:

David Steele  writes:

Another thing to note here -- knowing the LSN is important but also
knowing that backup recovery was attempted (i.e. backup_label exists) is
really crucial. Knowing both just saves so much time in back and forth
debugging.



It appears the tally for back patching is:



For: Andres, David, Michael B
Not Sure: Robert, Laurenz, Michael P



It seems at least nobody is dead set against it.


We're talking about 1d35f705e, right?  That certainly looks harmless
and potentially useful.  I'm +1 for back-patching.


That's the one. If we were modifying existing messages I would be 
against it, but new, infrequent (but oh so helpful) messages seem fine.


Regards,
-David




Re: Use of backup_label not noted in log

2024-01-25 Thread Tom Lane
David Steele  writes:
> Another thing to note here -- knowing the LSN is important but also 
> knowing that backup recovery was attempted (i.e. backup_label exists) is 
> really crucial. Knowing both just saves so much time in back and forth 
> debugging.

> It appears the tally for back patching is:

> For: Andres, David, Michael B
> Not Sure: Robert, Laurenz, Michael P

> It seems at least nobody is dead set against it.

We're talking about 1d35f705e, right?  That certainly looks harmless
and potentially useful.  I'm +1 for back-patching.

regards, tom lane




Re: Guiding principle for dropping LLVM versions?

2024-01-25 Thread Thomas Munro
On Thu, Jan 25, 2024 at 4:44 PM Thomas Munro  wrote:
> ... A few build farm animals will
> now fail in the configure step as discussed, and need some adjustment
> (ie disable LLVM or upgrade to LLVM 10+ for the master branch).

Owners pinged.




Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele

On 1/25/24 09:29, Michael Banck wrote:

Hi,

On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote:

I would still advocate for a back patch here. It is frustrating to get logs
from users that just say:

LOG:  invalid checkpoint record
PANIC:  could not locate a valid checkpoint record

It would be very helpful to know what the checkpoint record LSN was in this
case.


I agree.


Another thing to note here -- knowing the LSN is important but also 
knowing that backup recovery was attempted (i.e. backup_label exists) is 
really crucial. Knowing both just saves so much time in back and forth 
debugging.


It appears the tally for back patching is:

For: Andres, David, Michael B
Not Sure: Robert, Laurenz, Michael P

It seems at least nobody is dead set against it.

Regards,
-David




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Andrew Dunstan


On 2024-01-25 Th 16:17, Dave Cramer wrote:



On Thu, 25 Jan 2024 at 16:04, Anthony Roberts 
 wrote:


Hi David,

Unix "file" or "dumpbin /headers" in vcvarsall are your best bets.

Thanks,
Anthony



So there is another way, select the file in Windows Explorer and right 
click, in the compatibility tab if the "Windows on ARM" is greyed out 
it is an arm binary.


So far mine are not :(



Yeah, I think the default Developer Command Prompt for VS2022 is set up 
for x86 builds. AIUI you should start by executing "vcvarsall x64_arm64".



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: More new SQL/JSON item methods

2024-01-25 Thread Andrew Dunstan



On 2024-01-25 Th 15:58, Tom Lane wrote:

I wrote:

There's something else going on, because I'm still getting the
assertion failure on my Mac with this fix in place.  Annoyingly,
it goes away if I compile with -O0, so it's kind of hard to
identify what's going wrong.

No, belay that: I must've got confused about which version I was
testing.  It's very unclear to me why the undefined reference
causes the preceding Assert to misbehave, but that is clearly
what's happening.  Compiler bug maybe?  My Mac has clang 15.0.0,
and the unhappy buildfarm members are also late-model clang.

Anyway, I did note that the preceding line

res = jperOk;

is dead code and might as well get removed while you're at it.





OK, pushed those. Thanks.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 16:04, Anthony Roberts 
wrote:

> Hi David,
>
> Unix "file" or "dumpbin /headers" in vcvarsall are your best bets.
>
> Thanks,
> Anthony
>


So there is another way, select the file in Windows Explorer and right
click, in the compatibility tab if the "Windows on ARM" is greyed out it is
an arm binary.

So far mine are not :(

Thanks,

Dave

>


Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Andrew Dunstan


On 2024-01-25 Th 15:56, Dave Cramer wrote:



On Thu, 25 Jan 2024 at 14:31, Andrew Dunstan  wrote:


On 2024-01-25 Th 08:45, Dave Cramer wrote:





I tried running my buildfarm using my git repo and my branch, but
get the following error
Status Line: 492 bad branch parameter
Content:
bad branch parameter fix_arm

Web txn failed with status: 1




You can't use your own branch with the buildfarm, you need to let
it set up its own branches.


So I guess I have to wait until this patch is merged in ?




Not quite.

There's a switch that lets you test your own code. If you say 
--from-source /path/to/repo  it will run in whatever state the repo is 
in. It won't care what branch you are on, and it won't try to update the 
repo. It won't report the results to the server, but it will build and 
test everything just like in a regular run. (On the "eat your own dog 
food" principle I use this mode a lot.) If you use that mode you 
probably also want to specify --verbose as well.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-25 Thread reid . thompson
On Thu, 2024-01-25 at 16:16 +0300, Aleksander Alekseev wrote:
> Hi,
> 
> > > Here is the corrected patch.
> > 
> > Thank you for updating the patch! I have some comments:
> 
> Thanks for the review.
> 
> > -tuple = (ReorderBufferTupleBuf *)
> > +tuple = (HeapTuple)
> >  MemoryContextAlloc(rb->tup_context,
> > -
> > sizeof(ReorderBufferTupleBuf) +
> > +   HEAPTUPLESIZE +
> > MAXIMUM_ALIGNOF +
> > alloc_len);
> > -tuple->alloc_tuple_size = alloc_len;
> > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple);
> > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE);
> > 
> > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the
> > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a
> > similar thing but it doesn't add it.
> 
> Indeed. I gave it a try and nothing crashed, so it appears that
> MAXIMUM_ALIGNOF is not needed.
> 
> > ---
> >  xl_parameter_change *xlrec =
> > -(xl_parameter_change *)
> > XLogRecGetData(buf->record);
> > +(xl_parameter_change *)
> > XLogRecGetData(buf->record);
> > 
> > Unnecessary change.
> 
> That's pgindent. Fixed.
> 
> > ---
> > -/*
> > - * Free a ReorderBufferTupleBuf.
> > - */
> > -void
> > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf 
> > *tuple)
> > -{
> > -pfree(tuple);
> > -}
> > -
> > 
> > Why does ReorderBufferReturnTupleBuf need to be moved from
> > reorderbuffer.c to reorderbuffer.h? It seems not related to this
> > refactoring patch so I think we should do it in a separate patch if we
> > really want it. I'm not sure it's necessary, though.
> 
> OK, fixed.
> 

I walked through v6 and didn't note any issues.

I do want to ask, the patch alters ReorderBufferReturnTupleBuf() and
drops the unused parameter ReorderBuffer *rb. It seems that
ReorderBufferFreeSnap(), ReorderBufferReturnRelids(),
ReorderBufferImmediateInvalidation(), and ReorderBufferRestoreCleanup()
also pass ReorderBuffer *rb but do not use it. Would it be beneficial
to implement a separate patch to remove this parameter from these
functions also?

Thanks,
Reid





Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Anthony Roberts
Hi David,

Unix "file" or "dumpbin /headers" in vcvarsall are your best bets.

Thanks,
Anthony

On Thu, 25 Jan 2024, 21:01 Dave Cramer,  wrote:

>
>
> On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan  wrote:
>
>>
>> On 2024-01-24 We 19:02, Michael Paquier wrote:
>>
>> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:
>>
>> I managed to get it to build the vcvarsall arch needs to be x64. I need to
>> add some options, but the patch above needs to be applied to build it.
>>
>> Nice.  If I may ask, what kind of host and/or configuration have you
>> used to reach a state where the code can be compiled and run tests
>> with meson?  If you have found specific steps, it may be a good thing
>> to document that on the wiki, say around [1].
>>
>> Perhaps you have not included TAP?  It may be fine in terms of runtime
>> checks and coverage.
>>
>> [1]: 
>> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows
>>
>>
>>
>> I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we
>> really want to build with x64_arm64, i.e. to generate native arm64
>> binaries. Setting just x64 will not do that, AIUI.
>>
>> I tried that with the buidfarm, setting that in the config file's call to
>> PGBuild::VSenv::getenv().
>>
>> That upset msvc_gendef.pl, so I added this there to keep it happy:
>>
>> $arch = 'x86_64' if $arch eq 'aarch64';
>>
>> After that things went ok until I got this:
>>
>> [1453/2088] "link" @src/backend/postgres.exe.rsp
>> FAILED: src/backend/postgres.exe src/backend/postgres.pdb
>> "link" @src/backend/postgres.exe.rsp
>>Creating library src\backend\postgres.exe.lib
>> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol
>> _mm_pause referenced in function perform_spin_delay
>> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals
>>
>>
>> I haven't made further progress, but I will return to it in the next day
>> or so.
>>
>> While this will be nice to have, I think it won't really matter until
>> there is ARM64 support in released versions of Windows Server. AFAICT they
>> still only sell versions for x86_64
>>
>
> I've tried it with my patch attached previously and x64_arm64 and it works
> fine. builds using the buildfarm as well.
>
> Is there a definitive way to figure out if the binaries are x64_arm64
>
> Dave
>


Re: More new SQL/JSON item methods

2024-01-25 Thread Andrew Dunstan



On 2024-01-25 Th 15:33, Tom Lane wrote:

Andrew Dunstan  writes:

On 2024-01-25 Th 14:31, Tom Lane wrote:

(The reported crashes seem to be happening later during a
recursive invocation, seemingly because JsonbType(jb) is
returning garbage.  So there may be another bug after this one.)

I don't think so. AIUI The first call deals with the '$' and the second
one deals with the '.string()', which is why we see the error on the
second call.

There's something else going on, because I'm still getting the
assertion failure on my Mac with this fix in place.  Annoyingly,
it goes away if I compile with -O0, so it's kind of hard to
identify what's going wrong.





Curiouser and curiouser. On my Mac the error is manifest but the fix you 
suggested cures it. Built with -O2 -g, clang 15.0.0, Apple Silicon.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan  wrote:

>
> On 2024-01-24 We 19:02, Michael Paquier wrote:
>
> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:
>
> I managed to get it to build the vcvarsall arch needs to be x64. I need to
> add some options, but the patch above needs to be applied to build it.
>
> Nice.  If I may ask, what kind of host and/or configuration have you
> used to reach a state where the code can be compiled and run tests
> with meson?  If you have found specific steps, it may be a good thing
> to document that on the wiki, say around [1].
>
> Perhaps you have not included TAP?  It may be fine in terms of runtime
> checks and coverage.
>
> [1]: 
> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows
>
>
>
> I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we really
> want to build with x64_arm64, i.e. to generate native arm64 binaries.
> Setting just x64 will not do that, AIUI.
>
> I tried that with the buidfarm, setting that in the config file's call to
> PGBuild::VSenv::getenv().
>
> That upset msvc_gendef.pl, so I added this there to keep it happy:
>
> $arch = 'x86_64' if $arch eq 'aarch64';
>
> After that things went ok until I got this:
>
> [1453/2088] "link" @src/backend/postgres.exe.rsp
> FAILED: src/backend/postgres.exe src/backend/postgres.pdb
> "link" @src/backend/postgres.exe.rsp
>Creating library src\backend\postgres.exe.lib
> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol
> _mm_pause referenced in function perform_spin_delay
> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals
>
>
> I haven't made further progress, but I will return to it in the next day
> or so.
>
> While this will be nice to have, I think it won't really matter until
> there is ARM64 support in released versions of Windows Server. AFAICT they
> still only sell versions for x86_64
>

I've tried it with my patch attached previously and x64_arm64 and it works
fine. builds using the buildfarm as well.

Is there a definitive way to figure out if the binaries are x64_arm64

Dave


[Doc] Improvements to ddl.sgl Privileges Section and Glossary

2024-01-25 Thread David G. Johnston
Hey,

In a nearby user complaint email [1] some missing information regarding
ownership reassignment came to light.  I took that and went a bit further
to add what I felt was further missing information and context for how the
privilege system is designed.  I've tried to formalize and label existing
concepts a bit and updated the glossary accordingly.

The attached is a partial rewrite of the patch on the linked post after
those comments were taken into consideration.

The new glossary entry for privileges defines various qualifications of the
term that are used in the new prose.  I've marked up each of the variants
and point them all back to the main entry.  I didn't try to incorporate
those terms, or even really look, anywhere else in the documentation.  If
the general idea is accepted that kind of work can be done as a follow-up.

David J.

[1]
https://www.postgresql.org/message-id/d294818d12280f6223ddf169ab5454927f5186b6.camel%40cybertec.at
From a4d6a599a0b5d6b8f280e3d8489e7f4a4a555383 Mon Sep 17 00:00:00 2001
From: "David G. Johnston" 
Date: Thu, 25 Jan 2024 13:41:48 -0700
Subject: [PATCH] v1-improvements-to-ddl-priv Section

---
 doc/src/sgml/ddl.sgml  | 109 ++---
 doc/src/sgml/glossary.sgml |  40 ++
 2 files changed, 105 insertions(+), 44 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index fc03a349f0..7c9c9d0dd1 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1855,12 +1855,33 @@ ALTER TABLE products RENAME TO items;
   
 
   
-   When an object is created, it is assigned an owner. The
-   owner is normally the role that executed the creation statement.
-   For most kinds of objects, the initial state is that only the owner
-   (or a superuser) can do anything with the object. To allow
-   other roles to use it, privileges must be
-   granted.
+   The permissions needed to interact with an object are split between
+   privileges and
+   owner rights.
+   The owner has the right to modify the object, including dropping it, as well as
+   the right to grant and revoke privileges on the object to any role in the system.
+   Aside from a brief description, at the end, of what happens when you reassign the
+   owner of an object to some other role, this section describes privileges.
+  
+
+  
+   For most databases there are many roles and many objects, and thus a large amount
+   of privileges that need to be defined.  Two features exist to make management of
+   privileges easier: role membership (i.e., group roles, see )
+   including the PUBLIC  pseudo-role (which is a group role consisting
+   of all roles in the system) is one, while
+   default privileges
+   (see ) is the other.
+  
+
+  
+   The fundamental design of the privilege system is to disallow by default.  A role
+   must, directly or indirectly (via group role inheritance), hold the correct privilege
+   on the object to peform the corresponding action.
+   Furthermore, inheritance is strictly additive, there is no mechanism to block an
+   indirect privilege.
+   Revoking a privilege only removes
+   direct privileges.
   
 
   
@@ -1878,21 +1899,14 @@ ALTER TABLE products RENAME TO items;
   
 
   
-   The right to modify or destroy an object is inherent in being the
-   object's owner, and cannot be granted or revoked in itself.
-   (However, like all privileges, that right can be inherited by
-   members of the owning role; see .)
-  
-
-  
-   An object can be assigned to a new owner with an ALTER
-   command of the appropriate kind for the object, for example
-
-ALTER TABLE table_name OWNER TO new_owner;
-
-   Superusers can always do this; ordinary roles can only do it if they are
-   both the current owner of the object (or inherit the privileges of the
-   owning role) and able to SET ROLE to the new owning role.
+   Upon object creation, two sets of
+   implicit privileges
+   are established.
+   The owner is granted all applicable privileges on the object. All other roles,
+   including the PUBLIC  pseudo-role, get their default privileges.
+   From this point onward only
+   explicit priviliege
+   modification is possible, and is described next.
   
 
   
@@ -1904,15 +1918,9 @@ ALTER TABLE table_name OWNER TO new_owne
 GRANT UPDATE ON accounts TO joe;
 
Writing ALL in place of a specific privilege grants all
-   privileges that are relevant for the object type.
-  
-
-  
-   The special role name PUBLIC can
-   be used to grant a privilege to every role on the system.  Also,
-   group roles can be set up to help manage privileges when
-   there are many users of a database  for details see
-   .
+   privileges that are relevant for the object type.  The pseudo-role 
+   PUBLIC is a valid role for this purpose.  All roles in the system
+   will immediately inherit the indirect privilege(s) named in the command.
   
 
   
@@ -1936,9 +1944,11 @@ REVOKE ALL ON accounts FROM PUBLIC;
 
   
An object's owner can choose to revoke 

Re: More new SQL/JSON item methods

2024-01-25 Thread Tom Lane
I wrote:
> There's something else going on, because I'm still getting the
> assertion failure on my Mac with this fix in place.  Annoyingly,
> it goes away if I compile with -O0, so it's kind of hard to
> identify what's going wrong.

No, belay that: I must've got confused about which version I was
testing.  It's very unclear to me why the undefined reference
causes the preceding Assert to misbehave, but that is clearly
what's happening.  Compiler bug maybe?  My Mac has clang 15.0.0,
and the unhappy buildfarm members are also late-model clang.

Anyway, I did note that the preceding line

res = jperOk;

is dead code and might as well get removed while you're at it.

regards, tom lane




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 14:31, Andrew Dunstan  wrote:

>
> On 2024-01-25 Th 08:45, Dave Cramer wrote:
>
>
>
>
>
> I tried running my buildfarm using my git repo and my branch, but get the
> following error
> Status Line: 492 bad branch parameter
> Content:
> bad branch parameter fix_arm
>
> Web txn failed with status: 1
>
>
>
> You can't use your own branch with the buildfarm, you need to let it set
> up its own branches.
>

So I guess I have to wait until this patch is merged in ?

Dave


Re: proposal: psql: show current user in prompt

2024-01-25 Thread Pavel Stehule
Hi

čt 11. 1. 2024 v 12:12 odesílatel Jelte Fennema-Nio 
napsal:

> On Tue, 12 Sept 2023 at 09:46, Peter Eisentraut 
> wrote:
> > ISTM that for a purpose like pgbouncer, it would be simpler to add a new
> > GUC "report these variables" and send that in the startup message?  That
> > might not help with the psql use case, but it would be much simpler.
>
> FYI I implemented it that way yesterday on this other thread (patch
> 0010 of that patchset):
>
> https://www.postgresql.org/message-id/flat/CAGECzQScQ3N-Ykv2j4NDyDtrPPc3FpRoa%3DLZ-2Uj2ocA4zr%3D4Q%40mail.gmail.com#cd9e8407820d492e8f677ee6a67c21ce


I read your patch, and I see some advantages and some disadvantages.

1. this doesn't need new protocol API just for this feature, what is nice

2. using GUC for all reported GUC looks not too readable. Maybe it should
be used just for customized reporting, not for default

3. Another issue of your proposal is less friendly enabling disabling
reporting of specific GUC. Maintaining a list needs more work than just
enabling and disabling one specific GUC.
I think this is the main disadvantage of your proposal. In my proposal I
don't need to know the state of any GUC. Just I send PQlinkParameterStatus
or PQunlinkParameterStatus. With your proposal, I need to read
_pq_.report_parameters, parse it, and modify, and send it back. This
doesn't look too practical.

Personally I prefer usage of a more generic API than my
PQlinkParameterStatus and PQunlinkParameterStatus. You talked about it with
Robert If I remember.

Can be nice if I can write just

/* similar princip is used inside ncurses */
set_report_guc_message_no = PQgetMessageNo("set_report_guc");
/* the result can be processed by server and by all proxies on the line */

if (set_report_guc_message_no == -1)
  fprintf(stderr, "feature is not supported");
result = PQsendMessage(set_report_guc_message, "user");
if (result == -1)
  fprintf(stderr, "some error ...");

With some API like this it can be easier to do some small protocol
enhancement. Maybe this is overengineering. Enhancing protocol is not too
common, and with usage PQsendTypedCommand enhancing protocol is less work
too.

Regards

Pavel


Re: More new SQL/JSON item methods

2024-01-25 Thread Tom Lane
Andrew Dunstan  writes:
> On 2024-01-25 Th 14:31, Tom Lane wrote:
>> (The reported crashes seem to be happening later during a
>> recursive invocation, seemingly because JsonbType(jb) is
>> returning garbage.  So there may be another bug after this one.)

> I don't think so. AIUI The first call deals with the '$' and the second 
> one deals with the '.string()', which is why we see the error on the 
> second call.

There's something else going on, because I'm still getting the
assertion failure on my Mac with this fix in place.  Annoyingly,
it goes away if I compile with -O0, so it's kind of hard to
identify what's going wrong.

regards, tom lane




Re: More new SQL/JSON item methods

2024-01-25 Thread Andrew Dunstan



On 2024-01-25 Th 14:31, Tom Lane wrote:

Andrew Dunstan  writes:

Thanks, I have pushed this.

The buildfarm is pretty widely unhappy, mostly failing on

select jsonb_path_query('1.23', '$.string()');

On a guess, I tried running that under valgrind, and behold it said

==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised 
value(s)
==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget 
(jsonpath_exec.c:1547)
==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FED03: executeNextItem 
(jsonpath_exec.c:1604)
==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget 
(jsonpath_exec.c:956)
==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 
(jsonpath_exec.c:612)
==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal 
(jsonpath_exec.c:438)

It's fairly obviously right about that:

 JsonbValuejbv;
 ...
 jb = 
 Assert(tmp != NULL);/* We must have set tmp above */
 jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
   ^

Presumably, this is a mistaken attempt to test the type
of the thing previously pointed to by "jb".

On the whole, what I'd be inclined to do here is get rid
of this test altogether and demand that every path through
the preceding "switch" deliver a value that doesn't need
pstrdup().  The only path that doesn't do that already is

 case jbvBool:
 tmp = (jb->val.boolean) ? "true" : "false";
 break;

and TBH I'm not sure that we really need a pstrdup there
either.  The constants are immutable enough.  Is something
likely to try to pfree the pointer later?  I tried

@@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, 
JsonPathItem *jsp,
  
 jb = 

 Assert(tmp != NULL);/* We must have set tmp above */
-   jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
+   jb->val.string.val = tmp;
 jb->val.string.len = strlen(jb->val.string.val);
 jb->type = jbvString;
  
and that quieted valgrind for this particular query and still

passes regression.




Your fix looks sane. I also don't see why we need the pstrdup.




(The reported crashes seem to be happening later during a
recursive invocation, seemingly because JsonbType(jb) is
returning garbage.  So there may be another bug after this one.)





I don't think so. AIUI The first call deals with the '$' and the second 
one deals with the '.string()', which is why we see the error on the 
second call.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Relation bulk write facility

2024-01-25 Thread Heikki Linnakangas

On 10/01/2024 18:17, Robert Haas wrote:

I think we should try to pick prefixes that are one or more words
rather than using word fragments. bulkw is an awkward prefix even for
people whose first language is English, and probably more awkward for
others.


Renamed the 'bulkw' variables to 'bulkstate, and the functions to have 
smgr_bulk_* prefix.


I was tempted to use just bulk_* as the prefix, but I'm afraid e.g. 
bulk_write() is too generic.


On 22/01/2024 07:50, Peter Smith wrote:

Hi, This patch has a CF status of "Needs Review" [1], but it seems
there was a CFbot test failure last time it was run [2]. Please have a
look and post an updated version if necessary.


Fixed the headerscheck failure by adding appropriate #includes.

--
Heikki Linnakangas
Neon (https://neon.tech)
From c2e8cff9326fb874b2e1643f5c3c8a4952eaa3ac Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Thu, 25 Jan 2024 21:40:46 +0200
Subject: [PATCH v5 1/1] Introduce a new bulk loading facility.

The new facility makes it easier to optimize bulk loading, as the
logic for buffering, WAL-logging, and syncing the relation only needs
to be implemented once. It's also less error-prone: We have had a
number of bugs in how a relation is fsync'd - or not - at the end of a
bulk loading operation. By centralizing that logic to one place, we
only need to write it correctly once.

The new facility is faster for small relations: Instead of of calling
smgrimmedsync(), we register the fsync to happen at next checkpoint,
which avoids the fsync latency. That can make a big difference if you
are e.g. restoring a schema-only dump with lots of relations.

It is also slightly more efficient with large relations, as the WAL
logging is performed multiple pages at a time. That avoids some WAL
header overhead. The sorted GiST index build did that already, this
moves the buffering to the new facility.

The changes to pageinspect GiST test needs an explanation: Before this
patch, the sorted GiST index build set the LSN on every page to the
special GistBuildLSN value, not the LSN of the WAL record, even though
they were WAL-logged. There was no particular need for it, it just
happened naturally when we wrote out the pages before WAL-logging
them. Now we WAL-log the pages first, like in B-tree build, so the
pages are stamped with the record's real LSN. When the build is not
WAL-logged, we still use GistBuildLSN. To make the test output
predictable, use an unlogged index.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/30e8f366-58b3-b239-c521-422122dd5150%40iki.fi
---
 contrib/pageinspect/expected/gist.out |  14 +-
 contrib/pageinspect/sql/gist.sql  |  16 +-
 src/backend/access/gist/gistbuild.c   | 122 +++
 src/backend/access/heap/rewriteheap.c |  72 ++
 src/backend/access/nbtree/nbtree.c|  33 +--
 src/backend/access/nbtree/nbtsort.c   | 135 
 src/backend/access/spgist/spginsert.c |  49 ++---
 src/backend/catalog/storage.c |  46 ++--
 src/backend/storage/smgr/Makefile |   1 +
 src/backend/storage/smgr/bulk_write.c | 303 ++
 src/backend/storage/smgr/md.c |  45 +++-
 src/backend/storage/smgr/meson.build  |   1 +
 src/backend/storage/smgr/smgr.c   |  31 +++
 src/include/storage/bulk_write.h  |  40 
 src/include/storage/md.h  |   1 +
 src/include/storage/smgr.h|   1 +
 src/tools/pgindent/typedefs.list  |   3 +
 17 files changed, 558 insertions(+), 355 deletions(-)
 create mode 100644 src/backend/storage/smgr/bulk_write.c
 create mode 100644 src/include/storage/bulk_write.h

diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index d1adbab8ae2..2b1d54a6279 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -1,13 +1,6 @@
--- The gist_page_opaque_info() function prints the page's LSN. Normally,
--- that's constant 1 (GistBuildLSN) on every page of a freshly built GiST
--- index. But with wal_level=minimal, the whole relation is dumped to WAL at
--- the end of the transaction if it's smaller than wal_skip_threshold, which
--- updates the LSNs. Wrap the tests on gist_page_opaque_info() in the
--- same transaction with the CREATE INDEX so that we see the LSNs before
--- they are possibly overwritten at end of transaction.
-BEGIN;
--- Create a test table and GiST index.
-CREATE TABLE test_gist AS SELECT point(i,i) p, i::text t FROM
+-- The gist_page_opaque_info() function prints the page's LSN.
+-- Use an unlogged index, so that the LSN is predictable.
+CREATE UNLOGGED TABLE test_gist AS SELECT point(i,i) p, i::text t FROM
 generate_series(1,1000) i;
 CREATE INDEX test_gist_idx ON test_gist USING gist (p);
 -- Page 0 is the root, the rest are leaf pages
@@ -29,7 +22,6 @@ SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
  0/1 | 0/0 | 1 | {leaf}
 (1 row)
 
-COMMIT;
 SELECT * FROM 

Re: More new SQL/JSON item methods

2024-01-25 Thread Andrew Dunstan



On 2024-01-25 Th 14:31, Tom Lane wrote:

Andrew Dunstan  writes:

Thanks, I have pushed this.

The buildfarm is pretty widely unhappy, mostly failing on

select jsonb_path_query('1.23', '$.string()');

On a guess, I tried running that under valgrind, and behold it said

==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised 
value(s)
==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget 
(jsonpath_exec.c:1547)
==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FED03: executeNextItem 
(jsonpath_exec.c:1604)
==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget 
(jsonpath_exec.c:956)
==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 
(jsonpath_exec.c:612)
==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal 
(jsonpath_exec.c:438)

It's fairly obviously right about that:

 JsonbValuejbv;
 ...
 jb = 
 Assert(tmp != NULL);/* We must have set tmp above */
 jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
   ^

Presumably, this is a mistaken attempt to test the type
of the thing previously pointed to by "jb".

On the whole, what I'd be inclined to do here is get rid
of this test altogether and demand that every path through
the preceding "switch" deliver a value that doesn't need
pstrdup().  The only path that doesn't do that already is

 case jbvBool:
 tmp = (jb->val.boolean) ? "true" : "false";
 break;

and TBH I'm not sure that we really need a pstrdup there
either.  The constants are immutable enough.  Is something
likely to try to pfree the pointer later?  I tried

@@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, 
JsonPathItem *jsp,
  
 jb = 

 Assert(tmp != NULL);/* We must have set tmp above */
-   jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
+   jb->val.string.val = tmp;
 jb->val.string.len = strlen(jb->val.string.val);
 jb->type = jbvString;
  
and that quieted valgrind for this particular query and still

passes regression.

(The reported crashes seem to be happening later during a
recursive invocation, seemingly because JsonbType(jb) is
returning garbage.  So there may be another bug after this one.)





Argh! Will look, thanks.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Andrew Dunstan


On 2024-01-25 Th 08:45, Dave Cramer wrote:





I tried running my buildfarm using my git repo and my branch, but get 
the following error

Status Line: 492 bad branch parameter
Content:
bad branch parameter fix_arm

Web txn failed with status: 1




You can't use your own branch with the buildfarm, you need to let it set 
up its own branches.



cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: More new SQL/JSON item methods

2024-01-25 Thread Tom Lane
Andrew Dunstan  writes:
> Thanks, I have pushed this.

The buildfarm is pretty widely unhappy, mostly failing on

select jsonb_path_query('1.23', '$.string()');

On a guess, I tried running that under valgrind, and behold it said

==00:00:00:05.637 435530== Conditional jump or move depends on uninitialised 
value(s)
==00:00:00:05.637 435530==at 0x8FD131: executeItemOptUnwrapTarget 
(jsonpath_exec.c:1547)
==00:00:00:05.637 435530==by 0x8FED03: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FED03: executeNextItem 
(jsonpath_exec.c:1604)
==00:00:00:05.637 435530==by 0x8FCA58: executeItemOptUnwrapTarget 
(jsonpath_exec.c:956)
==00:00:00:05.637 435530==by 0x8FFDE4: executeItem (jsonpath_exec.c:626)
==00:00:00:05.637 435530==by 0x8FFDE4: executeJsonPath.constprop.30 
(jsonpath_exec.c:612)
==00:00:00:05.637 435530==by 0x8FFF8C: jsonb_path_query_internal 
(jsonpath_exec.c:438)

It's fairly obviously right about that:

JsonbValuejbv;
...
jb = 
Assert(tmp != NULL);/* We must have set tmp above */
jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
  ^

Presumably, this is a mistaken attempt to test the type
of the thing previously pointed to by "jb".

On the whole, what I'd be inclined to do here is get rid
of this test altogether and demand that every path through
the preceding "switch" deliver a value that doesn't need
pstrdup().  The only path that doesn't do that already is

case jbvBool:
tmp = (jb->val.boolean) ? "true" : "false";
break;

and TBH I'm not sure that we really need a pstrdup there
either.  The constants are immutable enough.  Is something
likely to try to pfree the pointer later?  I tried

@@ -1544,7 +1544,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, 
JsonPathItem *jsp,
 
jb = 
Assert(tmp != NULL);/* We must have set tmp above */
-   jb->val.string.val = (jb->type == jbvString) ? tmp : 
pstrdup(tmp);
+   jb->val.string.val = tmp;
jb->val.string.len = strlen(jb->val.string.val);
jb->type = jbvString;
 
and that quieted valgrind for this particular query and still
passes regression.

(The reported crashes seem to be happening later during a
recursive invocation, seemingly because JsonbType(jb) is
returning garbage.  So there may be another bug after this one.)

regards, tom lane




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 12:30, Andrew Dunstan  wrote:

>
> On 2024-01-24 We 19:02, Michael Paquier wrote:
>
> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:
>
> I managed to get it to build the vcvarsall arch needs to be x64. I need to
> add some options, but the patch above needs to be applied to build it.
>
> Nice.  If I may ask, what kind of host and/or configuration have you
> used to reach a state where the code can be compiled and run tests
> with meson?  If you have found specific steps, it may be a good thing
> to document that on the wiki, say around [1].
>
> Perhaps you have not included TAP?  It may be fine in terms of runtime
> checks and coverage.
>
> [1]: 
> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows
>
>
>
> I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we really
> want to build with x64_arm64, i.e. to generate native arm64 binaries.
> Setting just x64 will not do that, AIUI.
>
> I tried that with the buidfarm, setting that in the config file's call to
> PGBuild::VSenv::getenv().
>
> That upset msvc_gendef.pl, so I added this there to keep it happy:
>
> $arch = 'x86_64' if $arch eq 'aarch64';
>
> After that things went ok until I got this:
>
> [1453/2088] "link" @src/backend/postgres.exe.rsp
> FAILED: src/backend/postgres.exe src/backend/postgres.pdb
> "link" @src/backend/postgres.exe.rsp
>Creating library src\backend\postgres.exe.lib
> storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol
> _mm_pause referenced in function perform_spin_delay
> src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals
>
>
> I haven't made further progress, but I will return to it in the next day
> or so.
>
> While this will be nice to have, I think it won't really matter until
> there is ARM64 support in released versions of Windows Server. AFAICT they
> still only sell versions for x86_64
>

I need it to build clients. The clients need arm64 libraries to link against

Dave


Re: A performance issue with Memoize

2024-01-25 Thread Tom Lane
David Rowley  writes:
> I think fixing it your way makes sense.  I don't really see any reason
> why we should have two. However...

> Another way it *could* be fixed would be to get rid of pull_paramids()
> and change create_memoize_plan() to set keyparamids to all the param
> IDs that match are equal() to each param_exprs.  That way nodeMemoize
> wouldn't purge the cache as we'd know the changing param is accounted
> for in the cache.  For the record, I don't think that's better, but it
> scares me a bit less as I don't know what other repercussions there
> are of applying your patch to get rid of the duplicate
> NestLoopParam.paramval.

> I'd feel better about doing it your way if Tom could comment on if
> there was a reason he put the function calls that way around in
> 5ebaaa494.

I'm fairly sure I thought it wouldn't matter because of the Param
de-duplication done in paramassign.c.  However, Richard's example
shows that's not so, because process_subquery_nestloop_params is
picky about the param ID assigned to a particular Var while
replace_nestloop_params is not.  So flipping the order makes sense.
I'd change the comment though, maybe to

/*
 * Replace any outer-relation variables with nestloop params.
 *
 * We must provide nestloop params for both lateral references of
 * the subquery and outer vars in the scan_clauses.  It's better
 * to assign the former first, because that code path requires
 * specific param IDs, while replace_nestloop_params can adapt
 * to the IDs assigned by process_subquery_nestloop_params.
 * This avoids possibly duplicating nestloop params when the
 * same Var is needed for both reasons.
 */

However ... it seems like we're not out of the woods yet.  Why
is Richard's proposed test case still showing

+ ->  Memoize (actual rows=5000 loops=N)
+   Cache Key: t1.two, t1.two

Seems like there is missing de-duplication logic, or something.

> I also feel like we might be getting a bit close to the minor version
> releases to be adjusting this stuff in back branches.

Yeah, I'm not sure I would change this in the back branches.

regards, tom lane




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Andrew Dunstan


On 2024-01-24 We 19:02, Michael Paquier wrote:

On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:

I managed to get it to build the vcvarsall arch needs to be x64. I need to
add some options, but the patch above needs to be applied to build it.

Nice.  If I may ask, what kind of host and/or configuration have you
used to reach a state where the code can be compiled and run tests
with meson?  If you have found specific steps, it may be a good thing
to document that on the wiki, say around [1].

Perhaps you have not included TAP?  It may be fine in terms of runtime
checks and coverage.

[1]:https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto#Running_on_Windows




I now have an ARM64 machine, so I set up a W11 ARM64 VM. I think we 
really want to build with x64_arm64, i.e. to generate native arm64 
binaries. Setting just x64 will not do that, AIUI.


I tried that with the buidfarm, setting that in the config file's call 
to PGBuild::VSenv::getenv().


That upset msvc_gendef.pl, so I added this there to keep it happy:

   $arch = 'x86_64' if $arch eq 'aarch64';

After that things went ok until I got this:

[1453/2088] "link" @src/backend/postgres.exe.rsp
FAILED: src/backend/postgres.exe src/backend/postgres.pdb
"link" @src/backend/postgres.exe.rsp
   Creating library src\backend\postgres.exe.lib
storage_lmgr_s_lock.c.obj : error LNK2019: unresolved external symbol 
_mm_pause referenced in function perform_spin_delay

src\backend\postgres.exe : fatal error LNK1120: 1 unresolved externals


I haven't made further progress, but I will return to it in the next day 
or so.


While this will be nice to have, I think it won't really matter until 
there is ARM64 support in released versions of Windows Server. AFAICT 
they still only sell versions for x86_64




cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-25 Thread Robert Haas
On Thu, Jan 25, 2024 at 11:22 AM Alvaro Herrera  wrote:
> Still with these auto-tuning GUCs, I noticed that the auto-tuning code
> would continue to grow the buffer sizes with shared_buffers to
> arbitrarily large values.  I added an arbitrary maximum of 1024 (8 MB),
> which is much higher than the current value of 128; but if you have
> (say) 30 GB of shared_buffers (not uncommon these days), do you really
> need 30MB of pg_clog cache?  It seems mostly unnecessary ... and you can
> still set it manually that way if you need it.  So, largely I just
> rewrote those small functions completely.

Yeah, I think that if we're going to scale with shared_buffers, it
should be capped.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-25 Thread Robert Haas
On Thu, Jan 25, 2024 at 11:19 AM Melanie Plageman
 wrote:
> Cool. I might add "successfully" or "fully" to "Either way, the page
> hasn't been processed yet"

I'm OK with that.

> > > I think it would be nice to clarify this comment. I think the "when
> > > there is little free space anyway" is referring to the case in
> > > lazy_vacuum() where we set do_index_vacuuming to false because "there
> > > are almost zero TIDs". I initially thought it was saying that in the
> > > failsafe vacuum case the pages whose free space we wouldn't record in
> > > the FSM have little free space anyway -- which I didn't get. But then
> > > I looked at where we set do_index_vacuuming to false.
> >
> > Oh... wow. That's kind of confusing; somehow I was thinking we were
> > talking about free space on the disk, rather than newly free space in
> > pages that could be added to the FSM.
>
> Perhaps I misunderstood. I interpreted it to refer to the bypass optimization.

I think you're probably correct. I just didn't realize what was meant.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: A performance issue with Memoize

2024-01-25 Thread Tom Lane
David Rowley  writes:
> I'd feel better about doing it your way if Tom could comment on if
> there was a reason he put the function calls that way around in
> 5ebaaa494.

Apologies for not having noticed this thread before.  I'm taking
a look at it now.  However, while sniffing around this I found
what seems like an oversight in paramassign.c's
assign_param_for_var(): it says it should compare all the same
fields as _equalVar except for varlevelsup, but it's failing to
compare varnullingrels.  Is that a bug?  It's conceivable that
it's not possible to get here with varnullingrels different and
all else the same, but I don't feel good about that proposition.

I tried adding

@@ -91,7 +91,10 @@ assign_param_for_var(PlannerInfo *root, Var *var)
 pvar->vartype == var->vartype &&
 pvar->vartypmod == var->vartypmod &&
 pvar->varcollid == var->varcollid)
+{
+Assert(bms_equal(pvar->varnullingrels, var->varnullingrels));
 return pitem->paramId;
+}
 }
 }

This triggers no failures in the regression tests, but we know
how little that proves.

Anyway, that's just a side observation unrelated to the problem
at hand.  More later.

regards, tom lane




Re: UUID v7

2024-01-25 Thread Sergey Prokhorenko
Aleksander,

In this case the documentation must state that the functions 
uuid_extract_time() and uuidv7(T) are against the RFC requirements, and that 
developers may use these functions with caution at their own risk, and these 
functions are not recommended for production environment.

The function uuidv7(T) is not better than uuid_extract_time(). Careless 
developers may well pass any business date into this function: document date, 
registration date, payment date, reporting date, start date of the current 
month, data download date, and even a constant. This would be a profanation of 
UUIDv7 with very negative consequences.

Sergey prokhorenkosergeyprokhore...@yahoo.com.au 

On Thursday, 25 January 2024 at 03:06:50 pm GMT+3, Aleksander Alekseev 
 wrote:  
 
 Hi,

> Postgres always was a bit hackerish, allowing slightly more then is safe. 
> I.e. you can define immutable function that is not really immutable, turn off 
> autovacuum or fsync. Why bother with safety guards here?
> My opinion is that we should have this function to extract timestamp. Even if 
> it can return strange values for imprecise RFC implementation.

Completely agree.

Users that don't like or don't need it can pretend there are no
uuid_extract_time() and uuidv7(T) in Postgres. If we don't provide
them however, users that need them will end up writing their own
probably buggy and not compatible implementations. That would be much
worse.

> So +1 for erroring when you provide a timestamp outside of that range
> (either too far in the past or too far in the future).
>
> OK, it seems like we have some consensus on ERRORing..
>
> Do we have any other open items? Does v13 address all open items? Maybe let’s 
> compose better error message?
>
> +1 for erroring when ts is outside range.
>
> v13 looks good for me. I think we have reached a optimal compromise.

Andrey, many thanks for the updated patch.

LGTM, cfbot is happy and I don't think we have any open items left. So
changing CF entry status back to RfC.

-- 
Best regards,
Aleksander Alekseev
  

Re: index prefetching

2024-01-25 Thread Tomas Vondra



On 1/25/24 11:45, Dilip Kumar wrote:
> On Wed, Jan 24, 2024 at 11:43 PM Tomas Vondra
>  wrote:
> 
>> On 1/22/24 07:35, Konstantin Knizhnik wrote:
>>>
>>> On 22/01/2024 1:47 am, Tomas Vondra wrote:
 h, right. Well, you're right in this case we perhaps could set just one
 of those flags, but the "purpose" of the two places is quite different.

 The "prefetch" flag is fully controlled by the prefetcher, and it's up
 to it to change it (e.g. I can easily imagine some new logic touching
 setting it to "false" for some reason).

 The "data" flag is fully controlled by the custom callbacks, so whatever
 the callback stores, will be there.

 I don't think it's worth simplifying this. In particular, I don't think
 the callback can assume it can rely on the "prefetch" flag.

>>> Why not to add "all_visible" flag to IndexPrefetchEntry ? If will not
>>> cause any extra space overhead (because of alignment), but allows to
>>> avoid dynamic memory allocation (not sure if it is critical, but nice to
>>> avoid if possible).
>>>
>>
> While reading through the first patch I got some questions, I haven't
> read it complete yet but this is what I got so far.
> 
> 1.
> +static bool
> +IndexPrefetchBlockIsSequential(IndexPrefetch *prefetch, BlockNumber block)
> +{
> + int idx;
> ...
> + if (prefetch->blockItems[idx] != (block - i))
> + return false;
> +
> + /* Don't prefetch if the block happens to be the same. */
> + if (prefetch->blockItems[idx] == block)
> + return false;
> + }
> +
> + /* not sequential, not recently prefetched */
> + return true;
> +}
> 
> The above function name is BlockIsSequential but at the end, it
> returns true if it is not sequential, seem like a problem?

Actually, I think it's the comment that's wrong - the last return is
reached only for a sequential pattern (and when the block was not
accessed recently).

> Also other 2 checks right above the end of the function are returning
> false if the block is the same or the pattern is sequential I think
> those are wrong too.
> 

Hmmm. You're right this is partially wrong. There are two checks:

/*
 * For a sequential pattern, blocks "k" step ago needs to have block
 * number by "k" smaller compared to the current block.
 */
if (prefetch->blockItems[idx] != (block - i))
return false;

/* Don't prefetch if the block happens to be the same. */
if (prefetch->blockItems[idx] == block)
return false;

The first condition is correct - we want to return "false" when the
pattern is not sequential.

But the second condition is wrong - we want to skip prefetching when the
block was already prefetched recently, so this should return true (which
is a bit misleading, as it seems to imply the pattern is sequential,
when it's not).

However, this is harmless, because we then identify this block as
recently prefetched in the "full" cache check, so we won't prefetch it
anyway. So it's harmless, although a bit more expensive.

There's another inefficiency - we stop looking for the same block once
we find the first block breaking the non-sequential pattern. Imagine a
sequence of blocks 1, 2, 3, 1, 2, 3, ... in which case we never notice
the block was recently prefetched, because we always find the break of
the sequential pattern. But again, it's harmless, thanks to the full
cache of recently prefetched blocks.

>  2.
>  I have noticed that the prefetch history is maintained at the backend
> level, but what if multiple backends are trying to fetch the same heap
> blocks maybe scanning the same index, so should that be in some shared
> structure?  I haven't thought much deeper about this from the
> implementation POV, but should we think about it, or it doesn't
> matter?

Yes, the cache is at the backend level - it's a known limitation, but I
see it more as a conscious tradeoff.

Firstly, while the LRU cache is at backend level, PrefetchBuffer also
checks shared buffers for each prefetch request. So with sufficiently
large shared buffers we're likely to find it there (and for direct I/O
there won't be any other place to check).

Secondly, the only other place to check is page cache, but there's no
good (sufficiently cheap) way to check that. See the preadv2/nowait
experiment earlier in this thread.

I suppose we could implement a similar LRU cache for shared memory (and
I don't think it'd be very complicated), but I did not plan to do that
in this patch unless absolutely necessary.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Add new error_action COPY ON_ERROR "log"

2024-01-25 Thread torikoshia

Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more 
"error_action".

(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and logs 
information that should have resulted in errors to PostgreSQL log.


I think this option has some advantages like below:

1) We can know which number of line input data was not loaded and 
reason.


  Example:

  =# copy t1 from stdin with (on_error log);
  Enter data to be copied followed by a newline.
  End with a backslash and a period on a line by itself, or an EOF 
signal.

  >> 1
  >> 2
  >> 3
  >> z
  >> \.
  LOG:  invalid input syntax for type integer: "z"
  NOTICE:  1 row was skipped due to data type incompatibility
  COPY 3

  =# \! tail data/log/postgresql*.log
  LOG:  22P02: invalid input syntax for type integer: "z"
  CONTEXT:  COPY t1, line 4, column i: "z"
  LOCATION:  pg_strtoint32_safe, numutils.c:620
  STATEMENT:  copy t1 from stdin with (on_error log);


2) Easier maintenance than storing error information in tables or 
proprietary log files.
For example, in case a large number of soft errors occur, some 
mechanisms are needed to prevent an infinite increase in the size of the 
destination data, but we can left it to PostgreSQL's log rotation.



Attached a patch.
This basically comes from previous discussion[1] which did both "ignore" 
and "log" soft error.


As shown in the example above, the log output to the client does not 
contain CONTEXT, so I'm a little concerned that client cannot see what 
line of the input data had a problem without looking at the server log.



What do you think?

[1] 
https://www.postgresql.org/message-id/c0fb57b82b150953f26a5c7e340412e8%40oss.nttdata.com


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group CorporationFrom 04e643facfea4b4e8dd174d22fbe5e008747a91a Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Fri, 26 Jan 2024 01:17:59 +0900
Subject: [PATCH v1] Add new error_action "log" to ON_ERROR option

Currently ON_ERROR option only has "ignore" to skip malformed data and
there are no ways to know where and why COPY skipped them.

"log" skips malformed data as well as "ignore", but it logs information that
should have resulted in errors to PostgreSQL log.


---
 doc/src/sgml/ref/copy.sgml  |  8 ++--
 src/backend/commands/copy.c |  4 +++-
 src/backend/commands/copyfrom.c | 24 
 src/include/commands/copy.h |  1 +
 src/test/regress/expected/copy2.out | 14 +-
 src/test/regress/sql/copy2.sql  |  9 +
 6 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..9662c90a8b 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -380,12 +380,16 @@ COPY { table_name [ ( 
   Specifies which 
   error_action to perform when there is malformed data in the input.
-  Currently, only stop (default) and ignore
-  values are supported.
+  Currently, only stop (default), ignore
+  and log values are supported.
   If the stop value is specified,
   COPY stops operation at the first error.
   If the ignore value is specified,
   COPY skips malformed data and continues copying data.
+  If the log value is specified,
+  COPY behaves the same as ignore, exept that
+  it logs information that should have resulted in errors to PostgreSQL log at
+  INFO level.
   The option is allowed only in COPY FROM.
   Only stop value is allowed when
   using binary format.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cc0786c6f4..812ca63350 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 		return COPY_ON_ERROR_STOP;
 
 	/*
-	 * Allow "stop", or "ignore" values.
+	 * Allow "stop", "ignore" or "log" values.
 	 */
 	sval = defGetString(def);
 	if (pg_strcasecmp(sval, "stop") == 0)
 		return COPY_ON_ERROR_STOP;
 	if (pg_strcasecmp(sval, "ignore") == 0)
 		return COPY_ON_ERROR_IGNORE;
+	if (pg_strcasecmp(sval, "log") == 0)
+		return COPY_ON_ERROR_LOG;
 
 	ereport(ERROR,
 			(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 1fe70b9133..7886bd5353 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1013,6 +1013,23 @@ CopyFrom(CopyFromState cstate)
  */
 cstate->escontext->error_occurred = false;
 
+			else if (cstate->opts.on_error == COPY_ON_ERROR_LOG)
+			{
+/* Adjust elevel so we don't jump out */
+cstate->escontext->error_data->elevel = LOG;
+
+/*
+ * Despite the name, this won't raise an error since elevel is
+ * LOG now.
+ */
+ThrowErrorData(cstate->escontext->error_data);
+
+/* Initialize escontext in preparation for next soft error */

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-25 Thread Alvaro Herrera
On 2024-Jan-25, Alvaro Herrera wrote:

> Here's a touched-up version of this patch.

> diff --git a/src/backend/storage/lmgr/lwlock.c 
> b/src/backend/storage/lmgr/lwlock.c
> index 98fa6035cc..4a5e05d5e4 100644
> --- a/src/backend/storage/lmgr/lwlock.c
> +++ b/src/backend/storage/lmgr/lwlock.c
> @@ -163,6 +163,13 @@ static const char *const BuiltinTrancheNames[] = {
>   [LWTRANCHE_LAUNCHER_HASH] = "LogicalRepLauncherHash",
>   [LWTRANCHE_DSM_REGISTRY_DSA] = "DSMRegistryDSA",
>   [LWTRANCHE_DSM_REGISTRY_HASH] = "DSMRegistryHash",
> + [LWTRANCHE_COMMITTS_SLRU] = "CommitTSSLRU",
> + [LWTRANCHE_MULTIXACTOFFSET_SLRU] = "MultixactOffsetSLRU",
> + [LWTRANCHE_MULTIXACTMEMBER_SLRU] = "MultixactMemberSLRU",
> + [LWTRANCHE_NOTIFY_SLRU] = "NotifySLRU",
> + [LWTRANCHE_SERIAL_SLRU] = "SerialSLRU"
> + [LWTRANCHE_SUBTRANS_SLRU] = "SubtransSLRU",
> + [LWTRANCHE_XACT_SLRU] = "XactSLRU",
>  };

Eeek.  Last minute changes ...  Fixed here.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"La primera ley de las demostraciones en vivo es: no trate de usar el sistema.
Escriba un guión que no toque nada para no causar daños." (Jakob Nielsen)
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..3e3119865a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2006,6 +2006,145 @@ include_dir 'conf.d'
   
  
 
+ 
+  commit_timestamp_buffers (integer)
+  
+   commit_timestamp_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of memory to use to cache the contents of
+pg_commit_ts (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 0, which requests
+shared_buffers/512 up to 1024 blocks,
+but not fewer than 16 blocks.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  multixact_members_buffers (integer)
+  
+   multixact_members_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/members (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 32.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  multixact_offsets_buffers (integer)
+  
+   multixact_offsets_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/offsets (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 16.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  notify_buffers (integer)
+  
+   notify_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_notify (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 16.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  serializable_buffers (integer)
+  
+   serializable_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_serial (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 32.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  subtransaction_buffers (integer)
+  
+   subtransaction_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_subtrans (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 0, which requests
+shared_buffers/512 up to 1024 blocks,
+but not fewer than 16 blocks.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  transaction_buffers (integer)
+  
+   transaction_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_xact (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+

Re: make dist using git archive

2024-01-25 Thread Tristan Partin

On Thu Jan 25, 2024 at 10:04 AM CST, Peter Eisentraut wrote:

On 22.01.24 21:04, Tristan Partin wrote:
>> +    'HEAD', '.'],
>> +  install: false,
>> +  output: distdir + '.tar.bz2',
>> +)
> 
> The bz2 target should be wrapped in an `if bzip2.found()`.


The way that this currently works is that you will fail at configure 
time if bz2 doesn't exist on the system. Meson will try to resolve 
a .path() method on a NotFoundProgram. You might want to define the bz2 
target to just call `exit 1` in this case.


if bzip2.found()
 # do your current target
else
 bz2 = run_target('tar.bz2', command: ['exit', 1])
endif

This should cause pgdist to appropriately fail at run time when 
generating the bz2 tarball.


Well, I think we want the dist step to fail if bzip2 is not there.  At 
least that is the current expectation.


>> +alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])
> 
> Are you intending for the check_dirty_index target to prohibit the other 
> two targets from running? Currently that is not the case.


Yes, that was the hope, and that's how the make dist variant works.  But 
I couldn't figure this out with meson.  Also, the above actually also 
doesn't work with older meson versions, so I had to comment this out to 
get CI to work.


> If it is what 
> you intend, use a stamp file or something to indicate a relationship. 
> Alternatively, inline the git diff-index into the other commands. These 
> might also do better as external scripts. It would reduce duplication 
> between the autotools and Meson builds.


Yeah, maybe that's a direction.


For what it's worth, I run Meson 1.3, and the behavior of generating the 
tarballs even though it is a dirty tree still occurred. In the new patch 
you seem to say it was fixed in 0.60.


--
Tristan Partin
Neon (https://neon.tech)




Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-25 Thread Alvaro Herrera
Here's a touched-up version of this patch.

First, PROC_GLOBAL->clogGroupFirst and SlruCtl->latest_page_number
change from being protected by locks to being atomics, but there's no
mention of considering memory barriers that they should have.  Looking
at the code, I think the former doesn't need any additional barriers,
but latest_page_number is missing some, which I have added.  This
deserves another careful look.

Second and most user visibly, the GUC names seem to have been chosen
based on the source-code variables, which have never meant to be user-
visible.  So I renamed a few:

xact_buffers -> transaction_buffers
subtrans_buffers -> subtransaction_buffers
serial_buffers -> serializable_buffers
commit_ts_buffers -> commit_timestamp_buffers

(unchanged: multixact_offsets_buffers, multixact_members_buffers,
notify_buffers)

I did this explicitly trying to avoid using the term SLRU in a
user-visible manner, because what do users care?  But immediately after
doing this I realized that we already have pg_stat_slru, so maybe the
cat is already out of the bag, and so perhaps we should name these GUCS
as, say slru_transaction_buffers?  That may make the connection between
these things a little more explicit.  (I do think we need to add
cross-links in the documentation of those GUCs to the pg_stat_slru
docs and vice-versa.)


Another thing that bothered me a bit is that we have auto-tuning for
transaction_buffers and commit_timestamp_buffers, but not for
subtransaction_buffers.  (Autotuning means you set the GUC to 0 and it
scales with shared_buffers).  I don't quite understand what's the reason
for the ommision, so I added it for subtrans too.  I think it may make
sense to do likewise for the multixact ones too, not sure.  It doesn't
seem worth having that for pg_serial and notify.

While messing about with these GUCs I realized that the usage of the
show_hook to print what the current number is, when autoturning is used,
was bogus: SHOW would print the number of blocks for (say)
transaction_buffers, but if you asked it to print (say)
multixact_offsets_buffers, it would give a size in MB or kB.  I'm sure
such an inconsistency would bite us.  So, digging around I found that a
good way to handle this is to remove the show_hook, and instead call
SetConfigOption() at the time when the ShmemInit function is called,
with the correct number of buffers determined.  This is pretty much what
is used for XLOGbuffers, and it works correctly as far as my testing
shows.

Still with these auto-tuning GUCs, I noticed that the auto-tuning code
would continue to grow the buffer sizes with shared_buffers to
arbitrarily large values.  I added an arbitrary maximum of 1024 (8 MB),
which is much higher than the current value of 128; but if you have
(say) 30 GB of shared_buffers (not uncommon these days), do you really
need 30MB of pg_clog cache?  It seems mostly unnecessary ... and you can
still set it manually that way if you need it.  So, largely I just
rewrote those small functions completely.

I also made the SGML documentation and postgresql.sample.conf all match
what the code actually docs.  The whole thing wasn't particularly
consistent.

I rewrote a bunch of code comments and moved stuff around to appear in
alphabetical order, etc.

More comment rewriting still pending.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5..3e3119865a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2006,6 +2006,145 @@ include_dir 'conf.d'
   
  
 
+ 
+  commit_timestamp_buffers (integer)
+  
+   commit_timestamp_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of memory to use to cache the contents of
+pg_commit_ts (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 0, which requests
+shared_buffers/512 up to 1024 blocks,
+but not fewer than 16 blocks.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  multixact_members_buffers (integer)
+  
+   multixact_members_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/members (see
+).
+If this value is specified without units, it is taken as blocks,
+that is BLCKSZ bytes, typically 8kB.
+The default value is 32.
+This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  multixact_offsets_buffers (integer)
+  
+   multixact_offsets_buffers configuration parameter
+  
+  
+  
+   
+Specifies the amount of shared memory to use to cache the contents
+of pg_multixact/offsets (see
+   

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-25 Thread Melanie Plageman
On Thu, Jan 25, 2024 at 10:19 AM Robert Haas  wrote:
>
> On Thu, Jan 25, 2024 at 9:18 AM Melanie Plageman
>  wrote:
> > > To me, the first paragraph of this one misses the mark. What I thought
> > > we should be saying here was something like "If we don't have a
> > > cleanup lock, the code above has already processed this page to the
> > > extent that is possible. Otherwise, we either got the cleanup lock
> > > initially and have not processed the page yet, or we didn't get it
> > > initially, attempted to process it without the cleanup lock, and
> > > decided we needed one after all. Either way, if we now have the lock,
> > > we must prune, freeze, and count tuples."
> >
> > I see. Your suggestion makes sense. The sentence starting with
> > "Otherwise" is a bit long. I started to lose the thread at "decided we
> > needed one after all". You previously referred to the cleanup lock as
> > "it" -- once you start referring to it as "one", I as the future
> > developer am no longer sure we are talking about the cleanup lock (as
> > opposed to the page or something else).
>
> Ok... trying again:
>
> If we have a cleanup lock, we must now prune, freeze, and count
> tuples. We may have acquired the cleanup lock originally, or we may
> have gone back and acquired it after lazy_scan_noprune() returned
> false. Either way, the page hasn't been processed yet.

Cool. I might add "successfully" or "fully" to "Either way, the page
hasn't been processed yet"

> > I think it would be nice to clarify this comment. I think the "when
> > there is little free space anyway" is referring to the case in
> > lazy_vacuum() where we set do_index_vacuuming to false because "there
> > are almost zero TIDs". I initially thought it was saying that in the
> > failsafe vacuum case the pages whose free space we wouldn't record in
> > the FSM have little free space anyway -- which I didn't get. But then
> > I looked at where we set do_index_vacuuming to false.
>
> Oh... wow. That's kind of confusing; somehow I was thinking we were
> talking about free space on the disk, rather than newly free space in
> pages that could be added to the FSM.

Perhaps I misunderstood. I interpreted it to refer to the bypass optimization.

> And it seems really questionable
> whether that case is OK. I mean, in the emergency case, fine,
> whatever, we should do whatever it takes to get the system back up,
> and it should barely ever happen on a well-configured system. But this
> case could happen regularly, and losing track of free space could
> easily cause bloat.
>
> This might be another argument for moving FSM updates to the first
> heap pass, but that's a separate task from fixing the comment.

Yes, it seems we could miss recording space freed in the first pass if
we never end up doing a second pass. consider_bypass_optimization is
set to false only if index cleanup is explicitly enabled or there are
dead items accumulated for vacuum's second pass at some point.

- Melanie




Re: cleanup patches for incremental backup

2024-01-25 Thread Nathan Bossart
On Thu, Jan 25, 2024 at 10:06:41AM -0500, Robert Haas wrote:
> On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart  
> wrote:
>> That seems like a reasonable starting point.  Even if it doesn't help
>> determine the root cause, it should at least help rule out concurrent
>> summary removal.
> 
> Here is a patch for that.

LGTM.  The only thing I might add is the cutoff_time in that LOG.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: make dist using git archive

2024-01-25 Thread Peter Eisentraut

On 22.01.24 21:04, Tristan Partin wrote:

+git = find_program('git', required: false, native: true, disabler: true)
+bzip2 = find_program('bzip2', required: false, native: true, 
disabler: true)


This doesn't need to be a disabler. git is fine as-is. See later 
comment. Disablers only work like you are expecting when they are used 
like how git is used. Once you call a method like .path(), all bets are 
off.


ok, fixed


+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+   command: [git, 'diff-index', 
'--quiet', 'HEAD'])


Seems like you might want to add -C here too?


done


+tar_bz2 = custom_target('tar.bz2',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' 
+ bzip2.path() + ' -c', 'archive',

+    '--format', 'tar.bz2',
+    '--prefix', distdir + '/',


-    '-o', '@BUILD_ROOT@/@OUTPUT@',
+    '-o', join_paths(meson.build_root(), '@OUTPUT@'),

This will generate the tarballs in the build directory. Do the same for 
the previous target. Tested locally.


Fixed, thanks.  I had struggled with this one.


+    'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.bz2',
+)


The bz2 target should be wrapped in an `if bzip2.found()`.


Well, I think we want the dist step to fail if bzip2 is not there.  At 
least that is the current expectation.



+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])


Are you intending for the check_dirty_index target to prohibit the other 
two targets from running? Currently that is not the case.


Yes, that was the hope, and that's how the make dist variant works.  But 
I couldn't figure this out with meson.  Also, the above actually also 
doesn't work with older meson versions, so I had to comment this out to 
get CI to work.


If it is what 
you intend, use a stamp file or something to indicate a relationship. 
Alternatively, inline the git diff-index into the other commands. These 
might also do better as external scripts. It would reduce duplication 
between the autotools and Meson builds.


Yeah, maybe that's a direction.

The updated patch also supports vpath builds with make now.

I have also added a CI patch, for amusement.  Maybe we'll want to keep 
it, though.From 13612f9a1c486e8acfe4156a7bc069a66fd30e77 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 25 Jan 2024 12:28:28 +0100
Subject: [PATCH v1 1/2] make dist uses git archive

Discussion: 
https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org
---
 GNUmakefile.in | 34 --
 meson.build| 41 +
 2 files changed, 53 insertions(+), 22 deletions(-)

diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..680c765dd73 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers 
submake-libpgport
 distdir= postgresql-$(VERSION)
 dummy  = =install=
 
+GIT = git
+
 dist: $(distdir).tar.gz $(distdir).tar.bz2
-   rm -rf $(distdir)
-
-$(distdir).tar: distdir
-   $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
-   @echo $(distdir)
-
-distdir:
-   rm -rf $(distdir)* $(dummy)
-   for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name 
.git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
-   mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
-   ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
-   done
-   $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+   $(GIT) -C $(srcdir) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+   $(GIT) -C $(srcdir) archive --format tar.gz --prefix $(distdir)/ HEAD 
-o $(abs_top_builddir)/$@
+
+$(distdir).tar.bz2: check-dirty-index
+   $(GIT) -C $(srcdir) -c tar.tar.bz2.command='$(BZIP2) -c' archive 
--format tar.bz2 --prefix $(distdir)/ HEAD -o $(abs_top_builddir)/$@
 
 distcheck: dist
rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index 55184db2488..43884e7cfdd 100644
--- a/meson.build
+++ b/meson.build
@@ -3348,6 +3348,47 @@ run_target('help',
 
 
 
+###
+# Distribution archive
+###
+
+git = find_program('git', required: false, native: true)
+bzip2 = find_program('bzip2', required: false, native: true)
+
+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+   command: [git, '-C', '@SOURCE_ROOT@',
+  

Re: Patch: Improve Boolean Predicate JSON Path Docs

2024-01-25 Thread Tom Lane
"David E. Wheeler"  writes:
> On Jan 24, 2024, at 16:32, Tom Lane  wrote:
>> +  
>> +   Predicate check expressions are required in the
>> +   @@ operator (and the
>> +   jsonb_path_match function), and should not be 
>> used
>> +   with the @? operator (or the
>> +   jsonb_path_exists function).
>> +  
>> + 
>> +

> I had this bit here:

>   
>Conversely, non-predicate jsonpath expressions should not 
> be
>used with the @@ operator (or the
>jsonb_path_match function).
>   

I changed the preceding para to say "... check expressions are
required in ...", which I thought was sufficient to cover that.
Also, the tabular description of the operator tells you not to do it.

> What do you think of also dropping the article from all the references to 
> “the strict mode” or “the lax mode”, to make them “strict mode” and “lax 
> mode”, respectively?

Certainly most of 'em don't need it.  I'll make it so.

regards, tom lane




Re: A compiling warning in jsonb_populate_record_valid

2024-01-25 Thread Tom Lane
Amit Langote  writes:
> On Fri, Jan 26, 2024 at 0:15 Tom Lane  wrote:
>> Amit Langote  writes:
>>> Ignoring the warning was my 1st thought too, because an old discussion I
>>> found about the warning was too old (2011).  The style I adopted in my
>>> “fix” is used in a few other places too, so I thought I might as well
>>> go for it.

>> Oh, well maybe I'm missing some context.  What comparable places did
>> you see?

> Sorry, not on my computer right now so can’t paste any code, but I was able
> to find a couple of functions (using Google ;) that refer to error_occurred
> directly for one reason or another:

OK, looking around, it seems like our pattern is that direct access
to escontext.error_occurred is okay in the same function that sets up
the escontext (so that there's no possibility of a NULL pointer).
So this change is fine and I withdraw my objection.  Sorry for
the noise.

regards, tom lane




Re: More new SQL/JSON item methods

2024-01-25 Thread Andrew Dunstan


On 2024-01-18 Th 09:25, Jeevan Chalke wrote:



On Thu, Jan 18, 2024 at 1:03 AM Peter Eisentraut 
 wrote:


On 17.01.24 10:03, Jeevan Chalke wrote:
> I added unary '+' and '-' support as well and thus thought of
having
> separate rules altogether rather than folding those in.
>
>     Per SQL standard, the precision and scale arguments are unsigned
>     integers, so unary plus and minus signs are not supported. 
So my patch
>     removes that support, but I didn't adjust the regression
tests for that.
>
>
> However, PostgreSQL numeric casting does support a negative
scale. Here
> is an example:
>
> # select '12345'::numeric(4,-2);
>   numeric
> -
>     12300
> (1 row)
>
> And thus thought of supporting those.
> Do we want this JSON item method to behave differently here?

Ok, it would make sense to support this in SQL/JSON as well.


OK. So with this, we don't need changes done in your 0001 patches.


> I will merge them all into one and will try to keep them in the
order
> specified in sql_features.txt.
> However, for documentation, it makes more sense to keep them in
logical
> order than the alphabetical one. What are your views on this?

The documentation can be in a different order.


Thanks, Andrew and Peter for the confirmation.

Attached merged single patch along these lines.



Thanks, I have pushed this.


cheers


andrew


--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: A compiling warning in jsonb_populate_record_valid

2024-01-25 Thread Amit Langote
On Fri, Jan 26, 2024 at 0:15 Tom Lane  wrote:

> Amit Langote  writes:
> >  On Thu, Jan 25, 2024 at 23:57 Tom Lane  wrote:
> >> -1 please.  We should not break that abstraction for the sake
> >> of ignorable warnings on ancient compilers.
>
> > Ignoring the warning was my 1st thought too, because an old discussion I
> > found about the warning was too old (2011).  The style I adopted in my
> > “fix” is used in a few other places too, so I thought I might as well
> > go for it.
>
> Oh, well maybe I'm missing some context.  What comparable places did
> you see?


Sorry, not on my computer right now so can’t paste any code, but I was able
to find a couple of functions (using Google ;) that refer to error_occurred
directly for one reason or another:

process_integer_literal(),
xml_is_document()


Re: Unnecessary smgropen in {heapam_relation,index}_copy_data?

2024-01-25 Thread Japin Li


On Thu, 25 Jan 2024 at 21:43, Aleksander Alekseev  
wrote:
> Hi,
>
>> I find heapam_relation_copy_data() and index_copy_data() have the following 
>> code:
>>
>> dstrel = smgropen(*newrlocator, rel->rd_backend);
>>
>> ...
>>
>> RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, 
>> true);
>>
>> The smgropen() is also called by RelationCreateStorage(), why should we call
>> smgropen() explicitly here?
>>
>> I try to remove the smgropen(), and all tests passed.
>
> That's a very good question. Note that the second argument of
> smgropen() used to create dstrel changes after applying your patch.
> I'm not 100% sure whether this is significant or not.
>

Thanks for the review.

According the comments of RelationData->rd_backend, it is the backend id, if
the relation is temporary.  The differnece is RelationCreateStorage() uses
relpersistence to determinate the backend id.

> I added your patch to the nearest open commitfest so that we will not lose it:
>
> https://commitfest.postgresql.org/47/4794/

Thank you.




Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-25 Thread Robert Haas
On Thu, Jan 25, 2024 at 9:18 AM Melanie Plageman
 wrote:
> > To me, the first paragraph of this one misses the mark. What I thought
> > we should be saying here was something like "If we don't have a
> > cleanup lock, the code above has already processed this page to the
> > extent that is possible. Otherwise, we either got the cleanup lock
> > initially and have not processed the page yet, or we didn't get it
> > initially, attempted to process it without the cleanup lock, and
> > decided we needed one after all. Either way, if we now have the lock,
> > we must prune, freeze, and count tuples."
>
> I see. Your suggestion makes sense. The sentence starting with
> "Otherwise" is a bit long. I started to lose the thread at "decided we
> needed one after all". You previously referred to the cleanup lock as
> "it" -- once you start referring to it as "one", I as the future
> developer am no longer sure we are talking about the cleanup lock (as
> opposed to the page or something else).

Ok... trying again:

If we have a cleanup lock, we must now prune, freeze, and count
tuples. We may have acquired the cleanup lock originally, or we may
have gone back and acquired it after lazy_scan_noprune() returned
false. Either way, the page hasn't been processed yet.

> I think it would be nice to clarify this comment. I think the "when
> there is little free space anyway" is referring to the case in
> lazy_vacuum() where we set do_index_vacuuming to false because "there
> are almost zero TIDs". I initially thought it was saying that in the
> failsafe vacuum case the pages whose free space we wouldn't record in
> the FSM have little free space anyway -- which I didn't get. But then
> I looked at where we set do_index_vacuuming to false.

Oh... wow. That's kind of confusing; somehow I was thinking we were
talking about free space on the disk, rather than newly free space in
pages that could be added to the FSM. And it seems really questionable
whether that case is OK. I mean, in the emergency case, fine,
whatever, we should do whatever it takes to get the system back up,
and it should barely ever happen on a well-configured system. But this
case could happen regularly, and losing track of free space could
easily cause bloat.

This might be another argument for moving FSM updates to the first
heap pass, but that's a separate task from fixing the comment.

> As for the last sentence starting with "Besides", even with Peter's
> explanation I still am not sure what it should say. There are blocks
> whose free space we don't record in the first heap pass. Then, due to
> skipping index vacuuming and the second heap pass, we also don't
> record their free space in the second heap pass. I think he is saying
> that once we set do_index_vacuuming to false, we will stop skipping
> updating the FSM after the first pass for future blocks. So, future
> blocks will have their free space recorded in the FSM. But that feels
> self-evident.

Yes, I don't think that necessarily needs to be mentioned here.

> The more salient point is that there are some blocks
> whose free space is not recorded (those whose first pass happened
> before unsetting do_index_vacuuming and whose second pass did not
> happen before do_index_vacuuming is unset). The extra sentence made me
> think there was some way we might go back and record free space for
> those blocks, but that is not true.

I don't really see why that sentence made you think that, but it's not
important. I agree with you about what point we need to emphasize
here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: A compiling warning in jsonb_populate_record_valid

2024-01-25 Thread Tom Lane
Amit Langote  writes:
>  On Thu, Jan 25, 2024 at 23:57 Tom Lane  wrote:
>> -1 please.  We should not break that abstraction for the sake
>> of ignorable warnings on ancient compilers.

> Ignoring the warning was my 1st thought too, because an old discussion I
> found about the warning was too old (2011).  The style I adopted in my
> “fix” is used in a few other places too, so I thought I might as well
> go for it.

Oh, well maybe I'm missing some context.  What comparable places did
you see?

regards, tom lane




Re: A compiling warning in jsonb_populate_record_valid

2024-01-25 Thread Amit Langote
 On Thu, Jan 25, 2024 at 23:57 Tom Lane  wrote:

> Amit Langote  writes:
> > On Thu, Jan 25, 2024 at 2:59 PM Richard Guo 
> wrote:
> >> I came across a warning when building master (a044e61f1b) on old GCC
> >> (4.8.5).
>
> > Will apply the attached, which does this:
>
> > -   return BoolGetDatum(!SOFT_ERROR_OCCURRED());
> > +   return BoolGetDatum(!escontext.error_occurred);
>
> -1 please.  We should not break that abstraction for the sake
> of ignorable warnings on ancient compilers.


Ignoring the warning was my 1st thought too, because an old discussion I
found about the warning was too old (2011).  The style I adopted in my
“fix” is used in a few other places too, so I thought I might as well
go for it.

Anyway, I’m fine with reverting.

>


Re: cleanup patches for incremental backup

2024-01-25 Thread Robert Haas
On Wed, Jan 24, 2024 at 2:39 PM Nathan Bossart  wrote:
> That seems like a reasonable starting point.  Even if it doesn't help
> determine the root cause, it should at least help rule out concurrent
> summary removal.

Here is a patch for that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v1-0001-Temporary-patch-to-help-debug-pg_walsummary-test-.patch
Description: Binary data


Re: A compiling warning in jsonb_populate_record_valid

2024-01-25 Thread Tom Lane
Amit Langote  writes:
> On Thu, Jan 25, 2024 at 2:59 PM Richard Guo  wrote:
>> I came across a warning when building master (a044e61f1b) on old GCC
>> (4.8.5).

> Will apply the attached, which does this:

> -   return BoolGetDatum(!SOFT_ERROR_OCCURRED());
> +   return BoolGetDatum(!escontext.error_occurred);

-1 please.  We should not break that abstraction for the sake
of ignorable warnings on ancient compilers.

regards, tom lane




Re: remaining sql/json patches

2024-01-25 Thread jian he
On Thu, Jan 25, 2024 at 7:54 PM Amit Langote  wrote:
>
> >
> > The problem with returning comp_domain_with_typmod from json_value()
> > seems to be that it's using a text-to-record CoerceViaIO expression
> > picked from JsonExpr.item_coercions, which behaves differently than
> > the expression tree that the following uses:
> >
> > select ('abcd', 42)::comp_domain_with_typmod;
> >row
> > --
> >  (abc,42)
> > (1 row)
>
> Oh, it hadn't occurred to me to check what trying to coerce a "string"
> containing the record literal would do:
>
> select '(''abcd'', 42)'::comp_domain_with_typmod;
> ERROR:  value too long for type character(3)
> LINE 1: select '(''abcd'', 42)'::comp_domain_with_typmod;
>
> which is the same thing as what the JSON_QUERY() and JSON_VALUE() are
> running into.  So, it might be fair to think that the error is not a
> limitation of the SQL/JSON patch but an underlying behavior that it
> has to accept as is.
>

Hi, I reconciled with these cases.
What bugs me now is the first query of the following 4 cases (for comparison).
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) omit quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) keep quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text omit quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text keep quotes);

I did some minor refactoring on the function coerceJsonFuncExprOutput.
it will make the following queries return null instead of error. NULL
is the return of json_value.

SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int2);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int4);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int8);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING bool);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING numeric);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING real);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING float8);


v1-0001-minor-refactor-coerceJsonFuncExprOutput.no-cfbot
Description: Binary data


Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-25 Thread Melanie Plageman
On Thu, Jan 25, 2024 at 8:57 AM Robert Haas  wrote:
>
> On Wed, Jan 24, 2024 at 9:13 PM Melanie Plageman
>  wrote:
> > v12 attached has my attempt at writing better comments for this
> > section of lazy_scan_heap().
>
> + /*
> + * If we didn't get the cleanup lock and the page is not new or empty,
> + * we can still collect LP_DEAD items in the dead_items array for
> + * later vacuuming, count live and recently dead tuples for vacuum
> + * logging, and determine if this block could later be truncated. If
> + * we encounter any xid/mxids that require advancing the
> + * relfrozenxid/relminxid, we'll have to wait for a cleanup lock and
> + * call lazy_scan_prune().
> + */
>
> I like this comment. I would probably drop "and the page is not new or
> empty" from it since that's really speaking to the previous bit of
> code, but it wouldn't be the end of the world to keep it, either.

Yes, probably best to get rid of the part about new or empty.

>   /*
> - * Prune, freeze, and count tuples.
> + * If we got a cleanup lock, we can prune and freeze tuples and
> + * defragment the page. If we didn't get a cleanup lock, we will still
> + * consider whether or not to update the FSM.
>   *
> - * Accumulates details of remaining LP_DEAD line pointers on page in
> - * dead_items array.  This includes LP_DEAD line pointers that we
> - * pruned ourselves, as well as existing LP_DEAD line pointers that
> - * were pruned some time earlier.  Also considers freezing XIDs in the
> - * tuple headers of remaining items with storage. It also determines
> - * if truncating this block is safe.
> + * Like lazy_scan_noprune(), lazy_scan_prune() will count
> + * recently_dead_tuples and live tuples for vacuum logging, determine
> + * if the block can later be truncated, and accumulate the details of
> + * remaining LP_DEAD line pointers on the page in the dead_items
> + * array. These dead items include those pruned by lazy_scan_prune()
> + * as well we line pointers previously marked LP_DEAD.
>   */
>
> To me, the first paragraph of this one misses the mark. What I thought
> we should be saying here was something like "If we don't have a
> cleanup lock, the code above has already processed this page to the
> extent that is possible. Otherwise, we either got the cleanup lock
> initially and have not processed the page yet, or we didn't get it
> initially, attempted to process it without the cleanup lock, and
> decided we needed one after all. Either way, if we now have the lock,
> we must prune, freeze, and count tuples."

I see. Your suggestion makes sense. The sentence starting with
"Otherwise" is a bit long. I started to lose the thread at "decided we
needed one after all". You previously referred to the cleanup lock as
"it" -- once you start referring to it as "one", I as the future
developer am no longer sure we are talking about the cleanup lock (as
opposed to the page or something else).

> - * Note: It's not in fact 100% certain that we really will call
> - * lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index
> - * vacuuming (and so must skip heap vacuuming).  This is deemed okay
> - * because it only happens in emergencies, or when there is very
> - * little free space anyway. (Besides, we start recording free space
> - * in the FSM once index vacuuming has been abandoned.)
>
> Here's a suggestion from me:
>
> Note: In corner cases, it's possible to miss updating the FSM
> entirely. If index vacuuming is currently enabled, we'll skip the FSM
> update now. But if failsafe mode is later activated, disabling index
> vacuuming, there will also be no opportunity to update the FSM later,
> because we'll never revisit this page. Since updating the FSM is
> desirable but not absolutely required, that's OK.
>
> I think this expresses the same sentiment as the current comment, but
> IMHO more clearly. The one part of the current comment that I don't
> understand at all is the remark about "when there is very little
> freespace anyway". I get that if the failsafe activates we won't come
> back to the page, which is the "only happens in emergencies" part of
> the existing comment. But the current phrasing makes it sound like
> there is a second case where it can happen -- "when there is very
> little free space anyway" -- and I don't know what that is talking
> about. If it's important, we should try to make it clearer.
>
> We could also just decide to keep this entire paragraph as it is for
> purposes of the present patch. The part I really thought needed
> adjusting was "Prune, freeze, and count tuples."

I think it would be nice to clarify this comment. I think the "when
there is little free space anyway" is referring to the case in
lazy_vacuum() where we set do_index_vacuuming to false because "there
are almost zero TIDs". I initially thought it was saying that in the
failsafe vacuum case the pages whose free space we wouldn't record in
the FSM have little free space anyway -- which I didn't get. But then
I looked 

Re: Emit fewer vacuum records by reaping removable tuples during pruning

2024-01-25 Thread Robert Haas
On Wed, Jan 24, 2024 at 9:13 PM Melanie Plageman
 wrote:
> v12 attached has my attempt at writing better comments for this
> section of lazy_scan_heap().

+ /*
+ * If we didn't get the cleanup lock and the page is not new or empty,
+ * we can still collect LP_DEAD items in the dead_items array for
+ * later vacuuming, count live and recently dead tuples for vacuum
+ * logging, and determine if this block could later be truncated. If
+ * we encounter any xid/mxids that require advancing the
+ * relfrozenxid/relminxid, we'll have to wait for a cleanup lock and
+ * call lazy_scan_prune().
+ */

I like this comment. I would probably drop "and the page is not new or
empty" from it since that's really speaking to the previous bit of
code, but it wouldn't be the end of the world to keep it, either.

  /*
- * Prune, freeze, and count tuples.
+ * If we got a cleanup lock, we can prune and freeze tuples and
+ * defragment the page. If we didn't get a cleanup lock, we will still
+ * consider whether or not to update the FSM.
  *
- * Accumulates details of remaining LP_DEAD line pointers on page in
- * dead_items array.  This includes LP_DEAD line pointers that we
- * pruned ourselves, as well as existing LP_DEAD line pointers that
- * were pruned some time earlier.  Also considers freezing XIDs in the
- * tuple headers of remaining items with storage. It also determines
- * if truncating this block is safe.
+ * Like lazy_scan_noprune(), lazy_scan_prune() will count
+ * recently_dead_tuples and live tuples for vacuum logging, determine
+ * if the block can later be truncated, and accumulate the details of
+ * remaining LP_DEAD line pointers on the page in the dead_items
+ * array. These dead items include those pruned by lazy_scan_prune()
+ * as well we line pointers previously marked LP_DEAD.
  */

To me, the first paragraph of this one misses the mark. What I thought
we should be saying here was something like "If we don't have a
cleanup lock, the code above has already processed this page to the
extent that is possible. Otherwise, we either got the cleanup lock
initially and have not processed the page yet, or we didn't get it
initially, attempted to process it without the cleanup lock, and
decided we needed one after all. Either way, if we now have the lock,
we must prune, freeze, and count tuples."

The second paragraph seems fine.

- * Note: It's not in fact 100% certain that we really will call
- * lazy_vacuum_heap_rel() -- lazy_vacuum() might yet opt to skip index
- * vacuuming (and so must skip heap vacuuming).  This is deemed okay
- * because it only happens in emergencies, or when there is very
- * little free space anyway. (Besides, we start recording free space
- * in the FSM once index vacuuming has been abandoned.)

Here's a suggestion from me:

Note: In corner cases, it's possible to miss updating the FSM
entirely. If index vacuuming is currently enabled, we'll skip the FSM
update now. But if failsafe mode is later activated, disabling index
vacuuming, there will also be no opportunity to update the FSM later,
because we'll never revisit this page. Since updating the FSM is
desirable but not absolutely required, that's OK.

I think this expresses the same sentiment as the current comment, but
IMHO more clearly. The one part of the current comment that I don't
understand at all is the remark about "when there is very little
freespace anyway". I get that if the failsafe activates we won't come
back to the page, which is the "only happens in emergencies" part of
the existing comment. But the current phrasing makes it sound like
there is a second case where it can happen -- "when there is very
little free space anyway" -- and I don't know what that is talking
about. If it's important, we should try to make it clearer.

We could also just decide to keep this entire paragraph as it is for
purposes of the present patch. The part I really thought needed
adjusting was "Prune, freeze, and count tuples."

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Thu, 25 Jan 2024 at 08:31, Dave Cramer  wrote:

>
>
> On Wed, 24 Jan 2024 at 19:03, Michael Paquier  wrote:
>
>> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:
>> > I managed to get it to build the vcvarsall arch needs to be x64. I need
>> to
>> > add some options, but the patch above needs to be applied to build it.
>>
>> Nice.  If I may ask, what kind of host and/or configuration have you
>> used to reach a state where the code can be compiled and run tests
>> with meson?  If you have found specific steps, it may be a good thing
>> to document that on the wiki, say around [1].
>>
>
> I am running a mac-mini M2 with parallels running windows pro 11
> The only thing required is this patch. Running in a native developer
> prompt
>
> meson setup build --prefix=c:\postgres
> cd build
> ninja
> ninja install
> ninja test
>
> all run without errors
> I also have buildfarm running and will run that today
>
> Dave
>
>>
>> Perhaps you have not included TAP?  It may be fine in terms of runtime
>> checks and coverage.
>>
>
> Does TAP have to be explicitly added to the meson build ?
>
> Dave
>


I tried running my buildfarm using my git repo and my branch, but get the
following error
Status Line: 492 bad branch parameter
Content:
bad branch parameter fix_arm

Web txn failed with status: 1


Re: Unnecessary smgropen in {heapam_relation,index}_copy_data?

2024-01-25 Thread Aleksander Alekseev
Hi,

> I find heapam_relation_copy_data() and index_copy_data() have the following 
> code:
>
> dstrel = smgropen(*newrlocator, rel->rd_backend);
>
> ...
>
> RelationCreateStorage(*newrlocator, rel->rd_rel->relpersistence, 
> true);
>
> The smgropen() is also called by RelationCreateStorage(), why should we call
> smgropen() explicitly here?
>
> I try to remove the smgropen(), and all tests passed.

That's a very good question. Note that the second argument of
smgropen() used to create dstrel changes after applying your patch.
I'm not 100% sure whether this is significant or not.

I added your patch to the nearest open commitfest so that we will not lose it:

https://commitfest.postgresql.org/47/4794/


-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Add native windows on arm64 support

2024-01-25 Thread Dave Cramer
On Wed, 24 Jan 2024 at 19:03, Michael Paquier  wrote:

> On Wed, Jan 24, 2024 at 06:45:21AM -0500, Dave Cramer wrote:
> > I managed to get it to build the vcvarsall arch needs to be x64. I need
> to
> > add some options, but the patch above needs to be applied to build it.
>
> Nice.  If I may ask, what kind of host and/or configuration have you
> used to reach a state where the code can be compiled and run tests
> with meson?  If you have found specific steps, it may be a good thing
> to document that on the wiki, say around [1].
>

I am running a mac-mini M2 with parallels running windows pro 11
The only thing required is this patch. Running in a native developer
prompt

meson setup build --prefix=c:\postgres
cd build
ninja
ninja install
ninja test

all run without errors
I also have buildfarm running and will run that today

Dave

>
> Perhaps you have not included TAP?  It may be fine in terms of runtime
> checks and coverage.
>

Does TAP have to be explicitly added to the meson build ?

Dave


Re: Use of backup_label not noted in log

2024-01-25 Thread Michael Banck
Hi,

On Thu, Jan 25, 2024 at 08:56:52AM -0400, David Steele wrote:
> I would still advocate for a back patch here. It is frustrating to get logs
> from users that just say:
> 
> LOG:  invalid checkpoint record
> PANIC:  could not locate a valid checkpoint record
> 
> It would be very helpful to know what the checkpoint record LSN was in this
> case.

I agree.


Michael




Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-25 Thread Aleksander Alekseev
Hi,

> > Here is the corrected patch.
>
> Thank you for updating the patch! I have some comments:

Thanks for the review.

> -tuple = (ReorderBufferTupleBuf *)
> +tuple = (HeapTuple)
>  MemoryContextAlloc(rb->tup_context,
> -
> sizeof(ReorderBufferTupleBuf) +
> +   HEAPTUPLESIZE +
> MAXIMUM_ALIGNOF +
> alloc_len);
> -tuple->alloc_tuple_size = alloc_len;
> -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple);
> +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE);
>
> Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the
> MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a
> similar thing but it doesn't add it.

Indeed. I gave it a try and nothing crashed, so it appears that
MAXIMUM_ALIGNOF is not needed.

> ---
>  xl_parameter_change *xlrec =
> -(xl_parameter_change *)
> XLogRecGetData(buf->record);
> +(xl_parameter_change *)
> XLogRecGetData(buf->record);
>
> Unnecessary change.

That's pgindent. Fixed.

> ---
> -/*
> - * Free a ReorderBufferTupleBuf.
> - */
> -void
> -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf *tuple)
> -{
> -pfree(tuple);
> -}
> -
>
> Why does ReorderBufferReturnTupleBuf need to be moved from
> reorderbuffer.c to reorderbuffer.h? It seems not related to this
> refactoring patch so I think we should do it in a separate patch if we
> really want it. I'm not sure it's necessary, though.

OK, fixed.

-- 
Best regards,
Aleksander Alekseev


v6-0001-Remove-ReorderBufferTupleBuf-structure.patch
Description: Binary data


Re: Use of backup_label not noted in log

2024-01-25 Thread David Steele

On 1/25/24 04:12, Michael Paquier wrote:

On Mon, Jan 22, 2024 at 04:36:27PM +0900, Michael Paquier wrote:

+   if (ControlFile->backupStartPoint != InvalidXLogRecPtr)

Nit 1: I would use XLogRecPtrIsInvalid here.

+   ereport(LOG,
+   (errmsg("completed backup recovery with redo LSN %X/%X",
+   LSN_FORMAT_ARGS(oldBackupStartPoint;

Nit 2: How about adding backupEndPoint in this LOG?  That would give:
"completed backup recovery with redo LSN %X/%X and end LSN %X/%X".


Hearing nothing, I've just applied a version of the patch with these
two modifications on HEAD.  If this needs tweaks, just let me know.


I had planned to update the patch this morning -- so thanks for doing 
that. I think having the end point in the message makes perfect sense.


I would still advocate for a back patch here. It is frustrating to get 
logs from users that just say:


LOG:  invalid checkpoint record
PANIC:  could not locate a valid checkpoint record

It would be very helpful to know what the checkpoint record LSN was in 
this case.


Regards,
-David




Re: UUID v7

2024-01-25 Thread Aleksander Alekseev
Hi,

> Andrey, many thanks for the updated patch.
>
> LGTM, cfbot is happy and I don't think we have any open items left. So
> changing CF entry status back to RfC.

PFA v14. I changed:

```
elog(ERROR, "Time argument of UUID v7 cannot exceed 6 bytes");
```

... to:

```
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("Time argument of UUID v7 is outside of the valid range")));
```

Which IMO tells a bit more to the average user and is translatable.

> At a quick glance, the patch needs improving English, IMO.

Agree. We could use some help from a native English speaker for this.

-- 
Best regards,
Aleksander Alekseev


v14-0001-Implement-UUID-v7.patch
Description: Binary data


  1   2   >