date:20230606

is pg_log_standby_snapshot() really needed?

2023-06-06 Thread Jaime Casanova

Hi,

I'm testing the ability to have a logical replica subscribed from a standby.

Of course, I'm doing this in a laboratory with no activity so
everything get stuck after creating the subscription (the main slot).
This is clearly because every time it will create a temp slot for copy
a table it needs the running xacts from the primary.

Now, I was solving this by executing CHECKPOINT on the primary, and
also noted that pg_switch_wal() works too. After that, I read about
pg_log_standby_snapshot().

So, I wonder if that function is really needed because as I said I
solved it with already existing functionality. Or if it is really
needed maybe it is a bug that a CHECKPOINT and pg_switch_wal() have
the same effect?

-- 
Jaime Casanova
Director de Servicios Profesionales
SYSTEMGUARDS - Consultores de PostgreSQL

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Amit Kapila

On Wed, Jun 7, 2023 at 6:18 AM Tomas Vondra
 wrote:
>
> On 6/6/23 17:42, Tomas Vondra wrote:
> >
>
> In investigated this a bit more, and the problem actually seems to be
> more like this:
>
> 1) we create a new logical replication slot
>
> 2) while building the initial snapshot, we start with current insert
> location, and then process records
>
> 3) for RUNNING_XACTS we call SnapBuildProcessRunningXacts, which calls
> SnapBuildFindSnapshot
>
> 4) SnapBuildFindSnapshot does this:
>
> else if (!builder->building_full_snapshot &&
>  SnapBuildRestore(builder, lsn))
> {
> /* there won't be any state to cleanup */
> return false;
> }
>
> 5) because create_logical_replication_slot and get_changes both call
> CreateInitDecodingContext with needs_full_snapshot=false, we end up
> calling SnapBuildRestore()
>
> 6) once in a while this likely hits a snapshot created by a concurrent
> session (for another logical slot) with SNAPBUILD_CONSISTENT state
>

I think this analysis doesn't seem to match what you mentioned in the
previous email which is as follows:
> > However, we skip the assignment, because the log for this call of
> > get_changes says this:
> >
> >LOG:  logical decoding found consistent point at 0/6462E5D8
> >
> > so we fail to realize the 26662 is a subxact.

This is because the above LOG is printed when
"running->oldestRunningXid == running->nextXid" not when we restore
the snapshot. Am, I missing something?

>
> I don't know what's the correct fix for this. Maybe we should set
> needs_full_snapshot=true in create_logical_replication_slot when
> creating the initial snapshot? Maybe we should use true even in
> pg_logical_slot_get_changes_guts? This seems to fix the crashes ...
>

I don't think that is advisable because setting "needs_full_snapshot"
to true for decoding means the snapshot will start tracking
non-catalog committed xacts as well which is costly and is not
required for this case.

> That'll prevent reading the serialized snapshots like this, but how
> could that ever work? It seems pretty much guaranteed to ignore any
> assignments that happened right before the snapshot?
>

This part needs some analysis/thoughts. BTW, do you mean that it skips
the assignment (a) because the assignment record is before we reach a
consistent point, or (b) because we start reading WAL after the
assignment, or (c) something else? If you intend to say (a) then can
you please point me to the code you are referring to for the same?

-- 
With Regards,
Amit Kapila.

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Dilip Kumar

On Wed, Jun 7, 2023 at 7:32 AM Tom Lane  wrote:
>
> Thomas Munro  writes:
> > ... My point is
> > that we’re doing pretty unreasonable and inefficient contortions to
> > develop new features -- we're not just happily chugging along without
> > threads at no cost.
>
> Sure, but it's not like chugging along *with* threads would be no-cost.
> Others have already pointed out the permanent downsides of that, such
> as loss of isolation between sessions leading to debugging headaches
> (and, I predict, more than one security-grade bug).

I agree in some cases debugging would be hard, but I feel there are
cases where the thread model will make the debugging experience better
e.g breaking at the entry point of the new parallel worker or other
worker is hard with the process model but that would be very smooth
with the thread model as per my experience.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Dilip Kumar

On Tue, Jun 6, 2023 at 11:30 PM Robert Haas  wrote:
>
> On Tue, Jun 6, 2023 at 11:46 AM Heikki Linnakangas  wrote:
> > Bruce was worried about the loss of isolation that the separate address
> > spaces gives, and Jeremy shared an anecdote on that. That is an
> > objection to the idea itself, i.e. even if transition was smooth,
> > bug-free and effortless, that point remains. I personally think the
> > isolation we get from separate address spaces is overrated. Yes, it
> > gives you some protection, but given how much shared memory there is,
> > the blast radius is large even with separate backend processes.
>
> An interesting idea might be to look at the places where we ereport or
> elog FATAL due to some kind of backend data structure corruption and
> ask whether there would be an argument for elevating the level to
> PANIC if we changed this. There are definitely some places where we
> argue that the only corrupted state is backend-local and thus we don't
> need to PANIC if it's corrupted. I wonder to what extent this change
> would undermine that argument.

With the threaded model, that shouldn't change, right? Even though all
memory space is now shared across threads, we can maintain the same
rules for modifying critical shared data structures, i.e. modifying
such memory should still fall under the CRITICAL SECTION, so I guess
the rules for promoting error level to PANIC will remain the same.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Improve join_search_one_level readibilty (one line change)

2023-06-06 Thread 謝東霖

Thank you, Julien, for letting me know that cfbot doesn't test txt files.
Much appreciated!


v2-0001-Improve-left-deep-tree-dp-algorithm-s-readability.patch
Description: Binary data

Re: Improve join_search_one_level readibilty (one line change)

2023-06-06 Thread Tom Lane

Julien Rouhaud  writes:
> I'm glad I could help! Thanks for creating the cf entry. Note however that
> the cfbot ignores files with a .txt extension (I don't think it's
> documented but it will mostly handle files with diff, patch, gz(ip), tar
> extensions IIRC, processing them as needed depending on the extension),

Documented in the cfbot FAQ:

https://wiki.postgresql.org/wiki/Cfbot#Which_attachments_are_considered_to_be_patches.3F

which admittedly is a page a lot of people don't know about.

regards, tom lane

Re: Improve join_search_one_level readibilty (one line change)

2023-06-06 Thread Julien Rouhaud

On Tue, 6 Jun 2023, 16:18 謝東霖,  wrote:

> Thank you to Julien Rouhaud and Tender Wang for the reviews.
>
> Julien's detailed guide has proven to be incredibly helpful, and I am
> truly grateful for it.
> Thank you so much for providing such valuable guidance!
>
> I have initiated a new commitfest:
> https://commitfest.postgresql.org/43/4346/
>
> Furthermore, I have attached a patch that improves the code by moving
> the initialization of "other_rels_list" outside the if branching.
>

I'm glad I could help! Thanks for creating the cf entry. Note however that
the cfbot ignores files with a .txt extension (I don't think it's
documented but it will mostly handle files with diff, patch, gz(ip), tar
extensions IIRC, processing them as needed depending on the extension), so
you should send v2 again with a supported extension, otherwise the cfbot
will keep testing your original patch.

>

Re: Assert failure of the cross-check for nullingrels

2023-06-06 Thread Richard Guo

On Wed, Jun 7, 2023 at 4:22 AM Tom Lane  wrote:

> Alvaro Herrera  writes:
> > So, is this done?  I see that you made other commits fixing related code
> > several days after this email, but none seems to match the changes you
> > posted in this patch; and also it's not clear to me that there's any
> > test case where this patch is expected to change behavior.  (So there's
> > also a question of whether this is a bug fix or rather some icying on
> > cake.)

This issue is fixed at 991a3df22.

> Well, the bugs I was aware of ahead of PGCon are all fixed, but there
> are some new reports I still have to deal with.  I left the existing
> open issue open, but maybe it'd be better to close it and start a new
> one?

I went ahead and closed it, and then started two new open items for the
two new issues --- one is about assert failure and wrong query results
due to incorrectly removing PHVs, the other is about inconsistent
nulling bitmap in nestloop parameters.

Thanks
Richard

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Tom Lane

Thomas Munro  writes:
> ... My point is
> that we’re doing pretty unreasonable and inefficient contortions to
> develop new features -- we're not just happily chugging along without
> threads at no cost.

Sure, but it's not like chugging along *with* threads would be no-cost.
Others have already pointed out the permanent downsides of that, such
as loss of isolation between sessions leading to debugging headaches
(and, I predict, more than one security-grade bug).

I agree that if we were building this system from scratch today,
we'd probably choose thread-per-session not process-per-session.
But the costs of getting to that from where we are will be enormous.
I seriously doubt that the net benefits could justify that work,
no matter how long you want to look forward.  It's not really
significantly different from "let's rewrite the server in
C++/Rust/$latest_hotness".

regards, tom lane

Re: [PATCH] pg_regress.c: Fix "make check" on Mac OS X: Pass DYLD_LIBRARY_PATH

2023-06-06 Thread Julien Rouhaud

On Wed, Jun 7, 2023 at 5:44 AM Evan Jones  wrote:
>
> On Tue, Jun 6, 2023 at 5:23 PM Peter Eisentraut  wrote:
>>
>> This addresses only pg_regress.  What about all the other test suites?
>> Per the previous discussions, you'd need to patch up other places in a
>> similar way, potentially everywhere system() is called.
>
>
> Are there instructions for how I can run these other test suites? The 
> installation notes that Julien linked, and the Postgres wiki Developer FAQ 
> [1] only seem to mention "make check". I would be happy to try to fix other 
> tests on Mac OS X.

AFAIK there's no make rule that can really run everything.  You can
get most of it using make check-world (see
https://www.postgresql.org/docs/current/regress-run.html#id-1.6.20.5.5)
and making sure you added support for TAP tests (and probably also a
lot of optional dependencies) when running configure.  This won't run
everything but hopefully will hit most of the relevant infrastructure.

Re: [PATCH] pg_regress.c: Fix "make check" on Mac OS X: Pass DYLD_LIBRARY_PATH

2023-06-06 Thread Tom Lane

Evan Jones  writes:
> On Tue, Jun 6, 2023 at 5:23 PM Peter Eisentraut 
> wrote:
>> This addresses only pg_regress.  What about all the other test suites?

> Are there instructions for how I can run these other test suites?

configure with --enable-tap-tests, then do "make check-world".

Also, adding certain additional feature arguments such as
--with-python enables more test cases.  We aren't going to be
super excited about a patch that doesn't handle all of them.

regards, tom lane

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Tomas Vondra




On 6/6/23 17:42, Tomas Vondra wrote:
> 
> 
> On 6/6/23 14:00, Alexander Lakhin wrote:
>> Hello Tomas,
>>
>> 06.06.2023 12:56, Tomas Vondra wrote:
>>> On 6/6/23 11:00, Alexander Lakhin wrote:
 Hello,
 ...> With the debug logging added inside AssertTXNLsnOrder() I see:
 ctx->snapshot_builder->start_decoding_at: 209807224,
 ctx->reader->EndRecPtr: 210043072,
 SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
 and inside the loop:
 cur_txn->first_lsn: 209792872
 cur_txn->first_lsn: 209975744
 cur_txn->first_lsn: 210043008
 cur_txn->first_lsn: 210043008
 and it triggers the Assert.

>>> So what's the prev_first_lsn value for these first_lsn values? How does
>>> it change over time? Did you try looking at the pg_waldump for these
>>> positions?
>>
>> With more logging I've got (for another run):
>> ReorderBufferTXNByXid| xid: 3397, lsn: c1fbc80
>>
>> ctx->snapshot_builder->start_decoding_at: c1f2cc0,
>> ctx->reader->EndRecPtr: c1fbcc0,
>> SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
>> prev_first_lsn: 0, cur_txn->first_lsn: c1fbc80
>> prev_first_lsn: c1fbc80, cur_txn->first_lsn: c1fbc80
>> TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn") ...
>>
>> waldump for 0001000C shows:
>> grep c1fbc80:
>> rmgr: Heap2   len (rec/tot): 60/    60, tx:   3398, lsn:
>> 0/0C1FBC80, prev 0/0C1FBC50, desc: NEW_CID rel: 1663/18763/19987, tid:
>> 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
>> rmgr: Heap    len (rec/tot): 59/    59, tx:   3398, lsn:
>> 0/0C1FBCC0, prev 0/0C1FBC80, desc: INSERT+INIT off: 1, flags: 0x08,
>> blkref #0: rel 1663/18763/19987 blk 0
>>
>> grep '( 3397| 3398)'
> 
> I've been able to reproduce this, after messing with the script a little
> bit (I had to skip the test_decoding regression tests, because that was
> complaining about slots already existing etc).
> 
> Anyway, AssertTXNLsnOrder sees these two transactions (before aborting):
> 
>   26662 0/6462E6F0 (first 0/0)
>   26661 0/6462E6F0 (first 0/6462E6F0)
> 
> 
> where 26661 is the top xact, 26662 is a subxact of 26661. This is
> clearly a problem, because we really should not have subxact in this
> list once the assignment gets applied.
> 
> And the relevant WAL looks like this:
> 
> -
> 26662, lsn: 0/645EDAA0, prev 0/645EDA60, desc: ASSIGNMENT xtop 26661:
> subxacts: 26662
> 26662, lsn: 0/645EDAD0, prev 0/645EDAA0, desc: INSERT+INIT off: 1,
> flags: 0x08, blkref #0: rel 1663/125447/126835 blk 0
> ...
> 0, lsn: 0/6462E5D8, prev 0/6462E2A0, desc: RUNNING_XACTS nextXid
> 26673 latestCompletedXid 26672 oldestRunningXid 26661; 3 xacts: 26667
> 26661 26664; 3 subxacts: 26668 26662 26665
> ...
> 26662, lsn: 0/6462E6F0, prev 0/6462E678, desc: NEW_CID rel:
> 1663/125447/126841, tid: 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
> 26662, lsn: 0/6462E730, prev 0/6462E6F0, desc: INSERT+INIT off: 1,
> flags: 0x08, blkref #0: rel 1663/125447/126841 blk 0
> 26661, lsn: 0/6462E770, prev 0/6462E730, desc: COMMIT 2023-06-06
> 16:41:24.442870 CEST; subxacts: 26662
> -
> 
> so the assignment is the *first* thing that happens for these xacts.
> 
> However, we skip the assignment, because the log for this call of
> get_changes says this:
> 
>LOG:  logical decoding found consistent point at 0/6462E5D8
> 
> so we fail to realize the 26662 is a subxact.
> 
> Then when processing the NEW_CID, SnapBuildProcessNewCid chimes in and
> does this:
> 
>   ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);
> 
>   ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
>  xlrec->target_locator, xlrec->target_tid,
>  xlrec->cmin, xlrec->cmax,
>  xlrec->combocid);
> 
> and ReorderBufferAddNewTupleCids() proceeds to enter an entry for the
> passed XID (which is xlrec->top_xid, 26661), but with LSN of the WAL
> record. But ReorderBufferXidSetCatalogChanges() already did the same
> thing for the subxact 26662, as it has no idea it's a subxact (due to
> the skipped assignment).
> 
> I haven't figured out what exactly is happening / what it should be
> doing instead. But it seems wrong to skip the assignment - I wonder if
> SnapBuildProcessRunningXacts might be doing that too eagerly.
> 

In investigated this a bit more, and the problem actually seems to be
more like this:

1) we create a new logical replication slot

2) while building the initial snapshot, we start with current insert
location, and then process records

3) for RUNNING_XACTS we call SnapBuildProcessRunningXacts, which calls
SnapBuildFindSnapshot

4) SnapBuildFindSnapshot does this:

else if (!builder->building_full_snapshot &&
 SnapBuildRestore(builder, lsn))
{
/* there won't be any state to cleanup */
return false;

doc patch: note AttributeRelationId passed to FDW validator function

2023-06-06 Thread Ian Lawrence Barwick

Hi

Here:

  https://www.postgresql.org/docs/current/fdw-functions.html

the enumeration of OIDs which might be passed as the FDW validator function's
second argument omits "AttributeRelationId" (as passed when altering
a foreign table's column options).

Attached v1 patch adds this to this list of OIDs.

The alternative v2 patch adds this to this list of OIDs, and also
formats it as an
SGML list, which IMHO is easier to read.

Looks like this has been missing since 9.3.


Regards

Ian Barwick
commit 8a9a0c6e087c9a83beb7e1af0d7dcb7f494eed36
Author: Ian Barwick 
Date:   Wed Jun 7 08:51:29 2023 +0900

doc: update FDW validator function description

diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index ac1717bc3c..6dc3755dd0 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -67,13 +67,16 @@
  foreign tables using the wrapper.
  The validator function must be registered as taking two arguments, a
  text array containing the options to be validated, and an OID
- representing the type of object the options are associated with (in
- the form of the OID of the system catalog the object would be stored
- in, either
- ForeignDataWrapperRelationId,
- ForeignServerRelationId,
- UserMappingRelationId,
- or ForeignTableRelationId).
+ representing the type of object the options are associated with. The
+ latter corresponds to the OID of the system catalog the object
+ would be stored in, one of:
+ 
+  ForeignDataWrapperRelationId
+  ForeignServerRelationId
+  UserMappingRelationId
+  ForeignTableRelationId
+  AttributeRelationId
+ 
  If no validator function is supplied, options are not checked at object
  creation time or object alteration time.
 
commit 4a789bb5d5f9b9cd87a5d9b3569e13c2ebc95f0d
Author: Ian Barwick 
Date:   Wed Jun 7 08:12:57 2023 +0900

doc: note AttributeRelationId for FDW validator function

diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index ac1717bc3c..d33f5c4fc8 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -73,7 +73,8 @@
  ForeignDataWrapperRelationId,
  ForeignServerRelationId,
  UserMappingRelationId,
- or ForeignTableRelationId).
+ ForeignTableRelationId,
+ or AttributeRelationId).
  If no validator function is supplied, options are not checked at object
  creation time or object alteration time.

Re: Improving FTS for Greek

2023-06-06 Thread Florents Tselai



> On 7 Jun 2023, at 12:13 AM, Peter Eisentraut  wrote:
> 
> On 03.06.23 19:47, Florents Tselai wrote:
>> There’s another previous relevant patch [0] but was never merged. I’ve 
>> included these stop words and added some more (info in README.md).
>> For my personal projects looks like it yields much better results.
>> I’d like some feedback on the extension ; particularly on the installation 
>> infra (I’m not sure I’ve handled properly the permissions in the .sql files)
>> I’ll then try to make a .patch for this.
> 
> The open question at the previous attempt was that it wasn't clear what the 
> upstream source or long-term maintenance of the stop words list would be.  If 
> it's just a personally composed list, then it's okay if you use it yourself, 
> but for including it into PostgreSQL it ought to come from a reputable 
> non-individual source like snowball.

I’ve used the NLTK list [0] as my base of stopwords; Wouldn’t this be 
considered reputable enough ? 

0 
https://github.com/nltk/nltk_data/blob/gh-pages/packages/corpora/stopwords.zip 
(see greek.stop file in the archive)

>

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Thomas Munro

On Tue, Jun 6, 2023 at 6:52 AM Andrew Dunstan  wrote:
> If we were starting out today we would probably choose a threaded 
> implementation. But moving to threaded now seems to me like a 
> multi-year-multi-person project with the prospect of years to come chasing 
> bugs and the prospect of fairly modest advantages. The risk to reward doesn't 
> look great.
>
> That's my initial reaction. I could be convinced otherwise.

Here is one thing I often think about when contemplating threads.
Take a look at dsa.c.  It calls itself a shared memory allocator, but
really it has two jobs, the second being to provide software emulation
of virtual memory.  That’s behind dshash.c and now the stats system,
and various parts of the parallel executor code.  It’s slow and
complicated, and far from the state of the art.  I wrote that code
(building on allocator code from Robert) with the expectation that it
was a transitional solution to unblock a bunch of projects.  I always
expected that we'd eventually be deleting it.  When I explain that
subsystem to people who are not steeped in the lore of PostgreSQL, it
sounds completely absurd.  I mean, ... it is, right?My point is
that we’re doing pretty unreasonable and inefficient contortions to
develop new features -- we're not just happily chugging along without
threads at no cost.

Re: [PATCH] pg_regress.c: Fix "make check" on Mac OS X: Pass DYLD_LIBRARY_PATH

2023-06-06 Thread Evan Jones

On Tue, Jun 6, 2023 at 5:23 PM Peter Eisentraut 
wrote:

> This addresses only pg_regress.  What about all the other test suites?
> Per the previous discussions, you'd need to patch up other places in a
> similar way, potentially everywhere system() is called.
>

Are there instructions for how I can run these other test suites? The
installation notes that Julien linked, and the Postgres wiki Developer FAQ
[1] only seem to mention "make check". I would be happy to try to fix other
tests on Mac OS X.

Thanks!

[1]
https://wiki.postgresql.org/wiki/Developer_FAQ#How_do_I_test_my_changes.3F

Re: [PATCH] pg_regress.c: Fix "make check" on Mac OS X: Pass DYLD_LIBRARY_PATH

2023-06-06 Thread Peter Eisentraut


On 06.06.23 16:24, Evan Jones wrote:
On Mon, Jun 5, 2023 at 10:33 PM Tom Lane > wrote:


 > Note that this is a known issue
Yeah.  We have attempted to work around this before, but failed to find
a solution without more downsides than upsides.  I will be interested
to look at this patch, but lack time for it right now.  Anybody else?


Ah, I didn't find that mention in the documentation when I was trying to 
get this working. Sorry about that!


My argument in favour of considering this patch is that making 
"./configure; make; make check" work on current major operating systems 
makes it easier for others to contribute in the future. I think the 
disadvantage of this patch is it makes pg_regress harder to understand, 
because it requires an #ifdef for this OS specific behaviour, and 
obscures the command lines of the child processes it spawns.


This addresses only pg_regress.  What about all the other test suites? 
Per the previous discussions, you'd need to patch up other places in a 
similar way, potentially everywhere system() is called.

Re: Improving FTS for Greek

2023-06-06 Thread Peter Eisentraut


On 03.06.23 19:47, Florents Tselai wrote:
There’s another previous relevant patch [0] but was never merged. I’ve 
included these stop words and added some more (info in README.md).


For my personal projects looks like it yields much better results.

I’d like some feedback on the extension ; particularly on the 
installation infra (I’m not sure I’ve handled properly the permissions 
in the .sql files)


I’ll then try to make a .patch for this.


The open question at the previous attempt was that it wasn't clear what 
the upstream source or long-term maintenance of the stop words list 
would be.  If it's just a personally composed list, then it's okay if 
you use it yourself, but for including it into PostgreSQL it ought to 
come from a reputable non-individual source like snowball.

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Andrew Gierth

> "Joe" == Joe Conway  writes:

 > On 6/6/23 15:55, Tom Lane wrote:
 >> Robert Haas  writes:
 >>> On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:
  Also +1, except that I find "none" a rather confusing choice of name.
  There *is* a provider, it's just PG itself not either libc or ICU.
  I thought Joe's suggestion of "internal" made more sense.
 >> 
 >>> Or perhaps "builtin" or "postgresql".
 >> Either OK by me

 Joe> Same here

I like either "internal" or "builtin" because they correctly identify
that no external resources are used. I'm not keen on "postgresql".

-- 
Andrew.

Re: Assert failure of the cross-check for nullingrels

2023-06-06 Thread Tom Lane

Alvaro Herrera  writes:
> So, is this done?  I see that you made other commits fixing related code
> several days after this email, but none seems to match the changes you
> posted in this patch; and also it's not clear to me that there's any
> test case where this patch is expected to change behavior.  (So there's
> also a question of whether this is a bug fix or rather some icying on
> cake.)

Well, the bugs I was aware of ahead of PGCon are all fixed, but there
are some new reports I still have to deal with.  I left the existing
open issue open, but maybe it'd be better to close it and start a new
one?

regards, tom lane

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Tom Lane

"Jonathan S. Katz"  writes:
> Since we're bikeshedding, "postgresql" or "builtin" could make it seem 
> to a (app) developer that these may be recommended options, as we're 
> trusting PostgreSQL to make the best choices for us. Granted, v16 is 
> (theoretically) defaulting to ICU, so that choice is made, but the 
> unsuspecting developer could make a switch based on that naming.

I don't think this is a problem, as long as any locale name other
than C/POSIX fails when combined with that provider name.

regards, tom lane

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Greg Stark

On Mon, 5 Jun 2023 at 10:52, Heikki Linnakangas  wrote:
>
> I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
> so that the whole server runs in a single process, with multiple
> threads. It has been discussed many times in the past, last thread on
> pgsql-hackers was back in 2017 when Konstantin made some experiments [0].
>
> I feel that there is now pretty strong consensus that it would be a good
> thing, more so than before. Lots of work to get there, and lots of
> details to be hashed out, but no objections to the idea at a high level.
>
> The purpose of this email is to make that silent consensus explicit. If
> you have objections to switching from the current multi-process
> architecture to a single-process, multi-threaded architecture, please
> speak up.

I suppose I should reiterate my comments that I gave at the time. I'm
not sure they qualify as "objections" but they're some kind of general
concern.

I think of processes and threads as fundamentally the same things,
just a slightly different API -- namely that in one memory is by
default unshared and needs to be explicitly shared and in the other
it's default shared and needs to be explicitly unshared. There are
obvious practical API differences too like how signals are handled but
those are just implementation details.

So the question is whether defaulting to shared memory or defaulting
to unshared memory is better -- and whether the implementation details
are significant enough to override that.

And my general concern was that in my experience default shared memory
leads to hugely complex and chaotic shared data structures with often
very loose rules for ownership of shared data and who is responsible
for making updates, handling errors, or releasing resources.

So all else equal I feel like having a good infrastructure for
explicitly allocating shared memory segments and managing them is
superior.

However all else is not equal. The discussion in the hallway turned to
whether we could just use pthread primitives like mutexes and
condition variables instead of our own locks -- and the point was
raised that those libraries assume these objects will be in threads of
one process not shared across completely different processes.

And that's probably not the only library we're stuck reimplementing
because of this. So the question is are these things worth taking the
risk of having data structures shared implicitly and having unclear
ownership rules?

I was going to say supporting both modes relieves that fear since it
would force that extra discipline and allow testing under the more
restrictive rule. However I don't think that will actually work. As
long as we support both modes we lose all the advantages of threads.
We still wouldn't be able to use pthreads and would still need to
provide and maintain our homegrown replacement infrastructure.

-- 
greg

Re: Assert failure of the cross-check for nullingrels

2023-06-06 Thread Alvaro Herrera

On 2023-May-21, Tom Lane wrote:

> Since we're hard up against the beta1 wrap deadline, I went ahead
> and pushed the v5 patch.  I doubt that it's perfect yet, but it's
> a small change and demonstrably fixes the cases we know about.
> 
> As I said in the commit message, the main knock I'd lay on v5
> is "why not use required_relids all the time?".

So, is this done?  I see that you made other commits fixing related code
several days after this email, but none seems to match the changes you
posted in this patch; and also it's not clear to me that there's any
test case where this patch is expected to change behavior.  (So there's
also a question of whether this is a bug fix or rather some icying on
cake.)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Jonathan S. Katz


On 6/6/23 3:56 PM, Joe Conway wrote:

On 6/6/23 15:55, Tom Lane wrote:

Robert Haas  writes:

On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:

Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's suggestion of "internal" made more sense.



Or perhaps "builtin" or "postgresql".


Either OK by me


Same here


Since we're bikeshedding, "postgresql" or "builtin" could make it seem 
to a (app) developer that these may be recommended options, as we're 
trusting PostgreSQL to make the best choices for us. Granted, v16 is 
(theoretically) defaulting to ICU, so that choice is made, but the 
unsuspecting developer could make a switch based on that naming.


However, I don't have a strong alternative -- I understand the concern 
about "internal", so I'd be OK with "postgresql" unless a better name 
appears.


Jonathan



OpenPGP_signature
Description: OpenPGP digital signature

Re: Cleaning up nbtree after logical decoding on standby work

2023-06-06 Thread Alvaro Herrera

On 2023-Jun-05, Peter Geoghegan wrote:

> My current plan is to commit this later in the week, unless there are
> further objections. Wednesday or possibly Thursday.

I've added this as an open item for 16, with Peter and Andres as owners.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Las cosas son buenas o malas segun las hace nuestra opinión" (Lisias)

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Joe Conway


On 6/6/23 15:55, Tom Lane wrote:

Robert Haas  writes:

On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:

Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's suggestion of "internal" made more sense.



Or perhaps "builtin" or "postgresql".


Either OK by me


Same here

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Tom Lane

Robert Haas  writes:
> On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:
>> Also +1, except that I find "none" a rather confusing choice of name.
>> There *is* a provider, it's just PG itself not either libc or ICU.
>> I thought Joe's suggestion of "internal" made more sense.

> Or perhaps "builtin" or "postgresql".

Either OK by me

regards, tom lane

Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

2023-06-06 Thread Alvaro Herrera

On 2023-Apr-24, Andres Freund wrote:

> A prototype of that approach is attached. I pushed the retry handling into the
> pg_* routines where applicable.  I guess we could add pg_* routines for
> FileFallocate(), FilePrewarm() etc as well, but I didn't do that here.
> 
> Christoph, could you verify this fixes your issue?

So, is anyone making progress on this?  I don't see anything in the
thread.

On adding the missing pg_* wrappers: I think if we don't (and we leave
the retry loops at the File* layer), then the risk is that some external
code would add calls to the underlying File* routines trusting them to
do the retrying, which would then become broken when we move the retry
loops to the pg_* wrappers when we add them.  That doesn't seem terribly
serious to me.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Robert Haas

On Tue, Jun 6, 2023 at 3:25 PM Tom Lane  wrote:
> Joe Conway  writes:
> > On 6/6/23 15:18, Jeff Davis wrote:
> >> The locale "C" is a special case, documented as a non-locale. So, if
> >> LOCALE/--locale apply to ICU, then either ICU needs to handle locale
> >> "C" in the expected way (v8 patch series); or when we see locale "C" we
> >> need to somehow change the provider into something that can handle it
> >> (v6 patch series changes it to the "none" provider).
>
> > +1 to the latter approach
>
> Also +1, except that I find "none" a rather confusing choice of name.
> There *is* a provider, it's just PG itself not either libc or ICU.
> I thought Joe's suggestion of "internal" made more sense.

Or perhaps "builtin" or "postgresql".

I'm just thinking that "internal" as a type name kind of means "you
shouldn't be touching this from SQL" and we don't want to give people
the idea that the "C" locale isn't something you should use.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: confusion about this commit "Revert "Skip redundant anti-wraparound vacuums""

2023-06-06 Thread Robert Haas

On Mon, Jun 5, 2023 at 1:50 AM jiye  wrote:

> we can not get determinate test case as this issue reproduce only once,
> and currently autovaccum can works as we using vacuum freeze for each
> tables of each database.
>
> our client's application is real online bank business, and have serveral
> customer database, do a majority of update opertaion as  result trigger
> some table dead_tup_ratio nealy 100%, but can not find any autovacuum
> process work for a very long time before we do vacuum freeze manally.
>

I tend to doubt that this is caused by the commit you're blaming, because
that commit purports to skip autovacuum operations only if some other
vacuum has already done the work. Here you are saying that you see no
autovacuum tasks at all.

The screenshot that you posted of XID ages exceeding 200 million is not
evidence of a problem. It's pretty normal for some table XID ages
to temporarily exceed autovacuum_freeze_max_age, especially if you have a
lot of tables with about the same XID age, as seems to be the case here.
When a table's XID age reaches autovacuum_freeze_max_age, the system will
start trying harder to reduce the XID age, but that process isn't
instantaneous.

On the other hand, your statement that you have very high numbers of dead
tuples *is* evidence of a problem. It's very likely caused by vacuum not
running aggressively enough. Remember that autovacuum is limited by the
number of workers (autovacuum_max_workers) but even more importantly by the
cost delay system. It's *extremely* common to need to raise
vacuum_cost_limit on large or busy database systems, often by large
multiples (e.g. 10x or more).

I'd strongly suggest that you carefully monitor how many autovacuum
processes are running and what they are doing. If I were a betting man, I'd
bet that you'd find that in the situation where you had this problem, the
number of running processes was always 3 -- which is the configured maximum
-- and if you looked at the wait event in pg_stat_activity I bet you would
see VacuumDelay showing up a lot. If so, raise vacuum_cost_limit
considerably and over time the problem should get better. It won't be
instantaneous.

Or maybe I'm wrong and you'd see something else, but whatever you did see
would probably give a hint as to what the problem here is.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg_collation.collversion for C.UTF-8

2023-06-06 Thread Joe Conway


On 6/6/23 15:23, Jeff Davis wrote:

On Mon, 2023-06-05 at 19:43 +0200, Daniel Verite wrote:

But in the meantime, personally I don't quite see why Postgres should
start forcing C.UTF-8 to sort differently in the database than in the
OS.


I can see both points of view. It could be surprising to users if
C.UTF-8 does not sort like C/memcmp, or surprising if it changes out
from under them. It could also be surprising that it wouldn't sort like
the current OS's libc interpretation of C.UTF-8.

What about ICU? How should provider=icu locale=C.UTF-8 behave? We
could:

a. Just pass it to the provider and see what happens (older versions of
ICU would interpret it as en-US-u-va-posix; newer versions would give
the root locale).

b. Consistently interpret it as en-US-u-va-posix.

c. Don't pass it to the provider at all and treat it with memcmp
semantics.


Personally I think this should be (a). However we should also clearly 
document that the semantics of such is provider/OS dependent and 
therefore may not be what is expected/desired.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Tom Lane

Joe Conway  writes:
> On 6/6/23 15:18, Jeff Davis wrote:
>> The locale "C" is a special case, documented as a non-locale. So, if
>> LOCALE/--locale apply to ICU, then either ICU needs to handle locale
>> "C" in the expected way (v8 patch series); or when we see locale "C" we
>> need to somehow change the provider into something that can handle it
>> (v6 patch series changes it to the "none" provider).

> +1 to the latter approach

Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's suggestion of "internal" made more sense.

regards, tom lane

Re: pg_collation.collversion for C.UTF-8

2023-06-06 Thread Jeff Davis

On Mon, 2023-06-05 at 19:43 +0200, Daniel Verite wrote:
> But in the meantime, personally I don't quite see why Postgres should
> start forcing C.UTF-8 to sort differently in the database than in the
> OS.

I can see both points of view. It could be surprising to users if
C.UTF-8 does not sort like C/memcmp, or surprising if it changes out
from under them. It could also be surprising that it wouldn't sort like
the current OS's libc interpretation of C.UTF-8.

What about ICU? How should provider=icu locale=C.UTF-8 behave? We
could:

a. Just pass it to the provider and see what happens (older versions of
ICU would interpret it as en-US-u-va-posix; newer versions would give
the root locale).

b. Consistently interpret it as en-US-u-va-posix.

c. Don't pass it to the provider at all and treat it with memcmp
semantics.

Regards,
Jeff Davis

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Joe Conway


On 6/6/23 15:18, Jeff Davis wrote:

On Tue, 2023-06-06 at 15:09 +0200, Daniel Verite wrote:

FWIW I don't quite see how 0001 improve things or what problem it's
trying to solve.


The word "locale" is generic, so we need to make LOCALE/--locale apply
to whatever provider is being used. If "locale" only applies to libc,
using ICU will always be confusing and never be on the same level as
libc, let alone the preferred provider.



Agree 100%


The locale "C" is a special case, documented as a non-locale. So, if
LOCALE/--locale apply to ICU, then either ICU needs to handle locale
"C" in the expected way (v8 patch series); or when we see locale "C" we
need to somehow change the provider into something that can handle it
(v6 patch series changes it to the "none" provider).


+1 to the latter approach


Please let me know if you disagree with the goal or the reasoning here.
If so, please explain where you think we should end up, because the
status quo does not seem great to me.


also +1


0001 creates exceptions throughout the code so that when an ICU
collation has a locale name "C" or "POSIX" then it does not behave
like an ICU collation, even though pg_collation.collprovider='i'
To me it's neither desirable nor necessary that a collation that
has collprovider='i' is diverted to non-ICU semantics.


It's not very principled, but it matches what libc does.


Makes sense to me

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Joe Conway


On 6/6/23 15:15, Jeff Davis wrote:

On Tue, 2023-06-06 at 14:11 -0400, Joe Conway wrote:

This discussion makes me wonder (though probably too late for the v16
cycle) if we shouldn't treat "C" and "POSIX" locales to be a third 
provider, something like "internal".


That's exactly what I did in v6 of this series: I created a "none"
provider, and when someone specified provider=icu iculocale=C, it would
change the provider to "none":

https://www.postgresql.org/message-id/5f9bf4a0b040428c5db2dc1f23cc3ad96acb5672.camel%40j-davis.com

I'm fine with either approach.


Ha!

Well it seems like I am +1 on that then ;-)

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Jeff Davis

On Tue, 2023-06-06 at 14:11 -0400, Joe Conway wrote:
> This discussion makes me wonder (though probably too late for the v16
> cycle) if we shouldn't treat "C" and "POSIX" locales to be a third 
> provider, something like "internal".

That's exactly what I did in v6 of this series: I created a "none"
provider, and when someone specified provider=icu iculocale=C, it would
change the provider to "none":

https://www.postgresql.org/message-id/5f9bf4a0b040428c5db2dc1f23cc3ad96acb5672.camel%40j-davis.com

I'm fine with either approach.

Regards,
Jeff Davis

Re: Mark a transaction uncommittable

2023-06-06 Thread Cary Huang

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hello

It is one of those features that a handful of people would find useful in 
specific user cases. I think it is a nice to have feature that safeguards your 
production database against unwanted commits when troubleshooting production 
problems. Your patch applies fine on master and I am able to run a couple tests 
on it and it seems to do as described. I noticed that the patch has a 
per-session variable "default_transaction_committable" that could make all 
transaction committable or uncommittable on the session even without specifying 
"begin transaction not committable;" I am wondering if we should have a 
configurable default at all as I think it should always defaults to true and 
unchangable. If an user wants a uncommittable transaction, he/she will need to 
explicitly specify that during "begin". Having another option to change default 
behavior for all transactions may be a little unsafe, it is possible someone 
could purposely change this default to false on a production session that needs 
transactions to absolutely commit, causing damages there. 

thank you
Cary Huang
--
Highgo Software Canada
www.highgo.ca

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Kirk Wolak

On Tue, Jun 6, 2023 at 2:00 PM Robert Haas  wrote:

>
> I'm also not quite convinced that there's no long-term use case for
> multi-process mode. Maybe you're right and there isn't, but that
> amounts to arguing that every extension in the world will be happy to
> run in a multi-threaded world rather than not. I don't know if I quite
> believe that. It also amounts to arguing that performance is going to
> be better for everyone in this new multi-threaded mode, and that it
> won't cause unforeseen problems for any significant numbers of users,
> and maybe those things are true, but I think we need to get this new
> system in place and get some real-world experience before we can judge
> these kinds of things. I agree that, in theory, it would be nice to
> get to a place where the multi-process mode is a dinosaur and that we
> can just rip it out ... but I don't share your confidence that we can
> get there in any short time period.
>

First, I am enjoying the activity of this thread.  But my first question is
"to what end"?
Do I consider threads better? (yes... and no)

I do wonder if we could add better threading within any given
session/process to get a hybrid?
[maybe this gets us closer to solving some of the problems incrementally?]

If I could have anything (today)... I would prefer a Master-Master
Implementation leveraging some
of the ultra-fast server-server communication protocols to help sync
things.  Then I wouldn't care.
I could avoid the O/S  Overwhelm caused by excessive processes, via
spinning up machines.
[Unfortunately I know that PG leverages the filesystem cache, etc to such a
degree that communicating
from one master to another would require a really special architecture
there.  And the N! communication lines].

Kirk...

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Robert Haas

On Tue, Jun 6, 2023 at 2:51 PM Kirk Wolak  wrote:
> I do wonder if we could add better threading within any given session/process 
> to get a hybrid?
> [maybe this gets us closer to solving some of the problems incrementally?]

I don't think it helps much -- if anything, I think that would be more
complicated.

> If I could have anything (today)... I would prefer a Master-Master 
> Implementation leveraging some
> of the ultra-fast server-server communication protocols to help sync things.  
> Then I wouldn't care.
> I could avoid the O/S  Overwhelm caused by excessive processes, via spinning 
> up machines.
> [Unfortunately I know that PG leverages the filesystem cache, etc to such a 
> degree that communicating
> from one master to another would require a really special architecture there. 
>  And the N! communication lines].

I think there's plenty of interesting things to improve in this area,
but they're different things than what this thread is about.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Joe Conway


On 6/6/23 09:09, Daniel Verite wrote:

Jeff Davis wrote:

New patch series attached. I plan to commit 0001 and 0002 soon, unless
there are objections.

0001 causes the "C" and "POSIX" locales to be treated with
memcmp/pg_ascii semantics in ICU, just like in libc. We also
considered a new "none" provider, but it's more invasive, and we can
always reconsider that in the v17 cycle.



0001 creates exceptions throughout the code so that when an ICU
collation has a locale name "C" or "POSIX" then it does not behave
like an ICU collation, even though pg_collation.collprovider='i'
To me it's neither desirable nor necessary that a collation that
has collprovider='i' is diverted to non-ICU semantics.


This discussion makes me wonder (though probably too late for the v16 
cycle) if we shouldn't treat "C" and "POSIX" locales to be a third 
provider, something like "internal".


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Robert Haas

On Tue, Jun 6, 2023 at 11:46 AM Heikki Linnakangas  wrote:
> Bruce was worried about the loss of isolation that the separate address
> spaces gives, and Jeremy shared an anecdote on that. That is an
> objection to the idea itself, i.e. even if transition was smooth,
> bug-free and effortless, that point remains. I personally think the
> isolation we get from separate address spaces is overrated. Yes, it
> gives you some protection, but given how much shared memory there is,
> the blast radius is large even with separate backend processes.

An interesting idea might be to look at the places where we ereport or
elog FATAL due to some kind of backend data structure corruption and
ask whether there would be an argument for elevating the level to
PANIC if we changed this. There are definitely some places where we
argue that the only corrupted state is backend-local and thus we don't
need to PANIC if it's corrupted. I wonder to what extent this change
would undermine that argument.

Even if it does, I think it's worth it. Corrupted backend-local data
structures aren't that common, thankfully.

> I don't think this is worth it, unless we plan to eventually remove the
> multi-process mode. We could e.g. make lock table expandable in threaded
> mode, and fixed-size in process mode, but the big gains would come from
> being able to share things between threads and have variable-length
> shared data structures more easily. As long as you need to also support
> processes, you need to code to the lowest common denominator and don't
> really get the benefits.
>
> I don't know how long a transition period we need. Maybe 1 release, maybe 5.

I think 1 release is wildly optimistic. Even if someone wrote a patch
for this and got it committed this release cycle, it's likely that
there would be follow-up commits needed over a period of several years
before it really worked as well as we'd like. Only after that could we
consider deprecating the per-process way. But I don't think that's
necessarily a huge problem. I originally intended DSM as an optional
feature: if you didn't have it, then you couldn't use features that
depended on it, but the rest of the system still worked. Eventually,
other people liked it enough that we decided to introduce hard
dependencies on it. I think that's a good model for a change like
this. When the inventor of a new system thinks that we should have a
hard dependency on it, MEH. When there's a groundswell of other,
unaffiliated hackers making that argument, COOL.

I'm also not quite convinced that there's no long-term use case for
multi-process mode. Maybe you're right and there isn't, but that
amounts to arguing that every extension in the world will be happy to
run in a multi-threaded world rather than not. I don't know if I quite
believe that. It also amounts to arguing that performance is going to
be better for everyone in this new multi-threaded mode, and that it
won't cause unforeseen problems for any significant numbers of users,
and maybe those things are true, but I think we need to get this new
system in place and get some real-world experience before we can judge
these kinds of things. I agree that, in theory, it would be nice to
get to a place where the multi-process mode is a dinosaur and that we
can just rip it out ... but I don't share your confidence that we can
get there in any short time period.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Konstantin Knizhnik





On 06.06.2023 5:13 PM, Robert Haas wrote:

On Tue, Jun 6, 2023 at 9:40 AM Robert Haas  wrote:

I'm not sure that there's a strong consensus, but I do think it's a good idea.

Let me elaborate on this a bit.



Not all databases have this problem, and PostgreSQL isn't going to be
able to stop having it without some kind of major architectural
change. Changing from a process model to a threaded model might be
insufficient, because while I think that threads consume fewer OS
resources than processes, what is really needed, in all likelihood, is
the ability to have idle connections have neither a process nor a
thread associated with them until they cease being idle. That's a huge
project and I'm not volunteering to do it, but if we want to have the
same kind of scalability as some competing products, that is probably
a place to which we ultimately need to go. Getting out of the current
model where every backend has an arbitrarily large amount of state
hanging off of random global variables, not all of which are even
known to any central system, is a critical step in that journey.


It looks like built-in connection pooler, doesn't it?
Actually built-in connection pooler has a lot o common things with 
multithreaded Postgres.

It also needs to keep session context.
Te main difference is that there is no need to place here all Postgres 
global/static variables, because lefitime of most of them is shorter 
than transaction. So it is really enough to place all such variables in 
single struct.

This is how built-in connection pooler was implemented in PgPro.

Reading all concerns  against  multithreading Postgres makes me think 
that it may erasonable to combine two approaches:
still have processes (backends) but be able to spawn multiple threads 
inside process (for example for parallel query execution).
It can be considered that such approach can only increase complexity of 
implementation and combine drawbacks of both approaches.

But actually such approach allows:
1. Support old (external, non-reentrant) extensions - them will be 
executed by dedicated backends.

2. Simplify parallel query execution and make it more efficient.
3. Allows to most efficiently use multitreaded PL-s (like JVM based). As 
far as there will be no single VM for all connections, but only for some 
group of them(for example belonging to one user), then most complaints 
concerning sharing VM between different connections can be avoided

4. Avoid or minimize problems with OOM and memory fragmentation.
5. Can be combine with connection pooler (save inactive connection state 
without having process or thread for it)

Re: Add support for AT LOCAL

2023-06-06 Thread Laurenz Albe

On Tue, 2023-06-06 at 04:24 -0400, Vik Fearing wrote:
> > At a quick glance, it looks like you resolve "timezone" at the time
> > the query is parsed.  Shouldn't the resolution happen at query
> > execution time?
> 
> current_setting(text) is stable, and my tests show that it is calculated 
> at execution time.

Ah, ok, then sorry for the noise.  I misread the code then.

Yours,
Laurenz Albe

Re: Return value of pg_promote()

2023-06-06 Thread Laurenz Albe

On Tue, 2023-06-06 at 16:35 +0530, Ashutosh Sharma wrote:
> At present, pg_promote() returns true to the caller on successful
> promotion of standby, however it returns false in multiple scenarios
> which includes:
> 
> 1) The SIGUSR1 signal could not be sent to the postmaster process.
> 2) The postmaster died during standby promotion.
> 3) Standby couldn't be promoted within the specified wait time.
> 
> For an application calling this function, if pg_promote returns false,
> it is hard to interpret the reason behind it. So I think we should
> *only* allow pg_promote to return false when the server could not be
> promoted in the given wait time and in other scenarios it should just
> throw an error (FATAL, ERROR ... depending on the type of failure that
> occurred). Please let me know your thoughts on this change. thanks.!

As the original author, I'd say that that sounds reasonable, particularly
in case #1.  If the postmaster dies, we are going to die too, so it
probably doesn't matter much.  But I think an error is certainly also
correct in that case.

Yours,
Laurenz Albe

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread chap


On 2023-06-06 12:24, Heikki Linnakangas wrote:

I'm afraid having multiple processes and JVMs doesn't help that.
If you can escape the one JVM in one backend process, it's game over.


So there's escape and there's escape, right? Java still prioritizes
(and has, in fact, strengthened) barriers against breaking module
encapsulation, or getting access to arbitrary native memory or code.

The features that have been deprecated, to eventually go away, are
the ones that offer fine-grained control over operations that there
are Java APIs for. Eventually it won't be as easy as it is now to say
"ok, your function gets to open these files or these sockets but
not those ones."

Even for those things, there may yet be solutions. There are Java
APIs for virtualizing the view of the file system, for example. It's
yet to be seen how things will shake out. Configuration may get
trickier, and there may be some incentive to to include, say,
sepgsql in the picture.

Sure, even access to a file API can be game over, depending on
what file you open, but that's already the risk for every PL
with an 'untrusted' flavor.

Regards,
-Chap

Re: abi-compliance-checker

2023-06-06 Thread Tristan Partin

+abidiff = find_program('abidiff', native: false, required: false)
+abidw = find_program('abidw', native: false, required: false)
+
+abidw_flags = [
+  '--drop-undefined-syms',
+  '--no-architecture',
+  '--no-comp-dir-path',
+  '--no-elf-needed',
+  '--no-show-locs',
+  '--type-id-style', 'hash',
+]
+abidw_cmd = [abidw, abidw_flags, '--out-file', '@OUTPUT@', '@INPUT@']

It would make sense to me to mark abidiff and abidw as disabler: true.

+if abidw.found()
+  libpq_abi = custom_target('libpq.abi.xml',
+input: libpq_so,
+output: 'libpq.abi.xml',
+command: abidw_cmd,
+build_by_default: true)
+endif
+
+if abidiff.found()
+  test('libpq.abidiff',
+   abidiff,
+   args: [files('libpq.base.abi.xml'), libpq_abi],
+   suite: 'abidiff',
+   verbose: true)
+endif

With disabler: true, you can drop the conditionals. Disablers tell Meson
to disable parts of the build[0].

I also don't think it makes sense to mark the custom_targets as
build_by_default: true, unless you see value in that. I would just have
them built when the test is ran.

However, it might make sense to create an alias_target of all the ABI
XML files for people that want to interact with the files outside of the
tests for whatever reason.

[0]: https://mesonbuild.com/Reference-manual_returned_disabler.html

-- 
Tristan Partin
Neon (https://neon.tech)

Re: ERROR: could not determine which collation to use for string comparison

2023-06-06 Thread Laurenz Albe

On Tue, 2023-06-06 at 11:42 -0300, Rômulo Coutinho wrote:
> We recently upgraded to Postgres 15.3. When running ANALYZE, we get the 
> following message:
> ERROR:  could not determine which collation to use for string comparison
> HINT:  Use the COLLATE clause to set the collation explicitly.
> 
> We have never seen this before. Could this be a bug?

Impossible to say without a way to reproduce.

Yours,
Laurenz Albe

ERROR: could not determine which collation to use for string comparison

2023-06-06 Thread Rômulo Coutinho

Hi,

We recently upgraded to Postgres 15.3. When running ANALYZE, we get the
following message:
ERROR:  could not determine which collation to use for string comparison
HINT:  Use the COLLATE clause to set the collation explicitly.

We have never seen this before. Could this be a bug?

Regards,

Rômulo Coutinho.

Re: Potential us of initialized memory in xlogrecovery.c

2023-06-06 Thread Heikki Linnakangas


On 06/06/2023 10:24, Tristan Partin wrote:

Hello,

Today, I compiled the master branch of Postgres with the following GCC
version:

gcc (GCC) 13.1.1 20230511 (Red Hat 13.1.1-2)

I got the following warning:

[701/2058] Compiling C object 
src/backend/postgres_lib.a.p/access_transam_xlogrecovery.c.o
In function ‘recoveryStopsAfter’,
 inlined from ‘PerformWalRecovery’ at 
../src/backend/access/transam/xlogrecovery.c:1749:8:
../src/backend/access/transam/xlogrecovery.c:2756:42: warning: ‘recordXtime’ 
may be used uninitialized [-Wmaybe-uninitialized]
  2756 | recoveryStopTime = recordXtime;
   | ~^
../src/backend/access/transam/xlogrecovery.c: In function ‘PerformWalRecovery’:
../src/backend/access/transam/xlogrecovery.c:2647:21: note: ‘recordXtime’ was 
declared here
  2647 | TimestampTz recordXtime;
   | ^~~

Investigating this issue I see a potential assignment in
xlogrecovery.c:2715. Best I can tell the warning looks real. Similar
functions in this file seem to initialize recordXtime to 0. Attached is
a patch which does just that.


Thank you! My refactoring in commit c945af80cf introduced this. Looking 
at getRecordTimestamp(), it will always return true and set recordXtime 
for the commit and abort records, and some compilers can deduce that.


Initializing to 0 makes sense, I'll commit that fix later tonight.

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Heikki Linnakangas


On 06/06/2023 11:48, c...@anastigmatix.net wrote:

And the devs of Java, in their immeasurable wisdom, have announced
a "JDK Enhancement Proposal" (that's just what these things are
called, don't blame Orwell), JEP 411[2][3], in which all of the
Security Manager features that PL/Java relies on for bounds on
'trusted' behavior are deprecated for eventual removal with no
functional replacement. I'd be even more leery of using one big
shared JVM for everybody's work after that happens.


Ouch.


Might the work toward allowing a run-time choice between a
process or threaded model also make possible some
intermediate models as well? A backend process for
connections to a particular database, or with particular
authentication credentials? Go through the authentication
handshake and then sendfd the connected socket to the
appropriate process. (Has every supported platform got
something like sendfd?)


I'm afraid having multiple processes and JVMs doesn't help that. If you 
can escape the one JVM in one backend process, it's game over. Backend 
processes are not a security barrier, and you have the same problems 
with the current multi-process architecture, too.


https://github.com/greenplum-db/plcontainer is one approach. It launches 
a separate process for the PL, separate from the backend process, and 
sandboxes that.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: RFC: logical publication via inheritance root?

2023-06-06 Thread Jacob Champion

On 4/4/23 08:14, Jacob Champion wrote:
> Yes, sorry -- after 062a84442, the architecture needs to change in a
> way that I'm still working through. I've moved the patch to Waiting on
> Author while I figure out the rebase.

Okay -- that took longer than I wanted, but here's a rebased patchset
that I'll call v2.

Commit 062a84442 necessitated some rework of the new
pg_get_publication_rels_to_sync() helper. It now takes a list of
publications so that we can handle conflicts in the pubviaroot settings.
This is more complicated than before -- unlike partitions, standard
inheritance trees can selectively publish tables that aren't leaves. But
I think I've finally settled on some semantics for it which are
unsurprising.

As part of that, I've pulled out a patch in 0001 which I hope is
independently useful. Today, there appears to be no way to check which
relid a table will be published through, short of creating a
subscription just to see what happens. 0001 introduces
pg_get_relation_publishing_info() to surface this information, which
makes testing it easier and also makes it possible to inspect what's
happening with more complicated publication setups.

0001 also moves the determination of publish_as_relid out of the
pgoutput plugin and into a pg_publication helper function, because
unless I've missed something crucial, it doesn't seem like an output
plugin is really free to make that decision independently of the
publication settings. The subscriber is not going to ask a plugin for
the right tables to COPY during initial sync, so the plugin had better
be using the same logic as the core.

Many TODOs and upthread points of feedback are still pending, and I
think that several of them are actually symptoms of one architectural
problem with my patch:

- The volatility classifications of pg_set_logical_root() and
pg_get_publication_rels_to_sync() appear to conflict
- A dump/restore cycle loses the new marker
- Inheritance can be tampered with after the logical root has been set
- There's currently no way to clear a logical root after setting it

I wonder if pg_set_logical_root() might be better implemented as part of
ALTER TABLE. Maybe with a relation option? If it all went through ALTER
TABLE ONLY ... SET, then we wouldn't have to worry about a user
modifying roots while reading pg_get_publication_rels_to_sync() in the
same query. The permissions checks should be more consistent with less
effort, and there's an existing way to set/clear the option that already
plays well with pg_dump and pg_upgrade. The downsides I can see are the
need to handle simultaneous changes to INHERIT and SET (since we'd be
manipulating pg_inherits in both), as well as the fact that ALTER TABLE
... SET defaults to altering the entire table hierarchy, which may be
bad UX for this case.

--JacobFrom 761b91b3f173adffac9f73132c2f1c9fc8bcbec3 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Fri, 7 Apr 2023 09:55:27 -0700
Subject: [PATCH v2 1/2] pgoutput: refactor publication cache construction

Breaking this logic into a standalone helper should help expose the
behavior for testing.  Additionally, move the implementation under the
pg_publication catalog helpers, since it seems like it's inherent to the
publication settings (and it must match the behavior of the initial
sync, which doesn't seem to be controlled by the output plugin at all).
---
 src/backend/catalog/pg_publication.c| 215 
 src/backend/replication/pgoutput/pgoutput.c | 137 +
 src/include/catalog/pg_proc.dat |   8 +
 src/include/catalog/pg_publication.h|   4 +
 src/test/regress/expected/publication.out   |  26 +++
 src/test/regress/sql/publication.sql|  15 ++
 6 files changed, 272 insertions(+), 133 deletions(-)

diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c
index c488b6370b..df9abf09c3 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -1259,3 +1259,218 @@ pg_get_publication_tables(PG_FUNCTION_ARGS)

 	SRF_RETURN_DONE(funcctx);
 }
+
+List *
+process_relation_publications(Oid relid, const List *publications,
+			  PublicationActions *pubactions,
+			  Oid *publish_as_relid)
+{
+	Oid			schemaId = get_rel_namespace(relid);
+	List	   *pubids = GetRelationPublications(relid);
+
+	/*
+	 * We don't acquire a lock on the namespace system table as we build
+	 * the cache entry using a historic snapshot and all the later changes
+	 * are absorbed while decoding WAL.
+	 */
+	List	   *schemaPubids = GetSchemaPublications(schemaId);
+	ListCell   *lc;
+	int			publish_ancestor_level = 0;
+	bool		am_partition = get_rel_relispartition(relid);
+	char		relkind = get_rel_relkind(relid);
+	List	   *rel_publications = NIL;
+
+	*publish_as_relid = relid;
+
+	foreach(lc, publications)
+	{
+		Publication *pub = lfirst(lc);
+		bool		publish = false;
+
+		/*
+		 * Under what relid should we publish changes in this publication?
+		 * We'll

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread chap


On 2023-06-06 08:06, Konstantin Knizhnik wrote:
7. It will be hard to support non-multithreaded PL languages (like 
python), but for example support of Java will be more natural and 
efficient.


To this I say ...

Hmm.

Surely, the current situation with a JVM in each backend process
(that calls for one) has been often seen as heavier than desirable.

At the same time, I am not sure how manageable one giant process
with one giant JVM instance would prove to be, either.

It is somewhat nice to be able to tweak JVM settings in a session
and see what happens, without disrupting other sessions. There may
also exist cases for different JVM settings in per-user or per-
database GUCs.

Like Python with the GIL, it is documented for JNI_CreateJavaVM
that "Creation of multiple VMs in a single process is not
supported."[1]

And the devs of Java, in their immeasurable wisdom, have announced
a "JDK Enhancement Proposal" (that's just what these things are
called, don't blame Orwell), JEP 411[2][3], in which all of the
Security Manager features that PL/Java relies on for bounds on
'trusted' behavior are deprecated for eventual removal with no
functional replacement. I'd be even more leery of using one big
shared JVM for everybody's work after that happens.

Might the work toward allowing a run-time choice between a
process or threaded model also make possible some
intermediate models as well? A backend process for
connections to a particular database, or with particular
authentication credentials? Go through the authentication
handshake and then sendfd the connected socket to the
appropriate process. (Has every supported platform got
something like sendfd?)

That way, there could be some flexibility to arrange how many
distinct backends (and, for Java purposes, how many JVMs) get
fired up, and have each sharing sessions that have something in
common.

Or, would that just require all the complexity of both
approaches to synchronization, with no sufficient benefit?

Regards,
-Chap

[1] 
https://docs.oracle.com/en/java/javase/17/docs/specs/jni/invocation.html#jni_createjavavm
[2] 
https://blogs.apache.org/netbeans/entry/jep-411-deprecate-the-security1

[3] https://github.com/tada/pljava/wiki/JEP-411

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Heikki Linnakangas


On 06/06/2023 09:40, Robert Haas wrote:

On Mon, Jun 5, 2023 at 10:52 AM Heikki Linnakangas  wrote:

I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
so that the whole server runs in a single process, with multiple
threads. It has been discussed many times in the past, last thread on
pgsql-hackers was back in 2017 when Konstantin made some experiments [0].

I feel that there is now pretty strong consensus that it would be a good
thing, more so than before. Lots of work to get there, and lots of
details to be hashed out, but no objections to the idea at a high level.


I'm not sure that there's a strong consensus, but I do think it's a good idea.


The consensus is not as strong as I hoped for... To summarize:

Tom, Andrew, Joe are worried that it will break a lot of stuff. That's a 
valid point. The transition needs to be done well and not break things, 
I agree with that. But if we can make the transition smooth, that's not 
an objection to the idea itself.


Many comments have been along the lines of "it's hard, not worth the 
effort". That's fair, but also not an objection to the idea itself, if 
someone decides to spend the time on it.


Bruce was worried about the loss of isolation that the separate address 
spaces gives, and Jeremy shared an anecdote on that. That is an 
objection to the idea itself, i.e. even if transition was smooth, 
bug-free and effortless, that point remains. I personally think the 
isolation we get from separate address spaces is overrated. Yes, it 
gives you some protection, but given how much shared memory there is, 
the blast radius is large even with separate backend processes.


So I think there's hope. I didn't hear any _strong_ objections to the 
idea itself, assuming the transition can be done smoothly.



# Transition period

The transition surely cannot be done fully in one release. Even if we
could pull it off in core, extensions will need more time to adapt.
There will be a transition period of at least one release, probably
more, where you can choose multi-process or multi-thread model using a
GUC. Depending on how it goes, we can document it as experimental at first.


I think the transition period should probably be effectively infinite.
There might be some distant future day when we'd remove the process
support, if things go incredibly well with threads, but I don't think
it would be any time soon.


I don't think this is worth it, unless we plan to eventually remove the 
multi-process mode. We could e.g. make lock table expandable in threaded 
mode, and fixed-size in process mode, but the big gains would come from 
being able to share things between threads and have variable-length 
shared data structures more easily. As long as you need to also support 
processes, you need to code to the lowest common denominator and don't 
really get the benefits.


I don't know how long a transition period we need. Maybe 1 release, maybe 5.


If nothing else, considering that we don't want to force a hard
compatibility break for extensions.
Extensions regularly need small tweaks to adapt to new major Postgres 
versions, I don't think this would be too different.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Tomas Vondra

On 6/6/23 14:00, Alexander Lakhin wrote:
> Hello Tomas,
> 
> 06.06.2023 12:56, Tomas Vondra wrote:
>> On 6/6/23 11:00, Alexander Lakhin wrote:
>>> Hello,
>>> ...> With the debug logging added inside AssertTXNLsnOrder() I see:
>>> ctx->snapshot_builder->start_decoding_at: 209807224,
>>> ctx->reader->EndRecPtr: 210043072,
>>> SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
>>> and inside the loop:
>>> cur_txn->first_lsn: 209792872
>>> cur_txn->first_lsn: 209975744
>>> cur_txn->first_lsn: 210043008
>>> cur_txn->first_lsn: 210043008
>>> and it triggers the Assert.
>>>
>> So what's the prev_first_lsn value for these first_lsn values? How does
>> it change over time? Did you try looking at the pg_waldump for these
>> positions?
> 
> With more logging I've got (for another run):
> ReorderBufferTXNByXid| xid: 3397, lsn: c1fbc80
> 
> ctx->snapshot_builder->start_decoding_at: c1f2cc0,
> ctx->reader->EndRecPtr: c1fbcc0,
> SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
> prev_first_lsn: 0, cur_txn->first_lsn: c1fbc80
> prev_first_lsn: c1fbc80, cur_txn->first_lsn: c1fbc80
> TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn") ...
> 
> waldump for 0001000C shows:
> grep c1fbc80:
> rmgr: Heap2   len (rec/tot): 60/    60, tx:   3398, lsn:
> 0/0C1FBC80, prev 0/0C1FBC50, desc: NEW_CID rel: 1663/18763/19987, tid:
> 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
> rmgr: Heap    len (rec/tot): 59/    59, tx:   3398, lsn:
> 0/0C1FBCC0, prev 0/0C1FBC80, desc: INSERT+INIT off: 1, flags: 0x08,
> blkref #0: rel 1663/18763/19987 blk 0
> 
> grep '( 3397| 3398)'

I've been able to reproduce this, after messing with the script a little
bit (I had to skip the test_decoding regression tests, because that was
complaining about slots already existing etc).

Anyway, AssertTXNLsnOrder sees these two transactions (before aborting):

  26662 0/6462E6F0 (first 0/0)
  26661 0/6462E6F0 (first 0/6462E6F0)

where 26661 is the top xact, 26662 is a subxact of 26661. This is
clearly a problem, because we really should not have subxact in this
list once the assignment gets applied.

And the relevant WAL looks like this:

-
26662, lsn: 0/645EDAA0, prev 0/645EDA60, desc: ASSIGNMENT xtop 26661:
subxacts: 26662
26662, lsn: 0/645EDAD0, prev 0/645EDAA0, desc: INSERT+INIT off: 1,
flags: 0x08, blkref #0: rel 1663/125447/126835 blk 0
...
0, lsn: 0/6462E5D8, prev 0/6462E2A0, desc: RUNNING_XACTS nextXid
26673 latestCompletedXid 26672 oldestRunningXid 26661; 3 xacts: 26667
26661 26664; 3 subxacts: 26668 26662 26665
...
26662, lsn: 0/6462E6F0, prev 0/6462E678, desc: NEW_CID rel:
1663/125447/126841, tid: 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
26662, lsn: 0/6462E730, prev 0/6462E6F0, desc: INSERT+INIT off: 1,
flags: 0x08, blkref #0: rel 1663/125447/126841 blk 0
26661, lsn: 0/6462E770, prev 0/6462E730, desc: COMMIT 2023-06-06
16:41:24.442870 CEST; subxacts: 26662
-

so the assignment is the *first* thing that happens for these xacts.

However, we skip the assignment, because the log for this call of
get_changes says this:

   LOG:  logical decoding found consistent point at 0/6462E5D8

so we fail to realize the 26662 is a subxact.

Then when processing the NEW_CID, SnapBuildProcessNewCid chimes in and
does this:

  ReorderBufferXidSetCatalogChanges(builder->reorder, xid, lsn);

  ReorderBufferAddNewTupleCids(builder->reorder, xlrec->top_xid, lsn,
 xlrec->target_locator, xlrec->target_tid,
 xlrec->cmin, xlrec->cmax,
 xlrec->combocid);

and ReorderBufferAddNewTupleCids() proceeds to enter an entry for the
passed XID (which is xlrec->top_xid, 26661), but with LSN of the WAL
record. But ReorderBufferXidSetCatalogChanges() already did the same
thing for the subxact 26662, as it has no idea it's a subxact (due to
the skipped assignment).

I haven't figured out what exactly is happening / what it should be
doing instead. But it seems wrong to skip the assignment - I wonder if
SnapBuildProcessRunningXacts might be doing that too eagerly.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Potential us of initialized memory in xlogrecovery.c

2023-06-06 Thread Tristan Partin

Hello,

Today, I compiled the master branch of Postgres with the following GCC
version:

gcc (GCC) 13.1.1 20230511 (Red Hat 13.1.1-2)

I got the following warning:

[701/2058] Compiling C object 
src/backend/postgres_lib.a.p/access_transam_xlogrecovery.c.o
In function ‘recoveryStopsAfter’,
inlined from ‘PerformWalRecovery’ at 
../src/backend/access/transam/xlogrecovery.c:1749:8:
../src/backend/access/transam/xlogrecovery.c:2756:42: warning: ‘recordXtime’ 
may be used uninitialized [-Wmaybe-uninitialized]
 2756 | recoveryStopTime = recordXtime;
  | ~^
../src/backend/access/transam/xlogrecovery.c: In function ‘PerformWalRecovery’:
../src/backend/access/transam/xlogrecovery.c:2647:21: note: ‘recordXtime’ was 
declared here
 2647 | TimestampTz recordXtime;
  | ^~~

Investigating this issue I see a potential assignment in
xlogrecovery.c:2715. Best I can tell the warning looks real. Similar
functions in this file seem to initialize recordXtime to 0. Attached is
a patch which does just that.

-- 
Tristan Partin
Neon (https://neon.tech)
From 74cc0fcd0570383d273f49dbc7d669caf7474453 Mon Sep 17 00:00:00 2001
From: Tristan Partin 
Date: Tue, 6 Jun 2023 09:22:56 -0500
Subject: [PATCH v1] Initialize potentially uninitialized memory in
 xlogrecovery.c

Seems that in certain cases recordXtime might be used prior to
assignment.
---
 src/backend/access/transam/xlogrecovery.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 4883fcb512..becc2bda62 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -2644,7 +2644,7 @@ recoveryStopsAfter(XLogReaderState *record)
 	uint8		info;
 	uint8		xact_info;
 	uint8		rmid;
-	TimestampTz recordXtime;
+	TimestampTz recordXtime = 0;
 
 	/*
 	 * Ignore recovery target settings when not in archive recovery (meaning
-- 
-- 
Tristan Partin
Neon (https://neon.tech)

Re: [PATCH] pg_regress.c: Fix "make check" on Mac OS X: Pass DYLD_LIBRARY_PATH

2023-06-06 Thread Evan Jones

On Mon, Jun 5, 2023 at 10:33 PM Tom Lane  wrote:

> > Note that this is a known issue
> Yeah.  We have attempted to work around this before, but failed to find
> a solution without more downsides than upsides.  I will be interested
> to look at this patch, but lack time for it right now.  Anybody else?
>

Ah, I didn't find that mention in the documentation when I was trying to
get this working. Sorry about that!

My argument in favour of considering this patch is that making
"./configure; make; make check" work on current major operating systems
makes it easier for others to contribute in the future. I think the
disadvantage of this patch is it makes pg_regress harder to understand,
because it requires an #ifdef for this OS specific behaviour, and obscures
the command lines of the child processes it spawns.

Thanks for considering it!

Evan

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-06 Thread Evan Jones

On Tue, Jun 6, 2023 at 7:37 AM Michael Paquier  wrote:

> Indeed.  It looks like 9ae2661 missed this spot.
>

I didn't think to look for a previous fix, thanks for finding this commit
id!

I did a quick look at the places found with "git grep isspace" yesterday. I
agree with the comment from commit 9ae2661: "I've left alone isspace()
calls in places that aren't really expecting any non-ASCII input
characters, such as float8in()." There are a number of other calls where I
think it would likely be safe, and possibly even a good idea, to replace
isspace() with scanner_isspace(). However, I couldn't find any where I
could cause a bug like the one I hit in hstore parsing.

Original mailing list post for commit 9ae2661 in case it is helpful for
others: https://www.postgresql.org/message-id/10129.1495302...@sss.pgh.pa.us

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Robert Haas

On Tue, Jun 6, 2023 at 9:40 AM Robert Haas  wrote:
> I'm not sure that there's a strong consensus, but I do think it's a good idea.

Let me elaborate on this a bit.

I think one of PostgreSQL's bigger problems right now is that it
doesn't scale as far as users would like. Beyond a couple of hundred
connections, everything goes to heck. Back in the day, the big
scalability problems were around locking, but we've done a pretty good
job cleaning that stuff up over the issues. Now, the problem when you
run a ton of PostgreSQL connections isn't so much that PostgreSQL
stops working as it is that the OS stops working. PostgreSQL backends
use a lot of memory, even if they're idle. Some of that is for stuff
that we could optimize but haven't, like catcache and relcache
entries, and some of it is for stuff that we can't do anything about,
like per-process page tables. But the problem isn't just RAM, either.
I've seen machines running >1000 PostgreSQL backends where kill -9
took many *minutes* to work because the OS was overwhelmed. I don't
know exactly what goes wrong inside the kernel, but clearly something
does.

Not all databases have this problem, and PostgreSQL isn't going to be
able to stop having it without some kind of major architectural
change. Changing from a process model to a threaded model might be
insufficient, because while I think that threads consume fewer OS
resources than processes, what is really needed, in all likelihood, is
the ability to have idle connections have neither a process nor a
thread associated with them until they cease being idle. That's a huge
project and I'm not volunteering to do it, but if we want to have the
same kind of scalability as some competing products, that is probably
a place to which we ultimately need to go. Getting out of the current
model where every backend has an arbitrarily large amount of state
hanging off of random global variables, not all of which are even
known to any central system, is a critical step in that journey.

Also, programming with DSA and shm_mq sucks. It's doable (proof by
example) but it's awkward and it takes a long time and the performance
isn't great. Here again, threads instead of processes is no panacea.
For as long as we support a process model - and my guess is that we're
talking about a very long time - new features are going to have to
work with those systems or else be optional. But the amount of sheer
mental energy that is required to deal with DSA means we're unlikely
to ever have a rich library of parallel primitives. Maybe we wouldn't
anyway, volunteer efforts are hard to predict, but this is certainly
not helping. I do think that there's some danger that if sharing
memory becomes as easy as calling palloc(), we'll end up with memory
leaks that could eventually take the whole system down. We need to
give some thought to how to avoid or manage that danger.

Even think about something like the main lock table. That's a fixed
size hash table, so lock exhaustion is a real possibility. If we
weren't limited to a fixed-size shared memory segment, we could let
that thing grow without a server restart. We might not want to let it
grow infinitely, but we could raise the maximum size by 100x and
allocate as required and I think we'd just be better off. Doing that
as things stand would require nailing down that amount of memory
forever whether it's ever needed or not, which doesn't seem like a
good idea. But doing something where the memory can be allocated only
if it's needed would avoid user-facing errors with relatively little
cost.

I think doing something like this is going to be a huge effort, and
frankly, there's probably no point in anybody other than a handful of
people (Heikki, Andres, a handful of others) even trying. There's too
many ways to go wrong, and this has to be done really well to be worth
doing at all. But if somebody with the requisite expertise wants to
have a go at it, I don't think we should tell them "no, we don't want
that" on principle. Let's talk about whether a specific proposal is
good or bad, and why it's good or bad, rather than falling back on an
essentially religious argument. It's not an article of faith that
PostgreSQL should not use threads: it's a technology decision. The
difficulty of reversing the decision made long ago should weigh
heavily in evaluating any proposal to do so, but the potential
benefits of such a change should be considered, too.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Robert Haas

On Mon, Jun 5, 2023 at 10:52 AM Heikki Linnakangas  wrote:
> I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
> so that the whole server runs in a single process, with multiple
> threads. It has been discussed many times in the past, last thread on
> pgsql-hackers was back in 2017 when Konstantin made some experiments [0].
>
> I feel that there is now pretty strong consensus that it would be a good
> thing, more so than before. Lots of work to get there, and lots of
> details to be hashed out, but no objections to the idea at a high level.

I'm not sure that there's a strong consensus, but I do think it's a good idea.

> # Transition period
>
> The transition surely cannot be done fully in one release. Even if we
> could pull it off in core, extensions will need more time to adapt.
> There will be a transition period of at least one release, probably
> more, where you can choose multi-process or multi-thread model using a
> GUC. Depending on how it goes, we can document it as experimental at first.

I think the transition period should probably be effectively infinite.
There might be some distant future day when we'd remove the process
support, if things go incredibly well with threads, but I don't think
it would be any time soon. If nothing else, considering that we don't
want to force a hard compatibility break for extensions.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Implement generalized sub routine find_in_log for tap test

2023-06-06 Thread vignesh C

On Tue, 6 Jun 2023 at 09:36, Michael Paquier  wrote:
>
> On Tue, Jun 06, 2023 at 08:05:49AM +0530, Amit Kapila wrote:
> > Personally, I don't see any problem to do this refactoring for v16.
> > However, sometimes, we do decide to backpatch refactoring in tests to
> > avoid backpatch effort. I am not completely sure if that is the case
> > here.
>
> 033_replay_tsp_drops.pl has one find_in_log() down to 11, and
> 019_replslot_limit.pl has four calls down to 14.  Making things
> consistent everywhere is a rather appealing argument to ease future
> backpatching.  So I am OK to spend a few extra cycles in adjusting
> these routines all the way down where needed.  I'll do that tomorrow
> once I get back in front of my laptop.
>
> Note that connect_ok() and connect_fails() are new to 14, so this
> part has no need to go further down than that.

Please find the attached patches that can be applied on back branches
too. v5*master.patch can be applied on master, v5*PG15.patch can be
applied on PG15, v5*PG14.patch can be applied on PG14, v5*PG13.patch
can be applied on PG13, v5*PG12.patch can be applied on PG12, PG11 and
PG10.

Regards,
Vignesh
From d582676214f3fd5068c343f42116fa256be18958 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Tue, 6 Jun 2023 17:07:37 +0530
Subject: [PATCH v5] Remove duplicate find_in_log sub routines from tap tests.

Remove duplicate find_in_log sub routines from tap tests.
---
 src/test/perl/PostgresNode.pm   | 15 +
 src/test/recovery/t/019_replslot_limit.pl   | 25 -
 src/test/recovery/t/033_replay_tsp_drops.pl | 14 ++--
 3 files changed, 21 insertions(+), 33 deletions(-)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index b7e4dbd0e8..ce924de0b2 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2081,6 +2081,21 @@ sub issues_sql_like
 
 =pod
 
+=item $node->log_contains(pattern, offset)
+
+Find pattern in logfile of node after offset byte.
+
+=cut
+
+sub log_contains
+{
+	my ($self, $pattern, $offset) = @_;
+
+	return TestLib::slurp_file($self->logfile, $offset) =~ m/$pattern/;
+}
+
+=pod
+
 =item $node->run_log(...)
 
 Runs a shell command like TestLib::run_log, but with connection parameters set
diff --git a/src/test/recovery/t/019_replslot_limit.pl b/src/test/recovery/t/019_replslot_limit.pl
index 7a8954dad2..d36c954588 100644
--- a/src/test/recovery/t/019_replslot_limit.pl
+++ b/src/test/recovery/t/019_replslot_limit.pl
@@ -165,8 +165,7 @@ $node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
 
 $node_standby->stop;
 
-ok( !find_in_log(
-		$node_standby,
+ok( !$node_standby->log_contains(
 		"requested WAL segment [0-9A-F]+ has already been removed"),
 	'check that required WAL segments are still available');
 
@@ -188,8 +187,7 @@ $node_master->safe_psql('postgres', "CHECKPOINT;");
 my $invalidated = 0;
 for (my $i = 0; $i < 1; $i++)
 {
-	if (find_in_log(
-			$node_master,
+	if ($node_master->log_contains(
 			"invalidating slot \"rep1\" because its restart_lsn [0-9A-F/]+ exceeds max_slot_wal_keep_size",
 			$logstart))
 	{
@@ -212,7 +210,7 @@ is($result, "rep1|f|t|lost|",
 my $checkpoint_ended = 0;
 for (my $i = 0; $i < 1; $i++)
 {
-	if (find_in_log($node_master, "checkpoint complete: ", $logstart))
+	if ($node_master->log_contains("checkpoint complete: ", $logstart))
 	{
 		$checkpoint_ended = 1;
 		last;
@@ -242,8 +240,7 @@ $node_standby->start;
 my $failed = 0;
 for (my $i = 0; $i < 1; $i++)
 {
-	if (find_in_log(
-			$node_standby,
+	if ($node_standby->log_contains(
 			"requested WAL segment [0-9A-F]+ has already been removed",
 			$logstart))
 	{
@@ -318,17 +315,3 @@ sub get_log_size
 
 	return (stat $node->logfile)[7];
 }
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-	my ($node, $pat, $off) = @_;
-
-	$off = 0 unless defined $off;
-	my $log = TestLib::slurp_file($node->logfile);
-	return 0 if (length($log) <= $off);
-
-	$log = substr($log, $off);
-
-	return $log =~ m/$pat/;
-}
diff --git a/src/test/recovery/t/033_replay_tsp_drops.pl b/src/test/recovery/t/033_replay_tsp_drops.pl
index 140d94db31..4c8f4e7970 100644
--- a/src/test/recovery/t/033_replay_tsp_drops.pl
+++ b/src/test/recovery/t/033_replay_tsp_drops.pl
@@ -132,21 +132,11 @@ while ($max_attempts-- >= 0)
 {
 	last
 	  if (
-		find_in_log(
-			$node_standby, qr!WARNING: ( [A-Z0-9]+:)? creating missing directory: pg_tblspc/!,
+		$node_standby->log_contains(
+			qr!WARNING: ( [A-Z0-9]+:)? creating missing directory: pg_tblspc/!,
 			$logstart));
 	usleep(100_000);
 }
 ok($max_attempts > 0, "invalid directory creation is detected");
 
 done_testing();
-
-# find $pat in logfile of $node after $off-th byte
-sub find_in_log
-{
-	my ($node, $pat, $off) = @_;
-
-	my $log = PostgreSQL::Test::Utils::slurp_file($node->logfile, $off);
-
-	return $log =~ m/$pat/;
-}
-- 
2.34.1

From 7570e55a9f0121a9e2e0a12f4e8718cc531811c9 Mon Sep 17 00:00:00 2001
From: Vignesh C

Re: Order changes in PG16 since ICU introduction

2023-06-06 Thread Daniel Verite

Jeff Davis wrote:

> New patch series attached. I plan to commit 0001 and 0002 soon, unless
> there are objections.
> 
> 0001 causes the "C" and "POSIX" locales to be treated with
> memcmp/pg_ascii semantics in ICU, just like in libc. We also
> considered a new "none" provider, but it's more invasive, and we can
> always reconsider that in the v17 cycle.

FWIW I don't quite see how 0001 improve things or what problem it's
trying to solve.

0001 creates exceptions throughout the code so that when an ICU
collation has a locale name "C" or "POSIX" then it does not behave
like an ICU collation, even though pg_collation.collprovider='i'
To me it's neither desirable nor necessary that a collation that
has collprovider='i' is diverted to non-ICU semantics.

Also in the current state, this diversion does not apply to initdb.

"initdb --icu-locale=C" with 0001 applied reports this:

   Using language tag "en-US-u-va-posix" for ICU locale "C".
   The database cluster will be initialized with this locale configuration:
 provider:icu
 ICU locale:  en-US-u-va-posix
 LC_COLLATE:  fr_FR.UTF-8
 [...]

and "initdb --locale=C" reports this:

   Using default ICU locale "fr_FR".
   Using language tag "fr-FR" for ICU locale "fr_FR".
   The database cluster will be initialized with this locale configuration:
 provider:icu
 ICU locale:  fr-FR
 LC_COLLATE:  C
 [...]

Could you elaborate a bit more on what 0001 is meant to achieve, from
the point of view of the user?

Best regards,
-- 
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

Re: Implement generalized sub routine find_in_log for tap test

2023-06-06 Thread Bharath Rupireddy

On Tue, Jun 6, 2023 at 4:11 PM Michael Paquier  wrote:
>
> On Tue, Jun 06, 2023 at 10:00:00AM +0530, Bharath Rupireddy wrote:
> > +1 for deduplicating find_in_log. How about deduplicating advance_wal
> > too so that 019_replslot_limit.pl, 033_replay_tsp_drops.pl,
> > 035_standby_logical_decoding.pl and 001_stream_rep.pl can benefit
> > immediately?
>
> As in a small wrapper for pg_switch_wal() that generates N segments at
> will?

Yes. A simpler way of doing it would be to move advance_wal() in
019_replslot_limit.pl to Cluster.pm, something like the attached. CI
members don't complain with it
https://github.com/BRupireddy/postgres/tree/add_a_function_in_Cluster.pm_to_generate_WAL.
Perhaps, we can name it better instead of advance_wal, say
generate_wal or some other?

> I don't see why we could not do that if it proves useful in the
> long run.

Besides the beneficiaries listed above, the test case added by
https://commitfest.postgresql.org/43/3663/ can use it. And, the
test_table bits in 020_pg_receivewal.pl can use it (?).

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

v1-0001-Add-a-function-in-Cluster.pm-to-generate-WAL
Description: Binary data

Re: Let's make PostgreSQL multi-threaded

2023-06-06 Thread Konstantin Knizhnik




On 06.06.2023 12:07 AM, Jonah H. Harris wrote:

On Mon, Jun 5, 2023 at 8:18 AM Tom Lane  wrote:

For the record, I think this will be a disaster.  There is far too
much
code that will get broken, largely silently, and much of it is not
under our control.


While I've long been in favor of a multi-threaded implementation, now 
in my old age, I tend to agree with Tom. I'd be interested in 
Konstantin's thoughts (and PostgresPro's experience) of multi-threaded 
vs. internal pooling with the current process-based model. I recall 
looking at and playing with Konstantin's implementations of both, 
which were impressive. Yes, the latter doesn't solve the same issues, 
but many real-world ones where multi-threaded is argued. Personally, I 
think there would be not only a significant amount of time spent 
dealing with in-the-field stability regressions before a 
multi-threaded implementation matures, but it would also increase the 
learning curve for anyone trying to start with internals development.


--
Jonah H. Harris




Let me share my experience with porting Postgres to threads (by the way 
- repository is still alive - 
https://github.com/postgrespro/postgresql.pthreads 


but I have not keep it in sync with recent versions of Postgres).

1. Solving the problem with static variables was not so difficult as I 
expected - thanks to TLS and its support in modern compilers.
So the only thing we should do is to add some special modified to 
variable declaration:


-static int MyLockNo = 0;
-static bool holdingAllLocks = false;
+static session_local int   MyLockNo = 0;
+static session_local bool holdingAllLocks = false;

But there are about 2k such variables which storage class has to be changed.
This is one of the reasons why I do not agree with the proposal to 
define some session context, place all session specific variables in 
such context and pass it everywhere. It will be very inconvenient to 
maintain structure with 2k fields and adding new field to this struct 
each time you need some non-local variable. Even i it can be hide in 
some macros like DEF_SESSION_VAR(type, name).
Also it requires changing of all Postgres code working with this 
variables, not just declarations.
So patch will be 100x times more and almost any line of Postgres code 
has to be changed.
And I do not see any reasons for it except portability and avoid 
dependecy on compiler.
Implementation of TLS is quite efficient (at least at x86) - there is 
special register pointing to TLS area, so access TLS variable is not 
more expensive than static variable.


2. Performance improvement from switching to threads was not so large 
(~10%). But please notice that I have not changed ny Postgres sync 
primitives.
(But still not sure that using for example pthead_rwlock instead of our 
own LWLock will cause some gains in performance)


3. Multithreading significantly simplify concurrent query execution and 
interaction between workers.
Right now with dynamic shared memory stuff  we can support work with 
varying size data in shared memory but

in mutithreaded program it can be done much easier.

4. Multuthreaded model  opens a way for fixing many existed Postgres 
problems: lack of shared catalog and prepared statements cache, changing 
page pool size (shared buffers) in runtime, ...


5. During this porting I had most of troubles with the following 
components: GUCs,  signals, handling errors and file descriptor cache. 
File descriptor cache really becomes bottleneck because now all backends 
and competing for file descriptors which number is usually limited by 
1024 (without editing system configuration). Protecting it with mutex 
cause significant degrade of performance. So I have to maintain 
thread-local cache.


6. It is not clear how to support external extensions.

7. It will be hard to support non-multithreaded PL languages (like 
python), but for example support of Java will be more natural and efficient.


I do not think that development of multithreaded application is more  
complex or requires large "learning curve".

When you deal with parallel execution you should be careful in any case.
The advantage of process model is that there is much clear distinction 
between shared and private variables.
Concerning debugging and profiling - it is more convenient with 
multithreading in some cases and less convenient in other.
But programmers are living with threads for more than 30 years so now 
most tools are supporting threads at least not worse than processes.
And for many developers now working with threads is more natural and 
convenient.



OOM and local backend memory consumption seems to be one of the main 
challenges for multithreadig model:
right now some queries can cause high consumption of memory. work_mem is 
just a hint and real memory consumption can be much higher.
Even if it doesn't cause OOM, still not all of the allocated memory is 
returned

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Alexander Lakhin


Hello Tomas,

06.06.2023 12:56, Tomas Vondra wrote:

On 6/6/23 11:00, Alexander Lakhin wrote:

Hello,
...> With the debug logging added inside AssertTXNLsnOrder() I see:
ctx->snapshot_builder->start_decoding_at: 209807224,
ctx->reader->EndRecPtr: 210043072,
SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
and inside the loop:
cur_txn->first_lsn: 209792872
cur_txn->first_lsn: 209975744
cur_txn->first_lsn: 210043008
cur_txn->first_lsn: 210043008
and it triggers the Assert.


So what's the prev_first_lsn value for these first_lsn values? How does
it change over time? Did you try looking at the pg_waldump for these
positions?


With more logging I've got (for another run):
ReorderBufferTXNByXid| xid: 3397, lsn: c1fbc80

ctx->snapshot_builder->start_decoding_at: c1f2cc0, ctx->reader->EndRecPtr: c1fbcc0, 
SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0

prev_first_lsn: 0, cur_txn->first_lsn: c1fbc80
prev_first_lsn: c1fbc80, cur_txn->first_lsn: c1fbc80
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn") ...

waldump for 0001000C shows:
grep c1fbc80:
rmgr: Heap2   len (rec/tot): 60/    60, tx:   3398, lsn: 0/0C1FBC80, prev 0/0C1FBC50, desc: NEW_CID rel: 
1663/18763/19987, tid: 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
rmgr: Heap    len (rec/tot): 59/    59, tx:   3398, lsn: 0/0C1FBCC0, prev 0/0C1FBC80, desc: INSERT+INIT off: 
1, flags: 0x08, blkref #0: rel 1663/18763/19987 blk 0


grep '( 3397| 3398)'
rmgr: Transaction len (rec/tot): 43/    43, tx:   3398, lsn: 0/0C1F2B20, prev 0/0C1F2688, desc: ASSIGNMENT xtop 
3397: subxacts: 3398
rmgr: Heap    len (rec/tot): 59/    59, tx:   3398, lsn: 0/0C1F2B50, prev 0/0C1F2B20, desc: INSERT+INIT off: 
1, flags: 0x08, blkref #0: rel 1663/18763/19981 blk 0
rmgr: Standby len (rec/tot): 62/    62, tx:  0, lsn: 0/0C1F2BD0, prev 0/0C1F2B90, desc: RUNNING_XACTS 
nextXid 3400 latestCompletedXid 3396 oldestRunningXid 3397; 2 xacts: 3399 3397; 1 subxacts: 3398
rmgr: Standby len (rec/tot): 58/    58, tx:  0, lsn: 0/0C1F2C80, prev 0/0C1F2C50, desc: RUNNING_XACTS 
nextXid 3400 latestCompletedXid 3399 oldestRunningXid 3397; 1 xacts: 3397; 1 subxacts: 3398
rmgr: XLOG    len (rec/tot):    114/   114, tx:  0, lsn: 0/0C1F2CC0, prev 0/0C1F2C80, desc: 
CHECKPOINT_ONLINE redo 0/C1F2C10; tli 1; prev tli 1; fpw true; xid 0:3400; oid 24576; multi 13; offset 29; oldest xid 
722 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 3397; online
rmgr: Standby len (rec/tot): 62/    62, tx:  0, lsn: 0/0C1FBAD0, prev 0/0C1FBAA0, desc: RUNNING_XACTS 
nextXid 3401 latestCompletedXid 3399 oldestRunningXid 3397; 2 xacts: 3400 3397; 1 subxacts: 3398
rmgr: Heap2   len (rec/tot): 60/    60, tx:   3398, lsn: 0/0C1FBC80, prev 0/0C1FBC50, desc: NEW_CID rel: 
1663/18763/19987, tid: 0/1, cmin: 1, cmax: 4294967295, combo: 4294967295
rmgr: Heap    len (rec/tot): 59/    59, tx:   3398, lsn: 0/0C1FBCC0, prev 0/0C1FBC80, desc: INSERT+INIT off: 
1, flags: 0x08, blkref #0: rel 1663/18763/19987 blk 0
rmgr: Transaction len (rec/tot): 54/    54, tx:   3397, lsn: 0/0C1FBD00, prev 0/0C1FBCC0, desc: COMMIT 
2023-06-06 13:55:26.955268 MSK; subxacts: 3398


Best regards,
Alexander

Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

2023-06-06 Thread Michael Paquier

On Mon, Jun 05, 2023 at 11:26:56AM -0400, Evan Jones wrote:
> This patch fixes a rare parsing bug with unicode characters on Mac OS X.
> The problem is that isspace() on Mac OS X changes its behaviour with the
> locale. Use scanner_isspace instead, which only returns true for ASCII
> whitespace. It appears other places in the Postgres code have already run
> into this, since a number of places use scanner_isspace instead. However,
> there are still a lot of other calls to isspace(). I'll try to take a quick
> look to see if there might be other instances of this bug.

Indeed.  It looks like 9ae2661 missed this spot.
--
Michael


signature.asc
Description: PGP signature

Re: Do we want a hashset type?

2023-06-06 Thread Tomas Vondra



On 6/5/23 21:52, Joel Jacobson wrote:
> On Mon, Jun 5, 2023, at 01:44, Tomas Vondra wrote:
>> Anyway, I hacked together a trivial set backed by an open addressing
>> hash table:
>>
>>   https://github.com/tvondra/hashset
>>
>> It's super-basic, providing just some bare minimum of functions, but
>> hopefully good enough for experiments.
>>
>> - hashset data type - hash table in varlena
>> - hashset_init - create new hashset
>> - hashset_add / hashset_contains - add value / check membership
>> - hashset_merge - merge two hashsets
>> - hashset_count - count elements
>> - hashset_to_array - return
>> - hashset(int) aggregate
>>
>> This allows rewriting the recursive query like this:
>>
>> WITH RECURSIVE friends_of_friends AS (
> ...
> 
> Nice! I get similar results, 737 ms (hashset) vs 1809 ms (array_agg).
> 
> I think if you just add one more hashset function, it will be a win against 
> roaringbitmap, which is 400 ms.
> 
> The missing function is an agg that takes hashset as input and returns 
> hashset, similar to roaringbitmap's rb_or_agg().
> 
> With such a function, we could add an adjacent nodes hashset column to the 
> `nodes` table, which would eliminate the need to scan the edges table for 
> graph traversal:
> 

I added a trivial version of such aggregate hashset(hashset), and if I
rewrite the CTE like this:

WITH RECURSIVE friends_of_friends AS (
SELECT
(select hashset(v) from values (5867) as s(v)) AS current,
0 AS depth
UNION ALL
SELECT
new_current,
friends_of_friends.depth + 1
FROM
friends_of_friends
CROSS JOIN LATERAL (
SELECT
hashset(edges.to_node) AS new_current
FROM
edges
WHERE
from_node =
ANY(hashset_to_array(friends_of_friends.current))
) q
WHERE
friends_of_friends.depth < 3
)
SELECT
depth,
hashset_count(current)
FROM
friends_of_friends
WHERE
depth = 3;

it cuts the timing to about 50% on my laptop, so maybe it'll be ~300ms
on your system. There's a bunch of opportunities for more improvements,
as the hash table implementation is pretty naive/silly, the on-disk
format is wasteful and so on.

But before spending more time on that, it'd be interesting to know what
would be a competitive timing. I mean, what would be "good enough"? What
timings are achievable with graph databases?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: 回复：Fix missing initialization of delayChkptEnd

2023-06-06 Thread Michael Paquier

On Tue, Jun 06, 2023 at 03:13:14PM +0900, Kyotaro Horiguchi wrote:
> After a quick check through the 14 tree, then compaing with the
> corresponding parts of 15, it hit me that ProcArrayClearTransaction()
> needs an assertion on the variable.  Other than that, the patch looks
> good to me.

Yeah, it feels wrong to check only after delayChkpt in this code
path.  I'll look at that tomorrow.
--
Michael


signature.asc
Description: PGP signature

Re: Simplify pg_collation.collversion for Windows libc

2023-06-06 Thread Peter Eisentraut


On 06.06.23 03:21, Thomas Munro wrote:

On Mon, Jun 5, 2023 at 12:56 PM Daniel Verite  wrote:

postgres=# select collversion,count(*) from pg_collation group by
collversion;
   collversion  | count
---+---
| 5
  1539.5,1539.5 |  1457
(2 rows)

According to [1] the second number is obsolete, and AFAICS we should
expose only the first.


Would it be a good idea to remove or ignore the trailing /,*$/
somewhere, perhaps during pg_upgrade, to avoid bogus version mismatch
warnings?


I wonder whether it's worth dealing with this, versus just leaving it 
all alone.

Return value of pg_promote()

2023-06-06 Thread Ashutosh Sharma

Hi All,

At present, pg_promote() returns true to the caller on successful
promotion of standby, however it returns false in multiple scenarios
which includes:

1) The SIGUSR1 signal could not be sent to the postmaster process.
2) The postmaster died during standby promotion.
3) Standby couldn't be promoted within the specified wait time.

For an application calling this function, if pg_promote returns false,
it is hard to interpret the reason behind it. So I think we should
*only* allow pg_promote to return false when the server could not be
promoted in the given wait time and in other scenarios it should just
throw an error (FATAL, ERROR ... depending on the type of failure that
occurred). Please let me know your thoughts on this change. thanks.!

--
With Regards,
Ashutosh Sharma.

Re: Support logical replication of DDLs

2023-06-06 Thread Amit Kapila

On Mon, Jun 5, 2023 at 3:00 PM shveta malik  wrote:
>

Few assorted comments:
===
1. I see the following text in 0005 patch: "It supports most of ALTER
TABLE command except some commands(DDL related to PARTITIONED TABLE
...) that are recently introduced but are not yet supported by the
current ddl_deparser, we will support that later.". Is this still
correct?

2. I think the commit message of 0008
(0008-ObjTree-Removal-for-multiple-commands-2023_05_22) should be
expanded to explain why ObjTree is not required and or how you have
accomplished the same with jsonb functions.

3. 0005* patch has the following changes in docs:
+
+  DDL Replication Support by Command Tag
+  
+
+
+
+  
+   
+Command Tag
+For Replication
+Notes
+   
+  
+  
+   
+ALTER AGGREGATE
+-
+
+   
+   
+ALTER CAST
+-
+
...
...

If the patch's scope is to only support replication of table DDLs, why
such other commands are mentioned?

4. Can you write some comments about the deparsing format in one of
the file headers or in README?

5.
+   
+The table_init_write event occurs just after
the creation of
+table while execution of CREATE TABLE AS and
+SELECT INTO commands.

Patch 0001 has multiple references to table_init_write trigger but it
is introduced in the 0002 patch, so those changes either belong to
0002 or one of the later patches. I think that may reduce most of the
changes in event-trigger.sgml.

6.
+ if (relpersist == RELPERSISTENCE_PERMANENT)
+ {
+ ddl_deparse_context context;
+ char*json_string;
+
+ context.verbose_mode = false;
+ context.func_volatile = PROVOLATILE_IMMUTABLE;

Can we write some comments as to why PROVOLATILE_IMMUTABLE is chosen here?

7.
diff --git 
a/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig
b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig
new file mode 100644
index 00..58cf7cdf33
--- /dev/null
+++ 
b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig

Will this file require for the 0008 patch? Or is this just a leftover?

-- 
With Regards,
Amit Kapila.

Re: Improve join_search_one_level readibilty (one line change)

2023-06-06 Thread 謝東霖

Thank you to Julien Rouhaud and Tender Wang for the reviews.

Julien's detailed guide has proven to be incredibly helpful, and I am
truly grateful for it.
Thank you so much for providing such valuable guidance!

I have initiated a new commitfest:
https://commitfest.postgresql.org/43/4346/

Furthermore, I have attached a patch that improves the code by moving
the initialization of "other_rels_list" outside the if branching.

Perhaps Tom Lane would be interested in reviewing this minor change as well?
From 5b038577cf4f8acdc094d6e41d3891b559016cae Mon Sep 17 00:00:00 2001
From: DouEnergy 
Date: Tue, 6 Jun 2023 15:30:14 +0800
Subject: [PATCH v2] Improve left deep tree dp algorithm's readability

---
 src/backend/optimizer/path/joinrels.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c 
b/src/backend/optimizer/path/joinrels.c
index 2feab2184f..a283e574f3 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -109,14 +109,14 @@ join_search_one_level(PlannerInfo *root, int level)
List   *other_rels_list;
ListCell   *other_rels;
 
+   other_rels_list = joinrels[1];
+
if (level == 2) /* consider remaining initial 
rels */
{
-   other_rels_list = joinrels[level - 1];
other_rels = lnext(other_rels_list, r);
}
else/* consider all initial 
rels */
{
-   other_rels_list = joinrels[1];
other_rels = list_head(other_rels_list);
}
 
-- 
2.37.1 (Apple Git-137.1)

Re: Implement generalized sub routine find_in_log for tap test

2023-06-06 Thread Michael Paquier

On Tue, Jun 06, 2023 at 10:00:00AM +0530, Bharath Rupireddy wrote:
> +1 for deduplicating find_in_log. How about deduplicating advance_wal
> too so that 019_replslot_limit.pl, 033_replay_tsp_drops.pl,
> 035_standby_logical_decoding.pl and 001_stream_rep.pl can benefit
> immediately?

As in a small wrapper for pg_switch_wal() that generates N segments at
will?  I don't see why we could not do that if it proves useful in the
long run.
--
Michael


signature.asc
Description: PGP signature

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Tomas Vondra

On 6/6/23 11:00, Alexander Lakhin wrote:
> Hello,
> ...> With the debug logging added inside AssertTXNLsnOrder() I see:
> ctx->snapshot_builder->start_decoding_at: 209807224,
> ctx->reader->EndRecPtr: 210043072,
> SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
> and inside the loop:
> cur_txn->first_lsn: 209792872
> cur_txn->first_lsn: 209975744
> cur_txn->first_lsn: 210043008
> cur_txn->first_lsn: 210043008
> and it triggers the Assert.
> 

So what's the prev_first_lsn value for these first_lsn values? How does
it change over time? Did you try looking at the pg_waldump for these
positions?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Parallelize correlated subqueries that execute within each worker

2023-06-06 Thread Richard Guo

On Mon, Jan 23, 2023 at 10:00 PM James Coleman  wrote:

> Which this patch we do in fact now see (as expected) rels with
> non-empty lateral_relids showing up in generate_[useful_]gather_paths.
> And the partial paths can now have non-empty required outer rels. I'm
> not able to come up with a plan that would actually be caught by those
> checks; I theorize that because of the few places we actually call
> generate_[useful_]gather_paths we are in practice already excluding
> those, but for now I've left these as a conditional rather than an
> assertion because it seems like the kind of guard we'd want to ensure
> those methods are safe.

I'm trying to understand this part.  AFAICS we will not create partial
paths for a rel, base or join, if it has lateral references.  So it
seems to me that in generate_[useful_]gather_paths after we've checked
that there are partial paths, the checks for lateral_relids are not
necessary because lateral_relids should always be empty in this case.
Maybe I'm missing something.

And while trying the v9 patch I came across a crash with the query
below.

set min_parallel_table_scan_size to 0;
set parallel_setup_cost to 0;
set parallel_tuple_cost to 0;

explain (costs off)
select * from pg_description t1 where objoid in
(select objoid from pg_description t2 where t2.description =
t1.description);
   QUERY PLAN

 Seq Scan on pg_description t1
   Filter: (SubPlan 1)
   SubPlan 1
 ->  Gather
   Workers Planned: 2
   ->  Parallel Seq Scan on pg_description t2
 Filter: (description = t1.description)
(7 rows)

select * from pg_description t1 where objoid in
(select objoid from pg_description t2 where t2.description =
t1.description);
WARNING:  terminating connection because of crash of another server process

Seems something is wrong when extracting the argument from the Param in
parallel worker.

BTW another rebase is needed as it no longer applies to HEAD.

Thanks
Richard

Re: Support logical replication of DDLs

2023-06-06 Thread vignesh C

On Thu, 1 Jun 2023 at 07:42, Yu Shi (Fujitsu)  wrote:
>
> On Wed, May 31, 2023 5:41 PM shveta malik  wrote:
> >
> > PFA the set of patches consisting above changes. All the changes are
> > made in 0008 patch.
> >
> > Apart from above changes, many partition attach/detach related tests
> > are uncommented in alter_table.sql in patch 0008.
> >
>
> Thanks for updating the patch, here are some comments.
>
> 1.
> I think some code can be removed because it is not for supporting table
> commands.

> 1.2
> 0001 patch, in deparse_AlterRelation().
>
> +   case AT_AddColumnToView:
> +   /* CREATE OR REPLACE VIEW -- nothing to do 
> here */
> +   break;

Modified

> 1.3
> 0001 patch.
> ("alter table ... owner to ... " is deparsed in deparse_AlterRelation(), 
> instead
> of this funciton.)
>
> +static ObjTree *
> +deparse_AlterOwnerStmt(ObjectAddress address, Node *parsetree)
> +{
> +   AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree;
> +
> +   return new_objtree_VA("ALTER %{objtype}s %{identity}s OWNER TO 
> %{newowner}I", 3,
> + "objtype", ObjTypeString,
> + 
> stringify_objtype(node->objectType),
> + "identity", ObjTypeString,
> + getObjectIdentity(, 
> false),
> + "newowner", ObjTypeString,
> + 
> get_rolespec_name(node->newowner));
> +}

Modified

> 2. 0001 patch, in deparse_AlterTableStmt(),
>
> +   case AT_CookedColumnDefault:
> +   {
> +   Relationattrrel;
> +   HeapTuple   atttup;
> +   Form_pg_attribute attStruct;
> ...
>
> I think this case can be removed because it is for "create table like", and in
> such case the function has returned before reaching here, see below.
>
> +   /*
> +* ALTER TABLE subcommands generated for TableLikeClause is processed 
> in
> +* the top level CREATE TABLE command; return empty here.
> +*/
> +   if (stmt->table_like)
> +   return NULL;

Modified

> 3. 0001 patch, in deparse_AlterRelation().
>
> Is there any case that "ALTER TABLE" command adds an index which is not a
> constraint? If not, maybe we can remove the check or replace it with an 
> assert.
>
> +   case AT_AddIndex:
> +   {
> ...
> +
> +   if (!istmt->isconstraint)
> +   break;

Modified to Assert

> 4. To run this test when building with meson, it seems we should add
> "test_ddl_deparse_regress" to src/test/modules/meson.build.

Modified

> 5.
> create table t1 (a int);
> create table t2 (a int);
> SET allow_in_place_tablespaces = true;
> CREATE TABLESPACE ddl_tblspace LOCATION '';
> RESET allow_in_place_tablespaces;
> ALTER TABLE ALL IN TABLESPACE pg_default SET TABLESPACE ddl_tblspace;
>
> In the last command, if multiple tables are altered, the command will be
> deparsed multiple times and there will be multiple re-formed commands. Is it 
> OK?

Modified to  "ALTER TABLE .,, SET TABLESPACE" for each of the tables
who are getting altered. We have to generate subcommands for each
table because of the existing alter table trigger callbacks.

> 6.
> Cfbot failed because of compiler warnings. [1]
>
> [15:06:48.065] ddldeparse.c: In function ‘deparse_Seq_OwnedBy_toJsonb’:
> [15:06:48.065] ddldeparse.c:586:14: error: variable ‘value’ set but not used 
> [-Werror=unused-but-set-variable]
> [15:06:48.065]   586 |  JsonbValue *value = NULL;
> [15:06:48.065]   |  ^
> [15:06:48.065] ddldeparse.c: In function ‘deparse_utility_command’:
> [15:06:48.065] ddldeparse.c:2729:18: error: ‘rel’ may be used uninitialized 
> in this function [-Werror=maybe-uninitialized]
> [15:06:48.065]  2729 |  seq_relid = 
> getIdentitySequence(RelationGetRelid(rel), attnum, true);
> [15:06:48.065]   |  
> ^~~~
> [15:06:48.065] ddldeparse.c:1935:11: note: ‘rel’ was declared here
> [15:06:48.065]  1935 |  Relation rel;
> [15:06:48.065]   |   ^~~
> [15:06:48.065] ddldeparse.c:2071:5: error: ‘dpcontext’ may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
> [15:06:48.065]  2071 | deparse_ColumnDef_toJsonb(state, rel, dpcontext,
> [15:06:48.065]   | ^~~~
> [15:06:48.065]  2072 | false, (ColumnDef *) subcmd->def,
> [15:06:48.065]   | ~
> [15:06:48.065]  2073 | true, NULL);
> [15:06:48.065]

Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)

2023-06-06 Thread Alexander Lakhin


Hello,

21.10.2022 08:49, Amit Kapila wrote:

On Fri, Oct 21, 2022 at 8:01 AM Masahiko Sawada  wrote:

Thank you for the comment! I agreed with all comments and I've updated
patches accordingly.


Pushed after removing the test case from v11-13 branches as it is not
relevant to those branches and the test-1 in
catalog_change_snapshot.spec already tests the same case for those
branches.


I've managed to get that assertion failure again (on master) while playing
with the concurrent installcheck. This can be easily reproduced with the
following script:
numclients=5
for ((c=1;c<=numclients;c++)); do
  cp -r contrib/test_decoding contrib/test_decoding_$c
  sed "s/isolation_slot/isolation_slot_$c/" -i contrib/test_decoding_$c/specs/catalog_change_snapshot.spec # Use 
independent slots
  sed "$(printf '$p; %.0s' `seq 50`)" -i contrib/test_decoding_$c/specs/catalog_change_snapshot.spec # Repeat the last 
permutation 50 times

done
for ((c=1;c<=numclients;c++)); do
  EXTRA_REGRESS_OPTS="--dbname=regress_$c" make -s installcheck-force -C contrib/test_decoding_$c USE_MODULE_DB=1 
>"installcheck-$c.log" 2>&1 &

done
wait
grep 'TRAP:' server.log

Produces for me:
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794105
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794104
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794099
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794105
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794104
TRAP: failed Assert("prev_first_lsn < cur_txn->first_lsn"), File: 
"reorderbuffer.c", Line: 942, PID: 3794099

With the debug logging added inside AssertTXNLsnOrder() I see:
ctx->snapshot_builder->start_decoding_at: 209807224, ctx->reader->EndRecPtr: 
210043072,
SnapBuildXactNeedsSkip(ctx->snapshot_builder, ctx->reader->EndRecPtr): 0
and inside the loop:
cur_txn->first_lsn: 209792872
cur_txn->first_lsn: 209975744
cur_txn->first_lsn: 210043008
cur_txn->first_lsn: 210043008
and it triggers the Assert.

Best regards,
Alexander

Re: Add support for AT LOCAL

2023-06-06 Thread Vik Fearing


On 6/6/23 03:56, Laurenz Albe wrote:

On Mon, 2023-06-05 at 23:13 -0400, Vik Fearing wrote:

The Standard defines time zone conversion as follows:

 ::=
     [  ]

 ::=
    AT 

 ::=
  LOCAL
    | TIME ZONE 


While looking at something else, I noticed we do not support AT LOCAL.
The local time zone is defined as that of *the session*, not the server,
which can make this quite interesting in views where the view will
automatically adjust to the session's time zone.

Patch against 3f1180 attached.


+1 on the idea; it should be faily trivial, if not very useful.


Thanks.


At a quick glance, it looks like you resolve "timezone" at the time
the query is parsed.  Shouldn't the resolution happen at query
execution time?


current_setting(text) is stable, and my tests show that it is calculated 
at execution time.



postgres=# prepare x as values (now() at local);
PREPARE
postgres=# set timezone to 'UTC';
SET
postgres=# execute x;
  column1

 2023-06-06 08:23:02.088634
(1 row)

postgres=# set timezone to 'Asia/Pyongyang';
SET
postgres=# execute x;
  column1

 2023-06-06 17:23:14.837219
(1 row)


Am I missing something?
--
Vik Fearing

Re: Add support for AT LOCAL

2023-06-06 Thread Laurenz Albe

On Mon, 2023-06-05 at 23:13 -0400, Vik Fearing wrote:
> The Standard defines time zone conversion as follows:
> 
>  ::=
>     [  ]
> 
>  ::=
>    AT 
> 
>  ::=
>  LOCAL
>    | TIME ZONE 
> 
> 
> While looking at something else, I noticed we do not support AT LOCAL. 
> The local time zone is defined as that of *the session*, not the server, 
> which can make this quite interesting in views where the view will 
> automatically adjust to the session's time zone.
> 
> Patch against 3f1180 attached.

+1 on the idea; it should be faily trivial, if not very useful.

At a quick glance, it looks like you resolve "timezone" at the time
the query is parsed.  Shouldn't the resolution happen at query
execution time?

Yours,
Laurenz Albe

Re: 回复：Fix missing initialization of delayChkptEnd

2023-06-06 Thread Kyotaro Horiguchi

Good catch!

At Tue, 06 Jun 2023 00:39:47 +0800, "蔡梦娟(玊于)"  
wrote in 
> Hi, all. I updated the patch for this bugfix, the previous one
> missed the modification of function InitAuxiliaryProcess, please
> check the new patch.

After a quick check through the 14 tree, then compaing with the
corresponding parts of 15, it hit me that ProcArrayClearTransaction()
needs an assertion on the variable.  Other than that, the patch looks
good to me.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: Support logical replication of DDLs

2023-06-06 Thread Wei Wang (Fujitsu)

On Thur, June 1, 2023 at 23:42 vignesh C  wrote:
> On Wed, 31 May 2023 at 14:32, Wei Wang (Fujitsu) 
> wrote:
> > ~~~
> >
> > 2. Deparsed results of the partition table.
> > When I run the following SQLs:
> > ```
> > create table parent (a int primary key) partition by range (a);
> > create table child partition of parent default;
> > ```
> >
> > I got the following two deparsed results:
> > ```
> > CREATE  TABLE  public.parent (a pg_catalog.int4 STORAGE PLAIN  ,
> CONSTRAINT parent_pkey PRIMARY KEY (a))  PARTITION BY RANGE (a)
> > CREATE  TABLE  public.child PARTITION OF public.parent (CONSTRAINT
> child_pkey PRIMARY KEY (a)) DEFAULT
> > ```
> >
> > When I run these two deparsed results on another instance, I got the 
> > following
> error:
> > ```
> > postgres=# CREATE  TABLE  public.parent (a pg_catalog.int4 STORAGE PLAIN
> >   ,
> CONSTRAINT parent_pkey PRIMARY KEY (a))  PARTITION BY RANGE (a);
> > CREATE TABLE
> > postgres=# CREATE  TABLE  public.child PARTITION OF public.parent
> (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT;
> > ERROR:  multiple primary keys for table "child" are not allowed
> > ```
> >
> > I think that we could skip deparsing the primary key related constraint for
> > partition (child) table in the function obtainConstraints for this case.
> 
> Not applicable for 0008 patch

I think this issue still exists after applying the 0008 patch. Is this error the
result we expected?
If no, I think we could try to address this issue in the function
deparse_Constraints_ToJsonb in 0008 patch like the attachment. What do you
think? BTW, we also need to skip the parentheses in the above case if you think
this approach is OK.

Regards,
Wang wei


tmp_fix.patch.bak
Description: tmp_fix.patch.bak

80 matches

Mail list logo