subject:"\[BUG\] Logical replication failure \"ERROR\: could not map filenode \"base\/13237\/442428\" to relation OID\" with catalog modifying txns"

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-11-21 Thread Maxim Orlov

> Pushed.
>
> --
> With Regards,
> Amit Kapila.
>
>
Hi!

While working on 64–bit XID's patch set, I stumble into problems with
contrib/test_decoding/catalog_change_snapshot test [0].

AFAICS, the problem is not related to the 64–bit XID's patch set and the
problem is in InitialRunningXacts array, allocaled in
builder->context. Do we really need it to be allocated that way?


[0]
https://www.postgresql.org/message-id/CACG%3DezZoz_KG%2BRyh9MrU_g5e0HiVoHocEvqFF%3DNRrhrwKmEQJQ%40mail.gmail.com


-- 
Best regards,
Maxim Orlov.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-29 Thread Amit Kapila

On Sat, Aug 27, 2022 at 7:06 PM Masahiko Sawada  wrote:
>
> On Sat, Aug 27, 2022 at 7:24 PM Amit Kapila  wrote:
> >
> > On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada  
> > wrote:
> > >
> > > > >
> > > > > I think then we should change this code in the master branch patch
> > > > > with an additional comment on the lines of: "Either all the xacts got
> > > > > purged or none. It is only possible to partially remove the xids from
> > > > > this array if one or more of the xids are still running but not all.
> > > > > That can happen if we start decoding from a point (LSN where the
> > > > > snapshot state became consistent) where all the xacts in this were
> > > > > running and then at least one of those got committed and a few are
> > > > > still running. We will never start from such a point because we won't
> > > > > move the slot's restart_lsn past the point where the oldest running
> > > > > transaction's restart_decoding_lsn is."
> > > > >
> > > >
> > > > Unfortunately, this theory doesn't turn out to be true. While
> > > > investigating the latest buildfarm failure [1], I see that it is
> > > > possible that only part of the xacts in the restored catalog modifying
> > > > xacts list needs to be purged. See the attached where I have
> > > > demonstrated it via a reproducible test. It seems the point we were
> > > > missing was that to start from a point where two or more catalog
> > > > modifying were serialized, it requires another open transaction before
> > > > both get committed, and then we need the checkpoint or other way to
> > > > force running_xacts record in-between the commit of initial two
> > > > catalog modifying xacts. There could possibly be other ways as well
> > > > but the theory above wasn't correct.
> > > >
> > >
> > > Thank you for the analysis and the patch. I have the same conclusion.
> > > Since we took this approach only on the master the back branches are
> > > not affected.
> > >
> > > The new test scenario makes sense to me and looks better than the one
> > > I have. Regarding the fix, I think we should use
> > > TransactionIdFollowsOrEquals() instead of
> > > NormalTransactionIdPrecedes():
> > >
> > >  +   for (off = 0; off < builder->catchange.xcnt; off++)
> > >  +   {
> > >  +   if (NormalTransactionIdPrecedes(builder->catchange.xip[off],
> > >  +   builder->xmin))
> > >  +   break;
> > >  +   }
> > >
> >
> > Right, fixed.
>
> Thank you for updating the patch! It looks good to me.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-27 Thread Masahiko Sawada

On Sat, Aug 27, 2022 at 7:24 PM Amit Kapila  wrote:
>
> On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada  wrote:
> >
> > On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila  wrote:
> > >
> > > On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila  
> > > wrote:
> > > >
> > > > >
> > > > > Yeah, your description makes sense to me. I've also considered how to
> > > > > hit this path but I guess it is never hit. Thinking of it in another
> > > > > way, first of all, at least 2 catalog modifying transactions have to
> > > > > be running while writing a xl_running_xacts. The serialized snapshot
> > > > > that is written when we decode the first xl_running_xact has two
> > > > > transactions. Then, one of them is committed before the second
> > > > > xl_running_xacts. The second serialized snapshot has only one
> > > > > transaction. Then, the transaction is also committed after that. Now,
> > > > > in order to execute the path, we need to start decoding from the first
> > > > > serialized snapshot. However, if we start from there, we cannot decode
> > > > > the full contents of the transaction that was committed later.
> > > > >
> > > >
> > > > I think then we should change this code in the master branch patch
> > > > with an additional comment on the lines of: "Either all the xacts got
> > > > purged or none. It is only possible to partially remove the xids from
> > > > this array if one or more of the xids are still running but not all.
> > > > That can happen if we start decoding from a point (LSN where the
> > > > snapshot state became consistent) where all the xacts in this were
> > > > running and then at least one of those got committed and a few are
> > > > still running. We will never start from such a point because we won't
> > > > move the slot's restart_lsn past the point where the oldest running
> > > > transaction's restart_decoding_lsn is."
> > > >
> > >
> > > Unfortunately, this theory doesn't turn out to be true. While
> > > investigating the latest buildfarm failure [1], I see that it is
> > > possible that only part of the xacts in the restored catalog modifying
> > > xacts list needs to be purged. See the attached where I have
> > > demonstrated it via a reproducible test. It seems the point we were
> > > missing was that to start from a point where two or more catalog
> > > modifying were serialized, it requires another open transaction before
> > > both get committed, and then we need the checkpoint or other way to
> > > force running_xacts record in-between the commit of initial two
> > > catalog modifying xacts. There could possibly be other ways as well
> > > but the theory above wasn't correct.
> > >
> >
> > Thank you for the analysis and the patch. I have the same conclusion.
> > Since we took this approach only on the master the back branches are
> > not affected.
> >
> > The new test scenario makes sense to me and looks better than the one
> > I have. Regarding the fix, I think we should use
> > TransactionIdFollowsOrEquals() instead of
> > NormalTransactionIdPrecedes():
> >
> >  +   for (off = 0; off < builder->catchange.xcnt; off++)
> >  +   {
> >  +   if (NormalTransactionIdPrecedes(builder->catchange.xip[off],
> >  +   builder->xmin))
> >  +   break;
> >  +   }
> >
>
> Right, fixed.

Thank you for updating the patch! It looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-27 Thread Amit Kapila

On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada  wrote:
>
> On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila  
> > wrote:
> > >
> > > >
> > > > Yeah, your description makes sense to me. I've also considered how to
> > > > hit this path but I guess it is never hit. Thinking of it in another
> > > > way, first of all, at least 2 catalog modifying transactions have to
> > > > be running while writing a xl_running_xacts. The serialized snapshot
> > > > that is written when we decode the first xl_running_xact has two
> > > > transactions. Then, one of them is committed before the second
> > > > xl_running_xacts. The second serialized snapshot has only one
> > > > transaction. Then, the transaction is also committed after that. Now,
> > > > in order to execute the path, we need to start decoding from the first
> > > > serialized snapshot. However, if we start from there, we cannot decode
> > > > the full contents of the transaction that was committed later.
> > > >
> > >
> > > I think then we should change this code in the master branch patch
> > > with an additional comment on the lines of: "Either all the xacts got
> > > purged or none. It is only possible to partially remove the xids from
> > > this array if one or more of the xids are still running but not all.
> > > That can happen if we start decoding from a point (LSN where the
> > > snapshot state became consistent) where all the xacts in this were
> > > running and then at least one of those got committed and a few are
> > > still running. We will never start from such a point because we won't
> > > move the slot's restart_lsn past the point where the oldest running
> > > transaction's restart_decoding_lsn is."
> > >
> >
> > Unfortunately, this theory doesn't turn out to be true. While
> > investigating the latest buildfarm failure [1], I see that it is
> > possible that only part of the xacts in the restored catalog modifying
> > xacts list needs to be purged. See the attached where I have
> > demonstrated it via a reproducible test. It seems the point we were
> > missing was that to start from a point where two or more catalog
> > modifying were serialized, it requires another open transaction before
> > both get committed, and then we need the checkpoint or other way to
> > force running_xacts record in-between the commit of initial two
> > catalog modifying xacts. There could possibly be other ways as well
> > but the theory above wasn't correct.
> >
>
> Thank you for the analysis and the patch. I have the same conclusion.
> Since we took this approach only on the master the back branches are
> not affected.
>
> The new test scenario makes sense to me and looks better than the one
> I have. Regarding the fix, I think we should use
> TransactionIdFollowsOrEquals() instead of
> NormalTransactionIdPrecedes():
>
>  +   for (off = 0; off < builder->catchange.xcnt; off++)
>  +   {
>  +   if (NormalTransactionIdPrecedes(builder->catchange.xip[off],
>  +   builder->xmin))
>  +   break;
>  +   }
>

Right, fixed.

-- 
With Regards,
Amit Kapila.


v2-0001-Fix-the-incorrect-assertion-introduced-in-commit-.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-27 Thread Masahiko Sawada

On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila  wrote:
>
> On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila  wrote:
> >
> > >
> > > Yeah, your description makes sense to me. I've also considered how to
> > > hit this path but I guess it is never hit. Thinking of it in another
> > > way, first of all, at least 2 catalog modifying transactions have to
> > > be running while writing a xl_running_xacts. The serialized snapshot
> > > that is written when we decode the first xl_running_xact has two
> > > transactions. Then, one of them is committed before the second
> > > xl_running_xacts. The second serialized snapshot has only one
> > > transaction. Then, the transaction is also committed after that. Now,
> > > in order to execute the path, we need to start decoding from the first
> > > serialized snapshot. However, if we start from there, we cannot decode
> > > the full contents of the transaction that was committed later.
> > >
> >
> > I think then we should change this code in the master branch patch
> > with an additional comment on the lines of: "Either all the xacts got
> > purged or none. It is only possible to partially remove the xids from
> > this array if one or more of the xids are still running but not all.
> > That can happen if we start decoding from a point (LSN where the
> > snapshot state became consistent) where all the xacts in this were
> > running and then at least one of those got committed and a few are
> > still running. We will never start from such a point because we won't
> > move the slot's restart_lsn past the point where the oldest running
> > transaction's restart_decoding_lsn is."
> >
>
> Unfortunately, this theory doesn't turn out to be true. While
> investigating the latest buildfarm failure [1], I see that it is
> possible that only part of the xacts in the restored catalog modifying
> xacts list needs to be purged. See the attached where I have
> demonstrated it via a reproducible test. It seems the point we were
> missing was that to start from a point where two or more catalog
> modifying were serialized, it requires another open transaction before
> both get committed, and then we need the checkpoint or other way to
> force running_xacts record in-between the commit of initial two
> catalog modifying xacts. There could possibly be other ways as well
> but the theory above wasn't correct.
>

Thank you for the analysis and the patch. I have the same conclusion.
Since we took this approach only on the master the back branches are
not affected.

The new test scenario makes sense to me and looks better than the one
I have. Regarding the fix, I think we should use
TransactionIdFollowsOrEquals() instead of
NormalTransactionIdPrecedes():

 +   for (off = 0; off < builder->catchange.xcnt; off++)
 +   {
 +   if (NormalTransactionIdPrecedes(builder->catchange.xip[off],
 +   builder->xmin))
 +   break;
 +   }

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-27 Thread Amit Kapila

On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila  wrote:
>
> >
> > Yeah, your description makes sense to me. I've also considered how to
> > hit this path but I guess it is never hit. Thinking of it in another
> > way, first of all, at least 2 catalog modifying transactions have to
> > be running while writing a xl_running_xacts. The serialized snapshot
> > that is written when we decode the first xl_running_xact has two
> > transactions. Then, one of them is committed before the second
> > xl_running_xacts. The second serialized snapshot has only one
> > transaction. Then, the transaction is also committed after that. Now,
> > in order to execute the path, we need to start decoding from the first
> > serialized snapshot. However, if we start from there, we cannot decode
> > the full contents of the transaction that was committed later.
> >
>
> I think then we should change this code in the master branch patch
> with an additional comment on the lines of: "Either all the xacts got
> purged or none. It is only possible to partially remove the xids from
> this array if one or more of the xids are still running but not all.
> That can happen if we start decoding from a point (LSN where the
> snapshot state became consistent) where all the xacts in this were
> running and then at least one of those got committed and a few are
> still running. We will never start from such a point because we won't
> move the slot's restart_lsn past the point where the oldest running
> transaction's restart_decoding_lsn is."
>

Unfortunately, this theory doesn't turn out to be true. While
investigating the latest buildfarm failure [1], I see that it is
possible that only part of the xacts in the restored catalog modifying
xacts list needs to be purged. See the attached where I have
demonstrated it via a reproducible test. It seems the point we were
missing was that to start from a point where two or more catalog
modifying were serialized, it requires another open transaction before
both get committed, and then we need the checkpoint or other way to
force running_xacts record in-between the commit of initial two
catalog modifying xacts. There could possibly be other ways as well
but the theory above wasn't correct.

[1] - 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2022-08-25%2004%3A15%3A34


--
With Regards,
Amit Kapila.
diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out
index dc4f9b7..d2a4bdf 100644
--- a/contrib/test_decoding/expected/catalog_change_snapshot.out
+++ b/contrib/test_decoding/expected/catalog_change_snapshot.out
@@ -1,4 +1,4 @@
-Parsed test spec with 2 sessions
+Parsed test spec with 3 sessions
 
 starting permutation: s0_init s0_begin s0_savepoint s0_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s0_commit s1_get_changes
 step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
@@ -42,3 +42,57 @@ COMMIT
 stop
 (1 row)
 
+
+starting permutation: s0_init s0_begin s0_truncate s2_begin s2_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s2_commit s1_checkpoint s1_get_changes s0_commit s1_get_changes
+step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
+?column?
+
+init
+(1 row)
+
+step s0_begin: BEGIN;
+step s0_truncate: TRUNCATE tbl1;
+step s2_begin: BEGIN;
+step s2_truncate: TRUNCATE tbl2;
+step s1_checkpoint: CHECKPOINT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data
+
+(0 rows)
+
+step s0_commit: COMMIT;
+step s0_begin: BEGIN;
+step s0_insert: INSERT INTO tbl1 VALUES (1);
+step s1_checkpoint: CHECKPOINT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data   
+---
+BEGIN  
+table public.tbl1: TRUNCATE: (no-flags)
+COMMIT 
+(3 rows)
+
+step s2_commit: COMMIT;
+step s1_checkpoint: CHECKPOINT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data   
+---
+BEGIN  
+table public.tbl2: TRUNCATE: (no-flags)
+COMMIT 
+(3 rows)
+
+step s0_commit: COMMIT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data 
+-
+BEGIN

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-11 Thread Masahiko Sawada

On Thu, Aug 11, 2022 at 3:10 PM Amit Kapila  wrote:
>
> On Mon, Aug 8, 2022 at 9:34 AM Amit Kapila  wrote:
> >
> > On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada  
> > wrote:
> > >
> > >
> > > Oops, thanks for pointing it out. I've fixed it and attached updated
> > > patches for all branches so as not to confuse the patch version. There
> > > is no update from v12 patch on REL12 - master patches.
> > >
> >
> > Thanks for the updated patches, the changes look good to me.
> > Horiguchi-San, and others, do you have any further comments on this or
> > do you want to spend time in review of it? If not, I would like to
> > push this after the current minor version release.
> >
>
> Pushed.

Thank you!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-11 Thread Amit Kapila

On Mon, Aug 8, 2022 at 9:34 AM Amit Kapila  wrote:
>
> On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada  wrote:
> >
> >
> > Oops, thanks for pointing it out. I've fixed it and attached updated
> > patches for all branches so as not to confuse the patch version. There
> > is no update from v12 patch on REL12 - master patches.
> >
>
> Thanks for the updated patches, the changes look good to me.
> Horiguchi-San, and others, do you have any further comments on this or
> do you want to spend time in review of it? If not, I would like to
> push this after the current minor version release.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-07 Thread Amit Kapila

On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada  wrote:
>
> On Wed, Aug 3, 2022 at 3:52 PM shiy.f...@fujitsu.com
>  wrote:
> >
> > On Wed, Aug 3, 2022 12:06 PM Masahiko Sawada  wrote:
> > >
> > > I've attached updated patches that incorporated the above comments as
> > > well as the comments from Shi yu. Please review them.
> > >
> >
> > Thanks for updating the patch.
> >
> > I noticed that in SnapBuildXidSetCatalogChanges(), "i" is initialized in 
> > the if
> > branch in REL10 patch, which is different from REL11 patch. Maybe we can 
> > modify
> > REL11 patch to be consistent with REL10 patch.
> >
> > The rest of the patch looks good to me.
>
> Oops, thanks for pointing it out. I've fixed it and attached updated
> patches for all branches so as not to confuse the patch version. There
> is no update from v12 patch on REL12 - master patches.
>

Thanks for the updated patches, the changes look good to me.
Horiguchi-San, and others, do you have any further comments on this or
do you want to spend time in review of it? If not, I would like to
push this after the current minor version release.

-- 
With Regards,
Amit Kapila.

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-03 Thread shiy.f...@fujitsu.com

On Wed, Aug 3, 2022 12:06 PM Masahiko Sawada  wrote:
> 
> I've attached updated patches that incorporated the above comments as
> well as the comments from Shi yu. Please review them.
> 

Thanks for updating the patch.

I noticed that in SnapBuildXidSetCatalogChanges(), "i" is initialized in the if
branch in REL10 patch, which is different from REL11 patch. Maybe we can modify
REL11 patch to be consistent with REL10 patch.

The rest of the patch looks good to me.

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-03 Thread Kyotaro Horiguchi

At Wed, 3 Aug 2022 08:51:40 +0530, Amit Kapila  wrote 
in 
> On Wed, Aug 3, 2022 at 7:05 AM Masahiko Sawada  wrote:
> > Looking at other codes in snapbuild.c, we call CloseTransientFile()
> > before erroring out in SnapBuildSerialize(). I think it's better to
> > keep it consistent with nearby codes in this patch. I think if we
> > prefer the style of closing the file by ereport(ERROR), it should be
> > done for all of them in a separate patch.
> >
> 
> +1. I also feel it is better to change it in a separate patch as this
> is not a pattern introduced by this patch.

Agreed.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Masahiko Sawada

On Tue, Aug 2, 2022 at 5:31 PM shiy.f...@fujitsu.com
 wrote:
>
> On Mon, Aug 1, 2022 10:31 PM Amit Kapila  wrote:
> >
> > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada
> >  wrote:
> > >
> > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila 
> > wrote:
> > > >
> > >
> > > I've attached updated patches for all branches. Please review them.
> > >
> >
> > Thanks, the patches look mostly good to me. I have made minor edits by
> > removing 'likely' from a few places as those don't seem to be adding
> > much value, changed comments at a few places, and was getting
> > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop
> > initial declarations are only allowed in C99 mode) which I have fixed.
> > See attached, unless there are major comments/suggestions, I am
> > planning to push this day after tomorrow (by Wednesday) after another
> > pass.
> >
>
> Thanks for updating the patch.
>
> Here are some minor comments:
>
> 1.
> patches for REL10 ~ REL13:
> + * Mark the transaction as containing catalog changes. In addition, if the
> + * given xid is in the list of the initial running xacts, we mark the
> + * its subtransactions as well. See comments for NInitialRunningXacts and
> + * InitialRunningXacts for additional info.
>
> "mark the its subtransactions"
> ->
> "mark its subtransactions"
>
> 2.
> patches for REL10 ~ REL15:
> In the comment in catalog_change_snapshot.spec, maybe we can use 
> "RUNNING_XACTS"
> instead of "RUNNING_XACT" "XACT_RUNNING", same as the patch for master branch.
>

Thank you for the comments! These have been incorporated in the latest
version v12 patch I just submitted.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Amit Kapila

On Wed, Aug 3, 2022 at 7:05 AM Masahiko Sawada  wrote:
>
> On Wed, Aug 3, 2022 at 10:20 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila  
> > wrote in
> > > On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi
> > >  wrote:
> > > >
> > > > +   {
> > > > +   int save_errno = errno;
> > > > +
> > > > +   CloseTransientFile(fd);
> > > > +
> > > > +   if (readBytes < 0)
> > > > +   {
> > > > +   errno = save_errno;
> > > > +   ereport(ERROR,
> > > >
> > > > Do we need the CloseTransientFile(fd) there?  This call requires errno
> > > > to be remembered but anyway OpenTransientFile'd files are to be close
> > > > at transaction end.  Actually CloseTransientFile() is not called
> > > > before error'ing-out at error in other places.
> > ..
> > > We just moved it to a separate function as the same code is being
> > > duplicated to multiple places.
> >
> > There are code paths that doesn't CloseTransientFile() explicitly,
> > too.  If there were no need of save_errno there, that'd be fine.  But
> > otherwise I guess we prefer to let the orphan fds closed by ERROR and
> > I don't think we need to preserve the less-preferred code pattern (if
> > we actually prefer not to have the explicit call).
>
> Looking at other codes in snapbuild.c, we call CloseTransientFile()
> before erroring out in SnapBuildSerialize(). I think it's better to
> keep it consistent with nearby codes in this patch. I think if we
> prefer the style of closing the file by ereport(ERROR), it should be
> done for all of them in a separate patch.
>

+1. I also feel it is better to change it in a separate patch as this
is not a pattern introduced by this patch.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Masahiko Sawada

On Wed, Aug 3, 2022 at 10:20 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila  
> wrote in
> > On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi
> >  wrote:
> > >
> > > +   {
> > > +   int save_errno = errno;
> > > +
> > > +   CloseTransientFile(fd);
> > > +
> > > +   if (readBytes < 0)
> > > +   {
> > > +   errno = save_errno;
> > > +   ereport(ERROR,
> > >
> > > Do we need the CloseTransientFile(fd) there?  This call requires errno
> > > to be remembered but anyway OpenTransientFile'd files are to be close
> > > at transaction end.  Actually CloseTransientFile() is not called
> > > before error'ing-out at error in other places.
> ..
> > We just moved it to a separate function as the same code is being
> > duplicated to multiple places.
>
> There are code paths that doesn't CloseTransientFile() explicitly,
> too.  If there were no need of save_errno there, that'd be fine.  But
> otherwise I guess we prefer to let the orphan fds closed by ERROR and
> I don't think we need to preserve the less-preferred code pattern (if
> we actually prefer not to have the explicit call).

Looking at other codes in snapbuild.c, we call CloseTransientFile()
before erroring out in SnapBuildSerialize(). I think it's better to
keep it consistent with nearby codes in this patch. I think if we
prefer the style of closing the file by ereport(ERROR), it should be
done for all of them in a separate patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Kyotaro Horiguchi

At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila  wrote 
in 
> On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi
>  wrote:
> >
> > +   {
> > +   int save_errno = errno;
> > +
> > +   CloseTransientFile(fd);
> > +
> > +   if (readBytes < 0)
> > +   {
> > +   errno = save_errno;
> > +   ereport(ERROR,
> >
> > Do we need the CloseTransientFile(fd) there?  This call requires errno
> > to be remembered but anyway OpenTransientFile'd files are to be close
> > at transaction end.  Actually CloseTransientFile() is not called
> > before error'ing-out at error in other places.
..
> We just moved it to a separate function as the same code is being
> duplicated to multiple places.

There are code paths that doesn't CloseTransientFile() explicitly,
too.  If there were no need of save_errno there, that'd be fine.  But
otherwise I guess we prefer to let the orphan fds closed by ERROR and
I don't think we need to preserve the less-preferred code pattern (if
we actually prefer not to have the explicit call).

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread shiy.f...@fujitsu.com

On Mon, Aug 1, 2022 10:31 PM Amit Kapila  wrote:
> 
> On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada
>  wrote:
> >
> > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila 
> wrote:
> > >
> >
> > I've attached updated patches for all branches. Please review them.
> >
> 
> Thanks, the patches look mostly good to me. I have made minor edits by
> removing 'likely' from a few places as those don't seem to be adding
> much value, changed comments at a few places, and was getting
> compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop
> initial declarations are only allowed in C99 mode) which I have fixed.
> See attached, unless there are major comments/suggestions, I am
> planning to push this day after tomorrow (by Wednesday) after another
> pass.
> 

Thanks for updating the patch.

Here are some minor comments:

1.
patches for REL10 ~ REL13:
+ * Mark the transaction as containing catalog changes. In addition, if the
+ * given xid is in the list of the initial running xacts, we mark the
+ * its subtransactions as well. See comments for NInitialRunningXacts and
+ * InitialRunningXacts for additional info.

"mark the its subtransactions"
->
"mark its subtransactions"

2.
patches for REL10 ~ REL15:
In the comment in catalog_change_snapshot.spec, maybe we can use "RUNNING_XACTS"
instead of "RUNNING_XACT" "XACT_RUNNING", same as the patch for master branch.

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Amit Kapila

On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi
 wrote:
>
> At Mon, 1 Aug 2022 20:01:00 +0530, Amit Kapila  
> wrote in
> > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila  
> > > wrote:
> > > >
> > >
> > > I've attached updated patches for all branches. Please review them.
> > >
> >
> > Thanks, the patches look mostly good to me. I have made minor edits by
> > removing 'likely' from a few places as those don't seem to be adding
> > much value, changed comments at a few places, and was getting
> > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop
> > initial declarations are only allowed in C99 mode) which I have fixed.
> > See attached, unless there are major comments/suggestions, I am
> > planning to push this day after tomorrow (by Wednesday) after another
> > pass.
>
>
> +   {
> +   int save_errno = errno;
> +
> +   CloseTransientFile(fd);
> +
> +   if (readBytes < 0)
> +   {
> +   errno = save_errno;
> +   ereport(ERROR,
>
> Do we need the CloseTransientFile(fd) there?  This call requires errno
> to be remembered but anyway OpenTransientFile'd files are to be close
> at transaction end.  Actually CloseTransientFile() is not called
> before error'ing-out at error in other places.
>

But this part of the code is just a copy of the existing code. See:

- if (readBytes != sizeof(SnapBuild))
- {
- int save_errno = errno;
-
- CloseTransientFile(fd);
-
- if (readBytes < 0)
- {
- errno = save_errno;
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not read file \"%s\": %m", path)));
- }
- else
- ereport(ERROR,
- (errcode(ERRCODE_DATA_CORRUPTED),
- errmsg("could not read file \"%s\": read %d of %zu",
- path, readBytes, sizeof(SnapBuild;
- }

We just moved it to a separate function as the same code is being
duplicated to multiple places.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-08-02 Thread Kyotaro Horiguchi

At Mon, 1 Aug 2022 20:01:00 +0530, Amit Kapila  wrote 
in 
> On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada  wrote:
> >
> > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila  wrote:
> > >
> >
> > I've attached updated patches for all branches. Please review them.
> >
> 
> Thanks, the patches look mostly good to me. I have made minor edits by
> removing 'likely' from a few places as those don't seem to be adding
> much value, changed comments at a few places, and was getting
> compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop
> initial declarations are only allowed in C99 mode) which I have fixed.
> See attached, unless there are major comments/suggestions, I am
> planning to push this day after tomorrow (by Wednesday) after another
> pass.

master:
+ * Read the contents of the serialized snapshot to the dest.

Do we need the "the" before the "dest"?

+   {
+   int save_errno = errno;
+
+   CloseTransientFile(fd);
+
+   if (readBytes < 0)
+   {
+   errno = save_errno;
+   ereport(ERROR,

Do we need the CloseTransientFile(fd) there?  This call requires errno
to be remembered but anyway OpenTransientFile'd files are to be close
at transaction end.  Actually CloseTransientFile() is not called
before error'ing-out at error in other places.


+* from the LSN-ordered list of toplevel TXNs. We remove TXN from the 
list

We remove "the" TXN"?

+   if (dlist_is_empty(>catchange_txns))
+   {
+   Assert(rb->catchange_ntxns == 0);
+   return NULL;
+   }

It seems that the assert is far simpler than dlist_is_empty().  Why
don't we swap the conditions for if() and Assert() in the above?

+* the oldest running transaction窶冱 restart_decoding_lsn is.

The line contains a broken characters.


+* Either all the xacts got purged or none. It is only possible to
+* partially remove the xids from this array if one or more of the xids
+* are still running but not all. That can happen if we start decoding

Assuming this, the commment below seems getting stale.

+* catalog. We remove xids from this array when they become old enough 
to
+* matter, and then it eventually becomes empty.

"We discard this array when the all containing xids are gone. See
SnapBuildPurgeOlderTxn for details." or something like?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-29 Thread Amit Kapila

On Fri, Jul 29, 2022 at 5:36 AM Masahiko Sawada  wrote:
>
> On Thu, Jul 28, 2022 at 8:57 PM Amit Kapila  wrote:
> >
> > On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila  wrote:
> >
> > I have another comment on this patch:
> > SnapBuildPurgeOlderTxn()
> > {
> > ...
> > + if (surviving_xids > 0)
> > + memmove(builder->catchange.xip, &(builder->catchange.xip[off]),
> > + surviving_xids * sizeof(TransactionId))
> > ...
> >
> > For this code to hit, we must have a situation where one or more of
> > the xacts in this array must be still running. And, if that is true,
> > we would not have started from the restart point where the
> > corresponding snapshot (that contains the still running xacts) has
> > been serialized because we advance the restart point to not before the
> > oldest running xacts restart_decoding_lsn. This may not be easy to
> > understand so let me take an example to explain. Say we have two
> > transactions t1 and t2, and both have made catalog changes. We want a
> > situation where one of those gets purged and the other remains in
> > builder->catchange.xip array. I have tried variants of the below
> > sequence to see if I can get into the required situation but am not
> > able to make it.
> >
> > Session-1
> > Checkpoint -1;
> > T1
> > DDL
> >
> > Session-2
> > T2
> > DDL
> >
> > Session-3
> > Checkpoint-2;
> > pg_logical_slot_get_changes()
> >  -- Here when we serialize the snapshot corresponding to
> > CHECKPOINT-2's running_xact record, we will serialize both t1 and t2
> > as catalog-changing xacts.
> >
> > Session-1
> > T1
> > Commit;
> >
> > Checkpoint;
> > pg_logical_slot_get_changes()
> >  -- Here we will restore from Checkpoint-1's serialized snapshot and
> > won't be able to move restart_point to Checkpoint-2 because T2 is
> > still open.
> >
> > Now, as per my understanding, it is only possible to move
> > restart_point to Checkpoint-2 if T2 gets committed/rolled-back in
> > which case we will never have that in surviving_xids array after the
> > purge.
> >
> > It is possible I am missing something here. Do let me know your thoughts.
>
> Yeah, your description makes sense to me. I've also considered how to
> hit this path but I guess it is never hit. Thinking of it in another
> way, first of all, at least 2 catalog modifying transactions have to
> be running while writing a xl_running_xacts. The serialized snapshot
> that is written when we decode the first xl_running_xact has two
> transactions. Then, one of them is committed before the second
> xl_running_xacts. The second serialized snapshot has only one
> transaction. Then, the transaction is also committed after that. Now,
> in order to execute the path, we need to start decoding from the first
> serialized snapshot. However, if we start from there, we cannot decode
> the full contents of the transaction that was committed later.
>

I think then we should change this code in the master branch patch
with an additional comment on the lines of: "Either all the xacts got
purged or none. It is only possible to partially remove the xids from
this array if one or more of the xids are still running but not all.
That can happen if we start decoding from a point (LSN where the
snapshot state became consistent) where all the xacts in this were
running and then at least one of those got committed and a few are
still running. We will never start from such a point because we won't
move the slot's restart_lsn past the point where the oldest running
transaction's restart_decoding_lsn is."

I suggest keeping the back branch as it is w.r.t this change as if
this logic proves to be faulty it won't affect the stable branches. We
can always back-patch this small change if required.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Masahiko Sawada

On Thu, Jul 28, 2022 at 8:57 PM Amit Kapila  wrote:
>
> On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila  wrote:
> >
> > On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada  
> > wrote:
> > >
> > > Okay, I've attached an updated patch that does the above idea. Could
> > > you please do the performance tests again to see if the idea can help
> > > reduce the overhead, Shi yu?
> > >
> >
> > While reviewing the patch for HEAD, I have changed a few comments. See
> > attached, if you agree with these changes then include them in the
> > next version.
> >
>
> I have another comment on this patch:
> SnapBuildPurgeOlderTxn()
> {
> ...
> + if (surviving_xids > 0)
> + memmove(builder->catchange.xip, &(builder->catchange.xip[off]),
> + surviving_xids * sizeof(TransactionId))
> ...
>
> For this code to hit, we must have a situation where one or more of
> the xacts in this array must be still running. And, if that is true,
> we would not have started from the restart point where the
> corresponding snapshot (that contains the still running xacts) has
> been serialized because we advance the restart point to not before the
> oldest running xacts restart_decoding_lsn. This may not be easy to
> understand so let me take an example to explain. Say we have two
> transactions t1 and t2, and both have made catalog changes. We want a
> situation where one of those gets purged and the other remains in
> builder->catchange.xip array. I have tried variants of the below
> sequence to see if I can get into the required situation but am not
> able to make it.
>
> Session-1
> Checkpoint -1;
> T1
> DDL
>
> Session-2
> T2
> DDL
>
> Session-3
> Checkpoint-2;
> pg_logical_slot_get_changes()
>  -- Here when we serialize the snapshot corresponding to
> CHECKPOINT-2's running_xact record, we will serialize both t1 and t2
> as catalog-changing xacts.
>
> Session-1
> T1
> Commit;
>
> Checkpoint;
> pg_logical_slot_get_changes()
>  -- Here we will restore from Checkpoint-1's serialized snapshot and
> won't be able to move restart_point to Checkpoint-2 because T2 is
> still open.
>
> Now, as per my understanding, it is only possible to move
> restart_point to Checkpoint-2 if T2 gets committed/rolled-back in
> which case we will never have that in surviving_xids array after the
> purge.
>
> It is possible I am missing something here. Do let me know your thoughts.

Yeah, your description makes sense to me. I've also considered how to
hit this path but I guess it is never hit. Thinking of it in another
way, first of all, at least 2 catalog modifying transactions have to
be running while writing a xl_running_xacts. The serialized snapshot
that is written when we decode the first xl_running_xact has two
transactions. Then, one of them is committed before the second
xl_running_xacts. The second serialized snapshot has only one
transaction. Then, the transaction is also committed after that. Now,
in order to execute the path, we need to start decoding from the first
serialized snapshot. However, if we start from there, we cannot decode
the full contents of the transaction that was committed later.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Amit Kapila

On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila  wrote:
>
> On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada  wrote:
> >
> > Okay, I've attached an updated patch that does the above idea. Could
> > you please do the performance tests again to see if the idea can help
> > reduce the overhead, Shi yu?
> >
>
> While reviewing the patch for HEAD, I have changed a few comments. See
> attached, if you agree with these changes then include them in the
> next version.
>

I have another comment on this patch:
SnapBuildPurgeOlderTxn()
{
...
+ if (surviving_xids > 0)
+ memmove(builder->catchange.xip, &(builder->catchange.xip[off]),
+ surviving_xids * sizeof(TransactionId))
...

For this code to hit, we must have a situation where one or more of
the xacts in this array must be still running. And, if that is true,
we would not have started from the restart point where the
corresponding snapshot (that contains the still running xacts) has
been serialized because we advance the restart point to not before the
oldest running xacts restart_decoding_lsn. This may not be easy to
understand so let me take an example to explain. Say we have two
transactions t1 and t2, and both have made catalog changes. We want a
situation where one of those gets purged and the other remains in
builder->catchange.xip array. I have tried variants of the below
sequence to see if I can get into the required situation but am not
able to make it.

Session-1
Checkpoint -1;
T1
DDL

Session-2
T2
DDL

Session-3
Checkpoint-2;
pg_logical_slot_get_changes()
 -- Here when we serialize the snapshot corresponding to
CHECKPOINT-2's running_xact record, we will serialize both t1 and t2
as catalog-changing xacts.

Session-1
T1
Commit;

Checkpoint;
pg_logical_slot_get_changes()
 -- Here we will restore from Checkpoint-1's serialized snapshot and
won't be able to move restart_point to Checkpoint-2 because T2 is
still open.

Now, as per my understanding, it is only possible to move
restart_point to Checkpoint-2 if T2 gets committed/rolled-back in
which case we will never have that in surviving_xids array after the
purge.

It is possible I am missing something here. Do let me know your thoughts.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Amit Kapila

On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada  wrote:
>
> Okay, I've attached an updated patch that does the above idea. Could
> you please do the performance tests again to see if the idea can help
> reduce the overhead, Shi yu?
>

While reviewing the patch for HEAD, I have changed a few comments. See
attached, if you agree with these changes then include them in the
next version.

-- 
With Regards,
Amit Kapila.

master_v9_amit.diff.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Amit Kapila

On Thu, Jul 28, 2022 at 12:56 PM Masahiko Sawada  wrote:
>
> On Thu, Jul 28, 2022 at 4:13 PM Amit Kapila  wrote:
> >
> > >
> > > While editing back branch patches, I realized that the following
> > > (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are
> > > equivalent:
> > >
> > > +   /*
> > > +* If the COMMIT record has invalidation messages, it could have 
> > > catalog
> > > +* changes. It is possible that we didn't mark this transaction as
> > > +* containing catalog changes when the decoding starts from a commit
> > > +* record without decoding the transaction's other changes. So, we 
> > > ensure
> > > +* to mark such transactions as containing catalog change.
> > > +*
> > > +* This must be done before SnapBuildCommitTxn() so that we can 
> > > include
> > > +* these transactions in the historic snapshot.
> > > +*/
> > > +   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
> > > +   SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid,
> > > + parsed->nsubxacts, parsed->subxacts,
> > > + buf->origptr);
> > > +
> > > /*
> > >  * Process invalidation messages, even if we're not interested in the
> > >  * transaction's contents, since the various caches need to always be
> > >  * consistent.
> > >  */
> > > if (parsed->nmsgs > 0)
> > > {
> > > if (!ctx->fast_forward)
> > > ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr,
> > >   parsed->nmsgs, parsed->msgs);
> > > ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, 
> > > buf->origptr);
> > > }
> > >
> > > If that's right, I think we can merge these if branches. We can call
> > > ReorderBufferXidSetCatalogChanges() for top-txn and in
> > > SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn
> > > is in the list. What do you think?
> > >
> >
> > Note that this code doesn't exist in 14 and 15, so we need to create
> > different patches for those.
>
> Right.
>

Okay, then this sounds reasonable to me.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Masahiko Sawada

On Thu, Jul 28, 2022 at 4:13 PM Amit Kapila  wrote:
>
> On Thu, Jul 28, 2022 at 11:56 AM Masahiko Sawada  
> wrote:
> >
> > On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila  
> > wrote:
> > >
> > > On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada  
> > > wrote:
> > > >
> >
> > While editing back branch patches, I realized that the following
> > (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are
> > equivalent:
> >
> > +   /*
> > +* If the COMMIT record has invalidation messages, it could have catalog
> > +* changes. It is possible that we didn't mark this transaction as
> > +* containing catalog changes when the decoding starts from a commit
> > +* record without decoding the transaction's other changes. So, we 
> > ensure
> > +* to mark such transactions as containing catalog change.
> > +*
> > +* This must be done before SnapBuildCommitTxn() so that we can include
> > +* these transactions in the historic snapshot.
> > +*/
> > +   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
> > +   SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid,
> > + parsed->nsubxacts, parsed->subxacts,
> > + buf->origptr);
> > +
> > /*
> >  * Process invalidation messages, even if we're not interested in the
> >  * transaction's contents, since the various caches need to always be
> >  * consistent.
> >  */
> > if (parsed->nmsgs > 0)
> > {
> > if (!ctx->fast_forward)
> > ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr,
> >   parsed->nmsgs, parsed->msgs);
> > ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
> > }
> >
> > If that's right, I think we can merge these if branches. We can call
> > ReorderBufferXidSetCatalogChanges() for top-txn and in
> > SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn
> > is in the list. What do you think?
> >
>
> Note that this code doesn't exist in 14 and 15, so we need to create
> different patches for those.

Right.

> BTW, how in 13 and lower versions did we
> identify and mark subxacts as having catalog changes without our
> patch?

I think we use HEAP_INPLACE and XLOG_HEAP2_NEW_CID to mark subxacts as well.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Amit Kapila

On Thu, Jul 28, 2022 at 11:56 AM Masahiko Sawada  wrote:
>
> On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila  wrote:
> >
> > On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada  
> > wrote:
> > >
>
> While editing back branch patches, I realized that the following
> (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are
> equivalent:
>
> +   /*
> +* If the COMMIT record has invalidation messages, it could have catalog
> +* changes. It is possible that we didn't mark this transaction as
> +* containing catalog changes when the decoding starts from a commit
> +* record without decoding the transaction's other changes. So, we ensure
> +* to mark such transactions as containing catalog change.
> +*
> +* This must be done before SnapBuildCommitTxn() so that we can include
> +* these transactions in the historic snapshot.
> +*/
> +   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
> +   SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid,
> + parsed->nsubxacts, parsed->subxacts,
> + buf->origptr);
> +
> /*
>  * Process invalidation messages, even if we're not interested in the
>  * transaction's contents, since the various caches need to always be
>  * consistent.
>  */
> if (parsed->nmsgs > 0)
> {
> if (!ctx->fast_forward)
> ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr,
>   parsed->nmsgs, parsed->msgs);
> ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
> }
>
> If that's right, I think we can merge these if branches. We can call
> ReorderBufferXidSetCatalogChanges() for top-txn and in
> SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn
> is in the list. What do you think?
>

Note that this code doesn't exist in 14 and 15, so we need to create
different patches for those. BTW, how in 13 and lower versions did we
identify and mark subxacts as having catalog changes without our
patch?

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-28 Thread Masahiko Sawada

() an

On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila  wrote:
>
> On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila  wrote:
> > >
> >
> > >  I have changed accordingly in the attached
> > > and apart from that slightly modified the comments and commit message.
> > > Do let me know what you think of the attached?
> >
> > It would be better to remember the initial running xacts after
> > SnapBuildRestore() returns true? Because otherwise, we could end up
> > allocating InitialRunningXacts multiple times while leaking the old
> > ones if there are no serialized snapshots that we are interested in.
> >
>
> Right, this makes sense. But note that you can no longer have a check
> (builder->state == SNAPBUILD_START) which I believe is not required.
> We need to do this after restore, in whichever state snapshot was as
> any state other than SNAPBUILD_CONSISTENT can have commits without all
> their changes.

Right.

>
> Accordingly, I think the comment: "Remember the transactions and
> subtransactions that were running when xl_running_xacts record that we
> decoded first was written." needs to be slightly modified to something
> like: "Remember the transactions and subtransactions that were running
> when xl_running_xacts record that we decoded was written.". Change
> this if it is used at any other place in the patch.

Agreed.

>
> > ---
> > +   if (builder->state == SNAPBUILD_START)
> > +   {
> > +   int nxacts =
> > running->subxcnt + running->xcnt;
> > +   Sizesz = sizeof(TransactionId) * nxacts;
> > +
> > +   NInitialRunningXacts = nxacts;
> > +   InitialRunningXacts =
> > MemoryContextAlloc(builder->context, sz);
> > +   memcpy(InitialRunningXacts, running->xids, sz);
> > +   qsort(InitialRunningXacts, nxacts,
> > sizeof(TransactionId), xidComparator);
> > +   }
> >
> > We should allocate the memory for InitialRunningXacts only when
> > (running->subxcnt + running->xcnt) > 0.
> >
>
d > There is no harm in doing that but ideally, that case would have been
> covered by an earlier check "if (running->oldestRunningXid ==
> running->nextXid)" which suggests "No transactions were running, so we
> can jump to consistent."

You're right.

While editing back branch patches, I realized that the following
(parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are
equivalent:

+   /*
+* If the COMMIT record has invalidation messages, it could have catalog
+* changes. It is possible that we didn't mark this transaction as
+* containing catalog changes when the decoding starts from a commit
+* record without decoding the transaction's other changes. So, we ensure
+* to mark such transactions as containing catalog change.
+*
+* This must be done before SnapBuildCommitTxn() so that we can include
+* these transactions in the historic snapshot.
+*/
+   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)
+   SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid,
+ parsed->nsubxacts, parsed->subxacts,
+ buf->origptr);
+
/*
 * Process invalidation messages, even if we're not interested in the
 * transaction's contents, since the various caches need to always be
 * consistent.
 */
if (parsed->nmsgs > 0)
{
if (!ctx->fast_forward)
ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr,
  parsed->nmsgs, parsed->msgs);
ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
}

If that's right, I think we can merge these if branches. We can call
ReorderBufferXidSetCatalogChanges() for top-txn and in
SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn
is in the list. What do you think?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-27 Thread Amit Kapila

On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada  wrote:
>
> On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila  wrote:
> >
>
> >  I have changed accordingly in the attached
> > and apart from that slightly modified the comments and commit message.
> > Do let me know what you think of the attached?
>
> It would be better to remember the initial running xacts after
> SnapBuildRestore() returns true? Because otherwise, we could end up
> allocating InitialRunningXacts multiple times while leaking the old
> ones if there are no serialized snapshots that we are interested in.
>

Right, this makes sense. But note that you can no longer have a check
(builder->state == SNAPBUILD_START) which I believe is not required.
We need to do this after restore, in whichever state snapshot was as
any state other than SNAPBUILD_CONSISTENT can have commits without all
their changes.

Accordingly, I think the comment: "Remember the transactions and
subtransactions that were running when xl_running_xacts record that we
decoded first was written." needs to be slightly modified to something
like: "Remember the transactions and subtransactions that were running
when xl_running_xacts record that we decoded was written.". Change
this if it is used at any other place in the patch.

> ---
> +   if (builder->state == SNAPBUILD_START)
> +   {
> +   int nxacts =
> running->subxcnt + running->xcnt;
> +   Sizesz = sizeof(TransactionId) * nxacts;
> +
> +   NInitialRunningXacts = nxacts;
> +   InitialRunningXacts =
> MemoryContextAlloc(builder->context, sz);
> +   memcpy(InitialRunningXacts, running->xids, sz);
> +   qsort(InitialRunningXacts, nxacts,
> sizeof(TransactionId), xidComparator);
> +   }
>
> We should allocate the memory for InitialRunningXacts only when
> (running->subxcnt + running->xcnt) > 0.
>

There is no harm in doing that but ideally, that case would have been
covered by an earlier check "if (running->oldestRunningXid ==
running->nextXid)" which suggests "No transactions were running, so we
can jump to consistent."

Kindly make the required changes and submit the back branch patches again.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-27 Thread Masahiko Sawada

On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila  wrote:
>
> On Mon, Jul 25, 2022 at 11:26 AM Masahiko Sawada  
> wrote:
> >
> > On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada  
> > wrote:
> >
> > I've attached the patch for REl15 that I forgot.
> >
>
> I feel the place to remember running xacts information in
> SnapBuildProcessRunningXacts is not appropriate. Because in cases
> where there are no running xacts or when xl_running_xact is old enough
> that we can't use it, we don't need that information. I feel we need
> it only when we have to reuse the already serialized snapshot, so,
> won't it be better to initialize at that place in
> SnapBuildFindSnapshot()?

Good point, agreed.

>  I have changed accordingly in the attached
> and apart from that slightly modified the comments and commit message.
> Do let me know what you think of the attached?

It would be better to remember the initial running xacts after
SnapBuildRestore() returns true? Because otherwise, we could end up
allocating InitialRunningXacts multiple times while leaking the old
ones if there are no serialized snapshots that we are interested in.

---
+   if (builder->state == SNAPBUILD_START)
+   {
+   int nxacts =
running->subxcnt + running->xcnt;
+   Sizesz = sizeof(TransactionId) * nxacts;
+
+   NInitialRunningXacts = nxacts;
+   InitialRunningXacts =
MemoryContextAlloc(builder->context, sz);
+   memcpy(InitialRunningXacts, running->xids, sz);
+   qsort(InitialRunningXacts, nxacts,
sizeof(TransactionId), xidComparator);
+   }

We should allocate the memory for InitialRunningXacts only when
(running->subxcnt + running->xcnt) > 0.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-27 Thread Amit Kapila

On Mon, Jul 25, 2022 at 11:26 AM Masahiko Sawada  wrote:
>
> On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada  
> wrote:
>
> I've attached the patch for REl15 that I forgot.
>

I feel the place to remember running xacts information in
SnapBuildProcessRunningXacts is not appropriate. Because in cases
where there are no running xacts or when xl_running_xact is old enough
that we can't use it, we don't need that information. I feel we need
it only when we have to reuse the already serialized snapshot, so,
won't it be better to initialize at that place in
SnapBuildFindSnapshot()? I have changed accordingly in the attached
and apart from that slightly modified the comments and commit message.
Do let me know what you think of the attached?

-- 
With Regards,
Amit Kapila.

REL15_v9-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-26 Thread shiy.f...@fujitsu.com

On Tue, Jul 26, 2022 3:52 PM Masahiko Sawada  wrote:
> 
> On Tue, Jul 26, 2022 at 2:18 PM Amit Kapila 
> wrote:
> >
> > On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada
>  wrote:
> > >
> > > On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com
> > >  wrote:
> > > >
> > > >
> > > > case 3
> > > > -
> > > > There are 64 catalog modifying transactions.
> > > > Decode 100k/500k/1M transactions.
> > > >
> > > > 100k500k1M
> > > > master  0.0600  0.1666  0.4876
> > > > patched 0.0620  0.1653  0.4795
> > > >
> > > >
> > > > Summary of the tests:
> > > > After applying this patch, there is a overhead of about 3% in the case
> decoding
> > > > 100k transactions with 64 catalog modifying transactions. This is an
> extreme
> > > > case, so maybe it's okay.
> > >
> > > Yes. If we're worried about the overhead and doing bsearch() is the
> > > cause, probably we can try simplehash instead of the array.
> > >
> >
> > I am not sure if we need to go that far for this extremely corner
> > case. Let's first try your below idea.
> >
> > > An improvement idea is that we pass the parsed->xinfo down to
> > > SnapBuildXidHasCatalogChanges(), and then return from that function
> > > before doing bearch() if the parsed->xinfo doesn't have
> > > XACT_XINFO_HAS_INVALS. That would save calling bsearch() for
> > > non-catalog-modifying transactions. Is it worth trying?
> > >
> >
> > I think this is worth trying and this might reduce some of the
> > overhead as well in the case presented by Shi-San.
> 
> Okay, I've attached an updated patch that does the above idea. Could
> you please do the performance tests again to see if the idea can help
> reduce the overhead, Shi yu?
> 

Thanks for your improvement. I have tested the case which shows overhead before
(decoding 100k transactions with 64 catalog modifying transactions) for the v9
patch, the result is as follows.

master  0.0607
patched 0.0613

There's almost no difference compared with master (less than 1%), which looks
good to me.

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-26 Thread Masahiko Sawada

On Tue, Jul 26, 2022 at 2:18 PM Amit Kapila  wrote:
>
> On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada  wrote:
> >
> > On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > > Hi,
> > >
> > > I did some performance test for the master branch patch (based on v6 
> > > patch) to
> > > see if the bsearch() added by this patch will cause any overhead.
> >
> > Thank you for doing performance tests!
> >
> > >
> > > I tested them three times and took the average.
> > >
> > > The results are as follows, and attach the bar chart.
> > >
> > > case 1
> > > -
> > > No catalog modifying transaction.
> > > Decode 800k pgbench transactions. (8 clients, 100k transactions per 
> > > client)
> > >
> > > master  7.5417
> > > patched 7.4107
> > >
> > > case 2
> > > -
> > > There's one catalog modifying transaction.
> > > Decode 100k/500k/1M transactions.
> > >
> > > 100k500k1M
> > > master  0.0576  0.1491  0.4346
> > > patched 0.0586  0.1500  0.4344
> > >
> > > case 3
> > > -
> > > There are 64 catalog modifying transactions.
> > > Decode 100k/500k/1M transactions.
> > >
> > > 100k500k1M
> > > master  0.0600  0.1666  0.4876
> > > patched 0.0620  0.1653  0.4795
> > >
> > > (Because the result of case 3 shows that there is a overhead of about 3% 
> > > in the
> > > case decoding 100k transactions with 64 catalog modifying transactions, I
> > > tested the next run of 100k xacts with or without catalog modifying
> > > transactions, to see if it affects subsequent decoding.)
> > >
> > > case 4.1
> > > -
> > > After the test steps in case 3 (64 catalog modifying transactions, decode 
> > > 100k
> > > transactions), run 100k xacts and then decode.
> > >
> > > master  0.3699
> > > patched 0.3701
> > >
> > > case 4.2
> > > -
> > > After the test steps in case 3 (64 catalog modifying transactions, decode 
> > > 100k
> > > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then 
> > > decode.
> > >
> > > master  0.3687
> > > patched 0.3696
> > >
> > > Summary of the tests:
> > > After applying this patch, there is a overhead of about 3% in the case 
> > > decoding
> > > 100k transactions with 64 catalog modifying transactions. This is an 
> > > extreme
> > > case, so maybe it's okay.
> >
> > Yes. If we're worried about the overhead and doing bsearch() is the
> > cause, probably we can try simplehash instead of the array.
> >
>
> I am not sure if we need to go that far for this extremely corner
> case. Let's first try your below idea.
>
> > An improvement idea is that we pass the parsed->xinfo down to
> > SnapBuildXidHasCatalogChanges(), and then return from that function
> > before doing bearch() if the parsed->xinfo doesn't have
> > XACT_XINFO_HAS_INVALS. That would save calling bsearch() for
> > non-catalog-modifying transactions. Is it worth trying?
> >
>
> I think this is worth trying and this might reduce some of the
> overhead as well in the case presented by Shi-San.

Okay, I've attached an updated patch that does the above idea. Could
you please do the performance tests again to see if the idea can help
reduce the overhead, Shi yu?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


master-v9-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-25 Thread Amit Kapila

On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada  wrote:
>
> On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com
>  wrote:
> >
> > Hi,
> >
> > I did some performance test for the master branch patch (based on v6 patch) 
> > to
> > see if the bsearch() added by this patch will cause any overhead.
>
> Thank you for doing performance tests!
>
> >
> > I tested them three times and took the average.
> >
> > The results are as follows, and attach the bar chart.
> >
> > case 1
> > -
> > No catalog modifying transaction.
> > Decode 800k pgbench transactions. (8 clients, 100k transactions per client)
> >
> > master  7.5417
> > patched 7.4107
> >
> > case 2
> > -
> > There's one catalog modifying transaction.
> > Decode 100k/500k/1M transactions.
> >
> > 100k500k1M
> > master  0.0576  0.1491  0.4346
> > patched 0.0586  0.1500  0.4344
> >
> > case 3
> > -
> > There are 64 catalog modifying transactions.
> > Decode 100k/500k/1M transactions.
> >
> > 100k500k1M
> > master  0.0600  0.1666  0.4876
> > patched 0.0620  0.1653  0.4795
> >
> > (Because the result of case 3 shows that there is a overhead of about 3% in 
> > the
> > case decoding 100k transactions with 64 catalog modifying transactions, I
> > tested the next run of 100k xacts with or without catalog modifying
> > transactions, to see if it affects subsequent decoding.)
> >
> > case 4.1
> > -
> > After the test steps in case 3 (64 catalog modifying transactions, decode 
> > 100k
> > transactions), run 100k xacts and then decode.
> >
> > master  0.3699
> > patched 0.3701
> >
> > case 4.2
> > -
> > After the test steps in case 3 (64 catalog modifying transactions, decode 
> > 100k
> > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode.
> >
> > master  0.3687
> > patched 0.3696
> >
> > Summary of the tests:
> > After applying this patch, there is a overhead of about 3% in the case 
> > decoding
> > 100k transactions with 64 catalog modifying transactions. This is an extreme
> > case, so maybe it's okay.
>
> Yes. If we're worried about the overhead and doing bsearch() is the
> cause, probably we can try simplehash instead of the array.
>

I am not sure if we need to go that far for this extremely corner
case. Let's first try your below idea.

> An improvement idea is that we pass the parsed->xinfo down to
> SnapBuildXidHasCatalogChanges(), and then return from that function
> before doing bearch() if the parsed->xinfo doesn't have
> XACT_XINFO_HAS_INVALS. That would save calling bsearch() for
> non-catalog-modifying transactions. Is it worth trying?
>

I think this is worth trying and this might reduce some of the
overhead as well in the case presented by Shi-San.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-25 Thread Masahiko Sawada

On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com
 wrote:
>
> Hi,
>
> I did some performance test for the master branch patch (based on v6 patch) to
> see if the bsearch() added by this patch will cause any overhead.

Thank you for doing performance tests!

>
> I tested them three times and took the average.
>
> The results are as follows, and attach the bar chart.
>
> case 1
> -
> No catalog modifying transaction.
> Decode 800k pgbench transactions. (8 clients, 100k transactions per client)
>
> master  7.5417
> patched 7.4107
>
> case 2
> -
> There's one catalog modifying transaction.
> Decode 100k/500k/1M transactions.
>
> 100k500k1M
> master  0.0576  0.1491  0.4346
> patched 0.0586  0.1500  0.4344
>
> case 3
> -
> There are 64 catalog modifying transactions.
> Decode 100k/500k/1M transactions.
>
> 100k500k1M
> master  0.0600  0.1666  0.4876
> patched 0.0620  0.1653  0.4795
>
> (Because the result of case 3 shows that there is a overhead of about 3% in 
> the
> case decoding 100k transactions with 64 catalog modifying transactions, I
> tested the next run of 100k xacts with or without catalog modifying
> transactions, to see if it affects subsequent decoding.)
>
> case 4.1
> -
> After the test steps in case 3 (64 catalog modifying transactions, decode 100k
> transactions), run 100k xacts and then decode.
>
> master  0.3699
> patched 0.3701
>
> case 4.2
> -
> After the test steps in case 3 (64 catalog modifying transactions, decode 100k
> transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode.
>
> master  0.3687
> patched 0.3696
>
> Summary of the tests:
> After applying this patch, there is a overhead of about 3% in the case 
> decoding
> 100k transactions with 64 catalog modifying transactions. This is an extreme
> case, so maybe it's okay.

Yes. If we're worried about the overhead and doing bsearch() is the
cause, probably we can try simplehash instead of the array.

An improvement idea is that we pass the parsed->xinfo down to
SnapBuildXidHasCatalogChanges(), and then return from that function
before doing bearch() if the parsed->xinfo doesn't have
XACT_XINFO_HAS_INVALS. That would save calling bsearch() for
non-catalog-modifying transactions. Is it worth trying?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-25 Thread shiy.f...@fujitsu.com

Hi,

I did some performance test for the master branch patch (based on v6 patch) to
see if the bsearch() added by this patch will cause any overhead.

I tested them three times and took the average.

The results are as follows, and attach the bar chart.

case 1
-
No catalog modifying transaction.
Decode 800k pgbench transactions. (8 clients, 100k transactions per client)

master  7.5417
patched 7.4107

case 2
-
There's one catalog modifying transaction.
Decode 100k/500k/1M transactions.

100k500k1M
master  0.0576  0.1491  0.4346
patched 0.0586  0.1500  0.4344

case 3
-
There are 64 catalog modifying transactions.
Decode 100k/500k/1M transactions.

100k500k1M
master  0.0600  0.1666  0.4876
patched 0.0620  0.1653  0.4795

(Because the result of case 3 shows that there is a overhead of about 3% in the
case decoding 100k transactions with 64 catalog modifying transactions, I
tested the next run of 100k xacts with or without catalog modifying
transactions, to see if it affects subsequent decoding.)

case 4.1
-
After the test steps in case 3 (64 catalog modifying transactions, decode 100k
transactions), run 100k xacts and then decode.

master  0.3699
patched 0.3701

case 4.2
-
After the test steps in case 3 (64 catalog modifying transactions, decode 100k
transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode.

master  0.3687
patched 0.3696

Summary of the tests:
After applying this patch, there is a overhead of about 3% in the case decoding
100k transactions with 64 catalog modifying transactions. This is an extreme
case, so maybe it's okay. And case 4.1 and case 4.2 shows that the patch has no
effect on subsequent decoding. In other cases, there are no significant
differences.

For your information, here are the parameters specified in postgresql.conf in
the test.

shared_buffers = 8GB
checkpoint_timeout = 30min
max_wal_size = 20GB
min_wal_size = 10GB
autovacuum = off

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-24 Thread Masahiko Sawada

On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada  wrote:
>
> On Sat, Jul 23, 2022 at 8:32 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 22, 2022 at 11:48 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada  
> > > > wrote:
> > > > >
> > > >
> > > > This is required if we don't want to introduce a new set of functions
> > > > as you proposed above. I am not sure which one is better w.r.t back
> > > > patching effort later but it seems to me using flag stuff would make
> > > > future back patches easier if we make any changes in
> > > > SnapBuildCommitTxn.
> > >
> > > Understood.
> > >
> > > I've implemented this idea as well for discussion. Both patches have
> > > the common change to remember the initial running transactions and to
> > > purge them when decoding xl_running_xacts records. The difference is
> > > how to mark the transactions as needing to be added to the snapshot.
> > >
> > > In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch,
> > > when the transaction is in the initial running xact list and its
> > > commit record has XINFO_HAS_INVAL flag, we mark both the top
> > > transaction and its all subtransactions as containing catalog changes
> > > (which also means to create ReorderBufferTXN entries for them). These
> > > transactions are added to the snapshot in SnapBuildCommitTxn() since
> > > ReorderBufferXidHasCatalogChanges () for them returns true.
> > >
> > > In poc_mark_top_txn_has_inval.patch, when the transaction is in the
> > > initial running xacts list and its commit record has XINFO_HAS_INVALS
> > > flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top
> > > transaction.
> > >
> >
> > It seems that the patch has missed the part to check if the xid is in
> > the initial running xacts list?
>
> Oops, right.
>
> >
> > > In SnapBuildCommitTxn(), we add all subtransactions to
> > > the snapshot without checking ReorderBufferXidHasCatalogChanges() for
> > > subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS
> > > flag.
> > >
> > > A difference between the two ideas is the scope of changes: the former
> > > changes only snapbuild.c but the latter changes both snapbuild.c and
> > > reorderbuffer.c. Moreover, while the former uses the existing flag,
> > > the latter adds a new flag to the reorder buffer for dealing with only
> > > this case. I think the former idea is simpler in terms of that. But,
> > > an advantage of the latter idea is that the latter idea can save to
> > > create ReorderBufferTXN entries for subtransactions.
> > >
> > > Overall I prefer the former for now but I'd like to hear what others 
> > > think.
> > >
> >
> > I agree that the latter idea can have better performance in extremely
> > special scenarios but introducing a new flag for the same sounds a bit
> > ugly to me. So, I would also prefer to go with the former idea,
> > however, I would also like to hear what Horiguchi-San and others have
> > to say.
>
> Agreed.
>
> >
> > Few comments on v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during:
> > 1.
> > +void
> > +SnapBuildInitialXactSetCatalogChanges(SnapBuild *builder, TransactionId 
> > xid,
> > +   int subxcnt, TransactionId *subxacts,
> > +   XLogRecPtr lsn)
> > +{
> >
> > I think it is better to name this function as
> > SnapBuildXIDSetCatalogChanges as we use this to mark a particular
> > transaction as having catalog changes.
> >
> > 2. Changed/added a few comments in the attached.
>
> Thank you for the comments.
>
> I've attached updated version patches for the master and back branches.

I've attached the patch for REl15 that I forgot.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


REL15-v8-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-23 Thread Amit Kapila

On Fri, Jul 22, 2022 at 11:48 AM Masahiko Sawada  wrote:
>
> On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila  wrote:
> >
> > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada  
> > wrote:
> > >
> >
> > This is required if we don't want to introduce a new set of functions
> > as you proposed above. I am not sure which one is better w.r.t back
> > patching effort later but it seems to me using flag stuff would make
> > future back patches easier if we make any changes in
> > SnapBuildCommitTxn.
>
> Understood.
>
> I've implemented this idea as well for discussion. Both patches have
> the common change to remember the initial running transactions and to
> purge them when decoding xl_running_xacts records. The difference is
> how to mark the transactions as needing to be added to the snapshot.
>
> In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch,
> when the transaction is in the initial running xact list and its
> commit record has XINFO_HAS_INVAL flag, we mark both the top
> transaction and its all subtransactions as containing catalog changes
> (which also means to create ReorderBufferTXN entries for them). These
> transactions are added to the snapshot in SnapBuildCommitTxn() since
> ReorderBufferXidHasCatalogChanges () for them returns true.
>
> In poc_mark_top_txn_has_inval.patch, when the transaction is in the
> initial running xacts list and its commit record has XINFO_HAS_INVALS
> flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top
> transaction.
>

It seems that the patch has missed the part to check if the xid is in
the initial running xacts list?

> In SnapBuildCommitTxn(), we add all subtransactions to
> the snapshot without checking ReorderBufferXidHasCatalogChanges() for
> subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS
> flag.
>
> A difference between the two ideas is the scope of changes: the former
> changes only snapbuild.c but the latter changes both snapbuild.c and
> reorderbuffer.c. Moreover, while the former uses the existing flag,
> the latter adds a new flag to the reorder buffer for dealing with only
> this case. I think the former idea is simpler in terms of that. But,
> an advantage of the latter idea is that the latter idea can save to
> create ReorderBufferTXN entries for subtransactions.
>
> Overall I prefer the former for now but I'd like to hear what others think.
>

I agree that the latter idea can have better performance in extremely
special scenarios but introducing a new flag for the same sounds a bit
ugly to me. So, I would also prefer to go with the former idea,
however, I would also like to hear what Horiguchi-San and others have
to say.

Few comments on v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during:
1.
+void
+SnapBuildInitialXactSetCatalogChanges(SnapBuild *builder, TransactionId xid,
+   int subxcnt, TransactionId *subxacts,
+   XLogRecPtr lsn)
+{

I think it is better to name this function as
SnapBuildXIDSetCatalogChanges as we use this to mark a particular
transaction as having catalog changes.

2. Changed/added a few comments in the attached.

-- 
With Regards,
Amit Kapila.


v7-0001-diff-amit.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-22 Thread Masahiko Sawada

On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila  wrote:
>
> On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila  wrote:
> > >
> > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  
> > > wrote:
> > >
> > > > Another idea would be to have functions, say
> > > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> > > > does actual work of handling transaction commits and both
> > > > SnapBuildCommitTxn() and SnapBuildCommit() call
> > > > SnapBuildCommitTxnWithXInfo() with different arguments.
> > > >
> > >
> > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in
> > > above para?
> >
> > I meant that we will call like DecodeCommit() ->
> > SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals =
> > true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If
> > SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext()
> > with has_invals = false and behaves the same as before.
> >
>
> Okay, understood. This will work.
>
> > > Yet another idea could be to have another flag
> > > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
> > > Then, we can retrieve it even for each of the subtxn's if and when
> > > required.
> >
> > Do you mean that when checking if the subtransaction has catalog
> > changes, we check if its top-level XID has this new flag?
> >
>
> Yes.
>
> > Why do we
> > need the new flag?
> >
>
> This is required if we don't want to introduce a new set of functions
> as you proposed above. I am not sure which one is better w.r.t back
> patching effort later but it seems to me using flag stuff would make
> future back patches easier if we make any changes in
> SnapBuildCommitTxn.

Understood.

I've implemented this idea as well for discussion. Both patches have
the common change to remember the initial running transactions and to
purge them when decoding xl_running_xacts records. The difference is
how to mark the transactions as needing to be added to the snapshot.

In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch,
when the transaction is in the initial running xact list and its
commit record has XINFO_HAS_INVAL flag, we mark both the top
transaction and its all subtransactions as containing catalog changes
(which also means to create ReorderBufferTXN entries for them). These
transactions are added to the snapshot in SnapBuildCommitTxn() since
ReorderBufferXidHasCatalogChanges () for them returns true.

In poc_mark_top_txn_has_inval.patch, when the transaction is in the
initial running xacts list and its commit record has XINFO_HAS_INVALS
flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top
transaction. In SnapBuildCommitTxn(), we add all subtransactions to
the snapshot without checking ReorderBufferXidHasCatalogChanges() for
subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS
flag.

A difference between the two ideas is the scope of changes: the former
changes only snapbuild.c but the latter changes both snapbuild.c and
reorderbuffer.c. Moreover, while the former uses the existing flag,
the latter adds a new flag to the reorder buffer for dealing with only
this case. I think the former idea is simpler in terms of that. But,
an advantage of the latter idea is that the latter idea can save to
create ReorderBufferTXN entries for subtransactions.

Overall I prefer the former for now but I'd like to hear what others think.

FWIW, I didn't try the idea of adding wrapper functions since it would
be costly in terms of back patching effort in the future.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 9a31e0b879..4553252d75 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	spill slot truncate stream stats twophase twophase_stream
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
-	twophase_snapshot
+	twophase_snapshot catalog_change_snapshot

 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out
new file mode 100644
index 00..dc4f9b7018
--- /dev/null
+++ b/contrib/test_decoding/expected/catalog_change_snapshot.out
@@ -0,0 +1,44 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s0_init s0_begin s0_savepoint s0_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s0_commit s1_get_changes
+step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
+?column?
+
+init

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Amit Kapila

On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada  wrote:
>
> On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila  wrote:
> >
> > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  
> > wrote:
> >
> > > Another idea would be to have functions, say
> > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> > > does actual work of handling transaction commits and both
> > > SnapBuildCommitTxn() and SnapBuildCommit() call
> > > SnapBuildCommitTxnWithXInfo() with different arguments.
> > >
> >
> > Do you want to say DecodeCommit() instead of SnapBuildCommit() in
> > above para?
>
> I meant that we will call like DecodeCommit() ->
> SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals =
> true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If
> SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext()
> with has_invals = false and behaves the same as before.
>

Okay, understood. This will work.

> > Yet another idea could be to have another flag
> > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
> > Then, we can retrieve it even for each of the subtxn's if and when
> > required.
>
> Do you mean that when checking if the subtransaction has catalog
> changes, we check if its top-level XID has this new flag?
>

Yes.

> Why do we
> need the new flag?
>

This is required if we don't want to introduce a new set of functions
as you proposed above. I am not sure which one is better w.r.t back
patching effort later but it seems to me using flag stuff would make
future back patches easier if we make any changes in
SnapBuildCommitTxn.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Masahiko Sawada

On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila  wrote:
>
> On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila  
> > wrote:
> > >
> > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada  
> > > wrote:
> > >
> > > Why do you think we can't remove
> > > ReorderBufferInitialXactsSetCatalogChanges() from the back branch
> > > patch? I think we don't need to change the existing function
> > > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper
> > > like SnapBuildXidHasCatalogChanges() similar to master branch patch.
> >
> > IIUC we need to change SnapBuildCommitTxn() but it's exposed.
> >
> > Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() ->
> > ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we
> > call like DecodeCommit() -> SnapBuildCommitTxn() ->
> > SnapBuildXidHasCatalogChanges() ->
> > ReorderBufferXidHasCatalogChanges(). In
> > SnapBuildXidHasCatalogChanges(), we need to check if the transaction
> > has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass
> > either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0)
> > down to SnapBuildXidHasCatalogChanges(). However, since
> > SnapBuildCommitTxn(), between DecodeCommit() and
> > SnapBuildXidHasCatalogChanges(), is exposed we cannot change it.
> >
>
> Agreed.
>
> > Another idea would be to have functions, say
> > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> > does actual work of handling transaction commits and both
> > SnapBuildCommitTxn() and SnapBuildCommit() call
> > SnapBuildCommitTxnWithXInfo() with different arguments.
> >
>
> Do you want to say DecodeCommit() instead of SnapBuildCommit() in
> above para?

I meant that we will call like DecodeCommit() ->
SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals =
true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If
SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext()
with has_invals = false and behaves the same as before.

> Yet another idea could be to have another flag
> RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
> Then, we can retrieve it even for each of the subtxn's if and when
> required.

Do you mean that when checking if the subtransaction has catalog
changes, we check if its top-level XID has this new flag? Why do we
need the new flag?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Masahiko Sawada

On Wed, Jul 20, 2022 at 4:16 PM Kyotaro Horiguchi
 wrote:
>
> At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada  
> wrote in
> > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi
> >  wrote:
> > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
> > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still
> >  > assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
> > > ondisk.builder.commited.xip populated with a valid array pointer. But
> > > the patch allows committed.xip be NULL, thus in that case,
> > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
> > > failure.
> >
> > IIUC the patch doesn't allow committed.xip to be NULL since we don't
> > overwrite it if builder->committed.xcnt is 0 (i.e.,
> > ondisk.builder.committed.xip is NULL):
>
> I meant that ondisk.builder.committed.xip can be NULL.. But looking
> again that cannot be.  I don't understand what I was looking at at
> that time.
>
> So, sorry for the noise.

No problem. Thank you for your review and comments!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-20 Thread Kyotaro Horiguchi

At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada  
wrote in 
> On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi
>  wrote:
> > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
> > allocate the array with a fixed size. SnapBuildAddCommittedTxn still
>  > assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
> > ondisk.builder.commited.xip populated with a valid array pointer. But
> > the patch allows committed.xip be NULL, thus in that case,
> > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
> > failure.
> 
> IIUC the patch doesn't allow committed.xip to be NULL since we don't
> overwrite it if builder->committed.xcnt is 0 (i.e.,
> ondisk.builder.committed.xip is NULL):

I meant that ondisk.builder.committed.xip can be NULL.. But looking
again that cannot be.  I don't understand what I was looking at at
that time.

So, sorry for the noise.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Amit Kapila

On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada  wrote:
>
> On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila  wrote:
> >
> > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada  
> > wrote:
> >
> > Why do you think we can't remove
> > ReorderBufferInitialXactsSetCatalogChanges() from the back branch
> > patch? I think we don't need to change the existing function
> > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper
> > like SnapBuildXidHasCatalogChanges() similar to master branch patch.
>
> IIUC we need to change SnapBuildCommitTxn() but it's exposed.
>
> Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() ->
> ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we
> call like DecodeCommit() -> SnapBuildCommitTxn() ->
> SnapBuildXidHasCatalogChanges() ->
> ReorderBufferXidHasCatalogChanges(). In
> SnapBuildXidHasCatalogChanges(), we need to check if the transaction
> has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass
> either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0)
> down to SnapBuildXidHasCatalogChanges(). However, since
> SnapBuildCommitTxn(), between DecodeCommit() and
> SnapBuildXidHasCatalogChanges(), is exposed we cannot change it.
>

Agreed.

> Another idea would be to have functions, say
> SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
> does actual work of handling transaction commits and both
> SnapBuildCommitTxn() and SnapBuildCommit() call
> SnapBuildCommitTxnWithXInfo() with different arguments.
>

Do you want to say DecodeCommit() instead of SnapBuildCommit() in
above para? Yet another idea could be to have another flag
RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN.
Then, we can retrieve it even for each of the subtxn's if and when
required.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila  wrote:
>
> On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila  wrote:
> > >
> > > On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada  
> > > wrote:
> > > >
> > >
> > > > BTW on backbranches, I think that the reason why we add
> > > > initial_running_xacts stuff to ReorderBuffer is that we cannot modify
> > > > SnapBuild that could be serialized. Can we add a (private) array for
> > > > the initial running xacts in snapbuild.c instead of adding new
> > > > variables to ReorderBuffer?
> > > >
> > >
> > > While thinking about this, I wonder if the current patch for back
> > > branches can lead to an ABI break as it changes the exposed structure?
> > > If so, it may be another reason to change it to some other way
> > > probably as you are suggesting.
> >
> > Yeah, it changes the size of ReorderBuffer, which is not good.
> >
>
> So, are you planning to give a try with your idea of making a private
> array for the initial running xacts?

Yes.

>  I am not sure but I guess you are
> proposing to add it in SnapBuild structure, if so, that seems safe as
> that structure is not exposed.

We cannot add it in SnapBuild as it gets serialized to the disk.

>
> > Changing the function names and arguments would also break ABI. So
> > probably we cannot do the above idea of removing
> > ReorderBufferInitialXactsSetCatalogChanges() as well.
> >
>
> Why do you think we can't remove
> ReorderBufferInitialXactsSetCatalogChanges() from the back branch
> patch? I think we don't need to change the existing function
> ReorderBufferXidHasCatalogChanges() but instead can have a wrapper
> like SnapBuildXidHasCatalogChanges() similar to master branch patch.

IIUC we need to change SnapBuildCommitTxn() but it's exposed.

Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() ->
ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we
call like DecodeCommit() -> SnapBuildCommitTxn() ->
SnapBuildXidHasCatalogChanges() ->
ReorderBufferXidHasCatalogChanges(). In
SnapBuildXidHasCatalogChanges(), we need to check if the transaction
has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass
either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0)
down to SnapBuildXidHasCatalogChanges(). However, since
SnapBuildCommitTxn(), between DecodeCommit() and
SnapBuildXidHasCatalogChanges(), is exposed we cannot change it.

Another idea would be to have functions, say
SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter
does actual work of handling transaction commits and both
SnapBuildCommitTxn() and SnapBuildCommit() call
SnapBuildCommitTxnWithXInfo() with different arguments.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Amit Kapila

On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada  wrote:
>
> On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila  wrote:
> >
> > On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada  
> > wrote:
> > >
> >
> > > BTW on backbranches, I think that the reason why we add
> > > initial_running_xacts stuff to ReorderBuffer is that we cannot modify
> > > SnapBuild that could be serialized. Can we add a (private) array for
> > > the initial running xacts in snapbuild.c instead of adding new
> > > variables to ReorderBuffer?
> > >
> >
> > While thinking about this, I wonder if the current patch for back
> > branches can lead to an ABI break as it changes the exposed structure?
> > If so, it may be another reason to change it to some other way
> > probably as you are suggesting.
>
> Yeah, it changes the size of ReorderBuffer, which is not good.
>

So, are you planning to give a try with your idea of making a private
array for the initial running xacts? I am not sure but I guess you are
proposing to add it in SnapBuild structure, if so, that seems safe as
that structure is not exposed.

> Changing the function names and arguments would also break ABI. So
> probably we cannot do the above idea of removing
> ReorderBufferInitialXactsSetCatalogChanges() as well.
>

Why do you think we can't remove
ReorderBufferInitialXactsSetCatalogChanges() from the back branch
patch? I think we don't need to change the existing function
ReorderBufferXidHasCatalogChanges() but instead can have a wrapper
like SnapBuildXidHasCatalogChanges() similar to master branch patch.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 19 Jul 2022 17:31:07 +0900, Masahiko Sawada  
> wrote in
> > On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi
> >  wrote:
> > > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila  
> > > wrote in
> > > > Good work. I wonder without comments this may create a problem in the
> > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > > > freeing the memory any less robust. Also, for consistency, we can use
> > > > a similar check based on xcnt in the SnapBuildRestore to free the
> > > > memory in the below code:
> > > > + /* set catalog modifying transactions */
> > > > + if (builder->catchange.xip)
> > > > + pfree(builder->catchange.xip);
> > >
> > > But xip must be positive there.  We can add a comment explains that.
> > >
> >
> > Yes, if we add the comment for it, probably we need to explain a gcc's
> > optimization but it seems to be too much to me.
>
> Ah, sorry. I confused with other place in SnapBuildPurgeCommitedTxn.
> I agree to you, that we don't need additional comment *there*.
>
> > > +   catchange_xip = 
> > > ReorderBufferGetCatalogChangesXacts(builder->reorder);
> > >
> > > catchange_xip is allocated in the current context, but ondisk is
> > > allocated in builder->context.  I see it kind of inconsistent (even if
> > > the current context is same with build->context).
> >
> > Right. I thought that since the lifetime of catchange_xip is short,
> > until the end of SnapBuildSerialize() function we didn't need to
> > allocate it in builder->context. But given ondisk, we need to do that
> > for catchange_xip as well. Will fix it.
>
> Thanks.
>
> > > +   if (builder->committed.xcnt > 0)
> > > +   {
> > >
> > > It seems to me comitted.xip is always non-null, so we don't need this.
> > > I don't strongly object to do that, though.
> >
> > But committed.xcnt could be 0, right? We don't need to copy anything
> > by calling memcpy with size = 0 in this case. Also, it looks more
> > consistent with what we do for catchange_xcnt.
>
> Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
> allocate the array with a fixed size. SnapBuildAddCommittedTxn still
 > assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
> ondisk.builder.commited.xip populated with a valid array pointer. But
> the patch allows committed.xip be NULL, thus in that case,
> SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
> failure.

IIUC the patch doesn't allow committed.xip to be NULL since we don't
overwrite it if builder->committed.xcnt is 0 (i.e.,
ondisk.builder.committed.xip is NULL):

builder->committed.xcnt = ondisk.builder.committed.xcnt;
/* We only allocated/stored xcnt, not xcnt_space xids ! */
/* don't overwrite preallocated xip, if we don't have anything here */
if (builder->committed.xcnt > 0)
{
pfree(builder->committed.xip);
builder->committed.xcnt_space = ondisk.builder.committed.xcnt;
builder->committed.xip = ondisk.builder.committed.xip;
}

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Kyotaro Horiguchi

At Tue, 19 Jul 2022 17:31:07 +0900, Masahiko Sawada  
wrote in 
> On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi
>  wrote:
> > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila  
> > wrote in
> > > Good work. I wonder without comments this may create a problem in the
> > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > > freeing the memory any less robust. Also, for consistency, we can use
> > > a similar check based on xcnt in the SnapBuildRestore to free the
> > > memory in the below code:
> > > + /* set catalog modifying transactions */
> > > + if (builder->catchange.xip)
> > > + pfree(builder->catchange.xip);
> >
> > But xip must be positive there.  We can add a comment explains that.
> >
> 
> Yes, if we add the comment for it, probably we need to explain a gcc's
> optimization but it seems to be too much to me.

Ah, sorry. I confused with other place in SnapBuildPurgeCommitedTxn.
I agree to you, that we don't need additional comment *there*.

> > +   catchange_xip = 
> > ReorderBufferGetCatalogChangesXacts(builder->reorder);
> >
> > catchange_xip is allocated in the current context, but ondisk is
> > allocated in builder->context.  I see it kind of inconsistent (even if
> > the current context is same with build->context).
> 
> Right. I thought that since the lifetime of catchange_xip is short,
> until the end of SnapBuildSerialize() function we didn't need to
> allocate it in builder->context. But given ondisk, we need to do that
> for catchange_xip as well. Will fix it.

Thanks.

> > +   if (builder->committed.xcnt > 0)
> > +   {
> >
> > It seems to me comitted.xip is always non-null, so we don't need this.
> > I don't strongly object to do that, though.
> 
> But committed.xcnt could be 0, right? We don't need to copy anything
> by calling memcpy with size = 0 in this case. Also, it looks more
> consistent with what we do for catchange_xcnt.

Mmm. the patch changed that behavior. AllocateSnapshotBuilder always
allocate the array with a fixed size. SnapBuildAddCommittedTxn still
assumes builder->committed.xip is non-NULL.  SnapBuildRestore *kept*
ondisk.builder.commited.xip populated with a valid array pointer. But
the patch allows committed.xip be NULL, thus in that case,
SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion
failure.

> > +   Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));
> >
> > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
> > (xcnt == rb->catchange_ntxns) is not what should be checked here. The
> > assert just requires that catchange_txns and catchange_ntxns are
> > consistent so it should be checked just after dlist_empty.. I think.
> >
> 
> If we want to check if catchange_txns and catchange_ntxns are
> consistent, should we check (xcnt == rb->catchange_ntxns) as well, no?
> This function requires the caller to use rb->catchange_ntxns as the
> length of the returned array. I think this assertion ensures that the
> actual length of the array is consistent with the length we
> pre-calculated.

Sorry again. Please forget the comment about xcnt == rb->catchange_ntxns..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila  wrote:
>
> On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada  wrote:
> >
> > On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila  wrote:
> > >
> > > On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
> > > >  wrote:
> > > > >
> > > >
> > > > I've attached patches for all supported branches including the master.
> > > >
> > >
> > > For back branch patches,
> > > * Wouldn't it be better to move purge logic into the function
> > > SnapBuildPurge* function for the sake of consistency?
> >
> > Agreed.
> >
> > > * Do we really need ReorderBufferInitialXactsSetCatalogChanges()?
> > > Can't we instead have a function similar to
> > > SnapBuildXidHasCatalogChanges() as we have for the master branch? That
> > > will avoid calling it when the snapshot
> > > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT
> >
> > Seems a good idea. We would need to pass the information about
> > (parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably
> > we can change ReorderBufferXidHasCatalogChanges() so that it checks
> > the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts
> > array.
> >
>
> Let's try to keep this as much similar to the master branch patch as possible.
>
> > BTW on backbranches, I think that the reason why we add
> > initial_running_xacts stuff to ReorderBuffer is that we cannot modify
> > SnapBuild that could be serialized. Can we add a (private) array for
> > the initial running xacts in snapbuild.c instead of adding new
> > variables to ReorderBuffer?
> >
>
> While thinking about this, I wonder if the current patch for back
> branches can lead to an ABI break as it changes the exposed structure?
> If so, it may be another reason to change it to some other way
> probably as you are suggesting.

Yeah, it changes the size of ReorderBuffer, which is not good.
Changing the function names and arguments would also break ABI. So
probably we cannot do the above idea of removing
ReorderBufferInitialXactsSetCatalogChanges() as well.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Amit Kapila

On Tue, Jul 19, 2022 at 2:01 PM Masahiko Sawada  wrote:
>
> On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi
>  wrote:
>
> >
> >
> > +   Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));
> >
> > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
> > (xcnt == rb->catchange_ntxns) is not what should be checked here. The
> > assert just requires that catchange_txns and catchange_ntxns are
> > consistent so it should be checked just after dlist_empty.. I think.
> >
>
> If we want to check if catchange_txns and catchange_ntxns are
> consistent, should we check (xcnt == rb->catchange_ntxns) as well, no?
> This function requires the caller to use rb->catchange_ntxns as the
> length of the returned array. I think this assertion ensures that the
> actual length of the array is consistent with the length we
> pre-calculated.
>

Right, so, I think it is better to keep this assertion but remove
(xcnt > 0) part as pointed out by Horiguchi-San.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Amit Kapila

On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada  wrote:
>
> On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila  wrote:
> >
> > On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
> > >  wrote:
> > > >
> > >
> > > I've attached patches for all supported branches including the master.
> > >
> >
> > For back branch patches,
> > * Wouldn't it be better to move purge logic into the function
> > SnapBuildPurge* function for the sake of consistency?
>
> Agreed.
>
> > * Do we really need ReorderBufferInitialXactsSetCatalogChanges()?
> > Can't we instead have a function similar to
> > SnapBuildXidHasCatalogChanges() as we have for the master branch? That
> > will avoid calling it when the snapshot
> > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT
>
> Seems a good idea. We would need to pass the information about
> (parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably
> we can change ReorderBufferXidHasCatalogChanges() so that it checks
> the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts
> array.
>

Let's try to keep this as much similar to the master branch patch as possible.

> BTW on backbranches, I think that the reason why we add
> initial_running_xacts stuff to ReorderBuffer is that we cannot modify
> SnapBuild that could be serialized. Can we add a (private) array for
> the initial running xacts in snapbuild.c instead of adding new
> variables to ReorderBuffer?
>

While thinking about this, I wonder if the current patch for back
branches can lead to an ABI break as it changes the exposed structure?
If so, it may be another reason to change it to some other way
probably as you are suggesting.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Amit Kapila

On Tue, Jul 19, 2022 at 1:43 PM Kyotaro Horiguchi
 wrote:
>
> At Tue, 19 Jul 2022 16:57:14 +0900 (JST), Kyotaro Horiguchi 
>  wrote in
> > At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada  
> > wrote in
> > > On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila  
> > > wrote:
> > > > Good work. I wonder without comments this may create a problem in the
> > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > > > freeing the memory any less robust. Also, for consistency, we can use
> > > > a similar check based on xcnt in the SnapBuildRestore to free the
> > > > memory in the below code:
> > > > + /* set catalog modifying transactions */
> > > > + if (builder->catchange.xip)
> > > > + pfree(builder->catchange.xip);
> > >
> > > I would hesitate to add comments about preventing the particular
> > > optimization. I think we do null-pointer-check-then-pfree many place.
> > > It seems to me that checking the array length before memcpy is more
> > > natural than checking both the array length and the array existence
> > > before pfree.
> >
> > Anyway according to commit message of 46ab07ffda, POSIX forbits
> > memcpy(NULL, NULL, 0). It seems to me that it is the cause of the
> > false (or over) optimization.  So if we add some comment, it would be
> > for memcpy, not pfree..
>
> For clarilty, I meant that I don't think we need that comment.
>

Fair enough. I think commit 46ab07ffda clearly explains why it is a
good idea to add a check as Sawada-San did in his latest version. I
also agree that we don't any comment for this change.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi
 wrote:

Thank you for the comments!

>
> At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila  
> wrote in
> > Good work. I wonder without comments this may create a problem in the
> > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > freeing the memory any less robust. Also, for consistency, we can use
> > a similar check based on xcnt in the SnapBuildRestore to free the
> > memory in the below code:
> > + /* set catalog modifying transactions */
> > + if (builder->catchange.xip)
> > + pfree(builder->catchange.xip);
>
> But xip must be positive there.  We can add a comment explains that.
>

Yes, if we add the comment for it, probably we need to explain a gcc's
optimization but it seems to be too much to me.

>
> +* Array of transactions and subtransactions that had modified 
> catalogs
> +* and were running when the snapshot was serialized.
> +*
> +* We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS 
> records to
> +* know if the transaction has changed the catalog. But it could 
> happen that
> +* the logical decoding decodes only the commit record of the 
> transaction.
> +* This array keeps track of the transactions that have modified 
> catalogs
>
> (Might be only me, but) "track" makes me think that xids are added and
> removed by activities. On the other hand the array just remembers
> catalog-modifying xids in the last life until the all xids in the list
> gone.
>
> +* and were running when serializing a snapshot, and this array is 
> used to
> +* add such transactions to the snapshot.
> +*
> +* This array is set once when restoring the snapshot, xids are 
> removed
>
> (So I want to add "only" between "are removed").
>
> +* from the array when decoding xl_running_xacts record, and then 
> eventually
> +* becomes empty.

Agreed. WIll fix.

>
>
> +   catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder);
>
> catchange_xip is allocated in the current context, but ondisk is
> allocated in builder->context.  I see it kind of inconsistent (even if
> the current context is same with build->context).

Right. I thought that since the lifetime of catchange_xip is short,
until the end of SnapBuildSerialize() function we didn't need to
allocate it in builder->context. But given ondisk, we need to do that
for catchange_xip as well. Will fix it.

>
>
> +   if (builder->committed.xcnt > 0)
> +   {
>
> It seems to me comitted.xip is always non-null, so we don't need this.
> I don't strongly object to do that, though.

But committed.xcnt could be 0, right? We don't need to copy anything
by calling memcpy with size = 0 in this case. Also, it looks more
consistent with what we do for catchange_xcnt.

>
> -* Remove TXN from its containing list.
> +* Remove TXN from its containing lists.
>
> The comment body only describes abut txn->nodes. I think we need to
> add that for catchange_node.

Will add.

>
>
> +   Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));
>
> (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
> (xcnt == rb->catchange_ntxns) is not what should be checked here. The
> assert just requires that catchange_txns and catchange_ntxns are
> consistent so it should be checked just after dlist_empty.. I think.
>

If we want to check if catchange_txns and catchange_ntxns are
consistent, should we check (xcnt == rb->catchange_ntxns) as well, no?
This function requires the caller to use rb->catchange_ntxns as the
length of the returned array. I think this assertion ensures that the
actual length of the array is consistent with the length we
pre-calculated.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Kyotaro Horiguchi

At Tue, 19 Jul 2022 16:57:14 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada  
> wrote in 
> > On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila  wrote:
> > > Good work. I wonder without comments this may create a problem in the
> > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > > freeing the memory any less robust. Also, for consistency, we can use
> > > a similar check based on xcnt in the SnapBuildRestore to free the
> > > memory in the below code:
> > > + /* set catalog modifying transactions */
> > > + if (builder->catchange.xip)
> > > + pfree(builder->catchange.xip);
> > 
> > I would hesitate to add comments about preventing the particular
> > optimization. I think we do null-pointer-check-then-pfree many place.
> > It seems to me that checking the array length before memcpy is more
> > natural than checking both the array length and the array existence
> > before pfree.
> 
> Anyway according to commit message of 46ab07ffda, POSIX forbits
> memcpy(NULL, NULL, 0). It seems to me that it is the cause of the
> false (or over) optimization.  So if we add some comment, it would be
> for memcpy, not pfree..

For clarilty, I meant that I don't think we need that comment.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Kyotaro Horiguchi

At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada  
wrote in 
> On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila  wrote:
> > Good work. I wonder without comments this may create a problem in the
> > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> > freeing the memory any less robust. Also, for consistency, we can use
> > a similar check based on xcnt in the SnapBuildRestore to free the
> > memory in the below code:
> > + /* set catalog modifying transactions */
> > + if (builder->catchange.xip)
> > + pfree(builder->catchange.xip);
> 
> I would hesitate to add comments about preventing the particular
> optimization. I think we do null-pointer-check-then-pfree many place.
> It seems to me that checking the array length before memcpy is more
> natural than checking both the array length and the array existence
> before pfree.

Anyway according to commit message of 46ab07ffda, POSIX forbits
memcpy(NULL, NULL, 0). It seems to me that it is the cause of the
false (or over) optimization.  So if we add some comment, it would be
for memcpy, not pfree..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Tue, Jul 19, 2022 at 4:28 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Sunday, July 17, 2022 9:59 PM Masahiko Sawada  
> wrote:
> > I've attached patches for all supported branches including the master.
> Hi,
>
>
> Minor comments for REL14.
>
> (1) There are some foreign characters in the patches (in the commit message)
>
> When I had a look at your patch for back branches with some editor,
> I could see some unfamiliar full-width characters like below two cases,
> mainly around "single quotes" in the sentences.
>
> Could you please check the entire patches,
> probably by some tool that helps you to detect this kind of characters ?
>
> * the 2nd paragraph of the commit message
>
> ...mark the transaction as containing catalog changes if it窶冱 in the list of 
> the
> initial running transactions ...
>
> * the 3rd paragraph of the same
>
> It doesn窶冲 have the information on which (sub) transaction has catalog 
> changes
>
> FYI, this comment applies to other patches for REL13, REL12, REL11, REL10.
>
>
> (2) typo in the commit message
>
> FROM:
> To fix this problem, this change the reorder buffer so that...
> TO:
> To fix this problem, this changes the reorder buffer so that...
>
>
> (3) typo in ReorderBufferProcessInitialXacts
>
> +   /*
> +* Remove transactions that would have been processed and we don't 
> need to
> +* keep track off anymore.
>
>
> Kindly change
> FROM:
> keep track off
> TO:
> keep track of

Thank you for the comments! I'll address these comments in the next
version patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila  wrote:
>
> On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada  wrote:
> >
> > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> >
> > I've attached patches for all supported branches including the master.
> >
>
> For back branch patches,
> * Wouldn't it be better to move purge logic into the function
> SnapBuildPurge* function for the sake of consistency?

Agreed.

> * Do we really need ReorderBufferInitialXactsSetCatalogChanges()?
> Can't we instead have a function similar to
> SnapBuildXidHasCatalogChanges() as we have for the master branch? That
> will avoid calling it when the snapshot
> state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT

Seems a good idea. We would need to pass the information about
(parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably
we can change ReorderBufferXidHasCatalogChanges() so that it checks
the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts
array.

BTW on backbranches, I think that the reason why we add
initial_running_xacts stuff to ReorderBuffer is that we cannot modify
SnapBuild that could be serialized. Can we add a (private) array for
the initial running xacts in snapbuild.c instead of adding new
variables to ReorderBuffer? That way, the code would become more
consistent with the changes on the master branch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Kyotaro Horiguchi

At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila  wrote 
in 
> Good work. I wonder without comments this may create a problem in the
> future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> freeing the memory any less robust. Also, for consistency, we can use
> a similar check based on xcnt in the SnapBuildRestore to free the
> memory in the below code:
> + /* set catalog modifying transactions */
> + if (builder->catchange.xip)
> + pfree(builder->catchange.xip);

But xip must be positive there.  We can add a comment explains that.


+* Array of transactions and subtransactions that had modified catalogs
+* and were running when the snapshot was serialized.
+*
+* We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS 
records to
+* know if the transaction has changed the catalog. But it could happen 
that
+* the logical decoding decodes only the commit record of the 
transaction.
+* This array keeps track of the transactions that have modified 
catalogs

(Might be only me, but) "track" makes me think that xids are added and
removed by activities. On the other hand the array just remembers
catalog-modifying xids in the last life until the all xids in the list
gone.

+* and were running when serializing a snapshot, and this array is used 
to
+* add such transactions to the snapshot.
+*
+* This array is set once when restoring the snapshot, xids are removed

(So I want to add "only" between "are removed").

+* from the array when decoding xl_running_xacts record, and then 
eventually
+* becomes empty.


+   catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder);

catchange_xip is allocated in the current context, but ondisk is
allocated in builder->context.  I see it kind of inconsistent (even if
the current context is same with build->context).


+   if (builder->committed.xcnt > 0)
+   {

It seems to me comitted.xip is always non-null, so we don't need this.
I don't strongly object to do that, though.

-* Remove TXN from its containing list.
+* Remove TXN from its containing lists.

The comment body only describes abut txn->nodes. I think we need to
add that for catchange_node.


+   Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns));

(xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..).
(xcnt == rb->catchange_ntxns) is not what should be checked here. The
assert just requires that catchange_txns and catchange_ntxns are
consistent so it should be checked just after dlist_empty.. I think.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread osumi.takami...@fujitsu.com

On Sunday, July 17, 2022 9:59 PM Masahiko Sawada  wrote:
> I've attached patches for all supported branches including the master.
Hi,


Minor comments for REL14.

(1) There are some foreign characters in the patches (in the commit message)

When I had a look at your patch for back branches with some editor,
I could see some unfamiliar full-width characters like below two cases,
mainly around "single quotes" in the sentences.

Could you please check the entire patches,
probably by some tool that helps you to detect this kind of characters ?

* the 2nd paragraph of the commit message

...mark the transaction as containing catalog changes if it窶冱 in the list of the
initial running transactions ...

* the 3rd paragraph of the same

It doesn窶冲 have the information on which (sub) transaction has catalog 
changes

FYI, this comment applies to other patches for REL13, REL12, REL11, REL10.


(2) typo in the commit message

FROM:
To fix this problem, this change the reorder buffer so that...
TO:
To fix this problem, this changes the reorder buffer so that...


(3) typo in ReorderBufferProcessInitialXacts

+   /*
+* Remove transactions that would have been processed and we don't need 
to
+* keep track off anymore.


Kindly change
FROM:
keep track off
TO:
keep track of



Best Regards,
Takamichi Osumi

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-19 Thread Masahiko Sawada

On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila  wrote:
>
> On Tue, Jul 19, 2022 at 6:34 AM Masahiko Sawada  wrote:
> >
> > On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila  wrote:
> > >
> > > On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > This patch should have the fix for the issue that Shi yu reported. Shi
> > > > yu, could you please test it again with this patch?
> > > >
> > >
> > > Can you explain the cause of the failure and your fix for the same?
> >
> > @@ -1694,6 +1788,8 @@ out:
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > +   if (catchange_xip)
> > +   pfree(catchange_xip);
> >
> > Regarding the above code in the previous version patch, looking at the
> > generated assembler code shared by Shi yu offlist, I realized that the
> > “if (catchange_xip)” is removed (folded) by gcc optimization. This is
> > because we dereference catchange_xip before null-pointer check as
> > follow:
> >
> > +   /* copy catalog modifying xacts */
> > +   sz = sizeof(TransactionId) * catchange_xcnt;
> > +   memcpy(ondisk_c, catchange_xip, sz);
> > +   COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
> > +   ondisk_c += sz;
> >
> > Since sz is 0 in this case, memcpy doesn’t do anything actually.
> >
> > By checking the assembler code, I’ve confirmed that gcc does the
> > optimization for these code and setting
> > -fno-delete-null-pointer-checks flag prevents the if statement from
> > being folded. Also, I’ve confirmed that adding the check if
> > "catchange.xcnt > 0” before the null-pointer check also can prevent
> > that. Adding a check  if "catchange.xcnt > 0” looks more robust. I’ve
> > added a similar check for builder->committed.xcnt as well for
> > consistency. builder->committed.xip could have no transactions.
> >
>
> Good work. I wonder without comments this may create a problem in the
> future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
> freeing the memory any less robust. Also, for consistency, we can use
> a similar check based on xcnt in the SnapBuildRestore to free the
> memory in the below code:
> + /* set catalog modifying transactions */
> + if (builder->catchange.xip)
> + pfree(builder->catchange.xip);

I would hesitate to add comments about preventing the particular
optimization. I think we do null-pointer-check-then-pfree many place.
It seems to me that checking the array length before memcpy is more
natural than checking both the array length and the array existence
before pfree.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Amit Kapila

On Tue, Jul 19, 2022 at 6:34 AM Masahiko Sawada  wrote:
>
> On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada  
> > wrote:
> > >
> > > This patch should have the fix for the issue that Shi yu reported. Shi
> > > yu, could you please test it again with this patch?
> > >
> >
> > Can you explain the cause of the failure and your fix for the same?
>
> @@ -1694,6 +1788,8 @@ out:
> /* be tidy */
> if (ondisk)
> pfree(ondisk);
> +   if (catchange_xip)
> +   pfree(catchange_xip);
>
> Regarding the above code in the previous version patch, looking at the
> generated assembler code shared by Shi yu offlist, I realized that the
> “if (catchange_xip)” is removed (folded) by gcc optimization. This is
> because we dereference catchange_xip before null-pointer check as
> follow:
>
> +   /* copy catalog modifying xacts */
> +   sz = sizeof(TransactionId) * catchange_xcnt;
> +   memcpy(ondisk_c, catchange_xip, sz);
> +   COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
> +   ondisk_c += sz;
>
> Since sz is 0 in this case, memcpy doesn’t do anything actually.
>
> By checking the assembler code, I’ve confirmed that gcc does the
> optimization for these code and setting
> -fno-delete-null-pointer-checks flag prevents the if statement from
> being folded. Also, I’ve confirmed that adding the check if
> "catchange.xcnt > 0” before the null-pointer check also can prevent
> that. Adding a check  if "catchange.xcnt > 0” looks more robust. I’ve
> added a similar check for builder->committed.xcnt as well for
> consistency. builder->committed.xip could have no transactions.
>

Good work. I wonder without comments this may create a problem in the
future. OTOH, I don't see adding a check "catchange.xcnt > 0" before
freeing the memory any less robust. Also, for consistency, we can use
a similar check based on xcnt in the SnapBuildRestore to free the
memory in the below code:
+ /* set catalog modifying transactions */
+ if (builder->catchange.xip)
+ pfree(builder->catchange.xip);

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Masahiko Sawada

On Mon, Jul 18, 2022 at 12:28 PM shiy.f...@fujitsu.com
 wrote:
>
> On Fri, Jul 15, 2022 10:39 PM Masahiko Sawada  wrote:
> >
> > This patch should have the fix for the issue that Shi yu reported. Shi
> > yu, could you please test it again with this patch?
> >
>
> Thanks for updating the patch!
> I have tested and confirmed that the problem I found has been fixed.

Thank you for testing!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Masahiko Sawada

On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila  wrote:
>
> On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada  wrote:
> >
> > This patch should have the fix for the issue that Shi yu reported. Shi
> > yu, could you please test it again with this patch?
> >
>
> Can you explain the cause of the failure and your fix for the same?

@@ -1694,6 +1788,8 @@ out:
/* be tidy */
if (ondisk)
pfree(ondisk);
+   if (catchange_xip)
+   pfree(catchange_xip);

Regarding the above code in the previous version patch, looking at the
generated assembler code shared by Shi yu offlist, I realized that the
“if (catchange_xip)” is removed (folded) by gcc optimization. This is
because we dereference catchange_xip before null-pointer check as
follow:

+   /* copy catalog modifying xacts */
+   sz = sizeof(TransactionId) * catchange_xcnt;
+   memcpy(ondisk_c, catchange_xip, sz);
+   COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
+   ondisk_c += sz;

Since sz is 0 in this case, memcpy doesn’t do anything actually.

By checking the assembler code, I’ve confirmed that gcc does the
optimization for these code and setting
-fno-delete-null-pointer-checks flag prevents the if statement from
being folded. Also, I’ve confirmed that adding the check if
"catchange.xcnt > 0” before the null-pointer check also can prevent
that. Adding a check  if "catchange.xcnt > 0” looks more robust. I’ve
added a similar check for builder->committed.xcnt as well for
consistency. builder->committed.xip could have no transactions.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-18 Thread Amit Kapila

On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada  wrote:
>
> On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
>  wrote:
> >
>
> I've attached patches for all supported branches including the master.
>

For back branch patches,
* Wouldn't it be better to move purge logic into the function
SnapBuildPurge* function for the sake of consistency?
* Do we really need ReorderBufferInitialXactsSetCatalogChanges()?
Can't we instead have a function similar to
SnapBuildXidHasCatalogChanges() as we have for the master branch? That
will avoid calling it when the snapshot
state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-17 Thread Amit Kapila

On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada  wrote:
>
> This patch should have the fix for the issue that Shi yu reported. Shi
> yu, could you please test it again with this patch?
>

Can you explain the cause of the failure and your fix for the same?

-- 
With Regards,
Amit Kapila.

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-17 Thread shiy.f...@fujitsu.com

On Fri, Jul 15, 2022 10:39 PM Masahiko Sawada  wrote:
> 
> This patch should have the fix for the issue that Shi yu reported. Shi
> yu, could you please test it again with this patch?
> 

Thanks for updating the patch!
I have tested and confirmed that the problem I found has been fixed.

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-17 Thread Masahiko Sawada

On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com
 wrote:
>
> On Mon, Jul 11, 2022 9:54 PM Masahiko Sawada  wrote:
> >
> > I've attached an updated patch, please review it.
> >
>
> Thanks for your patch. Here are some comments for the REL14-v1 patch.
>
> 1.
> +   Sizesz = sizeof(TransactionId) * nxacts;;
>
> There is a redundant semicolon at the end.
>
> 2.
> +   workspace = MemoryContextAlloc(rb->context, 
> rb->n_initial_running_xacts);
>
> Should it be:
> +   workspace = MemoryContextAlloc(rb->context, sizeof(TransactionId) * 
> rb->n_initial_running_xacts);
>
> 3.
> +   /* bound check if there is at least one transaction to be removed */
> +   if (NormalTransactionIdPrecedes(rb->initial_running_xacts[0],
> +   
> running->oldestRunningXid))
> +   return;
> +
>
> Here, I think it should return if rb->initial_running_xacts[0] is older than
> oldestRunningXid, right? Should it be changed to:
>
> +   if (!NormalTransactionIdPrecedes(rb->initial_running_xacts[0],
> +   
> running->oldestRunningXid))
> +   return;
>
> 4.
> +   if ((parsed->xinfo & XACT_XINFO_HAS_INVALS) != 0)
>
> Maybe we can change it like the following, to be consistent with other places 
> in
> this file. It's also fine if you don't change it.
>
> +   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)

Thank you for the comments!

I've attached patches for all supported branches including the master.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


REL13-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data


REL12-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data


REL14-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data


REL11-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data


REL10-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch
Description: Binary data


master-v6-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-15 Thread Masahiko Sawada

On Fri, Jul 15, 2022 at 10:43 PM osumi.takami...@fujitsu.com
 wrote:
>
> On Thursday, July 14, 2022 10:31 AM Masahiko Sawada  
> wrote:
> > I've attached an updated patch that incorporated comments from Amit and Shi.
> Hi,
>
>
> Minor comments for v4.

Thank you for the comments!

>
> (1) typo in the commit message
>
> "When decoding a COMMIT record, we check both the list and the ReorderBuffer 
> to see if
> if the transaction has modified catalogs."
>
> There are two 'if's in succession in the last sentence of the second 
> paragraph.
>
> (2) The header comment for the spec test
>
> +# Test that decoding only the commit record of the transaction that have
> +# catalog-changed.
>
> Rewording of this part looks required, because "test that ... " requires a 
> complete sentence
> after that, right ?
>
>
> (3) SnapBuildRestore
>
> snapshot_not_interesting:
> if (ondisk.builder.committed.xip != NULL)
> pfree(ondisk.builder.committed.xip);
> return false;
> }
>
> Do we need to add pfree for ondisk.builder.catchange.xip after the 
> 'snapshot_not_interesting' label ?
>
>
> (4) SnapBuildPurgeOlderTxn
>
> +   elog(DEBUG3, "purged catalog modifying transactions from %d 
> to %d",
> +(uint32) builder->catchange.xcnt, surviving_xids);
>
> To make this part more aligned with existing codes,
> probably we can have a look at another elog for debug in the same function.
>
> We should use %u for casted xcnt & surviving_xids,
> while adding a format for xmin if necessary ?

I agreed with all the above comments and incorporated them into the
updated patch.

This patch should have the fix for the issue that Shi yu reported. Shi
yu, could you please test it again with this patch?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v5-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-15 Thread osumi.takami...@fujitsu.com

On Thursday, July 14, 2022 10:31 AM Masahiko Sawada  
wrote:
> I've attached an updated patch that incorporated comments from Amit and Shi.
Hi,


Minor comments for v4.

(1) typo in the commit message

"When decoding a COMMIT record, we check both the list and the ReorderBuffer to 
see if
if the transaction has modified catalogs."

There are two 'if's in succession in the last sentence of the second paragraph.

(2) The header comment for the spec test

+# Test that decoding only the commit record of the transaction that have
+# catalog-changed.

Rewording of this part looks required, because "test that ... " requires a 
complete sentence
after that, right ?


(3) SnapBuildRestore

snapshot_not_interesting:
if (ondisk.builder.committed.xip != NULL)
pfree(ondisk.builder.committed.xip);
return false;
}

Do we need to add pfree for ondisk.builder.catchange.xip after the 
'snapshot_not_interesting' label ?


(4) SnapBuildPurgeOlderTxn

+   elog(DEBUG3, "purged catalog modifying transactions from %d to 
%d",
+(uint32) builder->catchange.xcnt, surviving_xids);

To make this part more aligned with existing codes,
probably we can have a look at another elog for debug in the same function.

We should use %u for casted xcnt & surviving_xids,
while adding a format for xmin if necessary ?


Best Regards,
Takamichi Osumi

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-15 Thread shiy.f...@fujitsu.com

On Mon, Jul 11, 2022 9:54 PM Masahiko Sawada  wrote:
> 
> I've attached an updated patch, please review it.
> 

Thanks for your patch. Here are some comments for the REL14-v1 patch.

1.
+   Sizesz = sizeof(TransactionId) * nxacts;;

There is a redundant semicolon at the end.

2.
+   workspace = MemoryContextAlloc(rb->context, 
rb->n_initial_running_xacts);

Should it be:
+   workspace = MemoryContextAlloc(rb->context, sizeof(TransactionId) * 
rb->n_initial_running_xacts);

3.
+   /* bound check if there is at least one transaction to be removed */
+   if (NormalTransactionIdPrecedes(rb->initial_running_xacts[0],
+   
running->oldestRunningXid))
+   return;
+

Here, I think it should return if rb->initial_running_xacts[0] is older than
oldestRunningXid, right? Should it be changed to:

+   if (!NormalTransactionIdPrecedes(rb->initial_running_xacts[0],
+   
running->oldestRunningXid))
+   return;

4.
+   if ((parsed->xinfo & XACT_XINFO_HAS_INVALS) != 0)

Maybe we can change it like the following, to be consistent with other places in
this file. It's also fine if you don't change it.

+   if (parsed->xinfo & XACT_XINFO_HAS_INVALS)

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-14 Thread Masahiko Sawada

On Thu, Jul 14, 2022 at 12:06 PM Masahiko Sawada  wrote:
>
> On Thu, Jul 14, 2022 at 11:16 AM shiy.f...@fujitsu.com
>  wrote:
> >
> > On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada  wrote:
> > >
> > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
> > >  wrote:
> > > >
> > > > It happened when executing the following code because it tried to free a
> > > NULL
> > > > pointer (catchange_xip).
> > > >
> > > > /* be tidy */
> > > > if (ondisk)
> > > > pfree(ondisk);
> > > > +   if (catchange_xip)
> > > > +   pfree(catchange_xip);
> > > >  }
> > > >
> > > > It seems to be related to configure option. I could reproduce it when 
> > > > using
> > > > `./configure --enable-debug`.
> > > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> > > ggdb"`.
> > >
> > > Hmm, I could not reproduce this problem even if I use ./configure
> > > --enable-debug. And it's weird that we checked if catchange_xip is not
> > > null but we did pfree for it:
> > >
> > > #1  pfree (pointer=0x0) at mcxt.c:1177
> > > #2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
> > > lsn=25719712) at snapbuild.c:1792
> > >
> > > Is it reproducible in your environment?
> >
> > Thanks for your reply! Yes, it is reproducible. And I also reproduced it on 
> > the
> > v4 patch you posted [1].
>
> Thank you for testing!

I've found out the exact cause of this problem and how to fix it. I'll
submit an updated patch next week with my analysis.

Thank you for testing and providing additional information off-list, Shi yu.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Thu, Jul 14, 2022 at 11:16 AM shiy.f...@fujitsu.com
 wrote:
>
> On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > > It happened when executing the following code because it tried to free a
> > NULL
> > > pointer (catchange_xip).
> > >
> > > /* be tidy */
> > > if (ondisk)
> > > pfree(ondisk);
> > > +   if (catchange_xip)
> > > +   pfree(catchange_xip);
> > >  }
> > >
> > > It seems to be related to configure option. I could reproduce it when 
> > > using
> > > `./configure --enable-debug`.
> > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> > ggdb"`.
> >
> > Hmm, I could not reproduce this problem even if I use ./configure
> > --enable-debug. And it's weird that we checked if catchange_xip is not
> > null but we did pfree for it:
> >
> > #1  pfree (pointer=0x0) at mcxt.c:1177
> > #2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
> > lsn=25719712) at snapbuild.c:1792
> >
> > Is it reproducible in your environment?
>
> Thanks for your reply! Yes, it is reproducible. And I also reproduced it on 
> the
> v4 patch you posted [1].

Thank you for testing!

>
> [1] 
> https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com
>
> > If so, could you test it again
> > with the following changes?
> >
> > diff --git a/src/backend/replication/logical/snapbuild.c
> > b/src/backend/replication/logical/snapbuild.c
> > index d015c06ced..a6e76e3781 100644
> > --- a/src/backend/replication/logical/snapbuild.c
> > +++ b/src/backend/replication/logical/snapbuild.c
> > @@ -1788,7 +1788,7 @@ out:
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > -   if (catchange_xip)
> > +   if (catchange_xip != NULL)
> > pfree(catchange_xip);
> >  }
> >
>
> I tried this and could still reproduce the problem.

Does the backtrace still show we attempt to pfree a null-pointer?

>
> Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
> the value of catchange_xcnt instead of catchange_xip before pfree.

Could you check if this problem occurred when we reached there via
goto pass, i.e., did we call ReorderBufferGetCatalogChangesXacts() or
not?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread shiy.f...@fujitsu.com

On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada  wrote:
> 
> On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
>  wrote:
> >
> > It happened when executing the following code because it tried to free a
> NULL
> > pointer (catchange_xip).
> >
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > +   if (catchange_xip)
> > +   pfree(catchange_xip);
> >  }
> >
> > It seems to be related to configure option. I could reproduce it when using
> > `./configure --enable-debug`.
> > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -
> ggdb"`.
> 
> Hmm, I could not reproduce this problem even if I use ./configure
> --enable-debug. And it's weird that we checked if catchange_xip is not
> null but we did pfree for it:
> 
> #1  pfree (pointer=0x0) at mcxt.c:1177
> #2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
> lsn=25719712) at snapbuild.c:1792
> 
> Is it reproducible in your environment?

Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the
v4 patch you posted [1].

[1] 
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

> If so, could you test it again
> with the following changes?
> 
> diff --git a/src/backend/replication/logical/snapbuild.c
> b/src/backend/replication/logical/snapbuild.c
> index d015c06ced..a6e76e3781 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1788,7 +1788,7 @@ out:
> /* be tidy */
> if (ondisk)
> pfree(ondisk);
> -   if (catchange_xip)
> +   if (catchange_xip != NULL)
> pfree(catchange_xip);
>  }
> 

I tried this and could still reproduce the problem.

Besides, I tried the suggestion from Amit [2],  it could be fixed by checking
the value of catchange_xcnt instead of catchange_xip before pfree.

[2] 
https://www.postgresql.org/message-id/CAA4eK1%2BXPdm8G%3DEhUJA12Pi1YvQAfcz2%3DkTd9a4BjVx4%3Dgk-MA%40mail.gmail.com

diff --git a/src/backend/replication/logical/snapbuild.c 
b/src/backend/replication/logical/snapbuild.c
index c482e906b0..68b9c4ef7d 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
Sizeneeded_length;
SnapBuildOnDisk *ondisk = NULL;
TransactionId   *catchange_xip = NULL;
-   size_t  catchange_xcnt;
+   size_t  catchange_xcnt = 0;
char   *ondisk_c;
int fd;
chartmppath[MAXPGPATH];
@@ -1788,7 +1788,7 @@ out:
/* be tidy */
if (ondisk)
pfree(ondisk);
-   if (catchange_xip)
+   if (catchange_xcnt != 0)
pfree(catchange_xip);
 }


Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 12:40 PM shiy.f...@fujitsu.com
 wrote:
>
> On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada  wrote:
> >
> > I've attached an updated patch.
> >
> > While trying this idea, I noticed there is no API to get the length of
> > dlist, as we discussed offlist. Alternative idea was to use List
> > (T_XidList) but I'm not sure it's a great idea since deleting an xid
> > from the list is O(N), we need to implement list_delete_xid, and we
> > need to make sure allocating list node in the reorder buffer context.
> > So in the patch, I added a variable, catchange_ntxns, to keep track of
> > the length of the dlist. Please review it.
> >
>
> Thanks for your patch. Here are some comments on the master patch.

Thank you for the comments.

>
> 1.
> In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead 
> of
> "RUNNING_XACT record" / "XACT_RUNNING record" in the comment?
>
> 2.
> +* Since catchange.xip is sorted, we find the lower bound of
> +* xids that sill are interesting.
>
> Typo?
> "sill" -> "still"
>
> 3.
> +* This array is set once when restoring the snapshot, xids are 
> removed
> +* from the array when decoding xl_running_xacts record, and then 
> eventually
> +* becomes an empty.
>
> +   /* catchange list becomes an empty */
> +   pfree(builder->catchange.xip);
> +   builder->catchange.xip = NULL;
>
> Should "becomes an empty" be modified to "becomes empty"?
>
> 4.
> + * changes that are smaller than ->xmin. Those won't ever get checked via
> + * the ->committed array and ->catchange, respectively. The committed xids 
> will
>
> Should we change
> "the ->committed array and ->catchange"
> to
> "the ->committed or ->catchange array"
> ?

Agreed with all the above comments. These are incorporated in the
latest v4 patch I just sent[1].

Regards,

[1] 
https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-13 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 5:52 PM Amit Kapila  wrote:
>
> On Tue, Jul 12, 2022 at 1:13 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila  wrote:
> > >
> > > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada 
> > > >  wrote:
> > > > >
> > > > >
> > > > > I'm doing benchmark tests and will share the results.
> > > > >
> > > >
> > > > I've done benchmark tests to measure the overhead introduced by doing
> > > > bsearch() every time when decoding a commit record. I've simulated a
> > > > very intensified situation where we decode 1M commit records while
> > > > keeping builder->catchange.xip array but the overhead is negilible:
> > > >
> > > > HEAD: 584 ms
> > > > Patched: 614 ms
> > > >
> > > > I've attached the benchmark script I used. With increasing
> > > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
> > > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping
> > > > catalog modifying transactions.
> > > >
> > >
> > > Thanks for the test. We should also see how it performs when (a) we
> > > don't change LOG_SNAPSHOT_INTERVAL_MS,
> >
> > What point do you want to see in this test? I think the performance
> > overhead depends on how many times we do bsearch() and how many
> > transactions are in the list.
> >
>
> Right, I am not expecting any visible performance difference in this
> case. This is to ensure that we are not incurring any overhead in the
> more usual scenarios (or default cases). As per my understanding, the
> purpose of increasing the value of LOG_SNAPSHOT_INTERVAL_MS is to
> simulate a stress case for the changes made by the patch, and keeping
> its value default will test the more usual scenarios.

Agreed.

I've done simple benchmark tests to decode 100k pgbench transactions:

HEAD: 10.34 s
Patched: 10.29 s

I've attached an updated patch that incorporated comments from Amit and Shi.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/
From 28ca92c9d95cd05a26a7db6e54704f92b1846943 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada 
Date: Wed, 6 Jul 2022 12:53:36 +0900
Subject: [PATCH v4] Add catalog modifying transactions to logical decoding
 serialized snapshot.

Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, if the logical
decoding decodes only the commit record of the transaction that
actually has modified a catalog, we missed adding its XID to the
snapshot. We ended up looking at catalogs with the wrong snapshot.

To fix this problem, this change adds the list of transaction IDs and
sub-transaction IDs, that have modified catalogs and are running when
snapshot serialization, to the serialized snapshot. When decoding a
COMMIT record, we check both the list and the ReorderBuffer to see if
if the transaction has modified catalogs.

Since this adds additional information to the serialized snapshot, we
cannot backpatch it. For back branches, we take another approach;
remember the last-running-xacts list of the first decoded
RUNNING_XACTS record and check if the transaction whose commit record
has XACT_XINFO_HAS_INVALS and whose XID is in the list. This doesn't
require any file format changes but the transaction will end up being
added to the snapshot even if it has only relcache invalidations.

This commit bumps SNAPBUILD_VERSION because of change in SnapBuild.
---
 contrib/test_decoding/Makefile|   2 +-
 .../expected/catalog_change_snapshot.out  |  44 
 .../specs/catalog_change_snapshot.spec|  39 +++
 .../replication/logical/reorderbuffer.c   |  69 -
 src/backend/replication/logical/snapbuild.c   | 235 --
 src/include/replication/reorderbuffer.h   |  12 +
 6 files changed, 317 insertions(+), 84 deletions(-)
 create mode 100644 contrib/test_decoding/expected/catalog_change_snapshot.out
 create mode 100644 contrib/test_decoding/specs/catalog_change_snapshot.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index b220906479..c7ce603706 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \
 	spill slot truncate stream stats twophase twophase_stream
 ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \
 	oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \
-	twophase_snapshot slot_creation_error
+	twophase_snapshot slot_creation_error catalog_change_snapshot
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
 ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf
diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out
new file mode 100644
index

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 7:59 PM Amit Kapila  wrote:
>
> On Tue, Jul 12, 2022 at 2:53 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
> >  wrote:
> > >
> > >
> > > It happened when executing the following code because it tried to free a 
> > > NULL
> > > pointer (catchange_xip).
> > >
> > > /* be tidy */
> > > if (ondisk)
> > > pfree(ondisk);
> > > +   if (catchange_xip)
> > > +   pfree(catchange_xip);
> > >  }
> > >
> > > It seems to be related to configure option. I could reproduce it when 
> > > using
> > > `./configure --enable-debug`.
> > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og 
> > > -ggdb"`.
> >
> > Hmm, I could not reproduce this problem even if I use ./configure
> > --enable-debug. And it's weird that we checked if catchange_xip is not
> > null but we did pfree for it:
> >
>
> Yeah, this looks weird to me as well but one difference in running
> tests could be the timing of WAL LOG for XLOG_RUNNING_XACTS. That may
> change the timing of SnapBuildSerialize. The other thing we can try is
> by checking the value of catchange_xcnt before pfree.

Yeah, we can try that.

While reading the code, I realized that we try to pfree both ondisk
and catchange_xip also when we jumped to 'out:':

out:
ReorderBufferSetRestartPoint(builder->reorder,
 builder->last_serialized_snapshot);
/* be tidy */
if (ondisk)
pfree(ondisk);
if (catchange_xip)
pfree(catchange_xip);

But we use both ondisk and catchange_xip only if we didn't jump to
'out:'. If this problem is related to compiler optimization with
'goto' statement, moving them before 'out:' might be worth trying.

>
> BTW, I think ReorderBufferGetCatalogChangesXacts should have an Assert
> to ensure rb->catchange_ntxns and xcnt are equal. We can probably then
> avoid having xcnt_p as an out parameter as the caller can use
> rb->catchange_ntxns instead.
>

Agreed.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Amit Kapila

On Tue, Jul 12, 2022 at 2:53 PM Masahiko Sawada  wrote:
>
> On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
>  wrote:
> >
> >
> > It happened when executing the following code because it tried to free a 
> > NULL
> > pointer (catchange_xip).
> >
> > /* be tidy */
> > if (ondisk)
> > pfree(ondisk);
> > +   if (catchange_xip)
> > +   pfree(catchange_xip);
> >  }
> >
> > It seems to be related to configure option. I could reproduce it when using
> > `./configure --enable-debug`.
> > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og 
> > -ggdb"`.
>
> Hmm, I could not reproduce this problem even if I use ./configure
> --enable-debug. And it's weird that we checked if catchange_xip is not
> null but we did pfree for it:
>

Yeah, this looks weird to me as well but one difference in running
tests could be the timing of WAL LOG for XLOG_RUNNING_XACTS. That may
change the timing of SnapBuildSerialize. The other thing we can try is
by checking the value of catchange_xcnt before pfree.

BTW, I think ReorderBufferGetCatalogChangesXacts should have an Assert
to ensure rb->catchange_ntxns and xcnt are equal. We can probably then
avoid having xcnt_p as an out parameter as the caller can use
rb->catchange_ntxns instead.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com
 wrote:
>
> On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada  wrote:
> >
> > I've attached an updated patch.
> >
>
> Hi,
>
> I met a segmentation fault in test_decoding test after applying the patch for 
> master
> branch. Attach the backtrace.

Thank you for testing the patch!

>
> It happened when executing the following code because it tried to free a NULL
> pointer (catchange_xip).
>
> /* be tidy */
> if (ondisk)
> pfree(ondisk);
> +   if (catchange_xip)
> +   pfree(catchange_xip);
>  }
>
> It seems to be related to configure option. I could reproduce it when using
> `./configure --enable-debug`.
> But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -ggdb"`.

Hmm, I could not reproduce this problem even if I use ./configure
--enable-debug. And it's weird that we checked if catchange_xip is not
null but we did pfree for it:

#1  pfree (pointer=0x0) at mcxt.c:1177
#2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78,
lsn=25719712) at snapbuild.c:1792

Is it reproducible in your environment? If so, could you test it again
with the following changes?

diff --git a/src/backend/replication/logical/snapbuild.c
b/src/backend/replication/logical/snapbuild.c
index d015c06ced..a6e76e3781 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -1788,7 +1788,7 @@ out:
/* be tidy */
if (ondisk)
pfree(ondisk);
-   if (catchange_xip)
+   if (catchange_xip != NULL)
pfree(catchange_xip);
 }

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread shiy.f...@fujitsu.com

On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada  wrote:
> 
> I've attached an updated patch.
> 

Hi,

I met a segmentation fault in test_decoding test after applying the patch for 
master
branch. Attach the backtrace.

It happened when executing the following code because it tried to free a NULL
pointer (catchange_xip).

/* be tidy */
if (ondisk)
pfree(ondisk);
+   if (catchange_xip)
+   pfree(catchange_xip);
 }

It seems to be related to configure option. I could reproduce it when using
`./configure --enable-debug`.
But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -ggdb"`.

Regards,
Shi yu
#0  0x00910333 in GetMemoryChunkContext (pointer=0x0) at 
../../../../src/include/utils/memutils.h:129
#1  pfree (pointer=0x0) at mcxt.c:1177
#2  0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, lsn=25719712) 
at snapbuild.c:1792
#3  0x00782797 in SnapBuildProcessRunningXacts (builder=0x1fd5e78, 
lsn=25719712, running=0x27dfe50) at snapbuild.c:1199
#4  0x00774273 in standby_decode (ctx=0x1fc3e08, buf=0x7ffd8c95b5d0) at 
decode.c:346
#5  0x00773ab3 in LogicalDecodingProcessRecord 
(ctx=ctx@entry=0x1fc3e08, record=0x1fc41a0) at decode.c:119
#6  0x0077815d in pg_logical_slot_get_changes_guts (fcinfo=0x1fb5d88, 
confirm=, binary=) at logicalfuncs.c:271
#7  0x0067b63d in ExecMakeTableFunctionResult (setexpr=0x1fb42e0, 
econtext=0x1fb41b0, argContext=, expectedDesc=0x1fd7da8, 
randomAccess=false)
at execSRF.c:234
#8  0x0068b76f in FunctionNext (node=node@entry=0x1fb3fa0) at 
nodeFunctionscan.c:94
#9  0x0067be37 in ExecScanFetch (recheckMtd=0x68b450 , 
accessMtd=0x68b470 , node=0x1fb3fa0) at execScan.c:133
#10 ExecScan (node=0x1fb3fa0, accessMtd=0x68b470 , 
recheckMtd=0x68b450 ) at execScan.c:199
#11 0x0067344b in ExecProcNode (node=0x1fb3fa0) at 
../../../src/include/executor/executor.h:259
#12 ExecutePlan (execute_once=, dest=0x1fc02e8, 
direction=, numberTuples=0, sendTuples=, 
operation=CMD_SELECT,
use_parallel_mode=, planstate=0x1fb3fa0, estate=0x1fb3d78) 
at execMain.c:1636
#13 standard_ExecutorRun (queryDesc=0x1f9e178, direction=, 
count=0, execute_once=) at execMain.c:363
#14 0x007dda9e in PortalRunSelect (portal=0x1f51338, forward=, count=0, dest=) at pquery.c:924
#15 0x007decf7 in PortalRun (portal=portal@entry=0x1f51338, 
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
run_once=run_once@entry=true, dest=dest@entry=0x1fc02e8, 
altdest=altdest@entry=0x1fc02e8, qc=0x7ffd8c95bbf0) at pquery.c:768
#16 0x007db4ff in exec_simple_query (
query_string=0x1ee29a8 "SELECT data FROM 
pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', 
'1', 'include-xids', '0');")
at postgres.c:1243
#17 0x007dc742 in PostgresMain (dbname=, 
username=) at postgres.c:4482
#18 0x0076132b in BackendRun (port=, port=) at postmaster.c:4503
#19 BackendStartup (port=) at postmaster.c:4231
#20 ServerLoop () at postmaster.c:1805
#21 0x007621fb in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x1edbf20) at postmaster.c:1477
#22 0x004f0f69 in main (argc=3, argv=0x1edbf20) at main.c:202

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Amit Kapila

On Tue, Jul 12, 2022 at 1:13 PM Masahiko Sawada  wrote:
>
> On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila  wrote:
> >
> > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > >
> > > > I'm doing benchmark tests and will share the results.
> > > >
> > >
> > > I've done benchmark tests to measure the overhead introduced by doing
> > > bsearch() every time when decoding a commit record. I've simulated a
> > > very intensified situation where we decode 1M commit records while
> > > keeping builder->catchange.xip array but the overhead is negilible:
> > >
> > > HEAD: 584 ms
> > > Patched: 614 ms
> > >
> > > I've attached the benchmark script I used. With increasing
> > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
> > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping
> > > catalog modifying transactions.
> > >
> >
> > Thanks for the test. We should also see how it performs when (a) we
> > don't change LOG_SNAPSHOT_INTERVAL_MS,
>
> What point do you want to see in this test? I think the performance
> overhead depends on how many times we do bsearch() and how many
> transactions are in the list.
>

Right, I am not expecting any visible performance difference in this
case. This is to ensure that we are not incurring any overhead in the
more usual scenarios (or default cases). As per my understanding, the
purpose of increasing the value of LOG_SNAPSHOT_INTERVAL_MS is to
simulate a stress case for the changes made by the patch, and keeping
its value default will test the more usual scenarios.

> I increased this value to easily
> simulate the situation where we decode many commit records while
> keeping catalog modifying transactions. But even if we don't change
> this value, the result would not change if we don't change how many
> commit records we decode.
>
> > and (b) we have more DDL xacts
> > so that the array to search is somewhat bigger
>
> I've done the same performance tests while creating 64 catalog
> modifying transactions. The result is:
>
> HEAD: 595 ms
> Patched: 628 ms
>
> There was no big overhead.
>

Yeah, especially considering you have simulated a stress case for the patch.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila  wrote:
>
> On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada  
> wrote:
> >
> > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada  
> > wrote:
> > >
> > >
> > > I'm doing benchmark tests and will share the results.
> > >
> >
> > I've done benchmark tests to measure the overhead introduced by doing
> > bsearch() every time when decoding a commit record. I've simulated a
> > very intensified situation where we decode 1M commit records while
> > keeping builder->catchange.xip array but the overhead is negilible:
> >
> > HEAD: 584 ms
> > Patched: 614 ms
> >
> > I've attached the benchmark script I used. With increasing
> > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
> > pg_logicla_slot_get_changes() decodes 1M commit records while keeping
> > catalog modifying transactions.
> >
>
> Thanks for the test. We should also see how it performs when (a) we
> don't change LOG_SNAPSHOT_INTERVAL_MS,

What point do you want to see in this test? I think the performance
overhead depends on how many times we do bsearch() and how many
transactions are in the list. I increased this value to easily
simulate the situation where we decode many commit records while
keeping catalog modifying transactions. But even if we don't change
this value, the result would not change if we don't change how many
commit records we decode.

> and (b) we have more DDL xacts
> so that the array to search is somewhat bigger

I've done the same performance tests while creating 64 catalog
modifying transactions. The result is:

HEAD: 595 ms
Patched: 628 ms

There was no big overhead.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Amit Kapila

On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada  wrote:
>
> On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada  
> wrote:
> >
> >
> > I'm doing benchmark tests and will share the results.
> >
>
> I've done benchmark tests to measure the overhead introduced by doing
> bsearch() every time when decoding a commit record. I've simulated a
> very intensified situation where we decode 1M commit records while
> keeping builder->catchange.xip array but the overhead is negilible:
>
> HEAD: 584 ms
> Patched: 614 ms
>
> I've attached the benchmark script I used. With increasing
> LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
> pg_logicla_slot_get_changes() decodes 1M commit records while keeping
> catalog modifying transactions.
>

Thanks for the test. We should also see how it performs when (a) we
don't change LOG_SNAPSHOT_INTERVAL_MS, and (b) we have more DDL xacts
so that the array to search is somewhat bigger

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-12 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada  wrote:
>
> On Tue, Jul 12, 2022 at 9:48 AM Masahiko Sawada  wrote:
> >
> > On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada  
> > > > wrote:
> > > > >
> > > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  
> > > > > wrote:
> > > > > >
> > > > >
> > > > > > 1.
> > > > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> > > > > > list length of 'catchange_txns' to allocate xids array? If we can do
> > > > > > so, then we will save the need to repalloc as well.
> > > > >
> > > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
> > > > > catalog modifying transactions, the length of the array could be
> > > > > bigger than the one taken last time. We can start with the previous
> > > > > length but I think we cannot remove the need for repalloc.
> > > > >
> > > >
> > > > It is using the list "catchange_txns" to form xid array which
> > > > shouldn't change for the duration of
> > > > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid
> > > > array after its use. Next time in
> > > > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid
> > > > array happens, so not sure why repalloc would be required?
> > >
> > > Oops, I mistook catchange_txns for catchange->xcnt. You're right.
> > > Starting with the length of catchange_txns should be sufficient.
> > >
> >
> > I've attached an updated patch.
> >
> > While trying this idea, I noticed there is no API to get the length of
> > dlist, as we discussed offlist. Alternative idea was to use List
> > (T_XidList) but I'm not sure it's a great idea since deleting an xid
> > from the list is O(N), we need to implement list_delete_xid, and we
> > need to make sure allocating list node in the reorder buffer context.
> > So in the patch, I added a variable, catchange_ntxns, to keep track of
> > the length of the dlist. Please review it.
> >
>
> I'm doing benchmark tests and will share the results.
>

I've done benchmark tests to measure the overhead introduced by doing
bsearch() every time when decoding a commit record. I've simulated a
very intensified situation where we decode 1M commit records while
keeping builder->catchange.xip array but the overhead is negilible:

HEAD: 584 ms
Patched: 614 ms

I've attached the benchmark script I used. With increasing
LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by
pg_logicla_slot_get_changes() decodes 1M commit records while keeping
catalog modifying transactions.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


bench.spec
Description: Binary data

RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-11 Thread shiy.f...@fujitsu.com

On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada  wrote:
> 
> I've attached an updated patch.
> 
> While trying this idea, I noticed there is no API to get the length of
> dlist, as we discussed offlist. Alternative idea was to use List
> (T_XidList) but I'm not sure it's a great idea since deleting an xid
> from the list is O(N), we need to implement list_delete_xid, and we
> need to make sure allocating list node in the reorder buffer context.
> So in the patch, I added a variable, catchange_ntxns, to keep track of
> the length of the dlist. Please review it.
> 

Thanks for your patch. Here are some comments on the master patch.

1.
In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead of
"RUNNING_XACT record" / "XACT_RUNNING record" in the comment?

2.
+* Since catchange.xip is sorted, we find the lower bound of
+* xids that sill are interesting.

Typo?
"sill" -> "still"

3.
+* This array is set once when restoring the snapshot, xids are removed
+* from the array when decoding xl_running_xacts record, and then 
eventually
+* becomes an empty.

+   /* catchange list becomes an empty */
+   pfree(builder->catchange.xip);
+   builder->catchange.xip = NULL;

Should "becomes an empty" be modified to "becomes empty"?

4.
+ * changes that are smaller than ->xmin. Those won't ever get checked via
+ * the ->committed array and ->catchange, respectively. The committed xids will

Should we change 
"the ->committed array and ->catchange"
to
"the ->committed or ->catchange array"
?

Regards,
Shi yu

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-11 Thread Masahiko Sawada

On Tue, Jul 12, 2022 at 9:48 AM Masahiko Sawada  wrote:
>
> On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada  wrote:
> >
> > On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila  wrote:
> > >
> > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > >
> > > > > 1.
> > > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> > > > > list length of 'catchange_txns' to allocate xids array? If we can do
> > > > > so, then we will save the need to repalloc as well.
> > > >
> > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
> > > > catalog modifying transactions, the length of the array could be
> > > > bigger than the one taken last time. We can start with the previous
> > > > length but I think we cannot remove the need for repalloc.
> > > >
> > >
> > > It is using the list "catchange_txns" to form xid array which
> > > shouldn't change for the duration of
> > > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid
> > > array after its use. Next time in
> > > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid
> > > array happens, so not sure why repalloc would be required?
> >
> > Oops, I mistook catchange_txns for catchange->xcnt. You're right.
> > Starting with the length of catchange_txns should be sufficient.
> >
>
> I've attached an updated patch.
>
> While trying this idea, I noticed there is no API to get the length of
> dlist, as we discussed offlist. Alternative idea was to use List
> (T_XidList) but I'm not sure it's a great idea since deleting an xid
> from the list is O(N), we need to implement list_delete_xid, and we
> need to make sure allocating list node in the reorder buffer context.
> So in the patch, I added a variable, catchange_ntxns, to keep track of
> the length of the dlist. Please review it.
>

I'm doing benchmark tests and will share the results.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-11 Thread Masahiko Sawada

On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada  wrote:
>
> On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila  wrote:
> >
> > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  
> > > wrote:
> > > >
> > >
> > > > 1.
> > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> > > > list length of 'catchange_txns' to allocate xids array? If we can do
> > > > so, then we will save the need to repalloc as well.
> > >
> > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
> > > catalog modifying transactions, the length of the array could be
> > > bigger than the one taken last time. We can start with the previous
> > > length but I think we cannot remove the need for repalloc.
> > >
> >
> > It is using the list "catchange_txns" to form xid array which
> > shouldn't change for the duration of
> > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid
> > array after its use. Next time in
> > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid
> > array happens, so not sure why repalloc would be required?
>
> Oops, I mistook catchange_txns for catchange->xcnt. You're right.
> Starting with the length of catchange_txns should be sufficient.
>

I've attached an updated patch.

While trying this idea, I noticed there is no API to get the length of
dlist, as we discussed offlist. Alternative idea was to use List
(T_XidList) but I'm not sure it's a great idea since deleting an xid
from the list is O(N), we need to implement list_delete_xid, and we
need to make sure allocating list node in the reorder buffer context.
So in the patch, I added a variable, catchange_ntxns, to keep track of
the length of the dlist. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


master-v3-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-11 Thread Masahiko Sawada

On Wed, Jul 6, 2022 at 3:01 PM Amit Kapila  wrote:
>
> On Wed, Jul 6, 2022 at 7:38 AM Masahiko Sawada  wrote:
> >
> > I'll post a new version patch in the next email with replying to other 
> > comments.
> >
>
> Okay, thanks for working on this. Few comments/suggestions on
> poc_remember_last_running_xacts_v2 patch:
>
> 1.
> +ReorderBufferSetLastRunningXactsCatalogChanges(ReorderBuffer *rb,
> TransactionId xid,
> +uint32 xinfo, int subxcnt,
> +TransactionId *subxacts, XLogRecPtr lsn)
> +{
> ...
> ...
> +
> + test = bsearch(, rb->last_running_xacts, rb->n_last_running_xacts,
> +sizeof(TransactionId), xidComparator);
> +
> + if (test == NULL)
> + {
> + for (int i = 0; i < subxcnt; i++)
> + {
> + test = bsearch([i], rb->last_running_xacts, 
> rb->n_last_running_xacts,
> +sizeof(TransactionId), xidComparator);
> ...
>
> Is there ever a possibility that the top transaction id is not in the
> running_xacts list but one of its subxids is present? If yes, it is
> not very obvious at least to me so adding a comment here could be
> useful. If not, then why do we need this additional check for each of
> the sub-transaction ids?

I think there is no possibility. The check for subtransactions is not necessary.

>
> 2.
> @@ -627,6 +647,15 @@ DecodeCommit(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
>   commit_time = parsed->origin_timestamp;
>   }
>
> + /*
> + * Set the last running xacts as containing catalog change if necessary.
> + * This must be done before SnapBuildCommitTxn() so that we include catalog
> + * change transactions to the historic snapshot.
> + */
> + ReorderBufferSetLastRunningXactsCatalogChanges(ctx->reorder, xid,
> parsed->xinfo,
> +parsed->nsubxacts, parsed->subxacts,
> +buf->origptr);
> +
>   SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid,
>  parsed->nsubxacts, parsed->subxacts);
>
> As mentioned previously as well, marking it before SnapBuildCommitTxn
> has one disadvantage, we sometimes do this work even if the snapshot
> state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT in which case
> SnapBuildCommitTxn wouldn't do anything. Can we instead check whether
> the particular txn has invalidations and is present in the
> last_running_xacts list along with the check
> ReorderBufferXidHasCatalogChanges? I think that has the additional
> advantage that we don't need this additional marking if the xact is
> already marked as containing catalog changes.

Agreed.

>
> 3.
> 1.
> + /*
> + * We rely on HEAP2_NEW_CID records and XACT_INVALIDATIONS to know
> + * if the transaction has changed the catalog, and that information
> + * is not serialized to SnapBuilder.  Therefore, if the logical
> + * decoding decodes the commit record of the transaction that actually
> + * has done catalog changes without these records, we miss to add
> + * the xid to the snapshot so up creating the wrong snapshot.
>
> The part of the sentence "... snapshot so up creating the wrong
> snapshot." is not clear. In this comment, at one place you have used
> two spaces after a full stop, and at another place, there is one
> space. I think let's follow nearby code practice to use a single space
> before a new sentence.

Agreed.

>
> 4.
> +void
> +ReorderBufferProcessLastRunningXacts(ReorderBuffer *rb,
> xl_running_xacts *running)
> +{
> + /* Quick exit if there is no longer last running xacts */
> + if (likely(rb->n_last_running_xacts == 0))
> + return;
> +
> + /* First call, build the last running xact list */
> + if (rb->n_last_running_xacts == -1)
> + {
> + int nxacts = running->subxcnt + running->xcnt;
> + Size sz = sizeof(TransactionId) * nxacts;;
> +
> + rb->last_running_xacts = MemoryContextAlloc(rb->context, sz);
> + memcpy(rb->last_running_xacts, running->xids, sz);
> + qsort(rb->last_running_xacts, nxacts, sizeof(TransactionId), xidComparator);
> +
> + rb->n_last_running_xacts = nxacts;
> +
> + return;
> + }
>
> a. Can we add the function header comments for this function?

Updated.

> b. We seem to be tracking the running_xact information for the first
> running_xact record after start/restart. The name last_running_xacts
> doesn't sound appropriate for that, how about initial_running_xacts?

Sound good, updated.

>
> 5.
> + /*
> + * Purge xids in the last running xacts list if we can do that for at least
> + * one xid.
> + */
> + if (NormalTransactionIdPrecedes(rb->last_running_xacts[0],
> + running->oldestRunningXid))
>
> I think it would be a good idea to add a few lines here explaining why
> it is safe to purge. IIUC, it is because the commit for those xacts
> would have already been processed and we don't need such a xid
> anymore.

Right, updated.

>
> 6. As per the discussion above in this thread having
> XACT_XINFO_HAS_INVALS in the commit record doesn't indicate that the
> xact has catalog changes, so can we add somewhere in comments that for
> such a case we can't distinguish whether the txn has catalog change
> but we still mark the txn has catalog changes?

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-08 Thread Masahiko Sawada

On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila  wrote:
>
> On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada  wrote:
> >
> > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  wrote:
> > >
> >
> > > 1.
> > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> > > list length of 'catchange_txns' to allocate xids array? If we can do
> > > so, then we will save the need to repalloc as well.
> >
> > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
> > catalog modifying transactions, the length of the array could be
> > bigger than the one taken last time. We can start with the previous
> > length but I think we cannot remove the need for repalloc.
> >
>
> It is using the list "catchange_txns" to form xid array which
> shouldn't change for the duration of
> ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid
> array after its use. Next time in
> ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid
> array happens, so not sure why repalloc would be required?

Oops, I mistook catchange_txns for catchange->xcnt. You're right.
Starting with the length of catchange_txns should be sufficient.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-08 Thread Amit Kapila

On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada  wrote:
>
> On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  wrote:
> >
>
> > 1.
> > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> > list length of 'catchange_txns' to allocate xids array? If we can do
> > so, then we will save the need to repalloc as well.
>
> Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
> catalog modifying transactions, the length of the array could be
> bigger than the one taken last time. We can start with the previous
> length but I think we cannot remove the need for repalloc.
>

It is using the list "catchange_txns" to form xid array which
shouldn't change for the duration of
ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid
array after its use. Next time in
ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid
array happens, so not sure why repalloc would be required?

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-08 Thread Masahiko Sawada

On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila  wrote:
>
> On Fri, Jul 8, 2022 at 6:45 AM Masahiko Sawada  wrote:
> >
> > On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila  wrote:
> > >
> > > On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada  
> > > wrote:
> >
> > I've attached the new version patch that incorporates the comments and
> > the optimizations discussed above.
> >
>
> Thanks, few minor comments:

Thank you for the comments.

> 1.
> In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
> list length of 'catchange_txns' to allocate xids array? If we can do
> so, then we will save the need to repalloc as well.

Since ReorderBufferGetcatalogChangesXacts() collects all ongoing
catalog modifying transactions, the length of the array could be
bigger than the one taken last time. We can start with the previous
length but I think we cannot remove the need for repalloc.

> 2.
> /* ->committed manipulation */
> static void SnapBuildPurgeCommittedTxn(SnapBuild *builder);
>
> The above comment also needs to be changed.
>
> 3. As SnapBuildPurgeCommittedTxn() removes xacts both from committed
> and catchange arrays, the function name no more remains appropriate.
> We can either rename to something like SnapBuildPurgeOlderTxn() or
> move the catchange logic to a different function and call it from
> SnapBuildProcessRunningXacts.
>
> 4.
> + if (TransactionIdEquals(builder->catchange.xip[off],
> + builder->xmin) ||
> + NormalTransactionIdFollows(builder->catchange.xip[off],
> +builder->xmin))
>
> Can we use TransactionIdFollowsOrEquals() instead of above?
>
> 5. Comment change suggestion:
> /*
>   * Remove knowledge about transactions we treat as committed or
> containing catalog
>   * changes that are smaller than ->xmin. Those won't ever get checked via
> - * the ->committed array and ->catchange, respectively. The committed xids 
> will
> - * get checked via the clog machinery.
> + * the ->committed or ->catchange array, respectively. The committed xids 
> will
> + * get checked via the clog machinery. We can ideally remove the transaction
> + * from catchange array once it is finished (committed/aborted) but that 
> could
> + * be costly as we need to maintain the xids order in the array.
>   */
>

Agreed with the above comments.

> Apart from the above, I think there are pending comments for the
> back-branch patch and some performance testing of this work.

Right. I'll incorporate all comments I got so far into these patches
and submit them. Also, will do some benchmark tests.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-08 Thread Amit Kapila

On Fri, Jul 8, 2022 at 6:45 AM Masahiko Sawada  wrote:
>
> On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila  wrote:
> >
> > On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada  
> > wrote:
>
> I've attached the new version patch that incorporates the comments and
> the optimizations discussed above.
>

Thanks, few minor comments:
1.
In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the
list length of 'catchange_txns' to allocate xids array? If we can do
so, then we will save the need to repalloc as well.

2.
/* ->committed manipulation */
static void SnapBuildPurgeCommittedTxn(SnapBuild *builder);

The above comment also needs to be changed.

3. As SnapBuildPurgeCommittedTxn() removes xacts both from committed
and catchange arrays, the function name no more remains appropriate.
We can either rename to something like SnapBuildPurgeOlderTxn() or
move the catchange logic to a different function and call it from
SnapBuildProcessRunningXacts.

4.
+ if (TransactionIdEquals(builder->catchange.xip[off],
+ builder->xmin) ||
+ NormalTransactionIdFollows(builder->catchange.xip[off],
+builder->xmin))

Can we use TransactionIdFollowsOrEquals() instead of above?

5. Comment change suggestion:
/*
  * Remove knowledge about transactions we treat as committed or
containing catalog
  * changes that are smaller than ->xmin. Those won't ever get checked via
- * the ->committed array and ->catchange, respectively. The committed xids will
- * get checked via the clog machinery.
+ * the ->committed or ->catchange array, respectively. The committed xids will
+ * get checked via the clog machinery. We can ideally remove the transaction
+ * from catchange array once it is finished (committed/aborted) but that could
+ * be costly as we need to maintain the xids order in the array.
  */

Apart from the above, I think there are pending comments for the
back-branch patch and some performance testing of this work.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-07 Thread Masahiko Sawada

On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila  wrote:
>
> On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada  wrote:
> >
> > On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila  wrote:
> > >
> > > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> > > >
> > > > No.
> > > >
> > > > > If not, is there a reason for the same? I think we can remove it
> > > > > either at commit/abort or even immediately after adding the xid/subxid
> > > > > to committed->xip array.
> > > >
> > > > It might be a good idea but I'm concerned that removing XID from the
> > > > array at every commit/abort or after adding it to committed->xip array
> > > > might be costly as it requires adjustment of the array to keep its
> > > > order. Removing XIDs from the array would make bsearch faster but the
> > > > array is updated reasonably often (every 15 sec).
> > > >
> > >
> > > Fair point. However, I am slightly worried that we are unnecessarily
> > > searching in this new array even when ReorderBufferTxn has the
> > > required information. To avoid that, in function
> > > SnapBuildXidHasCatalogChange(), we can first check
> > > ReorderBufferXidHasCatalogChanges() and then check the array if the
> > > first check doesn't return true. Also, by the way, do we need to
> > > always keep builder->catchanges.xip updated via SnapBuildRestore()?
> > > Isn't it sufficient that we just read and throw away contents from a
> > > snapshot if builder->catchanges.xip is non-NULL?
> >
> > IIUC catchanges.xip is restored only once when restoring a consistent
> > snapshot via SnapBuildRestore(). I think it's necessary to set
> > catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
> > you mean via SnapBuildSerialize()?∫
> >
>
> Sorry, I got confused about the way restore is used. You are right, it
> will be done once. My main worry is that we shouldn't look at
> builder->catchanges.xip array on an ongoing basis which I think can be
> dealt with by one of the ideas you mentioned below. But, I think we
> can still follow the other suggestion related to moving
> ReorderBufferXidHasCatalogChanges() check prior to checking array.

Agreed. I've incorporated this change in the new version patch.

>
> > >
> > > I had additionally thought if can further optimize this solution to
> > > just store this additional information when we need to serialize for
> > > checkpoint record but I think that won't work because walsender can
> > > restart even without resatart of server in which case the same problem
> > > can occur.
> >
> > Yes, probably we need to write catalog modifying transactions for
> > every serialized snapshot.
> >
> > > I am not if sure there is a way to further optimize this
> > > solution, let me know if you have any ideas?
> >
> > I suppose that writing additional information to serialized snapshots
> > would not be a noticeable overhead since we need 4 bytes per
> > transaction and we would not expect there is a huge number of
> > concurrent catalog modifying transactions. But both collecting catalog
> > modifying transactions (especially when there are many ongoing
> > transactions) and bsearch'ing on the XID list every time decoding the
> > COMMIT record could bring overhead.
> >
> > A solution for the first point would be to keep track of catalog
> > modifying transactions by using a linked list so that we can avoid
> > checking all ongoing transactions.
> >
>
> This sounds reasonable to me.
>
> > Regarding the second point, on reflection, I think we need to look up
> > the XID list until all XID in the list is committed/aborted. We can
> > remove XIDs from the list after adding it to committed.xip as you
> > suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
> > older than builder->xmin from the list like we do for committed.xip in
> > SnapBuildPurgeCommittedTxn().
> >
>
> I think doing along with RUNNING_XACTS should be fine. At each
> commit/abort, the cost could be high because we need to maintain the
> sort order. In general, I feel any one of these should be okay because
> once the array becomes empty, it won't be used again and there won't
> be any operation related to it during ongoing replication.

I've attached the new version patch that incorporates the comments and
the optimizations discussed above.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v2-0001-Add-catalog-modifying-transactions-to-logical-dec.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-07 Thread Amit Kapila

On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada  wrote:
>
> On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila  wrote:
> >
> > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila  
> > > wrote:
> > > >
> > > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> > >
> > > No.
> > >
> > > > If not, is there a reason for the same? I think we can remove it
> > > > either at commit/abort or even immediately after adding the xid/subxid
> > > > to committed->xip array.
> > >
> > > It might be a good idea but I'm concerned that removing XID from the
> > > array at every commit/abort or after adding it to committed->xip array
> > > might be costly as it requires adjustment of the array to keep its
> > > order. Removing XIDs from the array would make bsearch faster but the
> > > array is updated reasonably often (every 15 sec).
> > >
> >
> > Fair point. However, I am slightly worried that we are unnecessarily
> > searching in this new array even when ReorderBufferTxn has the
> > required information. To avoid that, in function
> > SnapBuildXidHasCatalogChange(), we can first check
> > ReorderBufferXidHasCatalogChanges() and then check the array if the
> > first check doesn't return true. Also, by the way, do we need to
> > always keep builder->catchanges.xip updated via SnapBuildRestore()?
> > Isn't it sufficient that we just read and throw away contents from a
> > snapshot if builder->catchanges.xip is non-NULL?
>
> IIUC catchanges.xip is restored only once when restoring a consistent
> snapshot via SnapBuildRestore(). I think it's necessary to set
> catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
> you mean via SnapBuildSerialize()?∫
>

Sorry, I got confused about the way restore is used. You are right, it
will be done once. My main worry is that we shouldn't look at
builder->catchanges.xip array on an ongoing basis which I think can be
dealt with by one of the ideas you mentioned below. But, I think we
can still follow the other suggestion related to moving
ReorderBufferXidHasCatalogChanges() check prior to checking array.

> >
> > I had additionally thought if can further optimize this solution to
> > just store this additional information when we need to serialize for
> > checkpoint record but I think that won't work because walsender can
> > restart even without resatart of server in which case the same problem
> > can occur.
>
> Yes, probably we need to write catalog modifying transactions for
> every serialized snapshot.
>
> > I am not if sure there is a way to further optimize this
> > solution, let me know if you have any ideas?
>
> I suppose that writing additional information to serialized snapshots
> would not be a noticeable overhead since we need 4 bytes per
> transaction and we would not expect there is a huge number of
> concurrent catalog modifying transactions. But both collecting catalog
> modifying transactions (especially when there are many ongoing
> transactions) and bsearch'ing on the XID list every time decoding the
> COMMIT record could bring overhead.
>
> A solution for the first point would be to keep track of catalog
> modifying transactions by using a linked list so that we can avoid
> checking all ongoing transactions.
>

This sounds reasonable to me.

> Regarding the second point, on reflection, I think we need to look up
> the XID list until all XID in the list is committed/aborted. We can
> remove XIDs from the list after adding it to committed.xip as you
> suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
> older than builder->xmin from the list like we do for committed.xip in
> SnapBuildPurgeCommittedTxn().
>

I think doing along with RUNNING_XACTS should be fine. At each
commit/abort, the cost could be high because we need to maintain the
sort order. In general, I feel any one of these should be okay because
once the array becomes empty, it won't be used again and there won't
be any operation related to it during ongoing replication.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-06 Thread Masahiko Sawada

On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila  wrote:
>
> On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada  wrote:
> >
> > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila  wrote:
> > >
> > > 2. Are we anytime removing transaction ids from catchanges->xip array?
> >
> > No.
> >
> > > If not, is there a reason for the same? I think we can remove it
> > > either at commit/abort or even immediately after adding the xid/subxid
> > > to committed->xip array.
> >
> > It might be a good idea but I'm concerned that removing XID from the
> > array at every commit/abort or after adding it to committed->xip array
> > might be costly as it requires adjustment of the array to keep its
> > order. Removing XIDs from the array would make bsearch faster but the
> > array is updated reasonably often (every 15 sec).
> >
>
> Fair point. However, I am slightly worried that we are unnecessarily
> searching in this new array even when ReorderBufferTxn has the
> required information. To avoid that, in function
> SnapBuildXidHasCatalogChange(), we can first check
> ReorderBufferXidHasCatalogChanges() and then check the array if the
> first check doesn't return true. Also, by the way, do we need to
> always keep builder->catchanges.xip updated via SnapBuildRestore()?
> Isn't it sufficient that we just read and throw away contents from a
> snapshot if builder->catchanges.xip is non-NULL?

IIUC catchanges.xip is restored only once when restoring a consistent
snapshot via SnapBuildRestore(). I think it's necessary to set
catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did
you mean via SnapBuildSerialize()?∫

>
> I had additionally thought if can further optimize this solution to
> just store this additional information when we need to serialize for
> checkpoint record but I think that won't work because walsender can
> restart even without resatart of server in which case the same problem
> can occur.

Yes, probably we need to write catalog modifying transactions for
every serialized snapshot.

> I am not if sure there is a way to further optimize this
> solution, let me know if you have any ideas?

I suppose that writing additional information to serialized snapshots
would not be a noticeable overhead since we need 4 bytes per
transaction and we would not expect there is a huge number of
concurrent catalog modifying transactions. But both collecting catalog
modifying transactions (especially when there are many ongoing
transactions) and bsearch'ing on the XID list every time decoding the
COMMIT record could bring overhead.

A solution for the first point would be to keep track of catalog
modifying transactions by using a linked list so that we can avoid
checking all ongoing transactions.

Regarding the second point, on reflection, I think we need to look up
the XID list until all XID in the list is committed/aborted. We can
remove XIDs from the list after adding it to committed.xip as you
suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs
older than builder->xmin from the list like we do for committed.xip in
SnapBuildPurgeCommittedTxn().

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-06 Thread Amit Kapila

On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada  wrote:
>
> On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila  wrote:
> >
> > 2. Are we anytime removing transaction ids from catchanges->xip array?
>
> No.
>
> > If not, is there a reason for the same? I think we can remove it
> > either at commit/abort or even immediately after adding the xid/subxid
> > to committed->xip array.
>
> It might be a good idea but I'm concerned that removing XID from the
> array at every commit/abort or after adding it to committed->xip array
> might be costly as it requires adjustment of the array to keep its
> order. Removing XIDs from the array would make bsearch faster but the
> array is updated reasonably often (every 15 sec).
>

Fair point. However, I am slightly worried that we are unnecessarily
searching in this new array even when ReorderBufferTxn has the
required information. To avoid that, in function
SnapBuildXidHasCatalogChange(), we can first check
ReorderBufferXidHasCatalogChanges() and then check the array if the
first check doesn't return true. Also, by the way, do we need to
always keep builder->catchanges.xip updated via SnapBuildRestore()?
Isn't it sufficient that we just read and throw away contents from a
snapshot if builder->catchanges.xip is non-NULL?

I had additionally thought if can further optimize this solution to
just store this additional information when we need to serialize for
checkpoint record but I think that won't work because walsender can
restart even without resatart of server in which case the same problem
can occur. I am not if sure there is a way to further optimize this
solution, let me know if you have any ideas?

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-06 Thread Masahiko Sawada

On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila  wrote:
>
> On Mon, Jul 4, 2022 at 6:12 PM Amit Kapila  wrote:
> >
> > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  
> > wrote:
> > >
> > > I've attached three POC patches:
> > >
> >
> > I think it will be a good idea if you can add a short commit message
> > at least to say which patch is proposed for HEAD and which one is for
> > back branches. Also, it would be good if you can add some description
> > of the fix in the commit message. Let's remove poc* from the patch
> > name.
> >
> > Review poc_add_running_catchanges_xacts_to_serialized_snapshot
> > =
>
> Few more comments:

Thank you for the comments.

> 1.
> +
> + /* This array must be sorted in xidComparator order */
> + TransactionId *xip;
> + } catchanges;
>  };
>
> This array contains the transaction ids for subtransactions as well. I
> think it is better mention the same in comments.

Updated.

>
> 2. Are we anytime removing transaction ids from catchanges->xip array?

No.

> If not, is there a reason for the same? I think we can remove it
> either at commit/abort or even immediately after adding the xid/subxid
> to committed->xip array.

It might be a good idea but I'm concerned that removing XID from the
array at every commit/abort or after adding it to committed->xip array
might be costly as it requires adjustment of the array to keep its
order. Removing XIDs from the array would make bsearch faster but the
array is updated reasonably often (every 15 sec).

>
> 3.
> + if (readBytes != sz)
> + {
> + int save_errno = errno;
> +
> + CloseTransientFile(fd);
> +
> + if (readBytes < 0)
> + {
> + errno = save_errno;
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not read file \"%s\": %m", path)));
> + }
> + else
> + ereport(ERROR,
> + (errcode(ERRCODE_DATA_CORRUPTED),
> + errmsg("could not read file \"%s\": read %d of %zu",
> + path, readBytes, sz)));
> + }
>
> This is the fourth instance of similar error handling code in
> SnapBuildRestore(). Isn't it better to extract this into a separate
> function?

Good idea, updated.

>
> 4.
> +TransactionId *
> +ReorderBufferGetCatalogChangesXacts(ReorderBuffer *rb, size_t *xcnt_p)
> +{
> + HASH_SEQ_STATUS hash_seq;
> + ReorderBufferTXNByIdEnt *ent;
> + TransactionId *xids;
> + size_t xcnt = 0;
> + size_t xcnt_space = 64; /* arbitrary number */
> +
> + xids = (TransactionId *) palloc(sizeof(TransactionId) * xcnt_space);
> +
> + hash_seq_init(_seq, rb->by_txn);
> + while ((ent = hash_seq_search(_seq)) != NULL)
> + {
> + ReorderBufferTXN *txn = ent->txn;
> +
> + if (!rbtxn_has_catalog_changes(txn))
> + continue;
>
> It would be better to allocate memory the first time we have to store
> xids. There is a good chance that many a time this function will do
> just palloc without having to store any xid.

Agreed.

>
> 5. Do you think we should do some performance testing for a mix of
> ddl/dml workload to see if it adds any overhead in decoding due to
> serialize/restore doing additional work? I don't think it should add
> some meaningful overhead but OTOH there is no harm in doing some
> testing of the same.

Yes, it would be worth trying. I also believe this change doesn't
introduce noticeable overhead but let's check just in case.

I've attached an updated patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


0001-Add-catalog-modifying-transactions-to-logical-decodi.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-06 Thread Amit Kapila

On Wed, Jul 6, 2022 at 7:38 AM Masahiko Sawada  wrote:
>
> I'll post a new version patch in the next email with replying to other 
> comments.
>

Okay, thanks for working on this. Few comments/suggestions on
poc_remember_last_running_xacts_v2 patch:

1.
+ReorderBufferSetLastRunningXactsCatalogChanges(ReorderBuffer *rb,
TransactionId xid,
+uint32 xinfo, int subxcnt,
+TransactionId *subxacts, XLogRecPtr lsn)
+{
...
...
+
+ test = bsearch(, rb->last_running_xacts, rb->n_last_running_xacts,
+sizeof(TransactionId), xidComparator);
+
+ if (test == NULL)
+ {
+ for (int i = 0; i < subxcnt; i++)
+ {
+ test = bsearch([i], rb->last_running_xacts, rb->n_last_running_xacts,
+sizeof(TransactionId), xidComparator);
...

Is there ever a possibility that the top transaction id is not in the
running_xacts list but one of its subxids is present? If yes, it is
not very obvious at least to me so adding a comment here could be
useful. If not, then why do we need this additional check for each of
the sub-transaction ids?

2.
@@ -627,6 +647,15 @@ DecodeCommit(LogicalDecodingContext *ctx,
XLogRecordBuffer *buf,
  commit_time = parsed->origin_timestamp;
  }

+ /*
+ * Set the last running xacts as containing catalog change if necessary.
+ * This must be done before SnapBuildCommitTxn() so that we include catalog
+ * change transactions to the historic snapshot.
+ */
+ ReorderBufferSetLastRunningXactsCatalogChanges(ctx->reorder, xid,
parsed->xinfo,
+parsed->nsubxacts, parsed->subxacts,
+buf->origptr);
+
  SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid,
 parsed->nsubxacts, parsed->subxacts);

As mentioned previously as well, marking it before SnapBuildCommitTxn
has one disadvantage, we sometimes do this work even if the snapshot
state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT in which case
SnapBuildCommitTxn wouldn't do anything. Can we instead check whether
the particular txn has invalidations and is present in the
last_running_xacts list along with the check
ReorderBufferXidHasCatalogChanges? I think that has the additional
advantage that we don't need this additional marking if the xact is
already marked as containing catalog changes.

3.
1.
+ /*
+ * We rely on HEAP2_NEW_CID records and XACT_INVALIDATIONS to know
+ * if the transaction has changed the catalog, and that information
+ * is not serialized to SnapBuilder.  Therefore, if the logical
+ * decoding decodes the commit record of the transaction that actually
+ * has done catalog changes without these records, we miss to add
+ * the xid to the snapshot so up creating the wrong snapshot.

The part of the sentence "... snapshot so up creating the wrong
snapshot." is not clear. In this comment, at one place you have used
two spaces after a full stop, and at another place, there is one
space. I think let's follow nearby code practice to use a single space
before a new sentence.

4.
+void
+ReorderBufferProcessLastRunningXacts(ReorderBuffer *rb,
xl_running_xacts *running)
+{
+ /* Quick exit if there is no longer last running xacts */
+ if (likely(rb->n_last_running_xacts == 0))
+ return;
+
+ /* First call, build the last running xact list */
+ if (rb->n_last_running_xacts == -1)
+ {
+ int nxacts = running->subxcnt + running->xcnt;
+ Size sz = sizeof(TransactionId) * nxacts;;
+
+ rb->last_running_xacts = MemoryContextAlloc(rb->context, sz);
+ memcpy(rb->last_running_xacts, running->xids, sz);
+ qsort(rb->last_running_xacts, nxacts, sizeof(TransactionId), xidComparator);
+
+ rb->n_last_running_xacts = nxacts;
+
+ return;
+ }

a. Can we add the function header comments for this function?
b. We seem to be tracking the running_xact information for the first
running_xact record after start/restart. The name last_running_xacts
doesn't sound appropriate for that, how about initial_running_xacts?

5.
+ /*
+ * Purge xids in the last running xacts list if we can do that for at least
+ * one xid.
+ */
+ if (NormalTransactionIdPrecedes(rb->last_running_xacts[0],
+ running->oldestRunningXid))

I think it would be a good idea to add a few lines here explaining why
it is safe to purge. IIUC, it is because the commit for those xacts
would have already been processed and we don't need such a xid
anymore.

6. As per the discussion above in this thread having
XACT_XINFO_HAS_INVALS in the commit record doesn't indicate that the
xact has catalog changes, so can we add somewhere in comments that for
such a case we can't distinguish whether the txn has catalog change
but we still mark the txn has catalog changes? Can you please share
one example for this case?

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-05 Thread Masahiko Sawada

On Mon, Jul 4, 2022 at 9:42 PM Amit Kapila  wrote:
>
> On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  
> wrote:
> >
> > I've attached three POC patches:
> >
>
> I think it will be a good idea if you can add a short commit message
> at least to say which patch is proposed for HEAD and which one is for
> back branches. Also, it would be good if you can add some description
> of the fix in the commit message. Let's remove poc* from the patch
> name.

Updated.

>
> Review poc_add_running_catchanges_xacts_to_serialized_snapshot
> =
> 1.
> + /*
> + * Array of transactions that were running when the snapshot serialization
> + * and changed system catalogs,
>
> The part of the sentence after serialization is not very clear.

Updated.

>
> 2.
> - if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid))
> + if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid) ||
> + bsearch(, builder->catchanges.xip, builder->catchanges.xcnt,
> + sizeof(TransactionId), xidComparator) != NULL)
>
> Why are you using xid instead of subxid in bsearch call? Can we add a
> comment to say why it is okay to use xid if there is a valid reason?
> But note, we are using subxid to add to the committed xact array so
> not sure if this is a good idea but I might be missing something.

You're right, subxid should be used here.

>
> Suggestions for improvement in comments:
> -   /*
> -* Update the transactions that are running and changes
> catalogs that are
> -* not committed.
> -*/
> +   /* Update the catalog modifying transactions that are yet not
> committed. */
> if (builder->catchanges.xip)
> pfree(builder->catchanges.xip);
> builder->catchanges.xip =
> ReorderBufferGetCatalogChangesXacts(builder->reorder,
> @@ -1647,7 +1644,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
> COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
> ondisk_c += sz;
>
> -   /* copy catalog-changes xacts */
> +   /* copy catalog modifying xacts */
> sz = sizeof(TransactionId) * builder->catchanges.xcnt;
> memcpy(ondisk_c, builder->catchanges.xip, sz);
> COMP_CRC32C(ondisk->checksum, ondisk_c, sz);

Updated.

I'll post a new version patch in the next email with replying to other comments.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-05 Thread Amit Kapila

On Mon, Jul 4, 2022 at 6:12 PM Amit Kapila  wrote:
>
> On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  
> wrote:
> >
> > I've attached three POC patches:
> >
>
> I think it will be a good idea if you can add a short commit message
> at least to say which patch is proposed for HEAD and which one is for
> back branches. Also, it would be good if you can add some description
> of the fix in the commit message. Let's remove poc* from the patch
> name.
>
> Review poc_add_running_catchanges_xacts_to_serialized_snapshot
> =

Few more comments:
1.
+
+ /* This array must be sorted in xidComparator order */
+ TransactionId *xip;
+ } catchanges;
 };

This array contains the transaction ids for subtransactions as well. I
think it is better mention the same in comments.

2. Are we anytime removing transaction ids from catchanges->xip array?
If not, is there a reason for the same? I think we can remove it
either at commit/abort or even immediately after adding the xid/subxid
to committed->xip array.

3.
+ if (readBytes != sz)
+ {
+ int save_errno = errno;
+
+ CloseTransientFile(fd);
+
+ if (readBytes < 0)
+ {
+ errno = save_errno;
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not read file \"%s\": %m", path)));
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg("could not read file \"%s\": read %d of %zu",
+ path, readBytes, sz)));
+ }

This is the fourth instance of similar error handling code in
SnapBuildRestore(). Isn't it better to extract this into a separate
function?

4.
+TransactionId *
+ReorderBufferGetCatalogChangesXacts(ReorderBuffer *rb, size_t *xcnt_p)
+{
+ HASH_SEQ_STATUS hash_seq;
+ ReorderBufferTXNByIdEnt *ent;
+ TransactionId *xids;
+ size_t xcnt = 0;
+ size_t xcnt_space = 64; /* arbitrary number */
+
+ xids = (TransactionId *) palloc(sizeof(TransactionId) * xcnt_space);
+
+ hash_seq_init(_seq, rb->by_txn);
+ while ((ent = hash_seq_search(_seq)) != NULL)
+ {
+ ReorderBufferTXN *txn = ent->txn;
+
+ if (!rbtxn_has_catalog_changes(txn))
+ continue;

It would be better to allocate memory the first time we have to store
xids. There is a good chance that many a time this function will do
just palloc without having to store any xid.

5. Do you think we should do some performance testing for a mix of
ddl/dml workload to see if it adds any overhead in decoding due to
serialize/restore doing additional work? I don't think it should add
some meaningful overhead but OTOH there is no harm in doing some
testing of the same.

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-07-04 Thread Amit Kapila

On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  wrote:
>
> I've attached three POC patches:
>

I think it will be a good idea if you can add a short commit message
at least to say which patch is proposed for HEAD and which one is for
back branches. Also, it would be good if you can add some description
of the fix in the commit message. Let's remove poc* from the patch
name.

Review poc_add_running_catchanges_xacts_to_serialized_snapshot
=
1.
+ /*
+ * Array of transactions that were running when the snapshot serialization
+ * and changed system catalogs,

The part of the sentence after serialization is not very clear.

2.
- if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid))
+ if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid) ||
+ bsearch(, builder->catchanges.xip, builder->catchanges.xcnt,
+ sizeof(TransactionId), xidComparator) != NULL)

Why are you using xid instead of subxid in bsearch call? Can we add a
comment to say why it is okay to use xid if there is a valid reason?
But note, we are using subxid to add to the committed xact array so
not sure if this is a good idea but I might be missing something.

Suggestions for improvement in comments:
-   /*
-* Update the transactions that are running and changes
catalogs that are
-* not committed.
-*/
+   /* Update the catalog modifying transactions that are yet not
committed. */
if (builder->catchanges.xip)
pfree(builder->catchanges.xip);
builder->catchanges.xip =
ReorderBufferGetCatalogChangesXacts(builder->reorder,
@@ -1647,7 +1644,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn)
COMP_CRC32C(ondisk->checksum, ondisk_c, sz);
ondisk_c += sz;

-   /* copy catalog-changes xacts */
+   /* copy catalog modifying xacts */
sz = sizeof(TransactionId) * builder->catchanges.xcnt;
memcpy(ondisk_c, builder->catchanges.xip, sz);
COMP_CRC32C(ondisk->checksum, ondisk_c, sz);

-- 
With Regards,
Amit Kapila.

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-06-14 Thread Masahiko Sawada

On Tue, Jun 14, 2022 at 3:57 PM Amit Kapila  wrote:
>
> On Mon, Jun 13, 2022 at 8:29 AM Masahiko Sawada  wrote:
> >
> > On Tue, Jun 7, 2022 at 9:32 PM Amit Kapila  wrote:
> > >
> > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Wed, May 25, 2022 at 12:11 PM Masahiko Sawada 
> > > >  wrote:
> > > > >
> > > >
> > > > poc_add_regression_tests.patch adds regression tests for this bug. The
> > > > regression tests are required for both HEAD and back-patching but I've
> > > > separated this patch for testing the above two patches easily.
> > > >
> >
> > Thank you for the comments.
> >
> > >
> > > Few comments on the test case patch:
> > > ===
> > > 1.
> > > +# For the transaction that TRUNCATEd the table tbl1, the last decoding 
> > > decodes
> > > +# only its COMMIT record, because it starts from the RUNNING_XACT
> > > record emitted
> > > +# during the first checkpoint execution.  This transaction must be 
> > > marked as
> > > +# catalog-changes while decoding the COMMIT record and the decoding
> > > of the INSERT
> > > +# record must read the pg_class with the correct historic snapshot.
> > > +permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate"
> > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert"
> > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"
> > >
> > > Will this test always work? What if we get an additional running_xact
> > > record between steps "s0_commit" and "s0_begin" that is logged via
> > > bgwriter? You can mimic that by adding an additional checkpoint
> > > between those two steps. If we do that, the test will pass even
> > > without the patch because I think the last decoding will start
> > > decoding from this new running_xact record.
> >
> > Right. It could pass depending on the timing but doesn't fail
> > depending on the timing. I think we need to somehow stop bgwriter to
> > make the test case stable but it seems unrealistic.
> >
>
> Agreed, in my local testing for this case, I use to increase
> LOG_SNAPSHOT_INTERVAL_MS to avoid such a situation but I understand it
> is not practical via test.
>
> > Do you have any
> > better ideas?
> >
>
> No, I don't have any better ideas. I think it is better to add some
> information related to this in the comments because it may help to
> improve the test in the future if we come up with a better idea.

I also don't have any better ideas to make it stable, and agreed. I've
attached an updated version patch for adding regression tests.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


poc_add_regression_tests_v2.patch
Description: Binary data

Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

2022-06-14 Thread Amit Kapila

On Mon, Jun 13, 2022 at 8:29 AM Masahiko Sawada  wrote:
>
> On Tue, Jun 7, 2022 at 9:32 PM Amit Kapila  wrote:
> >
> > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Wed, May 25, 2022 at 12:11 PM Masahiko Sawada  
> > > wrote:
> > > >
> > >
> > > poc_add_regression_tests.patch adds regression tests for this bug. The
> > > regression tests are required for both HEAD and back-patching but I've
> > > separated this patch for testing the above two patches easily.
> > >
>
> Thank you for the comments.
>
> >
> > Few comments on the test case patch:
> > ===
> > 1.
> > +# For the transaction that TRUNCATEd the table tbl1, the last decoding 
> > decodes
> > +# only its COMMIT record, because it starts from the RUNNING_XACT
> > record emitted
> > +# during the first checkpoint execution.  This transaction must be marked 
> > as
> > +# catalog-changes while decoding the COMMIT record and the decoding
> > of the INSERT
> > +# record must read the pg_class with the correct historic snapshot.
> > +permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate"
> > "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert"
> > "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes"
> >
> > Will this test always work? What if we get an additional running_xact
> > record between steps "s0_commit" and "s0_begin" that is logged via
> > bgwriter? You can mimic that by adding an additional checkpoint
> > between those two steps. If we do that, the test will pass even
> > without the patch because I think the last decoding will start
> > decoding from this new running_xact record.
>
> Right. It could pass depending on the timing but doesn't fail
> depending on the timing. I think we need to somehow stop bgwriter to
> make the test case stable but it seems unrealistic.
>

Agreed, in my local testing for this case, I use to increase
LOG_SNAPSHOT_INTERVAL_MS to avoid such a situation but I understand it
is not practical via test.

> Do you have any
> better ideas?
>

No, I don't have any better ideas. I think it is better to add some
information related to this in the comments because it may help to
improve the test in the future if we come up with a better idea.

-- 
With Regards,
Amit Kapila.

1 2 >

1 - 100 of 126 matches

Mail list logo