Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
> Pushed. > > -- > With Regards, > Amit Kapila. > > Hi! While working on 64–bit XID's patch set, I stumble into problems with contrib/test_decoding/catalog_change_snapshot test [0]. AFAICS, the problem is not related to the 64–bit XID's patch set and the problem is in InitialRunningXacts array, allocaled in builder->context. Do we really need it to be allocated that way? [0] https://www.postgresql.org/message-id/CACG%3DezZoz_KG%2BRyh9MrU_g5e0HiVoHocEvqFF%3DNRrhrwKmEQJQ%40mail.gmail.com -- Best regards, Maxim Orlov.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sat, Aug 27, 2022 at 7:06 PM Masahiko Sawada wrote: > > On Sat, Aug 27, 2022 at 7:24 PM Amit Kapila wrote: > > > > On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada > > wrote: > > > > > > > > > > > > > I think then we should change this code in the master branch patch > > > > > with an additional comment on the lines of: "Either all the xacts got > > > > > purged or none. It is only possible to partially remove the xids from > > > > > this array if one or more of the xids are still running but not all. > > > > > That can happen if we start decoding from a point (LSN where the > > > > > snapshot state became consistent) where all the xacts in this were > > > > > running and then at least one of those got committed and a few are > > > > > still running. We will never start from such a point because we won't > > > > > move the slot's restart_lsn past the point where the oldest running > > > > > transaction's restart_decoding_lsn is." > > > > > > > > > > > > > Unfortunately, this theory doesn't turn out to be true. While > > > > investigating the latest buildfarm failure [1], I see that it is > > > > possible that only part of the xacts in the restored catalog modifying > > > > xacts list needs to be purged. See the attached where I have > > > > demonstrated it via a reproducible test. It seems the point we were > > > > missing was that to start from a point where two or more catalog > > > > modifying were serialized, it requires another open transaction before > > > > both get committed, and then we need the checkpoint or other way to > > > > force running_xacts record in-between the commit of initial two > > > > catalog modifying xacts. There could possibly be other ways as well > > > > but the theory above wasn't correct. > > > > > > > > > > Thank you for the analysis and the patch. I have the same conclusion. > > > Since we took this approach only on the master the back branches are > > > not affected. > > > > > > The new test scenario makes sense to me and looks better than the one > > > I have. Regarding the fix, I think we should use > > > TransactionIdFollowsOrEquals() instead of > > > NormalTransactionIdPrecedes(): > > > > > > + for (off = 0; off < builder->catchange.xcnt; off++) > > > + { > > > + if (NormalTransactionIdPrecedes(builder->catchange.xip[off], > > > + builder->xmin)) > > > + break; > > > + } > > > > > > > Right, fixed. > > Thank you for updating the patch! It looks good to me. > Pushed. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sat, Aug 27, 2022 at 7:24 PM Amit Kapila wrote: > > On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada wrote: > > > > On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila wrote: > > > > > > On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila > > > wrote: > > > > > > > > > > > > > > Yeah, your description makes sense to me. I've also considered how to > > > > > hit this path but I guess it is never hit. Thinking of it in another > > > > > way, first of all, at least 2 catalog modifying transactions have to > > > > > be running while writing a xl_running_xacts. The serialized snapshot > > > > > that is written when we decode the first xl_running_xact has two > > > > > transactions. Then, one of them is committed before the second > > > > > xl_running_xacts. The second serialized snapshot has only one > > > > > transaction. Then, the transaction is also committed after that. Now, > > > > > in order to execute the path, we need to start decoding from the first > > > > > serialized snapshot. However, if we start from there, we cannot decode > > > > > the full contents of the transaction that was committed later. > > > > > > > > > > > > > I think then we should change this code in the master branch patch > > > > with an additional comment on the lines of: "Either all the xacts got > > > > purged or none. It is only possible to partially remove the xids from > > > > this array if one or more of the xids are still running but not all. > > > > That can happen if we start decoding from a point (LSN where the > > > > snapshot state became consistent) where all the xacts in this were > > > > running and then at least one of those got committed and a few are > > > > still running. We will never start from such a point because we won't > > > > move the slot's restart_lsn past the point where the oldest running > > > > transaction's restart_decoding_lsn is." > > > > > > > > > > Unfortunately, this theory doesn't turn out to be true. While > > > investigating the latest buildfarm failure [1], I see that it is > > > possible that only part of the xacts in the restored catalog modifying > > > xacts list needs to be purged. See the attached where I have > > > demonstrated it via a reproducible test. It seems the point we were > > > missing was that to start from a point where two or more catalog > > > modifying were serialized, it requires another open transaction before > > > both get committed, and then we need the checkpoint or other way to > > > force running_xacts record in-between the commit of initial two > > > catalog modifying xacts. There could possibly be other ways as well > > > but the theory above wasn't correct. > > > > > > > Thank you for the analysis and the patch. I have the same conclusion. > > Since we took this approach only on the master the back branches are > > not affected. > > > > The new test scenario makes sense to me and looks better than the one > > I have. Regarding the fix, I think we should use > > TransactionIdFollowsOrEquals() instead of > > NormalTransactionIdPrecedes(): > > > > + for (off = 0; off < builder->catchange.xcnt; off++) > > + { > > + if (NormalTransactionIdPrecedes(builder->catchange.xip[off], > > + builder->xmin)) > > + break; > > + } > > > > Right, fixed. Thank you for updating the patch! It looks good to me. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sat, Aug 27, 2022 at 1:06 PM Masahiko Sawada wrote: > > On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila wrote: > > > > On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila > > wrote: > > > > > > > > > > > Yeah, your description makes sense to me. I've also considered how to > > > > hit this path but I guess it is never hit. Thinking of it in another > > > > way, first of all, at least 2 catalog modifying transactions have to > > > > be running while writing a xl_running_xacts. The serialized snapshot > > > > that is written when we decode the first xl_running_xact has two > > > > transactions. Then, one of them is committed before the second > > > > xl_running_xacts. The second serialized snapshot has only one > > > > transaction. Then, the transaction is also committed after that. Now, > > > > in order to execute the path, we need to start decoding from the first > > > > serialized snapshot. However, if we start from there, we cannot decode > > > > the full contents of the transaction that was committed later. > > > > > > > > > > I think then we should change this code in the master branch patch > > > with an additional comment on the lines of: "Either all the xacts got > > > purged or none. It is only possible to partially remove the xids from > > > this array if one or more of the xids are still running but not all. > > > That can happen if we start decoding from a point (LSN where the > > > snapshot state became consistent) where all the xacts in this were > > > running and then at least one of those got committed and a few are > > > still running. We will never start from such a point because we won't > > > move the slot's restart_lsn past the point where the oldest running > > > transaction's restart_decoding_lsn is." > > > > > > > Unfortunately, this theory doesn't turn out to be true. While > > investigating the latest buildfarm failure [1], I see that it is > > possible that only part of the xacts in the restored catalog modifying > > xacts list needs to be purged. See the attached where I have > > demonstrated it via a reproducible test. It seems the point we were > > missing was that to start from a point where two or more catalog > > modifying were serialized, it requires another open transaction before > > both get committed, and then we need the checkpoint or other way to > > force running_xacts record in-between the commit of initial two > > catalog modifying xacts. There could possibly be other ways as well > > but the theory above wasn't correct. > > > > Thank you for the analysis and the patch. I have the same conclusion. > Since we took this approach only on the master the back branches are > not affected. > > The new test scenario makes sense to me and looks better than the one > I have. Regarding the fix, I think we should use > TransactionIdFollowsOrEquals() instead of > NormalTransactionIdPrecedes(): > > + for (off = 0; off < builder->catchange.xcnt; off++) > + { > + if (NormalTransactionIdPrecedes(builder->catchange.xip[off], > + builder->xmin)) > + break; > + } > Right, fixed. -- With Regards, Amit Kapila. v2-0001-Fix-the-incorrect-assertion-introduced-in-commit-.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sat, Aug 27, 2022 at 3:56 PM Amit Kapila wrote: > > On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila wrote: > > > > > > > > Yeah, your description makes sense to me. I've also considered how to > > > hit this path but I guess it is never hit. Thinking of it in another > > > way, first of all, at least 2 catalog modifying transactions have to > > > be running while writing a xl_running_xacts. The serialized snapshot > > > that is written when we decode the first xl_running_xact has two > > > transactions. Then, one of them is committed before the second > > > xl_running_xacts. The second serialized snapshot has only one > > > transaction. Then, the transaction is also committed after that. Now, > > > in order to execute the path, we need to start decoding from the first > > > serialized snapshot. However, if we start from there, we cannot decode > > > the full contents of the transaction that was committed later. > > > > > > > I think then we should change this code in the master branch patch > > with an additional comment on the lines of: "Either all the xacts got > > purged or none. It is only possible to partially remove the xids from > > this array if one or more of the xids are still running but not all. > > That can happen if we start decoding from a point (LSN where the > > snapshot state became consistent) where all the xacts in this were > > running and then at least one of those got committed and a few are > > still running. We will never start from such a point because we won't > > move the slot's restart_lsn past the point where the oldest running > > transaction's restart_decoding_lsn is." > > > > Unfortunately, this theory doesn't turn out to be true. While > investigating the latest buildfarm failure [1], I see that it is > possible that only part of the xacts in the restored catalog modifying > xacts list needs to be purged. See the attached where I have > demonstrated it via a reproducible test. It seems the point we were > missing was that to start from a point where two or more catalog > modifying were serialized, it requires another open transaction before > both get committed, and then we need the checkpoint or other way to > force running_xacts record in-between the commit of initial two > catalog modifying xacts. There could possibly be other ways as well > but the theory above wasn't correct. > Thank you for the analysis and the patch. I have the same conclusion. Since we took this approach only on the master the back branches are not affected. The new test scenario makes sense to me and looks better than the one I have. Regarding the fix, I think we should use TransactionIdFollowsOrEquals() instead of NormalTransactionIdPrecedes(): + for (off = 0; off < builder->catchange.xcnt; off++) + { + if (NormalTransactionIdPrecedes(builder->catchange.xip[off], + builder->xmin)) + break; + } Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 29, 2022 at 12:15 PM Amit Kapila wrote: > > > > > Yeah, your description makes sense to me. I've also considered how to > > hit this path but I guess it is never hit. Thinking of it in another > > way, first of all, at least 2 catalog modifying transactions have to > > be running while writing a xl_running_xacts. The serialized snapshot > > that is written when we decode the first xl_running_xact has two > > transactions. Then, one of them is committed before the second > > xl_running_xacts. The second serialized snapshot has only one > > transaction. Then, the transaction is also committed after that. Now, > > in order to execute the path, we need to start decoding from the first > > serialized snapshot. However, if we start from there, we cannot decode > > the full contents of the transaction that was committed later. > > > > I think then we should change this code in the master branch patch > with an additional comment on the lines of: "Either all the xacts got > purged or none. It is only possible to partially remove the xids from > this array if one or more of the xids are still running but not all. > That can happen if we start decoding from a point (LSN where the > snapshot state became consistent) where all the xacts in this were > running and then at least one of those got committed and a few are > still running. We will never start from such a point because we won't > move the slot's restart_lsn past the point where the oldest running > transaction's restart_decoding_lsn is." > Unfortunately, this theory doesn't turn out to be true. While investigating the latest buildfarm failure [1], I see that it is possible that only part of the xacts in the restored catalog modifying xacts list needs to be purged. See the attached where I have demonstrated it via a reproducible test. It seems the point we were missing was that to start from a point where two or more catalog modifying were serialized, it requires another open transaction before both get committed, and then we need the checkpoint or other way to force running_xacts record in-between the commit of initial two catalog modifying xacts. There could possibly be other ways as well but the theory above wasn't correct. [1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio=2022-08-25%2004%3A15%3A34 -- With Regards, Amit Kapila. diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out index dc4f9b7..d2a4bdf 100644 --- a/contrib/test_decoding/expected/catalog_change_snapshot.out +++ b/contrib/test_decoding/expected/catalog_change_snapshot.out @@ -1,4 +1,4 @@ -Parsed test spec with 2 sessions +Parsed test spec with 3 sessions starting permutation: s0_init s0_begin s0_savepoint s0_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s0_commit s1_get_changes step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); @@ -42,3 +42,57 @@ COMMIT stop (1 row) + +starting permutation: s0_init s0_begin s0_truncate s2_begin s2_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s2_commit s1_checkpoint s1_get_changes s0_commit s1_get_changes +step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); +?column? + +init +(1 row) + +step s0_begin: BEGIN; +step s0_truncate: TRUNCATE tbl1; +step s2_begin: BEGIN; +step s2_truncate: TRUNCATE tbl2; +step s1_checkpoint: CHECKPOINT; +step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); +data + +(0 rows) + +step s0_commit: COMMIT; +step s0_begin: BEGIN; +step s0_insert: INSERT INTO tbl1 VALUES (1); +step s1_checkpoint: CHECKPOINT; +step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); +data +--- +BEGIN +table public.tbl1: TRUNCATE: (no-flags) +COMMIT +(3 rows) + +step s2_commit: COMMIT; +step s1_checkpoint: CHECKPOINT; +step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); +data +--- +BEGIN +table public.tbl2: TRUNCATE: (no-flags) +COMMIT +(3 rows) + +step s0_commit: COMMIT; +step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); +data +- +BEGIN
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Aug 11, 2022 at 3:10 PM Amit Kapila wrote: > > On Mon, Aug 8, 2022 at 9:34 AM Amit Kapila wrote: > > > > On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada > > wrote: > > > > > > > > > Oops, thanks for pointing it out. I've fixed it and attached updated > > > patches for all branches so as not to confuse the patch version. There > > > is no update from v12 patch on REL12 - master patches. > > > > > > > Thanks for the updated patches, the changes look good to me. > > Horiguchi-San, and others, do you have any further comments on this or > > do you want to spend time in review of it? If not, I would like to > > push this after the current minor version release. > > > > Pushed. Thank you! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Aug 8, 2022 at 9:34 AM Amit Kapila wrote: > > On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada wrote: > > > > > > Oops, thanks for pointing it out. I've fixed it and attached updated > > patches for all branches so as not to confuse the patch version. There > > is no update from v12 patch on REL12 - master patches. > > > > Thanks for the updated patches, the changes look good to me. > Horiguchi-San, and others, do you have any further comments on this or > do you want to spend time in review of it? If not, I would like to > push this after the current minor version release. > Pushed. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Aug 3, 2022 at 1:20 PM Masahiko Sawada wrote: > > On Wed, Aug 3, 2022 at 3:52 PM shiy.f...@fujitsu.com > wrote: > > > > On Wed, Aug 3, 2022 12:06 PM Masahiko Sawada wrote: > > > > > > I've attached updated patches that incorporated the above comments as > > > well as the comments from Shi yu. Please review them. > > > > > > > Thanks for updating the patch. > > > > I noticed that in SnapBuildXidSetCatalogChanges(), "i" is initialized in > > the if > > branch in REL10 patch, which is different from REL11 patch. Maybe we can > > modify > > REL11 patch to be consistent with REL10 patch. > > > > The rest of the patch looks good to me. > > Oops, thanks for pointing it out. I've fixed it and attached updated > patches for all branches so as not to confuse the patch version. There > is no update from v12 patch on REL12 - master patches. > Thanks for the updated patches, the changes look good to me. Horiguchi-San, and others, do you have any further comments on this or do you want to spend time in review of it? If not, I would like to push this after the current minor version release. -- With Regards, Amit Kapila.
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Aug 3, 2022 12:06 PM Masahiko Sawada wrote: > > I've attached updated patches that incorporated the above comments as > well as the comments from Shi yu. Please review them. > Thanks for updating the patch. I noticed that in SnapBuildXidSetCatalogChanges(), "i" is initialized in the if branch in REL10 patch, which is different from REL11 patch. Maybe we can modify REL11 patch to be consistent with REL10 patch. The rest of the patch looks good to me. Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Wed, 3 Aug 2022 08:51:40 +0530, Amit Kapila wrote in > On Wed, Aug 3, 2022 at 7:05 AM Masahiko Sawada wrote: > > Looking at other codes in snapbuild.c, we call CloseTransientFile() > > before erroring out in SnapBuildSerialize(). I think it's better to > > keep it consistent with nearby codes in this patch. I think if we > > prefer the style of closing the file by ereport(ERROR), it should be > > done for all of them in a separate patch. > > > > +1. I also feel it is better to change it in a separate patch as this > is not a pattern introduced by this patch. Agreed. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Aug 2, 2022 at 5:31 PM shiy.f...@fujitsu.com wrote: > > On Mon, Aug 1, 2022 10:31 PM Amit Kapila wrote: > > > > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada > > wrote: > > > > > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila > > wrote: > > > > > > > > > > I've attached updated patches for all branches. Please review them. > > > > > > > Thanks, the patches look mostly good to me. I have made minor edits by > > removing 'likely' from a few places as those don't seem to be adding > > much value, changed comments at a few places, and was getting > > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop > > initial declarations are only allowed in C99 mode) which I have fixed. > > See attached, unless there are major comments/suggestions, I am > > planning to push this day after tomorrow (by Wednesday) after another > > pass. > > > > Thanks for updating the patch. > > Here are some minor comments: > > 1. > patches for REL10 ~ REL13: > + * Mark the transaction as containing catalog changes. In addition, if the > + * given xid is in the list of the initial running xacts, we mark the > + * its subtransactions as well. See comments for NInitialRunningXacts and > + * InitialRunningXacts for additional info. > > "mark the its subtransactions" > -> > "mark its subtransactions" > > 2. > patches for REL10 ~ REL15: > In the comment in catalog_change_snapshot.spec, maybe we can use > "RUNNING_XACTS" > instead of "RUNNING_XACT" "XACT_RUNNING", same as the patch for master branch. > Thank you for the comments! These have been incorporated in the latest version v12 patch I just submitted. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Aug 3, 2022 at 7:05 AM Masahiko Sawada wrote: > > On Wed, Aug 3, 2022 at 10:20 AM Kyotaro Horiguchi > wrote: > > > > At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila > > wrote in > > > On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi > > > wrote: > > > > > > > > + { > > > > + int save_errno = errno; > > > > + > > > > + CloseTransientFile(fd); > > > > + > > > > + if (readBytes < 0) > > > > + { > > > > + errno = save_errno; > > > > + ereport(ERROR, > > > > > > > > Do we need the CloseTransientFile(fd) there? This call requires errno > > > > to be remembered but anyway OpenTransientFile'd files are to be close > > > > at transaction end. Actually CloseTransientFile() is not called > > > > before error'ing-out at error in other places. > > .. > > > We just moved it to a separate function as the same code is being > > > duplicated to multiple places. > > > > There are code paths that doesn't CloseTransientFile() explicitly, > > too. If there were no need of save_errno there, that'd be fine. But > > otherwise I guess we prefer to let the orphan fds closed by ERROR and > > I don't think we need to preserve the less-preferred code pattern (if > > we actually prefer not to have the explicit call). > > Looking at other codes in snapbuild.c, we call CloseTransientFile() > before erroring out in SnapBuildSerialize(). I think it's better to > keep it consistent with nearby codes in this patch. I think if we > prefer the style of closing the file by ereport(ERROR), it should be > done for all of them in a separate patch. > +1. I also feel it is better to change it in a separate patch as this is not a pattern introduced by this patch. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Aug 3, 2022 at 10:20 AM Kyotaro Horiguchi wrote: > > At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila > wrote in > > On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi > > wrote: > > > > > > + { > > > + int save_errno = errno; > > > + > > > + CloseTransientFile(fd); > > > + > > > + if (readBytes < 0) > > > + { > > > + errno = save_errno; > > > + ereport(ERROR, > > > > > > Do we need the CloseTransientFile(fd) there? This call requires errno > > > to be remembered but anyway OpenTransientFile'd files are to be close > > > at transaction end. Actually CloseTransientFile() is not called > > > before error'ing-out at error in other places. > .. > > We just moved it to a separate function as the same code is being > > duplicated to multiple places. > > There are code paths that doesn't CloseTransientFile() explicitly, > too. If there were no need of save_errno there, that'd be fine. But > otherwise I guess we prefer to let the orphan fds closed by ERROR and > I don't think we need to preserve the less-preferred code pattern (if > we actually prefer not to have the explicit call). Looking at other codes in snapbuild.c, we call CloseTransientFile() before erroring out in SnapBuildSerialize(). I think it's better to keep it consistent with nearby codes in this patch. I think if we prefer the style of closing the file by ereport(ERROR), it should be done for all of them in a separate patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Tue, 2 Aug 2022 13:54:43 +0530, Amit Kapila wrote in > On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi > wrote: > > > > + { > > + int save_errno = errno; > > + > > + CloseTransientFile(fd); > > + > > + if (readBytes < 0) > > + { > > + errno = save_errno; > > + ereport(ERROR, > > > > Do we need the CloseTransientFile(fd) there? This call requires errno > > to be remembered but anyway OpenTransientFile'd files are to be close > > at transaction end. Actually CloseTransientFile() is not called > > before error'ing-out at error in other places. .. > We just moved it to a separate function as the same code is being > duplicated to multiple places. There are code paths that doesn't CloseTransientFile() explicitly, too. If there were no need of save_errno there, that'd be fine. But otherwise I guess we prefer to let the orphan fds closed by ERROR and I don't think we need to preserve the less-preferred code pattern (if we actually prefer not to have the explicit call). regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Aug 1, 2022 10:31 PM Amit Kapila wrote: > > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada > wrote: > > > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila > wrote: > > > > > > > I've attached updated patches for all branches. Please review them. > > > > Thanks, the patches look mostly good to me. I have made minor edits by > removing 'likely' from a few places as those don't seem to be adding > much value, changed comments at a few places, and was getting > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop > initial declarations are only allowed in C99 mode) which I have fixed. > See attached, unless there are major comments/suggestions, I am > planning to push this day after tomorrow (by Wednesday) after another > pass. > Thanks for updating the patch. Here are some minor comments: 1. patches for REL10 ~ REL13: + * Mark the transaction as containing catalog changes. In addition, if the + * given xid is in the list of the initial running xacts, we mark the + * its subtransactions as well. See comments for NInitialRunningXacts and + * InitialRunningXacts for additional info. "mark the its subtransactions" -> "mark its subtransactions" 2. patches for REL10 ~ REL15: In the comment in catalog_change_snapshot.spec, maybe we can use "RUNNING_XACTS" instead of "RUNNING_XACT" "XACT_RUNNING", same as the patch for master branch. Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Aug 2, 2022 at 12:00 PM Kyotaro Horiguchi wrote: > > At Mon, 1 Aug 2022 20:01:00 +0530, Amit Kapila > wrote in > > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada > > wrote: > > > > > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila > > > wrote: > > > > > > > > > > I've attached updated patches for all branches. Please review them. > > > > > > > Thanks, the patches look mostly good to me. I have made minor edits by > > removing 'likely' from a few places as those don't seem to be adding > > much value, changed comments at a few places, and was getting > > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop > > initial declarations are only allowed in C99 mode) which I have fixed. > > See attached, unless there are major comments/suggestions, I am > > planning to push this day after tomorrow (by Wednesday) after another > > pass. > > > + { > + int save_errno = errno; > + > + CloseTransientFile(fd); > + > + if (readBytes < 0) > + { > + errno = save_errno; > + ereport(ERROR, > > Do we need the CloseTransientFile(fd) there? This call requires errno > to be remembered but anyway OpenTransientFile'd files are to be close > at transaction end. Actually CloseTransientFile() is not called > before error'ing-out at error in other places. > But this part of the code is just a copy of the existing code. See: - if (readBytes != sizeof(SnapBuild)) - { - int save_errno = errno; - - CloseTransientFile(fd); - - if (readBytes < 0) - { - errno = save_errno; - ereport(ERROR, - (errcode_for_file_access(), - errmsg("could not read file \"%s\": %m", path))); - } - else - ereport(ERROR, - (errcode(ERRCODE_DATA_CORRUPTED), - errmsg("could not read file \"%s\": read %d of %zu", - path, readBytes, sizeof(SnapBuild; - } We just moved it to a separate function as the same code is being duplicated to multiple places. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Mon, 1 Aug 2022 20:01:00 +0530, Amit Kapila wrote in > On Mon, Aug 1, 2022 at 7:46 AM Masahiko Sawada wrote: > > > > On Fri, Jul 29, 2022 at 3:45 PM Amit Kapila wrote: > > > > > > > I've attached updated patches for all branches. Please review them. > > > > Thanks, the patches look mostly good to me. I have made minor edits by > removing 'likely' from a few places as those don't seem to be adding > much value, changed comments at a few places, and was getting > compilation in error in v11/10 (snapbuild.c:2111:3: error: ‘for’ loop > initial declarations are only allowed in C99 mode) which I have fixed. > See attached, unless there are major comments/suggestions, I am > planning to push this day after tomorrow (by Wednesday) after another > pass. master: + * Read the contents of the serialized snapshot to the dest. Do we need the "the" before the "dest"? + { + int save_errno = errno; + + CloseTransientFile(fd); + + if (readBytes < 0) + { + errno = save_errno; + ereport(ERROR, Do we need the CloseTransientFile(fd) there? This call requires errno to be remembered but anyway OpenTransientFile'd files are to be close at transaction end. Actually CloseTransientFile() is not called before error'ing-out at error in other places. +* from the LSN-ordered list of toplevel TXNs. We remove TXN from the list We remove "the" TXN"? + if (dlist_is_empty(>catchange_txns)) + { + Assert(rb->catchange_ntxns == 0); + return NULL; + } It seems that the assert is far simpler than dlist_is_empty(). Why don't we swap the conditions for if() and Assert() in the above? +* the oldest running transaction窶冱 restart_decoding_lsn is. The line contains a broken characters. +* Either all the xacts got purged or none. It is only possible to +* partially remove the xids from this array if one or more of the xids +* are still running but not all. That can happen if we start decoding Assuming this, the commment below seems getting stale. +* catalog. We remove xids from this array when they become old enough to +* matter, and then it eventually becomes empty. "We discard this array when the all containing xids are gone. See SnapBuildPurgeOlderTxn for details." or something like? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 29, 2022 at 5:36 AM Masahiko Sawada wrote: > > On Thu, Jul 28, 2022 at 8:57 PM Amit Kapila wrote: > > > > On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila wrote: > > > > I have another comment on this patch: > > SnapBuildPurgeOlderTxn() > > { > > ... > > + if (surviving_xids > 0) > > + memmove(builder->catchange.xip, &(builder->catchange.xip[off]), > > + surviving_xids * sizeof(TransactionId)) > > ... > > > > For this code to hit, we must have a situation where one or more of > > the xacts in this array must be still running. And, if that is true, > > we would not have started from the restart point where the > > corresponding snapshot (that contains the still running xacts) has > > been serialized because we advance the restart point to not before the > > oldest running xacts restart_decoding_lsn. This may not be easy to > > understand so let me take an example to explain. Say we have two > > transactions t1 and t2, and both have made catalog changes. We want a > > situation where one of those gets purged and the other remains in > > builder->catchange.xip array. I have tried variants of the below > > sequence to see if I can get into the required situation but am not > > able to make it. > > > > Session-1 > > Checkpoint -1; > > T1 > > DDL > > > > Session-2 > > T2 > > DDL > > > > Session-3 > > Checkpoint-2; > > pg_logical_slot_get_changes() > > -- Here when we serialize the snapshot corresponding to > > CHECKPOINT-2's running_xact record, we will serialize both t1 and t2 > > as catalog-changing xacts. > > > > Session-1 > > T1 > > Commit; > > > > Checkpoint; > > pg_logical_slot_get_changes() > > -- Here we will restore from Checkpoint-1's serialized snapshot and > > won't be able to move restart_point to Checkpoint-2 because T2 is > > still open. > > > > Now, as per my understanding, it is only possible to move > > restart_point to Checkpoint-2 if T2 gets committed/rolled-back in > > which case we will never have that in surviving_xids array after the > > purge. > > > > It is possible I am missing something here. Do let me know your thoughts. > > Yeah, your description makes sense to me. I've also considered how to > hit this path but I guess it is never hit. Thinking of it in another > way, first of all, at least 2 catalog modifying transactions have to > be running while writing a xl_running_xacts. The serialized snapshot > that is written when we decode the first xl_running_xact has two > transactions. Then, one of them is committed before the second > xl_running_xacts. The second serialized snapshot has only one > transaction. Then, the transaction is also committed after that. Now, > in order to execute the path, we need to start decoding from the first > serialized snapshot. However, if we start from there, we cannot decode > the full contents of the transaction that was committed later. > I think then we should change this code in the master branch patch with an additional comment on the lines of: "Either all the xacts got purged or none. It is only possible to partially remove the xids from this array if one or more of the xids are still running but not all. That can happen if we start decoding from a point (LSN where the snapshot state became consistent) where all the xacts in this were running and then at least one of those got committed and a few are still running. We will never start from such a point because we won't move the slot's restart_lsn past the point where the oldest running transaction's restart_decoding_lsn is." I suggest keeping the back branch as it is w.r.t this change as if this logic proves to be faulty it won't affect the stable branches. We can always back-patch this small change if required. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 8:57 PM Amit Kapila wrote: > > On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila wrote: > > > > On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada > > wrote: > > > > > > Okay, I've attached an updated patch that does the above idea. Could > > > you please do the performance tests again to see if the idea can help > > > reduce the overhead, Shi yu? > > > > > > > While reviewing the patch for HEAD, I have changed a few comments. See > > attached, if you agree with these changes then include them in the > > next version. > > > > I have another comment on this patch: > SnapBuildPurgeOlderTxn() > { > ... > + if (surviving_xids > 0) > + memmove(builder->catchange.xip, &(builder->catchange.xip[off]), > + surviving_xids * sizeof(TransactionId)) > ... > > For this code to hit, we must have a situation where one or more of > the xacts in this array must be still running. And, if that is true, > we would not have started from the restart point where the > corresponding snapshot (that contains the still running xacts) has > been serialized because we advance the restart point to not before the > oldest running xacts restart_decoding_lsn. This may not be easy to > understand so let me take an example to explain. Say we have two > transactions t1 and t2, and both have made catalog changes. We want a > situation where one of those gets purged and the other remains in > builder->catchange.xip array. I have tried variants of the below > sequence to see if I can get into the required situation but am not > able to make it. > > Session-1 > Checkpoint -1; > T1 > DDL > > Session-2 > T2 > DDL > > Session-3 > Checkpoint-2; > pg_logical_slot_get_changes() > -- Here when we serialize the snapshot corresponding to > CHECKPOINT-2's running_xact record, we will serialize both t1 and t2 > as catalog-changing xacts. > > Session-1 > T1 > Commit; > > Checkpoint; > pg_logical_slot_get_changes() > -- Here we will restore from Checkpoint-1's serialized snapshot and > won't be able to move restart_point to Checkpoint-2 because T2 is > still open. > > Now, as per my understanding, it is only possible to move > restart_point to Checkpoint-2 if T2 gets committed/rolled-back in > which case we will never have that in surviving_xids array after the > purge. > > It is possible I am missing something here. Do let me know your thoughts. Yeah, your description makes sense to me. I've also considered how to hit this path but I guess it is never hit. Thinking of it in another way, first of all, at least 2 catalog modifying transactions have to be running while writing a xl_running_xacts. The serialized snapshot that is written when we decode the first xl_running_xact has two transactions. Then, one of them is committed before the second xl_running_xacts. The second serialized snapshot has only one transaction. Then, the transaction is also committed after that. Now, in order to execute the path, we need to start decoding from the first serialized snapshot. However, if we start from there, we cannot decode the full contents of the transaction that was committed later. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 3:23 PM Amit Kapila wrote: > > On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada wrote: > > > > Okay, I've attached an updated patch that does the above idea. Could > > you please do the performance tests again to see if the idea can help > > reduce the overhead, Shi yu? > > > > While reviewing the patch for HEAD, I have changed a few comments. See > attached, if you agree with these changes then include them in the > next version. > I have another comment on this patch: SnapBuildPurgeOlderTxn() { ... + if (surviving_xids > 0) + memmove(builder->catchange.xip, &(builder->catchange.xip[off]), + surviving_xids * sizeof(TransactionId)) ... For this code to hit, we must have a situation where one or more of the xacts in this array must be still running. And, if that is true, we would not have started from the restart point where the corresponding snapshot (that contains the still running xacts) has been serialized because we advance the restart point to not before the oldest running xacts restart_decoding_lsn. This may not be easy to understand so let me take an example to explain. Say we have two transactions t1 and t2, and both have made catalog changes. We want a situation where one of those gets purged and the other remains in builder->catchange.xip array. I have tried variants of the below sequence to see if I can get into the required situation but am not able to make it. Session-1 Checkpoint -1; T1 DDL Session-2 T2 DDL Session-3 Checkpoint-2; pg_logical_slot_get_changes() -- Here when we serialize the snapshot corresponding to CHECKPOINT-2's running_xact record, we will serialize both t1 and t2 as catalog-changing xacts. Session-1 T1 Commit; Checkpoint; pg_logical_slot_get_changes() -- Here we will restore from Checkpoint-1's serialized snapshot and won't be able to move restart_point to Checkpoint-2 because T2 is still open. Now, as per my understanding, it is only possible to move restart_point to Checkpoint-2 if T2 gets committed/rolled-back in which case we will never have that in surviving_xids array after the purge. It is possible I am missing something here. Do let me know your thoughts. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 26, 2022 at 1:22 PM Masahiko Sawada wrote: > > Okay, I've attached an updated patch that does the above idea. Could > you please do the performance tests again to see if the idea can help > reduce the overhead, Shi yu? > While reviewing the patch for HEAD, I have changed a few comments. See attached, if you agree with these changes then include them in the next version. -- With Regards, Amit Kapila. master_v9_amit.diff.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 12:56 PM Masahiko Sawada wrote: > > On Thu, Jul 28, 2022 at 4:13 PM Amit Kapila wrote: > > > > > > > > While editing back branch patches, I realized that the following > > > (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are > > > equivalent: > > > > > > + /* > > > +* If the COMMIT record has invalidation messages, it could have > > > catalog > > > +* changes. It is possible that we didn't mark this transaction as > > > +* containing catalog changes when the decoding starts from a commit > > > +* record without decoding the transaction's other changes. So, we > > > ensure > > > +* to mark such transactions as containing catalog change. > > > +* > > > +* This must be done before SnapBuildCommitTxn() so that we can > > > include > > > +* these transactions in the historic snapshot. > > > +*/ > > > + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) > > > + SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid, > > > + parsed->nsubxacts, parsed->subxacts, > > > + buf->origptr); > > > + > > > /* > > > * Process invalidation messages, even if we're not interested in the > > > * transaction's contents, since the various caches need to always be > > > * consistent. > > > */ > > > if (parsed->nmsgs > 0) > > > { > > > if (!ctx->fast_forward) > > > ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr, > > > parsed->nmsgs, parsed->msgs); > > > ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, > > > buf->origptr); > > > } > > > > > > If that's right, I think we can merge these if branches. We can call > > > ReorderBufferXidSetCatalogChanges() for top-txn and in > > > SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn > > > is in the list. What do you think? > > > > > > > Note that this code doesn't exist in 14 and 15, so we need to create > > different patches for those. > > Right. > Okay, then this sounds reasonable to me. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 4:13 PM Amit Kapila wrote: > > On Thu, Jul 28, 2022 at 11:56 AM Masahiko Sawada > wrote: > > > > On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila > > wrote: > > > > > > On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada > > > wrote: > > > > > > > > While editing back branch patches, I realized that the following > > (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are > > equivalent: > > > > + /* > > +* If the COMMIT record has invalidation messages, it could have catalog > > +* changes. It is possible that we didn't mark this transaction as > > +* containing catalog changes when the decoding starts from a commit > > +* record without decoding the transaction's other changes. So, we > > ensure > > +* to mark such transactions as containing catalog change. > > +* > > +* This must be done before SnapBuildCommitTxn() so that we can include > > +* these transactions in the historic snapshot. > > +*/ > > + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) > > + SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid, > > + parsed->nsubxacts, parsed->subxacts, > > + buf->origptr); > > + > > /* > > * Process invalidation messages, even if we're not interested in the > > * transaction's contents, since the various caches need to always be > > * consistent. > > */ > > if (parsed->nmsgs > 0) > > { > > if (!ctx->fast_forward) > > ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr, > > parsed->nmsgs, parsed->msgs); > > ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr); > > } > > > > If that's right, I think we can merge these if branches. We can call > > ReorderBufferXidSetCatalogChanges() for top-txn and in > > SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn > > is in the list. What do you think? > > > > Note that this code doesn't exist in 14 and 15, so we need to create > different patches for those. Right. > BTW, how in 13 and lower versions did we > identify and mark subxacts as having catalog changes without our > patch? I think we use HEAP_INPLACE and XLOG_HEAP2_NEW_CID to mark subxacts as well. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 11:56 AM Masahiko Sawada wrote: > > On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila wrote: > > > > On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada > > wrote: > > > > > While editing back branch patches, I realized that the following > (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are > equivalent: > > + /* > +* If the COMMIT record has invalidation messages, it could have catalog > +* changes. It is possible that we didn't mark this transaction as > +* containing catalog changes when the decoding starts from a commit > +* record without decoding the transaction's other changes. So, we ensure > +* to mark such transactions as containing catalog change. > +* > +* This must be done before SnapBuildCommitTxn() so that we can include > +* these transactions in the historic snapshot. > +*/ > + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) > + SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid, > + parsed->nsubxacts, parsed->subxacts, > + buf->origptr); > + > /* > * Process invalidation messages, even if we're not interested in the > * transaction's contents, since the various caches need to always be > * consistent. > */ > if (parsed->nmsgs > 0) > { > if (!ctx->fast_forward) > ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr, > parsed->nmsgs, parsed->msgs); > ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr); > } > > If that's right, I think we can merge these if branches. We can call > ReorderBufferXidSetCatalogChanges() for top-txn and in > SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn > is in the list. What do you think? > Note that this code doesn't exist in 14 and 15, so we need to create different patches for those. BTW, how in 13 and lower versions did we identify and mark subxacts as having catalog changes without our patch? -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
() an On Thu, Jul 28, 2022 at 12:21 PM Amit Kapila wrote: > > On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada wrote: > > > > On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila wrote: > > > > > > > > I have changed accordingly in the attached > > > and apart from that slightly modified the comments and commit message. > > > Do let me know what you think of the attached? > > > > It would be better to remember the initial running xacts after > > SnapBuildRestore() returns true? Because otherwise, we could end up > > allocating InitialRunningXacts multiple times while leaking the old > > ones if there are no serialized snapshots that we are interested in. > > > > Right, this makes sense. But note that you can no longer have a check > (builder->state == SNAPBUILD_START) which I believe is not required. > We need to do this after restore, in whichever state snapshot was as > any state other than SNAPBUILD_CONSISTENT can have commits without all > their changes. Right. > > Accordingly, I think the comment: "Remember the transactions and > subtransactions that were running when xl_running_xacts record that we > decoded first was written." needs to be slightly modified to something > like: "Remember the transactions and subtransactions that were running > when xl_running_xacts record that we decoded was written.". Change > this if it is used at any other place in the patch. Agreed. > > > --- > > + if (builder->state == SNAPBUILD_START) > > + { > > + int nxacts = > > running->subxcnt + running->xcnt; > > + Sizesz = sizeof(TransactionId) * nxacts; > > + > > + NInitialRunningXacts = nxacts; > > + InitialRunningXacts = > > MemoryContextAlloc(builder->context, sz); > > + memcpy(InitialRunningXacts, running->xids, sz); > > + qsort(InitialRunningXacts, nxacts, > > sizeof(TransactionId), xidComparator); > > + } > > > > We should allocate the memory for InitialRunningXacts only when > > (running->subxcnt + running->xcnt) > 0. > > > d > There is no harm in doing that but ideally, that case would have been > covered by an earlier check "if (running->oldestRunningXid == > running->nextXid)" which suggests "No transactions were running, so we > can jump to consistent." You're right. While editing back branch patches, I realized that the following (parsed->xinfo & XACT_XINFO_HAS_INVALS) and (parsed->nmsgs > 0) are equivalent: + /* +* If the COMMIT record has invalidation messages, it could have catalog +* changes. It is possible that we didn't mark this transaction as +* containing catalog changes when the decoding starts from a commit +* record without decoding the transaction's other changes. So, we ensure +* to mark such transactions as containing catalog change. +* +* This must be done before SnapBuildCommitTxn() so that we can include +* these transactions in the historic snapshot. +*/ + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) + SnapBuildXidSetCatalogChanges(ctx->snapshot_builder, xid, + parsed->nsubxacts, parsed->subxacts, + buf->origptr); + /* * Process invalidation messages, even if we're not interested in the * transaction's contents, since the various caches need to always be * consistent. */ if (parsed->nmsgs > 0) { if (!ctx->fast_forward) ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr, parsed->nmsgs, parsed->msgs); ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr); } If that's right, I think we can merge these if branches. We can call ReorderBufferXidSetCatalogChanges() for top-txn and in SnapBuildXidSetCatalogChanges() we mark its subtransactions if top-txn is in the list. What do you think? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 28, 2022 at 7:18 AM Masahiko Sawada wrote: > > On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila wrote: > > > > > I have changed accordingly in the attached > > and apart from that slightly modified the comments and commit message. > > Do let me know what you think of the attached? > > It would be better to remember the initial running xacts after > SnapBuildRestore() returns true? Because otherwise, we could end up > allocating InitialRunningXacts multiple times while leaking the old > ones if there are no serialized snapshots that we are interested in. > Right, this makes sense. But note that you can no longer have a check (builder->state == SNAPBUILD_START) which I believe is not required. We need to do this after restore, in whichever state snapshot was as any state other than SNAPBUILD_CONSISTENT can have commits without all their changes. Accordingly, I think the comment: "Remember the transactions and subtransactions that were running when xl_running_xacts record that we decoded first was written." needs to be slightly modified to something like: "Remember the transactions and subtransactions that were running when xl_running_xacts record that we decoded was written.". Change this if it is used at any other place in the patch. > --- > + if (builder->state == SNAPBUILD_START) > + { > + int nxacts = > running->subxcnt + running->xcnt; > + Sizesz = sizeof(TransactionId) * nxacts; > + > + NInitialRunningXacts = nxacts; > + InitialRunningXacts = > MemoryContextAlloc(builder->context, sz); > + memcpy(InitialRunningXacts, running->xids, sz); > + qsort(InitialRunningXacts, nxacts, > sizeof(TransactionId), xidComparator); > + } > > We should allocate the memory for InitialRunningXacts only when > (running->subxcnt + running->xcnt) > 0. > There is no harm in doing that but ideally, that case would have been covered by an earlier check "if (running->oldestRunningXid == running->nextXid)" which suggests "No transactions were running, so we can jump to consistent." Kindly make the required changes and submit the back branch patches again. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 27, 2022 at 8:33 PM Amit Kapila wrote: > > On Mon, Jul 25, 2022 at 11:26 AM Masahiko Sawada > wrote: > > > > On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada > > wrote: > > > > I've attached the patch for REl15 that I forgot. > > > > I feel the place to remember running xacts information in > SnapBuildProcessRunningXacts is not appropriate. Because in cases > where there are no running xacts or when xl_running_xact is old enough > that we can't use it, we don't need that information. I feel we need > it only when we have to reuse the already serialized snapshot, so, > won't it be better to initialize at that place in > SnapBuildFindSnapshot()? Good point, agreed. > I have changed accordingly in the attached > and apart from that slightly modified the comments and commit message. > Do let me know what you think of the attached? It would be better to remember the initial running xacts after SnapBuildRestore() returns true? Because otherwise, we could end up allocating InitialRunningXacts multiple times while leaking the old ones if there are no serialized snapshots that we are interested in. --- + if (builder->state == SNAPBUILD_START) + { + int nxacts = running->subxcnt + running->xcnt; + Sizesz = sizeof(TransactionId) * nxacts; + + NInitialRunningXacts = nxacts; + InitialRunningXacts = MemoryContextAlloc(builder->context, sz); + memcpy(InitialRunningXacts, running->xids, sz); + qsort(InitialRunningXacts, nxacts, sizeof(TransactionId), xidComparator); + } We should allocate the memory for InitialRunningXacts only when (running->subxcnt + running->xcnt) > 0. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 25, 2022 at 11:26 AM Masahiko Sawada wrote: > > On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada > wrote: > > I've attached the patch for REl15 that I forgot. > I feel the place to remember running xacts information in SnapBuildProcessRunningXacts is not appropriate. Because in cases where there are no running xacts or when xl_running_xact is old enough that we can't use it, we don't need that information. I feel we need it only when we have to reuse the already serialized snapshot, so, won't it be better to initialize at that place in SnapBuildFindSnapshot()? I have changed accordingly in the attached and apart from that slightly modified the comments and commit message. Do let me know what you think of the attached? -- With Regards, Amit Kapila. REL15_v9-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 26, 2022 3:52 PM Masahiko Sawada wrote: > > On Tue, Jul 26, 2022 at 2:18 PM Amit Kapila > wrote: > > > > On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada > wrote: > > > > > > On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com > > > wrote: > > > > > > > > > > > > case 3 > > > > - > > > > There are 64 catalog modifying transactions. > > > > Decode 100k/500k/1M transactions. > > > > > > > > 100k500k1M > > > > master 0.0600 0.1666 0.4876 > > > > patched 0.0620 0.1653 0.4795 > > > > > > > > > > > > Summary of the tests: > > > > After applying this patch, there is a overhead of about 3% in the case > decoding > > > > 100k transactions with 64 catalog modifying transactions. This is an > extreme > > > > case, so maybe it's okay. > > > > > > Yes. If we're worried about the overhead and doing bsearch() is the > > > cause, probably we can try simplehash instead of the array. > > > > > > > I am not sure if we need to go that far for this extremely corner > > case. Let's first try your below idea. > > > > > An improvement idea is that we pass the parsed->xinfo down to > > > SnapBuildXidHasCatalogChanges(), and then return from that function > > > before doing bearch() if the parsed->xinfo doesn't have > > > XACT_XINFO_HAS_INVALS. That would save calling bsearch() for > > > non-catalog-modifying transactions. Is it worth trying? > > > > > > > I think this is worth trying and this might reduce some of the > > overhead as well in the case presented by Shi-San. > > Okay, I've attached an updated patch that does the above idea. Could > you please do the performance tests again to see if the idea can help > reduce the overhead, Shi yu? > Thanks for your improvement. I have tested the case which shows overhead before (decoding 100k transactions with 64 catalog modifying transactions) for the v9 patch, the result is as follows. master 0.0607 patched 0.0613 There's almost no difference compared with master (less than 1%), which looks good to me. Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 26, 2022 at 2:18 PM Amit Kapila wrote: > > On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada wrote: > > > > On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com > > wrote: > > > > > > Hi, > > > > > > I did some performance test for the master branch patch (based on v6 > > > patch) to > > > see if the bsearch() added by this patch will cause any overhead. > > > > Thank you for doing performance tests! > > > > > > > > I tested them three times and took the average. > > > > > > The results are as follows, and attach the bar chart. > > > > > > case 1 > > > - > > > No catalog modifying transaction. > > > Decode 800k pgbench transactions. (8 clients, 100k transactions per > > > client) > > > > > > master 7.5417 > > > patched 7.4107 > > > > > > case 2 > > > - > > > There's one catalog modifying transaction. > > > Decode 100k/500k/1M transactions. > > > > > > 100k500k1M > > > master 0.0576 0.1491 0.4346 > > > patched 0.0586 0.1500 0.4344 > > > > > > case 3 > > > - > > > There are 64 catalog modifying transactions. > > > Decode 100k/500k/1M transactions. > > > > > > 100k500k1M > > > master 0.0600 0.1666 0.4876 > > > patched 0.0620 0.1653 0.4795 > > > > > > (Because the result of case 3 shows that there is a overhead of about 3% > > > in the > > > case decoding 100k transactions with 64 catalog modifying transactions, I > > > tested the next run of 100k xacts with or without catalog modifying > > > transactions, to see if it affects subsequent decoding.) > > > > > > case 4.1 > > > - > > > After the test steps in case 3 (64 catalog modifying transactions, decode > > > 100k > > > transactions), run 100k xacts and then decode. > > > > > > master 0.3699 > > > patched 0.3701 > > > > > > case 4.2 > > > - > > > After the test steps in case 3 (64 catalog modifying transactions, decode > > > 100k > > > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then > > > decode. > > > > > > master 0.3687 > > > patched 0.3696 > > > > > > Summary of the tests: > > > After applying this patch, there is a overhead of about 3% in the case > > > decoding > > > 100k transactions with 64 catalog modifying transactions. This is an > > > extreme > > > case, so maybe it's okay. > > > > Yes. If we're worried about the overhead and doing bsearch() is the > > cause, probably we can try simplehash instead of the array. > > > > I am not sure if we need to go that far for this extremely corner > case. Let's first try your below idea. > > > An improvement idea is that we pass the parsed->xinfo down to > > SnapBuildXidHasCatalogChanges(), and then return from that function > > before doing bearch() if the parsed->xinfo doesn't have > > XACT_XINFO_HAS_INVALS. That would save calling bsearch() for > > non-catalog-modifying transactions. Is it worth trying? > > > > I think this is worth trying and this might reduce some of the > overhead as well in the case presented by Shi-San. Okay, I've attached an updated patch that does the above idea. Could you please do the performance tests again to see if the idea can help reduce the overhead, Shi yu? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ master-v9-0001-Add-catalog-modifying-transactions-to-logical-dec.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 26, 2022 at 7:00 AM Masahiko Sawada wrote: > > On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com > wrote: > > > > Hi, > > > > I did some performance test for the master branch patch (based on v6 patch) > > to > > see if the bsearch() added by this patch will cause any overhead. > > Thank you for doing performance tests! > > > > > I tested them three times and took the average. > > > > The results are as follows, and attach the bar chart. > > > > case 1 > > - > > No catalog modifying transaction. > > Decode 800k pgbench transactions. (8 clients, 100k transactions per client) > > > > master 7.5417 > > patched 7.4107 > > > > case 2 > > - > > There's one catalog modifying transaction. > > Decode 100k/500k/1M transactions. > > > > 100k500k1M > > master 0.0576 0.1491 0.4346 > > patched 0.0586 0.1500 0.4344 > > > > case 3 > > - > > There are 64 catalog modifying transactions. > > Decode 100k/500k/1M transactions. > > > > 100k500k1M > > master 0.0600 0.1666 0.4876 > > patched 0.0620 0.1653 0.4795 > > > > (Because the result of case 3 shows that there is a overhead of about 3% in > > the > > case decoding 100k transactions with 64 catalog modifying transactions, I > > tested the next run of 100k xacts with or without catalog modifying > > transactions, to see if it affects subsequent decoding.) > > > > case 4.1 > > - > > After the test steps in case 3 (64 catalog modifying transactions, decode > > 100k > > transactions), run 100k xacts and then decode. > > > > master 0.3699 > > patched 0.3701 > > > > case 4.2 > > - > > After the test steps in case 3 (64 catalog modifying transactions, decode > > 100k > > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode. > > > > master 0.3687 > > patched 0.3696 > > > > Summary of the tests: > > After applying this patch, there is a overhead of about 3% in the case > > decoding > > 100k transactions with 64 catalog modifying transactions. This is an extreme > > case, so maybe it's okay. > > Yes. If we're worried about the overhead and doing bsearch() is the > cause, probably we can try simplehash instead of the array. > I am not sure if we need to go that far for this extremely corner case. Let's first try your below idea. > An improvement idea is that we pass the parsed->xinfo down to > SnapBuildXidHasCatalogChanges(), and then return from that function > before doing bearch() if the parsed->xinfo doesn't have > XACT_XINFO_HAS_INVALS. That would save calling bsearch() for > non-catalog-modifying transactions. Is it worth trying? > I think this is worth trying and this might reduce some of the overhead as well in the case presented by Shi-San. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 25, 2022 at 7:57 PM shiy.f...@fujitsu.com wrote: > > Hi, > > I did some performance test for the master branch patch (based on v6 patch) to > see if the bsearch() added by this patch will cause any overhead. Thank you for doing performance tests! > > I tested them three times and took the average. > > The results are as follows, and attach the bar chart. > > case 1 > - > No catalog modifying transaction. > Decode 800k pgbench transactions. (8 clients, 100k transactions per client) > > master 7.5417 > patched 7.4107 > > case 2 > - > There's one catalog modifying transaction. > Decode 100k/500k/1M transactions. > > 100k500k1M > master 0.0576 0.1491 0.4346 > patched 0.0586 0.1500 0.4344 > > case 3 > - > There are 64 catalog modifying transactions. > Decode 100k/500k/1M transactions. > > 100k500k1M > master 0.0600 0.1666 0.4876 > patched 0.0620 0.1653 0.4795 > > (Because the result of case 3 shows that there is a overhead of about 3% in > the > case decoding 100k transactions with 64 catalog modifying transactions, I > tested the next run of 100k xacts with or without catalog modifying > transactions, to see if it affects subsequent decoding.) > > case 4.1 > - > After the test steps in case 3 (64 catalog modifying transactions, decode 100k > transactions), run 100k xacts and then decode. > > master 0.3699 > patched 0.3701 > > case 4.2 > - > After the test steps in case 3 (64 catalog modifying transactions, decode 100k > transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode. > > master 0.3687 > patched 0.3696 > > Summary of the tests: > After applying this patch, there is a overhead of about 3% in the case > decoding > 100k transactions with 64 catalog modifying transactions. This is an extreme > case, so maybe it's okay. Yes. If we're worried about the overhead and doing bsearch() is the cause, probably we can try simplehash instead of the array. An improvement idea is that we pass the parsed->xinfo down to SnapBuildXidHasCatalogChanges(), and then return from that function before doing bearch() if the parsed->xinfo doesn't have XACT_XINFO_HAS_INVALS. That would save calling bsearch() for non-catalog-modifying transactions. Is it worth trying? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Hi, I did some performance test for the master branch patch (based on v6 patch) to see if the bsearch() added by this patch will cause any overhead. I tested them three times and took the average. The results are as follows, and attach the bar chart. case 1 - No catalog modifying transaction. Decode 800k pgbench transactions. (8 clients, 100k transactions per client) master 7.5417 patched 7.4107 case 2 - There's one catalog modifying transaction. Decode 100k/500k/1M transactions. 100k500k1M master 0.0576 0.1491 0.4346 patched 0.0586 0.1500 0.4344 case 3 - There are 64 catalog modifying transactions. Decode 100k/500k/1M transactions. 100k500k1M master 0.0600 0.1666 0.4876 patched 0.0620 0.1653 0.4795 (Because the result of case 3 shows that there is a overhead of about 3% in the case decoding 100k transactions with 64 catalog modifying transactions, I tested the next run of 100k xacts with or without catalog modifying transactions, to see if it affects subsequent decoding.) case 4.1 - After the test steps in case 3 (64 catalog modifying transactions, decode 100k transactions), run 100k xacts and then decode. master 0.3699 patched 0.3701 case 4.2 - After the test steps in case 3 (64 catalog modifying transactions, decode 100k transactions), run 64 DDLs(without checkpoint) and 100k xacts, then decode. master 0.3687 patched 0.3696 Summary of the tests: After applying this patch, there is a overhead of about 3% in the case decoding 100k transactions with 64 catalog modifying transactions. This is an extreme case, so maybe it's okay. And case 4.1 and case 4.2 shows that the patch has no effect on subsequent decoding. In other cases, there are no significant differences. For your information, here are the parameters specified in postgresql.conf in the test. shared_buffers = 8GB checkpoint_timeout = 30min max_wal_size = 20GB min_wal_size = 10GB autovacuum = off Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 25, 2022 at 10:45 AM Masahiko Sawada wrote: > > On Sat, Jul 23, 2022 at 8:32 PM Amit Kapila wrote: > > > > On Fri, Jul 22, 2022 at 11:48 AM Masahiko Sawada > > wrote: > > > > > > On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila > > > wrote: > > > > > > > > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > > > > This is required if we don't want to introduce a new set of functions > > > > as you proposed above. I am not sure which one is better w.r.t back > > > > patching effort later but it seems to me using flag stuff would make > > > > future back patches easier if we make any changes in > > > > SnapBuildCommitTxn. > > > > > > Understood. > > > > > > I've implemented this idea as well for discussion. Both patches have > > > the common change to remember the initial running transactions and to > > > purge them when decoding xl_running_xacts records. The difference is > > > how to mark the transactions as needing to be added to the snapshot. > > > > > > In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch, > > > when the transaction is in the initial running xact list and its > > > commit record has XINFO_HAS_INVAL flag, we mark both the top > > > transaction and its all subtransactions as containing catalog changes > > > (which also means to create ReorderBufferTXN entries for them). These > > > transactions are added to the snapshot in SnapBuildCommitTxn() since > > > ReorderBufferXidHasCatalogChanges () for them returns true. > > > > > > In poc_mark_top_txn_has_inval.patch, when the transaction is in the > > > initial running xacts list and its commit record has XINFO_HAS_INVALS > > > flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top > > > transaction. > > > > > > > It seems that the patch has missed the part to check if the xid is in > > the initial running xacts list? > > Oops, right. > > > > > > In SnapBuildCommitTxn(), we add all subtransactions to > > > the snapshot without checking ReorderBufferXidHasCatalogChanges() for > > > subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS > > > flag. > > > > > > A difference between the two ideas is the scope of changes: the former > > > changes only snapbuild.c but the latter changes both snapbuild.c and > > > reorderbuffer.c. Moreover, while the former uses the existing flag, > > > the latter adds a new flag to the reorder buffer for dealing with only > > > this case. I think the former idea is simpler in terms of that. But, > > > an advantage of the latter idea is that the latter idea can save to > > > create ReorderBufferTXN entries for subtransactions. > > > > > > Overall I prefer the former for now but I'd like to hear what others > > > think. > > > > > > > I agree that the latter idea can have better performance in extremely > > special scenarios but introducing a new flag for the same sounds a bit > > ugly to me. So, I would also prefer to go with the former idea, > > however, I would also like to hear what Horiguchi-San and others have > > to say. > > Agreed. > > > > > Few comments on v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during: > > 1. > > +void > > +SnapBuildInitialXactSetCatalogChanges(SnapBuild *builder, TransactionId > > xid, > > + int subxcnt, TransactionId *subxacts, > > + XLogRecPtr lsn) > > +{ > > > > I think it is better to name this function as > > SnapBuildXIDSetCatalogChanges as we use this to mark a particular > > transaction as having catalog changes. > > > > 2. Changed/added a few comments in the attached. > > Thank you for the comments. > > I've attached updated version patches for the master and back branches. I've attached the patch for REl15 that I forgot. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ REL15-v8-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 22, 2022 at 11:48 AM Masahiko Sawada wrote: > > On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila wrote: > > > > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada > > wrote: > > > > > > > This is required if we don't want to introduce a new set of functions > > as you proposed above. I am not sure which one is better w.r.t back > > patching effort later but it seems to me using flag stuff would make > > future back patches easier if we make any changes in > > SnapBuildCommitTxn. > > Understood. > > I've implemented this idea as well for discussion. Both patches have > the common change to remember the initial running transactions and to > purge them when decoding xl_running_xacts records. The difference is > how to mark the transactions as needing to be added to the snapshot. > > In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch, > when the transaction is in the initial running xact list and its > commit record has XINFO_HAS_INVAL flag, we mark both the top > transaction and its all subtransactions as containing catalog changes > (which also means to create ReorderBufferTXN entries for them). These > transactions are added to the snapshot in SnapBuildCommitTxn() since > ReorderBufferXidHasCatalogChanges () for them returns true. > > In poc_mark_top_txn_has_inval.patch, when the transaction is in the > initial running xacts list and its commit record has XINFO_HAS_INVALS > flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top > transaction. > It seems that the patch has missed the part to check if the xid is in the initial running xacts list? > In SnapBuildCommitTxn(), we add all subtransactions to > the snapshot without checking ReorderBufferXidHasCatalogChanges() for > subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS > flag. > > A difference between the two ideas is the scope of changes: the former > changes only snapbuild.c but the latter changes both snapbuild.c and > reorderbuffer.c. Moreover, while the former uses the existing flag, > the latter adds a new flag to the reorder buffer for dealing with only > this case. I think the former idea is simpler in terms of that. But, > an advantage of the latter idea is that the latter idea can save to > create ReorderBufferTXN entries for subtransactions. > > Overall I prefer the former for now but I'd like to hear what others think. > I agree that the latter idea can have better performance in extremely special scenarios but introducing a new flag for the same sounds a bit ugly to me. So, I would also prefer to go with the former idea, however, I would also like to hear what Horiguchi-San and others have to say. Few comments on v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during: 1. +void +SnapBuildInitialXactSetCatalogChanges(SnapBuild *builder, TransactionId xid, + int subxcnt, TransactionId *subxacts, + XLogRecPtr lsn) +{ I think it is better to name this function as SnapBuildXIDSetCatalogChanges as we use this to mark a particular transaction as having catalog changes. 2. Changed/added a few comments in the attached. -- With Regards, Amit Kapila. v7-0001-diff-amit.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 5:50 PM Amit Kapila wrote: > > On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada wrote: > > > > On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila wrote: > > > > > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada > > > wrote: > > > > > > > Another idea would be to have functions, say > > > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > > > > does actual work of handling transaction commits and both > > > > SnapBuildCommitTxn() and SnapBuildCommit() call > > > > SnapBuildCommitTxnWithXInfo() with different arguments. > > > > > > > > > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in > > > above para? > > > > I meant that we will call like DecodeCommit() -> > > SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals = > > true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If > > SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext() > > with has_invals = false and behaves the same as before. > > > > Okay, understood. This will work. > > > > Yet another idea could be to have another flag > > > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. > > > Then, we can retrieve it even for each of the subtxn's if and when > > > required. > > > > Do you mean that when checking if the subtransaction has catalog > > changes, we check if its top-level XID has this new flag? > > > > Yes. > > > Why do we > > need the new flag? > > > > This is required if we don't want to introduce a new set of functions > as you proposed above. I am not sure which one is better w.r.t back > patching effort later but it seems to me using flag stuff would make > future back patches easier if we make any changes in > SnapBuildCommitTxn. Understood. I've implemented this idea as well for discussion. Both patches have the common change to remember the initial running transactions and to purge them when decoding xl_running_xacts records. The difference is how to mark the transactions as needing to be added to the snapshot. In v7-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch, when the transaction is in the initial running xact list and its commit record has XINFO_HAS_INVAL flag, we mark both the top transaction and its all subtransactions as containing catalog changes (which also means to create ReorderBufferTXN entries for them). These transactions are added to the snapshot in SnapBuildCommitTxn() since ReorderBufferXidHasCatalogChanges () for them returns true. In poc_mark_top_txn_has_inval.patch, when the transaction is in the initial running xacts list and its commit record has XINFO_HAS_INVALS flag, we set a new flag, say RBTXN_COMMIT_HAS_INVALS, only to the top transaction. In SnapBuildCommitTxn(), we add all subtransactions to the snapshot without checking ReorderBufferXidHasCatalogChanges() for subtransactions if its top transaction has the RBTXN_COMMIT_HAS_INVALS flag. A difference between the two ideas is the scope of changes: the former changes only snapbuild.c but the latter changes both snapbuild.c and reorderbuffer.c. Moreover, while the former uses the existing flag, the latter adds a new flag to the reorder buffer for dealing with only this case. I think the former idea is simpler in terms of that. But, an advantage of the latter idea is that the latter idea can save to create ReorderBufferTXN entries for subtransactions. Overall I prefer the former for now but I'd like to hear what others think. FWIW, I didn't try the idea of adding wrapper functions since it would be costly in terms of back patching effort in the future. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index 9a31e0b879..4553252d75 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \ spill slot truncate stream stats twophase twophase_stream ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \ oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \ - twophase_snapshot + twophase_snapshot catalog_change_snapshot REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out new file mode 100644 index 00..dc4f9b7018 --- /dev/null +++ b/contrib/test_decoding/expected/catalog_change_snapshot.out @@ -0,0 +1,44 @@ +Parsed test spec with 2 sessions + +starting permutation: s0_init s0_begin s0_savepoint s0_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s0_commit s1_get_changes +step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); +?column? + +init
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 1:28 PM Masahiko Sawada wrote: > > On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila wrote: > > > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada > > wrote: > > > > > Another idea would be to have functions, say > > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > > > does actual work of handling transaction commits and both > > > SnapBuildCommitTxn() and SnapBuildCommit() call > > > SnapBuildCommitTxnWithXInfo() with different arguments. > > > > > > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in > > above para? > > I meant that we will call like DecodeCommit() -> > SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals = > true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If > SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext() > with has_invals = false and behaves the same as before. > Okay, understood. This will work. > > Yet another idea could be to have another flag > > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. > > Then, we can retrieve it even for each of the subtxn's if and when > > required. > > Do you mean that when checking if the subtransaction has catalog > changes, we check if its top-level XID has this new flag? > Yes. > Why do we > need the new flag? > This is required if we don't want to introduce a new set of functions as you proposed above. I am not sure which one is better w.r.t back patching effort later but it seems to me using flag stuff would make future back patches easier if we make any changes in SnapBuildCommitTxn. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 2:19 PM Amit Kapila wrote: > > On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada wrote: > > > > On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila > > wrote: > > > > > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada > > > wrote: > > > > > > Why do you think we can't remove > > > ReorderBufferInitialXactsSetCatalogChanges() from the back branch > > > patch? I think we don't need to change the existing function > > > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper > > > like SnapBuildXidHasCatalogChanges() similar to master branch patch. > > > > IIUC we need to change SnapBuildCommitTxn() but it's exposed. > > > > Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() -> > > ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we > > call like DecodeCommit() -> SnapBuildCommitTxn() -> > > SnapBuildXidHasCatalogChanges() -> > > ReorderBufferXidHasCatalogChanges(). In > > SnapBuildXidHasCatalogChanges(), we need to check if the transaction > > has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass > > either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0) > > down to SnapBuildXidHasCatalogChanges(). However, since > > SnapBuildCommitTxn(), between DecodeCommit() and > > SnapBuildXidHasCatalogChanges(), is exposed we cannot change it. > > > > Agreed. > > > Another idea would be to have functions, say > > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > > does actual work of handling transaction commits and both > > SnapBuildCommitTxn() and SnapBuildCommit() call > > SnapBuildCommitTxnWithXInfo() with different arguments. > > > > Do you want to say DecodeCommit() instead of SnapBuildCommit() in > above para? I meant that we will call like DecodeCommit() -> SnapBuildCommitTxnWithXInfo() -> SnapBuildCommitTxn_ext(has_invals = true) -> SnapBuildXidHasCatalogChanges(has_invals = true) -> ... If SnapBuildCommitTxn() gets called, it calls SnapBuildCommitTxn_ext() with has_invals = false and behaves the same as before. > Yet another idea could be to have another flag > RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. > Then, we can retrieve it even for each of the subtxn's if and when > required. Do you mean that when checking if the subtransaction has catalog changes, we check if its top-level XID has this new flag? Why do we need the new flag? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 4:16 PM Kyotaro Horiguchi wrote: > > At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada > wrote in > > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi > > wrote: > > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always > > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still > > > assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* > > > ondisk.builder.commited.xip populated with a valid array pointer. But > > > the patch allows committed.xip be NULL, thus in that case, > > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion > > > failure. > > > > IIUC the patch doesn't allow committed.xip to be NULL since we don't > > overwrite it if builder->committed.xcnt is 0 (i.e., > > ondisk.builder.committed.xip is NULL): > > I meant that ondisk.builder.committed.xip can be NULL.. But looking > again that cannot be. I don't understand what I was looking at at > that time. > > So, sorry for the noise. No problem. Thank you for your review and comments! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Wed, 20 Jul 2022 10:58:16 +0900, Masahiko Sawada wrote in > On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi > wrote: > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always > > allocate the array with a fixed size. SnapBuildAddCommittedTxn still > > assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* > > ondisk.builder.commited.xip populated with a valid array pointer. But > > the patch allows committed.xip be NULL, thus in that case, > > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion > > failure. > > IIUC the patch doesn't allow committed.xip to be NULL since we don't > overwrite it if builder->committed.xcnt is 0 (i.e., > ondisk.builder.committed.xip is NULL): I meant that ondisk.builder.committed.xip can be NULL.. But looking again that cannot be. I don't understand what I was looking at at that time. So, sorry for the noise. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 9:01 AM Masahiko Sawada wrote: > > On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila wrote: > > > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada > > wrote: > > > > Why do you think we can't remove > > ReorderBufferInitialXactsSetCatalogChanges() from the back branch > > patch? I think we don't need to change the existing function > > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper > > like SnapBuildXidHasCatalogChanges() similar to master branch patch. > > IIUC we need to change SnapBuildCommitTxn() but it's exposed. > > Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() -> > ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we > call like DecodeCommit() -> SnapBuildCommitTxn() -> > SnapBuildXidHasCatalogChanges() -> > ReorderBufferXidHasCatalogChanges(). In > SnapBuildXidHasCatalogChanges(), we need to check if the transaction > has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass > either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0) > down to SnapBuildXidHasCatalogChanges(). However, since > SnapBuildCommitTxn(), between DecodeCommit() and > SnapBuildXidHasCatalogChanges(), is exposed we cannot change it. > Agreed. > Another idea would be to have functions, say > SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter > does actual work of handling transaction commits and both > SnapBuildCommitTxn() and SnapBuildCommit() call > SnapBuildCommitTxnWithXInfo() with different arguments. > Do you want to say DecodeCommit() instead of SnapBuildCommit() in above para? Yet another idea could be to have another flag RBTXN_HAS_INVALS which will be set by DecodeCommit for top-level TXN. Then, we can retrieve it even for each of the subtxn's if and when required. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 12:11 PM Amit Kapila wrote: > > On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada wrote: > > > > On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila wrote: > > > > > > On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada > > > wrote: > > > > > > > > > > > BTW on backbranches, I think that the reason why we add > > > > initial_running_xacts stuff to ReorderBuffer is that we cannot modify > > > > SnapBuild that could be serialized. Can we add a (private) array for > > > > the initial running xacts in snapbuild.c instead of adding new > > > > variables to ReorderBuffer? > > > > > > > > > > While thinking about this, I wonder if the current patch for back > > > branches can lead to an ABI break as it changes the exposed structure? > > > If so, it may be another reason to change it to some other way > > > probably as you are suggesting. > > > > Yeah, it changes the size of ReorderBuffer, which is not good. > > > > So, are you planning to give a try with your idea of making a private > array for the initial running xacts? Yes. > I am not sure but I guess you are > proposing to add it in SnapBuild structure, if so, that seems safe as > that structure is not exposed. We cannot add it in SnapBuild as it gets serialized to the disk. > > > Changing the function names and arguments would also break ABI. So > > probably we cannot do the above idea of removing > > ReorderBufferInitialXactsSetCatalogChanges() as well. > > > > Why do you think we can't remove > ReorderBufferInitialXactsSetCatalogChanges() from the back branch > patch? I think we don't need to change the existing function > ReorderBufferXidHasCatalogChanges() but instead can have a wrapper > like SnapBuildXidHasCatalogChanges() similar to master branch patch. IIUC we need to change SnapBuildCommitTxn() but it's exposed. Currently, we call like DecodeCommit() -> SnapBuildCommitTxn() -> ReorderBufferXidHasCatalogChanges(). If we have a wrapper function, we call like DecodeCommit() -> SnapBuildCommitTxn() -> SnapBuildXidHasCatalogChanges() -> ReorderBufferXidHasCatalogChanges(). In SnapBuildXidHasCatalogChanges(), we need to check if the transaction has XACT_XINFO_HAS_INVALS, which means DecodeCommit() needs to pass either parsed->xinfo or (parsed->xinfo & XACT_XINFO_HAS_INVALS != 0) down to SnapBuildXidHasCatalogChanges(). However, since SnapBuildCommitTxn(), between DecodeCommit() and SnapBuildXidHasCatalogChanges(), is exposed we cannot change it. Another idea would be to have functions, say SnapBuildCommitTxnWithXInfo() and SnapBuildCommitTxn_ext(). The latter does actual work of handling transaction commits and both SnapBuildCommitTxn() and SnapBuildCommit() call SnapBuildCommitTxnWithXInfo() with different arguments. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 7:28 PM Masahiko Sawada wrote: > > On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila wrote: > > > > On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada > > wrote: > > > > > > > > BTW on backbranches, I think that the reason why we add > > > initial_running_xacts stuff to ReorderBuffer is that we cannot modify > > > SnapBuild that could be serialized. Can we add a (private) array for > > > the initial running xacts in snapbuild.c instead of adding new > > > variables to ReorderBuffer? > > > > > > > While thinking about this, I wonder if the current patch for back > > branches can lead to an ABI break as it changes the exposed structure? > > If so, it may be another reason to change it to some other way > > probably as you are suggesting. > > Yeah, it changes the size of ReorderBuffer, which is not good. > So, are you planning to give a try with your idea of making a private array for the initial running xacts? I am not sure but I guess you are proposing to add it in SnapBuild structure, if so, that seems safe as that structure is not exposed. > Changing the function names and arguments would also break ABI. So > probably we cannot do the above idea of removing > ReorderBufferInitialXactsSetCatalogChanges() as well. > Why do you think we can't remove ReorderBufferInitialXactsSetCatalogChanges() from the back branch patch? I think we don't need to change the existing function ReorderBufferXidHasCatalogChanges() but instead can have a wrapper like SnapBuildXidHasCatalogChanges() similar to master branch patch. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 20, 2022 at 9:58 AM Kyotaro Horiguchi wrote: > > At Tue, 19 Jul 2022 17:31:07 +0900, Masahiko Sawada > wrote in > > On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi > > wrote: > > > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila > > > wrote in > > > > Good work. I wonder without comments this may create a problem in the > > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > > > freeing the memory any less robust. Also, for consistency, we can use > > > > a similar check based on xcnt in the SnapBuildRestore to free the > > > > memory in the below code: > > > > + /* set catalog modifying transactions */ > > > > + if (builder->catchange.xip) > > > > + pfree(builder->catchange.xip); > > > > > > But xip must be positive there. We can add a comment explains that. > > > > > > > Yes, if we add the comment for it, probably we need to explain a gcc's > > optimization but it seems to be too much to me. > > Ah, sorry. I confused with other place in SnapBuildPurgeCommitedTxn. > I agree to you, that we don't need additional comment *there*. > > > > + catchange_xip = > > > ReorderBufferGetCatalogChangesXacts(builder->reorder); > > > > > > catchange_xip is allocated in the current context, but ondisk is > > > allocated in builder->context. I see it kind of inconsistent (even if > > > the current context is same with build->context). > > > > Right. I thought that since the lifetime of catchange_xip is short, > > until the end of SnapBuildSerialize() function we didn't need to > > allocate it in builder->context. But given ondisk, we need to do that > > for catchange_xip as well. Will fix it. > > Thanks. > > > > + if (builder->committed.xcnt > 0) > > > + { > > > > > > It seems to me comitted.xip is always non-null, so we don't need this. > > > I don't strongly object to do that, though. > > > > But committed.xcnt could be 0, right? We don't need to copy anything > > by calling memcpy with size = 0 in this case. Also, it looks more > > consistent with what we do for catchange_xcnt. > > Mmm. the patch changed that behavior. AllocateSnapshotBuilder always > allocate the array with a fixed size. SnapBuildAddCommittedTxn still > assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* > ondisk.builder.commited.xip populated with a valid array pointer. But > the patch allows committed.xip be NULL, thus in that case, > SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion > failure. IIUC the patch doesn't allow committed.xip to be NULL since we don't overwrite it if builder->committed.xcnt is 0 (i.e., ondisk.builder.committed.xip is NULL): builder->committed.xcnt = ondisk.builder.committed.xcnt; /* We only allocated/stored xcnt, not xcnt_space xids ! */ /* don't overwrite preallocated xip, if we don't have anything here */ if (builder->committed.xcnt > 0) { pfree(builder->committed.xip); builder->committed.xcnt_space = ondisk.builder.committed.xcnt; builder->committed.xip = ondisk.builder.committed.xip; } Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Tue, 19 Jul 2022 17:31:07 +0900, Masahiko Sawada wrote in > On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi > wrote: > > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila > > wrote in > > > Good work. I wonder without comments this may create a problem in the > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > > freeing the memory any less robust. Also, for consistency, we can use > > > a similar check based on xcnt in the SnapBuildRestore to free the > > > memory in the below code: > > > + /* set catalog modifying transactions */ > > > + if (builder->catchange.xip) > > > + pfree(builder->catchange.xip); > > > > But xip must be positive there. We can add a comment explains that. > > > > Yes, if we add the comment for it, probably we need to explain a gcc's > optimization but it seems to be too much to me. Ah, sorry. I confused with other place in SnapBuildPurgeCommitedTxn. I agree to you, that we don't need additional comment *there*. > > + catchange_xip = > > ReorderBufferGetCatalogChangesXacts(builder->reorder); > > > > catchange_xip is allocated in the current context, but ondisk is > > allocated in builder->context. I see it kind of inconsistent (even if > > the current context is same with build->context). > > Right. I thought that since the lifetime of catchange_xip is short, > until the end of SnapBuildSerialize() function we didn't need to > allocate it in builder->context. But given ondisk, we need to do that > for catchange_xip as well. Will fix it. Thanks. > > + if (builder->committed.xcnt > 0) > > + { > > > > It seems to me comitted.xip is always non-null, so we don't need this. > > I don't strongly object to do that, though. > > But committed.xcnt could be 0, right? We don't need to copy anything > by calling memcpy with size = 0 in this case. Also, it looks more > consistent with what we do for catchange_xcnt. Mmm. the patch changed that behavior. AllocateSnapshotBuilder always allocate the array with a fixed size. SnapBuildAddCommittedTxn still assumes builder->committed.xip is non-NULL. SnapBuildRestore *kept* ondisk.builder.commited.xip populated with a valid array pointer. But the patch allows committed.xip be NULL, thus in that case, SnapBuildAddCommitedTxn calls repalloc(NULL) which triggers assertion failure. > > + Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns)); > > > > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..). > > (xcnt == rb->catchange_ntxns) is not what should be checked here. The > > assert just requires that catchange_txns and catchange_ntxns are > > consistent so it should be checked just after dlist_empty.. I think. > > > > If we want to check if catchange_txns and catchange_ntxns are > consistent, should we check (xcnt == rb->catchange_ntxns) as well, no? > This function requires the caller to use rb->catchange_ntxns as the > length of the returned array. I think this assertion ensures that the > actual length of the array is consistent with the length we > pre-calculated. Sorry again. Please forget the comment about xcnt == rb->catchange_ntxns.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 9:25 PM Amit Kapila wrote: > > On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada wrote: > > > > On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila wrote: > > > > > > On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada > > > wrote: > > > > > > > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com > > > > wrote: > > > > > > > > > > > > > I've attached patches for all supported branches including the master. > > > > > > > > > > For back branch patches, > > > * Wouldn't it be better to move purge logic into the function > > > SnapBuildPurge* function for the sake of consistency? > > > > Agreed. > > > > > * Do we really need ReorderBufferInitialXactsSetCatalogChanges()? > > > Can't we instead have a function similar to > > > SnapBuildXidHasCatalogChanges() as we have for the master branch? That > > > will avoid calling it when the snapshot > > > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT > > > > Seems a good idea. We would need to pass the information about > > (parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably > > we can change ReorderBufferXidHasCatalogChanges() so that it checks > > the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts > > array. > > > > Let's try to keep this as much similar to the master branch patch as possible. > > > BTW on backbranches, I think that the reason why we add > > initial_running_xacts stuff to ReorderBuffer is that we cannot modify > > SnapBuild that could be serialized. Can we add a (private) array for > > the initial running xacts in snapbuild.c instead of adding new > > variables to ReorderBuffer? > > > > While thinking about this, I wonder if the current patch for back > branches can lead to an ABI break as it changes the exposed structure? > If so, it may be another reason to change it to some other way > probably as you are suggesting. Yeah, it changes the size of ReorderBuffer, which is not good. Changing the function names and arguments would also break ABI. So probably we cannot do the above idea of removing ReorderBufferInitialXactsSetCatalogChanges() as well. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 2:01 PM Masahiko Sawada wrote: > > On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi > wrote: > > > > > > > + Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns)); > > > > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..). > > (xcnt == rb->catchange_ntxns) is not what should be checked here. The > > assert just requires that catchange_txns and catchange_ntxns are > > consistent so it should be checked just after dlist_empty.. I think. > > > > If we want to check if catchange_txns and catchange_ntxns are > consistent, should we check (xcnt == rb->catchange_ntxns) as well, no? > This function requires the caller to use rb->catchange_ntxns as the > length of the returned array. I think this assertion ensures that the > actual length of the array is consistent with the length we > pre-calculated. > Right, so, I think it is better to keep this assertion but remove (xcnt > 0) part as pointed out by Horiguchi-San. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 1:10 PM Masahiko Sawada wrote: > > On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila wrote: > > > > On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com > > > wrote: > > > > > > > > > > I've attached patches for all supported branches including the master. > > > > > > > For back branch patches, > > * Wouldn't it be better to move purge logic into the function > > SnapBuildPurge* function for the sake of consistency? > > Agreed. > > > * Do we really need ReorderBufferInitialXactsSetCatalogChanges()? > > Can't we instead have a function similar to > > SnapBuildXidHasCatalogChanges() as we have for the master branch? That > > will avoid calling it when the snapshot > > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT > > Seems a good idea. We would need to pass the information about > (parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably > we can change ReorderBufferXidHasCatalogChanges() so that it checks > the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts > array. > Let's try to keep this as much similar to the master branch patch as possible. > BTW on backbranches, I think that the reason why we add > initial_running_xacts stuff to ReorderBuffer is that we cannot modify > SnapBuild that could be serialized. Can we add a (private) array for > the initial running xacts in snapbuild.c instead of adding new > variables to ReorderBuffer? > While thinking about this, I wonder if the current patch for back branches can lead to an ABI break as it changes the exposed structure? If so, it may be another reason to change it to some other way probably as you are suggesting. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 1:43 PM Kyotaro Horiguchi wrote: > > At Tue, 19 Jul 2022 16:57:14 +0900 (JST), Kyotaro Horiguchi > wrote in > > At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada > > wrote in > > > On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila > > > wrote: > > > > Good work. I wonder without comments this may create a problem in the > > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > > > freeing the memory any less robust. Also, for consistency, we can use > > > > a similar check based on xcnt in the SnapBuildRestore to free the > > > > memory in the below code: > > > > + /* set catalog modifying transactions */ > > > > + if (builder->catchange.xip) > > > > + pfree(builder->catchange.xip); > > > > > > I would hesitate to add comments about preventing the particular > > > optimization. I think we do null-pointer-check-then-pfree many place. > > > It seems to me that checking the array length before memcpy is more > > > natural than checking both the array length and the array existence > > > before pfree. > > > > Anyway according to commit message of 46ab07ffda, POSIX forbits > > memcpy(NULL, NULL, 0). It seems to me that it is the cause of the > > false (or over) optimization. So if we add some comment, it would be > > for memcpy, not pfree.. > > For clarilty, I meant that I don't think we need that comment. > Fair enough. I think commit 46ab07ffda clearly explains why it is a good idea to add a check as Sawada-San did in his latest version. I also agree that we don't any comment for this change. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 4:35 PM Kyotaro Horiguchi wrote: Thank you for the comments! > > At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila > wrote in > > Good work. I wonder without comments this may create a problem in the > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > freeing the memory any less robust. Also, for consistency, we can use > > a similar check based on xcnt in the SnapBuildRestore to free the > > memory in the below code: > > + /* set catalog modifying transactions */ > > + if (builder->catchange.xip) > > + pfree(builder->catchange.xip); > > But xip must be positive there. We can add a comment explains that. > Yes, if we add the comment for it, probably we need to explain a gcc's optimization but it seems to be too much to me. > > +* Array of transactions and subtransactions that had modified > catalogs > +* and were running when the snapshot was serialized. > +* > +* We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS > records to > +* know if the transaction has changed the catalog. But it could > happen that > +* the logical decoding decodes only the commit record of the > transaction. > +* This array keeps track of the transactions that have modified > catalogs > > (Might be only me, but) "track" makes me think that xids are added and > removed by activities. On the other hand the array just remembers > catalog-modifying xids in the last life until the all xids in the list > gone. > > +* and were running when serializing a snapshot, and this array is > used to > +* add such transactions to the snapshot. > +* > +* This array is set once when restoring the snapshot, xids are > removed > > (So I want to add "only" between "are removed"). > > +* from the array when decoding xl_running_xacts record, and then > eventually > +* becomes empty. Agreed. WIll fix. > > > + catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder); > > catchange_xip is allocated in the current context, but ondisk is > allocated in builder->context. I see it kind of inconsistent (even if > the current context is same with build->context). Right. I thought that since the lifetime of catchange_xip is short, until the end of SnapBuildSerialize() function we didn't need to allocate it in builder->context. But given ondisk, we need to do that for catchange_xip as well. Will fix it. > > > + if (builder->committed.xcnt > 0) > + { > > It seems to me comitted.xip is always non-null, so we don't need this. > I don't strongly object to do that, though. But committed.xcnt could be 0, right? We don't need to copy anything by calling memcpy with size = 0 in this case. Also, it looks more consistent with what we do for catchange_xcnt. > > -* Remove TXN from its containing list. > +* Remove TXN from its containing lists. > > The comment body only describes abut txn->nodes. I think we need to > add that for catchange_node. Will add. > > > + Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns)); > > (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..). > (xcnt == rb->catchange_ntxns) is not what should be checked here. The > assert just requires that catchange_txns and catchange_ntxns are > consistent so it should be checked just after dlist_empty.. I think. > If we want to check if catchange_txns and catchange_ntxns are consistent, should we check (xcnt == rb->catchange_ntxns) as well, no? This function requires the caller to use rb->catchange_ntxns as the length of the returned array. I think this assertion ensures that the actual length of the array is consistent with the length we pre-calculated. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Tue, 19 Jul 2022 16:57:14 +0900 (JST), Kyotaro Horiguchi wrote in > At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada > wrote in > > On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila wrote: > > > Good work. I wonder without comments this may create a problem in the > > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > > freeing the memory any less robust. Also, for consistency, we can use > > > a similar check based on xcnt in the SnapBuildRestore to free the > > > memory in the below code: > > > + /* set catalog modifying transactions */ > > > + if (builder->catchange.xip) > > > + pfree(builder->catchange.xip); > > > > I would hesitate to add comments about preventing the particular > > optimization. I think we do null-pointer-check-then-pfree many place. > > It seems to me that checking the array length before memcpy is more > > natural than checking both the array length and the array existence > > before pfree. > > Anyway according to commit message of 46ab07ffda, POSIX forbits > memcpy(NULL, NULL, 0). It seems to me that it is the cause of the > false (or over) optimization. So if we add some comment, it would be > for memcpy, not pfree.. For clarilty, I meant that I don't think we need that comment. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Tue, 19 Jul 2022 16:02:26 +0900, Masahiko Sawada wrote in > On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila wrote: > > Good work. I wonder without comments this may create a problem in the > > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > > freeing the memory any less robust. Also, for consistency, we can use > > a similar check based on xcnt in the SnapBuildRestore to free the > > memory in the below code: > > + /* set catalog modifying transactions */ > > + if (builder->catchange.xip) > > + pfree(builder->catchange.xip); > > I would hesitate to add comments about preventing the particular > optimization. I think we do null-pointer-check-then-pfree many place. > It seems to me that checking the array length before memcpy is more > natural than checking both the array length and the array existence > before pfree. Anyway according to commit message of 46ab07ffda, POSIX forbits memcpy(NULL, NULL, 0). It seems to me that it is the cause of the false (or over) optimization. So if we add some comment, it would be for memcpy, not pfree.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 4:28 PM osumi.takami...@fujitsu.com wrote: > > On Sunday, July 17, 2022 9:59 PM Masahiko Sawada > wrote: > > I've attached patches for all supported branches including the master. > Hi, > > > Minor comments for REL14. > > (1) There are some foreign characters in the patches (in the commit message) > > When I had a look at your patch for back branches with some editor, > I could see some unfamiliar full-width characters like below two cases, > mainly around "single quotes" in the sentences. > > Could you please check the entire patches, > probably by some tool that helps you to detect this kind of characters ? > > * the 2nd paragraph of the commit message > > ...mark the transaction as containing catalog changes if it窶冱 in the list of > the > initial running transactions ... > > * the 3rd paragraph of the same > > It doesn窶冲 have the information on which (sub) transaction has catalog > changes > > FYI, this comment applies to other patches for REL13, REL12, REL11, REL10. > > > (2) typo in the commit message > > FROM: > To fix this problem, this change the reorder buffer so that... > TO: > To fix this problem, this changes the reorder buffer so that... > > > (3) typo in ReorderBufferProcessInitialXacts > > + /* > +* Remove transactions that would have been processed and we don't > need to > +* keep track off anymore. > > > Kindly change > FROM: > keep track off > TO: > keep track of Thank you for the comments! I'll address these comments in the next version patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 18, 2022 at 8:49 PM Amit Kapila wrote: > > On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada wrote: > > > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com > > wrote: > > > > > > > I've attached patches for all supported branches including the master. > > > > For back branch patches, > * Wouldn't it be better to move purge logic into the function > SnapBuildPurge* function for the sake of consistency? Agreed. > * Do we really need ReorderBufferInitialXactsSetCatalogChanges()? > Can't we instead have a function similar to > SnapBuildXidHasCatalogChanges() as we have for the master branch? That > will avoid calling it when the snapshot > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT Seems a good idea. We would need to pass the information about (parsed->xinfo & XACT_XINFO_HAS_INVALS) to the function but probably we can change ReorderBufferXidHasCatalogChanges() so that it checks the RBTXN_HAS_CATALOG_CHANGES flag and then the initial running xacts array. BTW on backbranches, I think that the reason why we add initial_running_xacts stuff to ReorderBuffer is that we cannot modify SnapBuild that could be serialized. Can we add a (private) array for the initial running xacts in snapbuild.c instead of adding new variables to ReorderBuffer? That way, the code would become more consistent with the changes on the master branch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
At Tue, 19 Jul 2022 10:17:15 +0530, Amit Kapila wrote in > Good work. I wonder without comments this may create a problem in the > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > freeing the memory any less robust. Also, for consistency, we can use > a similar check based on xcnt in the SnapBuildRestore to free the > memory in the below code: > + /* set catalog modifying transactions */ > + if (builder->catchange.xip) > + pfree(builder->catchange.xip); But xip must be positive there. We can add a comment explains that. +* Array of transactions and subtransactions that had modified catalogs +* and were running when the snapshot was serialized. +* +* We normally rely on HEAP2_NEW_CID and XLOG_XACT_INVALIDATIONS records to +* know if the transaction has changed the catalog. But it could happen that +* the logical decoding decodes only the commit record of the transaction. +* This array keeps track of the transactions that have modified catalogs (Might be only me, but) "track" makes me think that xids are added and removed by activities. On the other hand the array just remembers catalog-modifying xids in the last life until the all xids in the list gone. +* and were running when serializing a snapshot, and this array is used to +* add such transactions to the snapshot. +* +* This array is set once when restoring the snapshot, xids are removed (So I want to add "only" between "are removed"). +* from the array when decoding xl_running_xacts record, and then eventually +* becomes empty. + catchange_xip = ReorderBufferGetCatalogChangesXacts(builder->reorder); catchange_xip is allocated in the current context, but ondisk is allocated in builder->context. I see it kind of inconsistent (even if the current context is same with build->context). + if (builder->committed.xcnt > 0) + { It seems to me comitted.xip is always non-null, so we don't need this. I don't strongly object to do that, though. -* Remove TXN from its containing list. +* Remove TXN from its containing lists. The comment body only describes abut txn->nodes. I think we need to add that for catchange_node. + Assert((xcnt > 0) && (xcnt == rb->catchange_ntxns)); (xcnt > 0) is obvious here (otherwise means dlist_foreach is broken..). (xcnt == rb->catchange_ntxns) is not what should be checked here. The assert just requires that catchange_txns and catchange_ntxns are consistent so it should be checked just after dlist_empty.. I think. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sunday, July 17, 2022 9:59 PM Masahiko Sawada wrote: > I've attached patches for all supported branches including the master. Hi, Minor comments for REL14. (1) There are some foreign characters in the patches (in the commit message) When I had a look at your patch for back branches with some editor, I could see some unfamiliar full-width characters like below two cases, mainly around "single quotes" in the sentences. Could you please check the entire patches, probably by some tool that helps you to detect this kind of characters ? * the 2nd paragraph of the commit message ...mark the transaction as containing catalog changes if it窶冱 in the list of the initial running transactions ... * the 3rd paragraph of the same It doesn窶冲 have the information on which (sub) transaction has catalog changes FYI, this comment applies to other patches for REL13, REL12, REL11, REL10. (2) typo in the commit message FROM: To fix this problem, this change the reorder buffer so that... TO: To fix this problem, this changes the reorder buffer so that... (3) typo in ReorderBufferProcessInitialXacts + /* +* Remove transactions that would have been processed and we don't need to +* keep track off anymore. Kindly change FROM: keep track off TO: keep track of Best Regards, Takamichi Osumi
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 1:47 PM Amit Kapila wrote: > > On Tue, Jul 19, 2022 at 6:34 AM Masahiko Sawada wrote: > > > > On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila wrote: > > > > > > On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada > > > wrote: > > > > > > > > This patch should have the fix for the issue that Shi yu reported. Shi > > > > yu, could you please test it again with this patch? > > > > > > > > > > Can you explain the cause of the failure and your fix for the same? > > > > @@ -1694,6 +1788,8 @@ out: > > /* be tidy */ > > if (ondisk) > > pfree(ondisk); > > + if (catchange_xip) > > + pfree(catchange_xip); > > > > Regarding the above code in the previous version patch, looking at the > > generated assembler code shared by Shi yu offlist, I realized that the > > “if (catchange_xip)” is removed (folded) by gcc optimization. This is > > because we dereference catchange_xip before null-pointer check as > > follow: > > > > + /* copy catalog modifying xacts */ > > + sz = sizeof(TransactionId) * catchange_xcnt; > > + memcpy(ondisk_c, catchange_xip, sz); > > + COMP_CRC32C(ondisk->checksum, ondisk_c, sz); > > + ondisk_c += sz; > > > > Since sz is 0 in this case, memcpy doesn’t do anything actually. > > > > By checking the assembler code, I’ve confirmed that gcc does the > > optimization for these code and setting > > -fno-delete-null-pointer-checks flag prevents the if statement from > > being folded. Also, I’ve confirmed that adding the check if > > "catchange.xcnt > 0” before the null-pointer check also can prevent > > that. Adding a check if "catchange.xcnt > 0” looks more robust. I’ve > > added a similar check for builder->committed.xcnt as well for > > consistency. builder->committed.xip could have no transactions. > > > > Good work. I wonder without comments this may create a problem in the > future. OTOH, I don't see adding a check "catchange.xcnt > 0" before > freeing the memory any less robust. Also, for consistency, we can use > a similar check based on xcnt in the SnapBuildRestore to free the > memory in the below code: > + /* set catalog modifying transactions */ > + if (builder->catchange.xip) > + pfree(builder->catchange.xip); I would hesitate to add comments about preventing the particular optimization. I think we do null-pointer-check-then-pfree many place. It seems to me that checking the array length before memcpy is more natural than checking both the array length and the array existence before pfree. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 19, 2022 at 6:34 AM Masahiko Sawada wrote: > > On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila wrote: > > > > On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada > > wrote: > > > > > > This patch should have the fix for the issue that Shi yu reported. Shi > > > yu, could you please test it again with this patch? > > > > > > > Can you explain the cause of the failure and your fix for the same? > > @@ -1694,6 +1788,8 @@ out: > /* be tidy */ > if (ondisk) > pfree(ondisk); > + if (catchange_xip) > + pfree(catchange_xip); > > Regarding the above code in the previous version patch, looking at the > generated assembler code shared by Shi yu offlist, I realized that the > “if (catchange_xip)” is removed (folded) by gcc optimization. This is > because we dereference catchange_xip before null-pointer check as > follow: > > + /* copy catalog modifying xacts */ > + sz = sizeof(TransactionId) * catchange_xcnt; > + memcpy(ondisk_c, catchange_xip, sz); > + COMP_CRC32C(ondisk->checksum, ondisk_c, sz); > + ondisk_c += sz; > > Since sz is 0 in this case, memcpy doesn’t do anything actually. > > By checking the assembler code, I’ve confirmed that gcc does the > optimization for these code and setting > -fno-delete-null-pointer-checks flag prevents the if statement from > being folded. Also, I’ve confirmed that adding the check if > "catchange.xcnt > 0” before the null-pointer check also can prevent > that. Adding a check if "catchange.xcnt > 0” looks more robust. I’ve > added a similar check for builder->committed.xcnt as well for > consistency. builder->committed.xip could have no transactions. > Good work. I wonder without comments this may create a problem in the future. OTOH, I don't see adding a check "catchange.xcnt > 0" before freeing the memory any less robust. Also, for consistency, we can use a similar check based on xcnt in the SnapBuildRestore to free the memory in the below code: + /* set catalog modifying transactions */ + if (builder->catchange.xip) + pfree(builder->catchange.xip); -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 18, 2022 at 12:28 PM shiy.f...@fujitsu.com wrote: > > On Fri, Jul 15, 2022 10:39 PM Masahiko Sawada wrote: > > > > This patch should have the fix for the issue that Shi yu reported. Shi > > yu, could you please test it again with this patch? > > > > Thanks for updating the patch! > I have tested and confirmed that the problem I found has been fixed. Thank you for testing! Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 18, 2022 at 1:12 PM Amit Kapila wrote: > > On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada wrote: > > > > This patch should have the fix for the issue that Shi yu reported. Shi > > yu, could you please test it again with this patch? > > > > Can you explain the cause of the failure and your fix for the same? @@ -1694,6 +1788,8 @@ out: /* be tidy */ if (ondisk) pfree(ondisk); + if (catchange_xip) + pfree(catchange_xip); Regarding the above code in the previous version patch, looking at the generated assembler code shared by Shi yu offlist, I realized that the “if (catchange_xip)” is removed (folded) by gcc optimization. This is because we dereference catchange_xip before null-pointer check as follow: + /* copy catalog modifying xacts */ + sz = sizeof(TransactionId) * catchange_xcnt; + memcpy(ondisk_c, catchange_xip, sz); + COMP_CRC32C(ondisk->checksum, ondisk_c, sz); + ondisk_c += sz; Since sz is 0 in this case, memcpy doesn’t do anything actually. By checking the assembler code, I’ve confirmed that gcc does the optimization for these code and setting -fno-delete-null-pointer-checks flag prevents the if statement from being folded. Also, I’ve confirmed that adding the check if "catchange.xcnt > 0” before the null-pointer check also can prevent that. Adding a check if "catchange.xcnt > 0” looks more robust. I’ve added a similar check for builder->committed.xcnt as well for consistency. builder->committed.xip could have no transactions. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Sun, Jul 17, 2022 at 6:29 PM Masahiko Sawada wrote: > > On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com > wrote: > > > > I've attached patches for all supported branches including the master. > For back branch patches, * Wouldn't it be better to move purge logic into the function SnapBuildPurge* function for the sake of consistency? * Do we really need ReorderBufferInitialXactsSetCatalogChanges()? Can't we instead have a function similar to SnapBuildXidHasCatalogChanges() as we have for the master branch? That will avoid calling it when the snapshot state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 15, 2022 at 8:09 PM Masahiko Sawada wrote: > > This patch should have the fix for the issue that Shi yu reported. Shi > yu, could you please test it again with this patch? > Can you explain the cause of the failure and your fix for the same? -- With Regards, Amit Kapila.
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 15, 2022 10:39 PM Masahiko Sawada wrote: > > This patch should have the fix for the issue that Shi yu reported. Shi > yu, could you please test it again with this patch? > Thanks for updating the patch! I have tested and confirmed that the problem I found has been fixed. Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 15, 2022 at 3:32 PM shiy.f...@fujitsu.com wrote: > > On Mon, Jul 11, 2022 9:54 PM Masahiko Sawada wrote: > > > > I've attached an updated patch, please review it. > > > > Thanks for your patch. Here are some comments for the REL14-v1 patch. > > 1. > + Sizesz = sizeof(TransactionId) * nxacts;; > > There is a redundant semicolon at the end. > > 2. > + workspace = MemoryContextAlloc(rb->context, > rb->n_initial_running_xacts); > > Should it be: > + workspace = MemoryContextAlloc(rb->context, sizeof(TransactionId) * > rb->n_initial_running_xacts); > > 3. > + /* bound check if there is at least one transaction to be removed */ > + if (NormalTransactionIdPrecedes(rb->initial_running_xacts[0], > + > running->oldestRunningXid)) > + return; > + > > Here, I think it should return if rb->initial_running_xacts[0] is older than > oldestRunningXid, right? Should it be changed to: > > + if (!NormalTransactionIdPrecedes(rb->initial_running_xacts[0], > + > running->oldestRunningXid)) > + return; > > 4. > + if ((parsed->xinfo & XACT_XINFO_HAS_INVALS) != 0) > > Maybe we can change it like the following, to be consistent with other places > in > this file. It's also fine if you don't change it. > > + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) Thank you for the comments! I've attached patches for all supported branches including the master. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ REL13-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data REL12-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data REL14-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data REL11-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data REL10-v6-0001-Fix-catalog-lookup-with-the-wrong-snapshot-during.patch Description: Binary data master-v6-0001-Add-catalog-modifying-transactions-to-logical-dec.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 15, 2022 at 10:43 PM osumi.takami...@fujitsu.com wrote: > > On Thursday, July 14, 2022 10:31 AM Masahiko Sawada > wrote: > > I've attached an updated patch that incorporated comments from Amit and Shi. > Hi, > > > Minor comments for v4. Thank you for the comments! > > (1) typo in the commit message > > "When decoding a COMMIT record, we check both the list and the ReorderBuffer > to see if > if the transaction has modified catalogs." > > There are two 'if's in succession in the last sentence of the second > paragraph. > > (2) The header comment for the spec test > > +# Test that decoding only the commit record of the transaction that have > +# catalog-changed. > > Rewording of this part looks required, because "test that ... " requires a > complete sentence > after that, right ? > > > (3) SnapBuildRestore > > snapshot_not_interesting: > if (ondisk.builder.committed.xip != NULL) > pfree(ondisk.builder.committed.xip); > return false; > } > > Do we need to add pfree for ondisk.builder.catchange.xip after the > 'snapshot_not_interesting' label ? > > > (4) SnapBuildPurgeOlderTxn > > + elog(DEBUG3, "purged catalog modifying transactions from %d > to %d", > +(uint32) builder->catchange.xcnt, surviving_xids); > > To make this part more aligned with existing codes, > probably we can have a look at another elog for debug in the same function. > > We should use %u for casted xcnt & surviving_xids, > while adding a format for xmin if necessary ? I agreed with all the above comments and incorporated them into the updated patch. This patch should have the fix for the issue that Shi yu reported. Shi yu, could you please test it again with this patch? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ v5-0001-Add-catalog-modifying-transactions-to-logical-dec.patch Description: Binary data
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thursday, July 14, 2022 10:31 AM Masahiko Sawada wrote: > I've attached an updated patch that incorporated comments from Amit and Shi. Hi, Minor comments for v4. (1) typo in the commit message "When decoding a COMMIT record, we check both the list and the ReorderBuffer to see if if the transaction has modified catalogs." There are two 'if's in succession in the last sentence of the second paragraph. (2) The header comment for the spec test +# Test that decoding only the commit record of the transaction that have +# catalog-changed. Rewording of this part looks required, because "test that ... " requires a complete sentence after that, right ? (3) SnapBuildRestore snapshot_not_interesting: if (ondisk.builder.committed.xip != NULL) pfree(ondisk.builder.committed.xip); return false; } Do we need to add pfree for ondisk.builder.catchange.xip after the 'snapshot_not_interesting' label ? (4) SnapBuildPurgeOlderTxn + elog(DEBUG3, "purged catalog modifying transactions from %d to %d", +(uint32) builder->catchange.xcnt, surviving_xids); To make this part more aligned with existing codes, probably we can have a look at another elog for debug in the same function. We should use %u for casted xcnt & surviving_xids, while adding a format for xmin if necessary ? Best Regards, Takamichi Osumi
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 11, 2022 9:54 PM Masahiko Sawada wrote: > > I've attached an updated patch, please review it. > Thanks for your patch. Here are some comments for the REL14-v1 patch. 1. + Sizesz = sizeof(TransactionId) * nxacts;; There is a redundant semicolon at the end. 2. + workspace = MemoryContextAlloc(rb->context, rb->n_initial_running_xacts); Should it be: + workspace = MemoryContextAlloc(rb->context, sizeof(TransactionId) * rb->n_initial_running_xacts); 3. + /* bound check if there is at least one transaction to be removed */ + if (NormalTransactionIdPrecedes(rb->initial_running_xacts[0], + running->oldestRunningXid)) + return; + Here, I think it should return if rb->initial_running_xacts[0] is older than oldestRunningXid, right? Should it be changed to: + if (!NormalTransactionIdPrecedes(rb->initial_running_xacts[0], + running->oldestRunningXid)) + return; 4. + if ((parsed->xinfo & XACT_XINFO_HAS_INVALS) != 0) Maybe we can change it like the following, to be consistent with other places in this file. It's also fine if you don't change it. + if (parsed->xinfo & XACT_XINFO_HAS_INVALS) Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 14, 2022 at 12:06 PM Masahiko Sawada wrote: > > On Thu, Jul 14, 2022 at 11:16 AM shiy.f...@fujitsu.com > wrote: > > > > On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada wrote: > > > > > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com > > > wrote: > > > > > > > > It happened when executing the following code because it tried to free a > > > NULL > > > > pointer (catchange_xip). > > > > > > > > /* be tidy */ > > > > if (ondisk) > > > > pfree(ondisk); > > > > + if (catchange_xip) > > > > + pfree(catchange_xip); > > > > } > > > > > > > > It seems to be related to configure option. I could reproduce it when > > > > using > > > > `./configure --enable-debug`. > > > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og - > > > ggdb"`. > > > > > > Hmm, I could not reproduce this problem even if I use ./configure > > > --enable-debug. And it's weird that we checked if catchange_xip is not > > > null but we did pfree for it: > > > > > > #1 pfree (pointer=0x0) at mcxt.c:1177 > > > #2 0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, > > > lsn=25719712) at snapbuild.c:1792 > > > > > > Is it reproducible in your environment? > > > > Thanks for your reply! Yes, it is reproducible. And I also reproduced it on > > the > > v4 patch you posted [1]. > > Thank you for testing! I've found out the exact cause of this problem and how to fix it. I'll submit an updated patch next week with my analysis. Thank you for testing and providing additional information off-list, Shi yu. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 14, 2022 at 11:16 AM shiy.f...@fujitsu.com wrote: > > On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada wrote: > > > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com > > wrote: > > > > > > It happened when executing the following code because it tried to free a > > NULL > > > pointer (catchange_xip). > > > > > > /* be tidy */ > > > if (ondisk) > > > pfree(ondisk); > > > + if (catchange_xip) > > > + pfree(catchange_xip); > > > } > > > > > > It seems to be related to configure option. I could reproduce it when > > > using > > > `./configure --enable-debug`. > > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og - > > ggdb"`. > > > > Hmm, I could not reproduce this problem even if I use ./configure > > --enable-debug. And it's weird that we checked if catchange_xip is not > > null but we did pfree for it: > > > > #1 pfree (pointer=0x0) at mcxt.c:1177 > > #2 0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, > > lsn=25719712) at snapbuild.c:1792 > > > > Is it reproducible in your environment? > > Thanks for your reply! Yes, it is reproducible. And I also reproduced it on > the > v4 patch you posted [1]. Thank you for testing! > > [1] > https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com > > > If so, could you test it again > > with the following changes? > > > > diff --git a/src/backend/replication/logical/snapbuild.c > > b/src/backend/replication/logical/snapbuild.c > > index d015c06ced..a6e76e3781 100644 > > --- a/src/backend/replication/logical/snapbuild.c > > +++ b/src/backend/replication/logical/snapbuild.c > > @@ -1788,7 +1788,7 @@ out: > > /* be tidy */ > > if (ondisk) > > pfree(ondisk); > > - if (catchange_xip) > > + if (catchange_xip != NULL) > > pfree(catchange_xip); > > } > > > > I tried this and could still reproduce the problem. Does the backtrace still show we attempt to pfree a null-pointer? > > Besides, I tried the suggestion from Amit [2], it could be fixed by checking > the value of catchange_xcnt instead of catchange_xip before pfree. Could you check if this problem occurred when we reached there via goto pass, i.e., did we call ReorderBufferGetCatalogChangesXacts() or not? Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 5:23 PM Masahiko Sawada wrote: > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com > wrote: > > > > It happened when executing the following code because it tried to free a > NULL > > pointer (catchange_xip). > > > > /* be tidy */ > > if (ondisk) > > pfree(ondisk); > > + if (catchange_xip) > > + pfree(catchange_xip); > > } > > > > It seems to be related to configure option. I could reproduce it when using > > `./configure --enable-debug`. > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og - > ggdb"`. > > Hmm, I could not reproduce this problem even if I use ./configure > --enable-debug. And it's weird that we checked if catchange_xip is not > null but we did pfree for it: > > #1 pfree (pointer=0x0) at mcxt.c:1177 > #2 0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, > lsn=25719712) at snapbuild.c:1792 > > Is it reproducible in your environment? Thanks for your reply! Yes, it is reproducible. And I also reproduced it on the v4 patch you posted [1]. [1] https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com > If so, could you test it again > with the following changes? > > diff --git a/src/backend/replication/logical/snapbuild.c > b/src/backend/replication/logical/snapbuild.c > index d015c06ced..a6e76e3781 100644 > --- a/src/backend/replication/logical/snapbuild.c > +++ b/src/backend/replication/logical/snapbuild.c > @@ -1788,7 +1788,7 @@ out: > /* be tidy */ > if (ondisk) > pfree(ondisk); > - if (catchange_xip) > + if (catchange_xip != NULL) > pfree(catchange_xip); > } > I tried this and could still reproduce the problem. Besides, I tried the suggestion from Amit [2], it could be fixed by checking the value of catchange_xcnt instead of catchange_xip before pfree. [2] https://www.postgresql.org/message-id/CAA4eK1%2BXPdm8G%3DEhUJA12Pi1YvQAfcz2%3DkTd9a4BjVx4%3Dgk-MA%40mail.gmail.com diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index c482e906b0..68b9c4ef7d 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -1573,7 +1573,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) Sizeneeded_length; SnapBuildOnDisk *ondisk = NULL; TransactionId *catchange_xip = NULL; - size_t catchange_xcnt; + size_t catchange_xcnt = 0; char *ondisk_c; int fd; chartmppath[MAXPGPATH]; @@ -1788,7 +1788,7 @@ out: /* be tidy */ if (ondisk) pfree(ondisk); - if (catchange_xip) + if (catchange_xcnt != 0) pfree(catchange_xip); } Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 12:40 PM shiy.f...@fujitsu.com wrote: > > On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada wrote: > > > > I've attached an updated patch. > > > > While trying this idea, I noticed there is no API to get the length of > > dlist, as we discussed offlist. Alternative idea was to use List > > (T_XidList) but I'm not sure it's a great idea since deleting an xid > > from the list is O(N), we need to implement list_delete_xid, and we > > need to make sure allocating list node in the reorder buffer context. > > So in the patch, I added a variable, catchange_ntxns, to keep track of > > the length of the dlist. Please review it. > > > > Thanks for your patch. Here are some comments on the master patch. Thank you for the comments. > > 1. > In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead > of > "RUNNING_XACT record" / "XACT_RUNNING record" in the comment? > > 2. > +* Since catchange.xip is sorted, we find the lower bound of > +* xids that sill are interesting. > > Typo? > "sill" -> "still" > > 3. > +* This array is set once when restoring the snapshot, xids are > removed > +* from the array when decoding xl_running_xacts record, and then > eventually > +* becomes an empty. > > + /* catchange list becomes an empty */ > + pfree(builder->catchange.xip); > + builder->catchange.xip = NULL; > > Should "becomes an empty" be modified to "becomes empty"? > > 4. > + * changes that are smaller than ->xmin. Those won't ever get checked via > + * the ->committed array and ->catchange, respectively. The committed xids > will > > Should we change > "the ->committed array and ->catchange" > to > "the ->committed or ->catchange array" > ? Agreed with all the above comments. These are incorporated in the latest v4 patch I just sent[1]. Regards, [1] https://www.postgresql.org/message-id/CAD21AoAyNPrOFg%2BQGh%2B%3D4205TU0%3DyrE%2BQyMgzStkH85uBZXptQ%40mail.gmail.com -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 5:52 PM Amit Kapila wrote: > > On Tue, Jul 12, 2022 at 1:13 PM Masahiko Sawada wrote: > > > > On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila wrote: > > > > > > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada > > > wrote: > > > > > > > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada > > > > wrote: > > > > > > > > > > > > > > > I'm doing benchmark tests and will share the results. > > > > > > > > > > > > > I've done benchmark tests to measure the overhead introduced by doing > > > > bsearch() every time when decoding a commit record. I've simulated a > > > > very intensified situation where we decode 1M commit records while > > > > keeping builder->catchange.xip array but the overhead is negilible: > > > > > > > > HEAD: 584 ms > > > > Patched: 614 ms > > > > > > > > I've attached the benchmark script I used. With increasing > > > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by > > > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping > > > > catalog modifying transactions. > > > > > > > > > > Thanks for the test. We should also see how it performs when (a) we > > > don't change LOG_SNAPSHOT_INTERVAL_MS, > > > > What point do you want to see in this test? I think the performance > > overhead depends on how many times we do bsearch() and how many > > transactions are in the list. > > > > Right, I am not expecting any visible performance difference in this > case. This is to ensure that we are not incurring any overhead in the > more usual scenarios (or default cases). As per my understanding, the > purpose of increasing the value of LOG_SNAPSHOT_INTERVAL_MS is to > simulate a stress case for the changes made by the patch, and keeping > its value default will test the more usual scenarios. Agreed. I've done simple benchmark tests to decode 100k pgbench transactions: HEAD: 10.34 s Patched: 10.29 s I've attached an updated patch that incorporated comments from Amit and Shi. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ From 28ca92c9d95cd05a26a7db6e54704f92b1846943 Mon Sep 17 00:00:00 2001 From: Masahiko Sawada Date: Wed, 6 Jul 2022 12:53:36 +0900 Subject: [PATCH v4] Add catalog modifying transactions to logical decoding serialized snapshot. Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION records to know if the transaction has modified the catalog, and that information is not serialized to snapshot. Therefore, if the logical decoding decodes only the commit record of the transaction that actually has modified a catalog, we missed adding its XID to the snapshot. We ended up looking at catalogs with the wrong snapshot. To fix this problem, this change adds the list of transaction IDs and sub-transaction IDs, that have modified catalogs and are running when snapshot serialization, to the serialized snapshot. When decoding a COMMIT record, we check both the list and the ReorderBuffer to see if if the transaction has modified catalogs. Since this adds additional information to the serialized snapshot, we cannot backpatch it. For back branches, we take another approach; remember the last-running-xacts list of the first decoded RUNNING_XACTS record and check if the transaction whose commit record has XACT_XINFO_HAS_INVALS and whose XID is in the list. This doesn't require any file format changes but the transaction will end up being added to the snapshot even if it has only relcache invalidations. This commit bumps SNAPBUILD_VERSION because of change in SnapBuild. --- contrib/test_decoding/Makefile| 2 +- .../expected/catalog_change_snapshot.out | 44 .../specs/catalog_change_snapshot.spec| 39 +++ .../replication/logical/reorderbuffer.c | 69 - src/backend/replication/logical/snapbuild.c | 235 -- src/include/replication/reorderbuffer.h | 12 + 6 files changed, 317 insertions(+), 84 deletions(-) create mode 100644 contrib/test_decoding/expected/catalog_change_snapshot.out create mode 100644 contrib/test_decoding/specs/catalog_change_snapshot.spec diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile index b220906479..c7ce603706 100644 --- a/contrib/test_decoding/Makefile +++ b/contrib/test_decoding/Makefile @@ -8,7 +8,7 @@ REGRESS = ddl xact rewrite toast permissions decoding_in_xact \ spill slot truncate stream stats twophase twophase_stream ISOLATION = mxact delayed_startup ondisk_startup concurrent_ddl_dml \ oldest_xmin snapshot_transfer subxact_without_top concurrent_stream \ - twophase_snapshot slot_creation_error + twophase_snapshot slot_creation_error catalog_change_snapshot REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/test_decoding/logical.conf diff --git a/contrib/test_decoding/expected/catalog_change_snapshot.out b/contrib/test_decoding/expected/catalog_change_snapshot.out new file mode 100644 index
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 7:59 PM Amit Kapila wrote: > > On Tue, Jul 12, 2022 at 2:53 PM Masahiko Sawada wrote: > > > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com > > wrote: > > > > > > > > > It happened when executing the following code because it tried to free a > > > NULL > > > pointer (catchange_xip). > > > > > > /* be tidy */ > > > if (ondisk) > > > pfree(ondisk); > > > + if (catchange_xip) > > > + pfree(catchange_xip); > > > } > > > > > > It seems to be related to configure option. I could reproduce it when > > > using > > > `./configure --enable-debug`. > > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og > > > -ggdb"`. > > > > Hmm, I could not reproduce this problem even if I use ./configure > > --enable-debug. And it's weird that we checked if catchange_xip is not > > null but we did pfree for it: > > > > Yeah, this looks weird to me as well but one difference in running > tests could be the timing of WAL LOG for XLOG_RUNNING_XACTS. That may > change the timing of SnapBuildSerialize. The other thing we can try is > by checking the value of catchange_xcnt before pfree. Yeah, we can try that. While reading the code, I realized that we try to pfree both ondisk and catchange_xip also when we jumped to 'out:': out: ReorderBufferSetRestartPoint(builder->reorder, builder->last_serialized_snapshot); /* be tidy */ if (ondisk) pfree(ondisk); if (catchange_xip) pfree(catchange_xip); But we use both ondisk and catchange_xip only if we didn't jump to 'out:'. If this problem is related to compiler optimization with 'goto' statement, moving them before 'out:' might be worth trying. > > BTW, I think ReorderBufferGetCatalogChangesXacts should have an Assert > to ensure rb->catchange_ntxns and xcnt are equal. We can probably then > avoid having xcnt_p as an out parameter as the caller can use > rb->catchange_ntxns instead. > Agreed. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 2:53 PM Masahiko Sawada wrote: > > On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com > wrote: > > > > > > It happened when executing the following code because it tried to free a > > NULL > > pointer (catchange_xip). > > > > /* be tidy */ > > if (ondisk) > > pfree(ondisk); > > + if (catchange_xip) > > + pfree(catchange_xip); > > } > > > > It seems to be related to configure option. I could reproduce it when using > > `./configure --enable-debug`. > > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og > > -ggdb"`. > > Hmm, I could not reproduce this problem even if I use ./configure > --enable-debug. And it's weird that we checked if catchange_xip is not > null but we did pfree for it: > Yeah, this looks weird to me as well but one difference in running tests could be the timing of WAL LOG for XLOG_RUNNING_XACTS. That may change the timing of SnapBuildSerialize. The other thing we can try is by checking the value of catchange_xcnt before pfree. BTW, I think ReorderBufferGetCatalogChangesXacts should have an Assert to ensure rb->catchange_ntxns and xcnt are equal. We can probably then avoid having xcnt_p as an out parameter as the caller can use rb->catchange_ntxns instead. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 5:58 PM shiy.f...@fujitsu.com wrote: > > On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada wrote: > > > > I've attached an updated patch. > > > > Hi, > > I met a segmentation fault in test_decoding test after applying the patch for > master > branch. Attach the backtrace. Thank you for testing the patch! > > It happened when executing the following code because it tried to free a NULL > pointer (catchange_xip). > > /* be tidy */ > if (ondisk) > pfree(ondisk); > + if (catchange_xip) > + pfree(catchange_xip); > } > > It seems to be related to configure option. I could reproduce it when using > `./configure --enable-debug`. > But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -ggdb"`. Hmm, I could not reproduce this problem even if I use ./configure --enable-debug. And it's weird that we checked if catchange_xip is not null but we did pfree for it: #1 pfree (pointer=0x0) at mcxt.c:1177 #2 0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, lsn=25719712) at snapbuild.c:1792 Is it reproducible in your environment? If so, could you test it again with the following changes? diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index d015c06ced..a6e76e3781 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -1788,7 +1788,7 @@ out: /* be tidy */ if (ondisk) pfree(ondisk); - if (catchange_xip) + if (catchange_xip != NULL) pfree(catchange_xip); } Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada wrote: > > I've attached an updated patch. > Hi, I met a segmentation fault in test_decoding test after applying the patch for master branch. Attach the backtrace. It happened when executing the following code because it tried to free a NULL pointer (catchange_xip). /* be tidy */ if (ondisk) pfree(ondisk); + if (catchange_xip) + pfree(catchange_xip); } It seems to be related to configure option. I could reproduce it when using `./configure --enable-debug`. But I couldn't reproduce with `./configure --enable-debug CFLAGS="-Og -ggdb"`. Regards, Shi yu #0 0x00910333 in GetMemoryChunkContext (pointer=0x0) at ../../../../src/include/utils/memutils.h:129 #1 pfree (pointer=0x0) at mcxt.c:1177 #2 0x0078186b in SnapBuildSerialize (builder=0x1fd5e78, lsn=25719712) at snapbuild.c:1792 #3 0x00782797 in SnapBuildProcessRunningXacts (builder=0x1fd5e78, lsn=25719712, running=0x27dfe50) at snapbuild.c:1199 #4 0x00774273 in standby_decode (ctx=0x1fc3e08, buf=0x7ffd8c95b5d0) at decode.c:346 #5 0x00773ab3 in LogicalDecodingProcessRecord (ctx=ctx@entry=0x1fc3e08, record=0x1fc41a0) at decode.c:119 #6 0x0077815d in pg_logical_slot_get_changes_guts (fcinfo=0x1fb5d88, confirm=, binary=) at logicalfuncs.c:271 #7 0x0067b63d in ExecMakeTableFunctionResult (setexpr=0x1fb42e0, econtext=0x1fb41b0, argContext=, expectedDesc=0x1fd7da8, randomAccess=false) at execSRF.c:234 #8 0x0068b76f in FunctionNext (node=node@entry=0x1fb3fa0) at nodeFunctionscan.c:94 #9 0x0067be37 in ExecScanFetch (recheckMtd=0x68b450 , accessMtd=0x68b470 , node=0x1fb3fa0) at execScan.c:133 #10 ExecScan (node=0x1fb3fa0, accessMtd=0x68b470 , recheckMtd=0x68b450 ) at execScan.c:199 #11 0x0067344b in ExecProcNode (node=0x1fb3fa0) at ../../../src/include/executor/executor.h:259 #12 ExecutePlan (execute_once=, dest=0x1fc02e8, direction=, numberTuples=0, sendTuples=, operation=CMD_SELECT, use_parallel_mode=, planstate=0x1fb3fa0, estate=0x1fb3d78) at execMain.c:1636 #13 standard_ExecutorRun (queryDesc=0x1f9e178, direction=, count=0, execute_once=) at execMain.c:363 #14 0x007dda9e in PortalRunSelect (portal=0x1f51338, forward=, count=0, dest=) at pquery.c:924 #15 0x007decf7 in PortalRun (portal=portal@entry=0x1f51338, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1fc02e8, altdest=altdest@entry=0x1fc02e8, qc=0x7ffd8c95bbf0) at pquery.c:768 #16 0x007db4ff in exec_simple_query ( query_string=0x1ee29a8 "SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');") at postgres.c:1243 #17 0x007dc742 in PostgresMain (dbname=, username=) at postgres.c:4482 #18 0x0076132b in BackendRun (port=, port=) at postmaster.c:4503 #19 BackendStartup (port=) at postmaster.c:4231 #20 ServerLoop () at postmaster.c:1805 #21 0x007621fb in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1edbf20) at postmaster.c:1477 #22 0x004f0f69 in main (argc=3, argv=0x1edbf20) at main.c:202
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 1:13 PM Masahiko Sawada wrote: > > On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila wrote: > > > > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada > > > wrote: > > > > > > > > > > > > I'm doing benchmark tests and will share the results. > > > > > > > > > > I've done benchmark tests to measure the overhead introduced by doing > > > bsearch() every time when decoding a commit record. I've simulated a > > > very intensified situation where we decode 1M commit records while > > > keeping builder->catchange.xip array but the overhead is negilible: > > > > > > HEAD: 584 ms > > > Patched: 614 ms > > > > > > I've attached the benchmark script I used. With increasing > > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by > > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping > > > catalog modifying transactions. > > > > > > > Thanks for the test. We should also see how it performs when (a) we > > don't change LOG_SNAPSHOT_INTERVAL_MS, > > What point do you want to see in this test? I think the performance > overhead depends on how many times we do bsearch() and how many > transactions are in the list. > Right, I am not expecting any visible performance difference in this case. This is to ensure that we are not incurring any overhead in the more usual scenarios (or default cases). As per my understanding, the purpose of increasing the value of LOG_SNAPSHOT_INTERVAL_MS is to simulate a stress case for the changes made by the patch, and keeping its value default will test the more usual scenarios. > I increased this value to easily > simulate the situation where we decode many commit records while > keeping catalog modifying transactions. But even if we don't change > this value, the result would not change if we don't change how many > commit records we decode. > > > and (b) we have more DDL xacts > > so that the array to search is somewhat bigger > > I've done the same performance tests while creating 64 catalog > modifying transactions. The result is: > > HEAD: 595 ms > Patched: 628 ms > > There was no big overhead. > Yeah, especially considering you have simulated a stress case for the patch. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 3:25 PM Amit Kapila wrote: > > On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada > wrote: > > > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada > > wrote: > > > > > > > > > I'm doing benchmark tests and will share the results. > > > > > > > I've done benchmark tests to measure the overhead introduced by doing > > bsearch() every time when decoding a commit record. I've simulated a > > very intensified situation where we decode 1M commit records while > > keeping builder->catchange.xip array but the overhead is negilible: > > > > HEAD: 584 ms > > Patched: 614 ms > > > > I've attached the benchmark script I used. With increasing > > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by > > pg_logicla_slot_get_changes() decodes 1M commit records while keeping > > catalog modifying transactions. > > > > Thanks for the test. We should also see how it performs when (a) we > don't change LOG_SNAPSHOT_INTERVAL_MS, What point do you want to see in this test? I think the performance overhead depends on how many times we do bsearch() and how many transactions are in the list. I increased this value to easily simulate the situation where we decode many commit records while keeping catalog modifying transactions. But even if we don't change this value, the result would not change if we don't change how many commit records we decode. > and (b) we have more DDL xacts > so that the array to search is somewhat bigger I've done the same performance tests while creating 64 catalog modifying transactions. The result is: HEAD: 595 ms Patched: 628 ms There was no big overhead. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 11:38 AM Masahiko Sawada wrote: > > On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada > wrote: > > > > > > I'm doing benchmark tests and will share the results. > > > > I've done benchmark tests to measure the overhead introduced by doing > bsearch() every time when decoding a commit record. I've simulated a > very intensified situation where we decode 1M commit records while > keeping builder->catchange.xip array but the overhead is negilible: > > HEAD: 584 ms > Patched: 614 ms > > I've attached the benchmark script I used. With increasing > LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by > pg_logicla_slot_get_changes() decodes 1M commit records while keeping > catalog modifying transactions. > Thanks for the test. We should also see how it performs when (a) we don't change LOG_SNAPSHOT_INTERVAL_MS, and (b) we have more DDL xacts so that the array to search is somewhat bigger -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 10:28 AM Masahiko Sawada wrote: > > On Tue, Jul 12, 2022 at 9:48 AM Masahiko Sawada wrote: > > > > On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila > > > wrote: > > > > > > > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila > > > > > wrote: > > > > > > > > > > > > > > > > > 1. > > > > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > > > > > > list length of 'catchange_txns' to allocate xids array? If we can do > > > > > > so, then we will save the need to repalloc as well. > > > > > > > > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing > > > > > catalog modifying transactions, the length of the array could be > > > > > bigger than the one taken last time. We can start with the previous > > > > > length but I think we cannot remove the need for repalloc. > > > > > > > > > > > > > It is using the list "catchange_txns" to form xid array which > > > > shouldn't change for the duration of > > > > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid > > > > array after its use. Next time in > > > > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid > > > > array happens, so not sure why repalloc would be required? > > > > > > Oops, I mistook catchange_txns for catchange->xcnt. You're right. > > > Starting with the length of catchange_txns should be sufficient. > > > > > > > I've attached an updated patch. > > > > While trying this idea, I noticed there is no API to get the length of > > dlist, as we discussed offlist. Alternative idea was to use List > > (T_XidList) but I'm not sure it's a great idea since deleting an xid > > from the list is O(N), we need to implement list_delete_xid, and we > > need to make sure allocating list node in the reorder buffer context. > > So in the patch, I added a variable, catchange_ntxns, to keep track of > > the length of the dlist. Please review it. > > > > I'm doing benchmark tests and will share the results. > I've done benchmark tests to measure the overhead introduced by doing bsearch() every time when decoding a commit record. I've simulated a very intensified situation where we decode 1M commit records while keeping builder->catchange.xip array but the overhead is negilible: HEAD: 584 ms Patched: 614 ms I've attached the benchmark script I used. With increasing LOG_SNAPSHOT_INTERVAL_MS to 9, the last decoding by pg_logicla_slot_get_changes() decodes 1M commit records while keeping catalog modifying transactions. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ bench.spec Description: Binary data
RE: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 8:49 AM Masahiko Sawada wrote: > > I've attached an updated patch. > > While trying this idea, I noticed there is no API to get the length of > dlist, as we discussed offlist. Alternative idea was to use List > (T_XidList) but I'm not sure it's a great idea since deleting an xid > from the list is O(N), we need to implement list_delete_xid, and we > need to make sure allocating list node in the reorder buffer context. > So in the patch, I added a variable, catchange_ntxns, to keep track of > the length of the dlist. Please review it. > Thanks for your patch. Here are some comments on the master patch. 1. In catalog_change_snapshot.spec, should we use "RUNNING_XACTS record" instead of "RUNNING_XACT record" / "XACT_RUNNING record" in the comment? 2. +* Since catchange.xip is sorted, we find the lower bound of +* xids that sill are interesting. Typo? "sill" -> "still" 3. +* This array is set once when restoring the snapshot, xids are removed +* from the array when decoding xl_running_xacts record, and then eventually +* becomes an empty. + /* catchange list becomes an empty */ + pfree(builder->catchange.xip); + builder->catchange.xip = NULL; Should "becomes an empty" be modified to "becomes empty"? 4. + * changes that are smaller than ->xmin. Those won't ever get checked via + * the ->committed array and ->catchange, respectively. The committed xids will Should we change "the ->committed array and ->catchange" to "the ->committed or ->catchange array" ? Regards, Shi yu
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 12, 2022 at 9:48 AM Masahiko Sawada wrote: > > On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada wrote: > > > > On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila wrote: > > > > > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada > > > wrote: > > > > > > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila > > > > wrote: > > > > > > > > > > > > > > 1. > > > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > > > > > list length of 'catchange_txns' to allocate xids array? If we can do > > > > > so, then we will save the need to repalloc as well. > > > > > > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing > > > > catalog modifying transactions, the length of the array could be > > > > bigger than the one taken last time. We can start with the previous > > > > length but I think we cannot remove the need for repalloc. > > > > > > > > > > It is using the list "catchange_txns" to form xid array which > > > shouldn't change for the duration of > > > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid > > > array after its use. Next time in > > > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid > > > array happens, so not sure why repalloc would be required? > > > > Oops, I mistook catchange_txns for catchange->xcnt. You're right. > > Starting with the length of catchange_txns should be sufficient. > > > > I've attached an updated patch. > > While trying this idea, I noticed there is no API to get the length of > dlist, as we discussed offlist. Alternative idea was to use List > (T_XidList) but I'm not sure it's a great idea since deleting an xid > from the list is O(N), we need to implement list_delete_xid, and we > need to make sure allocating list node in the reorder buffer context. > So in the patch, I added a variable, catchange_ntxns, to keep track of > the length of the dlist. Please review it. > I'm doing benchmark tests and will share the results. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 8, 2022 at 8:20 PM Masahiko Sawada wrote: > > On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila wrote: > > > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada > > wrote: > > > > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila > > > wrote: > > > > > > > > > > > 1. > > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > > > > list length of 'catchange_txns' to allocate xids array? If we can do > > > > so, then we will save the need to repalloc as well. > > > > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing > > > catalog modifying transactions, the length of the array could be > > > bigger than the one taken last time. We can start with the previous > > > length but I think we cannot remove the need for repalloc. > > > > > > > It is using the list "catchange_txns" to form xid array which > > shouldn't change for the duration of > > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid > > array after its use. Next time in > > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid > > array happens, so not sure why repalloc would be required? > > Oops, I mistook catchange_txns for catchange->xcnt. You're right. > Starting with the length of catchange_txns should be sufficient. > I've attached an updated patch. While trying this idea, I noticed there is no API to get the length of dlist, as we discussed offlist. Alternative idea was to use List (T_XidList) but I'm not sure it's a great idea since deleting an xid from the list is O(N), we need to implement list_delete_xid, and we need to make sure allocating list node in the reorder buffer context. So in the patch, I added a variable, catchange_ntxns, to keep track of the length of the dlist. Please review it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ master-v3-0001-Add-catalog-modifying-transactions-to-logical-dec.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 6, 2022 at 3:01 PM Amit Kapila wrote: > > On Wed, Jul 6, 2022 at 7:38 AM Masahiko Sawada wrote: > > > > I'll post a new version patch in the next email with replying to other > > comments. > > > > Okay, thanks for working on this. Few comments/suggestions on > poc_remember_last_running_xacts_v2 patch: > > 1. > +ReorderBufferSetLastRunningXactsCatalogChanges(ReorderBuffer *rb, > TransactionId xid, > +uint32 xinfo, int subxcnt, > +TransactionId *subxacts, XLogRecPtr lsn) > +{ > ... > ... > + > + test = bsearch(, rb->last_running_xacts, rb->n_last_running_xacts, > +sizeof(TransactionId), xidComparator); > + > + if (test == NULL) > + { > + for (int i = 0; i < subxcnt; i++) > + { > + test = bsearch([i], rb->last_running_xacts, > rb->n_last_running_xacts, > +sizeof(TransactionId), xidComparator); > ... > > Is there ever a possibility that the top transaction id is not in the > running_xacts list but one of its subxids is present? If yes, it is > not very obvious at least to me so adding a comment here could be > useful. If not, then why do we need this additional check for each of > the sub-transaction ids? I think there is no possibility. The check for subtransactions is not necessary. > > 2. > @@ -627,6 +647,15 @@ DecodeCommit(LogicalDecodingContext *ctx, > XLogRecordBuffer *buf, > commit_time = parsed->origin_timestamp; > } > > + /* > + * Set the last running xacts as containing catalog change if necessary. > + * This must be done before SnapBuildCommitTxn() so that we include catalog > + * change transactions to the historic snapshot. > + */ > + ReorderBufferSetLastRunningXactsCatalogChanges(ctx->reorder, xid, > parsed->xinfo, > +parsed->nsubxacts, parsed->subxacts, > +buf->origptr); > + > SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid, > parsed->nsubxacts, parsed->subxacts); > > As mentioned previously as well, marking it before SnapBuildCommitTxn > has one disadvantage, we sometimes do this work even if the snapshot > state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT in which case > SnapBuildCommitTxn wouldn't do anything. Can we instead check whether > the particular txn has invalidations and is present in the > last_running_xacts list along with the check > ReorderBufferXidHasCatalogChanges? I think that has the additional > advantage that we don't need this additional marking if the xact is > already marked as containing catalog changes. Agreed. > > 3. > 1. > + /* > + * We rely on HEAP2_NEW_CID records and XACT_INVALIDATIONS to know > + * if the transaction has changed the catalog, and that information > + * is not serialized to SnapBuilder. Therefore, if the logical > + * decoding decodes the commit record of the transaction that actually > + * has done catalog changes without these records, we miss to add > + * the xid to the snapshot so up creating the wrong snapshot. > > The part of the sentence "... snapshot so up creating the wrong > snapshot." is not clear. In this comment, at one place you have used > two spaces after a full stop, and at another place, there is one > space. I think let's follow nearby code practice to use a single space > before a new sentence. Agreed. > > 4. > +void > +ReorderBufferProcessLastRunningXacts(ReorderBuffer *rb, > xl_running_xacts *running) > +{ > + /* Quick exit if there is no longer last running xacts */ > + if (likely(rb->n_last_running_xacts == 0)) > + return; > + > + /* First call, build the last running xact list */ > + if (rb->n_last_running_xacts == -1) > + { > + int nxacts = running->subxcnt + running->xcnt; > + Size sz = sizeof(TransactionId) * nxacts;; > + > + rb->last_running_xacts = MemoryContextAlloc(rb->context, sz); > + memcpy(rb->last_running_xacts, running->xids, sz); > + qsort(rb->last_running_xacts, nxacts, sizeof(TransactionId), xidComparator); > + > + rb->n_last_running_xacts = nxacts; > + > + return; > + } > > a. Can we add the function header comments for this function? Updated. > b. We seem to be tracking the running_xact information for the first > running_xact record after start/restart. The name last_running_xacts > doesn't sound appropriate for that, how about initial_running_xacts? Sound good, updated. > > 5. > + /* > + * Purge xids in the last running xacts list if we can do that for at least > + * one xid. > + */ > + if (NormalTransactionIdPrecedes(rb->last_running_xacts[0], > + running->oldestRunningXid)) > > I think it would be a good idea to add a few lines here explaining why > it is safe to purge. IIUC, it is because the commit for those xacts > would have already been processed and we don't need such a xid > anymore. Right, updated. > > 6. As per the discussion above in this thread having > XACT_XINFO_HAS_INVALS in the commit record doesn't indicate that the > xact has catalog changes, so can we add somewhere in comments that for > such a case we can't distinguish whether the txn has catalog change > but we still mark the txn has catalog changes?
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 8, 2022 at 5:59 PM Amit Kapila wrote: > > On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada wrote: > > > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila wrote: > > > > > > > > 1. > > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > > > list length of 'catchange_txns' to allocate xids array? If we can do > > > so, then we will save the need to repalloc as well. > > > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing > > catalog modifying transactions, the length of the array could be > > bigger than the one taken last time. We can start with the previous > > length but I think we cannot remove the need for repalloc. > > > > It is using the list "catchange_txns" to form xid array which > shouldn't change for the duration of > ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid > array after its use. Next time in > ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid > array happens, so not sure why repalloc would be required? Oops, I mistook catchange_txns for catchange->xcnt. You're right. Starting with the length of catchange_txns should be sufficient. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 8, 2022 at 12:46 PM Masahiko Sawada wrote: > > On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila wrote: > > > > > 1. > > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > > list length of 'catchange_txns' to allocate xids array? If we can do > > so, then we will save the need to repalloc as well. > > Since ReorderBufferGetcatalogChangesXacts() collects all ongoing > catalog modifying transactions, the length of the array could be > bigger than the one taken last time. We can start with the previous > length but I think we cannot remove the need for repalloc. > It is using the list "catchange_txns" to form xid array which shouldn't change for the duration of ReorderBufferGetCatalogChangesXacts(). Then the caller frees the xid array after its use. Next time in ReorderBufferGetCatalogChangesXacts(), the fresh allocation for xid array happens, so not sure why repalloc would be required? -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 8, 2022 at 3:27 PM Amit Kapila wrote: > > On Fri, Jul 8, 2022 at 6:45 AM Masahiko Sawada wrote: > > > > On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila wrote: > > > > > > On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada > > > wrote: > > > > I've attached the new version patch that incorporates the comments and > > the optimizations discussed above. > > > > Thanks, few minor comments: Thank you for the comments. > 1. > In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the > list length of 'catchange_txns' to allocate xids array? If we can do > so, then we will save the need to repalloc as well. Since ReorderBufferGetcatalogChangesXacts() collects all ongoing catalog modifying transactions, the length of the array could be bigger than the one taken last time. We can start with the previous length but I think we cannot remove the need for repalloc. > 2. > /* ->committed manipulation */ > static void SnapBuildPurgeCommittedTxn(SnapBuild *builder); > > The above comment also needs to be changed. > > 3. As SnapBuildPurgeCommittedTxn() removes xacts both from committed > and catchange arrays, the function name no more remains appropriate. > We can either rename to something like SnapBuildPurgeOlderTxn() or > move the catchange logic to a different function and call it from > SnapBuildProcessRunningXacts. > > 4. > + if (TransactionIdEquals(builder->catchange.xip[off], > + builder->xmin) || > + NormalTransactionIdFollows(builder->catchange.xip[off], > +builder->xmin)) > > Can we use TransactionIdFollowsOrEquals() instead of above? > > 5. Comment change suggestion: > /* > * Remove knowledge about transactions we treat as committed or > containing catalog > * changes that are smaller than ->xmin. Those won't ever get checked via > - * the ->committed array and ->catchange, respectively. The committed xids > will > - * get checked via the clog machinery. > + * the ->committed or ->catchange array, respectively. The committed xids > will > + * get checked via the clog machinery. We can ideally remove the transaction > + * from catchange array once it is finished (committed/aborted) but that > could > + * be costly as we need to maintain the xids order in the array. > */ > Agreed with the above comments. > Apart from the above, I think there are pending comments for the > back-branch patch and some performance testing of this work. Right. I'll incorporate all comments I got so far into these patches and submit them. Also, will do some benchmark tests. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Fri, Jul 8, 2022 at 6:45 AM Masahiko Sawada wrote: > > On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila wrote: > > > > On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada > > wrote: > > I've attached the new version patch that incorporates the comments and > the optimizations discussed above. > Thanks, few minor comments: 1. In ReorderBufferGetCatalogChangesXacts(), isn't it better to use the list length of 'catchange_txns' to allocate xids array? If we can do so, then we will save the need to repalloc as well. 2. /* ->committed manipulation */ static void SnapBuildPurgeCommittedTxn(SnapBuild *builder); The above comment also needs to be changed. 3. As SnapBuildPurgeCommittedTxn() removes xacts both from committed and catchange arrays, the function name no more remains appropriate. We can either rename to something like SnapBuildPurgeOlderTxn() or move the catchange logic to a different function and call it from SnapBuildProcessRunningXacts. 4. + if (TransactionIdEquals(builder->catchange.xip[off], + builder->xmin) || + NormalTransactionIdFollows(builder->catchange.xip[off], +builder->xmin)) Can we use TransactionIdFollowsOrEquals() instead of above? 5. Comment change suggestion: /* * Remove knowledge about transactions we treat as committed or containing catalog * changes that are smaller than ->xmin. Those won't ever get checked via - * the ->committed array and ->catchange, respectively. The committed xids will - * get checked via the clog machinery. + * the ->committed or ->catchange array, respectively. The committed xids will + * get checked via the clog machinery. We can ideally remove the transaction + * from catchange array once it is finished (committed/aborted) but that could + * be costly as we need to maintain the xids order in the array. */ Apart from the above, I think there are pending comments for the back-branch patch and some performance testing of this work. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 7, 2022 at 3:40 PM Amit Kapila wrote: > > On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada wrote: > > > > On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila wrote: > > > > > > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada > > > wrote: > > > > > > > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila > > > > wrote: > > > > > > > > > > 2. Are we anytime removing transaction ids from catchanges->xip array? > > > > > > > > No. > > > > > > > > > If not, is there a reason for the same? I think we can remove it > > > > > either at commit/abort or even immediately after adding the xid/subxid > > > > > to committed->xip array. > > > > > > > > It might be a good idea but I'm concerned that removing XID from the > > > > array at every commit/abort or after adding it to committed->xip array > > > > might be costly as it requires adjustment of the array to keep its > > > > order. Removing XIDs from the array would make bsearch faster but the > > > > array is updated reasonably often (every 15 sec). > > > > > > > > > > Fair point. However, I am slightly worried that we are unnecessarily > > > searching in this new array even when ReorderBufferTxn has the > > > required information. To avoid that, in function > > > SnapBuildXidHasCatalogChange(), we can first check > > > ReorderBufferXidHasCatalogChanges() and then check the array if the > > > first check doesn't return true. Also, by the way, do we need to > > > always keep builder->catchanges.xip updated via SnapBuildRestore()? > > > Isn't it sufficient that we just read and throw away contents from a > > > snapshot if builder->catchanges.xip is non-NULL? > > > > IIUC catchanges.xip is restored only once when restoring a consistent > > snapshot via SnapBuildRestore(). I think it's necessary to set > > catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did > > you mean via SnapBuildSerialize()?∫ > > > > Sorry, I got confused about the way restore is used. You are right, it > will be done once. My main worry is that we shouldn't look at > builder->catchanges.xip array on an ongoing basis which I think can be > dealt with by one of the ideas you mentioned below. But, I think we > can still follow the other suggestion related to moving > ReorderBufferXidHasCatalogChanges() check prior to checking array. Agreed. I've incorporated this change in the new version patch. > > > > > > > I had additionally thought if can further optimize this solution to > > > just store this additional information when we need to serialize for > > > checkpoint record but I think that won't work because walsender can > > > restart even without resatart of server in which case the same problem > > > can occur. > > > > Yes, probably we need to write catalog modifying transactions for > > every serialized snapshot. > > > > > I am not if sure there is a way to further optimize this > > > solution, let me know if you have any ideas? > > > > I suppose that writing additional information to serialized snapshots > > would not be a noticeable overhead since we need 4 bytes per > > transaction and we would not expect there is a huge number of > > concurrent catalog modifying transactions. But both collecting catalog > > modifying transactions (especially when there are many ongoing > > transactions) and bsearch'ing on the XID list every time decoding the > > COMMIT record could bring overhead. > > > > A solution for the first point would be to keep track of catalog > > modifying transactions by using a linked list so that we can avoid > > checking all ongoing transactions. > > > > This sounds reasonable to me. > > > Regarding the second point, on reflection, I think we need to look up > > the XID list until all XID in the list is committed/aborted. We can > > remove XIDs from the list after adding it to committed.xip as you > > suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs > > older than builder->xmin from the list like we do for committed.xip in > > SnapBuildPurgeCommittedTxn(). > > > > I think doing along with RUNNING_XACTS should be fine. At each > commit/abort, the cost could be high because we need to maintain the > sort order. In general, I feel any one of these should be okay because > once the array becomes empty, it won't be used again and there won't > be any operation related to it during ongoing replication. I've attached the new version patch that incorporates the comments and the optimizations discussed above. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ v2-0001-Add-catalog-modifying-transactions-to-logical-dec.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Thu, Jul 7, 2022 at 8:21 AM Masahiko Sawada wrote: > > On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila wrote: > > > > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada > > wrote: > > > > > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila > > > wrote: > > > > > > > > 2. Are we anytime removing transaction ids from catchanges->xip array? > > > > > > No. > > > > > > > If not, is there a reason for the same? I think we can remove it > > > > either at commit/abort or even immediately after adding the xid/subxid > > > > to committed->xip array. > > > > > > It might be a good idea but I'm concerned that removing XID from the > > > array at every commit/abort or after adding it to committed->xip array > > > might be costly as it requires adjustment of the array to keep its > > > order. Removing XIDs from the array would make bsearch faster but the > > > array is updated reasonably often (every 15 sec). > > > > > > > Fair point. However, I am slightly worried that we are unnecessarily > > searching in this new array even when ReorderBufferTxn has the > > required information. To avoid that, in function > > SnapBuildXidHasCatalogChange(), we can first check > > ReorderBufferXidHasCatalogChanges() and then check the array if the > > first check doesn't return true. Also, by the way, do we need to > > always keep builder->catchanges.xip updated via SnapBuildRestore()? > > Isn't it sufficient that we just read and throw away contents from a > > snapshot if builder->catchanges.xip is non-NULL? > > IIUC catchanges.xip is restored only once when restoring a consistent > snapshot via SnapBuildRestore(). I think it's necessary to set > catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did > you mean via SnapBuildSerialize()?∫ > Sorry, I got confused about the way restore is used. You are right, it will be done once. My main worry is that we shouldn't look at builder->catchanges.xip array on an ongoing basis which I think can be dealt with by one of the ideas you mentioned below. But, I think we can still follow the other suggestion related to moving ReorderBufferXidHasCatalogChanges() check prior to checking array. > > > > I had additionally thought if can further optimize this solution to > > just store this additional information when we need to serialize for > > checkpoint record but I think that won't work because walsender can > > restart even without resatart of server in which case the same problem > > can occur. > > Yes, probably we need to write catalog modifying transactions for > every serialized snapshot. > > > I am not if sure there is a way to further optimize this > > solution, let me know if you have any ideas? > > I suppose that writing additional information to serialized snapshots > would not be a noticeable overhead since we need 4 bytes per > transaction and we would not expect there is a huge number of > concurrent catalog modifying transactions. But both collecting catalog > modifying transactions (especially when there are many ongoing > transactions) and bsearch'ing on the XID list every time decoding the > COMMIT record could bring overhead. > > A solution for the first point would be to keep track of catalog > modifying transactions by using a linked list so that we can avoid > checking all ongoing transactions. > This sounds reasonable to me. > Regarding the second point, on reflection, I think we need to look up > the XID list until all XID in the list is committed/aborted. We can > remove XIDs from the list after adding it to committed.xip as you > suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs > older than builder->xmin from the list like we do for committed.xip in > SnapBuildPurgeCommittedTxn(). > I think doing along with RUNNING_XACTS should be fine. At each commit/abort, the cost could be high because we need to maintain the sort order. In general, I feel any one of these should be okay because once the array becomes empty, it won't be used again and there won't be any operation related to it during ongoing replication. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 6, 2022 at 5:55 PM Amit Kapila wrote: > > On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada wrote: > > > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila wrote: > > > > > > 2. Are we anytime removing transaction ids from catchanges->xip array? > > > > No. > > > > > If not, is there a reason for the same? I think we can remove it > > > either at commit/abort or even immediately after adding the xid/subxid > > > to committed->xip array. > > > > It might be a good idea but I'm concerned that removing XID from the > > array at every commit/abort or after adding it to committed->xip array > > might be costly as it requires adjustment of the array to keep its > > order. Removing XIDs from the array would make bsearch faster but the > > array is updated reasonably often (every 15 sec). > > > > Fair point. However, I am slightly worried that we are unnecessarily > searching in this new array even when ReorderBufferTxn has the > required information. To avoid that, in function > SnapBuildXidHasCatalogChange(), we can first check > ReorderBufferXidHasCatalogChanges() and then check the array if the > first check doesn't return true. Also, by the way, do we need to > always keep builder->catchanges.xip updated via SnapBuildRestore()? > Isn't it sufficient that we just read and throw away contents from a > snapshot if builder->catchanges.xip is non-NULL? IIUC catchanges.xip is restored only once when restoring a consistent snapshot via SnapBuildRestore(). I think it's necessary to set catchanges.xip for later use in SnapBuildXidHasCatalogChange(). Or did you mean via SnapBuildSerialize()?∫ > > I had additionally thought if can further optimize this solution to > just store this additional information when we need to serialize for > checkpoint record but I think that won't work because walsender can > restart even without resatart of server in which case the same problem > can occur. Yes, probably we need to write catalog modifying transactions for every serialized snapshot. > I am not if sure there is a way to further optimize this > solution, let me know if you have any ideas? I suppose that writing additional information to serialized snapshots would not be a noticeable overhead since we need 4 bytes per transaction and we would not expect there is a huge number of concurrent catalog modifying transactions. But both collecting catalog modifying transactions (especially when there are many ongoing transactions) and bsearch'ing on the XID list every time decoding the COMMIT record could bring overhead. A solution for the first point would be to keep track of catalog modifying transactions by using a linked list so that we can avoid checking all ongoing transactions. Regarding the second point, on reflection, I think we need to look up the XID list until all XID in the list is committed/aborted. We can remove XIDs from the list after adding it to committed.xip as you suggested. Or when decoding a RUNNING_XACTS record, we can remove XIDs older than builder->xmin from the list like we do for committed.xip in SnapBuildPurgeCommittedTxn(). Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 6, 2022 at 12:19 PM Masahiko Sawada wrote: > > On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila wrote: > > > > 2. Are we anytime removing transaction ids from catchanges->xip array? > > No. > > > If not, is there a reason for the same? I think we can remove it > > either at commit/abort or even immediately after adding the xid/subxid > > to committed->xip array. > > It might be a good idea but I'm concerned that removing XID from the > array at every commit/abort or after adding it to committed->xip array > might be costly as it requires adjustment of the array to keep its > order. Removing XIDs from the array would make bsearch faster but the > array is updated reasonably often (every 15 sec). > Fair point. However, I am slightly worried that we are unnecessarily searching in this new array even when ReorderBufferTxn has the required information. To avoid that, in function SnapBuildXidHasCatalogChange(), we can first check ReorderBufferXidHasCatalogChanges() and then check the array if the first check doesn't return true. Also, by the way, do we need to always keep builder->catchanges.xip updated via SnapBuildRestore()? Isn't it sufficient that we just read and throw away contents from a snapshot if builder->catchanges.xip is non-NULL? I had additionally thought if can further optimize this solution to just store this additional information when we need to serialize for checkpoint record but I think that won't work because walsender can restart even without resatart of server in which case the same problem can occur. I am not if sure there is a way to further optimize this solution, let me know if you have any ideas? -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jul 5, 2022 at 8:00 PM Amit Kapila wrote: > > On Mon, Jul 4, 2022 at 6:12 PM Amit Kapila wrote: > > > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada > > wrote: > > > > > > I've attached three POC patches: > > > > > > > I think it will be a good idea if you can add a short commit message > > at least to say which patch is proposed for HEAD and which one is for > > back branches. Also, it would be good if you can add some description > > of the fix in the commit message. Let's remove poc* from the patch > > name. > > > > Review poc_add_running_catchanges_xacts_to_serialized_snapshot > > = > > Few more comments: Thank you for the comments. > 1. > + > + /* This array must be sorted in xidComparator order */ > + TransactionId *xip; > + } catchanges; > }; > > This array contains the transaction ids for subtransactions as well. I > think it is better mention the same in comments. Updated. > > 2. Are we anytime removing transaction ids from catchanges->xip array? No. > If not, is there a reason for the same? I think we can remove it > either at commit/abort or even immediately after adding the xid/subxid > to committed->xip array. It might be a good idea but I'm concerned that removing XID from the array at every commit/abort or after adding it to committed->xip array might be costly as it requires adjustment of the array to keep its order. Removing XIDs from the array would make bsearch faster but the array is updated reasonably often (every 15 sec). > > 3. > + if (readBytes != sz) > + { > + int save_errno = errno; > + > + CloseTransientFile(fd); > + > + if (readBytes < 0) > + { > + errno = save_errno; > + ereport(ERROR, > + (errcode_for_file_access(), > + errmsg("could not read file \"%s\": %m", path))); > + } > + else > + ereport(ERROR, > + (errcode(ERRCODE_DATA_CORRUPTED), > + errmsg("could not read file \"%s\": read %d of %zu", > + path, readBytes, sz))); > + } > > This is the fourth instance of similar error handling code in > SnapBuildRestore(). Isn't it better to extract this into a separate > function? Good idea, updated. > > 4. > +TransactionId * > +ReorderBufferGetCatalogChangesXacts(ReorderBuffer *rb, size_t *xcnt_p) > +{ > + HASH_SEQ_STATUS hash_seq; > + ReorderBufferTXNByIdEnt *ent; > + TransactionId *xids; > + size_t xcnt = 0; > + size_t xcnt_space = 64; /* arbitrary number */ > + > + xids = (TransactionId *) palloc(sizeof(TransactionId) * xcnt_space); > + > + hash_seq_init(_seq, rb->by_txn); > + while ((ent = hash_seq_search(_seq)) != NULL) > + { > + ReorderBufferTXN *txn = ent->txn; > + > + if (!rbtxn_has_catalog_changes(txn)) > + continue; > > It would be better to allocate memory the first time we have to store > xids. There is a good chance that many a time this function will do > just palloc without having to store any xid. Agreed. > > 5. Do you think we should do some performance testing for a mix of > ddl/dml workload to see if it adds any overhead in decoding due to > serialize/restore doing additional work? I don't think it should add > some meaningful overhead but OTOH there is no harm in doing some > testing of the same. Yes, it would be worth trying. I also believe this change doesn't introduce noticeable overhead but let's check just in case. I've attached an updated patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ 0001-Add-catalog-modifying-transactions-to-logical-decodi.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Wed, Jul 6, 2022 at 7:38 AM Masahiko Sawada wrote: > > I'll post a new version patch in the next email with replying to other > comments. > Okay, thanks for working on this. Few comments/suggestions on poc_remember_last_running_xacts_v2 patch: 1. +ReorderBufferSetLastRunningXactsCatalogChanges(ReorderBuffer *rb, TransactionId xid, +uint32 xinfo, int subxcnt, +TransactionId *subxacts, XLogRecPtr lsn) +{ ... ... + + test = bsearch(, rb->last_running_xacts, rb->n_last_running_xacts, +sizeof(TransactionId), xidComparator); + + if (test == NULL) + { + for (int i = 0; i < subxcnt; i++) + { + test = bsearch([i], rb->last_running_xacts, rb->n_last_running_xacts, +sizeof(TransactionId), xidComparator); ... Is there ever a possibility that the top transaction id is not in the running_xacts list but one of its subxids is present? If yes, it is not very obvious at least to me so adding a comment here could be useful. If not, then why do we need this additional check for each of the sub-transaction ids? 2. @@ -627,6 +647,15 @@ DecodeCommit(LogicalDecodingContext *ctx, XLogRecordBuffer *buf, commit_time = parsed->origin_timestamp; } + /* + * Set the last running xacts as containing catalog change if necessary. + * This must be done before SnapBuildCommitTxn() so that we include catalog + * change transactions to the historic snapshot. + */ + ReorderBufferSetLastRunningXactsCatalogChanges(ctx->reorder, xid, parsed->xinfo, +parsed->nsubxacts, parsed->subxacts, +buf->origptr); + SnapBuildCommitTxn(ctx->snapshot_builder, buf->origptr, xid, parsed->nsubxacts, parsed->subxacts); As mentioned previously as well, marking it before SnapBuildCommitTxn has one disadvantage, we sometimes do this work even if the snapshot state is SNAPBUILD_START or SNAPBUILD_BUILDING_SNAPSHOT in which case SnapBuildCommitTxn wouldn't do anything. Can we instead check whether the particular txn has invalidations and is present in the last_running_xacts list along with the check ReorderBufferXidHasCatalogChanges? I think that has the additional advantage that we don't need this additional marking if the xact is already marked as containing catalog changes. 3. 1. + /* + * We rely on HEAP2_NEW_CID records and XACT_INVALIDATIONS to know + * if the transaction has changed the catalog, and that information + * is not serialized to SnapBuilder. Therefore, if the logical + * decoding decodes the commit record of the transaction that actually + * has done catalog changes without these records, we miss to add + * the xid to the snapshot so up creating the wrong snapshot. The part of the sentence "... snapshot so up creating the wrong snapshot." is not clear. In this comment, at one place you have used two spaces after a full stop, and at another place, there is one space. I think let's follow nearby code practice to use a single space before a new sentence. 4. +void +ReorderBufferProcessLastRunningXacts(ReorderBuffer *rb, xl_running_xacts *running) +{ + /* Quick exit if there is no longer last running xacts */ + if (likely(rb->n_last_running_xacts == 0)) + return; + + /* First call, build the last running xact list */ + if (rb->n_last_running_xacts == -1) + { + int nxacts = running->subxcnt + running->xcnt; + Size sz = sizeof(TransactionId) * nxacts;; + + rb->last_running_xacts = MemoryContextAlloc(rb->context, sz); + memcpy(rb->last_running_xacts, running->xids, sz); + qsort(rb->last_running_xacts, nxacts, sizeof(TransactionId), xidComparator); + + rb->n_last_running_xacts = nxacts; + + return; + } a. Can we add the function header comments for this function? b. We seem to be tracking the running_xact information for the first running_xact record after start/restart. The name last_running_xacts doesn't sound appropriate for that, how about initial_running_xacts? 5. + /* + * Purge xids in the last running xacts list if we can do that for at least + * one xid. + */ + if (NormalTransactionIdPrecedes(rb->last_running_xacts[0], + running->oldestRunningXid)) I think it would be a good idea to add a few lines here explaining why it is safe to purge. IIUC, it is because the commit for those xacts would have already been processed and we don't need such a xid anymore. 6. As per the discussion above in this thread having XACT_XINFO_HAS_INVALS in the commit record doesn't indicate that the xact has catalog changes, so can we add somewhere in comments that for such a case we can't distinguish whether the txn has catalog change but we still mark the txn has catalog changes? Can you please share one example for this case? -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 4, 2022 at 9:42 PM Amit Kapila wrote: > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada > wrote: > > > > I've attached three POC patches: > > > > I think it will be a good idea if you can add a short commit message > at least to say which patch is proposed for HEAD and which one is for > back branches. Also, it would be good if you can add some description > of the fix in the commit message. Let's remove poc* from the patch > name. Updated. > > Review poc_add_running_catchanges_xacts_to_serialized_snapshot > = > 1. > + /* > + * Array of transactions that were running when the snapshot serialization > + * and changed system catalogs, > > The part of the sentence after serialization is not very clear. Updated. > > 2. > - if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid)) > + if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid) || > + bsearch(, builder->catchanges.xip, builder->catchanges.xcnt, > + sizeof(TransactionId), xidComparator) != NULL) > > Why are you using xid instead of subxid in bsearch call? Can we add a > comment to say why it is okay to use xid if there is a valid reason? > But note, we are using subxid to add to the committed xact array so > not sure if this is a good idea but I might be missing something. You're right, subxid should be used here. > > Suggestions for improvement in comments: > - /* > -* Update the transactions that are running and changes > catalogs that are > -* not committed. > -*/ > + /* Update the catalog modifying transactions that are yet not > committed. */ > if (builder->catchanges.xip) > pfree(builder->catchanges.xip); > builder->catchanges.xip = > ReorderBufferGetCatalogChangesXacts(builder->reorder, > @@ -1647,7 +1644,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) > COMP_CRC32C(ondisk->checksum, ondisk_c, sz); > ondisk_c += sz; > > - /* copy catalog-changes xacts */ > + /* copy catalog modifying xacts */ > sz = sizeof(TransactionId) * builder->catchanges.xcnt; > memcpy(ondisk_c, builder->catchanges.xip, sz); > COMP_CRC32C(ondisk->checksum, ondisk_c, sz); Updated. I'll post a new version patch in the next email with replying to other comments. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jul 4, 2022 at 6:12 PM Amit Kapila wrote: > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada > wrote: > > > > I've attached three POC patches: > > > > I think it will be a good idea if you can add a short commit message > at least to say which patch is proposed for HEAD and which one is for > back branches. Also, it would be good if you can add some description > of the fix in the commit message. Let's remove poc* from the patch > name. > > Review poc_add_running_catchanges_xacts_to_serialized_snapshot > = Few more comments: 1. + + /* This array must be sorted in xidComparator order */ + TransactionId *xip; + } catchanges; }; This array contains the transaction ids for subtransactions as well. I think it is better mention the same in comments. 2. Are we anytime removing transaction ids from catchanges->xip array? If not, is there a reason for the same? I think we can remove it either at commit/abort or even immediately after adding the xid/subxid to committed->xip array. 3. + if (readBytes != sz) + { + int save_errno = errno; + + CloseTransientFile(fd); + + if (readBytes < 0) + { + errno = save_errno; + ereport(ERROR, + (errcode_for_file_access(), + errmsg("could not read file \"%s\": %m", path))); + } + else + ereport(ERROR, + (errcode(ERRCODE_DATA_CORRUPTED), + errmsg("could not read file \"%s\": read %d of %zu", + path, readBytes, sz))); + } This is the fourth instance of similar error handling code in SnapBuildRestore(). Isn't it better to extract this into a separate function? 4. +TransactionId * +ReorderBufferGetCatalogChangesXacts(ReorderBuffer *rb, size_t *xcnt_p) +{ + HASH_SEQ_STATUS hash_seq; + ReorderBufferTXNByIdEnt *ent; + TransactionId *xids; + size_t xcnt = 0; + size_t xcnt_space = 64; /* arbitrary number */ + + xids = (TransactionId *) palloc(sizeof(TransactionId) * xcnt_space); + + hash_seq_init(_seq, rb->by_txn); + while ((ent = hash_seq_search(_seq)) != NULL) + { + ReorderBufferTXN *txn = ent->txn; + + if (!rbtxn_has_catalog_changes(txn)) + continue; It would be better to allocate memory the first time we have to store xids. There is a good chance that many a time this function will do just palloc without having to store any xid. 5. Do you think we should do some performance testing for a mix of ddl/dml workload to see if it adds any overhead in decoding due to serialize/restore doing additional work? I don't think it should add some meaningful overhead but OTOH there is no harm in doing some testing of the same. -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada wrote: > > I've attached three POC patches: > I think it will be a good idea if you can add a short commit message at least to say which patch is proposed for HEAD and which one is for back branches. Also, it would be good if you can add some description of the fix in the commit message. Let's remove poc* from the patch name. Review poc_add_running_catchanges_xacts_to_serialized_snapshot = 1. + /* + * Array of transactions that were running when the snapshot serialization + * and changed system catalogs, The part of the sentence after serialization is not very clear. 2. - if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid)) + if (ReorderBufferXidHasCatalogChanges(builder->reorder, subxid) || + bsearch(, builder->catchanges.xip, builder->catchanges.xcnt, + sizeof(TransactionId), xidComparator) != NULL) Why are you using xid instead of subxid in bsearch call? Can we add a comment to say why it is okay to use xid if there is a valid reason? But note, we are using subxid to add to the committed xact array so not sure if this is a good idea but I might be missing something. Suggestions for improvement in comments: - /* -* Update the transactions that are running and changes catalogs that are -* not committed. -*/ + /* Update the catalog modifying transactions that are yet not committed. */ if (builder->catchanges.xip) pfree(builder->catchanges.xip); builder->catchanges.xip = ReorderBufferGetCatalogChangesXacts(builder->reorder, @@ -1647,7 +1644,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) COMP_CRC32C(ondisk->checksum, ondisk_c, sz); ondisk_c += sz; - /* copy catalog-changes xacts */ + /* copy catalog modifying xacts */ sz = sizeof(TransactionId) * builder->catchanges.xcnt; memcpy(ondisk_c, builder->catchanges.xip, sz); COMP_CRC32C(ondisk->checksum, ondisk_c, sz); -- With Regards, Amit Kapila.
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Tue, Jun 14, 2022 at 3:57 PM Amit Kapila wrote: > > On Mon, Jun 13, 2022 at 8:29 AM Masahiko Sawada wrote: > > > > On Tue, Jun 7, 2022 at 9:32 PM Amit Kapila wrote: > > > > > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada > > > wrote: > > > > > > > > On Wed, May 25, 2022 at 12:11 PM Masahiko Sawada > > > > wrote: > > > > > > > > > > > > > poc_add_regression_tests.patch adds regression tests for this bug. The > > > > regression tests are required for both HEAD and back-patching but I've > > > > separated this patch for testing the above two patches easily. > > > > > > > > Thank you for the comments. > > > > > > > > Few comments on the test case patch: > > > === > > > 1. > > > +# For the transaction that TRUNCATEd the table tbl1, the last decoding > > > decodes > > > +# only its COMMIT record, because it starts from the RUNNING_XACT > > > record emitted > > > +# during the first checkpoint execution. This transaction must be > > > marked as > > > +# catalog-changes while decoding the COMMIT record and the decoding > > > of the INSERT > > > +# record must read the pg_class with the correct historic snapshot. > > > +permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate" > > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert" > > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes" > > > > > > Will this test always work? What if we get an additional running_xact > > > record between steps "s0_commit" and "s0_begin" that is logged via > > > bgwriter? You can mimic that by adding an additional checkpoint > > > between those two steps. If we do that, the test will pass even > > > without the patch because I think the last decoding will start > > > decoding from this new running_xact record. > > > > Right. It could pass depending on the timing but doesn't fail > > depending on the timing. I think we need to somehow stop bgwriter to > > make the test case stable but it seems unrealistic. > > > > Agreed, in my local testing for this case, I use to increase > LOG_SNAPSHOT_INTERVAL_MS to avoid such a situation but I understand it > is not practical via test. > > > Do you have any > > better ideas? > > > > No, I don't have any better ideas. I think it is better to add some > information related to this in the comments because it may help to > improve the test in the future if we come up with a better idea. I also don't have any better ideas to make it stable, and agreed. I've attached an updated version patch for adding regression tests. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/ poc_add_regression_tests_v2.patch Description: Binary data
Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
On Mon, Jun 13, 2022 at 8:29 AM Masahiko Sawada wrote: > > On Tue, Jun 7, 2022 at 9:32 PM Amit Kapila wrote: > > > > On Mon, May 30, 2022 at 11:13 AM Masahiko Sawada > > wrote: > > > > > > On Wed, May 25, 2022 at 12:11 PM Masahiko Sawada > > > wrote: > > > > > > > > > > poc_add_regression_tests.patch adds regression tests for this bug. The > > > regression tests are required for both HEAD and back-patching but I've > > > separated this patch for testing the above two patches easily. > > > > > Thank you for the comments. > > > > > Few comments on the test case patch: > > === > > 1. > > +# For the transaction that TRUNCATEd the table tbl1, the last decoding > > decodes > > +# only its COMMIT record, because it starts from the RUNNING_XACT > > record emitted > > +# during the first checkpoint execution. This transaction must be marked > > as > > +# catalog-changes while decoding the COMMIT record and the decoding > > of the INSERT > > +# record must read the pg_class with the correct historic snapshot. > > +permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate" > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert" > > "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes" > > > > Will this test always work? What if we get an additional running_xact > > record between steps "s0_commit" and "s0_begin" that is logged via > > bgwriter? You can mimic that by adding an additional checkpoint > > between those two steps. If we do that, the test will pass even > > without the patch because I think the last decoding will start > > decoding from this new running_xact record. > > Right. It could pass depending on the timing but doesn't fail > depending on the timing. I think we need to somehow stop bgwriter to > make the test case stable but it seems unrealistic. > Agreed, in my local testing for this case, I use to increase LOG_SNAPSHOT_INTERVAL_MS to avoid such a situation but I understand it is not practical via test. > Do you have any > better ideas? > No, I don't have any better ideas. I think it is better to add some information related to this in the comments because it may help to improve the test in the future if we come up with a better idea. -- With Regards, Amit Kapila.