Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Tue, 18 Feb 2020 at 00:40, Muhammad Usama wrote: > > Hi Sawada San, > > I have a couple of comments on > "v27-0002-Support-atomic-commit-among-multiple-foreign-ser.patch" > > 1- As part of the XLogReadRecord refactoring commit the signature of > XLogReadRecord was changed, > so the function call to XLogReadRecord() needs a small adjustment. > > i.e. In function XlogReadFdwXactData(XLogRecPtr lsn, char **buf, int *len) > ... > - record = XLogReadRecord(xlogreader, lsn, ); > + XLogBeginRead(xlogreader, lsn) > + record = XLogReadRecord(xlogreader, ); > > 2- In register_fdwxact(..) function you are setting the > XACT_FLAGS_FDWNOPREPARE transaction flag > when the register request comes in for foreign server that does not support > two-phase commit regardless > of the value of 'bool modified' argument. And later in the > PreCommit_FdwXacts() you just error out when > "foreign_twophase_commit" is set to 'required' only by looking at the > XACT_FLAGS_FDWNOPREPARE flag. > which I think is not correct. > Since there is a possibility that the transaction might have only read from > the foreign servers (not capable of > handling transactions or two-phase commit) and all other servers where we > require to do atomic commit > are capable enough of doing so. > If I am not missing something obvious here, then IMHO the > XACT_FLAGS_FDWNOPREPARE flag should only > be set when the transaction management/two-phase functionality is not > available and "modified" argument is > true in register_fdwxact() > Thank you for reviewing this patch! Your comments are incorporated in the latest patch set I recently sent[1]. [1] https://www.postgresql.org/message-id/CA%2Bfd4k5ZcDvoiY_5c-mF1oDACS5nUWS7ppoiOwjCOnM%2BgrJO-Q%40mail.gmail.com Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
Hi Sawada San, I have a couple of comments on "v27-0002-Support-atomic-commit-among-multiple-foreign-ser.patch" 1- As part of the XLogReadRecord refactoring commit the signature of XLogReadRecord was changed, so the function call to XLogReadRecord() needs a small adjustment. i.e. In function XlogReadFdwXactData(XLogRecPtr lsn, char **buf, int *len) ... - record = XLogReadRecord(xlogreader, lsn, ); + XLogBeginRead(xlogreader, lsn) + record = XLogReadRecord(xlogreader, ); 2- In register_fdwxact(..) function you are setting the XACT_FLAGS_FDWNOPREPARE transaction flag when the register request comes in for foreign server that does not support two-phase commit regardless of the value of 'bool modified' argument. And later in the PreCommit_FdwXacts() you just error out when "foreign_twophase_commit" is set to 'required' only by looking at the XACT_FLAGS_FDWNOPREPARE flag. which I think is not correct. Since there is a possibility that the transaction might have only read from the foreign servers (not capable of handling transactions or two-phase commit) and all other servers where we require to do atomic commit are capable enough of doing so. If I am not missing something obvious here, then IMHO the XACT_FLAGS_FDWNOPREPARE flag should only be set when the transaction management/two-phase functionality is not available and "modified" argument is true in register_fdwxact() Thanks Best regards Muhammad Usama Highgo Software (Canada/China/Pakistan) The new status of this patch is: Waiting on Author
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Sep 04, 2019 at 12:44:20PM +0900, Masahiko Sawada wrote: > I forgot to include some new header files. Attached the updated patches. No reviews since and the patch does not apply anymore. I am moving it to next CF, waiting on author. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
Hello Sawada-san, On 2019-Jul-02, Masahiko Sawada wrote: > On Mon, Jul 1, 2019 at 8:32 PM Thomas Munro wrote: > > Can we please have a fresh rebase? > > Thank you for the notice. Attached rebased patches. ... and again? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Apr 17, 2019 at 10:23 PM Masahiko Sawada wrote: > Sorry for the very late. Attached updated version patches. Hello Sawada-san, Can we please have a fresh rebase? Thanks, -- Thomas Munro https://enterprisedb.com
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Thu, Jan 31, 2019 at 11:09:09AM +0100, Masahiko Sawada wrote: > Thanks. Actually I'm updating the patch set, changing API interface as > I proposed before and improving the document and README. I'll submit > the latest patch next week. Cool, I have moved the patch to next CF. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Tue, Jan 29, 2019 at 5:47 PM Ildar Musin wrote: > > Hello, > > The patch needs rebase as it doesn't apply to the current master. I applied it > to the older commit to test it. It worked fine so far. Thank you for testing the patch! > > I found one bug though which would cause resolver to finish by timeout even > though there are unresolved foreign transactions in the list. The > `fdw_xact_exists()` function expects database id as the first argument and xid > as the second. But everywhere it is called arguments specified in the > different > order (xid first, then dbid). Also function declaration in header doesn't > match its definition. Will fix. > > There are some other things I found. > * In `FdwXactResolveAllDanglingTransactions()` variable `n_resolved` is > declared as bool but used as integer. > * In fdwxact.c's module comment there are > `FdwXactRegisterForeignTransaction()` > and `FdwXactMarkForeignTransactionModified()` functions mentioned that are > not there anymore. > * In documentation (storage.sgml) there is no mention of `pg_fdw_xact` > directory. > > Couple of stylistic notes. > * In `FdwXactCtlData struct` there are both camel case and snake case naming > used. > * In `get_fdw_xacts()` `xid != InvalidTransactionId` can be replaced with > `TransactionIdIsValid(xid)`. > * In `generate_fdw_xact_identifier()` the `fx` prefix could be a part of > format > string instead of being processed by `sprintf` as an extra argument. > I'll incorporate them at the next patch set. > I'll continue looking into the patch. Thanks! Thanks. Actually I'm updating the patch set, changing API interface as I proposed before and improving the document and README. I'll submit the latest patch next week. -- Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
Hello, The patch needs rebase as it doesn't apply to the current master. I applied it to the older commit to test it. It worked fine so far. I found one bug though which would cause resolver to finish by timeout even though there are unresolved foreign transactions in the list. The `fdw_xact_exists()` function expects database id as the first argument and xid as the second. But everywhere it is called arguments specified in the different order (xid first, then dbid). Also function declaration in header doesn't match its definition. There are some other things I found. * In `FdwXactResolveAllDanglingTransactions()` variable `n_resolved` is declared as bool but used as integer. * In fdwxact.c's module comment there are `FdwXactRegisterForeignTransaction()` and `FdwXactMarkForeignTransactionModified()` functions mentioned that are not there anymore. * In documentation (storage.sgml) there is no mention of `pg_fdw_xact` directory. Couple of stylistic notes. * In `FdwXactCtlData struct` there are both camel case and snake case naming used. * In `get_fdw_xacts()` `xid != InvalidTransactionId` can be replaced with `TransactionIdIsValid(xid)`. * In `generate_fdw_xact_identifier()` the `fx` prefix could be a part of format string instead of being processed by `sprintf` as an extra argument. I'll continue looking into the patch. Thanks! On Tue, Nov 20, 2018 at 12:18 PM Masahiko Sawada wrote: > On Thu, Nov 15, 2018 at 7:36 PM Masahiko Sawada > wrote: > > > > On Mon, Oct 29, 2018 at 6:03 PM Masahiko Sawada > wrote: > > > > > > On Mon, Oct 29, 2018 at 10:16 AM Masahiko Sawada < > sawada.m...@gmail.com> wrote: > > > > > > > > On Wed, Oct 24, 2018 at 9:06 AM Masahiko Sawada < > sawada.m...@gmail.com> wrote: > > > > > > > > > > On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI > > > > > wrote: > > > > > > > > > > > > Hello. > > > > > > > > > > > > # It took a long time to come here.. > > > > > > > > > > > > At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada < > sawada.m...@gmail.com> wrote in > > > > > > > > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada < > sawada.m...@gmail.com> wrote: > > > > > > ... > > > > > > > * Updated docs, added the new section "Distributed > Transaction" at > > > > > > > Chapter 33 to explain the concept to users > > > > > > > > > > > > > > * Moved atomic commit codes into src/backend/access/fdwxact > directory. > > > > > > > > > > > > > > * Some bug fixes. > > > > > > > > > > > > > > Please reivew them. > > > > > > > > > > > > I have some comments, with apologize in advance for possible > > > > > > duplicate or conflict with others' comments so far. > > > > > > > > > > Thank youf so much for reviewing this patch! > > > > > > > > > > > > > > > > > 0001: > > > > > > > > > > > > This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT > > > > > > relation is modified. Isn't it needed when UNLOGGED tables are > > > > > > modified? It may be better that we have dedicated classification > > > > > > macro or function. > > > > > > > > > > I think even if we do atomic commit for modifying the an UNLOGGED > > > > > table and a remote table the data will get inconsistent if the > local > > > > > server crashes. For example, if the local server crashes after > > > > > prepared the transaction on foreign server but before the local > commit > > > > > and, we will lose the all data of the local UNLOGGED table whereas > the > > > > > modification of remote table is rollbacked. In case of persistent > > > > > tables, the data consistency is left. So I think the keeping data > > > > > consistency between remote data and local unlogged table is > difficult > > > > > and want to leave it as a restriction for now. Am I missing > something? > > > > > > > > > > > > > > > > > The flag is handled in heapam.c. I suppose that it should be done > > > > > > in the upper layer considering coming pluggable storage. > > > > > > (X_F_ACCESSEDTEMPREL is set in heapam, but..) > > > > > > > > > > > > > > > > Yeah, or we can set the flag after heap_insert in ExecInsert. > > > > > > > > > > > > > > > > > 0002: > > > > > > > > > > > > The name FdwXactParticipantsForAC doesn't sound good for me. How > > > > > > about FdwXactAtomicCommitPartitcipants? > > > > > > > > > > +1, will fix it. > > > > > > > > > > > > > > > > > Well, as the file comment of fdwxact.c, > > > > > > FdwXactRegisterTransaction is called from FDW driver and > > > > > > F_X_MarkForeignTransactionModified is called from executor. I > > > > > > think that we should clarify who is responsible to the whole > > > > > > sequence. Since the state of local tables affects, I suppose > > > > > > executor is that. Couldn't we do the whole thing within executor > > > > > > side? I'm not sure but I feel that > > > > > > F_X_RegisterForeignTransaction can be a part of > > > > > > F_X_MarkForeignTransactionModified. The callers of > > > > > > MarkForeignTransactionModified can find whether the table is > > > > > > involved in 2pc by
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Thu, Nov 15, 2018 at 7:36 PM Masahiko Sawada wrote: > > On Mon, Oct 29, 2018 at 6:03 PM Masahiko Sawada wrote: > > > > On Mon, Oct 29, 2018 at 10:16 AM Masahiko Sawada > > wrote: > > > > > > On Wed, Oct 24, 2018 at 9:06 AM Masahiko Sawada > > > wrote: > > > > > > > > On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI > > > > wrote: > > > > > > > > > > Hello. > > > > > > > > > > # It took a long time to come here.. > > > > > > > > > > At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada > > > > > wrote in > > > > > > > > > > > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada > > > > > > wrote: > > > > > ... > > > > > > * Updated docs, added the new section "Distributed Transaction" at > > > > > > Chapter 33 to explain the concept to users > > > > > > > > > > > > * Moved atomic commit codes into src/backend/access/fdwxact > > > > > > directory. > > > > > > > > > > > > * Some bug fixes. > > > > > > > > > > > > Please reivew them. > > > > > > > > > > I have some comments, with apologize in advance for possible > > > > > duplicate or conflict with others' comments so far. > > > > > > > > Thank youf so much for reviewing this patch! > > > > > > > > > > > > > > 0001: > > > > > > > > > > This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT > > > > > relation is modified. Isn't it needed when UNLOGGED tables are > > > > > modified? It may be better that we have dedicated classification > > > > > macro or function. > > > > > > > > I think even if we do atomic commit for modifying the an UNLOGGED > > > > table and a remote table the data will get inconsistent if the local > > > > server crashes. For example, if the local server crashes after > > > > prepared the transaction on foreign server but before the local commit > > > > and, we will lose the all data of the local UNLOGGED table whereas the > > > > modification of remote table is rollbacked. In case of persistent > > > > tables, the data consistency is left. So I think the keeping data > > > > consistency between remote data and local unlogged table is difficult > > > > and want to leave it as a restriction for now. Am I missing something? > > > > > > > > > > > > > > The flag is handled in heapam.c. I suppose that it should be done > > > > > in the upper layer considering coming pluggable storage. > > > > > (X_F_ACCESSEDTEMPREL is set in heapam, but..) > > > > > > > > > > > > > Yeah, or we can set the flag after heap_insert in ExecInsert. > > > > > > > > > > > > > > 0002: > > > > > > > > > > The name FdwXactParticipantsForAC doesn't sound good for me. How > > > > > about FdwXactAtomicCommitPartitcipants? > > > > > > > > +1, will fix it. > > > > > > > > > > > > > > Well, as the file comment of fdwxact.c, > > > > > FdwXactRegisterTransaction is called from FDW driver and > > > > > F_X_MarkForeignTransactionModified is called from executor. I > > > > > think that we should clarify who is responsible to the whole > > > > > sequence. Since the state of local tables affects, I suppose > > > > > executor is that. Couldn't we do the whole thing within executor > > > > > side? I'm not sure but I feel that > > > > > F_X_RegisterForeignTransaction can be a part of > > > > > F_X_MarkForeignTransactionModified. The callers of > > > > > MarkForeignTransactionModified can find whether the table is > > > > > involved in 2pc by IsTwoPhaseCommitEnabled interface. > > > > > > > > Indeed. We can register foreign servers by executor while FDWs don't > > > > need to register anything. I will remove the registration function so > > > > that FDW developers don't need to call the register function but only > > > > need to provide atomic commit APIs. > > > > > > > > > > > > > > > > > > > > if (foreign_twophase_commit == true && > > > > > > ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) ) > > > > > > ereport(ERROR, > > > > > > > > > > > > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > > > > >errmsg("cannot COMMIT a distributed > > > > > > transaction that has operated on foreign server that doesn't > > > > > > support atomic commit"))); > > > > > > > > > > The error is emitted when a the GUC is turned off in the > > > > > trasaction where MarkTransactionModify'ed. I think that the > > > > > number of the variables' possible states should be reduced for > > > > > simplicity. For example in the case, once foreign_twopase_commit > > > > > is checked in a transaction, subsequent changes in the > > > > > transaction should be ignored during the transaction. > > > > > > > > > > > > > I might have not gotten your comment correctly but since the > > > > foreign_twophase_commit is a PGC_USERSET parameter I think we need to > > > > check it at commit time. Also we need to keep participant servers even > > > > when foreign_twophase_commit is off if both max_prepared_foreign_xacts > > > > and max_foreign_xact_resolvers are > 0. > > > > > > > > I will post the updated patch
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI wrote: > > Hello. > > # It took a long time to come here.. > > At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada > wrote in > > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada > > wrote: > ... > > * Updated docs, added the new section "Distributed Transaction" at > > Chapter 33 to explain the concept to users > > > > * Moved atomic commit codes into src/backend/access/fdwxact directory. > > > > * Some bug fixes. > > > > Please reivew them. > > I have some comments, with apologize in advance for possible > duplicate or conflict with others' comments so far. Thank youf so much for reviewing this patch! > > 0001: > > This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT > relation is modified. Isn't it needed when UNLOGGED tables are > modified? It may be better that we have dedicated classification > macro or function. I think even if we do atomic commit for modifying the an UNLOGGED table and a remote table the data will get inconsistent if the local server crashes. For example, if the local server crashes after prepared the transaction on foreign server but before the local commit and, we will lose the all data of the local UNLOGGED table whereas the modification of remote table is rollbacked. In case of persistent tables, the data consistency is left. So I think the keeping data consistency between remote data and local unlogged table is difficult and want to leave it as a restriction for now. Am I missing something? > > The flag is handled in heapam.c. I suppose that it should be done > in the upper layer considering coming pluggable storage. > (X_F_ACCESSEDTEMPREL is set in heapam, but..) > Yeah, or we can set the flag after heap_insert in ExecInsert. > > 0002: > > The name FdwXactParticipantsForAC doesn't sound good for me. How > about FdwXactAtomicCommitPartitcipants? +1, will fix it. > > Well, as the file comment of fdwxact.c, > FdwXactRegisterTransaction is called from FDW driver and > F_X_MarkForeignTransactionModified is called from executor. I > think that we should clarify who is responsible to the whole > sequence. Since the state of local tables affects, I suppose > executor is that. Couldn't we do the whole thing within executor > side? I'm not sure but I feel that > F_X_RegisterForeignTransaction can be a part of > F_X_MarkForeignTransactionModified. The callers of > MarkForeignTransactionModified can find whether the table is > involved in 2pc by IsTwoPhaseCommitEnabled interface. Indeed. We can register foreign servers by executor while FDWs don't need to register anything. I will remove the registration function so that FDW developers don't need to call the register function but only need to provide atomic commit APIs. > > > > if (foreign_twophase_commit == true && > > ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) ) > > ereport(ERROR, > > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > >errmsg("cannot COMMIT a distributed > > transaction that has operated on foreign server that doesn't support atomic > > commit"))); > > The error is emitted when a the GUC is turned off in the > trasaction where MarkTransactionModify'ed. I think that the > number of the variables' possible states should be reduced for > simplicity. For example in the case, once foreign_twopase_commit > is checked in a transaction, subsequent changes in the > transaction should be ignored during the transaction. > I might have not gotten your comment correctly but since the foreign_twophase_commit is a PGC_USERSET parameter I think we need to check it at commit time. Also we need to keep participant servers even when foreign_twophase_commit is off if both max_prepared_foreign_xacts and max_foreign_xact_resolvers are > 0. I will post the updated patch in this week. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
Hello. # It took a long time to come here.. At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada wrote in > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada wrote: ... > * Updated docs, added the new section "Distributed Transaction" at > Chapter 33 to explain the concept to users > > * Moved atomic commit codes into src/backend/access/fdwxact directory. > > * Some bug fixes. > > Please reivew them. I have some comments, with apologize in advance for possible duplicate or conflict with others' comments so far. 0001: This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT relation is modified. Isn't it needed when UNLOGGED tables are modified? It may be better that we have dedicated classification macro or function. The flag is handled in heapam.c. I suppose that it should be done in the upper layer considering coming pluggable storage. (X_F_ACCESSEDTEMPREL is set in heapam, but..) 0002: The name FdwXactParticipantsForAC doesn't sound good for me. How about FdwXactAtomicCommitPartitcipants? Well, as the file comment of fdwxact.c, FdwXactRegisterTransaction is called from FDW driver and F_X_MarkForeignTransactionModified is called from executor. I think that we should clarify who is responsible to the whole sequence. Since the state of local tables affects, I suppose executor is that. Couldn't we do the whole thing within executor side? I'm not sure but I feel that F_X_RegisterForeignTransaction can be a part of F_X_MarkForeignTransactionModified. The callers of MarkForeignTransactionModified can find whether the table is involved in 2pc by IsTwoPhaseCommitEnabled interface. > if (foreign_twophase_commit == true && > ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) ) > ereport(ERROR, > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), >errmsg("cannot COMMIT a distributed > transaction that has operated on foreign server that doesn't support atomic > commit"))); The error is emitted when a the GUC is turned off in the trasaction where MarkTransactionModify'ed. I think that the number of the variables' possible states should be reduced for simplicity. For example in the case, once foreign_twopase_commit is checked in a transaction, subsequent changes in the transaction should be ignored during the transaction. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Oct 3, 2018 at 6:02 PM Chris Travers wrote: > > > > On Wed, Oct 3, 2018 at 9:41 AM Chris Travers wrote: >> >> The following review has been posted through the commitfest application: >> make installcheck-world: tested, failed >> Implements feature: not tested >> Spec compliant: not tested >> Documentation:tested, failed >> >> I am hoping I am not out of order in writing this before the commitfest >> starts. The patch is big and long and so wanted to start on this while >> traffic is slow. >> >> I find this patch quite welcome and very close to a minimum viable version. >> The few significant limitations can be resolved later. One thing I may have >> missed in the documentation is a discussion of the limits of the current >> approach. I think this would be important to document because the caveats >> of the current approach are significant, but the people who need it will >> have the knowledge to work with issues if they come up. >> >> The major caveat I see in our past discussions and (if I read the patch >> correctly) is that the resolver goes through global transactions >> sequentially and does not move on to the next until the previous one is >> resolved. This means that if I have a global transaction on server A, with >> foreign servers B and C, and I have another one on server A with foreign >> servers C and D, if server B goes down at the wrong moment, the background >> worker does not look like it will detect the failure and move on to try to >> resolve the second, so server D will have a badly set vacuum horizon until >> this is resolved. Also if I read the patch correctly, it looks like one can >> invoke SQL commands to remove the bad transaction to allow processing to >> continue and manual resolution (this is good and necessary because in this >> area there is no ability to have perfect recoverability without occasional >> administrative action). I would really like to see more documentation of >> failure cases and appropriate administrative action at present. Otherwise >> this is I think a minimum viable addition and I think we want it. >> >> It is possible i missed that in the documentation. If so, my objection >> stands aside. If it is welcome I am happy to take a first crack at such >> docs. > Thank you for reviewing the patch! > > After further testing I am pretty sure I misread the patch. It looks like > one can have multiple resolvers which can, in fact, work through a queue > together solving this problem. So the objection above is not valid and I > withdraw that objection. I will re-review the docs in light of the > experience. Actually the patch doesn't solve this problem; the foreign transaction resolver processes distributed transactions sequentially. But since one resolver process is responsible for one database the backend connecting to another database can complete the distributed transaction. I understood the your concern and agreed to solve this problem. I'll address it in the next patch. > >> >> >> To my mind thats the only blocker in the code (but see below). I can say >> without a doubt that I would expect we would use this feature once available. >> >> -- >> >> Testing however failed. >> >> make installcheck-world fails with errors like the following: >> >> -- Modify foreign server and raise an error >> BEGIN; >> INSERT INTO ft7_twophase VALUES(8); >> + ERROR: prepread foreign transactions are disabled >> + HINT: Set max_prepared_foreign_transactions to a nonzero value. >> INSERT INTO ft8_twophase VALUES(NULL); -- violation >> ! ERROR: current transaction is aborted, commands ignored until end of >> transaction block >> ROLLBACK; >> SELECT * FROM ft7_twophase; >> ! ERROR: prepread foreign transactions are disabled >> ! HINT: Set max_prepared_foreign_transactions to a nonzero value. >> SELECT * FROM ft8_twophase; >> ! ERROR: prepread foreign transactions are disabled >> ! HINT: Set max_prepared_foreign_transactions to a nonzero value. >> -- Rollback foreign transaction that involves both 2PC-capable >> -- and 2PC-non-capable foreign servers. >> BEGIN; >> INSERT INTO ft8_twophase VALUES(7); >> + ERROR: prepread foreign transactions are disabled >> + HINT: Set max_prepared_foreign_transactions to a nonzero value. >> INSERT INTO ft9_not_twophase VALUES(7); >> + ERROR: current transaction is aborted, commands ignored until end of >> transaction block >> ROLLBACK; >> SELECT * FROM ft8_twophase; >> ! ERROR: prepread foreign transactions are disabled >> ! HINT: Set max_prepared_foreign_transactions to a nonzero value. >> >> make installcheck in the contrib directory shows the same, so that's the >> easiest way of reproducing, at least on a new installation. I think the >> test cases will have to handle that sort of setup. The 'make installcheck' is a regression test mode to do the tests to the existing installation. If the installation
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Oct 3, 2018 at 9:41 AM Chris Travers wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, failed > Implements feature: not tested > Spec compliant: not tested > Documentation:tested, failed > > I am hoping I am not out of order in writing this before the commitfest > starts. The patch is big and long and so wanted to start on this while > traffic is slow. > > I find this patch quite welcome and very close to a minimum viable > version. The few significant limitations can be resolved later. One thing > I may have missed in the documentation is a discussion of the limits of the > current approach. I think this would be important to document because the > caveats of the current approach are significant, but the people who need it > will have the knowledge to work with issues if they come up. > > The major caveat I see in our past discussions and (if I read the patch > correctly) is that the resolver goes through global transactions > sequentially and does not move on to the next until the previous one is > resolved. This means that if I have a global transaction on server A, with > foreign servers B and C, and I have another one on server A with foreign > servers C and D, if server B goes down at the wrong moment, the background > worker does not look like it will detect the failure and move on to try to > resolve the second, so server D will have a badly set vacuum horizon until > this is resolved. Also if I read the patch correctly, it looks like one > can invoke SQL commands to remove the bad transaction to allow processing > to continue and manual resolution (this is good and necessary because in > this area there is no ability to have perfect recoverability without > occasional administrative action). I would really like to see more > documentation of failure cases and appropriate administrative action at > present. Otherwise this is I think a minimum viable addition and I think > we want it. > > It is possible i missed that in the documentation. If so, my objection > stands aside. If it is welcome I am happy to take a first crack at such > docs. > After further testing I am pretty sure I misread the patch. It looks like one can have multiple resolvers which can, in fact, work through a queue together solving this problem. So the objection above is not valid and I withdraw that objection. I will re-review the docs in light of the experience. > > To my mind thats the only blocker in the code (but see below). I can say > without a doubt that I would expect we would use this feature once > available. > > -- > > Testing however failed. > > make installcheck-world fails with errors like the following: > > -- Modify foreign server and raise an error > BEGIN; > INSERT INTO ft7_twophase VALUES(8); > + ERROR: prepread foreign transactions are disabled > + HINT: Set max_prepared_foreign_transactions to a nonzero value. > INSERT INTO ft8_twophase VALUES(NULL); -- violation > ! ERROR: current transaction is aborted, commands ignored until end of > transaction block > ROLLBACK; > SELECT * FROM ft7_twophase; > ! ERROR: prepread foreign transactions are disabled > ! HINT: Set max_prepared_foreign_transactions to a nonzero value. > SELECT * FROM ft8_twophase; > ! ERROR: prepread foreign transactions are disabled > ! HINT: Set max_prepared_foreign_transactions to a nonzero value. > -- Rollback foreign transaction that involves both 2PC-capable > -- and 2PC-non-capable foreign servers. > BEGIN; > INSERT INTO ft8_twophase VALUES(7); > + ERROR: prepread foreign transactions are disabled > + HINT: Set max_prepared_foreign_transactions to a nonzero value. > INSERT INTO ft9_not_twophase VALUES(7); > + ERROR: current transaction is aborted, commands ignored until end of > transaction block > ROLLBACK; > SELECT * FROM ft8_twophase; > ! ERROR: prepread foreign transactions are disabled > ! HINT: Set max_prepared_foreign_transactions to a nonzero value. > > make installcheck in the contrib directory shows the same, so that's the > easiest way of reproducing, at least on a new installation. I think the > test cases will have to handle that sort of setup. > > make check in the contrib directory passes. > > For reasons of test failures, I am setting this back to waiting on author. > > -- > I had a few other thoughts that I figure are worth sharing with the > community on this patch with the idea that once it is in place, this may > open up more options for collaboration in the area of federated and > distributed storage generally. I could imagine other foreign data wrappers > using this API, and folks might want to refactor out the atomic handling > part so that extensions that do not use the foreign data wrapper structure > could use it as well (while this looks like a classic SQL/MED issue, I am > not sure that only foreign data wrappers
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Oct 3, 2018 at 9:56 AM Chris Travers wrote: > > > On Wed, Oct 3, 2018 at 9:41 AM Chris Travers > wrote: > >> >> (errmsg("preparing foreign transactions > (max_prepared_foreign_transactions > 0) requires maX_foreign_xact_resolvers > > 0"))); > Two more critical notes here which I think are small blockers. The error message above references a config variable that does not exist. The correct name of the config parameter is max_foreign_transaction_resolvers Setting that along with the following to 10 caused the tests to pass, but again it fails on default configs: max_prepared_foreign_transactions, max_prepared_transactions > > > > -- > Best Regards, > Chris Travers > Head of Database > > Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com > Saarbrücker Straße 37a, 10405 Berlin > > -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Wed, Oct 3, 2018 at 9:41 AM Chris Travers wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, failed > Implements feature: not tested > Spec compliant: not tested > Documentation:tested, failed > Also one really minor point: I think this is a typo (maX vs max)? (errmsg("preparing foreign transactions (max_prepared_foreign_transactions > 0) requires maX_foreign_xact_resolvers > 0"))); -- Best Regards, Chris Travers Head of Database Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com Saarbrücker Straße 37a, 10405 Berlin
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation:tested, failed I am hoping I am not out of order in writing this before the commitfest starts. The patch is big and long and so wanted to start on this while traffic is slow. I find this patch quite welcome and very close to a minimum viable version. The few significant limitations can be resolved later. One thing I may have missed in the documentation is a discussion of the limits of the current approach. I think this would be important to document because the caveats of the current approach are significant, but the people who need it will have the knowledge to work with issues if they come up. The major caveat I see in our past discussions and (if I read the patch correctly) is that the resolver goes through global transactions sequentially and does not move on to the next until the previous one is resolved. This means that if I have a global transaction on server A, with foreign servers B and C, and I have another one on server A with foreign servers C and D, if server B goes down at the wrong moment, the background worker does not look like it will detect the failure and move on to try to resolve the second, so server D will have a badly set vacuum horizon until this is resolved. Also if I read the patch correctly, it looks like one can invoke SQL commands to remove the bad transaction to allow processing to continue and manual resolution (this is good and necessary because in this area there is no ability to have perfect recoverability without occasional administrative action). I would really like to see more documentation of failure cases and appropriate administrative action at present. Otherwise this is I think a minimum viable addition and I think we want it. It is possible i missed that in the documentation. If so, my objection stands aside. If it is welcome I am happy to take a first crack at such docs. To my mind thats the only blocker in the code (but see below). I can say without a doubt that I would expect we would use this feature once available. -- Testing however failed. make installcheck-world fails with errors like the following: -- Modify foreign server and raise an error BEGIN; INSERT INTO ft7_twophase VALUES(8); + ERROR: prepread foreign transactions are disabled + HINT: Set max_prepared_foreign_transactions to a nonzero value. INSERT INTO ft8_twophase VALUES(NULL); -- violation ! ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACK; SELECT * FROM ft7_twophase; ! ERROR: prepread foreign transactions are disabled ! HINT: Set max_prepared_foreign_transactions to a nonzero value. SELECT * FROM ft8_twophase; ! ERROR: prepread foreign transactions are disabled ! HINT: Set max_prepared_foreign_transactions to a nonzero value. -- Rollback foreign transaction that involves both 2PC-capable -- and 2PC-non-capable foreign servers. BEGIN; INSERT INTO ft8_twophase VALUES(7); + ERROR: prepread foreign transactions are disabled + HINT: Set max_prepared_foreign_transactions to a nonzero value. INSERT INTO ft9_not_twophase VALUES(7); + ERROR: current transaction is aborted, commands ignored until end of transaction block ROLLBACK; SELECT * FROM ft8_twophase; ! ERROR: prepread foreign transactions are disabled ! HINT: Set max_prepared_foreign_transactions to a nonzero value. make installcheck in the contrib directory shows the same, so that's the easiest way of reproducing, at least on a new installation. I think the test cases will have to handle that sort of setup. make check in the contrib directory passes. For reasons of test failures, I am setting this back to waiting on author. -- I had a few other thoughts that I figure are worth sharing with the community on this patch with the idea that once it is in place, this may open up more options for collaboration in the area of federated and distributed storage generally. I could imagine other foreign data wrappers using this API, and folks might want to refactor out the atomic handling part so that extensions that do not use the foreign data wrapper structure could use it as well (while this looks like a classic SQL/MED issue, I am not sure that only foreign data wrappers would be interested in the API. The new status of this patch is: Waiting on Author
Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2
On Fri, Aug 03, 2018 at 05:52:24PM +0900, Masahiko Sawada wrote: > I attached the updated version patch as the previous versions conflict > with the current HEAD. Please note that the latest patch set does not apply anymore, so this patch is moved to next CF, waiting on author. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Sat, May 26, 2018 at 12:25 AM, Robert Haas wrote: > On Fri, May 18, 2018 at 11:21 AM, Masahiko Sawada > wrote: >> Regarding to API design, should we use 2PC for a distributed >> transaction if both two or more 2PC-capable foreign servers and >> 2PC-non-capable foreign server are involved with it? Or should we end >> up with an error? the 2PC-non-capable server might be either that has >> 2PC functionality but just disables it or that doesn't have it. > > It seems to me that this is functionality that many people will not > want to use. First, doing a PREPARE and then a COMMIT for each FDW > write transaction is bound to be more expensive than just doing a > COMMIT. Second, because the default value of > max_prepared_transactions is 0, this can only work at all if special > configuration has been done on the remote side. Because of the second > point in particular, it seems to me that the default for this new > feature must be "off". It would make to ship a default configuration > of PostgreSQL that doesn't work with the default configuration of > postgres_fdw, and I do not think we want to change the default value > of max_prepared_transactions. It was changed from 5 to 0 a number of > years back for good reason. I'm not sure that many people will not want to use this feature because it seems to me that there are many people who don't want to use the database that is missing transaction atomicity. But I agree that this feature should not be enabled by default as we disable 2PC by default. > > So, I think the question could be broadened a bit: how you enable this > feature if you want it, and what happens if you want it but it's not > available for your choice of FDW? One possible enabling method is a > GUC (e.g. foreign_twophase_commit). It could be true/false, with true > meaning use PREPARE for all FDW writes and fail if that's not > supported, or it could be three-valued, like require/prefer/disable, > with require throwing an error if PREPARE support is not available and > prefer using PREPARE where available but without failing when it isn't > available. Another possibility could be to make it an FDW option, > possibly capable of being set at multiple levels (e.g. server or > foreign table). If any FDW involved in the transaction demands > distributed 2PC semantics then the whole transaction must have those > semantics or it fails. I was previous leaning toward the latter > approach, but I guess now the former approach is sounding better. I'm > not totally certain I know what's best here. > I agree that the former is better. That way, we also can control that parameter at transaction level. If we allow the 'prefer' behavior we need to manage not only 2PC-capable foreign server but also 2PC-non-capable foreign server. It requires all FDW to call the registration function. So I think two-values parameter would be better. BTW, sorry for late submitting the updated patch. I'll post the updated patch in this week but I'd like to share the new APIs design beforehand. APIs that I'd like to add are 4 functions and 1 registration function: PrepareForeignTransaction, CommitForeignTransaction, RollbackForeignTransaction, IsTwoPhaseCommitEnabled and FdwXactRegisterForeignServer. All FDWs that want to support atomic commit have to support all APIs and to call the registration function when foreign transaction opens. Transaction processing sequence with atomic commit will be like follows. 1. FDW begins a transaction on a 2PC-capable foreign server. 2. FDW registers the foreign server with/without a foreign transaction identifier by calling FdwXactRegisterForeignServer(). * The passing foreign transaction identifier can be NULL. If it's NULL, the core code constructs it like 'fx_<4 random chars>__'. * Providing foreign transaction identifier at beginning of transaction is useful because some DBMS such as Oracle database or MySQL requires a transaction identifier at beginning of its XA transaction. * Registration the foreign transaction guarantees that its transaction is controlled by the core and APIs are called at an appropriate time. 3. Perform 1 and 2 whenever the distributed transaction opens a transaction on 2PC-capable foreign servers. * When the distributed transaction modifies a foreign server, we mark it as 'modified'. * This mark is used at commit to check if it's necessary to use 2PC. * At the same time, we also check if the foreign server enables 2PC by calling IsTwoPhaseCommitEnabled(). * If an FDW disables or doesn't provide that function, we mark XACT_FALGS_FDWNONPREPARE. This is necessary because we need to remember wrote 2PC-non-capable foreign server. * When the distributed transaction modifies temp table locally, mark XACT_FALGS_WROTENONTEMREL. * This is used at commit to check i it's necessary to use 2PC as well. 4. During pre-commit, we prepare all foreign transaction if 2PC is required by calling
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, May 18, 2018 at 11:21 AM, Masahiko Sawadawrote: > Regarding to API design, should we use 2PC for a distributed > transaction if both two or more 2PC-capable foreign servers and > 2PC-non-capable foreign server are involved with it? Or should we end > up with an error? the 2PC-non-capable server might be either that has > 2PC functionality but just disables it or that doesn't have it. It seems to me that this is functionality that many people will not want to use. First, doing a PREPARE and then a COMMIT for each FDW write transaction is bound to be more expensive than just doing a COMMIT. Second, because the default value of max_prepared_transactions is 0, this can only work at all if special configuration has been done on the remote side. Because of the second point in particular, it seems to me that the default for this new feature must be "off". It would make to ship a default configuration of PostgreSQL that doesn't work with the default configuration of postgres_fdw, and I do not think we want to change the default value of max_prepared_transactions. It was changed from 5 to 0 a number of years back for good reason. So, I think the question could be broadened a bit: how you enable this feature if you want it, and what happens if you want it but it's not available for your choice of FDW? One possible enabling method is a GUC (e.g. foreign_twophase_commit). It could be true/false, with true meaning use PREPARE for all FDW writes and fail if that's not supported, or it could be three-valued, like require/prefer/disable, with require throwing an error if PREPARE support is not available and prefer using PREPARE where available but without failing when it isn't available. Another possibility could be to make it an FDW option, possibly capable of being set at multiple levels (e.g. server or foreign table). If any FDW involved in the transaction demands distributed 2PC semantics then the whole transaction must have those semantics or it fails. I was previous leaning toward the latter approach, but I guess now the former approach is sounding better. I'm not totally certain I know what's best here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
RE: [HACKERS] Transactions involving multiple postgres foreign servers
From: Masahiko Sawada [mailto:sawada.m...@gmail.com] > > I'm for the latter. That is, COMMIT or PREPARE TRANSACTION statement > issued from an application reports an error. > > I'm not sure that we should end up with an error in such case, but if > we want then we can raise an error when the transaction tries to > modify 2PC-non-capable server after modified 2PC-capable server. Such early reporting would be better, but I wonder if we can handle the opposite order: update data on a 2PC-capable server after a 2PC-non-capable server. If it's not easy or efficient, I think it's enough to report the error at COMMIT and PREPARE TRANSACTION, just like we report "ERROR: cannot PREPARE a transaction that has operated on temporary tables" at PREPARE TRANSACTION. > Honestly I'm not sure we should use atomic commit by default at this > point. Because it also means to change default behavior though the > existing users use them without 2PC. But I think control of global > transaction atomicity by GUC parameter would be a good idea. For > example, synchronous_commit = 'global' makes backends wait for > transaction to be resolved globally before returning to the user. Regarding the incompatibility of default behavior, Postgres has the history to pursue correctness and less confusion, such as the handling of backslash characters in strings and automatic type casts below. Non-character data types are no longer automatically cast to TEXT (Peter, Tom) Previously, if a non-character value was supplied to an operator or function that requires text input, it was automatically cast to text, for most (though not all) built-in data types. This no longer happens: an explicit cast to text is now required for all non-character-string types. ... The reason for the change is that these automatic casts too often caused surprising behavior. > I might not understand your comment correctly but the current patch is > implemented in such way. The patch introduces new FDW APIs: > PrepareForeignTransaction, EndForeignTransaction, > ResolvePreparedForeignTransaction and GetPreapreId. The postgres core > calls each APIs at appropriate timings while managing each foreign > transactions. FDWs that don't support 2PC set the function pointers of > them to NULL. Ouch, you are right. > Also, regarding the current API design it might not fit to other > databases than PostgreSQL. For example, in MySQL we have to start xa > transaction explicitly using by XA START whereas PostgreSQL can > prepare the transaction that is started by BEGIN TRANSACTION. So in > MySQL global transaction id is required at beginning of xa > transaction. And we have to execute XA END is required before we > prepare or commit it at one phase. So it would be better to define > APIs according to X/Open XA in order to make it more general. I thought of: * Put the functions that implement xa_prepare(), xa_commit() and xa_rollback() in libxa.so or libpq.so. * PrepareForeignTransaction and EndForeignTransaction for postgres_fdw call them. I meant just code reuse for Postgres. But this is my simple intuition, so don't mind. Regards Takayuki Tsunakawa
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Mon, May 21, 2018 at 10:42 AM, Tsunakawa, Takayukiwrote: > From: Masahiko Sawada [mailto:sawada.m...@gmail.com] >> Regarding to API design, should we use 2PC for a distributed >> transaction if both two or more 2PC-capable foreign servers and >> 2PC-non-capable foreign server are involved with it? Or should we end >> up with an error? the 2PC-non-capable server might be either that has >> 2PC functionality but just disables it or that doesn't have it. > >>but I think we also could take >> the latter way because it doesn't make sense for user even if the >> transaction commit atomically among not all participants. > > I'm for the latter. That is, COMMIT or PREPARE TRANSACTION statement issued > from an application reports an error. I'm not sure that we should end up with an error in such case, but if we want then we can raise an error when the transaction tries to modify 2PC-non-capable server after modified 2PC-capable server. > DBMS, particularly relational DBMS (, and even more particularly Postgres?) > places high value on data correctness. So I think transaction atomicity > should be preserved, at least by default. If we preferred updatability and > performance to data correctness, why don't we change the default value of > synchronous_commit to off in favor of performance? On the other hand, if we > want to allow 1PC commit when not all FDWs support 2PC, we can add a new GUC > parameter like "allow_nonatomic_commit = on", just like synchronous_commit > and fsync trade-offs data correctness and performance. Honestly I'm not sure we should use atomic commit by default at this point. Because it also means to change default behavior though the existing users use them without 2PC. But I think control of global transaction atomicity by GUC parameter would be a good idea. For example, synchronous_commit = 'global' makes backends wait for transaction to be resolved globally before returning to the user. > > >> Also, regardless whether we take either way I think it would be >> better to manage not only 2PC transaction but also non-2PC transaction >> in the core and add two_phase_commit argument. I think we can use it >> without breaking existing FDWs. Currently FDWs manage transactions >> using XactCallback but new APIs being added also manage transactions. >> I think it might be better if users use either way (using XactCallback >> or using new APIs) for transaction management rather than use both >> ways with combination. Otherwise two codes for transaction management >> will be required: the code that manages foreign transactions using >> XactCallback for non-2PC transactions and code that manages them using >> new APIs for 2PC transactions. That would not be easy for FDW >> developers. So what I imagined for new API is that if FDW developers >> use new APIs they can use both 2PC and non-2PC transaction, but if >> they use XactCallback they can use only non-2PC transaction. >> Any thoughts? > > If we add new functions, can't we just add functions whose names are > straightforward like PrepareTransaction() and CommitTransaction()? FDWs > without 2PC support returns NULL for the function pointer of > PrepareTransaction(). > > This is similar to XA: XA requires each RM to provide function pointers for > xa_prepare() and xa_commit(). If we go this way, maybe we could leverage the > artifact of postgres_fdw to create the XA library for C/C++. I mean we put > transaction control functions in the XA library, and postgres_fdw also uses > it. i.e.: > > postgres_fdw.so -> libxa.so -> libpq.so > \-/ I might not understand your comment correctly but the current patch is implemented in such way. The patch introduces new FDW APIs: PrepareForeignTransaction, EndForeignTransaction, ResolvePreparedForeignTransaction and GetPreapreId. The postgres core calls each APIs at appropriate timings while managing each foreign transactions. FDWs that don't support 2PC set the function pointers of them to NULL. Also, regarding the current API design it might not fit to other databases than PostgreSQL. For example, in MySQL we have to start xa transaction explicitly using by XA START whereas PostgreSQL can prepare the transaction that is started by BEGIN TRANSACTION. So in MySQL global transaction id is required at beginning of xa transaction. And we have to execute XA END is required before we prepare or commit it at one phase. So it would be better to define APIs according to X/Open XA in order to make it more general. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, May 11, 2018 at 12:27 AM, Robert Haaswrote: > On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada > wrote: >> I might be missing your point. As for API breaking, this patch doesn't >> break any existing FDWs. All new APIs I proposed are dedicated to 2PC. >> In other words, FDWs that work today can continue working after this >> patch gets committed, but if FDWs want to support atomic commit then >> they should be updated to use new APIs. The reason why the calling of >> FdwXactRegisterForeignServer is necessary is that the core code >> controls the foreign servers that involved with the transaction but >> the whether each foreign server uses 2PC option (two_phase_commit) is >> known only on FDW code. We can eliminate the necessity of calling >> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to >> server-level in order not to enforce calling the registering function >> on FDWs. If we did that, the user can use the 2PC option as a >> server-level option. > > Well, FdwXactRegisterForeignServer has a "bool two_phase_commit" > argument. If it only needs to be called by FDWs that support 2PC, > then that argument is unnecessary. An FDW may support 2PC but not a foreign server created using that FDW. Without that argument all the foreign servers created using a given FDW will need to support 2PC which may not be possible. > If it needs to be called by all > FDWs, then it breaks existing FDWs that don't call it. > That's true. By default FDWs, which do not want to use this facility, can just pass false without any need for further change. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
RE: [HACKERS] Transactions involving multiple postgres foreign servers
From: Masahiko Sawada [mailto:sawada.m...@gmail.com] > Regarding to API design, should we use 2PC for a distributed > transaction if both two or more 2PC-capable foreign servers and > 2PC-non-capable foreign server are involved with it? Or should we end > up with an error? the 2PC-non-capable server might be either that has > 2PC functionality but just disables it or that doesn't have it. >but I think we also could take > the latter way because it doesn't make sense for user even if the > transaction commit atomically among not all participants. I'm for the latter. That is, COMMIT or PREPARE TRANSACTION statement issued from an application reports an error. DBMS, particularly relational DBMS (, and even more particularly Postgres?) places high value on data correctness. So I think transaction atomicity should be preserved, at least by default. If we preferred updatability and performance to data correctness, why don't we change the default value of synchronous_commit to off in favor of performance? On the other hand, if we want to allow 1PC commit when not all FDWs support 2PC, we can add a new GUC parameter like "allow_nonatomic_commit = on", just like synchronous_commit and fsync trade-offs data correctness and performance. > Also, regardless whether we take either way I think it would be > better to manage not only 2PC transaction but also non-2PC transaction > in the core and add two_phase_commit argument. I think we can use it > without breaking existing FDWs. Currently FDWs manage transactions > using XactCallback but new APIs being added also manage transactions. > I think it might be better if users use either way (using XactCallback > or using new APIs) for transaction management rather than use both > ways with combination. Otherwise two codes for transaction management > will be required: the code that manages foreign transactions using > XactCallback for non-2PC transactions and code that manages them using > new APIs for 2PC transactions. That would not be easy for FDW > developers. So what I imagined for new API is that if FDW developers > use new APIs they can use both 2PC and non-2PC transaction, but if > they use XactCallback they can use only non-2PC transaction. > Any thoughts? If we add new functions, can't we just add functions whose names are straightforward like PrepareTransaction() and CommitTransaction()? FDWs without 2PC support returns NULL for the function pointer of PrepareTransaction(). This is similar to XA: XA requires each RM to provide function pointers for xa_prepare() and xa_commit(). If we go this way, maybe we could leverage the artifact of postgres_fdw to create the XA library for C/C++. I mean we put transaction control functions in the XA library, and postgres_fdw also uses it. i.e.: postgres_fdw.so -> libxa.so -> libpq.so \-/ Regards Takayuki Tsunakawa
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, May 11, 2018 at 9:56 AM, Masahiko Sawadawrote: > Thank you for the comment. > > On Fri, May 11, 2018 at 3:57 AM, Robert Haas wrote: >> On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada >> wrote: >>> I might be missing your point. As for API breaking, this patch doesn't >>> break any existing FDWs. All new APIs I proposed are dedicated to 2PC. >>> In other words, FDWs that work today can continue working after this >>> patch gets committed, but if FDWs want to support atomic commit then >>> they should be updated to use new APIs. The reason why the calling of >>> FdwXactRegisterForeignServer is necessary is that the core code >>> controls the foreign servers that involved with the transaction but >>> the whether each foreign server uses 2PC option (two_phase_commit) is >>> known only on FDW code. We can eliminate the necessity of calling >>> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to >>> server-level in order not to enforce calling the registering function >>> on FDWs. If we did that, the user can use the 2PC option as a >>> server-level option. >> >> Well, FdwXactRegisterForeignServer has a "bool two_phase_commit" >> argument. If it only needs to be called by FDWs that support 2PC, >> then that argument is unnecessary. If it needs to be called by all >> FDWs, then it breaks existing FDWs that don't call it. >> > > I understood now. For now since FdwXactRegisterForeignServer needs to > be called by only FDWs that support 2PC, I will remove the argument. Regarding to API design, should we use 2PC for a distributed transaction if both two or more 2PC-capable foreign servers and 2PC-non-capable foreign server are involved with it? Or should we end up with an error? the 2PC-non-capable server might be either that has 2PC functionality but just disables it or that doesn't have it. If we use it, the transaction atomicity will be satisfied among only 2PC-capable servers that might be part of all participants. Or If we don't, we use 1PC instead in such case but when using 2PC transactions always ends up with satisfying the transaction atomicity among all participants of the transaction. The current patch takes the former (doesn't allow PREPARE case though), but I think we also could take the latter way because it doesn't make sense for user even if the transaction commit atomically among not all participants. Also, regardless whether we take either way I think it would be better to manage not only 2PC transaction but also non-2PC transaction in the core and add two_phase_commit argument. I think we can use it without breaking existing FDWs. Currently FDWs manage transactions using XactCallback but new APIs being added also manage transactions. I think it might be better if users use either way (using XactCallback or using new APIs) for transaction management rather than use both ways with combination. Otherwise two codes for transaction management will be required: the code that manages foreign transactions using XactCallback for non-2PC transactions and code that manages them using new APIs for 2PC transactions. That would not be easy for FDW developers. So what I imagined for new API is that if FDW developers use new APIs they can use both 2PC and non-2PC transaction, but if they use XactCallback they can use only non-2PC transaction. Any thoughts? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawadawrote: > I might be missing your point. As for API breaking, this patch doesn't > break any existing FDWs. All new APIs I proposed are dedicated to 2PC. > In other words, FDWs that work today can continue working after this > patch gets committed, but if FDWs want to support atomic commit then > they should be updated to use new APIs. The reason why the calling of > FdwXactRegisterForeignServer is necessary is that the core code > controls the foreign servers that involved with the transaction but > the whether each foreign server uses 2PC option (two_phase_commit) is > known only on FDW code. We can eliminate the necessity of calling > FdwXactRegisterForeignServer by moving 2PC option from fdw-level to > server-level in order not to enforce calling the registering function > on FDWs. If we did that, the user can use the 2PC option as a > server-level option. Well, FdwXactRegisterForeignServer has a "bool two_phase_commit" argument. If it only needs to be called by FDWs that support 2PC, then that argument is unnecessary. If it needs to be called by all FDWs, then it breaks existing FDWs that don't call it. >>> But this is now responsible by FDW. I should change it to resolver >>> side. That is, FDW can raise error in ordinarily way but core code >>> should catch and process it. >> >> I don't understand exactly what you mean here. > > Hmm I think I got confused. My understanding is that logging a > complaint in commit case and not doing that in abort case if prepared > transaction doesn't exist is a core code's job. An API contract here > is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if > there is no such transaction. Since it's an expected case in abort > case for the fdwxact manager, the core code can regard the error as > not actual problem. In general, it's not safe to catch an error and continue unless you protect the code that throws the error by a sub-transaction. That means we shouldn't expect the FDW to throw an error when the prepared transaction isn't found and then just have the core code ignore the error. Instead the FDW should return normally and, if the core code needs to know whether the prepared transaction was found, then the FDW should indicate this through a return value, not an ERROR. > Or do you mean that FDWs should not raise an error if there is the > prepared transaction, and then core code doesn't need to check > sqlstate in case of error? Right. As noted above, that's unsafe, so we shouldn't do it. >>> You're right. Perhaps we can deal with it by PrescanFdwXacts until >>> reach consistent point, and then have vac_update_datfrozenxid check >>> local xids of un-resolved fdwxact to determine the new datfrozenxid. >>> Since the local xids of un-resolved fdwxacts would not be relevant >>> with vacuuming, we don't need to include it to snapshot and >>> GetOldestXmin etc. Also we hint to resolve fdwxact when near >>> wraparound. >> >> I agree with you about snapshots, but I'm not sure about GetOldestXmin. > > Hmm, although I've thought concern in case where we don't consider > local xids of un-resolved fdwxact in GetOldestXmin, I could not find > problem. Could you share your concern if you have? I'll try to find a > possibility based on it. I don't remember exactly what I was thinking when I wrote that, but I think the point is that GetOldestXmin() does a bunch of things other than control the threshold for VACUUM, and we'd need to study them all and look for problems. For example, it won't do for an XID to get reused while it's still associated with an unresolved fdwxact. We therefore certainly need such XIDs to hold back the cluster-wide threshold for clog truncation in some manner -- and maybe that involves GetOldestXmin(). Or maybe not. But anyway the point, broadly considered, is that GetOldestXmin() is used in various ways, and I don't know if we've thought through all of the consequences in regard to this new feature. Can I ask what your time frame is for updating this patch? Considering how much work appears to remain, if you want to get this committed to v12, it would be best to get started as early as possible. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Thu, Mar 29, 2018 at 2:27 AM, David Steelewrote: > On 2/27/18 2:21 AM, Masahiko Sawada wrote: >> >> Hmm, although I've thought concern in case where we don't consider >> local xids of un-resolved fdwxact in GetOldestXmin, I could not find >> problem. Could you share your concern if you have? I'll try to find a >> possibility based on it. > > It appears that this entry should be marked Waiting on Author so I have > done that. > > I also think it may be time to move this patch to the next CF. > I agree to move this patch to the next CF. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On 2/27/18 2:21 AM, Masahiko Sawada wrote: > > Hmm, although I've thought concern in case where we don't consider > local xids of un-resolved fdwxact in GetOldestXmin, I could not find > problem. Could you share your concern if you have? I'll try to find a > possibility based on it. It appears that this entry should be marked Waiting on Author so I have done that. I also think it may be time to move this patch to the next CF. Regards, -- -David da...@pgmasters.net
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Wed, Feb 21, 2018 at 6:07 AM, Robert Haaswrote: > On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawada > wrote: >>> The fdw-transactions section of the documentation seems to imply that >>> henceforth every FDW must call FdwXactRegisterForeignServer, which I >>> think is an unacceptable API break. >>> >>> It doesn't seem advisable to make this behavior depend on >>> max_prepared_foreign_transactions. I think that it should be an >>> server-level option: use 2PC for this server, or not? FDWs that don't >>> have 2PC default to "no"; but others can be set to "yes" if the user >>> wishes. But we shouldn't force that decision to be on a cluster-wide >>> basis. >> >> Since I've added a new option two_phase_commit to postgres_fdw we need >> to ask FDW whether the foreign server is 2PC-capable server or not in >> order to register the foreign server information. That's why the patch >> asked FDW to call FdwXactRegisterForeignServer. However we can >> register a foreign server information automatically by executor (e.g. >> at BeginDirectModify and at BeginForeignModify) if a foreign server >> itself has that information. We can add two_phase_commit_enabled >> column to pg_foreign_server system catalog and that column is set to >> true if the foriegn server is 2PC-capable (i.g. has enough functions) >> and user want to use it. > > I don't see why this would need a new catalog column. I might be missing your point. As for API breaking, this patch doesn't break any existing FDWs. All new APIs I proposed are dedicated to 2PC. In other words, FDWs that work today can continue working after this patch gets committed, but if FDWs want to support atomic commit then they should be updated to use new APIs. The reason why the calling of FdwXactRegisterForeignServer is necessary is that the core code controls the foreign servers that involved with the transaction but the whether each foreign server uses 2PC option (two_phase_commit) is known only on FDW code. We can eliminate the necessity of calling FdwXactRegisterForeignServer by moving 2PC option from fdw-level to server-level in order not to enforce calling the registering function on FDWs. If we did that, the user can use the 2PC option as a server-level option. > >>> But that brings up another issue: why is MyFdwConnections named that >>> way and why does it have those semantics? That is, why do we really >>> need a list of every FDW connection? I think we really only need the >>> ones that are 2PC-capable writes. If a non-2PC-capable foreign server >>> is involved in the transaction, then we don't really to keep it in a >>> list. We just need to flag the transaction as not locally prepareable >>> i.e. clear TwoPhaseReady. I think that this could all be figured out >>> in a much less onerous way: if we ever perform a modification of a >>> foreign table, have nodeModifyTable.c either mark the transaction >>> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign >>> server is not 2PC capable, or otherwise add the appropriate >>> information to MyFdwConnections, which can then be renamed to indicate >>> that it contains only information about preparable stuff. Then you >>> don't need each individual FDW to be responsible about calling some >>> new function; the core code just does the right thing automatically. >> >> I could not get this comment. Did you mean that the foreign >> transaction on not 2PC-capable foreign server should be end in the >> same way as before (i.g. by XactCallback)? >> >> Currently, because there is not FDW API to end foreign transaction, >> almost FDWs use XactCallbacks to end the transaction. But after >> introduced new FDW APIs, I think it's better to use FDW APIs to end >> transactions rather than using XactCallbacks. Otherwise we end up with >> having FDW APIs for 2PC (prepare and resolve) and XactCallback for >> ending the transaction, which would be hard to understand. So I've >> changed the foreign transaction management so that core code >> explicitly asks FDW to end/prepare a foreign transaction instead of >> ending it by individual FDWs. After introduced new FDW APIs, core code >> can have the information of all foreign servers involved with the >> transaction and call each APIs at appropriate timing. > > Well, it's one thing to introduce a new API. It's another thing to > require existing FDWs to be updated to use it. There are a lot of > existing FDWs out there, and I think that it is needlessly unfriendly > to force them all to be updated for v11 (or whenever this gets > committed) even if we think the new API is clearly better. FDWs that > work today should continue working after this patch is committed. Agreed. > Separately, I think there's a question of whether the new API is in > fact better -- I'm not sure I have a completely well-formed opinion > about that yet. I think one API should do one job. In terms of keeping simple API the current four APIs
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawadawrote: >> The fdw-transactions section of the documentation seems to imply that >> henceforth every FDW must call FdwXactRegisterForeignServer, which I >> think is an unacceptable API break. >> >> It doesn't seem advisable to make this behavior depend on >> max_prepared_foreign_transactions. I think that it should be an >> server-level option: use 2PC for this server, or not? FDWs that don't >> have 2PC default to "no"; but others can be set to "yes" if the user >> wishes. But we shouldn't force that decision to be on a cluster-wide >> basis. > > Since I've added a new option two_phase_commit to postgres_fdw we need > to ask FDW whether the foreign server is 2PC-capable server or not in > order to register the foreign server information. That's why the patch > asked FDW to call FdwXactRegisterForeignServer. However we can > register a foreign server information automatically by executor (e.g. > at BeginDirectModify and at BeginForeignModify) if a foreign server > itself has that information. We can add two_phase_commit_enabled > column to pg_foreign_server system catalog and that column is set to > true if the foriegn server is 2PC-capable (i.g. has enough functions) > and user want to use it. I don't see why this would need a new catalog column. >> But that brings up another issue: why is MyFdwConnections named that >> way and why does it have those semantics? That is, why do we really >> need a list of every FDW connection? I think we really only need the >> ones that are 2PC-capable writes. If a non-2PC-capable foreign server >> is involved in the transaction, then we don't really to keep it in a >> list. We just need to flag the transaction as not locally prepareable >> i.e. clear TwoPhaseReady. I think that this could all be figured out >> in a much less onerous way: if we ever perform a modification of a >> foreign table, have nodeModifyTable.c either mark the transaction >> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign >> server is not 2PC capable, or otherwise add the appropriate >> information to MyFdwConnections, which can then be renamed to indicate >> that it contains only information about preparable stuff. Then you >> don't need each individual FDW to be responsible about calling some >> new function; the core code just does the right thing automatically. > > I could not get this comment. Did you mean that the foreign > transaction on not 2PC-capable foreign server should be end in the > same way as before (i.g. by XactCallback)? > > Currently, because there is not FDW API to end foreign transaction, > almost FDWs use XactCallbacks to end the transaction. But after > introduced new FDW APIs, I think it's better to use FDW APIs to end > transactions rather than using XactCallbacks. Otherwise we end up with > having FDW APIs for 2PC (prepare and resolve) and XactCallback for > ending the transaction, which would be hard to understand. So I've > changed the foreign transaction management so that core code > explicitly asks FDW to end/prepare a foreign transaction instead of > ending it by individual FDWs. After introduced new FDW APIs, core code > can have the information of all foreign servers involved with the > transaction and call each APIs at appropriate timing. Well, it's one thing to introduce a new API. It's another thing to require existing FDWs to be updated to use it. There are a lot of existing FDWs out there, and I think that it is needlessly unfriendly to force them all to be updated for v11 (or whenever this gets committed) even if we think the new API is clearly better. FDWs that work today should continue working after this patch is committed. Separately, I think there's a question of whether the new API is in fact better -- I'm not sure I have a completely well-formed opinion about that yet. > In FdwXactResolveForeignTranasction(), resolver concludes the fate of > transaction by seeing the status of fdwxact entry and the state of > local transaction in clog. what I need to do is making that function > log a complaint in commit case if couldn't find the prepared > transaction, and not do that in abort case. +1. > Also, postgres_fdw don't > raise an error even if we could not find prepared transaction on > foreign server because it might have been resolved by other process. +1. > But this is now responsible by FDW. I should change it to resolver > side. That is, FDW can raise error in ordinarily way but core code > should catch and process it. I don't understand exactly what you mean here. > You're right. Perhaps we can deal with it by PrescanFdwXacts until > reach consistent point, and then have vac_update_datfrozenxid check > local xids of un-resolved fdwxact to determine the new datfrozenxid. > Since the local xids of un-resolved fdwxacts would not be relevant > with vacuuming, we don't need to include it to snapshot and > GetOldestXmin etc. Also we hint to
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Sat, Feb 10, 2018 at 4:08 AM, Robert Haaswrote: > On Thu, Feb 8, 2018 at 3:58 AM, Masahiko Sawada wrote: >>> Overall, what's the status of this patch? Are we hung up on this >>> issue only, or are there other things? >> >> AFAIK there is no more technical issue in this patch so far other than >> this issue. The patch has tests and docs, and includes all stuff to >> support atomic commit to distributed transactions: the introducing >> both the atomic commit ability to distributed transactions and some >> corresponding FDW APIs, and having postgres_fdw support 2pc. I think >> this patch needs to be reviewed, especially the functionality of >> foreign transaction resolution which is re-designed before. > > OK. I'm going to give 0002 a read-through now, but I think it would > be a good thing if others also contributed to the review effort. > There is a lot of code here, and there are a lot of other patches > competing for attention. That said, here we go: I appreciate your reviewing. I'll thoroughly update the whole patch based on your comments and suggestions. Here is answer, question and my understanding. > > The fdw-transactions section of the documentation seems to imply that > henceforth every FDW must call FdwXactRegisterForeignServer, which I > think is an unacceptable API break. > > It doesn't seem advisable to make this behavior depend on > max_prepared_foreign_transactions. I think that it should be an > server-level option: use 2PC for this server, or not? FDWs that don't > have 2PC default to "no"; but others can be set to "yes" if the user > wishes. But we shouldn't force that decision to be on a cluster-wide > basis. Since I've added a new option two_phase_commit to postgres_fdw we need to ask FDW whether the foreign server is 2PC-capable server or not in order to register the foreign server information. That's why the patch asked FDW to call FdwXactRegisterForeignServer. However we can register a foreign server information automatically by executor (e.g. at BeginDirectModify and at BeginForeignModify) if a foreign server itself has that information. We can add two_phase_commit_enabled column to pg_foreign_server system catalog and that column is set to true if the foriegn server is 2PC-capable (i.g. has enough functions) and user want to use it. > > But that brings up another issue: why is MyFdwConnections named that > way and why does it have those semantics? That is, why do we really > need a list of every FDW connection? I think we really only need the > ones that are 2PC-capable writes. If a non-2PC-capable foreign server > is involved in the transaction, then we don't really to keep it in a > list. We just need to flag the transaction as not locally prepareable > i.e. clear TwoPhaseReady. I think that this could all be figured out > in a much less onerous way: if we ever perform a modification of a > foreign table, have nodeModifyTable.c either mark the transaction > non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign > server is not 2PC capable, or otherwise add the appropriate > information to MyFdwConnections, which can then be renamed to indicate > that it contains only information about preparable stuff. Then you > don't need each individual FDW to be responsible about calling some > new function; the core code just does the right thing automatically. I could not get this comment. Did you mean that the foreign transaction on not 2PC-capable foreign server should be end in the same way as before (i.g. by XactCallback)? Currently, because there is not FDW API to end foreign transaction, almost FDWs use XactCallbacks to end the transaction. But after introduced new FDW APIs, I think it's better to use FDW APIs to end transactions rather than using XactCallbacks. Otherwise we end up with having FDW APIs for 2PC (prepare and resolve) and XactCallback for ending the transaction, which would be hard to understand. So I've changed the foreign transaction management so that core code explicitly asks FDW to end/prepare a foreign transaction instead of ending it by individual FDWs. After introduced new FDW APIs, core code can have the information of all foreign servers involved with the transaction and call each APIs at appropriate timing. > > +if (!fdw_conn->end_foreign_xact(fdw_conn->serverid, fdw_conn->userid, > +fdw_conn->umid, true)) > +elog(WARNING, "could not commit transaction on server %s", > + fdw_conn->servername); > > First, as I noted above, this assumes that the local transaction can't > fail after this point, which is certainly false -- if nothing else, > think about a plug pull. Second, it assumes that the right thing to > do would be to throw a WARNING, which I think is false: if this is > running in the pre-commit phase, it's not too late to switch to the > abort path, and certainly if we make changes on only 1 server
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Thu, Feb 8, 2018 at 3:58 AM, Masahiko Sawadawrote: >> Overall, what's the status of this patch? Are we hung up on this >> issue only, or are there other things? > > AFAIK there is no more technical issue in this patch so far other than > this issue. The patch has tests and docs, and includes all stuff to > support atomic commit to distributed transactions: the introducing > both the atomic commit ability to distributed transactions and some > corresponding FDW APIs, and having postgres_fdw support 2pc. I think > this patch needs to be reviewed, especially the functionality of > foreign transaction resolution which is re-designed before. OK. I'm going to give 0002 a read-through now, but I think it would be a good thing if others also contributed to the review effort. There is a lot of code here, and there are a lot of other patches competing for attention. That said, here we go: In the documentation for pg_prepared_fdw_xacts, the first two columns have descriptions ending in a preposition. That's typically to be avoided in formal writing. The first one can be fixed by moving "in" before "which". The second can be fixed by changing "that" to "with which" and then dropping the trailing "with". The first three columns have descriptions ending in a period; the latter two do not. Make it consistent with whatever the surrounding style is, or at least internally consistent if the surrounding style varies. Also, some of the descriptions begin with "the" and others do not; again, seems better to be consistent and adhere to surrounding style. The documentation of the serverid column seems to make no sense. Possibly you mean "OID of the foreign server on which this foreign transaction is prepared"? As it is, you use "foreign server" twice, which is why I'm confused. The documentation of max_prepared_foreign_transactions seems a bit brief. I think that if I want to be able to use 2PC for N transactions each across K servers, this variable needs to be set to N*K, not just N. That's not the right way to word it here, but somehow you probably want to explain that a single local transaction can give rise to multiple foreign transactions and that this should be taken into consideration when setting a value for this variable. Maybe also include a link to where the user can find more information, which brings me to another point: there doesn't seem to be any general, user-facing explanation of this system. You explain the catalogs, the GUCs, the interface, etc. but there's nothing anywhere that explains the overall point of the whole thing, which seems pretty important. The closest thing you've got is a description for people writing FDWs, but we really need a place to explain the concept to *users*. One idea is to add a new chapter to the "Server Administration" section, maybe just after "Logical Replication" and before "Regression Tests". But I'm open to other ideas. It's important that the documentation of the various GUCs provide users with some clue as to how to set them. I notice this particularly for foreign_transaction_resolution_interval; off the top of my head, without having read the rest of this patch, I don't know why I shouldn't want this to be zero. But the others could use more explanation as well. It is unclear from reading the documentation for GetPreparedId why this should be the responsibility of the FDW, or how the FDW is supposed to guarantee uniqueness. PrepareForignTransaction is spelled wrong. Nearby typos: prepareing, tranasaction. A bit further down, "can _prepare" has an unwanted space in the middle. Various places in this section capitalize certain words in the middle of sentences which is not standard English style. For example, in "Every Foreign Data Wrapper is required..." the word "Every" is appropriately capitalized because it begins a sentence, but there's no reason to capitalize the others. Likewise for "...employs Two-phase commit protocol..." and other similar cases. EndForeignTransaction doesn't explain what sorts of things the FDW can legally do when called, or how this method is intended to be used. Those seem like important details. Especially, an FDW that tries to do arbitrary stuff during abort cleanup will probably cause big problems. The fdw-transactions section of the documentation seems to imply that henceforth every FDW must call FdwXactRegisterForeignServer, which I think is an unacceptable API break. It doesn't seem advisable to make this behavior depend on max_prepared_foreign_transactions. I think that it should be an server-level option: use 2PC for this server, or not? FDWs that don't have 2PC default to "no"; but others can be set to "yes" if the user wishes. But we shouldn't force that decision to be on a cluster-wide basis. + shows the functions +available for foreign transaction managements. management +Resolve a foreign transaction. This function search foreign transaction searches
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Jan 9, 2018 at 9:49 AM, Masahiko Sawadawrote: >> If I understand correctly, XactLastRecEnd can be set by, for example, >> a HOT cleanup record, so that doesn't seem like a good thing to use. > > Yes, that's right. > >> Whether we need to use 2PC across remote nodes seems like it shouldn't >> depend on whether a local SELECT statement happened to do a HOT >> cleanup or not. > > So I think we need to check if the top transaction is invalid or not as well. Even if you check both, it doesn't sound like it really does what you want. Won't you still end up partially dependent on whether a HOT cleanup happened, if not in quite the same way as before? How about defining a new bit in MyXactFlags for XACT_FLAGS_WROTENONTEMPREL? Just have heap_insert, heap_update, and heap_delete do something like: if (RelationNeedsWAL(relation)) MyXactFlags |= XACT_FLAGS_WROTENONTEMPREL; Overall, what's the status of this patch? Are we hung up on this issue only, or are there other things? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Wed, Dec 27, 2017 at 9:40 PM, Tsunakawa, Takayukiwrote: > (1) > Why don't you use the existing global variable MyXactFlags instead of the new > TransactionDidWrite? Or, how about using XactLastRecEnd != 0 to determine > the transaction did any writes? When the transaction only modified temporary > tables on the local database and some data on one remote database, I think > 2pc is unnecessary. If I understand correctly, XactLastRecEnd can be set by, for example, a HOT cleanup record, so that doesn't seem like a good thing to use. Whether we need to use 2PC across remote nodes seems like it shouldn't depend on whether a local SELECT statement happened to do a HOT cleanup or not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Mon, Jan 1, 2018 at 7:12 PM, Ashutosh Bapatwrote: > On Thu, Dec 28, 2017 at 11:08 AM, Masahiko Sawada > wrote: >> >>> (1) >>> Why don't you use the existing global variable MyXactFlags instead of the >>> new TransactionDidWrite? Or, how about using XactLastRecEnd != 0 to >>> determine the transaction did any writes? When the transaction only >>> modified temporary tables on the local database and some data on one remote >>> database, I think 2pc is unnecessary. >> >> Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if >> we did any writes on local node which requires the atomic commit. Will >> fix. >> > > I haven't checked how much code it needs to track whether the local > transaction wrote anything. But probably we can post-pone this > optimization. If it's easy to incorporate, it's good to have in the > first set itself. > Without the track of local writes, we always have to use two-phase commit even when the transaction write data on only one foreign server. It will be cause of unnecessary performance degradation and cause of transaction failure on existing systems. We can set the using two-phase commit per foreign server by ALTER SERVER but it will affect other transactions. If we can configure it per transaction perhaps it will be worth to postpone. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Thu, Dec 28, 2017 at 11:08 AM, Masahiko Sawadawrote: > >> (1) >> Why don't you use the existing global variable MyXactFlags instead of the >> new TransactionDidWrite? Or, how about using XactLastRecEnd != 0 to >> determine the transaction did any writes? When the transaction only >> modified temporary tables on the local database and some data on one remote >> database, I think 2pc is unnecessary. > > Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if > we did any writes on local node which requires the atomic commit. Will > fix. > I haven't checked how much code it needs to track whether the local transaction wrote anything. But probably we can post-pone this optimization. If it's easy to incorporate, it's good to have in the first set itself. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Thu, Dec 28, 2017 at 11:40 AM, Tsunakawa, Takayukiwrote: > From: Masahiko Sawada [mailto:sawada.m...@gmail.com] >> I've updated documentation of patches, and fixed some bugs. I did some >> failure tests of this feature using a fault simulation tool[1] for >> PostgreSQL that I created. >> >> 0001 patch adds a mechanism to track of writes on local server. This is >> required to determine whether we should use 2pc at commit. 0002 patch is >> the main part. It adds a distributed transaction manager (currently only >> for atomic commit), APIs for 2pc and foreign transaction manager resolver >> process. 0003 patch makes postgres_fdw support atomic commit using 2pc. >> >> Please review patches. > > I'd like to join the review and testing of this functionality. First, some > comments after taking a quick look at 0001: Thank you so much! > (1) > Why don't you use the existing global variable MyXactFlags instead of the new > TransactionDidWrite? Or, how about using XactLastRecEnd != 0 to determine > the transaction did any writes? When the transaction only modified temporary > tables on the local database and some data on one remote database, I think > 2pc is unnecessary. Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if we did any writes on local node which requires the atomic commit. Will fix. > > (2) > If TransactionDidWrite is necessary, I don't think you need to provide setter > functions, because other transaction state variables are accessed globally > without getter/setters. And you didn't create getter function for > TransactionDidWrite. > > (3) > heap_multi_insert() doesn't modify TransactionDidWrite. Is it sufficient to > just remember heap modifications? Are other modifications on the coordinator > node covered such as TRUNCATEand and REINDEX? I think the using (XactLastRecEnd != 0 && markXidCommitted) to check if we did any writes on local node would be better. After changed I will be able to deal with the all above concerns. > > > Questions before looking at 0002 and 0003: > > Q1: Does this functionality work when combined with XA 2pc transactions? All transaction including local transaction and foreign transactions are prepared when PREPARE. And all transactions are committed/rollbacked by the foreign transaction resolver process when COMMIT/ROLLBACK PREPARED. > Q2: Does the atomic commit cascade across multiple remote databases? For > example: > * The local transaction modifies data on remote database 1 via a foreign > table. > * A trigger fires on the remote database 1, which modifies data on remote > database 2 via a foreign table. > * The local transaction commits. I've not tested yet more complex failure situations but as far as I tested on my environment, the cascading atomic commit works. I'll test these cases more deeply. Regards, Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
RE: [HACKERS] Transactions involving multiple postgres foreign servers
From: Masahiko Sawada [mailto:sawada.m...@gmail.com] > I've updated documentation of patches, and fixed some bugs. I did some > failure tests of this feature using a fault simulation tool[1] for > PostgreSQL that I created. > > 0001 patch adds a mechanism to track of writes on local server. This is > required to determine whether we should use 2pc at commit. 0002 patch is > the main part. It adds a distributed transaction manager (currently only > for atomic commit), APIs for 2pc and foreign transaction manager resolver > process. 0003 patch makes postgres_fdw support atomic commit using 2pc. > > Please review patches. I'd like to join the review and testing of this functionality. First, some comments after taking a quick look at 0001: (1) Why don't you use the existing global variable MyXactFlags instead of the new TransactionDidWrite? Or, how about using XactLastRecEnd != 0 to determine the transaction did any writes? When the transaction only modified temporary tables on the local database and some data on one remote database, I think 2pc is unnecessary. (2) If TransactionDidWrite is necessary, I don't think you need to provide setter functions, because other transaction state variables are accessed globally without getter/setters. And you didn't create getter function for TransactionDidWrite. (3) heap_multi_insert() doesn't modify TransactionDidWrite. Is it sufficient to just remember heap modifications? Are other modifications on the coordinator node covered such as TRUNCATEand and REINDEX? Questions before looking at 0002 and 0003: Q1: Does this functionality work when combined with XA 2pc transactions? Q2: Does the atomic commit cascade across multiple remote databases? For example: * The local transaction modifies data on remote database 1 via a foreign table. * A trigger fires on the remote database 1, which modifies data on remote database 2 via a foreign table. * The local transaction commits. Regards Takayuki Tsunakawa
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Wed, Dec 13, 2017 at 12:03 AM, Robert Haaswrote: > On Mon, Dec 11, 2017 at 5:20 AM, Masahiko Sawada > wrote: >>> The question I have is how would we deal with a foreign server that is >>> not available for longer duration due to crash, longer network outage >>> etc. Example is the foreign server crashed/got disconnected after >>> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain >>> blocked for much longer duration without user having an idea of what's >>> going on. May be we should add some timeout. >> >> After more thought, I agree with adding some timeout. I can image >> there are users who want the timeout, for example, who cannot accept >> even a few seconds latency. If the timeout occurs backend unlocks the >> foreign transactions and breaks the loop. The resolver process will >> keep to continue to resolve foreign transactions at certain interval. > > I don't think a timeout is a very good idea. There is no timeout for > synchronous replication and the issues here are similar. I will not > try to block a patch adding a timeout, but I think it had better be > disabled by default and have very clear documentation explaining why > it's really dangerous. And this is why: with no timeout, you can > count on being able to see the effects of your own previous > transactions, unless at some point you sent a query cancel or got > disconnected. With a timeout, you may or may not see the effects of > your own previous transactions depending on whether or not you hit the > timeout, which you have no sure way of knowing. > transactions after the coordinator server recovered. On the other hand, for the reading a consistent result on such situation by subsequent reads, for example, we can disallow backends to inquiry SQL to the foreign server if a foreign transaction of the foreign server is remained. >>> >>> +1 for the last sentence. If we do that, we don't need the backend to >>> be blocked by resolver since a subsequent read accessing that foreign >>> server would get an error and not inconsistent data. >> >> Yeah, however the disadvantage of this is that we manage foreign >> transactions per foreign servers. If a transaction that modified even >> one table is remained as a in-doubt transaction, we cannot issue any >> SQL that touches that foreign server. Can we occur an error at >> ExecInitForeignScan()? > > I really feel strongly we shouldn't complicate the initial patch with > this kind of thing. Let's make it enough for this patch to guarantee > that either all parts of the transaction commit eventually or they all > abort eventually. Ensuring consistent visibility is a different and > hard project, and if we try to do that now, this patch is not going to > be done any time soon. > Thank you for the suggestion. I was really wondering if we should add a timeout to this feature. It's a common concern that we want to put a timeout at critical section. But currently we don't have such timeout to neither synchronous replication or writing WAL. I can image there will be users who want to a timeout for such cases but obviously it makes this feature more complex. Anyway, even if we add a timeout to this feature we can make it as a separated patch and feature. So I'd like to keep it simple as first step. This patch guarantees that the transaction commit or rollback on all foreign servers or not unless users doesn't cancel. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Mon, Dec 11, 2017 at 5:20 AM, Masahiko Sawadawrote: >> The question I have is how would we deal with a foreign server that is >> not available for longer duration due to crash, longer network outage >> etc. Example is the foreign server crashed/got disconnected after >> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain >> blocked for much longer duration without user having an idea of what's >> going on. May be we should add some timeout. > > After more thought, I agree with adding some timeout. I can image > there are users who want the timeout, for example, who cannot accept > even a few seconds latency. If the timeout occurs backend unlocks the > foreign transactions and breaks the loop. The resolver process will > keep to continue to resolve foreign transactions at certain interval. I don't think a timeout is a very good idea. There is no timeout for synchronous replication and the issues here are similar. I will not try to block a patch adding a timeout, but I think it had better be disabled by default and have very clear documentation explaining why it's really dangerous. And this is why: with no timeout, you can count on being able to see the effects of your own previous transactions, unless at some point you sent a query cancel or got disconnected. With a timeout, you may or may not see the effects of your own previous transactions depending on whether or not you hit the timeout, which you have no sure way of knowing. >>> transactions after the coordinator server recovered. On the other >>> hand, for the reading a consistent result on such situation by >>> subsequent reads, for example, we can disallow backends to inquiry SQL >>> to the foreign server if a foreign transaction of the foreign server >>> is remained. >> >> +1 for the last sentence. If we do that, we don't need the backend to >> be blocked by resolver since a subsequent read accessing that foreign >> server would get an error and not inconsistent data. > > Yeah, however the disadvantage of this is that we manage foreign > transactions per foreign servers. If a transaction that modified even > one table is remained as a in-doubt transaction, we cannot issue any > SQL that touches that foreign server. Can we occur an error at > ExecInitForeignScan()? I really feel strongly we shouldn't complicate the initial patch with this kind of thing. Let's make it enough for this patch to guarantee that either all parts of the transaction commit eventually or they all abort eventually. Ensuring consistent visibility is a different and hard project, and if we try to do that now, this patch is not going to be done any time soon. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Nov 28, 2017 at 12:31 PM, Ashutosh Bapatwrote: > On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawada > wrote: >> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska wrote: >>> Masahiko Sawada wrote: >>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat wrote: > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada > wrote: >> >> Because I don't want to break the current user semantics. that is, >> currently it's guaranteed that the subsequent reads can see the >> committed result of previous writes even if the previous transactions >> were distributed transactions. And it's ensured by writer side. If we >> can make the reader side ensure it, the backend process don't need to >> wait for the resolver process. >> >> The waiting backend process are released by resolver process after the >> resolver process tried to resolve foreign transactions. Even if >> resolver process failed to either connect to foreign server or to >> resolve foreign transaction the backend process will be released and >> the foreign transactions are leaved as dangling transaction in that >> case, which are processed later. Also if resolver process takes a long >> time to resolve foreign transactions for whatever reason the user can >> cancel it by Ctl-c anytime. >> > > So, there's no guarantee that the next command issued from the > connection *will* see the committed data, since the foreign > transaction might not have committed because of a network glitch > (say). If we go this route of making backends wait for resolver to > resolve the foreign transaction, we will have add complexity to make > sure that the waiting backends are woken up in problematic events like > crash of the resolver process OR if the resolver process hangs in a > connection to a foreign server etc. I am not sure that the complexity > is worth the half-guarantee. > Hmm, maybe I was wrong. I now think that the waiting backends can be woken up only in following cases; - The resolver process succeeded to resolve all foreign transactions. - The user did the cancel (e.g. ctl-c). - The resolver process failed to resolve foreign transaction for a reason of there is no such prepared transaction on foreign server. In other cases the resolver process should not release the waiters. >>> >>> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special >>> effort is needed to ensure that backend does not keep waiting if the >>> resolver >>> can't finish it's work in forseable future. But this effort is not worth >>> because by waking the backend up you might prevent the next transaction from >>> seeing the changes the previous one tried to make." >>> >>> On the other hand, your last comments indicate that you try to be even more >>> stringent in letting the backend wait. However even this stringent approach >>> does not guarantee that the next transaction will see the data changes made >>> by >>> the previous one. >>> >> >> What I'd like to guarantee is that the subsequent read can see the >> committed result of previous writes if the transaction involving >> multiple foreign servers is committed without cancellation by user. In >> other words, the backend should not be waken up and the resolver >> should continue to resolve at certain intervals even if the resolver >> fails to connect to the foreign server or fails to resolve it. This is >> similar to what synchronous replication guaranteed today. Keeping this >> semantics is very important for users. Note that the reading a >> consistent result by concurrent reads is a separated problem. > > The question I have is how would we deal with a foreign server that is > not available for longer duration due to crash, longer network outage > etc. Example is the foreign server crashed/got disconnected after > PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain > blocked for much longer duration without user having an idea of what's > going on. May be we should add some timeout. After more thought, I agree with adding some timeout. I can image there are users who want the timeout, for example, who cannot accept even a few seconds latency. If the timeout occurs backend unlocks the foreign transactions and breaks the loop. The resolver process will keep to continue to resolve foreign transactions at certain interval. >> >> The read result including foreign servers can be inconsistent if the >> such transaction is cancelled or the coordinator server crashes during >> two-phase commit processing. That is, if there is in-doubt transaction >> the read result can be inconsistent, even for subsequent reads. But I >> think this behaviour can be
Re: [HACKERS] Transactions involving multiple postgres foreign servers
Masahiko Sawadawrote: > On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska wrote: > > Masahiko Sawada wrote: > > > >> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat > >> wrote: > >> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada > >> > wrote: > >> >> > >> >> Because I don't want to break the current user semantics. that is, > >> >> currently it's guaranteed that the subsequent reads can see the > >> >> committed result of previous writes even if the previous transactions > >> >> were distributed transactions. And it's ensured by writer side. If we > >> >> can make the reader side ensure it, the backend process don't need to > >> >> wait for the resolver process. > >> >> > >> >> The waiting backend process are released by resolver process after the > >> >> resolver process tried to resolve foreign transactions. Even if > >> >> resolver process failed to either connect to foreign server or to > >> >> resolve foreign transaction the backend process will be released and > >> >> the foreign transactions are leaved as dangling transaction in that > >> >> case, which are processed later. Also if resolver process takes a long > >> >> time to resolve foreign transactions for whatever reason the user can > >> >> cancel it by Ctl-c anytime. > >> >> > >> > > >> > So, there's no guarantee that the next command issued from the > >> > connection *will* see the committed data, since the foreign > >> > transaction might not have committed because of a network glitch > >> > (say). If we go this route of making backends wait for resolver to > >> > resolve the foreign transaction, we will have add complexity to make > >> > sure that the waiting backends are woken up in problematic events like > >> > crash of the resolver process OR if the resolver process hangs in a > >> > connection to a foreign server etc. I am not sure that the complexity > >> > is worth the half-guarantee. > >> > > >> > >> Hmm, maybe I was wrong. I now think that the waiting backends can be > >> woken up only in following cases; > >> - The resolver process succeeded to resolve all foreign transactions. > >> - The user did the cancel (e.g. ctl-c). > >> - The resolver process failed to resolve foreign transaction for a > >> reason of there is no such prepared transaction on foreign server. > >> > >> In other cases the resolver process should not release the waiters. > > > > I'm not sure I see consensus here. What Ashutosh says seems to be: "Special > > effort is needed to ensure that backend does not keep waiting if the > > resolver > > can't finish it's work in forseable future. But this effort is not worth > > because by waking the backend up you might prevent the next transaction from > > seeing the changes the previous one tried to make." > > > > On the other hand, your last comments indicate that you try to be even more > > stringent in letting the backend wait. However even this stringent approach > > does not guarantee that the next transaction will see the data changes made > > by > > the previous one. > > > > What I'd like to guarantee is that the subsequent read can see the > committed result of previous writes if the transaction involving > multiple foreign servers is committed without cancellation by user. I missed the point that user should not expect atomicity of the commit to be guaranteed if he has cancelled his request. The other things are clear to me, including the fact that atomic commit and atomic visibility will be implemented separately. Thanks. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawadawrote: > On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska wrote: >> Masahiko Sawada wrote: >> >>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat >>> wrote: >>> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada >>> > wrote: >>> >> >>> >> Because I don't want to break the current user semantics. that is, >>> >> currently it's guaranteed that the subsequent reads can see the >>> >> committed result of previous writes even if the previous transactions >>> >> were distributed transactions. And it's ensured by writer side. If we >>> >> can make the reader side ensure it, the backend process don't need to >>> >> wait for the resolver process. >>> >> >>> >> The waiting backend process are released by resolver process after the >>> >> resolver process tried to resolve foreign transactions. Even if >>> >> resolver process failed to either connect to foreign server or to >>> >> resolve foreign transaction the backend process will be released and >>> >> the foreign transactions are leaved as dangling transaction in that >>> >> case, which are processed later. Also if resolver process takes a long >>> >> time to resolve foreign transactions for whatever reason the user can >>> >> cancel it by Ctl-c anytime. >>> >> >>> > >>> > So, there's no guarantee that the next command issued from the >>> > connection *will* see the committed data, since the foreign >>> > transaction might not have committed because of a network glitch >>> > (say). If we go this route of making backends wait for resolver to >>> > resolve the foreign transaction, we will have add complexity to make >>> > sure that the waiting backends are woken up in problematic events like >>> > crash of the resolver process OR if the resolver process hangs in a >>> > connection to a foreign server etc. I am not sure that the complexity >>> > is worth the half-guarantee. >>> > >>> >>> Hmm, maybe I was wrong. I now think that the waiting backends can be >>> woken up only in following cases; >>> - The resolver process succeeded to resolve all foreign transactions. >>> - The user did the cancel (e.g. ctl-c). >>> - The resolver process failed to resolve foreign transaction for a >>> reason of there is no such prepared transaction on foreign server. >>> >>> In other cases the resolver process should not release the waiters. >> >> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special >> effort is needed to ensure that backend does not keep waiting if the resolver >> can't finish it's work in forseable future. But this effort is not worth >> because by waking the backend up you might prevent the next transaction from >> seeing the changes the previous one tried to make." >> >> On the other hand, your last comments indicate that you try to be even more >> stringent in letting the backend wait. However even this stringent approach >> does not guarantee that the next transaction will see the data changes made >> by >> the previous one. >> > > What I'd like to guarantee is that the subsequent read can see the > committed result of previous writes if the transaction involving > multiple foreign servers is committed without cancellation by user. In > other words, the backend should not be waken up and the resolver > should continue to resolve at certain intervals even if the resolver > fails to connect to the foreign server or fails to resolve it. This is > similar to what synchronous replication guaranteed today. Keeping this > semantics is very important for users. Note that the reading a > consistent result by concurrent reads is a separated problem. The question I have is how would we deal with a foreign server that is not available for longer duration due to crash, longer network outage etc. Example is the foreign server crashed/got disconnected after PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain blocked for much longer duration without user having an idea of what's going on. May be we should add some timeout. > > The read result including foreign servers can be inconsistent if the > such transaction is cancelled or the coordinator server crashes during > two-phase commit processing. That is, if there is in-doubt transaction > the read result can be inconsistent, even for subsequent reads. But I > think this behaviour can be accepted by users. For the resolution of > in-doubt transactions, the resolver process will try to resolve such > transactions after the coordinator server recovered. On the other > hand, for the reading a consistent result on such situation by > subsequent reads, for example, we can disallow backends to inquiry SQL > to the foreign server if a foreign transaction of the foreign server > is remained. +1 for the last sentence. If we do that, we don't need the backend to be blocked by resolver since a subsequent read accessing
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Mon, Nov 27, 2017 at 4:35 PM Masahiko Sawadawrote: > What I'd like to guarantee is that the subsequent read can see the > committed result of previous writes if the transaction involving > multiple foreign servers is committed without cancellation by user. In > other words, the backend should not be waken up and the resolver > should continue to resolve at certain intervals even if the resolver > fails to connect to the foreign server or fails to resolve it. This is > similar to what synchronous replication guaranteed today. Keeping this > semantics is very important for users. Note that the reading a > consistent result by concurrent reads is a separated problem. > > The read result including foreign servers can be inconsistent if the > such transaction is cancelled or the coordinator server crashes during > two-phase commit processing. That is, if there is in-doubt transaction > the read result can be inconsistent, even for subsequent reads. But I > think this behaviour can be accepted by users. For the resolution of > in-doubt transactions, the resolver process will try to resolve such > transactions after the coordinator server recovered. On the other > hand, for the reading a consistent result on such situation by > subsequent reads, for example, we can disallow backends to inquiry SQL > to the foreign server if a foreign transaction of the foreign server > is remained. > > For the concurrent reads, the reading an inconsistent result can be > happen even without in-doubt transaction because we can read data on a > foreign server between PREPARE and COMMIT PREPARED while other foreign > servers have committed. I think we should deal with this problem by > other feature on reader side, for example, atomic visibility. If we > have atomic visibility feature, we also can solve the above problem. +1 to all of that. ...Robert > > -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houskawrote: > Masahiko Sawada wrote: > >> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat >> wrote: >> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada >> > wrote: >> >> >> >> Because I don't want to break the current user semantics. that is, >> >> currently it's guaranteed that the subsequent reads can see the >> >> committed result of previous writes even if the previous transactions >> >> were distributed transactions. And it's ensured by writer side. If we >> >> can make the reader side ensure it, the backend process don't need to >> >> wait for the resolver process. >> >> >> >> The waiting backend process are released by resolver process after the >> >> resolver process tried to resolve foreign transactions. Even if >> >> resolver process failed to either connect to foreign server or to >> >> resolve foreign transaction the backend process will be released and >> >> the foreign transactions are leaved as dangling transaction in that >> >> case, which are processed later. Also if resolver process takes a long >> >> time to resolve foreign transactions for whatever reason the user can >> >> cancel it by Ctl-c anytime. >> >> >> > >> > So, there's no guarantee that the next command issued from the >> > connection *will* see the committed data, since the foreign >> > transaction might not have committed because of a network glitch >> > (say). If we go this route of making backends wait for resolver to >> > resolve the foreign transaction, we will have add complexity to make >> > sure that the waiting backends are woken up in problematic events like >> > crash of the resolver process OR if the resolver process hangs in a >> > connection to a foreign server etc. I am not sure that the complexity >> > is worth the half-guarantee. >> > >> >> Hmm, maybe I was wrong. I now think that the waiting backends can be >> woken up only in following cases; >> - The resolver process succeeded to resolve all foreign transactions. >> - The user did the cancel (e.g. ctl-c). >> - The resolver process failed to resolve foreign transaction for a >> reason of there is no such prepared transaction on foreign server. >> >> In other cases the resolver process should not release the waiters. > > I'm not sure I see consensus here. What Ashutosh says seems to be: "Special > effort is needed to ensure that backend does not keep waiting if the resolver > can't finish it's work in forseable future. But this effort is not worth > because by waking the backend up you might prevent the next transaction from > seeing the changes the previous one tried to make." > > On the other hand, your last comments indicate that you try to be even more > stringent in letting the backend wait. However even this stringent approach > does not guarantee that the next transaction will see the data changes made by > the previous one. > What I'd like to guarantee is that the subsequent read can see the committed result of previous writes if the transaction involving multiple foreign servers is committed without cancellation by user. In other words, the backend should not be waken up and the resolver should continue to resolve at certain intervals even if the resolver fails to connect to the foreign server or fails to resolve it. This is similar to what synchronous replication guaranteed today. Keeping this semantics is very important for users. Note that the reading a consistent result by concurrent reads is a separated problem. The read result including foreign servers can be inconsistent if the such transaction is cancelled or the coordinator server crashes during two-phase commit processing. That is, if there is in-doubt transaction the read result can be inconsistent, even for subsequent reads. But I think this behaviour can be accepted by users. For the resolution of in-doubt transactions, the resolver process will try to resolve such transactions after the coordinator server recovered. On the other hand, for the reading a consistent result on such situation by subsequent reads, for example, we can disallow backends to inquiry SQL to the foreign server if a foreign transaction of the foreign server is remained. For the concurrent reads, the reading an inconsistent result can be happen even without in-doubt transaction because we can read data on a foreign server between PREPARE and COMMIT PREPARED while other foreign servers have committed. I think we should deal with this problem by other feature on reader side, for example, atomic visibility. If we have atomic visibility feature, we also can solve the above problem. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] Transactions involving multiple postgres foreign servers
Masahiko Sawadawrote: > On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat > wrote: > > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada > > wrote: > >> > >> Because I don't want to break the current user semantics. that is, > >> currently it's guaranteed that the subsequent reads can see the > >> committed result of previous writes even if the previous transactions > >> were distributed transactions. And it's ensured by writer side. If we > >> can make the reader side ensure it, the backend process don't need to > >> wait for the resolver process. > >> > >> The waiting backend process are released by resolver process after the > >> resolver process tried to resolve foreign transactions. Even if > >> resolver process failed to either connect to foreign server or to > >> resolve foreign transaction the backend process will be released and > >> the foreign transactions are leaved as dangling transaction in that > >> case, which are processed later. Also if resolver process takes a long > >> time to resolve foreign transactions for whatever reason the user can > >> cancel it by Ctl-c anytime. > >> > > > > So, there's no guarantee that the next command issued from the > > connection *will* see the committed data, since the foreign > > transaction might not have committed because of a network glitch > > (say). If we go this route of making backends wait for resolver to > > resolve the foreign transaction, we will have add complexity to make > > sure that the waiting backends are woken up in problematic events like > > crash of the resolver process OR if the resolver process hangs in a > > connection to a foreign server etc. I am not sure that the complexity > > is worth the half-guarantee. > > > > Hmm, maybe I was wrong. I now think that the waiting backends can be > woken up only in following cases; > - The resolver process succeeded to resolve all foreign transactions. > - The user did the cancel (e.g. ctl-c). > - The resolver process failed to resolve foreign transaction for a > reason of there is no such prepared transaction on foreign server. > > In other cases the resolver process should not release the waiters. I'm not sure I see consensus here. What Ashutosh says seems to be: "Special effort is needed to ensure that backend does not keep waiting if the resolver can't finish it's work in forseable future. But this effort is not worth because by waking the backend up you might prevent the next transaction from seeing the changes the previous one tried to make." On the other hand, your last comments indicate that you try to be even more stringent in letting the backend wait. However even this stringent approach does not guarantee that the next transaction will see the data changes made by the previous one. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at