Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2020-02-21 Thread Masahiko Sawada
On Tue, 18 Feb 2020 at 00:40, Muhammad Usama  wrote:
>
> Hi Sawada San,
>
> I have a couple of comments on 
> "v27-0002-Support-atomic-commit-among-multiple-foreign-ser.patch"
>
> 1-  As part of the XLogReadRecord refactoring commit the signature of 
> XLogReadRecord was changed,
> so the function call to XLogReadRecord() needs a small adjustment.
>
> i.e. In function XlogReadFdwXactData(XLogRecPtr lsn, char **buf, int *len)
> ...
> -   record = XLogReadRecord(xlogreader, lsn, );
> +   XLogBeginRead(xlogreader, lsn)
> +   record = XLogReadRecord(xlogreader, );
>
> 2- In register_fdwxact(..) function you are setting the 
> XACT_FLAGS_FDWNOPREPARE transaction flag
> when the register request comes in for foreign server that does not support 
> two-phase commit regardless
> of the value of 'bool modified' argument. And later in the 
> PreCommit_FdwXacts() you just error out when
> "foreign_twophase_commit" is set to 'required' only by looking at the 
> XACT_FLAGS_FDWNOPREPARE flag.
> which I think is not correct.
> Since there is a possibility that the transaction might have only read from 
> the foreign servers (not capable of
> handling transactions or two-phase commit) and all other servers where we 
> require to do atomic commit
> are capable enough of doing so.
> If I am not missing something obvious here, then IMHO the 
> XACT_FLAGS_FDWNOPREPARE flag should only
> be set when the transaction management/two-phase functionality is not 
> available and "modified" argument is
> true in register_fdwxact()
>

Thank you for reviewing this patch!

Your comments are incorporated in the latest patch set I recently sent[1].

[1] 
https://www.postgresql.org/message-id/CA%2Bfd4k5ZcDvoiY_5c-mF1oDACS5nUWS7ppoiOwjCOnM%2BgrJO-Q%40mail.gmail.com

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2020-02-17 Thread Muhammad Usama
Hi Sawada San,

I have a couple of comments on 
"v27-0002-Support-atomic-commit-among-multiple-foreign-ser.patch"

1-  As part of the XLogReadRecord refactoring commit the signature of 
XLogReadRecord was changed,
so the function call to XLogReadRecord() needs a small adjustment.

i.e. In function XlogReadFdwXactData(XLogRecPtr lsn, char **buf, int *len)
...
-   record = XLogReadRecord(xlogreader, lsn, );
+   XLogBeginRead(xlogreader, lsn)
+   record = XLogReadRecord(xlogreader, );

2- In register_fdwxact(..) function you are setting the XACT_FLAGS_FDWNOPREPARE 
transaction flag
when the register request comes in for foreign server that does not support 
two-phase commit regardless
of the value of 'bool modified' argument. And later in the PreCommit_FdwXacts() 
you just error out when
"foreign_twophase_commit" is set to 'required' only by looking at the 
XACT_FLAGS_FDWNOPREPARE flag.
which I think is not correct.
Since there is a possibility that the transaction might have only read from the 
foreign servers (not capable of
handling transactions or two-phase commit) and all other servers where we 
require to do atomic commit
are capable enough of doing so.
If I am not missing something obvious here, then IMHO the 
XACT_FLAGS_FDWNOPREPARE flag should only
be set when the transaction management/two-phase functionality is not available 
and "modified" argument is
true in register_fdwxact()

Thanks

Best regards
Muhammad Usama
Highgo Software (Canada/China/Pakistan)

The new status of this patch is: Waiting on Author


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-11-30 Thread Michael Paquier
On Wed, Sep 04, 2019 at 12:44:20PM +0900, Masahiko Sawada wrote:
> I forgot to include some new header files. Attached the updated patches.

No reviews since and the patch does not apply anymore.  I am moving it
to next CF, waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-09-03 Thread Alvaro Herrera
Hello Sawada-san,

On 2019-Jul-02, Masahiko Sawada wrote:

> On Mon, Jul 1, 2019 at 8:32 PM Thomas Munro  wrote:

> > Can we please have a fresh rebase?
> 
> Thank you for the notice. Attached rebased patches.

... and again?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-07-01 Thread Thomas Munro
On Wed, Apr 17, 2019 at 10:23 PM Masahiko Sawada  wrote:
> Sorry for the very late. Attached updated version patches.

Hello Sawada-san,

Can we please have a fresh rebase?

Thanks,

-- 
Thomas Munro
https://enterprisedb.com




Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-02-03 Thread Michael Paquier
On Thu, Jan 31, 2019 at 11:09:09AM +0100, Masahiko Sawada wrote:
> Thanks. Actually I'm updating the patch set, changing API interface as
> I proposed before and improving the document and README. I'll submit
> the latest patch next week.

Cool, I have moved the patch to next CF.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-01-31 Thread Masahiko Sawada
On Tue, Jan 29, 2019 at 5:47 PM Ildar Musin  wrote:
>
> Hello,
>
> The patch needs rebase as it doesn't apply to the current master. I applied it
> to the older commit to test it. It worked fine so far.

Thank you for testing the patch!

>
> I found one bug though which would cause resolver to finish by timeout even
> though there are unresolved foreign transactions in the list. The
> `fdw_xact_exists()` function expects database id as the first argument and xid
> as the second. But everywhere it is called arguments specified in the 
> different
> order (xid first, then dbid).  Also function declaration in header doesn't
> match its definition.

Will fix.

>
> There are some other things I found.
> * In `FdwXactResolveAllDanglingTransactions()` variable `n_resolved` is
>   declared as bool but used as integer.
> * In fdwxact.c's module comment there are 
> `FdwXactRegisterForeignTransaction()`
>   and `FdwXactMarkForeignTransactionModified()` functions mentioned that are
>   not there anymore.
> * In documentation (storage.sgml) there is no mention of `pg_fdw_xact`
>   directory.
>
> Couple of stylistic notes.
> * In `FdwXactCtlData struct` there are both camel case and snake case naming
>   used.
> * In `get_fdw_xacts()` `xid != InvalidTransactionId` can be replaced with
>   `TransactionIdIsValid(xid)`.
> * In `generate_fdw_xact_identifier()` the `fx` prefix could be a part of 
> format
>   string instead of being processed by `sprintf` as an extra argument.
>

I'll incorporate them at the next patch set.

> I'll continue looking into the patch. Thanks!

Thanks. Actually I'm updating the patch set, changing API interface as
I proposed before and improving the document and README. I'll submit
the latest patch next week.


--
Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2019-01-29 Thread Ildar Musin
Hello,

The patch needs rebase as it doesn't apply to the current master. I applied
it
to the older commit to test it. It worked fine so far.

I found one bug though which would cause resolver to finish by timeout even
though there are unresolved foreign transactions in the list. The
`fdw_xact_exists()` function expects database id as the first argument and
xid
as the second. But everywhere it is called arguments specified in the
different
order (xid first, then dbid).  Also function declaration in header doesn't
match its definition.

There are some other things I found.
* In `FdwXactResolveAllDanglingTransactions()` variable `n_resolved` is
  declared as bool but used as integer.
* In fdwxact.c's module comment there are
`FdwXactRegisterForeignTransaction()`
  and `FdwXactMarkForeignTransactionModified()` functions mentioned that are
  not there anymore.
* In documentation (storage.sgml) there is no mention of `pg_fdw_xact`
  directory.

Couple of stylistic notes.
* In `FdwXactCtlData struct` there are both camel case and snake case naming
  used.
* In `get_fdw_xacts()` `xid != InvalidTransactionId` can be replaced with
  `TransactionIdIsValid(xid)`.
* In `generate_fdw_xact_identifier()` the `fx` prefix could be a part of
format
  string instead of being processed by `sprintf` as an extra argument.

I'll continue looking into the patch. Thanks!



On Tue, Nov 20, 2018 at 12:18 PM Masahiko Sawada 
wrote:

> On Thu, Nov 15, 2018 at 7:36 PM Masahiko Sawada 
> wrote:
> >
> > On Mon, Oct 29, 2018 at 6:03 PM Masahiko Sawada 
> wrote:
> > >
> > > On Mon, Oct 29, 2018 at 10:16 AM Masahiko Sawada <
> sawada.m...@gmail.com> wrote:
> > > >
> > > > On Wed, Oct 24, 2018 at 9:06 AM Masahiko Sawada <
> sawada.m...@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI
> > > > >  wrote:
> > > > > >
> > > > > > Hello.
> > > > > >
> > > > > > # It took a long time to come here..
> > > > > >
> > > > > > At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada <
> sawada.m...@gmail.com> wrote in
> 
> > > > > > > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada <
> sawada.m...@gmail.com> wrote:
> > > > > > ...
> > > > > > > * Updated docs, added the new section "Distributed
> Transaction" at
> > > > > > > Chapter 33 to explain the concept to users
> > > > > > >
> > > > > > > * Moved atomic commit codes into src/backend/access/fdwxact
> directory.
> > > > > > >
> > > > > > > * Some bug fixes.
> > > > > > >
> > > > > > > Please reivew them.
> > > > > >
> > > > > > I have some comments, with apologize in advance for possible
> > > > > > duplicate or conflict with others' comments so far.
> > > > >
> > > > > Thank youf so much for reviewing this patch!
> > > > >
> > > > > >
> > > > > > 0001:
> > > > > >
> > > > > > This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT
> > > > > > relation is modified. Isn't it needed when UNLOGGED tables are
> > > > > > modified? It may be better that we have dedicated classification
> > > > > > macro or function.
> > > > >
> > > > > I think even if we do atomic commit for modifying the an UNLOGGED
> > > > > table and a remote table the data will get inconsistent if the
> local
> > > > > server crashes. For example, if the local server crashes after
> > > > > prepared the transaction on foreign server but before the local
> commit
> > > > > and, we will lose the all data of the local UNLOGGED table whereas
> the
> > > > > modification of remote table is rollbacked. In case of persistent
> > > > > tables, the data consistency is left. So I think the keeping data
> > > > > consistency between remote data and local unlogged table is
> difficult
> > > > > and want to leave it as a restriction for now. Am I missing
> something?
> > > > >
> > > > > >
> > > > > > The flag is handled in heapam.c. I suppose that it should be done
> > > > > > in the upper layer considering coming pluggable storage.
> > > > > > (X_F_ACCESSEDTEMPREL is set in heapam, but..)
> > > > > >
> > > > >
> > > > > Yeah, or we can set the flag after heap_insert in ExecInsert.
> > > > >
> > > > > >
> > > > > > 0002:
> > > > > >
> > > > > > The name FdwXactParticipantsForAC doesn't sound good for me. How
> > > > > > about FdwXactAtomicCommitPartitcipants?
> > > > >
> > > > > +1, will fix it.
> > > > >
> > > > > >
> > > > > > Well, as the file comment of fdwxact.c,
> > > > > > FdwXactRegisterTransaction is called from FDW driver and
> > > > > > F_X_MarkForeignTransactionModified is called from executor. I
> > > > > > think that we should clarify who is responsible to the whole
> > > > > > sequence. Since the state of local tables affects, I suppose
> > > > > > executor is that. Couldn't we do the whole thing within executor
> > > > > > side?  I'm not sure but I feel that
> > > > > > F_X_RegisterForeignTransaction can be a part of
> > > > > > F_X_MarkForeignTransactionModified.  The callers of
> > > > > > MarkForeignTransactionModified can find whether the table is
> > > > > > involved in 2pc by 

Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-11-20 Thread Masahiko Sawada
On Thu, Nov 15, 2018 at 7:36 PM Masahiko Sawada  wrote:
>
> On Mon, Oct 29, 2018 at 6:03 PM Masahiko Sawada  wrote:
> >
> > On Mon, Oct 29, 2018 at 10:16 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Wed, Oct 24, 2018 at 9:06 AM Masahiko Sawada  
> > > wrote:
> > > >
> > > > On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI
> > > >  wrote:
> > > > >
> > > > > Hello.
> > > > >
> > > > > # It took a long time to come here..
> > > > >
> > > > > At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada 
> > > > >  wrote in 
> > > > > 
> > > > > > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada 
> > > > > >  wrote:
> > > > > ...
> > > > > > * Updated docs, added the new section "Distributed Transaction" at
> > > > > > Chapter 33 to explain the concept to users
> > > > > >
> > > > > > * Moved atomic commit codes into src/backend/access/fdwxact 
> > > > > > directory.
> > > > > >
> > > > > > * Some bug fixes.
> > > > > >
> > > > > > Please reivew them.
> > > > >
> > > > > I have some comments, with apologize in advance for possible
> > > > > duplicate or conflict with others' comments so far.
> > > >
> > > > Thank youf so much for reviewing this patch!
> > > >
> > > > >
> > > > > 0001:
> > > > >
> > > > > This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT
> > > > > relation is modified. Isn't it needed when UNLOGGED tables are
> > > > > modified? It may be better that we have dedicated classification
> > > > > macro or function.
> > > >
> > > > I think even if we do atomic commit for modifying the an UNLOGGED
> > > > table and a remote table the data will get inconsistent if the local
> > > > server crashes. For example, if the local server crashes after
> > > > prepared the transaction on foreign server but before the local commit
> > > > and, we will lose the all data of the local UNLOGGED table whereas the
> > > > modification of remote table is rollbacked. In case of persistent
> > > > tables, the data consistency is left. So I think the keeping data
> > > > consistency between remote data and local unlogged table is difficult
> > > > and want to leave it as a restriction for now. Am I missing something?
> > > >
> > > > >
> > > > > The flag is handled in heapam.c. I suppose that it should be done
> > > > > in the upper layer considering coming pluggable storage.
> > > > > (X_F_ACCESSEDTEMPREL is set in heapam, but..)
> > > > >
> > > >
> > > > Yeah, or we can set the flag after heap_insert in ExecInsert.
> > > >
> > > > >
> > > > > 0002:
> > > > >
> > > > > The name FdwXactParticipantsForAC doesn't sound good for me. How
> > > > > about FdwXactAtomicCommitPartitcipants?
> > > >
> > > > +1, will fix it.
> > > >
> > > > >
> > > > > Well, as the file comment of fdwxact.c,
> > > > > FdwXactRegisterTransaction is called from FDW driver and
> > > > > F_X_MarkForeignTransactionModified is called from executor. I
> > > > > think that we should clarify who is responsible to the whole
> > > > > sequence. Since the state of local tables affects, I suppose
> > > > > executor is that. Couldn't we do the whole thing within executor
> > > > > side?  I'm not sure but I feel that
> > > > > F_X_RegisterForeignTransaction can be a part of
> > > > > F_X_MarkForeignTransactionModified.  The callers of
> > > > > MarkForeignTransactionModified can find whether the table is
> > > > > involved in 2pc by IsTwoPhaseCommitEnabled interface.
> > > >
> > > > Indeed. We can register foreign servers by executor while FDWs don't
> > > > need to register anything. I will remove the registration function so
> > > > that FDW developers don't need to call the register function but only
> > > > need to provide atomic commit APIs.
> > > >
> > > > >
> > > > >
> > > > > >   if (foreign_twophase_commit == true &&
> > > > > >   ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) )
> > > > > >   ereport(ERROR,
> > > > > >   
> > > > > > (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > > > > >errmsg("cannot COMMIT a distributed 
> > > > > > transaction that has operated on foreign server that doesn't 
> > > > > > support atomic commit")));
> > > > >
> > > > > The error is emitted when a the GUC is turned off in the
> > > > > trasaction where MarkTransactionModify'ed. I think that the
> > > > > number of the variables' possible states should be reduced for
> > > > > simplicity. For example in the case, once foreign_twopase_commit
> > > > > is checked in a transaction, subsequent changes in the
> > > > > transaction should be ignored during the transaction.
> > > > >
> > > >
> > > > I might have not gotten your comment correctly but since the
> > > > foreign_twophase_commit is a PGC_USERSET parameter I think we need to
> > > > check it at commit time. Also we need to keep participant servers even
> > > > when foreign_twophase_commit is off if both max_prepared_foreign_xacts
> > > > and max_foreign_xact_resolvers are > 0.
> > > >
> > > > I will post the updated patch 

Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-23 Thread Masahiko Sawada
On Tue, Oct 23, 2018 at 12:54 PM Kyotaro HORIGUCHI
 wrote:
>
> Hello.
>
> # It took a long time to come here..
>
> At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada  
> wrote in 
> > On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada  
> > wrote:
> ...
> > * Updated docs, added the new section "Distributed Transaction" at
> > Chapter 33 to explain the concept to users
> >
> > * Moved atomic commit codes into src/backend/access/fdwxact directory.
> >
> > * Some bug fixes.
> >
> > Please reivew them.
>
> I have some comments, with apologize in advance for possible
> duplicate or conflict with others' comments so far.

Thank youf so much for reviewing this patch!

>
> 0001:
>
> This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT
> relation is modified. Isn't it needed when UNLOGGED tables are
> modified? It may be better that we have dedicated classification
> macro or function.

I think even if we do atomic commit for modifying the an UNLOGGED
table and a remote table the data will get inconsistent if the local
server crashes. For example, if the local server crashes after
prepared the transaction on foreign server but before the local commit
and, we will lose the all data of the local UNLOGGED table whereas the
modification of remote table is rollbacked. In case of persistent
tables, the data consistency is left. So I think the keeping data
consistency between remote data and local unlogged table is difficult
and want to leave it as a restriction for now. Am I missing something?

>
> The flag is handled in heapam.c. I suppose that it should be done
> in the upper layer considering coming pluggable storage.
> (X_F_ACCESSEDTEMPREL is set in heapam, but..)
>

Yeah, or we can set the flag after heap_insert in ExecInsert.

>
> 0002:
>
> The name FdwXactParticipantsForAC doesn't sound good for me. How
> about FdwXactAtomicCommitPartitcipants?

+1, will fix it.

>
> Well, as the file comment of fdwxact.c,
> FdwXactRegisterTransaction is called from FDW driver and
> F_X_MarkForeignTransactionModified is called from executor. I
> think that we should clarify who is responsible to the whole
> sequence. Since the state of local tables affects, I suppose
> executor is that. Couldn't we do the whole thing within executor
> side?  I'm not sure but I feel that
> F_X_RegisterForeignTransaction can be a part of
> F_X_MarkForeignTransactionModified.  The callers of
> MarkForeignTransactionModified can find whether the table is
> involved in 2pc by IsTwoPhaseCommitEnabled interface.

Indeed. We can register foreign servers by executor while FDWs don't
need to register anything. I will remove the registration function so
that FDW developers don't need to call the register function but only
need to provide atomic commit APIs.

>
>
> >   if (foreign_twophase_commit == true &&
> >   ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) )
> >   ereport(ERROR,
> >   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >errmsg("cannot COMMIT a distributed 
> > transaction that has operated on foreign server that doesn't support atomic 
> > commit")));
>
> The error is emitted when a the GUC is turned off in the
> trasaction where MarkTransactionModify'ed. I think that the
> number of the variables' possible states should be reduced for
> simplicity. For example in the case, once foreign_twopase_commit
> is checked in a transaction, subsequent changes in the
> transaction should be ignored during the transaction.
>

I might have not gotten your comment correctly but since the
foreign_twophase_commit is a PGC_USERSET parameter I think we need to
check it at commit time. Also we need to keep participant servers even
when foreign_twophase_commit is off if both max_prepared_foreign_xacts
and max_foreign_xact_resolvers are > 0.

I will post the updated patch in this week.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-22 Thread Kyotaro HORIGUCHI
Hello.

# It took a long time to come here..

At Fri, 19 Oct 2018 21:38:35 +0900, Masahiko Sawada  
wrote in 
> On Wed, Oct 10, 2018 at 1:34 PM Masahiko Sawada  wrote:
...
> * Updated docs, added the new section "Distributed Transaction" at
> Chapter 33 to explain the concept to users
> 
> * Moved atomic commit codes into src/backend/access/fdwxact directory.
> 
> * Some bug fixes.
> 
> Please reivew them.

I have some comments, with apologize in advance for possible
duplicate or conflict with others' comments so far.

0001:

This sets XACT_FLAG_WROTENONTEMPREL when RELPERSISTENT_PERMANENT
relation is modified. Isn't it needed when UNLOGGED tables are
modified? It may be better that we have dedicated classification
macro or function.

The flag is handled in heapam.c. I suppose that it should be done
in the upper layer considering coming pluggable storage.
(X_F_ACCESSEDTEMPREL is set in heapam, but..)


0002:

The name FdwXactParticipantsForAC doesn't sound good for me. How
about FdwXactAtomicCommitPartitcipants?

Well, as the file comment of fdwxact.c,
FdwXactRegisterTransaction is called from FDW driver and
F_X_MarkForeignTransactionModified is called from executor. I
think that we should clarify who is responsible to the whole
sequence. Since the state of local tables affects, I suppose
executor is that. Couldn't we do the whole thing within executor
side?  I'm not sure but I feel that
F_X_RegisterForeignTransaction can be a part of
F_X_MarkForeignTransactionModified.  The callers of
MarkForeignTransactionModified can find whether the table is
involved in 2pc by IsTwoPhaseCommitEnabled interface.


>   if (foreign_twophase_commit == true &&
>   ((MyXactFlags & XACT_FLAGS_FDWNOPREPARE) != 0) )
>   ereport(ERROR,
>   (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>errmsg("cannot COMMIT a distributed 
> transaction that has operated on foreign server that doesn't support atomic 
> commit")));

The error is emitted when a the GUC is turned off in the
trasaction where MarkTransactionModify'ed. I think that the
number of the variables' possible states should be reduced for
simplicity. For example in the case, once foreign_twopase_commit
is checked in a transaction, subsequent changes in the
transaction should be ignored during the transaction.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-09 Thread Masahiko Sawada
On Wed, Oct 3, 2018 at 6:02 PM Chris Travers  wrote:
>
>
>
> On Wed, Oct 3, 2018 at 9:41 AM Chris Travers  wrote:
>>
>> The following review has been posted through the commitfest application:
>> make installcheck-world:  tested, failed
>> Implements feature:   not tested
>> Spec compliant:   not tested
>> Documentation:tested, failed
>>
>> I am hoping I am not out of order in writing this before the commitfest 
>> starts.  The patch is big and long and so wanted to start on this while 
>> traffic is slow.
>>
>> I find this patch quite welcome and very close to a minimum viable version.  
>> The few significant limitations can be resolved later.  One thing I may have 
>> missed in the documentation is a discussion of the limits of the current 
>> approach.  I think this would be important to document because the caveats 
>> of the current approach are significant, but the people who need it will 
>> have the knowledge to work with issues if they come up.
>>
>> The major caveat I see in our past discussions and (if I read the patch 
>> correctly) is that the resolver goes through global transactions 
>> sequentially and does not move on to the next until the previous one is 
>> resolved.  This means that if I have a global transaction on server A, with 
>> foreign servers B and C, and I have another one on server A with foreign 
>> servers C and D, if server B goes down at the wrong moment, the background 
>> worker does not look like it will detect the failure and move on to try to 
>> resolve the second, so server D will have a badly set vacuum horizon until 
>> this is resolved.  Also if I read the patch correctly, it looks like one can 
>> invoke SQL commands to remove the bad transaction to allow processing to 
>> continue and manual resolution (this is good and necessary because in this 
>> area there is no ability to have perfect recoverability without occasional 
>> administrative action).  I would really like to see more documentation of 
>> failure cases and appropriate administrative action at present.  Otherwise 
>> this is I think a minimum viable addition and I think we want it.
>>
>> It is possible i missed that in the documentation.  If so, my objection 
>> stands aside.  If it is welcome I am happy to take a first crack at such 
>> docs.
>

Thank you for reviewing the patch!

>
> After further testing I am pretty sure I misread the patch.  It looks like 
> one can have multiple resolvers which can, in fact, work through a queue 
> together solving this problem.  So the objection above is not valid and I 
> withdraw that objection.  I will re-review the docs in light of the 
> experience.

Actually the patch doesn't solve this problem; the foreign transaction
resolver processes distributed transactions sequentially. But since
one resolver process is responsible for one database the backend
connecting to another database can complete the distributed
transaction. I understood the your concern and agreed to solve this
problem. I'll address it in the next patch.

>
>>
>>
>> To my mind thats the only blocker in the code (but see below).  I can say 
>> without a doubt that I would expect we would use this feature once available.
>>
>> --
>>
>> Testing however failed.
>>
>> make installcheck-world fails with errors like the following:
>>
>>  -- Modify foreign server and raise an error
>>   BEGIN;
>>   INSERT INTO ft7_twophase VALUES(8);
>> + ERROR:  prepread foreign transactions are disabled
>> + HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>>   INSERT INTO ft8_twophase VALUES(NULL); -- violation
>> ! ERROR:  current transaction is aborted, commands ignored until end of 
>> transaction block
>>   ROLLBACK;
>>   SELECT * FROM ft7_twophase;
>> ! ERROR:  prepread foreign transactions are disabled
>> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>>   SELECT * FROM ft8_twophase;
>> ! ERROR:  prepread foreign transactions are disabled
>> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>>   -- Rollback foreign transaction that involves both 2PC-capable
>>   -- and 2PC-non-capable foreign servers.
>>   BEGIN;
>>   INSERT INTO ft8_twophase VALUES(7);
>> + ERROR:  prepread foreign transactions are disabled
>> + HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>>   INSERT INTO ft9_not_twophase VALUES(7);
>> + ERROR:  current transaction is aborted, commands ignored until end of 
>> transaction block
>>   ROLLBACK;
>>   SELECT * FROM ft8_twophase;
>> ! ERROR:  prepread foreign transactions are disabled
>> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>>
>> make installcheck in the contrib directory shows the same, so that's the 
>> easiest way of reproducing, at least on a new installation.  I think the 
>> test cases will have to handle that sort of setup.

The 'make installcheck' is a regression test mode to do the tests to
the existing installation. If the installation 

Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-03 Thread Chris Travers
On Wed, Oct 3, 2018 at 9:41 AM Chris Travers 
wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:tested, failed
>
> I am hoping I am not out of order in writing this before the commitfest
> starts.  The patch is big and long and so wanted to start on this while
> traffic is slow.
>
> I find this patch quite welcome and very close to a minimum viable
> version.  The few significant limitations can be resolved later.  One thing
> I may have missed in the documentation is a discussion of the limits of the
> current approach.  I think this would be important to document because the
> caveats of the current approach are significant, but the people who need it
> will have the knowledge to work with issues if they come up.
>
> The major caveat I see in our past discussions and (if I read the patch
> correctly) is that the resolver goes through global transactions
> sequentially and does not move on to the next until the previous one is
> resolved.  This means that if I have a global transaction on server A, with
> foreign servers B and C, and I have another one on server A with foreign
> servers C and D, if server B goes down at the wrong moment, the background
> worker does not look like it will detect the failure and move on to try to
> resolve the second, so server D will have a badly set vacuum horizon until
> this is resolved.  Also if I read the patch correctly, it looks like one
> can invoke SQL commands to remove the bad transaction to allow processing
> to continue and manual resolution (this is good and necessary because in
> this area there is no ability to have perfect recoverability without
> occasional administrative action).  I would really like to see more
> documentation of failure cases and appropriate administrative action at
> present.  Otherwise this is I think a minimum viable addition and I think
> we want it.
>
> It is possible i missed that in the documentation.  If so, my objection
> stands aside.  If it is welcome I am happy to take a first crack at such
> docs.
>

After further testing I am pretty sure I misread the patch.  It looks like
one can have multiple resolvers which can, in fact, work through a queue
together solving this problem.  So the objection above is not valid and I
withdraw that objection.  I will re-review the docs in light of the
experience.


>
> To my mind thats the only blocker in the code (but see below).  I can say
> without a doubt that I would expect we would use this feature once
> available.
>
> --
>
> Testing however failed.
>
> make installcheck-world fails with errors like the following:
>
>  -- Modify foreign server and raise an error
>   BEGIN;
>   INSERT INTO ft7_twophase VALUES(8);
> + ERROR:  prepread foreign transactions are disabled
> + HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>   INSERT INTO ft8_twophase VALUES(NULL); -- violation
> ! ERROR:  current transaction is aborted, commands ignored until end of
> transaction block
>   ROLLBACK;
>   SELECT * FROM ft7_twophase;
> ! ERROR:  prepread foreign transactions are disabled
> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>   SELECT * FROM ft8_twophase;
> ! ERROR:  prepread foreign transactions are disabled
> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>   -- Rollback foreign transaction that involves both 2PC-capable
>   -- and 2PC-non-capable foreign servers.
>   BEGIN;
>   INSERT INTO ft8_twophase VALUES(7);
> + ERROR:  prepread foreign transactions are disabled
> + HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>   INSERT INTO ft9_not_twophase VALUES(7);
> + ERROR:  current transaction is aborted, commands ignored until end of
> transaction block
>   ROLLBACK;
>   SELECT * FROM ft8_twophase;
> ! ERROR:  prepread foreign transactions are disabled
> ! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
>
> make installcheck in the contrib directory shows the same, so that's the
> easiest way of reproducing, at least on a new installation.  I think the
> test cases will have to handle that sort of setup.
>
> make check in the contrib directory passes.
>
> For reasons of test failures, I am setting this back to waiting on author.
>
> --
> I had a few other thoughts that I figure are worth sharing with the
> community on this patch with the idea that once it is in place, this may
> open up more options for collaboration in the area of federated and
> distributed storage generally.  I could imagine other foreign data wrappers
> using this API, and folks might want to refactor out the atomic handling
> part so that extensions that do not use the foreign data wrapper structure
> could use it as well (while this looks like a classic SQL/MED issue, I am
> not sure that only foreign data wrappers 

Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-03 Thread Chris Travers
On Wed, Oct 3, 2018 at 9:56 AM Chris Travers 
wrote:

>
>
> On Wed, Oct 3, 2018 at 9:41 AM Chris Travers 
> wrote:
>
>>
>> (errmsg("preparing foreign transactions
> (max_prepared_foreign_transactions > 0) requires maX_foreign_xact_resolvers
> > 0")));
>

Two more critical notes here which I think are small blockers.

The error message above references a config variable that does not exist.

The correct name of the config parameter is
max_foreign_transaction_resolvers

 Setting that along with the following to 10 caused the tests to pass, but
again it fails on default configs:

max_prepared_foreign_transactions, max_prepared_transactions

>
>
>
> --
> Best Regards,
> Chris Travers
> Head of Database
>
> Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
> Saarbrücker Straße 37a, 10405 Berlin
>
>

-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-03 Thread Chris Travers
On Wed, Oct 3, 2018 at 9:41 AM Chris Travers 
wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, failed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:tested, failed
>

Also one really minor point:  I think this is a typo (maX vs max)?

(errmsg("preparing foreign transactions (max_prepared_foreign_transactions
> 0) requires maX_foreign_xact_resolvers > 0")));




-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-03 Thread Chris Travers
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, failed

I am hoping I am not out of order in writing this before the commitfest starts. 
 The patch is big and long and so wanted to start on this while traffic is slow.

I find this patch quite welcome and very close to a minimum viable version.  
The few significant limitations can be resolved later.  One thing I may have 
missed in the documentation is a discussion of the limits of the current 
approach.  I think this would be important to document because the caveats of 
the current approach are significant, but the people who need it will have the 
knowledge to work with issues if they come up.

The major caveat I see in our past discussions and (if I read the patch 
correctly) is that the resolver goes through global transactions sequentially 
and does not move on to the next until the previous one is resolved.  This 
means that if I have a global transaction on server A, with foreign servers B 
and C, and I have another one on server A with foreign servers C and D, if 
server B goes down at the wrong moment, the background worker does not look 
like it will detect the failure and move on to try to resolve the second, so 
server D will have a badly set vacuum horizon until this is resolved.  Also if 
I read the patch correctly, it looks like one can invoke SQL commands to remove 
the bad transaction to allow processing to continue and manual resolution (this 
is good and necessary because in this area there is no ability to have perfect 
recoverability without occasional administrative action).  I would really like 
to see more documentation of failure cases and appropriate administrative 
action at present.  Otherwise this is I think a minimum viable addition and I 
think we want it.

It is possible i missed that in the documentation.  If so, my objection stands 
aside.  If it is welcome I am happy to take a first crack at such docs.

To my mind thats the only blocker in the code (but see below).  I can say 
without a doubt that I would expect we would use this feature once available.

--

Testing however failed.

make installcheck-world fails with errors like the following:

 -- Modify foreign server and raise an error
  BEGIN;
  INSERT INTO ft7_twophase VALUES(8);
+ ERROR:  prepread foreign transactions are disabled
+ HINT:  Set max_prepared_foreign_transactions to a nonzero value.
  INSERT INTO ft8_twophase VALUES(NULL); -- violation
! ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
  ROLLBACK;
  SELECT * FROM ft7_twophase;
! ERROR:  prepread foreign transactions are disabled
! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
  SELECT * FROM ft8_twophase;
! ERROR:  prepread foreign transactions are disabled
! HINT:  Set max_prepared_foreign_transactions to a nonzero value.
  -- Rollback foreign transaction that involves both 2PC-capable
  -- and 2PC-non-capable foreign servers.
  BEGIN;
  INSERT INTO ft8_twophase VALUES(7);
+ ERROR:  prepread foreign transactions are disabled
+ HINT:  Set max_prepared_foreign_transactions to a nonzero value.
  INSERT INTO ft9_not_twophase VALUES(7);
+ ERROR:  current transaction is aborted, commands ignored until end of 
transaction block
  ROLLBACK;
  SELECT * FROM ft8_twophase;
! ERROR:  prepread foreign transactions are disabled
! HINT:  Set max_prepared_foreign_transactions to a nonzero value.

make installcheck in the contrib directory shows the same, so that's the 
easiest way of reproducing, at least on a new installation.  I think the test 
cases will have to handle that sort of setup.

make check in the contrib directory passes.

For reasons of test failures, I am setting this back to waiting on author.

--
I had a few other thoughts that I figure are worth sharing with the community 
on this patch with the idea that once it is in place, this may open up more 
options for collaboration in the area of federated and distributed storage 
generally.  I could imagine other foreign data wrappers using this API, and 
folks might want to refactor out the atomic handling part so that extensions 
that do not use the foreign data wrapper structure could use it as well (while 
this looks like a classic SQL/MED issue, I am not sure that only foreign data 
wrappers would be interested in the API.

The new status of this patch is: Waiting on Author


Re: [HACKERS] Transactions involving multiple postgres foreign servers, take 2

2018-10-02 Thread Michael Paquier
On Fri, Aug 03, 2018 at 05:52:24PM +0900, Masahiko Sawada wrote:
> I attached the updated version patch as the previous versions conflict
> with the current HEAD.

Please note that the latest patch set does not apply anymore, so this
patch is moved to next CF, waiting on author.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-06-05 Thread Masahiko Sawada
On Sat, May 26, 2018 at 12:25 AM, Robert Haas  wrote:
> On Fri, May 18, 2018 at 11:21 AM, Masahiko Sawada  
> wrote:
>> Regarding to API design, should we use 2PC for a distributed
>> transaction if both two or more 2PC-capable foreign servers and
>> 2PC-non-capable foreign server are involved with it?  Or should we end
>> up with an error? the 2PC-non-capable server might be either that has
>> 2PC functionality but just disables it or that doesn't have it.
>
> It seems to me that this is functionality that many people will not
> want to use.  First, doing a PREPARE and then a COMMIT for each FDW
> write transaction is bound to be more expensive than just doing a
> COMMIT.  Second, because the default value of
> max_prepared_transactions is 0, this can only work at all if special
> configuration has been done on the remote side.  Because of the second
> point in particular, it seems to me that the default for this new
> feature must be "off".  It would make to ship a default configuration
> of PostgreSQL that doesn't work with the default configuration of
> postgres_fdw, and I do not think we want to change the default value
> of max_prepared_transactions.  It was changed from 5 to 0 a number of
> years back for good reason.

I'm not sure that many people will not want to use this feature
because it seems to me that there are many people who don't want to
use the database that is missing transaction atomicity. But I agree
that this feature should not be enabled by default as we disable 2PC
by default.

>
> So, I think the question could be broadened a bit: how you enable this
> feature if you want it, and what happens if you want it but it's not
> available for your choice of FDW?  One possible enabling method is a
> GUC (e.g. foreign_twophase_commit).  It could be true/false, with true
> meaning use PREPARE for all FDW writes and fail if that's not
> supported, or it could be three-valued, like require/prefer/disable,
> with require throwing an error if PREPARE support is not available and
> prefer using PREPARE where available but without failing when it isn't
> available.  Another possibility could be to make it an FDW option,
> possibly capable of being set at multiple levels (e.g. server or
> foreign table).  If any FDW involved in the transaction demands
> distributed 2PC semantics then the whole transaction must have those
> semantics or it fails.  I was previous leaning toward the latter
> approach, but I guess now the former approach is sounding better.  I'm
> not totally certain I know what's best here.
>

I agree that the former is better. That way, we also can control that
parameter at transaction level. If we allow the 'prefer' behavior we
need to manage not only 2PC-capable foreign server but also
2PC-non-capable foreign server. It requires all FDW to call the
registration function. So I think two-values parameter would be
better.

BTW, sorry for late submitting the updated patch. I'll post the
updated patch in this week but I'd like to share the new APIs design
beforehand.

APIs that I'd like to add are 4 functions and 1 registration function:
PrepareForeignTransaction, CommitForeignTransaction,
RollbackForeignTransaction, IsTwoPhaseCommitEnabled and
FdwXactRegisterForeignServer. All FDWs that want to support atomic
commit have to support all APIs and to call the registration function
when foreign transaction opens.

Transaction processing sequence with atomic commit will be like follows.

1. FDW begins a transaction on a 2PC-capable foreign server.
2. FDW registers the foreign server with/without a foreign transaction
identifier by calling FdwXactRegisterForeignServer().
* The passing foreign transaction identifier can be NULL. If it's
NULL, the core code constructs it like 'fx_<4 random
chars>__'.
* Providing foreign transaction identifier at beginning of
transaction is useful because some DBMS such as Oracle database or
MySQL requires a transaction identifier at beginning of its XA
transaction.
* Registration the foreign transaction guarantees that its
transaction is controlled by the core and APIs are called at an
appropriate time.
3. Perform 1 and 2 whenever the distributed transaction opens a
transaction on 2PC-capable foreign servers.
* When the distributed transaction modifies a foreign server, we
mark it as 'modified'.
* This mark is used at commit to check if it's necessary to use 2PC.
* At the same time, we also check if the foreign server enables
2PC by calling IsTwoPhaseCommitEnabled().
* If an FDW disables or doesn't provide that function, we mark
XACT_FALGS_FDWNONPREPARE. This is necessary because we need to
remember wrote 2PC-non-capable foreign server.
* When the distributed transaction modifies temp table locally,
mark XACT_FALGS_WROTENONTEMREL.
* This is used at commit to check i it's necessary to use 2PC as well.
4. During pre-commit, we prepare all foreign transaction if 2PC is
required by calling 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-25 Thread Robert Haas
On Fri, May 18, 2018 at 11:21 AM, Masahiko Sawada  wrote:
> Regarding to API design, should we use 2PC for a distributed
> transaction if both two or more 2PC-capable foreign servers and
> 2PC-non-capable foreign server are involved with it?  Or should we end
> up with an error? the 2PC-non-capable server might be either that has
> 2PC functionality but just disables it or that doesn't have it.

It seems to me that this is functionality that many people will not
want to use.  First, doing a PREPARE and then a COMMIT for each FDW
write transaction is bound to be more expensive than just doing a
COMMIT.  Second, because the default value of
max_prepared_transactions is 0, this can only work at all if special
configuration has been done on the remote side.  Because of the second
point in particular, it seems to me that the default for this new
feature must be "off".  It would make to ship a default configuration
of PostgreSQL that doesn't work with the default configuration of
postgres_fdw, and I do not think we want to change the default value
of max_prepared_transactions.  It was changed from 5 to 0 a number of
years back for good reason.

So, I think the question could be broadened a bit: how you enable this
feature if you want it, and what happens if you want it but it's not
available for your choice of FDW?  One possible enabling method is a
GUC (e.g. foreign_twophase_commit).  It could be true/false, with true
meaning use PREPARE for all FDW writes and fail if that's not
supported, or it could be three-valued, like require/prefer/disable,
with require throwing an error if PREPARE support is not available and
prefer using PREPARE where available but without failing when it isn't
available.  Another possibility could be to make it an FDW option,
possibly capable of being set at multiple levels (e.g. server or
foreign table).  If any FDW involved in the transaction demands
distributed 2PC semantics then the whole transaction must have those
semantics or it fails.  I was previous leaning toward the latter
approach, but I guess now the former approach is sounding better.  I'm
not totally certain I know what's best here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



RE: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-23 Thread Tsunakawa, Takayuki
From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> > I'm for the latter.  That is, COMMIT or PREPARE TRANSACTION statement
> issued from an application reports an error.
> 
> I'm not sure that we should end up with an error in such case, but if
> we want then we can raise an error when the transaction tries to
> modify 2PC-non-capable server after modified 2PC-capable server.

Such early reporting would be better, but I wonder if we can handle the 
opposite order: update data on a 2PC-capable server after a 2PC-non-capable 
server.  If it's not easy or efficient, I think it's enough to report the error 
at COMMIT and PREPARE TRANSACTION, just like we report "ERROR:  cannot PREPARE 
a transaction that has operated on temporary tables" at PREPARE TRANSACTION.


> Honestly I'm not sure we should use atomic commit by default at this
> point. Because it also means to change default behavior though the
> existing users use them without 2PC. But I think control of global
> transaction atomicity by GUC parameter would be a good idea. For
> example, synchronous_commit = 'global' makes backends wait for
> transaction to be resolved globally before returning to the user.

Regarding the incompatibility of default behavior, Postgres has the history to 
pursue correctness and less confusion, such as the handling of backslash 
characters in strings and automatic type casts below.

Non-character data types are no longer automatically cast to TEXT (Peter, Tom)
Previously, if a non-character value was supplied to an operator or function 
that requires text input, it was automatically cast to text, for most (though 
not all) built-in data types. This no longer happens: an explicit cast to text 
is now required for all non-character-string types. ... The reason for the 
change is that these automatic casts too often caused surprising behavior.


> I might not understand your comment correctly but the current patch is
> implemented in such way. The patch introduces new FDW APIs:
> PrepareForeignTransaction, EndForeignTransaction,
> ResolvePreparedForeignTransaction and GetPreapreId. The postgres core
> calls each APIs at appropriate timings while managing each foreign
> transactions. FDWs that don't support 2PC set the function pointers of
> them to NULL.

Ouch, you are right.


> Also, regarding the current API design it might not fit to other
> databases than PostgreSQL. For example, in MySQL we have to start xa
> transaction explicitly using by XA START whereas PostgreSQL can
> prepare the transaction that is started by BEGIN TRANSACTION. So in
> MySQL global transaction id is required at beginning of xa
> transaction. And we have to execute XA END is required before we
> prepare or commit it at one phase. So it would be better to define
> APIs according to X/Open XA in order to make it more general.

I thought of:

* Put the functions that implement xa_prepare(), xa_commit() and xa_rollback() 
in libxa.so or libpq.so.
* PrepareForeignTransaction and EndForeignTransaction for postgres_fdw call 
them.

I meant just code reuse for Postgres.  But this is my simple intuition, so 
don't mind.

Regards
Takayuki Tsunakawa



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-22 Thread Masahiko Sawada
On Mon, May 21, 2018 at 10:42 AM, Tsunakawa, Takayuki
 wrote:
> From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
>> Regarding to API design, should we use 2PC for a distributed
>> transaction if both two or more 2PC-capable foreign servers and
>> 2PC-non-capable foreign server are involved with it?  Or should we end
>> up with an error? the 2PC-non-capable server might be either that has
>> 2PC functionality but just disables it or that doesn't have it.
>
>>but I think we also could take
>> the latter way because it doesn't make sense for user even if the
>> transaction commit atomically among not all participants.
>
> I'm for the latter.  That is, COMMIT or PREPARE TRANSACTION statement issued 
> from an application reports an error.

I'm not sure that we should end up with an error in such case, but if
we want then we can raise an error when the transaction tries to
modify 2PC-non-capable server after modified 2PC-capable server.

> DBMS, particularly relational DBMS (, and even more particularly Postgres?) 
> places high value on data correctness.  So I think transaction atomicity 
> should be preserved, at least by default.  If we preferred updatability and 
> performance to data correctness, why don't we change the default value of 
> synchronous_commit to off in favor of performance?  On the other hand, if we 
> want to allow 1PC commit when not all FDWs support 2PC, we can add a new GUC 
> parameter like "allow_nonatomic_commit = on", just like synchronous_commit 
> and fsync trade-offs data correctness and performance.

Honestly I'm not sure we should use atomic commit by default at this
point. Because it also means to change default behavior though the
existing users use them without 2PC. But I think control of global
transaction atomicity by GUC parameter would be a good idea. For
example, synchronous_commit = 'global' makes backends wait for
transaction to be resolved globally before returning to the user.

>
>
>> Also, regardless whether we take either way  I think it would be
>> better to manage not only 2PC transaction but also non-2PC transaction
>> in the core and add two_phase_commit argument. I think we can use it
>> without breaking existing FDWs. Currently FDWs manage transactions
>> using XactCallback but new APIs being added also manage transactions.
>> I think it might be better if users use either way (using XactCallback
>> or using new APIs) for transaction management rather than use both
>> ways with combination. Otherwise two codes for transaction management
>> will be required: the code that manages foreign transactions using
>> XactCallback for non-2PC transactions and code that manages them using
>> new APIs for 2PC transactions. That would not be easy for FDW
>> developers. So what I imagined for new API is that if FDW developers
>> use new APIs they can use both 2PC and non-2PC transaction, but if
>> they use XactCallback they can use only non-2PC transaction.
>> Any thoughts?
>
> If we add new functions, can't we just add functions whose names are 
> straightforward like PrepareTransaction() and CommitTransaction()?  FDWs 
> without 2PC support returns NULL for the function pointer of 
> PrepareTransaction().
>
> This is similar to XA: XA requires each RM to provide function pointers for 
> xa_prepare() and xa_commit().  If we go this way, maybe we could leverage the 
> artifact of postgres_fdw to create the XA library for C/C++.  I mean we put 
> transaction control functions in the XA library, and postgres_fdw also uses 
> it.  i.e.:
>
> postgres_fdw.so -> libxa.so -> libpq.so
>  \-/

I might not understand your comment correctly but the current patch is
implemented in such way. The patch introduces new FDW APIs:
PrepareForeignTransaction, EndForeignTransaction,
ResolvePreparedForeignTransaction and GetPreapreId. The postgres core
calls each APIs at appropriate timings while managing each foreign
transactions. FDWs that don't support 2PC set the function pointers of
them to NULL.

Also, regarding the current API design it might not fit to other
databases than PostgreSQL. For example, in MySQL we have to start xa
transaction explicitly using by XA START whereas PostgreSQL can
prepare the transaction that is started by BEGIN TRANSACTION. So in
MySQL global transaction id is required at beginning of xa
transaction. And we have to execute XA END is required before we
prepare or commit it at one phase. So it would be better to define
APIs according to X/Open XA in order to make it more general.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-21 Thread Ashutosh Bapat
On Fri, May 11, 2018 at 12:27 AM, Robert Haas  wrote:
> On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada  
> wrote:
>> I might be missing your point. As for API breaking, this patch doesn't
>> break any existing FDWs. All new APIs I proposed are dedicated to 2PC.
>> In other words, FDWs that work today can continue working after this
>> patch gets committed, but if FDWs want to support atomic commit then
>> they should be updated to use new APIs. The reason why the calling of
>> FdwXactRegisterForeignServer is necessary is that the core code
>> controls the foreign servers that involved with the transaction but
>> the whether each foreign server uses 2PC option (two_phase_commit) is
>> known only on FDW code. We can eliminate the necessity of calling
>> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to
>> server-level in order not to enforce calling the registering function
>> on FDWs. If we did that, the user can use the 2PC option as a
>> server-level option.
>
> Well, FdwXactRegisterForeignServer has a "bool two_phase_commit"
> argument.  If it only needs to be called by FDWs that support 2PC,
> then that argument is unnecessary.

An FDW may support 2PC but not a foreign server created using that
FDW. Without that argument all the foreign servers created using a
given FDW will need to support 2PC which may not be possible.

> If it needs to be called by all
> FDWs, then it breaks existing FDWs that don't call it.
>

That's true. By default FDWs, which do not want to use this facility,
can just pass false without any need for further change.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



RE: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-20 Thread Tsunakawa, Takayuki
From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> Regarding to API design, should we use 2PC for a distributed
> transaction if both two or more 2PC-capable foreign servers and
> 2PC-non-capable foreign server are involved with it?  Or should we end
> up with an error? the 2PC-non-capable server might be either that has
> 2PC functionality but just disables it or that doesn't have it.

>but I think we also could take
> the latter way because it doesn't make sense for user even if the
> transaction commit atomically among not all participants.

I'm for the latter.  That is, COMMIT or PREPARE TRANSACTION statement issued 
from an application reports an error.  DBMS, particularly relational DBMS (, 
and even more particularly Postgres?) places high value on data correctness.  
So I think transaction atomicity should be preserved, at least by default.  If 
we preferred updatability and performance to data correctness, why don't we 
change the default value of synchronous_commit to off in favor of performance?  
On the other hand, if we want to allow 1PC commit when not all FDWs support 
2PC, we can add a new GUC parameter like "allow_nonatomic_commit = on", just 
like synchronous_commit and fsync trade-offs data correctness and performance.


> Also, regardless whether we take either way  I think it would be
> better to manage not only 2PC transaction but also non-2PC transaction
> in the core and add two_phase_commit argument. I think we can use it
> without breaking existing FDWs. Currently FDWs manage transactions
> using XactCallback but new APIs being added also manage transactions.
> I think it might be better if users use either way (using XactCallback
> or using new APIs) for transaction management rather than use both
> ways with combination. Otherwise two codes for transaction management
> will be required: the code that manages foreign transactions using
> XactCallback for non-2PC transactions and code that manages them using
> new APIs for 2PC transactions. That would not be easy for FDW
> developers. So what I imagined for new API is that if FDW developers
> use new APIs they can use both 2PC and non-2PC transaction, but if
> they use XactCallback they can use only non-2PC transaction.
> Any thoughts?

If we add new functions, can't we just add functions whose names are 
straightforward like PrepareTransaction() and CommitTransaction()?  FDWs 
without 2PC support returns NULL for the function pointer of 
PrepareTransaction().

This is similar to XA: XA requires each RM to provide function pointers for 
xa_prepare() and xa_commit().  If we go this way, maybe we could leverage the 
artifact of postgres_fdw to create the XA library for C/C++.  I mean we put 
transaction control functions in the XA library, and postgres_fdw also uses it. 
 i.e.:

postgres_fdw.so -> libxa.so -> libpq.so
 \-/


Regards
Takayuki Tsunakawa



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-18 Thread Masahiko Sawada
On Fri, May 11, 2018 at 9:56 AM, Masahiko Sawada  wrote:
> Thank you for the comment.
>
> On Fri, May 11, 2018 at 3:57 AM, Robert Haas  wrote:
>> On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada  
>> wrote:
>>> I might be missing your point. As for API breaking, this patch doesn't
>>> break any existing FDWs. All new APIs I proposed are dedicated to 2PC.
>>> In other words, FDWs that work today can continue working after this
>>> patch gets committed, but if FDWs want to support atomic commit then
>>> they should be updated to use new APIs. The reason why the calling of
>>> FdwXactRegisterForeignServer is necessary is that the core code
>>> controls the foreign servers that involved with the transaction but
>>> the whether each foreign server uses 2PC option (two_phase_commit) is
>>> known only on FDW code. We can eliminate the necessity of calling
>>> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to
>>> server-level in order not to enforce calling the registering function
>>> on FDWs. If we did that, the user can use the 2PC option as a
>>> server-level option.
>>
>> Well, FdwXactRegisterForeignServer has a "bool two_phase_commit"
>> argument.  If it only needs to be called by FDWs that support 2PC,
>> then that argument is unnecessary.  If it needs to be called by all
>> FDWs, then it breaks existing FDWs that don't call it.
>>
>
> I understood now. For now since FdwXactRegisterForeignServer needs to
> be called by only FDWs that support 2PC, I will remove the argument.

Regarding to API design, should we use 2PC for a distributed
transaction if both two or more 2PC-capable foreign servers and
2PC-non-capable foreign server are involved with it?  Or should we end
up with an error? the 2PC-non-capable server might be either that has
2PC functionality but just disables it or that doesn't have it.

If we use it, the transaction atomicity will be satisfied among only
2PC-capable servers that might be part of all participants. Or If we
don't, we use 1PC instead in such case but when using 2PC transactions
always ends up with satisfying the transaction atomicity among all
participants of the transaction. The current patch takes the former
(doesn't allow PREPARE case though), but I think we also could take
the latter way because it doesn't make sense for user even if the
transaction commit atomically among not all participants.

Also, regardless whether we take either way  I think it would be
better to manage not only 2PC transaction but also non-2PC transaction
in the core and add two_phase_commit argument. I think we can use it
without breaking existing FDWs. Currently FDWs manage transactions
using XactCallback but new APIs being added also manage transactions.
I think it might be better if users use either way (using XactCallback
or using new APIs) for transaction management rather than use both
ways with combination. Otherwise two codes for transaction management
will be required: the code that manages foreign transactions using
XactCallback for non-2PC transactions and code that manages them using
new APIs for 2PC transactions. That would not be easy for FDW
developers. So what I imagined for new API is that if FDW developers
use new APIs they can use both 2PC and non-2PC transaction, but if
they use XactCallback they can use only non-2PC transaction.
Any thoughts?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-05-10 Thread Robert Haas
On Tue, Feb 27, 2018 at 2:21 AM, Masahiko Sawada  wrote:
> I might be missing your point. As for API breaking, this patch doesn't
> break any existing FDWs. All new APIs I proposed are dedicated to 2PC.
> In other words, FDWs that work today can continue working after this
> patch gets committed, but if FDWs want to support atomic commit then
> they should be updated to use new APIs. The reason why the calling of
> FdwXactRegisterForeignServer is necessary is that the core code
> controls the foreign servers that involved with the transaction but
> the whether each foreign server uses 2PC option (two_phase_commit) is
> known only on FDW code. We can eliminate the necessity of calling
> FdwXactRegisterForeignServer by moving 2PC option from fdw-level to
> server-level in order not to enforce calling the registering function
> on FDWs. If we did that, the user can use the 2PC option as a
> server-level option.

Well, FdwXactRegisterForeignServer has a "bool two_phase_commit"
argument.  If it only needs to be called by FDWs that support 2PC,
then that argument is unnecessary.  If it needs to be called by all
FDWs, then it breaks existing FDWs that don't call it.

>>> But this is now responsible by FDW. I should change it to resolver
>>> side. That is, FDW can raise error in ordinarily way but core code
>>> should catch and process it.
>>
>> I don't understand exactly what you mean here.
>
> Hmm I think I got confused. My understanding is that logging a
> complaint in commit case and not doing that in abort case if prepared
> transaction doesn't exist is a core code's job. An API contract here
> is that FDW raise an error with ERRCODE_UNDEFINED_OBJECT error code if
> there is no such transaction. Since it's an expected case in abort
> case for the fdwxact manager, the core code can regard the error as
> not actual problem.

In general, it's not safe to catch an error and continue unless you
protect the code that throws the error by a sub-transaction.  That
means we shouldn't expect the FDW to throw an error when the prepared
transaction isn't found and then just have the core code ignore the
error.  Instead the FDW should return normally and, if the core code
needs to know whether the prepared transaction was found, then the FDW
should indicate this through a return value, not an ERROR.

> Or do you mean that FDWs should not raise an error if there is the
> prepared transaction, and then core code doesn't need to check
> sqlstate in case of error?

Right.  As noted above, that's unsafe, so we shouldn't do it.

>>> You're right. Perhaps we can deal with it by PrescanFdwXacts until
>>> reach consistent point, and then have vac_update_datfrozenxid check
>>> local xids of un-resolved fdwxact to determine the new datfrozenxid.
>>> Since the local xids of un-resolved fdwxacts would not be relevant
>>> with vacuuming, we don't need to include it to snapshot and
>>> GetOldestXmin etc. Also we hint to resolve fdwxact when near
>>> wraparound.
>>
>> I agree with you about snapshots, but I'm not sure about GetOldestXmin.
>
> Hmm, although I've thought concern in case where we don't consider
> local xids of un-resolved fdwxact in GetOldestXmin, I could not find
> problem. Could you share your concern if you have? I'll try to find a
> possibility based on it.

I don't remember exactly what I was thinking when I wrote that, but I
think the point is that GetOldestXmin() does a bunch of things other
than control the threshold for VACUUM, and we'd need to study them all
and look for problems.  For example, it won't do for an XID to get
reused while it's still associated with an unresolved fdwxact.  We
therefore certainly need such XIDs to hold back the cluster-wide
threshold for clog truncation in some manner -- and maybe that
involves GetOldestXmin().  Or maybe not.  But anyway the point,
broadly considered, is that GetOldestXmin() is used in various ways,
and I don't know if we've thought through all of the consequences in
regard to this new feature.

Can I ask what your time frame is for updating this patch?
Considering how much work appears to remain, if you want to get this
committed to v12, it would be best to get started as early as
possible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-03-28 Thread Masahiko Sawada
On Thu, Mar 29, 2018 at 2:27 AM, David Steele  wrote:
> On 2/27/18 2:21 AM, Masahiko Sawada wrote:
>>
>> Hmm, although I've thought concern in case where we don't consider
>> local xids of un-resolved fdwxact in GetOldestXmin, I could not find
>> problem. Could you share your concern if you have? I'll try to find a
>> possibility based on it.
>
> It appears that this entry should be marked Waiting on Author so I have
> done that.
>
> I also think it may be time to move this patch to the next CF.
>

I agree to move this patch to the next CF.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-03-28 Thread David Steele
On 2/27/18 2:21 AM, Masahiko Sawada wrote:
> 
> Hmm, although I've thought concern in case where we don't consider
> local xids of un-resolved fdwxact in GetOldestXmin, I could not find
> problem. Could you share your concern if you have? I'll try to find a
> possibility based on it.

It appears that this entry should be marked Waiting on Author so I have
done that.

I also think it may be time to move this patch to the next CF.

Regards,
-- 
-David
da...@pgmasters.net



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-02-26 Thread Masahiko Sawada
On Wed, Feb 21, 2018 at 6:07 AM, Robert Haas  wrote:
> On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawada  
> wrote:
>>> The fdw-transactions section of the documentation seems to imply that
>>> henceforth every FDW must call FdwXactRegisterForeignServer, which I
>>> think is an unacceptable API break.
>>>
>>> It doesn't seem advisable to make this behavior depend on
>>> max_prepared_foreign_transactions.  I think that it should be an
>>> server-level option: use 2PC for this server, or not?  FDWs that don't
>>> have 2PC default to "no"; but others can be set to "yes" if the user
>>> wishes.  But we shouldn't force that decision to be on a cluster-wide
>>> basis.
>>
>> Since I've added a new option two_phase_commit to postgres_fdw we need
>> to ask FDW whether the foreign server is 2PC-capable server or not in
>> order to register the foreign server information. That's why the patch
>> asked FDW to call  FdwXactRegisterForeignServer. However we can
>> register a foreign server information automatically by executor (e.g.
>> at  BeginDirectModify and at BeginForeignModify) if a foreign server
>> itself has that information. We can add two_phase_commit_enabled
>> column to pg_foreign_server system catalog and that column is set to
>> true if the foriegn server is 2PC-capable (i.g. has enough functions)
>> and user want to use it.
>
> I don't see why this would need a new catalog column.

I might be missing your point. As for API breaking, this patch doesn't
break any existing FDWs. All new APIs I proposed are dedicated to 2PC.
In other words, FDWs that work today can continue working after this
patch gets committed, but if FDWs want to support atomic commit then
they should be updated to use new APIs. The reason why the calling of
FdwXactRegisterForeignServer is necessary is that the core code
controls the foreign servers that involved with the transaction but
the whether each foreign server uses 2PC option (two_phase_commit) is
known only on FDW code. We can eliminate the necessity of calling
FdwXactRegisterForeignServer by moving 2PC option from fdw-level to
server-level in order not to enforce calling the registering function
on FDWs. If we did that, the user can use the 2PC option as a
server-level option.

>
>>> But that brings up another issue: why is MyFdwConnections named that
>>> way and why does it have those semantics?  That is, why do we really
>>> need a list of every FDW connection? I think we really only need the
>>> ones that are 2PC-capable writes.  If a non-2PC-capable foreign server
>>> is involved in the transaction, then we don't really to keep it in a
>>> list.  We just need to flag the transaction as not locally prepareable
>>> i.e. clear TwoPhaseReady.  I think that this could all be figured out
>>> in a much less onerous way: if we ever perform a modification of a
>>> foreign table, have nodeModifyTable.c either mark the transaction
>>> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign
>>> server is not 2PC capable, or otherwise add the appropriate
>>> information to MyFdwConnections, which can then be renamed to indicate
>>> that it contains only information about preparable stuff.  Then you
>>> don't need each individual FDW to be responsible about calling some
>>> new function; the core code just does the right thing automatically.
>>
>> I could not get this comment. Did you mean that the foreign
>> transaction on not 2PC-capable foreign server should be end in the
>> same way as before (i.g. by XactCallback)?
>>
>> Currently, because there is not FDW API to end foreign transaction,
>> almost FDWs use XactCallbacks to end the transaction. But after
>> introduced new FDW APIs, I think it's better to use FDW APIs to end
>> transactions rather than using XactCallbacks. Otherwise we end up with
>> having FDW APIs for 2PC (prepare and resolve) and XactCallback for
>> ending the transaction, which would be hard to understand. So I've
>> changed the foreign transaction management so that core code
>> explicitly asks FDW to end/prepare a foreign transaction instead of
>> ending it by individual FDWs. After introduced new FDW APIs, core code
>> can have the information of all foreign servers involved with the
>> transaction and call each APIs at appropriate timing.
>
> Well, it's one thing to introduce a new API.  It's another thing to
> require existing FDWs to be updated to use it.  There are a lot of
> existing FDWs out there, and I think that it is needlessly unfriendly
> to force them all to be updated for v11 (or whenever this gets
> committed) even if we think the new API is clearly better.  FDWs that
> work today should continue working after this patch is committed.

Agreed.

> Separately, I think there's a question of whether the new API is in
> fact better -- I'm not sure I have a completely well-formed opinion
> about that yet.

I think one API should do one job. In terms of keeping simple API the
current four APIs 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-02-20 Thread Robert Haas
On Tue, Feb 13, 2018 at 5:42 AM, Masahiko Sawada  wrote:
>> The fdw-transactions section of the documentation seems to imply that
>> henceforth every FDW must call FdwXactRegisterForeignServer, which I
>> think is an unacceptable API break.
>>
>> It doesn't seem advisable to make this behavior depend on
>> max_prepared_foreign_transactions.  I think that it should be an
>> server-level option: use 2PC for this server, or not?  FDWs that don't
>> have 2PC default to "no"; but others can be set to "yes" if the user
>> wishes.  But we shouldn't force that decision to be on a cluster-wide
>> basis.
>
> Since I've added a new option two_phase_commit to postgres_fdw we need
> to ask FDW whether the foreign server is 2PC-capable server or not in
> order to register the foreign server information. That's why the patch
> asked FDW to call  FdwXactRegisterForeignServer. However we can
> register a foreign server information automatically by executor (e.g.
> at  BeginDirectModify and at BeginForeignModify) if a foreign server
> itself has that information. We can add two_phase_commit_enabled
> column to pg_foreign_server system catalog and that column is set to
> true if the foriegn server is 2PC-capable (i.g. has enough functions)
> and user want to use it.

I don't see why this would need a new catalog column.

>> But that brings up another issue: why is MyFdwConnections named that
>> way and why does it have those semantics?  That is, why do we really
>> need a list of every FDW connection? I think we really only need the
>> ones that are 2PC-capable writes.  If a non-2PC-capable foreign server
>> is involved in the transaction, then we don't really to keep it in a
>> list.  We just need to flag the transaction as not locally prepareable
>> i.e. clear TwoPhaseReady.  I think that this could all be figured out
>> in a much less onerous way: if we ever perform a modification of a
>> foreign table, have nodeModifyTable.c either mark the transaction
>> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign
>> server is not 2PC capable, or otherwise add the appropriate
>> information to MyFdwConnections, which can then be renamed to indicate
>> that it contains only information about preparable stuff.  Then you
>> don't need each individual FDW to be responsible about calling some
>> new function; the core code just does the right thing automatically.
>
> I could not get this comment. Did you mean that the foreign
> transaction on not 2PC-capable foreign server should be end in the
> same way as before (i.g. by XactCallback)?
>
> Currently, because there is not FDW API to end foreign transaction,
> almost FDWs use XactCallbacks to end the transaction. But after
> introduced new FDW APIs, I think it's better to use FDW APIs to end
> transactions rather than using XactCallbacks. Otherwise we end up with
> having FDW APIs for 2PC (prepare and resolve) and XactCallback for
> ending the transaction, which would be hard to understand. So I've
> changed the foreign transaction management so that core code
> explicitly asks FDW to end/prepare a foreign transaction instead of
> ending it by individual FDWs. After introduced new FDW APIs, core code
> can have the information of all foreign servers involved with the
> transaction and call each APIs at appropriate timing.

Well, it's one thing to introduce a new API.  It's another thing to
require existing FDWs to be updated to use it.  There are a lot of
existing FDWs out there, and I think that it is needlessly unfriendly
to force them all to be updated for v11 (or whenever this gets
committed) even if we think the new API is clearly better.  FDWs that
work today should continue working after this patch is committed.

Separately, I think there's a question of whether the new API is in
fact better -- I'm not sure I have a completely well-formed opinion
about that yet.

> In FdwXactResolveForeignTranasction(), resolver concludes the fate of
> transaction by seeing the status of fdwxact entry and the state of
> local transaction in clog. what I need to do is making that function
> log a complaint in commit case if couldn't find the prepared
> transaction, and not do that in abort case.

+1.

> Also, postgres_fdw don't
> raise an error even if we could not find prepared transaction on
> foreign server because it might have been resolved by other process.

+1.

> But this is now responsible by FDW. I should change it to resolver
> side. That is, FDW can raise error in ordinarily way but core code
> should catch and process it.

I don't understand exactly what you mean here.

> You're right. Perhaps we can deal with it by PrescanFdwXacts until
> reach consistent point, and then have vac_update_datfrozenxid check
> local xids of un-resolved fdwxact to determine the new datfrozenxid.
> Since the local xids of un-resolved fdwxacts would not be relevant
> with vacuuming, we don't need to include it to snapshot and
> GetOldestXmin etc. Also we hint to 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-02-13 Thread Masahiko Sawada
On Sat, Feb 10, 2018 at 4:08 AM, Robert Haas  wrote:
> On Thu, Feb 8, 2018 at 3:58 AM, Masahiko Sawada  wrote:
>>> Overall, what's the status of this patch?  Are we hung up on this
>>> issue only, or are there other things?
>>
>> AFAIK there is no more technical issue in this patch so far other than
>> this issue. The patch has tests and docs, and includes all stuff to
>> support atomic commit to distributed transactions: the introducing
>> both the atomic commit ability to distributed transactions and some
>> corresponding FDW APIs, and having postgres_fdw support 2pc. I think
>> this patch needs to be reviewed, especially the functionality of
>> foreign transaction resolution which is re-designed before.
>
> OK.  I'm going to give 0002 a read-through now, but I think it would
> be a good thing if others also contributed to the review effort.
> There is a lot of code here, and there are a lot of other patches
> competing for attention.  That said, here we go:

I appreciate your reviewing. I'll thoroughly update the whole patch
based on your comments and suggestions. Here is answer, question and
my understanding.

>
> The fdw-transactions section of the documentation seems to imply that
> henceforth every FDW must call FdwXactRegisterForeignServer, which I
> think is an unacceptable API break.
>
> It doesn't seem advisable to make this behavior depend on
> max_prepared_foreign_transactions.  I think that it should be an
> server-level option: use 2PC for this server, or not?  FDWs that don't
> have 2PC default to "no"; but others can be set to "yes" if the user
> wishes.  But we shouldn't force that decision to be on a cluster-wide
> basis.

Since I've added a new option two_phase_commit to postgres_fdw we need
to ask FDW whether the foreign server is 2PC-capable server or not in
order to register the foreign server information. That's why the patch
asked FDW to call  FdwXactRegisterForeignServer. However we can
register a foreign server information automatically by executor (e.g.
at  BeginDirectModify and at BeginForeignModify) if a foreign server
itself has that information. We can add two_phase_commit_enabled
column to pg_foreign_server system catalog and that column is set to
true if the foriegn server is 2PC-capable (i.g. has enough functions)
and user want to use it.

>
> But that brings up another issue: why is MyFdwConnections named that
> way and why does it have those semantics?  That is, why do we really
> need a list of every FDW connection? I think we really only need the
> ones that are 2PC-capable writes.  If a non-2PC-capable foreign server
> is involved in the transaction, then we don't really to keep it in a
> list.  We just need to flag the transaction as not locally prepareable
> i.e. clear TwoPhaseReady.  I think that this could all be figured out
> in a much less onerous way: if we ever perform a modification of a
> foreign table, have nodeModifyTable.c either mark the transaction
> non-preparable by setting XACT_FLAGS_FDWNOPREPARE if the foreign
> server is not 2PC capable, or otherwise add the appropriate
> information to MyFdwConnections, which can then be renamed to indicate
> that it contains only information about preparable stuff.  Then you
> don't need each individual FDW to be responsible about calling some
> new function; the core code just does the right thing automatically.

I could not get this comment. Did you mean that the foreign
transaction on not 2PC-capable foreign server should be end in the
same way as before (i.g. by XactCallback)?

Currently, because there is not FDW API to end foreign transaction,
almost FDWs use XactCallbacks to end the transaction. But after
introduced new FDW APIs, I think it's better to use FDW APIs to end
transactions rather than using XactCallbacks. Otherwise we end up with
having FDW APIs for 2PC (prepare and resolve) and XactCallback for
ending the transaction, which would be hard to understand. So I've
changed the foreign transaction management so that core code
explicitly asks FDW to end/prepare a foreign transaction instead of
ending it by individual FDWs. After introduced new FDW APIs, core code
can have the information of all foreign servers involved with the
transaction and call each APIs at appropriate timing.

>
> +if (!fdw_conn->end_foreign_xact(fdw_conn->serverid, fdw_conn->userid,
> +fdw_conn->umid, true))
> +elog(WARNING, "could not commit transaction on server %s",
> + fdw_conn->servername);
>
> First, as I noted above, this assumes that the local transaction can't
> fail after this point, which is certainly false -- if nothing else,
> think about a plug pull.  Second, it assumes that the right thing to
> do would be to throw a WARNING, which I think is false: if this is
> running in the pre-commit phase, it's not too late to switch to the
> abort path, and certainly if we make changes on only 1 server 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-02-09 Thread Robert Haas
On Thu, Feb 8, 2018 at 3:58 AM, Masahiko Sawada  wrote:
>> Overall, what's the status of this patch?  Are we hung up on this
>> issue only, or are there other things?
>
> AFAIK there is no more technical issue in this patch so far other than
> this issue. The patch has tests and docs, and includes all stuff to
> support atomic commit to distributed transactions: the introducing
> both the atomic commit ability to distributed transactions and some
> corresponding FDW APIs, and having postgres_fdw support 2pc. I think
> this patch needs to be reviewed, especially the functionality of
> foreign transaction resolution which is re-designed before.

OK.  I'm going to give 0002 a read-through now, but I think it would
be a good thing if others also contributed to the review effort.
There is a lot of code here, and there are a lot of other patches
competing for attention.  That said, here we go:

In the documentation for pg_prepared_fdw_xacts, the first two columns
have descriptions ending in a preposition.  That's typically to be
avoided in formal writing.  The first one can be fixed by moving "in"
before "which".  The second can be fixed by changing "that" to "with
which" and then dropping the trailing "with".  The first three columns
have descriptions ending in a period; the latter two do not.  Make it
consistent with whatever the surrounding style is, or at least
internally consistent if the surrounding style varies.  Also, some of
the descriptions begin with "the" and others do not; again, seems
better to be consistent and adhere to surrounding style.  The
documentation of the serverid column seems to make no sense.  Possibly
you mean "OID of the foreign server on which this foreign transaction
is prepared"?  As it is, you use "foreign server" twice, which is why
I'm confused.

The documentation of max_prepared_foreign_transactions seems a bit
brief.  I think that if I want to be able to use 2PC for N
transactions each across K servers, this variable needs to be set to
N*K, not just N.  That's not the right way to word it here, but
somehow you probably want to explain that a single local transaction
can give rise to multiple foreign transactions and that this should be
taken into consideration when setting a value for this variable.
Maybe also include a link to where the user can find more information,
which brings me to another point: there doesn't seem to be any
general, user-facing explanation of this system.  You explain the
catalogs, the GUCs, the interface, etc. but there's nothing anywhere
that explains the overall point of the whole thing, which seems pretty
important. The closest thing you've got is a description for people
writing FDWs, but we really need a place to explain the concept to
*users*.  One idea is to add a new chapter to the "Server
Administration" section, maybe just after "Logical Replication" and
before "Regression Tests".  But I'm open to other ideas.

It's important that the documentation of the various GUCs provide
users with some clue as to how to set them.  I notice this
particularly for foreign_transaction_resolution_interval; off the top
of my head, without having read the rest of this patch, I don't know
why I shouldn't want this to be zero.  But the others could use more
explanation as well.

It is unclear from reading the documentation for GetPreparedId why
this should be the responsibility of the FDW, or how the FDW is
supposed to guarantee uniqueness.

PrepareForignTransaction is spelled wrong.  Nearby typos: prepareing,
tranasaction.  A bit further down, "can _prepare" has an unwanted
space in the middle.  Various places in this section capitalize
certain words in the middle of sentences which is not standard English
style.  For example, in "Every Foreign Data Wrapper is required..."
the word "Every" is appropriately capitalized because it begins a
sentence, but there's no reason to capitalize the others.  Likewise
for "...employs Two-phase commit protocol..." and other similar cases.

EndForeignTransaction doesn't explain what sorts of things the FDW can
legally do when called, or how this method is intended to be used.
Those seem like important details.  Especially, an FDW that tries to
do arbitrary stuff during abort cleanup will probably cause big
problems.

The fdw-transactions section of the documentation seems to imply that
henceforth every FDW must call FdwXactRegisterForeignServer, which I
think is an unacceptable API break.

It doesn't seem advisable to make this behavior depend on
max_prepared_foreign_transactions.  I think that it should be an
server-level option: use 2PC for this server, or not?  FDWs that don't
have 2PC default to "no"; but others can be set to "yes" if the user
wishes.  But we shouldn't force that decision to be on a cluster-wide
basis.

+ shows the functions
+available for foreign transaction managements.

management

+Resolve a foreign transaction. This function search foreign transaction

searches 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-02-07 Thread Robert Haas
On Tue, Jan 9, 2018 at 9:49 AM, Masahiko Sawada  wrote:
>> If I understand correctly, XactLastRecEnd can be set by, for example,
>> a HOT cleanup record, so that doesn't seem like a good thing to use.
>
> Yes, that's right.
>
>> Whether we need to use 2PC across remote nodes seems like it shouldn't
>> depend on whether a local SELECT statement happened to do a HOT
>> cleanup or not.
>
> So I think we need to check if the top transaction is invalid or not as well.

Even if you check both, it doesn't sound like it really does what you
want.  Won't you still end up partially dependent on whether a HOT
cleanup happened, if not in quite the same way as before?  How about
defining a new bit in MyXactFlags for XACT_FLAGS_WROTENONTEMPREL?
Just have heap_insert, heap_update, and heap_delete do something like:

if (RelationNeedsWAL(relation))
MyXactFlags |= XACT_FLAGS_WROTENONTEMPREL;

Overall, what's the status of this patch?  Are we hung up on this
issue only, or are there other things?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-01-09 Thread Robert Haas
On Wed, Dec 27, 2017 at 9:40 PM, Tsunakawa, Takayuki
 wrote:
> (1)
> Why don't you use the existing global variable MyXactFlags instead of the new 
> TransactionDidWrite?  Or, how about using XactLastRecEnd != 0 to determine 
> the transaction did any writes?  When the transaction only modified temporary 
> tables on the local database and some data on one remote database, I think 
> 2pc is unnecessary.

If I understand correctly, XactLastRecEnd can be set by, for example,
a HOT cleanup record, so that doesn't seem like a good thing to use.
Whether we need to use 2PC across remote nodes seems like it shouldn't
depend on whether a local SELECT statement happened to do a HOT
cleanup or not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-01-08 Thread Masahiko Sawada
On Mon, Jan 1, 2018 at 7:12 PM, Ashutosh Bapat
 wrote:
> On Thu, Dec 28, 2017 at 11:08 AM, Masahiko Sawada  
> wrote:
>>
>>> (1)
>>> Why don't you use the existing global variable MyXactFlags instead of the 
>>> new TransactionDidWrite?  Or, how about using XactLastRecEnd != 0 to 
>>> determine the transaction did any writes?  When the transaction only 
>>> modified temporary tables on the local database and some data on one remote 
>>> database, I think 2pc is unnecessary.
>>
>> Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if
>> we did any writes on local node which requires the atomic commit. Will
>> fix.
>>
>
> I haven't checked how much code it needs to track whether the local
> transaction wrote anything. But probably we can post-pone this
> optimization. If it's easy to incorporate, it's good to have in the
> first set itself.
>

Without the track of local writes, we always have to use two-phase
commit even when the transaction write data on only one foreign
server. It will be cause of unnecessary performance degradation and
cause of transaction failure on existing systems. We can set the using
two-phase commit per foreign server by ALTER SERVER but it will affect
other transactions. If we can configure it per transaction perhaps it
will be worth to postpone.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2018-01-01 Thread Ashutosh Bapat
On Thu, Dec 28, 2017 at 11:08 AM, Masahiko Sawada  wrote:
>
>> (1)
>> Why don't you use the existing global variable MyXactFlags instead of the 
>> new TransactionDidWrite?  Or, how about using XactLastRecEnd != 0 to 
>> determine the transaction did any writes?  When the transaction only 
>> modified temporary tables on the local database and some data on one remote 
>> database, I think 2pc is unnecessary.
>
> Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if
> we did any writes on local node which requires the atomic commit. Will
> fix.
>

I haven't checked how much code it needs to track whether the local
transaction wrote anything. But probably we can post-pone this
optimization. If it's easy to incorporate, it's good to have in the
first set itself.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-12-27 Thread Masahiko Sawada
On Thu, Dec 28, 2017 at 11:40 AM, Tsunakawa, Takayuki
 wrote:
> From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
>> I've updated documentation of patches, and fixed some bugs. I did some
>> failure tests of this feature using a fault simulation tool[1] for
>> PostgreSQL that I created.
>>
>> 0001 patch adds a mechanism to track of writes on local server. This is
>> required to determine whether we should use 2pc at commit. 0002 patch is
>> the main part. It adds a distributed transaction manager (currently only
>> for atomic commit), APIs for 2pc and foreign transaction manager resolver
>> process. 0003 patch makes postgres_fdw support atomic commit using 2pc.
>>
>> Please review patches.
>
> I'd like to join the review and testing of this functionality.  First, some 
> comments after taking a quick look at 0001:

Thank you so much!

> (1)
> Why don't you use the existing global variable MyXactFlags instead of the new 
> TransactionDidWrite?  Or, how about using XactLastRecEnd != 0 to determine 
> the transaction did any writes?  When the transaction only modified temporary 
> tables on the local database and some data on one remote database, I think 
> 2pc is unnecessary.

Perhaps we can use (XactLastRecEnd != 0 && markXidCommitted) to see if
we did any writes on local node which requires the atomic commit. Will
fix.

>
> (2)
> If TransactionDidWrite is necessary, I don't think you need to provide setter 
> functions, because other transaction state variables are accessed globally 
> without getter/setters.  And you didn't create getter function for 
> TransactionDidWrite.
>
> (3)
> heap_multi_insert() doesn't modify TransactionDidWrite.  Is it sufficient to 
> just remember heap modifications?  Are other modifications on the coordinator 
> node covered such as TRUNCATEand and REINDEX?

I think the using (XactLastRecEnd != 0 && markXidCommitted) to check
if we did any writes on local node would be better. After changed I
will be able to deal with the all above concerns.

>
>
> Questions before looking at 0002 and 0003:
>
> Q1: Does this functionality work when combined with XA 2pc transactions?

All transaction including local transaction and foreign transactions
are prepared when PREPARE. And all transactions are
committed/rollbacked by the foreign transaction resolver process when
COMMIT/ROLLBACK PREPARED.

> Q2: Does the atomic commit cascade across multiple remote databases? For 
> example:
> * The local transaction modifies data on remote database 1 via a foreign 
> table.
> * A trigger fires on the remote database 1, which modifies data on remote 
> database 2 via a foreign table.
> * The local transaction commits.

I've not tested yet more complex failure situations but as far as I
tested on my environment, the cascading atomic commit works. I'll test
these cases more deeply.

Regards,

Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



RE: [HACKERS] Transactions involving multiple postgres foreign servers

2017-12-27 Thread Tsunakawa, Takayuki
From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> I've updated documentation of patches, and fixed some bugs. I did some
> failure tests of this feature using a fault simulation tool[1] for
> PostgreSQL that I created.
> 
> 0001 patch adds a mechanism to track of writes on local server. This is
> required to determine whether we should use 2pc at commit. 0002 patch is
> the main part. It adds a distributed transaction manager (currently only
> for atomic commit), APIs for 2pc and foreign transaction manager resolver
> process. 0003 patch makes postgres_fdw support atomic commit using 2pc.
> 
> Please review patches.

I'd like to join the review and testing of this functionality.  First, some 
comments after taking a quick look at 0001:

(1)
Why don't you use the existing global variable MyXactFlags instead of the new 
TransactionDidWrite?  Or, how about using XactLastRecEnd != 0 to determine the 
transaction did any writes?  When the transaction only modified temporary 
tables on the local database and some data on one remote database, I think 2pc 
is unnecessary.

(2)
If TransactionDidWrite is necessary, I don't think you need to provide setter 
functions, because other transaction state variables are accessed globally 
without getter/setters.  And you didn't create getter function for 
TransactionDidWrite.

(3)
heap_multi_insert() doesn't modify TransactionDidWrite.  Is it sufficient to 
just remember heap modifications?  Are other modifications on the coordinator 
node covered such as TRUNCATEand and REINDEX?


Questions before looking at 0002 and 0003:

Q1: Does this functionality work when combined with XA 2pc transactions?

Q2: Does the atomic commit cascade across multiple remote databases? For 
example:
* The local transaction modifies data on remote database 1 via a foreign table.
* A trigger fires on the remote database 1, which modifies data on remote 
database 2 via a foreign table.
* The local transaction commits.

Regards
Takayuki Tsunakawa



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-12-12 Thread Masahiko Sawada
On Wed, Dec 13, 2017 at 12:03 AM, Robert Haas  wrote:
> On Mon, Dec 11, 2017 at 5:20 AM, Masahiko Sawada  
> wrote:
>>> The question I have is how would we deal with a foreign server that is
>>> not available for longer duration due to crash, longer network outage
>>> etc. Example is the foreign server crashed/got disconnected after
>>> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
>>> blocked for much longer duration without user having an idea of what's
>>> going on. May be we should add some timeout.
>>
>> After more thought, I agree with adding some timeout. I can image
>> there are users who want the timeout, for example, who cannot accept
>> even a few seconds latency. If the timeout occurs backend unlocks the
>> foreign transactions and breaks the loop. The resolver process will
>> keep to continue to resolve foreign transactions at certain interval.
>
> I don't think a timeout is a very good idea.  There is no timeout for
> synchronous replication and the issues here are similar.  I will not
> try to block a patch adding a timeout, but I think it had better be
> disabled by default and have very clear documentation explaining why
> it's really dangerous.  And this is why: with no timeout, you can
> count on being able to see the effects of your own previous
> transactions, unless at some point you sent a query cancel or got
> disconnected.  With a timeout, you may or may not see the effects of
> your own previous transactions depending on whether or not you hit the
> timeout, which you have no sure way of knowing.
>
 transactions after the coordinator server recovered. On the other
 hand, for the reading a consistent result on such situation by
 subsequent reads, for example, we can disallow backends to inquiry SQL
 to the foreign server if a foreign transaction of the foreign server
 is remained.
>>>
>>> +1 for the last sentence. If we do that, we don't need the backend to
>>> be blocked by resolver since a subsequent read accessing that foreign
>>> server would get an error and not inconsistent data.
>>
>> Yeah, however the disadvantage of this is that we manage foreign
>> transactions per foreign servers. If a transaction that modified even
>> one table is remained as a in-doubt transaction, we cannot issue any
>> SQL that touches that foreign server. Can we occur an error at
>> ExecInitForeignScan()?
>
> I really feel strongly we shouldn't complicate the initial patch with
> this kind of thing.  Let's make it enough for this patch to guarantee
> that either all parts of the transaction commit eventually or they all
> abort eventually.  Ensuring consistent visibility is a different and
> hard project, and if we try to do that now, this patch is not going to
> be done any time soon.
>

Thank you for the suggestion.

I was really wondering if we should add a timeout to this feature.
It's a common concern that we want to put a timeout at critical
section. But currently we don't have such timeout to neither
synchronous replication or writing WAL. I can image there will be
users who want to a timeout for such cases but obviously it makes this
feature more complex. Anyway, even if we add a timeout to this feature
we can make it as a separated patch and feature. So I'd like to keep
it simple as first step. This patch guarantees that the transaction
commit or rollback on all foreign servers or not unless users doesn't
cancel.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-12-12 Thread Robert Haas
On Mon, Dec 11, 2017 at 5:20 AM, Masahiko Sawada  wrote:
>> The question I have is how would we deal with a foreign server that is
>> not available for longer duration due to crash, longer network outage
>> etc. Example is the foreign server crashed/got disconnected after
>> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
>> blocked for much longer duration without user having an idea of what's
>> going on. May be we should add some timeout.
>
> After more thought, I agree with adding some timeout. I can image
> there are users who want the timeout, for example, who cannot accept
> even a few seconds latency. If the timeout occurs backend unlocks the
> foreign transactions and breaks the loop. The resolver process will
> keep to continue to resolve foreign transactions at certain interval.

I don't think a timeout is a very good idea.  There is no timeout for
synchronous replication and the issues here are similar.  I will not
try to block a patch adding a timeout, but I think it had better be
disabled by default and have very clear documentation explaining why
it's really dangerous.  And this is why: with no timeout, you can
count on being able to see the effects of your own previous
transactions, unless at some point you sent a query cancel or got
disconnected.  With a timeout, you may or may not see the effects of
your own previous transactions depending on whether or not you hit the
timeout, which you have no sure way of knowing.

>>> transactions after the coordinator server recovered. On the other
>>> hand, for the reading a consistent result on such situation by
>>> subsequent reads, for example, we can disallow backends to inquiry SQL
>>> to the foreign server if a foreign transaction of the foreign server
>>> is remained.
>>
>> +1 for the last sentence. If we do that, we don't need the backend to
>> be blocked by resolver since a subsequent read accessing that foreign
>> server would get an error and not inconsistent data.
>
> Yeah, however the disadvantage of this is that we manage foreign
> transactions per foreign servers. If a transaction that modified even
> one table is remained as a in-doubt transaction, we cannot issue any
> SQL that touches that foreign server. Can we occur an error at
> ExecInitForeignScan()?

I really feel strongly we shouldn't complicate the initial patch with
this kind of thing.  Let's make it enough for this patch to guarantee
that either all parts of the transaction commit eventually or they all
abort eventually.  Ensuring consistent visibility is a different and
hard project, and if we try to do that now, this patch is not going to
be done any time soon.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-12-11 Thread Masahiko Sawada
On Tue, Nov 28, 2017 at 12:31 PM, Ashutosh Bapat
 wrote:
> On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawada  
> wrote:
>> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska  wrote:
>>> Masahiko Sawada  wrote:
>>>
 On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
  wrote:
 > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada  
 > wrote:
 >>
 >> Because I don't want to break the current user semantics. that is,
 >> currently it's guaranteed that the subsequent reads can see the
 >> committed result of previous writes even if the previous transactions
 >> were distributed transactions. And it's ensured by writer side. If we
 >> can make the reader side ensure it, the backend process don't need to
 >> wait for the resolver process.
 >>
 >> The waiting backend process are released by resolver process after the
 >> resolver process tried to resolve foreign transactions. Even if
 >> resolver process failed to either connect to foreign server or to
 >> resolve foreign transaction the backend process will be released and
 >> the foreign transactions are leaved as dangling transaction in that
 >> case, which are processed later. Also if resolver process takes a long
 >> time to resolve foreign transactions for whatever reason the user can
 >> cancel it by Ctl-c anytime.
 >>
 >
 > So, there's no guarantee that the next command issued from the
 > connection *will* see the committed data, since the foreign
 > transaction might not have committed because of a network glitch
 > (say). If we go this route of making backends wait for resolver to
 > resolve the foreign transaction, we will have add complexity to make
 > sure that the waiting backends are woken up in problematic events like
 > crash of the resolver process OR if the resolver process hangs in a
 > connection to a foreign server etc. I am not sure that the complexity
 > is worth the half-guarantee.
 >

 Hmm, maybe I was wrong. I now think that the waiting backends can be
 woken up only in following cases;
 - The resolver process succeeded to resolve all foreign transactions.
 - The user did the cancel (e.g. ctl-c).
 - The resolver process failed to resolve foreign transaction for a
 reason of there is no such prepared transaction on foreign server.

 In other cases the resolver process should not release the waiters.
>>>
>>> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
>>> effort is needed to ensure that backend does not keep waiting if the 
>>> resolver
>>> can't finish it's work in forseable future. But this effort is not worth
>>> because by waking the backend up you might prevent the next transaction from
>>> seeing the changes the previous one tried to make."
>>>
>>> On the other hand, your last comments indicate that you try to be even more
>>> stringent in letting the backend wait. However even this stringent approach
>>> does not guarantee that the next transaction will see the data changes made 
>>> by
>>> the previous one.
>>>
>>
>> What I'd like to guarantee is that the subsequent read can see the
>> committed result of previous writes if the transaction involving
>> multiple foreign servers is committed without cancellation by user. In
>> other words, the backend should not be waken up and the resolver
>> should continue to resolve at certain intervals even if the resolver
>> fails to connect to the foreign server or fails to resolve it. This is
>> similar to what synchronous replication guaranteed today. Keeping this
>> semantics is very important for users. Note that the reading a
>> consistent result by concurrent reads is a separated problem.
>
> The question I have is how would we deal with a foreign server that is
> not available for longer duration due to crash, longer network outage
> etc. Example is the foreign server crashed/got disconnected after
> PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
> blocked for much longer duration without user having an idea of what's
> going on. May be we should add some timeout.

After more thought, I agree with adding some timeout. I can image
there are users who want the timeout, for example, who cannot accept
even a few seconds latency. If the timeout occurs backend unlocks the
foreign transactions and breaks the loop. The resolver process will
keep to continue to resolve foreign transactions at certain interval.

>>
>> The read result including foreign servers can be inconsistent if the
>> such transaction is cancelled or the coordinator server crashes during
>> two-phase commit processing. That is, if there is in-doubt transaction
>> the read result can be inconsistent, even for subsequent reads. But I
>> think this behaviour can be 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-11-27 Thread Antonin Houska
Masahiko Sawada  wrote:

> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska  wrote:
> > Masahiko Sawada  wrote:
> >
> >> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
> >>  wrote:
> >> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada  
> >> > wrote:
> >> >>
> >> >> Because I don't want to break the current user semantics. that is,
> >> >> currently it's guaranteed that the subsequent reads can see the
> >> >> committed result of previous writes even if the previous transactions
> >> >> were distributed transactions. And it's ensured by writer side. If we
> >> >> can make the reader side ensure it, the backend process don't need to
> >> >> wait for the resolver process.
> >> >>
> >> >> The waiting backend process are released by resolver process after the
> >> >> resolver process tried to resolve foreign transactions. Even if
> >> >> resolver process failed to either connect to foreign server or to
> >> >> resolve foreign transaction the backend process will be released and
> >> >> the foreign transactions are leaved as dangling transaction in that
> >> >> case, which are processed later. Also if resolver process takes a long
> >> >> time to resolve foreign transactions for whatever reason the user can
> >> >> cancel it by Ctl-c anytime.
> >> >>
> >> >
> >> > So, there's no guarantee that the next command issued from the
> >> > connection *will* see the committed data, since the foreign
> >> > transaction might not have committed because of a network glitch
> >> > (say). If we go this route of making backends wait for resolver to
> >> > resolve the foreign transaction, we will have add complexity to make
> >> > sure that the waiting backends are woken up in problematic events like
> >> > crash of the resolver process OR if the resolver process hangs in a
> >> > connection to a foreign server etc. I am not sure that the complexity
> >> > is worth the half-guarantee.
> >> >
> >>
> >> Hmm, maybe I was wrong. I now think that the waiting backends can be
> >> woken up only in following cases;
> >> - The resolver process succeeded to resolve all foreign transactions.
> >> - The user did the cancel (e.g. ctl-c).
> >> - The resolver process failed to resolve foreign transaction for a
> >> reason of there is no such prepared transaction on foreign server.
> >>
> >> In other cases the resolver process should not release the waiters.
> >
> > I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
> > effort is needed to ensure that backend does not keep waiting if the 
> > resolver
> > can't finish it's work in forseable future. But this effort is not worth
> > because by waking the backend up you might prevent the next transaction from
> > seeing the changes the previous one tried to make."
> >
> > On the other hand, your last comments indicate that you try to be even more
> > stringent in letting the backend wait. However even this stringent approach
> > does not guarantee that the next transaction will see the data changes made 
> > by
> > the previous one.
> >
> 
> What I'd like to guarantee is that the subsequent read can see the
> committed result of previous writes if the transaction involving
> multiple foreign servers is committed without cancellation by user.

I missed the point that user should not expect atomicity of the commit to be
guaranteed if he has cancelled his request.

The other things are clear to me, including the fact that atomic commit and
atomic visibility will be implemented separately. Thanks.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-11-27 Thread Ashutosh Bapat
On Tue, Nov 28, 2017 at 3:04 AM, Masahiko Sawada  wrote:
> On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska  wrote:
>> Masahiko Sawada  wrote:
>>
>>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
>>>  wrote:
>>> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada  
>>> > wrote:
>>> >>
>>> >> Because I don't want to break the current user semantics. that is,
>>> >> currently it's guaranteed that the subsequent reads can see the
>>> >> committed result of previous writes even if the previous transactions
>>> >> were distributed transactions. And it's ensured by writer side. If we
>>> >> can make the reader side ensure it, the backend process don't need to
>>> >> wait for the resolver process.
>>> >>
>>> >> The waiting backend process are released by resolver process after the
>>> >> resolver process tried to resolve foreign transactions. Even if
>>> >> resolver process failed to either connect to foreign server or to
>>> >> resolve foreign transaction the backend process will be released and
>>> >> the foreign transactions are leaved as dangling transaction in that
>>> >> case, which are processed later. Also if resolver process takes a long
>>> >> time to resolve foreign transactions for whatever reason the user can
>>> >> cancel it by Ctl-c anytime.
>>> >>
>>> >
>>> > So, there's no guarantee that the next command issued from the
>>> > connection *will* see the committed data, since the foreign
>>> > transaction might not have committed because of a network glitch
>>> > (say). If we go this route of making backends wait for resolver to
>>> > resolve the foreign transaction, we will have add complexity to make
>>> > sure that the waiting backends are woken up in problematic events like
>>> > crash of the resolver process OR if the resolver process hangs in a
>>> > connection to a foreign server etc. I am not sure that the complexity
>>> > is worth the half-guarantee.
>>> >
>>>
>>> Hmm, maybe I was wrong. I now think that the waiting backends can be
>>> woken up only in following cases;
>>> - The resolver process succeeded to resolve all foreign transactions.
>>> - The user did the cancel (e.g. ctl-c).
>>> - The resolver process failed to resolve foreign transaction for a
>>> reason of there is no such prepared transaction on foreign server.
>>>
>>> In other cases the resolver process should not release the waiters.
>>
>> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
>> effort is needed to ensure that backend does not keep waiting if the resolver
>> can't finish it's work in forseable future. But this effort is not worth
>> because by waking the backend up you might prevent the next transaction from
>> seeing the changes the previous one tried to make."
>>
>> On the other hand, your last comments indicate that you try to be even more
>> stringent in letting the backend wait. However even this stringent approach
>> does not guarantee that the next transaction will see the data changes made 
>> by
>> the previous one.
>>
>
> What I'd like to guarantee is that the subsequent read can see the
> committed result of previous writes if the transaction involving
> multiple foreign servers is committed without cancellation by user. In
> other words, the backend should not be waken up and the resolver
> should continue to resolve at certain intervals even if the resolver
> fails to connect to the foreign server or fails to resolve it. This is
> similar to what synchronous replication guaranteed today. Keeping this
> semantics is very important for users. Note that the reading a
> consistent result by concurrent reads is a separated problem.

The question I have is how would we deal with a foreign server that is
not available for longer duration due to crash, longer network outage
etc. Example is the foreign server crashed/got disconnected after
PREPARE but before COMMIT/ROLLBACK was issued. The backend will remain
blocked for much longer duration without user having an idea of what's
going on. May be we should add some timeout.

>
> The read result including foreign servers can be inconsistent if the
> such transaction is cancelled or the coordinator server crashes during
> two-phase commit processing. That is, if there is in-doubt transaction
> the read result can be inconsistent, even for subsequent reads. But I
> think this behaviour can be accepted by users. For the resolution of
> in-doubt transactions, the resolver process will try to resolve such
> transactions after the coordinator server recovered. On the other
> hand, for the reading a consistent result on such situation by
> subsequent reads, for example, we can disallow backends to inquiry SQL
> to the foreign server if a foreign transaction of the foreign server
> is remained.

+1 for the last sentence. If we do that, we don't need the backend to
be blocked by resolver since a subsequent read accessing 

Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-11-27 Thread Robert Haas
On Mon, Nov 27, 2017 at 4:35 PM Masahiko Sawada 
wrote:

> What I'd like to guarantee is that the subsequent read can see the
> committed result of previous writes if the transaction involving
> multiple foreign servers is committed without cancellation by user. In
> other words, the backend should not be waken up and the resolver
> should continue to resolve at certain intervals even if the resolver
> fails to connect to the foreign server or fails to resolve it. This is
> similar to what synchronous replication guaranteed today. Keeping this
> semantics is very important for users. Note that the reading a
> consistent result by concurrent reads is a separated problem.
>
> The read result including foreign servers can be inconsistent if the
> such transaction is cancelled or the coordinator server crashes during
> two-phase commit processing. That is, if there is in-doubt transaction
> the read result can be inconsistent, even for subsequent reads. But I
> think this behaviour can be accepted by users. For the resolution of
> in-doubt transactions, the resolver process will try to resolve such
> transactions after the coordinator server recovered. On the other
> hand, for the reading a consistent result on such situation by
> subsequent reads, for example, we can disallow backends to inquiry SQL
> to the foreign server if a foreign transaction of the foreign server
> is remained.
>
> For the concurrent reads, the reading an inconsistent result can be
> happen even without in-doubt transaction because we can read data on a
> foreign server between PREPARE and COMMIT PREPARED while other foreign
> servers have committed. I think we should deal with this problem by
> other feature on reader side, for example, atomic visibility. If we
> have atomic visibility feature, we also can solve the above problem.


+1 to all of that.

...Robert

>
> --
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-11-27 Thread Masahiko Sawada
On Fri, Nov 24, 2017 at 10:28 PM, Antonin Houska  wrote:
> Masahiko Sawada  wrote:
>
>> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
>>  wrote:
>> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada  
>> > wrote:
>> >>
>> >> Because I don't want to break the current user semantics. that is,
>> >> currently it's guaranteed that the subsequent reads can see the
>> >> committed result of previous writes even if the previous transactions
>> >> were distributed transactions. And it's ensured by writer side. If we
>> >> can make the reader side ensure it, the backend process don't need to
>> >> wait for the resolver process.
>> >>
>> >> The waiting backend process are released by resolver process after the
>> >> resolver process tried to resolve foreign transactions. Even if
>> >> resolver process failed to either connect to foreign server or to
>> >> resolve foreign transaction the backend process will be released and
>> >> the foreign transactions are leaved as dangling transaction in that
>> >> case, which are processed later. Also if resolver process takes a long
>> >> time to resolve foreign transactions for whatever reason the user can
>> >> cancel it by Ctl-c anytime.
>> >>
>> >
>> > So, there's no guarantee that the next command issued from the
>> > connection *will* see the committed data, since the foreign
>> > transaction might not have committed because of a network glitch
>> > (say). If we go this route of making backends wait for resolver to
>> > resolve the foreign transaction, we will have add complexity to make
>> > sure that the waiting backends are woken up in problematic events like
>> > crash of the resolver process OR if the resolver process hangs in a
>> > connection to a foreign server etc. I am not sure that the complexity
>> > is worth the half-guarantee.
>> >
>>
>> Hmm, maybe I was wrong. I now think that the waiting backends can be
>> woken up only in following cases;
>> - The resolver process succeeded to resolve all foreign transactions.
>> - The user did the cancel (e.g. ctl-c).
>> - The resolver process failed to resolve foreign transaction for a
>> reason of there is no such prepared transaction on foreign server.
>>
>> In other cases the resolver process should not release the waiters.
>
> I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
> effort is needed to ensure that backend does not keep waiting if the resolver
> can't finish it's work in forseable future. But this effort is not worth
> because by waking the backend up you might prevent the next transaction from
> seeing the changes the previous one tried to make."
>
> On the other hand, your last comments indicate that you try to be even more
> stringent in letting the backend wait. However even this stringent approach
> does not guarantee that the next transaction will see the data changes made by
> the previous one.
>

What I'd like to guarantee is that the subsequent read can see the
committed result of previous writes if the transaction involving
multiple foreign servers is committed without cancellation by user. In
other words, the backend should not be waken up and the resolver
should continue to resolve at certain intervals even if the resolver
fails to connect to the foreign server or fails to resolve it. This is
similar to what synchronous replication guaranteed today. Keeping this
semantics is very important for users. Note that the reading a
consistent result by concurrent reads is a separated problem.

The read result including foreign servers can be inconsistent if the
such transaction is cancelled or the coordinator server crashes during
two-phase commit processing. That is, if there is in-doubt transaction
the read result can be inconsistent, even for subsequent reads. But I
think this behaviour can be accepted by users. For the resolution of
in-doubt transactions, the resolver process will try to resolve such
transactions after the coordinator server recovered. On the other
hand, for the reading a consistent result on such situation by
subsequent reads, for example, we can disallow backends to inquiry SQL
to the foreign server if a foreign transaction of the foreign server
is remained.

For the concurrent reads, the reading an inconsistent result can be
happen even without in-doubt transaction because we can read data on a
foreign server between PREPARE and COMMIT PREPARED while other foreign
servers have committed. I think we should deal with this problem by
other feature on reader side, for example, atomic visibility. If we
have atomic visibility feature, we also can solve the above problem.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



Re: [HACKERS] Transactions involving multiple postgres foreign servers

2017-11-24 Thread Antonin Houska
Masahiko Sawada  wrote:

> On Mon, Oct 30, 2017 at 5:48 PM, Ashutosh Bapat
>  wrote:
> > On Thu, Oct 26, 2017 at 7:41 PM, Masahiko Sawada  
> > wrote:
> >>
> >> Because I don't want to break the current user semantics. that is,
> >> currently it's guaranteed that the subsequent reads can see the
> >> committed result of previous writes even if the previous transactions
> >> were distributed transactions. And it's ensured by writer side. If we
> >> can make the reader side ensure it, the backend process don't need to
> >> wait for the resolver process.
> >>
> >> The waiting backend process are released by resolver process after the
> >> resolver process tried to resolve foreign transactions. Even if
> >> resolver process failed to either connect to foreign server or to
> >> resolve foreign transaction the backend process will be released and
> >> the foreign transactions are leaved as dangling transaction in that
> >> case, which are processed later. Also if resolver process takes a long
> >> time to resolve foreign transactions for whatever reason the user can
> >> cancel it by Ctl-c anytime.
> >>
> >
> > So, there's no guarantee that the next command issued from the
> > connection *will* see the committed data, since the foreign
> > transaction might not have committed because of a network glitch
> > (say). If we go this route of making backends wait for resolver to
> > resolve the foreign transaction, we will have add complexity to make
> > sure that the waiting backends are woken up in problematic events like
> > crash of the resolver process OR if the resolver process hangs in a
> > connection to a foreign server etc. I am not sure that the complexity
> > is worth the half-guarantee.
> >
> 
> Hmm, maybe I was wrong. I now think that the waiting backends can be
> woken up only in following cases;
> - The resolver process succeeded to resolve all foreign transactions.
> - The user did the cancel (e.g. ctl-c).
> - The resolver process failed to resolve foreign transaction for a
> reason of there is no such prepared transaction on foreign server.
> 
> In other cases the resolver process should not release the waiters.

I'm not sure I see consensus here. What Ashutosh says seems to be: "Special
effort is needed to ensure that backend does not keep waiting if the resolver
can't finish it's work in forseable future. But this effort is not worth
because by waking the backend up you might prevent the next transaction from
seeing the changes the previous one tried to make."

On the other hand, your last comments indicate that you try to be even more
stringent in letting the backend wait. However even this stringent approach
does not guarantee that the next transaction will see the data changes made by
the previous one.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at