On Tue, Sep 27, 2016 at 9:06 PM, Ashutosh Bapat
<ashutosh.ba...@enterprisedb.com> wrote:
> On Tue, Sep 27, 2016 at 2:54 PM, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
>> On Mon, Sep 26, 2016 at 9:07 PM, Ashutosh Bapat
>> <ashutosh.ba...@enterprisedb.com> wrote:
>>> On Mon, Sep 26, 2016 at 5:25 PM, Masahiko Sawada <sawada.m...@gmail.com> 
>>> wrote:
>>>> On Mon, Sep 26, 2016 at 7:28 PM, Ashutosh Bapat
>>>> <ashutosh.ba...@enterprisedb.com> wrote:
>>>>> My original patch added code to manage the files for 2 phase
>>>>> transactions opened by the local server on the remote servers. This
>>>>> code was mostly inspired from the code in twophase.c which manages the
>>>>> file for prepared transactions. The logic to manage 2PC files has
>>>>> changed since [1] and has been optimized. One of the things I wanted
>>>>> to do is see, if those optimizations are applicable here as well. Have
>>>>> you considered that?
>>>>>
>>>>>
>>>>
>>>> Yeah, we're considering it.
>>>> After these changes are committed, we will post the patch incorporated
>>>> these changes.
>>>>
>>>> But what we need to do first is the discussion in order to get consensus.
>>>> Since current design of this patch is to transparently execute DCL of
>>>> 2PC on foreign server, this code changes lot of code and is
>>>> complicated.
>>>
>>> Can you please elaborate. I am not able to understand what DCL is
>>> involved here. According to [1], examples of DCL are GRANT and REVOKE
>>> command.
>>
>> I meant transaction management command such as PREPARE TRANSACTION and
>> COMMIT/ABORT PREPARED command.
>> The web page I refered might be wrong, sorry.
>>
>>>> Another approach I have is to push down DCL to only foreign servers
>>>> that support 2PC protocol, which is similar to DML push down.
>>>> This approach would be more simpler than current idea and is easy to
>>>> use by distributed transaction manager.
>>>
>>> Again, can you please elaborate, how that would be different from the
>>> current approach and how does it simplify the code.
>>>
>>
>> The idea is just to push down PREPARE TRANSACTION, COMMIT/ROLLBACK
>> PREPARED to foreign servers that support 2PC.
>> With this idea, the client need to do following operation when foreign
>> server is involved with transaction.
>>
>> BEGIN;
>> UPDATE parent_table SET ...; -- update including foreign server
>> PREPARE TRANSACTION 'xact_id';
>> COMMIT PREPARED 'xact_id';
>>
>> The above PREPARE TRANSACTION and COMMIT PREPARED command are pushed
>> down to foreign server.
>> That is, the client needs to execute PREPARE TRANSACTION and
>>
>> In this idea, I think that we don't need to do followings,
>>
>> * Providing the prepare id of 2PC.
>>   Current patch adds new API prepare_id_provider() but we can use the
>> prepare id of 2PC that is used on parent server.
>>
>> * Keeping track of status of foreign servers.
>>   Current patch keeps track of status of foreign servers involved with
>> transaction but this idea is just to push down transaction management
>> command to foreign server.
>>   So I think that we no longer need to do that.
>
>> COMMIT/ROLLBACK PREPARED explicitly.
>
> The problem with this approach is same as one previously stated. If
> the connection between local and foreign server is lost between
> PREPARE and COMMIT the prepared transaction on the foreign server
> remains dangling, none other than the local server knows what to do
> with it and the local server has lost track of the prepared
> transaction on the foreign server. So, just pushing down those
> commands doesn't work.

Yeah, my idea is one of the first step.
Mechanism that resolves the dangling foreign transaction and the
resolver worker process are necessary.

>>
>> * Adding max_prepared_foreign_transactions parameter.
>>   It means that the number of transaction involving foreign server is
>> the same as max_prepared_transactions.
>>
>
> That isn't true exactly. max_prepared_foreign_transactions indicates
> how many transactions can be prepared on the foreign server, which in
> the method you propose should have a cap of max_prepared_transactions
> * number of foreign servers.

Oh, I understood, thanks.

Consider sharding solution using postgres_fdw (that is, the parent
postgres server has multiple shard postgres servers), we need to
increase max_prepared_foreign_transactions whenever new shard server
is added to cluster, or to allocate enough size in advance. But the
estimation of enough max_prepared_foreign_transactions would not be
easy, for example can we estimate it by (max throughput of the system)
* (the number of foreign servers)?

One new idea I came up with is that we set transaction id on parent
server to global transaction id (gid) that is prepared on shard
server.
And pg_fdw_resolver worker process periodically resolves the dangling
transaction on foreign server by comparing active lowest XID on parent
server with the XID in gid used by PREPARE TRANSACTION.

For example, suppose that there are one parent server and one shard
server, and the client executes update transaction (XID = 100)
involving foreign servers.
In commit phase, parent server executes PREPARE TRANSACTION command
with gid containing 100, say 'px_<random
number>_100_<serverid>_<userid>', on foreign server.
If the shard server crashed before COMMIT PREPARED, the transaction
100 become danging transaction.

But resolver worker process on parent server can resolve it with
following steps.
1. Get lowest active XID on parent server(XID=110).
2. Connect to foreign server. (Get foreign server information from
pg_foreign_server system catalog.)
3. Check if there is prepared transaction with XID less than 110.
4. Rollback the dangling transaction found at #3 step.
    gid 'px_<random number>_100_<serverid>_<userid>' is prepared on
foreign server by transaction 100, rollback it.

In this idea, we need gid provider API but parent server doesn't need
to have persistent foreign transaction data.
Also we could remove max_prepared_foreign_transactions, and fdw_xact.c
would become more simple implementation.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to