RE: Transactions involving multiple postgres foreign servers, take 2

tsunakawa.ta...@fujitsu.com Thu, 16 Jul 2020 22:23:14 -0700

From: Masahiko Sawada <masahiko.saw...@2ndquadrant.com>
I have briefly checked the only oracle_fdw but in general I think that
> if an existing FDW supports transaction begin, commit, and rollback,
> these can be ported to new FDW transaction APIs easily.


Does oracle_fdw support begin, commit and rollback?

And most importantly, do other major DBMSs, including Oracle, provide the API 
for preparing a transaction?  In other words, will the FDWs other than 
postgres_fdw really be able to take advantage of the new FDW functions to join 
the 2PC processing?  I think we need to confirm that there are concrete 
examples.

What I'm worried is that if only postgres_fdw can implement the prepare 
function, it's a sign that FDW interface will be riddled with functions only 
for Postgres.  That is, the FDW interface is getting away from its original 
purpose "access external data as a relation" and complex.  Tomas Vondra showed 
this concern as follows:

Horizontal scalability/sharding 
https://www.postgresql.org/message-id/flat/CANP8%2BjK%3D%2B3zVYDFY0oMAQKQVJ%2BqReDHr1UPdyFEELO82yVfb9A%40mail.gmail.com#2c45f0ee97855449f1f7fedcef1d5e11


[Tomas Vondra's remarks]
--------------------------------------------------
> This strikes me as a bit of a conflict of interest with FDW which
> seems to want to hide the fact that it's foreign; the FDW
> implementation makes it's own optimization decisions which might
> make sense for single table queries but breaks down in the face of
> joins.

+1 to these concerns

In my mind, FDW is a wonderful tool to integrate PostgreSQL with 
external data sources, and it's nicely shaped for this purpose, which 
implies the abstractions and assumptions in the code.

The truth however is that many current uses of the FDW API are actually 
using it for different purposes because there's no other way to do that, 
not because FDWs are the "right way". And this includes the attempts to 
build sharding on FDW, I think.

Situations like this result in "improvements" of the API that seem to 
improve the API for the second group, but make the life harder for the 
original FDW API audience by making the API needlessly complex. And I 
say "seem to improve" because the second group eventually runs into the 
fundamental abstractions and assumptions the API is based on anyway.

And based on the discussions at pgcon, I think this is the main reason 
why people cringe when they hear "FDW" and "sharding" in the same sentence.

...
My other worry is that we'll eventually mess the FDW infrastructure, 
making it harder to use for the original purpose. Granted, most of the 
improvements proposed so far look sane and useful for FDWs in general, 
but sooner or later that ceases to be the case - there sill be changes 
needed merely for the sharding. Those will be tough decisions.
--------------------------------------------------


> Regarding the comparison between FDW transaction APIs and transaction
> callbacks, I think one of the benefits of providing FDW transaction
> APIs is that the core is able to manage the status of foreign
> transactions. We need to track the status of individual foreign
> transactions to support atomic commit. If we use transaction callbacks
> (XactCallback) that many FDWs are using, I think we will end up
> calling the transaction callback and leave the transaction work to
> FDWs, leading that the core is not able to know the return values of
> PREPARE TRANSACTION for example. We can add more arguments passed to
> transaction callbacks to get the return value from FDWs but I don’t
> think it’s a good idea as transaction callbacks are used not only by
> FDW but also other external modules.

To track the foreign transaction status, we can add GetTransactionStatus() to 
the FDW interface as an alternative, can't we?


> With the current version patch (v23), it supports only
> INSERT/DELETE/UPDATE. But I'm going to change the patch so that it
> supports other writes SQLs as Fujii-san also pointed out.

OK.  I've just read that Fujii san already pointed out a similar thing.  But I 
wonder if we can know that the UDF executed on the foreign server has updated 
data.  Maybe we can know or guess it by calling txid_current_if_any() or 
checking the transaction status in FE/BE protocol, but can we deal with other 
FDWs other than postgres_fdw?


> No, in the current design, the backend who received a query from the
> client does PREPARE, and then the transaction resolver process, a
> background worker, does COMMIT PREPARED.

This "No" means the current implementation cannot group commits from multiple 
transactions?
Does the transaction resolver send COMMIT PREPARED and waits for its response 
for each transaction one by one?  For example,

[local server]
Transaction T1 and T2 performs 2PC at the same time.
Transaction resolver sends COMMIT PREPARED for T1 and then waits for the 
response.
T1 writes COMMIT PREPARED record locally and sync the WAL.
Transaction resolver sends COMMIT PREPARED for T2 and then waits for the 
response.
T2 writes COMMIT PREPARED record locally and sync the WAL.

[foreign server]
T1 writes COMMIT PREPARED record locally and sync the WAL.
T2 writes COMMIT PREPARED record locally and sync the WAL.

If the WAL records of multiple concurrent transactions are written and synced 
separately, i.e. group commit doesn't take effect, then the OLTP transaction 
performance will be unacceptable.


Regards
Takayuki Tsunakawa

RE: Transactions involving multiple postgres foreign servers, take 2

Reply via email to