Le 25/03/2020 à 03:49, Dave Sharpe a écrit :
>
> Hi pghackers,
>
> This is my first time posting here ...  Gilles Darold and I are
> developing a new FDW which is based on the contrib/postgres_fdw. The
> postgres_fdw logic uses a RegisterXactCallback function to send local
> transactions remote. But I found that a registered XactCallback is not
> always called for a successful client ROLLBACK statement. So, a
> successful local ROLLBACK does not get sent remote by FDW, and now the
> local and remote transaction states are messed up, out of sync. The
> local database is "eating" the successful rollback.
>
>  
>
> I attach a git format-patch file
> 0001-Fix-CommitTransactionCommand-to-CallXactCallbacks-in.patch
>
> The patch fixes the problem and is ready to commit as far as I can
> tell. It adds some comment lines and three lines of code to
> CommitTransactionCommand() in the TBLOCK_ABORT_END case. Line (1) to
> reset the transaction's blockState back to TBLOCK_ABORT, ahead of (2)
> a new call to callXactCallbacks(). If the callback returns successful
> (which is usually the case), (3) the new code switches back to
> TBLOCK_ABORT_END, then continues with old code to CleanupTransaction()
> as before. If any callback does error out, the TBLOCK_ABORT was the
> correct block state for an error.
>
>  
>
> I have not added a regression test since my scenario involves a C
> module ... I didn't see such a regression test, but somebody can teach
> me where to put the C module and Makefile if a regression test should
> be added. Heads up that the scenario to hit this is a bit complex (to
> me) ... I attach the following unit test files:
>
> 1) eat_rollback.c, _/PG_init() installs my/_utility_hook() for INFO
> log for debugging.
>
> RegisterSubXactCallback(mySubtransactionCallback) which injects some
> logging and an ERROR for savepoints, i.e., negative testing, e.g.,
> like a remote FDW savepoint failed.
>
> And RegisterXactTransaction(myTransactionCallback) with info logging.
>
> 2) Makefile, to make the eat_rollback.so
>
> 3) eat_rollback2.sql, drives the fail sequence, especially the
> SAVEPOINT, which errors out, then later a successful ROLLBACK gets
> incorrectly eaten (no CALLBACK info logging, demonstrates the bug),
> then another successful ROLLBACK (now there is CALLBACK info logging).
>
> 4) eat_rollback2.out, output without the fix, the rollback at line 68
> is successful but there is not myTransactionCallback() INFO output
>
> 5) eat_rollback2.fixed, output after the fix, the rollback at line 68
> is successful, and now there is a myTransactionCallback() INFO log.
> Success!
>
>  
>
> It first failed for me in v11.1, I suspect it failed since before then
> too. And it is still failing in current master.
>
>  
>
> Bye the way, we worked around the bug in our FDW by handling sub/xact
> in the utility hook. A transaction callback is still needed for
> implicit, internal rollbacks. Another advantage of the workaround is
> (I think) it reduces the code complexity and improves performance
> because the entire subxact stack is not unwound to drive each and
> every subtransaction level to remote. Also sub/transaction statements
> are sent remote as they arrive (local and remote are more
> "transactionally" synced, not stacked by FDW for later). And of
> course, the workaround doesn't hit the above bug, since our utility
> hook correctly handles the client ROLLBACK. If it makes sense to the
> community, I could try and patch contrib/postgres_fdw to do what we
> are doing. But note that I don't need it myself: we have our own new
> FDW for remote DB2 z/OS (!) ... at LzLabs we are a building a software
> defined mainframe with PostgreSQL (of all things).
>
>  
>
> Hope it helps!
>
>  
>
> Dave Sharpe
>
> /I don't necessarily agree with everything I say./(MM)
>
> www.lzlabs.com
>

Hi,


As I'm involved in this thread I have given a review to this bug report
and I don't think that there is a problem here but as a disclaimer my
knowledge on internal transaction management is probably not enough to
have a definitive opinion.


Actually the callback function is called when the error is thrown:

    psql:eat_rollback2.sql:20: INFO:  00000: myTransactionCallback()
    XactEvent 2 (is abort) level 1 <-------------------
    LOCATION:  myTransactionCallback, eat_rollback.c:52
    psql:eat_rollback2.sql:20: ERROR:  XX000: no no no
    LOCATION:  mySubtransactionCallback, eat_rollback.c:65


this is probably why the callback is not called on the subsequent
ROLLBACK execution because abort processing is already done
(src/backend/access/transam/xact.c:3890).


With this simple test and the debug extension:

    BEGIN;
    DROP INDEX "index2", "index3"; -- will fail indexes doesn't exist
    ROLLBACK;


the callback function is also called at error level, not on the ROLLBACK:

    
---------------------------------------------------------------------------------BEGIN
    psql:eat_rollback-2/sql/eat_rollback3.sql:39: INFO:  00000:
    my_utility_hook() utility transaction kind 0 (is not rollback)
    LOCATION:  my_utility_hook, eat_rollback.c:28
    BEGIN
    
---------------------------------------------------------------------------------DROP
    INDEX + error
    psql:eat_rollback-2/sql/eat_rollback3.sql:41: INFO:  00000:
    myTransactionCallback() XactEvent 2 (is abort) level 1 <---------------
    LOCATION:  myTransactionCallback, eat_rollback.c:52
    psql:eat_rollback-2/sql/eat_rollback3.sql:41: ERROR:  42704: index
    "concur_index2" does not exist
    LOCATION:  DropErrorMsgNonExistent, tablecmds.c:1209
    
---------------------------------------------------------------------------------ROLLBACK
    psql:eat_rollback-2/sql/eat_rollback3.sql:43: INFO:  00000:
    my_utility_hook() utility transaction kind 3 (is rollback)
    LOCATION:  my_utility_hook, eat_rollback.c:28
    ROLLBACK


So this is probably the expected behavior , however the proposed patch
will throw the callback function two times which might not be what we want.


    
---------------------------------------------------------------------------------BEGIN
    psql:eat_rollback-2/sql/eat_rollback3.sql:39: INFO:  00000:
    my_utility_hook() utility transaction kind 0 (is not rollback)
    LOCATION:  my_utility_hook, eat_rollback.c:28
    BEGIN
    
---------------------------------------------------------------------------------DROP
    INDEX + error
    psql:eat_rollback-2/sql/eat_rollback3.sql:41: INFO:  00000:
    myTransactionCallback() XactEvent 2 (is abort) level 1
    <-------------- first call
    LOCATION:  myTransactionCallback, eat_rollback.c:52
    psql:eat_rollback-2/sql/eat_rollback3.sql:41: ERROR:  42704: index
    "concur_index2" does not exist
    LOCATION:  DropErrorMsgNonExistent, tablecmds.c:1209
    
---------------------------------------------------------------------------------ROLLBACK
    psql:eat_rollback-2/sql/eat_rollback3.sql:43: INFO:  00000:
    my_utility_hook() utility transaction kind 3 (is
    rollback)<-------------- second call
    LOCATION:  my_utility_hook, eat_rollback.c:28
    psql:eat_rollback-2/sql/eat_rollback3.sql:43: INFO:  00000:
    myTransactionCallback() XactEvent 2 (is abort) level 1
    LOCATION:  myTransactionCallback, eat_rollback.c:52
    ROLLBACK


But again I may be wrong.


-- 
Gilles Darold
http://www.darold.net/

Reply via email to