[
https://issues.apache.org/jira/browse/FLINK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292824#comment-17292824
]
Maciej Obuchowski commented on FLINK-21488:
-------------------------------------------
Unfortunately, swapping those does not work. Using statically generated bytes
as global transaction id makes second checkpoint fail - as Oracle treats this
as starting second transaction branch with already prepared transaction.
So, I believe that global transaction id has to be well, globally unique. And
branch id is kind of irrelevant here as every transaction that this sink makes
has only one branch.
My proposed solutions right now are:
1) Keep "semantic" semantics of current generator, and make gtrId combination
of job id, task index and checkpoint id. Problem with this solution is that
we're adding 16 bytes to gtrId, and that getting job id in runtime is not
trivial:
[http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-get-flink-JobId-in-runtime-td36756.html]
2) Add 4 or 8 byte static random component to gtrId.
3) Make gtrId random on each generateXid call.
> Jdbc XA sink - XID generation conflicts between jobs
> ----------------------------------------------------
>
> Key: FLINK-21488
> URL: https://issues.apache.org/jira/browse/FLINK-21488
> Project: Flink
> Issue Type: Bug
> Components: Connectors / JDBC
> Affects Versions: 1.13.0
> Reporter: Maciej Obuchowski
> Priority: Major
>
> I'm using Flink 1.13's JDBC XA sink to write data to oracle DB using exactly
> once semantics.
> I want to have two jobs doing this. One is working right now. When starting
> second one, I encountered errors:
> org.apache.flink.util.FlinkRuntimeException: unable to start XA transaction,
> xid: 201:0600000000000000:9b1d1b84e8ce79bb, error -3: resource manager error
> has occurred. [XAErr (-3): A resource manager error has occured in the
> transaction branch. ORA-2079 SQLErr (0)]
> Oracle description:
> ORA-02079: cannot join a committing distributed transaction
> Cause: Once a transaction branch is prepared, no more new transaction
> branches are allowed to start, nor is the prepared transaction branch allowed
> to be joined.
> Action: Check the application code as this is an XA protocol violation.
> I've looked at the implementation of XID generation and noticed following
> line:
> private transient byte[] gtridBuffer; // globalTransactionId = checkpoint id
> (long)
> My hypothesis is that second job generated xid that referred to global
> transaction id that the first job created. If I'm right, then I'd suppose fix
> would rely on embedding part of job id inside of gtridBuffer.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)