[
https://issues.apache.org/jira/browse/IGNITE-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Scherbakov updated IGNITE-28365:
---------------------------------------
Description:
The scenario:
Assume coordinator node A, primary partition replica node B
# Tx1 is started on A and locks key K on node B. tx1 timeout=100000
# Tx2 is started on A and requests locked K. tx2 timeout=1000
# Tx2 is rolled back by timeout on node A
# Cleanup request is send on node B
# method lockManager.failAllWaiters is called during cleanup in
awaitCleanupReadyFutures and fails the lock request for tx2, but the reason is
lost, because transaction state on B doesn't contain info about timeout
# User receives wrong exception (no mention about "finished due to timeout")
Transaction killed state can also be lost with the similar scenario. Just
replace step 3 with kill
The fix is to propagate rollback reason in cleanup request.
was:
The scenario:
Assume coordinator node A, primary partition replica node B
# Tx1 is started on A and locks key K on node B. tx1 timeout=100000
# Tx2 is started on A and requests locked K. tx2 timeout=1000
# Tx2 is rolled back by timeout on node A
# Cleanup request is send on node B
# method lockManager.failAllWaiters is called during cleanup in
awaitCleanupReadyFutures and fails the lock request for tx2, but the reason is
lost, because transaction state on B doesn't contain info about timeout
# User receives wrong exception (no mention about "finished due to timeout")
The fix is to propagate rollback reason in cleanup request.
> Transaction failure reason is lost on rollback due to timeout
> -------------------------------------------------------------
>
> Key: IGNITE-28365
> URL: https://issues.apache.org/jira/browse/IGNITE-28365
> Project: Ignite
> Issue Type: Bug
> Components: rw transactions ai3
> Reporter: Alexey Scherbakov
> Assignee: Anton Laletin
> Priority: Major
> Labels: ignite-3
>
> The scenario:
> Assume coordinator node A, primary partition replica node B
> # Tx1 is started on A and locks key K on node B. tx1 timeout=100000
> # Tx2 is started on A and requests locked K. tx2 timeout=1000
> # Tx2 is rolled back by timeout on node A
> # Cleanup request is send on node B
> # method lockManager.failAllWaiters is called during cleanup in
> awaitCleanupReadyFutures and fails the lock request for tx2, but the reason
> is lost, because transaction state on B doesn't contain info about timeout
> # User receives wrong exception (no mention about "finished due to timeout")
> Transaction killed state can also be lost with the similar scenario. Just
> replace step 3 with kill
> The fix is to propagate rollback reason in cleanup request.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)