[
https://issues.apache.org/jira/browse/IGNITE-17731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Pavlov updated IGNITE-17731:
-----------------------------------
Labels: IEP-89 ise (was: IEP-89)
> Possible LRT in case of postponed GridDhtLockRequest
> ----------------------------------------------------
>
> Key: IGNITE-17731
> URL: https://issues.apache.org/jira/browse/IGNITE-17731
> Project: Ignite
> Issue Type: Bug
> Reporter: Mikhail Petrov
> Priority: Major
> Labels: IEP-89, ise
>
> Let's assume the foowing scenario:
> 1. TX coordinator starts transaction and sends GridDhtLockRequest to "near"
> nodes.
> 2. Some GridDhtLockRequest messages was delayed by the network.
> 3. Not all "near" nodes receive GridDhtLockRequest and as result not all of
> them respond to the TX coordinator.
> 4. TX coordinator aborts TX by the timeout.
> 5. Completed TX ID is stored in IgniteTxManager#completedVersHashMap.
> 6. TX load continuous (assume puts in TX cache) and record about described
> above completed TX is evicted from the map.
> 7. GridDhtLockRequest from the clause 2 is finally recived by the "near"
> nodes. They lock keys, start the local TX, and respond to the TX coordinator.
> But currently TX coordinator ignores GridDhtLockResponce as info about
> initial TX was evicted and does nothing.
> As a result near nodes keep holding key locks and waiting for next steps of
> TX protocol that will never happen as TX was already completed.
> As a WA TX can be explicitly KILLED on the near node.
> It is proposed to handle this situation and not aquire locks on the near node
> if TX coordinator or other cluster nodes do not have notion about TX to which
> current lock request belongs to.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)