[
https://issues.apache.org/jira/browse/IGNITE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057915#comment-17057915
]
Ivan Rakov commented on IGNITE-12746:
-------------------------------------
[~ilyak]
> What is the scope of this problem?
It's rare concurrent scenario which is possible with optimistic transactions
(putAll to transactional cache works as optimistic + read_commited transaction).
Here's the flow that leads to the deadlock:
1) TX 1 adds MVCC candidates on primary node for keys 1, 3 and 5 (addLocal call
on prepare phase)
2) TX 2 adds MVCC candidates on primary node for keys 2, 3 and 6 (addLocal call
on prepare phase, XID 1 is first in candidates queue for key 3)
3) TX 2 acquires lock for key 2 (readyLocks call on prepare phase)
4) TX 1 acquires lock for key 1 (readyLocks call on prepare phase)
5) TX 2 tries to acquire lock for 3 (unsuccessfully: TX 1 becomes an owner
instead: it's first in the queue and its previous chain item [key 1] in the
thread chain was concurrently owned by TX 1)
6) Neither TX 1 nor TX 2 continues processing of its thread chain
[~ascherbakov]
I agree with all of your propositions. I guess I need a fresh TC visa then.
> Regression in GridCacheColocatedDebugTest: putAll of sorted keys causes
> deadlock
> --------------------------------------------------------------------------------
>
> Key: IGNITE-12746
> URL: https://issues.apache.org/jira/browse/IGNITE-12746
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Reporter: Ilya Kasnacheev
> Assignee: Ivan Rakov
> Priority: Blocker
> Fix For: 2.8.1
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> After this commit:
> 7d4bb49264b IGNITE-12329 Invalid handling of remote entries causes partition
> desync and transaction hanging in COMMITTING state.
> the following tests:
> org.apache.ignite.internal.processors.cache.distributed.dht.GridCacheColocatedDebugTest#testPutsMultithreadedColocated
> org.apache.ignite.internal.processors.cache.distributed.dht.GridCacheColocatedDebugTest#testPutsMultithreadedMixed
> started to be flaky because their ordered putAll operations started
> deadlocking.
> This is a regression compared to 2.7 and should be fixed, since it may affect
> production clusters.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)