[
https://issues.apache.org/jira/browse/IGNITE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057788#comment-17057788
]
Alexey Scherbakov edited comment on IGNITE-12746 at 3/12/20, 11:17 AM:
-----------------------------------------------------------------------
[~ivan.glukos]
The changes look good to me.
To summarize a root cause: in the current implementation we can own a lock for
another candidate version during scan of thread chain. In this case thread
chain for another candidate will not be processed, but it should.
My comments on PR:
1. For better clarity I suggest a change
{code}
boolean lockedByThreadChainVer = owner != null && owner.hasCandidate(ver);
// If locked by the thread chain version no need to do recursive thread chain
scans for the same chain.
// This call must be made outside of synchronization.
checkOwnerChanged(prev, owner, val, lockedByThreadChainVer);
return !lockedByThreadChainVer;
{code}
2. Newly added test should be executed @WithSystemProperty(key =
"IGNITE_TO_STRING_MAX_LENGTH", value = "100000"), otherwise txs will be
truncated in output on assertion.
3. I've noticed the enlisted keys are printed in wrong order. This is because
of a bug in
org.apache.ignite.internal.processors.cache.transactions.IgniteTxStateImpl#allEntriesCopy.
We should use ArrayList here.
Can you fix it ?
was (Author: ascherbakov):
[~ivan.glukos]
The changes look good to me.
To summarize a root cause: in the current implementation we can own a lock for
another candidate version during scan of thread chain. In this case thread
chain for another candidate will not be processed, but it should.
My comments on PR:
1. For better clarity I suggest a change
{code}
boolean lockedByThreadChainVer = owner != null && owner.hasCandidate(ver);
// If locked by the thread chain version no need to do recursive thread chain
scans for the same chain.
// This call must be made outside of synchronization.
checkOwnerChanged(prev, owner, val, lockedByThreadChainVer);
return !lockedByThreadChainVer;
{code}
2. Newly added test should be executed @WithSystemProperty(key =
"IGNITE_TO_STRING_MAX_LENGTH", value = "100000"), otherwise txs will be
truncated in output on assertion.
3. I've noticed the enlisted keys are printed in wrong order. This is because
of bug in
org.apache.ignite.internal.processors.cache.transactions.IgniteTxStateImpl#allEntriesCopy.
We should use ArrayList here.
Can you fix it ?
> Regression in GridCacheColocatedDebugTest: putAll of sorted keys causes
> deadlock
> --------------------------------------------------------------------------------
>
> Key: IGNITE-12746
> URL: https://issues.apache.org/jira/browse/IGNITE-12746
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Reporter: Ilya Kasnacheev
> Assignee: Ivan Rakov
> Priority: Blocker
> Fix For: 2.8.1
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> After this commit:
> 7d4bb49264b IGNITE-12329 Invalid handling of remote entries causes partition
> desync and transaction hanging in COMMITTING state.
> the following tests:
> org.apache.ignite.internal.processors.cache.distributed.dht.GridCacheColocatedDebugTest#testPutsMultithreadedColocated
> org.apache.ignite.internal.processors.cache.distributed.dht.GridCacheColocatedDebugTest#testPutsMultithreadedMixed
> started to be flaky because their ordered putAll operations started
> deadlocking.
> This is a regression compared to 2.7 and should be fixed, since it may affect
> production clusters.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)