[
https://issues.apache.org/jira/browse/IGNITE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200206#comment-17200206
]
Mirza Aliev edited comment on IGNITE-12451 at 9/22/20, 4:42 PM:
----------------------------------------------------------------
[~alex_pl]
> Since only younger threads retry all locks
Could you please explain in more detail this phrase? As far as I can
understand, only _older_ threads retry all locks, because the timeout has
passed for old threads, while young threads still try to lock entry because
their timeout has not happened. This means that there is a chance that the
older thread will be deadlocked again with a new putAll thread that comes at
the moment when the old thread started to take locks for its entries, and so
on, this means that the older thread won't do any progress.
was (Author: maliev):
> Since only younger threads retry all locks
Could you please explain in more detail this phrase? As far as I can
understand, only _older_ threads retry all locks, because the timeout has
passed for old threads, while young threads still try to lock entry because
their timeout has not happened. This means that there is a chance that the
older thread will be deadlocked again with a new putAll thread that comes at
the moment when the old thread started to take locks for its entries, and so
on, this means that the older thread won't do any progress.
> Introduce deadlock detection for cache entry reentrant locks
> ------------------------------------------------------------
>
> Key: IGNITE-12451
> URL: https://issues.apache.org/jira/browse/IGNITE-12451
> Project: Ignite
> Issue Type: Improvement
> Affects Versions: 2.7.6
> Reporter: Ivan Rakov
> Assignee: Mirza Aliev
> Priority: Major
> Fix For: 2.10
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Aside from IGNITE-12365, we still have possible threat of cache-entry-level
> deadlock in case of careless usage of JCache mass operations (putAll,
> removeAll):
> 1. If two different user threads will perform putAll on the same two keys in
> reverse order (primary node for which is the same), there's a chance that
> sys-stripe threads will be deadlocked.
> 2. Even without direct contract violation from user side, HashMap can be
> passed as argument for putAll. Even if user threads have called mass
> operations with two keys in the same order, HashMap iteration order is not
> strictly defined, which may cause the same deadlock.
> Local deadlock detection should mitigate this issue. We can create a wrapper
> for ReentrantLock with logic that performs cycle detection in wait-for graph
> in case we are waiting for lock acquisition for too long. Exception will be
> thrown from one of the threads in such case, failing user operation, but
> letting the system make progress.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)