[ 
https://issues.apache.org/jira/browse/IGNITE-12451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200168#comment-17200168
 ] 

Aleksey Plekhanov commented on IGNITE-12451:
--------------------------------------------

{{hasLockCollisions}} gives sometimes false-positive results, but I think it's 
not very often. If we already waited some time for lock most likely this thread 
already in deadlock and most likely thread that holds the lock we are trying to 
acquire blocking us. If not, we just retry and nothing bad happens, user 
operation is not affected by this retry.

Fair cycles searching is more resource consuming (affects performance and 
footprint). This solution doesn't add any footprint and has no performance 
impact, but I see no advantages of fair cycles searching here. 

This algorithm does garantee that system won't hang. Since only younger threads 
retry all locks, older threads don't retry and will always advance in progress 
(there will be no infinite retries).   

Throwing exception to user leads to bad user experience. If we can perform this 
operation without exception, why should we throw it? 


> Introduce deadlock detection for cache entry reentrant locks
> ------------------------------------------------------------
>
>                 Key: IGNITE-12451
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12451
>             Project: Ignite
>          Issue Type: Improvement
>    Affects Versions: 2.7.6
>            Reporter: Ivan Rakov
>            Assignee: Mirza Aliev
>            Priority: Major
>             Fix For: 2.10
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Aside from IGNITE-12365, we still have possible threat of cache-entry-level 
> deadlock in case of careless usage of JCache mass operations (putAll, 
> removeAll):
> 1. If two different user threads will perform putAll on the same two keys in 
> reverse order (primary node for which is the same), there's a chance that 
> sys-stripe threads will be deadlocked.
> 2. Even without direct contract violation from user side, HashMap can be 
> passed as argument for putAll. Even if user threads have called mass 
> operations with two keys in the same order, HashMap iteration order is not 
> strictly defined, which may cause the same deadlock. 
> Local deadlock detection should mitigate this issue. We can create a wrapper 
> for ReentrantLock with logic that performs cycle detection in wait-for graph 
> in case we are waiting for lock acquisition for too long. Exception will be 
> thrown from one of the threads in such case, failing user operation, but 
> letting the system make progress.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to