[ 
https://issues.apache.org/jira/browse/IGNITE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760665#comment-15760665
 ] 

Semen Boikov commented on IGNITE-4450:
--------------------------------------

Found reason of hang: when second node is killed 'affinity change' exchange is 
in progress and exchange worker waits for locks release. But callback removing 
locks for failed node is called only when exchange for node fail will start, so 
'affinity change' exchange never finish. Need fix 
GridDhtPartitionsExchangeFuture to also call 'removeExplicitNodeLocks' when it 
receive s'node fail' events. 

> Explicit lock is not released when node that acquired it is killed
> ------------------------------------------------------------------
>
>                 Key: IGNITE-4450
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4450
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 1.8
>            Reporter: Valentin Kulichenko
>            Assignee: Semen Boikov
>            Priority: Critical
>             Fix For: 1.9
>
>         Attachments: ExplicitLockTest.java
>
>
> Test is attached. The scenario is the following:
> # Start first node and create a transactional cache.
> # Start second node and acquire a lock.
> # Kill second node after several seconds without doing unlock.
> # Try to start third node. It can't join because the lock is still held for 
> some reason. The result if hanged cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to