[ https://issues.apache.org/jira/browse/IGNITE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15760665#comment-15760665 ]
Semen Boikov commented on IGNITE-4450: -------------------------------------- Found reason of hang: when second node is killed 'affinity change' exchange is in progress and exchange worker waits for locks release. But callback removing locks for failed node is called only when exchange for node fail will start, so 'affinity change' exchange never finish. Need fix GridDhtPartitionsExchangeFuture to also call 'removeExplicitNodeLocks' when it receive s'node fail' events. > Explicit lock is not released when node that acquired it is killed > ------------------------------------------------------------------ > > Key: IGNITE-4450 > URL: https://issues.apache.org/jira/browse/IGNITE-4450 > Project: Ignite > Issue Type: Bug > Components: cache > Affects Versions: 1.8 > Reporter: Valentin Kulichenko > Assignee: Semen Boikov > Priority: Critical > Fix For: 1.9 > > Attachments: ExplicitLockTest.java > > > Test is attached. The scenario is the following: > # Start first node and create a transactional cache. > # Start second node and acquire a lock. > # Kill second node after several seconds without doing unlock. > # Try to start third node. It can't join because the lock is still held for > some reason. The result if hanged cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)