Anton Vinogradov commented on IGNITE-8783:

Hang reason found 
you can see code
 // There is final ack for created latch.
if (pendingAcks.containsKey(latchId)) {
        pendingAcks.remove(latchId); // this cause pending acks loss when 
coordinator failure was not handled yet (eg. we handling another node fail)
        clientLatches.put(latchId, latch);

so, I propose to replace this code with simple 

clientLatches.put(latchId, latch);

Could you please explain idea of handling final message from old_coordinator?
As far as I see - latches will be recreated on each topology change and acks 
will be resent.

> Failover tests periodically cause hanging of the whole Data Structures suite 
> on TC
> ----------------------------------------------------------------------------------
>                 Key: IGNITE-8783
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8783
>             Project: Ignite
>          Issue Type: Bug
>          Components: data structures
>            Reporter: Ivan Rakov
>            Assignee: Anton Vinogradov
>            Priority: Major
>              Labels: MakeTeamcityGreenAgain
> History of suite runs: 
> https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_DataStructures&tab=buildTypeHistoryList&branch_IgniteTests24Java8=%3Cdefault%3E
> Chance of suite hang is 18% in master (based on previous 50 runs).
> Hang is always caused by one of the following failover tests:
> {noformat}
> GridCacheReplicatedDataStructuresFailoverSelfTest#testAtomicSequenceConstantTopologyChange
> GridCachePartitionedDataStructuresFailoverSelfTest#testFairReentrantLockConstantTopologyChangeNonFailoverSafe
> {noformat}

This message was sent by Atlassian JIRA

Reply via email to