[ 
https://issues.apache.org/jira/browse/IGNITE-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Shinde updated IGNITE-13132:
----------------------------------
    Description: 
*Ignite version:2.6.0*

We are using ignite's distributed countdownlatch to make sure that cache 
loading is completed on all server nodes. We do this to make sure that our 
kafka consumers starts only after cache loading is complete on all server 
nodes. This is the basic criteria which needs to be fulfilled before starts 
actual processing

 We have 4 server nodes and countdownlatch is initialized to 4. We use 
"cache.loadCache" method to start the cache loading. When each server completes 
cache loading it reduces the count by 1 using countDown method. So when all the 
nodes completes cache loading, the count reaches to zero. When this count  
reaches to zero we start kafka consumers on all server nodes.

 But we saw weird behavior in prod env. The 3 server nodes were shut down at 
the same time. But 1 node is still alive. When this happened the count down was 
reinitialized to original value i.e. 4. But I am not able to reproduce this in 
dev env.  
  
 Note: Partiton loss were happened when three node gone down at same time.

 

  was:
We are using ignite's distributed countdownlatch to make sure that cache 
loading is completed on all server nodes. We do this to make sure that our 
kafka consumers starts only after cache loading is complete on all server 
nodes. This is the basic criteria which needs to be fulfilled before starts 
actual processing

 We have 4 server nodes and countdownlatch is initialized to 4. We use 
"cache.loadCache" method to start the cache loading. When each server completes 
cache loading it reduces the count by 1 using countDown method. So when all the 
nodes completes cache loading, the count reaches to zero. When this count  
reaches to zero we start kafka consumers on all server nodes.

 But we saw weird behavior in prod env. The 3 server nodes were shut down at 
the same time. But 1 node is still alive. When this happened the count down was 
reinitialized to original value i.e. 4. But I am not able to reproduce this in 
dev env.  
 
Note: Partiton loss were happened when three node gone down at same time.


> Countdown latched gets reinitialize to original value(4) when one or more 
> (but not all) node goes down. 
> --------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-13132
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13132
>             Project: Ignite
>          Issue Type: Bug
>          Components: data structures
>    Affects Versions: 2.6
>            Reporter: Akash Shinde
>            Priority: Critical
>
> *Ignite version:2.6.0*
> We are using ignite's distributed countdownlatch to make sure that cache 
> loading is completed on all server nodes. We do this to make sure that our 
> kafka consumers starts only after cache loading is complete on all server 
> nodes. This is the basic criteria which needs to be fulfilled before starts 
> actual processing
>  We have 4 server nodes and countdownlatch is initialized to 4. We use 
> "cache.loadCache" method to start the cache loading. When each server 
> completes cache loading it reduces the count by 1 using countDown method. So 
> when all the nodes completes cache loading, the count reaches to zero. When 
> this count  reaches to zero we start kafka consumers on all server nodes.
>  But we saw weird behavior in prod env. The 3 server nodes were shut down at 
> the same time. But 1 node is still alive. When this happened the count down 
> was reinitialized to original value i.e. 4. But I am not able to reproduce 
> this in dev env.  
>   
>  Note: Partiton loss were happened when three node gone down at same time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to