[
https://issues.apache.org/jira/browse/IGNITE-13132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Akash Shinde updated IGNITE-13132:
----------------------------------
Affects Version/s: 2.6
> Countdown latched gets reinitialize to original value(4) when one or more
> (but not all) node goes down.
> --------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-13132
> URL: https://issues.apache.org/jira/browse/IGNITE-13132
> Project: Ignite
> Issue Type: Bug
> Components: data structures
> Affects Versions: 2.6
> Reporter: Akash Shinde
> Priority: Critical
>
> We are using ignite's distributed countdownlatch to make sure that cache
> loading is completed on all server nodes. We do this to make sure that our
> kafka consumers starts only after cache loading is complete on all server
> nodes. This is the basic criteria which needs to be fulfilled before starts
> actual processing
> We have 4 server nodes and countdownlatch is initialized to 4. We use
> "cache.loadCache" method to start the cache loading. When each server
> completes cache loading it reduces the count by 1 using countDown method. So
> when all the nodes completes cache loading, the count reaches to zero. When
> this count reaches to zero we start kafka consumers on all server nodes.
> But we saw weird behavior in prod env. The 3 server nodes were shut down at
> the same time. But 1 node is still alive. When this happened the count down
> was reinitialized to original value i.e. 4. But I am not able to reproduce
> this in dev env.
>
> Note: Partiton loss were happened when three node gone down at same time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)