[ https://issues.apache.org/jira/browse/IGNITE-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mikhail Cherkasov updated IGNITE-6323: -------------------------------------- Fix Version/s: 2.3 > Ignite node not stopping after segmentation > ------------------------------------------- > > Key: IGNITE-6323 > URL: https://issues.apache.org/jira/browse/IGNITE-6323 > Project: Ignite > Issue Type: Bug > Affects Versions: 2.1 > Reporter: Mikhail Cherkasov > Fix For: 2.3 > > > The problem was found by a user and described in user list: > http://apache-ignite-users.70518.x6.nabble.com/Ignite-node-not-stopping-after-segmentation-td16773.html > copy of the message: > """ > I have follow up question on segmentation from my previous post. The issue I > am trying to resolve is that ignite node does not stop on the segmented node. > Here is brief information on my application. > > I have embedded Ignite into my application and using it for distributed > caches. I am running Ignite cluster in my lab environment. I have two nodes > in the cluster. In current setup, the application receives about 1 million > data points every minute. I am putting the data into ignite distributed cache > using data streamer. This way data gets distributed among members and each > member further processes the data. The application also uses other > distributed caches while processing the data. > > When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP > event but nothing happens after that. Node hangs in some unstable state. I am > suspecting that when node is trying to stop there are data in buffers of > streamer which needs sent to other members. Because the node is segmented, it > is not able to flush/drop the data. The application is also trying to access > caches while node is stopping, that also causes deadlock situation. > > I have tried few things to make it work, > Letting node stop after segmentation which is the default behavior. But the > node gets stuck. > Setting segmentation policy to NOOP. Plan was to stop the node manually after > some clean up. > This way when I get segmented event, I first try to close data streamer > instance and cache instance. But when I trying to close data streamer, the > close() call gets stuck. I was calling close with true to drop everything is > streamer. But that did not help. > On receiving segmentation event, restrict the application from accessing any > caches. Then stop the node. Even then the node gets stuck. > > I have attached few thread dumps here. In each of them one thread is trying > to stop the node, but gets into waiting state. > """ -- This message was sent by Atlassian JIRA (v6.4.14#64029)