Mikhail Cherkasov created IGNITE-6323:
-----------------------------------------
Summary: Ignite node not stopping after segmentation
Key: IGNITE-6323
URL: https://issues.apache.org/jira/browse/IGNITE-6323
Project: Ignite
Issue Type: Bug
Reporter: Mikhail Cherkasov
The problem was found by a user and described in user list:
http://apache-ignite-users.70518.x6.nabble.com/Ignite-node-not-stopping-after-segmentation-td16773.html
copy of the message:
"""
I have follow up question on segmentation from my previous post. The issue I am
trying to resolve is that ignite node does not stop on the segmented node. Here
is brief information on my application.
I have embedded Ignite into my application and using it for distributed caches.
I am running Ignite cluster in my lab environment. I have two nodes in the
cluster. In current setup, the application receives about 1 million data points
every minute. I am putting the data into ignite distributed cache using data
streamer. This way data gets distributed among members and each member further
processes the data. The application also uses other distributed caches while
processing the data.
When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP
event but nothing happens after that. Node hangs in some unstable state. I am
suspecting that when node is trying to stop there are data in buffers of
streamer which needs sent to other members. Because the node is segmented, it
is not able to flush/drop the data. The application is also trying to access
caches while node is stopping, that also causes deadlock situation.
I have tried few things to make it work,
Letting node stop after segmentation which is the default behavior. But the
node gets stuck.
Setting segmentation policy to NOOP. Plan was to stop the node manually after
some clean up.
This way when I get segmented event, I first try to close data streamer
instance and cache instance. But when I trying to close data streamer, the
close() call gets stuck. I was calling close with true to drop everything is
streamer. But that did not help.
On receiving segmentation event, restrict the application from accessing any
caches. Then stop the node. Even then the node gets stuck.
I have attached few thread dumps here. In each of them one thread is trying to
stop the node, but gets into waiting state.
"""
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)