[
https://issues.apache.org/jira/browse/IGNITE-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Cherkasov reassigned IGNITE-6323:
-----------------------------------------
Assignee: Mikhail Cherkasov
> Ignite node not stopping after segmentation
> -------------------------------------------
>
> Key: IGNITE-6323
> URL: https://issues.apache.org/jira/browse/IGNITE-6323
> Project: Ignite
> Issue Type: Bug
> Components: general
> Affects Versions: 2.0
> Reporter: Mikhail Cherkasov
> Assignee: Mikhail Cherkasov
> Fix For: 2.3
>
> Attachments: thread-dump-9-1.txt, thread-dump-9-2.txt,
> thread-dump-9-4.txt
>
>
> The problem was found by a user and described in user list:
> http://apache-ignite-users.70518.x6.nabble.com/Ignite-node-not-stopping-after-segmentation-td16773.html
> copy of the message:
> """
> I have follow up question on segmentation from my previous post. The issue I
> am trying to resolve is that ignite node does not stop on the segmented node.
> Here is brief information on my application.
>
> I have embedded Ignite into my application and using it for distributed
> caches. I am running Ignite cluster in my lab environment. I have two nodes
> in the cluster. In current setup, the application receives about 1 million
> data points every minute. I am putting the data into ignite distributed cache
> using data streamer. This way data gets distributed among members and each
> member further processes the data. The application also uses other
> distributed caches while processing the data.
>
> When a member node gets segmented, it does not stop. I get BEFORE_NODE_STOP
> event but nothing happens after that. Node hangs in some unstable state. I am
> suspecting that when node is trying to stop there are data in buffers of
> streamer which needs sent to other members. Because the node is segmented, it
> is not able to flush/drop the data. The application is also trying to access
> caches while node is stopping, that also causes deadlock situation.
>
> I have tried few things to make it work,
> Letting node stop after segmentation which is the default behavior. But the
> node gets stuck.
> Setting segmentation policy to NOOP. Plan was to stop the node manually after
> some clean up.
> This way when I get segmented event, I first try to close data streamer
> instance and cache instance. But when I trying to close data streamer, the
> close() call gets stuck. I was calling close with true to drop everything is
> streamer. But that did not help.
> On receiving segmentation event, restrict the application from accessing any
> caches. Then stop the node. Even then the node gets stuck.
>
> I have attached few thread dumps here. In each of them one thread is trying
> to stop the node, but gets into waiting state.
> """
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)