>Does network issue make JVM  halt?
There is a failureDetectionTimeout, which will help other nodes in the
cluster to detect that node is unreachable and to exclude this node from
topology. So, I believe it could be something like a temporary network
problem. I would recommend to add some network monitoring to be prepared
for the next failure.

Best Regards,
Evgenii

пт, 26 июл. 2019 г. в 16:01, Akash Shinde <[email protected]>:

> This issue is not consistent and but occurs sometimes. Does network issue
> make JVM  halt?. As per my understanding node will disconnects from cluster
> if network issue happens.
> But in this case multiple JVMs were terminated.Can it be a bug in Ignite
> 2.6 version?
>
> Thanks,
> Akash
>
> On Fri, Jul 26, 2019 at 4:00 PM Evgenii Zhuravlev <
> [email protected]> wrote:
>
>> I don't see any specific errors in the logs. For me, it looks like
>> network problems, moreover, on client nodes it prints messages about
>> connection problems. Is this issue reproducible?
>> Evgenii
>>
>> пт, 26 июл. 2019 г. в 09:21, Akash Shinde <[email protected]>:
>>
>>> Can someone please help me on this issue ?
>>>
>>> On Wed, Jul 24, 2019 at 12:04 PM Akash Shinde <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>> Please find attached logs from all server and client nodes.Also
>>>> attached gc logs for each node.
>>>>
>>>> Thanks,
>>>> Akash
>>>>
>>>>
>>>> On Tue, Jul 23, 2019 at 8:59 PM Evgenii Zhuravlev <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Can you please share full logs from the node start from all nodes in
>>>>> the cluster?
>>>>>
>>>>> Thanks,
>>>>> Evgenii
>>>>>
>>>>> вт, 23 июл. 2019 г. в 16:51, Akash Shinde <[email protected]>:
>>>>>
>>>>>> I am using Ignite 2.6 version.  I have created a cluster of 7 server
>>>>>> nodes and three client nodes. Out of seven nodes five nodes stopped
>>>>>> unexpectedly with below error logs lines.
>>>>>> I have attached logs of two such server nodes.
>>>>>>
>>>>>> FailureDetectionTimeout is set to 30000 ms  in Ignite configuration.
>>>>>> Network time out is default.
>>>>>> ClientFailureDetectionTimeout is set to 30000 ms.
>>>>>>
>>>>>> I check gc logs but it does not seem to be GC pause issue. I have
>>>>>> attached GC logs too.
>>>>>>
>>>>>> 1) Can someone please help me to identify the reason for this issue?
>>>>>> 2) Are there any specific reasons which causes this issue or it is a
>>>>>> bug in Ignite 2.6 version?
>>>>>>
>>>>>>
>>>>>> *ERROR LOGS LINES*
>>>>>> 2019-07-22 09:22:47,281 19417675 [tcp-disco-srvr-#3%springDataNode%]
>>>>>> ERROR  - Critical system error detected. Will be handled accordingly to
>>>>>> configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler,
>>>>>> failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
>>>>>> err=java.lang.IllegalStateException: Thread
>>>>>> tcp-disco-srvr-#3%springDataNode% is terminated unexpectedly.]]
>>>>>> java.lang.IllegalStateException: Thread
>>>>>> tcp-disco-srvr-#3%springDataNode% is terminated unexpectedly.
>>>>>> at
>>>>>> org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:5686)
>>>>>> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>>>>>> 2019-07-22 09:22:47,281 19417675 [tcp-disco-srvr-#3%springDataNode%]
>>>>>> ERROR  - JVM will be halted immediately due to the failure:
>>>>>> [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
>>>>>> err=java.lang.IllegalStateException: Thread
>>>>>> tcp-disco-srvr-#3%springDataNode% is terminated unexpectedly.]]
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Akash
>>>>>>
>>>>>

Reply via email to