Good question. We do log GC duration and no sign of a long GC activity or
warnings of this type.
My understanding the only fixes around these issues is to play with the
timeout settings. Is that correct?
is the line  "Previous node alive status" telling me a node has changed id
or had issues

On Wed, 8 Nov 2023 at 22:08, Stephen Darlington <sdarling...@apache.org>
wrote:

> The most common cause of a segmented cluster is not the network but your
> Java garbage collection configuration. Do you see any "Long JVM pause"
> warnings in your logs before the problem occurs?
>
> On Wed, 8 Nov 2023 at 08:48, Alan Rose <alan_r...@trimble.com> wrote:
>
>>
>> I am hoping someone can help me understand some log entries better.
>> I have two ignite nodes A & B running in linux containers that appear to
>> have a network issue that result in node A restarting approx 5 seconds
>> later.
>> From the logs Node B states about Node A
>>    "Previous node alive status [alive=false,
>> checkPreviousNodeId=fb9c943e-aa4a-4e6c-ae00-1df5212a3f3f,
>> actualPreviousNode=TcpDiscoveryNode
>> [id=58424f0b-e77f-4127-835a-4274f57955a1,
>> consistentId=5faca106-0c39-45ab-8c64-f38df8910238, etc.
>> What is this line telling me about Node A?
>>
>> I then get
>>          "Node FAILED: TcpDiscoveryNode ..etc"
>> and  "Close incoming connection, unknown node..?" I think talking about
>> node A
>>
>> Node A log states
>>    Failed to send message to remote node [node=TcpDiscoveryNode [id= etc
>>    but it does appear to be able to ping node B Ok
>> within 5 second I see in Node A log
>>   Node is out of topology (probably, due to short-time network problems).
>>    Local node SEGMENTED: TcpDiscoveryNode [id=58 etc
>>  finally there is a restart of the node A.
>> I see no other evidence of a network issue. Is there something I can
>> configure, so it is not so quick to timeout
>> The only thing I see in the log at startup around 5 seconds
>> is netTimeout=5000
>>
>>
>>
>>
>>
>> --
>> *Alan Rose*
>> *Senior Software Engineer. *
>>
>> *CCSS Team Merino*
>> *Trimble Navigation New Zealand Limited*
>> P O Box 8729, Riccarton, Christchurch 8440 , New Zealand
>> +64 3 9635616 Ext 604016
>>
>>

-- 
*Alan Rose*
*Senior Software Engineer. *

*CCSS Team Merino*
*Trimble Navigation New Zealand Limited*
P O Box 8729, Riccarton, Christchurch 8440 , New Zealand
+64 3 9635616 Ext 604016

Reply via email to