We recently had an issue under heavy load where two nodes on the same grid
got into trouble communicating with each other.

My question is what do these error messages mean and the best approach to
solving them. Thanks in advance.



Node PSN-2 logs

Failed to read magic header (too few bytes received) [rmtAddr=/
10.215.105.95:34832, locAddr=/10.215.108.99:47500]



WRN [ImmutableCacheComputeServer]   Node FAILED: TcpDiscoveryNode
[id=82984913-d32a-4707-bf6b-1f488f305e37,
consistentId=2d7a6834-3883-4b48-acf9-aee9e2fd15b6, addrs=ArrayList
[10.215.104.130, 127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500,
trex-alpha-psn-1.trex-alpha-psnode.alpha.svc.cluster.local/
10.215.104.130:47500], discPort=47500, order=12, intOrder=12,
lastExchangeTime=1686278202953, loc=false,
ver=2.15.0#20230425-sha1:f98f7f35, isClient=false]





Node PSN-1 logs

Timed out waiting for message delivery receipt (most probably, the reason
is in long GC pauses on remote node; consider tuning GC and increasing
'ackTimeout' configuration property). Will retry to send message with
increased timeout [currentTimeout=9496, rmtAddr=trex-alpha-psn-2.trex-alpha-



PSN-1 Eventually restarts

Reply via email to