Hello!

You can set clientReconnectDisabled to 'true' on the client nodes, in this
case the client will not try to reconnect and instead will produce an
error. When you see this error you may create a new client which will
hopefully not have these problems.

Regards,
-- 
Ilya Kasnacheev


ср, 24 февр. 2021 г. в 14:14, Oğuzhan Melez <[email protected]>:

>
> Thank you. So what should i do? Client node disconnected after this error
> and client can not reconnect to the cluster until i reboot my application,
> client node and server node. How to client node reconnect to cluster?
>
> Ilya Kasnacheev <[email protected]>, 24 Şub 2021 Çar, 13:57
> tarihinde şunu yazdı:
>
>> Hello!
>>
>> Looks like network problems, long GC on server node or some kind of
>> deadlock on server node which prevents it from responding.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 24 февр. 2021 г. в 13:09, oguzhan <[email protected]>:
>>
>>> Hello,
>>>
>>> We have 1 client node and 1 server node and we are using ignite version
>>> 2.9.1.
>>>
>>> Our application is scheduled to do the same jobs every day. Then our
>>> application did not get any errors for 2 weeks, but 2 weeks later, we are
>>> getting this error as you can see below (We get such an error about
>>> every 2
>>> weeks):
>>>
>>> I hope you support to solve my problem. Thanks and best regards...
>>>
>>>
>>> 2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
>>> TcpDiscoverySpi:576 - Failed to connect to any address from IP finder
>>> (will
>>> retry to join topology every 2000 ms; change 'reconnectDelay' to
>>> configure
>>> the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
>>> /127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
>>> /127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
>>> 2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 -
>>> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>>>     ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
>>>     ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
>>> minorTopVer=18985]
>>>     ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
>>> commPort=47101]
>>>     ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
>>>     ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
>>>     ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
>>>     ^-- Page memory [pages=0]
>>>     ^--   sysMemPlc region [type=internal, persistence=false,
>>> lazyAlloc=false,
>>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^--   TxLog region [type=internal, persistence=false,
>>> lazyAlloc=false,
>>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^--   Default_Region region [type=default, persistence=false,
>>> lazyAlloc=true,
>>>       ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
>>> allocRam=0MB]
>>>     ^-- Outbound messages queue [size=0]
>>>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>>>     ^-- System thread pool [active=0, idle=81, qSize=0]
>>> 2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
>>> TcpDiscoverySpi:586 - Failed to send message: null
>>> java.io.IOException: Failed to get acknowledge for message:
>>> TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
>>> [sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
>>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>> isClient=true]]
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> 2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216
>>> TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
>>> currTimeout=600000]], addr=/127.0.0.1:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216
>>> TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
>>> currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
>>> 2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:06] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
>>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:16] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
>>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:28] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>> 2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
>>> Handshake timed out (will stop attempts to perform the handshake)
>>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>>> totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
>>> err=Operation timed out [timeoutStrategy=
>>> ExponentialBackoffTimeoutStrategy
>>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
>>> currTimeout=600000]], addr=/127.0.0.1:47100,
>>> failureDetectionTimeoutEnabled=true, timeout=0]
>>> 2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
>>> system-critical thread has been detected. This can lead to cluster-wide
>>> undefined behaviour [workerName=tcp-comm-worker,
>>> threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
>>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
>>> failure suppressed accordingly to a configured handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> class org.apache.ignite.IgniteException: GridWorker
>>> [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>>         at sun.misc.Unsafe.park(Native Method)
>>>         at
>>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>>         at
>>>
>>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>>         at
>>>
>>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>>         at
>>>
>>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>>         at
>>>
>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>         at
>>>
>>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>>         at
>>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>>> [02:08:37] Possible failure suppressed accordingly to a configured
>>> handler
>>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
>>> CacheDiagnosticManager:571 - Page locks dump:
>>>
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>
>>

Reply via email to