Thank you. So what should i do? Client node disconnected after this error
and client can not reconnect to the cluster until i reboot my application,
client node and server node. How to client node reconnect to cluster?

Ilya Kasnacheev <[email protected]>, 24 Şub 2021 Çar, 13:57
tarihinde şunu yazdı:

> Hello!
>
> Looks like network problems, long GC on server node or some kind of
> deadlock on server node which prevents it from responding.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 24 февр. 2021 г. в 13:09, oguzhan <[email protected]>:
>
>> Hello,
>>
>> We have 1 client node and 1 server node and we are using ignite version
>> 2.9.1.
>>
>> Our application is scheduled to do the same jobs every day. Then our
>> application did not get any errors for 2 weeks, but 2 weeks later, we are
>> getting this error as you can see below (We get such an error about every
>> 2
>> weeks):
>>
>> I hope you support to solve my problem. Thanks and best regards...
>>
>>
>> 2021-02-14 02:07:34 WARN  tcp-client-disco-reconnector-#7-#77756
>> TcpDiscoverySpi:576 - Failed to connect to any address from IP finder
>> (will
>> retry to join topology every 2000 ms; change 'reconnectDelay' to configure
>> the frequency of retries): [/127.0.0.1:47500, /127.0.0.1:47501,
>> /127.0.0.1:47502, /127.0.0.1:47503, /127.0.0.1:47504, /127.0.0.1:47505,
>> /127.0.0.1:47506, /127.0.0.1:47507, /127.0.0.1:47508, /127.0.0.1:47509]
>> 2021-02-14 02:07:37 INFO  grid-timeout-worker-#206 IgniteKernal:566 -
>> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>>     ^-- Node [id=2fefd66f, uptime=4 days, 13:33:34.341]
>>     ^-- Cluster [hosts=1, CPUs=16, servers=1, clients=1, topVer=2,
>> minorTopVer=18985]
>>     ^-- Network [addrs=[10.86.26.180, 127.0.0.1], discoPort=0,
>> commPort=47101]
>>     ^-- CPU [CPUs=16, curLoad=1.07%, avgLoad=0.05%, GC=0.1%]
>>     ^-- Heap [used=865MB, free=92.96%, comm=12274MB]
>>     ^-- Off-heap memory [used=0MB, free=100%, allocated=0MB]
>>     ^-- Page memory [pages=0]
>>     ^--   sysMemPlc region [type=internal, persistence=false,
>> lazyAlloc=false,
>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^--   TxLog region [type=internal, persistence=false, lazyAlloc=false,
>>       ...  initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^--   Default_Region region [type=default, persistence=false,
>> lazyAlloc=true,
>>       ...  initCfg=256MB, maxCfg=32768MB, usedRam=0MB, freeRam=100%,
>> allocRam=0MB]
>>     ^-- Outbound messages queue [size=0]
>>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>>     ^-- System thread pool [active=0, idle=81, qSize=0]
>> 2021-02-14 02:07:38 ERROR tcp-client-disco-sock-writer-#2-#230
>> TcpDiscoverySpi:586 - Failed to send message: null
>> java.io.IOException: Failed to get acknowledge for message:
>> TcpDiscoveryClientMetricsUpdateMessage [super=TcpDiscoveryAbstractMessage
>> [sndNodeId=null, id=1d467368771-2fefd66f-0954-45dd-aa32-a33e58567950,
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>> isClient=true]]
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1471)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> 2021-02-14 02:07:44 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576
>> -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671033974906026, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671033974906026,
>> currTimeout=600000]], addr=/127.0.0.1:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:07:54 WARN  tcp-comm-worker-#1-#216 TcpCommunicationSpi:576
>> -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671044002786218, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671044002786218,
>> currTimeout=600000]], addr=dwccatp01/10.86.26.180:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:08:06 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=11s]
>> 2021-02-14 02:08:06 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:06] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:07 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:16 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=21s]
>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:16] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:16 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:28 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=33s]
>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:28] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:28 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>> 2021-02-14 02:08:32 WARN  http-nio-8082-exec-5 TcpCommunicationSpi:576 -
>> Handshake timed out (will stop attempts to perform the handshake)
>> [node=6953d599-d606-4781-a6ba-43de7aff59e4,
>> connTimeoutStrategy=ExponentialBackoffTimeoutStrategy [maxTimeout=600000,
>> totalTimeout=10000, startNanos=1671081715938786, currTimeout=600000],
>> err=Operation timed out [timeoutStrategy=
>> ExponentialBackoffTimeoutStrategy
>> [maxTimeout=600000, totalTimeout=10000, startNanos=1671081715938786,
>> currTimeout=600000]], addr=/127.0.0.1:47100,
>> failureDetectionTimeoutEnabled=true, timeout=0]
>> 2021-02-14 02:08:37 ERROR grid-timeout-worker-#206 G:581 - Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [workerName=tcp-comm-worker,
>> threadName=tcp-comm-worker-#1-#216, blockedFor=42s]
>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206 root:576 - Possible
>> failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> class org.apache.ignite.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]
>>         at sun.misc.Unsafe.park(Native Method)
>>         at
>> java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>>         at
>>
>> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.pingNode(ClientImpl.java:449)
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.pingNode(TcpDiscoverySpi.java:493)
>>         at
>>
>> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1688)
>>         at
>>
>> org.apache.ignite.internal.managers.GridManagerAdapter$1.pingNode(GridManagerAdapter.java:409)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:5165)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4951)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at
>>
>> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2503)
>>         at
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
>> [02:08:37] Possible failure suppressed accordingly to a configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
>> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
>> failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
>> o.a.i.IgniteException: GridWorker [name=tcp-comm-worker,
>> igniteInstanceName=null, finished=false, heartbeatTs=1613257674823]]]
>> 2021-02-14 02:08:37 WARN  grid-timeout-worker-#206
>> CacheDiagnosticManager:571 - Page locks dump:
>>
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>
>

Reply via email to