Re: Ignite Visor Cache command hangs indefinitely.

John Smith Fri, 21 Jun 2019 06:54:13 -0700

How to turn it off?

Also i think i know what may have been the visor issue. I was connecting to
cluster not specifying ports 47500..47509. But once I added that it seems
more stable. I can even see the wifi node and everything.



On Fri, 21 Jun 2019 at 06:01, Ilya Kasnacheev <[email protected]>
wrote:

> Hello!
>
> It is recommended to turn off failure detection since its default config
> is not very convenient. Maybe it is also fixed in 2.7.5.
>
> This just means some operation took longer than expected and Ignite
> panicked.
>
> Regards,
>
> чт, 20 июн. 2019 г., 19:28 John Smith <[email protected]>:
>
>> Actually this hapenned when the WIFI node connected. But it never
>> hapenned before...
>>
>> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][GridDhtPartitionsExchangeFuture]
>> Completed partition exchange
>> [localNode=e9e9f4b9-b249-4a4d-87ee-fc97097ad9ee,
>> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
>> [topVer=59, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
>> [id=45516c37-5ee0-4046-a13a-9573607d25aa, addrs=[0:0:0:0:0:0:0:1,
>> 127.0.0.1, MY_WIFI_IP, MY_WIFI_IP], sockAddrs=[/MY_WIFI_IP:0,
>> /0:0:0:0:0:0:0:1:0, /127.0.0.1:0, /MY_WIFI_IP:0], discPort=0, order=59,
>> intOrder=32, lastExchangeTime=1561042306599, loc=false,
>> ver=2.7.0#20181130-sha1:256ae401, isClient=true], done=true],
>> topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0],
>> durationFromInit=0]
>> [14:51:46,660][INFO][exchange-worker-#43%xxxxxx%][time] Finished exchange
>> init [topVer=AffinityTopologyVersion [topVer=59, minorTopVer=0], crd=true]
>> [14:51:46,662][INFO][exchange-worker-#43%xxxxxx%][GridCachePartitionExchangeManager]
>> Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
>> [topVer=59, minorTopVer=0], force=false, evt=NODE_JOINED,
>> node=45516c37-5ee0-4046-a13a-9573607d25aa]
>> [14:51:47,123][INFO][grid-nio-worker-tcp-comm-2-#26%xxxxxx%][TcpCommunicationSpi]
>> Accepted incoming communication connection [locAddr=/xxx.xxx.xxx.69:47100,
>> rmtAddr=/MY_WIFI_IP:62249]
>> [14:51:59,428][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
>> Checkpoint started [checkpointId=56e2ea25-7273-49ab-81ac-0fdbc5945626,
>> startPtr=FileWALPointer [idx=137, fileOff=45790479, len=17995],
>> checkpointLockWait=0ms, checkpointLockHoldTime=12ms,
>> walCpRecordFsyncDuration=3ms, pages=242, reason='timeout']
>> [14:51:59,544][INFO][db-checkpoint-thread-#1068%xxxxxx%][GridCacheDatabaseSharedManager]
>> Checkpoint finished [cpId=56e2ea25-7273-49ab-81ac-0fdbc5945626, pages=242,
>> markPos=FileWALPointer [idx=137, fileOff=45790479, len=17995],
>> walSegmentsCleared=0, walSegmentsCovered=[], markDuration=23ms,
>> pagesWrite=14ms, fsync=101ms, total=138ms]
>> [14:52:45,827][INFO][tcp-disco-msg-worker-#2%xxxxxx%][TcpDiscoverySpi]
>> Local node seems to be disconnected from topology (failure detection
>> timeout is reached) [failureDetectionTimeout=10000, connCheckInterval=500]
>> [14:52:45,847][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][G] Blocked
>> system-critical thread has been detected. This can lead to cluster-wide
>> undefined behaviour [threadName=tcp-disco-msg-worker, blockedFor=39s]
>> [14:52:45,859][INFO][tcp-disco-sock-reader-#36%xxxxxx%][TcpDiscoverySpi]
>> Finished serving remote node connection [rmtAddr=/xxx.xxx.xxx.76:56861,
>> rmtPort=56861
>> [14:52:45,864][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][G] Thread
>> [name="tcp-disco-msg-worker-#2%xxxxxx%", id=83, state=RUNNABLE, blockCnt=6,
>> waitCnt=24621465]
>>
>> [14:52:45,875][SEVERE][ttl-cleanup-worker-#1652%xxxxxx%][] Critical
>> system error detected. Will be handled accordingly to configured handler
>> [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
>> super=AbstractFailureHandler
>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], failureCtx=FailureContext
>> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
>> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
>> heartbeatTs=1561042326687]]]
>> class org.apache.ignite.IgniteException: GridWorker
>> [name=tcp-disco-msg-worker, igniteInstanceName=xxxxxx, finished=false,
>> heartbeatTs=1561042326687]
>>         at
>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
>>         at
>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
>>         at
>> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
>>         at
>> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:151)
>>         at
>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>         at java.lang.Thread.run(Thread.java:748)
>>
>>
>> [14:52:47,974][WARNING][jvm-pause-detector-worker][IgniteKernal%xxxxxx]
>> Possible too long JVM pause: 2047 milliseconds.
>>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
>> TCP discovery accepted incoming connection [rmtAddr=/xxx.xxx.xxx.72,
>> rmtPort=37607]
>>         [14:52:47,994][INFO][tcp-disco-srvr-#3%xxxxxx%][TcpDiscoverySpi]
>> TCP discovery spawning a new thread for connection
>> [rmtAddr=/xxx.xxx.xxx.72, rmtPort=37607]
>>
>> [14:52:47,996][INFO][tcp-disco-sock-reader-#37%xxxxxx%][TcpDiscoverySpi]
>> Started serving remote node connection [rmtAddr=/xxx.xxx.xxx.72:37607,
>> rmtPort=37607]
>>
>> [14:52:48,005][WARNING][ttl-cleanup-worker-#1652%xxxxxx%][FailureProcessor]
>> Thread dump at 2019/06/20 14:52:47 UTC
>>         Thread [name="sys-#25624%xxxxxx%", id=33109, state=TIMED_WAITING,
>> blockCnt=0, waitCnt=1]
>>             Lock
>> [object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3a9414a4,
>> ownerName=null, ownerId=-1]
>>                 at sun.misc.Unsafe.park(Native Method)
>>                 at
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>                 at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
>>                 at
>> java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>                 at java.lang.Thread.run(Thread.java:748)
>>
>>         Thread [name="Thread-6972", id=33108, state=TIMED_WAITING,
>> blockCnt=0, waitCnt=17]
>>             Lock
>> [object=java.util.concurrent.SynchronousQueue$TransferStack@62bdd75c,
>> ownerName=null, ownerId=-1]
>>                 at sun.misc.Unsafe.park(Native Method)
>>                 at
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>>                 at
>> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
>>                 at
>> java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
>>                 at
>> java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
>>                 at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>                 at java.lang.Thread.run(Thread.java:748)
>>
>>
>> On Thu, 20 Jun 2019 at 10:08, John Smith <[email protected]> wrote:
>>
>>> Ok, where do I look for the visor logs when it hangs? And it's not a no
>>> caches issue the cluster works great. It when visor cannot reach a specific
>>> client node.
>>>
>>> On Thu., Jun. 20, 2019, 8:45 a.m. Vasiliy Sisko, <[email protected]>
>>> wrote:
>>>
>>>> Hello @javadevmtl
>>>>
>>>> I failed to reproduce your problem.
>>>> In case of any error in cache command Visor CMD shows message "No caches
>>>> found".
>>>> Please provide logs of visor, server and client nodes after command
>>>> hang.
>>>>
>>>>
>>>>
>>>> --
>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>
>>>

Re: Ignite Visor Cache command hangs indefinitely.

Reply via email to