Ivan Daschinskiy created IGNITE-13540:
-----------------------------------------
Summary: Exchange worker, waiting for new task from queue,
considered as blocked.
Key: IGNITE-13540
URL: https://issues.apache.org/jira/browse/IGNITE-13540
Project: Ignite
Issue Type: Bug
Reporter: Ivan Daschinskiy
Assignee: Ivan Daschinskiy
Waiting for new task in ExchangeWorker#body now is not marking as blocking
section.
So if network timeout (timeout for polling task from queue) is greater than
system worker blocked timeout, exchange worker thread is considered as
blocking. Sometimes this is reported in logs after few seconds when actually
PME is finished
{noformat}
[2020-10-06 16:55:45,939][INFO
][exchange-worker-#50][org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager1]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=6, minorTopVer=1], force=false, evt=DISCOVERY_CUSTOM_EVT,
node=163fd0f0-b9a4-4317-a28f-f7dbdb776076]
[2020-10-06 16:55:48,822][ERROR][tcp-disco-msg-worker-[9e18957a
172.18.0.5:47500]-#2-#44][org.apache.ignite.internal.util.typedef.G1] Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour [workerName=partition-exchanger,
threadName=exchange-worker-#50, blockedFor=2s]
[2020-10-06 16:55:48,824][WARN ][tcp-disco-msg-worker-[9e18957a
172.18.0.5:47500]-#2-#44][org.apache.ignite.internal.util.typedef.G1] Thread
[name="exchange-worker-#50", id=90, state=TIMED_WAITING, blockCnt=20,
waitCnt=48]
Lock
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14f29e0e,
ownerName=null, ownerId=-1]
[2020-10-06 16:55:48,827][WARN ][tcp-disco-msg-worker-[9e18957a
172.18.0.5:47500]-#2-#44][root1] Possible failure suppressed accordingly to a
configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_BLOCKED, err=class
o.a.i.IgniteException: GridWorker [name=partition-exchanger,
igniteInstanceName=null, finished=false, heartbeatTs=1601992545941]]]
class org.apache.ignite.IgniteException: GridWorker [name=partition-exchanger,
igniteInstanceName=null, finished=false, heartbeatTs=1601992545941]
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1860)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$3.apply(IgnitionEx.java:1855)
at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:234)
at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:299)
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)