Hi Team,
We have encountered following defect in PROD environment. After which entire
traffic got halted for around 10 minutes, we recently upgraded our cluster to
Ignite 2.7.6 from 2.6.0.
Is this related to any existing open defect in this version? Has anyone
observed the same defect earlier ?
Any help or pointers around this will be appreciated.
[2020-07-03T18:17:11,613][ERROR][sys-stripe-36-#37%CustomerCC%][G] Blocked
system-critical thread has been detected. This can lead to cluster-wide
undefined behaviour
[threadName=partition-exchanger, blockedFor=480s]
[2020-07-03T18:17:11,613][WARN ][sys-stripe-36-#37%CustomerCC%][G] Thread
[name="exchange-worker-#344%CustomerCC%", id=391, state=TIMED_WAITING,
blockCnt=1, waitCnt=2049782]
Lock
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@6bf9f3a4,
ownerName=null, ownerId=-1]
[2020-07-03T18:17:11,620][ERROR][sys-stripe-36-#37%CustomerCC%][] Critical
system error detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED,
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext
[type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: GridWorker
[name=partition-exchanger, igniteInstanceName=CustomerCC, finished=false,
heartbeatTs=1593780431612]]]
org.apache.ignite.IgniteException: GridWorker [name=partition-exchanger,
igniteInstanceName=CustomerCC, finished=false, heartbeatTs=1593780431612]
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:513)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
[ignite-core-2.7.6.jar:2.7.6]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
[2020-07-03T18:17:11,625][WARN
][sys-stripe-36-#37%CustomerCC%][FailureProcessor] No deadlocked threads
detected.
[2020-07-03T18:17:21,790][INFO
][tcp-disco-sock-reader-#201%CustomerCC%][TcpDiscoverySpi] Finished serving
remote node connection [rmtAddr=/xx.xx.xx.xx:46416, rmtPort=46416
[2020-07-03T18:17:21,793][WARN
][jvm-pause-detector-worker][IgniteKernal%CustomerCC] Possible too long JVM
pause: 10133 milliseconds.
[2020-07-03T18:17:21,794][WARN
][grid-nio-worker-tcp-comm-31-#295%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:11764,
writeTimeout=2000]
[2020-07-03T18:17:21,794][WARN
][grid-nio-worker-tcp-comm-57-#321%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:38500,
writeTimeout=2000]
[2020-07-03T18:17:21,794][WARN
][grid-nio-worker-tcp-comm-5-#269%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:41442,
writeTimeout=2000]
[2020-07-03T18:17:21,794][WARN
][grid-nio-worker-tcp-comm-53-#317%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:44178,
writeTimeout=2000]
[2020-07-03T18:17:21,794][WARN
][grid-nio-worker-tcp-comm-59-#323%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:11884,
writeTimeout=2000]
[2020-07-03T18:17:21,795][WARN
][grid-nio-worker-tcp-comm-59-#323%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:39044,
writeTimeout=2000]
[2020-07-03T18:17:21,795][WARN
][grid-nio-worker-tcp-comm-53-#317%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:48756,
writeTimeout=2000]
[2020-07-03T18:17:21,795][WARN
][grid-nio-worker-tcp-comm-59-#323%CustomerCC%][TcpCommunicationSpi]
Communication SPI session write timed out (consider increasing
'socketWriteTimeout' configuration property) [remoteAddr=/xx.xx.xx.xx:42190,
writeTimeout=2000]
Thanks and Regards,
Kamlesh Joshi
"Confidentiality Warning: This message and any attachments are intended only
for the use of the intended recipient(s).
are confidential and may be privileged. If you are not the intended recipient.
you are hereby notified that any
review. re-transmission. conversion to hard copy. copying. circulation or other
use of this message and any attachments is
strictly prohibited. If you are not the intended recipient. please notify the
sender immediately by return email.
and delete this message and any attachments from your system.
Virus Warning: Although the company has taken reasonable precautions to ensure
no viruses are present in this email.
The company cannot accept responsibility for any loss or damage arising from
the use of this email or attachment."