hi Ilya, 

Thanks!  opening  the verbose logger will run for day to collect the logging,   
the cluster did not stuck eventually, still can get response. 


Regards
Aaron
From: Ilya Kasnacheev
Date: 2018-12-18 19:36
To: user
CC: aaron
Subject: Re: Re: Partition-exchanger blocked after upgrade to 2.7
Hello!

It's still hard to say. Can you enable more verbose logging for 
org.apache.ignite?

Did the cluster un-stuck eventually?

Regards,
-- 
Ilya Kasnacheev


вт, 18 дек. 2018 г. в 14:25, aa...@tophold.com <aa...@tophold.com>:
Hi Ilya, 

Attached is the full log of another ignite nodes.   the data in the cluster 
will be written back to the mysql.

For this nodes the ERROR happen at 2018-12-14 10:38:51.730 around , but in fact 
after that, the nodes still working. 


Regards
Aaron
 
From: Ilya Kasnacheev
Date: 2018-12-18 18:44
To: user
Subject: Re: Partition-exchanger blocked after upgrade to 2.7
Hello!

Unfortunately it's hard to say what happens here from such short log snippet. 
Can you provide full logs?

Regards,
-- 
Ilya Kasnacheev


вт, 18 дек. 2018 г. в 05:51, aa...@tophold.com <aa...@tophold.com>:
Hello, 

After we upgrade to the 2.7  we meet a wired warn; basically all our ignite 
cache running in LOCAL model in a internal network. 

All the configuration are almost default.  but we meet a ERROR logger of the 
tcp-disco-msg-worker* but after that the the cluster still working, no crash 
happen. 

[ERROR] 2018-12-17 23:52:55.989 [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] 
[ig] G - Blocked system-critical thread has been detected. This can lead to 
cluster-wide undefined behaviour [threadName=partition-exchanger, blockedFor=5s]
[WARN ] 2018-12-17 23:52:55.989 [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] 
[ig] G - Thread [name="exchange-worker-#98%PortfolioEventIgnite%", id=152, 
state=TIMED_WAITING, blockCnt=0, waitCnt=10143]
    Lock 
[object=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@39b50130,
 ownerName=null, ownerId=-1]

[WARN ] 2018-12-17 23:52:55.998 [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] 
[ig] FailureProcessor - No deadlocked threads detected.
[WARN ] 2018-12-17 23:52:57.443 [jvm-pause-detector-worker] [ig] 
IgniteKernal%PortfolioEventIgnite - Possible too long JVM pause: 1404 
milliseconds.
[WARN ] 2018-12-17 23:52:57.457 [tcp-disco-msg-worker-#2%PortfolioEventIgnite%] 
[ig] FailureProcessor - Thread dump at 2018/12/17 23:52:57 UTC


While cache are local, not sure why the partition-exchanger still blocking. 

Also  the tcp-disco-msg-worker, as running in internal network, so this warn 
suppose not happen. 

"Possible too long JVM pause: 1404 milliseconds" from the gc details during 
that time around the cost is reasonable:

2018-12-18T07:44:27.513+0800: 50200.190: [GC pause (G1 Evacuation Pause) 
(young), 0.0241404 secs]
....
[Times: user=0.19 sys=0.00, real=0.02 secs]
 
2018-12-18T07:53:21.453+0800: 50734.129: [GC pause (G1 Evacuation Pause) 
(young), 0.0221342 secs]
...
[Times: user=0.20 sys=0.00, real=0.02 secs]



Regards
Aaron

Reply via email to