Hello!

>From this log:

[17:19:09,949][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 1405 milliseconds.
[17:19:12,237][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 1983 milliseconds.
[17:19:14,416][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2029 milliseconds.
[17:19:16,619][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2103 milliseconds.
[17:19:18,948][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2279 milliseconds.
[17:19:21,217][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2219 milliseconds.
[17:19:23,268][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2001 milliseconds.
[17:19:25,028][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 1710 milliseconds.
[17:19:28,814][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 3736 milliseconds.
[17:19:30,962][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 2098 milliseconds.
[17:19:32,553][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 1541 milliseconds.
[17:19:37,938][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 3837 milliseconds.
[17:19:51,271][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 13200 milliseconds.
[17:19:57,222][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 7482 milliseconds.
[17:20:17,384][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 5832 milliseconds.
[17:20:17,384][SEVERE][exchange-worker-#43][G] Blocked system-critical
thread has been detected. This can lead to cluster-wide undefined behaviour
[threadName=grid-timeout-worker, blockedFor=10s]
[17:20:36,342][WARNING][tcp-disco-msg-worker-#2][TcpDiscoverySpi] Timed out
waiting for message delivery receipt (most probably, the reason is in long
GC pauses on remote node; consider tuning GC and increasing 'ackTimeout'
configuration property). Will retry to send message with increased timeout
[currentTimeout=10000, rmtAddr=server: 2016/redacted_ip:47500,
rmtPort=47500]
[17:20:36,342][INFO][tcp-disco-srvr-#3][TcpDiscoverySpi] TCP discovery
accepted incoming connection [rmtAddr=/redacted_ip, rmtPort=56925]
[17:20:36,342][WARNING][jvm-pause-detector-worker][IgniteKernal] Possible
too long JVM pause: 30741 milliseconds.
[17:20:42,276][SEVERE][nio-acceptor-tcp-rest-#39][GridTcpRestProtocol]
Runtime error caught during grid runnable execution: GridWorker
[name=nio-acceptor-tcp-rest, igniteInstanceName=null, finished=false,
heartbeatTs=1581322824712, hashCode=328613569, interrupted=false,
runner=nio-acceptor-tcp-rest-#39]
*java.lang.OutOfMemoryError: GC overhead limit exceeded*

So, you have plainly run out of heap, and Ignite is likely not to blame
since we are not using a lot of heap.

I recommend collecting heap dumps, searching for leaks in your own code /
use patterns.

Regards,
-- 
Ilya Kasnacheev


ср, 19 февр. 2020 г. в 07:01, wentat <[email protected]>:

> Hi Ilya,
>
> Thank you for your reply. I have done this test a few times and I
> consistently get stalling grids during failover/scaling/server swapping
>
> I have tried tuning some parameters, according to  ignite production prep
> docs <https://apacheignite.readme.io/docs/preparing-for-production>  . I
> have increased the heap size to max of 10GB, removed logging of metrics and
> set igcfg.setFailureDetectionTimeout(60000); - one hour! However, this was
> done after the 2 tries in this thread.
>
> I will try to run one time and get logs for whole cluster including GC if
> problem persists but it will take some time as I have moved on to other
> tests. Meanwhile, here is the original log from my first experiment. Maybe
> you can have a clue.
>
> Once again, thank you for your time in this issue
>
> crash.log
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2779/crash.log>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Reply via email to