Re: Node pause for no obvious reason

2018-06-14 Thread dkarachentsev
Hi, Check system logs for that time, maybe there was some system freeze and add more information in GC logs, for example safepoints: -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime. Thanks! -Dmitry -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Node pause for no obvious reason

2018-06-08 Thread Ray
I didn't have the dstat logs. But I think these charts is the same as dstats logs. CPU usage CPU.png Memory usage Memory.png Swap swap.png

Re: Node pause for no obvious reason

2018-06-08 Thread Andrey Mashenkov
Checkpoint can block other threads only if there another checkpoint should be started. But seems checkpoint is not the cause. It is hard to tell what was going on server. May be you have any dstat logs? How many nodes to you start per machine? Do you use docker or any VM? We have bad experience

Re: Node pause for no obvious reason

2018-06-08 Thread Ray
Hi, Please see the GC log and the picture I attached. Looks like the GC is not taking a very long time. Yes, the checkpoint is taking a long time to finish. Could it be the checkpoint thread has something to do with the node crash? In my understanding, the checkpoint will not block other threads

Re: Node pause for no obvious reason

2018-06-08 Thread Andrey Mashenkov
Possibly there is a race. I've created a ticket for this [1] [1] https://issues.apache.org/jira/browse/IGNITE-8751 On Fri, Jun 8, 2018 at 4:56 PM, Andrey Mashenkov wrote: > Hi, > > Looks like node was segmented due to long JVM pause. > There are 2 "long JVM pause" messages in long an suspiciou

Re: Node pause for no obvious reason

2018-06-08 Thread Andrey Mashenkov
Hi, Looks like node was segmented due to long JVM pause. There are 2 "long JVM pause" messages in long an suspicious long checkpoint: Checkpoint finished [cpId=77cf2fa2-2a9f-48ea-bdeb-dda81b15dac1, pages=2050858, markPos=FileWALPointer [idx=2051, fileOff=38583904, len=15981], walSegmentsCleared=0

Node pause for no obvious reason

2018-06-08 Thread Ray
I setup a six node Ignite cluster to test the performance and stability. Here's my setup.