hi All, we notice this cause by the backend DB issue for the JDBC cache store.
So we have a concern, if the backend DB outline for while, will all those cached objects be stuck in heap memory. Possible those data swap to off-heap? when DB back then write back to DB? we already set the onheapCacheEnabled to false, but now the reality is every time the DB stuck then the on-heap memory increase, eventually trigger a Full GC then the entire node failed. Regards Aaron [email protected] From: [email protected] Date: 2017-08-18 11:34 To: user Subject: Re: Re: In which scenario the ignite nodes will shutdown by itself hi Dmirty, Eventually we can reproduce it, the Node shutdown cause of the long GC pause > 20s ([Full GC (Allocation Failure) 10182M->7962M(10G), 27.6709967 secs]). But when we dump the jvm state, Ignite occupied a lot memory all those data holding in the old Generation, cause the full GC failure, an the entire JVM stuck there. Some interesting beans: GridLocalCacheEntry & GridCacheObsoleteEntryExtras both about 22million; this quite close to the entire data in our DB. Totally Ignite take about of the 4G/8G data in old generation. Ignite cached data - transaction usually lifecycle is very short, there should not be so many data onheap, PLUS we configurate off-heap memory for them. GC log and our configuration as attached, thanks again for your time and very appreciated Regards Aaron [email protected] From: [email protected] Date: 2017-08-17 20:25 To: user Subject: Re: Re: In which scenario the ignite nodes will shutdown by itself Thanks Dmirty! we monitor the GC it is health -- the entire java process still running smoothly now. We adjust a bit of the remote service, not stuck the socket, now seem the node not easy to crash, but we notice the instance eat almost all the memory of the machine. Seems the entire system are at the edge of the crash. we explicitly set the memory policy to 16~27G. But I not sure why it still occupy so many memory, for this process in fact we did not query anything from Cache. it work just as buffer between the DB, we write to cache and JDBC cache store flush to DB, we notice about 30million data inserted already, Will the RANDOM_2_LRU evict the data from page? What we concern is if crash now we will lost some data maybe. [root@iZuf62zdiq684kn72aatgjZ 1502945772_19623]# free -mh total used free shared buff/cache available Mem: 62G 17G 405M 744K 44G 44G Swap: 0B 0B 0B PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19376 root 20 0 46.715g 0.021t 0.010t S 313.3 34.3 1332:40 java Thanks for your time! Regards Aaron [email protected] From: dkarachentsev Date: 2017-08-16 21:51 To: user Subject: Re: In which scenario the ignite nodes will shutdown by itself Hi Aaron, Do you have long GC pauses? As Bob said, large queries or transactions could move a lot of data to on-heap. F.e. regular SELECT will get all dataset into heap on server node and will respond to client with pages. But first of all try to find a reason of node segmentation first, it could be due to GC or system pauses, lost network, etc. Thanks! -Dmitry. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/In-which-scenario-the-ignite-nodes-will-shutdown-by-itself-tp16192p16227.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
