Hello! Unfortunately I would not expect anyone to be debugging your 1.8 cluster since most people upgraded to 2.x.
Next time this happens, can you capture heap dump from problematic node? Dominator graph & per-class histogram may help tremendously. Regards, -- Ilya Kasnacheev пт, 22 мар. 2019 г. в 15:10, praveeng <[email protected]>: > Hi, > > Ignite version : 1.8 > One of the ignite node in 3node cluster is down due to full usage of RAM. > > At that point of time i can observe the following logs on this node: > > [00:32:02,119][INFO > > ][grid-timeout-worker-#7%CasinoApacheIgniteServices%][IgniteKernal%CasinoApacheIgniteServices] > Metrics for local node (to disable set 'metricsLogFrequency' to 0) > ^-- Node [id=9f8df386, name=CasinoApacheIgniteServices, > uptime=23:21:45:744] > ^-- H/N/C [hosts=8, nodes=8, CPUs=44] > ^-- CPU [cur=8.33%, avg=1.6%, GC=0%] > ^-- Heap [used=3886MB, free=36.65%, comm=6134MB] > ^-- Non heap [used=78MB, free=85.96%, comm=529MB] > ^-- Public thread pool [active=0, idle=0, qSize=0] > ^-- System thread pool [active=0, idle=16, qSize=0] > ^-- Outbound messages queue [size=0] > > [00:33:24,674][WARN > > ][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager] > Failed to wait for partition map exchange [topVer=AffinityTopologyVersion > [topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432]. > Dumping pending objects that might be the cause: > [00:33:24,674][WARN > > ][exchange-worker-#23%CasinoApacheIgniteServices%][GridCachePartitionExchangeManager] > Failed to wait for partition map exchange [topVer=AffinityTopologyVersion > [topVer=84, minorTopVer=0], node=9f8df386-2886-451f-b1ff-53713878d432]. > Dumping pending objects that might be the cause: > > > SAR stats for memory usage on this date: > > -- mar 6 > 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit > %commit kbactive kbinact kbdirty > 12:10:01 PM 170120 16090232 98.95 0 3393384 8222696 > 45.02 9887268 2088504 60 > 01:50:01 PM 168176 16092176 98.97 0 2120848 8224724 > 45.03 10804712 1596792 48 > 03:10:01 PM 199128 16061224 98.78 0 991832 8224904 > 45.04 11384652 1241284 436 > 04:10:01 PM 153060 16107292 99.06 0 229984 8224880 > 45.04 11255628 1627600 208 > 04:20:01 PM 165580 16094772 98.98 0 78572 8224828 > 45.03 11338592 1560944 52 > 04:30:01 PM 153508 16106844 99.06 0 29740 8224872 > 45.03 11436544 1579468 44 > 04:40:01 PM 162184 16098168 99.00 0 33152 8224892 > 45.04 11606584 1580388 24 > 11:10:01 PM 370956 15889396 97.72 0 74816 8225312 > 45.04 11927676 1610828 36 > 11:20:01 PM 348576 15911776 97.86 0 69012 8225272 > 45.04 11929820 1602748 48 > 11:30:01 PM 359132 15901220 97.79 0 27060 8225308 > 45.04 11912656 1577848 36 > 11:40:01 PM 340252 15920100 97.91 0 24908 8225272 > 45.04 11910516 1577668 32 > 11:50:01 PM 308340 15952012 98.10 0 39208 8242284 > 45.13 11914564 1589208 48 > Average: 253568 16006784 98.44 0 2317289 8226063 > 45.04 10368276 1955525 142 > > Please find the attached file for the cache configuration. > > ignite-clb-cache-config_dev.xml > < > http://apache-ignite-users.70518.x6.nabble.com/file/t1753/ignite-clb-cache-config_dev.xml> > > > Please find the memory snapshot which is captured by app dynamics tool in > the attachment. > memorySnapshot.JPG > < > http://apache-ignite-users.70518.x6.nabble.com/file/t1753/memorySnapshot.JPG> > > > Following is my analysis. > When the data is evicting from on heap to off heap, there is not much space > in off heap. > Due to that off heap memory usage is full and application has become slow > and unresponsive. > > Even the data in off heap is not expired because of that there is not much > free memory in RAM. > After i restarted the application on this node, the RAM usage has become to > 25% and now it's usage is 45%. > > can you please check and suggest once. > > Thanks, > Praveen > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
