That heap is way, too big. Solr does NOT pull the entire index into the JVM heap. You need RAM that is not in the heap for the OS to keep the index in file buffers.
Try again with: -Xms16G -Xmx16G If you run out of that, try 31G. Starting and maximum heap size should always be the same for a server process. The JVM will increase it to max before doing a full GC. Also, there is no need for 30 GC threads. Leave that out and use the default. Finally, the list strips images, so nobody could see the image. Upload it and link it, please. wunder Walter Underwood [email protected] http://observer.wunderwood.org/ (my blog) > On May 9, 2021, at 10:54 PM, Vignan Malyala <[email protected]> wrote: > > Hi everyone, > > We have 3 cluster solr running in 3 different machines with an index size of > 300 GB. > RAM: 300 GB per node > Heap - Xms: 240GB Xmx: 300GB > Index size: 300GB > > GC_TUNE="-XX:+UseG1GC > -XX:InitiatingHeapOccupancyPercent=45 > -XX:ConcGCThreads=6 > -XX:ParallelGCThreads=30 > -XX:G1ReservePercent=20 > > <autoCommit> > <maxTime>${solr.autoCommit.maxTime:400000}</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > > <autoSoftCommit> > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> > </autoSoftCommit> > > > > Our cloud servers suddenly stopped yesterday. When we try to restart, our JVM > heap size goes to max of 300 GB just in few seconds and we get the following > message before stopping automatically. > > > > Heap before GC invocations=0 (full 0): > garbage-first heap total 251658240K, used 360448K [0x00007eba80000000, > 0x00007eba8200f000, 0x00007f0580000000) > region size 32768K, 12 young (393216K), 0 survivors (0K) > Metaspace used 20504K, capacity 21158K, committed 21248K, reserved > 22528K > 2021-05-10T05:31:59.511+0000: 3.036: [GC pause (Metadata GC Threshold) > (young) (initial-mark) > Desired survivor size 805306368 bytes, new threshold 15 (max 15) > > > > > {Heap before GC invocations=11 (full 0): > garbage-first heap total 288849920K, used 20398080K [0x00007eba80000000, > 0x00007eba82011378, 0x00007f0580000000) > region size 32768K, 440 young (14417920K), 54 survivors (1769472K) > Metaspace used 58413K, capacity 61495K, committed 61696K, reserved > 63488K > 2021-05-10T05:33:15.477+0000: 79.002: [GC pause (G1 Evacuation Pause) (young) > Desired survivor size 922746880 bytes, new threshold 1 (max 15) > - age 1: 1043976736 bytes, 1043976736 total > - age 2: 766998080 bytes, 1810974816 total > , 0.4319767 secs] > [Parallel Time: 408.3 ms, GC Workers: 30] > [GC Worker Start (ms): Min: 79002.5, Avg: 79003.0, Max: 79003.6, Diff: > 1.2] > [Ext Root Scanning (ms): Min: 0.1, Avg: 0.8, Max: 2.7, Diff: 2.6, Sum: > 23.7] > [Update RS (ms): Min: 0.0, Avg: 1.7, Max: 3.1, Diff: 3.1, Sum: 51.7] > [Processed Buffers: Min: 0, Avg: 3.8, Max: 17, Diff: 17, Sum: 113] > [Scan RS (ms): Min: 13.9, Avg: 15.8, Max: 16.7, Diff: 2.8, Sum: 474.0] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 2.1, Diff: 2.1, Sum: > 4.3] > [Object Copy (ms): Min: 385.5, Avg: 387.5, Max: 390.6, Diff: 5.1, Sum: > 11624.2] > [Termination (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 13.8] > [Termination Attempts: Min: 1, Avg: 82.1, Max: 172, Diff: 171, Sum: > 2464] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.4, Diff: 0.4, Sum: 3.6] > [GC Worker Total (ms): Min: 405.9, Avg: 406.5, Max: 407.3, Diff: 1.4, > Sum: 12195.3] > [GC Worker End (ms): Min: 79409.4, Avg: 79409.5, Max: 79409.8, Diff: 0.4] > [Code Root Fixup: 0.1 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 6.7 ms] > [Other: 16.9 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 5.2 ms] > [Ref Enq: 0.0 ms] > [Redirty Cards: 9.2 ms] > [Humongous Register: 0.3 ms] > [Humongous Reclaim: 0.0 ms] > [Free CSet: 0.7 ms] > > > Please help to solve this issue! > Thanks in advance! > Regards! > Vigz > >
