[jira] [Updated] (SOLR-15400) JVM Heap becomes full and Solr stops when we try to restart

Sai Vignan (Jira) Sun, 09 May 2021 23:08:04 -0700


     [ 
https://issues.apache.org/jira/browse/SOLR-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sai Vignan updated SOLR-15400:
------------------------------
    Description: 
| |Hi everyone,|

We have 3 cluster solr running in 3 different machines with an index size of 
300 GB.
 RAM: 300 GB per node
 Heap - Xms: 240GB Xmx: 300GB
 Index size: 300GB
  
 GC_TUNE="-XX:+UseG1GC
 -XX:InitiatingHeapOccupancyPercent=45
 -XX:ConcGCThreads=6
 -XX:ParallelGCThreads=30
 -XX:G1ReservePercent=20
  
 <autoCommit>
      <maxTime>${solr.autoCommit.maxTime:400000}</maxTime>
      <openSearcher>false</openSearcher>
    </autoCommit>

 <autoSoftCommit>
      <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
    </autoSoftCommit>
  
 
!https://mail.google.com/mail/u/1?ui=2&ik=f503c8b70f&attid=0.1&permmsgid=msg-a:r-7442156480407812592&th=17954d735e698654&view=fimg&sz=s0-l75-ft&attbid=ANGjdJ-PIgCgEYeKpTx7FyAE299_28NGY6f3gcv4a5WX1feEB9KiaAMyXZErfa4GCJapv9tYoJy2PLEkRV90zCrN_n4knr3NJ1rdE0jbh-6wTKfLqSfcq0EWF0u9ACo&disp=emb&realattid=ii_koi6tqu60|width=542,height=286!
  
 *Our cloud servers suddenly stopped yesterday. When we try to restart, our JVM 
heap size goes to max of 300 GB just in few seconds and we get the following 
message before stopping automatically.*
  
  
  
 Heap before GC invocations=0 (full 0):
 garbage-first heap   total 251658240K, used 360448K [0x00007eba80000000, 
0x00007eba8200f000, 0x00007f0580000000)
  region size 32768K, 12 young (393216K), 0 survivors (0K)
 Metaspace       used 20504K, capacity 21158K, committed 21248K, reserved 22528K
 2021-05-10T05:31:59.511+0000: 3.036: [GC pause (Metadata GC Threshold) (young) 
(initial-mark)
 Desired survivor size 805306368 bytes, new threshold 15 (max 15)

{Heap before GC invocations=11 (full 0):
 garbage-first heap   total 288849920K, used 20398080K [0x00007eba80000000, 
0x00007eba82011378, 0x00007f0580000000)
  region size 32768K, 440 young (14417920K), 54 survivors (1769472K)
 Metaspace       used 58413K, capacity 61495K, committed 61696K, reserved 63488K
 2021-05-10T05:33:15.477+0000: 79.002: [GC pause (G1 Evacuation Pause) (young)
 Desired survivor size 922746880 bytes, new threshold 1 (max 15)
 - age   1: 1043976736 bytes, 1043976736 total
 - age   2:  766998080 bytes, 1810974816 total
 , 0.4319767 secs]
   [Parallel Time: 408.3 ms, GC Workers: 30]
      [GC Worker Start (ms): Min: 79002.5, Avg: 79003.0, Max: 79003.6, Diff: 
1.2]
      [Ext Root Scanning (ms): Min: 0.1, Avg: 0.8, Max: 2.7, Diff: 2.6, Sum: 
23.7]
      [Update RS (ms): Min: 0.0, Avg: 1.7, Max: 3.1, Diff: 3.1, Sum: 51.7]
         [Processed Buffers: Min: 0, Avg: 3.8, Max: 17, Diff: 17, Sum: 113]
      [Scan RS (ms): Min: 13.9, Avg: 15.8, Max: 16.7, Diff: 2.8, Sum: 474.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 2.1, Diff: 2.1, Sum: 
4.3]
      [Object Copy (ms): Min: 385.5, Avg: 387.5, Max: 390.6, Diff: 5.1, Sum: 
11624.2]
      [Termination (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 13.8]
         [Termination Attempts: Min: 1, Avg: 82.1, Max: 172, Diff: 171, Sum: 
2464]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.4, Diff: 0.4, Sum: 3.6]
      [GC Worker Total (ms): Min: 405.9, Avg: 406.5, Max: 407.3, Diff: 1.4, 
Sum: 12195.3]
      [GC Worker End (ms): Min: 79409.4, Avg: 79409.5, Max: 79409.8, Diff: 0.4]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 6.7 ms]
   [Other: 16.9 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 5.2 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 9.2 ms]
      [Humongous Register: 0.3 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.7 ms]
  
  
 Please help to solve this issue!
 Thanks in advance!
 Regards!
 Vigz
  
  
  
  
  
  

  was:
| |Hi everyone,

We have 3 cluster solr running in 3 different machines with an index size of 
300 GB.
RAM: 300 GB per node
Heap - Xms: 240GB Xmx: 300GB
Index size: 300GB
 
GC_TUNE="-XX:+UseG1GC
-XX:InitiatingHeapOccupancyPercent=45
-XX:ConcGCThreads=6
-XX:ParallelGCThreads=30
-XX:G1ReservePercent=20
 
<autoCommit>
     <maxTime>${solr.autoCommit.maxTime:400000}</maxTime>
     <openSearcher>false</openSearcher>
   </autoCommit>

 <autoSoftCommit>
     <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
   </autoSoftCommit>
 
!https://mail.google.com/mail/u/1?ui=2&ik=f503c8b70f&attid=0.1&permmsgid=msg-a:r-7442156480407812592&th=17954d735e698654&view=fimg&sz=s0-l75-ft&attbid=ANGjdJ-PIgCgEYeKpTx7FyAE299_28NGY6f3gcv4a5WX1feEB9KiaAMyXZErfa4GCJapv9tYoJy2PLEkRV90zCrN_n4knr3NJ1rdE0jbh-6wTKfLqSfcq0EWF0u9ACo&disp=emb&realattid=ii_koi6tqu60|width=542,height=286!
 
*Our cloud servers suddenly stopped yesterday. When we try to restart, our JVM 
heap size goes to max of 300 GB just in few seconds and we get the following 
message before stopping automatically.*
 
 
 
Heap before GC invocations=0 (full 0):
garbage-first heap   total 251658240K, used 360448K [0x00007eba80000000, 
0x00007eba8200f000, 0x00007f0580000000)
 region size 32768K, 12 young (393216K), 0 survivors (0K)
Metaspace       used 20504K, capacity 21158K, committed 21248K, reserved 22528K
2021-05-10T05:31:59.511+0000: 3.036: [GC pause (Metadata GC Threshold) (young) 
(initial-mark)
Desired survivor size 805306368 bytes, new threshold 15 (max 15)




{Heap before GC invocations=11 (full 0):
garbage-first heap   total 288849920K, used 20398080K [0x00007eba80000000, 
0x00007eba82011378, 0x00007f0580000000)
 region size 32768K, 440 young (14417920K), 54 survivors (1769472K)
Metaspace       used 58413K, capacity 61495K, committed 61696K, reserved 63488K
2021-05-10T05:33:15.477+0000: 79.002: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 922746880 bytes, new threshold 1 (max 15)
- age   1: 1043976736 bytes, 1043976736 total
- age   2:  766998080 bytes, 1810974816 total
, 0.4319767 secs]
  [Parallel Time: 408.3 ms, GC Workers: 30]
     [GC Worker Start (ms): Min: 79002.5, Avg: 79003.0, Max: 79003.6, Diff: 1.2]
     [Ext Root Scanning (ms): Min: 0.1, Avg: 0.8, Max: 2.7, Diff: 2.6, Sum: 
23.7]
     [Update RS (ms): Min: 0.0, Avg: 1.7, Max: 3.1, Diff: 3.1, Sum: 51.7]
        [Processed Buffers: Min: 0, Avg: 3.8, Max: 17, Diff: 17, Sum: 113]
     [Scan RS (ms): Min: 13.9, Avg: 15.8, Max: 16.7, Diff: 2.8, Sum: 474.0]
     [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 2.1, Diff: 2.1, Sum: 
4.3]
     [Object Copy (ms): Min: 385.5, Avg: 387.5, Max: 390.6, Diff: 5.1, Sum: 
11624.2]
     [Termination (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 13.8]
        [Termination Attempts: Min: 1, Avg: 82.1, Max: 172, Diff: 171, Sum: 
2464]
     [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.4, Diff: 0.4, Sum: 3.6]
     [GC Worker Total (ms): Min: 405.9, Avg: 406.5, Max: 407.3, Diff: 1.4, Sum: 
12195.3]
     [GC Worker End (ms): Min: 79409.4, Avg: 79409.5, Max: 79409.8, Diff: 0.4]
  [Code Root Fixup: 0.1 ms]
  [Code Root Purge: 0.0 ms]
  [Clear CT: 6.7 ms]
  [Other: 16.9 ms]
     [Choose CSet: 0.0 ms]
     [Ref Proc: 5.2 ms]
     [Ref Enq: 0.0 ms]
     [Redirty Cards: 9.2 ms]
     [Humongous Register: 0.3 ms]
     [Humongous Reclaim: 0.0 ms]
     [Free CSet: 0.7 ms]
 
 
Please help to solve this issue!
Thanks in advance!
Regards!
Vigz
 
 
 
 
 
 
 
|!https://ssl.gstatic.com/ui/v1/icons/mail/no_photo.png|id=:ma_37!|ReplyReply 
allForward|
 
 
 | |


> JVM Heap becomes full and Solr stops when we try to restart
> -----------------------------------------------------------
>
>                 Key: SOLR-15400
>                 URL: https://issues.apache.org/jira/browse/SOLR-15400
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.5.1
>            Reporter: Sai Vignan
>            Priority: Critical
>
> | |Hi everyone,|
> We have 3 cluster solr running in 3 different machines with an index size of 
> 300 GB.
>  RAM: 300 GB per node
>  Heap - Xms: 240GB Xmx: 300GB
>  Index size: 300GB
>   
>  GC_TUNE="-XX:+UseG1GC
>  -XX:InitiatingHeapOccupancyPercent=45
>  -XX:ConcGCThreads=6
>  -XX:ParallelGCThreads=30
>  -XX:G1ReservePercent=20
>   
>  <autoCommit>
>       <maxTime>${solr.autoCommit.maxTime:400000}</maxTime>
>       <openSearcher>false</openSearcher>
>     </autoCommit>
>  <autoSoftCommit>
>       <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
>     </autoSoftCommit>
>   
>  
> !https://mail.google.com/mail/u/1?ui=2&ik=f503c8b70f&attid=0.1&permmsgid=msg-a:r-7442156480407812592&th=17954d735e698654&view=fimg&sz=s0-l75-ft&attbid=ANGjdJ-PIgCgEYeKpTx7FyAE299_28NGY6f3gcv4a5WX1feEB9KiaAMyXZErfa4GCJapv9tYoJy2PLEkRV90zCrN_n4knr3NJ1rdE0jbh-6wTKfLqSfcq0EWF0u9ACo&disp=emb&realattid=ii_koi6tqu60|width=542,height=286!
>   
>  *Our cloud servers suddenly stopped yesterday. When we try to restart, our 
> JVM heap size goes to max of 300 GB just in few seconds and we get the 
> following message before stopping automatically.*
>   
>   
>   
>  Heap before GC invocations=0 (full 0):
>  garbage-first heap   total 251658240K, used 360448K [0x00007eba80000000, 
> 0x00007eba8200f000, 0x00007f0580000000)
>   region size 32768K, 12 young (393216K), 0 survivors (0K)
>  Metaspace       used 20504K, capacity 21158K, committed 21248K, reserved 
> 22528K
>  2021-05-10T05:31:59.511+0000: 3.036: [GC pause (Metadata GC Threshold) 
> (young) (initial-mark)
>  Desired survivor size 805306368 bytes, new threshold 15 (max 15)
> {Heap before GC invocations=11 (full 0):
>  garbage-first heap   total 288849920K, used 20398080K [0x00007eba80000000, 
> 0x00007eba82011378, 0x00007f0580000000)
>   region size 32768K, 440 young (14417920K), 54 survivors (1769472K)
>  Metaspace       used 58413K, capacity 61495K, committed 61696K, reserved 
> 63488K
>  2021-05-10T05:33:15.477+0000: 79.002: [GC pause (G1 Evacuation Pause) (young)
>  Desired survivor size 922746880 bytes, new threshold 1 (max 15)
>  - age   1: 1043976736 bytes, 1043976736 total
>  - age   2:  766998080 bytes, 1810974816 total
>  , 0.4319767 secs]
>    [Parallel Time: 408.3 ms, GC Workers: 30]
>       [GC Worker Start (ms): Min: 79002.5, Avg: 79003.0, Max: 79003.6, Diff: 
> 1.2]
>       [Ext Root Scanning (ms): Min: 0.1, Avg: 0.8, Max: 2.7, Diff: 2.6, Sum: 
> 23.7]
>       [Update RS (ms): Min: 0.0, Avg: 1.7, Max: 3.1, Diff: 3.1, Sum: 51.7]
>          [Processed Buffers: Min: 0, Avg: 3.8, Max: 17, Diff: 17, Sum: 113]
>       [Scan RS (ms): Min: 13.9, Avg: 15.8, Max: 16.7, Diff: 2.8, Sum: 474.0]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 2.1, Diff: 2.1, Sum: 
> 4.3]
>       [Object Copy (ms): Min: 385.5, Avg: 387.5, Max: 390.6, Diff: 5.1, Sum: 
> 11624.2]
>       [Termination (ms): Min: 0.1, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 13.8]
>          [Termination Attempts: Min: 1, Avg: 82.1, Max: 172, Diff: 171, Sum: 
> 2464]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.4, Diff: 0.4, Sum: 
> 3.6]
>       [GC Worker Total (ms): Min: 405.9, Avg: 406.5, Max: 407.3, Diff: 1.4, 
> Sum: 12195.3]
>       [GC Worker End (ms): Min: 79409.4, Avg: 79409.5, Max: 79409.8, Diff: 
> 0.4]
>    [Code Root Fixup: 0.1 ms]
>    [Code Root Purge: 0.0 ms]
>    [Clear CT: 6.7 ms]
>    [Other: 16.9 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 5.2 ms]
>       [Ref Enq: 0.0 ms]
>       [Redirty Cards: 9.2 ms]
>       [Humongous Register: 0.3 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 0.7 ms]
>   
>   
>  Please help to solve this issue!
>  Thanks in advance!
>  Regards!
>  Vigz
>   
>   
>   
>   
>   
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SOLR-15400) JVM Heap becomes full and Solr stops when we try to restart

Reply via email to