Hi Thomas, Sorry for the late reply and thanks for your nice interpretation. I added some comments and questions inline.
Thanks, Lijie On Sat, Nov 18, 2017 at 11:37 PM, Thomas Schatzl <thomas.scha...@oracle.com> wrote: > Hi, > > On Sat, 2017-11-18 at 20:53 +0800, Lijie Xu wrote: > > Hi All, > > I recently encountered an OOM error in a Spark application using G1 > > collector. This application launches multiple JVM instances to > > process the large data. Each JVM has 6.5GB heap size and uses G1 > > collector. A JVM instance throws an OOM error during allocating a > > large (570MB) array. However, this JVM has about 3GB free heap space > > at that time. After analyzing the application logic, heap usage, and > > GC log, I guess the root cause may be the lack of consecutive space > > for holding this large array in G1. I want to know whether my guess > > is right ... > > Very likely. This is a long-standing issue (actually I have once > investigated about it like 10 years ago on a different regional > collector), and given your findings it is very likely you are correct. > The issue also has an extra section in the tuning guide [0]. > *==> This reference is very helpful for me. Another question is that "Do Parallel and CMS collectors have this defect too"?* > > > ... and why G1 has this defect. > > Nobody fixed it yet. :) > > Reasons: > - workaround easy and typically "just works". > - no "real world" test setups where fixes could be tested available. > People tend to disappear after getting to know the workaround. > Unfortunately, Apache SPARK which is probably one of the more frequent > environmnet it happens with, but it still does not work on jdk9/10 and > soon 11 yet where development happens. > - it's not very interesting work for many. Not sure why, probably > because it involves implementing and evaluating longer term strategies > in the collector to minimize impact of fragmentation which is a complex > topic (at least if you are not satisfied with the last-ditch brute > force approach). > - there are more problematic issues to deal with that affect more > installations, have test setups, and no or no good workaround. > > Actually I have been discussing this with colleagues just last week > again in context of work for students/interns. :) > > If you want to look into this there are a bunch of CRs open that you > might want to start with (e.g. [1][2][3]) to get an idea of > possibilities - these CRs do not even mention the one brute force > solution other VMs probably apply in that situation: have the full gc > move large arrays too. > > Feel free to start a discussion about this topic either here or > preferably in the hotspot-gc-dev mailing list. > > > In the following sections, I will detail the JVM info, application, > > OOM phase, and heap usage. Any suggestions will be appreciated. > > Simply either increase the heap size or increase region size via > -XX:HeapRegionSize. I think 16m regions will fix the issue in your case > without any other performance impact, and reduce the amount of > humongous objects significantly. > *==> Your guess is quite right. I have changed the region size to 8m, 16m, and 32m.* *The application still throws an OOM error in 8m, but successfully finished in 16m and 32m.* > > [JVM info] > > java version "1.8.0_121" > > Oracle Java(TM) SE Runtime Environment (build 1.8.0_121-b13) > > Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) > > While it won't impact this issue, I recommend updating at least to the > latest 8u release. Not suggesting jdk 9 here because we know that SPARK > does not work there yet. > > Thanks, > Thomas > > [0] https://docs.oracle.com/javase/9/gctuning/garbage-first-garbage-col > lector-tuning.htm#GUID-2428DA90-B93D-48E6-B336-A849ADF1C552 > <https://docs.oracle.com/javase/9/gctuning/garbage-first-garbage-collector-tuning.htm#GUID-2428DA90-B93D-48E6-B336-A849ADF1C552> > [1] https://bugs.openjdk.java.net/browse/JDK-8172713 > [2] https://bugs.openjdk.java.net/browse/JDK-8038487 > [3] https://bugs.openjdk.java.net/browse/JDK-8173627 > >
_______________________________________________ hotspot-gc-use mailing list hotspot-gc-use@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use