On Tue, May 12, 2015 at 7:41 PM, David chen <c77...@163.com> wrote: > A RegionServer was killed because OutOfMemory(OOM), although the process > killed can be seen in the Linux message log, but i still have two following > problems: > 1. How to inspect the root reason to cause OOM? >
Start the regionserver with -XX:-HeapDumpOnOutOfMemoryError specifying a location for the heap to be dumped to on OOME (See http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html). Remove the XX:OnOutOfMemoryError because now it will conflict with HeapDumpOnOutOfMemoryError Then open the heap dump in the java mission control, jprofiler, etc., to see how the retained objects are associated. > 2 When RegionServer encounters OOM, why can't it free some memories > occupied? if so, whether or not killer will not need. > We require a certain amount of memory to process a particular work load. If an insufficient allocation, we OOME. Once an application has OOME'd, its state goes indeterminate. We opt to kill the process rather than hang around in a damaged state. Enable GC logging to figure why in particular you OOME'd (There are different categories of OOME [1]). We may have a sufficient memory allocation but an incorrectly tuned GC or a badly specified set of heap args may bring on OOME. St.Ack 1. http://www.javacodegeeks.com/2013/08/understanding-the-outofmemoryerror.html > Any ideas can be appreciated!