For others that run into similar issue, it turned out that the OutOfMemoryError was thrown (and subsequently hidden) on the client side. The error was caused by excessive direct memory usage in Java NIO's bytebuffer caching (described here: http://www.evanjones.ca/java-bytebuffer-leak.html), and setting -Djdk.nio.maxCachedBufferSize=262144 allowed the application to complete.
Yet another proof that correct handling of OOME is hard. Thanks, Daniel 2017-10-11 11:33 GMT+02:00 Daniel Jeliński <[email protected]>: > Thanks for the hints. I'll see if we can explicitly set > MaxDirectMemorySize to a safe number. > Thanks, > Daniel > > 2017-10-10 21:10 GMT+02:00 Esteban Gutierrez <[email protected]>: > >> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/ >> classes/sun/misc/VM.java#l184 >> >> // The initial value of this field is arbitrary; during JRE >> initialization >> // it will be reset to the value specified on the command line, if >> any, >> // otherwise to Runtime.getRuntime().maxMemory(). >> >> which goes all the way down to memory/heap.cpp to whatever was left to the >> reserved memory depending on the flags and the platform used as Vladimir >> says. >> >> Also, depending on which distribution and features are used there are >> specific guidelines about setting that parameter so mileage might vary. >> >> thanks, >> esteban. >> >> >> >> -- >> Cloudera, Inc. >> >> >> On Tue, Oct 10, 2017 at 1:35 PM, Vladimir Rodionov < >> [email protected]> >> wrote: >> >> > >> The default value is zero, which means the maximum direct memory is >> > unbounded. >> > >> > That is not correct. If you do not specify MaxDirectMemorySize, default >> is >> > platform specific >> > >> > The link above is for JRockit JVM I presume? >> > >> > On Tue, Oct 10, 2017 at 11:19 AM, Esteban Gutierrez < >> [email protected]> >> > wrote: >> > >> > > I don't think is truly unbounded, IIRC it s limited to the maximum >> > > allocated heap. >> > > >> > > thanks, >> > > esteban. >> > > >> > > -- >> > > Cloudera, Inc. >> > > >> > > >> > > On Tue, Oct 10, 2017 at 1:11 PM, Ted Yu <[email protected]> wrote: >> > > >> > > > From https://docs.oracle.com/cd/E15289_01/doc.40/e15062/optionxx. >> htm : >> > > > >> > > > java -XX:MaxDirectMemorySize=2g myApp >> > > > >> > > > Default Value >> > > > >> > > > The default value is zero, which means the maximum direct memory is >> > > > unbounded. >> > > > >> > > > On Tue, Oct 10, 2017 at 11:04 AM, Vladimir Rodionov < >> > > > [email protected]> >> > > > wrote: >> > > > >> > > > > >> XXMaxDirectMemorySize is set to the default 0, which means >> > unlimited >> > > > as >> > > > > far >> > > > > >> as I can tell. >> > > > > >> > > > > Not sure if this is true. The only conforming that link I found >> was >> > for >> > > > > JRockit JVM. >> > > > > >> > > > > On Mon, Oct 9, 2017 at 11:29 PM, Daniel Jeliński < >> > [email protected] >> > > > >> > > > > wrote: >> > > > > >> > > > > > Vladimir, >> > > > > > XXMaxDirectMemorySize is set to the default 0, which means >> > unlimited >> > > as >> > > > > far >> > > > > > as I can tell. >> > > > > > Thanks, >> > > > > > Daniel >> > > > > > >> > > > > > 2017-10-09 19:30 GMT+02:00 Vladimir Rodionov < >> > [email protected] >> > > >: >> > > > > > >> > > > > > > Have you try to increase direct memory size for server >> process? >> > > > > > > -XXMaxDirectMemorySize=? >> > > > > > > >> > > > > > > On Mon, Oct 9, 2017 at 2:12 AM, Daniel Jeliński < >> > > > [email protected]> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Hello, >> > > > > > > > I'm running an application doing a lot of Puts (size >> anywhere >> > > > > between 0 >> > > > > > > and >> > > > > > > > 10MB, one cell at a time); occasionally I'm getting an error >> > like >> > > > the >> > > > > > > > below: >> > > > > > > > 2017-10-09 04:29:29,811 WARN [AsyncProcess] - #13368, >> > > > > > > > table=researchplatform:repo_stripe, attempt=1/1 >> failed=1ops, >> > > last >> > > > > > > > exception: java.io.IOException: com.google.protobuf. >> > > > > ServiceException: >> > > > > > > > java.lang.OutOfMemoryError: Direct buffer memory on >> > > > > > > > c169dzv.int.westgroup.com,60020,1506476748534, tracking >> > started >> > > > Mon >> > > > > > Oct >> > > > > > > 09 >> > > > > > > > 04:29:29 EDT 2017; not retrying 1 - final failure >> > > > > > > > >> > > > > > > > After that the connection to RegionServer becomes unusable. >> > Every >> > > > > > > > subsequent attempt to execute Put on that connection >> results in >> > > > > > > > CallTimeoutException. I only found the OutOfMemory by >> reducing >> > > the >> > > > > > number >> > > > > > > > of tries to 1. >> > > > > > > > >> > > > > > > > The host running HBase appears to have at least a few GB of >> > free >> > > > > memory >> > > > > > > > available. Server logs do not mention anything about this >> > error. >> > > > > > Cluster >> > > > > > > is >> > > > > > > > running HBase 1.2.0-cdh5.10.2. >> > > > > > > > >> > > > > > > > Is this a known problem? Are there workarounds available? >> > > > > > > > Thanks, >> > > > > > > > Daniel >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
