TDB Java heap space and OutOfMemory errors

Rob Vesse Thu, 05 Jun 2014 03:45:07 -0700

Osma

Comments inline:


On 05/06/2014 10:11, "Osma Suominen" <[email protected]> wrote:

>Hi all!
>
>On 30/05/14 16:36, Mark Feblowitz wrote:
>
>> After some amount of time I see a series of messages after update posts
>>
>>      WARN  [xxxxxx] RC = 500 : Java heap space
>>
>> And I’m seeing "java.lang.OutOfMemoryError: Java heap space”  errors.
>
>I also got this error yesterday on an important machine.
>
>My setup is this: Fuseki 1.0.1 with a single TDB (no inference) that has
>grown to 13GB and an additional jena-text index of 200MB. Fuseki is
>given approx. 6GB of heap (-Xmx6000M) on a machine with 16GB RAM. The
>machine is a virtual machine running 64bit CentOS 6, kernel
>2.6.32-431.11.2.el6.x86_64, java -version gives this:
>java version "1.7.0_51"
>OpenJDK Runtime Environment (rhel-2.4.4.1.el6_5-x86_64 u51-b02)
>OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
>
>
>At the time this happened there were no updates (we usually only update
>during the night), just read-only SELECT and CONSTRUCT queries coming in
>at around 8 queries per second on average.

As has been explained in this thread the problem is that the continuous
query load prevents the in-memory transaction journal from being fully
flushed to disk.  If the read only queries continue through the night
while you do updates then they will block the journal flush and you will
eventually hit this case

JENA-703 (https://issues.apache.org/jira/browse/JENA-703) describes the
proposed fix for this issue but the side effect of that fix (as and when
it gets implemented) will be that for a system under continuous load reads
will be occasionally blocked and therefore some queries may experience
delays.

>
>Suddenly queries stop working and CPU usage rises to around 350% (the
>machine has 4 cores). Errors like this appear in the Fuseki log:
>
>2014-06-04 11:42:52,293 WARN Fuseki               :: [14739636] RC = 500
>: Java heap space
>java.lang.OutOfMemoryError: Java heap space
>2014-06-04 11:39:06,670 WARN Fuseki               :: [14739587] RC = 500
>: GC overhead limit exceeded
>java.lang.OutOfMemoryError: GC overhead limit exceeded
>2014-06-04 11:43:54,817 INFO Fuseki               :: [14739587] 500 GC
>overhead limit exceeded (1,032.111 s)
>2014-06-04 12:24:56,868 WARN Fuseki               :: [14739738] RC = 500
>: GC overhead limit exceeded
>java.lang.OutOfMemoryError: GC overhead limit exceeded
>2014-06-04 12:24:04,660 WARN Fuseki               :: [14739862] RC = 500
>: Java heap space
>java.lang.OutOfMemoryError: Java heap space
>2014-06-04 12:43:44,167 INFO Fuseki               :: [14739738] 500 GC
>overhead limit exceeded (2,227.717 s)
>2014-06-04 12:43:44,167 INFO Fuseki               :: [14739862] 500 Java
>heap space (2,227.719 s)
>2014-06-04 14:33:30,906 WARN Fuseki               :: [14740021] RC = 500
>: GC overhead limit exceeded
>java.lang.OutOfMemoryError: GC overhead limit exceeded
>2014-06-04 14:34:21,722 INFO Fuseki               :: [14740021] 500 GC
>overhead limit exceeded (4,850.886 s)
>
>
>The timestamps in the log entries are not always in order, as you can
>see above. Sometimes it takes more than an hour for an individual query
>to fail (see last entry).
>
>Needless to say this is a bit nasty way of failing - the process is
>running, consuming nearly all CPU, but responding to queries very slowly
>or not at all. It would even be better if the process just died, so
>something else could restart it. I am considering using a tool such as
>Monit to watch the Fuseki process and restart it if it starts behaving
>oddly.

This is really a JVM issue and not something we can control.  The JVM
allows catching OOM errors (for better or worse) but in doing so it often
leaves applications in a state where they are extremely close to the heap
limit and so the user code hangs while the JVM furiously tries to GC
enough memory for it to continue.

>
>Am I doing something obviously wrong here?

No, this is a known limitation of TDBs architecture


>Should I just give the JVM
>even more memory, or adjust some of the other JVM options?

That only prolongs the time to failure and TDB relies heavily on memory
mapped files which are off heap so increasing the heap size impacts
performance because it cause more swapping at the OS level

>Is there any 
>way to force the GC to fail faster, or otherwise avoid futile attempts
>of freeing more memory?

You can try the solution detailed at
http://stackoverflow.com/a/3878199/107591

Add -XX:OnOutOfMemoryError="kill -9 %p"

That is an Oracle JVM option so no guarantee it works on OpenJDK

Rob

>Even better, could Fuseki just stick to the
>memory it is given?
>
>Thanks,
>Osma
>
>-- 
>Osma Suominen
>D.Sc. (Tech), Information Systems Specialist
>National Library of Finland
>P.O. Box 26 (Teollisuuskatu 23)
>00014 HELSINGIN YLIOPISTO
>Tel. +358 50 3199529
>[email protected]
>http://www.nationallibrary.fi

Re: Jena/Fuseki/TDB Java heap space and OutOfMemory errors

Reply via email to