Osma
Comments inline:
On 05/06/2014 10:11, "Osma Suominen" <[email protected]> wrote:
Hi all!
On 30/05/14 16:36, Mark Feblowitz wrote:
After some amount of time I see a series of messages after update posts
WARN [xxxxxx] RC = 500 : Java heap space
And I’m seeing "java.lang.OutOfMemoryError: Java heap space” errors.
I also got this error yesterday on an important machine.
My setup is this: Fuseki 1.0.1 with a single TDB (no inference) that has
grown to 13GB and an additional jena-text index of 200MB. Fuseki is
given approx. 6GB of heap (-Xmx6000M) on a machine with 16GB RAM. The
machine is a virtual machine running 64bit CentOS 6, kernel
2.6.32-431.11.2.el6.x86_64, java -version gives this:
java version "1.7.0_51"
OpenJDK Runtime Environment (rhel-2.4.4.1.el6_5-x86_64 u51-b02)
OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)
At the time this happened there were no updates (we usually only update
during the night), just read-only SELECT and CONSTRUCT queries coming in
at around 8 queries per second on average.
As has been explained in this thread the problem is that the continuous
query load prevents the in-memory transaction journal from being fully
flushed to disk. If the read only queries continue through the night
while you do updates then they will block the journal flush and you will
eventually hit this case
JENA-703 (https://issues.apache.org/jira/browse/JENA-703) describes the
proposed fix for this issue but the side effect of that fix (as and when
it gets implemented) will be that for a system under continuous load reads
will be occasionally blocked and therefore some queries may experience
delays.
Suddenly queries stop working and CPU usage rises to around 350% (the
machine has 4 cores). Errors like this appear in the Fuseki log:
2014-06-04 11:42:52,293 WARN Fuseki :: [14739636] RC = 500
: Java heap space
java.lang.OutOfMemoryError: Java heap space
2014-06-04 11:39:06,670 WARN Fuseki :: [14739587] RC = 500
: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2014-06-04 11:43:54,817 INFO Fuseki :: [14739587] 500 GC
overhead limit exceeded (1,032.111 s)
2014-06-04 12:24:56,868 WARN Fuseki :: [14739738] RC = 500
: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2014-06-04 12:24:04,660 WARN Fuseki :: [14739862] RC = 500
: Java heap space
java.lang.OutOfMemoryError: Java heap space
2014-06-04 12:43:44,167 INFO Fuseki :: [14739738] 500 GC
overhead limit exceeded (2,227.717 s)
2014-06-04 12:43:44,167 INFO Fuseki :: [14739862] 500 Java
heap space (2,227.719 s)
2014-06-04 14:33:30,906 WARN Fuseki :: [14740021] RC = 500
: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
2014-06-04 14:34:21,722 INFO Fuseki :: [14740021] 500 GC
overhead limit exceeded (4,850.886 s)
The timestamps in the log entries are not always in order, as you can
see above. Sometimes it takes more than an hour for an individual query
to fail (see last entry).
Needless to say this is a bit nasty way of failing - the process is
running, consuming nearly all CPU, but responding to queries very slowly
or not at all. It would even be better if the process just died, so
something else could restart it. I am considering using a tool such as
Monit to watch the Fuseki process and restart it if it starts behaving
oddly.
This is really a JVM issue and not something we can control. The JVM
allows catching OOM errors (for better or worse) but in doing so it often
leaves applications in a state where they are extremely close to the heap
limit and so the user code hangs while the JVM furiously tries to GC
enough memory for it to continue.
Am I doing something obviously wrong here?
No, this is a known limitation of TDBs architecture
Should I just give the JVM
even more memory, or adjust some of the other JVM options?
That only prolongs the time to failure and TDB relies heavily on memory
mapped files which are off heap so increasing the heap size impacts
performance because it cause more swapping at the OS level
Is there any
way to force the GC to fail faster, or otherwise avoid futile attempts
of freeing more memory?
You can try the solution detailed at
http://stackoverflow.com/a/3878199/107591
Add -XX:OnOutOfMemoryError="kill -9 %p"
That is an Oracle JVM option so no guarantee it works on OpenJDK
Rob
Even better, could Fuseki just stick to the
memory it is given?
Thanks,
Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi