Hi all,
we've been having trouble with our production fuseki instance. a few
specifics:
fuseki 3.6.0, standalone/jetty. OpenJDK 1.8.0.171 on RHEL6. On an
m4.2xlarge, shared with two other applications.
we have about 21M triples in the database. We hit fuseki medium hard,
on the order of 1000 hits per minute. 99%+ of the hits are queries. Our
code could stand to do some client-side caching, we get lots of
repetitive queries. That said, fuseki is normally plenty fast at those,
it's rare that it takes >10ms on a query.
It looks like i'm getting hit by JENA-1516, I will schedule an upgrade
to 3.7 ASAP.
The log is full of errors like this.
[2018-06-11 16:15:07] BindingTDB ERROR get1(?s)
org.apache.jena.tdb.base.file.FileException:
ObjectFileStorage.read[nodes](488281706)[filesize=569694455][file.size()=569694455]:
Failed to read the length : got 0 bytes
at
org.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:341)
[2018-06-11 16:15:39] BindingTDB ERROR get1(?identifier)
org.apache.jena.tdb.base.file.FileException: In the middle of an alloc-write
at
org.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:311)
at
org.apache.jena.tdb.base.objectfile.ObjectFileWrapper.read(ObjectFileWrapper.java:57)
at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
The problem that got me looking is that fuseki memory usage goes nuts,
which causes the server to start swapping, etc. Swapping = slow =
pager. Total memory + swap in use by fuseki when I investigated was
about 32GB; It's configured to use a 16GB heap. Garbage collection
logging was not configured properly, so I can't say whether my immediate
problem was heap exhaustion.
I'm monitoring swap usage hourly - sometime in a <1hr timeframe the swap
usage increased past 2GB (10%) to about 11GB (10 of which was cleared
after I restarted fuseki). So the memory ballooned fairly quickly when
it happened.
The TDB errors happen much earlier than that memory goes nuts.
Obviously, could be a delayed effect of this problem, but I'm wondering:
- if this rings a bell in some other way - how much memory should I
expect fuseki to need?
- if there is any particular debugging I should enable
- if our traffic level is out of the ordinary
thanks
danno
--
Dan Pritts
ICPSR Computing & Network Services
University of Michigan
<https://www.postbox-inc.com>