Hi all,

we've been having trouble with our production fuseki instance. a few specifics:

fuseki 3.6.0, standalone/jetty. OpenJDK 1.8.0.171 on RHEL6. On an m4.2xlarge, shared with two other applications.

we have about 21M triples in the database. We hit fuseki medium hard, on the order of 1000 hits per minute. 99%+ of the hits are queries. Our code could stand to do some client-side caching, we get lots of repetitive queries. That said, fuseki is normally plenty fast at those, it's rare that it takes >10ms on a query.

It looks like i'm getting hit by JENA-1516, I will schedule an upgrade to 3.7 ASAP.

The log is full of errors like this.

[2018-06-11 16:15:07] BindingTDB ERROR get1(?s)
org.apache.jena.tdb.base.file.FileException: ObjectFileStorage.read[nodes](488281706)[filesize=569694455][file.size()=569694455]: Failed to read the length : got 0 bytes at org.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:341)

[2018-06-11 16:15:39] BindingTDB ERROR get1(?identifier)
org.apache.jena.tdb.base.file.FileException: In the middle of an alloc-write
at org.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:311) at org.apache.jena.tdb.base.objectfile.ObjectFileWrapper.read(ObjectFileWrapper.java:57)
        at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)



The problem that got me looking is that fuseki memory usage goes nuts, which causes the server to start swapping, etc. Swapping = slow = pager. Total memory + swap in use by fuseki when I investigated was about 32GB; It's configured to use a 16GB heap. Garbage collection logging was not configured properly, so I can't say whether my immediate problem was heap exhaustion.

I'm monitoring swap usage hourly - sometime in a <1hr timeframe the swap usage increased past 2GB (10%) to about 11GB (10 of which was cleared after I restarted fuseki). So the memory ballooned fairly quickly when it happened.

The TDB errors happen much earlier than that memory goes nuts. Obviously, could be a delayed effect of this problem, but I'm wondering:

- if this rings a bell in some other way - how much memory should I expect fuseki to need?
-  if there is any particular debugging I should enable
-  if our traffic level is out of the ordinary

thanks
danno
--
Dan Pritts
ICPSR Computing & Network Services
University of Michigan
<https://www.postbox-inc.com>

Reply via email to