Re: huge fuseki memory usage; NIO errors; heap NOT running out

Andy Seaborne Wed, 20 Jun 2018 03:43:28 -0700



On 19/06/18 16:47, Dan Pritts wrote:

Update:
Running without incident since the 14th. Maximum RSS has been just over16GB, Maximum VSIZE has been just over 28GB. None of the ballooning ofvsize to 48GB. Maximum heap before GC is under 4GB.
So whatever triggered the problem hasn't recurred. I'm going to see ifi can recreate it by replaying traffic from the logs, but unfortunatelythat won't include any updates (have apache logs - query strings arebeing logged, but updates are posts which aren't logged).
That said, that 16GB RSS is a lot bigger than the 13.4GB I calculatedback-of-the-envelope. If the heap's only using 4GB, and other JVMmemory pools another gig (guesstimate), then why is NIO using 11GB tocache a 2.5GB database?


No idea.

For a database that does fit in RAM, the actual RAM size is a bit biggerthan the disk size (e.g. nodes are UTF-8 on disk and Java strings in RAMso x2 bytes + string overheads and this is in-heap).


Hmm....
http://www.evanjones.ca/java-bytebuffer-leak.html

In Fuseki with Jetty, there are quite a few Jetty threads but I believethey are reused and new threads aren't being created and destroyed. Ifthat's right, this (slightly scary) NIO effect size will get to the sizeof "num threads * max buffer size" and stay there. It's not heap. Asfar as I know, there isn't any arbitrary sized I/O buffers around.

TL;DR: The Java NIO API caches a maximum-sized direct ByteBuffer foreach thread, which looks like a native memory leak if you read or writelarge blocks from many threads. You can easily patch the JDK yourself towork around this problem. Always use direct ByteBuffers with Java NIOAPIs for the best performance, and to avoid this "leak." Under thecovers, heap ByteBuffers are copied to temporary direct ByteBuffers oneach I/O call. [Update 2016-02-10: JDK 9 has a property to control this(Thanks Tony!). Run services with -Djdk.nio.maxCachedBufferSize=262144to avoid this problem.]
I wonder if this is my problem. Grepping around in the tdb(1) source, isee the following mentions to allocateDirect:
BufferAllocatorDirect.java: * Delegates to {@linkByteBuffer#allocateDirect(int)}.BufferAllocatorDirect.java: returnByteBuffer.allocateDirect(capacity) ;
BufferChannel.java:    // TODO Consider use of allocateDirect

BufferAllocatorDirect is not used by default - it is used only isjournal spilling to disk for large transactions is enabled which is notthe default. (aside to all: if you are in this situation, pleaseconsider TDB2)

...and a separate BufferAllocatorMapped class. I'm not a javaprogrammer, so it's not simple for me to track what's getting usedwhere, but the TODO makes me wonder.


Good find - the TODO is misleading though.

Direct ByteBuffers aren't used normally.

----

If you get a chance to dump threads and heap as per Rob's suggestion,that would be great. I can't see a reason for what you are seeing at themoment.


    Andy

Dan Pritts <mailto:[email protected]>
June 15, 2018 at 11:38 AM
Will do, thanks.


Dan Pritts <mailto:[email protected]>
June 14, 2018 at 10:48 PM
So the issue is that memory goes up, that is the heap expands to themaximum Xmx size set? The JVM does not return any heap back to theOS (as far as I know) so if all the applications grow their heaps,the real RAM to match that or swapping may result.
Hi Andy,

thanks for taking the time to help.

The problem is that the NON-HEAP memory usage skyrockets.
I "allocate" memory for the heap. The gc logs suggested that I wasnever exceeding 6GB of heap in use, even when things went to hell. SoI set the heap to 10GB.
Now that I know we're using NIO, I "allocate" memory for NIO to holdthe entire index in ram. the db is 2.4GB on disk. I don't know NIOwell but this seems plausible.
let's throw another gig at java for its own internal use.
That would add up to 10 + 2.4 + 1 = 13.4GB of memory i might expectjava to use. There's nothing else on the server except apache, linux,and a few system daemons (postfix, etc).
I upgraded to 3.7 and put fuseki on its own AWS instance last night.RAM was 16GB and swap 10GB.
once today it filled ram & swap such that linux whacked the jvmprocess. Two other times today it was swapping heavily (5GB or swapused) and we restarted fuseki before the system ran out of swap.
For some reason, the JVM running fuseki+jetty is going nuts with itsmemory usage. It *is* using more heap than usual when this happens,but it's not using more than the 10GB I allocated. At least, notaccording to the garbage collection logs.
We have had this problem a few times in the past - memory usage wouldspike drastically. We'd always attributed it to a slow memory leak,and decided we should restart fuseki regularly. But in the lastcouple weeks it's happened probably a dozen times.
after the third time today, I put it on a 32GB instance. Of course,the problem hasn't happened since.
A couple of possibilities:
1/ A query does an ORDER BY that involves a large set of results tosort. This then drives up the heap requirement, the JVM gorws theheap and now the process is larger. There may well be a CPU spike atthis time.
2/ Updates are building up. The journal isn't flushed to the maindatabase until there is a quiet moment and with the high query rateyou may get bursts of time when it is not quiet. The updates aresafe in the journal (the commit happened) but also in-memory as anoverlay on the database. The overlays are collapsed when there areno readers or writers.
What might be happening is that there isn't a quiet moment.
The traffic is certainly steady - it was about 1500 hits/minute todaywhen we first crashed.
Big sudden jump would imply a big update as well.
Setting the log into INFO (and, yes, at load it does get big)
What you are looking for is overlaps of query/updates so that the logshows true concurrent execution (i.e [1] starts, [2] starts, [1]finishes logged after [2] starts) around the time the size growsquickly and check the size of updates.
I will look for this. I am dubious, though. We don't make manywrites, and those we do are not very big. Our dataset is metadataabout our archive. The archive is 50 years old, and grows steadilybut slowly.
we had disabled the fuseki log but left httpd logging enabled becauseeach was huge. Unfortunately the updates were all in POSTs, which ihadn't noticed until i went looking just now. So I will have to waituntil next time.
thanks
danno


Dan Pritts <mailto:[email protected]>
June 12, 2018 at 10:49 PM
We had the problem again today.
Load was higher than average, but again not lunacy - about 3k hits perminute. There is no immediately obviously bad query, although ihardly know what to look for in the sparql - i just looked forextra-long statements. Nothing in the fuseki.log at all within anhour of the event. As you know the logs are verbose, so we havelogging set to "WARN" for just about everything. I'll append thelog4j.properties to the end of this message - if there's something inparticular that'd be useful to turn up, let me know.
I upgraded our dev & test to 3.7.0 today, am doing productiontonight. Also recreated the database from a backup, and am looking toverify that all db changes made since the 3.6 upgrade made it intofuseki.
For background, could you share a directory listing with files sizes?
total 2.8G
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GOSP.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GOSP.idn
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GPOS.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GPOS.idn
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GSPO.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 GSPO.idn
-rw-rw-r--. 1 fuseki fuseki    0 Jun 12 15:27 journal.jrnl
-rw-r--r--. 1 fuseki fuseki 208M Jun 12 15:27 node2id.dat
-rw-r--r--. 1 fuseki fuseki  32M Jun 12 10:55 node2id.idn
-rw-r--r--. 1 fuseki fuseki 545M Jun 12 15:27 nodes.dat
-rw-r--r--. 1 fuseki fuseki 784M Jun 12 15:27 OSP.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 OSPG.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 OSPG.idn
-rw-r--r--. 1 fuseki fuseki  88M Jun 12 11:09 OSP.idn
-rw-r--r--. 1 fuseki fuseki 760M Jun 12 15:27 POS.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 POSG.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 POSG.idn
-rw-r--r--. 1 fuseki fuseki  88M Jun 12 11:09 POS.idn
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 prefix2id.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 prefix2id.idn
-rw-r--r--. 1 fuseki fuseki    0 Feb 18 11:30 prefixes.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 prefixIdx.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 prefixIdx.idn
-rw-r--r--. 1 fuseki fuseki 808M Jun 12 15:27 SPO.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 SPOG.dat
-rw-r--r--. 1 fuseki fuseki 8.0M Feb 18 11:30 SPOG.idn
-rw-r--r--. 1 fuseki fuseki  96M Jun 12 11:09 SPO.idn
-rw-r--r--. 1 fuseki fuseki  20K Feb 18 11:33 stats.opt
-rw-rw-r--. 1 fuseki fuseki    5 Jun 12 12:30 tdb.lock
When you restart - looks like that 10G is the mapped file space beingdropped. Mapping on-demand in chunks, so on restart it is very smalland grows over time. It should reach a steady state. It should notcause swapping or GC.
Yes, I noticed that the server actually uses more Vsize than it'susing virtual memory (swap + ram), i figured it was something alongthose lines. But when I referred to memory + swap used, i meant theactual RSS as reported by ps, plus the inferred swap usage (swapbefore and after fuseki restart).
I was running "ps" & "free" every couple minutes. As you can seebetween 12:24 & 12:26 fuseki's memory usage skyrockets.
I've mildly edited the below but the numbers are all unmolested.


Tue Jun 12 12:18:01 EDT 2018
USER PID %CPU %MEM VSZ RSS TTY STATSTART TIME COMMANDfuseki 32175 23.1 65.3 41186832 21496864 ? Sl Jun1104:41:46 /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx16G-Dlog4j.configuration=file:/etc/archonnex/fuseki/log4j.properties [ gclogging options here ] -jar/usr/local/apache-jena-fuseki-3.6.0/fuseki-server.jar--config=/etc/archonnex/fuseki/fcrepo.ttl
             total       used       free     shared buffers     cached
Mem:      32877320   31168460    1708860        416 3184     961276
-/+ buffers/cache:   30204000    2673320
Swap:     27257848    3145708   24112140

[...]

Tue Jun 12 12:22:01 EDT 2018
fuseki 32175 23.1 66.5 41383440 21870824 ? Sl Jun1104:43:32 /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx16G
Mem:      32877320   31314128    1563192        488 1880     720456
-/+ buffers/cache:   30591792    2285528
Swap:     27257848    3145256   24112592

Tue Jun 12 12:24:01 EDT 2018
fuseki 32175 23.2 64.9 40859152 21352808 ? Sl Jun1104:44:19 /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx16G -
Mem:      32877320   31276020    1601300        504 2104    1231452
-/+ buffers/cache:   30042464    2834856
Swap:     27257848    3094076   24163772

Tue Jun 12 12:26:02 EDT 2018
fuseki 32175 23.3 82.6 49183252 27179308 ? Sl Jun1104:46:21 /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx16G
Mem:      32877320   32655256     222064        476 1516      25612
-/+ buffers/cache:   32628128     249192
Swap:     27257848    8361760   18896088

Tue Jun 12 12:28:01 EDT 2018
fuseki 32175 23.5 71.6 48702204 23540952 ? Sl Jun1104:49:44 /etc/alternatives/java_sdk_1.8.0/bin/java -Xmx16G
Mem:      32877320   30432416    2444904        484 2132     239924
-/+ buffers/cache:   30190360    2686960
Swap:     27257848   10598088   16659760
Java monitoring of the heap size should show the heap in use after amajor GC to be a different, smaller size.
Yesterday I fixed the garbage collection logging. I looked at it withgceasy.io; There is nothing horribly wrong there. Heap doesn't goabove 7GB, even when things went to hell. heap usage did increasesignificantly at the time of the problems - note the repeated Full GC's.
2018-06-12T12:22:28.779-0400: 73381.748: [GC (System.gc())
Desired survivor size 134742016 bytes, new threshold 9 (max 15)
[PSYoungGen: 3068955K->7622K(5460480K)] 3087627K->26295K(6654976K),0.0068808 secs] [Times: user=0.02 sys=0.00, real=0.01 secs]
2018-06-12T12:22:28.786-0400: 73381.755: [Full GC (System.gc())[PSYoungGen: 7622K->0K(5460480K)] [ParOldGen:18672K->24964K(1194496K)] 26295K->24964K(6654976K), [Metaspace:34037K->34037K(1081344K)], 0.1054190 secs] [Times: user=0.57 sys=0.00,real=0.10 secs]
2018-06-12T12:23:22.592-0400: 73435.562: [GC (System.gc())
Desired survivor size 130547712 bytes, new threshold 8 (max 15)
[PSYoungGen: 2440898K->2816K(5455872K)] 2465863K->27780K(6650368K),0.0037102 secs] [Times: user=0.02 sys=0.00, real=0.00 secs]
2018-06-12T12:23:22.596-0400: 73435.566: [Full GC (System.gc())[PSYoungGen: 2816K->0K(5455872K)] [ParOldGen:24964K->27048K(1194496K)] 27780K->27048K(6650368K), [Metaspace:34037K->34037K(1081344K)], 0.1114969 secs] [Times: user=0.61 sys=0.00,real=0.11 secs]
2018-06-12T12:24:02.404-0400: 73475.374: [GC (Allocation Failure)
Desired survivor size 175112192 bytes, new threshold 7 (max 15)
[PSYoungGen: 5324288K->127456K(5377536K)] 5351336K->201416K(6572032K),0.1020528 secs] [Times: user=0.66 sys=0.00, real=0.10 secs]
2018-06-12T12:24:29.348-0400: 73502.318: [GC (System.gc())
Desired survivor size 193986560 bytes, new threshold 6 (max 15)
[PSYoungGen: 880066K->129888K(5380096K)] 954027K->203848K(6574592K),0.0642832 secs] [Times: user=0.33 sys=0.00, real=0.06 secs]
2018-06-12T12:24:29.412-0400: 73502.382: [Full GC (System.gc())[PSYoungGen: 129888K->0K(5380096K)] [ParOldGen:73960K->196536K(1194496K)] 203848K->196536K(6574592K), [Metaspace:34037K->34037K(1081344K)], 0.3551479 secs] [Times: user=1.78 sys=0.00,real=0.35 secs]
2018-06-12T12:27:48.073-0400: 73701.045: [GC (System.gc())
Desired survivor size 186646528 bytes, new threshold 5 (max 15)
[PSYoungGen: 2862549K->16720K(5409792K)] 3059085K->213256K(6604288K),2.1344761 secs] [Times: user=1.07 sys=0.09, real=2.13 secs]
2018-06-12T12:27:50.210-0400: 73703.179: [Full GC (System.gc())[PSYoungGen: 16720K->0K(5409792K)] [ParOldGen:196536K->206591K(1194496K)] 213256K->206591K(6604288K), [Metaspace:34037K->34037K(1081344K)], 2.9111523 secs] [Times: user=2.51 sys=0.09,real=2.91 secs]
If that is not how it is, there is something to investigate.

    Andy

>
> thanks
> danno
Dan Pritts <mailto:[email protected]>
June 11, 2018 at 5:28 PM
Hi all,
we've been having trouble with our production fuseki instance. a fewspecifics:
fuseki 3.6.0, standalone/jetty. OpenJDK 1.8.0.171 on RHEL6. On anm4.2xlarge, shared with two other applications.
we have about 21M triples in the database. We hit fuseki mediumhard, on the order of 1000 hits per minute. 99%+ of the hits arequeries. Our code could stand to do some client-side caching, we getlots of repetitive queries. That said, fuseki is normally plentyfast at those, it's rare that it takes >10ms on a query.
It looks like i'm getting hit by JENA-1516, I will schedule anupgrade to 3.7 ASAP.
The log is full of errors like this.

[2018-06-11 16:15:07] BindingTDB ERROR get1(?s)
org.apache.jena.tdb.base.file.FileException:ObjectFileStorage.read[nodes](488281706)[filesize=569694455][file.size()=569694455]:Failed to read the length : got 0 bytes atorg.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:341)
[2018-06-11 16:15:39] BindingTDB ERROR get1(?identifier)
org.apache.jena.tdb.base.file.FileException: In the middle of analloc-write atorg.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:311) atorg.apache.jena.tdb.base.objectfile.ObjectFileWrapper.read(ObjectFileWrapper.java:57)
        at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
The problem that got me looking is that fuseki memory usage goesnuts, which causes the server to start swapping, etc. Swapping =slow = pager. Total memory + swap in use by fuseki when Iinvestigated was about 32GB; It's configured to use a 16GB heap.Garbage collection logging was not configured properly, so I can'tsay whether my immediate problem was heap exhaustion.
I'm monitoring swap usage hourly - sometime in a <1hr timeframe theswap usage increased past 2GB (10%) to about 11GB (10 of which wascleared after I restarted fuseki). So the memory ballooned fairlyquickly when it happened.
The TDB errors happen much earlier than that memory goes nuts.Obviously, could be a delayed effect of this problem, but I'm wondering:
- if this rings a bell in some other way - how much memory should Iexpect fuseki to need?
-  if there is any particular debugging I should enable
-  if our traffic level is out of the ordinary

thanks
danno
Dan Pritts <mailto:[email protected]>
June 11, 2018 at 5:28 PM
Hi all,
we've been having trouble with our production fuseki instance. a fewspecifics:
fuseki 3.6.0, standalone/jetty. OpenJDK 1.8.0.171 on RHEL6. On anm4.2xlarge, shared with two other applications.
we have about 21M triples in the database. We hit fuseki medium hard,on the order of 1000 hits per minute. 99%+ of the hits are queries.Our code could stand to do some client-side caching, we get lots ofrepetitive queries. That said, fuseki is normally plenty fast atthose, it's rare that it takes >10ms on a query.
It looks like i'm getting hit by JENA-1516, I will schedule an upgradeto 3.7 ASAP.
The log is full of errors like this.

[2018-06-11 16:15:07] BindingTDB ERROR get1(?s)
org.apache.jena.tdb.base.file.FileException:ObjectFileStorage.read[nodes](488281706)[filesize=569694455][file.size()=569694455]:Failed to read the length : got 0 bytes atorg.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:341)
[2018-06-11 16:15:39] BindingTDB ERROR get1(?identifier)
org.apache.jena.tdb.base.file.FileException: In the middle of analloc-write atorg.apache.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:311) atorg.apache.jena.tdb.base.objectfile.ObjectFileWrapper.read(ObjectFileWrapper.java:57)
        at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
The problem that got me looking is that fuseki memory usage goes nuts,which causes the server to start swapping, etc. Swapping = slow =pager. Total memory + swap in use by fuseki when I investigated wasabout 32GB; It's configured to use a 16GB heap. Garbage collectionlogging was not configured properly, so I can't say whether myimmediate problem was heap exhaustion.
I'm monitoring swap usage hourly - sometime in a <1hr timeframe theswap usage increased past 2GB (10%) to about 11GB (10 of which wascleared after I restarted fuseki). So the memory ballooned fairlyquickly when it happened.
The TDB errors happen much earlier than that memory goes nuts.Obviously, could be a delayed effect of this problem, but I'm wondering:
- if this rings a bell in some other way - how much memory should Iexpect fuseki to need?
-  if there is any particular debugging I should enable
-  if our traffic level is out of the ordinary

thanks
danno

Re: huge fuseki memory usage; NIO errors; heap NOT running out

Reply via email to