Re: Mystery memory leak in fuseki

Dave Reynolds Tue, 04 Jul 2023 01:32:21 -0700

Tried 4.7.0 under most up to date java 17 and it acts like 4.8.0. After16hours it gets to about 1.6GB and by eye has nearly flatted offsomewhat but not completely.


For interest here's a MEM% curve on a 4GB box (hope the link works).


https://www.dropbox.com/s/xjmluk4o3wlwo0y/fuseki-mem-percent.png?dl=0

The flattish curve from 12:00 to 17:20 is a run using 3.16.0 forcomparison. The curve from then onwards is 4.7.0.

The spikes on the 4.7.0 match the allocation and recovery of the directmemory buffers. The JVM metrics show those cycling around every 10minsand being reclaimed each time with no leaking visible at that level.Heap, non-heap and mapped buffers are all basically unchanging which isto be expected since it's doing nothing apart from reporting metrics.

Whereas this curve (again from 17:20 onwards) shows basically the same4.7.0 set up on a separate host, showing that despite flattening outsomewhat usage continues to grow - a least on a 16 hour timescale.


https://www.dropbox.com/s/k0v54yq4kexklk0/fuseki-mem-percent-2.png?dl=0

Both of those runs were using Eclipse Temurin on a base Ubuntu jammycontainer. Pervious runs used AWS Corretto on an AL2 base container.Behaviour basically unchanged so eliminates this being someCorretto-specific issue or a weird base container OS issue.


Dave

On 03/07/2023 14:54, Andy Seaborne wrote:

Hi Dave,

Could you try 4.7.0?

4.6.0 was 2022-08-20
4.7.0 was 2022-12-27
4.8.0 was 2023-04-20

This is an in-memory database?
Micrometer/Prometheus has had several upgrades but if it is not heap andnot direct memory (I though that was a hard bound set at start up), Idon't see how it can be involved.
     Andy

On 03/07/2023 14:20, Dave Reynolds wrote:
We have a very strange problem with recent fuseki versions whenrunning (in docker containers) on small machines. Suspect a jettyissue but it's not clear.
Wondering if anyone has seen anything like this.
This is a production service but with tiny data (~250k triples, ~60MBas NQuads). Runs on 4GB machines with java heap allocation of 500MB[1].
We used to run using 3.16 on jdk 8 (AWS Corretto for the long termsupport) with no problems.
Switching to fuseki 4.8.0 on jdk 11 the process grows in the space ofa day or so to reach ~3GB of memory at which point the 4GB machinebecomes unviable and things get OOM killed.
The strange thing is that this growth happens when the system isanswering no Sparql queries at all, just regular health ping checksand (prometheus) metrics scrapes from the monitoring systems.
Furthermore the space being consumed is not visible to any of the JVMmetrics:- Heap and and non-heap are stable at around 100MB total (mostlynon-heap metaspace).
- Mapped buffers stay at 50MB and remain long term stable.
- Direct memory buffers being allocated up to around 500MB then beingreclaimed. Since there are no sparql queries at all we assume this isjetty NIO buffers being churned as a result of the metric scrapes.However, this direct buffer behaviour seems stable, it cycles between0 and 500MB on approx a 10min cycle but is stable over a period ofdays and shows no leaks.
Yet the java process grows from an initial 100MB to at least 3GB. Thiscan occur in the space of a couple of hours or can take up to a day ortwo with no predictability in how fast.
Presumably there is some low level JNI space allocated by Jetty (?)which is invisible to all the JVM metrics and is not being reliablyreclaimed.
Trying 4.6.0, which we've had less problems with elsewhere, that seemsto grow to around 1GB (plus up to 0.5GB for the cycling direct memorybuffers) and then stays stable (at least on a three day soak test).We could live with allocating 1.5GB to a system that should only needa few 100MB but concerned that it may not be stable in the really longterm and, in any case, would rather be able to update to more recentfuseki versions.
Trying 4.8.0 on java 17 it grows rapidly to around 1GB again but thenkeeps ticking up slowly at random intervals. We project that it wouldtake a few weeks to grow the scale it did under java 11 but it willstill eventually kill the machine.
Anyone seem anything remotely like this?

Dave
[1] 500M heap may be overkill but there can be some complex queriesand that should still leave plenty of space for OS buffers etc in theremaining memory on a 4GB machine.

Re: Mystery memory leak in fuseki

Reply via email to