You can profile it in the container as well :) https://github.com/AtomGraph/fuseki-docker#profiling
On Tue, 4 Jul 2023 at 11.12, Rob @ DNR <rve...@dotnetrdf.org> wrote: > Does this only happen in a container? Or can you reproduce it running > locally as well? > > If you can reproduce it locally then attaching a profiler like VisualVM so > you can take a heap snapshot and see where the memory is going that would > be useful > > Rob > > From: Dave Reynolds <dave.e.reyno...@gmail.com> > Date: Tuesday, 4 July 2023 at 09:31 > To: users@jena.apache.org <users@jena.apache.org> > Subject: Re: Mystery memory leak in fuseki > Tried 4.7.0 under most up to date java 17 and it acts like 4.8.0. After > 16hours it gets to about 1.6GB and by eye has nearly flatted off > somewhat but not completely. > > For interest here's a MEM% curve on a 4GB box (hope the link works). > > https://www.dropbox.com/s/xjmluk4o3wlwo0y/fuseki-mem-percent.png?dl=0 > > The flattish curve from 12:00 to 17:20 is a run using 3.16.0 for > comparison. The curve from then onwards is 4.7.0. > > The spikes on the 4.7.0 match the allocation and recovery of the direct > memory buffers. The JVM metrics show those cycling around every 10mins > and being reclaimed each time with no leaking visible at that level. > Heap, non-heap and mapped buffers are all basically unchanging which is > to be expected since it's doing nothing apart from reporting metrics. > > Whereas this curve (again from 17:20 onwards) shows basically the same > 4.7.0 set up on a separate host, showing that despite flattening out > somewhat usage continues to grow - a least on a 16 hour timescale. > > https://www.dropbox.com/s/k0v54yq4kexklk0/fuseki-mem-percent-2.png?dl=0 > > > Both of those runs were using Eclipse Temurin on a base Ubuntu jammy > container. Pervious runs used AWS Corretto on an AL2 base container. > Behaviour basically unchanged so eliminates this being some > Corretto-specific issue or a weird base container OS issue. > > Dave > > On 03/07/2023 14:54, Andy Seaborne wrote: > > Hi Dave, > > > > Could you try 4.7.0? > > > > 4.6.0 was 2022-08-20 > > 4.7.0 was 2022-12-27 > > 4.8.0 was 2023-04-20 > > > > This is an in-memory database? > > > > Micrometer/Prometheus has had several upgrades but if it is not heap and > > not direct memory (I though that was a hard bound set at start up), I > > don't see how it can be involved. > > > > Andy > > > > On 03/07/2023 14:20, Dave Reynolds wrote: > >> We have a very strange problem with recent fuseki versions when > >> running (in docker containers) on small machines. Suspect a jetty > >> issue but it's not clear. > >> > >> Wondering if anyone has seen anything like this. > >> > >> This is a production service but with tiny data (~250k triples, ~60MB > >> as NQuads). Runs on 4GB machines with java heap allocation of 500MB[1]. > >> > >> We used to run using 3.16 on jdk 8 (AWS Corretto for the long term > >> support) with no problems. > >> > >> Switching to fuseki 4.8.0 on jdk 11 the process grows in the space of > >> a day or so to reach ~3GB of memory at which point the 4GB machine > >> becomes unviable and things get OOM killed. > >> > >> The strange thing is that this growth happens when the system is > >> answering no Sparql queries at all, just regular health ping checks > >> and (prometheus) metrics scrapes from the monitoring systems. > >> > >> Furthermore the space being consumed is not visible to any of the JVM > >> metrics: > >> - Heap and and non-heap are stable at around 100MB total (mostly > >> non-heap metaspace). > >> - Mapped buffers stay at 50MB and remain long term stable. > >> - Direct memory buffers being allocated up to around 500MB then being > >> reclaimed. Since there are no sparql queries at all we assume this is > >> jetty NIO buffers being churned as a result of the metric scrapes. > >> However, this direct buffer behaviour seems stable, it cycles between > >> 0 and 500MB on approx a 10min cycle but is stable over a period of > >> days and shows no leaks. > >> > >> Yet the java process grows from an initial 100MB to at least 3GB. This > >> can occur in the space of a couple of hours or can take up to a day or > >> two with no predictability in how fast. > >> > >> Presumably there is some low level JNI space allocated by Jetty (?) > >> which is invisible to all the JVM metrics and is not being reliably > >> reclaimed. > >> > >> Trying 4.6.0, which we've had less problems with elsewhere, that seems > >> to grow to around 1GB (plus up to 0.5GB for the cycling direct memory > >> buffers) and then stays stable (at least on a three day soak test). > >> We could live with allocating 1.5GB to a system that should only need > >> a few 100MB but concerned that it may not be stable in the really long > >> term and, in any case, would rather be able to update to more recent > >> fuseki versions. > >> > >> Trying 4.8.0 on java 17 it grows rapidly to around 1GB again but then > >> keeps ticking up slowly at random intervals. We project that it would > >> take a few weeks to grow the scale it did under java 11 but it will > >> still eventually kill the machine. > >> > >> Anyone seem anything remotely like this? > >> > >> Dave > >> > >> [1] 500M heap may be overkill but there can be some complex queries > >> and that should still leave plenty of space for OS buffers etc in the > >> remaining memory on a 4GB machine. > >> > >> > >> >