Re: Mystery memory leak in fuseki

Martynas Jusevičius Tue, 04 Jul 2023 02:15:47 -0700

You can profile it in the container as well :)
https://github.com/AtomGraph/fuseki-docker#profiling


On Tue, 4 Jul 2023 at 11.12, Rob @ DNR <rve...@dotnetrdf.org> wrote:

> Does this only happen in a container?  Or can you reproduce it running
> locally as well?
>
> If you can reproduce it locally then attaching a profiler like VisualVM so
> you can take a heap snapshot and see where the memory is going that would
> be useful
>
> Rob
>
> From: Dave Reynolds <dave.e.reyno...@gmail.com>
> Date: Tuesday, 4 July 2023 at 09:31
> To: users@jena.apache.org <users@jena.apache.org>
> Subject: Re: Mystery memory leak in fuseki
> Tried 4.7.0 under most up to date java 17 and it acts like 4.8.0. After
> 16hours it gets to about 1.6GB and by eye has nearly flatted off
> somewhat but not completely.
>
> For interest here's a MEM% curve on a 4GB box (hope the link works).
>
> https://www.dropbox.com/s/xjmluk4o3wlwo0y/fuseki-mem-percent.png?dl=0
>
> The flattish curve from 12:00 to 17:20 is a run using 3.16.0 for
> comparison. The curve from then onwards is 4.7.0.
>
> The spikes on the 4.7.0 match the allocation and recovery of the direct
> memory buffers. The JVM metrics show those cycling around every 10mins
> and being reclaimed each time with no leaking visible at that level.
> Heap, non-heap and mapped buffers are all basically unchanging which is
> to be expected since it's doing nothing apart from reporting metrics.
>
> Whereas this curve (again from 17:20 onwards) shows basically the same
> 4.7.0 set up on a separate host, showing that despite flattening out
> somewhat usage continues to grow - a least on a 16 hour timescale.
>
> https://www.dropbox.com/s/k0v54yq4kexklk0/fuseki-mem-percent-2.png?dl=0
>
>
> Both of those runs were using Eclipse Temurin on a base Ubuntu jammy
> container. Pervious runs used AWS Corretto on an AL2 base container.
> Behaviour basically unchanged so eliminates this being some
> Corretto-specific issue or a weird base container OS issue.
>
> Dave
>
> On 03/07/2023 14:54, Andy Seaborne wrote:
> > Hi Dave,
> >
> > Could you try 4.7.0?
> >
> > 4.6.0 was 2022-08-20
> > 4.7.0 was 2022-12-27
> > 4.8.0 was 2023-04-20
> >
> > This is an in-memory database?
> >
> > Micrometer/Prometheus has had several upgrades but if it is not heap and
> > not direct memory (I though that was a hard bound set at start up), I
> > don't see how it can be involved.
> >
> >      Andy
> >
> > On 03/07/2023 14:20, Dave Reynolds wrote:
> >> We have a very strange problem with recent fuseki versions when
> >> running (in docker containers) on small machines. Suspect a jetty
> >> issue but it's not clear.
> >>
> >> Wondering if anyone has seen anything like this.
> >>
> >> This is a production service but with tiny data (~250k triples, ~60MB
> >> as NQuads). Runs on 4GB machines with java heap allocation of 500MB[1].
> >>
> >> We used to run using 3.16 on jdk 8 (AWS Corretto for the long term
> >> support) with no problems.
> >>
> >> Switching to fuseki 4.8.0 on jdk 11 the process grows in the space of
> >> a day or so to reach ~3GB of memory at which point the 4GB machine
> >> becomes unviable and things get OOM killed.
> >>
> >> The strange thing is that this growth happens when the system is
> >> answering no Sparql queries at all, just regular health ping checks
> >> and (prometheus) metrics scrapes from the monitoring systems.
> >>
> >> Furthermore the space being consumed is not visible to any of the JVM
> >> metrics:
> >> - Heap and and non-heap are stable at around 100MB total (mostly
> >> non-heap metaspace).
> >> - Mapped buffers stay at 50MB and remain long term stable.
> >> - Direct memory buffers being allocated up to around 500MB then being
> >> reclaimed. Since there are no sparql queries at all we assume this is
> >> jetty NIO buffers being churned as a result of the metric scrapes.
> >> However, this direct buffer behaviour seems stable, it cycles between
> >> 0 and 500MB on approx a 10min cycle but is stable over a period of
> >> days and shows no leaks.
> >>
> >> Yet the java process grows from an initial 100MB to at least 3GB. This
> >> can occur in the space of a couple of hours or can take up to a day or
> >> two with no predictability in how fast.
> >>
> >> Presumably there is some low level JNI space allocated by Jetty (?)
> >> which is invisible to all the JVM metrics and is not being reliably
> >> reclaimed.
> >>
> >> Trying 4.6.0, which we've had less problems with elsewhere, that seems
> >> to grow to around 1GB (plus up to 0.5GB for the cycling direct memory
> >> buffers) and then stays stable (at least on a three day soak test).
> >> We could live with allocating 1.5GB to a system that should only need
> >> a few 100MB but concerned that it may not be stable in the really long
> >> term and, in any case, would rather be able to update to more recent
> >> fuseki versions.
> >>
> >> Trying 4.8.0 on java 17 it grows rapidly to around 1GB again but then
> >> keeps ticking up slowly at random intervals. We project that it would
> >> take a few weeks to grow the scale it did under java 11 but it will
> >> still eventually kill the machine.
> >>
> >> Anyone seem anything remotely like this?
> >>
> >> Dave
> >>
> >> [1]  500M heap may be overkill but there can be some complex queries
> >> and that should still leave plenty of space for OS buffers etc in the
> >> remaining memory on a 4GB machine.
> >>
> >>
> >>
>

Re: Mystery memory leak in fuseki

Reply via email to