Re: Re: How to change jena-fuseki logging level/running out of memory with jena-fuseki 5.1.0

jaanam Tue, 05 Nov 2024 21:15:05 -0800

> The measures of memory consumptions are not yet ready, but we have some 
> additional intesting results.


I've added memory measurements into git hub. Please check for the explantions 
in https://github.com/jamietti/jena/blob/main/README.md

Br, Jaana

> 05.11.2024 10.25 EET jaa...@kolumbus.fi kirjoitti:
> 
>  
> > Also, can you share some number in general? How many triples do you 
> > have? How much memory consumption do you see now compared to earlier 
> > versions? What kind of Update statements do you make to the triple 
> > store? 5h in your test for 20k lines of Excel sounds really slow in my 
> > opinion.
> 
> I have added some information regarding use of jena-fuseki to here: 
> https://github.com/jamietti/jena/blob/main/README.md
> 
> The measures of memory consumptions are not yet ready, but we have some 
> additional intesting results. 
> At first we have been running jena-fuseki in podman user space containers in 
> redHat linux, where we should move the python server from its current SUSE 
> linux platform.
> 
> When running the code and same excel update in SUSE linux machine it takes a 
> bit more than two hours to update the whole 20k lines of Excel with 
> jena-fuseki 3.13. installed to the host machine, i.e. no containers. 
> 
> When running docker.io/stain/jena-fuseki:4.8.0 in podman user space container 
> the same update takes about ten hours, whereas when running 
> docker.io/stain/jena-fuseki:3.14.0 in podman containser it passes in less 
> than six hours.
> 
> Also with docker.io/stain/jena-fuseki:5.1.0 it takes at least six hours, if 
> the execution does not fail with error 'Maximum lock count exceeded' or just 
> 'Remote end closed connection without response'.
> 
> So we are now planning to try direct installation of jena-fuseki to the RHEL9 
> with its original settings except ipv6 disabled. Would you have some hints 
> about operating system settings that could be useful for performance tuning ?
> 
> Br, Jaana
> 
> > 30.10.2024 09.36 EET Lorenz Buehmann <buehm...@informatik.uni-leipzig.de> 
> > kirjoitti:
> > 
> >  
> > Not that it matters maybe, but I'm wondering if you ever tried to 
> > combine your SPARQL Update request into batches to reduce the number of 
> > requests?
> > 
> > You have 20k lines and for each line you do 6 updates - did you at least 
> > try to send those 6 update statements as a single request?
> > 
> > 
> > Also, can you share some number in general? How many triples do you 
> > have? How much memory consumption do you see now compared to earlier 
> > versions? What kind of Update statements do you make to the triple 
> > store? 5h in your test for 20k lines of Excel sounds really slow in my 
> > opinion.
> > 
> > On 30.10.24 04:20, jaa...@kolumbus.fi wrote:
> > >> Are you using TDB1? TDB2?
> > > It's not even possible to use TDB1 with 5.1.0, UI offers just in-memory 
> > > and TDB2 databases.
> > >
> > > - Jaana
> > >
> > >
> > >> 29.10.2024 19.17 EET Andy Seaborne <a...@apache.org> kirjoitti:
> > >>
> > >>   
> > >> On 29/10/2024 13:43, jaa...@kolumbus.fi wrote:
> > >>>> The gap between 3.14.0 and 5.1.0 is huge. There is also a Jetty change 
> > >>>> -
> > >>>> Jetty 12 is a fundamentally different architecture in its HTTP 
> > >>>> handling.
> > >>> Does it mean that 5.1.0 requires much more meomory ?
> > >> In your case, it would seem so. From Jena, from Jetty, generally. Maybe
> > >> some internal caches are bigger - there could be lots of reasons from
> > >> the lifecycle of the memory releasing to man, many small things.
> > >>
> > >> If you care, then bisect on versions to find out.
> > >>
> > >> Are you using TDB1? TDB2?
> > >>
> > >>       Andy
> > >>
> > >>> Jaana
> > >>>
> > >>>> 29.10.2024 14.13 EET Andy Seaborne <a...@apache.org> kirjoitti:
> > >>>>
> > >>>>    
> > >>>> On 29/10/2024 11:12, jaa...@kolumbus.fi wrote:
> > >>>>> Hi,
> > >>>>>> 1. Check that the client is properly reading the whole of the 
> > >>>>>> response
> > >>>>>> 9even if zero bytes) and is actually closing the connection, or
> > >>>>>> returning it to the connection pool. Check by running "netstat" to 
> > >>>>>> see
> > >>>>>> TCp connections ("-t" on *nix)
> > >>>>> With netstat I saw several connections in TIME_WAIT state when 
> > >>>>> running my test. I think it means that the tcp-connections have been 
> > >>>>> properly closed.
> > >>>> (the test being the large run?)
> > >>>>
> > >>>> That's good - the important point is that there are not hundred's of
> > >>>> connections.
> > >>>>
> > >>>>> I understand that unproperly terminated tcp-connections could lead to 
> > >>>>> error case "Maximum lock count exceeded", but what could cause 
> > >>>>> jena-fuseki 5.1.0 use much more memory than 3.14 did with exactly 
> > >>>>> same program code and input in the client side and same data in the 
> > >>>>> database ?
> > >>>> The gap between 3.14.0 and 5.1.0 is huge. There is also a Jetty change 
> > >>>> -
> > >>>> Jetty 12 is a fundamentally different architecture in its HTTP 
> > >>>> handling.
> > >>>>
> > >>>>        Andy
> > 
> > -- 
> > Lorenz Bühmann
> > Research Associate/Scientific Developer
> > 
> > Email buehm...@infai.org
> > 
> > Institute for Applied Informatics e.V. (InfAI) | Goerdelerring 9 | 04109 
> > Leipzig | Germany

Re: Re: How to change jena-fuseki logging level/running out of memory with jena-fuseki 5.1.0

Reply via email to