This may be partly a case of a simple looking query having unexpected execution 
semantics.  Strictly speaking your query says select all triples in the 
specific graph then join them with these list of values for ?s.  Now the 
optimiser should, and does appear, to do the right thing and flip the join 
order i.e. it uses the concrete values from the VALUES block to search for 
triples with those subjects in the specific graph.  However if the query had 
other elements involved the optimiser might not kick in, a better query would 
place the VALUES prior to using the variables defined in the VALUES block.

This sounds like memory/cache thrashing.  From what you have described, running 
variants on this query 50k times, you are basically walking over your entire 
dataset extracting it piece by piece?

Assuming the Graph URI and the URIs in your VALUES block change in each query 
then every query is looking at a different section of the database causing a 
lot of data to be cached and then evicted both in terms of on-heap memory 
structures (the node table cache) and potentially also for the off heap memory 
mapped files which may be being paged in and out as the code traverses the 
B-Tree indexes.

Is there also some other query involved that extracts the Graph URIs and 
Subject URIs of interest that is being executed in parallel with the script?  
Or has the input from the script been pre-calculated ahead of time, comes from 
elsewhere etc?

Rob

On 29/01/2019, 14:06, "Mikael Pesonen" <[email protected]> wrote:

    
    Server:
    
    /usr/bin/java 
    
-Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
 
    -Xmx5600M -jar fuseki-server.jar --update --port 3030 
    --loc=/home/text/tools/jena_data_test/ /ds
    
    No custom configs, default installation package.
    
    
    Sparql similar to this (returns 5-10 triplets) :
    
    CONSTRUCT { ?s ?p ?o }
    FROM <https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
    WHERE
    {
             ?s ?p ?o
    
    VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8 
    lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985 
    lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902 
    lsr:239c6da0-4c24-4539-a277-c9756d6257ee 
    lsr:2ef0190d-6271-447a-992f-6225fc440897 
    lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9 
    lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf 
    lsr:6f6802cf-0336-4234-90b8-cc8780058f0d 
    lsr:d1e2751b-4332-4d57-95e4-ca8070c16782 
    lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
    }
    
    
    I solved this by adding sleep to script. So I guess it's about the java 
    memory manager not getting time to free memory? Even with sleep it was 
    barely doable, memory consumption changing rapidly between 1,5 gig - 6 gig.
    
    
    
    On 29/01/2019 15:50, Andy Seaborne wrote:
    > Mikael,
    >
    > There aren't enough details except to mention the suspects like sorting.
    >
    > With all the questions on the list, I personally don't track the 
    > details of each installation so please also remind me of your current 
    > setup.
    >
    >     Andy
    >
    > On 29/01/2019 11:32, Mikael Pesonen wrote:
    >>
    >> I'm not able to run a basic read-only script without running out of 
    >> memory on the server.
    >>
    >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills Fuseki 
    >> when running out of memory.
    >> All I'm running is simple sparql query getting few triples of 
    >> resource. This is run for about 50k times.
    >>
    >> All settings are default, using GSP.
    >>
    >>
    
    -- 
    Lingsoft - 30 years of Leading Language Management
    
    www.lingsoft.fi
    
    Speech Applications - Language Management - Translation - Reader's and 
Writer's Tools - Text Tools - E-books and M-books
    
    Mikael Pesonen
    System Engineer
    
    e-mail: [email protected]
    Tel. +358 2 279 3300
    
    Time zone: GMT+2
    
    Helsinki Office
    Eteläranta 10
    FI-00130 Helsinki
    FINLAND
    
    Turku Office
    Kauppiaskatu 5 A
    FI-20100 Turku
    FINLAND
    
    




Reply via email to