I haven't got desktop but free says

              total        used        free      shared buff/cache   available
Mem:        8691124     1844328      399084      100032 6447712     6463068
Swap:             0           0           0

when Fuseki is "resting".

On 29/01/2019 22:46, Andy Seaborne wrote:
TDB uses the OS file cache via mmap files.

The files appear as part of the process address space but of course they are not part of the heap. It also flex up and down as needed (unlike the heap).

Some sys tools report the total address - and that is not the amount of RAM the process is using.

In top(1) Linux-speak: roughly VIRT and RES (assuming no old-fashioned swapping is going on which with java should be avoided at all costs - the JVM heap on swap is very bad for performance).

RES is approximately yhe

visualvm allows you see the heap size. That's the figure to look at first.

For Mikael,

-Xmx5600M
process space 7+gigs
(VM 10+ gigs)

so

(For Fuseki+TDB its either heap or mapped files - there isn't use of direct memory (RAM, but not heap).

and start with -Xms5600M as well.

    Andy

On 29/01/2019 17:41, Dan Pritts wrote:
It's often misunderstood, but Java programs use memory in addition to the configured heap.  Fuseki in my experience sometimes uses a LOT more, more
than I could explain.  Some of the folks here (Andy for sure) spent some
time looking at it with me and weren't able to come to any conclusions.
You can look throught he list archives for the discussion, maybe 6 months
ago.

I ended up significantly overallocating memory to the instance and being
done with it.

How much RAM does your instance have?  You mentioned -Xmx 5600, and total
usage of 17GB ram+swap - sounds like you have maybe 8GB ram? I'd try
16GB and see how it does; watch the total memory usage.



On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mikael.peso...@lingsoft.fi>
wrote:




On 29/01/2019 16:28, Rob Vesse wrote:
This may be partly a case of a simple looking query having unexpected
execution semantics.  Strictly speaking your query says select all triples in the specific graph then join them with these list of values for ?s.  Now the optimiser should, and does appear, to do the right thing and flip the join order i.e. it uses the concrete values from the VALUES block to search
for triples with those subjects in the specific graph. However if the
query had other elements involved the optimiser might not kick in, a better query would place the VALUES prior to using the variables defined in the
VALUES block.
Thanks for the reminder on VALUES order

This sounds like memory/cache thrashing.  From what you have described,
running variants on this query 50k times, you are basically walking over
your entire dataset extracting it piece by piece?
Dataset is larger, these small sets (VALUES) are coming from out
external index for similar document search. Index returns id and related
metadata is fetched from Jena.

Assuming the Graph URI and the URIs in your VALUES block change in each
query then every query is looking at a different section of the database
causing a lot of data to be cached and then evicted both in terms of
on-heap memory structures (the node table cache) and potentially also for the off heap memory mapped files which may be being paged in and out as the
code traverses the B-Tree indexes.

Is there also some other query involved that extracts the Graph URIs and
Subject URIs of interest that is being executed in parallel with the
script?  Or has the input from the script been pre-calculated ahead of
time, comes from elsewhere etc?
There is no parrallelism from our part in this case. Only one php script
running and making GSP calls.

Rob

On 29/01/2019, 14:06, "Mikael Pesonen" <mikael.peso...@lingsoft.fi>
wrote:


      Server:

      /usr/bin/java

-Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
      -Xmx5600M -jar fuseki-server.jar --update --port 3030
      --loc=/home/text/tools/jena_data_test/ /ds

      No custom configs, default installation package.


      Sparql similar to this (returns 5-10 triplets) :

      CONSTRUCT { ?s ?p ?o }
      FROM <
https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
      WHERE
      {
               ?s ?p ?o

      VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
      lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
      lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
      lsr:239c6da0-4c24-4539-a277-c9756d6257ee
      lsr:2ef0190d-6271-447a-992f-6225fc440897
      lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
      lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
      lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
      lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
      lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
      }


      I solved this by adding sleep to script. So I guess it's about the
java
      memory manager not getting time to free memory? Even with sleep it
was
      barely doable, memory consumption changing rapidly between 1,5 gig
- 6 gig.



      On 29/01/2019 15:50, Andy Seaborne wrote:
      > Mikael,
      >
      > There aren't enough details except to mention the suspects like
sorting.
      >
      > With all the questions on the list, I personally don't track the
      > details of each installation so please also remind me of your
current
      > setup.
      >
      >     Andy
      >
      > On 29/01/2019 11:32, Mikael Pesonen wrote:
      >>
      >> I'm not able to run a basic read-only script without running out
of
      >> memory on the server.
      >>
      >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
Fuseki
      >> when running out of memory.
      >> All I'm running is simple sparql query getting few triples of
      >> resource. This is run for about 50k times.
      >>
      >> All settings are default, using GSP.
      >>
      >>

      --
      Lingsoft - 30 years of Leading Language Management

      www.lingsoft.fi

      Speech Applications - Language Management - Translation - Reader's
and Writer's Tools - Text Tools - E-books and M-books

      Mikael Pesonen
      System Engineer

      e-mail: mikael.peso...@lingsoft.fi
      Tel. +358 2 279 3300

      Time zone: GMT+2

      Helsinki Office
      Eteläranta 10
      FI-00130 Helsinki
      FINLAND

      Turku Office
      Kauppiaskatu 5 A
      FI-20100 Turku
      FINLAND







--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and
Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND




--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

Reply via email to