Re: Memory management with Fuseki

Luís Moreira de Sousa Fri, 17 Apr 2020 01:26:22 -0700

Hi all, some answers below to the many questions.

1. This Fuseki instance is based on the image maintained at DockerHub by the 
secoresearch account. Copies of the Dockerfile and tdb.cfg files are at the end 
of this message. There is no other code involved.


2. The image is deployed to an Openshift cluster with a default resource base 
of 1 CPU and 1 GB of RAM. The intention is to use Fuseki as component of a 
information system easy to deploy by institutions in developing countries, 
where resources may be limited and know-how lacking. These resources have shown 
sufficient to run software such as Postgres or MapServer.

3. Openshift provides a user interface to easy monitor the resources taken up 
by a running container (aka pod), no code is involved in this monitoring. It is 
also possible to launch a shell session into the container and monitor that 
way. At the end of the message is a print out from top showing that nothing 
else is running in this particular container. All memory is used eihter by 
Fuseki or the system.

4. The datasets I have been using to test Fuseki were created with rdflib and 
are saved as XML/RDF. Each contains some dozens of objects of interst and 
respective relations from a larger database. The largest of these RDF files 
contains just under 100 000 triples and occupies 20 MB in disk. I uploaded a 
new graph with more meaningfull labels (https://pasteboard.co/J4cfPM9.png). 
Each point in the graph is a dataset, in the xx axis (horizontal) is the number 
of triples in the dataset, in the yy axis (vertical) is the additional memory 
required by Fuseki once the dataset is added. Again, note that all datasets are 
uploaded in persistent mode.

5. Regarding the JVM, the information in the manual simply refers that the heap 
size is somewhat dependent on the kind of queries run. But the problem on this 
end is with dataset upload. At this stage I do not know what or how to modify 
in the JVM set-up.

Thank you for your help.

Dockerfile
----------
FROM secoresearch/fuseki:latest

# Set environment variables
ENV ADMIN_PASSWORD toto
ENV ENABLE_DATA_WRITE true
ENV ENABLE_UPDATE true
ENV ENABLE_UPLOAD true

# Add in config files
COPY ./tbd.cfg $FUSEKI_BASE/tbd.cfg
COPY ./tbd.cfg $FUSEKI_HOME/tbd.cfg


tbf.cfg
-------
{
  "tdb.node2nodeid_cache_size" :  50000 ,
  "tdb.nodeid2node_cache_size" :  250000 ,
}

top
---
Mem: 39251812K used, 26724204K free, 21104K shrd, 58340K buff, 23792776K cached
CPU:   9% usr   5% sys   0% nic  84% idle   0% io   0% irq   0% sirq
Load average: 2.02 1.93 1.75 3/4355 114
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
    1     0 9008     S   20528m  30%   4   0% java -cp *:/javalibs/* 
org.apache.jena.fuseki.cmd.FusekiCmd
  109   102 9008     S     1520   0%   7   0% /bin/sh
  102     0 9008     S     1512   0%   1   0% /bin/sh -c TERM="xterm-termite" 
/bin/sh
  110   109 9008     R     1508   0%   1   0% top




--
Luís

Re: Memory management with Fuseki

Reply via email to