We have the same issue and have stopped using Jena for that reason.

On 22/04/2024 18:22, Balduin Landolt wrote:

we're running Fuseki 5.0.0 (but previously the last 4.x versions behaved
essentially the same) with roughly 40 Mio triples (tendency: growing).
Not sure what configuration is relevant, but we have the default graph as
the union graph.
Also, we use Fuseki as our main database, not just as a "view on our data"
so we do quite a bit of updating on the data all the time.

Lately, we've been having more and more issues with servers running out of
disk space because Fuseki's database grew pretty rapidly.
This can be solved by compacting the DB, but with our data and hardware
this takes ca. 15 minutes, during which Fuseki does not accept any update
queries, so for the production system we can't really do this outside of
nighttime hours when (hopefully) no one uses the system anyways.

Some things we've noticed:
- A subset of our data (I think ~20 Mio triples) taking up 6GB in compacted
state, when dumped to a .trig file is ca. 5GB. But when uploading the same
.trig file to an empty DB, this grows to ca. 25GB
- Dropping graphs does not free up disk space
- A sequence of e.g. 10k queries updating only a small number of triples
(maybe 1-10 or so) on the full dataset seems to grow the DB size a lot,
like 10s to 100s of GB (I don't have numbers on this one, but it was

My question is:
Would that kind of growth in disk usage be expected? Are other people
having similar issues? Are there strategies to mitigate this? Maybe some
configuration that may be tweaked or so?

Best & thanks in advance,

Lingsoft - 30 years of Leading Language Management


Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku

Reply via email to