Thanks for the explanation! Yes it's mainly happening on disk, CPU usage
is at 0.5%. Are there any strategies to get around this? Somehow
compartmentalize the data so that deleting one compartment would be more
efficient?
On 14/11/2024 13:30, Rob @ DNR wrote:
Details matter here e.g. what storage layer is in use? How big is the graph
being deleted? How many other graphs (and triples) are in the server as a
whole? You say a curl request so can we assume Fuseki? Are there other
secondary indices involved e.g. Jena Text?
---
Most Jena storage, i.e. TDB/TDB2, is quad-oriented behind the scenes so when you issue a CLEAR
GRAPH <uri> (or a DROP GRAPH <uri>) what happens internally is that it must scan each
index and delete all quads with the relevant <uri> in the graph position of the quad. For
indexes where graph is later in the order e.g. SPOG these quads could be scattered across the
entire index affecting many blocks on disk meaning the whole index needs to be read.
For TDB2 which uses copy on write data structures this might also end up
effectively having to rewrite every single block in the index which for large
datasets could take an exceedingly long time.
If you have secondary indices involved, e.g. Jena Text, then it is also
potentially having to make the relevant delete requests to those indices as
well.
---
So, my guess would be that you have a lot of disk IO happening on your server
if you happened to look at its resource consumption while the CLEAR GRAPH is
ongoing?
Rob
From: Mikael Pesonen <mikael.peso...@lingsoft.fi>
Date: Thursday, 14 November 2024 at 09:21
To: users@jena.apache.org <users@jena.apache.org>
Subject: SPAM-LOW: Slow clear graph
Curl command is running now over 24 hours with Jena, what could cause
that? Shouldn't clear graph always be done in few seconds? It's not an
expensive operation?
--
Lingsoft - 30 years of Leading Language Management
www.lingsoft.fi
Speech Applications - Language Management - Translation - Reader's and Writer's
Tools - Text Tools - E-books and M-books
Mikael Pesonen
Semantic Technologies
e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300
Time zone: GMT+2
Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND
Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND