Thanks for the explanation! Yes it's mainly happening on disk, CPU usage is at 0.5%. Are there any strategies to get around this? Somehow compartmentalize the data so that deleting one compartment would be more efficient?

On 14/11/2024 13:30, Rob @ DNR wrote:
Details matter here e.g. what storage layer is in use? How big is the graph 
being deleted?  How many other graphs (and triples) are in the server as a 
whole?  You say a curl request so can we assume Fuseki?  Are there other 
secondary indices involved e.g. Jena Text?

---

Most Jena storage, i.e. TDB/TDB2, is quad-oriented behind the scenes so when you issue a CLEAR 
GRAPH <uri> (or a DROP GRAPH <uri>) what happens internally is that it must scan each 
index and delete all quads with the relevant <uri> in the graph position of the quad.  For 
indexes where graph is later in the order e.g. SPOG these quads could be scattered across the 
entire index affecting many blocks on disk meaning the whole index needs to be read.

For TDB2 which uses copy on write data structures this might also end up 
effectively having to rewrite every single block in the index which for large 
datasets could take an exceedingly long time.

If you have secondary indices involved, e.g. Jena Text, then it is also 
potentially having to make the relevant delete requests to those indices as 
well.

---

So, my guess would be that you have a lot of disk IO happening on your server 
if you happened to look at its resource consumption while the CLEAR GRAPH is 
ongoing?

Rob


From: Mikael Pesonen <mikael.peso...@lingsoft.fi>
Date: Thursday, 14 November 2024 at 09:21
To: users@jena.apache.org <users@jena.apache.org>
Subject: SPAM-LOW: Slow clear graph
Curl command is running now over 24 hours with Jena, what could cause
that? Shouldn't clear graph always be done in few seconds? It's not an
expensive operation?

--
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's 
Tools - Text Tools - E-books and M-books

Mikael Pesonen
Semantic Technologies

e-mail: mikael.peso...@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND

Reply via email to