I don’t have any specific queries to suggest since I have no familiarity with 
the database but an observation on the query shown

 

Scanning the whole database and counting the triples inherently requires a full 
traversal of the B-Tree so what you have shown so far is a corner case extreme. 
 Of course more complex queries could in fact require multiple traversals over 
the whole database and see a wider performance difference but YMMV

 

Rob

 

From: Wolfgang Fahl <[email protected]>
Organisation: BITPlan GmbH
Reply to: <[email protected]>
Date: Friday, 31 July 2020 at 08:51
To: <[email protected]>
Subject: Difference in query speed for rotating disk and SSD

 

Dear Apache Jena users,

the experience with the 
http://wiki.bitplan.com/index.php/Get_your_own_copy_of_WikiData trials and the 
unanswered question
https://stackoverflow.com/questions/61813248/jena-tdbloader-performance-and-limits
 led me to the assumption that it would be possible
to run the wikidata import for Jena on a costly 4 TB SSD but then use the 
resulting database on much cheaper rotating disk and see not much of a 
performance difference for queries then running from the SSD. 

My assumption was based on the 
https://jena.apache.org/documentation/tdb/architecture.html and the mentioned 
use of https://en.wikipedia.org/wiki/B+_tree.
I thought the B+ tree approach is optimized for making sure that not too many 
time costly seeks are necessary when fetching data during a query.

My experiment at 
http://wiki.bitplan.com/index.php/WikiData_Import_2020-07-15#log_for_query 
shows a different result for the query:
SELECT (COUNT(*) as ?Triples) WHERE { ?s ?p ?o}
 
It takes 31.501 secs on a rotating disk which is only a bit slower than the SSD 
in raw i/o but has the seek time of a rotating disk while the SSD does not have 
this performance penalty and the query takes 5.516 secs for the SSD. 

Would other queries see the same factor 6 difference or does the speed 
difference depend on the query? Please suggest some queries that i might test 
and then I will report the results here.

Cheers

  Wolfgang

 
-- 
Wolfgang Fahl
Pater-Delp-Str. 1, D-47877 Willich Schiefbahn
Tel. +49 2154 811-480, Fax +49 2154 811-481
Web: http://www.bitplan.de

Reply via email to