Hi,

I've been testing how much disk space TDB1 and TDB2 databases require. I was surprised to find out that the disk usage varies quite a lot depending on whether data is loaded to the default graph vs. a named graph. For TDB1 it also depends on the loading method: tdbloader2 gives much more compact databases than tdbloader.

My dataset is about 38M triples of bibliographic data in an N-Triples file. There are some blank nodes in the data (unfortunately). Here I'm testing only the initial load without a preexisting database.

I used Jena 3.5.0 command line tools. I didn't pay much attention to loading times since I'm running this on a VM with shared CPU and disk resources and the background load varies over time. I think tdb2.tdbloader with named graphs was the slowest at around 42 minutes.

For tdb2.tdbloader and tdbloader2 I had to convert the file to N-Quads first to be able to load into a named graph. For tdb2.tdbloader loading into a named graph didn't work (JENA-1422, being worked on by Andy); tdbloader2 doesn't provide a --graph option at all.


TDB1 results:

5.2G    tdb-bib-default-tdbloader
3.3G    tdb-bib-default-tdbloader2

13G     tdb-bib-named-tdbloader
7.6G    tdb-bib-named-tdbloader2


TDB2 results:

5.2G    tdb2-bib-default
12G     tdb2-bib-named


Some conclusions:

* Loading the same data into a named graph instead of the default graph uses a lot more disk space * There is no huge difference between TDB1 and TDB2 disk space usage when doing an apples-to-apples comparison (i.e. either using only the default graph for both, or a named graph for both) * For TDB1, using tdbloader2 instead of tdbloader results in a much smaller (around 40%) database, both when using the default graph only and when using a named graph

My larger goal is to decide whether to use TDB1 or TDB2 (or something else, like HDT or Blazegraph...) for a new bibliographic Linked Data service. Disk space is a factor (though not the most important one) in the calculation.

-Osma


--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi

Reply via email to