Hi,
I've been testing how much disk space TDB1 and TDB2 databases require. I
was surprised to find out that the disk usage varies quite a lot
depending on whether data is loaded to the default graph vs. a named
graph. For TDB1 it also depends on the loading method: tdbloader2 gives
much more compact databases than tdbloader.
My dataset is about 38M triples of bibliographic data in an N-Triples
file. There are some blank nodes in the data (unfortunately). Here I'm
testing only the initial load without a preexisting database.
I used Jena 3.5.0 command line tools. I didn't pay much attention to
loading times since I'm running this on a VM with shared CPU and disk
resources and the background load varies over time. I think
tdb2.tdbloader with named graphs was the slowest at around 42 minutes.
For tdb2.tdbloader and tdbloader2 I had to convert the file to N-Quads
first to be able to load into a named graph. For tdb2.tdbloader loading
into a named graph didn't work (JENA-1422, being worked on by Andy);
tdbloader2 doesn't provide a --graph option at all.
TDB1 results:
5.2G tdb-bib-default-tdbloader
3.3G tdb-bib-default-tdbloader2
13G tdb-bib-named-tdbloader
7.6G tdb-bib-named-tdbloader2
TDB2 results:
5.2G tdb2-bib-default
12G tdb2-bib-named
Some conclusions:
* Loading the same data into a named graph instead of the default graph
uses a lot more disk space
* There is no huge difference between TDB1 and TDB2 disk space usage
when doing an apples-to-apples comparison (i.e. either using only the
default graph for both, or a named graph for both)
* For TDB1, using tdbloader2 instead of tdbloader results in a much
smaller (around 40%) database, both when using the default graph only
and when using a named graph
My larger goal is to decide whether to use TDB1 or TDB2 (or something
else, like HDT or Blazegraph...) for a new bibliographic Linked Data
service. Disk space is a factor (though not the most important one) in
the calculation.
-Osma
--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
[email protected]
http://www.nationallibrary.fi