Re: TDB1/TDB2 disk space with and without named graphs

Andy Seaborne Thu, 16 Nov 2017 09:41:42 -0800


On 16/11/17 11:30, Osma Suominen wrote:

Rob Vesse kirjoitti 16.11.2017 klo 13:13:
This is by design. As has been discussed in the past tdbloader2produces maximally packed B+Trees by preprocessing data which willminimise disk space usage.
[...]
As Andy mentioned on an earlier thread tdb2.tdbloader essentiallyhas the same behaviour as tdbloader, because of the different datastructures low performance should be much better anyway and he did notthink there would be much benefit to having a tdb2.tdbloader2 variant.Also given the different Data structures I’m not sure if this would beas practical.
Right. I was just surprised at how big the difference is.

tdbloader2 is both fast and space-efficient,


but does not run on MS Windows.

that makes it a lot moreappealing than tdb2.tdbloader which in my (very limited) experience isslow and space-hungry (but similar to tdbloader for TDB1).
But the real surprise was the space overhead of named graphs. More thantwice the space just because I decide to put the data in a named graphinstead of the default graph? And that seems to be the case both forTDB1 (both tdbloader and tdbloader2) and TDB2.

The tdbloader in TDB2 is a simple. I found that this current simple onewas faster than expected (maybe because it is append-only so diskfriendly or maybe because I was using an SSD for testing).

Rather than wait for that work area to be done, I thought it would begood to contribute and now release TDB2. It's experimental.


The TDB1 loaders could be ported.

TDB2 goals are to address the scal elimiations on transactions, thewrite-back queue overload problems, a better architecture e.g. fullyintegrate in jena-text transactions, and no quirks about models acrosstransactions.

The style of tdbloader2 could be used to compact databases down. Tehcurrent compaction is a simple one, good for getting a trusted one to work.


        Andy

Re: TDB1/TDB2 disk space with and without named graphs

Reply via email to