TDB2 - technical background

Andy Seaborne Sun, 01 Jan 2017 09:17:51 -0800

TDB2 is an upgrade of TDB. It provides fully scalable transactions,e.g.loading 100's of million triples into a live database.


Releases will soon be available from maven central:


   <dependency>
     <groupId>org.seaborne.mantis</groupId>
     <artifactId>tdb2</artifactId>
     <version>0.2.0</version>
   </dependency>

** TDB2 databases are not compatible with Apache Jena TDB (TDB1). **
Bad things will happen.

Currently, it is undergoing final testing.



So what are the differences between TDB1 and TDB2?

== Technical Changes

= Indexes

Indexes are now copy-on-write B+Trees which are immutable once atransaction commits. Slightly confusingly, these are called "persistentdatastructures" in the literature - this is not referring to be beingon-disk but that the fact they are there permanently and not lost on alater update.


Jena TIM uses Dexx Collections for the same purpose.

Updates happen to the B+Trees as a write transaction progresses. Thejournal is now only a small amount of data to record the new state ofthe tree which is its root pointer, and 2 file limits for branches andleaves files. 24 bytes.


This has several desirable effects:

* Write-once
* Writer-pays
* No in-memory copy

Data is written straight into the indexes and is being flushed to diskby the OS while the transaction runs (i.e. its asynchronous to the dataupdates). Changes in the data do not go into the journal at all, onlyindex state goes into the journal. In TDB1, changes are written to thejournal then later written to disk as well as buffered in-memory. Thefinal sync() happens as the writer commits.

Active readers do not hold up the write-back any more so that source ofgrowing journals has been eliminated as well.


= Nodes

The node data is now held in a binary form (using RDF/Thrift).

The NodeId format has been revised: datatypes are always retained, evenfor inline values. (so, xsd:int does not become xsd:integer; "001" stillbecomes "1").


= Transactions

There is a completely new transaction mechanism. It is now a generalframework that can work with multiple components. A TDB2 database is anumber of such components - one per index, and also the node table. Itcould be enhanced to provide multiple dataset transactions and work withexternal indexes. The API on datasets is unchanged.


== Status

The one remaining work item is to provide storage reclamation. The indexstyle means indexes grow in size. A means to GC the database, pruningit to a specific version is needed. At the moment, this can be donewith a backup/resort.


== Possibilities

Given this design, some features are possible, i.e. could be done butaren't.

"See into the past" - a read-transaction can be started that sees somespecific committed state from the past, not the latest commit. Thedatabase does not forget any committed changes unless storage is reclaimed.

This can also be used to reset the whole database to a point in the pastand then allow it to evolve from there. (Actually, branching from theold version is also possible technically but will probably cause generalchaos to have database that branched without a way to merge.)

TDB2 - technical background

Reply via email to