On 23/11/2021 09:40, Rob Vesse wrote:
Marco

So there's a couple of things going on.

Firstly the Node Table, the mapping of RDF Terms to the internal Node IDs used 
in the indexes can only ever grow.  TDB2 doesn't do reference counting so it 
doesn't ever remove entries from the table as it doesn't know when a Node ID is 
no longer needed.  Also for RDF Terms that aren't directly interned (e.g. some 
numerics, booleans, dates etc), so primarily URIs, Blank Nodes and 
larger/arbitrarily typed literals, the Node ID actually encodes the offset into 
the Node Table to make Node ID to RDF Term decoding fast so you can’t just 
arbitrarily rewrite the Node Table.  And even if rewriting the Node Table were 
supported it would require rewriting all the indexes since those use the Node 
IDs.

TL;DR the Node Table only grows because the cost of compacting it outweighs the 
benefits.  This is also why you may have seen advice in the past that if your 
database has a lot of DELETE operations made against it then in periodically 
dumping all the data and reloading it into a new database is recommended since 
that generates a fresh Node Table with only the RDF Terms currently in use.

Secondly the indexes are themselves versioned storage, so when you modify the 
database a new state is created (potentially pointing to some/all of the 
existing data) but the old data is still there as well.  This is done for two 
reasons:

1) It allows writes to overlap with ongoing reads to improve concurrency.  
Essentially each read/write transaction operates on a snapshot of the data, a 
write creates a new snapshot but an ongoing read can continue to read the old 
snapshot it was working against
2) It provides for strong fault tolerance since a crash/exit during a write 
doesn't affect old data

3) Arbitrarily large transactions.

Note that you can perform a compact operation on a TDB2 database which 
essentially discards all but the latest snapshot and should reclaim the index 
data that is no longer needed.  This is a blocking exclusive write operation so 
doesn't allow for concurrent reads as a normal write would.

Nowadays, reads continue during compaction; it's only writes that get held up (I'd like to add delta-technology to fix that).

There is a short period of pointer swapping with some disk sync at the end to switch the database in-use; it is milliseconds.

        Andy


Cheers,

Rob

PS. I'm sure Andy will chime in if I've misrepresented/misstated anything above

On 22/11/2021, 21:15, "Marco Neumann" <marco.neum...@gmail.com> wrote:

     Yes I just had a look at one of my own datasets with 180mt and a footprint
     of 28G. The overhead is not too bad at 10-20%. vs raw nt files

     I was surprised that the CLEAR ALL directive doesn't remove/release disk
     memory. Does TDB2 require a commit to release disk space?

     impressed to see that load times went up to 250k/s with 4.2. more than
     twice the speed I have seen with 3.15. Not sure if this is OS (Ubuntu
     20.04.3 LTS) related.

     Maybe we should make a recommendation to the wikidata team to provide us
     with a production environment type machine to run some load and query 
tests.






     On Mon, Nov 22, 2021 at 8:43 PM Andy Seaborne <a...@apache.org> wrote:

     >
     >
     > On 21/11/2021 21:03, Marco Neumann wrote:
     > > What's the disk footprint these days for 1b on tdb2?
     >
     > Quite a lot. For 1B BSBM, ~125G (which is a bit heavy on significant
     > sized literals - the node themselves are 50G). Obvious for current WD
     > scale usage a sprinkling of compression would be good!
     >
     > One thing xloader gives us is that it makes it possible to load on a
     > spinning disk. (it also has lower peak intermediate file space and
     > faster because it does not fall into a slow loading mode for the node
     > table that tdbloader2 did sometimes.)
     >
     >      Andy
     >
     > >
     > > On Sun, Nov 21, 2021 at 8:00 PM Andy Seaborne <a...@apache.org> wrote:
     > >
     > >>
     > >>
     > >> On 20/11/2021 14:21, Andy Seaborne wrote:
     > >>> Wikidata are looking for a replace for BlazeGraph
     > >>>
     > >>> About WDQS, current scale and current challenges
     > >>>     https://youtu.be/wn2BrQomvFU?t=9148
     > >>>
     > >>> And in the process of appointing a graph consultant: (5 month
     > contract):
     > >>> https://boards.greenhouse.io/wikimedia/jobs/3546920
     > >>>
     > >>> and Apache Jena came up:
     > >>> https://phabricator.wikimedia.org/T206560#7517212
     > >>>
     > >>> Realistically?
     > >>>
     > >>> Full wikidata is 16B triples. Very hard to load - xloader may help
     > >>> though the goal for that was to make loading the truthy subset (5B)
     > >>> easier. 5B -> 16B is not a trivial step.
     > >>
     > >> And it's growing at about 1B per quarter.
     > >>
     > >>
     > 
https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/ScalingStrategy
     > >>
     > >>>
     > >>> Even if wikidata loads, it would be impractically slow as TDB is 
today.
     > >>> (yes, that's fixable; not practical in their timescales.)
     > >>>
     > >>> The current discussions feel more like they are looking for a 
"product"
     > >>> - a triplestore that they are use - rather than a collaboration.
     > >>>
     > >>>       Andy
     > >>
     > >
     > >
     >


     --


     ---
     Marco Neumann
     KONA




Reply via email to