On 23/11/2021 09:40, Rob Vesse wrote:
Marco So there's a couple of things going on. Firstly the Node Table, the mapping of RDF Terms to the internal Node IDs used in the indexes can only ever grow. TDB2 doesn't do reference counting so it doesn't ever remove entries from the table as it doesn't know when a Node ID is no longer needed. Also for RDF Terms that aren't directly interned (e.g. some numerics, booleans, dates etc), so primarily URIs, Blank Nodes and larger/arbitrarily typed literals, the Node ID actually encodes the offset into the Node Table to make Node ID to RDF Term decoding fast so you can’t just arbitrarily rewrite the Node Table. And even if rewriting the Node Table were supported it would require rewriting all the indexes since those use the Node IDs. TL;DR the Node Table only grows because the cost of compacting it outweighs the benefits. This is also why you may have seen advice in the past that if your database has a lot of DELETE operations made against it then in periodically dumping all the data and reloading it into a new database is recommended since that generates a fresh Node Table with only the RDF Terms currently in use. Secondly the indexes are themselves versioned storage, so when you modify the database a new state is created (potentially pointing to some/all of the existing data) but the old data is still there as well. This is done for two reasons: 1) It allows writes to overlap with ongoing reads to improve concurrency. Essentially each read/write transaction operates on a snapshot of the data, a write creates a new snapshot but an ongoing read can continue to read the old snapshot it was working against 2) It provides for strong fault tolerance since a crash/exit during a write doesn't affect old data
3) Arbitrarily large transactions.
Note that you can perform a compact operation on a TDB2 database which essentially discards all but the latest snapshot and should reclaim the index data that is no longer needed. This is a blocking exclusive write operation so doesn't allow for concurrent reads as a normal write would.
Nowadays, reads continue during compaction; it's only writes that get held up (I'd like to add delta-technology to fix that).
There is a short period of pointer swapping with some disk sync at the end to switch the database in-use; it is milliseconds.
Andy
Cheers, Rob PS. I'm sure Andy will chime in if I've misrepresented/misstated anything above On 22/11/2021, 21:15, "Marco Neumann" <marco.neum...@gmail.com> wrote: Yes I just had a look at one of my own datasets with 180mt and a footprint of 28G. The overhead is not too bad at 10-20%. vs raw nt files I was surprised that the CLEAR ALL directive doesn't remove/release disk memory. Does TDB2 require a commit to release disk space? impressed to see that load times went up to 250k/s with 4.2. more than twice the speed I have seen with 3.15. Not sure if this is OS (Ubuntu 20.04.3 LTS) related. Maybe we should make a recommendation to the wikidata team to provide us with a production environment type machine to run some load and query tests. On Mon, Nov 22, 2021 at 8:43 PM Andy Seaborne <a...@apache.org> wrote: > > > On 21/11/2021 21:03, Marco Neumann wrote: > > What's the disk footprint these days for 1b on tdb2? > > Quite a lot. For 1B BSBM, ~125G (which is a bit heavy on significant > sized literals - the node themselves are 50G). Obvious for current WD > scale usage a sprinkling of compression would be good! > > One thing xloader gives us is that it makes it possible to load on a > spinning disk. (it also has lower peak intermediate file space and > faster because it does not fall into a slow loading mode for the node > table that tdbloader2 did sometimes.) > > Andy > > > > > On Sun, Nov 21, 2021 at 8:00 PM Andy Seaborne <a...@apache.org> wrote: > > > >> > >> > >> On 20/11/2021 14:21, Andy Seaborne wrote: > >>> Wikidata are looking for a replace for BlazeGraph > >>> > >>> About WDQS, current scale and current challenges > >>> https://youtu.be/wn2BrQomvFU?t=9148 > >>> > >>> And in the process of appointing a graph consultant: (5 month > contract): > >>> https://boards.greenhouse.io/wikimedia/jobs/3546920 > >>> > >>> and Apache Jena came up: > >>> https://phabricator.wikimedia.org/T206560#7517212 > >>> > >>> Realistically? > >>> > >>> Full wikidata is 16B triples. Very hard to load - xloader may help > >>> though the goal for that was to make loading the truthy subset (5B) > >>> easier. 5B -> 16B is not a trivial step. > >> > >> And it's growing at about 1B per quarter. > >> > >> > https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/ScalingStrategy > >> > >>> > >>> Even if wikidata loads, it would be impractically slow as TDB is today. > >>> (yes, that's fixable; not practical in their timescales.) > >>> > >>> The current discussions feel more like they are looking for a "product" > >>> - a triplestore that they are use - rather than a collaboration. > >>> > >>> Andy > >> > > > > > -- --- Marco Neumann KONA