On 21/11/2021 21:03, Marco Neumann wrote:
What's the disk footprint these days for 1b on tdb2?

Quite a lot. For 1B BSBM, ~125G (which is a bit heavy on significant sized literals - the node themselves are 50G). Obvious for current WD scale usage a sprinkling of compression would be good!

One thing xloader gives us is that it makes it possible to load on a spinning disk. (it also has lower peak intermediate file space and faster because it does not fall into a slow loading mode for the node table that tdbloader2 did sometimes.)

    Andy


On Sun, Nov 21, 2021 at 8:00 PM Andy Seaborne <a...@apache.org> wrote:



On 20/11/2021 14:21, Andy Seaborne wrote:
Wikidata are looking for a replace for BlazeGraph

About WDQS, current scale and current challenges
    https://youtu.be/wn2BrQomvFU?t=9148

And in the process of appointing a graph consultant: (5 month contract):
https://boards.greenhouse.io/wikimedia/jobs/3546920

and Apache Jena came up:
https://phabricator.wikimedia.org/T206560#7517212

Realistically?

Full wikidata is 16B triples. Very hard to load - xloader may help
though the goal for that was to make loading the truthy subset (5B)
easier. 5B -> 16B is not a trivial step.

And it's growing at about 1B per quarter.

https://wikitech.wikimedia.org/wiki/Wikidata_Query_Service/ScalingStrategy


Even if wikidata loads, it would be impractically slow as TDB is today.
(yes, that's fixable; not practical in their timescales.)

The current discussions feel more like they are looking for a "product"
- a triplestore that they are use - rather than a collaboration.

      Andy



Reply via email to