On 16/11/17 20:36, Osma Suominen wrote:
Andy Seaborne kirjoitti 16.11.2017 klo 22:04:TDB1 or HDT. TDB2 has no benefits for you at 40M triples at occasional updates.Compaction would be a benefit, if it could be automated. But apparently not in the current state (see today's dev@ thread).TDB2 goals are to address the scale limitations on transactions, the write-back queue overload problems, a better architecture e.g. fully integrate in jena-text transactions, and no quirks about models across transactions. TDB2 is experimental at this stage.Understood.(You could use DatasetGraphSwitchable in TDB2 to make a switchable HDT backed database.)Thanks for the tip!I think there's a lot of potential in HDT, it's just hampered by implementation bugs and lack of resources on the hdt-java side. For my use case it would be almost perfect, but the hdt-java implementation doesn't support union default graph functionality [1]. It could be added of course, just hasn't been.
Fuseki (well, ARQ) supports union graph on all datasets these days.It will be a loop over graphs if necessary, and suppressing duplicates is expensive in the general case. Putting graphs one by one into a general purpose RDF Dataset (DatasetImpl) means a loop.
(they use dataset in the general sense of "collection of data", not RDF Dataset)
Andy
-Osma [1] https://github.com/rdfhdt/hdt-java/issues/3
