I've reloaded the GND dataset at http://zbw.eu/beta/sparql/gnd/query with 
4.5.0-SNAPSHOT. The sources were a 133G .nt.gz file,  plus several small .ttl 
files with ontology etc. I loaded the large one with tdb2.xloader, and 
immediately after that the smaller ones with tdb2.tdbloader (see protocol at 
https://zbw.eu/beta/tmp/fuseki/create_tdb_20220220.log). 

Two things smelled fishy in this load:

1) The tdb2.tdbstats call after the loading looped at 100% CPU, and I had to 
kill it after an hour or so (this is reproducible)

2) some files remained in the fuseki/databases/temp directory (1.3G 
triples.tmp.gz, empty quads.tmp.gz, and a load.json with

{
  "ingested" : "2022-02-20T13:15:45.528+00:00" ,
  "data" : [ "../var/gnd/2021-11/src/GND.utf8.ttl.gz" ] ,
  "triples" : 165639860 ,
  "quads" : 0
}

Text indexing however worked, and also a few example queries. However, a basic 
query like "?x gndo:DifferentiatedPerson ." does not work any more.

Any idea what could have gone wrong?

Cheers, Joachim


Reply via email to