Hi Bob,

Half-billion triples sounds larger than 500 million to me.

Did you try --loader=parallel? The default loader also does a bit of parallel but the "parallel" loader lets rip.

Warning: it can take over your machine and max I/O. Your UI may become unresponsive.

For larger datasets, 1B+, we also have tdb2.xloader (Linux only). Not as fast but it will load these datasets on a small machine (I did WikiData truthy (6.6B) on an XPS portable).

    Andy

On 29/01/2022 22:23, b...@snee.com wrote:
Thanks Andy! I tried again and I got it to work. I probably had some dumb typo somewhere.

I was doing this to load the half-billion triples in the CHeMBL data set (https://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBL-RDF/27.0/) into Fuseki, and while it took some patience, in the end it all worked great. (Remember those "million triple" challenges?)

Thanks again,

Bob



On 2022-01-29 10:53, Andy Seaborne wrote:
Hi Bob,

Seems to be working for me.

The <http://jena.apache.org/2016/tdb#DatasetTDB2> looks right.

What does myDataset.ttl look like?

Does
  tdb2.tdbloader --loc DB2 myData.ttl
work?

The error would be caused if Jena initialization failed but
tdb2.tdbloader needs the TDB2 code to run at all!

    Andy


Couple of points:

1/ You can load data through the web UI into TDB2 - slower than the bulk
loader at scale but there are no size limits (unlike TDB1). Maybe the
time saved stopping and starting Fuseki compensates!

2/ You can load the database then move it into place in Fuseki.
Sometimes an easier workflow.

    Andy

On 29/01/2022 15:08, b...@snee.com wrote:

I thought that the following steps worked for me just fine a few weeks ago, but today I'm getting an error.

 From the 4.3.2 fuseki web-based interface, I created a  "Persistent (TDB2)" database called myDataset. I then shut down fuseki and confirmed that a run/configuration/myDataset.ttl file had been created.

I then tried to load data into the dataset like this:

~/Downloads/apache-jena-4.3.2/bin/tdb2.tdbloader --tdb /Users/bobdc/Downloads/apache-jena-fuseki-4.3.2/run/configuration/myDataset.ttl myData.ttl

I then got this error:

     org.apache.jena.sparql.ARQException: No such type: <http://jena.apache.org/2016/tdb#DatasetTDB2>

Can anyone tell me what I'm doing wrong?

Thanks,

Bob

Reply via email to