Thanks Johannes for starting this thread. I am facing the exact same
problem with tdb2. For any significantly large file for that matter, it
takes forever to load. I hope this problem has a solution.
Thank you.
-Ahmed

On Mon, Jun 8, 2020 at 11:55 AM Hoffart, Johannes <[email protected]>
wrote:

> Hi,
>
> I want to load the full Wikidata dump, available at
> https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.bz2 to
> use in Jena.
>
> I tried it using the tdb2.tdbloader with $JVM_ARGS set to -Xmx120G.
> Initially, the progress (measured by dataset size) is quick. It slows down
> very much after a couple of 100GB written, and finally, at around 500GB,
> the progress is almost halted.
>
> Did anyone ingest Wikidata into Jena before? What are the system
> requirements? Is there a specific tdb2.tdbloader configuration that would
> speed things up? For example building an index after data ingest?
>
> Thanks
> Johannes
>
> Johannes Hoffart, Executive Director, Technology Division
> Goldman Sachs Bank Europe SE | Marienturm | Taunusanlage 9-10 | D-60329
> Frankfurt am Main
> Email: [email protected]<mailto:[email protected]> | Tel: +49
> (0)69 7532 3558
> Vorstand: Dr. Wolfgang Fink (Vorsitzender) | Thomas Degn-Petersen | Dr.
> Matthias Bock
> Vorsitzender des Aufsichtsrats: Dermot McDonogh
> Sitz: Frankfurt am Main | Amtsgericht Frankfurt am Main HRB 114190
>
>
> ________________________________
>
> Your Personal Data: We may collect and process information about you that
> may be subject to data protection laws. For more information about how we
> use and disclose your personal data, how we protect your information, our
> legal basis to use your information, your rights and who you can contact,
> please refer to: www.gs.com/privacy-notices<
> http://www.gs.com/privacy-notices>
>

Reply via email to