Look through the list archives for posts from Andy describing the
differences between tdb1 and tdb2. they have different optimizations; I
don't recall the differences.
thanks
danno
Dan Pritts
ICPSR Computing and Network Services
On 12 Nov 2019, at 7:29, Amandeep Srivastava wrote:
Hi,
I'm trying to create a TDB database from Wikidata's official RDF dump
to
read the data using Fuseki service. I need to make a few queries for
my
personal project, running which the online service times out.
I have a 12 core machine with 36 GB memory.
Can you please advise on the best way for creating the database? Since
the
dump is huge, I cannot try all the approaches. Besides, I'm not sure
if the
tdbloader function works in a similar way on data of different sizes.
Questions:
1. Which one would be better to use - tdb.tdbloader2 (TDB1) or
tdb2.tdbloader (TDB2) for creating the database and why? Any specific
configurations that I should be aware of?
2. I'm running a job currently using tdb.tdbloader2 but it is using
just a
single core. Also, it's loading speed is decreasing slowly. It started
at
an avg of 120k tuples and is currently at 80k tuples. Can you advise
how
can I utilize all the cores of my machine and maintain the loading
speed at
the same time?
Regards,
Aman