Use tdbloader for 10M quads.
As to why the load stage of tdbloder2 drops off, we'd need to know more
about the environment you are running in.
What is the machine? The disk?
How much RAM does the machine have?
Is there anything else running on the machine?
Have you set the heap size or taken the defaults?
Andy
On 15/04/17 11:55, Laura Morales wrote:
I've made a dataset with about 10M nquads, 5-6 graphs, stored as a single .nq
file.
I've launched tdbloader2 to create a new dataset from this file, but I see a
constant and remarkable slow down as more nquads are added to the dataset. Here
are some INFO during processing:
INFO Add: 50,000 Data (Batch: 12,983 / Avg: 12,983)
INFO Add: 500,000 Data (Batch: 77,639 / Avg: 51,743)
INFO Add: 1,000,000 Data (Batch: 81,833 / Avg: 64,926)
INFO Add: 2,000,000 Data (Batch: 84,745 / Avg: 72,745)
INFO Add: 3,000,000 Data (Batch: 79,365 / Avg: 76,591)
INFO Add: 4,000,000 Data (Batch: 91,575 / Avg: 77,605)
INFO Add: 5,000,000 Data (Batch: 3,582 / Avg: 49,010)
INFO Add: 6,000,000 Data (Batch: 3,915 / Avg: 22,031)
INFO Add: 7,000,000 Data (Batch: 11,887 / Avg: 16,724)
INFO Add: 8,000,000 Data (Batch: 4,121 / Avg: 15,455)
INFO Add: 9,000,000 Data (Batch: 24,038 / Avg: 14,804)
I wonder if this is normal or if there's anything I can do to speed this up.