Hello all.

Are there any pointers to inserting large volumes of data in a persistent
RW TDB store please?

I currently have a 8M line 500MB+ input file which is being parsed by
JavaCC and the created quads inserted into a TDB store.

The process genreates 120M quads and takes just over 2hrs which is;

60M quads/hr/ or
1M quads/min or
16666 quads/sec.

Parse is single threaded (12% core utiliization i.e. 100%) with -Xmx8GB
(16GB available) on a i7 8 core and a 512GB SSD.

I am working with the datasetGraph after opening the TDB store to remove
any "extra" code which might slow the process down. I begin/commit a
transaction for every 1000 input rows as prior to this a OOME occured after
~3M input rows if I tried to wrap the entire load in a transaction. The TDB
store is being read from so I am unable to use a TDB loader.

I don't believe the runtime is poor but any pointers which would improve
the speed...

Dick.

Reply via email to