Hello all. Are there any pointers to inserting large volumes of data in a persistent RW TDB store please?
I currently have a 8M line 500MB+ input file which is being parsed by JavaCC and the created quads inserted into a TDB store. The process genreates 120M quads and takes just over 2hrs which is; 60M quads/hr/ or 1M quads/min or 16666 quads/sec. Parse is single threaded (12% core utiliization i.e. 100%) with -Xmx8GB (16GB available) on a i7 8 core and a 512GB SSD. I am working with the datasetGraph after opening the TDB store to remove any "extra" code which might slow the process down. I begin/commit a transaction for every 1000 input rows as prior to this a OOME occured after ~3M input rows if I tried to wrap the entire load in a transaction. The TDB store is being read from so I am unable to use a TDB loader. I don't believe the runtime is poor but any pointers which would improve the speed... Dick.
