Hi all again,
I'm trying to ask again about the problem below. It surprises me that is
so slow (like 2k triples/s, when the tdbloader tool is 10x faster).
I've tried to use org.apache.jena.tdb.TDBLoader, but I haven't clear how
to obtain a DatasetGraphTDB or a GraphNonTxnTDB from the path of a TDB
location.
Thanks in advance for any help,
Marco
On 23/12/2017 13:09, Zak Mc Kracken wrote:
Hi all,
I've an application where exporting threads are producing Model
instances of a pre-configured size, then I want to write those models
into a TDB.
For the moment, I'm using this (I believe, rather canonical) code for
the writing (dataSet is shared between threads:
this.dataSet.begin ( ReadWrite.WRITE );
try {
Model dmodel = this.dataSet.getDefaultModel ();
dmodel.add ( model );
this.dataSet.commit ();
}
finally {
this.dataSet.end ();
}
It's extremely slow. When tested with about 2G of Turtle data, it is
still running after hours. The same data exported to a .ttl and then
loaded with tdbloader take a couple of minutes. Am I doing something
wrong? Is the transactional approach inherently slower? Should I call
the TDBLoader instead (the one used by the command line tool)?
Note that I'm OK to force everything to one thread only (or to
serialise a couple of threads, as the code above seem to enforce), the
application is going to be used in different export use cases and in
some of them it will be truly parallel (eg, saving data on different
files).
Thanks in advance for any help.
Marco.