On 23/03/18 10:32, Davide wrote:
Well, so what is the best way to do this? I'm trying a lot of ways. Now I
load the default model in memory from dataset with getDefaultModel method,
and each time, I append triples in another model create with
createDefaultModel(). When I obtain 200000 triples into the model, I load
model in the dataset adding the same model to the default model with add()
function.

20,00 or 200,000? Previously itwas 20K.

But I don't think it is optimal.

Could you expand on that please?

What load rates with what hardware do you get for TDB1 and for TDB2?

Another way that I was trying is
load directly models in the Fuseki Server, with a remote connection,
creating models with 200M size, and loading models, each 200M triples into
the server. But it causes an GC overhead limit exceeded. So, what is the
best way to perform this?

TDB2 can do that.

For TDB1, you'll have to break it into chunks depending on the memory available.

    Andy


2018-03-21 11:22 GMT+01:00 Andy Seaborne <[email protected]>:

Bulkloading (TDB1) is for working from an empty dataset.  The tricks it
uses do not work when there is already data in dataset.  For TDB1, One of
the bulkloaders simply loads triples/qwuads, the other refuses to load.

For TDB2, which has no limits on the size of transactions, a batch size of
20K, or even 200M, should work. The larger the batch size, the more the
transaction overheads are amortized.

     Andy


On 19/03/18 15:50, Davide wrote:

I've about 20000 triples to load each time. I load data into models with
Jena API, and write data inside a StreamWriter. When the buffer has a
certain size, I load data in the dataset with the Bulkloader. But now I'm
trying to use TDB2 with Loader.Bulkload method to see if there are
improvements, but I've a problem. I retrieve the dataset with
"TDB2Factory.connectDataset(location), and pass it in the Bulkload
function. But I've a ClassCastException in runtime:
"org.apache.jena.tdb2.store.DatasetGraphSwitchable cannot be cast to
org.apache.jena.tdb2.store.DatasetGraphTDB". How can I resolve this?



Reply via email to