I want to emphasize what Andy said first: > The fastest way is to use the bulk loader directly to setup the database, > then add it to Fuseki.
This will be very much faster, as well as eliminating any questions of you needing to write efficient code. If you can find a workflow that does this, I suspect it might be the best immediate choice. ajs6f > On Jul 18, 2019, at 8:08 AM, Scarlet Remilia > <[email protected]> wrote: > > Thank you for reply! > > > > The server storage is HDD on local with RAID 10. > > CPU is 4x 14 cores with 28 threads but only one core is used during the load. > > The JVM of fuseki2 is tuned by adding -Xmx=50GB -Xms=50GB and TDB2 used is > also tuned by tuning cache size. > > I observed disk IO by iostat, but it seems not utilized much disk IO and also > it is observed that memory usage of fuseki2 is increasing after loading every > 3 millions triples. > > Fuseki2 is setup as a standalone server by the command below: > > > > ./fuseki-server –tdb2 –loc=./tdb2dataset –port 2222 -update /fuseki2 > > > > Thank you very much! > > > > Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 > > > > ________________________________ > From: Andy Seaborne <[email protected]> > Sent: Thursday, July 18, 2019 6:41:56 PM > To: [email protected] > Subject: Re: About fuseki2 load performance by java API > > That's quite slow. I get maybe 50-70K triples for a 100m load via the > Fuseki UI. > > The fastest way is to use the bulk loader directly to setup the > database, then add it to Fuseki. > > The hardware of the server makes a big difference. What's the server > setup? Disk/SSD? Local or remote storage? > > Andy > > You don't need the begin/commit in the client - the transaction is in > the backend server. > > On 18/07/2019 09:02, Scarlet Remilia wrote: >> Hello everyone, >> I want to load a hundred millions triple into TDB2-backend fuseki2 by Java >> API. >> I used code below: >> >> Model model = ModelFactory.createDefaultModel(); >> model.add(model.asStatement(triple)); >> RDFConnectionRemoteBuilder builder = RDFConnectionFuseki.create() >> .destination(FusekiURL); >> RDFConnection conn = builder.build(); >> conn.begin(ReadWrite.WRITE); >> try { >> conn.load(model); >> conn.commit(); >> } finally { >> conn.end(); >> } >> >> The code is actually worked but performance is not ideal enough. >> >> [2019-07-18 23:29:25] Fuseki INFO [46] POST >> http://192.168.204.244:2222/fuseki2?default >> [2019-07-18 23:30:45] Fuseki INFO [15] Body: Content-Length=-1, >> Content-Type=application/rdf+thrift, Charset=null => RDF-THRIFT : >> Count=3257309 Triples=3257309 Quads=0 >> [2019-07-18 23:31:12] Fuseki INFO [15] 200 OK (3,302.546 s) >> >> Every 3 millions triples cost 3,302.546 seconds and there are totally 300 >> millions triples in queue…(One in-mem Model is impossible to contain so much >> triples…) >> >> Is there any better method to load them quicker? >> >> Thanks! >> >> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10 >> >>
