Re: TDBTransactionException: Not in a transaction

2019-07-18 Thread ajs6f
I'm thinking a design in which Union and Multiunion and UnionDatasetGraph actually try to handle read transactions fully would be possible, but it would be a bit of finicking to get it right. Perhaps we could keep a lock in the type to hold while transactions are opened on all the underlying

Re: About fuseki2 load performance by java API

2019-07-18 Thread Andy Seaborne
On 18/07/2019 13:08, Scarlet Remilia wrote: Thank you for reply! The server storage is HDD on local with RAID 10. CPU is 4x 14 cores with 28 threads but only one core is used during the load. The JVM of fuseki2 is tuned by adding -Xmx=50GB -Xms=50GB and TDB2 used is also tuned by tuning

Re: About fuseki2 load performance by java API

2019-07-18 Thread Andy Seaborne
On 18/07/2019 13:49, Laura Morales wrote: I had a similar problem when trying to load wikidata on my laptop with 8GB RAM, i7 CPU, 750GB HDD. It started fine but then slowed to a crawl after about 100 million triples. I don't think CPU or RAM are the problem, it's probably to do with disk

Re: RE: About fuseki2 load performance by java API

2019-07-18 Thread Laura Morales
I had a similar problem when trying to load wikidata on my laptop with 8GB RAM, i7 CPU, 750GB HDD. It started fine but then slowed to a crawl after about 100 million triples. I don't think CPU or RAM are the problem, it's probably to do with disk queues or caches or something like that. IIRC

Re: About fuseki2 load performance by java API

2019-07-18 Thread ajs6f
I want to emphasize what Andy said first: > The fastest way is to use the bulk loader directly to setup the database, > then add it to Fuseki. This will be very much faster, as well as eliminating any questions of you needing to write efficient code. If you can find a workflow that does

RE: About fuseki2 load performance by java API

2019-07-18 Thread Scarlet Remilia
Thank you for reply! The server storage is HDD on local with RAID 10. CPU is 4x 14 cores with 28 threads but only one core is used during the load. The JVM of fuseki2 is tuned by adding -Xmx=50GB -Xms=50GB and TDB2 used is also tuned by tuning cache size. I observed disk IO by iostat, but

Re: TDBTransactionException: Not in a transaction

2019-07-18 Thread Andy Seaborne
Having them in the same storage or one in memory would help. This is the crux difficult in JENA-1667. How to know if a transaction has already been started on a dataset when there are several levels of models over the top. At the moment, a union graph starts a transaction on its base graph,

Re: About fuseki2 load performance by java API

2019-07-18 Thread Andy Seaborne
That's quite slow. I get maybe 50-70K triples for a 100m load via the Fuseki UI. The fastest way is to use the bulk loader directly to setup the database, then add it to Fuseki. The hardware of the server makes a big difference. What's the server setup? Disk/SSD? Local or remote storage?

About fuseki2 load performance by java API

2019-07-18 Thread Scarlet Remilia
Hello everyone, I want to load a hundred millions triple into TDB2-backend fuseki2 by Java API. I used code below: Model model = ModelFactory.createDefaultModel(); model.add(model.asStatement(triple)); RDFConnectionRemoteBuilder builder = RDFConnectionFuseki.create()

Re: What detail steps are required to Restore a TDB from the back-up file

2019-07-18 Thread Laura Morales
To restore from Fuseki web UI, in the "manage datasets" page there are buttons for creating new stores as well as for selecting your files containing the triples ("upload data"). >From the CLI you can type "tdbloader --help" for the list of options. Example: "tdbloader --loc mydb data.nt" will