Hello
I have coded a java program for mapping and loading relational data over 
network into Virtuoso (open source edition) as rdf/owl graphs.
This program uses first d2rq-tool api for relational-to-rdf mapping and 
then loads the generated list of triples into Virtuoso using Virtuoso 
Jena provider api function: 
virtGraph.getBulkUpdateHandler().add(triplelist).
When I tested the performance of this program, I noticed that the 
execution time of rdf-loading was one hundred times longer than by 
loading the rdf-file locally at server side using isql-tool and 
Virtuosos TTLP_MT function.
Programatical loading of 1.5 million triples took about 10000 seconds 
whereas by using isql it took only 100s. If I try to load bigger rdf 
graphs, I get Virtuoso Communications Link Failure (timeout).

1)I understand that the performance of Jena provider must be lower in 
general, but is this factor 100 what could be expected?
2)Is this getBulkUpdateHandler().add() function the best way to load big 
rdf graphs, or is there some other options?
3)Configuration tricks to increase performance?
4)Another problem I faced with when trying to use d2rq api and virtuoso 
jena provider in the same java program is that they seem to require 
different versions of jena packages. So I can't use sparql queries in 
the same program where I do graph loading using d2rq? Is there some 
other good option for mapping and loading external relational data into 
Virtuoso rdf graphs?

Virtuoso Server: Ubuntu 12.04, Intel Q9400 2,66MHz, 5GB.
Java client: Windows 7, Mem: 10GB

Regards
Pekka Aarnio

------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users

Reply via email to