Hello! We are experimenting with Virtuoso Open Source at the BBC to see if it could be used as a backend for some of our applications. However, we are running in two main issues.
First, we noticed that on an average dataset (a couple of million triples), the following query is really fast: PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> PREFIX foaf: <htp://xmlns.com/foaf/0.1/> SELECT ?artist ?type WHERE { ?artist mbz:firstLetter "a" . } However the following query is really, really slow (a couple of minutes to answer): PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> PREFIX foaf: <htp://xmlns.com/foaf/0.1/> SELECT ?artist ?type WHERE { ?artist mbz:firstLetter "a" . ?artist a ?type } I am guessing the optimiser must see the rdf:type predicate, figure that it should use an index on type first, and end up going through every resource in the dataset. Another major issue we're running into is the deadlocking mechanism. We have a constant flow of updates going in through SPARQL/Update. Our dataset is a collection of fairly small graphs (around 30 triples each). When we do a query like the above, going through all these graphs, we're almost sure to reach a deadlock at some point. At almost any point in time, there is an update going on in one of the graphs. Is there a way to work around that? Like just waiting for the deadlock to be removed on the problematic graphs and return the SPARQL results? Cheers, y
