Hello Yves Re. hang. when a deadlock happens, you may attach GDB to the running instance and check stacks of the hanging threads and we may try to fix the thing using these stacks as a hint.
Re. slow query. Even more important, do you know the graph where the triples should come from? If there's a single known graph, please specify it in the query. If both triples should come from one graph, but the graph is unknown then it's nice to write the query as PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> PREFIX foaf: <htp://xmlns.com/foaf/0.1/> SELECT ?artist ?type WHERE { graph ?g { ?artist mbz:firstLetter "a" . ?artist a ?type } } to give more hints to the optimizer. In any case, if graph is not a constant then additional indexes are strongly advised, as described in the User's Guide. If indexes are in place already, what type of indexes you've created? If the last item is S then make it bitmap. A bitmap index takes half of space of a regular one, resulting in a big difference in speed. In any case, the output of explain ('PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> PREFIX foaf: <htp://xmlns.com/foaf/0.1/> SELECT ?artist ?type WHERE { ?artist mbz:firstLetter "a" . ?artist a ?type }') may give us a hint. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com On Thu, 2009-06-18 at 12:57 +0100, Yves Raimond wrote: > Hello! > > We are experimenting with Virtuoso Open Source at the BBC to see if it > could be used as a backend for some of our applications. However, we > are running in two main issues. > > First, we noticed that on an average dataset (a couple of million > triples), the following query is really fast: > > PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> > PREFIX foaf: <htp://xmlns.com/foaf/0.1/> > SELECT ?artist ?type WHERE { > ?artist mbz:firstLetter "a" . > } > > > However the following query is really, really slow (a couple of > minutes to answer): > > PREFIX mbz: <http://www.bbc.co.uk/ontologies/musicbrainz/> > PREFIX foaf: <htp://xmlns.com/foaf/0.1/> > SELECT ?artist ?type WHERE { > ?artist mbz:firstLetter "a" . > ?artist a ?type > } > > I am guessing the optimiser must see the rdf:type predicate, figure > that it should use an index on type first, and end up going through > every resource in the dataset. > > Another major issue we're running into is the deadlocking mechanism. > We have a constant flow of updates going in through SPARQL/Update. Our > dataset is a collection of fairly small graphs (around 30 triples > each). When we do a query like the above, going through all these > graphs, we're almost sure to reach a deadlock at some point. At almost > any point in time, there is an update going on in one of the graphs. > > Is there a way to work around that? Like just waiting for the deadlock > to be removed on the problematic graphs and return the SPARQL results? > > Cheers, > y >
