Andy Seaborne <[email protected]> wrote on 07/14/2012 04:59:13 AM:
> From: Andy Seaborne <[email protected]> > To: [email protected], > Date: 07/14/2012 05:00 AM > Subject: Re: LARQ query with GRAPH clause > Sent by: Andy Seaborne <[email protected]> > > On 13/07/12 21:55, Andy Seaborne wrote: > > On 13/07/12 17:04, Frank Budinsky wrote: > >> > >> > >> Hi, > >> > >> We've noticed that this (unionDefaultGraph = true) query: > >> > >> SELECT ?subject ?predicate ?object ?score > >> WHERE { > >> (?object ?score) <http://jena.hpl.hp.com/ARQ/property#textMatch> > >> "cruise" . > >> ?subject ?predicate ?object . > >> } > >> ORDER BY Desc(?score) > >> > >> runs significantly faster (i,e., 100x) than this one: > >> > >> SELECT ?subject ?predicate ?object ?graph ?score > >> WHERE { > >> GRAPH ?graph { > >> (?object ?score) <http://jena.hpl.hp.com/ARQ/property#textMatch> > >> "cruise" . > >> ?subject ?predicate ?object . > >> } > >> } > >> ORDER BY Desc(?score) > >> > >> Is that expected, and if so, is there another (more efficient) way of > >> writing such a query that also returns the graphs of the matches? > > > > Which storage layer is this? (TDB?) > > How many named graph are there? And other details of the data distribution? > > > > Given the other report about property functions and TDB, can I assume > > you are using 0.9.1 or 0.9.2? > > > > It is possible it will be slower when there are many named graphs - with > > GRAPH ?g and a property function ARQ may have to iterate over each named > > graph in order to know whether the pattern > > > > { > > (?object ?score) pf:textMatch "cruise" . > > ?subject ?predicate ?object . > > } > > > > matches for that graph. > > > > { > > (?object ?score) pf:textMatch "cruise" . > > ?subject ?predicate ?object . > > } > > > > on the union graph is in effect ignoring the quad field. > > > > ARQ does not have property function support for quads, only triples. > > > > The text index is not tied to the graph. > > Corrollary: > > WHERE > { > (?object ?score) pf:textMatch "cruise" . > GRAPH ?graph { > ?subject ?predicate ?object . > } > } > > should be efficient - it goes to the text index once, then checks all > the graphs (as a single quad pattern in TDB). > > Andy > Hi Andy, You're absolutely right about the cause of the problem - the performance of the original query is tied to the number of named graphs in the datastore. Your suggested alternative query fixes the problem. Thanks a lot for your help. Frank. > > > > >
