Andy Seaborne <[email protected]> wrote on 07/14/2012 04:59:13
AM:

> From: Andy Seaborne <[email protected]>
> To: [email protected],
> Date: 07/14/2012 05:00 AM
> Subject: Re: LARQ query with GRAPH clause
> Sent by: Andy Seaborne <[email protected]>
>
> On 13/07/12 21:55, Andy Seaborne wrote:
> > On 13/07/12 17:04, Frank Budinsky wrote:
> >>
> >>
> >> Hi,
> >>
> >> We've noticed that this (unionDefaultGraph = true) query:
> >>
> >> SELECT ?subject ?predicate ?object ?score
> >>    WHERE {
> >>      (?object ?score) <http://jena.hpl.hp.com/ARQ/property#textMatch>
> >> "cruise" .
> >>      ?subject ?predicate ?object .
> >>    }
> >> ORDER BY Desc(?score)
> >>
> >> runs significantly faster (i,e., 100x) than this one:
> >>
> >> SELECT ?subject ?predicate ?object ?graph ?score
> >>    WHERE {
> >>      GRAPH ?graph {
> >>        (?object ?score)
<http://jena.hpl.hp.com/ARQ/property#textMatch>
> >> "cruise" .
> >>        ?subject ?predicate ?object .
> >>      }
> >>    }
> >> ORDER BY Desc(?score)
> >>
> >> Is that expected, and if so, is there another (more efficient) way of
> >> writing such a query that also returns the graphs of the matches?
> >
> > Which storage layer is this? (TDB?)
> > How many named graph are there? And other details of the data
distribution?
> >
> > Given the other report about property functions and TDB, can I assume
> > you are using 0.9.1 or 0.9.2?
> >
> > It is possible it will be slower when there are many named graphs -
with
> > GRAPH ?g and a property function ARQ may have to iterate over each
named
> > graph in order to know whether the pattern
> >
> > {
> >    (?object ?score) pf:textMatch "cruise" .
> >    ?subject ?predicate ?object .
> > }
> >
> > matches for that graph.
> >
> > {
> >   (?object ?score) pf:textMatch "cruise" .
> >    ?subject ?predicate ?object .
> > }
> >
> > on the union graph is in effect ignoring the quad field.
> >
> > ARQ does not have property function support for quads, only triples.
> >
>
> The text index is not tied to the graph.
>
> Corrollary:
>
> WHERE
> {
>     (?object ?score) pf:textMatch "cruise" .
>      GRAPH ?graph {
>         ?subject ?predicate ?object .
>        }
> }
>
> should be efficient - it goes to the text index once, then checks all
> the graphs (as a single quad pattern in TDB).
>
>    Andy
>

Hi Andy,

You're absolutely right about the cause of the problem - the performance of
the original query is tied to the number of named graphs in the datastore.

Your suggested alternative query fixes the problem.

Thanks a lot for your help.

Frank.

>
>
>
>
>

Reply via email to