On 13/07/12 21:55, Andy Seaborne wrote:
On 13/07/12 17:04, Frank Budinsky wrote:
Hi,
We've noticed that this (unionDefaultGraph = true) query:
SELECT ?subject ?predicate ?object ?score
WHERE {
(?object ?score) <http://jena.hpl.hp.com/ARQ/property#textMatch>
"cruise" .
?subject ?predicate ?object .
}
ORDER BY Desc(?score)
runs significantly faster (i,e., 100x) than this one:
SELECT ?subject ?predicate ?object ?graph ?score
WHERE {
GRAPH ?graph {
(?object ?score) <http://jena.hpl.hp.com/ARQ/property#textMatch>
"cruise" .
?subject ?predicate ?object .
}
}
ORDER BY Desc(?score)
Is that expected, and if so, is there another (more efficient) way of
writing such a query that also returns the graphs of the matches?
Which storage layer is this? (TDB?)
How many named graph are there? And other details of the data distribution?
Given the other report about property functions and TDB, can I assume
you are using 0.9.1 or 0.9.2?
It is possible it will be slower when there are many named graphs - with
GRAPH ?g and a property function ARQ may have to iterate over each named
graph in order to know whether the pattern
{
(?object ?score) pf:textMatch "cruise" .
?subject ?predicate ?object .
}
matches for that graph.
{
(?object ?score) pf:textMatch "cruise" .
?subject ?predicate ?object .
}
on the union graph is in effect ignoring the quad field.
ARQ does not have property function support for quads, only triples.
The text index is not tied to the graph.
Corrollary:
WHERE
{
(?object ?score) pf:textMatch "cruise" .
GRAPH ?graph {
?subject ?predicate ?object .
}
}
should be efficient - it goes to the text index once, then checks all
the graphs (as a single quad pattern in TDB).
Andy