On 20/08/14 10:03, Ruben Verborgh wrote:
Dear all,
Can/should a SPARQL parser join groups within a scope?
The parser and execution can do anything it likes providing it does not
change the results. The defn of what the results are given by a
sequence of
1/ Abstract syntax tree -> SPARQL algebra
2/ Execution of the SPARQL algebra.
That's the "can" part ... the "should" is "it depends".
For instance, Section 5.2.2 of the SPARQL 1.1 spec [1]
says that the following patterns all have the same solutions:
{ ?x foaf:name ?name .
?x foaf:mbox ?mbox .
FILTER regex(?name, "Smith")
}
{ FILTER regex(?name, "Smith")
?x foaf:name ?name .
?x foaf:mbox ?mbox .
}
{ ?x foaf:name ?name .
FILTER regex(?name, "Smith")
?x foaf:mbox ?mbox .
}
Parse trees for the above could look like:
GROUP(GROUP(a, b), FILTER(c))
A sequence of adjacent triple patterns is a "basic graph pattern"
and you may wish to keep those together. The SPARQL grammar does pick
this pick with
[55] TriplesBlock
so it's more like GROUP(BGP(a, b), FILTER(c))
What really matters for execution is the algebra.
"Translation to the SPARQL Algebra"
http://www.w3.org/TR/sparql11-query/#sparqlQuery
GROUP(FILTER(c), GROUP(a, b))
GROUP(GROUP(a), FILTER(c), GROUP(b))
In the algebra, they are all the same: FILTERs are placed logically at
the end of basic graph patterns (and, of course, an optimizer may do
magic with them after that).
The query is the algebra:
project (?book ?title)
filter (regex ?name "Smith")
bgp
?x foaf:name ?name
?x foaf:mbox ?mbox
Produced by:
http://www.sparql.org/query-validator.html
One possible optimization is to filter the "?x foaf:name ?name" and not
after the whole BGP has been executed.
Like many things, it's not always how an engine might wish to do it. It
might get a stream of "?x" and all it's properties at once, for example,
as the data access primitive.
I wondered if it thus makes sense for a parser
to join the groups together and moving them to the front,
so that all of the above become:
GROUP(GROUP(a, b), FILTER(c))
Is it a good idea for a parser to do this in general?
Probably!
Best,
Rubenpushing the filter into the left side of a join, assuming that basic graph
patterns are just joins of triples and so can be broken up.
[1] http://www.w3.org/TR/sparql11-query/#scopeFilters
Andy