On 20/08/14 10:03, Ruben Verborgh wrote:
Dear all,

Can/should a SPARQL parser join groups within a scope?

The parser and execution can do anything it likes providing it does not change the results. The defn of what the results are given by a sequence of

1/ Abstract syntax tree -> SPARQL algebra
2/ Execution of the SPARQL algebra.

That's the "can" part ... the "should" is "it depends".

For instance, Section 5.2.2 of the SPARQL 1.1 spec [1]
says that the following patterns all have the same solutions:

  {  ?x foaf:name ?name .
     ?x foaf:mbox ?mbox .
     FILTER regex(?name, "Smith")
  }

  {  FILTER regex(?name, "Smith")
     ?x foaf:name ?name .
     ?x foaf:mbox ?mbox .
  }

  {  ?x foaf:name ?name .
     FILTER regex(?name, "Smith")
     ?x foaf:mbox ?mbox .
  }

Parse trees for the above could look like:

GROUP(GROUP(a, b), FILTER(c))

A sequence of adjacent triple patterns is a "basic graph pattern"
and you may wish to keep those together. The SPARQL grammar does pick this pick with

[55]    TriplesBlock

so it's more like GROUP(BGP(a, b), FILTER(c))

What really matters for execution is the algebra.

"Translation to the SPARQL Algebra"
http://www.w3.org/TR/sparql11-query/#sparqlQuery


GROUP(FILTER(c), GROUP(a, b))

GROUP(GROUP(a), FILTER(c), GROUP(b))

In the algebra, they are all the same: FILTERs are placed logically at the end of basic graph patterns (and, of course, an optimizer may do magic with them after that).

The query is the algebra:

    project (?book ?title)
      filter (regex ?name "Smith")
        bgp
          ?x foaf:name ?name
          ?x foaf:mbox ?mbox

Produced by:
http://www.sparql.org/query-validator.html

One possible optimization is to filter the "?x foaf:name ?name" and not after the whole BGP has been executed.

Like many things, it's not always how an engine might wish to do it. It might get a stream of "?x" and all it's properties at once, for example, as the data access primitive.


I wondered if it thus makes sense for a parser
to join the groups together and moving them to the front,
so that all of the above become:

GROUP(GROUP(a, b), FILTER(c))

Is it a good idea for a parser to do this in general?

Probably!


Best,

Rubenpushing the filter into the left side of a join, assuming that basic graph 
patterns are just joins of triples and so can be broken up.



[1] http://www.w3.org/TR/sparql11-query/#scopeFilters


        Andy


Reply via email to