Rob Vesse created JENA-473:
------------------------------

             Summary: ARQ should be able to optimize implicit joins and 
implicit left joins
                 Key: JENA-473
                 URL: https://issues.apache.org/jira/browse/JENA-473
             Project: Apache Jena
          Issue Type: Improvement
          Components: ARQ
            Reporter: Rob Vesse
            Assignee: Rob Vesse
             Fix For: Jena 2.10.2


There is a class of useful optimizations that currently ARQ does not even 
attempt to apply which are usually referred to as implicit joins.

A trivial example is as follows:

SELECT *
WHERE
{
  ?x ?p1 ?o1 .
  ?y ?p2 ?o2 .
  FILTER(?x = ?y)
}

Currently this requires us to compute a cross product and then apply the 
filter, even with streaming evaluation this can be extremely costly.  The aim 
of this optimization is to produce a query like the following:

SELECT *
WHERE
{
  ?x ?p1 ?o1 .
  ?x ?p2 ?o2 .
  BIND(?x AS ?y)
}

This optimization can also be applied to some left joins where the implicit 
join applies across the join e.g.

SELECT *
WHERE
{
  ?x ?p1 ?o1 .
  OPTIONAL
  {
    ?y ?p2 ?o2 .
    FILTER(?x = ?y)
  }
}

This can be thought of as a generalization of TransformFilterEquality except 
covering the case where both items are variables.  Since both things are 
variables we need to be careful about when we apply this optimization since 
when = is used we need to guarantee that substituting one variable for the 
other does not alter the semantics of the query.

I believe the optimization is safe to apply providing that we can guarantee (as 
far as possible) that one variable is non-literal.  This can be done by 
inspecting the positions in which the mentioned variables are used and ensuring 
that at least one of the variables occurs in the graph, subject or predicate 
position.

Safety for left joins is a little more complex since we must ensure that at 
least one of the variables occurs in the RHS and we can only make the 
substitution in the RHS as otherwise we change the join semantics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to