Rob Vesse created JENA-389:
------------------------------

             Summary: Subquery containing a FILTER NOT EXISTS can cause 
unexpected results
                 Key: JENA-389
                 URL: https://issues.apache.org/jira/browse/JENA-389
             Project: Apache Jena
          Issue Type: Bug
          Components: ARQ
    Affects Versions: Jena 2.10.0
            Reporter: Rob Vesse


Highlighted by Tim Harsch on Answers.SemanticWeb.com - 
http://answers.semanticweb.com/questions/20737/why-doesnt-subquery-variable-get-projected

He found a couple of queries (run against the trivial books database on 
sparql.org which should give the same results yet yield entirely different 
results).

Query 1:

SELECT *
{
  SELECT (COUNT(?x1) as ?openTriplets)
  WHERE {
    ?x1 ?a1 ?y1 .
    ?y1 ?b1 ?z1 .
    FILTER NOT EXISTS {?z1 ?c1 ?x1}
  }
}

Result 1 - ?openTriplets has count of 6

Query 2:

SELECT ?openTriplets
{
  SELECT (COUNT(?x1) as ?openTriplets)
  WHERE {
    ?x1 ?a1 ?y1 .
    ?y1 ?b1 ?z1 .
    FILTER NOT EXISTS {?z1 ?c1 ?x1}
  }
}

Result 2 - ?openTriplets = 0

This seems to be because the explicit mention of the variable name in Query 2 
results in a different algebra being generated because of how ARQ renames 
variables to give correct scoping.

Algebra for Query 1:

(base <http://example/base/>
   (project (?openTriplets)
     (extend ((?openTriplets ?.0))
      (group () ((?.0 (count ?x1)))
        (filter (notexists (bgp (triple ?z1 ?c1 ?x1)))
          (bgp
            (triple ?x1 ?a1 ?y1)
           (triple ?y1 ?b1 ?z1)
         ))))))

Algebra for Query 2:

(base <http://example/base/>
   (project (?openTriplets)
     (project (?openTriplets)
       (extend ((?openTriplets ?/.0))
         (group () ((?/.0 (count ?/x1)))
           (filter (notexists (bgp (triple ?//z1 ?//c1 ?//x1)))
             (bgp
               (triple ?/x1 ?/a1 ?/y1)
               (triple ?/y1 ?/b1 ?/z1)
             )))))))

As can be seen with the second algebra the extra project appears to cause ARQ 
to rename variables differently.  This results in the NOT EXISTS being treated 
as distinct from the BGP it encloses meaning it matches everything resulting in 
all values being eliminated hence the count of zero as the result of the second 
query.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to