Hi Andy,

oh my, this is really a bigger issue than I thought. The following query pattern also no longer works

SELECT *
WHERE {
    GRAPH <http://spinrdf.org/spin> {
        ?x rdfs:label ?label .
    }
    BIND (?label AS ?result) .
}

The above is again an artificial test case that makes no sense, but we and our customers have an unknown number of queries in production that use values from other { ... } blocks in BIND steps, often multiple BINDs where intermediate values are sliced and diced by succeeding BINDs. Here is a typical example (?graph is pre-bound from the outside):

SELECT ?result
WHERE {
    {
        BIND (xsd:string(?graph) AS ?str) .
        FILTER fn:starts-with(?str, "urn:x-evn-master:") .
    } .
    BIND (fn:substring(?str, 18) AS ?a) .
    BIND (spif:lastIndexOf(?a, ":") AS ?last) .
    BIND (IF(bound(?last), fn:substring(?a, 1, ?last), ?a) AS ?result) .
}

In another example from our queries

SELECT ?node ?label ?leaf ?icon ?movable
WHERE {
    {
        ?node skos:broader ?parent .
BIND ((!swa:isReadOnlyTriple(?node, skos:broader, ?parent)) AS ?movable) .
    }
    UNION
    {
        ?parent skos:hasTopConcept ?node .
BIND ((!swa:isReadOnlyTriple(?parent, skos:hasTopConcept, ?node)) AS ?movable) .
    } .
    BIND (NOT EXISTS  {
        ?child skos:broader ?node .
    } AS ?leaf) .
    BIND (ui:label(?node) AS ?label) .
    BIND ("evn-icon-concept" AS ?icon) .
}
ORDER BY (?label)

I have to move the computation of ?label into each branch of the UNION, and move the computation of ?leaf into the SELECT projection. The latter isn't a big problem except for readability, but the double appearance of ?label is really bad. The new query is

SELECT ?node ?label ((NOT EXISTS  {
    ?child skos:broader ?node .
}) AS ?leaf) ?icon ?movable
WHERE {
    {
        ?node skos:broader ?parent .
BIND ((!swa:isReadOnlyTriple(?node, skos:broader, ?parent)) AS ?movable) .
        BIND (ui:label(?node) AS ?label) .
    }
    UNION
    {
        ?parent skos:hasTopConcept ?node .
BIND ((!swa:isReadOnlyTriple(?parent, skos:hasTopConcept, ?node)) AS ?movable) .
        BIND (ui:label(?node) AS ?label) .
    } .
    BIND ("evn-icon-concept" AS ?icon) .
}
ORDER BY (?label)

Others are harder to refactor. For example I have to reformulate this query

SELECT ?actionName ?onSelect ?enabled ?group ?label ?iconClass
WHERE {
    GRAPH ui:unionGraph {
        {
            ?action a swa:ResourceAction .
            ?action rdfs:label ?label .
            ?action arg:condition ?condition .
            BIND (ui:encodeNode(?action) AS ?actionName) .
            BIND (spl:object(?action, arg:onSelect) AS ?onSelectRaw) .
BIND (COALESCE(?onSelectRaw, IF(swa:hasOtherArgument(?action), CONCAT("swa.openHandlerDialog(\"", ui:escapeJSON(?label), "\", \"<", xsd:string(?action), ">\", \"", ui:escapeJSON(xsd:string(?resource)), "\")"), ?none)) AS ?onSelect) .
            BIND (COALESCE(spl:object(?action, arg:group), "") AS ?group) .
            BIND (spl:object(?action, arg:iconClass) AS ?iconClass) .
FILTER (((!bound(?appName)) || (?appName = "")) || swa:actionHasAppName(?action, ?appName)) .
        } .
        BIND (spin:eval(?condition, arg:resource, ?resource) AS ?enabled) .
        FILTER bound(?enabled) .
    } .
}
ORDER BY (?group) (?label)

because the FILTER depends on the previous BIND, but the BIND can't use the values from the upper block. I really don't want the spin:eval to be called if the FILTER above it is false - it's an expensive operation. I guess it has to become

SELECT ?actionName ?onSelect ?enabled ?group ?label ?iconClass
WHERE {
    GRAPH ui:unionGraph {
        ?action a swa:ResourceAction .
        ?action rdfs:label ?label .
        ?action arg:condition ?condition .
        BIND (ui:encodeNode(?action) AS ?actionName) .
        BIND (spl:object(?action, arg:onSelect) AS ?onSelectRaw) .
BIND (COALESCE(?onSelectRaw, IF(swa:hasOtherArgument(?action), CONCAT("swa.openHandlerDialog(\"", ui:escapeJSON(?label), "\", \"<", xsd:string(?action), ">\", \"", ui:escapeJSON(xsd:string(?resource)), "\")"), ?none)) AS ?onSelect) .
        BIND (COALESCE(spl:object(?action, arg:group), "") AS ?group) .
        BIND (spl:object(?action, arg:iconClass) AS ?iconClass) .
BIND ((((!bound(?appName)) || (?appName = "")) || swa:actionHasAppName(?action, ?appName)) AS ?app) . BIND (IF(?app, spin:eval(?condition, arg:resource, ?resource), ?none) AS ?enabled) .
        FILTER bound(?enabled) .
    } .
}
ORDER BY (?group) (?label)

i.e. the trick is to replace the upper FILTER with the intermediate helper variable ?app, and use this to prevent the spin:eval call with an IF. This trick obviously doesn't scale if there is a chain of other BINDs.


While I don't understand all the technical details, I believe BIND has become unnecessarily limited and unintuitive with this spec. If your previous implementation (that you had for many years including LET) was indeed just a bug then it was a very useful bug. What ever happened to the nice mantra that SPARQL is executed from the inside out, if it becomes impossible to use the produced values in BIND statements? It seems that the baby has been thrown out with the bath water here.

I believe TQ will need to raise this issue with the SPARQL 1.1 WG again, although it seems we are very late in the process.

BTW in the future it would be helpful to see such changes listed in the release notes.

And yes, optimizing the FILTER placement would be great and would remove some of the pain and allow query authors to improve query performance.

Thanks,
Holger


On 8/12/2012 1:19, Andy Seaborne wrote:
On 11/08/12 00:50, Holger Knublauch wrote:
On 8/10/2012 19:40, Andy Seaborne wrote:
On 10/08/12 02:12, Holger Knublauch wrote:
Andy,

we are evaluating the move to 2.7.3 and have been immediately hit by
what looks like a change of SPARQL semantics in ARQ. See the attached
Java test which returns "Test" in 272 but null in 273. The query is
really simple:

     SELECT *
     WHERE {
         {
             BIND ("Test" AS ?label) .
         } .
         BIND (?label AS ?result) .
     }

but ?label is no longer visible in the outer BIND. The same happens if
you replace the inner BIND with a BGP that binds ?label, but I wanted to
make the example model independent.

So my obvious question: is this the intended behavior, why the change
etc?

2.7.3 is right - 2.7.2. is wrong (plain old bug, fixed due to having
to clarify scoping in the SPARQL spec so I went back and check ARQ).

>          {
>              BIND ("Test" AS ?label) .
>          } .
>          BIND (?label AS ?result) .

That's a join of the inner, first BIND and the outer BIND.

The Outer BIND applies to the immediately preceeding BGP. BIND binds
quite tightly (if you'll forgive the pun).

The preceeding BGP is actually empty - it's between the "}" and
"BIND (?label AS ?result) ."

Think of it as :

    {
        { BIND ("Test" AS ?label) . }
        {} BIND (?label AS ?result) .
    }

technically, that's structurally different but it stresses the empty
part before second BIND.

The important factor is the scope of ?label.

The query joins "BIND ("Test" AS ?label)" and
"BIND (?label AS ?result)".  So it evals "BIND (?label AS ?result)"
not in the context of the "BIND ("Test" AS ?label)" i.e. the use of
?label in "BIND (?label AS ?result)" is unbound.

Thanks Andy. I cannot claim that I understand this yet. Nor do I believe
many of our users will. Where does the "hidden {}" come from?

The pattern that I don't see how to solve with the new design is as
follows:

It's not a new design ... it's what the spec has said all along although it was a bit of a mess. The descriptive section was clear; the formal section was open to "multiple interpretations" at best, including none :-( Any spec changes are to make it clear.Also, ARQ was just plain wrong and had a bug regardless of the spec.

     {
         ?x ex:prop ?value .
         FILTER (?value some condition) .
     }
     BIND (my:function(?value) AS ?result) .

I only want my:function to execute if the FILTER is passed. Therefore I
cannot simply write

     ?x ex:prop ?value .
     FILTER (?value some condition) .
     BIND (my:function(?value) AS ?result) .

because 2.7.2 moves the FILTER to the end and makes it effectively

     ?x ex:prop ?value .
     BIND (my:function(?value) AS ?result) .
     FILTER (?value some condition) .

I had introduced the inner { ... } block to ensure that the FILTER is
grouped together with the previous line. The mantra "SPARQL executes
from the inside out" was just easy enough to explain, but now inner
blocks seem to have become useless.

How would I have to rewrite the first query to make sure that the BIND
is only executed after the FILTER, but with ?value bound?

So this wil do exactly what you want - the SELECT expression form will do what you want.

SELECT ?value (my:function(?value) AS ?result)
{
   ?x ex:prop ?value .
   FILTER (?value some condition) .
}

It is regrettable that * isn't allowed in this position.

Then it really is like:

BIND (my:function(?value) AS ?result WHERE
      {
         ?x ex:prop ?value .
         FILTER (?value some condition) .
      })

The other way to approach is that

  {
    ?x ex:prop ?value .
    BIND (my:function(?value) AS ?result) .
    FILTER (?value some condition) .
  }

Any function should really cope with anything pased to it - it can return as error (an exception) and ?result is not bound.

The optimizer can push the filter through the (extend) - the algebra operator for BIND - so the execution is more efficient.

BGP -> extend -> filter

becomes

BGP -> filter -> extend

It can do this because the extend variable ?result is not used in the filter. The code (TransformFilterPlacement) does not currently do this.

I'd file a JIRA for it but JIRA@ASF is undergoing maintenance at the moment. They are having to move it to a bigger machine due to too much load.

    Andy


Thanks
Holger



Reply via email to