Thank you very much, very helpful.
One final question. When you say a FILTER (such as FILTER
{!sameTerm(a,b)}) removes certain solutions in a SELECT query, does that
mean it removes rows that do not pass the FILTER? It does not try and
do anything clever about removing individual values within a row?
On 5/13/2020 3:44 AM, Richard Cyganiak wrote:
> Hi Steve,
>
>> On 12 May 2020, at 22:19, Steve Vestal <[email protected]>
>> wrote:
>>
>> DuCharme's book says that the order of OPTIONALs matters, in the sense
>> that if bindings have been done by an OPTIONAL, then later OPTIONALs
>> will have no effect if any variables they might bind have already been
>> bound (so order of OPTIONALs can affect query results).
>>
>> What prompted this query (no pun intended) is the discovery that putting
>> a FILTER inside an OPTIONAL {} versus outside makes a difference, even
>> though the FILTERs (which were some !sameTerm() conditions to avoid
>> aliasing some variables) are the same. If I have the pattern "OPTIONAL
>> { <triples> FILTER <filter>} " versus "OPTIONAL {<triples>} FILTER
>> <filter>", is the behavior that the first form is considered to fail and
>> not bind its variables due to filtering within {}; while the second form
>> would consider the OPTIONAL to have succeeded in binding the variables
>> and cause subsequent OPTIONALs to be skipped?
> Yes, that sounds about right.
>
> SELECT query results are “tables” where each “row” is called a “solution” and
> each “column” a “variable”. When we say that a variable is bound to some
> value, that always means it is bound to a value *within a particular row*
> a.k.a. solution. The intermediate results at every step of SPARQL evaluation
> are also “tables” of this same form, and all SPARQL operators can be
> understood in those terms—they produce a table from a graph (in the case of
> triple patterns), or produce new tables from other tables. For example, the
> result of UNION is simply one table appended to another; the result of FILTER
> is a table with non-matching rows removed.
>
> In SPARQL semantics, when curly braces are involved, evaluation is
> “inside-out”. Whatever is inside curly braces is completely evaluated
> independently from anything outside the braces, before being combined with
> other solutions from outside.
>
> For OPTIONAL this means that the graph pattern inside { } is completely
> evaluated independently from what's before or after the OPTIONAL { }. FILTERs
> inside the { } will be evaluated as part of this. Once this is complete, the
> results will be combined in an OPTIONAL join (left join) with the solutions
> of the graph pattern that preceded the OPTIONAL.
>
>> Do variables stay bound
>> once bound, a FILTER just filters the final result, as if all FILTERs
>> were applied at the end?
> Within a group (a { ... } section), one could say that variables in a
> solution stay bound once bound. But entire solutions can be removed by
> FILTERs. Within a group, the order of *some* constructs matters (OPTIONAL
> being one). Other constructs between these order-dependent ones can be
> rearranged without changing the result. So, if you had a graph pattern of
> this shape:
>
> { ... OPTIONAL {...} triples FILTER triples FILTER triple OPTIONAL {...}
> ... }
>
> Then the order of the triples and FILTERs can be freely changed, but moving
> them before or after the OPTIONALs would potentially affect the result.
>
>> But in the first pattern above, the filter
>> nested inside the OPTIONAL{} is able to unbind anything that was bound
>> inside that OPTIONAL?
> “Unbinding” is not a particularly good way to think of it.
>
> I would say: The FILTER inside the OPTIONAL { } removes certain solutions
> from the result of the { } group, and therefore those removed solutions are
> not taken into account when the group's result gets left-joined to whatever
> came before the OPTIONAL.
>
>> Ignoring performance issues, are there any cases where the order of
>> FILTER statements would affect the result of the query? Or are
>> OPTIONALs the only thing that have order-dependent semantics?
> OPTIONAL, BIND and MINUS are order-dependent. FILTERs can be moved around
> between those, but moving them “over” an order-dependent construct
> potentially changes the result.
>
>> Do nested braces have any impact on variable name visibility or
>> semantics?
> It's complicated.
>
> Some occurrences of a variable potentially “bind” the variable, that is, they
> may cause a value to appear for that variable in a solution. This includes
> variables in triple patterns, in BIND (expr AS ?var), and in VALUES.
>
> Other occurrences of a variable only “use” variables that have been
> previously bound. For example, FILTER can never cause new bindings.
>
> “Bound” variables travel inside-out through curly braces, and top-down within
> a single level of nesting. “Used” variables don't travel.
>
> Consider this:
>
> { a b OPTIONAL { c } d OPTIONAL { e } f }
>
> A variable bound in c would be visible at d and f, but not at a, b, or e.
>
> A sub-SELECT only binds the variables in its variable list; other variables
> that may occur inside it are not visible outside.
>
> (One way to test all this is by inserting “BIND (1 AS ?var)” at various
> points in the query. This is a syntax error if ?var is already bound at that
> place.)
>
>> Are all variables appearing anywhere in a WHERE{} in the
>> same global namespace, or are there cases where nested {} have some
>> namespace semantics?
> All variables are in the same namespace. But queries can be written so that
> two occurrences of the same variable never interact. The question is what
> operators connect them.
>
> Excellent questions by the way!
>
> (What I describe above is one way of determining the correct result of a
> SPARQL query. There are other ways that produce the same correct results. The
> language in the spec is different. Implementations may do something
> different. What matters is that all get the same results.)
>
> Richard