The problem is that the subsitution is done after optimization but it
still is substitution semantics.
Currently, execution proceeds as:
QueryEngineBase.createPlan:
Op op = modifyOp(queryOp) ;
eval(op, dsg, binding, context) ;
and modifyOp does the highlevel optimizations as selected by the engine.
At evaluation time, something like this happens in eval for each query
engine:
if ( ! input.isEmpty() )
op = Substitute.substitute(op, input) ;
We can:
A/ Do early algebra substitution:
Advantage: gives the optimizer more chance to do a good jon, especially
on FILTER (?x = ?param)
Disadvantage:
If Service.exec changes to use the original input syntax, it breaks. See
note in that method.
The change is:
QueryEngineBase
protected Plan createPlan()
{
// Decide the algebra to actually execute.
Op op = queryOp ;
*** New code ***
if ( ! startBinding.isEmpty() ) {
op = Substitute.substitute(op, startBinding) ;
context.put(ARQConstants.sysCurrentAlgebra, op) ;
// Don't reset the startBinding because it also is needed
// in the output.
}
*** New code ***
op = modifyOp(op) ;
doing it in setOp() may be better.
This works - testInitialBindings5 and testInitialBindings6 then fail as
expected.
Query engines then don't need to do this step - they still need to
create bindings including the initial conditions (else CONSTRUCT does
not work).
B/ Do abstract query syntax substitution.
Do this at inside QueryExecutionBase.setInitialBinding.
To this end I have now committed code that does abstract query syntax
substitution that I wrote as an experiment during the last discussions:
https://svn.apache.org/repos/asf/jena/Scratch/AFS/Jena-Dev/trunk/src/element/
This is not advocacy of this approach - it's giving us choices. It also
sort-of works for updates. (Biggest hole is in testing - updates don't
support structure .equals).
Substitution and Update:
We need something. The most natural case IMO is
INSERT DATA { ?param1 foaf:name ?param2 }
so rewriting the abstract syntax tree will not currently work - that's
illegal syntax for an INSERT DATA. The parser could be modified and the
"no vars" check done later.
The current parametrized SPARQL Strings can do this. A touch odd we
have two separate mechanisms.
Multiple Values:
Whatever mechanism we end up with, the semantics of multiple values
should be loop-substitute.
Rob's list:
1 – Remove support for initial bindings on queries entirely
AFS: -1
2 – Change initial bindings to be a pre-optimization algebra
transformation of the query
2a – Algebra -> Algebra done before optimization (A)
2b – AST -> AST done before creating the query engine (B)
AFS: +1 (A is tested, and seems to work).
Note the checked-in code for B does the case of
SELECT ?x {}
==>
SELECT (1 AS ?x) {}
which parametrized SPARQL strings and algebra->algebra does not.
3 – Change initial bindings to be done by injection of VALUES clauses
AFS: -0.5 - seems complicated and I'm not clear it will work (I'd need
more time to think about it).
However this approach might get rather complex
Yes - complicated - certainly it isn't a simple add VALUES to the top
level pattern but I'd have to think harder to know whether it interacts
with scoping and name substitution.
Need to be very careful that it does not assume ARQ's current execution
strategy which is scope sensitive.
4 – Skip optimization when initial bindings are involved
AFS: -1
This isn't per-se an optimization issue - the feature is there for
taking information obtained elsewhere - it does work for a graph-wide
query becoming quite local and a big advantage.
Andy