List readers : It would be good to hear from organisations that plug in and provide their own query engines).

Hi Stephen,

On 14/08/13 01:19, Stephen Allen wrote:
I don't believe there is anything necessarily holding back the addition an
initial binding to the existing UpdateEngine implementations.  It could be
carried out similarly to QueryEngineBase and then activate only for
DELETE/INSERT/WHERE
and DELETE WHERE query operations.

However, I think the whole design we currently have is not a very good one.
  It is the wrong level for it.  It meant (until Andy changed it) that the
substitution occurred after query optimization, which is not really
expected.  It also adds a burden on all QueryEngine/UpdateEngine
implementations
to provide the substitution logic.

I think the better way to do it would be to apply the substitution on Query
and Update algebra objects before they are passed into the
QueryEngine/UpdateEngine, and remove all initial binding handling from the
engines.  Users can create a Query or Update objects and call
Substitute.substitute()
themselves (alternatively we can add a helper method to Query, UpdateModify,
and UpdateDeleteWhere that does the same thing).  This has the added
benefit of removing the ambiguity surrounding which Update types the
binding applies to.

I agree that things could be cleaner - it's the passage of time.

Are you proposing a change to the public API? If so , what exact because I don't see how it would work.

Substitute.substitute works on the algebra (we could add operations to work on queries but that's not how it is currently; operations on Update by syntax can't be done because of the possibility of streaming).

Query objects do not have an algebra object - this does not happen until inside QueryEngineBase. There is no presumption that the algebra as provided by ARQ will ever be involved. Algebra creation happens in the constructor of QueryEngineBase.

QueryExecutionBase holds the initial binding and passes it to the query engine factory chosen (to produce a Plan - QueryEngineFactory is not required to use QueryEngineBase).

The binding does still need to be passed the execution because it must come out the other side of pattern matching for use in template substitution. It's easier to send it via the query pattern solutions than add an addition route for the binding to get to the templating in CONSTRUCT or an update.

If there is no initial binding set by the app, a root one is passed in (one row, no solutions), and that is necessary to kick of the execution. It forms the common root of all results.

At the moment, in trunk, the creation of the plan does the Substitute.substitute before calling the optimizer at modifyOp.

queryOp is set in the constructor by QueryEngineBase.createOp (an extension point).

protected Plan createPlan()
{
    // Decide the algebra to actually execute.
    Op op = queryOp ;


    if ( ! startBinding.isEmpty() ) {
        op = Substitute.substitute(op, startBinding) ;
        context.put(ARQConstants.sysCurrentAlgebra, op) ;
        // Don't reset the startBinding because it also is
        // needed in the output.
    }

    // Optimization happens here.
    op = modifyOp(op) ;

    ...
    evaluate(op, dataset, startBinding, context) ;
    ...


It also moves the feature out of the core API that 3rd party engine
developers need to implement, and that is not relevant in  many
situations (Fuseki).

We may wish to add initial bindings to the protocol. They work well with query-by-reference.

So I think it is a feature that Fuseki might wish to use. The cost in execution is currently one call to "startBinding.isEmpty()". The common root binding is always there.

The work involved in this would be to deprecate initial binding support
from QueryEngineBase, QueryExecutionFactory, etc, and then also adding two
new static methods to Substitute do the substitution of UpdateModify and
UpdateDeleteWhere objects.

I can only see how to do this if it's query->query rewrite, not algebra->algebra because queries don't have an algebra that early.
Update->update rewrite and streaming don't fit together.

        Andy


-Stephen


On Mon, Aug 12, 2013 at 9:41 PM, Holger Knublauch <[email protected]>wrote:

Hi Andy,

this is in response to the parallel thread. I honestly wasn't aware that
there remain open issues with UPDATE and thought that the best solution was
to bring back initial bindings. Your email seems to state that

INSERT DATA { ?param1 foaf:name ?param2 }

is problematic because it's not valid syntax. But why is this a show
stopper to restoring initial bindings for other updates? Isn't this just a
matter of expectation management? Of course any query that takes initial
bindings must also be valid syntactically - if the template query is
invalid then the template with bindings is also invalid.

This may be one of the advantages of the parameterized SPARQL strings - it
allows cases such as above because it sits before the string compiler.
Another advantage of that is that magic properties (property functions) can
be substituted while initial bindings don't recognize those. But this
should be in the hand of the developer to decide which mechanism to use -
the parameterized strings have their own disadvantages and limitations too.

In a nutshell, I remain of the opinion that corner cases such as INSERT
DATA should not lead to dropping an otherwise useful feature for the other
99% of use cases (we never use INSERT DATA for example, but have hundreds
of other UPDATEs). It is inconsistent to support initial bindings for Query
but not for Update.

Thanks,
Holger



On 8/10/2013 7:19, Andy Seaborne wrote:

On 08/08/13 00:02, Rob Vesse wrote:

  On a related topic I looked at Holger's question around injecting
BNodes into SPARQL updates via ParameterizedSparqlString and it
doesn't work in the scenario he describes (an INSERT WHERE) if the
variable is used both in the INSERT template and WHERE since the
template mention is treated as a minting a fresh BNode.  Either he
needs to use the BIND workaround discussed by yourself in another
thread 
(http://markmail.org/message/**3lsnjq7yca4es2wb<http://markmail.org/message/3lsnjq7yca4es2wb>)
which I suspect
is not workable for TQ OR we need to look at restoring initial
bindings for updates.


BIND will work - this is way it works for CONSTRUCT.

All Holger's examples are WHERE based.

  I think restoring the feature is going to be the best option, the
documentation just needs to be really clear that initial bindings
only apply to the WHERE portion of updates and not more generally
since that is the only way they were used prior to the feature being
  removed (I went back and looked at the ARW 2.9.4 code).  We can
always look at expanding their scope later as we've discussed in the
  past.


Just to be clear - this is restoring the WHERE based functionality for
update.

I think that is the way forward.  I'd like to hear from Stephen about
the implications for streaming but, superficially at least, it does not
look too bad. The initial binding is applied to each DELETE/INSERT/WHERE
(and DELETE WHERE?), when it is evaluated as a query.  The streaming is
most critical in the DATA parts.

There is a separate issue about <_:id> and anything reserialized and sent
over the network.

     Andy





Reply via email to