Re: Initial Bindings in Update

Stephen Allen Tue, 20 Aug 2013 14:34:16 -0700

I've created a patch and attached to JENA-516, which adds back initial
binding support for update queries.  It will only work for
INSERT/DELETE/WHERE and DELETE/WHERE queries.


Please take a look at it, and test it.  If it looks good, I can check it in.

-Stephen


On Wed, Aug 14, 2013 at 7:52 AM, Andy Seaborne <[email protected]> wrote:
>
> List readers : It would be good to hear from organisations that plug in
> and provide their own query engines).
>
>
> Hi Stephen,
>
>
> On 14/08/13 01:19, Stephen Allen wrote:
>>
>> I don't believe there is anything necessarily holding back the addition
>> an
>> initial binding to the existing UpdateEngine implementations.  It could
>> be
>> carried out similarly to QueryEngineBase and then activate only for
>> DELETE/INSERT/WHERE
>> and DELETE WHERE query operations.
>>
>> However, I think the whole design we currently have is not a very good
>> one.
>>   It is the wrong level for it.  It meant (until Andy changed it) that
>> the
>> substitution occurred after query optimization, which is not really
>> expected.  It also adds a burden on all QueryEngine/UpdateEngine
>> implementations
>> to provide the substitution logic.
>>
>> I think the better way to do it would be to apply the substitution on
>> Query
>> and Update algebra objects before they are passed into the
>> QueryEngine/UpdateEngine, and remove all initial binding handling from
>> the
>> engines.  Users can create a Query or Update objects and call
>> Substitute.substitute()
>> themselves (alternatively we can add a helper method to Query,
>> UpdateModify,
>> and UpdateDeleteWhere that does the same thing).  This has the added
>> benefit of removing the ambiguity surrounding which Update types the
>> binding applies to.
>
>
> I agree that things could be cleaner - it's the passage of time.
>
> Are you proposing a change to the public API?  If so , what exact because
> I don't see how it would work.
>
> Substitute.substitute works on the algebra (we could add operations to
> work on queries but that's not how it is currently; operations on Update by
> syntax can't be done because of the possibility of streaming).
>
> Query objects do not have an algebra object - this does not happen until
> inside QueryEngineBase.  There is no presumption that the algebra as
> provided by ARQ will ever be involved.  Algebra creation happens in the
> constructor of QueryEngineBase.
>
> QueryExecutionBase holds the initial binding and passes it to the query
> engine factory chosen (to produce a Plan - QueryEngineFactory is not
> required to use QueryEngineBase).
>
> The binding does still need to be passed the execution because it must
> come out the other side of pattern matching for use in template
> substitution.  It's easier to send it via the query pattern solutions than
> add an addition route for the binding to get to the templating in CONSTRUCT
> or an update.
>
> If there is no initial binding set by the app, a root one is passed in
> (one row, no solutions), and that is necessary to kick of the execution.  It
> forms the common root of all results.
>
> At the moment, in trunk, the creation of the plan does the
> Substitute.substitute before calling the optimizer at modifyOp.
>
> queryOp is set in the constructor by QueryEngineBase.createOp (an
> extension point).
>
> protected Plan createPlan()
> {
>     // Decide the algebra to actually execute.
>     Op op = queryOp ;
>
>
>     if ( ! startBinding.isEmpty() ) {
>         op = Substitute.substitute(op, startBinding) ;
>         context.put(ARQConstants.sysCurrentAlgebra, op) ;
>         // Don't reset the startBinding because it also is
>         // needed in the output.
>     }
>
>     // Optimization happens here.
>     op = modifyOp(op) ;
>
>     ...
>     evaluate(op, dataset, startBinding, context) ;
>     ...
>
>
>
>> It also moves the feature out of the core API that 3rd party engine
>> developers need to implement, and that is not relevant in  many
>> situations (Fuseki).
>
>
> We may wish to add initial bindings to the protocol.  They work well with
> query-by-reference.
>
> So I think it is a feature that Fuseki might wish to use. The cost in
> execution is currently one call to "startBinding.isEmpty()".  The common
> root binding is always there.
>
>
>> The work involved in this would be to deprecate initial binding support
>> from QueryEngineBase, QueryExecutionFactory, etc, and then also adding
>> two
>> new static methods to Substitute do the substitution of UpdateModify and
>> UpdateDeleteWhere objects.
>
>
> I can only see how to do this if it's query->query rewrite, not
> algebra->algebra because queries don't have an algebra that early.
> Update->update rewrite and streaming don't fit together.
>
>         Andy
>
>>
>> -Stephen
>>
>>
>> On Mon, Aug 12, 2013 at 9:41 PM, Holger Knublauch
>> <[email protected]>wrote:
>>
>>> Hi Andy,
>>>
>>> this is in response to the parallel thread. I honestly wasn't aware that
>>> there remain open issues with UPDATE and thought that the best solution
>>> was
>>> to bring back initial bindings. Your email seems to state that
>>>
>>> INSERT DATA { ?param1 foaf:name ?param2 }
>>>
>>> is problematic because it's not valid syntax. But why is this a show
>>> stopper to restoring initial bindings for other updates? Isn't this just
>>> a
>>> matter of expectation management? Of course any query that takes initial
>>> bindings must also be valid syntactically - if the template query is
>>> invalid then the template with bindings is also invalid.
>>>
>>> This may be one of the advantages of the parameterized SPARQL strings -
>>> it
>>> allows cases such as above because it sits before the string compiler.
>>> Another advantage of that is that magic properties (property functions)
>>> can
>>> be substituted while initial bindings don't recognize those. But this
>>> should be in the hand of the developer to decide which mechanism to use
>>> -
>>> the parameterized strings have their own disadvantages and limitations
>>> too.
>>>
>>> In a nutshell, I remain of the opinion that corner cases such as INSERT
>>> DATA should not lead to dropping an otherwise useful feature for the
>>> other
>>> 99% of use cases (we never use INSERT DATA for example, but have
>>> hundreds
>>> of other UPDATEs). It is inconsistent to support initial bindings for
>>> Query
>>> but not for Update.
>>>
>>> Thanks,
>>> Holger
>>>
>>>
>>>
>>> On 8/10/2013 7:19, Andy Seaborne wrote:
>>>
>>>> On 08/08/13 00:02, Rob Vesse wrote:
>>>>
>>>>   On a related topic I looked at Holger's question around injecting
>>>>>
>>>>> BNodes into SPARQL updates via ParameterizedSparqlString and it
>>>>> doesn't work in the scenario he describes (an INSERT WHERE) if the
>>>>> variable is used both in the INSERT template and WHERE since the
>>>>> template mention is treated as a minting a fresh BNode.  Either he
>>>>> needs to use the BIND workaround discussed by yourself in another
>>>>> thread
>>>>> (http://markmail.org/message/**3lsnjq7yca4es2wb<http://markmail.org/message/3lsnjq7yca4es2wb>)
>>>>>
>>>>> which I suspect
>>>>> is not workable for TQ OR we need to look at restoring initial
>>>>> bindings for updates.
>>>>>
>>>>
>>>> BIND will work - this is way it works for CONSTRUCT.
>>>>
>>>> All Holger's examples are WHERE based.
>>>>
>>>>   I think restoring the feature is going to be the best option, the
>>>>>
>>>>> documentation just needs to be really clear that initial bindings
>>>>> only apply to the WHERE portion of updates and not more generally
>>>>> since that is the only way they were used prior to the feature being
>>>>>   removed (I went back and looked at the ARW 2.9.4 code).  We can
>>>>> always look at expanding their scope later as we've discussed in the
>>>>>   past.
>>>>>
>>>>
>>>> Just to be clear - this is restoring the WHERE based functionality for
>>>> update.
>>>>
>>>> I think that is the way forward.  I'd like to hear from Stephen about
>>>> the implications for streaming but, superficially at least, it does not
>>>> look too bad. The initial binding is applied to each
>>>> DELETE/INSERT/WHERE
>>>> (and DELETE WHERE?), when it is evaluated as a query.  The streaming is
>>>> most critical in the DATA parts.
>>>>
>>>> There is a separate issue about <_:id> and anything reserialized and
>>>> sent
>>>> over the network.
>>>>
>>>>      Andy
>>>>
>>>
>>>
>>
>

Re: Initial Bindings in Update

Reply via email to