I've created a patch and attached to JENA-516, which adds back initial binding support for update queries. It will only work for INSERT/DELETE/WHERE and DELETE/WHERE queries.
Please take a look at it, and test it. If it looks good, I can check it in. -Stephen On Wed, Aug 14, 2013 at 7:52 AM, Andy Seaborne <[email protected]> wrote: > > List readers : It would be good to hear from organisations that plug in > and provide their own query engines). > > > Hi Stephen, > > > On 14/08/13 01:19, Stephen Allen wrote: >> >> I don't believe there is anything necessarily holding back the addition >> an >> initial binding to the existing UpdateEngine implementations. It could >> be >> carried out similarly to QueryEngineBase and then activate only for >> DELETE/INSERT/WHERE >> and DELETE WHERE query operations. >> >> However, I think the whole design we currently have is not a very good >> one. >> It is the wrong level for it. It meant (until Andy changed it) that >> the >> substitution occurred after query optimization, which is not really >> expected. It also adds a burden on all QueryEngine/UpdateEngine >> implementations >> to provide the substitution logic. >> >> I think the better way to do it would be to apply the substitution on >> Query >> and Update algebra objects before they are passed into the >> QueryEngine/UpdateEngine, and remove all initial binding handling from >> the >> engines. Users can create a Query or Update objects and call >> Substitute.substitute() >> themselves (alternatively we can add a helper method to Query, >> UpdateModify, >> and UpdateDeleteWhere that does the same thing). This has the added >> benefit of removing the ambiguity surrounding which Update types the >> binding applies to. > > > I agree that things could be cleaner - it's the passage of time. > > Are you proposing a change to the public API? If so , what exact because > I don't see how it would work. > > Substitute.substitute works on the algebra (we could add operations to > work on queries but that's not how it is currently; operations on Update by > syntax can't be done because of the possibility of streaming). > > Query objects do not have an algebra object - this does not happen until > inside QueryEngineBase. There is no presumption that the algebra as > provided by ARQ will ever be involved. Algebra creation happens in the > constructor of QueryEngineBase. > > QueryExecutionBase holds the initial binding and passes it to the query > engine factory chosen (to produce a Plan - QueryEngineFactory is not > required to use QueryEngineBase). > > The binding does still need to be passed the execution because it must > come out the other side of pattern matching for use in template > substitution. It's easier to send it via the query pattern solutions than > add an addition route for the binding to get to the templating in CONSTRUCT > or an update. > > If there is no initial binding set by the app, a root one is passed in > (one row, no solutions), and that is necessary to kick of the execution. It > forms the common root of all results. > > At the moment, in trunk, the creation of the plan does the > Substitute.substitute before calling the optimizer at modifyOp. > > queryOp is set in the constructor by QueryEngineBase.createOp (an > extension point). > > protected Plan createPlan() > { > // Decide the algebra to actually execute. > Op op = queryOp ; > > > if ( ! startBinding.isEmpty() ) { > op = Substitute.substitute(op, startBinding) ; > context.put(ARQConstants.sysCurrentAlgebra, op) ; > // Don't reset the startBinding because it also is > // needed in the output. > } > > // Optimization happens here. > op = modifyOp(op) ; > > ... > evaluate(op, dataset, startBinding, context) ; > ... > > > >> It also moves the feature out of the core API that 3rd party engine >> developers need to implement, and that is not relevant in many >> situations (Fuseki). > > > We may wish to add initial bindings to the protocol. They work well with > query-by-reference. > > So I think it is a feature that Fuseki might wish to use. The cost in > execution is currently one call to "startBinding.isEmpty()". The common > root binding is always there. > > >> The work involved in this would be to deprecate initial binding support >> from QueryEngineBase, QueryExecutionFactory, etc, and then also adding >> two >> new static methods to Substitute do the substitution of UpdateModify and >> UpdateDeleteWhere objects. > > > I can only see how to do this if it's query->query rewrite, not > algebra->algebra because queries don't have an algebra that early. > Update->update rewrite and streaming don't fit together. > > Andy > >> >> -Stephen >> >> >> On Mon, Aug 12, 2013 at 9:41 PM, Holger Knublauch >> <[email protected]>wrote: >> >>> Hi Andy, >>> >>> this is in response to the parallel thread. I honestly wasn't aware that >>> there remain open issues with UPDATE and thought that the best solution >>> was >>> to bring back initial bindings. Your email seems to state that >>> >>> INSERT DATA { ?param1 foaf:name ?param2 } >>> >>> is problematic because it's not valid syntax. But why is this a show >>> stopper to restoring initial bindings for other updates? Isn't this just >>> a >>> matter of expectation management? Of course any query that takes initial >>> bindings must also be valid syntactically - if the template query is >>> invalid then the template with bindings is also invalid. >>> >>> This may be one of the advantages of the parameterized SPARQL strings - >>> it >>> allows cases such as above because it sits before the string compiler. >>> Another advantage of that is that magic properties (property functions) >>> can >>> be substituted while initial bindings don't recognize those. But this >>> should be in the hand of the developer to decide which mechanism to use >>> - >>> the parameterized strings have their own disadvantages and limitations >>> too. >>> >>> In a nutshell, I remain of the opinion that corner cases such as INSERT >>> DATA should not lead to dropping an otherwise useful feature for the >>> other >>> 99% of use cases (we never use INSERT DATA for example, but have >>> hundreds >>> of other UPDATEs). It is inconsistent to support initial bindings for >>> Query >>> but not for Update. >>> >>> Thanks, >>> Holger >>> >>> >>> >>> On 8/10/2013 7:19, Andy Seaborne wrote: >>> >>>> On 08/08/13 00:02, Rob Vesse wrote: >>>> >>>> On a related topic I looked at Holger's question around injecting >>>>> >>>>> BNodes into SPARQL updates via ParameterizedSparqlString and it >>>>> doesn't work in the scenario he describes (an INSERT WHERE) if the >>>>> variable is used both in the INSERT template and WHERE since the >>>>> template mention is treated as a minting a fresh BNode. Either he >>>>> needs to use the BIND workaround discussed by yourself in another >>>>> thread >>>>> (http://markmail.org/message/**3lsnjq7yca4es2wb<http://markmail.org/message/3lsnjq7yca4es2wb>) >>>>> >>>>> which I suspect >>>>> is not workable for TQ OR we need to look at restoring initial >>>>> bindings for updates. >>>>> >>>> >>>> BIND will work - this is way it works for CONSTRUCT. >>>> >>>> All Holger's examples are WHERE based. >>>> >>>> I think restoring the feature is going to be the best option, the >>>>> >>>>> documentation just needs to be really clear that initial bindings >>>>> only apply to the WHERE portion of updates and not more generally >>>>> since that is the only way they were used prior to the feature being >>>>> removed (I went back and looked at the ARW 2.9.4 code). We can >>>>> always look at expanding their scope later as we've discussed in the >>>>> past. >>>>> >>>> >>>> Just to be clear - this is restoring the WHERE based functionality for >>>> update. >>>> >>>> I think that is the way forward. I'd like to hear from Stephen about >>>> the implications for streaming but, superficially at least, it does not >>>> look too bad. The initial binding is applied to each >>>> DELETE/INSERT/WHERE >>>> (and DELETE WHERE?), when it is evaluated as a query. The streaming is >>>> most critical in the DATA parts. >>>> >>>> There is a separate issue about <_:id> and anything reserialized and >>>> sent >>>> over the network. >>>> >>>> Andy >>>> >>> >>> >> >
