Hi Andy,

this looks great, and is just in time for the ongoing discussions in the SHACL group. I apologize in advance for not having the bandwidth yet to try this out from your branch, but this topic will definitely bubble up in the priorities soon...

I have not fully understood how the semantics of this are different from the setInitialBinding feature that we currently use in SPIN, and which seems to do a pretty good job. However, having a facility to do further pre-processing in advance may improve performance and provide a more formal definition of what setInitialBinding is doing. I am personally not enthusiastic about approaches based on text-substitution, so working on the parsed syntax tree looks good to me. There are some (rare) cases where text-substitution would be more powerful, e.g. dynamic path properties and some solution modifiers, but as you say no approach is perfect.

Questions:

- would this also pre-bind variables inside of nested SELECTs?
- I assume this can handle blank nodes (e.g. rdf:Lists) as bindings?
- What about bound(?var) and ?var is pre-bound?

Thanks
Holger


On 6/28/15 8:08 PM, Andy Seaborne wrote:
(info / discussion / ...)

In working on JENA-963 (OpAsQuery; reworked handling of SPARQL modifiers for GROUP BY), it was easier/better to add the code I had for rewriting syntax by transformation, much like the algebra is rewritten by the optimizer. The use case is rewriting the output of OpAsQuery to remove unnecessary nesting of levels of "{}" which arise during translation for the safety of the translation.

Hence putting in package oaj.sparql.syntax.syntaxtransform, a general framework for rewriting syntax, like we have for the SPARQL+ algebra.

It is also capable of being a parameterized query system (PQ). We already ParameterizedSparqlString (PSS) so how do they compare?

Work-in-progress:

https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java

PQ is a rewrite of a Query object (the template) with a map of variables to constants. That is, it works on the syntax tree after parsing and produces a syntax tree.

PSS is a builder with substitution. It builds a string, carefully (injection attacks) and is neutral as to what it is working with - query or update or something weird. http://jena.apache.org/documentation/query/parameterized-sparql-strings.html

Summary:

PQ is only for replacement of a variable in a template.
PSS is a builder that can do that as part of building.

PQ covers cases PSS doesn't - neither is perfect.

PSS works with INSERT DATA.
PQ would use the INSERT { ... } WHERE {} form.

Details:

PSS:
  Can build query, update strings and fragments
  Supports JDBC style positional parameters (a '?')
    These must be bound to get a valid query.
    Can generate illegal syntax.
  Tests the type of the injected value (string, iri, double etc).
  Has corner cases
     Looks for ?x as a string so ...
       "This is not a ?x as a variable"
       <http://example/foo?x=123>
       "SELECT ?x"
       ns:local\?x (a legal local part)
  Protects against injection by checking.
  Works on INSERT DATA.

PQ:
  Replaces SPARQL variables where identified as variables.
    (no extra-syntax positional '?')
  Legal query to legal syntax query.
    The query may violate scope rules (example below).
    Not a query builder.
  Post parser, so no reparsing to use the query
    (for large updates and queries)
  Injection is meaningless - can only inject values, not syntax.
  Can rewrite structurally: "SELECT ?x" => "SELECT  (:value AS ?x)"
    which is useful to record the injection variables.
  Works with "INSERT {?s ?p ?o } WHERE { }"

PQ example:

  Query template = QueryFactory.create(.. valid query ..) ;
  Map<String, RDFNode> map = new HashMap<>() ;
  map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
  Query query = ParameterizedQuery.setVariables(template, map) ;


A perfect system probably needs a "template language" which SPARQL extended with a new "template variable" which is only allowed in certain places in the query and must be bound before use.

Some examples of hard templates:

(1) Not variables:
<http://example/foo?x=123>
"This is not a ?x as a variable"
ns:local\?x

(2) Some places ?x can not be replaced with a value directly.
   SELECT ?x { ?s ?p ?x }



A possible output is:
  SELECT  (:X AS ?x) { ?s ?p :X }
which is nice as it record the substitution but it fails when nested again.

SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }

This is a bad query:
SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...

(3) Other places:
SELECT ?x { BIND(1 AS ?x) }
SELECT ?x { VALUES ?x { 123 } }

    Andy

Reply via email to