Hi Andy,
this looks great, and is just in time for the ongoing discussions in the
SHACL group. I apologize in advance for not having the bandwidth yet to
try this out from your branch, but this topic will definitely bubble up
in the priorities soon...
I have not fully understood how the semantics of this are different from
the setInitialBinding feature that we currently use in SPIN, and which
seems to do a pretty good job. However, having a facility to do further
pre-processing in advance may improve performance and provide a more
formal definition of what setInitialBinding is doing. I am personally
not enthusiastic about approaches based on text-substitution, so working
on the parsed syntax tree looks good to me. There are some (rare) cases
where text-substitution would be more powerful, e.g. dynamic path
properties and some solution modifiers, but as you say no approach is
perfect.
Questions:
- would this also pre-bind variables inside of nested SELECTs?
- I assume this can handle blank nodes (e.g. rdf:Lists) as bindings?
- What about bound(?var) and ?var is pre-bound?
Thanks
Holger
On 6/28/15 8:08 PM, Andy Seaborne wrote:
(info / discussion / ...)
In working on JENA-963 (OpAsQuery; reworked handling of SPARQL
modifiers for GROUP BY), it was easier/better to add the code I had
for rewriting syntax by transformation, much like the algebra is
rewritten by the optimizer. The use case is rewriting the output of
OpAsQuery to remove unnecessary nesting of levels of "{}" which arise
during translation for the safety of the translation.
Hence putting in package oaj.sparql.syntax.syntaxtransform, a general
framework for rewriting syntax, like we have for the SPARQL+ algebra.
It is also capable of being a parameterized query system (PQ). We
already ParameterizedSparqlString (PSS) so how do they compare?
Work-in-progress:
https://github.com/afs/jena-workspace/blob/master/src/main/java/syntaxtransform/ParameterizedQuery.java
PQ is a rewrite of a Query object (the template) with a map of
variables to constants. That is, it works on the syntax tree after
parsing and produces a syntax tree.
PSS is a builder with substitution. It builds a string, carefully
(injection attacks) and is neutral as to what it is working with -
query or update or something weird.
http://jena.apache.org/documentation/query/parameterized-sparql-strings.html
Summary:
PQ is only for replacement of a variable in a template.
PSS is a builder that can do that as part of building.
PQ covers cases PSS doesn't - neither is perfect.
PSS works with INSERT DATA.
PQ would use the INSERT { ... } WHERE {} form.
Details:
PSS:
Can build query, update strings and fragments
Supports JDBC style positional parameters (a '?')
These must be bound to get a valid query.
Can generate illegal syntax.
Tests the type of the injected value (string, iri, double etc).
Has corner cases
Looks for ?x as a string so ...
"This is not a ?x as a variable"
<http://example/foo?x=123>
"SELECT ?x"
ns:local\?x (a legal local part)
Protects against injection by checking.
Works on INSERT DATA.
PQ:
Replaces SPARQL variables where identified as variables.
(no extra-syntax positional '?')
Legal query to legal syntax query.
The query may violate scope rules (example below).
Not a query builder.
Post parser, so no reparsing to use the query
(for large updates and queries)
Injection is meaningless - can only inject values, not syntax.
Can rewrite structurally: "SELECT ?x" => "SELECT (:value AS ?x)"
which is useful to record the injection variables.
Works with "INSERT {?s ?p ?o } WHERE { }"
PQ example:
Query template = QueryFactory.create(.. valid query ..) ;
Map<String, RDFNode> map = new HashMap<>() ;
map.put("y", ResourceFactory.createPlainLiteral("Bristol") ;
Query query = ParameterizedQuery.setVariables(template, map) ;
A perfect system probably needs a "template language" which SPARQL
extended with a new "template variable" which is only allowed in
certain places in the query and must be bound before use.
Some examples of hard templates:
(1) Not variables:
<http://example/foo?x=123>
"This is not a ?x as a variable"
ns:local\?x
(2) Some places ?x can not be replaced with a value directly.
SELECT ?x { ?s ?p ?x }
A possible output is:
SELECT (:X AS ?x) { ?s ?p :X }
which is nice as it record the substitution but it fails when nested
again.
SELECT ?x { {SELECT ?x { ?s ?p ?x } } ?s ?p ?o }
This is a bad query:
SELECT (:X AS ?x) { {SELECT (:X AS ?x) { ...
(3) Other places:
SELECT ?x { BIND(1 AS ?x) }
SELECT ?x { VALUES ?x { 123 } }
Andy