Hi Rick,

Are you in a position where you give some background on what you're trying to achieve overall?

and inline ...


On 04/06/15 13:49, Rick Moynihan wrote:
Hi Rob,

Firstly thanks for filing the bug for me.

Secondly in the case you cite, I don't understand why the query you cite:

SELECT * {
   SELECT ?x { ?x a ?type }
}

Isn't converted into the SSE:

(project (?x ?type)
  (project (?x)
   (bgp
    (triple ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type))))


I hope not - that is the algebra for:

SELECT ?x ?type {
   SELECT ?x { ?x a ?type }

same effect, different way of writing it.

Rewriting does no "optimization" however simple.

Or the semantically equivalent:

(project (?x ?type)
   (bgp
    (triple ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type)))

Both of which will OpAsQuery.asQuery back into a semantically equivalent
SPARQL query.

I clearly don't quite understand the role of the SSE's and the various
representations of Query inside ARQ.

SSE is the syntax in which the ARQ internal algebra can be written out and read in. We do loosely talk about SSE for the alegbra but in fact theer are lots of other things that can be written in SSE. Includind RDF data!

http://jena.apache.org/documentation/notes/sse.html

SSE is just a syntax.


Essentially I was wanting to use the SSE's to convert a SPARQL query into a
tree, rewrite the query tree, and convert it back into SPARQL with
OpAsQuery.

If you want to do transforms that restructure the query, that's the way I'd do it.

If you want more of a surface manipulation, and only such manipulations, e.g. rename a variable, then a transformation of the query syntax tree might be easier. I have some code for this I can contribute (written as an alternative to the query builder / parameterized query code).

https://github.com/afs/AFS-Dev/tree/master/src/main/java/element

If you think you are ever going to need the complicated part, investing in setting up the op based one is better. AST manipulation is a bit of a dead end.

The OpAsQuery contract can't be query=>op=>exactly the same query because there ways to write two queries that lead to "the same" algebra.

ARQ executes the algebra so the Op=>Query step isn't needed (or desirable).

I had assumed that aside from handling the trivialities around restoring
the outer queries type (e.g. CONSTRUCT/ASK/DESRIBE etc...) that this would
work, and that SPARQL queries would be round tripable.

What is the reasoning for this property not holding?  I understand that
SSE's can express things that can't be expressed in SPARQL, presumably
useful once the SSE has been optimized; but why isn't every valid SPARQL
query round tripable, as suggested above?

We hope to have that contract (essentially, reverse the syntax to algebra algorithm in the SPARQL spec).


Aside from this, what is the best way to use ARQ to rewrite a SPARQL 1.1
query, and get a valid SPARQL 1.1 query back out?

Query.toString.

By the way, arq.qparse has internally a lot of checking. It parses, serializes and reparses a query, then check they are .equals (bnode isomorphic) and then does the same for the algebra from the query.

It does not cover OpAsQuery as it may return equivalent AST, that is not .equals.

        Andy

Thanks again for filing the bug report for me, and answering my questions.

Kind regards,

R.

On 4 June 2015 at 12:28, Rob Vesse <[email protected]> wrote:

Rick

Yes this does look like a bug

Please bear in mind however that conversion to algebra does NOT guarantee
to round trip because some parts of a query do not end up in the algebra
and so OpAsQuery has simply no way to reconstruct the exact original query

For example:

SELECT * {
   SELECT ?x { ?x a ?type }
}

Would round trip back to just:

SELECT ?x { ?x a ?type }


There are also other cases where things could move around slightly, for
example a BIND is potentially indistinguishable from a SELECT expression
depending on the structure of the query.

I have filed this as JENA-954 -
https://issues.apache.org/jira/browse/JENA-954

Thanks for reporting this,

Rob

On 04/06/2015 11:45, "Rick Moynihan" <[email protected]> wrote:

Hi all,

I have been playing around using ARQ to rewrite queries with Jena 2.13.0
and have encountered what appears to be a bug when roundtripping a valid
SPARQL query through to an SSE and back out as SPARQL.

The original SPARQL query is this:

SELECT (COUNT(*) as ?count) {
  SELECT DISTINCT ?uri ?graph WHERE {
    GRAPH ?graph {
      ?uri ?p ?o .
      }
    } LIMIT 1
}

This parses into the following SSE by going through QueryFactory.create ->
Algebra.compile :

#<OpProject (project (?tripod_count_var)
  (extend ((?tripod_count_var ?.0))
    (group () ((?.0 (count)))
      (distinct
        (project (?uri ?graph)
          (graph ?graph
            (bgp (triple ?uri ?p ?o))))))))

To my eye this looks correct so far... next we round trip it back into a
SPARQL query by using OpAsQuery.asQuery that results in:

#<Query SELECT DISTINCT  (count(*) AS ?tripod_count_var)
WHERE
  { { SELECT  ?uri ?graph
      WHERE
        { GRAPH ?graph
            { ?uri ?p ?o}
        }
    }
  }


This now seems broken...  asQuery has mixed the inner select distinct onto
the outer one.  This appears to happen with all sub selects.  I suspect it
might be due to OpAsQuery.asQuery building only Query object which is
somehow being reused for all sub queries.

I took a look in the unit tests and found that some of the test queries in
TestOpAsQuery are also subject to this bug e.g. the query on line 223:

SELECT ?key ?agg WHERE { { SELECT ?key (COUNT(*) AS ?agg) { ?key ?p ?o }
GROUP BY ?key } }

Though the tests don't seem to currently test for this kind of thing.

Can anyone confirm that this is a bug?

Kind regards,

R.







Reply via email to