Re: Possible Bug Roundtripping SPARQL to an SSE and back

Martynas Jusevičius Fri, 05 Jun 2015 04:43:38 -0700

Rick,

SPIN is indeed more than a SPARQL vocabulary -- it can do SPARQL-based
rules, constraints etc.


I think in your case it could be used to represent queries in RDF -
which is a platform-independent way to do it.

So far there is no SPIN support in Sesame, but the SPIN API is built
on Jena: http://topbraid.org/spin/api/
If you're using ARQ anyway, I think you might just as well use the SPIN API.

I think you should give Graphity a try, we have SPIN built-in ;)
https://github.com/Graphity/graphity-processor/wiki/How-Graphity-Processor-works

Martynas
graphityhq.com

On Thu, Jun 4, 2015 at 6:38 PM, Rick Moynihan <[email protected]> wrote:
> Hi Martynas,
>
> Yes I'm aware of SPIN; and I agree it's a really nice way to solve this
> problem.  I have a few issues with it as a solution to my problem right now
> though... Which is why I didn't follow up on your suggestion earlier:
>
> 1) I've not had time to fully understand it.  It seems SPIN is more than
> just representing SPARQL queries in RDF.
>
> 2) It doesn't appear to have much real world support yet (the only place
> I've seen it mentioned is in relation to top quadrant stuff).  In
> particular to my knowledge none of the stores I've used support it yet
> (Sesame, Jena, Stardog, Blazegraph, GraphDB).
>
> I don't know if the intention is for stores to run SPIN queries; but if the
> stores themselves can't process SPIN RDF queries, then to use SPIN I'd need
> to parse a SPARQL into SPIN RDF.  Once in RDF I can obviously easily
> rewrite the query; but I then need to render the query back out into SPARQL
> to use it with one of my stores.
>
> I skimmed the code when you posted it on the Sesame list, but couldn't
> quite figure out how to use it to solve my problem.  IIRC it seemed to only
> do the first bit, but not the final rendering back out as SPARQL, and the
> code appears also to be embedded in a larger project.
>
> So basically I figured it was less risky to use ARQ.
>
> Anyway, if I'm wrong on any of these points I'd genuinely love to know what
> I can do to make use of it.
>
> Thanks again,
>
> R.
>
> On 4 June 2015 at 16:37, Martynas Jusevičius <[email protected]> wrote:
>
>> Hey,
>>
>> this is probably not directly relevant to your use case, but once
>> again I'd like to suggest using SPIN Vocabulary as an API-neutral way
>> to represent and manage SPARQL natively as RDF:
>> http://spinrdf.org/sp.html
>>
>> I know Sesame has plans to implement support for SPIN:
>> https://openrdf.atlassian.net/browse/SES-1840
>>
>> Martynas
>>
>> On Thu, Jun 4, 2015 at 5:23 PM, Rick Moynihan <[email protected]> wrote:
>> > On 4 June 2015 at 14:08, Andy Seaborne <[email protected]> wrote:
>> >
>> >> Hi Rick,
>> >>
>> >> Are you in a position where you give some background on what you're
>> trying
>> >> to achieve overall?
>> >>
>> >
>> > Sure.  Basically we're porting some existing work of ours which was based
>> > on Sesame, which would rewrite queries (and then the results on the way
>> > out) using sesame's AST.  We're not doing anything especially fancy...
>> > mostly simple URI substitutions on URI constants, both on the query and
>> > then the results; so from the query side the transformation is
>> essentially
>> > invisible.
>> >
>> > This worked great, whilst the we were using the sesame native store, as
>> > we'd essentially go from SPARQL -> AST -> rewritten AST -> execution &
>> > results.  However we now wish to make this same code work with remote
>> > stores, which essentially means we have to convert the rewritten queries
>> > back into valid SPARQL queries.
>> >
>> > We don't care about syntactic or structural differences in queries, that
>> > occur, but semantically the queries need to be isomorphic.  We originally
>> > tried using sesame's query renderer, but it doesn't support SPARQL 1.1
>> So
>> > we thought we'd port this code to use ARQ.
>> >
>> > and inline ...
>> >>
>> >> On 04/06/15 13:49, Rick Moynihan wrote:
>> >>
>> >>> Hi Rob,
>> >>>
>> >>> Firstly thanks for filing the bug for me.
>> >>>
>> >>> Secondly in the case you cite, I don't understand why the query you
>> cite:
>> >>>
>> >>> SELECT * {
>> >>>    SELECT ?x { ?x a ?type }
>> >>> }
>> >>>
>> >>> Isn't converted into the SSE:
>> >>>
>> >>> (project (?x ?type)
>> >>>   (project (?x)
>> >>>    (bgp
>> >>>     (triple ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>> >>> ?type))))
>> >>>
>> >>>
>> >> I hope not - that is the algebra for:
>> >>
>> >> SELECT ?x ?type {
>> >>    SELECT ?x { ?x a ?type }
>> >>
>> >> same effect, different way of writing it.
>> >>
>> >
>> > I expanded the * partly because I didn't know whether SSE's represent
>> > wildcards.  For my particular use case either way is fine.
>> >
>> > Rewriting does no "optimization" however simple.
>> >>
>> >>  Or the semantically equivalent:
>> >>>
>> >>> (project (?x ?type)
>> >>>    (bgp
>> >>>     (triple ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>> ?type)))
>> >>>
>> >>> Both of which will OpAsQuery.asQuery back into a semantically
>> equivalent
>> >>> SPARQL query.
>> >>>
>> >>> I clearly don't quite understand the role of the SSE's and the various
>> >>> representations of Query inside ARQ.
>> >>>
>> >>
>> >> SSE is the syntax in which the ARQ internal algebra can be written out
>> and
>> >> read in.  We do loosely talk about SSE for the alegbra but in fact theer
>> >> are lots of other things that can be written in SSE.  Includind RDF
>> data!
>> >>
>> >> http://jena.apache.org/documentation/notes/sse.html
>> >>
>> >> SSE is just a syntax.
>> >>
>> >>
>> >>> Essentially I was wanting to use the SSE's to convert a SPARQL query
>> into
>> >>> a
>> >>> tree, rewrite the query tree, and convert it back into SPARQL with
>> >>> OpAsQuery.
>> >>>
>> >>
>> >> If you want to do transforms that restructure the query, that's the way
>> >> I'd do it.
>> >>
>> >
>> > That's a relief.  As I've already spent 2 days porting my rewriting code
>> to
>> > ARQ and this style, and my only failing test case is currently with
>> nested
>> > sub queries.
>> >
>> > I think I'd misread Robs answer with regards to the semantics of * and
>> > subqueries, and thought it was returning a different query!
>> >
>> > So am I right that once JENA-954 is fixed, round tripping should always
>> be
>> > possible, and should always yield a semantically equivalent query?
>> >
>> >
>> > If you want more of a surface manipulation, and only such manipulations,
>> >> e.g. rename a variable, then a transformation of the query syntax tree
>> >> might be easier.  I have some code for this I can contribute (written
>> as an
>> >> alternative to the query builder / parameterized query code).
>> >>
>> >> https://github.com/afs/AFS-Dev/tree/master/src/main/java/element
>> >>
>> >> If you think you are ever going to need the complicated part, investing
>> in
>> >> setting up the op based one is better. AST manipulation is a bit of a
>> dead
>> >> end.
>> >>
>> >> The OpAsQuery contract can't be query=>op=>exactly the same query
>> because
>> >> there ways to write two queries that lead to "the same" algebra.
>> >>
>> >> ARQ executes the algebra so the Op=>Query step isn't needed (or
>> desirable).
>> >>
>> >>  I had assumed that aside from handling the trivialities around
>> restoring
>> >>> the outer queries type (e.g. CONSTRUCT/ASK/DESRIBE etc...) that this
>> would
>> >>> work, and that SPARQL queries would be round tripable.
>> >>>
>> >>> What is the reasoning for this property not holding?  I understand that
>> >>> SSE's can express things that can't be expressed in SPARQL, presumably
>> >>> useful once the SSE has been optimized; but why isn't every valid
>> SPARQL
>> >>> query round tripable, as suggested above?
>> >>>
>> >>
>> >> We hope to have that contract (essentially, reverse the syntax to
>> algebra
>> >> algorithm in the SPARQL spec).
>> >>
>> >
>> > Amazing!  I think this answers my above question.
>> >
>> > I noticed that JENA-954 has been scheduled for 3.0.0; is that release
>> > expected anytime soon?  I'm not sure I know enough about JENA to provide
>> a
>> > patch for this issue yet, but if we could assemble one, could it be
>> > included in an earlier release?
>> >
>> >
>> > R.
>> >
>> >
>> >  Thanks again for filing the bug report for me, and answering my
>> questions.
>> >>
>> >> Kind regards,
>> >>
>> >> R.
>> >>
>> >> On 4 June 2015 at 12:28, Rob Vesse <[email protected]> wrote:
>> >>
>> >>  Rick
>> >>>
>> >>> Yes this does look like a bug
>> >>>
>> >>> Please bear in mind however that conversion to algebra does NOT
>> guarantee
>> >>> to round trip because some parts of a query do not end up in the
>> algebra
>> >>> and so OpAsQuery has simply no way to reconstruct the exact original
>> query
>> >>>
>> >>> For example:
>> >>>
>> >>> SELECT * {
>> >>>    SELECT ?x { ?x a ?type }
>> >>> }
>> >>>
>> >>> Would round trip back to just:
>> >>>
>> >>> SELECT ?x { ?x a ?type }
>> >>>
>> >>>
>> >>> There are also other cases where things could move around slightly, for
>> >>> example a BIND is potentially indistinguishable from a SELECT
>> expression
>> >>> depending on the structure of the query.
>> >>>
>> >>> I have filed this as JENA-954 -
>> >>> https://issues.apache.org/jira/browse/JENA-954
>> >>>
>> >>> Thanks for reporting this,
>> >>>
>> >>> Rob
>> >>>
>> >>> On 04/06/2015 11:45, "Rick Moynihan" <[email protected]> wrote:
>> >>>
>> >>>  Hi all,
>> >>>>
>> >>>> I have been playing around using ARQ to rewrite queries with Jena
>> 2.13.0
>> >>>> and have encountered what appears to be a bug when roundtripping a
>> valid
>> >>>> SPARQL query through to an SSE and back out as SPARQL.
>> >>>>
>> >>>> The original SPARQL query is this:
>> >>>>
>> >>>> SELECT (COUNT(*) as ?count) {
>> >>>>   SELECT DISTINCT ?uri ?graph WHERE {
>> >>>>     GRAPH ?graph {
>> >>>>       ?uri ?p ?o .
>> >>>>       }
>> >>>>     } LIMIT 1
>> >>>> }
>> >>>>
>> >>>> This parses into the following SSE by going through
>> QueryFactory.create
>> >>>> ->
>> >>>> Algebra.compile :
>> >>>>
>> >>>> #<OpProject (project (?tripod_count_var)
>> >>>>   (extend ((?tripod_count_var ?.0))
>> >>>>     (group () ((?.0 (count)))
>> >>>>       (distinct
>> >>>>         (project (?uri ?graph)
>> >>>>           (graph ?graph
>> >>>>             (bgp (triple ?uri ?p ?o))))))))
>> >>>>
>> >>>> To my eye this looks correct so far... next we round trip it back
>> into a
>> >>>> SPARQL query by using OpAsQuery.asQuery that results in:
>> >>>>
>> >>>> #<Query SELECT DISTINCT  (count(*) AS ?tripod_count_var)
>> >>>> WHERE
>> >>>>   { { SELECT  ?uri ?graph
>> >>>>       WHERE
>> >>>>         { GRAPH ?graph
>> >>>>             { ?uri ?p ?o}
>> >>>>         }
>> >>>>     }
>> >>>>   }
>> >>>>
>> >>>>>
>> >>>>>
>> >>>> This now seems broken...  asQuery has mixed the inner select distinct
>> >>>> onto
>> >>>> the outer one.  This appears to happen with all sub selects.  I
>> suspect
>> >>>> it
>> >>>> might be due to OpAsQuery.asQuery building only Query object which is
>> >>>> somehow being reused for all sub queries.
>> >>>>
>> >>>> I took a look in the unit tests and found that some of the test
>> queries
>> >>>> in
>> >>>> TestOpAsQuery are also subject to this bug e.g. the query on line 223:
>> >>>>
>> >>>> SELECT ?key ?agg WHERE { { SELECT ?key (COUNT(*) AS ?agg) { ?key ?p
>> ?o }
>> >>>> GROUP BY ?key } }
>> >>>>
>> >>>> Though the tests don't seem to currently test for this kind of thing.
>> >>>>
>> >>>> Can anyone confirm that this is a bug?
>> >>>>
>> >>>> Kind regards,
>> >>>>
>> >>>> R.
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>
>>

Re: Possible Bug Roundtripping SPARQL to an SSE and back

Reply via email to