On 17 June 2015 at 14:13, Andy Seaborne <[email protected]> wrote: > On 17/06/15 10:16, Rick Moynihan wrote: > >> Hi Andy, >> >> Thanks for raising JENA-963 for me - I'll raise the issues directly in the >> future. Sometimes it's hard to know whether things are intended (or at >> least accepted) behaviours though. >> > > Point taken. It's those unfunded volunteers - can't rely on them! > The project takes whatever channels work; we're not, I hope, dogmatic. > When stuff gets detailed, email isn't so good, whether basic formatting > stuff or just as a record over time, JIRA is better, at least I find so. > Helps people see what they can contribute as well. >
I completely agree about your channels point. Which is precisely why I'll often go to the mailing list before the bug tracker. If you're unsure about the behaviour, or whether its a bug you can get quicker feedback by going to the mailing list first, and when you're satisfied its a bug; filing it. Regardless, I think 963 was clearly a bug, and I should have directly filed it for you in JIRA, and will do in the future. Unfortunately I haven't got an exhaustive set of queries we need to >> support; but we're basically hoping to have all arbitrary SPARQL 1.1 >> queries round-trip back to a query which is at least equivalent when >> evaluated on any complaint SPARQL 1.1 database to what went in. >> >> Most of the problems I've run into have been uncovered either by using it, >> writing unit tests for my domain code, by integration testing with some of >> our other components, or in this particular case by a colleague trying to >> generate some stats on data we have. >> >> Would every example query from the SPARQL 1.1 spec be a good start? >> >> http://www.w3.org/TR/sparql11-query/ >> >> I also have a small collection of about 28 different real world queries >> (mostly for handling RDF data cubes) which were generated via some of our >> systems that may be useful. If you'd like me to provide them as potential >> test cases I'm sure I can do that. >> > > That would be great. > Ok, I'm not sure how useful these will be for this bug, but I've created a repo with 56 real world SPARQL queries (no data), which you're more than welcome to use as you please. I've licensed the repo as MIT, which I think should work with Apache; but I'm happy to grant you an Apache license to the queries as they are too. Many of the queries were auto generated, so might not be what a user would write. https://github.com/Swirrl/sparql-corpus Let me know if you need anything else. I've done some analysis on JENA-963 and written in the cases I think turn > out for GROUP BY and it woudl be good to validate that analysis with real > world queries of interest. > Ok, there happen to be 11 real world GROUP BY queries in that repo: 12:07 $ git grep GROUP cabi/cabi-calculate-level.sparql:} GROUP BY ?leafConcept ?topConcept cabi/cabi-count-documents-countries.sparql:} GROUP BY ?countable ?countryLabel LIMIT 10 OFFSET 0 cabi/cabi-count-documents-regions.sparql:} GROUP BY ?countable ?countableId ?countableName cabi/cabi-count-documents-themes.sparql:} GROUP BY ?countable ?countableId cabi/cabi-graphs.sparql:} GROUP BY ?o ?g cabi/cabi-research-outputs.sparql: } GROUP BY ?resource ?title ?projectUri ?outputTitle ?outputDate cabi/opendatascotland.sparql:} GROUP BY cabi/spog.sparql:} GROUP BY ?g cabi/test-sparql.sparql: } GROUP BY ?resource ?title ?projectUri ?projectId ?outputTitle ?outputDate cabi/test.sparql:} GROUP BY ?countable ?countableId ?countableName LIMIT 10 OFFSET 10 pmd/dataset_period_row_labels.sparql: GROUP BY ?row > It looks to me like the top-down visit-driven translation is good for the > WHERE{} part of the algebra to query but spotting group, and all it's > details, is more of a pattern matching task. In fact, having pattern > matching for the parts outside WHERE{}, all the modifiers in SPARQL, looks > good. > > Algebra that is not in the shape originally generated by the query needs > to be factored in (not that the contract of OpAsQuery can promise > perfection there), it's just that, my guess, algebra-like-queries is the > major use case. > I can't say I understand all the details here, but it sounds good. If you let me know when the code lands in a SNAPSHOT jar, I'll happily integrate it with our stuff and see if anything else falls out. > (Yes, clojure would be perfect for this!) > It's funny you should say that! Our systems are actually written in Clojure, and rather than make use of the visitors JENA provides - I wrote a small functional zipper in just 9 lines of clojure.zip that you can use to trivially traverse SSE trees in a few lines. Obviously from a clojure perspective it would be better if SSE items, lists and nodes were actually clojure data - but the SSE idea made the whole thing a joy. Bravo! R. > On 15 June 2015 at 18:31, Andy Seaborne <[email protected]> wrote: >> >> Hi Rick, >>> >>> Sorry, your not having a good time of it here. >>> >>> Not one but 2 related bugs (filter in wrong place, lost the aggregate >>> function) this time. HAVING is particularly hard because it isn't a >>> simple >>> mapping to one algebra form. >>> >>> If split up: >>> -------------- >>> PREFIX qb: <http://purl.org/linked-data/cube#> >>> >>> SELECT ?obs (COUNT(?value) AS ?C) >>> WHERE >>> { ?obs a qb:Observation . >>> ?obs qb:measureType ?measure . >>> ?obs ?measure ?value >>> } >>> GROUP BY ?obs >>> HAVING ( ?C > 1 ) >>> -------------- >>> it goes wrong as well. >>> >>> I've recorded it as >>> >>> https://issues.apache.org/jira/browse/JENA-963 >>> >>> A couple of things would be good: >>> >>> You can raise JIRA directly - I attached code to the JIRA like it was >>> from >>> JENA-954. Prefixes etc. - query-in, query-out. >>> >>> What would be really good is fix the test coverage. "TestOpAsQuery" is >>> the test class. Do you have a complete (nearly complete ...) list of >>> features? What's missing in TestOpAsQuery? >>> >>> If we can get the coverage up, we'll be a better position long term. >>> >>> Andy >>> >>> >>> On 15/06/15 16:57, Rick Moynihan wrote: >>> >>> Hi all, >>>> >>>> I've been using the recent fixes to ARQ (made in JENA-954) around >>>> rendering >>>> SPARQL queries and have encountered another problem where a valid query >>>> appears to roundtrip to an invalid one. >>>> >>>> The problematic query is this: >>>> >>>> SELECT ?obs >>>> WHERE { >>>> ?obs a qb:Observation ; >>>> qb:measureType ?measure ; >>>> ?measure ?value ; >>>> . >>>> } >>>> GROUP BY ?obs >>>> HAVING (COUNT(?value) > 1) >>>> >>>> Which generates this SSE: >>>> >>>> (project >>>> (?obs) >>>> (filter >>>> (> ?.0 1) >>>> (group >>>> (?obs) >>>> ((?.0 >>>> (count ?value))) >>>> (bgp >>>> (triple ?obs <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>> < >>>> http://purl.org/linked-data/cube#Observation>) >>>> (triple ?obs <http://purl.org/linked-data/cube#measureType> >>>> ?measure) >>>> (triple ?obs ?measure ?value))))) >>>> >>>> But when round tripped back into SPARQL with OpAsQuery.asQuery, leads to >>>> this invalid query: >>>> >>>> SELECT ?obs >>>> WHERE >>>> { ?obs <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> >>>> qb:Observation . >>>> ?obs qb:measureType ?measure . >>>> ?obs ?measure ?value >>>> FILTER ( ?.0 > 1 ) >>>> } >>>> GROUP BY ?obs >>>> >>>> >>>> R. >>>> >>>> >>>> >> >
