Hi Andy Thanks very much for the detailed assessment.
No I wasn't on the reviewing panel for the W3C/OGC event - I haven't seen your paper for that! (I do plan to attend the event). I asked the question because we have been looking at the various approaches all based on the Lucene geo functions (Jena spatial, ElasticSearch etc) for indexing and searching geographical data in a linked data context. The Lucene based approaches should do what we want to do for now, but was just thinking that if there was a standardised way of incorporating the geo functions into SPARQL, that would be attractive for us - if the performance was decent of course, and I realise that is far from easy to achieve! Cheers Bill On 12 Feb 2014, at 10:32, Andy Seaborne <[email protected]> wrote: > On 11/02/14 11:05, Bill Roberts wrote: >> Hi All >> >> Does anyone have plans to implement GeoSPARQL in Jena? I'm aware of Jena >> Spatial which obviously has many functional similarities, but just wondering >> if there are plans for GeoSPARQL itself? >> >> Thanks >> >> Bill >> > > Hi Bill - > > Hope you weren't one of the reviewers for the submission to the W3C/OGC > workshop that coming up! The "why not GeoSPARQL" came up but as the > submission tries to point out, the work needed to do even a partial GeoSPARQL > is not insignificant. > > There are non-technical issues as well. Support and users questions - > suppose a complete, perfect implementation is released in Jena. Or suppose > it's a partial implementation - now there is a need to explain what is and > isn't implemented. > > The first step it needs someone to investigate it properly; it does look to > me like something that needs resource with access to a geospatial expert for > at least advice. It's not in the same league as a one-off patch to ARQ. > > jena-spatial was driven by the availability of the geo functions in lucene. > jena-spatial is a self-contained extension, GeoSPARQL needs deep integration > into the query engine just to do the same point-in-bounding box functionality. > > From what I can see, there needs to be a community around geospatial data > somehow, not just users learning about geospatial data. That would be good > to have wherever it is; Jena community, sub project, independent project on > github. > > GeoSPARQL is a core and number of extensions. The core is just some class > definitions - Jena already supports all the core requirements as does all > general SPARQL engines but it does not do anything. It's the various > extensions that give the functionality. > > GeoSPARQL covers regions and boundaries - for the Topology Vocabulary > Extension (section 7) it needs one or more geo-reasoners to provide the > topological relations e.g. geo:sfDisjoint in relation_family=Simple > Features; There is also relation_family=Egenhofer and relation_family=RCC8. > > Geometry Extension (section 8) have the interesting part "Non-topological > Query Functions" (section 8.7) > > Take function "geof:distance" > > FILTER ( geof:distance(?geoPoints, SomeFixedGeo, units) < 56 ) > > which is the within-circle function. > > If you simply add that function as a custom function to a general purpose > SPARQL engine, then to calculate it you need to full scan of the geo data to > find all the ?geoPoints, and filter them. That's the situation we had > pre-jena-spatial. It's slow even on modest data without access to a > geospatial index (R-tree, quad-tree, lucene spatial, whatever), > > jena-spatial collects the bounded geospatial access together in one property > function that asks a geo index that can find a few points of interest very > quickly then adds info from the rest of the RDF data. > > To do the GeoSPARQL style, you need to pick out from the graph pattern part > where ?geo came from, being careful that the non-geo access patterns are not > made inefficient in the process. It's an optimization problem. If the focus > is on a geospatial DB, then it's not too bad but if the RDG database is some > geo and a lot of other data, all the optimization choices get mixed up and > compete. > > There are various other geof:* functions which work on regions and run into > later sections getting more complicated. > > The Query Rewrite Extension (section 11) looks fun. It's query rewrite to > turn property relationships into primitive data access and custom functions. > ARQ can do that but again, what about when in the context of general data as > well? > > I haven't found geo libraries to use except spatial4j. There are some that > are various ones using GPL which I haven't tried, and obvious they have > consequences for the whole of Jena. There would need to be some kind of geo > index, working with the optimizer and data loading. The Lucene spatial index > is just point data. An R-tree and regions is needed for more general > GeoSPARQL extensions. > > So - call to geo-experts - is that a fair assessment? Being wrong about the > amount of work needed would be very good news. > > Andy >
