Hi Andy

Thanks very much for the detailed assessment.

No I wasn't on the reviewing panel for the W3C/OGC event - I haven't seen your 
paper for that!  (I do plan to attend the event).

I asked the question because we have been looking at the various approaches all 
based on the Lucene geo functions (Jena spatial, ElasticSearch etc) for 
indexing and searching geographical data in a linked data context.

The Lucene based approaches should do what we want to do for now, but was just 
thinking that if there was a standardised way of incorporating the geo 
functions into SPARQL, that would be attractive for us - if the performance was 
decent of course, and I realise that is far from easy to achieve!


Cheers

Bill





On 12 Feb 2014, at 10:32, Andy Seaborne <[email protected]> wrote:

> On 11/02/14 11:05, Bill Roberts wrote:
>> Hi All
>> 
>> Does anyone have plans to implement GeoSPARQL in Jena?  I'm aware of Jena 
>> Spatial which obviously has many functional similarities, but just wondering 
>> if there are plans for GeoSPARQL itself?
>> 
>> Thanks
>> 
>> Bill
>> 
> 
> Hi Bill -
> 
> Hope you weren't one of the reviewers for the submission to the W3C/OGC 
> workshop that coming up!  The "why not GeoSPARQL" came up but as the 
> submission tries to point out, the work needed to do even a partial GeoSPARQL 
> is not insignificant.
> 
> There are non-technical issues as well.  Support and users questions - 
> suppose a complete, perfect implementation is released in Jena.  Or suppose 
> it's a partial implementation - now there is a need to explain what is and 
> isn't implemented.
> 
> The first step it needs someone to investigate it properly; it does look to 
> me like something that needs resource with access to a geospatial expert for 
> at least advice.  It's not in the same league as a one-off patch to ARQ.
> 
> jena-spatial was driven by the availability of the geo functions in lucene.  
> jena-spatial is a self-contained extension, GeoSPARQL needs deep integration 
> into the query engine just to do the same point-in-bounding box functionality.
> 
> From what I can see, there needs to be a community around geospatial data 
> somehow, not just users learning about geospatial data.  That would be good 
> to have wherever it is; Jena community, sub project, independent project on 
> github.
> 
> GeoSPARQL is a core and number of extensions.   The core is just some class 
> definitions - Jena already supports all the core requirements as does all 
> general SPARQL engines but it does not do anything.   It's the various 
> extensions that give the functionality.
> 
> GeoSPARQL covers regions and boundaries - for the Topology Vocabulary 
> Extension (section 7) it needs one or more geo-reasoners to provide the 
> topological relations e.g.  geo:sfDisjoint in relation_family=Simple 
> Features;  There is also relation_family=Egenhofer and relation_family=RCC8.
> 
> Geometry Extension (section 8) have the interesting part "Non-topological 
> Query Functions" (section 8.7)
> 
> Take function "geof:distance"
> 
>    FILTER ( geof:distance(?geoPoints, SomeFixedGeo, units) < 56 )
> 
> which is the within-circle function.
> 
> If you simply add that function as a custom function to a general purpose 
> SPARQL engine, then to calculate it you need to full scan of the geo data to 
> find all the ?geoPoints, and filter them.  That's the situation we had 
> pre-jena-spatial.  It's slow even on modest data without access to a 
> geospatial index (R-tree, quad-tree, lucene spatial, whatever),
> 
> jena-spatial collects the bounded geospatial access together in one property 
> function that asks a geo index that can find a few points of interest very 
> quickly then adds info from the rest of the RDF data.
> 
> To do the GeoSPARQL style, you need to pick out from the graph pattern part 
> where ?geo came from, being careful that the non-geo access patterns are not 
> made inefficient in the process.  It's an optimization problem.  If the focus 
> is on a geospatial DB, then it's not too bad but if the RDG database is some 
> geo and a lot of other data, all the optimization choices get mixed up and 
> compete.
> 
> There are various other geof:* functions which work on regions and run into 
> later sections getting more complicated.
> 
> The Query Rewrite Extension (section 11) looks fun.  It's query rewrite to 
> turn property relationships into primitive data access and custom functions.  
> ARQ can do that but again, what about when in the context of general data as 
> well?
> 
> I haven't found geo libraries to use except spatial4j.  There are some that 
> are various ones using GPL which I haven't tried, and obvious they have 
> consequences for the whole of Jena.  There would need to be some kind of geo 
> index, working with the optimizer and data loading.  The Lucene spatial index 
> is just point data. An R-tree and regions is needed for more general 
> GeoSPARQL extensions.
> 
> So - call to geo-experts - is that a fair assessment?  Being wrong about the 
> amount of work needed would be very good news.
> 
>       Andy
> 

Reply via email to