Hi Marco. Yes, that's it. The indexes work well in isolation, but don't combine well. Smooshing them into a single index would be a great idea, especially if the query could resolve both text and spatial predicates with one matching scan of the index.
Perhaps Stephen could be persuaded to pick up the pace on this one? Thanks Mark On 20 December 2015 12:53:39 GMT+00:00, Marco Neumann <[email protected]> wrote: >yes correct Mark I am only referring to the extra payload here for >invoking the spatial filter in the SPARQL query. > >now that you mention a particular issue with the combined use of both >jena-text and jena-spatial (something I am not aware of ) this might >be related to duplicated code in the two projects. back in May Stephen >Allen wrote on the dev-list that he is about to address some of this >possibly in a new jena-external-index project. > >http://mail-archives.apache.org/mod_mbox/jena-dev/201505.mbox/%3ccaptxtvpwu2ijogyj0kx8o6-07yokk5g1t32b_k3g_cjaqvk...@mail.gmail.com%3E > >On Sun, Dec 20, 2015 at 2:29 AM, Mark Wharton ><[email protected]> wrote: >> Hi >> >> Thanks for this. I've read the chapter in the book and now I'm not >sure >> if I misunderstand your reply or you've only addressed half of the >problem. >> >> I'm not worried about the performance of the spatial search in >isolation >> - that's 97ms which is fine. The text search on its own takes a bit >> longer but that's acceptable, too. >> >> It's when I put the spatial and text *together* that query time >increase >> by 10-30 times. That's the bit I don't understand and would like >some >> help with. >> >> Is there a SPARQL query formulation that can "AND the indexes" rather >> than retrieving one set and looping through to retrieve the matches >> individually on the other. (Which is my guess as to how it works). >> >> Thanks for your help so far. >> >> Mark >> >> Technology Lead, Iotic Labs >> [email protected] >> https://www.iotic-labs.com >> >> On 18/12/15 18:59, Marco Neumann wrote: >>> it's a common spatial access method latency in paticular for small >>> data sets. you can try a mbr range query instead. >>> >>> see Chapter 13 Managing Space and Time in Semantic Web Programming >by >>> John Hebeler et. al.. 2009 >>> >>> On Fri, Dec 18, 2015 at 10:13 AM, Mark Wharton >>> <[email protected]> wrote: >>>> Hi Jena users. >>>> >>>> I'm having performance problems with a query that uses text and >location >>>> search >>>> >>>> The query is roughly this: >>>> >>>> >>>> SELECT ?score ?ent >>>> WHERE { >>>> ?ent spatial:nearby(51.507999420166016 -0.10999999940395355 >>>> 70.01807880401611 'km') . >>>> (?ent ?score) text:query ('environment' 'lang:en') . >>>> ?ent rdf:type iotic:Entity . >>>> } >>>> >>>> >>>> There are about 450 entities in that radius >>>> There are about 2200 entities with environment in their >rdfs:comment >>>> >>>> The query takes 5 seconds. >>>> >>>> I've tried this: >>>> Commenting out the text predicate the query takes 97 ms >>>> Commenting out the spatial predicate the query takes 438 ms >>>> Swapping the spatial and text predicates it takes 15 seconds >>>> >>>> >>>> My question is this... It looks like the query is separately >getting >>>> the results of the first two predicates and merging (somehow) to >find >>>> the intersection. Is there a formulation which will intersect the >two >>>> sets faster? >>>> >>>> Many TIAs, >>>> >>>> Mark >>>> -- >>>> Technology Lead, Iotic Labs >>>> [email protected] >>>> https://www.iotic-labs.com >>> >>> >>> > > > >-- > > >--- >Marco Neumann >KONA
