ASF GitHub Bot commented on JENA-1277:

GitHub user osma opened a pull request:


    JENA-1277: don't use sorting in spatial queries, for much better performance

    This PR proposes removing the `distSort` parameter from the Lucene spatial 
query performed by jena-spatial. Dropping the sorting gives a massive 
performance boost; in the Geonames example given in JENA-1277, the query time 
drops from over 20 seconds to less than 200 ms.
    I suppose that the sorting is not necessary since jena-spatial results are 
just raw material for the SPARQL engine anyway.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/osma/jena jena-spatial-no-sort

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #205
commit 864e2ce831f41a15ba683d0edd29cbe3236ff636
Author: Osma Suominen <osma.suomi...@helsinki.fi>
Date:   2017-01-13T12:29:16Z

    JENA-1277: don't use sorting in spatial queries, for much better performance


> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>                 Key: JENA-1277
>                 URL: https://issues.apache.org/jira/browse/JENA-1277
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Spatial
>    Affects Versions: Jena 3.1.1
>         Environment: Linux Ubuntu
>            Reporter: samur araujo
>         Attachments: spatial-assembler.ttl
> I loaded geonames on Jena but the spatial queries take more than 3s to 
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
>     ?place spatial:intersectBox (32.55668 -117.12865 32.56668  -117.13865) .
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed 
> that when I access the lucene index directly the queries are also very fast, 
> about 20ms. 
> The issue may be related to the pos-processing of lucene results.

This message was sent by Atlassian JIRA

Reply via email to