[
https://issues.apache.org/jira/browse/JENA-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821721#comment-15821721
]
ASF GitHub Bot commented on JENA-1277:
--------------------------------------
GitHub user osma opened a pull request:
https://github.com/apache/jena/pull/205
JENA-1277: don't use sorting in spatial queries, for much better performance
This PR proposes removing the `distSort` parameter from the Lucene spatial
query performed by jena-spatial. Dropping the sorting gives a massive
performance boost; in the Geonames example given in JENA-1277, the query time
drops from over 20 seconds to less than 200 ms.
I suppose that the sorting is not necessary since jena-spatial results are
just raw material for the SPARQL engine anyway.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/osma/jena jena-spatial-no-sort
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/jena/pull/205.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #205
----
commit 864e2ce831f41a15ba683d0edd29cbe3236ff636
Author: Osma Suominen <[email protected]>
Date: 2017-01-13T12:29:16Z
JENA-1277: don't use sorting in spatial queries, for much better performance
----
> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>
> Key: JENA-1277
> URL: https://issues.apache.org/jira/browse/JENA-1277
> Project: Apache Jena
> Issue Type: Improvement
> Components: Spatial
> Affects Versions: Jena 3.1.1
> Environment: Linux Ubuntu
> Reporter: samur araujo
> Attachments: spatial-assembler.ttl
>
>
> I loaded geonames on Jena but the spatial queries take more than 3s to
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
> ?place spatial:intersectBox (32.55668 -117.12865 32.56668 -117.13865) .
>
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed
> that when I access the lucene index directly the queries are also very fast,
> about 20ms.
> The issue may be related to the pos-processing of lucene results.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)