[
https://issues.apache.org/jira/browse/JENA-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821705#comment-15821705
]
Osma Suominen commented on JENA-1277:
-------------------------------------
Did some basic profiling using current Jena (Fuseki 1.5.0-SNAPSHOT). The slow
part which takes ~20s is this line in SpatialIndexLucene.java, i.e. the point
where the actual Lucene search is done:
{noformat}
TopDocs docs = indexSearcher.search(new MatchAllDocsQuery(), filter,
limit, distSort);
{noformat}
Of the parameters,
* {{filter}} is a (Lucene) {{IntersectsPrefixTreeFilter}}
* {{limit}} is 10000 (the default)
* {{distSort}} is
{{<custom:"ShapeFieldCacheDistanceValueSource(org.apache.lucene.spatial.prefix.PointPrefixTreeFieldCacheProvider@2805220,
Pt(x=62.86635000000001,y=32.561679999999996))":
org.apache.lucene.queries.function.ValueSource$ValueSourceComparatorSource@558a840>}},
whatever that means.
I'm not sure how fast this operation should be on such a large data set. I've
never used jena-spatial or Lucene spatial queries before...
> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>
> Key: JENA-1277
> URL: https://issues.apache.org/jira/browse/JENA-1277
> Project: Apache Jena
> Issue Type: Improvement
> Components: Spatial
> Affects Versions: Jena 3.1.1
> Environment: Linux Ubuntu
> Reporter: samur araujo
> Attachments: spatial-assembler.ttl
>
>
> I loaded geonames on Jena but the spatial queries take more than 3s to
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
> ?place spatial:intersectBox (32.55668 -117.12865 32.56668 -117.13865) .
>
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed
> that when I access the lucene index directly the queries are also very fast,
> about 20ms.
> The issue may be related to the pos-processing of lucene results.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)