[jira] [Commented] (JENA-1277) Spatial Queries Very Slow For Large Databases

Osma Suominen (JIRA) Fri, 13 Jan 2017 04:12:48 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821705#comment-15821705
 ]


Osma Suominen commented on JENA-1277:
-------------------------------------

Did some basic profiling using current Jena (Fuseki 1.5.0-SNAPSHOT). The slow 
part which takes ~20s is this line in SpatialIndexLucene.java, i.e. the point 
where the actual Lucene search is done:

{noformat}
 TopDocs docs = indexSearcher.search(new MatchAllDocsQuery(), filter,
                                limit, distSort);
{noformat}

Of the parameters, 
* {{filter}} is a (Lucene)  {{IntersectsPrefixTreeFilter}}
* {{limit}} is 10000 (the default)
* {{distSort}} is 
{{<custom:"ShapeFieldCacheDistanceValueSource(org.apache.lucene.spatial.prefix.PointPrefixTreeFieldCacheProvider@2805220,
 Pt(x=62.86635000000001,y=32.561679999999996))": 
org.apache.lucene.queries.function.ValueSource$ValueSourceComparatorSource@558a840>}},
 whatever that means.

I'm not sure how fast this operation should be on such a large data set. I've 
never used jena-spatial or Lucene spatial queries before...

> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>
>                 Key: JENA-1277
>                 URL: https://issues.apache.org/jira/browse/JENA-1277
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Spatial
>    Affects Versions: Jena 3.1.1
>         Environment: Linux Ubuntu
>            Reporter: samur araujo
>         Attachments: spatial-assembler.ttl
>
>
> I loaded geonames on Jena but the spatial queries take more than 3s to 
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
>     ?place spatial:intersectBox (32.55668 -117.12865 32.56668  -117.13865) .
>   
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed 
> that when I access the lucene index directly the queries are also very fast, 
> about 20ms. 
> The issue may be related to the pos-processing of lucene results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1277) Spatial Queries Very Slow For Large Databases

Reply via email to