[jira] [Commented] (JENA-1277) Spatial Queries Very Slow For Large Databases

Osma Suominen (JIRA) Fri, 13 Jan 2017 03:38:16 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821669#comment-15821669
 ]


Osma Suominen commented on JENA-1277:
-------------------------------------

I tried reproducing this. It takes a while though since the data set is rather 
large. I will attach the assembler/fuseki configuration file that I used.

First I created the TDB using tdbloader2:

{noformat}
tdbloader2 --loc tdb geonames.nt.gz
{noformat}

This took 69 minutes on my i3-2330M Ubuntu 16.04 laptop with SSD.

Then I created the spatial index. I had to experiment a bit until I found out 
the amount of memory. Luckily 6G was enough, since I'm on a 8G machine so I 
couldn't have afforded much more:

{noformat}
java -Xmx6G -cp fuseki-server.jar jena.spatialindexer 
--desc=spatial-assembler.ttl
{noformat}

This took 19 minutes.

Finally I ran Fuseki 1.4.1. I tweaked fuseki-server startup script beforehand 
to give it 4G of memory, just in case.

{noformat}
./fuseki-server --config spatial-assembler.ttl
{noformat}

Finally I executed the query:
{noformat}
s-query --service=http://localhost:3030/ds/sparql --query query.rq --output=csv 
>results.csv
{noformat}

I ran this a few times and the response time varied between 18 and 27 seconds. 
I got 2469 results, not 17 as you said on the mailing list. I suspect that your 
spatial index is somehow incomplete, since you got fewer results in a shorter 
time.

In any case, I can confirm that this spatial query is really slow.

> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>
>                 Key: JENA-1277
>                 URL: https://issues.apache.org/jira/browse/JENA-1277
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Spatial
>    Affects Versions: Jena 3.1.1
>         Environment: Linux Ubuntu
>            Reporter: samur araujo
>
> I loaded geonames on Jena but the spatial queries take more than 3s to 
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
>     ?place spatial:intersectBox (32.55668 -117.12865 32.56668  -117.13865) .
>   
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed 
> that when I access the lucene index directly the queries are also very fast, 
> about 20ms. 
> The issue may be related to the pos-processing of lucene results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-1277) Spatial Queries Very Slow For Large Databases

Reply via email to