[
https://issues.apache.org/jira/browse/JENA-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821669#comment-15821669
]
Osma Suominen commented on JENA-1277:
-------------------------------------
I tried reproducing this. It takes a while though since the data set is rather
large. I will attach the assembler/fuseki configuration file that I used.
First I created the TDB using tdbloader2:
{noformat}
tdbloader2 --loc tdb geonames.nt.gz
{noformat}
This took 69 minutes on my i3-2330M Ubuntu 16.04 laptop with SSD.
Then I created the spatial index. I had to experiment a bit until I found out
the amount of memory. Luckily 6G was enough, since I'm on a 8G machine so I
couldn't have afforded much more:
{noformat}
java -Xmx6G -cp fuseki-server.jar jena.spatialindexer
--desc=spatial-assembler.ttl
{noformat}
This took 19 minutes.
Finally I ran Fuseki 1.4.1. I tweaked fuseki-server startup script beforehand
to give it 4G of memory, just in case.
{noformat}
./fuseki-server --config spatial-assembler.ttl
{noformat}
Finally I executed the query:
{noformat}
s-query --service=http://localhost:3030/ds/sparql --query query.rq --output=csv
>results.csv
{noformat}
I ran this a few times and the response time varied between 18 and 27 seconds.
I got 2469 results, not 17 as you said on the mailing list. I suspect that your
spatial index is somehow incomplete, since you got fewer results in a shorter
time.
In any case, I can confirm that this spatial query is really slow.
> Spatial Queries Very Slow For Large Databases
> ---------------------------------------------
>
> Key: JENA-1277
> URL: https://issues.apache.org/jira/browse/JENA-1277
> Project: Apache Jena
> Issue Type: Improvement
> Components: Spatial
> Affects Versions: Jena 3.1.1
> Environment: Linux Ubuntu
> Reporter: samur araujo
>
> I loaded geonames on Jena but the spatial queries take more than 3s to
> execute. The query is below:
> PREFIX spatial: <http://jena.apache.org/spatial#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> SELECT distinct ?place
> {
> ?place spatial:intersectBox (32.55668 -117.12865 32.56668 -117.13865) .
>
> }
> The data can be downloaded here:
> https://drive.google.com/file/d/0B-fwYPJYT1GOYVVIZF9ROUxzclk/view?usp=sharing
> For small datasets the queries are executed in 200ms, very fast. I noticed
> that when I access the lucene index directly the queries are also very fast,
> about 20ms.
> The issue may be related to the pos-processing of lucene results.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)