So 4x not 10x; I’m not feeling depressed anymore ;-)
Your approach of using QuadTree and a coordinate reference system makes sense.
One day hopefully not too far away, I expect Lucene-spatial/Spatial4j will have
built-in projection support. But Instead of indexing bounding boxes, you
should ideally be indexing the actual shapes, and then you can pull the WKB and
check for actual intersection.
I’m excited to announce to you and others reading this that I’m currently
working on a much more sophisticated system indexing shapes and computing
intersections that will be much faster. The first release (within 2-3 weeks)
will index shapes using the grid and then any matches will be double-checked
against a WKB representation stored in Lucene "doc-values”. The subsequent
release to occur within the next ~30 days will tweak the grid encoding to
include a little bit more metadata such that most queries will be completely
satisfied by examining the fast index grid; only shapes that barely touch an
indexed shape will have to be double-checked against the WKB representation.
The net effect should be a dramatic increase in spatial accuracy and
performance over the current scheme. You can expect to see a blog post with
illustrations about this within 30 days.
~ David
From: Demeter Sztanko <[email protected]<mailto:[email protected]>>
Date: Friday, January 17, 2014 at 11:13 AM
To: "Smiley, David W." <[email protected]<mailto:[email protected]>>
Cc:
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [Jts-topo-suite-user] Persistent STR tree
Hi David,
First of all, thanks for the development of lucene - it is an amazing and
unique library.
Sorry, 10x was a very rough estimation - lucene is actually 4 times slower.
When using Lucene, I can perform around 700 queries/second (that's 8 threads on
8 core machine macbook Pro with ssd disks). With JTS STRTree I was able to get
around 2800 queries/sec, so that's around 4x slowdown. I was counting only
query performance, not indexing.
I am storing BB rectangles as geometry and the real geometry in WKB format as
the value field of the record. And I am using QuadPrefixTree.
One thing I have noticed is that Lucene is dealing with lat/lng coordinates
only - therefore it wont allow any other reference systems (I am using British
reference grid: http://spatialreference.org/ref/epsg/27700/ ), so I had to
scale down all bounding boxes so the coordinates fit into 0-180 interval.
I haven't tried any of the standalone databases as I believe the simple network
overhead will kill all possible performance benefits. Also for other reasons I
do not want to deal with those.
I still believe the operations I am performing on the RTree are relatively
simple and Lucene is optimised for much more general use, so I have some hopes
to enhance it's performance.
D.
On Fri, Jan 17, 2014 at 3:39 PM, Smiley, David W.
<[email protected]<mailto:[email protected]>> wrote:
Whoops; forgot to reply-all.
From: <Smiley>, "Smiley, David W." <[email protected]<mailto:[email protected]>>
Date: Friday, January 17, 2014 at 10:15 AM
To: Demeter Sztanko <[email protected]<mailto:[email protected]>>
Subject: Re: [Jts-topo-suite-user] Persistent STR tree
Thanks for sharing your experience with Lucene-spatial. I’m responsible for a
large part of it. I don’t think you’re ever going to get the performance of an
in-memory structure to compare to an on-disk one (even SSD). Of course if you
find one then let me know. FWIW I’m looking to improve the accuracy &
performance of lucene-spatial a lot this year. Can you tell me if the indexed
spatial objects are all points or if it’s mostly non-points? And was the 10x
slower just query performance or did that include indexing?
In the NoSQL space (or shall we say… not a relational database space), the
systems with the best spatial support to my knowledge are MongoDB, CouchDB
(spatial module is add-on separately), and Lucene-spatial. Your data set isn’t
huge though; I’d try PostGIS if I were you. And I’m very impressed with what I
see in SQL Server.
Good luck,
~ David Smiley
From: Demeter Sztanko <[email protected]<mailto:[email protected]>>
Date: Friday, January 17, 2014 at 7:56 AM
To:
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: [Jts-topo-suite-user] Persistent STR tree
Hi all,
I need to store around 50M objects in a spatial index (I need only support for
bulk insert and concurrent intersection() operations). I need then to
semi-randomly access the objects (that is, I probably will have 300 requests
within one location, then another 300 in another random location, etc.)
STRTree is great and fast, however I need around 50GB of RAM for fitting the
tree which is unfortunately too expensive for me to maintain in long term.
I need a solution that can run on 1Gb of RAM and SSD disks (it's a digitalocean
cloud instance)
I have also tried using Lucene for storing spatial index, which is also
feasible but around 10 times slower even on SSD disks.
I was wondering if you know of any other minimal java libraries that can do
what I am looking for yet they are still relatively fast.
Thanks,
D.
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Jts-topo-suite-user mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Jts-topo-suite-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user