That's great news, David! Thank you for your work! Is it going to be
released in the next version of Lucene?


On Fri, Jan 17, 2014 at 4:53 PM, Smiley, David W. <[email protected]> wrote:

>  So 4x not 10x; I’m not feeling depressed anymore ;-)
>
>  Your approach of using QuadTree and a coordinate reference system makes
> sense.  One day hopefully not too far away, I expect
> Lucene-spatial/Spatial4j will have built-in projection support.  But
> Instead of indexing bounding boxes, you should ideally be indexing the
> actual shapes, and then you can pull the WKB and check for actual
> intersection.
>
>  I’m excited to announce to you and others reading this that I’m
> currently working on a much more sophisticated system indexing shapes and
> computing intersections that will be much faster.  The first release
> (within 2-3 weeks) will index shapes using the grid and then any matches
> will be double-checked against a WKB representation stored in Lucene
> "doc-values”.  The subsequent release to occur within the next ~30 days
> will tweak the grid encoding to include a little bit more metadata such
> that most queries will be completely satisfied by examining the fast index
> grid; only shapes that barely touch an indexed shape will have to be
> double-checked against the WKB representation.  The net effect should be a
> dramatic increase in spatial accuracy and performance over the current
> scheme.  You can expect to see a blog post with illustrations about this
> within 30 days.
>
>  ~ David
>
>   From: Demeter Sztanko <[email protected]>
> Date: Friday, January 17, 2014 at 11:13 AM
> To: "Smiley, David W." <[email protected]>
> Cc: "[email protected]" <
> [email protected]>
>
> Subject: Re: [Jts-topo-suite-user] Persistent STR tree
>
>   Hi David,
>
>  First of all, thanks for the development of lucene - it is an amazing
> and unique library.
>
>  Sorry, 10x was a very rough estimation - lucene is actually 4 times
> slower.
>
>  When using Lucene, I can perform around 700 queries/second (that's 8
> threads on 8 core machine macbook Pro with ssd disks). With JTS STRTree I
> was able to get around 2800 queries/sec, so that's around 4x slowdown. I
> was counting only query performance, not indexing.
>
>  I am storing BB rectangles as geometry and the real geometry in WKB
> format as the value field of the record. And I am using QuadPrefixTree.
>
>  One thing I have noticed is that Lucene is dealing with lat/lng
> coordinates only - therefore it wont allow any other reference systems (I
> am using British reference grid:
> http://spatialreference.org/ref/epsg/27700/ ), so I had to scale down all
> bounding boxes so the coordinates fit into 0-180 interval.
>
>  I haven't tried any of the standalone databases as I believe the simple
> network overhead will kill all possible performance benefits. Also for
> other reasons I do not want to deal with those.
>
>  I still believe the operations I am performing on the RTree are
> relatively simple and Lucene is optimised for much more general use, so I
> have some hopes to enhance it's performance.
>
>  D.
>
>
> On Fri, Jan 17, 2014 at 3:39 PM, Smiley, David W. <[email protected]>wrote:
>
>>  Whoops; forgot to reply-all.
>>
>>
>>   From: <Smiley>, "Smiley, David W." <[email protected]>
>> Date: Friday, January 17, 2014 at 10:15 AM
>> To: Demeter Sztanko <[email protected]>
>> Subject: Re: [Jts-topo-suite-user] Persistent STR tree
>>
>>   Thanks for sharing your experience with Lucene-spatial.  I’m
>> responsible for a large part of it.  I don’t think you’re ever going to get
>> the performance of an in-memory structure to compare to an on-disk one
>> (even SSD).  Of course if you find one then let me know.  FWIW I’m looking
>> to improve the accuracy & performance of lucene-spatial a lot this year.
>>  Can you tell me if the indexed spatial objects are all points or if it’s
>> mostly non-points?  And was the 10x slower just query performance or did
>> that include indexing?
>>
>>  In the NoSQL space (or shall we say… not a relational database space),
>> the systems with the best spatial support to my knowledge are MongoDB,
>> CouchDB (spatial module is add-on separately), and Lucene-spatial.  Your
>> data set isn’t huge though; I’d try PostGIS if I were you.  And I’m very
>> impressed with what I see in SQL Server.
>>
>>  Good luck,
>>   ~ David Smiley
>>
>>   From: Demeter Sztanko <[email protected]>
>> Date: Friday, January 17, 2014 at 7:56 AM
>> To: "[email protected]" <
>> [email protected]>
>> Subject: [Jts-topo-suite-user] Persistent STR tree
>>
>>   Hi all,
>>
>>  I need to store around 50M objects in a spatial index (I need only
>> support for bulk insert and concurrent intersection() operations). I need
>> then to semi-randomly access the objects (that is, I probably will have 300
>> requests within one location, then another 300 in another random location,
>> etc.)
>>
>>  STRTree is great and fast, however I need around 50GB of RAM for
>> fitting the tree which is unfortunately too expensive for me to maintain in
>> long term.
>>
>>  I need a solution that can run on 1Gb of RAM and SSD disks (it's a
>> digitalocean cloud instance)
>>
>>  I have also tried using Lucene for storing spatial index, which is also
>> feasible but around 10 times slower even on SSD disks.
>>
>>  I was wondering if you know of any other minimal java libraries that
>> can do what I am looking for yet they are still relatively fast.
>>
>>
>>  Thanks,
>>
>>  D.
>>
>>
>>
>> ------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services.
>> Learn Why More Businesses Are Choosing CenturyLink Cloud For
>> Critical Workloads, Development Environments & Everything In Between.
>> Get a Quote or Start a Free Trial Today.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Jts-topo-suite-user mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user
>>
>>
>
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Jts-topo-suite-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user

Reply via email to