Re: Spatial search using R-tree for indexed bounding boxes
There are many spatial solutions out there - R-tree - Quad-Tree - SRID with positional proximity like geohash - Voronoi diagrams etc.. All have their pros & cons as do Cartesian grids. Feel free to contribute the more there are, the more solutions that can be applied to different problems, I use a Cartesian method meets my needs for my work which has a substantial data set. If you look at the LocalSolr implementation of LocalLucene you'll notice that there are multiple grids indexed at different levels / tiers. In a very similar fashion to Quad-Trees. The reason, you can 'zoom' in as low as you need to. There are multiple filters occurring for the geo filters in locallucene, first is an MMB based on bestFit or zoom level, then afterwards those results are filtered by your text intersection and finally by a radial filter. The MMB filter, pre-generates the shape or in this case box with all the id's you're interested in and using a TermEnumerator pulls those from the index. As the id's are stored in a sorted fashion it's a very fast retrieval. Without having to manage memory or a data structure outside of lucene. Storing multiple points is possible, doing a radial filter, or sorting on distance (which is what my work depends on) is tricky as I use FieldCache to retrieve actual points quickly. Multiple values in a field cache dont work. But I am looking at the Uninverted Field method of solr facets for that. You will have the same issue no matter what method you use, unless you don't care about distance. If you want to do something more complex like polygon's, you extend the Shape's class, and create either another MMB or convex hull method. Basically my belief is, that it's faster to find something if you know what your looking for. i.e. grid / box id's The bestFit method essentially lets you skip traversing and just get to the level you want. Pre-generate the id's used in a shape, and simply pull them out of the index. hossman wrote: > > > : Patrick (of local lucene fame) thinks it is possible to do extent > queries with > : the cartesian grid method -- essentially you select the "best fit" level > and > : cell, and that should be set for anything within the extent. The > advantage of > : this approach is that it is super-fast and scaleable. The disadvantage > is > : that it is only as accurate as the grid. > > i'm way out of my league on spatial search -- but couldn't you use the > grid method to whittle down the result space, and then do the computation > to determine if a true overlap exists? > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Spatial-search-using-R-tree-for-indexed-bounding-boxes-tp22318859p22462731.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spatial search using R-tree for indexed bounding boxes
: Patrick (of local lucene fame) thinks it is possible to do extent queries with : the cartesian grid method -- essentially you select the "best fit" level and : cell, and that should be set for anything within the extent. The advantage of : this approach is that it is super-fast and scaleable. The disadvantage is : that it is only as accurate as the grid. i'm way out of my league on spatial search -- but couldn't you use the grid method to whittle down the result space, and then do the computation to determine if a true overlap exists? -Hoss
Re: Spatial search using R-tree for indexed bounding boxes
Are there any easily foreseeable problems with implementing an r- tree box indexing/searching extension to Solr, in the spirit of localsolr? If anyone has any pointers I'm all ears. I have implemented an R-Tree based integration for solr. It is pretty ugly and memory intensive, but works for now. I plan to release it (or something like it) in the lucene spatial contrib sometime. I am waiting to see what the "Flexible indexing" thread leads since that may be a good way to just build the R-Tree at index time rather then every time you open a searcher. In my current approach, I store a string representing a bounding box. When you open a searcher, it walks through every document and builds an in memory R-Tree. Then I have a solr query component that applies a filter based on what matches. - - - - - Patrick (of local lucene fame) thinks it is possible to do extent queries with the cartesian grid method -- essentially you select the "best fit" level and cell, and that should be set for anything within the extent. The advantage of this approach is that it is super-fast and scaleable. The disadvantage is that it is only as accurate as the grid. ryan
Spatial search using R-tree for indexed bounding boxes
Hi list We're in need of a spatial search extension to Solr, and it looks like we might end up implementing it ourselves. At first glance localsolr looked promising, but a closer look revealed that each document has only a single point location, and querying is done via a cartesian grid. Our documents have bounding rectangles, and we need to query for intersection with a given extent (with ranking as some measure of similarity of the two areas.) An existing one is already on our website at www.koordinates.com . The textbook solution is to use an R-tree as the index instead of a cartesian grid. Does anyone know of any prior work implementing that for Lucene/Solr? So far searching relevant places on the internet hasn't turned up anything. Are there any easily foreseeable problems with implementing an r-tree box indexing/searching extension to Solr, in the spirit of localsolr? If anyone has any pointers I'm all ears. Regards Craig de Stigter -- Koordinates Ltd PO Box 1604, Shortland St, Auckland, New Zealand Phone +64-9-966 0433 Fax +64-9-969 0045 Web http://www.koordinates.com