Re: Spatial search using R-tree for indexed bounding boxes

2009-03-11 Thread pjaol

There are many spatial solutions out there
- R-tree
- Quad-Tree
- SRID with positional proximity like geohash
- Voronoi diagrams
etc..

All have their pros & cons as do Cartesian grids.
Feel free to contribute the more there are, the more solutions that can be
applied to different problems, I use a Cartesian method meets my needs for
my work which has a substantial data set. 

If you look at the LocalSolr implementation of LocalLucene you'll notice
that there are 
multiple grids indexed at different levels / tiers. In a very similar
fashion to Quad-Trees.

The reason, you can 'zoom' in as low as you need to.

There are multiple filters occurring for the geo filters in locallucene,
first is an MMB based
on bestFit or zoom level, then afterwards those results are filtered by your
text intersection
and finally by a radial filter. 
The MMB filter, pre-generates the shape or in this case box with all the
id's you're interested in
and using a TermEnumerator pulls those from the index. As the id's are
stored in a sorted fashion
it's a very fast retrieval. Without having to manage memory or a data
structure outside of lucene.

Storing multiple points is possible, doing a radial filter, or sorting on
distance (which is what my work
depends on) is tricky as I use FieldCache to retrieve actual points quickly.
Multiple values in a field cache
dont work. But I am looking at the Uninverted Field method of solr facets
for that.
You will have the same issue no matter what method you use, unless you don't
care
about distance.

If you want to do something more complex like polygon's, you extend the
Shape's class, and create
either another MMB or convex hull method. 
Basically my belief is, that it's faster to find something if you know what
your looking for.
i.e. grid / box id's
The bestFit method essentially lets you skip traversing and just get to the
level you want.
Pre-generate the id's used in a shape, and simply pull them out of the
index.




hossman wrote:
> 
> 
> : Patrick (of local lucene fame) thinks it is possible to do extent
> queries with
> : the cartesian grid method -- essentially you select the "best fit" level
> and
> : cell, and that should be set for anything within the extent.  The
> advantage of
> : this approach is that it is super-fast and scaleable.  The disadvantage
> is
> : that it is only as accurate as the grid.
> 
> i'm way out of my league on spatial search -- but couldn't you use the 
> grid method to whittle down the result space, and then do the computation 
> to determine if a true overlap exists?
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Spatial-search-using-R-tree-for-indexed-bounding-boxes-tp22318859p22462731.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Spatial search using R-tree for indexed bounding boxes

2009-03-09 Thread Chris Hostetter

: Patrick (of local lucene fame) thinks it is possible to do extent queries with
: the cartesian grid method -- essentially you select the "best fit" level and
: cell, and that should be set for anything within the extent.  The advantage of
: this approach is that it is super-fast and scaleable.  The disadvantage is
: that it is only as accurate as the grid.

i'm way out of my league on spatial search -- but couldn't you use the 
grid method to whittle down the result space, and then do the computation 
to determine if a true overlap exists?


-Hoss



Re: Spatial search using R-tree for indexed bounding boxes

2009-03-03 Thread Ryan McKinley


Are there any easily foreseeable problems with implementing an r- 
tree box
indexing/searching extension to Solr, in the spirit of localsolr? If  
anyone

has any pointers I'm all ears.


I have implemented an R-Tree based integration for solr.  It is pretty  
ugly and memory intensive, but works for now.  I plan to release it  
(or something like it) in the lucene spatial contrib sometime.  I am  
waiting to see what the "Flexible indexing" thread leads since that  
may be a good way to just build the R-Tree at index time rather then  
every time you open a searcher.


In my current approach, I store a string representing a bounding box.   
When you open a searcher, it walks through every document and builds  
an in memory R-Tree.  Then I have a solr query component that applies  
a filter based on what matches.


- - - - -

Patrick (of local lucene fame) thinks it is possible to do extent  
queries with the cartesian grid method -- essentially you select the  
"best fit" level and cell, and that should be set for anything within  
the extent.  The advantage of this approach is that it is super-fast  
and scaleable.  The disadvantage is that it is only as accurate as the  
grid.


ryan




Spatial search using R-tree for indexed bounding boxes

2009-03-03 Thread Craig de Stigter
Hi list

We're in need of a spatial search extension to Solr, and it looks like we
might end up implementing it ourselves.

At first glance localsolr looked promising, but a closer look revealed that
each document has only a single point location, and querying is done via a
cartesian grid. Our documents have bounding rectangles, and we need to query
for intersection with a given extent (with ranking as some measure of
similarity of the two areas.) An existing one is already on our website at
www.koordinates.com .

The textbook solution is to use an R-tree as the index instead of a
cartesian grid. Does anyone know of any prior work implementing that for
Lucene/Solr? So far searching relevant places on the internet hasn't turned
up anything.

Are there any easily foreseeable problems with implementing an r-tree box
indexing/searching extension to Solr, in the spirit of localsolr? If anyone
has any pointers I'm all ears.

Regards
Craig de Stigter

-- 
Koordinates Ltd
PO Box 1604, Shortland St, Auckland, New Zealand
Phone +64-9-966 0433 Fax +64-9-969 0045
Web http://www.koordinates.com