Correction: We only do the euclidan computation during sorting. For filtering, a simple bounding box is computed to approximate the radius, and 2 range comparisons are made to exclude documents. Because these comparisons are done outside of Lucene as integer comparisons, it is pretty fast. With 13000 results, the seach time with distance sort is about 200 msec (compared to 30 ms for a simple non-radius, date-sorted keyword search).
Peter On 1/27/07, no spam <[EMAIL PROTECTED]> wrote:
Isn't this extremely ineffecient to do the euclidean distance twice? Perhaps not a huge deal if a small search result set. I at times have 13,000 results that match my search terms of an index with 1.2 million docs. Can't you do some simple radian math first to ensure it's way out of bounds, then do the euclidian distance for the subset within bounds? I'm currently only doing the distance calc once (post hit collector). I don't have any performance numbers with the double vs single distance calc. I'm still working out the sort by radius myself. Mark On 11/3/06, Peter Keegan <[EMAIL PROTECTED]> wrote: > > Daniel, > Yes, this is correct if you happen to be doing a radius search and sorting > by mileage. > Peter > >