Dan,

The filtering is done in the HitCollector by the bounding box, so the only
hits that get collected are those that match the keywords, the bounding box,
and some Lucene filters (BitSets) (I'm probably overloading the word
'filter' a bit). So, the only hits from the collector that need to be sorted
are those that are roughly within the search radius. When the search radius
gets larger, a new bounding box is computed for that query. Make sense?

Peter

On 3/16/07, Daniel Rosher <[EMAIL PROTECTED]> wrote:

Hi Peter,

Shouldn't the search perform the euclidean distance during filtering as
well
though, otherwise you will obtain perhaps highly relevant hits reported to
the user outside the range they specified? Particularly as the search
radius
gets larger.

Cheers,
Dan

On 1/28/07, Peter Keegan <[EMAIL PROTECTED]> wrote:
>
> Correction:
> We only do the euclidan computation during sorting. For filtering, a
> simple
> bounding box is computed to approximate the radius, and 2 range
> comparisons
> are made to exclude documents. Because these comparisons are done
outside
> of
> Lucene as integer comparisons, it is pretty fast. With 13000 results,
the
> seach time with distance sort is about 200 msec (compared to 30 ms for a
> simple non-radius, date-sorted keyword search).
>
> Peter
>
> On 1/27/07, no spam <[EMAIL PROTECTED]> wrote:
> >
> > Isn't this extremely ineffecient to do the euclidean distance twice?
> > Perhaps not a huge deal if a small search result set.  I at times have
> > 13,000 results that match my search terms of an index with 1.2 million
> > docs.
> >
> > Can't you do some simple radian math first to ensure it's way out of
> > bounds,
> > then do the euclidian distance for the subset within bounds?  I'm
> > currently
> > only doing the distance calc once (post hit collector). I don't have
any
> > performance numbers with the double vs single distance calc.
> >
> > I'm still working out the sort by radius myself.
> >
> > Mark
> >
> > On 11/3/06, Peter Keegan <[EMAIL PROTECTED]> wrote:
> > >
> > > Daniel,
> > > Yes, this is correct if you happen to be doing a radius search and
> > sorting
> > > by mileage.
> > > Peter
> > >
> > >
> >
> >
>

Reply via email to