On Apr 14, 2010, at 12:12 PM, Chris Male wrote: > Hi, > > On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll <gsing...@apache.org> wrote: > > On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > > > Hi, > > > > My understanding of the benefits of the new algorithm is that it means a > > lower tier level resulting in fewer boxes, but more documents inside those > > boxes that are outside of the search radius. > > > > While having fewer boxes means fewer term queries to make against the > > index, more documents means more costly calculations to filter out those > > extraneous documents. > > > > For those doing just Cartesian Tier filtering it seems like the new > > approach is a win, but for those doing distance calculations on those > > documents passing the filter, it seems to come at a cost. > > Currently, this is only used for filtering. AIUI, Tiers aren't really that > useful for distance calculations, are they? After all, all you have is a box > id and you'd have to reverse out the calc of that to be able to calc a > distance, no? Perhaps I'm missing something. > > > How Spatial Lucene currently works (or at least one of the ways it was > designed to work), is using a 2 step filtering process. Step 1 is the > Cartesian Tier filtering. The resulting set of Documents is then passed on > through to Step 2 which then calculates the distance from each Document to > the search centre. If the distance is greater than the radius, the Document > is filtered out. This means that after both filtering steps you have only > those Documents that are in the search radius. > > How this impacts this algorithm choice is that the more Documents the pass > through Step 1, the more calculations that have to be done in Step 2.
OK, I see what you mean now. I thought you were implying the box id would be used for calculating a distance, too.