Thanks. I added my comment on the issue. I think we should revert and then someone can put up a patch to make this pluggable. As it stands, this Best Fit calculation has nothing to do with the CartesianTierPlotter anyway, so we could refactor it pretty easily.
-Grant On Apr 14, 2010, at 12:42 PM, Helleringer, Nicolas wrote: > Tables are well on JIRA : https://issues.apache.org/jira/browse/LUCENE-2359 > > Nicolas > > 2010/4/14 Helleringer, Nicolas <nicolas.hellerin...@novacodex.net> > Here are the summary tables : > > First a table to remind metrics on the Tiers : > Tile Level TierLegnth TierBoxes TileLength (miles) 0 1 1 24902 1 2 4 12451 2 > 4 16 6225,5 3 8 64 3112,75 4 16 256 1556,375 5 32 1024 778,1875 6 64 4096 > 389,09375 7 128 16384 194,546875 8 256 65536 97,2734375 9 512 262144 > 48,63671875 10 1024 1048576 24,31835938 11 2048 4194304 12,15917969 12 4096 > 16777216 6,079589844 13 8192 67108864 3,039794922 14 16384 268435456 > 1,519897461 15 32768 1073741824 0,75994873 > > > Then the comparaison table between legacy and new bestFit : > Radius (miles) legacy bestFit legacy bestFit TileLength legacy bestFit max > number of Box to fetch new bestFit new bestFit TileLength new bestFit number > of Box to fetch 1 18 0,75994873 9 14 1,519897461 4 5 16 0,75994873 64 12 > 6,079589844 4 10 15 0,75994873 225 11 12,15917969 4 25 13 3,039794922 100 9 > 24,31835938 9 50 12 6,079589844 100 8 97,2734375 4 100 11 12,15917969 100 7 > 194,546875 4 250 10 24,31835938 144 6 389,09375 4 500 9 48,63671875 144 5 > 778,1875 4 1000 8 97,2734375 144 4 1556,375 4 2500 7 194,546875 196 3 3112,75 > 4 5000 6 389,09375 196 2 6225,5 4 10000 5 778,1875 196 1 12451 4 > > I hope mailers will keep the formating ... > > > > > > If not I shall post on JIRA. > > Formulas : > TileLength is 24902 (earth circumference) / TierLength > bestFit formulas as summarized by Grant in his email. > number of box to fetch : pow(ceil(TileLength/Radius)+1,2) => > TileLength/Radius is for how many tiles are needed to cover the radius, +1 is > because you are not always well aligned, the pow(X,2) because there is two > directions/axis > > Best regards, > > Nicolas > > 2010/4/14 Chris Male <gento...@gmail.com> > > Hi, > > On Wed, Apr 14, 2010 at 6:07 PM, Grant Ingersoll <gsing...@apache.org> wrote: > > On Apr 14, 2010, at 11:06 AM, Chris Male wrote: > > > Hi, > > > > My understanding of the benefits of the new algorithm is that it means a > > lower tier level resulting in fewer boxes, but more documents inside those > > boxes that are outside of the search radius. > > > > While having fewer boxes means fewer term queries to make against the > > index, more documents means more costly calculations to filter out those > > extraneous documents. > > > > For those doing just Cartesian Tier filtering it seems like the new > > approach is a win, but for those doing distance calculations on those > > documents passing the filter, it seems to come at a cost. > > Currently, this is only used for filtering. AIUI, Tiers aren't really that > useful for distance calculations, are they? After all, all you have is a box > id and you'd have to reverse out the calc of that to be able to calc a > distance, no? Perhaps I'm missing something. > > > How Spatial Lucene currently works (or at least one of the ways it was > designed to work), is using a 2 step filtering process. Step 1 is the > Cartesian Tier filtering. The resulting set of Documents is then passed on > through to Step 2 which then calculates the distance from each Document to > the search centre. If the distance is greater than the radius, the Document > is filtered out. This means that after both filtering steps you have only > those Documents that are in the search radius. > > How this impacts this algorithm choice is that the more Documents the pass > through Step 1, the more calculations that have to be done in Step 2. > > I'm not sure, however, that it is a win for filtering. It seems like you end > up including docs in the result set that should be in there. > > I'll wait for Nicolas' summary table, but I'm inclined to revert and then > someone can refactor if they want to offer alternate implementations. > > -Grant > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org > > > > > -- > Chris Male | Software Developer | JTeam BV.| www.jteam.nl > > -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search