See comments below. --- Erik Hatcher <[EMAIL PROTECTED]> wrote:
> [trimming the post a bit] > > On Sep 18, 2005, at 11:51 AM, James Huang wrote: > > The problem is quite generic, I believe. What I > like > > to do is similar to LIA-ch6, i.e. to find a "good > > Chinese Hunan-style restaurant near me." I prefer > > Hunan-style; however, if a good Human-style one is > 12 > > miles, where there is a Shanghai-style only 2 > miles, I > > may want to take that instead. So it's not a > simple > > multi-sorting problem, it's an empirical ordering > and > > the parameters may have to be experimented. Thus > far, > > I'm happy with that formula I gave earlier. > > The example in LIA was purely a distance sort, not > blended as you > desire. > > > Separately, earlier in this thread, you also > mentioned > > "what if 10M search results?" -- that's also my > > concern, for both space and time. > > > > 1. Space-wise, the 10M Document's will be dragged > into > > memory (in a Hits, say), right? > > No, that is not correct, and this is an important > point about Lucene > and it's ability to scale extremely well. Hits > caches up to 200 > documents (I believe) and uses a mechanism to score > single documents > at a time and only keep the top scoring ones. > > There is no problem for Lucene to search and have > Hits with a massive size. > > There are memory considerations with sorting, though > - these are > described in detail in the javadocs and a little in > LIA. > > > 1. How to use a compound scoring at search-time > (where > > you suggested a Query-subclass, but what/how?) > > I'm going to defer to others to assist with this, or > validate that > this is the right approach in this situation. > > > 2. Space concern about large search result set. > > With a Query subclass, this shouldn't be a concern. > With sorting > using Lucene's Sort there are some memory concerns, > but less so than with your own TreeSet. > OK, so external sorting does not scale and has to be ruled out! Now I have to find a way to customize the scoring during search (using Hits, not customized HitsCollector). Help is desparately needed here! Thanks in advance, -James > > Erik > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]