On 5/30/07, Donna L Gresh <[EMAIL PROTECTED]> wrote:
My guess is that my user would really like the scoring to be done only considering that subset of person ids as well but we haven't explicitly discussed it and I'm pretty sure that the scoring is based on information in the entire index and can't be changed on the fly, correct?
Yes, term document frequency (used for idf - inverse document frequency) will be based on the whole index.
In any case it seems to me that the "natural" way to only return people who are in the original input list is to simply use Lucene as it is, getting all the hits I need, and then only returning out of the application those on the original input list. Does this seem appropriate? Thanks in advance for any pointers--
It's probably easier to use a Filter (which essentially does the same thing at a lower level in the search API). Use termDocs(Term) to look up the ids, add them to a BitSet, and make a Filter with that. You might want to check out CachingWrapperFilter or QueryFilter too. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]