Hi, I would use pure span or cover density based ranking algorithm which do not take document length into consideration. (tweaking whatever currently in the standard Lucene distribution?)
For example, searching for the keywords "beautiful house", span/cover ranking will treat a long document and a short document the same ranking as long as they have the same number of spans/covers (for example, "beautiful xxxxxx house" is one cover), and with each span/cover, the editing distance between the keywords is the same. Just my 2 cents, Cheers, Jian On 29 Jun 2005 20:30:49 -0000, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Hi, > > Short documents bubble to the top of the results because the field > length is short. Does anyone have a good strategy for working around this? > Will doing something like log(document length) flatten out my results while > still making them meaningful? I'm going to try some different approaches > but any advice is appreciated. > > Thanks. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]