Re: Strategy for making short documents not bubble to the top?

jian chen Wed, 29 Jun 2005 13:39:56 -0700

Hi,

I would use pure span or cover density based ranking algorithm which
do not take document length into consideration. (tweaking whatever
currently in the standard Lucene distribution?)


For example, searching for the keywords "beautiful house", span/cover
ranking will treat a long document and a short document the same
ranking as long as they have the same number of spans/covers (for
example, "beautiful xxxxxx house" is one cover), and with each
span/cover, the editing distance between the keywords is the same.

Just my 2 cents, 

Cheers,

Jian

On 29 Jun 2005 20:30:49 -0000, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
> Hi,
> 
> Short documents bubble to the top of the results because the field
> length is short.  Does anyone have a good strategy for working around this?
>  Will doing something like log(document length) flatten out my results while
> still making them meaningful?  I'm going to try some different approaches
> but any advice is appreciated.
> 
> Thanks.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Strategy for making short documents not bubble to the top?

Reply via email to