...or just set a lower boost on fileds with less than $x amount of characters while indexing.

John

Otis Gospodnetic wrote:
Kevin,

You could try setting index-time field length-dependent boosts.

Another possibility may be your own sorting, that takes field length in
consideration, but I'm not sure how well that would work.

Finally, you could use your own Similarity and implement your own
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#lengthNorm(java.lang.String,%20int)

Otis


--- "Kevin A. Burton" <[EMAIL PROTECTED]> wrote:


I've noticed that Lucene does a very bad job at doing search ranking when text has just a few words in the body.

For example if you searched for the word "World" in the following two

paragraphs:

"Hello World"

and

"The World is often a dangerous place"

The first paragraph wuold probably match.

Is there a way I can tweak lucene to return richer content?

Kevin

--

Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an

invite!  Also see irc.freenode.net #rojo if you want to chat.

Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html

If you're interested in RSS, Weblogs, Social Networking, etc... then
you should work for Rojo! If you recommend someone and we hire them
you'll get a free iPod!
Kevin A. Burton, Location - San Francisco, CA
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412



--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]





--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


****************************************************************************** The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, any disclosure, copying, distribution, or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Please note that emails to, from and within RTÉ may be subject to the Freedom of Information Act 1997 and may be liable to disclosure. ******************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to