Well, this web page helps quite a bit, but I dont see anything about the density of words in the document compared to the amount of times the word appears, I could have swore I read that somewhere.
Anyway it makes sense that its doing it because I have get a document back with 5 words, just 5 words, and 1 word being the result of the query, and its scored higher than a document with 20 words where the term appears twice, which is obviously what we dont want. http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html Continuing investigation -- View this message in context: http://www.nabble.com/Scoring-modification-question-tp21580240p21601980.html Sent from the Lucene - General mailing list archive at Nabble.com.
