What is the best way to determine relevancy and the cutoff of results to show?
So the system I'm working on right now involves searching the inventory and returning the results. Each result must be reviewed by an employee to determine whether it is a true match. Obviously, we want to minimize the number of false results we return. I've been tweaking boosts and stuff to get it to score better, but we still have a few problems with determining relevancy. An absolute threshold doesn't work because search scores are only meaningful relative to the results in a given query. So a score of 200 on one query may not be as relevant as a score of .2 on another. The other method I've seen is a score normalized with respect to the top score of a query. Then we can return all results at are within x% of that score. However, if there are no good results, then the top result is very poor, and all the results we return will be poor. How can I determine which documents are relevant and which are not? -- View this message in context: http://lucene.472066.n3.nabble.com/Determining-Relevancy-Cutoff-tp4151714.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org