Hoss wrote: this would be meaningless even if it were easier... http://wiki.apache.org/lucene-java/LuceneFAQ#head-912c1f237bb00259185353182948e5935f0c2f03
FAQ: "Can I filter by score?" -Hoss I've read the warnings referenced there; but still have a problem to solve. We have "fact-based" information about people and the jobs they might fill (availability dates, experience level, languages spoken, etc.) and we have textual information about both the jobs and the people (e.g. resumes). We'd like to use the "goodness of match" of the textual description of the job to the person's resume as a way to suggest, for example, additional people who should be considered for the job, even if, say, their specific job title does not match the requested job title. I can use the job description to construct a query (and I've done it in a variety of ways), but how best to choose which of the returned people to allow to "fit" the job? An obviously desirable way to do it is using the score, but all discussion seems to say "don't do that, since absolute score isn't meaningful" (and I don't use the normalized score, BTW, I use the raw score, but the same caveats apply). Certainly the relative scores for a single query can be used to rank the goodness of fit to that particular job, but that doesn't solve the problem in general. Should I just give up and return the "25 best" fits to the job, and only use score to rank them relative to one another? This then means that a job description that has very few words that match *anything* in the collection of resumes will still produce 25 people that "match". In practice (anecdotally) it does appear to me that when the highest score for a particular job description is fairly small (say 0.10) that there's a good reason for that, and when the highest score is something like 0.60, there's a good reason for that as well. That is, queries that yield a small "best score" are queries for which I would not expect good matches, and vice versa. So it does seem (again anecdotally) that the score has *some* relevance. What are the experts' thoughts on this? Donna L. Gresh Services Research, Mathematical Sciences Department IBM T.J. Watson Research Center (914) 945-2472 http://www.research.ibm.com/people/g/donnagresh [EMAIL PROTECTED]