Perhaps more of an NLP question, but are there any tests regarding
relevance for Lucene? Given an example corpus of documents, what are the
golden sets for specific queries? The Wikidump dump is used as a
benchmarking tool for both indexing and querying in Lucene, but there are
no metrics in terms of precision.

The Open Relevance project was closed yesterday (
http://lucene.apache.org/openrelevance/), which is what prompted me to ask
this question. Was the sub-project closed because others have found
alternate solutions?

Relevancy is of course extremely context-dependent and objective, but my
hope is that there is an example catalog somewhere with defined golden sets.

Cheers,

Ivan

Reply via email to