Perhaps more of an NLP question, but are there any tests regarding relevance for Lucene? Given an example corpus of documents, what are the golden sets for specific queries? The Wikidump dump is used as a benchmarking tool for both indexing and querying in Lucene, but there are no metrics in terms of precision.
The Open Relevance project was closed yesterday ( http://lucene.apache.org/openrelevance/), which is what prompted me to ask this question. Was the sub-project closed because others have found alternate solutions? Relevancy is of course extremely context-dependent and objective, but my hope is that there is an example catalog somewhere with defined golden sets. Cheers, Ivan