On Jun 25, 2007, at 2:04 PM, Doug Cutting wrote:
Doron Cohen wrote:
It is very important that we would be able to assess the search
quality in
a repeatable manner - so that anyone can repeat the quality tests,
and
maybe find ways to improve them. (This would also allow to verify the
"improvements claims" above...). This capability seems like a
natural part
of the benchmark package. I started to look at extending the
benchmark
package with search quality module, that would open an index (or
first
create one), run a set of queries (similar to the performance
benchmark),
and compute and report the set of known statistics mentioned above
and
more. Such a module depends on input data - documents, queries, and
judgements. And that's my second question. We don't have to invent
this
data - TREC has it already, and it is getting wider every year as
there are
more judgements. So, theoretically we could use TREC data.
We should be careful not to tune things too much for any one
application and/or dataset. Tools to perform evaluation would
clearly be valuable. But changes that improve Lucene's results on
TREC data may or may not be of general utility. The best way to
tune an application is to sample its query stream and evaluate
these against its documents.
+1. To do this, we could use Reuters or Wikipedia. The hard part is
generating the queries and having people make relevance judgments for
a sufficient sample size. Over time it would get better, especially
if we had a nice way for people to add queries/judgments w/o going
through the patch/commit process (maybe a page on the wiki could hold
the queries and judgments? That could get tricky) we might get more
support from outsiders.
That said, Lucene's scoring method has never been systematically
tuned, and some judicious tuning based on TREC results would
probably benefit a majority of Lucene applications. Ideally we can
develop evaluation tools, use them on a variety of datasets to find
better defaults for Lucene, and make the tools available so that
folks can fine-tune things for their particular applications.
+1 as well.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]