Hi, On 5/29/06, Sebastiano Vigna <[EMAIL PROTECTED]> wrote:
Dear Lucene developers, I'd be interested in doing some benchmarking on (at least) Lucene, Egothor and MG4J. There is no actual data around on publicly available collections, and it would be nice to have some more objective data on efficiency for a significantly large collection.
I was wondering if you have seen the TREC 2004 paper by Giuseppe Attardi, Andrea Esuli and Chirag Pate from the University of Pisa, Italy, titled "Using Clustering and Blade Clusters in the TeraByte task"? http://trec.nist.gov/pubs/trec13/papers/upisa-tera.pdf In the paper, three search engines (including Lucene) was benchmarked on the GOV2 corpus. -- Dave Kor Center for Information Mining and Extraction School of Computing National University of Singapore. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]