Hello, industry standard is a thing called Reuters21578.
http://www.daviddlewis.com/resources/testcollections/reuters21578/ That should do, Karsten -----Urspr�ngliche Nachricht----- Von: Jos van der Meer [mailto:[EMAIL PROTECTED] Gesendet: Mittwoch, 30. Juli 2003 15:20 An: [EMAIL PROTECTED] Betreff: Free, medium size, downloadable corpus of newspaper articles ? For my experiments with Lucene, I would like to have a publicly available free, medium size, downloadable corpus of newspaper articles (topics do not matter, nor does its publication date). For I would like to share the results of the experiments, and people should be able to reproduce and to extend it. Don't send the corpora themselves (..), but please send me their URLs. Thanks in advance, [EMAIL PROTECTED] aidministrator nederland bv - http://www.aidministrator.nl/ prinses julianaplein 14-b, 3817 cs amersfoort, the netherlands tel. +31-(0)33-4659987 fax. +31-(0)33-4659987 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
