Take a look at Project Guttenberg: http://www.gutenberg.org/ Igor
On 4/1/06, Pasha Bizhan <[EMAIL PROTECTED]> wrote: > > Hi, > > > From: Marvin Humphrey [mailto:[EMAIL PROTECTED] > > > I'm looking for a test corpus to use for some benchmarking > > and parsing tests. I can whip one up myself, but it would be > > nice to use something standardized. I'd like something that > > doesn't require a license/fee, so that other people can run > > the same tests. At least 1000 docs, a few hundred words > > each. Any suggestions? > > See Corpora section at http://wiki.apache.org/jakarta-lucene/Resources > > Pasha Bizhan > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >