Take a look at Project Guttenberg: http://www.gutenberg.org/
Igor

On 4/1/06, Pasha Bizhan <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> > From: Marvin Humphrey [mailto:[EMAIL PROTECTED]
>
> > I'm looking for a test corpus to use for some benchmarking
> > and parsing tests.  I can whip one up myself, but it would be
> > nice to use something standardized.  I'd like something that
> > doesn't require a license/fee, so that other people can run
> > the same tests.  At least 1000 docs, a few hundred words
> > each.  Any suggestions?
>
> See Corpora section at http://wiki.apache.org/jakarta-lucene/Resources
>
> Pasha Bizhan
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Reply via email to