Hi all,

I'm doing a presentation to my local JUG on Lucene, and I'm looking for a "good" set of documents to use as a demonstration.

Ideally it would be:
1) large (10,000 plus?).
2) contain some metadata besides "body" (like author, date, primarykey, etc).
3) freely available.


I was going to use the data from the previous Google programming contest, but it doesn't seem to be available.

If I can't find anything satisfactory, I'll probably:
- generate a fake whitepages phonebook
- grab documents from project Gutenberg

My preference is for some "real" data, but I'm happy to generate fake data if no-one has any better ideas.

:D

=Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to