I'm doing a presentation to my local JUG on Lucene, and I'm looking for a "good" set of documents to use as a demonstration.
Ideally it would be:
1) large (10,000 plus?).
2) contain some metadata besides "body" (like author, date, primarykey, etc).
3) freely available.
I was going to use the data from the previous Google programming contest, but it doesn't seem to be available.
If I can't find anything satisfactory, I'll probably: - generate a fake whitepages phonebook - grab documents from project Gutenberg
My preference is for some "real" data, but I'm happy to generate fake data if no-one has any better ideas.
:D
=Matt
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]