Very cool, I've added these to our collections wiki:  
http://cwiki.apache.org/confluence/display/MAHOUT/Collections

On Nov 19, 2009, at 3:31 AM, Robert Muir wrote:

> Hello,
> 
> While doing some work for the open relevance project, I thought that a large
> corpus of categorized documents might be useful test data for mahout.
> 
> Here is one I am working with:
> http://ece.ut.ac.ir/DBRG/Hamshahri/(Approximately 160k categorized
> docs)
> There is a newer beta verson here:
> http://ece.ut.ac.ir/DBRG/Hamshahri/ham2/(Approximately 320k
> categorized docs)
> 
> -- 
> Robert Muir
> [email protected]

Reply via email to