https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5845





------- Additional Comments From [EMAIL PROTECTED]  2008-03-04 10:49 -------
Be sure to use mbox rather than individual files for the corpus.  Using HDFS
with individual files you'd be lucky to get 1/3 the current message throughput.

That is, don't try using the message paths as the input and having the mappers
open the individual files from HDFS.  As expected, it doesn't work so well. ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to