https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5845
------- Additional Comments From [EMAIL PROTECTED] 2008-03-04 10:49 ------- Be sure to use mbox rather than individual files for the corpus. Using HDFS with individual files you'd be lucky to get 1/3 the current message throughput. That is, don't try using the message paths as the input and having the mappers open the individual files from HDFS. As expected, it doesn't work so well. ;) ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
