I'm using Postfix 2.4.6, Amavisd-new 2.5.2, ClamAV 0.91.2 and Mail-SpamAssassin 3.2.3 in a Linux mail filter. I'm having problems conveniently getting enough ham and spam for Bayes training. I'm aware that Bayes is more closely related to SA than Amavisd, but please humor me before sending me off to the SA forums :)
I am currently using the Postfix always_bcc function to copy each email coming through the system to postmaster. From postmaster's mailbox, I manually classify and copy each email into seperate spam-#### or ham-#### files. The problem is that this alters the recipient and adds a number of X-Amavis headers that could affect Bayes accuracy. It seems to me that it would be better if Amavisd could just make an un-altered copy of every e-mail it processes and place them in seperate disk files. From that point, it should be fairly easy to write a script that would allow postmaster to rquickly eview and classify the files. Then, the script would assign the files an appropriate spam or ham filename. That would take a lot of effort out of building a corpus. Any thoughts on that suggestion? Thanks! Ken Morley ------------------------------------------------------------------------- SF.Net email is sponsored by: Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace _______________________________________________ AMaViS-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/amavis-user AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 AMaViS-HowTos:http://www.amavis.org/howto/
