I'm using Postfix 2.4.6, Amavisd-new 2.5.2, ClamAV 0.91.2 and
Mail-SpamAssassin 3.2.3 in a Linux mail filter.  I'm having problems
conveniently getting enough ham and spam for Bayes training.  I'm aware
that Bayes is more closely related to SA than Amavisd, but please humor
me before sending me off to the SA forums :)

I am currently using the Postfix always_bcc function to copy each email
coming through the system to postmaster.  From postmaster's mailbox, I
manually classify and copy each email into seperate spam-#### or
ham-#### files.  The problem is that this alters the recipient and adds
a number of X-Amavis headers that could affect Bayes accuracy.
 
It seems to me that it would be better if Amavisd could just make an
un-altered copy of every e-mail it processes and place them in seperate
disk files.  From that point, it should be fairly easy to write a script
that would allow postmaster to rquickly eview and classify the files.
Then, the script would assign the files an appropriate spam or ham
filename.  That would take a lot of effort out of building a corpus.
 
Any thoughts on that suggestion?
 
Thanks!
 
Ken Morley
 

-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
AMaViS-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Reply via email to