Hello everyone, I'm a new ASSP user - I set up one production proxy and preparing to do another, and in the meantime I found that sorting mail into spam and ham is rather tedious, for not everyone can just copy their own mailbox as initial learning corpus for ASSP - some of us set it up mainly for other people, and thus their mail is more correct.
Plus I gather from time to time one has to retrain ASSP for slightly different ham and spam characteristics, therefore one has to do manual sorting all over of at least new spam again - is that correct? Anyway, this little Bash script can make manual mail sorting for ASSP learning corpus less effort-intensive. Since sorting moves mail to folders sorted/spam, sorted/notspam, etc, one can stop sorting at any time by pressing Ctrl-C and not lose any sorting results. If one runs ./spamsort --rate-collected-spam before starting manual sorting, it also displays spam/ham probabilities and bayesian confidence during sorting. The caveat, however, is that spamsort learns these things from ASSP by means of using curl to connect to administrative connection of ASSP. It works, but it's horribly slow - it takes over 1 second to rate one mail! This is because script opens up a new HTTP connection, rates one mail and closes the connection, and then it has to do it again to rate another mail. Anybody who has a clue how to make curl use persistent HTTP connection to learn mail probabilities faster, please raise your hand? I obviously welcome comments, improvements and pointing bugs out. Link to script: http://www.wbp.krakow.pl/mk/spamsort Usage: spamsort v0.2 - Bash script for sorting spam collected by Anti-Spam SMTP Proxy. Commandline options: --spam --ham Start manual sorting of spam or ham, respectively. --rate-collected-spam Calculate stats of spam in 'spam' folder of ASSP. WARNING: IT'S *VERY* SLOW AT THE MOMENT SINCE IT USES SEPARATE CURL CONNECTION TO ASSP FOR RATING EACH MAIL. --quiet Don't print stats for each mail during rating. --delete-high-rated-spam Moves highly spam-positive mail to sorted/spam folder. WARNING: YOU HAVE TO DO INITIAL ASSP LEARNING IN ORDER TO AVOID DELETING MANY FALSE POSITIVES IN SPAM FOLDER. USE THIS OPTION AT YOUR OWN RISK!!! Regards, Marcin Krol ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user
