I'm considering setting up a system that will allow non-unix users to submit false positives or false negatives back to the system in order to be relearned, and specific to them. (Using the SQL bayes feature of SA 3.0) I expect most users will not remember (or be consistent) in training the ham, so I'm thinking about setting the system up so that it learns ham automatically but requires the user to train it for spam.
This would be done by having a reasonable ham threshold, but a very high (200) spam learning threshold. Has anybody done anything similar to this? How successful was it for you? (I'm specifically wondering about the affect on SA of auto-learning of ham and manual training of spam aspect, as well as wondering if I'm correct about users willingness to train ham.) What I'm planning on doing is having the users create two SA rules-- one that looks for the Bayes_99 message text in the header and moves those into a "positively spam" folder, and then stops processing rules. (At some point they can alter the rule to just move the bayes_99 stuff straight to their trash folder.) The next rule will look for 'X-Spam-Flag: YES' and move the message into a "possible spam" folder. This is their chance to reinforce the training, or point out mistakes. I may also just increase the score of Bayes_90 to 5.1, or have the second rule detect BAYES_90 in the headers as well. Thanks, -ron
