Bayesit plugin and filters executing order
I just upgraded to TB v2 after a long time on v1.x and started checking out the Bayesit plugin. I noticed that some of my messages that I have filters on (to move them into separate folders) were being flagged as junk mail. I have my filters setup in the following order: Multiple filters for mailing lists, known subject lines, etc. moved into their respective folders. 2nd to the last entry is a KNOWN filter to move all messages from people in my address book into a special folder. Last entry is potential SPAM filter that moves everything that made it this far into a holding folder for later review. I'm assuming that the Bayesit plugin runs against all incoming mail prior to any of the filters, so that's why I am seeing some of my mailing list messages tagged as SPAM. I suppose I could add a bunch of entries to the Bayesit plugin's whitelist, but I hate duplicating what's already in my filters. Just trying to confirm my suspicions ~Mike Current version is 2.04.7 | 'Using TBUDL' information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Bayesit plugin and filters executing order
Second follow-up question Does the Bayesit plugin merely base it's calculations on the subject and body of the message? Example, say if I subscribe to a mailing list called [EMAIL PROTECTED] and have filters setup to move these messages into a specific folder (based on the sender's address and or the fact that [ABC] would show up in the subject line). Occasionally (well, too often on some lists), a spammer sends some junk mail to the list... if I mark these as JUNK in TB, will it potentially have an effect on legit messages from that group? Perhaps adding the [ABC] from the subject line to the list of words associated with junk mail? Just curious if I should just ignore spam that comes in via a mailing list, or start flagging it as junk for the Bayesit plugin. ~Mike Current version is 2.04.7 | 'Using TBUDL' information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Bayesit plugin and filters executing order
Hello bats, on Tue, 6. Apr 2004 at 10:44:42 -0700 Michael R Kizer wrote: Does the Bayesit plugin merely base it's calculations on the subject and body of the message? on the raw content/source of the mail, ie all words. a spammer sends some junk mail to the list... if I mark these as JUNK in TB, will it potentially have an effect on legit messages from that group? Perhaps adding the [ABC] from the subject line to the list of words associated with junk mail? It will notice that [ABC] has been used for spam, but significantly less than for legit mail. So, [ABC] would still belong to the group Ham, not Spam. Just curious if I should just ignore spam that comes in via a mailing list, or start flagging it as junk for the Bayesit plugin. No, you should mark every spam. This way a Bayesian filter will catch also the spams sent to a list (because of other words that belong to the Spam group). -- shinE! GnuPG/PGP key: http://thequod.de/danielhahler.asc lifted with The Bat! 2.05 Beta/14 on Windows XP Service Pack 1. Current version is 2.04.7 | 'Using TBUDL' information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re: Bayesit plugin and filters executing order
Hello bats, on Tue, 6. Apr 2004 at 10:24:26 -0700 Michael R Kizer wrote: I'm assuming that the Bayesit plugin runs against all incoming mail prior to any of the filters, so that's why I am seeing some of my mailing list messages tagged as SPAM. Correct. I suppose I could add a bunch of entries to the Bayesit plugin's whitelist, but I hate duplicating what's already in my filters. AFAIK Bayesit is not able (due to the plugin API) to insert headers into the mail (eg with a spam score). If that was possible you could remove the tickmark in the Spam plugin config to move the mail into Junk folder and filter on that Bayesit-headers. Nevertheless you should re-train Bayesit with the mails it got wrong and you probably won't notice it, if they get filtered correctly anyway. So, with the current setup you are somehow forced to re-train and that's good for Bayesit's learning capabilities. I for myself use POPFile, which uses the same approach (Bayesian), but is a lot more useful, as it can have as many buckets as you want. Eg, I have spam, english, german, admin and PGP. Accuracy is 99.62% for 28293 mails - which is awesome. -- shinE! GnuPG/PGP key: http://thequod.de/danielhahler.asc lifted with The Bat! 2.05 Beta/14 on Windows XP Service Pack 1. Current version is 2.04.7 | 'Using TBUDL' information: http://www.silverstones.com/thebat/TBUDLInfo.html