Re: [Evolution] Built-in spam filtering?
On Thu, 2003-01-02 at 15:42, Not Zed wrote: Its been suggested before ... You could just use an external app, and link it in the same way the spamassasin stuff is normally linked in. I still think doing it at the server end is the way to go though, otherwise you have to waste time downloading the message anyway. Great idea, but according to the Kmail handbook (Kmail does have filtering on the server), you still have to download the headers in order for the filters to work. So any message that you keep for download is effectively downloaded twice - first to get the headers for the filter, and then to get the message for your inbox. -- Bill Hartwell [EMAIL PROTECTED] MacManus Enterprises signature.asc Description: This is a digitally signed message part
Re: [Evolution] Built-in spam filtering?
cheers(); Great idea, but according to the Kmail handbook (Kmail does have filtering on the server), you still have to download the headers in order for the filters to work. So any message that you keep for download is effectively downloaded twice - first to get the headers for the filter, and then to get the message for your inbox. Nope, the message isn't downloaded twice (or at least hasn't to). Even with POP3 there is an (optional) command 'TOP n m' to get the header and the first m lines of mail n. So 'TOP n 0' only gets the header. ...guenther -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
On Sat, 2003-01-04 at 15:14, guenther wrote: cheers(); Great idea, but according to the Kmail handbook (Kmail does have filtering on the server), you still have to download the headers in order for the filters to work. So any message that you keep for download is effectively downloaded twice - first to get the headers for the filter, and then to get the message for your inbox. Nope, the message isn't downloaded twice (or at least hasn't to). Even with POP3 there is an (optional) command 'TOP n m' to get the header and the first m lines of mail n. So 'TOP n 0' only gets the header. Well, that does save some bandwidth, at least. Still, you are getting the headers no matter what. It seems (if I understand right) that the idea here is to make fetching mail work like fetching news...get all the headers, filter them, then do a delete/fetch on the bodies once the headers have been filtered. Is that what you have in mind? -- Bill Hartwell [EMAIL PROTECTED] MacManus Enterprises signature.asc Description: This is a digitally signed message part
Re: [Evolution] Built-in spam filtering?
cheers(); Great idea, but according to the Kmail handbook (Kmail does have filtering on the server), you still have to download the headers in order for the filters to work. So any message that you keep for download is effectively downloaded twice - first to get the headers for the filter, and then to get the message for your inbox. Nope, the message isn't downloaded twice (or at least hasn't to). Even with POP3 there is an (optional) command 'TOP n m' to get the header and the first m lines of mail n. So 'TOP n 0' only gets the header. Well, that does save some bandwidth, at least. Still, you are getting the headers no matter what. It seems (if I understand right) that the idea here is to make fetching mail work like fetching news...get all the headers, filter them, then do a delete/fetch on the bodies once the headers have been filtered. Is that what you have in mind? Yep. But _I_ don't really have that in mind. That was intended as info, cause you wrote 'first to get the headers for the filter' and stated, it 'is effectively downloaded twice'. You only have to get the header twice. If and only if you can filter by header. And I doubt, you can filter SPAM by header... So for an effective SPAM filtering you do have to get all the mail and filter it client-side. ...guenther -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
[Evolution] Built-in spam filtering?
I didn't have time to check the archives thoroughly but I'm somewhat surprised nobody has put a Bayesian spam filter or something like that into Evolution. I'd kill for that. I'd do it myself if I had the time, but I really don't. Anyway, in case this spurs someone to do some work, I did spend some time working on an imap server based bayesian system. The idea was that with imap the folders are all on the server and I can easily create a special spam folder that users can drag and drop spam into, and use their personal folders for the not-spam side of things. My system was rebuilding the databases every once in awhile out of cron but with a built-in system you could do it as-you-go (which would be cool). This was drop-dead simple to use from the user's point of view (my goal was that my wife should be able to use it without my help). The downfall was that I haven't had the time to get the delivery stuff working and integrated into my mail delivery system. Apple's mail client with Jaguar (OSX 10.2) does something more or less like this, but instead of a spam folder there's a this is spam button. And instead of moving probable spam into a special folder it colorizes them or destroys them (at your option). In some ways I like this, but I would kind of like to be able to go in and edit the spam template messages so I think I'd still rather have a spam folder and have colorization or prioritization versus a trash folder as an option. Anyway, if anyone has time to work on something like this I bet a ton of people would love it. I sure would. In fact, I'd pay money if this feature were an add-on ala Exchange connectivity (hint). I'd also pay money for a Windows version of Evolution (hint hint) so I didn't have to switch to Outlook whenever I have to use Windows. I note that I looked into spamassassin, which seems to be the preferred technique using an external filter, and I really dislike its rule-based system. Way too many false positives, and a lot of work to set up and maintain too. Spam filtering would be a great integrated feature and doesn't look like it'd be a lot of work to implement. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
Its been suggested before ... You could just use an external app, and link it in the same way the spamassasin stuff is normally linked in. I still think doing it at the server end is the way to go though, otherwise you have to waste time downloading the message anyway. On Fri, 2003-01-03 at 03:42, Jim Frost wrote: I didn't have time to check the archives thoroughly but I'm somewhat surprised nobody has put a Bayesian spam filter or something like that into Evolution. I'd kill for that. I'd do it myself if I had the time, but I really don't. Anyway, in case this spurs someone to do some work, I did spend some time working on an imap server based bayesian system. The idea was that with imap the folders are all on the server and I can easily create a special spam folder that users can drag and drop spam into, and use their personal folders for the not-spam side of things. My system was rebuilding the databases every once in awhile out of cron but with a built-in system you could do it as-you-go (which would be cool). This was drop-dead simple to use from the user's point of view (my goal was that my wife should be able to use it without my help). The downfall was that I haven't had the time to get the delivery stuff working and integrated into my mail delivery system. Apple's mail client with Jaguar (OSX 10.2) does something more or less like this, but instead of a spam folder there's a this is spam button. And instead of moving probable spam into a special folder it colorizes them or destroys them (at your option). In some ways I like this, but I would kind of like to be able to go in and edit the spam template messages so I think I'd still rather have a spam folder and have colorization or prioritization versus a trash folder as an option. Anyway, if anyone has time to work on something like this I bet a ton of people would love it. I sure would. In fact, I'd pay money if this feature were an add-on ala Exchange connectivity (hint). I'd also pay money for a Windows version of Evolution (hint hint) so I didn't have to switch to Outlook whenever I have to use Windows. I note that I looked into spamassassin, which seems to be the preferred technique using an external filter, and I really dislike its rule-based system. Way too many false positives, and a lot of work to set up and maintain too. Spam filtering would be a great integrated feature and doesn't look like it'd be a lot of work to implement. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering? (Spam grading is better)
On Thu, 2003-01-02 at 09:12, Jim Frost wrote: I note that I looked into spamassassin, which seems to be the preferred technique using an external filter, and I really dislike its rule-based system. Way too many false positives, and a lot of work to set up and maintain too. Spam filtering would be a great integrated feature and doesn't look like it'd be a lot of work to implement. SpamAssassin + fetchmail + procmail + Evolution is great. One of the big mistakes people make when using spam filtering is to consider it a binary filter: spam or not spam. I have my spam sorted in two categories: marginal and high. I only weekly check my marginal folder, if at all. The high folder do not check at all. The rare false positives you talk about end up in the marginal folder. -Arthur ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
On Thu, 2003-01-02 at 16:46, Jim Frost wrote: On Thu, 2003-01-02 at 18:13, guenther wrote: If you don't control the server, get your own. ;) I do have my own for personal use. Cannot have my own for corporate use, don't have the choice. What, is there a corporate policy preventing you from running your own (local) email server? How would they even know? [..] I want statistical, thanks. Me too. Bogofilter rocks! [...] What pop3 server? Everything I use is imap, which is one reason that it's hard to use a lot of the existing bayesian tools. Eh? What does the protocol have to do with which tools you use? [...] So, running my own server does work with personal mail although, as I said, it's not a straightforward drop-in to put in most of the server based filters. Finding the time to figure out what I need to do has been problematic. And, even with that done, I still have to deal with the corporate spam residing on servers I do not and cannot control. FWIW, I'll describe the system I have set up to use bogofilter (which, after a month or so of training, has achieved pretty close to 99.9% detection, and I've yet to get a false positive): * First, I'm running the UW imap server locally (which sucks, as you said, but it's really easy...). * I use fetchmail to grab mail from my various email sources (like the stupid corporate mandated M$ exchange server). * I use procmail to filter mail into various folders (like mailing lists, for example), and also to invoke bogofilter for filtering spam. * I use evolution (mostly) to read my mail, so I've also got some filters set up in evolution that I use to train bogofilter. * I have also set up a couple of email aliases on the machine in question so that, when I'm not using evolution, I still have a way to train bogofilter on new spams (by forwarding the target email to the given alias). The only part of this setup that's less than trivially easy is the procmail setup, so I'll explain that here: Here are the procmail rules that I use (put in ~/.procmailrc) # bogospam and bogoham are email aliases I set up for the sole # purpose of training bogofilter: # Anything sent to the bogospam mail alias just goes into the # bogofilter database as spam. :0HB * ^TO.*bogospam | bogofilter -S # Anything sent to the bogoham mail alias just goes into the # bogofilter database as legit mail. :0HB * ^TO.*bogoham | bogofilter -H # Here's where we let bogofilter do its work... :0HB * ? bogofilter { # If bogofilter thinks this message is spam, reinforce that # conclusion by adding it to the spam database. :0HBc | bogofilter -s # Then file it away in my spam folder, for later perusal and deletion. :0 $AUTOFILED/Spam } # By default, assume that everything else is not spam (and reinforce # the assumption by adding it to the non-spam database). :0EHBc | bogofilter -n The evolution filters I use are trivial, although perhaps not obvious: * First, I defined a couple of labels I can use to label target messages (spam and not spam) * Then, when I get a spam that wasn't detected (or if I ever get a false positive), I label the message in question appropriately, and re-run the evolution filters on the folder that contains it. * I have two filters defined (bogofilter spam and bogofilter not spam. * The spam filter just has two criteria: [label is spam] and [pipe message to shell command bogofilter -S] * The not spam filter has: [label is not spam] and [pipe message to shell command bogofilter -H] criteria. Probably 99% of Evolution's users don't run their own servers and would benefit from this kind of thing even if you personally don't, and a hell of a lot of people would prefer not to be screwing around with procmail just to get rid of spam. That's true. I also think that a lot of people would prefer to not be screwing around with spam at all (I know that's my preference)! Cheers! -- Brett Johnson [EMAIL PROTECTED] - i n v e n t - ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering? (Spam grading is better)
On Thu, 2003-01-02 at 18:48, Arthur Britto wrote: On Thu, 2003-01-02 at 09:12, Jim Frost wrote: I note that I looked into spamassassin, which seems to be the preferred technique using an external filter, and I really dislike its rule-based system. Way too many false positives, and a lot of work to set up and maintain too. Spam filtering would be a great integrated feature and doesn't look like it'd be a lot of work to implement. SpamAssassin + fetchmail + procmail + Evolution is great. This may be the case, although I reiterate that I don't like spamassassin because it doesn't do as good a job as statistical filters and takes a lot more work to tune. Still, setting up something like this is not straightforward and has no advantages over having an integrated filter. I mean, I have to learn how to set up and maintain not one software package but four. It's nice that you all have the time to screw around with all that stuff, but I have an actual job I have to do. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
On Thu, 2003-01-02 at 20:36, Brett Johnson wrote: On Thu, 2003-01-02 at 16:46, Jim Frost wrote: On Thu, 2003-01-02 at 18:13, guenther wrote: If you don't control the server, get your own. ;) I do have my own for personal use. Cannot have my own for corporate use, don't have the choice. What, is there a corporate policy preventing you from running your own (local) email server? How would they even know? I could do that if I were so inclined, but I'd have to suck mail out of their server first on a polling basis ... and I wouldn't get their backup support if I did this, nor the web mail support. I would prefer to keep it on their server rather than maintaining yet another server myself in any case (though if they go to Exchange I may do this just out of self preservation). I really do have better things to do than set up chains of software, which was also why I'd rather have filtering in the client than setting up like three or four other software packages just to do filtering for me. What pop3 server? Everything I use is imap, which is one reason that it's hard to use a lot of the existing bayesian tools. Eh? What does the protocol have to do with which tools you use? All of the statistical filtering tools need source data, and every one I've looked at wants that data local. That works fine if I'm running the filters on the same machine as the server, otherwise it's a pain. But not a pain if it's in the client, which can already download and process mail. The only part of this setup that's less than trivially easy is the procmail setup, so I'll explain that here: Thanks, this could be useful. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering? (Spam grading is better)
just to get this thread to stop, since it's going no where... Ximian is considering implementing a bayesian spam filter within Evolution itself. Management is pushing for it to be implemented for Evolution 1.4 but I seriously doubt Michael and I will have the time to do it in so short an amount of time, but that doesn't mean it won't make it into the version *after* 1.4 (or a version shortly after?). Jeff On Thu, 2003-01-02 at 21:10, Jim Frost wrote: On Thu, 2003-01-02 at 18:48, Arthur Britto wrote: On Thu, 2003-01-02 at 09:12, Jim Frost wrote: I note that I looked into spamassassin, which seems to be the preferred technique using an external filter, and I really dislike its rule-based system. Way too many false positives, and a lot of work to set up and maintain too. Spam filtering would be a great integrated feature and doesn't look like it'd be a lot of work to implement. SpamAssassin + fetchmail + procmail + Evolution is great. This may be the case, although I reiterate that I don't like spamassassin because it doesn't do as good a job as statistical filters and takes a lot more work to tune. Still, setting up something like this is not straightforward and has no advantages over having an integrated filter. I mean, I have to learn how to set up and maintain not one software package but four. It's nice that you all have the time to screw around with all that stuff, but I have an actual job I have to do. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution -- Jeffrey Stedfast Evolution Hacker - Ximian, Inc. [EMAIL PROTECTED] - www.ximian.com ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering? (Spam grading is better)
On Thu, 2003-01-02 at 21:46, Jeffrey Stedfast wrote: just to get this thread to stop, since it's going no where... Ximian is considering implementing a bayesian spam filter within Evolution itself. Management is pushing for it to be implemented for Evolution 1.4 but I seriously doubt Michael and I will have the time to do it in so short an amount of time, but that doesn't mean it won't make it into the version *after* 1.4 (or a version shortly after?). That is terrific to hear. What kind of timeframe is 1.4, and what are the odds that things might get done faster if you got some help with the initial implementation? I don't know that I have time to help, but I might ... and if my choice is to spend time setting up procmail et al in a one-off versus contributing my time towards a solution that lots of people can use, I'll pick the latter. Though I really hate the idea of going back to C++ :-). jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering? (Spam grading is better)
On Thu, 2003-01-02 at 21:56, Jim Frost wrote: On Thu, 2003-01-02 at 21:46, Jeffrey Stedfast wrote: just to get this thread to stop, since it's going no where... Ximian is considering implementing a bayesian spam filter within Evolution itself. Management is pushing for it to be implemented for Evolution 1.4 but I seriously doubt Michael and I will have the time to do it in so short an amount of time, but that doesn't mean it won't make it into the version *after* 1.4 (or a version shortly after?). That is terrific to hear. What kind of timeframe is 1.4, feature freeze is in 2 weeks and what are the odds that things might get done faster if you got some help with the initial implementation? given that the timeframe is 2 weeks, I'm thinking that the odds aren't much better since any aditional helpers would have to learn the codebase in a jiffy :-) we also need to finish porting evolution to gnome 2.0 :-) I don't know that I have time to help, but I might ... and if my choice is to spend time setting up procmail et al in a one-off versus contributing my time towards a solution that lots of people can use, I'll pick the latter. Though I really hate the idea of going back to C++ :-). it's actually implemented in c, not c++. Jeff -- Jeffrey Stedfast Evolution Hacker - Ximian, Inc. [EMAIL PROTECTED] - www.ximian.com ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
Jim Frost ([EMAIL PROTECTED]) had this to say on 01/02/03 at 19:07: Of course I did. It has about a 90% success rate and more than a 1% false positive rate and requires me to diligently keep up the rule base. Now, 90% success would be great, but 1% false is a killer. That means I'll see like five or ten falses a day, which means I'll be constantly going through the filtered mail, which defeats the purpose. I use the latest 2.43 version, and I get much better than 90%, and can't recall the last time I had a false positive. Perhaps your experiences were with earlier versions? Probably 99% of Evolution's users don't run their own servers and would I think that's an overstatement, altho I will agree to a large majority. -- PGP Fingerprint: 0AA8 DC47 CB63 AE3F C739 6BF9 9AB4 1EF6 5AA5 BCDF Member, LEAF Project http://leaf.sourceforge.netAIM: MikeLeone Public Key - http://www.mike-leone.com/~turgon/turgon-public-key.asc Registered Linux user# 201348 msg12315/pgp0.pgp Description: PGP signature
Re: [Evolution] Built-in spam filtering?
On Thu, 2003-01-02 at 17:42, Not Zed wrote: Its been suggested before ... You could just use an external app, and link it in the same way the spamassasin stuff is normally linked in. I still think doing it at the server end is the way to go though, otherwise you have to waste time downloading the message anyway. Two points: 1) Setting up an external program is, at best, a PITA. It's also a LOT slower and managing the databases is very difficult when the mail store is not local. Still, I'll look into this when I have the time because it's better than what I have now. 2) Do you actually control all of the mail servers you connect to? I don't. One of them is controlled by a group that seems to think our best move for the future is to switch to Exchange. Their previuos spam filtering system was to delete every bit of mail that matched certain substrings without bothering to tell any of us what those substrings were or to notify us that they deleted it. Clearly not going to be real helpful in putting a good spam filter up. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
cheers(); Its been suggested before ... You could just use an external app, and link it in the same way the spamassasin stuff is normally linked in. I still think doing it at the server end is the way to go though, otherwise you have to waste time downloading the message anyway. Two points: 1) Setting up an external program is, at best, a PITA. It's also a LOT slower and managing the databases is very difficult when the mail store is not local. Still, I'll look into this when I have the time because it's better than what I have now. 2) Do you actually control all of the mail servers you connect to? I don't. One of them is controlled by a group that seems to think our best move for the future is to switch to Exchange. Their previuos spam filtering system was to delete every bit of mail that matched certain substrings without bothering to tell any of us what those substrings were or to notify us that they deleted it. Clearly not going to be real helpful in putting a good spam filter up. If you don't control the server, get your own. ;) Not joking. I know, you mentioned spamassassin, but have you really considered it? fetchmail, procmail and spamassassin are really powerful. And you never have to wait for new mail to get sucked from the POP3 server. Setting up an IMAP server on my local machine was only about an rpm install on my Mandrake 9.0 system here. (That is, 400 km away from here, cause I'm still with my family for holiday... ;) For me, it is the perfect solution. Maybe it can be useful for you, too. (spamassassin has AFAIK even some server based spam detection, not only their rules based.) ...guenther -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution
Re: [Evolution] Built-in spam filtering?
On Thu, 2003-01-02 at 18:13, guenther wrote: If you don't control the server, get your own. ;) I do have my own for personal use. Cannot have my own for corporate use, don't have the choice. I know, you mentioned spamassassin, but have you really considered it? Of course I did. It has about a 90% success rate and more than a 1% false positive rate and requires me to diligently keep up the rule base. Now, 90% success would be great, but 1% false is a killer. That means I'll see like five or ten falses a day, which means I'll be constantly going through the filtered mail, which defeats the purpose. Statistical techniques are exceeding 99% accuracy with false positives of 0.1% or less, and maintenance is a matter of stuffing new spam into the database. I want statistical, thanks. And you never have to wait for new mail to get sucked from the POP3 server. What pop3 server? Everything I use is imap, which is one reason that it's hard to use a lot of the existing bayesian tools. Setting up an IMAP server on my local machine was only about an rpm install on my Mandrake 9.0 system here. (That is, 400 km away from here, cause I'm still with my family for holiday... ;) For me, it is the perfect solution. Maybe it can be useful for you, too. I have been running my own imap server since 1997. Started with Cyrus (which was great), then UW imapd since Cyrus didn't coexist well with Red Hat 6 (UW imapd sucks sucks sucks and yet is the standard on Linux systems) and these days I'm running courier imap on BSD (which is really great). So, running my own server does work with personal mail although, as I said, it's not a straightforward drop-in to put in most of the server based filters. Finding the time to figure out what I need to do has been problematic. And, even with that done, I still have to deal with the corporate spam residing on servers I do not and cannot control. Probably 99% of Evolution's users don't run their own servers and would benefit from this kind of thing even if you personally don't, and a hell of a lot of people would prefer not to be screwing around with procmail just to get rid of spam. (spamassassin has AFAIK even some server based spam detection, not only their rules based.) Yea, it does, but that's like using a nuke to kill rodents. jim ___ evolution maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/evolution