Re: Learning only on read emails?
On 19.10.15 17:30, Ryan Coleman wrote: I actually get THOUSANDS of emails a day. Most of it is spam. And not caught by SA. And when it is put into the spam folder it is not learned. But, hey, you know… you obviously know me better than me so why don’t you have this back and forth publicly with yourself and keep me out of it. I would be still carefull to train all read mail as ham... there are cases you don't notice it's spam (some of them are hardly distinguishable), forget to move it to spam or don't have time to do it... P.S.: no need for reply-all on a mailing list Habit. Besides, there’s no reply-to header rewrite on this mailing list. If I hit reply it goes only to you that's why there are list headers and why some MUAs support them. On Oct 19, 2015, at 5:25 PM, Reindl Haraldwrote: nonsense - there are list headers and if you use a broken client just remove anything but the list-address Wow, you really are an asshole, huh? I looked at the headers before I said anything. Broken Client? No… Apple Mail. There are lists where it works because it EXISTS IN THE HEADERS. not that I like him, but he's right that mail client that is not capable of handling mailing lists is kind of broken... Lists should not break mail by inserting reply-to header, because it's supposed to be inserted by a client, not by a mailing list. and the fact that it's made by apple doesn't make it good client. Microsoft and apple tend to screw things their way just because they are huge companies and don't care about (even backwards) compatibility and correctness Speaking of learning spam… your email address will be joining the blacklist very soon. just be careful when blacklisting and spam-training... -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Nothing is fool-proof to a talented fool.
Re: Learning only on read emails?
On 10/20/2015 12:41 AM, Ryan Coleman wrote: Actually it makes absolute sense since I dump my spam into a folder to be scanned as spam and anything that is still in my inbox, and read, is indeed ham. I just have to re-investigate the ./new and ./cur folders to make sure they will operate how I want. But if the email was delivered to my phone and it moves (but not read) then it’s not an option. cur and new folders work as supposed when the IMAP server is Courier, but NOT when you use Dovecot. That is how I have been learning from these two. br. jarif On Oct 19, 2015, at 4:35 PM, Reindl Haraldwrote: Am 19.10.2015 um 23:21 schrieb Ryan Coleman: Ok so it was established I don’t have a ham scan (correct). So how do I do it so that it only scans the read emails in a MAILDIR? that makes no sense train a spcific ham and a specific spam folder where you move messages you are sure how to classify and not a generic inbox just because you have read a message
Re: Learning only on read emails?
On Tue, 20 Oct 2015 08:29:27 -0500 Ryan Coleman wrote: > > > On Oct 20, 2015, at 8:21 AM, RWwrote: > > > > On Tue, 20 Oct 2015 15:14:42 +0300 > > Jari Fredriksson wrote: > > > >> On 10/20/2015 12:41 AM, Ryan Coleman wrote: > >>> Actually it makes absolute sense since I dump my spam into a > >>> folder to be scanned as spam and anything that is still in my > >>> inbox, and read, is indeed ham. > >>> > >>> I just have to re-investigate the ./new and ./cur folders to make > >>> sure they will operate how I want. But if the email was delivered > >>> to my phone and it moves (but not read) then it?s not an option. > >> > >> cur and new folders work as supposed when the IMAP server is > >> Courier, but NOT when you use Dovecot. > > > > How does it not work as expected? > > I haven?t seen anything appear in the ?new? folder, to be honest. Bear in mind that the "new" directory is there for mail that's been delivered into the maildir folder without going through a mail client. If the mail is delivered there by a pop/imap client, or copied/moved between maildir folders, "new" shouldn't be used. Even when it is used, an IMAP server should move mail from "new" to "cur" immediately after its existence been reported to a client, and that can be instantaneous if the IMAP client supports IDLE. In my experience Dovecot's MDA does the right thing. There is a complication though in that when Sieve is used to set a flag, the MDA has no choice but to put it in "cur".
Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?
On Tue, 20 Oct 2015, Amir Caspi wrote: On Oct 19, 2015, at 1:16 PM, RWwrote: body URI_HOST_IN_BLACKLISTeval:check_uri_host_in_blacklist() header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK') These appear to be the same thing. The first call is just a shorthand form for the second. I don't see where headers come into it. I think the second rule is probably just a mistake. So, following up on this... do any of the main devs see the second rule as a problem? It seems to be that a header rule shouldn't be checking URI hosts, but even if so, it absolutely shouldn't be hitting when those hosts aren't even in the headers (per the two spamples I posted). My default assumption for the behavior of a header eval() rule would be that it only checks message headers. If that's not the case (as you describe) then I'd agree the rule is a problem, especially if it leads to duplicate hits. Whether that's a bug in the documentation, or a bug in the rules, or a bug in eval(), or a bug in the implementation of check_uri_host_*, I can't really say at this point. Speculation: If the check_uri_host_* eval()s are looking only at the URI list regardless of the rule type (i.e. it always behaves as if it was a uri rule) then I'd say that needs to be documented clearly (if it isn't documented by more than just an example uri rule) and the rules fixed to remove the duplicate hits. If the intent of the eval()s was to respect the rule type, it's apparently not doing that. I don't have time at the moment to dig around in the code to see what it's doing and whether it's a documentation/rule issue or an eval() code issue. Kevin, John, others? Obviously this is only causing a few rare FPs, and presumably it would most likely affect this or some other spam-discussion list... but it appears to be a bug, no? Thanks! --- Amir -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- You cannot bring about prosperity by discouraging thrift. You cannot help small men by tearing down big men. You cannot strengthen the weak by weakening the strong. You cannot lift the wage-earner by pulling down the wage-payer. You cannot help the poor man by destroying the rich. You cannot keep out of trouble by spending more than your income. You cannot further the brotherhood of man by inciting class hatred. You cannot establish security on borrowed money. You cannot build character and courage by taking away men's initiative and independence. You cannot help men permanently by doing for them what they could and should do for themselves. -- William J. H. Boetcker ---
Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?
On Oct 19, 2015, at 1:16 PM, RWwrote: > body URI_HOST_IN_BLACKLISTeval:check_uri_host_in_blacklist() > header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK') > > These appear to be the same thing. The first call is just a shorthand > form for the second. I don't see where headers come into it. I think the > second rule is probably just a mistake. So, following up on this... do any of the main devs see the second rule as a problem? It seems to be that a header rule shouldn't be checking URI hosts, but even if so, it absolutely shouldn't be hitting when those hosts aren't even in the headers (per the two spamples I posted). Kevin, John, others? Obviously this is only causing a few rare FPs, and presumably it would most likely affect this or some other spam-discussion list... but it appears to be a bug, no? Thanks! --- Amir
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
On Tue, 20 Oct 2015, Rob McEwen wrote: On 10/20/2015 12:13 PM, sha...@shanew.net wrote: Unlike Larry (and others) I DO want to block the vast majority of the new tlds, because we see nothing but spam from them (and my users tend toward the more false-positives than false-negatives side of the spectrum). Rather than maintain a list of all the problematic tlds, I'd rather have a blanket block rule with the ability whitelist the handful that might be legit. Be careful about doing this for the long term. I think that spammer exploit new TLDs because they know that many anti-spam systems don't account for them correctly at first. (and/or maybe they are cheaper at first?). But in the longer term (years down the road).. they tend to move on to other ones, while the legit TLDs slowly increase. So this strategy can backfire in the long term. (but, of course, MMV... and some smaller hosters don't have to be as concerned about a few extra FPs) I totally agree. In fact, I assume anything I'm doing right now to successfully block spam could change tomorrow, much less months or years from now. For now, though, I'm seeing almost no legitimate traffic from most of the new ones (I'm thinking of the longer ones especially; .work, .ninja, .site, .science, etc.). I already have rules that score for these tlds in received or envelope from, but I'm getting tired of making the regular expression longer and longer (in two different places), and I know there's a smarter way. Whether I'm smart enough to implement that smarter way is another matter entirely. Is there an existing (relatively simple) plugin that behaves similarly that I could crib from? -- Public key #7BBC68D9 at| Shane Williams http://pgp.mit.edu/| System Admin - UT CompSci =--+--- All syllogisms contain three lines | sha...@shanew.net Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
On 10/20/2015 12:13 PM, sha...@shanew.net wrote: Unlike Larry (and others) I DO want to block the vast majority of the new tlds, because we see nothing but spam from them (and my users tend toward the more false-positives than false-negatives side of the spectrum). Rather than maintain a list of all the problematic tlds, I'd rather have a blanket block rule with the ability whitelist the handful that might be legit. Be careful about doing this for the long term. I think that spammer exploit new TLDs because they know that many anti-spam systems don't account for them correctly at first. (and/or maybe they are cheaper at first?). But in the longer term (years down the road).. they tend to move on to other ones, while the legit TLDs slowly increase. So this strategy can backfire in the long term. (but, of course, MMV... and some smaller hosters don't have to be as concerned about a few extra FPs) -- Rob McEwen +1 478-475-9032
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
If you have 3.4.1 and use sa-update then we add new tlds to a rule file that is then parsed. This does not block those tlds. It let's the engine recognize the urls for further rules. If you have a tld that is missed and you are using 3.4.1 with sa-update, let us know. Regards, KAM On October 14, 2015 3:37:58 PM PDT, sha...@shanew.net wrote: >On Tue, 13 Oct 2015, Kevin A. McGrail wrote: > >> At the end of the day, if you are having problems with new TLDs, ONE >solution >> is to use something that uses SA 3.4.1 and has sa-update configured >so you >> get updates with said new TLDs. > >I think maybe people are confused about how exactly this change helps >them get rid of all the spam that's coming from the "new" TLDs. > >So, in other words, having just updated to 3.4.1, how does one go from >having a list of all the new TLDs that can now be nicely maintained >with sa-update to getting rules which actually score against the vast >majority of the new TLDs (since most of them seem to be 99.99% spam)? > >I had created a local rule before moving to 3.4.1 that looks for new >TLDs in the Received, From and EnvelopeFrom headers, but it was >obvious that this wasn't going to scale well. Did the new system in >3.4.1 make this easier for me to do, or did it just make it possible >for new TLDs to be handed off to RBLs and the like (not that that's >not a major win)? > >Any elaboration (or a pointer to documentation (not the man page)) >would be greatly appreciated. > >-- >Public key #7BBC68D9 at| Shane Williams >http://pgp.mit.edu/| System Admin - UT CompSci >=--+--- >All syllogisms contain three lines | sha...@shanew.net >Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
I've got 3.4.1 installed and sa-update runs regularly. Unlike Larry (and others) I DO want to block the vast majority of the new tlds, because we see nothing but spam from them (and my users tend toward the more false-positives than false-negatives side of the spectrum). Rather than maintain a list of all the problematic tlds, I'd rather have a blanket block rule with the ability whitelist the handful that might be legit. Is anyone doing anything like this (perhaps as a plugin)? On Tue, 20 Oct 2015, Kevin A. McGrail wrote: If you have 3.4.1 and use sa-update then we add new tlds to a rule file that is then parsed. This does not block those tlds. It let's the engine recognize the urls for further rules. If you have a tld that is missed and you are using 3.4.1 with sa-update, let us know. Regards, KAM On October 14, 2015 3:37:58 PM PDT, sha...@shanew.net wrote: On Tue, 13 Oct 2015, Kevin A. McGrail wrote: At the end of the day, if you are having problems with new TLDs, ONE soluti on is to use something that uses SA 3.4.1 and has sa-update configured so you get updates with said new TLDs. I think maybe people are confused about how exactly this change helps them get rid of all the spam that's coming from the "new" TLDs. So, in other words, having just updated to 3.4.1, how does one go from having a list of all the new TLDs that can now be nicely maintained with sa-update to getting rules which actually score against the vast majority of the new TLDs (since most of them seem to be 99.99% spam)? I had created a local rule before moving to 3.4.1 that looks for new TLDs in the Received, From and EnvelopeFrom headers, but it was obvious that this wasn't going to scale well. Did the new system in 3.4.1 make this easier for me to do, or did it just make it possible for new TLDs to be handed off to RBLs and the like (not that that's not a major win)? Any elaboration (or a pointer to documentation (not the man page)) would be greatly appreciated. -- Public key #7BBC68D9 at| Shane Williams http://pgp.mit.edu/| System Admin - UT CompSci =--+--- All syllogisms contain three lines | sha...@shanew.net Therefore this is not a syllogism | www.ischool.utexas.edu/~shanew
Re: Misbehaving HEADER_HOST_IN_BLACKLIST? And no SPF on SA list host?
On Tue, 20 Oct 2015 11:58:11 -0700 (PDT) John Hardin wrote: > On Tue, 20 Oct 2015, Amir Caspi wrote: > > > On Oct 19, 2015, at 1:16 PM, RWwrote: > > > >> body URI_HOST_IN_BLACKLISTeval:check_uri_host_in_blacklist() > >> header HEADER_HOST_IN_BLACKLIST eval:check_uri_host_listed('BLACK') > >> > >> These appear to be the same thing. The first call is just a > >> shorthand form for the second. I don't see where headers come into > >> it. I think the second rule is probably just a mistake. > > > > So, following up on this... do any of the main devs see the second > > rule as a problem? It seems to be that a header rule shouldn't be > > checking URI hosts, but even if so, it absolutely shouldn't be > > hitting when those hosts aren't even in the headers (per the two > > spamples I posted). > > My default assumption for the behavior of a header eval() rule would > be that it only checks message headers. If that's not the case (as > you describe) then I'd agree the rule is a problem, especially if it > leads to duplicate hits. > > Whether that's a bug in the documentation, or a bug in the rules, or > a bug in eval(), or a bug in the implementation of check_uri_host_*, > I can't really say at this point. https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7256
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
On 10/20/2015 10:04 PM, RW wrote: On Tue, 20 Oct 2015 13:29:45 -0500 (CDT) sha...@shanew.net wrote: I already have rules that score for these tlds in received or envelope from, but I'm getting tired of making the regular expression longer and longer (in two different places), and I know there's a smarter way. Whether I'm smart enough to implement that smarter way is another matter entirely. Is there an existing (relatively simple) plugin that behaves similarly that I could crib from? You don't need a plugin, just autogenerate your rules from this: http://data.iana.org/TLD/tlds-alpha-by-domain.txt or put a choice of wildcarded TLDs in a rbldnsd zone and use a header check_rbl_envfrom rule for senders and URIBL.pm plugin lookups
Re: SpamAssassin Rules Regarding Abuse of New Top Level Domains
On Tue, 20 Oct 2015 13:29:45 -0500 (CDT) sha...@shanew.net wrote: > I already have rules that score for these tlds in received or envelope > from, but I'm getting tired of making the regular expression longer > and longer (in two different places), and I know there's a smarter > way. Whether I'm smart enough to implement that smarter way is > another matter entirely. > > Is there an existing (relatively simple) plugin that behaves similarly > that I could crib from? You don't need a plugin, just autogenerate your rules from this: http://data.iana.org/TLD/tlds-alpha-by-domain.txt
Re: Learning only on read emails?
On Tue, 20 Oct 2015 15:14:42 +0300 Jari Fredriksson wrote: > On 10/20/2015 12:41 AM, Ryan Coleman wrote: > > Actually it makes absolute sense since I dump my spam into a folder > > to be scanned as spam and anything that is still in my inbox, and > > read, is indeed ham. > > > > I just have to re-investigate the ./new and ./cur folders to make > > sure they will operate how I want. But if the email was delivered > > to my phone and it moves (but not read) then it?s not an option. > > cur and new folders work as supposed when the IMAP server is Courier, > but NOT when you use Dovecot. How does it not work as expected?
Re: Learning only on read emails?
> On Oct 20, 2015, at 8:21 AM, RWwrote: > > On Tue, 20 Oct 2015 15:14:42 +0300 > Jari Fredriksson wrote: > >> On 10/20/2015 12:41 AM, Ryan Coleman wrote: >>> Actually it makes absolute sense since I dump my spam into a folder >>> to be scanned as spam and anything that is still in my inbox, and >>> read, is indeed ham. >>> >>> I just have to re-investigate the ./new and ./cur folders to make >>> sure they will operate how I want. But if the email was delivered >>> to my phone and it moves (but not read) then it?s not an option. >> >> cur and new folders work as supposed when the IMAP server is Courier, >> but NOT when you use Dovecot. > > How does it not work as expected? I haven’t seen anything appear in the “new” folder, to be honest.