Re: Malformed spam email gets through.
> Also, can anyone suggest a nicely written rule, that triggers when an html > tag's text contains both upper and lower case letters? Thanks. - Mark Hi Mark and happy new year! For small tags a simple rule, uggly but very cheap, may work: /Src|sRc|srC|.. and son on number of letters to the power of 2... not usefull for long tags but cheap in terms of regex steps. A more ellaborated regex... The next rules are far from perfect but can detect "something that looks like" mixed upper and lower case HTML tags in the pristine body. full __MIXED_UPLOCASE_SRC /(?=(?i:src))(?!src|SRC)...\s*=/tflags __MIXED_UPLOCASE_SRC multiple maxhits=2 full __MIXED_UPLOCASE_HREF /(?=(?i:href))(?!href|HREF)\s*=/tflags __MIXED_UPLOCASE_HREF multiple maxhits=2 meta MIX_UPLOCASE_HTAGS __MIXED_UPLOCASE_SRC >1 && __MIXED_UPLOCASE_HREF >1describe MIX_UPLOCASE_HTAGS MIX OF UPPER AND lower LETTERS in HTML TAGSscore MIX_UPLOCASE_HTAGS 1 You can also check for invalid Base64 characters and and invalid Base64 line lenght... if all of them match... "Hasta luego Lucas" or as Rupert Gallagher says: easter eggs... :-) hope they help you... -PedroD
Re: Malformed spam email gets through.
On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. I have not seen such behavior in the past 20 years... Intentionally re-using another site's MIDs is so wrong that I'd happily make it break hard. HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Malformed spam email gets through.
On 1 Jan 2018, at 10:33 (-0500), David Jones wrote: On 01/01/2018 09:29 AM, Bill Cole wrote: On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. I have not seen such behavior in the past 20 years... Ok. I stand corrected then. What about bounces? Don't they intentionally keep all of the same headers with an empty envelope-from? Nope. A modern standard 'bounce' message is a MIME entity with a special type, denoted by a header somewhat like this: Content-Type: multipart/report; report-type=delivery-status; boundary="blah.foo.bar-baz/example.com" It should have a unique MID, a Date header reflecting the time of the bounce, a Subject header like "Undelivered Mail Returned to Sender", a To header with the original message's envelope sender, a From header clearly identifying the last MTA to hold the message and it's non-human nature such as 'mailer-dae...@example.com (Mail Delivery System)', and Received headers only reflecting the transit from that MTA to the target of the bounce. One PART of a bounce is a message/rfc822 entity which has at least the headers of the original message and usually some or all of the body -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Question about BAYES_999
I just had a spam message hit BAYES_999 but not BAYES_99. Based on BAYES_999 default score of 0.2, I thought that it was always supposed to complement the BAYES_99 rule and both would trigger when BAYES_999 hit. https://pastebin.com/QsVgXwdC If they are independent, then it would seem logical to bump up the default score higher than BAYES_99. -- David Jones
Re: Malformed spam email gets through.
On 01/01/2018 01:30 PM, Alan Hodgson wrote: I've had good success junking anything with one of my domains in the message-id, where I know the mail isn't actually from someone in that domain. That's a pretty solid spam signature. are you sure it's not your mailservers adding Message-Id to the incoming mail? On 01.01.18 14:01, David Jones wrote: I too have seen spam with my own domain in the Message-ID but I combined it with a meta rule of !ALL_TRUSTED to be safe. You are correct. This is a good indicator of spam but each person is going to have to create this local rule unless someone wants to write a plugin that can detect this dynamically. I've had probelms with a similar rule when I send mail directly from one of mailservers. I've had to replace it by !ALL_TRUSTED && !NO_RELAYS just FYI -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. A day without sunshine is like, night.
Re: Question about BAYES_999
On 01/01/2018 06:52 PM, David Jones wrote: On 01/01/2018 06:47 PM, Reindl Harald wrote: Am 02.01.2018 um 01:18 schrieb David Jones: I just had a spam message hit BAYES_999 but not BAYES_99. Based on BAYES_999 default score of 0.2, I thought that it was always supposed to complement the BAYES_99 rule and both would trigger when BAYES_999 hit. https://pastebin.com/QsVgXwdC If they are independent, then it would seem logical to bump up the default score higher than BAYES_99 never ever seen that and since bayes is based on a number between 0 and 1 this should be technically impossible at all with BAYES_00 that message has [score: 0.0003] I checked my logs and I am seeing both together when BAYES_999 hits except for a few times. Is this a bug? Should I open a bug issue? I am not sure how to reproduce the problem unless others also see the same thing with that message. Sorry. Not thinking clearly. Others would have to have the same Bayes DB to get that message to do the same thing. I was able to reproduce the same results on another SA platform running MailScanner using the same Bayes DB in redis. If others could check their mail logs to see if they are hitting BAYES_999 without BAYES_99 on the same message, please let me know. -- David Jones
Re: Question about BAYES_999
On 01/01/2018 06:47 PM, Reindl Harald wrote: Am 02.01.2018 um 01:18 schrieb David Jones: I just had a spam message hit BAYES_999 but not BAYES_99. Based on BAYES_999 default score of 0.2, I thought that it was always supposed to complement the BAYES_99 rule and both would trigger when BAYES_999 hit. https://pastebin.com/QsVgXwdC If they are independent, then it would seem logical to bump up the default score higher than BAYES_99 never ever seen that and since bayes is based on a number between 0 and 1 this should be technically impossible at all with BAYES_00 that message has [score: 0.0003] I checked my logs and I am seeing both together when BAYES_999 hits except for a few times. Is this a bug? Should I open a bug issue? I am not sure how to reproduce the problem unless others also see the same thing with that message. -- David Jones
Re: Malformed spam email gets through.
On 1 Jan 2018, at 12:47 (-0500), Matus UHLAR - fantomas wrote: On 1 Jan 2018, at 11:41 (-0500), Matus UHLAR - fantomas wrote: the gross format in RFCs 822,2822 and 5322 describes message-id consisting of local and domain part, thus is must contain "@". On 01.01.18 12:17, Bill Cole wrote: No, it does not. Re-read the cited sections. From RFC5322, the ABNF definition: msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] this is the part that says message-id must consist of local and domain parts. It just says it implicitly, not explicitly, but: It's not possible to construct Message-Id without the "@" while conforming to any of mentioned RFCs. True, but one could just as easily split up a UUID with '@' instead of '-' and comply while being as sure of uniqueness as could ever matter. Or put full UUIDs on both sides of the '@'. If a V1 UUID is on the right, it is even a host-unique identifier after a fashion. Also note that if you demand that MIDs contain '@' with conforming strings on both sides, you risk losing mail that users want. This is a mistake I have made. what exactly was the problem? Message-Id without the "@" or the non-conforming parts there? Missing '@' Some messages lacking it were generated by antique systems that had proven themselves resistant to evolutionary pressures. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Malformed spam email gets through.
On Mon, 2018-01-01 at 10:29 -0500, Bill Cole wrote: > On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: > > > I think some mail systems will keep the same message-ID per email > > thread so your system must reject some replies. > > I have not seen such behavior in the past 20 years... > > Intentionally re-using another site's MIDs is so wrong that I'd > happily > make it break hard. > > HOWEVER, the idea of enforcing any standard on MIDs beyond gross > format > (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the > sole > user is ludicrous. I've had good success junking anything with one of my domains in the message-id, where I know the mail isn't actually from someone in that domain. That's a pretty solid spam signature. Lack of any message-id is also significant, but sadly there are still some real senders sending mail with no message-id.
Re: Malformed spam email gets through.
On 01/01/2018 01:30 PM, Alan Hodgson wrote: On Mon, 2018-01-01 at 10:29 -0500, Bill Cole wrote: On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. I have not seen such behavior in the past 20 years... Intentionally re-using another site's MIDs is so wrong that I'd happily make it break hard. HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. I've had good success junking anything with one of my domains in the message-id, where I know the mail isn't actually from someone in that domain. That's a pretty solid spam signature. I too have seen spam with my own domain in the Message-ID but I combined it with a meta rule of !ALL_TRUSTED to be safe. You are correct. This is a good indicator of spam but each person is going to have to create this local rule unless someone wants to write a plugin that can detect this dynamically. Lack of any message-id is also significant, but sadly there are still some real senders sending mail with no message-id. -- David Jones
Re: Malformed spam email gets through.
On 1 Jan 2018, at 14:30 (-0500), Alan Hodgson wrote: On Mon, 2018-01-01 at 10:29 -0500, Bill Cole wrote: [...] HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. I've had good success junking anything with one of my domains in the message-id, where I know the mail isn't actually from someone in that domain. That's a pretty solid spam signature. Yes, I was a bit imprecise. Very specific idiosyncratic MID patterns can be extremely accurate spam indicators. Enforcement of RFC or common practice "standards" is riskier than it is worth. Lack of any message-id is also significant, but sadly there are still some real senders sending mail with no message-id. Yes. It's one of the most annoying persistent sorts of mail sloppiness. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Question about BAYES_999
On 01/01/2018 07:08 PM, Reindl Harald wrote: Am 02.01.2018 um 01:59 schrieb David Jones: On 01/01/2018 06:52 PM, David Jones wrote: On 01/01/2018 06:47 PM, Reindl Harald wrote: Am 02.01.2018 um 01:18 schrieb David Jones: I just had a spam message hit BAYES_999 but not BAYES_99. Based on BAYES_999 default score of 0.2, I thought that it was always supposed to complement the BAYES_99 rule and both would trigger when BAYES_999 hit. https://pastebin.com/QsVgXwdC If they are independent, then it would seem logical to bump up the default score higher than BAYES_99 never ever seen that and since bayes is based on a number between 0 and 1 this should be technically impossible at all with BAYES_00 that message has [score: 0.0003] I checked my logs and I am seeing both together when BAYES_999 hits except for a few times. Is this a bug? Should I open a bug issue? I am not sure how to reproduce the problem unless others also see the same thing with that message. Sorry. Not thinking clearly. Others would have to have the same Bayes DB to get that message to do the same thing. I was able to reproduce the same results on another SA platform running MailScanner using the same Bayes DB in redis. If others could check their mail logs to see if they are hitting BAYES_999 without BAYES_99 on the same message, please let me know [sa-milt@mail-gw:/var/log]$ xzcat maillog-2017*.xz | grep "BAYES_999," | wc -l 9125 [sa-milt@mail-gw:/var/log]$ xzcat maillog-2017*.xz | grep "BAYES_999," | grep "BAYES_99," | wc -l 9125 [sa-milt@mail-gw:/var/log]$ xzcat maillog-2017*.xz | grep "BAYES_999," | grep -v "BAYES_99," | wc -l 0 Since yesterday morning: # grep "BAYES_999=" /var/log/maillog | grep "BAYES_99=" | wc -l 8006 # grep "BAYES_999=" /var/log/maillog | wc -l 8092 # grep "BAYES_999=" /var/log/maillog | grep -v "BAYES_99=" | wc -l 86 Last week: # grep "BAYES_999=" /var/log/maillog-20171231 | grep "BAYES_99=" | wc -l 43753 # grep "BAYES_999=" /var/log/maillog-20171231 | wc -l 44108 # grep "BAYES_999=" /var/log/maillog-20171231 | grep -v "BAYES_99=" | wc -l 355 -- David Jones
Re: Malformed spam email gets through.
On 1 Jan 2018, at 11:41 (-0500), Matus UHLAR - fantomas wrote: the gross format in RFCs 822,2822 and 5322 describes message-id consisting of local and domain part, thus is must contain "@". No, it does not. Re-read the cited sections. From RFC5322, the ABNF definition: msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] id-left = dot-atom-text / obs-id-left id-right= dot-atom-text / no-fold-literal / obs-id-right no-fold-literal = "[" *dtext "]" Note the lack of specification of "local" and "domain" parts. Also note that if you demand that MIDs contain '@' with conforming strings on both sides, you risk losing mail that users want. This is a mistake I have made. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Malformed spam email gets through.
On 01/01/2018 09:29 AM, Bill Cole wrote: On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. I have not seen such behavior in the past 20 years... Ok. I stand corrected then. What about bounces? Don't they intentionally keep all of the same headers with an empty envelope-from? Intentionally re-using another site's MIDs is so wrong that I'd happily make it break hard. HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. -- David Jones
Re: Malformed spam email gets through.
David Jones skrev den 2018-01-01 15:59: There is no way that most of us on this mailing list can be as strict or our customers would complain constantly about missing email. postfix add rfc message-id on mails that dont follow rfcs, so first mta (postfix here) hiddes mua's fault not following rfc's, i dont know other mta's on how thay help spammers
Re: Malformed spam email gets through.
On 1 Jan 2018, at 3:54 (-0500), Rupert Gallagher wrote: We reject anything whose mid does not include the fqdn or address literal of their sending server. We do this because the RFC says explicitly that the mid *MUST* have those features. This is a blatant falsehood. Relevant RFCs: https://tools.ietf.org/html/rfc5322#section-3.6.4 https://tools.ietf.org/html/rfc2822#section-3.6.4 https://tools.ietf.org/html/rfc822#section-4.6 The only "MUST" in regard to MID content in any of those is uniqueness. Use of a domain identifier is merely RECOMMENDED. Beyond that, it is *IMPOSSIBLE* for a receiving system to reliably determine whether the right-hand part of a MID is a valid host or domain identifier for the generator of the MID. -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Currently Seeking Steady Work: https://linkedin.com/in/billcole
Re: Malformed spam email gets through.
On 01/01/2018 09:33 AM, David Jones wrote: On 01/01/2018 09:29 AM, Bill Cole wrote: On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. I have not seen such behavior in the past 20 years... Ok. I stand corrected then. What about bounces? Don't they intentionally keep all of the same headers with an empty envelope-from? Answering myself. No. I checked a few and the Message-ID is generated new on bounces too. NM Ignore me ... :) I was thinking of something else related to email archiving that dedupes based on the Message-ID. Happy New Year! Intentionally re-using another site's MIDs is so wrong that I'd happily make it break hard. HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. -- David Jones
Re: Malformed spam email gets through.
On 01/01/2018 02:54 AM, Rupert Gallagher wrote: We reject anything whose mid does not include the fqdn or address literal of their sending server. We do this because the RFC says explicitly that the mid *MUST* have those features. We write exceptions for those few senders who are legitimate but have lazy and incompetent sysadmins. On Mon, Jan 1, 2018 at 00:15, Mark London> wrote: Message-ID: Wow! You must not have any spam problems because you don't accept much email -- ham or spam. :) I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. There is no way that most of us on this mailing list can be as strict or our customers would complain constantly about missing email. -- David Jones
Re: Malformed spam email gets through.
On 1 Jan 2018, at 9:59 (-0500), David Jones wrote: I think some mail systems will keep the same message-ID per email thread so your system must reject some replies. On 01.01.18 10:29, Bill Cole wrote: I have not seen such behavior in the past 20 years... Intentionally re-using another site's MIDs is so wrong that I'd happily make it break hard. HOWEVER, the idea of enforcing any standard on MIDs beyond gross format (e.g.: <[[:ascii:]]{3,996}>) on a system where the admin isn't the sole user is ludicrous. the gross format in RFCs 822,2822 and 5322 describes message-id consisting of local and domain part, thus is must contain "@". -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Due to unexpected conditions Windows 2000 will be released in first quarter of year 1901
Re: Malformed spam email gets through.
On 1 Jan 2018, at 09:41, Matus UHLAR - fantomaswrote: > the gross format in RFCs 822,2822 and 5322 describes message-id consisting > of local and domain part, You are misreading the RFC. The Message-ID itself is a *should* and there is no MUST un any of the description of the construction of the Message-ID, only that it MUST be globally unique. 5322 specifically states: "Though other algorithms will work, it is RECOMMENDED that the right-hand side contain some domain identifier (either of the host itself or otherwise) such that the generator of the message identifier can guarantee the uniqueness of the left-hand side within the scope of that domain." There is no requirement to include a local and domain part in any part of a Message-ID. A 256-bit would be unique to some significant fraction of the atoms in the universe. I'd posit that meets any reasonable definition of "must be globally unique." But, in practice, the simplest way to guarantee uniqueness is to generate a timestamp and add it to a domain/IP/local ID. -- "We take off our Republican hats and put on our American hats" -- Many Republicans in Sep 2008
Re: Malformed spam email gets through.
We reject anything whose mid does not include the fqdn or address literal of their sending server. We do this because the RFC says explicitly that the mid *MUST* have those features. We write exceptions for those few senders who are legitimate but have lazy and incompetent sysadmins. On Mon, Jan 1, 2018 at 00:15, Mark Londonwrote: Message-ID:
Re: Malformed spam email gets through.
On 1 Jan 2018, at 11:41 (-0500), Matus UHLAR - fantomas wrote: the gross format in RFCs 822,2822 and 5322 describes message-id consisting of local and domain part, thus is must contain "@". On 01.01.18 12:17, Bill Cole wrote: No, it does not. Re-read the cited sections. From RFC5322, the ABNF definition: msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS] this is the part that says message-id must consist of local and domain parts. It just says it implicitly, not explicitly, but: It's not possible to construct Message-Id without the "@" while conforming to any of mentioned RFCs. Also note that if you demand that MIDs contain '@' with conforming strings on both sides, you risk losing mail that users want. This is a mistake I have made. what exactly was the problem? Message-Id without the "@" or the non-conforming parts there? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Your mouse has moved. Windows NT will now restart for changes to take to take effect. [OK]