Re: Multiple test failures
I haven't had a chance yet to read this thread carefully, but spamd when run as root in tests will, at least in some cases, set itself to run as user "nobody". If you do that in a subdirectory of your non-nobody user's HOME, the usual permission configuration will not provide read access to nobody and the test will fail. Would it be worth adding some sort of test for this kind of thing that could make a reasonably explicit "don't run install as root" or "incorrect directory permissions" or some such, to make it more obvious what is going wrong? I seem to remember at least one or two very similar postings here in the last year or so, so it is something that does happen and confuses people.
Re: Beginner Setting up Spam Assassin
SpamAssassin cannot block or eliminate spam. It does not have the facilities to do that. SA can only score potential spam. Whatever method you used to glue SA into your mail path needs to parse the score SA assigned in the returned mail, and do whatever routing it thinks is appropriate. We do not know what glue you are using to put SA into your mail path, so it is hard to give suggestions on how to set that unknown software up. With more details of your setup we may be able to help. We can suggest rules to assign a score to mail if it comes from a particular account. But something other than SA will then have to deal with that score and do the routing. - Original Message - From: FalconChristopher To: Michael Grant ; users@spamassassin.apache.org Sent: Saturday, December 30, 2023 2:48 AM Subject: Re: Beginner Setting up Spam Assassin Hi, can I not ask how to set up Spam Assassin in this mailing group it is a group for Spam Assassin. On 12/30/2023 4:30 AM, Michael Grant wrote: Can you ban this user in whatever your equivalent of the access file is so instead of putting the messages into a spam folder, you reject messages from that address at delivery time (SMTP)? On 30 December 2023 04:08:17 CET, FalconChristopher wrote: Anyone know how I can check and setup SpamAssassin so that I can eliminate some spam from coming in from a email account ? On 12/28/2023 2:24 AM, Matus UHLAR - fantomas wrote: > On 27.12.23 16:53, Fal Sangu verification: ⓘ No issues found, please report it if otherwise Request analyst action Verified by Sangu Anyone know how I can check and setup SpamAssassin so that I can eliminate some spam from coming in from a email account ? On 12/28/2023 2:24 AM, Matus UHLAR - fantomas wrote: > On 27.12.23 16:53, FalconChristopher wrote: >> Hi, I want to setup Spam Assassin so that any email that Spam >> Assassin flags as spam > > this is spamassassin's job > >> gets placed into a folder for a specific SMTP or IMAP email account. > > this is not spamassassin's job. > It's job of mail delivery agent - procmail, maildrop, sieve > >> Then if Spam Assassin flags emails that are not spam I can tell it >> which of those emails to not place into the spam folder for the >> specific email client. Until it gradually learns which emails are >> spam and which are not. > > dovecot (imap/pop3 server) has plugins that support training of > spam/ham, if you move the mail from/to spam folder. > > https://doc.dovecot.org/configuration_manual/spam_reporting/ > >> I've done a little research and I have access with my distribution to >> a mail directory as well as the local.cf file for which >> configurations are for Spam Assassin but I don't know how to setup >> what I mentioned above ? >
Re: My apologies
I've blocked him on my mail server, as well. Reindl now and then says something useful, but as you have noticed his people skills are somewhere in the negative 200 score level. I don't know that I'd block him, but you do need to take anything he says witha few horselicks of salt.
Re: Sudden surge in spam appearing to come from my email address
> header __FROM_THOMAS_1 From =~ //i You can simplify this. The parenthesized grouping was only necessary when there was more than one possible string, in my case .com and .net. Since you only have .com you can remove the (:? and ) and make the regex a little more efficient: > header __FROM_THOMAS_1 From =~ //i
Re: Sudden surge in spam appearing to come from my email address
> Am I correct? Sorry if I'm being dense. I'm just a sysadmin, not a developer, > so I'm not super clear on how macros and expansions work in perl. You have the concepts right. I'd try the rules you posted and see if they seem to be producing correct results. You can run a spam thru SA with the -t switch and see which rules hit, and hopefully the NOT_FROM_ rule will hit. Send yourself a test mail and see that it doesn't hit. If that all works it is time to add the rules for the family. If it doesn't work, look at how the From header is formatted in the mail you sent to yourself. Loren
Re: Sudden surge in spam appearing to come from my email address
I am suddenly getting hammered by a BUNCH of spam that appears to be from me. It scores low, and even though I keep feeding it to Bayes, it's still not hitting the threshold to be marked as spam. When I check the headers, it's coming from multiple random email servers, but many appear to originate from hotmail/outlook.com. So from outlook.com, through some unsecured email server, then to my server. SA can't block this trash by itself, but if something post the SA invocation can look at the headers you might be able to block it. You can certainly mark it as spam. For instance: # # Ok, catch 'from me' when it isn't header __FROM_ME_1 From =~ //i header __FROM_ME_2 From =~ /\"First Last\" / header __FROM_ME_3 From =~ /First Last / meta NOT_FROM_ME __FROM_ME_1 && !(__FROM_ME_2 || __FROM_ME_3) score NOT_FROM_ME 10 describe NOT_FROM_ME Spammer faking the mail from me! Mind the backslash on the quotes and at sign. Depending on versions of things these are necessary, and don't hurt if they are not necessary.
Re: Best practice for adding headers?
I've patched spamass milter to let any previously added "X-Spam" headers untouched Its generally considered bad practice to pass thru X-Spam headers from an unkonwn source. Like most anything else in an email header, a spammer could inject his own headers, probably populated with items designed to generate a negative score.
Re: Help with rule
> meta FROM_CLIENT_TEST from FROM_CLIENT_EMAIL && FROM_CLIENT_IP Is that a typo when you were making this mail, or is it actually how the line is coded? There is an extra "from" there. Even if you fix that, you won't get the results you expect. Both FROM_CLIENT_EMAIL and FROM_CLIENT_IP will score as 1 point each if they hit, so your final adjusted score will be +1, not -1. You can fix that in several ways: header FROM_CLIENT_EMAIL From =~ /client@client\.com/i scoreFROM_CLIENT_EMAIL 0.01 header FROM_CLIENT_IP Received =~ /from 138\.31\230\.222/ scoreFROM_CLIENT_IP 0.01 Or header FROM_CLIENT_EMAIL From =~ /client@client\.com/i header FROM_CLIENT_IP Received =~ /from 138\.31\230\.222/ meta FROM_CLIENT_TEST FROM_CLIENT_EMAIL && FROM_CLIENT_IP score FROM_CLIENT_TEST -3.0 Or the probably best way once you have the tests debugged and you know they both hit correctly: header __FROM_CLIENT_EMAIL From =~ /client@client\.com/i header __FROM_CLIENT_IP Received =~ /from 138\.31\230\.222/ meta FROM_CLIENT_TEST __FROM_CLIENT_EMAIL && __FROM_CLIENT_IP score FROM_CLIENT_TEST -1.0 The double underscore on the front of the rule will keep it from contributing a score of it's own, and it will not show in the list of hit rules. Thus you will only see the result of the meta. Loren
Re: authres missing when ran from spamass-milter
This is not an area I know anything about, so I may be completely wrong. That said, I seem to remember a conversation very like this some years back. If I remember correctly, someone found some switch that could be set to get spamass-milter to add the Received header before calling the other milters. Even if there isn't a switch, maybe it would only take a few lines of code change in spamass-milter to put out the Received header earlier.
Re: comparing sender domain against recipient domain
But I was more interested if SA already has something like that? It does not. Weren't there a whole set of "FUZZY" rules once? I'm pretty sure that they looked for words in in the subject and maybe body of the email that had exactly this sort of obfuscation. I don't think they were applied to domain names, and certinaly not matching two fields in different headers. But if the code for the fuzzy rules is still around, it possibly could be adapted for this use.
How can I detect a text/plain base64 email message with no other text parts?
I get a lot of spams, and a major characterisitc is they only have text/plain that is base-64 encoded. Since I live in an area where base-64 encoding is basically never necessary, almost all base-64 encoded text parts are major spam signs. Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: base64 Thanks, Loren
Re: BAYES scores
From: "Bill Cole" It is my understanding that an automated rescoring job was run quite some time ago (before I was on the PMC) to generate the Bayes scores, which determined that to be the best supplemental score to give to the greater certainty. I was around in those days. My memory isn't the greatest anymore, but what I recall was that they did automatic rescoring, and then manually tweaked a few of the values, basically to make them look pretty by rounding off long fractions. BAYES_999 may have been scored almost completely manually, I can't quite recall. Loren
Re: Strange findings debugging bayes results
From: "Reindl Harald" in other words a system for morons - morons which will drag mails to spam instead click on "unsubscribe" per-user bayes don't work well, never Well Harald, you are certainly welcome to your opinion. It would be nicer if you had kept it yourself though. The system works just fine with the userbase it has. It probably wouldn't work for AOL or *.online.
Re: Strange findings debugging bayes results
This is a home system with only a few users. All users have "Spam" and "Ham" folders showing up in their email program of choice, and they just drag messages they do or don't like into the appropriate folders. There are "Oldham" and "Oldspam" mboxes, and the new spam and ham (respectively) get merged into these folders after learning, and removed from the current Spam and Ham folders. - Original Message - From: Michael Grant To: users@spamassassin.apache.org ; Loren Wilton ; hg user Sent: Monday, February 20, 2023 12:47 PM Subject: Re: Strange findings debugging bayes results On 20 February 2023 12:28:00 CET, Loren Wilton wrote: > > A cron job that will harvest Spam and Ham mboxes and feed them to sa-learn once a day, then archive the learned messages. Per-user bayes and learning. Mail is hand-moved into the spam and ham learning folders, and for my personal account, I do this rarely, generally only when a message is mis-categorized. Although messages being mis-categorized as spam is often the result of a lot of quite aggressive local rules I have rather than a Bayes mis-classification. When you "harvest" ham from mboxes, what do you consider ham? You also, additionally, have a Ham folder for your users then? Interesting. Did you manage to train your users to use it easily? Does it grow unbounded or are old messages removed from it? If so, how to know they can be deleted like from the Spam folder. It's an interesting idea, just wondering about the details. Getting my users to train spamassassim has always been impossible for me.
Re: Strange findings debugging bayes results
> Can you please give me some details on your bayes setup? > Headers exclusion, bayes_token_sources, how do you "sa-learn" messages... Standard options on Bayes. No autolearn. A cron job that will harvest Spam and Ham mboxes and feed them to sa-learn once a day, then archive the learned messages. Per-user bayes and learning. Mail is hand-moved into the spam and ham learning folders, and for my personal account, I do this rarely, generally only when a message is mis-categorized. Although messages being mis-categorized as spam is often the result of a lot of quite aggressive local rules I have rather than a Bayes mis-classification.
Re: Strange findings debugging bayes results
> The real question is: has bayes still its use case in 2023 ? Is it still used > with important scores or just to flag messages for a review? It works fine for me here.
Re: BAYES_00 BODY. Negative score?
They receive wildly different BAYES scores. * -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.0002] * 2.2 BAYES_20 BODY: Bayes spam probability is 5 to 20% * [score: 0.0881] This looks like you have per-user Bayes databases, and the messaage type has been trained differently in each. Also, it looks like there are per-user rules, since BAYES_50 has a normal score of 0.2, and there is no reason BAYES_20 (indicating much less spammy) should have a score of 2.2.
Re: Seeing big (>1MB) spam
I started seeing some spam today in the 1-1.5 MB range. It's been over a year now, but for a while I was getting a huge number of spams that were either 1143 KB or 3831 KB. The 3831 KB variant used the same obfuscation payload as the 1143 KB spams, they just put it in twice in a row. Loren
Re: BAYES_00 BODY. Negative score?
Have some annoying SPAM that consistently shows a negative score on BAYES. Is the default scoring or influenced by BAYES in some way? *-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.] The score is reasonable for guaranteed ham, which is what your Bayes thinks this spam email is Of course the score isn't reasonable for spam, but Bayes thinks it is ham. In addition to being cautious of autolearn as Benny descriped, yes, you need to retrain your Bayes, because it is very clearly confused on this point. Loren
Re: New rule wanted
I believe 3MB is above the default scan size for SA, so likely it won't even look at the file. Loren - Original Message - From: Rupert Gallagher To: users@spamassassin.apache.org Sent: Tuesday, February 07, 2023 2:26 AM Subject: Re: New rule wanted Note: Both client and server are not Windows. The attached file type is a generic "data" on unix. On a Windows client the file runs as executable. A SA rule should merely detect that the file type is a generic "data" file. Original Message On Feb 7, 2023, 11:15, Rupert Gallagher < r...@protonmail.com> wrote: I received a spam with score -1. Well written, looks legit commercial, asking for a quotation, with details in the attachment, a 3MB file with unknown extension ".one". The file turns out to be a Windows Trojan: https://www.virustotal.com/gui/file/f4d587f60f2d34add9f77fcbd8c3c0df3ca51cfaecd9de85c45d25647eaac40b Both SA and ClamAV passed it as legit. We should have a SA rule that says: "attached file with unknown data type".
Re: Rule Help - not sure what is wrong with my syntax
> header TO_SPECIFIC_DOMAIN To:addr =~ /\@(test\.com|test\.net)$/ That for efficiency really should use a non-capturing grouping: header TO_SPECIFIC_DOMAIN To:addr =~ /\@(?:test\.com|test\.net)$/ Note the "?:" after the left parend. Loren
Re: Rule Help - not sure what is wrong with my syntax
Why not do a simple rule rather than inventing some Perl code? header TO_SPECIFIC_EMAIL To:addr ~= '(?:\bus...@example.com|\bus...@example.com|\bus...@example.com)' describe TO_SPECIFIC_EMAIL Mail to a specific email address score TO_SPECIFIC_EMAIL -2 header TO_SPECIFIC_DOMAIN To:addr '(?:'\@example1\.com | \@example2\.com | \@example3\.com)' describe TO_SPECIFIC_DOMAIN Mail to specific email domain score TO_SPECIFIC_DOMAIN -2 or possibly header TO_SPECIFIC_DOMAIN To:addr '\@(?:example1\.com | example2\.com | example3\.com)$' Loren - Original Message - From: Joey J To: users@spamassassin.apache.org Sent: Wednesday, January 11, 2023 3:39 PM Subject: Rule Help - not sure what is wrong with my syntax Hello All, I created this rule to check for email addresses matching a list to get added some negative value. I also tried it with just domains so it would be more efficient, but I can't seem to get them to run. Any suggestions? header TO_SPECIFIC_EMAIL eval:check_to_specific_email() describe TO_SPECIFIC_EMAIL Mail to a specific email address score TO_SPECIFIC_EMAIL -2 sub check_to_specific_email { my ($self) = @_; my $to = lc($self->get('To:addr')); my $list_of_address = qr/us...@example.com|us...@example.com|us...@example.com/; if ($to =~ $list_of_address) { return 1; } return 0; } This version was to simply check for the domain matches, but can't seem to get it to work header TO_SPECIFIC_DOMAIN eval:check_to_specific_domain() describe TO_SPECIFIC_DOMAIN Mail to specific email domain score TO_SPECIFIC_DOMAIN -2 sub check_to_specific_domain { my ($self) = @_; my $to = lc($self->get('To:addr')); if ($to =~ /\@example1\.com$|\@example2\.com$|\@example3\.com$/) { return 1; } return 0; } -- Thanks! Joey
Re: local rule exclude all domains except "my list of approved"
You can simplify your rule code a little if you want: header __LOCAL_FROM_BE From =~ /.\.beauty/i meta LOCAL_BE (__LOCAL_FROM_BE) score LOCAL_BE 2 describe LOCAL_BE from beauty domain to header LOCAL_BE From =~ /.\.beauty/i score LOCAL_BE 2 describe LOCAL_BE from beauty domain The meta isn't really doing anything there, since it only has a single clause. Metas are good when you want to combine the results of several matches with boolean logic. You might also want to add a \b to the rule: header LOCAL_BE From =~ /.\.beauty\b/i Without that the rule will match ".beauty", but also ".beautyrest". Another thing you might want to consider is using "From:addr" rather than just "From". As it is, it will match ".beauty" both in the address and in the person's name description. So it would match: From: "janice.beautyfull" Maybe you want that, in wihich a bare "From" is fine.
Re: SA build from cpan fails under certain conditions
If this is on 4.0, perhaps a bug should be opened. - Original Message - From: Shawn Iverson To: SA Mailing list Sent: Wednesday, December 21, 2022 10:05 AM Subject: SA build from cpan fails under certain conditions Hello SA Users, Just posting this in case anyone else runs into similar trouble... sudo cpan Mail::Spamassassin seems to only build properly on recent flavors of rhel under very specific conditions, notably: You are not root The cpan configuration is set to build specifically using local lib or sudo. sudo seems to work if you want to install SA for all users instead of in the home directory. The key is that the building/testing is happening in the user's home directory.
Re: Whitelist or add negative values for score
Personally I'd look at why BIGNUM_EMAILS_MANY is hitting and see if there is something the sender could do to avoid it. I'm pretty sure I've never seen that rule hit in any of my spam, so it must be something a bit unique. Loren
Re: Problems matching the last word in multi-OR Regex
> body__ANIMALS/cat|mouse|bird|dog/i There is a possible problem with your rule. It probably isn't related to what you are seeing, but could be a problem for you anyway. There is no word boundry in the regex, so 'cat' will match catamaran, 'mouse' will match mousehouse, 'bird' will match birddog, and so will 'dog'. You can solve this by adding word boundries: body__ANIMALS/\b(:?cat|mouse|bird|dog)\b/i or body__ANIMALS/\bcat\b|\bmouse\b|\bbird\b|\bdog\b/i
Re: spam subject marking
So the alternative is adding a header and move it to the spam folder automatically on the basis of the header? Currently I just want to 'warn' users that the message is possible spam, they can decide to move such emails automatically to a spam folder by enabling a sieve rule. What would be an alternative method to keep such functionality without altering the subject? If SA sees the message and classifies it as spam, it normally adds (from an example) X-Spam-Flag: YES X-Spam-Level: X-Spam-Status: Yes, score=8.2 required=5.0 tests=BAYES_50=0.8,DKIM_SIGNED=0.1, It should be trivial to look for the "X-Spam-Flag: YES" line.
Re: PDS_DBL_URL_TNB_RUNON
Pretty obviously a spam, I'm surprized that it didn't get a lot of "fake order" type of points. Here is the (or at least one) double URL that it caught: ;" href=3D"ht= tps://nam02.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fsojaprote= in.rs
How do I check for a jpeg attachment?
I'm getting a bunch of spams from fake gmail accounts that consist of one short line of text and a 2 MB jpg file. The subject and body text are pretty much random beyond that. How do I check for the following? --e345f305ea2680cd Content-Type: image/jpeg; name="MMM.jpg" Content-Disposition: attachment; filename="MMM.jpg" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_l8t6clr50 I want to match on /^Content-Type: image\/jpeg;/ but I can't figure out how to do that. rawbody doesn't seem to work. Thanks
Re: Mail with image marked as spam
It sure seems to me like people are just using email to share pictures (licenses, l egal docs, as well as pictures of the kids.) Are these messages that are being sent by individuals from their phones? Or is there some program that is sending these? I can see it being too much work to type a subject if you are just texting a snapshot to someone else on another phone, but if it is a program, perhaps it could be trained to put a "Here is your photo!" subject in the mail and eliminate the problem.
Re: Matching on missing To field?
> The problem I'm having is that my To header rules aren't matching because > there is no To header, > and I'm otherwise unsure what to match on. The only occurrence of the > recipient in the entire email > is in that Received header. > > It does match on "ALL", but I think I need to be more specific than that, to > avoid matching on "From:" > or Return-Path or EnvelopeFrom./ If you want to match on text in Received headers only, then just write a rule to check that header type: header __TO_FRED_JOHNSON To ~= /\bfred\.johnson@foo\.com\b/ header __RCVD_FRED_JOHNSONReceived ~= /\bfred\.johnson@foo\.com\b/ metaTO_FRED_JOHNSON __TO_FRED_JOHNSON || __RCVD_FRED_JOHNSON metaNOT_TO_ME !TO_FRED_JOHNSON You could do that with ALL, but this way is probably more efficient, and will be a lot less confusing regex. Loren
Re: Matching on missing To field?
header __HDRS_MISSP ALL:raw =~ /^(?:Subject|From|To|Reply-To):\S/ism That rule just says: look at all the raw header data and match if there's none of Subject, From, To, Reply-To entries. IE a really malformed message. Hum. As I read it, that is "headers misspelled" (not "headers missing") and it is checking for any of the listed words at the start of a line, followed by a colon, and NOT followed by a space. Loren
Re: Rule to detect non-standard headers that aren't X- prefixed
Minicomputers-Exhume: sides Malthus-Films: 88976dea Parasitic-Homogeneity: db5da28ba3e69a Capitalizations-Grievously: oilers It looks like the pattern is /[A-Z][a-z]{1,20}-[A-Z][a-z]{1.20}\:\s{1,10}[\w\d]{3,20}/ or something close to that. Obviously it can mutate, but generally these are made by a tool, and until a new version of the tool comes along, they will be stable. Try someting like header LW_BOGUS_HEADERS ALL =~ /[A-Z][a-z]{1,20}-[A-Z][a-z]{1.20}\:\s{1,10}[\w\d]{3,20}\n/is
Re: Another evil number
Fascinating thread I just stumbled on. Yes, in early parts of the phone system, the letters were geographic and referenced the street for where the central office was located switching those calls. For example, in Arlington VA, my grandfathers number was 533-9389 which was referred to as JE3-9389 and the CO was on Jefferson St. I'm pretty sure this fell apart rapidly as the system grew. Whether it referenced the CO or not was a regional thing at best. For instance my home number when growing up was YU-71314, where YU was Yukon. For the first year or two that we had a phone it was only YU-1314, the 7 came along later when it became possible to direct dial more than the single CO you were attached to. There was absolutely nothing within several thousand miles that was named Yukon, Alaska, or anything else cold. The CO was on Foothill Blvd, which was a two-lane undivided street. The main reason fo naming toll areas was memorability. It wasn't easy to remember a 7 digit number, but a prefix followed by 3, 4, or 5 digits (depending on the era and where you lived) was much easier to remember, or at least so Ma Bell thought at the time. Named toll codes stayed around until the mid to late 1960s. What finished them off was the introduction of DDD - Direct Distance Dialing, or the area code system we are all familiar with. Loren
Re: Linting of local.cf
Is there a tool I can use to do a manual lint of the local.cf file ? At command prompt: spamassassin --lint Loren
Re: T_SCC_BODY_TEXT_LINE
What is the purpose of the rule named T_SCC_BODY_TEXT_LINE? On my servers, it hits nearly every spam and ham email. Rules beginning with T_ are test rules, and should have a very small score. So someone is testing some concept there. I don't seem to have this rule, so I can't say any more about whhat it does. Loren
Re: rules for a sneaky SPEAR-VIRUS spam that gets past bayes
Just off the top of my head: rawbodyONEDRIVE_DOWNLOADm'https://onedrive\.live\.com/download[?]cid=' score ONEDRIVE_DOWNLOAD0.5 describeONEDRIVE_DOWNLOADDownload link to a file on Onedrive Personally I'd be inclined to put an i on the end of that. body FILE_PWD_INFO/\b(?:Fil lösenord|File password):\s[A-Z]{2}\d{4}\b/ scoreFILE_PWD_INFO3 describe FILE_PWD_INFOEmail has a password to an archive file meta PWD_ONEDRIVE_DLOADONEDRIVE_DOWNLOAD && FILE_PWD_INFO scorePWD_ONEDRIVE_DLOAD4 describe PWD_ONEDRIVE_DLOADEmail contains download for passworded Onedrive file Loren
Re: OT - Hotmail/Outlook.com marking most of our email as Junk
Cian is rumored to have said: Anne, I am incredibly grateful for the offer. I sent my emails to the tester and to the support email. Hopefully, they come up with something actionable. If you get a useful result it might be nice to summarize it to the list. Loren
Re: CONTENT_AFTER_HTML: better not discuss formatting!!
Are you talking about the use of m'' as the regex delimiter? Yes. It will probably work just fine for the foreseeable future, as long as the input validation of rules files is lenient. I think you may have a very hard time removing the m matching delimiters from SA. I suspect there are at least hundreds of rules like that in the release database. I have about a hundred local rules of my own that use that. Any time I have more than one backslash in a pattern, I use an alternate delimiter (usually single quote) so that I don't have to escape all the backslashes in the rule body. I'm not a fan of obfuscated rule bodies where it is impossible to tell what it is intended to match. My experience is that any time you have to write or \\ multiple times in a rule body, you are almost guaranteed to get the number of backslahses wrong, and the rule won't work. But of course it may work in some cases (like the one you used to test it) while not working in general. I don't have time in my life to deal with that sort of thing. It caused me enough grief when I started writing rules 20 years ago, which is why I started using m'. BTW, that particular rule dates from RulesEmporium days, which was what, 2005 or so? Loren
Re: CONTENT_AFTER_HTML: better not discuss formatting!!
No, I added that after observing multiple spams with random garbage after the closing HTML tag in the HTML body part. Presumably it was an attempt at Bayes poison, checksum avoidance, or some other filter evasion technique. I'll tighten it up. FWIW, here is the rule I use. It obviously could be better, but I haven't noticed that it misfires. full __GOODEHTML1 m''i full __GOODEHTML2 m'(?:\s|=0A){0,50}(?:$|--|=)'is # stop on mime ending boundary meta LW_BADEHTML1 (__GOODEHTML1 && !__GOODEHTML2) describe LW_BADEHTML1 Bad ending - something after score LW_BADEHTML1 1
Re: CONTENT_AFTER_HTML: better not discuss formatting!!
But, it had: * 2.5 CONTENT_AFTER_HTML More content after HTML close tag but one was only text/plain and I could see nothing wrong. reading 72_active.cf I found: rawbody__CONTENT_AFTER_HTML/<\/htnl>\s*[a-z0-9]/i > which fires on a text/plain part that discusses html formatting! Note you show __CONTENT_AFTER_HTML and CONTENT_AFTER_HTML, which are not the same rule. I suspect the meta for CONTENT_AFTER_HTML contains some other things that should in theory make it not hit in this case. I've personally never seen this rule hit, and didn't know it existed. Are you sure it isn't a local rule? I have a rule of my own that gives 1 point for extra trash after the /html end tag. I see it frequently on spam and UCE that has a tracking tag in the HTML section after the official end of the html. Loren
Re: resubmit mail or just delete
If they are more than a month or so old, just drop them would probably be appropriate. Spam changes with time, and learning old spam patterns may not do you much good. If you aren't running Bayes, just dump all of them. If you are running Bayes, it might be worth running the lst month or so thru SA-Learn as spam, as long as you are sure thay are all spam. Just my opinion. Loren
Re: Avoid processing upsteam trusted mail with X-Spam-Flag: YES?
that header should be on same host as the email clients read there mails, if its trusted outside of local mta, then its forged say X-Spam-Flag: NO do we want to trust it ? I have a somewhat similar situation where the mail provider for my personal account runs filtering software that usually puts classification headers in the mail I recieve from them. I ignore any statement they make about the mail being clean, but if they say it is spam, I add 2 points in SA. So far that has rarely resulted in FPs for me. I think it is perfectly reasonable policy to trust a (semi)trusted upstream system that says the message is spam, and award points for that decision: either the upstream system's original score, or a new score on the closer host, as you wish. But unless it is a completely trusted host I'd never trust it's decision of ham. Unfortunately it is difficult (without trickery of some sort) to do that if the upstream system is also SA, since SA removes the incoming X-Spam-* headers. Loren
Re: Rawheader or Rawsubject? Or how to match UTF-8 Emoji in Header.
How do I do this? There is no rawheader or rawbody matcher as far as I could determine. There is 'rawbody', but it may or may not help you. I seem to recall the Subject is prepended to the body text, but I don't recall if it is prepended to rawbody. You could try it. Short of that, you may have to fall back on 'full' and match for something like full MY_SUB/\nSubject: \n/
Re: SPF_NONE scoring
So how is this score arrived at? I believe that scores of 0.001 are generally manually set, and not intended to be anything other than a visible marker that the rule hit. That is probably the case here. Loren
Re: Seeing "check: exceeded time limit in ..." and need to resolve it
What would be helpful here would be logging of when a rule *starts* evaluation. Normally that would be painful, but for tracking a runaway it would be useful. Perhaps I can code up something to capture that and log it on a timeout... Actually what sounds like it would be useful would be knowing the name of the rule that timed out. I'm presuming when the timeout occurs that there is still some indication of the current rule being processed so it can be killed. I'd think that should be enough to backtrack to the rule name. A modification to the timeout message could display the name of the rule and even how long it took to that point. I guess there might be multiple rules running when the timeout occurs and not know which one really timed out, but that would still be a small number of rule names. Loren
Re: Fw: spam from gmail.com
I have to admit I'd never paid much attention to the RCVD_IN_DNSWL_* scores on spam before. Looking at spam for last month, I don't have a single RCVD_IN_DNSWL_MED. But I do have 12 pretty blatent spams that hit RCVD_IN_DNSWL_HI. It makes me wonder just how useful a rule it is. Especially when it includes sendgrid as part of the "HI" reputation senders. [ 66. 70.136.180] mta1.bevocalforlocal.info [ 88. 80.190.164] 88-80-190-164.ip.linodeusercontent.com [107.175.219. 38] dhrf266.medley.com.de [107.175.219. 54] dhrf2106.realatelier.xyz [107.175.219.103] dhrf2208.rollrs.xyz [139.162. 81.182] 139-162-81-182.ip.linodeusercontent.com [167. 89. 10.203] o1678910x203.outbound-mail.sendgrid.net [167. 89. 10.203] o1678910x203.outbound-mail.sendgrid.net [172.104.183.201] 172-104-183-201.ip.linodeusercontent.com [172.105.221. 77] li1875-77.members.linode.com [178. 79.178. 52] li347-52.members.linode.com [185. 51. 39.149] static-185-51-39-149.uludns.net
Re: Unicode considered harmful again
In v4.x, Unicode support will be better. That also means it may be easier to make this sort of attack quieter in the future, as non-ASCII rules won't be definitively wrong as they are now. The question is whether non-ascii malicious rules could do anything more damaging than simply failing to match on the obvious strings "visible" in the rule, or alternately deliberately match on some string that should not be matched, in some form of DOS attempt. It's hard to see how someone could inject Perl (or any other) code with screwy rules. There was a time Perl code was allowed in rules, that was disallowed many years ago: uri LW_PRINTIT /(^.*$)(?{ print "URI:\n$^N\nEnd URI\n\n" })/is That was a real handy debugging rule once, but you can't get away with that anymore. Loren
Sometimes the spammers should render the plain text manually v2.0
Or maybe they should just write a plain text body: Hello there, ! This is a test template...
Sometimes the spammers should render the plain text manually
Rather than just a direct translation of the obfuscated HTML: Clipxuck thDe button belo2bw avnd ewnd the confiIrmtion stTodeps
Re: Disabling autolearn on given rule
(2) where would I go to look at building a plugin for this? Ideally something that ends up upstream, but though I can write code, I know no perl :). Well, from the few I've seen, they all seem to have a relatively constant structure. Someone pointed you to a plugin that is at least dealing in this general area, that might be a good starting point, barring anyone else having a better suggestion. While I wrote a little Perl a decade ago I've forgotten many of the pecularities, but there are some good web sites out there, and there is one of the animal books on the subject. Perl is a bit pecular in syntax and function compared to the C/C++ I did much of my career, but I didn't have much trouble picking up enough to make some local SA hacks long ago, so if you can program in most anything it probably won't be too much trouble. I don't recall if Bayes itself is called from a plugin or from the main SA code, but I'm pretty sure it is only called if an internal 'autolearn' token is true for the message. If you make a plugin that runs late in the rule evaluation it should be able to look at the score and rule hits and items in the message header and body and decide if it wants to turn off the autolearn flag for the message. Hopefully there isn't something in main SA code that determines the value of this flag after all of the rules have run. I guess one thing you might be able to do is implement a tflags flag of absolutely_no_autolearn or some such that would force-disable the autolearn decision if the rule had hit, but that might be something that would have to be put into the main SA code itself. Maybe Henrick will chime in here. This may be really trivial if you know where to look. Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Re: Disabling autolearn on given rule
None of these seem to accomplish disabling learning for a specific rule I think the problem is that I believe Bayes works off of the total score, and probably only sees rule names as more tokens, if it sees them at all. If it indeed works off the total score, about all you can do is somehow tweak that score for a given rule or rule combination. Loren --- This email has been checked for viruses by AVG. https://www.avg.com
An interesting bit of HTML from a spam
I found this little wonder in a bunch of spams I've been getting for the last few days: http://; http://; http://; http://; http://; http://; href="http:/mi.wey.vandalized655bccemetries.cleaning/id>">unsubscribe here I have no idea if that actually works, since I'm not about to try it. Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Re: Does anyone know what generates these email headers?
The originating PHP script header helps people who run shared servers track down the source of problematic mail. The two most common cases are: Does this look valid? X-PHP-Originating-Script: 48:class.phpmailer.php Just looking at a dozen or so of the smpams I've gotten in the last couple days that match this pattern, they all have an x-originating-spam-status of -2.9, which makes me a little suspicious that that header is faked. Maybe the others are too. Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Does anyone know what generates these email headers?
I'm getting a lot of mails with some very curious headers in them. I tried searching with Google, and it has never heard of many of these strings. Does anyone recognize what might be generating these headers? X-EOPTenantAttributedMessage X-EmailAdvisor X-Mxtb-Transitionid X-MG-Subscriptionuid X-PHP-Originating-Script X-EmailTransmit-type CMM-X-SID-Result CMM-X-AUTH-Result CMM-X-Message-Status X-OutGoing-Spam-Status X-EmailTransmit-aid X-rext Thanks! Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Re: Website "help" spams
body NOT_INTERESTED=~ /“[Nn]ot\S{1,5}[Ii]nterested\.?â€/ Might also be an interesting test. I assume the gibberish on the front and back is quotes in some character set or another, but they seem a little unlikely in a real mail. Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Re: Another evil "order response" number
And yet another rather amusing one from a crypto trading scam: The BTC wallet which you have to send is: 1GF1DcYFpe MoA4Ttj6TeWPK sJFRV43JjYc (PLEASE REMOVE THE SPACES FROM THE WA= LLET NUMBER) Our trading system will automatically recognize your investment and start = making profits for YOU! If you need some additional info please text or contact me using Telegram:= +44 775 254 0482
Another evil "order response" number
Thanks Regards, Billing Team Defender Firewall Protection +1, 888, 313, 1366
Re: number in sender name
Perhaps memory fails, but was there not, once, a standard rule that detected non alpha characters in sender name? The domain/provider is not of interest for this question. I think there was, but I suspect that the spam/ham ratio would be about even, which is probably why it doesn't show up now.
Another evil number
From a fake "subscription" spam: You can reach out to our Customer Support Team+1 (800) 781 - 2511.
Re: Maybe it's time to revive EvilNumbers?
A number of the rules I passed along are generic "order" rules rather than Amazon specific. I had to go back to last month's spam to find an Amazon order spam, but I've gotten a dozen or so fake orders for other things this month, all of which hit on the LW_BOGUS_ORDER rule. Loren - Original Message - From: Mark London To: users@spamassassin.apache.org Sent: Thursday, June 17, 2021 8:52 AM Subject: Re: Maybe it's time to revive EvilNumbers? Loren - Unfortunately, the fake amazon shipment email that we received, doesn't contain the word Amazon in it's From or Subject headers. Or even the word amazon in the text of the message! Just the Amazon logo. And they've removed all the URLs, so the links don't work at the bottom. And they left the postal address of amazon, without the word amazon. I hate bogus spam that is so obviously bogus that it avoids filter rules. :) - Mark On 6/17/2021 10:52 AM, users-digest-h...@spamassassin.apache.org wrote: Subject: Re: Maybe it's time to revive EvilNumbers? From: "Loren Wilton" Date: 6/16/2021, 8:18 PM To: Here are a handful of rules that work for me. Feel free to try them. If you do, please let me know how they work for you. (Apologies for my mail client trashing the formatting. Be sure to check for possible line wrap on some of the rules!)
Re: Maybe it's time to revive EvilNumbers?
Here are a handful of rules that work for me. Feel free to try them. If you do, please let me know how they work for you. (Apologies for my mail client trashing the formatting. Be sure to check for possible line wrap on some of the rules!) Loren body LW_PAYMENT /You\s+sent\s+a\s+Payment\s+of/i score LW_PAYMENT 0.5 describe LW_PAYMENT You sent someone a payment body LW_ORDER /\b(?:order|purchase)\s+(?:number|ID|date|description)\b/i score LW_ORDER 0.5 describe LW_ORDER Contains order information header __LW_SUB_INVOICE Subject =~ /\b(?:invoice|order)\b/ header __LW_FROM_INVOICE From =~ /\b(?:invoice|order)\b/ header __LW_ABC_LISTID List-Id =~ /\w{13}\s+\, some meta LW_BOGUS_ORDER (__LW_SUB_INVOICE || __LW_FROM_INVOICE) && __LW_ABC_LISTID score LW_BOGUS_ORDER 5 describe LW_BOGUS_ORDER Fake order or invoice meta LW_SPAM_LISTID __LW_ABC_LISTID score LW_SPAM_LISTID 1 describe LW_SPAM_LISTID The List_Id header seems to indicate spam meta LW_FREEMAIL_ORDER FREEMAIL_FROM && (LW_ORDER || LW_PAYMENT) score LW_FREEMAIL_ORDER 4 describe LW_FREEMAIL_ORDER An order receipt from a free email address header __LW_SUB_AMZ_ORDER Subject =~ /^Your Amazon\.com order \#\d{3}-\d{7}-\d{7}\s*$/ header __LW_FROM_AMZ_ORDER From =~ /\"Amazon\.com\"\s+/ header __LW_REP_AMZ_ORDER Reply-To =~ /^no-reply\@amazon\.com\s*$/ body __LW_BODY_AMZ_ORDER /Amazon.com Order Confirmation/ meta LW_REAL_AMZ_ORDER__LW_SUB_AMZ_ORDER && __LW_FROM_AMZ_ORDER && __LW_REP_AMZ_ORDER && __LW_BODY_AMZ_ORDER scoreLW_REAL_AMZ_ORDER-2 describe LW_REAL_AMZ_ORDER Amazon order confirmation header __LW_FROM_AMZ From =~ /\bamazon\b/i header __LW_SUB_ORDER Subject =~ /\border\b/i meta LW_FAKE_AMZ_ORDER __LW_FROM_AMZ && __LW_SUB_ORDER && !LW_REAL_AMZ_ORDER scoreLW_FAKE_AMZ_ORDER 7 describe LW_FAKE_AMZ_ORDER Amazon order phish
Re: Maybe it's time to revive EvilNumbers?
My site is getting a lot of spam that is getting past spamassassin. Because it has a hone number to call, and rather than a link to login using username and password. Mostly fake amazon purchases. They are getting past a lot of URL block lists because of that. FWIW. - Mark I have a number of "purchase" rules that add about 30 points for fake Amazon (and other) scams. I haven't had one get thru in the last couple of months since I instituted them, but I only have a personal account and not a whole site, so YMMV. None of them look for phone numbers, but I do have a set of rules for a handful of stolen business addresses commonly used in spams I get. They add a few points when those show up. Loren
Re: Header exists with a dollar sign in it
You could try headerX_SWITCHALL=~ /^X-\$switch\b/sm Loren
Re: Bayes autolearn: how does it resolve whether rules are body or header related?
so you don't have points from body rules. your mentioned URI_DEOBFU_INSTR is a meta rule: meta URI_DEOBFU_INSTR __URI_DEOBFU_INSTR && !__MSGID_OK_HOST so maybe it's not considered. They are treated as header, or ignored if marked as net. I think a bug report should be submitted for this. Either they should be treated split 50/50 as header and body score, or when the metas are built they shoudl have a "body rule" flag, and that used to determine where the score goes. I tried, but for some reason apache decided that I'm evil and blocked the submission attempt, so someone else can do it. Loren
Re: How do I search and capture text for use in a rule?
I think the OP was trying to find a way to match "To: " to "Hi user". Loren
Re: ExtractText and docx
I'm trying to use the latest ExtractText plugin, but the docx2txt program the plugin references is no longer available from http://docx2txt.sourceforge.net The latest version appears to be 1.4 from several years ago. I just tried downloading the 1.4 version and the CVS version, and in both cases was rewarded with an archive file. Loren
Re: My 10 years old domain have a bad TLD
.pro have a -1 with SUSP_URI_NTLD_PRO. Is that really minus 1? Negative scores are good, they counteract spammy scores, which are positive. Loren
Re: SA seems powerless against marketing emails for SEO/web development
I could add another point between BAYES_999 and BAYES_99 scores but that seems reactionary. Is there a better way? Should I thrown in another point for certain keywords in marketing emails like these? For this specific message I might be inclined to add a rule to check for a URL in the subject and add a point for that. I can't think of very many legit mails I've ever received with a URL in the subject. A point or two for that should be safe enough if it isn't spam, but could trip it over the edge if it is. headerURI_IN_SUBJECTSubject =~ /\b[-\w\._]+\@(?:[-\w_]\.)+(?:com|org|biz|cloud)\b/ score URI_IN_SUBJECT1.5 describe URI_IN_SUBJECTA URI in the subject of the message Something like that, maybe. Loren
Re: Spoofed amazon order email
While I haven't received a forged Amazon order email in this exact form, there is all kinds of stuff here that could be caught with appropriate rules. "In-case you require any change in order or like to cancel we recommend giving us call immediately at " "In-case" is unlikely in mail, there should be no dash there. "giving us call" is missing "a" and is bad grammer, but typical of non-English speaking spam. "In case you require any change in order" is also poor phrasing. The whole "call us immediately to change your order" concept rates 3 points on my mail system. No phrase of any similar sort appears in a real Amazon order confirmation. An actual Amazon order has a subject of the form Subject: Your Amazon.com order #114-2489974-7888243 The Subject here is Subject: IVK-1250703-9254770 | Apple Watch Series 6 Order Now Confirmed The order number is in the wrong format. The order number is in the wrong place in the subject text The subject text is in the wrong format. An actual Amazon order confirmation has the headers, in this order: From: "Amazon.com" Reply-To: no-re...@amazon.com To: Message-ID: <010001774af541dc-d38f4184-621e-4014-a295-c520285ae319-00 0...@email.amazonses.com> Subject: Your Amazon.com order #114-2489974-7888242 This mail has From: "or...@amazon.com" X-Google-Original-From: "or...@amazon.com" Content-Type: multipart/alternative; boundary="===2707982310301423984==" MIME-Version: 1.0 Subject: IVK-1250703-9254770 | Apple Watch Series 6 Order Now Confirmed To: s...@dondley.com The header order is completely different. There is no Reply-To header The From address is completely wrong. There should be no X-Google-* headers. There should also be a header: X-AMAZON-MAIL-RELAY-TYPE: notification A real Amazon order receipt has Content-Type = multipart/alternative, but it only contains a text/plain part encoded in QP, with no HTML part. This message has an HTML part and should be getting MPART_ALT_DIFF. "This email was sent from a customer service address kindly write us back if you have any concern. " This is bad grammar and a very unlikely form of robot sending account notice. A real Amazon order contains "This email was sent from a notification-only address that cannot accept inc= oming email. Please do not reply to this message." This is a very stasndard phrasing for this sort of notice. A real Amazon order confirmation does not contain an "unsubscribe" link. This phish does. There is a lot of other stuff that could be caught by various rules, but a trivial set would be something like #--- # 04/16/2021 # A bunch of rules to try to catch fake Amazon order confirmations, based on a # message pasted to the SA Users list. header __LW_SUB_AMZ_ORDER Subject =~ /^Your Amazon\.com order \#\d{3}-\d{7}-\d{7}\s*$/ header __LW_FROM_AMZ_ORDER From =~ /\"Amazon\.com\"\s+/ header __LW_REP_AMZ_ORDER Reply-To =~ /^no-reply\@amazon\.com\s*$/ body __LW_BODY_AMZ_ORDER /Amazon.com Order Confirmation/ meta LW_REAL_AMZ_ORDER __LW_SUB_AMZ_ORDER && __LW_FROM_AMZ_ORDER && __LW_REP_AMZ_ORDER && __LW_BODY_AMZ_ORDER score LW_REAL_AMZ_ORDER -2 describe LW_REAL_AMZ_ORDER Amazon order confirmation header __LW_FROM_AMZ From =~ /\bamazon\b/i header __LW_SUB_ORDER Subject =~ /\border\b/i meta LW_FAKE_AMZ_ORDER __LW_FROM_AMZ && __LW_SUB_ORDER && !LW_REAL_AMZ_ORDER score LW_FAKE_AMZ_ORDER 7 describe LW_FAKE_AMZ_ORDER Amazon order phish You might also like body LW_PAYMENT /You\s+sent\s+a\s+Payment\s+of/i score LW_PAYMENT 0.5 describe LW_PAYMENT You sent someone a payment body LW_ORDER /\b(?:order|purchase)\s+(?:number|ID|date|description)\b/i score LW_ORDER 0.5 describe LW_ORDER Contains order information ? meta LW_FREEMAIL_ORDER FREEMAIL_FROM && (LW_ORDER || LW_PAYMENT) score LW_FREEMAIL_ORDER 4 describe LW_FREEMAIL_ORDER An order receipt from a free email address ?
Re: Re: LANSET, do they create anything but SPAM?
Examples: https://pastebin.com/pF6Nmquc Well, I can see a couple of simple rules that would catch these two, but I don't know if they would also trip on legit mail. List-Unsubscribe: m'http://180e977\.olink1\.xyz' X-Mailer-SID: m'\b180e977_18\b'
Re: Using spamassassin to thwart sharepoint phishing attacks
3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] 0.5 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] I have 5.0 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] I suggest raising BAYES_99 to at least 5. Loren
Re: gmail hotmail picture and a lot of spam-rubish
We would need to see the original headers from the spam, or ideally the whole spam before we could say anything. It would also be helpful to see the rules it hit on your system. Loren
Re: Are X-MC-xxx headers legit?
I would not be so broad with that. I have 49 messages in my personal archives with X-MC-User headers, none of which I have classified as spam. Bill, do you see multiple X-MC- headers in the mails that come thru MailChimp? As in, "multiple many" or "multiple 2 or 3"? Or just the Users header? I can't tell from the MailChimp documentation whether the headers will be generally filtered from the final email message, or passed through. The majority of them are instructions to MailChimp to do something in either the headers or body of the message, so really it makes little sense to leave them in the final message. I can write rules to detect bogus values for quite a few of the headers, but the allowed text for a lot of the headers is moderately complicated, so gets to be a big and expensive regex. It would be a lot easier to just add points if there are say 3 or more X-MC headers in a row. But of course that is no good if MC does just pass all the direction headers through to the final messages it generates. Thanks, Loren
Re: Are X-MC-xxx headers legit?
Ah, OK. Looking at the MailChimp page, it appears that these headers appear on a message being sent to MC, and then it extracts them, most likely removes them from the final generated email, and uses them as processing instructions on how to generate the email or sequence of emails. In any case it seems rather unlikely that they should ever appear in a received email message. And whether they should or not, the values given for about 90% of the headers is simply invalid according to the MC page describing them. The headers that are valid are direct copies of the examples given on the MC page, and would not likely work for any real email campaign. I'd say that presence of X-MC-xxx headers in a received message is a 100% guarantee of a targeted advertizing message, and a 99% guarantee that the message is a spam. If the values given for the options in the headers are obviously invalid, that rises to a 100% chance that the message is spam. I'd call these headers a great spam sign. Loren
Are X-MC-xxx headers legit?
I've started seeing a number of spams with the following block of X headers in it. I've never seen these before. While these look really fake to me (from the content of most of them), does any real tool or site make headers like this, or are they just from some spam tool and I can use them as a guarantee of spam? x-mcpf-jobid: mc.us6_13712451.1216993.5a2d921a72084.full_03 X-MC-User: 6b669534c4be2b401d8744486 X-MC-Tags:45829 X-MC-Track:456 X-MC-Autotext:global X-MC-AutoHtml: format X-MC-Template: smiley X-MC-MergeVars: {"_rcpt": "emailadr...@domain.com", "fname": "John", "lname":"Smith"} X-MC-GM-Analytics: normal X-MC-GoogleAnalyticsCampaign: good X-MC-Metadata: { "user_id": "45829", "location_id": "111" } X-MC-URLStripQS: link to X-MC-PreserveRecipients: 123 X-MC-InlineCSS: used X-MC-Subaccount: sendgrid X-MC-ViewContentLink: {uurl} X-MC-BccAddress: {ourl} X-MC-Important: notes X-MC-IpPool: 99% X-MC-ReturnPathDomain: server1.tech X-MC-SendAt: "AsianBeauties Team" X-MC-MergeLanguage: {all language} X-MC-MergeVars: {"var1": "global value 1"} X-MC-MergeVars: {"_rcpt": "emailadr...@domain.com", "fname": "John", "lname":"Smith"} X-MC-GoogleAnalytics: www.domain.com, domain. X-MC-Metadata: { "user_id": "45829", "location_id": "111" } X-MC-Metadata: { "group_id": "users_active" } X-MC-Metadata: { "_rcpt": "f...@example.com", "user_id": "123" } X-MC-Metadata: { "_rcpt": "b...@example.com", "user_id": "456" } x-mcda: TRUE
Re: ReturnPath rule renaming
In order to bring the SenderScore/ReturnPath DNS reputation and blocklist rules up-to-date with their current ownership and administration, the rules are being renamed: RCVD_IN_RP_CERTIFIED -> RCVD_IN_VALIDITY_CERTIFIED RCVD_IN_RP_SAFE -> RCVD_IN_VALIDITY_SAFE RCVD_IN_RP_RNBL -> RCVD_IN_VALIDITY_RPBL John, you might add this text to the comment you made on Bug 6247. I read through you comment there, then went and scanned the entire comment stream in the bug (most all from 2009) to try to figure out what was being changed, and finally came up empty. There was no description of what the ownership change was, nor the administration change, nor any mention of what exactly had been changed in the rules. Loren
No rule for fake payPal messages?
I just got this little wonder, and was surprised that it got thru as ham. From: "PayPal Billing" I've fixed that locally, but I'd think SA ought to have a rule for "PayPal" that doesn't come from paypal.
Has anyone heard of these people?
I just got what appears to be a legit email from my ISP. It has a tracker tag pointing to 102.122.207.net. Note that is a site name and not a dotquad. Somehow this doesn't make me real comfortable with the possible veracity of the email. Has anyone come across 102.122.2O7.net before? Loren
Re: Points for improbable Received header date?
why is date important ?, spamassassin do test it already DATE_IN_PAST * Well, the date is a spam sign. That is good enough for me to be important. And the DATE_IN_PAST * rules don't hit these spams. Loren
Re: Points for improbable Received header date?
and if you want to become an hero patches to document those evals are always welcome ;-) Well, if I use undocumented code I have to figure out, I always do my own documentation, since my memory these days is about five minutes long. The trick for me will be figuring out how I could submit those changes as a patch, since I'm an old mainframe and now Windows guy, and not used to creating Unix patch files. I guess there must be a tool somewhere to diff two file versions and create a proper format file. Loren
Points for improbable Received header date?
I'm getting a lot of spams that all have a series of completely bogus Received headers in them. A characteristic of these headers is a rather improbable datestamp, considering today's date: Received: from 69-171-232-143.mail-mail.facebook.com ([69.171.232.143]) by oxsus1nmtai03p.internal.vadesecure.com with ngmta id 0574d1a8-1628c15907fbaba1; Thu, 06 Aug 2020 18:30:56 + Note that this message must have been in flight for about a year and a half according to that header. Anyone know an easy way to check for a Received header date more than say a week old and add some points? Loren
ViraLife
Has anyone been getting spams from "ViraLife"? They have slowly started, one by one, hitting all of my email inboxes. It shows up about once a week as a "newletter". It claims to be from a legit email hosting company I know nothing about, and I certainly have never signed up for this spam. Here are some headers for what I just got: Received: from ver-vp1.custonews.net ([192.250.230.87]) by oxsus1nmtai04p.internal.vadesecure.com with ngmta id 3cd7cc16-165e208a6ec4ede9; Wed, 27 Jan 2021 15:31:36 + Received: from localhost ([127.0.0.1]) by ver-vp1.custonews.net with SMTP; Wed, 27 Jan 2021 10:31:36 -0500 (EST) Date: Wed, 27 Jan 2021 10:31:36 -0500 (EST) Subject: Some articles from this week that might interest you.. From: "ViraLife" Message-ID:
Re: More undetected hidden test spam signs
Right, but __STY_INVIS is currently tag-blind (it only looks for the style="" clause), so it hits that, and if lots of ham is hiding tracking images that way that might explain the poor S/O. I suspect that might be the case. The vast majority of invisible garbage I see is hidden in a ... pair, typically two per spam and about 50K in each one. Looking at the definition of the
Re: More undetected hidden test spam signs
On 16 Dec 2020, at 23:21, Loren Wilton wrote: I just got a batch of spams containing Such rules are there. Unfortunately, for whatever reason, lots of ham uses "invisible" text so it's not useful as a spam sign by itself and it's hard to come up with any useful combination rules. I think I may have figured it out - tracking images. Like: style="visibility: hidden !important; display:none !important; max-height: 0; width: 0; line-height: 0; mso-hide: all;"> Note in your example the display:none is in a contained tag and not in an opening tag of a span. The tag is probably fairly long because the URL is probably huge, but it is still the one item that is hidden. I put in a local rawbody rule for m'.{100,}(?:$|)'is and so far I haven't gotten any hits on ham. Of course that is a pretty heavy rule, but it would seem to indicate that hidden spans may not be that common in ham.
More undetected hidden test spam signs
I just got a batch of spams containing That was followed by about 2K bytes of garbage containing GUIDs and links to putatively some youtube video. The span was then terminated correctly, the body of the spam, and then the same garbage for about another 2KB. The small font rules didn't seem to catch this. Loren
Re: Possible spam sign
That probably should have hit at least one scored base rule: https://ruleqa.spamassassin.org/?rule=%2FFROM_2_ Nope. I think my rules are up to date, but maybe not. Feel free to pastebin it and I'll take a look. https://drive.google.com/file/d/1WQ0Mm1iUsKhTj51mFJwwehuTatSm8Nux/view?usp=sharing
Re: Possible spam sign
That probably should have hit at least one scored base rule: https://ruleqa.spamassassin.org/?rule=%2FFROM_2_ Nope. I think my rules are up to date, but maybe not.
Possible spam sign
I just received a spam with this interesting From address: From: "VA Rate Guide" I wonder if it is worth checking for mail from more than one sender at once? Loren
Are these valid email headers?
I don't have a Faceboox account and don't know anyone on Facebook that would send me mail (and don't want to!), so I have absolutely no idea if these headers from recent spams are completely made up out of the air (and thus spam signs) or are valid headers. Can anyone tell me if this stuff is valid or obviously fake? X-Facebook: from 2401:db00:1050:208b:face:0:4f:0 ([MTI3LjAuMC4x]) by www.facebook.com with HTTPS (ZuckMail); X-Priority: 3 X-Mailer: ZuckMail [version 1.00] X-Facebook-Notify: skipped_password_change; mailid=5ac39662d1c08G5af32c89e396G5ac39afc31edaG569 Feedback-ID: 509:skipped_password_change:Facebook X-FACEBOOK-PRIORITY: 0 X-Auto-Response-Suppress: All Require-Recipient-Valid-Since: gouldi...@earthlink.net; Sunday, 29 Nov 2009 00:17:08 + Thanks, Loren
Re: Apache SpamAssassin and Spammers 1st Amendment Rights
Keep in mind that freedon of speech says that you can stand in the park on a soapbox and shout. It does NOT say that passers-by are forced to stand there and listen to you until you run out of voice. They can walk away any time they want to. It also does not say that the local newspaper is required to take down everything you say and print it on the front page for all their readers. (But freeedom of the press says that they can, and you can't sue them for copyright infringement or much of anything else for doing so, even though people and organizations now try that.) Whether any organization has an obligation to convey the voice in the park to all of its customers, or even a selection of them, is an interesting question. By and large, organiizations in the US have the rights of individuals, and that includes the right to walk away and stop listening. The major possible exception would be a "common carrier", which cannot disconnect a call because they don't like what is being said. But ISPs are NOT common carriers. So even if the spammer is addressing spam to specific individuals, the ISPs, not being common carriers, do not have an obligation to deliver the spam.
Re: Problem with matching regex against long body
You may also want to stick optional whitespace in there to avoid trivial bypass: There's also the possibility of adding a typeface or other options to the tag, which would bypass your simple rule. And HTML is not case-sensitive. And avoid * on complex stuff when matching arbitrarily long texts, which can lead to runaway backtracking and scan timeouts. Thanks. This spammer is prolific, but seems to be very stupid and pattern based, hardly ever varying what he puts in some parts of the message. I've been seeing this pattern without change for about 3 months now. I almost never have to tweak a rule for his stuff to account for a possible variation. It would be interesting (at least to me) to run a set of test rules against the SA corpus to try to determine the optimial cutoff point for a good S/O as regards length of 0-point text. I personally have absolutely no idea what a "reasonable" size is for 0-point text in an email. Personally I'd be inclined to say that any 0-point text isn't reasonable, but mass marketers seem to believe otherwise.
Re: Problem with matching regex against long body
See rawbody_part_scan is the docs. Also the chunking of the rawbody into 2-4 kB blocks, may make a difference. I wasn't able to find rawbody_part_scan in any of the docs that I managed to find, but after digging into the source I found the chunking logic and dug out the 2K limit. I'm not sure why I was hitting a limit at just under 1K, I can only guess that the header was included in the first rawbody chunk, which seems a little unlikely. I was able to get the rule to work using a full rule, but I sure hated to do that, since I lose the base64 decoding of the body, and full rules are ugly and potentially dangerously inefficient. But at least it worked. Fortunately these spams are plain text encoded.
Re: Problem with matching regex against long body
basics of escaping at least *anything* won't do any harm php > echo preg_quote('[^<]*<'); \\[\^\<\]\*\< Well, escaping the [^<]* part certianly will do harm, since it will turn it from a group match into individual characters that don't exist in the text to be matched. But I've tried escaping the standaline characters like <, =, :, etc, and that doesn't help. I have many regex patterns without these escaped, so I'm pretty sure they work as expected normally, so should here too.
Problem with matching regex against long body
I'm getting lots of spams that are about 100+K long. The spam body contains two blocks of random news text copied from fox news or msnbc or the like, enclosed in a zero-point font block. I'm trying to match this simple pattern to give some extra points, but I can't seem to get it to work. I'm wondering if there is some buffer limit in SA that is preventing the match from working. If I try rawbody LONG_HIDDEN m'[^<]*<'s I don't get a match, even though I know there is a about 50K into the message. But if I try rawbody LONG_HIDDEN m'[^<]*'s I do get a match. Note all I've done is remove the final "<" from the match text. If I try rawbody LONG_HIDDEN m'[^<]{990,}'s I get a match. but if I try rawbody LONG_HIDDEN m'[^<]{997,}'s I don't get a match, but I know there is over 100K of text after that font tag. Can anyone see something I'm doing wrong, or know of some limitation in SA that will prevent these long matches from working? Thanks, Loren
Re: The most efficient SPAM implementation ever
> Can you please tell me how to generate that report? I believe he is asking for the results of something like spamassassin -t
Re: blacklisting the likes of sendgrid, mailgun, mailchimp etc.
https://krebsonsecurity.com/2020/08/sendgrid-under-siege-from-hacked-accounts/ also sheds light on the issue too. . SendGrid knows (or should konw) that it has compromised accounts. It could find out what some of them are for free by downloading Rob's list of 25 or so compromised accounts. It could find out what some of the other 400 are for $15 each, and could find out what some of the major offenders are for $400 each. Let's see, 400 compromised accounts times $400 is $16,000 dollars. SendGrid or Twillio can't afford a $16,000 cash outlay to find the account names of the major compromised accounts? Their head of security probably gets that much a month in salary and bonuses. It would be a trivial expense. So what could they do once they knew which acocunts are compromised? Are they helpless, and can only wring their hands and issue press releases saying They Have A Plan? No. They can SHUT THE DAMN ACCOUNTS DOWN. Issue refunds to the owners if they feel generous. Tell the owners to open new accounts with 2FA. But they won't do this, because they get their money from sending spam. Loren
Re: Amazon, dhl, fedex, etc. phishing
> We are regularly getting phishes from dhl, fedex, usps, amazon, netflix, > spotify that fakes the from (eg. amazon wants > to send me a amadon-legit.pdf). Usually these are previously unknown to > pyzor, dcc, rbls, and domain reputation doesn't really exist[0]. > > I'm wondering if anyone has made a rule that looks to see if the From > contains amazon, but it is not amazon.com/.ca/.jp (all their TLDs), then > score them up, if it wants to also drop a psd, or a tar.xz, or a png, or > a pdf or whatever, then light them on fire. I have rules similar to that to catch other things. I just made one for you to catch a spam that claims to be from USPS but is not. Simple modifications will catch other putative senders. #--- # 08/24/2020 # Someone on the SA mailing list is upset about spams that claim to be from some # reputable company, usually a package transfer company, but actually aren't. # I have an example in today's spam, though it is caught by lots of other rules: # # From: USPS header NOT_FROM_USPS From =~ /\bUSPS\b[^<]*<[\w\-.]+\@[\w\-.]*\b(?!usps\.com)\s{0,3}>/ score NOT_FROM_USPS 1 describeNOT_FROM_USPS Claims to be from USPS, but isn't I'm also including two general rules that catch this sort of stuff most of the time. #--- # 01/21/08 # Return-Path: # Message-Id: <20080121072522.16582.qmail@comp2> # From: # # The from and the return-path should match # The from host and the message-id host should match header __FROM_SENDER ALL =~ m'Return-Path:\s+<([^\n>]+)>.*\nFrom:(?:[^<\n]+<\1>|\s+\1$)'si header __NULL_SENDER Return-Path =~ /<>/ metaNOT_FROM_SENDER !__FROM_SENDER && !__NULL_SENDER score NOT_FROM_SENDER 1 describeNOT_FROM_SENDER Not from putative sender # Return-Path: # Message-ID: <7a9a01c85ca2$0fcbc910$c0a80102@Ricky> header __SENDER_MSGID ALL =~ m'Return-Path:[^\@\n]+\@([^>.]+).*\nMessage-Id:[^\@\n]+\@[\w.]{0,30}\1'si meta NOT_SENDER_MSGID !__SENDER_MSGID && !__NULL_SENDER score NOT_SENDER_MSGID 0.5 describe NOT_SENDER_MSGID Sender host doesn't match message-id host
Re: Zero-point garbage text that isn't caught by the small-font rules
I've seen mail containing ONLY the text mentioned above, in which case it's strange. From the original mail I got feeling that the mails also contain mentioned text only... The original mails I clipped the original obfuscation text from were using it to hide a phishing attempt. I have not seen it used with no other content in my mail stream. However, from time to time I see a mal-formed spam that lacks content and just has the formatting. Perhaps that is what you are seeing. Loren
Re: SendGrid (Was: Re: Freshdesk (again))
money should not make the emails go around, like wize no pressident should be elected by money Well, no judge nor congressman should be elected by money either. But we changed the rules some dacades back and legalized bribery, specifically in the payment of money to elect your favorite candidate. So, as functionally implemented in the current US Govenrment, all elected officials *should* be elected by money, because that is the law. But that is very off topic, hopefully. Loren