Re: thanks to thinking people.
On Thu, 22 Jul 2010, Benny Pedersen wrote: On ons 21 jul 2010 19:09:55 CEST, Alexandre Chapellon wrote You can have forged return-path and /or stollen credentials... in both cases you look like a backscatter source. i belive postfix is smart to change forged sender to something that is not fqdn before it bounce :) A forged sender looks no different than a legitimate sender. Postfix would have no way to be 'smart' about this (except for some instances of SPF fail, but then why 'bounce'? Why not reject?). - C
Re: [sa] Re: thanks to thinking people.
On Thu, 22 Jul 2010, Benny Pedersen wrote: On tor 22 jul 2010 20:03:18 CEST, Charles Gregory wrote A forged sender looks no different than a legitimate sender. Postfix would have no way to be 'smart' about this (except for some instances of SPF fail, but then why 'bounce'? Why not reject?). and why not show logs ? Sorry. Not OP. Just noting that the opinion that postfix should be smart enough to rewrite a forged sender just doesn't make sense. bounces is newer external since postfix change sender to mailer-daemon with will end in some mailbox local if it was sent from local ip Postfix doesn't change the sender. Mailer Daemon is the 'sender' for all buonces. But it will be sent TO the original sender listed in the 'From' header. If postfix has generated the From header based on transaction authentication, then a 'bounce' would indeed go back to the originating mail account. But if you are merely going by IP, then the 'sender' that postfix tries to 'bounce' mail to will be the forged sender. And postfix has no way to know it is forged - C
Re: [sa] Re: thanks to thinking people.
On Tue, 20 Jul 2010, LuKreme wrote: We are talking about Checking OUTBOUND messages. It is perfectly ok to bounce internal messages. Caveat: As long as proper care is taken to send the bounce to the authenticated sender of the mail and NOT just lamely use the 'From' header! Still prefer an SMTP reject over a bounce! - C
Re: First run score: 25.7 Second: 2.6
On Fri, 16 Jul 2010, Emin Akbulut wrote: X-Spam-Status: No, score=2.6 required=6.3 tests=HTML_IMAGE_ONLY_32, X-Spam-Status: No, score=2.6 required=6.3 tests=HTML_IMAGE_ONLY_32, X-Spam-Status: No, score=5.5 required=6.3 tests=HTML_IMAGE_ONLY_32, X-Spam-Status: Yes, score=24.4 required=6.3 tests=HTML_IMAGE_ONLY_32, (liberally snipped) There are commas at the end of these lines, implying you have trimmed the rest of the list of tests that account for the different scores. Go back and assemble the FULL logs, so that we can see the difference in what tests fire and what tests don't. Now if I have to GUESS on insufficient data, I would suspect that the 'port' of spamd to Windows(?) does not properly tidy up its children when finished. The fact that it crashes certainly points in this direction. May I presume that you did a 'full' memory test? To verify this situation, try running the same test as before, but leave a one minute gap between each run/test (and with no other spamd calls during that time interval!) so that we can see what happens when the spamd children have time to properly terminate. - C Ps. I'm not researching this deeply, so I may trip over some minor aspect of spamd coding/behaviour that the developers will call me on, I'm sure. :)
Re: [sa] How to block a network
On Fri, 16 Jul 2010, Igor Chudov wrote: I receive a large number of spams from network IPs belonging to SharkTech, 70.39.69.99 or so and so on. Does UBuntu use 'iptables' firewall? Throw it in there, and forget even the wasted initial SMTP connections. - C
Re: First run score: 25.7 Second: 2.6
On Wed, 14 Jul 2010, Matt Kettler wrote: On 7/14/2010 11:27 AM, Emin Akbulut wrote: I noticed randomly while I was testing SA. All I did is below: WinSpamC realspam.txt result1.txt NET STOP Spamassassin NET START Spamassassin WinSpamC realspam.txt result2.txt WinSpamC realspam.txt result3.txt result1: under 6.3 result2: very high result3: under 6.3 That is quite strange.. sounds like you've got DNS timeout problems. No, it's something more than that. Go back to the original test and there are other tests that stop firing like that FROM_IN_SUBJECT. It almost seems like some of his spamd children are failing to load all their parameters. Noting the frequent crashes mentioned in another post, I would say that there is something to it. I suggest to OP that he try the spamassassin executable, to see if this score anomaly repeats itself. If it is only happening on spamd, then somehow those crashes point to a problem. If the load is not to high, he could even use spamassassin for production. I do. And 99% of the time it works fine (Footnote for people who will inevitably ask: My glue doesn't seem to like the way spamc returns the original mail.) - C
Re: [sa] Re: First run score: 25.7 Second: 2.6
On Thu, 15 Jul 2010, Emin Akbulut wrote: spamassassin.exe always calculates the same/correct score. Good... Goood. pamd second run reports only a few tests. Is it OK? I mean spamd runs all test but only adds which one increases score to it's report? Or these tests are processed tests list only? First run has tons of tests, second run has only 5 tests. I am presuming, by your description that the exact same *unmodified* file is passing through spamc/spamd all three times, and that there are no other variables. The spamc calls are literalyl one after the other, with no change of userid or other change that would possibly lead toa different set of configuration files being read. So this means that it is spamd itself that is 'different' on the second execution. You are going to need to enable verbose logging for spamd and do these three tests and see what messages appear in the logs (presumably) showing a failure to load config files on the second run. Is it possiblt that the file LOCKING on your system prevents spamd from accessing certain files under certain circumstances? What happens if you run ANY other messaeg through spamc as the 'second' run, and then run the third one on the orignial file? Is spamd sensitie to it being the same messaeg or just messes up on 8whatever* the second message would happen to be? Timing or content? - C
Re: First run score: 25.7 Second: 2.6
On Wed, 14 Jul 2010, Bowie Bailey wrote: First run: --- X-Spam-Status: Yes, score=25.7 required=6.3 tests=HTML_IMAGE_ONLY_32, HTML_IMAGE_RATIO_02,HTML_MESSAGE,LOCALPART_IN_SUBJECT What sticks out to me is that most of the missing score hits on the second run are from blacklists. Quite true. What also sticks out to me is that test LOCALPART_IN_SUBJECT disappers which means that the headers on the second run are substantially different from the headers on the first run. SOMETHING is severely mangling the mail between the two runs, and quite obviously this degrades spamassassin's capability to detect spam. I suppose I should ask (of the OP) WHY there are two runs at all? - C
Re: How to stop weird From: crap?
On Mon, 12 Jul 2010, Michelle Konzack wrote: From: Coupon Dept. CouponDeptdOS_V`CcOP IW^GIdATOn2PbJK_/v...@perezcentral.com I realize that the spammers will soon recognize that you are filtering them, but for the moment, why not score heavily on the 'unusual' characters inside these coded addresses? header LOC_WEIRD_FROM From =~ /[...@\]*[\^\`\ ]...@\]*@/ score LOC_WEIRD_FROM 2 # not too high a score, just enough to tip them over... # note: the '[...@\]*' confines the match to within a local address part - C
Re: [sa] Re: How to stop weird From: crap?
On Mon, 12 Jul 2010, Karsten Bräckelmann wrote: header LOC_WEIRD_FROM From =~ /[...@\]*[\^\`\ ]...@\]*@/ # note: the '[...@\]*' confines the match to within a local address part Using From:addr instead is better and more accurate. Provided the spammer doesn't use more than one address on the From header. :) That RE is more complicate than it needs to, yet might even match the real name. From is not From:raw. From:raw, acording to docs, only prevents decoding of quoted printable and base 64 strings, and preserves whitespace. So the RE, as given, looks for the angel bracket at the beginning of ALL possible addresses, and scans for the undesirable characters. I don't see any unnecessary complexity in the RE (except that yes, you could use From:addr and eliminate the sections that pin-down address, but I've already explained I prefer an RE that captures ALL addresses, not just the first). As a footnote to OP, these characters ARE 'legal' even though rarely used. That's why you can't score too high... But I posted that solution yesterday already. Coming late to the show, eh? ;) 1) Syadmins New Year's Resolution: I will read all posts before responding. 2) Sorry, I got used to seeing so much *discussion* trying to dissect what was, to me, an obvious problem that I got fed up with it, and figured no one else was posting a rule, so I would Great minds, and all that? :) - C
Re: Move SPAM to directory and notify user
On Fri, 9 Jul 2010, Jose Luis Marin Perez wrote: In a CentOS 4.7 server I installed qmail + simscan + ClamAV + Spamassassin 3.3.0 that is working properly. Now my intention is that when a mail is considered SPAM this is moved to a folder called SPAM and in turn notifies the user (via email) so you can review it. Is it possible? If your Spamassassin is properly adding a header showing the Spam 'score' as a row of asterisks, then you can check for that header in procmail and deliver accordingly. I would suggest, to avoid having a notice for every piece of spam, try instead to have a cron job that checks the spam folder for *new* mail, once nightly, and sends a single message to the user. Of course, if the user is always getting spam, then that notice gets ignored pretty quickly, so you may want to decide whether it is worth the trouble. - C
Re: SA checking of authenticated users' messages
On Wed, 7 Jul 2010, Louis Guillaume wrote: (spamass-milter doesn't tell SA about auth) == [ rbl checks run against authenticated user's IP address lack of ALL_TRUSTED for authenticated user's mail That last one seems to be my problem. Does the patch fix this? I'll try updating and see what happens. Hi Again! I just need to clarify one thing that's not clear to me in re-reading our thread from the other day: Is there a work-around for this? My users are getting restless. Everytime their ISP changes their IP address I have to whitelist them! Uh, I missed the original thread, so maybe this was explained, but why aren't the users sending mail through their ISP's SMTP server? Presuming there is a good answer for this, then, have you considered just whitelisting based on the user's From: header? There's a trick to it: 90% of the time, spammers have a harvested address, but *don't* have the NAME portion of the user's From: header. So build a rule that matches their WHOLE 'From:' header, like this: header LOC_FROMOURUSER From =~ /^User Name theiraddr...@example.com/ Notice the absence of the coomnly usd 'i' flag on the regex. If they have quotes around their name, include them in the regex. The entire line shuold *exactly* match what the user's MUA generates. The only thing that messes this up is when users have the annoying habit of changing their 'name' on their mail Naturally, there is a small risk of having a spammer send a message with exactly that header, but really, how many of those will there be? - Charles
Re: Problems with File::Scan::ClamAV
On Sat, 3 Jul 2010, sebast...@debianfan.de wrote: i have a debian Lenny system with SpamAssassin version 3.3.1 running on Perl version 5.10.0. Is it running properly? I had installed clamav and i got a problem by installing file::scan::clamav. How is this connected to spamassassin? My first feeling on this is that you would have better luck posting this to the Debian lists. Or perhaps contact the author I presume you have already googled for your errors :) - C
Re: Whitelist programmatically
On Sat, 26 Jun 2010, Massimiliano Giovine wrote: What does it do? How can i read the documentation of the spamassassin behavior with whitelisting? Firstly, the behaviour of the various whitelist options are described in the Mail::SpamAssassin::Conf documentation. There is a copy on the web at: http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html Now I have to ask what functionality you are trying to achieve that is not already in SA? Are you simply trying to give your users a 'friendly' way to add whitelist entries to their spamassassin config? If so, and the volume of entries is not large, I would suggest you use an 'include whitelistfile' command in the user's .spamassassin/user_prefs and then use whatever user interface you like to put a listing of whitelist commands into that file Using a separate file would avoid issues with a script error corrupting the main user_prefs file. - Charles 2010/6/22 Martin Gregorie mar...@gregorie.org: On Tue, 2010-06-22 at 07:28 +0200, Massimiliano Giovine wrote: Really thanks for the answers. So, i need to configure my spamassassin installation to use the running database (i'm already using a mysql database for other reason) for whitelisting or i have to write the logic of a whitelist using my database installationa? You can do it all in SA. The steps are: 1)add another table to the database. This need only have a single column that contains the list of e-mail addresses you want to whitelist. The column needs to be the prime key, which is normally indexed. The e-mail address needs to be indexed for good performance. 2)you need a way of adding addresses to the table. If you're happy to use SQL you can use the MySQL interactive SQL tool or wrap it in a shell script to implement a shell command like whitelist someb...@example.com 3)of course you need some form of backup, but MySQL's standard database backup and restore tools should do just fine. If you already have a whitelist, you can easily load it into the database with the MySQL bulk loader. 4)you need to write or otherwise obtain a Spamassassin plugin to access the database and a rule to call the plugin. My whitelisting plugin interrogates a database view containing a moderately complex query. This appears to the plugin as the sort of table I've just described. If I was implementing your plugin I'd: - define a table that uses my view name as the table name and contains the same column name. This way I could use my existing plugin to access the whitelist table without any SQL changes, i.e. create table whitelist ( email varchar(80) primary key ); - Modify my plugin to work with MySQL. I use PostgresQL as my database but I think the changes would be minimal - possibly little more than configuration changes. I've never used MySQL, so can't be more definite. Is there something i can read to go deep into this argument? There isn't a lot. There's an SA document about writing plugins, which is quite helpful. I found it was easy enough to read that and then grab a plugin that accessed a database and modify that, but I do know some Perl and understand object-oriented programming. You need both to successfully create a plugin without too much trial and error. I found that figuring out the database access was easy enough, but the SA facility for configuring a plugin, i.e. telling it what sort of database to access and where to find it, was poorly documented and did need quite a bit of experimentation to get right. Caveat: As I've never used MySQL the preceding description assumes that it has all the tools that come as standard with every other SQL database I've used. Martin -- -Massimiliano Giovine Aksel Peter Jørgensen dice: Why make things difficult, when it is possible to make them cryptic and totally illogic, with just a little bit more effort? Blog: http://opentalking.blogspot.com Linus Torvalds doesn't die, he simply returns zero.
Re: Whitelist programmatically
On Sat, 26 Jun 2010, Massimiliano Giovine wrote: You guessed right! It's a little bit more complicated but the target is what you said! If i write into user_prefs i have to restart spamassassin service? H Not sure about that one. I know you have to restart spamd for changes to the site-wide config, but it wouldn't make sense to have to restart for every user change Easy enough to test out... Make some changes and see if they take. So, what are the complicated bits? :) -C 2010/6/26 Charles Gregory cgreg...@hwcn.org: On Sat, 26 Jun 2010, Massimiliano Giovine wrote: What does it do? How can i read the documentation of the spamassassin behavior with whitelisting? Firstly, the behaviour of the various whitelist options are described in the Mail::SpamAssassin::Conf documentation. There is a copy on the web at: http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html Now I have to ask what functionality you are trying to achieve that is not already in SA? Are you simply trying to give your users a 'friendly' way to add whitelist entries to their spamassassin config? If so, and the volume of entries is not large, I would suggest you use an 'include whitelistfile' command in the user's .spamassassin/user_prefs and then use whatever user interface you like to put a listing of whitelist commands into that file Using a separate file would avoid issues with a script error corrupting the main user_prefs file. - Charles 2010/6/22 Martin Gregorie mar...@gregorie.org: On Tue, 2010-06-22 at 07:28 +0200, Massimiliano Giovine wrote: Really thanks for the answers. So, i need to configure my spamassassin installation to use the running database (i'm already using a mysql database for other reason) for whitelisting or i have to write the logic of a whitelist using my database installationa? You can do it all in SA. The steps are: 1)add another table to the database. This need only have a single column that contains the list of e-mail addresses you want to whitelist. The column needs to be the prime key, which is normally indexed. The e-mail address needs to be indexed for good performance. 2)you need a way of adding addresses to the table. If you're happy to use SQL you can use the MySQL interactive SQL tool or wrap it in a shell script to implement a shell command like whitelist someb...@example.com 3)of course you need some form of backup, but MySQL's standard database backup and restore tools should do just fine. If you already have a whitelist, you can easily load it into the database with the MySQL bulk loader. 4)you need to write or otherwise obtain a Spamassassin plugin to access the database and a rule to call the plugin. My whitelisting plugin interrogates a database view containing a moderately complex query. This appears to the plugin as the sort of table I've just described. If I was implementing your plugin I'd: - define a table that uses my view name as the table name and contains the same column name. This way I could use my existing plugin to access the whitelist table without any SQL changes, i.e. create table whitelist ( email varchar(80) primary key ); - Modify my plugin to work with MySQL. I use PostgresQL as my database but I think the changes would be minimal - possibly little more than configuration changes. I've never used MySQL, so can't be more definite. Is there something i can read to go deep into this argument? There isn't a lot. There's an SA document about writing plugins, which is quite helpful. I found it was easy enough to read that and then grab a plugin that accessed a database and modify that, but I do know some Perl and understand object-oriented programming. You need both to successfully create a plugin without too much trial and error. I found that figuring out the database access was easy enough, but the SA facility for configuring a plugin, i.e. telling it what sort of database to access and where to find it, was poorly documented and did need quite a bit of experimentation to get right. Caveat: As I've never used MySQL the preceding description assumes that it has all the tools that come as standard with every other SQL database I've used. Martin -- -Massimiliano Giovine Aksel Peter Jørgensen dice: Why make things difficult, when it is possible to make them cryptic and totally illogic, with just a little bit more effort? Blog: http://opentalking.blogspot.com Linus Torvalds doesn't die, he simply returns zero. -- -Massimiliano Giovine Aksel Peter Jørgensen dice: Why make things difficult, when it is possible to make them cryptic and totally illogic, with just a little bit more effort? Blog: http://opentalking.blogspot.com Linus Torvalds doesn't die, he simply returns zero.
Re: [sa] Re: NO_RELAYS spam
On Fri, 18 Jun 2010, Randy Ramsdell wrote: I have no problem going over there but I am not convinced that the Amavis program is the problem. The header field is changed by spamassassin. Doesn't the email simply get handed to Spamassasin by Amavis where the headers are modified by spam report etc...? The headers are missing. Spamassassin records this fact, but is not responsible for it. So find out what happens to your message BEFORE spamassassin is called. Amavis is just a suggested starting place. And if it is to blame, someone on their list will reocgnize your query as soon as you post it. Suggestion: After each step of your mail processing, if you can, save a copy of the mail to a log file. At least that way you get a quick overview of *which* component removes those headers - C
Re: NO_RELAYS spam
On Thu, 17 Jun 2010, Randy Ramsdell wrote: The original email did not hit the NO_RELAYS rule but subsequent runs through do hit this rule and it isn't on all email. This sounds to me like you are 'resending' the mail from a local address to your mail server, rather than 'feeding' the original mail back into spamassassin. If this is the case, then you would naturally produce a new set of headers, and there would be no external relays, thus triggering the NO_RELAYS rule Original rules hit. X-Spam-Status: No, score=-0.394 tagged_above=- required=5tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_SORBS_WEB=0.619,URG_BIZ=1.585] Right there, we see 'RCVD_IN_SORBS'. This would not happen even if your own server was blacklisted with SORBS. There *was* a Received header for a relay, and somehow you have 'removed' it, either via a filtering mechanism outside SA, or by 'resending' or 'forwarding' the mail. After running spamassassin -D If this is what you used, then the forwarding and header rewriting must have occurred prior to this. Did someone 'forward' the spam to you as a complaint? Users often fail to properly forward with full headers enclosed. - C
Re: NO_RELAYS spam
On Thu, 17 Jun 2010, Randy Ramsdell wrote: Hmmm, this mail came in and went straight to the users inbox. 1. Postfix --- 2. Amavis ( Spamd/Clamd) --- 3. Postfix --- 4. Dovecot-deliver So the problem is somewhere during the 2 --- 3 or step 3 or 4. Step 4 it is unlikely since Deliver simply send the file to a directory location. I'm afraid I'm going to have to side with the people who suggested that something in the above steps is deleting headers. Postfix is pretty much guaranteed to add at least one Received header, even if it is just 'Received from localhost'. so if you can guarantee that Step 1 is being done, then something in a later Step is removing headers. Good luck with finding it! :) - C
Re: Please Help with SA Rule: FH_HOST_IN_ADDRARPA
On Thu, 17 Jun 2010, gwilodailo wrote: I've discovered that some mail between two of my clients (on separate hosts) is getting flagged as spam, because of this rule (FH_HOST_IN_ADDRARPA). I'm not at all an expert with spamassassin, and I'm having some difficulty finding what this rule is about and what to do about it. Your reverse DNS lookup for the hostname resolves to a string containing 'in-addr.arpa'. This can be corrected by setting your reverse DNS zone to a real hostname for the IP. If you are not in control of the DSN you may have to talk to your upstream provider. If you are only doing this internally, and never send external mail from that host, you can just add a whteilst entry for that hostname. -Charles
Re: SpamAssassin Integration
On Wed, 16 Jun 2010, Gnanam wrote: I want to integrate SpamAssassin in my web-based application to test spam score of the email content... If this is your own custom web software, then it is as simple as adding a call to spamassassin (or spamc) in the same area of the script that validates things like the format of e-mail addresses. You can keep it simple and just report spamassassin's exit code, or you could parse the results from SA and pass them back to your user, so that they know what rules were triggered, and how to correct their e-mail. If your web interface is pre-packaged piece of software, then it likely sends mail via your local SMTP server by calling 'sendmail' or an equivalent function that mimics that command. As long as the web client handles SMTP rejections and notifies users of problems sending, you should be able to run spamassassin normally in the context of your outgoing mail server. - Charles
Re: More large spam....
On Sat, 12 Jun 2010, Karsten Bräckelmann wrote: Please do not hijack a thread. Please do not hit Reply, if you do not intend to reply and contribute to that thread. Removing all quoted text and changing the Subject does *not* make it a new thread or post. (Hint: In-Reply-To and References headers.) (grumble grumble) Stupid mail programs (grumble grumble) Yeah okay. Not so stupid. I'll comply Footnote: and I was refraining from commenting on another thread on how people 'complain' about features of SA that don't work in ways that match *their* style of thinking Oh, the irony :) Has there been any progress... No changes since this has been asked the last time. (nod) Alright. So far this is still a less than once a week phenomenon, for me personally. I just raise it occasionally to put a data point into the archives. If my inquiry had shaken lose a bunch of 'me too' comments, it might have led somewhere. But it hasn't, so the issue remains on the far back burner :) There are just a very few rules scanning non-textual parts of a mail. Large-ish binary attachments don't have much of an impact on performance. Large-ish textual attachments potentially do. Now THAT is a curious comment. All the usage guidelines I have ever read implied or outright stated that scanning mails over a certain size was a significant degradation to system performance. Am I confusing the guidelines for antivirus programs with those for SA? Would it be 'safe' to run SA on messages with larger attachments? Anyone ever tested this? - C
Re: Set for Whitelist Only?
On Sat, 12 Jun 2010, andrewj wrote: I am migrating to a new server with SpamAssassin. I have a well-known email address which is a common spam target, and I want to set it up so that only addresses on my whitelist are allowed, everything else is automatically blacklisted. How do I set this up? Other advice on whitelisting aside, if your statement implies that you are starting to use spamassassin on mail that was previously unfiltered you might want to see how much spam actually still arrives in that mailbox once SA is doing it's job. I found that even some of my hardest hit mailboxes suddenly dropped down to a managable 3-4 spams delivered per day when I got SA working on them. - C
More large spam....
I got another 1MB spam today. I still don't want to kill my system by attempting to scan every large mail that comes in. Has there been any progress on an 'option' to scan only text portions of mail past a certain size limit and/or scan only the first X bytes? The former is preferable because it avoids any issues with incomplete mail, or text sections being last - Charles
Re: Performance problem body tests
On Thu, 3 Jun 2010, Helmut Schneider wrote: I then started from scratch and tried with SA 3.2.5. The particular body_tests take only 5 seconds (instead of 30). As I mentioned before, I noticed this difference myself, and presumed it was just a characteristic of the 'improved' logic for deep-scanning the body of emails, and perhaps just a larger number of rules than before Though I am still intrigued by your comment that this happens only on 'some' e-mails, not all. Apologies if I missed a response, but was there any difference noticable for the mails that process quicker? - Charles
Re: Performance problem body tests
On Thu, 3 Jun 2010, Mark Martinec wrote: Here is one common problem of 'certain mail messages' taking a long time to process - unresolvable for now: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5590 Sorry, but that bug has been around since 3.2.3 - it would not explain a sudden sixfold increase in processing time from 3.2.5 to 3.3.1. - C
Re: [sa] Performance problem body tests
On Wed, 2 Jun 2010, Helmut Schneider wrote: with certain mails on FreeBSD 8.0 and SA 3.3.1 I have a performance problem: What distinguishes 'certain mails'? Length? Content? Mime attachements? So the body tests take ~ 30 of 37 seconds. It's not a load problem, I noticed a significant increase in processing time when I upgraded from 3.2 to 3.3. but it was pretty much for all messages. You might want to raise the level of debugging so that you see the test which did NOT match, so that you can truly assess how long each body rule takes to process. Your logs show 6 second 'gaps' but they may have just been filled with non-hitting rule tests - C
Re: SPF_HELO_PASS on a spam message?
On Fri, 28 May 2010, theTree wrote: I received a spam email that scored zero on the SpamAssassin score. I think it may be to do with the SPF_HELO_PASS that it scored - would someone be able to give me some pointers? I can't be certain with the munged headers, but it looks like you are FORWARDING your mail internally from one server to another, and then doing an SPF check on the 'helo' between your two servers. You might want to see if you can put SA on your gateway mail server. Otherwise, be sure that 'trusted_networks' is set properly, so that SA has a better chance of examining the received header from the first external connection. - Charles
Re: Arabic Spam
On Mon, 24 May 2010, Jason Bertoch wrote: A user reported the following FN to me which is written in an Arabic character set. I have ok_locales en set, but I don't see any rules hitting that appear language related. I also found the normalize_charset option, but don't know if it will help or hurt my ability to detect these messages. Ideas or thoughts? http://pastebin.com/KtQSvZ5w At a guess I would say the bulk of your score is attributed to the URI in the body that has been flagged as being on the SURBL blocklist. Beyond that, the issue seems to be that they have used a body 'type' of text/html without actually using HTML. So spamasassin is complaining about various aspects of the improper use of HTML... Though I can't see how it decided that a large font was in use The solution here seems to be a combination of getting rid of the 'bad' URI from the text and gettin ghte sender to fix their (web based?) mail client so that all those HTML problems don't occur - C
Re: percentage off spam
I agree that full smaples are needed. The % Subject alone is not enough. But I would expect there is something 'common' to the body that would combine in a meta rule for decent score with minimal fp... So throw some examples up on pastebin. - C
Re: percentage off spam
On Tue, 18 May 2010, Kenneth Porter wrote: So throw some examples up on pastebin. Here's some: http://sewingwitch.com/ken/Stuff/foo.txt I'm currently catching them with this: header KP_PERCENT Subject =~ /\b-?[78][0-9]%/ describe KP_PERCENT 70-89 percent in subject scoreKP_PERCENT 1.0 Given how high these spams score already, this will work quite well. I also noticed that the 'view in a browser' line repeats consistently. I see some hits on RBL's and URIBL's. Perhaps those should score just a little bit higher? - C
Re: [sa] Re: Custom rules - escape characters
On Fri, 7 May 2010, Daniel Lemke wrote: Am I seeing ghosts or is this the third time you asked the same question on this list? Your first mail was already replied so I suggest you have a look there to get your answers. Daniel Oh, good, it's not my mail server acting up again! (smile) To OP: Spamassassin uses perl regular expressions - man perlre - C
Re: [sa] odd FPs
On Tue, 4 May 2010, Greg Troxel wrote: Thanks - I did pretty much understand the tests. What I'm boggled about is that they suddenly started firing, and then now suddenly do not. This is perfectly consistent with the explanation I offered at the beginning of this thread. A legitimate Google MX was temporarily blacklisted. Given that it was hitting dialup Lists, I would guess that maybe Google was (re)assigned an IP block that was previously dynamic. - Charles
Re: Scanning Outbound emails
On Wed, 5 May 2010, Bernd Petrovitsch wrote: Why shouldn't it be possible? SpamAssassin doesn't care where the mail comes from Well, actually, it DOES. The test DOS_DIRECT_TO_MX being an example. Which brings me back to the slightly confused feeling that I still get over 'trusted_networks' (which is what the OP should specify so that his outbond clients do not trigger RBL rules) and internal networks. In particular, I find these two paragraphs from Mail::SpamAssassin::Conf to be contradictory: Trusted relays that accept mail directly from dial-up connections (i.e. are also performing a role of mail submission agents - MSA) should not be listed in internal_networks. List them only in trusted_networks. If trusted_networks is set and internal_networks is not, the value of trusted_networks will be used for this parameter. So my mail server handles ALL mail, incoming and outgoing. According to the first paragraph, I should not list my mail server under 'internal_networks' because it is an MSA. Because I have no other MTA to list as 'internal' I have NO setting for 'internal_networks'. But according to the second paragraph, this makes my MSA 'default' to being an internal_network because its value is lifted from 'trusted_networks'? I don't think our dialup IP's are triggering the direct-to-mx rules, but that may only be because our dynamic IP's are not listed on the appropriate RBL's. So is the second paragraph *wrong* about the default usage? Or am I lucky? should I specify a 'not' rule for internal networks, just to preserve the trusted-only status of my dialups? - Charles
Re: Scanning Outbound emails
On Wed, 5 May 2010, Jari Fredriksson wrote: There is one special group that will suffer from that decision: namely SpamAssassin users within your network. If they do report their spam to SpamCop using SpamAssassin's own report mechanism, they are screwed Why not just add a negative-scoring rule for mail sent to spamcop? I have to do the same for mail from this list, to avoid FP'ing on every post that quotes a bit of spam :) - C
Re: [sa] odd FPs
On Tue, 4 May 2010, Greg Troxel wrote: I use spamassass-milter and reject at about 8 points. Normally this is fine. I just got a few false positives. BAYES_40,DKIM_FORGED,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DOS_OUTLOOK_TO_MX,HELO_NO_DOMAIN,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,RDNS_NONE,UNPARSEABLE_RELAY Your list of 'matched' rules includes several DNS blacklists (PBL, SORBS). Occasionally a gmail or earthlink server gets abused, and is temporarily blacklisted. IF they dn't fix it themselves, you may have to write to the blacklist maintainers and request removal of the IP - C
How many Froms?
Hiyo! Occasionally I see an e-mail with multiple addresses on the 'From:' header. (not the envelope) Can anyone think of legitimate uses for multiple From: addresses? Or could I just use a rule like: header From =~ /\...@.*\@/ - C
Re: [sa] Re: How many Froms?
On Wed, 28 Apr 2010, David B Funk wrote: There's an easy fix for that FP, just use the 'From:addr =~ ' varient of the header rule. That ignores the comment part of the 'From:' address and only examines the stuff inside the 'b...@blah.blah' part. Avoid FP, yes, but also avoid the live header that is triggering the rule, which was *not* formatted with I guess I'll just test for *3* '@'s - C
Re: Score overriding and behaviour
On Tue, 27 Apr 2010, Giampaolo Tomassoni wrote: Also, why body __SOMMA m'\Wsomma\W'i doesn't fire? I have the Rule2XSBody plugin active. Maybe somehow it wasn't compiled? But why, then? Do ANY of the rules in your local.cf fire? Try putting a test rule that will 'always' fire (like 'header From =~ /\@/') at the end of local.cf, then if it doesn't fire, start moving it up, to see if you can home in on a line that is perhaps aborting further reading of local.cf - C
Re: [sa] RE: Score overriding and behaviour
On Tue, 27 Apr 2010, Giampaolo Tomassoni wrote: Do ANY of the rules in your local.cf fire? Yes, they do. The __IN_ITALIAN rule referred by SOMMA and SOMMA2, in example. Just a side thought, but are we checking for SOMMA or SOMA? One 'm' or two? FRT_SOMA2 Try 'retyping' the __SOMMA rule without the m' body __SOMMA /\Wsomma\W/i Also, look for a 'runaway' unclosed quote on a prior rule (though I would expect such a condition to barf error messages like crazy) - C
Re: Whitelisting local domain (spamassassin qmail)
On Mon, 26 Apr 2010, Martin Caine wrote: Received: from host[my_ip_address].in-addr.btopenworld.com (HELO ?192.168.32.10?) (mar...@[my_domain_dot_com]@[my_ip_address]) by [our_servers_hostname].memset.net with SMTP; 26 Apr 2010 09:26:45 - If 'my_ip_address' is truly 'internal' then you should be able to add it to 'trusted_networks'. But that allows *all* mail from that internal IP. - C
Re: Whitelisting local domain (spamassassin qmail)
You used the phrase 'internal' to describe the IP from which you are sending your mail. If you are trying to send mail by connecting from an untrusted (external) dynamic IP address (including blackberries) then you need to use some form of SMTP authentication on the connection to verify that the mail is really legitimate mail from your domain. In which case If your MSA properly inserts the auth information into the headers, SpamAssassin should react appropriately. - Charles On Mon, 26 Apr 2010, Martin Caine wrote: Thanks for the reply. Unfortunately where I put my ip it's actually showing the IP I have here at work, it's the IP assigned for our internet connection in the office and is dynamic (and even if it was static, whitelisting it would only fix the problem if we were emailing from the office and wouldn't whitelist emails sent from blackberries, iphones and other locations). -- View this message in context: http://old.nabble.com/Whitelisting-local-domain-%28spamassassin---qmail%29-tp28364411p28366716.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: [sa] Re: Match returned message headers on any NDR
On Wed, 14 Apr 2010, Kris Deugau wrote: I have yet to figure out why people think it's a good idea to relay mail from your domain host to your ISP account (especially when the two are different companies) Do not mistake the following statement for any form of approval :) To many of our users, Outlook Express et al. are mysterious black boxes into which they have 'entered their user information' follownig the 'instructions' provided by their ISP. They are completely unaware that the mail client can handle more than one address/account. Or they may be dimly aware of the capability, but feel seriously overwhelmed by the options and accounts screens. But forwarding? That is simple in concept. You fill in ONE form with your ISP and it is done. AND the ISP will provide help and support for using the form to setup forwarding. Most ISP's tend to shove off the onerous task of teaching their users how to use Windows. So given the choice between filling in a form and 'using my mail the way I always have' and 'what if I do something wrong, mess up my mail and my ISP won't help me?', well, guess what they are going to choose? Again, don't confuse this for approval of any sort. For my part, I try my best to help users make intelligent use of their software. :) Personally, I *hate* forwarding because too many 'big players' setup 'reputation based' filtering strategies. So for every False Negative that I forward, there is one more chance that some dimwit user will click the button that says this is spam and lower *our* reputation. (sigh) /RANT :) - Charles
Re: skipping dynamic tests for ISP's own dynamic networks?
On Thu, 15 Apr 2010, Royce Williams wrote: I will also file a bug to suggest updates to the *_networks language that is in direct contradiction to the advice in other parts of this thread. One thing I might add: It seemed to me that at certain points in the discussion there was confusion as to whether the status of the mail server running spamassassin was influenced by the mua_networks setting. I believe some language in the docs would be appropriate to distinguish any '*_networks' settings that would be appropriate for the server *running* SA. Ie. The language about 'mua_networks' makes sense when you realize it is about 'trusting' *another* server on your network that has handled the dynamic IP's, as opposed to having SA simlpy 'trust' (as in 'trusted_networks' the dynamic IP's from which it directly receives outgiong mail. - C
Re: FROM_STARTS_WITH_NUMS matches on text-to-email
On Mon, 12 Apr 2010, Ted Mittelstaedt wrote: Seriously, you shouldn't be asking that question. The fundamental flaw here is in the assumption that an all-number mailbox user ID is virtually certain to be spam. It is not. Clearly, the default score assignment to that rule is too high. Well, firstly, the rule name says STARTS with nums. That would imply that the original condition for which this rule was created was NOT an 'all numeric' user part, but perhaps some 'jumble' of characters that merely *starts* with numbers. I would PROPOSE (to those with a nice testing rig) that the rule be modified so that there has to be at least one non-numeric character after the initial first 6 digits ie. /^\d{6,}\S*[^\d\s]\S*@/ This will reduce the 'hits' on phone numbers, while possibly still hitting the 'bad' usernames that it was intended to hit? - C
Re: FROM_STARTS_WITH_NUMS matches on text-to-email
On Tue, 13 Apr 2010, Martin Gregorie wrote: header FROM_STARTS_WITH_NUMS From =~ /\d{6,}[a-z._-][a-z0-9._-]{0,50}@/i This regex requires that the 7th character be non-numeric. Look at the regex I posted It covers all cases with six leading digits that is not a purely numeric address. /^\d{6,}\S*[^\s\d]\S*@/ As an aside, let's not forget that the high score that is causing concern is only used when there is no bayes and no network testing - Charles
Re: [sa] Re: FROM_STARTS_WITH_NUMS matches on text-to-email
On Tue, 13 Apr 2010, Martin Gregorie wrote: header FROM_STARTS_WITH_NUMS From =~ /\d{6,}[a-z._-][a-z0-9._-]{0,50}@/i This regex requires that the 7th character be non-numeric. Nope - only that a character after the first six is a legal address character but non-numeric. Hmmm My bad. I forgot that the '{6,} would match more than 6 digits... Silly me. :-} - C
Re: CLAMAV 0.95 to be disabled
Realize this is OT, and that even the instigation is OT :) But I'm hoping someone here just KNOWS 'rpm'. and can help... (Or can point me to the best forum for a quick answer) While attempting to use rpm on RH9 to update to a newer set of clamav packages, the rpm process locked up, and I had to kill it, and now rpm does not seem to be working at all I'm currently trying 'rpm --rebuilddb' but it's just sitting there, and I've got a feeling it has locked-up too - C
Re: CLAMAV 0.95 to be disabled
OT - RPM On Fri, 9 Apr 2010, Daniel McDonald wrote: I'm currently trying 'rpm --rebuilddb' but it's just sitting there, and I've got a feeling it has locked-up too You've got to delete the __db.* files in /varlib/rpm before you run --rebuilddb I'm trying that now, but don't have much hope. None of the db files were modified since 2007. So I suspect the corruption is in one of the other files :( - C
Re: [sa] Re: CLAMAV 0.95 to be disabled
On Fri, 9 Apr 2010, Daniel McDonald wrote: You've got to delete the __db.* files in /varlib/rpm before you run --rebuilddb That worked. Thanks! (wiping brow with relief) - C
Re: Domain specific configuration files??
Rajesh M wrote: if you standard score is say : 5.0 you can write a header rule to allocate a positive or negative score if the to field contains the specific domain example required_score 5 header header1 To =~ /example1\.com/i score header1 -1 Your rule would not work with Bcc mail (for example, mail from this list). You might get the desired result by using a 'Received:' or 'Delivered-To:' header This will vary depending on MTA, so examine your own mail and test for consistent performance. - C
Re: [sa] Re: Confused about how to use sa-update
On Thu, 1 Apr 2010, Phill Edwards wrote: actually posting to the right place! Is this the official spamassassin mailing list? Your own spam filter might be eating a lot of the messages? Try setting a rule to score -100 on mail received from apache.org... - C
Re: Limit SA to scan messages 100k and below
On Wed, 31 Mar 2010, Keith De Souza wrote: Sorry as I'm new to SA can you elaborated what you mean by glue? Geek terminology for the program, script or other mechanism that 'connects' your MTA and your SA. Ie. The calling MTA or its script must do the size check, then decide *whether* to call SA I'm trying to understand why is it taking 300.0 seconds to scan a message only 24Kb in size?? 1) Server is overloaded. Your load only has to go 10-20% over your system's 'maximum capacity' to cause processing times to jump from 20 seconds up to five minutes or more 2) Something that SA relies upon, like your DNS server, is taking way too long to do its job. Check that your DNS has a reasonable timeout value. Otherwise it could be waiting for a non-existent domain This would be the case if the problem occurs for certain addresses, or more often on spam (which comes from 'unknown' systems) than on legitimate mail 3) There may be a 'locking' issue with any databases (Bayes?) that SA uses. Again, this may only become a problem under heave load, with too many concurrent SA processes My thoughs so far is to perhaps reducing the file size that SA takes to scan and see if the scan time reduces. It is a better idea to try and reduce the number of emails that SA will process at the same time. - C
Re: Scanning large-body spam
On Wed, 31 Mar 2010, Henrik K wrote: SA 3.3 has special handling for truncated messages Excuse me for not *thinking* earlier, but it occurs to me that there is a very big drawback to *truncating* a message before passing it to SA, as opposed to my original request/suggestion to *flag* (or set a config param?) to tell SA to *ignore* parts of a message past a certain size. I believe it is fairly common practice for MTA's to expect SA to return the *entire* message, complete with X-Spam header 'markup', from SA's standard output stream. This is particularly important where mail classified as *slightly* spammy is delivered to a special spam folder based upon the headers added by SA. Or on a system where all mail tagged as spam is quarantined. Having SA's markup/explanations is critical to analysing false positives/negatives. So SA needs to read and write the *entire* message, but then be given a parameter to keep it from thrashing over the really large ones. - Charles
Re: Scanning large-body spam
On Wed, 31 Mar 2010, Mark Martinec wrote: and let it handle arbitrary size messages by avoiding its current paradigm of keeping the entire message in memory. Is there really a problem with the in-memory size? I would have thought the major concern was the processing time for evaluating 'full' (and rawbody?) rules on a large message Anyway, the amavisd glue to SpamAssassin does just that: let SpamAssassin see only the first 400 kB (configurable) of a large message, then edit the original message based on results obtained from SpamAssassin. Good for amavis-d, but not for those of us relying on SA to do the whole job, and not have our MTA's perform any further message modification I would be interested in having some of the developers offer an opinion on this. Where is the real 'cost' in running SA against a large message? Is it just the memory used? Or is it, as I suspect, the use of 'full' rules? - Charles
Re: Mega-Spam
(Subject line changed to remove the 'flag' to developers) On Mon, 29 Mar 2010, Karsten Bräckelmann wrote: .. But then again, this is a topic for the dev list [1] to start a discussion, not here. Uh, no, I'm not a developer. And the description of that list specifically says... For those involved in the development effort to discuss their work on the project. Unless you are working on a patch to SpamAssassin, this is probably not a list you need to use. If you're not already on the general users list, you should probably go there first... THIS is the list for users to ask questions *and* make suggestions, and has been used this way many times in the past. And the list description of this list says (and personal experience has proven) that developers monitor this list as time permits. So with respect, please stop telling people to clutter up a working list with (possibly) dumb ideas. And this one was mine, so I can call it dumb if I want to. I'm thinking it's not, but open to the idea it may be. :) As a side note, I had not intended to open a discussion, but merely drop my suggestion in the 'suggestion box', but I am glad I posted here, because the discusion among users has clarified the extent of the problem. :) [1] Also note your very own Subject. Intended to grab the attention of time-constrained developers, though honestly I am regretting it now, because I don't like having that 'all caps' flag attached to a discussion. I've removed it from this post - Charles
ATTN DEVELOPERS: Mega-Spam
Literally, Mega-Spam. I just got a spam with 1MB of images. My suggestion has been made before, but I would like to ask that it now be taken a bit more seriously. SA needs an option to allow efficient 'partial' scanning of large e-mails, so that, for example, we can peform all the valuable header checks, and maybe even scan for URIBL hits within the first few hundred K of the body? Is it possible (and easy!) to set a flag that tells SA to stop testing aganist the body when it reaches a certain byte count Or perhaps, if I understand the docs correctly, most rules only trigger on textual message parts anyway, so by simply disabling 'full' rules and possbily 'rawbody', we could get the desired result without too much of a processing hit? - C
Re: ATTN DEVELOPERS: Mega-Spam
On Mon, 29 Mar 2010, Karsten Bräckelmann wrote: You did read the entire thread, right? :) There's nothing new about this. Moreover, this still is a rare occurrence. Note even Charles, who started this thread, claims to have received *one* such spam. And it appears to be his first. ;) Last September the number of spams exceeding 256KB became frequent enough that I bumped up my limit. Now I'm starting to see spams past the new limit (400KB). But when they jump up to 1MB, maybe it's time for a different solution, and maybe regain some of system efficiency by adding the suggested mechanism to SA and only doing significant body scans on messages less than 256KN again :) Now, if this starts to become a more general pattern... The spams I've seen so far look more 'amateur' than 'pro'. Easily tracable IP's. Blacklistable domains. I'm just throwing my idea into the queue now so that it can be smoothly integrated with a future release. We've got plenty of time, but I suggest not waiting until it becomes a big problem before desperately rushing to fix it :) My 0.02 dollars - C
razor default in SA 3.3.1?
Hallo! Follow-up on SA 3.3.1 upgrade yesterday My system changes log reported the addition of several files named .razor/... which brought to my attentino that 'RAZOR2' tests are now enabled by default in SA 3.3.1 Is there anything that I should be concerned about? It seems to be functioning well, and I like the stats for the rules on rulesqa :) - Charles
add_header + report_safe 0 positioning in 3.3.1
In case anyone else uses a script to scan the SA injected message headers to build log records (to detail matched tests, etc), and that script cares about the *order* of the headers, then please take note that in 3.3.1 the position of the 'report_safe 0' command in your .cf files relative to the add_header command(s) determines the position in which X-Spam-Report will appear in the headers, relative to the others. This is a minor difference from 3.2.5 - strictly speaking it gives 3.3.1 superior behaviour, with more control/flexibility. So no complaints. :) Just wanted to mention this in case anyone else notes anomalies in their custom logging - Charles
Re: razor default in SA 3.3.1?
On Thu, 25 Mar 2010, Michael Scheidell wrote: (you using the freebsd SA port?) CentOS 4 (RHEL 4) rpm from rpmforge - C
Re: WARNING CENTOS USERS! BEWARE AUTO YUM INSTALL OF 3.3.1!
On Thu, 25 Mar 2010, fakessh wrote: I have different problems with latest spamassassin from rpmforge. it does not start Did you run sa-update as per my warning? - C
WARNING CENTOS USERS! BEWARE AUTO YUM INSTALL OF 3.3.1!
Had a nice HEART-STOPPING moment this morning! Logged in and found my mailbox had no new mail! WTF!?? Checked the logs and discovered that my nightly automatic updates via YUM had pulled in the new SA 3.3.1-3. WARNING: Centos does NOT run the required sa-update to get all the files into shape to run with the new SA engine! SA will ERROR. In my particular case, it turns out the 'Mail Avenger' MTA doesn't handle the error condition the way I expected and was dropping the mails on the floor! OUCH! :( Fortunately I have been reading all the posts about 3.3.1 with a view to installing it as soon as I was sure there were no major bugs. So I knew what to try first, and thankfully, yes, it was as simple as running sa-update. Mail is flowing again! Yay! But if anyone is running CentOS and runs yum manually, be warned that SA 3.3.1 will come in on the next update and you will have to run sa-update manually as soon as it is installed. - Charles, HWCN
Re: [sa] correction: was: WARNING CENTOS USERS! BEWARE AUTO YUM INSTALL OF 3.3.1!
On Wed, 24 Mar 2010, R P Herrold wrote: WARNING: Centos does NOT run the required sa-update to get all the files into shape to run with the new SA engine! SA will ERROR. rather: ... some third-party repository packagings, oriented to be used on CentOS, do not ... Correct. My warning more specifically applies to RPMFORGE rpm of SA-3.3.1-3... The CentOS provided packages are fine -- the independent packager aftermarket has the unexpected behaviour (nod) Thanks for the clarification. - C
Re: [sa] Re: Yahoo/URL spam
On Tue, 23 Mar 2010, Alex wrote: This is what I have: /^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^ ]{0,20}[a-z]{0,10}$/msi My bad. I got an option wrong. Please remove the 'm' above. I always get it backwards. According to 'man perlre' (the definitive resource for SA regexes!) the 'm' makes '^' match every newline! We want it to only match the beginning of the body. So just remove it, and, as noted by others, add the '^' that was missing... like so ... ]{0,20}[^a-z]{0,10}$/si - Charles
Re: Yahoo/URL spam
On Mon, 22 Mar 2010, Alex wrote: rawbody __BODY_ONLY_URI /^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^ ]{0,20}[^a-z]{0,10}$/msi This allows for some amount (up to ten chars?) of text before and after the URI if I'm reading that right, correct? Nope. With the /ms flags ^ and $ at beginning and end match the *whole* body as a single 'string' and permit 'any character' (. or [^x]) matches to also match newlines. So the above regex translates to: /^ - Beginning of body [^a-z]{0,10} - match 0-10 non-alpha characters *including* newlines (http:\/\/|www\.) - match a uri beginning with http *or* www (\w+\.)+ - match multiple occurences of word followed by . (this will match 'domain.' *or* 'www.domain.') (com|net|biz|org|cn|ru) - match TLD (adjust to fit your mail) \/? - match a slash if there is one [^ ]{0,20} - match 0-20 non-blank characters (page name, if given) [^a-z]{0,10} - match 0-10 non-alpha chars including newlines (did I TYPO in my OP and leave out the '^'?) $ - match end of body /msi Is it possible to determine the beginning of the line with a body rule? Insert '\n' into the above regex where you want to match newline. I didn't think that was possible. I believe this is also what this is trying to do? It's possible, but NOT what this regex does. Essentially this regex matches against a complete body that consists of nothing more than a single URI on a line, with possible blank lines before or after. Rather than test for newlines, I test for non-alpha so that a stray space or tab or LF code does not fail to match. This simple regex can also be 'dressed up' with elements of the form (\[^\\]+\ +)+ to match any HTML code inserted before or after the URI. A regex could also check for a link consisting of text enclosed by a href=... ... /a They key is to be sure that you don't use '*' or '+' in any context where it could 'run away' and try to match large message bodies This way as soon as the body exceeds 40 characters on either side of an unbroken string of characters it stops the test. Relatively efficient for a rawbody test - C
Re: Yahoo/URL spam
On Thu, 18 Mar 2010, Ned Slider wrote: If that's not an option, how about a meta rule for FROM_YAHOO and __HAS_ANY_URI (this rule exists in SA). Lots of ham may contain a URI, but how much ham contains ONLY a URI? Rough outline of rule, untested. rawbody __BODY_ONLY_URI /^[^a-z]{0,10}(http:\/\/|www\.)(\w+\.)+(com|net|org|biz|cn|ru)\/?[^ ]{0,20}[a-z]{0,10}$/msi Combine that with 'frequent abusers' like Yahoo, and you've got something you can give a few points There will probably need to be a variant on this to account for HTML mail and/or the 'standard' footers inserted by free mail agents. Which incidentally, suprises me here. I thought Yahoo always added a tagline? - C
Re: Hijacked thread :) (was: ruleset for German...)
On Mon, 15 Mar 2010, Karsten Bräckelmann wrote: The TextCat plugin. Even part of stock SA, though not enabled by default. Supports per-user settings. (nod) For reasons specific to my MTA, I can't run SA 'per user', but I can choose the most common languages (en fr) in our system's mail and flag when neither of them are used (assigning UNWANTED_LANGUAGE_BODY a minimal score) - then the user can set a procmail delivery rule (quarantine when that rule is present in the X-Spam headers). It will do. :) But you just forked (to avoid the word hijacked) this thread, which is about a very specific, on-going spam run. The OP really doesn't want to identify German spam for scoring, cause that's likely his first language. ;) My bad. :) But my compliments on the OP's excellent English! :) - C
Re: [sa] Re: ruleset for German Bettchen and Schlafzimmer spam
On Sun, 14 Mar 2010, Jörg Frings-Fürst wrote: take a look at http://wiki.apache.org/spamassassin/CustomRulesets and search to German Language Ruleset. H. I guess this goes back to my inquiry about the Brazilian spam I'm still looking for a way (hopefully) to simply identify the *language* of the mail (when not determined from CHARSET_FARAWAY rules), so that our users may opt-in for additional filtering based on language - Charles
Re: [sa] Re: Bogus mails from hijacked accounts
On Fri, 12 Mar 2010, Dennis B. Hopp wrote: describe FORGED_YAHOO Yahoo with non-Yahoo Reply-to address header __FORGED_YH1 From =~ /\...@yahoo\.com/i header __FORGED_YH2 Reply-to =~ /\...@yahoo\.com/i meta FORGED_YAHOO (__FORGED_YH1 !__FORGED_YH2) The problem with this is that the !__FORGED_YH2 matches when there is *NO* Reply-To header at all! You need something like this: header __FORGED_YH2 Reply-To =~ /\@([^y]|y[^a]|ya[^h]|yah[^o])/i meta FORGED_YAHOO (__FORGED_YH1 __FORGED_YH2) (remove the negation from the meta) This directly tests for an existing Reply-To specifically to a domain that does not begin with 'yaho'. However, keep in mind that the headers for *this* mailing list would trigger your rule. So you will also need to meta this with a rule that tests for yahoo mail server being the sending SMTP client Gets tricky, doesn't it? - C
Re: SMTP REJECT after DATA (was: SpamAssassin Milter Plugin...)
On Wed, 10 Mar 2010, R-Elists wrote: Charles Gregory Quote:Re: [sa] Re: SMTP REJECT after DATA The only efficiency to be gained is to reject as much as possible after the RCPT_TO, before accepting DATA. But for systems like mine, with lousy user cooperation, rejecting some of the mail after DATA is still the best option. i would say you are arguing both sides and that it might be the issue. I'm arguing that with such a strong component of YMMV there is NO side in this debate that is so woefully wrong as to be labelled 'misguided', which is what I was responding to in my first posdt in this thread. i would tend to believe that most have made the choice not to straddle the fence I made my own choice, as outlined above, but 'sit on the fence' with regard to my opinion on 'best practice' or 'misguided decisions', because I don't belive there really is any one 'good' or 'bad' decision (except maybe the decision to backscatter, but we all agree that is 'bad'). are you blaming the users for your administration? ;-) Naturally. All good adminsitration is customer driven. |-D - C
Re: [sa] Inconsistent Application of Rules?
On Wed, 10 Mar 2010, Stephen Carville wrote: I've been seeing several emails lately that are being scored low that, from what I know of the SA rules should be scored higher. A recent example was a typical spam message: FROM_STARTS_WITH_NUMS,RCVD_IN_DNSWL_LOW,URIBL_AB_SURBL,URIBL_JP_SURBL, URIBL_OB_SURBL,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=no The second message invoked a larger number of body check rules than the first but I don't understand why. Is that normal or do I have something configured incorrectly? The extra rules are all 'SURBL' blocklist tests which check the embedded URI against internet blocklists. It is not uncommon for the first few spams using a new URI to get through before the blocklists are updated. By the time you reran your tests, they had been updated, and so it scored higher - C
Re: [sa] Re: End of Thread [Was: [Emerging-Sigs] SIG: SpamAssassin Milter Plugin Remote Arbitrary Command Injection Attempt]
On Tue, 9 Mar 2010, Ned Slider wrote: It's clear you either haven't read or haven't understood what Kai wrote, which btw was spot on. More attitude. Yeesh. Kai has an opinion. And in fairness, I give his arguments some serious weight. It's not black-n-white. But this attitude that he/you have the 'best' solution is just yeah YAWN. End of Thread. Hope so.
Re: [sa] Re: [Emerging-Sigs] SIG: SpamAssassin Milter Plugin Remote Arbitrary Command Injection Attempt
On Tue, 9 Mar 2010, Brian wrote: I'm happy to stay on the Postfix 'merry-go-round' for an answer, or we can just agree Postfix can't easily do this and move on and stop flogging this dead horse :-) I use Mail Avenger for a front end SMTP Says it all - Charles
SMTP REJECT after DATA (was: SpamAssassin Milter Plugin...)
On Tue, 9 Mar 2010, Kai Schaetzl wrote: Second: you are completely misguided in your wish to reject mail after SMTP data stage. You may certainly argue for YOUR preference (and I emphasise *preference*) for the most 'efficient' way to run an SMTP server, but there is nothing sufficiently 'wrong' with rejecting mail after DATA that you can use the term 'misguided'. All this term implies is your attitude Apart from this, you make some nice arguments, but again, you seem to have a bias that weighs them too heavily. It does not make any sense to process a complete message and then reject it. If this were true, no one would have added 'header' and 'body' checks to the postfix configuration and no one would have been jumping through hoops to find ways to integrate SA into the front end of MTA's Indeed, it makes far LESS sense to have a system accept mail but send it to a spam folder. That practice leaves the sender with the mistaken impression that their mail was sucessfully delivered. And argue as you will, there is simply no way to get a broad user base to adopt the habit of reviewing a spam folder. I mean the whole point of filtering is that the user no longer has to sift through a pile of junk, right? Processing a message takes CPU power and precious SMTP time. Doing that at SMTP stage means you cannot take in as much mail as you could. It also means that the sending MTA cannot send as much mail as it could. Think about that statement twice. It IS correct, but it is an argument FOR processing mail at SMTP time. A legitimate outbound SMTP sever is *never* as busy as an incoming mail server. So a leigitimate server will not suffer *any* penalty from my system introducing a 5-6 second delay into the SMTP transaction. But a spammer's zombie is trying to pump out mail as fast as it can. The spambot will be slowed down. That is a GOOD thing. Yes? :) There are other reasons not to do this, for instance legal ones. Again, you are quoting arguments that favor SMTP reject. It is better to reject a mail, so that legitimate senders know it, rather than have them believe it was delivered when it was sent into a spam folder, perhaps suffer consequences and then sue the recipient. Sure, OUR butts will be covered by our user agreements, but only if we have jumped through hoops so that the user cannot claim they did not know about their spam folder. But in the real world, even if we don't get sued, we get a lot of people complaining that they didn't know about the optional spam folder on our system that the user turned ON themselves! Now we use a spam folder for 'borderline' spams that score 5-10. The rest get rejected at SMTP time. But still I get these occasional complaints It's just the way users are LOL The idea is not to punish the other side because it sends spam. If they send spam, I'm happy to see them punished. If they send legitimate mail, they should not be punished for the actions of spammers by having to GUESS whether their mail made it through. The idea behind a rejection at SMTP stage is twofold: avoid unnecessary processing and avoid unncessary traffic. None of that is achieved if you take a whole message, scan it and reject it at SMTP stage. Well, firstly, ALL of that is achieved *regardless* of these arguments because the helo/rbl checks are done BEFORE the DATA stage. The only 'loss' of time is on mail that you were going to have to fully process anyway because it made it past those checks. No loss to me. A few seconds delay on the SMTP connectino that saves a legitimate sender worry without incurring the 'cost' of backscatter, and actually might slow spammers down a bit. Maybe I personally don't gain any time. But maybe by the end of the day the spammer doesn't get to send quite as many e-mails, and someone out there enjoys less traffic on their server! - Charles
Re: [sa] Re: SMTP REJECT after DATA (was: SpamAssassin Milter Plugin...)
On Tue, 9 Mar 2010, Kai Schaetzl wrote: and you find it doesn't make sense to spam-scan messages and reject them in/after DATA stage in a real world scenario. You ignore my arguments. Hardly surprising. You reword yours, but say nothing new. It makes only sense if you are die-hard spam-fighter who wants to retaliate... I stated my objectives and they have nothing to do with this pathetic straw-man argument. Most if not all of your arguments are arguments for spam-filtering mail, not in favor of rejection at DATA stage. How is that English-as-a-second-language class coming along? I refuse to bore this group by repeating arguments that you so grossly mis-categorize in a feeble attempt to promote your point of view. Last, keep in mind that filtering mechanisms in whatever stage are not solely meant for rejecting or spam-fighting, they are for *filtering* and then assigning appropriate actions - which often have nothing to do with spam/malware detection at all. Now THAT is off-topic. We are discussing the use of SA at SMTP time. Please stay on-topic for this group, and for this thread. If you actually care to continue, I expect a reasonable response to my arguments about rejection being better than bouncing or silent diversion. Geez, you didn't even try to advocate a system of notices to the user to overcome the 'silent' portion of that argument. Do I have to argue both sides for you? :) - C
Re: [sa] Re: SMTP REJECT after DATA
On Tue, 9 Mar 2010, Andy Dorman wrote: So even if we can decide an email is spam before the DATA stage, it makes no difference since we have to store the thing for a while anyway in case the user wants to look for something caught that shouldn't be. (nod) To rely on this methodology requires that you *rely* upon your users to apply a conscientious and consistent system of reviewing their spam trap/folder on a regular basis. If you have this, then without sarcasm I would say you are very fortunate. But in a system like mine where educating ignorant users is difficult at best, it feels a bit too dangerous to allow (too much) mail to be received and held without notice to the sender. And unfortunately SMTP protocols do not contain a code to tell the sender that mail was 'accepted but held for review'. The only way to do that is with a separate mail, and that leads back to the backscatter horrorshow, which I am quite sure you would never advocate :) So for us (and we recognize not for everyone), the policy/practice we have chosen is the most workable and efficient. I think the only reason I leaped into this thread was because of the overbearing attitudes that seemed to completely ignore the fundamental notion of YMMV - C
Re: [sa] Re: SMTP REJECT after DATA
On Tue, 9 Mar 2010, David Morton wrote: Charles Gregory wrote: Indeed, it makes far LESS sense to have a system accept mail but send it to a spam folder. Maybe in your particular situation, but you can hardly apply that to everyone (nod) It was subject to the conditions I consider 'wide spread' but by no means universal: the failure of users to review spamtraps. - since we are supporting several large companies that find it more acceptable to quarantine mail than to reject it, and *have* trained their employees to look in a spam folder in the rare case that it is needed. Stop it! You're making me jealous! LOL If postfix and amavisd-new have improvements lately that allow for efficient rejecting at SMTP time, that's great! The only efficiency to be gained is to reject as much as possible after the RCPT_TO, before accepting DATA. But for systems like mine, with lousy user cooperation, rejecting some of the mail after DATA is still the best option. Again, I emphasise 'some', and only speak out because someone is describing any approach other than their own as 'misguided'. You are not misguided, and neither am I. We just have different situations. Hmm... policy. Sounds a lot like a feature of postfix, doesn't it? LOL... And not at all 'misguided' :) - C
Re: [sa] Re: SMTP REJECT after DATA
On Tue, 9 Mar 2010, Ted Mittelstaedt wrote: There are other reasons not to do this, for instance legal ones. Again, you are quoting arguments that favor SMTP reject. It is better to reject a mail, so that legitimate senders know it, rather than have them believe it was delivered when it was sent into a spam folder... This is one of the stupidest arguments in this thread Well, hey, now that we've got *that* off our chest NOBODY is legally required to accept e-mail. That is a crock of baloney. Well then it's a good thing I didn't say that, isn't it? It is NOT illegal to break a contract. It's called 'fraud'. Look it up. - C
Re: [sa] Re: SMTP REJECT after DATA
On Tue, 9 Mar 2010, Ted Mittelstaedt wrote: It is NOT illegal to break a contract. It's called 'fraud'. Look it up. No, sorry, it's NOT fraud. Fraud requires proving an intentional misrepresentation. Well duh. Did you think I meant something else? Breaking a contract does not imply that the contract was entered into with an intent to break it. But sending back an SMTP 'delivered' response when the mail was diverted to a spam folder could be PERCEIVED as misrepresentation (and therefore fraud, because clearly the decision to divert is based in policies established long before the implicit 'contract' of accepting a mail). But again, I stress this is only true for the STUPID USER who does not understand that the spam folder is an alternate form of delivery TO THEM. My responsibility is complete (and legal) when that mail is delivered to either location. It's all about the hassle and misperceptions. The fewer times I have to explain to users how their mail 'disappeared', the easier my life :) And please remember that my entire context was only to stress that my weak definition of 'something illegal' was in CONTRAST to the utterly ridiculous notion that rejecting a mail at SMTP DATA time had anything illegal to it at all! - C
Spanish/Brazilian/Mexican spam
Hello! I think I asked about this once before. I keep getting foreign language spams with noobvious (to me) indicators that I could test for Can anyone take a look at this crud and see a header or flag/type that I could score in SA? http://pastebin.com/3gGiaZVK (Note: post is set to expire at 3pm Tues Mar 9) Thanks! - Charles
Re: UPS Delivery Problems
On Wed, 3 Mar 2010, twofers wrote: I have been getting bombarded for weeks with these and even tho I have created specific rules in LOCAL.cf, Spamassassin refuses to even check The only reason for SA to 'refuse' to check a mail is if it exceeds the SIZE LIMIT for scanning. This limit is most often not within SA itself, but a parameter in whatever script/shell calls SA. If you are using 'spamc' as your client, make sure the -s (max size) parameter is a good size to catch jpg and virus spam. I use 40. - C
Re: [sa] Re: is this right? uribl_dbl seems to have a very odd number
On Wed, 3 Mar 2010, Bill Landry wrote: Yeah. You shouldn't be using it like that on 3.3.0. Go to http://www.spamhaus.org/dbl and look for SpamAssassin on the FAQ page. The DBL entries were added via sa-update yesterday, not added manually - at least for me. Anytime someone uses a new concept, like the URI checker that doesn't take IP's, shouldn't a new syntax be used, or a check for a new plugin? - C
Re: [sa] Putting your dead domains to use
On Mon, 1 Mar 2010, Marc Perkel wrote: For what it's worth - if any of you have domains you don't use you can point them to my virus harvesting server for spam harvesting. (SNIP) The sender has to do several other things in order to be blacklisted. Simple question: Does your 'harvester' have the smarts to detect (possible) correspondence from domain *registrars* (or ARIN) to the owners of a domain name? I can't guarantee that someone somewhere doesn't have our old domain as a 'contact' even though the MX has been a non-existent server for the last several years. Subject to this important consideration for the one possible form of 'legitimate' mail, I have a domain that used to be excessively spammed, which would be *perfect* to feed to your harvester... (unless the domain is in fact so old that it has dropped from spammers lists). - Charles
Re: [sa] Setting Blacklist_from and whitelist_to
On Sun, 28 Feb 2010, damuz wrote: Secondly, it occurred to me that all the (legit) mail to us will only be to a handful of email addresses and much of the spam still getting through is sent to spurious recipie...@mydomain.com. So with this in mind, is it useful or advisable to setup those legit email addresses as whitelist_to and if so, what becomes of the 'rest' of the mail or do you have to define only receive to whitelist_to? You have to 'fine tune' this kind of test. Keep in mind that the visible 'To:' header is hardly more than a *comment* on the mail. It may contain a mailing list name, or another *valid* recipient on another domain, while the mail was sent to *your* domain as a 'Bcc' hidden recipient. At the first stage of the SMTP transaction your MTA (should have) already rejected any mail that was actually 'addressed' to an invalid address. So the issue you are dealing with can be described as 'mail to a legitimate recipient with a suspicious To: header'. So it quickly devolves to the fact that the *only* thing you can reject is mail that has a 'To:' address that is @your.domain but which is not a valid (now or at any time in the past!) recipient on your domain. You can't flag mail that is 'To:' another domain. That could be valid! Now you need to be careful that when you invoke a 'whitelist' you do so for the 'To:' header, and NOT for the envelope recipient, which, by definition will always be a 'hit'. Unfortunately, the standard 'whitelist_to' will 'hit' on any embedded headers that your MTA adds to show the envelope recipient. You could essentially end up whitelisting all mail. So you need to whitelist on the visible headers *manually* So, if your list of internal recipients is not overly large, you may want to try the following: header __VALID_MYDOMAIN ToCc =~ /(validuser1|validuser2|...)\...@yourdomain.com/i header __TO_MYDOMAIN ToCc =~ /\...@yourdomain.com/i meta LOC_INVALID_MYDOMAIN ( __TO_MYDOMAIN ( ! __VALID_MYDOMAIN ) ) describe LOC_INVALID_MYDOMAIN Address in To or Cc header to invalid address on our domain scoreLOC_INVALID_MYDOMAIN 1 Obivously, score modestly until we are sure there are no false positives. The big 'problem' with this scheme is that *any* change to the list of valid users requires the first rule to be updated. So I only recommend this approach if you have absolute control over your mail system. - Charles
Re: [sa] Re: Finding URLs in html attachments
On Sun, 28 Feb 2010, LuKreme wrote: Your best bet is to check if mail claiming to be from paypal is, in fact, from paypal. Actually, I think his problem is that the reference to paypal has been buried in an attachment, described as 'type' of 'octet/binary' so that SA won't think it is text and scan it, and thus he doesn't *have* any 'visible' cue that the mail claims to be from paypal. And yes, I think that is a pretty serious problem. Looks like he may have to use a 'full' test to look for the references to paypal - C
Re: [sa] Re: Finding URLs in html attachments
On Mon, 1 Mar 2010, David B Funk wrote: Looks like he may have to use a 'full' test to look for the references to paypal Been there, done that, doesn't work. AFAIK SA ignores 'octet/binary' attachments for the rule engine. None of the rules that I tried (uri, body, full, rawbody) saw anything that was known to be in one of those attachments. You may have to examine the 'raw' message and look for 'encoding' that disguises the URI's in the attachment. Ths whole thing might be encoded as base64 or something... A real mess to work with. You might have more success making a rule that looks for mime headers that are type 'octet' but named 'html'. You won't be able to score that too high on its own, but it might combine well in a meta rule with certain buzz phrases from the text portions of the e-mail. - C
Re: Off-topic? Off-list!
On Fri, 26 Feb 2010, Karsten Bräckelmann wrote: I know I'm tired from repeatedly deleting clearly off-topic posts without even caring to open them. Wonder how the majority of subscribers feels about it. Well, there was a posting with some spam-related SPF stats the other day that proved very interesting. And relevant to how I might want to score SPF in my SA config. But yeah, it's otherwise getting a bit opinion-heavy and repetitve. Let's drop it an move on - C
tflags userconf
Hallo! Back on topic :) I happened to notice that 'tflags userconf' was specified for a few tests that, as far as I could tell have on user configurable parameters. Example (3.2.5): 25_spf.cf:tflags SPF_PASS nice userconf So what 'user configuration' is needed for SPF_PASS that is NOT needed for SPF_FAIL? In general, what does a 'userconf' specification 'look for' before permitting a test to run? - C
Re: tflags userconf
On Fri, 26 Feb 2010, RW wrote: I'm guessing it's also used to exclude rules from score optimization. There is a comment in 25_spf.cf: # these are userconf so that scores are set by hand tflags SPF_PASS nice userconf net tflags SPF_HELO_PASSnice userconf net Ah. I didn't see that because I was grepping * for 'SPF'... :) Thanks. - C
Re: Off-topic? Off-list!
On Fri, 26 Feb 2010, Karsten Bräckelmann wrote: Don't make me stomp my foot (Homer Simpson). LOL would you believe that someone in my girlfriend's computer class actually *said* to the instructor that famous Homerism, Where is the ANY key? Yes, really. And they are old enough to vote Brrr - C
Re: Off Topic - SPF - What a Disaster
On Fri, 26 Feb 2010, Benny Pedersen wrote: On Fri 26 Feb 2010 06:50:12 PM CET, Marc Perkel wrote And - SPF was originally introduced as a spam fighting solution. alot of lies out there Okay, this is getting stupid. Everyone on this thread, go to: http://www.openspf.org/Introduction Spammers are explicitly identified as one of the problems addressed. And even if this were somehow a 'lie', the original intent of the authors does not change whether SPF is *effective* for a given role. So this petulant arguing over its purpose is. (ad hominems snipped). Take it off list, PLEASE. - C
Re: Is there any Plugin to parse the “quoted email text” part in a mail (replied mail part)
On Fri, 26 Feb 2010, LuKreme wrote: On 26-Feb-10 11:31, Karsten Bräckelmann wrote: Uhm, what's with your real name? (Rewritten in RE style.) How do you pronounce *82* f's in a row? Fff for 8.2 seconds. That's ten fs a second? Wow. Fast little F'er. ;) - C
Re: [sa] Re: Bogus Dollar Amounts
On Thu, 25 Feb 2010, John Hardin wrote: i still see lot of junk mail coming with different charecters, i do not even read them clearly how can i stop those kind of emails Reject languages you can't read at SMTP time? I've been noticing more 'foreign language' spams that do not use a 'foreign' character set and therefore do not trigger the 'faraway' rules I don't suppose anyone has developed a generic rule that would spot 'foreign language usage in non-foreign charset'? - C
Re: SA on outgoing SMTP
On Wed, 17 Feb 2010, Kris Deugau wrote: My experience has been that Outlook in particular (not Outlook Express or its descendant Windows (Live) Mail) does NOT in fact display SMTP error messages exactly as the server spits them out. :( Sorry. You've heard that old phrase goes without saying? Well, I didn't say it. (smile) Where Microsoft and error messages are concerned, I consider it par for the course that what is reported to the user will be a miserable distortion of whatever actual error occurred. But just the same, the user will know that *something* has gone wrong with their mail. Obviously the fewer FP's the better when dealing with confusing error messages :) - C
Re: SA on outgoing SMTP
Slightly OT. To get 'control' of what my MX does at SMTP time I installed a simple SMTP daemon called 'Mail Avenger', which acts as a front end to my spamassassin and postfix. It's scripting capabilties allow for such interesting things as tracking the volume of mail sent by any one IP over a given time period. Stuff like that. Primarily designed for use as an MX, but no reason it couldn't help monitor/limit outgoing mail http://www.mailavenger.org - C On Tue, 16 Feb 2010, Alexandre Chapellon wrote: I have a quite buggy customer network, full of zombie PCs that spends all days sending spam and wasting the whole reputation of my networks. As a result it sometimes become quite hard to delivers queues for specific domains such as Yahoo!'s hosted ones. Indeed they have some temp fail (blacklist) mechanism that forbid my servers to send messages to them during hours. Taht's why I would like to setup some ougoing filtering to avoid sending too much spam through my mail relays. I think SA can help me in doing so, but I know too it's not really intented to work this way. I guess SA expects to work on MX hosts more than on smtp relays. My prerequisites are mainly: - STOP as much spam as possible at SMTP time (before queuing) - Have NO (or very few) false positives cause I could not manage telling thousands of users that they should *always_have_a_subject*, *shouldn't_write_the_subject_in_CAPS* or anything else. Further more I can't rely on RBL because a lot of my dyn IP address are regularily listed on different blacklist. Does anyone have already setup something like that and what specific config/tools/plugin could be usefull for me. If some one already done it does he/she have any statistics about the efficiency of this setup. Best regards.
Re: [sa] Re: MTX - How does it stop spam?
On Tue, 16 Feb 2010, Kris Deugau wrote: *nod* This is the biggest question I still see remaining; who maintains the blacklist? How many spams can come from an MTX-approved IP before it can/should be blacklisted? Why do we need any new/special blacklist at all? If the spamming from a given IP is sufficiently large, the regular internet blacklists will capture this IP and do a far better job of blacklisting, managing removes, etc, etc. Why reinvent the wheel? - C
Re: MTX public blacklist implemented Re: MTX plugin functionally complete?
On Sun, 14 Feb 2010, Jonas Eckerman wrote: 1: The participation record is optional, so you only use it if you want everything else to be rejected. This is why I would support mtamark... It permits the sysadmin to determine the default behaviour for his IP range, rather than defining a dangerous default in the client. And I quote: This subdomain MAY be inserted at any level in the DNS tree for IPv4 IN-ADDR.ARPA reverse zones. For IPv6, to limit the number of DNS queries, _srv is only queried at the /128 (host), /64 (subnet) and / 32 (site) level. That way it can either provide information for a specific IP address or for a whole network block. More specific information takes precedence over information found closer to the top of the tree. The beauty of this mechanism is that we can 'sell' large ISP's on it by saying you only need to create one 'allow' entry for each legitimate MTA and one 'deny' entry for each netblock. And for SA there is no need to give it 'starting' scores, like SPF, the mechanism is effective as soon as it is used, and ignorable if not... - C
Re: MTX public blacklist implemented Re: MTX plugin functionally complete?
On Tue, 16 Feb 2010, Jonas Eckerman wrote: 1: The participation record is optional, so you only use it if you want everything else to be rejected. This is why I would support mtamark... It permits the sysadmin to determine the default behaviour for his IP range, rather than defining a dangerous default in the client. In what way does the above define a dangerous default? It doesn't. My comment refers to early messages where the author of 'mtx' said that the 'standard' behaviour in the absence of any mtx record as being equivalent to a 'deny' condition. That is, the domain would be scored as 'spammish' if it did not participate. The default in the statement above is to consider a domain as *not* participating unless otherwise stated by whoever manages the DNS for the domain. Correct. And my comment was that this was a much better alternative to the 'dangerous default' of having 'not participating' mean 'spammy'. If the domain does not participate it should not be punished when a MTX record isn't found. You got it. Exactly. And that's why I gave up on MTX. Because the author was insisting that exactly that should happen. - C
Re: bayes learning '0 messages found'
On Sat, 13 Feb 2010, smfabac wrote: Now that we're all on the same page. How do I find out why sa-learn is not processing the legal not-spam file? To re-cap, sa-learn --spam --mbox isspam works but sa-learn --ham --mbox not-spam is not working. Well, I would expect if this suggestion were right you would have had all sorts of warning messages about syntax, but just in case Maybe linux is interpreting the dash in the filename as a switch indicator? Try enclosing the file name in single quotes or use a filename without a dash... - C
Re: MTX plugin created (Re: Spam filtering similar to SPF, less breakage)
On Sat, 13 Feb 2010, Per Jessen wrote: Justin Mason wrote: It might be useful to compare with MTA MARK and see what the status of that proposal currently is: http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/ Amazing. Justin, you must have known about that one - you can't possibly have just googled it? Well, I certainly had never heard of this one. And I think that with one minor variation in concept it could be useful to scoring systems like SA... Because of the threat of hacks, any system that 'favors' an MTA is simply giving spammers a target for exploitation. But an explicit 'disallow' record (MTA=0) created by the sysadmin would have a similar impact to deliberately naming PTR records as 'dynamic'. SA could 'detect' the explicit MTA=0 and add a score (or block outright at MTA level) The only thing I would *not* do, given the general laziness of the internet, is apply any default meaning to the absence of this TXT record. Only explicit identification of an IP or subnet as 'not permitted to send mail' would have significance to SA or a blocking MTA. H. Could work. No impact for non-implementation. Disables an unauthorized IP for any case where it is used. I like it... - C