Re: application/octet-stream Content-Type used to obfuscate terse .RTF spam
John Hardin wrote: Does this catch it? mimeheader __UNSPEC_BINARY_ATTACH Content-Type =~ /application\/octet-stream/i meta MIME_BINARY_ONLY (__CTYPE_MULTIPART_MXD __UNSPEC_BINARY_ATTACH !__ANY_TEXT_ATTACH) scoreMIME_BINARY_ONLY 2.00 describe MIME_BINARY_ONLY Unspecified binary body part but no text body parts Of course that builds on your previously posted rules (thank you very much!) and isn't by itself a complete set of rules. I installed this following variation locally and it seems to be working well for me: header __CTYPE_MULTIPART Content-Type =~ m{multipart/\w}i mimeheader __MIME_CTYPE_TEXT Content-Type =~ m{text/\w} mimeheader __MIME_CTYPE_APPLICATION_OCTETSTREAM Content-Type =~ m{application/octet-stream} meta MULTIPART_OCTETSTREAM_NO_TEXT (__CTYPE_MULTIPART __MIME_CTYPE_APPLICATION_OCTETSTREAM !__MIME_CTYPE_TEXT) scoreMULTIPART_OCTETSTREAM_NO_TEXT 2.0 describe MULTIPART_OCTETSTREAM_NO_TEXT Octet-Stream body part but no text body parts A more specific set of rules to zero in on the rtf nature specifically. It might be safer in that it specifically targets this type of spam. header __CTYPE_MULTIPART Content-Type =~ m{multipart/\w}i mimeheader __MIME_CTYPE_TEXT Content-Type =~ m{text/\w} mimeheader __MIME_CTYPE_APPLICATION_OCTETSTREAM_RTF Content-Type =~ m{application/octet-stream.*name=.*\.rtf} meta MULTIPART_OCTETSTREAM_RTF_NO_TEXT (__CTYPE_MULTIPART __MIME_CTYPE_APPLICATION_OCTETSTREAM_RTF !__MIME_CTYPE_TEXT) scoreMULTIPART_OCTETSTREAM_RTF_NO_TEXT 2.0 describe MULTIPART_OCTETSTREAM_RTF_NO_TEXT Octet-Stream body rtf part but no text body parts However playing wack-a-mole with each new type isn't productive. Perhaps this following, completely untested, would be the better way to go. Just look for any multipart message that doesn't have any text parts. I haven't thought to much about what type of false positives this might cause but it seems like a good direction anyway. header __CTYPE_MULTIPART Content-Type =~ m{multipart/\w}i mimeheader __MIME_CTYPE_TEXT Content-Type =~ m{text/\w} meta MULTIPART_NO_TEXT (__CTYPE_MULTIPART !__MIME_CTYPE_TEXT) scoreMULTIPART_NO_TEXT 2.0 describe MULTIPART_NO_TEXT Multipart body but no text parts Bob
Re: Filtering through mailing lists
On 29.05.09 17:26, Garik wrote: I have a situation where by mail passes through a mailing list and then goes on to the destination mailbox that's subscribed in the mailing list. Here's my problem: SpamAssasin checks the emails going through the mailing list for SPAM and adds the subject [**SPAM**] to the email, Why do you resend spam to ML users? Either filter it copmpletely, or don't mess with it. If you need filtering, X-Spam-* headers should do the job. And if you don't want to resend spam, moderate suspicious mail. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Boost your system's speed by 500% - DEL C:\WINDOWS\*.*
Re: application/octet-stream Content-Type used to obfuscate terse .RTF spam
On 31.05.09 21:25, Chip M. wrote: Mildly redacted sample posted here: http://puffin.net\software\spam\samples\0005_rtf.txt and the plain body, after decoding to plain text (purely for convenience): http://puffin.net\software\spam\samples\0005_body.txt Address Not Found puffin.net\software\spam\samples\0005_body.txt could not be found. Please check the name and try again. Did nobody ever told you that URL directories are separated by slashes, not backslashes? Or is this a part of the obfuscation schema? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. REALITY.SYS corrupted. Press any key to reboot Universe.
Re: twitter spam why RCVD_IN_DNSWL?
On 28-May-2009, at 14:57, Michael Scheidell wrote: why does a company that is so easy to spam through get a -8 point pass? Because the only way to get a message from twitter is to 1) have an account on twitter 2) have someone who YOU ARE FOLLOWING send a direct message to you 3) have Twitter set to send Direct messages to you via email. No one can send you a direct message unless you follow them, so you have 'opted-in' to receive their messages. If they are spammers, UNFOLLOW them. Incorrect. I don't follow anyone on twitter. Went to their web site for the first time last week to look for their complaint address. Whether you want the message or not, it is not spam. If you don't want these messages, disable the emailing in your twitter account. I can't. I don't have a twitter account. So, since twitter can send email to someone without a twitter account, its spam. Since I can't opt out without an account, its spam. If twitter allows people to forge email address when you sign up, without confirmed opt in, its spam. Since it doesn't include the full physical address of the sender, its also a violation of federal (you) can-spam laws. -- Michael Scheidell, CTO |SECNAP Network Security Finalist 2009 Network Products Guide Hot Companies FreeBSD SpamAssassin Ports maintainer _ This email has been scanned and certified safe by SpammerTrap(r). For Information please see http://www.secnap.com/products/spammertrap/ _
Re: application/octet-stream Content-Type used to obfuscate terse .RTF spam
On Mon, 1 Jun 2009, Bob Proulx wrote: However playing wack-a-mole with each new type isn't productive. Perhaps this following, completely untested, would be the better way to go. Just look for any multipart message that doesn't have any text parts. That actually sounds best to me. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- It is not the business of government to make men virtuous or religious, or to preserve the fool from the consequences of his own folly. -- Henry George --- 5 days until the 65th anniversary of D-Day
Identifying Source of False Positives
I'm running SA-3.2.5 on Slackware-12.2 and encountering false positives on messages that have not before been seen as spam by SA. Specifically, the daily postfix mail log summary report and the daily logwatch report are marked at spam; they are sent by root to me as a user. Because /etc/procmailrc threw these messages away it took a long time to figure out that it was SA mis-labeling these messages that was the immediate problem. Over the past few months I've also had problems with messages from three specific domains that were never delivered to my inbox. However, when a procmail recipe directed all messages to me at my business domain to a different mail file, they were delivered. How can I determine what causes SA to mark the log summary reports as spam? This is the first issue I want to resolve. I saw nothing appropriate on the web site's FAQ or front page so if I missed the information please point me to the appropriate location. Rich
Re: Identifying Source of False Positives
On Mon, 2009-06-01 at 09:28 -0700, Rich Shepard wrote: I'm running SA-3.2.5 on Slackware-12.2 and encountering false positives on messages that have not before been seen as spam by SA. Specifically, the daily postfix mail log summary report and the daily logwatch report are marked at spam; they are sent by root to me as a user. Because /etc/procmailrc threw these messages away it took a long time to figure out that it was SA mis-labeling these messages that was the immediate problem. Over the past few months I've also had problems with messages from three specific domains that were never delivered to my inbox. However, when a procmail recipe directed all messages to me at my business domain to a different mail file, they were delivered. How can I determine what causes SA to mark the log summary reports as spam? run the message though spamassassin -D and see what tests fire. Most likely it will be that some of the domains that are reported in your summary are listed in URIBL, SURBL, or some other uri block list. -- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX www.austinenergy.com signature.asc Description: This is a digitally signed message part
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: messages that have not before been seen as spam by SA. Specifically, the daily postfix mail log summary report and the daily logwatch report are marked at spam; Well, firstly, examine the mail full headers. There should be an X-Spam-Status header listing the tests that matched on the e-mail. At a first guess, I would suspect that your log includes a reference to a blacklisted URI or e-mail. Given the nature of logs to contain information of this sort, I would strongly urge you to 'whitelist' the logs. For that matter, if this is internally generated mail, why are you running spamassassin at all? Or is this mail being passed via an outside (untrusted) network to your mailbox? - C
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: I'm running SA-3.2.5 on Slackware-12.2 and encountering false positives on messages that have not before been seen as spam by SA. Specifically, the daily postfix mail log summary report and the daily logwatch report are marked at spam; they are sent by root to me as a user. That sort of thing shouldn't even be hitting SA. If you're using procmail to glue in SA, you might want to add some exclusionary clauses to the stanza that calls SA. Over the past few months I've also had problems with messages from three specific domains that were never delivered to my inbox. However, when a procmail recipe directed all messages to me at my business domain to a different mail file, they were delivered. It can be a bad idea, particularly if you're an administrator or delegate for the postmaster@ or abuse@ aliases, to discard mail that SA has marked as spam. Quarantine it and periodically review the quarantine. How can I determine what causes SA to mark the log summary reports as spam? This is the first issue I want to resolve. First, capture the messages rather than discarding them. The FPs should have the list of rules that hit in the headers. For historical messages you should be able to look in your mail log (typically /var/log/maillog or rotated to /var/log/maillog.1.gz etc.) for the SA log entry for the messages in question, which also list the rules hit. If you post the list of rules hit, or better a complete FP message with all headers intact, we may be able to suggest more precisely. Please don't post messages to the list; post them on pastebin or a webserver you control, and send the URL to the list. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- It is not the business of government to make men virtuous or religious, or to preserve the fool from the consequences of his own folly. -- Henry George --- 5 days until the 65th anniversary of D-Day
sa-update not updating since March 30.
We have a cron job that runs every day to update the spamassassin rules, but there have been no new updates since March 30. When I run it manually with the -D (debug) flag, I get this output: [24667] dbg: logger: adding facilities: all [24667] dbg: logger: logging level is DBG [24667] dbg: generic: SpamAssassin version 3.2.5 [24667] dbg: config: score set 0 chosen. [24667] dbg: dns: is Net::DNS::Resolver available? yes [24667] dbg: dns: Net::DNS version: 0.65 [24667] dbg: generic: sa-update version svn607589 [24667] dbg: generic: using update directory: /var/lib/spamassassin/3.002005 [24667] dbg: diag: perl platform: 5.01 linux [24667] dbg: diag: module installed: Digest::SHA1, version 2.11 [24667] dbg: diag: module installed: HTML::Parser, version 3.60 [24667] dbg: diag: module installed: Net::DNS, version 0.65 [24667] dbg: diag: module installed: MIME::Base64, version 3.07_01 [24667] dbg: diag: module installed: DB_File, version 1.816_1 [24667] dbg: diag: module installed: Net::SMTP, version 2.31 [24667] dbg: diag: module installed: Mail::SPF, version v2.006 [24667] dbg: diag: module installed: Mail::SPF::Query, version 1.999001 [24667] dbg: diag: module installed: IP::Country::Fast, version 604.001 [24667] dbg: diag: module not installed: Razor2::Client::Agent ('require' failed) [24667] dbg: diag: module not installed: Net::Ident ('require' failed) [24667] dbg: diag: module installed: IO::Socket::INET6, version 2.56 [24667] dbg: diag: module installed: IO::Socket::SSL, version 1.24 [24667] dbg: diag: module installed: Compress::Zlib, version 2.015 [24667] dbg: diag: module installed: Time::HiRes, version 1.9711 [24667] dbg: diag: module installed: Mail::DomainKeys, version 1.0 [24667] dbg: diag: module installed: Mail::DKIM, version 0.32 [24667] dbg: diag: module installed: DBI, version 1.608 [24667] dbg: diag: module installed: Getopt::Long, version 2.37 [24667] dbg: diag: module installed: LWP::UserAgent, version 5.826 [24667] dbg: diag: module installed: HTTP::Date, version 5.810 [24667] dbg: diag: module installed: Archive::Tar, version 1.38 [24667] dbg: diag: module installed: IO::Zlib, version 1.09 [24667] dbg: diag: module not installed: Encode::Detect ('require' failed) [24667] dbg: gpg: Searching for 'gpg' [24667] dbg: util: current PATH is: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/vpopmail/bin [24667] dbg: util: executable for gpg was found at /usr/local/bin/gpg [24667] dbg: gpg: found /usr/local/bin/gpg [24667] dbg: gpg: release trusted key id list: [24667] dbg: channel: attempting channel updates.spamassassin.org [24667] dbg: channel: update directory /var/lib/spamassassin/3.002005/updates_spamassassin_org [24667] dbg: channel: channel cf file /var/lib/spamassassin/3.002005/updates_spamassassin_org.cf [24667] dbg: channel: channel pre file /var/lib/spamassassin/3.002005/updates_spamassassin_org.pre [24667] dbg: channel: metadata version = 759778 [24667] dbg: dns: 5.2.3.updates.spamassassin.org = 759778, parsed as 759778 [24667] dbg: channel: current version is 759778, new version is 759778, skipping channel [24667] dbg: diag: updates complete, exiting with code 1 -- View this message in context: http://www.nabble.com/sa-update-not-updating-since-March-30.-tp23819671p23819671.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-update not updating since March 30.
On Mon, 1 Jun 2009, Ernie Dunbar wrote: We have a cron job that runs every day to update the spamassassin rules, but there have been no new updates since March 30. That's because there haven't been any updates recently. There's no firm schedule for releases of updates. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Government cannot grant rights. Government can only limit, infringe or suppress rights. --- 5 days until the 65th anniversary of D-Day
Re: sa-update not updating since March 30.
On Mon, 2009-06-01 at 11:26 -0700, Ernie Dunbar wrote: We have a cron job that runs every day to update the spamassassin rules, but there have been no new updates since March 30. Correct. updates_spamassassin_org has not been updated since March 30. I have seen updates on 90_sare_freemail_cf_sare_sa-update_dostech_net (May 11), 90_2tld_cf_sare_sa-update_dostech_net (May 24th), and sought_rules_yerp_org (today). -- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX www.austinenergy.com signature.asc Description: This is a digitally signed message part
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Charles Gregory wrote: Well, firstly, examine the mail full headers. There should be an X-Spam-Status header listing the tests that matched on the e-mail. Charles/Dan/John: I certainly managed to forget this. I just ran /etc/cron.daily/1pflogsumm and looked at the report. Here are the headers: From r...@salmo.appl-ecosys.com Mon Jun 1 11:25:44 2009 Return-Path: r...@salmo.appl-ecosys.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5-ph20040310.0 (2008-06-10) on salmo.appl-ecosys.com X-Spam-Level: X-Spam-Status: Yes, score=4.9 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_99, EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL autolearn=no version=3.2.5-ph20040310.0 X-Spam-Report: * -1.3 ALL_TRUSTED Passed through trusted hosts only via SMTP * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% * [score: 1.] * 2.5 EMPTY_BODY BODY: Message has subject but no body * 0.0 NORMAL_HTTP_TO_IP URI: Uses a dotted-decimal IP address in URL * 0.4 URI_HEX URI: URI hostname has long hexadecimal sequence * 0.0 NUMERIC_HTTP_ADDR URI: Uses a numeric IP address in URL * 1.6 URI_NOVOWEL URI: URI hostname has long non-vowel sequence * -1.8 AWL AWL: From: address is in the auto white-list X-Original-To: rshep...@appl-ecosys.com I can send the entire report if that's necessary. There is certainly body content in the message; it's not empty so I don't understand the 2.5 on that third test. I also don't know where the 3.5 on the second test arises. For about a decade these log summary reports showed up every day with no problems. Earlier this spring they became sporatic, then ceased appearing at all. This correlates with a distribution and SpamAssassin upgrade, so it must be something different in SA that's triggering this response now. Suggestions on how to proceed greatly appreciated. Thanks, Rich
Re: sa-update not updating since March 30.
John Hardin wrote: On Mon, 1 Jun 2009, Ernie Dunbar wrote: We have a cron job that runs every day to update the spamassassin rules, but there have been no new updates since March 30. That's because there haven't been any updates recently. There's no firm schedule for releases of updates. Ah. That certainly explains why we're getting massive amounts of spam. How would I go about contributing to that end? -- View this message in context: http://www.nabble.com/sa-update-not-updating-since-March-30.-tp23819671p23820187.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-update not updating since March 30.
On Mon, 1 Jun 2009, Ernie Dunbar wrote: John Hardin wrote: On Mon, 1 Jun 2009, Ernie Dunbar wrote: We have a cron job that runs every day to update the spamassassin rules, but there have been no new updates since March 30. That's because there haven't been any updates recently. There's no firm schedule for releases of updates. Ah. That certainly explains why we're getting massive amounts of spam. How would I go about contributing to that end? Read up in the SA wiki on how to become a committer. You need that level of access to submit rules for consideration even if you're not going to be hacking the SA code itself. Barring that, posting rules to the users list for discussion may lead to them being adopted into the base rules. I'm in the process of becoming a committer for this reason myself. One thing I intend to do is get some newer rules into the 3.2 tree, as the real devs are focusing on 3.3 -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Of the twenty-two civilizations that have appeared in history, nineteen of them collapsed when they reached the moral state the United States is in now. -- Arnold Toynbee --- 5 days until the 65th anniversary of D-Day
Re: [sa] Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: * 2.5 EMPTY_BODY BODY: Message has subject but no body There is certainly body content in the message; it's not empty so I don't understand the 2.5 on that third test. I also don't know where the 3.5 on the second test arises. Just to be clear, are you looking at the body in the actual rejected message, to make sure it is still there (not 'stripped' from the message)? First guess, look at the procmail code that 'chooses' to run spamassassin. Have you used an 'h' where you meant to use an 'H', thereby feeding *only* the header to spamassassin? - C
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: Here are the headers: From r...@salmo.appl-ecosys.com Mon Jun 1 11:25:44 2009 Return-Path: r...@salmo.appl-ecosys.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5-ph20040310.0 (2008-06-10) on salmo.appl-ecosys.com X-Spam-Level: X-Spam-Status: Yes, score=4.9 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_99, EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL autolearn=no version=3.2.5-ph20040310.0 X-Spam-Report: * -1.3 ALL_TRUSTED Passed through trusted hosts only via SMTP * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% * [score: 1.] I also don't know where the 3.5 on the second test arises. If these are system-generated messages, something is improperly training SA that they are spam. Do you use autolearn? Suggestions on how to proceed greatly appreciated. Primarily I'd suggest you exclude locally-generated emails from SA completely. If you'd post the Received: headers from such a message and the procmail stanza where you pass messages to SA for scoring I could suggest something. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Of the twenty-two civilizations that have appeared in history, nineteen of them collapsed when they reached the moral state the United States is in now. -- Arnold Toynbee --- 5 days until the 65th anniversary of D-Day
Re: [sa] Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Charles Gregory wrote: Just to be clear, are you looking at the body in the actual rejected message, Charles, Yes. The body consists of the mail log summary. First guess, look at the procmail code that 'chooses' to run spamassassin. Have you used an 'h' where you meant to use an 'H', thereby feeding *only* the header to spamassassin? ## Call SpamAssassin :0fw: spamassassin.lock * 256000 | spamassassin This is how it's been for years. Rich
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, John Hardin wrote: If these are system-generated messages, something is improperly training SA that they are spam. Do you use autolearn? John, No. Once a week or so I run sa-learn specifying spam on the spam-uncaught mbox file. Less frequently I run it on mail list files specifying them as ham. Primarily I'd suggest you exclude locally-generated emails from SA completely. If you'd post the Received: headers from such a message and the procmail stanza where you pass messages to SA for scoring I could suggest something. Here are all headers from the mail log summary: From r...@salmo.appl-ecosys.com Mon Jun 1 11:25:44 2009 Return-Path: r...@salmo.appl-ecosys.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5-ph20040310.0 (2008-06-10) on salmo.appl-ecosys.com X-Spam-Level: X-Spam-Status: Yes, score=4.9 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_99, EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL autolearn=no version=3.2.5-ph20040310.0 X-Spam-Report: * -1.3 ALL_TRUSTED Passed through trusted hosts only via SMTP * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% * [score: 1.] * 2.5 EMPTY_BODY BODY: Message has subject but no body * 0.0 NORMAL_HTTP_TO_IP URI: Uses a dotted-decimal IP address in URL * 0.4 URI_HEX URI: URI hostname has long hexadecimal sequence * 0.0 NUMERIC_HTTP_ADDR URI: Uses a numeric IP address in URL * 1.6 URI_NOVOWEL URI: URI hostname has long non-vowel sequence * -1.8 AWL AWL: From: address is in the auto white-list X-Original-To: rshep...@appl-ecosys.com Delivered-To: rshep...@appl-ecosys.com Received: from salmo.appl-ecosys.com (localhost.localdomain [127.0.0.1]) by salmo.appl-ecosys.com (Postfix) with ESMTP id 8DA0F1026 for rshep...@appl-ecosys.com; Mon, 1 Jun 2009 11:25:44 -0700 (PDT) Received: (from r...@localhost) by salmo.appl-ecosys.com (8.14.3/8.14.2/Submit) id n51IPibx030133; Mon, 1 Jun 2009 11:25:44 -0700 Date: Mon, 1 Jun 2009 11:25:44 -0700 From: r...@salmo.appl-ecosys.com Message-Id: 200906011825.n51ipibx030...@salmo.appl-ecosys.com To: rshep...@appl-ecosys.com Subject: *SPAM* salmo Daily Mail Report for Monday, 01 June 2009 X-Spam-Prev-Subject: salmo Daily Mail Report for Monday, 01 June 2009 Report based on information in /var/log/maillog And this is from ~/procmail/recipes.rc: ## Call SpamAssassin :0fw: spamassassin.lock * 256000 | spamassassin Thanks, Rich
Re: Identifying Source of False Positives
Rich Shepard wrote: Here are all headers from the mail log summary: From r...@salmo.appl-ecosys.com Mon Jun 1 11:25:44 2009 Return-Path: r...@salmo.appl-ecosys.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5-ph20040310.0 (2008-06-10) on salmo.appl-ecosys.com X-Spam-Level: X-Spam-Status: Yes, score=4.9 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_99, EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL autolearn=no version=3.2.5-ph20040310.0 X-Spam-Report: * -1.3 ALL_TRUSTED Passed through trusted hosts only via SMTP * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% * [score: 1.] * 2.5 EMPTY_BODY BODY: Message has subject but no body * 0.0 NORMAL_HTTP_TO_IP URI: Uses a dotted-decimal IP address in URL * 0.4 URI_HEX URI: URI hostname has long hexadecimal sequence * 0.0 NUMERIC_HTTP_ADDR URI: Uses a numeric IP address in URL * 1.6 URI_NOVOWEL URI: URI hostname has long non-vowel sequence * -1.8 AWL AWL: From: address is in the auto white-list X-Original-To: rshep...@appl-ecosys.com Delivered-To: rshep...@appl-ecosys.com Received: from salmo.appl-ecosys.com (localhost.localdomain [127.0.0.1]) by salmo.appl-ecosys.com (Postfix) with ESMTP id 8DA0F1026 for rshep...@appl-ecosys.com; Mon, 1 Jun 2009 11:25:44 -0700 (PDT) Received: (from r...@localhost) by salmo.appl-ecosys.com (8.14.3/8.14.2/Submit) id n51IPibx030133; Mon, 1 Jun 2009 11:25:44 -0700 Date: Mon, 1 Jun 2009 11:25:44 -0700 From: r...@salmo.appl-ecosys.com Message-Id: 200906011825.n51ipibx030...@salmo.appl-ecosys.com To: rshep...@appl-ecosys.com Subject: *SPAM* salmo Daily Mail Report for Monday, 01 June 2009 X-Spam-Prev-Subject: salmo Daily Mail Report for Monday, 01 June 2009 Report based on information in /var/log/maillog Your biggest problems here are BAYES_99 and EMPTY_BODY. To fix the Bayes problem, sa-learn some of these messages as ham. Make sure you are learning as the right user... The empty body problem is a more difficult problem. Have procmail save a copy of the raw message somewhere and take a look at it. Make sure there is a blank line between the headers and the body. Run 'spamassassin -D' on this saved message and look for anything unusual in the debug output. -- Bowie
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: On Mon, 1 Jun 2009, John Hardin wrote: If these are system-generated messages, something is improperly training SA that they are spam. Do you use autolearn? John, No. Once a week or so I run sa-learn specifying spam on the spam-uncaught mbox file. Less frequently I run it on mail list files specifying them as ham. And I assume you look at the sapm-uncaught file before learning it? If some log files got in there and were learned, that could explain the deterioration. Have you kept your spam and ham corpa? I would suggest wiping your Bayes database and retraining it, after reviewing the corpa. Primarily I'd suggest you exclude locally-generated emails from SA completely. If you'd post the Received: headers from such a message and the procmail stanza where you pass messages to SA for scoring I could suggest something. Here are all headers from the mail log summary: From r...@salmo.appl-ecosys.com Mon Jun 1 11:25:44 2009 Return-Path: r...@salmo.appl-ecosys.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5-ph20040310.0 (2008-06-10) on salmo.appl-ecosys.com X-Spam-Level: X-Spam-Status: Yes, score=4.9 required=4.0 tests=ALL_TRUSTED,AWL,BAYES_99, EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL autolearn=no version=3.2.5-ph20040310.0 X-Spam-Report: * -1.3 ALL_TRUSTED Passed through trusted hosts only via SMTP * 3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100% * [score: 1.] * 2.5 EMPTY_BODY BODY: Message has subject but no body * 0.0 NORMAL_HTTP_TO_IP URI: Uses a dotted-decimal IP address in URL * 0.4 URI_HEX URI: URI hostname has long hexadecimal sequence * 0.0 NUMERIC_HTTP_ADDR URI: Uses a numeric IP address in URL * 1.6 URI_NOVOWEL URI: URI hostname has long non-vowel sequence * -1.8 AWL AWL: From: address is in the auto white-list X-Original-To: rshep...@appl-ecosys.com Delivered-To: rshep...@appl-ecosys.com Received: from salmo.appl-ecosys.com (localhost.localdomain [127.0.0.1]) by salmo.appl-ecosys.com (Postfix) with ESMTP id 8DA0F1026 for rshep...@appl-ecosys.com; Mon, 1 Jun 2009 11:25:44 -0700 (PDT) Okay, let's key on that one. ## Call SpamAssassin : 0fw: spamassassin.lock * 256000 | spamassassin :0 fw: spamassassin.lock * 256000 * ! ^TO_abuse@ * ! ^List-Id: .*?use...@.]spamassassin\.apache\.org? * ! ^Received: from salmo\.appl-ecosys\.com \(localhost\.localdomain \[127\.0\.0\.1\]) by salmo\.appl-ecosys\.com | /usr/bin/spamc Using spamc creates less load than launching spamassassin from scratch for every email, but you do have to manage the daemon (i.e. restart it if the rules change). Are your resources really so limited that you want to serialize all email delivery? As a middle ground you might consider per-user lockfiles instead, e.g.: :0 fw: $HOME/.spamassassin.lock I'd also suggest upping the size limit a bit, but that's not a big issue. There are more complex things you can do; you might want to take a look at http://www.impsec.org/~jhardin/antispam/spamassassin.procmail -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- We have to realize that people who run the government can and do change. Our society and laws must assume that bad people - criminals even - will run the government, at least part of the time. -- John Gilmore --- 5 days until the 65th anniversary of D-Day
Re: [sa] Re: Identifying Source of False Positives
First guess, look at the procmail code that 'chooses' to run spamassassin. Have you used an 'h' where you meant to use an 'H', thereby feeding *only* the header to spamassassin? ## Call SpamAssassin : 0fw: spamassassin.lock * 256000 | spamassassin Is there anywhere in the procmail recipe *above* this one that some specila condition has been specified as: :0fwh ...which has the effect of 'filtering' the message down to just its headers? It wouldn't necessarily have to be a recent change to your procmailrc, it might just be a subtle change in the log mail that 'triggers' the rule when it didn't before. Next guess: Has this log summary grown in size past some limit that would cause the whole body to be 'truncated'? - Charles
Re: [sa] Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Charles Gregory wrote: Is there anywhere in the procmail recipe *above* this one that some specila condition has been specified as: :0fwh ...which has the effect of 'filtering' the message down to just its headers? It wouldn't necessarily have to be a recent change to your procmailrc, it might just be a subtle change in the log mail that 'triggers' the rule when it didn't before. Charles, # BEGIN RECIPES # Nuke duplicate messages #:0 Wh: msgid.lock #| $FORMAIL -D 8192 msgid.cache ## Call SpamAssassin :0fw: spamassassin.lock * 256000 | spamassassin The first recipe has been commented out for a while now, so the call to SA is at the top of the list. Next guess: Has this log summary grown in size past some limit that would cause the whole body to be 'truncated'? No. The log summary report (with headers) is 26,000 bytes. Rich
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Bowie Bailey wrote: Your biggest problems here are BAYES_99 and EMPTY_BODY. To fix the Bayes problem, sa-learn some of these messages as ham. Make sure you are learning as the right user... Bowie, I just did this on a run from this morning. I'll do so again tomorrow morning with both the mail log and log watch reports. The empty body problem is a more difficult problem. Have procmail save a copy of the raw message somewhere and take a look at it. Make sure there is a blank line between the headers and the body. Run 'spamassassin -D' on this saved message and look for anything unusual in the debug output. There is always a blank line between headers and body. I tried running 'spamassassin -D' on the saved message and nothing happened. Should it take more than a few seconds to complete and return a debug report? Thanks, Rich
Re: Identifying Source of False Positives
fwiw, even if there isn't a blank line, SA will figure it out (though it'll trigger a MISSING_HB_SEP rule hit). As for the debug output ... it depends, how did you run the command (ie: what was the command you tried). My guess is you did something like spamassassin -D filename, where filename gets treated as the argument to -D, so then it was waiting for input. If this is the case, try spamassassin -D filename /dev/null. :) On Mon, Jun 1, 2009 at 6:09 PM, Rich Shepard rshep...@appl-ecosys.com wrote: There is always a blank line between headers and body. I tried running 'spamassassin -D' on the saved message and nothing happened. Should it take more than a few seconds to complete and return a debug report?
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, John Hardin wrote: And I assume you look at the sapm-uncaught file before learning it? Yes. The messages in there are those I deliberately move there after they've ended up in my inbox because neither the postfix filters nor the spamassassin rules caught them. If some log files got in there and were learned, that could explain the deterioration. That seems very reasonable, but I would have had to move them there myself and I cannot recall doing so. Also, before running sa-learn to classify them as spam I look over the list. So, it's quite possible that they ended up classified as spam unintentionally. Have you kept your spam and ham corpa? I'm not sure. The spam comes from the spam-uncaught file which is cleared each time it's run. The ham comes from various mail lists and they grow over time. Okay, let's key on that one. ## Call SpamAssassin : 0fw: spamassassin.lock * 256000 | spamassassin :0 fw: spamassassin.lock * 256000 * ! ^TO_abuse@ * ! ^List-Id: .*?use...@.]spamassassin\.apache\.org? * ! ^Received: from salmo\.appl-ecosys\.com \(localhost\.localdomain \[127\.0\.0\.1\]) by salmo\.appl-ecosys\.com | /usr/bin/spamc Using spamc creates less load than launching spamassassin from scratch for every email, but you do have to manage the daemon (i.e. restart it if the rules change). I run spamd: 2978 ?Ss12:16 /usr/bin/spamd -d --pidfile=/var/run/spamd.pid 3052 ?S 0:04 spamd child 3054 ?S 0:05 spamd child is this not adequate for a light load? Are your resources really so limited that you want to serialize all email delivery? As a middle ground you might consider per-user lockfiles instead, e.g.: :0 fw: $HOME/.spamassassin.lock I'd also suggest upping the size limit a bit, but that's not a big issue. There are more complex things you can do; you might want to take a look at http://www.impsec.org/~jhardin/antispam/spamassassin.procmail There are only two users on this network and a low mail volume for each of us. The size limit has been at that value for years without a problem. I'll keep teaching SA that the log reports are ham and see if that makes a difference. As I wrote earlier, this is all within the past quarter year, and it's been a PITA since it's taken time and attention away from my business. Thanks, Rich
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Theo Van Dinter wrote: My guess is you did something like spamassassin -D filename, where filename gets treated as the argument to -D, so then it was waiting for input. Theo, Yes, this is what I did. If this is the case, try spamassassin -D filename /dev/null. :) Interesting: [785] dbg: rules: running uri tests; score so far=1.2 [785] dbg: rules: compiled uri tests [785] dbg: rules: ran uri rule NORMAL_HTTP_TO_IP == got hit: http://211.129.107.12; [785] dbg: rules: ran uri rule URI_HEX == got hit: http://kemp-5d866973; [785] dbg: rules: ran uri rule NUMERIC_HTTP_ADDR == got hit: http://1898218; [785] dbg: rules: ran uri rule URI_NOVOWEL == got hit: http://jcwpjkp; [785] dbg: rules: ran uri rule __DOS_HAS_ANY_URI == got hit: h [785] dbg: eval: stock info total: 0 [785] warn: rules: failed to run CG_FUJI_JPG test, skipping: [785] warn: (Can't locate object method image_name_regex via package Mail::SpamAssassin::PerMsgStatus at (eval 719) line 1315. [785] warn: ) [785] warn: rules: failed to run CG_DOUBLEDOT_GIF test, skipping: [785] warn: (Can't locate object method image_name_regex via package Mail::SpamAssassin::PerMsgStatus at (eval 719) line 1580. [785] warn: ) [785] warn: rules: failed to run CG_SONY_JPG test, skipping: [785] warn: (Can't locate object method image_name_regex via package Mail::SpamAssassin::PerMsgStatus at (eval 719) line 2601. [785] warn: ) [785] dbg: rules: ran eval rule BAYES_50 == got hit (1) [785] warn: rules: failed to run CG_CANON_JPG test, skipping: [785] warn: (Can't locate object method image_name_regex via package Mail::SpamAssassin::PerMsgStatus at (eval 719) line 4000. [785] warn: ) [785] dbg: rules: running rawbody tests; score so far=3.191 [785] dbg: rules: compiled rawbody tests [785] dbg: rules: running full tests; score so far=3.191 [785] dbg: rules: compiled full tests [785] dbg: util: current PATH is: /root/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:/usr/lib/java/bin:/usr/lib/java/jre/bin:/usr/lib/java/bin:/usr/lib/java/jre/bin:/usr/lib/qt/bin:/usr/share/texmf/bin [785] dbg: pyzor: pyzor is not available: no pyzor executable found [785] dbg: pyzor: no pyzor found, disabling Pyzor [785] dbg: rules: running meta tests; score so far=3.191 [785] dbg: rules: compiled meta tests [785] dbg: check: running tests for priority: 500 [785] dbg: dns: harvest_dnsbl_queries [785] dbg: async: select found 4 responses ready (t.o.=0.0) [785] dbg: async: completed in 0.149 s: URI-DNSBL, DNSBL:sbl.spamhaus.org.:10.96.127.75 [785] dbg: async: completed in 0.156 s: URI-DNSBL, DNSBL:sbl.spamhaus.org.:10.178.19.65 [785] dbg: async: completed in 0.155 s: URI-DNSBL, DNSBL:sbl.spamhaus.org.:11.25.147.192 [785] dbg: async: completed in 0.155 s: URI-DNSBL, DNSBL:sbl.spamhaus.org.:110.0.55.209 [785] dbg: async: queries completed: 4, started: 0 [785] dbg: async: queries active: URI-DNSBL=62 URI-NS=10 at Mon Jun 1 15:53:13 2009 [785] dbg: dns: harvest_dnsbl_queries - check_tick [785] dbg: async: select found 1 responses ready (t.o.=1.0) [785] dbg: async: completed in 0.158 s: URI-DNSBL, DNSBL:sbl.spamhaus.org.:39.0.58.80 [785] dbg: async: queries completed: 1, started: 0 [785] dbg: async: queries active: URI-DNSBL=61 URI-NS=10 at Mon Jun 1 15:53:13 2009 [785] dbg: dns: harvest_dnsbl_queries - check_tick ... [785] dbg: check: is spam? score=3.191 required=4 [785] dbg: check: tests=ALL_TRUSTED,BAYES_50,EMPTY_BODY,NORMAL_HTTP_TO_IP,NUMERIC_HTTP_ADDR,URI_HEX,URI_NOVOWEL [785] dbg: check: subtests=__DATE_700,__DOS_BODY_MON,__DOS_HAS_ANY_URI,__DOS_RCVD_MON,__DOS_REF_TODAY,__ENV_AND_HDR_FROM_MATCH,__FB_NUM_PERCNT,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_MSGID,__HAS_RCVD,__HAS_SUBJECT,__KAM_MED2,__KAM_NUMBER2,__KAM_TIME4,__MISSING_REF,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__MSOE_MID_WRONG_CASE,__NAKED_TO,__NONEMPTY_BODY,__SANE_MSGID,__TOCC_EXISTS,__hk_obfdomreq2 It suddenly jumps from 1.2 to 3.91 after looking for images. I don't know where to fix that. I think that I need to update SPF, too, because that's compiled against an earlier perl version. Rich
Re: Identifying Source of False Positives
On Mon, 1 Jun 2009, Rich Shepard wrote: On Mon, 1 Jun 2009, John Hardin wrote: Have you kept your spam and ham corpa? I'm not sure. The spam comes from the spam-uncaught file which is cleared each time it's run. Pity. If you're manually training it's a very good idea to retain your corpa so you can review training and retrain from scratch if needed. Okay, let's key on that one. ## Call SpamAssassin : 0fw: spamassassin.lock * 256000 | spamassassin : 0 fw: spamassassin.lock * 256000 * ! ^TO_abuse@ * ! ^List-Id: .*?use...@.]spamassassin\.apache\.org? * ! ^Received: from salmo\.appl-ecosys\.com \(localhost\.localdomain \[127\.0\.0\.1\]) by salmo\.appl-ecosys\.com | /usr/bin/spamc Using spamc creates less load than launching spamassassin from scratch for every email, but you do have to manage the daemon (i.e. restart it if the rules change). I run spamd: 2978 ?Ss12:16 /usr/bin/spamd -d --pidfile=/var/run/spamd.pid 3052 ?S 0:04 spamd child 3054 ?S 0:05 spamd child is this not adequate for a light load? That's fine. If you're currently running spamd, then having procmail call spamassassin is wasteful. That recompiles all of the rules from scratch for every message you receive, where using spamc/spamd compiles the rules once when you restart the daemon. Are your resources really so limited that you want to serialize all email delivery? As a middle ground you might consider per-user lockfiles instead, e.g.: : 0 fw: $HOME/.spamassassin.lock I'd also suggest upping the size limit a bit, but that's not a big issue. There are more complex things you can do; you might want to take a look at http://www.impsec.org/~jhardin/antispam/spamassassin.procmail There are only two users on this network and a low mail volume for each of us. Ok, then your locking should work okay. I'll keep teaching SA that the log reports are ham and see if that makes a difference. It will help, though it may take a while to override their current learning as spam. As I wrote earlier, this is all within the past quarter year, and it's been a PITA since it's taken time and attention away from my business. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- ...to announce there must be no criticism of the President or to stand by the President right or wrong is not only unpatriotic and servile, but is morally treasonous to the American public. -- Theodore Roosevelt, 1918 --- 5 days until the 65th anniversary of D-Day
Re: Plugin/TVD.pm
Yup, that's the beast. Missed the news that it had become part of 3.2. Excellent. Thanks. Theo Van Dinter wrote: That depends, what's TVD.pm? ;) Doing a quick search shows http://mail-archives.apache.org/mod_mbox/spamassassin-users/200603.mbox/%3c20060316233124.gv22...@kluge.net%3e which was a conversation we had way back in 2006 about SA 3.1 and bug 4255. There was a TVD.pm in discussion, so I assume that's the plugin in question. It appears to have become HTTPSMismatch.pm, already included as a standard plugin in SA 3.2 and beyond. :) On Sun, May 31, 2009 at 2:03 PM, Philip Prindeville philipp_s...@redfish-solutions.com wrote: I upgraded from FC8 to FC9 recently, and spamassassin could no longer find TVD.pm after I deprecated the old Perl install. Where does TVD.pm currently live?
Re: twitter spam why RCVD_IN_DNSWL?
On 1-Jun-2009, at 05:52, Michael Scheidell wrote: I don't follow anyone on twitter. Went to their web site for the first time last week to look for their complaint address. I've never seen a mail from twitter that was not directed to my twitter account. I searched the entire mailspool for the last 31 days. This is certainly not normal. -- The quality of our thoughts and ideas can only be as good as the quality of our language.