Re: sa-learn weirdness...
Arthur Dent wrote: Hmmm... Not delete exactly, but the sa-learn job take so long that the archivemail job has kicked off and finds the TempSpam and TempHam mboxes in the Mail directory and dutifully chops out anything older than 180 days. I didn't think that that would be a problem, but maybe it's upsetting sa-learn? I will try switch the order of the jobs (archivemail running first) and see if that makes a difference.. At this point you have probably already swapped the two processes. I think sa-learn or the process feeding it does not like the chopping. Well, as I explained in my previous post, the TempHam folder is a concatenation of all my non-spam folders. Mail that is older than 180 days is taken off at one end and new mail (c. 30-40 per day) added on at the other. The total remains roughly constant. Don't forget that sa-learn remembers which messages have been learned. Once your old messages have all been learned, you need to feed to it only new arrivals, that is since the last sa-learn run. No need to keep 180 days worth of ham and spam in the temp folder! Let sa-learn complete and then chop the folder. Just concatenate the process rather than schedule it in crontab. It should fix your apparent weirdness. Paolo
Re: sa-learn weirdness...
Arthur Dent wrote: Learned tokens from 8 message(s) (3165 message(s) examined) Learned tokens from 4628 message(s) (8703 message(s) examined) Learned tokens from 3890 message(s) (8634 message(s) examined) Learned tokens from 2264 message(s) (8671 message(s) examined) Learned tokens from 2303 message(s) (8620 message(s) examined) Odds 2,000,127 against one... and counting... Notice that although the amount of tokens being learned seems to be coming down gradually, the total far exceeds the total amount of ham mails in the corpus. The number of *messages* learned is decreasing, not the number of tokens. Could it be that something deletes the temp folder before sa-learn has finished, so it gets distracted and starts flying away carrying a suitcase? Or do you receive 8600 messages each day? Some of them might have been autolearned on the incoming SMTP channel, BTW. IMHO it is not necessary to train so extensively the Bayes DB. If you want the process to complete in a decent amount of time, feed it fewer messages at a time. Paolo PS: who knows who Arthud Dent is/was, will understand the oddities in this reply. All others: get a copy of the HHGTTG. :-)
Re: Spam Assassin Load Balancing
Thomas Ledbetter wrote: First of all: we're running amavisd-new, not plain spamc/spamd anymore. We used to have N servers each running its own spamd deamons, so with separate Bayes/AWL DB. I have not understood how many machines run spamc and how many spamd. With a rounb robin policy on a hardware load balancer, once the connection is routed to a specific 'worker bee', if that machine times out, the request will fail, and the mail wont get scanned. However, more intelligent hardware load balancing setups can monitor the work on each node, and take it out of service as necessary. A load balancer sets as offline non-responding nodes, according to a different level of checks (ICMP ping, TCP ping, service check, ...). But these checks are not in real-time, so if spamd dies during analysis the connection will drop (or hang) and spamc will timeout. The load balancer won't restart the connection to another node. At least not our HLB. Been there (with LDAP), done that! Also, when running a round-robin based cluster, is there any problem having a mix of machines with different performance capacities? i.e. If I have a 10 node cluster, and 3 of the servers are much slower than the others, will it impact performance of the cluster as a whole? Even if I limit the number of spamd that run to a lower value than the higher performance machines? What do you consider as performance? I think the global average analysis time (what I call performance) will obviously be affected, to an amount that depends on load distribution. With a real load balancer you can use different priorities for each node, so to keep faster machines more busy than slower ones. Anyway, I've seen spamd running on different hardware since 2004 and I wouldn't say the analysis speed has been improved significantly. Just don't let spamd nodes swap memory to disk. Good luck with the high-load spam fight, Paolo
Re: OT - massive newsletter
mizzio wrote: I'm setting up an SMTP server (centos + qmail) on a dell quad core machine for sending out a periodic newsletter (10 millions a month). In order to avoid any possible blacklisting problem, I'm looking for all the best practices. Right now I've set up: You need EXPLICIT authorization (opt-in) of all recipients and be able to prove it. This is required by EU (and thus your/my country law) and the best insurance not to end up in blacklists. Good luck, Paolo
Re: Blocking MMS messages?
Steve Monkhouse wrote: Yeah that works for that one.. but with multiple originating sources and multiple carriers etc I thought there must be a better way than manually enetering every mms provider... ?? I'm old fashioned and don't own an MMS-enabled phone, but phone numbers are generally 12 numbers long if in the standard international form, prefixed with a +. I just sent myself an SMS-to-email with Vodafone Italy and hit these rules: X-Spam-Status: No, score=2.532 tagged_above=-999 required=3.5 tests=[BAYES_00=-2.599, DNS_FROM_RFC_ABUSE=0.2, FORGED_RCVD_HELO=0.135, FROM_ENDS_IN_NUMS=2.53, FROM_LOCAL_HEX=1.305, NO_REAL_NAME=0.9 while the sender was [EMAIL PROTECTED] Take a survey of how your local providers format senders and write a set of rules accordingly. Paolo
spamd errors... SpamdForkScaling.pm
Got these errors in maillog on a postfix+spamc/spamd Linux RedHat ES3 installation. Looks like this issue has not been fixed in 3.1.7, targeted for 3.1.9? Could it be that the system runs out of file descriptors? Don't think so... [EMAIL PROTECTED] cat /proc/sys/fs/file-nr 84314030314564 [EMAIL PROTECTED] cat /proc/sys/fs/file-max 314564 Here's an excerpt from maillog. Process 31633 is the spamd master. Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: child states: BIIBBIB Dec 18 11:20:39 srv-asgw02 spamd[31633]: spamd: handled cleanup of child pid 5654 due to SIGCHLD Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: child states: BIIBBB Dec 18 11:20:39 srv-asgw02 spamd[31633]: syswrite() on closed filehandle GEN452736 at /usr/lib/perl5/5.8.0/i386-linux-thread- multi/IO/Handle.pm line 447. Dec 18 11:20:39 srv-asgw02 spamd[31633]: Use of uninitialized value in concatenation (.) or string at /usr/lib/perl5/site_per l/5.8.0/Mail/SpamAssassin/SpamdForkScaling.pm line 419. Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: killing rogue child 330, failed to write on fd : Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: killing failed child 330 fd=undefined at /usr/lib/perl5/site_perl/5.8.0/Mai l/SpamAssassin/SpamdForkScaling.pm line 137. Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: killed child 330 Dec 18 11:20:39 srv-asgw02 spamd[31633]: prefork: child states: BKBBBI Paolo
Re: bayes_seen on MySQL, growing and growing
Jim Maul wrote: I dont use mysql with SA, but you should be able to use truncate instead of delete. It may very well be faster with all those rows. From MySQL 4.x manual: For InnoDB, TRUNCATE TABLE is mapped to DELETE, so there is no difference. We're using InnoDB rather than MyISAM, so there's apparently no big difference. It doesn't free disk space, though, so an OPTIMIZE TABLE should be issued. Still no input from developers/maintainers can I empty the bayes_seen table without breaking DB consistency? Thanks, Paolo
bayes_seen on MySQL, growing and growing
Hi, while doing some checkup on production servers, I noticed that the bayes_seen table on MySQL is rather big: row: 15'814'021 (15.8Mr) size: 1'853'882'368 bytes ( 1.8GB) I've understood SA doesn't clean-up that table, so it has to be done manually. Can I simply do a DELETE * FROM bayes_seen and live long and employed? ;-) I know it works if Bayes is on files. I would also OPTIMIZE TABLE bayes_seen to regain the disk space. It would be probably faster to delete and re-create the table, but on a production system... Any other issues? TIA, Paolo
FP with Outook SMTPing to Lotus Domino
Hi, I just spotted this FP in our SA 3.1.4 quarantine... I have no means to contact the sender, but I guess he used an Outlook (Express?) client to SMTP a Domino server. Even if we had the threshold at the default 5 it would have been stopped. Is there a workaround on the rules or should I decrease some scores?! Moreover PRIORITY_NO_NAME is not listed in http://spamassassin.apache.org/tests_3_1_x.html but is present in my 20_head_tests.cf (require_version 3.001004). TIA, Paolo X-Spam-Status: Yes, score=5.091 tag=-999 tag2=3.5 kill=3.5 tests=[BAYES_00=-2.599, HTML_40_50=0.496, HTML_MESSAGE=0.001, MSGID_DOLLARS=1.716, PRIORITY_NO_NAME=2.7, RATWARE_OUTLOOK_NONAME=2.777] Received: from smarthost02.ISP.it (smarthost02.ISP.it [xxx.yyy.zzz.nnn]) by MYamavisSERVER.it (Postfix) with ESMTP id 777AD5840A5; Fri, 25 Aug 2006 09:12:11 +0200 (CEST) Received: from relay03.portal ([192.168.bbb.aaa]) by smarthost02.ISP.it (Lotus Domino Release 6.5.1) with ESMTP id 2006082509000547-2363 ; Fri, 25 Aug 2006 09:00:05 +0200 Received: from acme ([xxx.yyy.zzz.mmm]) by relay03.portal (Lotus Domino Release 6.5.1) with ESMTP id 2006082509004734-2554 ; Fri, 25 Aug 2006 09:00:47 +0200 Message-ID: [EMAIL PROTECTED] From: RFC2822 COMPLIANT [EMAIL PROTECTED] To: RFC2822 COMPLIANT Subject: RFC2822 COMPLIANT Date: Fri, 25 Aug 2006 09:11:30 +0200 MIME-Version: 1.0 X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Content-Type: multipart/alternative; boundary==_NextPart_000_0005_01C6C826.73BE4680
Re: FP with Outook SMTPing to Lotus Domino
Randal, Phil wrote: You might wish to look at tweaking your BAYES_xx scores to reduce false positives. I guess that depends on how healthy your Bayes database is, though. Can't really say how healthy it is. 99% of spam (guessing, but pretty close) is in English language, 99% of our ham is in Italian language. Spam in Italian is so rare (so far!) that I had to write custom rules to catch specific spam, because Bayes wouldn't hit hard enough after several training rounds. So... our Bayes is probably highly unbalanced due to the nature of our traffic and spam. Am I right? Any workaround? Paolo
Re: false positive on FORGED_MUA_OUTLOOK (v.3.1)
Tony Finch wrote: The following headers come from a legitimate message - I have obscured the sender's name, but that's all. The SlipStream SP Server seems to have appended the client username and IP address to the message-ID, causing the FP. See also: http://mail-archives.apache.org/mod_mbox/spamassassin-users/200509.mbox/[EMAIL PROTECTED] Yep! That was me! :-) I investigated with the sender of that message, and since he's a friend, I could ask him all sorts of questions. Turned out that he used a dialup connection *and* a dialup connection accelerator offered by the provider itself. I don't know how that [EMAIL PROTECTED] thing works, but it probably re-routes all IP traffic through a software-compressed tunnel established between the PC and provider's servers. Don't know where, Message-IDs are altered, but not by the client itself. I tried the same dialup without compression software and everything went fine. So, the FORGED rule triggers correctly. Someone deals improperly with Message-IDs! Paolo
Re: sa-learn Lotus Notes
Andy Jezierski wrote: There have been numerous threads on how to have end users drop misclassified mail to spam/ham folders in Exchange, but I don't recall seeing any mention of a way of doing this with Notes. Although we don't let users train Bayes, Lotus client and server from version 5 and above support IMAP, both as a client and as a server. When I need to extract a message from a LN mailbox I open an IMAP mailbox and copy it there. Or, the other way around, I access my LN mailbox via IMAP. Don't know if LN supports shared IMAP folders, or proxy authentication. But this need depends if you're training shared Bayes or per-user. Paolo
Re: Idea for new SA Rule
Gustafson, Tim wrote: Could SpamAssassin benefit from a filter that would actually check the spelling of the text parts of the message, and if misspelled words exceeds, for example, 50%, then we can add a few points to the SPAM score? I'm not sure how to begin coding this, but I think it should be pretty easy (using pSpell or aSpell or something) and I think it would be a very useful tool. And how would you deal with messages in other languages? Over here 99% of messages in English are spam! AFAIK there's no language indicator in email messages. Paolo
Re: Best Practices: SpamAssassin
of the above? Test them and decide which apply to your case. Dunno how indipendent your current antispam solution is, with SA you need to invest some time to review false negatives/positives (if any) and review extra rulesets. How have people faired with MySQL replication of the DB? I will need this solution to present the same data for backup MX which is not local to the primary MX. First of all: we dropped the secondary MX record because it received more spam than primary. We use a load balancer for HA. What do you want to store on MySQL? Bayes, AWL, quarantine are your non-mutually exclusive options. Bayes and AWL can be regenerated in matter of minutes, and you can start (I mean power up) a backup MX without them. Replicating quarantine is like replicating your trash between two bins. If you provide delegated quarantine, how likely is that a HW failure will destroy a false positive? You're probably better off without MySQL master-slave replication hassle. AFAIK there is a MySQL master-master replication function, but its limitations make it incompatible with amavis SQL needs. OT MODE ON X-Mailer: Novell GroupWise Internet Agent 6.0.4 OMG! It formatted your message paragraphs without breaking-up lines! Luckily Thunderbird has a rewrap function! OT MODE OFF Have a nice weekend! Paolo Cravero -- |QRPp-I #707 + www.paolocravero.tk + I QRP #476 | | SpamAssassin-based email antispam/antivirus solutions | \Italian/English-to/from-Croatian translations/ \ Skype: pcravero /
Re: Spamassassin Appliances?
Hi, this is a copy'n'paste from a message I wrote in December 2005 to the AMaViS list. Hi, I thought you might like to know how much a commercial solution _very_ similar to amavisd-new+ClamAV+SA+MySQL+mailzu costs. Something with AV+AS and webQuarantine to be installed on your own hardware, and a nice web interface for management (configuration). For 10k mailboxes it is about 12 USD/mailbox/year. But for 200 mailboxes the cost increases to 50 USD/mbx/year. There are of course reductions if the license coves 2 or 3 years. How much money is your setup worth? :-) . I'd go for spare lower-level machines that can just be turned on until the main one is fixed. Anyway, unless someone has shell access to your SA installation, it shouldn't software-break. Over here it hasn't in over 3 years of uninterrupted 100kmsgs/day/server. Mind disk occupation if you quarantine to disk, though! Depending on your traffic, Postix+SA could be handled by a P4 1GB RAM machine without slowdowns. Paolo
Forged Outlook false positive
Hi, these headers trigger the FORGED_MUA_OUTLOOK check on 2.64 and 3.1.0: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) X-Spam-Level: * X-Spam-Status: No, score=1.6 required=5.0 tests=BAYES_00,FORGED_MUA_OUTLOOK, FORGED_RCVD_HELO autolearn=no version=3.1.0 Received: from xx.yy.yu (user-broadband-wireless-2.4GHz-1.xx.yy.yu [1.2.3.4]) by zz.yy.it (Postfix) with ESMTP id 90F881A3A14 for [EMAIL PROTECTED]; Mon, 23 Jan 2006 07:52:38 +0100 (CET) Received: from galerija2 ([192.168.13.195]) by xx.yy.yu (kg.org.yu [192.168.13.5]) (MDaemon.PRO.v6.8.5.R) with ESMTP id 1-md501.tmp for [EMAIL PROTECTED]; Mon, 23 Jan 2006 07:52:57 +0100 Message-ID: [EMAIL PROTECTED] From: International [EMAIL PROTECTED] To: L R [EMAIL PROTECTED] References: [EMAIL PROTECTED] Subject: Read: ok subject line Date: Mon, 23 Jan 2006 07:52:56 +0100 MIME-Version: 1.0 Content-Type: multipart/report; report-type=disposition-notification; boundary==_NextPart_000_0006_01C61FF2.05C6A910 X-Mailer: Microsoft Outlook Express 5.50.4952.2800 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4952.2800 X-MDRemoteIP: 192.168.13.195 X-Return-Path: [EMAIL PROTECTED] X-MDaemon-Deliver-To: [EMAIL PROTECTED] What is wrong with these? This should be a Return Receipt sent by OE through a MDaemon SMTP server (whose behavior is to me unknown). Does it change Message-ID?! TIA, Paolo
A self-declared Bulk message
Just reviewed the spam that passed through our amavisd-new + SA3.1.0 barrier and noticed something funny at the bottom of a message: This email has been sent with an unregistered version of MaxBulk Mailer. MaxBulk Mailer is a new easy-to-use mail merge software for Macintosh. This message came to a non-existing address we started to trap for free spam, so nobody ever opted-in. Here's the message, I removed our internal SMTP headers and cut the actual recipient domain: http://spazioinwind.libero.it/ik1zyw/temp/selfDeclaredBulk.eml That website is now probably into most SURBLs, but it is funny anyway. Happy New Year to those who are in A.D. 2006! :-) Paolo
Re: Load ldap prefs
Philip S. Hempel wrote: Did you copy'n'paste this or retype? user_scores_dsn ldap://locahost/dc=qmailldap,dc=lh,dc=com?spamassassin?sub?uid=__USERNAME__ locaLhost, perhaps? Let us know... pc
Re: I need help with false spam (ham flagged as spam)
Liviu Lalescu wrote: Spamassassin is reporting it as spam, with a score of 5.6, but it is surely not spam. I have also used a sa-learn --ham on it, but even after that the message is still flagged as spam. I have done sa-learn --ham timetabling and after that spamassassin -t timetabling timetabling.out, obtaining also a 5.6 score. I can mention that I have used learning (sa-learn) for about 8000 ham messages and for 14 spam messages. Thus Bayesian does not kick in: obvious since no BAYES_* test gets reported. Anyway, SpamAssassin is NOT guilty at all in this false positive. pts rule name description -- -- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 0.9 MSGID_FROM_MTA_ID Message-Id for external message added locally -0.0 SPF_PASS SPF: sender matches SPF record 1.9 DATE_IN_FUTURE_96_XX Date: is 96 hours or more after Received: date Your correspondant sent a message dated 19 *JANUARY* *2006*. This alone would let the message through. 0.5 DNS_FROM_RFC_ABUSE RBL: Envelope sender in abuse.rfc-ignorant.org 0.9 DNS_FROM_RFC_WHOIS RBL: Envelope sender in whois.rfc-ignorant.org 1.4 DNS_FROM_RFC_POST RBL: Envelope sender in postmaster.rfc-ignorant.org Paulo's provider has been listed in rfc-ignorant.org lists. Go to those websites to understand why. Then, take some time to finish your Bayesian engine training and feed it/him/her some good ol' spam, so that is starts working. Last but not least, add Paulo in whitelisted senders. Paolo
Re: Using sa-learn with Notes/Domino Servers via agents
Not a solution but a few thoughts since we have LN here as well. Domino servers add a hell of headers to email messages that might confuse the Bayesian engine. Forwarding internet mail from one LN account to another DESTROYS RFC2822 headers. Copying preserves. LN clients can access IMAP mailboxes (sort-of undocumented hidden feature). sa-learn can be fed through a call from fetchmail accessing an IMAP mailbox+folder. (I think the latter is documented in the Wiki.) You may widen the autolearn thresholds so that fewer messages are fed automatically to the Bayes DB. Another issue I have is that we have 2 loadbalanced exim servers for tagging spam, yet I would like to keep the bayes DB the same on both hosts. Did anyone ever come up with a solution to this problem? Yes, a RDBMS backend for the Bayes database (MySQL here). Otherwise you might elect one server as master and align DBs nightly (spamd restart!). Or stay with mis-aligned Bayes DBs: if your servers route a lot of msgs/day (n*10k) and are round-robin balanced, they'll be statistically identical. Same goes for AWL, if used. HTH, Paolo -- |QRPp-I #707 + www.paolocravero.tk + I QRP #476 | | SpamAssassin-based email antispam/antivirus solutions | \Italian/English-to/from-Croatian translations/ \ Skype: pcravero /
Re: f-secure messaging security gateway x-series??
Mathias Homann wrote: So, has anyone here seen/touched this thing before? Not that one, but touched two other vendors' appliances. For me, the only strong point with it seems to be the combined firewall/AV/spam scanner thing (waitaminute... single point of failure??), and the web admin frontend which can generate colorful pie charts about spam/virus statistics (which, of course, can be printed on overhead films and used to increase the IT budget...). Anyone ever seen one of those? Lately they *all* look like an amavisd-new wrapper with a commercial AV, SA- or DSPAM-based AS analysis plus all those colorful niceties that impress managers but don't actually improve performance. One big issue with these appliances is how they decide a content is spam or not, and how you can adapt the appliance to your needs. Many of them keep a sort-of centralized rules (Bayes? heuristic? ...) that spreads to each appliance, so you really don't know what is behind the decisional process. That makes it hard to explain your customer why his favourite Ikea newsletter was blocked. Same goes for non-English spam/ham. There might be other issues, but they're OT for this list. SA rulez! :) Paolo PS: I asked one of those vendors (the one I sent an idea of pricings a few weeks ago) how they deal with DNS-based lists. I wanted to know if they use vendor-based DNS replicas or query public nameservers, since they advertise +100kmsgs/day. They haven't answered yet...
POP3 proxy with SA 3.x?
Hi, I have successfully used a Perl POP3proxy on a Linux box with SA 2.6.x . I have now migrated to 3.x, and some internal functions have been dropped or renamed, so that Perl program doesn't work anymore. Does anyone know of a (Linux) POP3 proxy that supports SA 3.x? TIA, Paolo
SA 3.1 false positive on FORGED_MUA_OUTLOOK
Hi, just incurred in a false positive with SA 3.1 (through amavisd-new). The message comes from a friend, and he uses a real Outlook Express client, perhaps the Italian version. libero.it is one of the biggest Italian (free) ISPs. Any hint on fixing this? Paolo . Received: from localhost (172.16.1.84) by smtp2.libero.it (7.0.027-DD01) id 431C3A2400E8EB94; Mon, 19 Sep 2005 19:44:51 +0200 Received: from smtp0.libero.it ([172.16.1.76]) by localhost (asav5.libero.it [193.70.192.154]) (amavisd-new, port 10024) with ESMTP id 11243-11-5; Mon, 19 Sep 2005 19:44:50 +0200 (CEST) Received: from Vecchio (195.210.65.40) by smtp0.libero.it (7.0.027-DD01) id 431C393500235EBE; Mon, 19 Sep 2005 19:44:50 +0200 Received: from ppp-231-174.25-151.libero.it ([EMAIL PROTECTED] [151.25.174.xxx]) by wca20.libero.it (SlipStream SP Server 4.0.112 built 2005/05/06 17:01:26 -0400 (EDT)); Mon, 19 Sep 2005 19:44:50 +0200 (CEST) X-Originating-IP: [151.25.174.xxx] X-Originating-User: [USER_ANONYMOUS] Message-ID: [EMAIL PROTECTED] From: Name Surname [EMAIL PROTECTED] To: Name Surname [EMAIL PROTECTED], Name Surname [EMAIL PROTECTED] Subject: cinema Date: Mon, 19 Sep 2005 19:43:12 +0200 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary==_NextPart_000_0056_01C5BD52.5F07C5C0 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Scanned: with antispam and antivirus automated system at libero.it X-Spam-Status: Yes, hits=4.924 tag=-999 tag2=3.5 kill=3.5 tests=[BAYES_00=-2.599, DNS_FROM_RFC_ABUSE=0.2, NS_FROM_RFC_POST=1.708, FORGED_MUA_OUTLOOK=4.056, HTML_MESSAGE=0.001, RCVD_IN_BL_SPAMCOP_NET=1.558] X-Spam-Score: 4.924 X-Spam-Level: X-Spam-Flag: YES
Re: SA 3.1 false positive on FORGED_MUA_OUTLOOK
Michael Monnerie wrote: X-Spam-Status: Yes, hits=4.924 tag=-999 tag2=3.5 kill=3.5 tests=[BAYES_00=-2.599, DNS_FROM_RFC_ABUSE=0.2, NS_FROM_RFC_POST=1.708, FORGED_MUA_OUTLOOK=4.056, HTML_MESSAGE=0.001, RCVD_IN_BL_SPAMCOP_NET=1.558] X-Spam-Score: 4.924 Yes, increase the level at which an e-mail is marked as SPAM. This one got only 4.924 points, which is still below the default 5 points from where it should be marked as SPAM. A level of 3.5 is very optimistic, leading to lots of FP. I work for an ISP and we've been running SA 2.64 at 3.5 threshold for a couple of years now. All false positives so far have been well beyond the threshold and mailing lists. Today's FP on SA 3.1 under evaluation was a personal mail. Maybe an update to his Outlook Express could help, saves 4.056 points if his e-mail program works correctly :-) Well, altought I can suggest it to my friend (who uses a 56k dialup BTW), I can't force the whole world to update their OE clients (they'd better switch to something better, anyway!). Even if we increase the threshold to 5, what about false positives that hit both FORGED_MUA_OUTLOOK and BAYES_nm (with nm leading a 1 score). IMHO FORGED_MUA_OUTLOOK is buggy, but I have too few experience with Outlook Express Message-IDs. Paolo -- |QRPp-I #707 + www.paolocravero.tk + I QRP #476 | | SpamAssassin-based email antispam/antivirus solutions | \Italian/English-to/from-Croatian translations/ \ Skype: pcravero /
(OT) SURBL local-DNS sample file?
Hi, what follows is certainly OT for SpamAssassin. I am setting up SA3 with SURBL support, and I am configuring RBLDNSD in order to run a local SURBL copy. Before asking for rsync permission, I'd like to test the configuration on a non-production system (with a non-production IP address). I need a sample of the files that are actually downloaded with rsync, but I've not been able to find any sample to use on surbl.org and related sites. I am not a DNS expert to write my own. Can someone provide me a sample? Would SURBL.org people mind publishing a sample rsync file on their pages? Thanks for your attention, Paolo
Re: SpamAssassin w/POP3 SMTP outsourced e-mail server...
Jesse Shumaker wrote: Let me try and summarize what I have recieved from all these e-mails as [...] use and am trying to piece it all together. Correct, except that the remote POP3 server is specified on client configuration and not wired statically on the pop3 proxy box. At least with the SApop3proxy we're using. Ciao, pc
Re: SpamAssassin w/POP3 SMTP outsourced e-mail server...
Jesse Shumaker wrote: Hi This looks good and I think I may try this perl module. It seems that it's geared towards a single workstation and not a network of machines. They say that you point your client to localhost, which means that each machine must have this installed. How are you guys running this so that you can have one centralized SA server? Also, how does the SA box authenticate with the ISP's POP servers for each e-mail client? In my organization each user has their own password and username for their e-mail account. We installed it on a linux box with SA, and run it as a deamon. It supports concurrent connections, altought we haven't tested it thoroughly (hundreds of simultaneous connections...). So, rather than installing it locally on each machine, use a shared POP proxy. The client sends SAproxy the user/password, that then SAproxy submits to the remote server. It is a proxy for POP3 protocol (no support for POP3*S*), just that before sending the message to the client it is scanned by SA. It is also very flexible, since the destinaton server has to be specified as part of the login string ([EMAIL PROTECTED] to retrieve mail with login [EMAIL PROTECTED] from pop.domain.com server): your colleagues can use the same proxy box for retrieving mail from other POP3 accounts as well. PC -- |QRPp-I #707 + www.paolocravero.tk + I QRP #476 | | SpamAssassin-based email antispam/antivirus solutions | \Italian/English-to/from-Croatian translations/ \ Skype: pcravero /
Re: SpamAssassin w/POP3 SMTP outsourced e-mail server...
Jesse Shumaker wrote: Jesse, It would be just like a web proxy. The outlook clients are redirectd to the spamassassin box which filters the e-mail and forwards/relays the requests onto our ISP's e-mail servers. If you can assist me at all with this I would be greatly appreciated. you can try this: http://mcd.perlmonk.org/pop3proxy/ It is written in Perl and apparently works on Win and Linux boxes. I believe it is the one we're using in my organization. Very stable. Paolo -- |QRPp-I #707 + www.paolocravero.tk + I QRP #476 | | SpamAssassin-based email antispam/antivirus solutions | \Italian/English-to/from-Croatian translations/ \ Skype: pcravero /
Re: OT: Mail/Spam Stats and MRTG
Jake Colman wrote: Does anyone have any suggestions for using mrtg to produce a graph showing the amount of received email and how much of it was flagged as spam? I am using mrtg, sendmail, and procmail on all the same server. You need to write an external program (script) for the SNMPdeamon on the server. It returns a single number computed out of sendmail/procmail maillog of whatever you want to monitor. Then use MRTG to manipulate the value (cumulative vs last-5-minutes). Here we use Cricket to monitor SpamAssassin performance in quasi-real-time. But I didn't set it up myself. HTHAL, Paolo --- SpamAssassin-based email antispam/antivirus solutions Italian/English-to/from-Croatian translations
Re: Logfile analyzer
Chris Santerre wrote: Can anyone recommend a good logfile analyzer for Spamassassin? Depends on what you want to analyze. One of the ninjas wrote a great script to parse the logs and show rule hit statistics. If you are looking for that I can see if I can find it my vast archive of ninja info. Let me know. pflogsumm.pl if using SA with Postfix... I also wrote a script that gives stats per domain of spam caught, if using SA with Postfix. If anyone's interested in joining my self beta-testing... Paolo -- QRPp-I #707 + www.paolocravero.tk + I QRP #476 \ Skype: pcravero /
Re: German Spam
Netmail wrote: Hi I'm new for spamassassin , when modify the local.cf file after restart sendmail or what ? If you are using spamc/spamd you need to restart spamd in order to activate new rules. If you are simply calling spamassassin executable from sendmail (highly inefficient), no restart is needed. Ciao, Paolo -- SpamAssassin-based antispam/antivirus email gateways Italian/English-to/from-Croatian translations
Re: R: German Spam
Netmail wrote: Ok Now this is my config file # This is the right place to customize your installation of SpamAssassin. # See 'perldoc Mail::SpamAssassin::Conf' for details of what can be # tweaked. # ### # rewrite_subject 1 #report_safe 1 If i want add block for the header of message ..how to ? Altought custom rules on a particular text is not the best way to achieve SpamAssassin potential, you need to add to local.cf something like: header SUBJ_RETHANKS Subject =~ /Re\: Thanks \:\)/ describeSUBJ_RETHANKS Subject is Re: Thanks :) score SUBJ_RETHANKS 10 10 10 10 # Gives 10 points to those messages whose subject is Re: Thanks :) You need to be able to write Perl regexps if you want to be more successful. You should test new rules on a development server, not on your primary box. Don't forget to run spamassassin --lint before you put any new rule/ruleset into production! Custom rules are very time consuming for the sysadmin, especially if not well written. There are enough free resources (rulesemporium) to keep SpamAssassin's hit ratio very high. And please do not forget that the Bayesian filter is your Friend! SpamAssassin is _extremely_ well documented. Paolo -- SpamAssassin-based email antispam/antivirus solutions Italian/English-to/from-Croatian translations
SA3.0.2 + amavisd-new ignoring $sa_tag_level_deflt ?
Hi, I'm testing a setup with amavisd-new (latest download version) and SA 3.0.2 on RedHat ES3. This setup serves as a laboratory for upgrading our SA 2.64 servers. I would like to have amavisd-new to add X-Spam-* headers to all messages, so I set the following: $sa_tag_level_deflt = -999; # add spam info headers if at, or above $sa_tag2_level_deflt = 3.50; # add 'spam detected' headers at that level $sa_kill_level_deflt = 3.50; # triggers spam evasive actions $sa_dsn_cutoff_level = -999; # spam level beyond which a DSN is not Unfortunately X-Spam-* headers are NOT added to messages scoring between -999 and 3.5. What am I missing? Thanks, Paolo -- QRPp-I #707 + www.paolocravero.tk + I QRP #476 \ Skype: pcravero /
Re: highly available sitewide bayes, local db vs. sql
Ben Poliakoff wrote: Hi Ben What sort of experiences have people had managing a sitewide bayes db that is used by spamassassin (spamd|amavisd) instances on multiple machines? I've got an environment with spamassassin/amavisd-new running in parallel on a pool of two (but possibly more in the future) equally weighted machines. How have you avoided the dreaded Single Point of Failure? Running here two servers with SA in load balancing. Each machine has its own local BayesAWL DB (no SPoF). Given the amount of incoming traffic (100kmsgs/server/workday) we are statistically sure that both servers see the same (spam) messages. We have not noticed any efficiency unbalance between the two instances in over 12 months. Having two DBs has also one advantage: if Bayes on one machine gets corrupted (wrong training, ...) you can restore it from the twin server with a simple FTP. We have done this at least once. What needs to be done periodically is AWL DB purging/reset since it keeps growing and growing... We were considering a MySQL DB on a third machine (with failover on other two), but the loss of Bayes history is not such a big issue IMHO. A nighttime backup is probably enough as long as you have another machine to restore the DB few hours after failure. Nevertheless a good ham/spam collection will re-train your Bayesian filter in a matter of minutes. Our third machine will probably run a local mirror of SURBL, instead! HTH, Paolo