Re: Understanding SpamAssassin
On 21-Sep-2009, at 13:05, poifgh wrote: Mails which have very high or very low score are fed to bayesian learning. Since we are confident about them being HAM or SPAM what do we want to learn from them - The regex filters have identified that the mail is a spam (say), what additional does bayesian learning achieve? Does it learn other words in the spam mail (say words surrounding obfuscated term) in hope of matching them in future emails? Or am I understanding it completely different? Bayes learning from spam helps score message that would not score as spam. Similarly, bayes learning from ham helps score messages as ham that might otherwise be tagged as ham. -- Heisenberg's only uncertainty was what pub to vomit in next and Jung fancied Freud's mother too. -- Jared Earle
Re: Re-running SA on an mbox
On Tuesday September 22 2009 06:32:12 Benny Pedersen wrote: On man 21 sep 2009 20:33:57 CEST, MySQL Student wrote but this will invalidtate dkim headers if this headers is signed, are spamassassin aware of this problem ? (in general) Are you saying there is a bug? partly yes, its not a bug as long you keep the orginal email but spamassassin --mbox infile outfile invalidate dkim signed mails no ? It is not common nor wise to have X-Spam-* header fields included in a DKIM signature. Neither amavisd nor dkim-milter/OpenDKIM or dkimproxy would do it, without special effort. I wouldn't expect striping of X-Spam-* header fields to be problematic in view of invalidating signatures. What can be detrimental to signatures is modifications to existing header fields like From or Subject by inserting 'tags' like **SPAM**. Whether this matters or not depends on what will happen next with such mail. Mark
Re: About reporting
Hello, please configure your mailer to wrap lines below 80 characters per line. 72 to 75 is usually OK. Thank you. On 19.09.09 22:45, João Eiras wrote: I still haven't got the answer to my last question, so here it goes again: Can I report a full mbs file with many mails in one go ? Or should I split each mail on it's own file ? sa-learn can accept mail in mbox format. It's even in its manual page. Is this what you have meant? Might be. I'm not familiar with spam assassin's internals, just a normal e-mail user wanting to report some spam. A quick man sa-learn shows --mbox sa-learn will read in the file(s) containing the emails to be learned, and will process them in mbox format (one or more emails per file). So, I have my answer. The page at http://wiki.apache.org/spamassassin/ReportingSpam is ambiguous in this regard. Just mentions a message.txt. I hope you can make something out of my uploaded mbox file :) On 22.09.09 03:05, João Eiras wrote: After further testing, while sa-laern might support multiple emails in a single mbox file, spamassassin -r does not, so I cooked a perl script to split the mbox file into each individual mails, and then a simple loop is enough to report everything. sa-learn only does bayes training. spamassassin -r and spamassassin -k do other things - report to network services like razor/pyzor/dcc and SpamCop. This is quite useless for older messages, which will probably be in the mailbox unless it contains mail received in last few days. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. I don't have lysdexia. The Dog wouldn't allow that.
partial (lazy) scoring?
I would like to configure Spamassassin to only do certain tests when the required_score is not yet reached. For example, do the usual rule-based and bayesian tests first, and if the score is lower than the required_score, then do the DCC and RAZOR2 tests. Is it possible?
Re: Understanding SpamAssassin
On tir 22 sep 2009 09:43:23 CEST, LuKreme wrote bayes learning from ham helps score messages as ham that might otherwise be tagged as ham. ups :) -- xpoint
Re: partial (lazy) scoring? (shortcircuit features)
ArtemGr wrote: I would like to configure Spamassassin to only do certain tests when the required_score is not yet reached. For example, do the usual rule-based and bayesian tests first, and if the score is lower than the required_score, then do the DCC and RAZOR2 tests. Is it possible? Not exactly the way you describe, no. SpamAssassin has a priority and a shortcircuit facility that provide a vaguely similar functionality, but it doesn't really work exactly the way you want. Priority allows you to change the order in which rules are processed, so you can make some rules run earlier, or later, than others. This part fits your needs. Shortcircuit allows you to stop processing when a particular rule fires. However, it is strictly based on the rule firing, not the message score. This part doesn't fit your needs. Collectively they allow you to make some rules (ie: USER_IN_WHITELIST, USER_IN_BLACKLIST) run first, and abort processing if they fire. However, this doesn't really work for your scenario of delaying a few rules and aborting if they're not needed. I suppose there could be some kind of mod to the shortcircuit plugin to do this, however it's a little dangerous from a false-positive perspective, so the devs may not be very enthusiastic about adding it. A long, long time ago, SpamAssassin had a feature where it would abort as soon as a given score was hit. However, this introduced a problem where it could cause false positives. A nonspam message might hit several spam rules early in the processing, and drive the score over the abort threshold, causing it to be tagged as spam. However, this could prevent it from matching negative scoring rules that would push it back under the spam threshold. Now, that version of SA was a long time ago, and we didn't have any priority going on, and it was also checking the score pretty often in between rules. In theory, a feature could be added to let you do something like this (SA doesn't have this feature, but I'm proposing it could be added): shortcircuit_if_score_above_at score priority Which would let you do: shortcircuit_if_score_above_at 5.0 99 priority RAZOR_CHECK 100 priority DCC_CHECK 100 You'd have to be careful about your priorities, as this will prevent any nonspam rules with higher priority numbers from running, but it could work for this scenario. You could also prevent the rules from running on nonspam if they're pointless as well with a similar score below feature: shortcircuit_if_score_below_at -1.17 99 The highest score you can ever get out of both DCC and Razor (with the current scores) is +6.17 (unlikely, but possible, assuming both e4 and e8 have high cf's and DCC fires too). If the score is already below -1.17, there's no way these rules can ever drive the score up enough be over 5.0 and make the message spam. Obviously this would greatly depend on what rules you're running late.
Re: partial (lazy) scoring? (shortcircuit features)
Matt Kettler mkettler_sa at verizon.net writes: In theory, a feature could be added to let you do something like this (SA doesn't have this feature, but I'm proposing it could be added): That would be a nice optimization: most of the spam we receive have a 10 score. It seems a real waste of resource to perform all the complex tests (like distributed hashing or OCR-ing) on spam which is DNS and rule-detectable.
Re: Re-running SA on an mbox
On Mon, 2009-09-21 at 23:18 -0400, MySQL Student wrote: How can I tell when another process is using the database and when it is free for my script to use? Is there a faster way to run spamassassin just to strip the SA headers? Try using a local SA setup for stripping the headers. By local, I mean don't use your main production SA - run a separate copy with its own (cut down) configuration and all data base accesses and UBL calls etc turned off. By using a separate SA instance you'll avoid access conflicts with your production SA and by using a minimal configuration it will initialise and run faster than if it was setting up for a normal scan run. I have a similar spamc/spamd system that is only used for testing new local rules. It works well and (important to me anyway) doesn't write anything to the production maillog, so testing new rules doesn't contaminate my daily SA performance report. Maybe there is a faster way, like passing the messages through the running amavisd instead of having to restart spamassassin each time to re-process each message? I maintain a cleaned spam corpus for developing and regression testing local rules. I use the following script to delete SA headers from this corpus: = cleaner === #!/bin/bash for f in data/*.txt do echo Cleaning $f gawk ' BEGIN { act = copy } /^X-Spam/ { act = skip } /^[A-WYZ]/ { act = copy } { if (act == copy) { print } } ' $f temp.txt mv temp.txt $f done == end of cleaner === This is certainly much faster than using SA for that job: it scans 167 spam messages in 2.3 seconds on a 1.6 GHz Core Duo laptop as compared with a spamc/spamd run on the same corpus and host, which takes 155 seconds. Martin
Re: partial (lazy) scoring? - run a second time?
ArtemGr artemciy at gmail.com writes: I would like to configure Spamassassin to only do certain tests when the required_score is not yet reached. For example, do the usual rule-based and bayesian tests first, and if the score is lower than the required_score, then do the DCC and RAZOR2 tests. Is it possible? Another way comes to mind: is to run the SpamAssassin the first time with the basic set of detections, and if the score is low, run it a second time with only the advanced detections turned on (using an alternative configuration by means of the ‐‐configpath=). That rises the question, whether the basic detections can be turned off. I found the following options: skip_rbl_checks 1 dns_available no use_bayes 0 use_bayes_rules 0 bayes_auto_learn 0 - but I do not see an option to turn off the static rules shipped with spamassasin. Is there such an option?
RBL error with SA
hi, I set to work with RBL Spamassasin by skip_rbl_checks option 0. But in my statistics were not reporting any RBL test Run spamassassin-D-lint And these messages came out [5563] dbg: dns: no ipv6 [5563] dbg: dns: is Net::DNS::Resolver available? yes [5563] dbg: dns: Net::DNS version: 0.61 [4315] dbg: dns: is_dns_available() last checked 1253631680.0 seconds ago; re-checking [4315] dbg: dns: is DNS available? 0 [4315] dbg: rules: local tests only, ignoring RBL eval _ Connect to the next generation of MSN Messenger http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-ussource=wlmailtagline
Re: RBL error with SA
Luis campo wrote: hi, I set to work with RBL Spamassasin by skip_rbl_checks option 0. But in my statistics were not reporting any RBL test Run spamassassin-D-lint http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-ussource=wlmailtagline One of the side-effects of '--lint' is to disable network checks. That should only be used to test your config files. Try this instead: spamassassin -D samplemessage.txt -- Bowie
Re: partial (lazy) scoring? (shortcircuit features)
Matt Kettler mkettler_sa at verizon.net writes: In theory, a feature could be added to let you do something like this (SA doesn't have this feature, but I'm proposing it could be added): On 22.09.09 11:46, ArtemGr wrote: That would be a nice optimization: most of the spam we receive have a 10 score. It seems a real waste of resource to perform all the complex tests (like distributed hashing or OCR-ing) on spam which is DNS and rule-detectable. You haven't read Matt's explanation of why it wasn't a good idea, did you? There are rules with negative scores, which can puch the score back to the ham, e.g. whitelist. Would you like to stop scoring before e.g. whitelist is checked? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Micro$oft random number generator: 0, 0, 0, 4.33e+67, 0, 0, 0...
Re: partial (lazy) scoring? - run a second time?
ArtemGr artemciy at gmail.com writes: I would like to configure Spamassassin to only do certain tests when the required_score is not yet reached. For example, do the usual rule-based and bayesian tests first, and if the score is lower than the required_score, then do the DCC and RAZOR2 tests. On 22.09.09 12:08, ArtemGr wrote: Another way comes to mind: is to run the SpamAssassin the first time with the basic set of detections, and if the score is low, run it a second time with only the advanced detections turned on (using an alternative configuration by means of the ‐‐configpath=). That rises the question, whether the basic detections can be turned off. I found the following options: skip_rbl_checks 1 dns_available no use_bayes 0 use_bayes_rules 0 bayes_auto_learn 0 - but I do not see an option to turn off the static rules shipped with spamassasin. Is there such an option? No, unless you will run spamassassin twice, or spamc once and spamassassin the other time. Note that running spamassassin would take much more time and the most CPU time-consuming rules are those that are not disabled in local mode. Why do you want to do that? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Boost your system's speed by 500% - DEL C:\WINDOWS\*.*
Re: Eliminating russian spam
Thank you, John! Both how-to (http://sa-russian.narod.ru/no_russian.html) and the ruleset (http://sa-russian.narod.ru/files/20090916/99_no_russian_mail.cf) are updated.
Re: RBL error with SA
On tir 22 sep 2009 17:24:30 CEST, Luis campo wrote [4315] dbg: rules: local tests only, ignoring RBL eval --lint will disable rbl testing so: spamassassin 21 -D | less any errors ? -- xpoint
RE: Problems with high spam
Dear friends, I appreciate your support. Yesterday at approximately 15:00 make some changes: - Add to SA skip_rbl_checks RBL 0 - Increase required_score from 3.5 to 5.0 Spam Statistics from yesterday were: Total messages:Ham: Spam: % Spam: -- 11656 5225 6431 55.17% Spam detection increased 1% compared to previous statistics Regarding whitelist_from these are the statistics: TOP HAM RULES FIRED -- RANKRULE NAME COUNT %OFMAIL %OFSPAM %OFHAM -- 22USER_IN_WHITELIST 110 0.95 0.02 2.11 If I remove the entire configuration of SA whitelist_from improve 1% Additionally, the rules that are 100 points are created based on mass mailings that are identified as SPAM (advertising) but SA is not detected. Additionally I noticed that there are emails that should detect as SPAM (for example those of 100 points - Advertising) but not filtered. What could happen? What more could add or remove the configuration of the SA? I understand that there may be errors in the configuration of the SA and basically it is because I have not much experience is why I turn to the list to give me support and I am equally learn more about SA. Thanks Jose Luis Date: Mon, 21 Sep 2009 19:36:24 -0400 Subject: Re: Problems with high spam From: aawo...@gmail.com To: users@spamassassin.apache.org On Mon, Sep 21, 2009 at 11:34 AM, Martin Gregorie mar...@gregorie.org wrote: On Mon, 2009-09-21 at 09:58 -0500, Jose Luis Marin Perez wrote: I will implement improvements in the configuration suggested and observe the results, however, that more could be suggested to improve my spam service? I think you need to find out more about where your system resources are going. For starters, take a look at maillog (/var/log/maillog on my system) to check whether any SA child processes are timing out. If they are, you need to find out why processing those messages took so long and, if possible, speed that up, e.g. if RBL checks or domain name lookups are slow, consider running a local caching DNS. If that doesn't turn up anything obvious, use performance monitoring tools (sar, iostat, mpstat, etc) to see what is consuming the system resources: you have to know where and what the bottleneck(s) are before you can do anything about them. You can find these tools here: http://freshmeat.net/projects/sysstat/ if they aren't part of your distro's package repository. Martin Has there been any evidence that the OP's system is short on resources? If so I missed it. The complaint was that too much spam is making it past the filter, with a detection rate of only 54%. This is not a very good percentage for a typical mail flow (if it is actually accurate, i.e. not missing the mails rejected by RBLs or RFC/syntax checks). There were several issues with the configuration that kind people on the list have pointed out. Assuming these suggested changes have been implemented, what is the detection rate now? From the posted local.cf, it is evident that the SA configuration is not working very well. There are many manually entered whitelist rules, and also many manually added rules that score 100. This is a telltale sign of a very bad setup that is attempting to bandaid instead of fixing the core issue. And as pointed out before, both the whitelist and the subject match - 100 are very bad ideas. Whitelisting the sender is so easily taken advantage of by spammers, and those +100pts matches are sure to generate FPs. Using rules this way demonstrates lack of understanding in the way that SA is supposed to work. SA rules rarely attempt to kill a message in one shot (100 pts), instead they add or subtract a small amount from the score based on likelyhood that a match means spam or ham. Fine tuning, not smashing with a hammer. So, I think it is pretty safe to assume that the problem lies within the SA configuration. Maybe there are old rulesets that need to be updated. Maybe not a good selection of rulesets in the first place. Perhaps this is an out of the box configuration that has never been properly set up. There are many good guides to setting up SA and supporting services available online. If the OP were to follow one of them to the letter, I think the detection rate would be much improved. Also some time spent learning more about SA in general would allow the OP to fine tune his config so that the current manual effort put into creating hammer smashing rules is unneeded. Good luck -Aaron _
RE: Problems with high spam
On Tue, 22 Sep 2009, Jose Luis Marin Perez wrote: Additionally I noticed that there are emails that should detect as SPAM (for example those of 100 points - Advertising) but not filtered. What more could add or remove the configuration of the SA? First we need to see why they aren't being scored well. Collect a small representative set of these (say, five or six) _including_ _all_ _headers_ and publish them on pastebin or on a webserver you have access to, and post the URLs to the list. We'll take a look at them and see if thereare any obvious suggestions. Two more questions: (1) Are you using any SMTP-time DNSBL checks? You may find using the spamhaus zen list at SMTP time (if that is possible in your environment) will greatly reduce your spam volume with minimal problems. (2) Are you using any third-party SA rulesets, for example from the SARE repository? -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- W-w-w-w-w-where did he learn to n-n-negotiate like that? --- Approximately 8756100 firearms legally purchased in the U.S. this year
RE: Problems with high spam
Dear Sirs. Thank you for your answers I'll gather some examples of emails that my users are considered as SPAM (Latest I could configure SA to display the report in the headers) Regarding the questions: 1. Yes I have set up qmail-smtpd to use rblsmtpd and definitively blocks a lot of mails before the SA can analyze. 2. I am using any third-party SA. But I will install now. Thanks for your reply Jose Luis Date: Tue, 22 Sep 2009 11:00:12 -0700 From: jhar...@impsec.org To: users@spamassassin.apache.org CC: aawo...@gmail.com Subject: RE: Problems with high spam On Tue, 22 Sep 2009, Jose Luis Marin Perez wrote: Additionally I noticed that there are emails that should detect as SPAM (for example those of 100 points - Advertising) but not filtered. What more could add or remove the configuration of the SA? First we need to see why they aren't being scored well. Collect a small representative set of these (say, five or six) _including_ _all_ _headers_ and publish them on pastebin or on a webserver you have access to, and post the URLs to the list. We'll take a look at them and see if thereare any obvious suggestions. Two more questions: (1) Are you using any SMTP-time DNSBL checks? You may find using the spamhaus zen list at SMTP time (if that is possible in your environment) will greatly reduce your spam volume with minimal problems. (2) Are you using any third-party SA rulesets, for example from the SARE repository? -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- W-w-w-w-w-where did he learn to n-n-negotiate like that? --- Approximately 8756100 firearms legally purchased in the U.S. this year _ News, entertainment and everything you care about at Live.com. Get it now! http://www.live.com/getstarted.aspx
RE: Problems with high spam
On Tue, 22 Sep 2009, Jose Luis Marin Perez wrote: I'll gather some examples of emails that my users are considered as SPAM (Latest I could configure SA to display the report in the headers) Regarding the questions: 1. Yes I have set up qmail-smtpd to use rblsmtpd and definitively blocks a lot of mails before the SA can analyze. Which RBLs are you using, if I may ask? 2. I am using any third-party SA. But I will install now. In addition to the SARE rules, I recommend the SOUGHT rules. Those are automatically generated and updated regularly based on current spam. You will want to set up sa-update to update SOUGHT daily. Two more questions: (1) Are you using any SMTP-time DNSBL checks? You may find using the spamhaus zen list at SMTP time (if that is possible in your environment) will greatly reduce your spam volume with minimal problems. (2) Are you using any third-party SA rulesets, for example from the SARE repository? -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- You do not examine legislation in the light of the benefits it will convey if properly administered, but in the light of the wrongs it would do and the harms it would cause if improperly administered. -- Lyndon B. Johnson --- Approximately 8758860 firearms legally purchased in the U.S. this year
Re: Re-running SA on an mbox
Hi, Try using a local SA setup for stripping the headers. By local, I mean don't use your main production SA - run a separate copy with its own (cut down) configuration and all data base accesses and UBL calls etc turned off. Much better idea, thanks. Thanks for the script, too. Best, Alex
RE: Problems with high spam
Dear Sirs. Thank you for your answers Qmail-Smtpd have the following RBL configured: bl.spamcop.net cbl.abuseat.org combined.njabl.org These are the SARE rules which adds to SA: echo 70_sare_adult.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj_x30.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_highrisk.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html4.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_oem.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_random.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_specific.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_spoof.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_stocks.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_unsub.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_rcvd.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_spf.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 72_sare_bml_post25x.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 99_sare_fraud_post25x.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt As certify that the SARE rules are working? Inquire about rules SOUGHT Thanks Jose Luis Date: Tue, 22 Sep 2009 12:27:27 -0700 From: jhar...@impsec.org To: users@spamassassin.apache.org CC: aawo...@gmail.com Subject: RE: Problems with high spam On Tue, 22 Sep 2009, Jose Luis Marin Perez wrote:
Re: About reporting
spamassassin -r and spamassassin -k do other things - report to network services like razor/pyzor/dcc and SpamCop. Hum, then how do the default spam filters that come with a clean spam assassin installation know what's spam and what's not ? Is there service we can report spam to ?
Re: Problems with high spam
Jose Luis Marin Perez wrote: Dear Sirs. Thank you for your answers Qmail-Smtpd have the following RBL configured: * bl.spamcop.net cbl.abuseat.org combined.njabl.org * You might want to try zen.spamhaus.org. That is the only one I trust enough to block mail on my MTA. These are the SARE rules which adds to SA: *echo 70_sare_adult.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj_x30.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_highrisk.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html4.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_oem.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_random.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_specific.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_spoof.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_stocks.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_unsub.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_rcvd.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_spf.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 72_sare_bml_post25x.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 99_sare_fraud_post25x.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt* With all of those extra rules, I would expect to see false positives, not spam slipping through. As certify that the SARE rules are
RE: Problems with high spam
On Tue, 22 Sep 2009, Jose Luis Marin Perez wrote: echo 70_sare_highrisk.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt Did you read the ruleset descriptions before choosing which ones to use? Inquire about rules SOUGHT http://wiki.apache.org/spamassassin/SoughtRules -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- End users want eye candy and the ooo's and hhh's experience when reading mail. To them email isn't a tool, but an entertainment form. -- Steve Lake --- Approximately 8760240 firearms legally purchased in the U.S. this year
Re: Problems with high spam
On Tue, Sep 22, 2009 at 4:02 PM, Jose Luis Marin Perez jolumape...@hotmail.com wrote: Dear Sirs. Thank you for your answers Qmail-Smtpd have the following RBL configured: bl.spamcop.net cbl.abuseat.org combined.njabl.org Consider zen. It is excellent. Spamcop and NJABL have caused too many false positives to be used for blocking here, although very useful in scoring mail. Everyone's mail is different, YMMV. Also consider the invalument block lists, see http://dnsbl.invaluement.com/ A very, very good list that is usable for blocking. Not free, but very affordable. These are the SARE rules which adds to SA: careful with this, some of those sets will cause you FPs! Don't just blindly copy things, read about what you are doing first. echo 70_sare_adult.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_evilnum2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_genlsubj_x30.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_header.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_highrisk.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html4.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_html.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu2.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_obfu.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_oem.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_random.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_specific.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_spoof.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_stocks.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_unsub.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri0.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri1.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_uri3.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_rcvd.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo 70_sare_whitelist_spf.cf.sare.sa-update.dostech.net /etc/mail/spamassassin/sare-sa-update-channels.txt echo
Re: Re-running SA on an mbox
On Tue, 22 Sep 2009 13:03:16 +0100 Martin Gregorie mar...@gregorie.org wrote: gawk ' BEGIN { act = copy } /^X-Spam/ { act = skip } /^[A-WYZ]/ { act = copy } { if (act == copy) { print } } ' There are a few problem with that: 1 - it deletes all consecutive headers starting with an X that follow an X-Spam header e.g. X-Delivered-to 2 - if the bottom header is deleted, the header-body separator is also deleted 3 - ^X-Spam can match on the body causing part of it to be deleted, in the worst case corrupting the mime structure. I think the following is a bit more robust: awk ' /^[^[:space:]]/ { remove = 0 } /^X-Spam/ { remove = 1 } /^$/ { isbody = 1 } isbody || !remove { print } '
Re: Re-running SA on an mbox
From: MySQL Student mysqlstud...@gmail.com Date: Tue, 22 Sep 2009 15:38:47 -0400 Try using a local SA setup for stripping the headers. By local, I mean don't use your main production SA - run a separate copy with its own (cut down) configuration and all data base accesses and UBL calls etc turned off. Much better idea, thanks. Thanks for the script, too. Alex formail can be used to remove headers, for example: To remove all Received: fields from the header: formail -I Received: The following should do what you wanted to remove the X-Spam headers: formail -I X-Spam msg -jeff
Re: Re-running SA on an mbox
On Tue, 22 Sep 2009, Jeff Mincy wrote: From: MySQL Student mysqlstud...@gmail.com Date: Tue, 22 Sep 2009 15:38:47 -0400 Try using a local SA setup for stripping the headers. By local, I mean don't use your main production SA - run a separate copy with its own (cut down) configuration and all data base accesses and UBL calls etc turned off. Much better idea, thanks. Thanks for the script, too. Alex formail can be used to remove headers, for example: To remove all Received: fields from the header: formail -I Received: The following should do what you wanted to remove the X-Spam headers: formail -I X-Spam msg And if it's still in multiple-message format: formail -Yb -I X-Spam -s in out You can add more headers, like: formail -Yb -I X-Spam -I X-Greylist -s in out -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- A sword is never a killer, it is but a tool in the killer's hands. -- Lucius Annaeus Seneca (Martial) 4BC-65AD --- Approximately 8761620 firearms legally purchased in the U.S. this year
Fwd: Episode 17 of the Who and Why Show: SpamAssassin
Hi all -- FYI, I had a chat with Steve Santorelli a while back for Team Cymru's 'Who and Why Show' podcast -- an occasional series designed to directly assist network administrators, looking at various Open Source Tools that can be used to check and secure systems. It's now up: Today we talk with Justin Mason, the original author of SpamAssassin, a world class anti-SPAM tool that's fundamentally changed the way we deal with the SPAM problem. These short overviews will give you a good grounding in the basics of the tool, the alternatives and also perhaps enlighten long time users to some of the upcoming features for future versions. See it at www.youtube.com/teamcymru It probably won't be news for this audience, as it's more newbie-oriented... but anyway ;) -- --j.
Re: Problems with high spam
On 22-Sep-2009, at 14:42, Aaron Wolfe wrote: Also consider the invalument block lists, see http://dnsbl.invaluement.com/ A very, very good list that is usable for blocking. Not free, but very affordable. I don't like how involvement does their pricing structure, actually. Firstly, I don't feel comfortable telling a 3rd party how many 'users' I have. Secondly, I don't feel like determining what they consider a 'user'. Third, because of my HELO/EHLO restrictions and rejection of unknown users I make FAR fewer RBL calls than most mailservers (I reject about 87% of all connections, and 98% of those rejections are in HELO/EHLO or unknown, Only 0.66% over the last week rejected by zen's RBL), so if I used invalument, it would probably only be for a handful of callouts per day but I would be paying the same amount as someone who was using it to do many tens of thousands of callouts per day. Sure, $20 a month is not a lot of money, but looking at my mail I figure that would be costing me about 1/2 a cent per check, if not more (I average out only about 1000 checks of zen per week), assuming I made exactly as many checks to involvement as zen means slightly over 1/2 cent per check. -- Don't congratulate yourself too much, or berate yourself either. You choices are half chance; so are everybody else's.
Re: Problems with high spam
On Tue, Sep 22, 2009 at 10:21 PM, LuKreme krem...@kreme.com wrote: On 22-Sep-2009, at 14:42, Aaron Wolfe wrote: Also consider the invalument block lists, see http://dnsbl.invaluement.com/ A very, very good list that is usable for blocking. Not free, but very affordable. I don't like how involvement does their pricing structure, actually. Firstly, I don't feel comfortable telling a 3rd party how many 'users' I have. Secondly, I don't feel like determining what they consider a 'user'. Third, because of my HELO/EHLO restrictions and rejection of unknown users I make FAR fewer RBL calls than most mailservers (I reject about 87% of all connections, and 98% of those rejections are in HELO/EHLO or unknown, Only 0.66% over the last week rejected by zen's RBL), so if I used invalument, it would probably only be for a handful of callouts per day but I would be paying the same amount as someone who was using it to do many tens of thousands of callouts per day. If you used the invalument lists, you would not be doing *any* callouts per day. The list is provided via rsync, you serve it from your own DNS servers to your MXes. You rsync the entire list every few minutes. Thus all sites, 10 users or 10 million users, use the same amount of invalument's resources to aquire the list. This is not what you are paying for. You're paying for the time and effort that the maintainer has put into making this list so good. How else can such a system offer a fair payment structure, if not by basing it on the number of users who benefit at each site? Sure, $20 a month is not a lot of money, but looking at my mail I figure that would be costing me about 1/2 a cent per check, if not more (I average out only about 1000 checks of zen per week), assuming I made exactly as many checks to involvement as zen means slightly over 1/2 cent per check. Most people would value this is terms of the time they save by not dealing with the spam, or in a larger organization the reduced calls to tech support about spam + the time the employees save by not getting the spam. You might also find that there is great value in the reduced load on your content scanners, because the invalument list can cut the traffic making it to these resource hungry systems quite dramatically. The list has save my organization many times its cost simply by reducing the number of content filtering nodes we have to run, or in other words allowing us to support more paying customers on less hardware. Everyone is entitled to their opinion, but for us the invaluement RBL is a no brainer. Sorry to sound like an advert here, practically all these same reasons are used to justify the purchase of a Zen rsync feed when you outgrow their free level of service. That will cost you quite a bit more, but still generally worth it in terms of support and hardware savings. -- Don't congratulate yourself too much, or berate yourself either. You choices are half chance; so are everybody else's.
Do I need to do anything to maintain MySQL?
Every so often, I see some large MySQL accesses taking place from SA. Is there any regular maintenance needed or should I just leave it alone? -- Time flies like the wind. Fruit flies like a banana. Stranger things have .0. happened but none stranger than this. Does your driver's license say Organ ..0 Donor?Black holes are where God divided by zero. Listen to me! We are all- 000 individuals! What if this weren't a hypothetical question? steveo at syslang.net signature.asc Description: OpenPGP digital signature