RE: AWL q?
On Thu 27 Aug 2009 12:08:47 AM CEST, Gary Smith wrote I don't let that junk get past envelope stage: postmap -q weekendhotdeals.info mysql:/usr/local/etc/postfix/mysql- from_senders_rhsbl.cf 554 RHSBL_DOMAIN post the mysql map, without password of course if you want to share it, but i belive what you do is that you have awl data in sql and use this from a postfix mysql map ? i had this in mind also but newer found a good stable query to make it I assume you are running some type of background process that generates the list of senders based upon some criteria. Can you share more. imho its just standard awl with sql backend used from postfix I also use mysql lookups for postfix (though I'm in the process of converting them to memcache for some of the larger ones (with a preloader) so I can hit memcached first (then lookup to the database after if necessary). I'm also looking for better ways to deal with spam. memcache is nice, but how do you use memcache data in postfix ? -- xpoint
RE: corpus ham/spam balance
On Thu 27 Aug 2009 02:34:16 AM CEST, Karsten Bräckelmann wrote Also, I do agree with the post by RW. By lowering the auto-learn ham threshold you managed to get the ratio more sane. However, continuing to do so you won't really learn any ham, but spam only. not if nham is bigger then nspam, this counters say if your thrshold is good or bad imho, and it also show what to tweek to get more learning in bayes if a spam mas scores 5.1 its unsafe to learn as spam, and if a ham msgs scores 4.9 its unsafe to learn as ham is there also a last learned digest in sa-learn --dump magic ? if its very long since last learned the thrshold ranges are to big Raising the threshold back to the default likely would be a good idea, and occasionally lower to get the effect you just observed: Get the ratio back to somewhat balanced. there is alot of ways to solve it, but since settings is not that hardcoded in sa why not change them if it helps ?, imho this is the point that there is same source code to all, but that does not mean that we all get the same ham spam mails to fight against -- xpoint
RE: AWL q?
postmap -q weekendhotdeals.info mysql:/usr/local/etc/postfix/mysql- from_senders_rhsbl.cf 554 RHSBL_DOMAIN post the mysql map it's a two-field table, just like a postfix .map file, index + data 1. rhsbl_domain 2. 554 RHSBL_DOMAIN , without password of course if you want to share it, but i belive what you do is that you have awl data in sql and use this from a postfix mysql map ? yes. i had this in mind also but newer found a good stable query to make it I assume you are running some type of background process that generates the list of senders based upon some criteria. Can you share more. When a domain gets hits from two RHSBL servers, the msg is rejected, and the domain is inserted to the mysql table, reducing repetitive queries to the RHSBL servers for the same domains. The mysql table is approached 28K rows. weekdays, that mysql table is producing about 25K rejects/day, which is separate from rejects from queries to rbsbl servers. Len
Obfuscation Question
When I send a test message for my broadcast email I am receiving 0.6 HTML_OBFUSCATE_05_10 BODY: Message is 5% to 10% HTML obfuscation in the spam score. It is a pretty basic email message with a few hyperlinks and a numbered list. Can you explain what may be causing this spam score. Lori Willard Irish Online Help Desk University of Notre Dame, Alumni Association Email: onlineh...@alumni.nd.edu Website: http://alumni.nd.edu Phone: (574) 631-1579
Re: SA: lottery message scored hammy by bayes
Apparently I am not sure if bayes is autolearning I am on a shared host service (midphase) which uses cPanel and has exim do the spamassassin stuff. They use my scores but ignore other commands. When I get a message I think I shouldn't have I save it and run spamc m .out inorder to see the X-Spam-report (which is Not included in ham !) My userprefs is always available at http:/www.Real-World-Systems.com/mail/user_prefs.html I have not manually trained bayes. Thanks John Hardin wrote: On Tue, 25 Aug 2009, Dennis German wrote: email with this content: CONGRATULATION ... received these scores X-Spam-testscores: BAYES_00=-2.599,HTML_MESSAGE=0.001,MISSING_HEADERS=5.7, SUBJ_ALL_CAPS=3.1,UPPERCASE_75_100=1.528 Does this indicate that bayes needs tuning/learning? Can you paste the output from sa-learn --dump magic ? It probably indicates that Bayes has been mistrained - somebody is training spammy messages as ham. How do you do your Bayes training? Autolearning, or purely manual, or some combination? How many messages are getting inappropriate Bayes scores? If a lot are, you'll probably want to turn off autolearning (if you're using it) until you analyze the problem. You may need to wipe your Bayes database and start fresh if the problem is bad enough. If you're using autolearning, what are your learning thresholds? If you're manually training, do you keep your corpora so that you can review and correct errors? If so, review your ham corpora and see if any spams have crept in - and if so, retrain them as spam, SA will forget that they were hammy.
Re: Training spam as ham and forwarding
On Wed, 26 Aug 2009 22:44:50 -0400 MySQL Student mysqlstud...@gmail.com wrote: Hi SA users, I have a few messages found in the quarantine that I need to train as ham because they were marked as spam incorrectly. To do this, I added the following to the top of the file so it becomes a normal email: From DUMMY-LINE Thu Jan 1 00:00:00 1970 This turns it into a *mailbox* in mbox format, which is one of the formats that pine supports along with its preferred mbx format. sa-learn would be happy with a raw email, but if you use the above, you should use --mbox.
Re: Obfuscation Question
On Wed 26 Aug 2009 05:30:31 PM CEST, Irish Online Help Desk wrote When I send a test message for my broadcast email I am receiving 0.6 HTML_OBFUSCATE_05_10 BODY: Message is 5% to 10% HTML obfuscation in the spam score. It is a pretty basic email message with a few hyperlinks and a numbered list. Can you explain what may be causing this spam score. try disable html in your mail client -- xpoint
Date parsing
Hi, I received an email with a date header like this: Date: 27 Aug 09 13:50:20 0100 That header triggered the following rule: 1.7 INVALID_DATE Invalid Date: header (not RFC 2822) That's fair enough, but then a second rule was incorrectly triggered: 2.3 DATE_IN_PAST_96_XX Date: is 96 hours or more before Received: date Although the date header was badly formatted, it wasn't actually incorrect as far as when the message was sent. I don't think the DATE_IN_PAST rules should fire if the date isn't valid in the first place... -- Mike Cardwell - IT Consultant and LAMP developer Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
RE: corpus ham/spam balance
On Thu, 2009-08-27 at 11:25 +0200, Benny Pedersen wrote: On Thu 27 Aug 2009 02:34:16 AM CEST, Karsten Bräckelmann wrote Also, I do agree with the post by RW. By lowering the auto-learn ham threshold you managed to get the ratio more sane. However, continuing to do so you won't really learn any ham, but spam only. not if nham is bigger then nspam, this counters say if your thrshold is good or bad imho, and it also show what to tweek to get more learning in bayes Benny, what the heck are you talking about? Have a look at the OP again, specifically the numbers and how they changed over time and conf changes. That's what *we* are talking about. And no, nham nspam is irrelevant on its own. The important part is, how it developed. if a spam mas scores 5.1 its unsafe to learn as spam, and if a ham msgs scores 4.9 its unsafe to learn as ham This is as wrong as it can get, if you are talking about manual training. And with auto-learning, this won't happen anyway. -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Date parsing
On 27-Aug-2009, at 06:59, Mike Cardwell wrote: Date: 27 Aug 09 13:50:20 0100 Although the date header was badly formatted, it wasn't actually incorrect as far as when the message was sent. I don't think the DATE_IN_PAST rules should fire if the date isn't valid in the first place… I dunno, the message is tagged as being 2,000 years old. That does qualify as more than 96 hours. -- I draw the line at 7 unreturned phone calls.
Re: corpus ham/spam balance
LuKreme wrote: On 26-Aug-2009, at 10:53, Kris Deugau wrote: If you're running a sitewide AWL on any kind of scale beyond a few tens of domains, and a couple hundred accounts, you should probably look at putting it in SQL - it's a *lot* easier to maintain there. Is there a good writeup on doing this? I don't remember where I found the references I used (and I can't seem to find any of them again :/ ), but I just updated one of the wiki pages referring to the config changes necessary for SQL-based AWL with some comments about the changes I've made to tune things here. http://wiki.apache.org/spamassassin/BetterDocumentation/SqlReadmeAwl -kgd
forwarding emails will sa learn?
Hello, I got an email that was not tagged high enough as spam, it was a 3.063 when needed was 3.5, so it got through, it is rare. I notice that spam messages go to username+s...@domain.com can i teach SA that the message is spam by forwarding the message to that email will it learn for the future? So far i've just increased the value of the three sa tests that got it. Thanks. Dave.
Google/Yahoo Spam
Hi all, I'm seeing an increase in Google Reader and yahoo groups/personals/profile spam. Here's an example of the Google Reader spam: http://pastebin.com/m1021fc5f Any ideas on how to catch this one? For the Yahoo spam (with links to yahoo sites ending in '/1', I've created these: uriLOC_YAHOO1 m{http://groups\.yahoo\.com\/}i score LOC_YAHOO1 0 1.5 0 1.5 describe LOC_YAHOO1 Contains groups.yahoo.com uri uriLOC_YAHOO2 m{http://profile\.yahoo\.com\/}i score LOC_YAHOO2 0 1.5 0 1.5 describe LOC_YAHOO2 Raw body contains profile.yahoo uriLOC_YAHOO3 m{http://personals\.yahoo\.com\/}i score LOC_YAHOO3 0 1.5 0 1.5 describe LOC_YAHOO3 Raw body contains personals.yahoo They're somewhat paired down because I'm not very good at pattern matching, so thought someone could improve on this? Thanks, Alex
RE: AWL q?
memcache is nice, but how do you use memcache data in postfix ? There is a patch for memcached and postfix. The problem is, which is what I'm working on, is how to populate it. They only give you the mechanism for using memcached. (http://www.aurore.net/projects/postfix_memcached/) So, my intent, when I have time over the couple weeks is to work on an app that will populate it (add/update) from a key pair stream (thus I can populate it with whatever data call I want) and just crontab it out. The problem is my C is pretty rusty and the data formats for the script languages use a different format for memcached than the C api. But the theory is sound and for something like having AWL integrated into postfix, this would be an ideal way to handle it as it's fast and can be modified externally. With that said, I spend some time last night thinking of a better implementation of what Len had mentioned. I don't want to block singleton email addresses as most of the emails are coming from random IP's so it defeats the purpose. I was thinking that, instead of that, create a table that will house the domain and IP, with an aggregate score (based upon some algorithm yet to be defined) and use that for the quick lookup for postfix. If a domain has passed in a couple spams from a single IP, this could be a fluke, but if they are passing hundreds, it's obviously not. Anyway, if you have any ideas on populating the memcached and have C experience, and some time, you might want to run with the idea as well and share some code.
Converting spam to email message
Hi all, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Thanks so much. Best regards, Alex
Re: Converting spam to email message
At 10:39 AM 8/27/2009, you wrote: Hi all, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Maybe I missed something, but SpamAssassin doesn't have a quarantine. http://wiki.apache.org/spamassassin/SpamQuarantine http://wiki.apache.org/spamassassin/SpamAssassinSpamAssassin itself just tags the messages it scans as spam or nonspam; if you want to quarantine spam, you will need an add-on product that does this. Here's a list (please expand it if you know of more!) So perhaps mentioning what quarantine you use, or if no one here can help, look on a mailing list for the quarantine you use.
Re: Google/Yahoo Spam
On Thu, 2009-08-27 at 12:38 -0400, MySQL Student wrote: I'm seeing an increase in Google Reader and yahoo groups/personals/profile spam. Any ideas on how to catch this one? For the Yahoo spam (with links to yahoo sites ending in '/1', I've created these: Thus should catch your set and more: uri LOC_YAHOO /^http:.{1,40}\.yahoo[.,]com/i scoreLOC_YAHOO 0 1.5 0 1.5 describe LOC_YAHOO Contains *.yahoo.com uri Or, if you want to be more specific, try this: uri LOC_YAHOO /^http:\/\/(groups|profile|personals)\.yahoo[.,]com/i scoreLOC_YAHOO 0 1.5 0 1.5 describe LOC_YAHOO Contains yahoo.com groups/profile/personals uri Martin
Re: Converting spam to email message
Hi, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Maybe I missed something, but SpamAssassin doesn't have a quarantine. http://wiki.apache.org/spamassassin/SpamQuarantine Yes, my apologies. I guess it would then be amavisd-new that's managing the quarantine. I didn't realize that amavisd manipulated the mail in that way. Hopefully someone can still help. Thanks, Alex
Re: forwarding emails will sa learn?
On 27-Aug-2009, at 10:22, Dave wrote: I got an email that was not tagged high enough as spam, it was a 3.063 when needed was 3.5, so it got through, it is rare. I notice that spam messages go to username+s...@domain.com can i teach SA that the message is spam by forwarding the message to that email will it learn for the future? That depends on your system setting. So far i've just increased the value of the three sa tests that got it. This is almost certainly a very bad idea. -- Imagine all the people Sharing all the world
Re: Converting spam to email message
SpamAssassin does not handle mail. SpamAssassin analyzes a message and returns a score/report to whatever asked for the analysis. That is all. Other products do things with mail - store/reject/accept/deliver, etc. - and some of those products use a SpamAssassin score as part of the basis for choosing what action they take. You are more likely to get better assistance by asking your question in a forum that is devoted to the product which actually does the function you want assistance with. MySQL Student mysqlstud...@gmail.com 08/27/09 2:54 PM Hi, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Maybe I missed something, but SpamAssassin doesn't have a quarantine. http://wiki.apache.org/spamassassin/SpamQuarantine Yes, my apologies. I guess it would then be amavisd-new that's managing the quarantine. I didn't realize that amavisd manipulated the mail in that way. Hopefully someone can still help. Thanks, Alex
razor/spamcop report question
Hello, I'm using the amavisd-new/spamassassin 3.2.5/clamav combo on some servers (Freebsd, Mac OS X Server). I would like spamassassin to report spam using razor and spamcop services. in /usr/local/etc/mail/spamassassin/v310.pre (freebsd), I have this: loadplugin Mail::SpamAssassin::Plugin::Razor2 loadplugin Mail::SpamAssassin::Plugin::SpamCop spamcop_to_address submit.mysubmitaddress@spam.spamcop.net 1- How do I know that spamcop get reports from Spamassassin? 2- I don't understand why Razor does not work. I run: # su vscan -c 'spamassassin -r /tmp/spam' and it returns: [28395] warn: reporter: razor2 report failed: No such file or directory report requires authentication at /usr/local/lib/perl5/site_perl/5.8.9/Mail/ SpamAssassin/Plugin/Razor2.pm line 178. at /usr/local/lib/perl5/site_perl/5.8.9/Mail/SpamAssassin/ Plugin/Razor2.pm line 326. 1 message(s) examined. - razor complains about auth. But I'm using Razor version 2.84, it's supposed to provide automatically the credentials (since 2.74 iirc). - spamcop send me an email : SpamCop is now ready to process your spam. Use links to finish spam reporting (members use cookie-login please!): http://www.spamcop.net/sc?id=z3261... And I only get an email like this one when I'm running `su vscan -c 'spamassassin -r /tmp/spam'`. During normal operations, I don't get any email from Spamcop asking me to finish a spam report. Am I missing something? regards, patpro
Re: Converting spam to email message
Mr. Student, On 8/27/2009 11:54 AM, MySQL Student wrote: Hi, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Probably best answered on amavis users list. Still... Use amavisd-release, which comes with amavis. Adjust the AM.PDP-SOCK policy_bank as necessary for your environment. See comments at the top of the program. Interesting settings examples: # $release_format = 'resend'; # 'attach', 'plain', 'resend' # $report_format = 'arf';# 'attach', 'plain', 'resend', 'arf' # Use with amavis-release over a socket or with Petr Rehor's amavis-milter.c # only applies with $unix_socketname $interface_policy{'SOCK'} = 'AM.PDP-SOCK'; $policy_bank{'AM.PDP-SOCK'} = { protocol = 'AM.PDP', # do not require secret_id for amavisd-release auth_required_release = 0, }; Maybe I missed something, but SpamAssassin doesn't have a quarantine. http://wiki.apache.org/spamassassin/SpamQuarantine Yes, my apologies. I guess it would then be amavisd-new that's managing the quarantine. I didn't realize that amavisd manipulated the mail in that way. Hopefully someone can still help. Thanks, Alex -- Mike
Writing spamassassin rules
I'm sure I'm missing the obvious, but I can't seem to find a guide to writing spamassassin rules on the spamassassin web page. I'd like to write some custom rules, and some documentation would be really handy. Anybody got that URL handy? Thanks much... ...Kevin -- Kevin MillerRegistered Linux User No: 307357 CBJ MIS Dept. Network Systems Admin., Mail Admin. 155 South Seward Street ph: (907) 586-0242 Juneau, Alaska 99801fax: (907 586-4500
Re: Writing spamassassin rules
At 12:46 PM 8/27/2009, you wrote: I'm sure I'm missing the obvious, but I can't seem to find a guide to writing spamassassin rules on the spamassassin web page. I'd like to write some custom rules, and some documentation would be really handy. Anybody got that URL handy? I'm guessing this is still valid... :) http://wiki.apache.org/spamassassin/WritingRules (Never written any of my own, so...) Evan
Re: Writing spamassassin rules
Kevin Miller wrote: I'm sure I'm missing the obvious, but I can't seem to find a guide to writing spamassassin rules on the spamassassin web page. I'd like to write some custom rules, and some documentation would be really handy. Anybody got that URL handy? http://wiki.apache.org/spamassassin/WritingRules?highlight=%28rule%29|%28writing%29
RE: Writing spamassassin rules
Evan Platt wrote: At 12:46 PM 8/27/2009, you wrote: I'm sure I'm missing the obvious, but I can't seem to find a guide to writing spamassassin rules on the spamassassin web page. I'd like to write some custom rules, and some documentation would be really handy. Anybody got that URL handy? I'm guessing this is still valid... :) http://wiki.apache.org/spamassassin/WritingRules (Never written any of my own, so...) Evan Excellent - thanks much! ...Kevin -- Kevin MillerRegistered Linux User No: 307357 CBJ MIS Dept. Network Systems Admin., Mail Admin. 155 South Seward Street ph: (907) 586-0242 Juneau, Alaska 99801fax: (907 586-4500
Re: Obfuscation Question
Not subscribed. You are missing the on-list replies. Well, if any useful, given that post... On Wed, 2009-08-26 at 11:30 -0400, Irish Online Help Desk wrote: When I send a test message for my broadcast email I am receiving “0.6 HTML_OBFUSCATE_05_10 BODY: Message is 5% to 10% HTML obfuscation” in the spam score. It is a pretty basic email message with a few hyperlinks and a numbered list. Can you explain what may be causing this spam score. Why do you care? Some observations... 50_scores.cf: score HTML_OBFUSCATE_05_10 0.638 0.572 0.000 0.001 So you are either using score-set 0 (neither Bayes nor network tests) or score-set 1 (with network tests). Since the latter is irrelevant in your pre-send tests, I'll assume 0. Anyway, that's 0.6 of 5.0 required (by default). Or, in other words, 12% of being marked as spam. Not above the threshold of 5.0, thus no spam. If you *really* want more reasoning why this might come across with a footprint that translates to with about 12% confidence spam, but no spam, then I recommend studying the sources. You got them already. -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Your message to the Irish Online Help Desk Re: Obfuscation Question
See, this is one of the reasons why I prefer NOT to moderate through posts by non-subscribers. I am *seriously* trying hard not to use any words that are inappropriate for a public list. Funnily enough, I can't even begin to explain how I feel about trying to help you and getting that bloody reply -- without going all ranty and flamy. Dear Irish Online Help Desk, before you even consider replying on- list, DO subscribe with an address that does not bloody auto-respond. Personal apologies accepted, personal help off-list sure as hell won't happen. Thank you for understanding even the tiniest bit of netiquette and how volunteer driven projects work. Full quote below for the amusement of the general public. On Thu, 2009-08-27 at 18:19 -0700, Irish Online Help Desk wrote: Thank you for contacting the Irish Online Help Desk. Your message has been received and we will respond to you as quickly as possible. * * * * * If you are having problems with Irish Online, you may review our Frequently Asked Questions at http://alumni.nd.edu/iofaq. When requesting assistance be sure to provide the following information: • Full Name • Class Year • Current Email address • 10-digit Notre Dame ID (see Notre Dame Magazine or other ND correspondence) or last 4-digits of SSN Thank you. We will respond to you as quickly as possible. University of Notre Dame, Alumni Association Irish Online Help Desk onlineh...@alumni.nd.edu 574-631-1579 -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
RE: Your message to the Irish Online Help Desk Re: ObfuscationQuestion
-Original Message- From: Karsten Bräckelmann [mailto:guent...@rudersport.de] Sent: Friday, 28 August 2009 1:34 p.m. To: Irish Online Help Desk Cc: users@spamassassin.apache.org Subject: Re: Your message to the Irish Online Help Desk Re: ObfuscationQuestion See, this is one of the reasons why I prefer NOT to moderate through posts by non-subscribers. Then why do it? If it causes you frustration, is the time worthwhile?. Surely readers of this list aren't expecting anyone to develop an Aneurysm from dealing with non-subscribers to the list.. Cheers, Mike
Re: Obfuscation Question
Irish Online Help Desk wrote: When I send a test message for my broadcast email I am receiving “0.6 HTML_OBFUSCATE_05_10 BODY: Message is 5% to 10% HTML obfuscation” in the spam score. It is a pretty basic email message with a few hyperlinks and a numbered list. Can you explain what may be causing this spam score. Well, at 0.6 points, it's not really anything to worry about. Nobody (at least nobody with more than 2 braincells) should be tagging or discarding email at such a low score level. As for the rule, it's generally going to be looking for abuse of tables, etc to obscure what the user-perceived text of a message is. (ie: writing a message by populating columns vertical-first in a table), etc. If you're really worried, you might want to look at the raw message source and see if the innocent looking text has a lot of really weird html layout in it. However, with such a low obfuscation ratio, and such a small score.. I'd really not worry about it.