Re: More text/plain questions

2014-07-24 Thread Amir 'CG' Caspi
On 2014-07-24 16:11, Philip Prindeville wrote: You might have a shorter wait if you move to CentOS 6.5 instead. I would, but the VPS software I'm using does not run on CentOS 6.x, only 5.x. It's rather old software and I should convert to something else, but it's not worth the time I don't

Re: More text/plain questions

2014-07-23 Thread Amir 'CG' Caspi
On 2014-07-02 15:04, Amir Caspi wrote: For what it's worth, I just received a spam that basically is the same as what Philip complained about. I've posted a spample here: http://pastebin.com/Y2YGwL49 [...] I'm wondering if we shouldn't write a rule looking for lots of #x0[0-9]{3};

Re: More text/plain questions

2014-07-23 Thread Amir 'CG' Caspi
On 2014-07-23 12:23, Paul Stead wrote: I've also implemented several rules to try and catch these types of emails. Care to share? Counting encoded chars is easy, of course. One thing to note, webmail and my MUA often will render the encoded characters in their translated format, not

Re: More text/plain questions

2014-07-23 Thread Amir 'CG' Caspi
On 2014-07-23 13:14, Axb wrote: doesn't your VPS offer you shell access? if yes, uninstall the SA rpm stuff and install SA 3.4 from source/trunk. I think I didn't explain properly. I'm running the dedicated server on which there is VPS software. I need RPMs so that they get distributed to

Re: More text/plain questions

2014-07-23 Thread Amir 'CG' Caspi
On 2014-07-23 13:38, Axb wrote: If you're using spamd, why not run a/multiple dedicated VMs for SA 3.4 and have your other VMs use the spamd on the SA VMs ? There is a dedicated spamd. It's the other tools that need to be distributed, like sa-learn. Bayes rules are handled per-user. (No,

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 12:57 pm, John Hardin wrote: 0 messages examined generally means either the format isn't what sa-learn expected, or the message is larger than the size limit. The file format is most certainly MBOX... it was created by my MUA, and running file on it tells me that it is

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 2:39 pm, Axb wrote: what's wrong with installing from source? I run a virtual-hosting server where the individual site RPMs are copied from server-level RPMs. Basically all software has to be installed as RPMs in order to propagate to the individual virtual hosts. ---

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 2:49 pm, Benny Pedersen wrote: On 2014-02-20 22:39, Kevin A. McGrail wrote: --max-size= I believe. Default is 256K. sa-learn barfs, that flag is not accepted. That flag works for spamc, but not for sa-learn. sa-learn man page and CLI help don't have any mention of a

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 3:16 pm, Kevin A. McGrail wrote: Are you using 3.4.0? I believe the size was hard-coded until then when the max-size option was added to sa-learn. No, as mentioned previously in this flurry of emails, I'm using 3.3.2. However, note that using spamassassin directly

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 3:52 pm, Kevin A. McGrail wrote: Questions that will be answered by that is solved in 3.4.0 aren't really going to get much support from me... Understood, though it'll be a while before I can upgrade to 3.4 due to the RPM issue that I've mentioned previously. However,

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 4:08 pm, Kevin A. McGrail wrote: Probably best if you install 3.4.0 (or even trunk) on a test system and throw the offending email onto that server and run sa-learn on that box with -D. In the meantime, anyone want to do it on my behalf? =) I provided the mbox link

Re: Increase in Image Spam

2014-02-20 Thread Amir 'CG' Caspi
On Thu, February 20, 2014 5:13 pm, Kevin A. McGrail wrote: Resend the mbox.link and I will likely have a cycle to throw it through. https://www.dropbox.com/s/m4fuv670wnvwa16/SA_testspam.mbox To be deleted in 24-48 hours (don't want spammers harvesting it). If you have a chance, please run it

Bayes ID depends on mailbox format?

2014-02-05 Thread Amir 'CG' Caspi
Hi all, Occasionally, I will receive an FN that is autolearned as ham. Normally, I dump it into my spam folder and sa-learn that as spam, so it should be forgotten as ham and relearned as spam, and all is well with the world (except for actually getting the spam, of course). Today, I was

Re: Help with a regex to catch spam with gibberish html tags

2014-01-29 Thread Amir 'CG' Caspi
On Wed, January 29, 2014 11:34 am, John Hardin wrote: There is already a style gibberish rule. http://ruleqa.spamassassin.org/20140128-r1562007-n/STYLE_GIBBERISH/detail I haven't seen STYLE_GIBBERISH hitting on very much in the last month... some hits, but it's missing a bunch of stuff,

Re: Bayes and multipart messages

2014-01-09 Thread Amir 'CG' Caspi
On Thu, January 9, 2014 6:20 pm, Karsten Bräckelmann wrote: Even the most effective results I have ever seen on a non-personal attack is merely getting the Bayes classification to a neutral. And that was not a regular text token, but includes mail headers. And a biased Bayes database towards

Re: Bayes and multipart messages

2014-01-09 Thread Amir 'CG' Caspi
On Thu, January 9, 2014 9:46 pm, Karsten Bräckelmann wrote: Unfortunately, well, for the scumbags, the shorter it gets, the less likely it is to be understood. Fallen for. Or even understood to be actual language. Well, not really true, because of the rising resurgence of spammers using

Re: Email address in subject line

2013-12-28 Thread Amir 'CG' Caspi
On Mon, December 23, 2013 5:39 pm, Amir 'CG' Caspi wrote: I did check... unless I'm completely blind, I don't see it. I'm basically looking for something like: header EMAIL_IN_SUBJ Subject =~ /[A-Za-z0-9.-_+]+@(?:[A-Za-z0-9]+\.)+[A-Za-z]{3,5}/ i.e. something that will match a valid email

Re: Email address in subject line

2013-12-28 Thread Amir 'CG' Caspi
On Sat, December 28, 2013 7:57 pm, John Hardin wrote: And in case you actually did mean From: rather than recipient address... Sorry, no, I meant To: as you surmised. Unfortunately __SUBJ_HAS_TO_1 isn't performing well enough against the current masscheck corpora to be published. It's

Email address in subject line

2013-12-23 Thread Amir 'CG' Caspi
Hi all, Over the past couple of days I've been getting slammed with spams that have subjects of the form: Subject: u...@host.com spammy subject line That is, the recipient's email address is included explicitly in angular brackets at the beginning of the subject line. These are new and varied

Re: Email address in subject line

2013-12-23 Thread Amir 'CG' Caspi
On Mon, December 23, 2013 3:06 pm, Axb wrote: To save you time, grep the pattern you're looking for thru the rules directories. If it's not there check in SVN's trunk to see if such a rule indeed exists and if it's getting auto-promoted and what score it gets. I did check... unless I'm

Re: few words

2013-12-06 Thread Amir 'CG' Caspi
On Fri, December 6, 2013 1:23 pm, Marcio Humpris wrote: But how to make it catch just an email with ONLY 2 words in BODY? Not match empty message? I strongly recommend http://www.regular-expressions.info to learn how regexps work. If you want to catch _exactly_ two words then you'd do

Re: 225 spreadsheets in title

2013-12-06 Thread Amir 'CG' Caspi
On Fri, December 6, 2013 1:25 pm, Marcio Humpris wrote: how can I do something that catches in subject of an email 225 spreadsheets for download and variations? such as xx to xxx spreadshets, header DL_SPREADSHEETS /^\d+ spreadsheets for download$/ Omit for download if you want. Omit the ^

Re: Regular spam with distinctive URLs

2013-12-03 Thread Amir 'CG' Caspi
On Tue, December 3, 2013 4:18 pm, Geoff Soper wrote: Are other people also suffering from these? Yes. I think there are regexps that would work but I haven't had a chance to test any yet, to add them to my growing list of spammy template regexps. I'll try to work something up next week. ---

Re: spamc -L apparently not working properly

2013-11-08 Thread Amir 'CG' Caspi
On Fri, November 8, 2013 2:39 pm, Sergio Durigan Junior wrote: I don't think sa-learn can help with spamd. Its own manpage mention that, for spamd users, spamc -L is the way to go. Hm, really? I thought spamd kept a global Bayes database, and that everyone calling spamc -L would end up

Re: spamc -L apparently not working properly

2013-11-08 Thread Amir 'CG' Caspi
On Fri, November 8, 2013 2:56 pm, Sergio Durigan Junior wrote: The problem with having a user-tailored database is that I will have to run sa-update for every user, right? No, or at least, not that I've seen. If spamd is running as root, it will load the sa-update rules from the root

Re: spamc -L apparently not working properly

2013-11-08 Thread Amir 'CG' Caspi
On Fri, November 8, 2013 3:24 pm, Karsten Bräckelmann wrote: The latter is incorrect -- spamc by default sends the effective user ID, and spamd switches users before processing the mail (assuming the daemon has been started as root). The -u user option is only necessary to change that default.

Re: LONGWORDS not hitting?

2013-08-24 Thread Amir 'CG' Caspi
are below. Thanks. --- Amir At 2:10 PM -0600 08/10/2013, Amir 'CG' Caspi wrote: At 12:42 PM -0600 06/30/2013, Amir 'CG' Caspi wrote: Hi all, Just got this spam: http://pastebin.com/KM5paaZ9 To me, it looks like LONGWORDS should have hit

Re: ADDRESS_IN_SUBJECT et al

2013-08-24 Thread Amir 'CG' Caspi
At 3:39 PM -0600 07/31/2013, Amir 'CG' Caspi wrote: At 3:23 AM +0200 07/25/2013, Karsten Bräckelmann wrote: header LOCALPART_IN_SUBJECTeval:check_for_to_in_subject('user') And all of them do hit that rule. A super-set of the ADDRESS variant, using the local part instead of the complete

Re: LONGWORDS not hitting?

2013-08-24 Thread Amir 'CG' Caspi
At 1:43 PM +0100 08/24/2013, RW wrote: LONGWORDS is a body rule, i.e. it runs on a normalized version of the Gah, THAT'S why it wasn't working? I feel like an idiot now. =P --- Amir

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 1:41 PM -0600 08/10/2013, Amir 'CG' Caspi wrote: (The HTML comment gibberish rule would be a big step here, since that's one of the few things that would distinguish this from ham... unlikely that a real person would embed tens of KB of comment gibberish.) OK, I'm trying to test an HTML

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 2:22 AM -0600 08/11/2013, Amir 'CG' Caspi wrote: My regex is valid and appropriate for those comments... I tested it at regexpal.com, which shows that all three comments match just fine (all three get highlighted). So... why is SA hitting only on the final comment, and ignoring the first

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 9:31 PM -0400 08/11/2013, Alex wrote: Can you post this rule again so we can investigate? # HTML comment gibberish # Looks for sequence of 100 or more words (alphanum + punct separated by whitespace) within HTML comment rawbody HTML_COMMENT_GIBBERISH

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 6:56 PM -0700 08/11/2013, John Hardin wrote: I'm also going to make FP-avoidance changes that should also help. Care to share? =) Just make sure that the rule does not match the -- comment-end token I tried doing that and it caused SA to hang... couldn't figure out why the regex wasn't

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 7:20 PM -0700 08/11/2013, John Hardin wrote: The unbounded matches you're using probably caused the RE engine to get stuck backing off and retrying. That's what I figured. That's why I changed things to the current version, which is bounded by the end-tag of the comment. My current

Re: New spam rule for specific content

2013-08-11 Thread Amir 'CG' Caspi
At 8:23 PM -0700 08/11/2013, John Hardin wrote: However, I may be taking too-conservative a stance here. It's possible that, while HTML comments can appear in ham, *long* HTML comments won't, and the fact that we're looking for long blocks of comment text is enough safety. That's why

Re: New spam rule for specific content

2013-08-10 Thread Amir 'CG' Caspi
At 10:41 AM -0700 08/09/2013, John Hardin wrote: Can you provide a spample or two? Sure. http://pastebin.com/VfSCB7fw http://pastebin.com/VCtvzjzV Note the outl and outi links near the very bottom. The actual domains used in these URIs vary... they used to be .pw, but recently most have

Re: New spam rule for specific content

2013-08-10 Thread Amir 'CG' Caspi
At 10:41 AM -0700 08/09/2013, John Hardin wrote: Can you provide a spample or two? Looks like a similar spam method has come out in recent weeks (since Jul 30, it seems) that uses slightly different footers... example is here: http://pastebin.com/QCmSPzwG Although running SA on this spam

Re: LONGWORDS not hitting?

2013-08-10 Thread Amir 'CG' Caspi
At 12:42 PM -0600 06/30/2013, Amir 'CG' Caspi wrote: Hi all, Just got this spam: http://pastebin.com/KM5paaZ9 To me, it looks like LONGWORDS should have hit... but it didn't. I ran it manually through spamassassin and spamc, and LONGWORDS still didn't hit, so it seems to just

Re: New spam rule for specific content

2013-08-10 Thread Amir 'CG' Caspi
At 2:17 PM -0700 08/10/2013, John Hardin wrote: Perhaps it's time to bring FuzzyOCR up-to-date...? Is this something I need to manually update or something that needs updating in the SA distribution? Thanks. --- Amir

New spam rule for specific content

2013-08-09 Thread Amir 'CG' Caspi
Hi all, A number of my users have been receiving spam formatted in a very specific way which seems to very often miss Bayes... I don't know why, whether it's because of the HTML gibberish flooding Bayes with useless tokens (to reduce the relative strength of the spammy tokens), or if it's

Re: New spam rule for specific content

2013-08-09 Thread Amir 'CG' Caspi
On Fri, August 9, 2013 1:01 pm, RW wrote: BAYES works on rendered text it doesn't see the HTML. Hmmm. It doesn't see HTML comments, which would appear in rendered HTML source even though they are invisible? OK, in that case, I have NO idea why the spam isn't hitting Bayes, because it looks

Re: ADDRESS_IN_SUBJECT et al

2013-07-31 Thread Amir 'CG' Caspi
At 3:23 AM +0200 07/25/2013, Karsten Bräckelmann wrote: header LOCALPART_IN_SUBJECTeval:check_for_to_in_subject('user') And all of them do hit that rule. A super-set of the ADDRESS variant, using the local part instead of the complete address. Still in stock rules. Hm. One of my

Re: Forgetting mis-learned email

2013-07-29 Thread Amir 'CG' Caspi
At 12:00 PM +0200 07/29/2013, Karsten Bräckelmann wrote: You're best bet is to just train what you have as spam, to counter the Sure, I was planning to do that. The reason I wanted to --forget it was to make sure that I wasn't learning it twice (once as ham, once as spam). You do not

Re: Forgetting mis-learned email

2013-07-29 Thread Amir 'CG' Caspi
At 5:48 PM +0200 07/29/2013, Karsten Bräckelmann wrote: I strongly suggest to NEVER copy-n-paste like that, but to either run sa-learn on an entire mbox, or *save* a single mail to a file. Since For what it's worth, I also opened the mbox in a text editor and copied the actual raw message (as

Re: Forgetting mis-learned email

2013-07-29 Thread Amir 'CG' Caspi
On Mon, July 29, 2013 10:21 am, Karsten Bräckelmann wrote: There were none for this email. Content-Type: text/plain Content-Transfer-Encoding: 8bit Whoops. I missed those... I guess this could be why a 7-bit copy/paste wouldn't work, and using the mbox file directly is required. Tried

Forgetting mis-learned email

2013-07-28 Thread Amir 'CG' Caspi
Hi all, So, some of my FNs get autolearned as ham, and because of the way my mail queue is set up, I typically only see this once the mail reaches my MUA and has already been deleted from the online inbox. I have one particular message that got autolearned as ham (but should be spam), and

Re: Running as root.

2013-07-15 Thread Amir 'CG' Caspi
At 12:05 AM +0100 07/16/2013, RW wrote: OTOH when I just tried this in 3.3.2, spamd didn't to pick-up a test rule I added to ~/.spamassassin/user_prefs (which worked with the spamassassin script). Do you have allow_user_rules enabled in your local.cf? According to

Re: LONGWORDS not hitting?

2013-07-01 Thread Amir 'CG' Caspi
At 3:24 PM +0200 07/01/2013, Benny Pedersen wrote: if content end user see is mangled, then end user cant relearn ham to be spam Yes, they can, because SA sees the mangled email before the user does. Therefore if SA misclassifies an email as ham, that exact same email is the one seen by the

LONGWORDS not hitting?

2013-06-30 Thread Amir 'CG' Caspi
Hi all, Just got this spam: http://pastebin.com/KM5paaZ9 To me, it looks like LONGWORDS should have hit... but it didn't. I ran it manually through spamassassin and spamc, and LONGWORDS still didn't hit, so it seems to just not be hitting that rule. But, to my eye, it looks like

Re: LONGWORDS not hitting?

2013-06-30 Thread Amir 'CG' Caspi
At 8:57 PM +0200 06/30/2013, Benny Pedersen wrote: well it might confuse bayes yes, but it cant confuse you to run sa-learn --spam on it ? I've been running sa-learn --spam on these messages for a month straight. Some get picked up, others don't. I'm still getting a lot of BAYES_50 on

Re: LONGWORDS not hitting?

2013-06-30 Thread Amir 'CG' Caspi
At 11:23 PM +0200 06/30/2013, Benny Pedersen wrote: does it continue if one msg is learned as spam, does it still after say bayes_50 ? No, it has BAYES_99 if I learn the message. That is, running SA on the SAME message will give BAYES_99 after it's learned. It's not a ham problem. you

Re: SPF lookup error

2013-06-25 Thread Amir 'CG' Caspi
On Tue, June 25, 2013 5:15 am, Matus UHLAR - fantomas wrote: This looks lik Net::DNS ssue, try upgrading that one. Also, try upgrading perl... Why do you say this looks like a Net::DNS issue? The error is being reported from Mail::SPF, and I've traced that error through the code, it tracks to

Bayes scoring priority

2013-06-24 Thread Amir 'CG' Caspi
Hi all, So, I think I've gotten my Bayes DB largely under control... most of the FN spam I'm getting is getting high Bayes scores and simply not a large enough aggregate score to count as spam. So, now I'm wondering if I should increase the points assigned to high Bayes scores. For

Re: SPF lookup error

2013-06-24 Thread Amir 'CG' Caspi
end up failing due to this error and some spam therefore gets missed (FNs). Any ideas are most welcome. Thanks. --- Amir At 12:01 AM -0600 06/13/2013, Amir 'CG' Caspi wrote: Hi all, I am getting the follow error peppering my

Re: New rule for HTML spam, using comments?

2013-06-20 Thread Amir 'CG' Caspi
At 9:47 AM +0200 06/20/2013, Tom Hendrikx wrote: Since mailscanner already has support for integrating spamassassin [1] (As I mentioned explicitly in a previous email...) why would you ever want to put work in reversing some of mailscanners 'protection'? Because, given the particularls of

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir 'CG' Caspi
On Wed, June 19, 2013 3:14 pm, Axb wrote: iirc, MailScanner munges the URL befor SA sees it so unless your plugin idea involves a crystal ball, it's not possible. Yes, MailScanner gets to it before SA does, unless SA is called from within MailScanner (which it isn't, on my setup, but that is a

Re: New rule for HTML spam, using comments?

2013-06-19 Thread Amir 'CG' Caspi
On Wed, June 19, 2013 3:47 pm, Axb wrote: SA's URIBL plugin doesn't and shouldn't look in the alt attribute. Why not, exactly? I wouldn't look at it for _all_ img tags, only for ones that are clearly MailScanner-munged. That is, one would look for the patterns that MailScanner uses for

Re: New rule for HTML spam, using comments?

2013-06-18 Thread Amir 'CG' Caspi
At 4:37 PM -0400 06/14/2013, Alex wrote: On Fri, Jun 14, 2013 at 4:18 PM, Amir 'CG' Caspi ceph...@3phase.com wrote: I wonder if there's some difference between running spamassassin manually on the message versus running spamd. I think the only difference would be if spamd somehow didn't

Re: New rule for HTML spam, using comments?

2013-06-18 Thread Amir 'CG' Caspi
At 10:13 AM -0700 06/18/2013, John Hardin wrote: On Mon, 17 Jun 2013, Amir 'CG' Caspi wrote: Any idea why it failed to hit, and does this need another rule revision? Yep, and yep. Revision committed. Initial comment gibberish rule committed. Thanks for the revision. Do you want to explain

Re: New rule for HTML spam, using comments?

2013-06-18 Thread Amir 'CG' Caspi
At 8:58 AM -0400 06/18/2013, Ben Johnson wrote: a.) You are copying/pasting the body of the email, but not the headers. No, I am copying the headers... however, I am using Eudora (ancient, I know) as a mail client, and it's possible the headers are not properly formatted. For example, for

Re: New rule for HTML spam, using comments?

2013-06-18 Thread Amir 'CG' Caspi
At 10:24 AM -0700 06/18/2013, John Hardin wrote: The earlier version wasn't allowing for some punctuation in the gibberish. There may be a period of whack-a-mole here, I was conservative in the change I made. Makes sense. Both of those examples are good for creating an

Re: New rule for HTML spam, using comments?

2013-06-18 Thread Amir 'CG' Caspi
Replies to multiple folks below... At 1:42 PM -0400 06/18/2013, Kris Deugau wrote: Try opening the on-disk file with Notepad (or your favourite text editor on *nix). If you see the same thing you see when you hit the blah blah blah button in Eudora, you should be OK. If not... I've done

Re: New rule for HTML spam, using comments?

2013-06-17 Thread Amir 'CG' Caspi
At 7:20 PM -0700 06/15/2013, John Hardin wrote: I took a closer look at this and it seems they're working around trivial gibberish detection by putting a valid CSS property at the very beginning of the style tag. Revising the rules... I am now seeing STYLE_GIBBERISH hitting on a lot of spam

Re: New rule for HTML spam, using comments?

2013-06-17 Thread Amir 'CG' Caspi
At 10:48 AM -0700 06/17/2013, John Hardin wrote: On Mon, 17 Jun 2013, Amir 'CG' Caspi wrote: I am now seeing STYLE_GIBBERISH hitting on a lot of spam in the past day or so, since the new rules hit the distribution. So far, all TPs, no FPs. Yay! But, I found one today that should have hit

Re: New rule for HTML spam, using comments?

2013-06-14 Thread Amir 'CG' Caspi
At 9:43 PM -0400 06/13/2013, Alex wrote: I'd say if you have any that are hitting bayes20 or lower, your database is not working properly and you should probably start over. Not quite sure I want to do that... I don't really have a sufficient corpus of mail for good training. It's working

Re: New rule for HTML spam, using comments?

2013-06-14 Thread Amir 'CG' Caspi
At 4:37 PM -0400 06/14/2013, Alex wrote: Yeah, but not bayes20. That's bad for sure. You should start collecting now, or pull a few hundred from your recent quarantine and use those, along with people's mail folders. Well, I got bayes99 when I ran spamassassin manually just now. So, I really

Re: New rule for HTML spam, using comments?

2013-06-14 Thread Amir 'CG' Caspi
At 4:37 PM -0400 06/14/2013, Alex wrote: I think the only difference would be if spamd somehow didn't recognize all the locations for your rules. Perhaps create a rule that you know will hit with a very low score in each directory that contains rules. Maybe there's a way to run spamd in the

Re: New rule for HTML spam, using comments?

2013-06-14 Thread Amir 'CG' Caspi
At 11:43 PM +0100 06/14/2013, Martin Gregorie wrote: Are you sure? Take a look at how sa_update is getting run to make sure that it is doing what you expect. Yes, I'm sure. I looked at the update script (in my case, it's called update_spamassassin, due to the way Parallels Pro configures

SPF lookup error

2013-06-13 Thread Amir 'CG' Caspi
Hi all, I am getting the follow error peppering my maillogs: Jun 13 01:26:42 kismet spamd[24575]: spf: lookup failed: Can't locate object method new_from_string via package Mail::SPF::v1::Record at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SPF/Server.pm line 524. This occurs very often,

New rule for HTML spam, using comments?

2013-06-13 Thread Amir 'CG' Caspi
Lately, I've been getting hit with a LOT of this type of spam: http://pastebin.com/HD0rNdxU Not all of it is identical in format, but there seems to be one thing in common: they include lots of random garbage inside either CSS or in HTML comments. All of this gets ignored by the HTML parser

Re: New rule for HTML spam, using comments?

2013-06-13 Thread Amir 'CG' Caspi
At 7:25 PM -0400 06/13/2013, Alex wrote: I think people will start by telling you to block the pw domain Sure, but not all of the comment-laden spam is from the pw domain. It comes in from .net, .com, .us, and a bunch of other places as well. This is just the one example I happened to pick

Re: New rule for HTML spam, using comments?

2013-06-13 Thread Amir 'CG' Caspi
At 8:04 PM -0400 06/13/2013, Alex wrote: After looking at it more closely, it's also only hitting bayes20 for you. Do the others also score so low? This hits bayes99 on my system. The ones that SA doesn't catch, yes, they are typically low. I have some that are bayes50, some bayes20, some