Re: SEO Spam

Martin Gregorie Tue, 19 May 2015 12:23:51 -0700

On Tue, 2015-05-19 at 14:38 -0400, Alex Regan wrote:
> I've got more than a dozen now. It's a regular thing. I was just trying 
> to somehow gain support for somehow being more proactive with these.
> 
Here are a couple of ideas that may help. Both use lists of alternate
patterns, i.e.  body RULE /(man|woman|child)/i :


1) if the phrases you're matching fall into groups, such as sales
phrases and product names: 

  'big discounts' available on 'Mickey Mouse Chronometers' 

where I've quoted the candidate sales and product patterns, it may pay
to make each group into a separate list of alternates with a minimum
score and put the punitive score on a meta that requires at least one
hit on each of the groups before it will fire. The benefit of this is
that as the lists get a bit bigger they'll start matching on
combinations that you haven't seen earlier. This approach seems to be
fairly resistant to FPs.

2) If you can't split the matches into categories, consider using a
single list of alternates with the tflags multiple flag set and a
moderate score chosen so that it will only classify the message as spam
if, three or more alternates match. Again, this will hit combinations
you haven't seen earlier, though its probably a bit more FP-prone than
my first suggestion.

The disadvantage of both approaches is that manually editing large
alternate lists is painful. So, I developed a scripted solution, based
on awk/gawk, that lets you keep the list of matching patterns in an
editor-friendly form and generates SA rules from the edited list. Here's
a link to it:
http://www.libelle-systems.com/free/portmanteau/portmanteau.tgz

HTH

Martin

Re: SEO Spam

Reply via email to