Re: [Razor-users] Question about reporting

Matt Kettler Fri, 27 Oct 2006 14:52:59 -0700

John Andersen wrote:
> On Friday 27 October 2006 09:57, Matt Kettler wrote:
>> Giampaolo Tomassoni wrote:
>>> Dears,
>>>
>>> some time ago (in nov. 2004, I guess) I registered with Razor2 a server
>>> of mines in order to report spam.
>>>
>>> I could easily see that whatever piece of spam I was reporting, was soon
>>> reported as spam by a further check on the very same message.
>>>
>>> Recently, I see that my reportings are basicly ignored.
>>>
>>> I there any policy change in this matter, or is due to some
>>> misconfiguration on my side?
>  
>> Are the spams you're doing this with some of the current image spams?
> 
> He said "the very same message", and I took him at his word.


As did I. I took it to mean the EXACT SAME message. ie: run it through
razor-report, then later run the exact same file through razor-check.

My explanation of the one-offs is not to suggest he's using two different
messages in his test, it is to explain why nobody else will ever report the same
one. That's why it's not being listed.

Giampaolo is not being ignored, there's just no other reports of the same 
signature.

And regardless, even if his one report did cause listing, it would not matter.
Nobody, not even Giampaolo, will ever get that same image again.


> This is not to say that one-off custom messages (be it text or image) do
> not present a challenge for razor.  However other engines such as
> spamassassin usually do properly catch these things. 

My spamassassin catches about 80-90% of these. I'd also point out I'm much more
proficient in SA than most folks are. (I wrote antidrug.cf, and the original
WritingRules guide that's in the SA wiki)

>  Even the random text messages.  (The theory of bayes poisoning has been 
> pretty well 
>> de-bunked).


Well, yes, bayes poisoning has been debunked, and I am personally one of the
people who has done a lot of public education to debunk it, and explain why it
can't work in a chi-squared system.

But this isn't poisoning, at least not in the true sense. Poisoning is where you
mix nonspam content with your spam. These messages aren't a mix. In this case,
there's NO spam in the text section of the message. All the spam parts are in
the image.

Because of that A bayes filter won't detect these messages very well.

Looking at my most recent 13 image based stock spams, every single one scored
BAYES_50. All were tagged as spam by SA (obviously not due to bayes), and 7 of
them were autolearned as spam.

But despite all the autolearning, the bayes scores aren't going up. Why? Because
the text in one looks nothing like the text in the other. Learning them does
almost nothing for your bayes system. It's statistically irrelevant to detecting
other similar spams. Fortunately, it's irrelevant to just about everything, so
this won't hurt your bayes DB either.

The bayes system did very little here. It consistently gets fed lots of these
messages, but can't pick them up as more than BAYES_50. They're all tagged based
on RBLs and SARE gif rules, not bayes or razor.

I would not rely on bayes any more than razor to detect these. Both are of
limited use against this particular variant.










-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Razor-users mailing list
Razor-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/razor-users

Re: [Razor-users] Question about reporting

Reply via email to