On 5/1/2010 1:51 PM, Andy Schmidt wrote:


<snip/>

 

Right - that's the same scheme I just pointed out to Dave myself - except in my case you could pick a distinct factor for the "-" vs. the "+" side of the scale (because Declude already has that option anyhow)


I was trying to provide a simple example. In practice it would probably be better to have separate positive and negative going weights.


<snip/>

Here’s an important question, though:

 

Do you have a distribution chart for the reputation scale? It of course makes a HUGE different, whether the distribution of reputations reported for the inflow of email is evenly distributed between -1.0 and 0.1, or whether it is a bell curve where 80% are in the “center” area, or whether it’s some sort of exponential curve that has very few with “good” reputation, a modest amount around the 0 point, and then expentionally increasing towards the bad and turn reputations?

 

This way one could decide what factors to use for the + and – sides and where to set the “mid” point (Declude allows you to shift the mid-point left and right.


The research we have shows that the curve is largely bipolar and heavily weighted toward the black. Supposedly "good" ISP's frequently produce > 90% spam from their systems!! Indeed one of the mistakes we made during early testing was to assume that anybody producing more than 80% spam was probably not to be trusted and that the remaining 20% might be explained largely by false negatives --- we were very wrong about that. (SCIENCE!)

On the other hand, good reputation values do occur and when there is a strong confidence value they can often be trusted. BUT NOT ALWAYS... When one of the new pre-tested campaigns hits a fresh bot-net some of the sources can gain strong positive reputations for a short time. Our real-time IP conflict instrumentation has shown us a clearer picture of this -- while we knew it was possible (even likely) we were surprised to see how often solid new rules for these campaigns will be met with auto-panics in the field when first deployed.

For this reason we chose a nonlinear curve to boil the statistics down to a single value. R = sign(p) * sqr(abs(p) * c)

From:

https://svn.microneil.com/websvn/filedetails.php?repname=PKG-SNF-SDK-WIN&path=%2Ftrunk%2FSNFMultiDll%2Fsnfmultidll.cpp
        default: {                                                              // Ugly means we calculate the reputation
             Reputation =                                                       // figure from the statistics. Start by
               sqrt(fabs(Tester.G.Probability() * Tester.G.Confidence()));      // combining the c & p figures then
             if(0 > Tester.G.Probability()) Reputation *= -1.0;                 // flip the sign if p is negative.
        }

I recommend a softer weight for "good looking" IP reputations -- something calculated to negate "iffy" tests and avoid false positives.
For "bad looking" IP reputations a strong weight is generally sound provided there are some countering weights to balance it off when one of those "Good" ISPs is delivering the message in the midst of their 80% spam flood.


 

>> I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point.<<

 

Thank you – that means in their “default” (sample) config file, they really should adjust the midpoint away from “0” to “-8” (they multiply the reputation scale by 10 to be able to work with integers)


You know -- a lot of the professional filtering houses that started with (or still use) Declude adjusted their scales up to 100 or higher in order to give more room for fine adjustments. When we were developing MDLP we preferred that as well. The choice of scale is a matter of opinion and application -- and in a weight driven system it's always up for adjustment as every weight interacts with every other weight.


Best,

_M

-- 
President
MicroNeil Research Corporation
www.microneil.com




---
This E-mail came from the Declude.JunkMail mailing list. To
unsubscribe, just send an E-mail to imail...@declude.com, and
type "unsubscribe Declude.JunkMail". The archives can be found
at http://www.mail-archive.com.

Reply via email to