RE: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs. Sniffer Truncate

2010-05-01 Thread David Barker
My quick response.

The out of the box Declude Customer CAN use the samples given. The extra 
scoring ensures that bad IP's are eliminated as spam. It would be the same 
as placing an extra high score on a specific test. Pete's notes suggest:

63 - Black
Systems should usually quarantine or reject messages produced by this IP.

20 - Truncate
Systems should usually refuse connections from this IP.

Which means for the majority of our customers an exaggerated score on these 
message is fine (I will have to check on Monday but I don't believe it 
triples the score I think the max would be 2 tests based on the same 
information) Unfortunately a large portion of our customers today do not 
understand or even care about the details. The beauty of  Declude is that 
you are welcome to score tests however you feel appropriate for your email 
server. 

I do agree with you that it could be made more clear, but to advise the 
list NOT to use the current declude settings is your opinion. What would be 
helpful is making a suggestion to what settings you use based on your 
results. 

David


From: Andy Schmidt andy_schm...@hm-software.com
Sent: Friday, April 30, 2010 9:26 PM
To: declude.junkmail@declude.com
Subject: RE: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs. 
Sniffer Truncate 

Thanks Pete - that confirms what I feared.

Declude's own sample should NOT be used as is because it duplicates the 
IP
results (at minimum)

 The SNFIPREP test gives you a variable weight based on the IP reputation 

in GBUdb. This allows you to get some weighting positively or negatively 
based on the reputation even when that reputation is not in one of the 
defined GBUdb envelopes. 

Yes - according to Dave's explanation earlier today, Declude will get a
decimal number between -1 and +1. Their Sample/Default configuration 
treats
0 as normal, treats anything negative as GOOD (and subtracts 5 points)
and anything positive as BAD (and adds 10 points).

So - even though Sniffer returns information on a vary graduated scale,
Declude then returns 3 discrete numbers. In fact, 0 is only returned for 
10%
of the range - 90% of the range returns either -5 or 10.

 I presume that even when SNFIP does return Caution, Black, or Truncate
that SNFIPREP continues to work and in that case will provide some shading
to those values... so, if you will, more or less Black, etc.

Based on Dave's explanation, Caution, Black and Truncate would
certainly always return a value  0. Consequently, 10 would ALWAYS be
added to the weight for those 3 reputations.

Their default example basically TRIPLES the 10 weight that is assigned 
in
many cases (once for SNFIP, once for SNFIPREP, and once for SNF).

Let's see if Dave's chips in - but it certainly seems to me that Declude's
Sniffer sample/default config should NOT be used (because it doesn't do 
what
an innocent user might expect).  It's not at all clear that after all
their Sniffer rules, 30 would be added to the weight in several cases.

-Original Message-
From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete
McNeil
Sent: Friday, April 30, 2010 7:07 PM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs.
Sniffer Truncate

On 4/30/2010 5:16 PM, Andy Schmidt wrote:
 Hi Pete,

 I'm look over Decludes recommended Sniffer configuration and trying to
 understand how much overlap there is between these options:

 IPREPUTATION  SNFIPREPx   0   10  -5

 SNFIPCAUTION  SNFIP   x   4   5   0
 SNFIPBLACKSNFIP   x   5   10
 0
 SNFIPTRUNCATE SNFIP   x   6   10  0

 SNFTRUNCATE   SNF x   20  10
 0
 SNIFFER-IP-RULES  SNF x   63  10
 0

 Looking at the Sniffer documentation IP test result codes

http://www.armresearch.com/support/articles/software/snfClient/resultCodes.j

 sp
 it seems that the SNFIP tests for 4, 5 and 6 (SNFIPCAUTION,
 SNFIPBLACK, SNFIPTRUNCATE) might coincide with 40, 63 and 20.


I am not intimately familiar with Declude's configuration and SNF 
integration --- not like I used to be anyway (s many platforms now).

I _think_ these tests work like this:

The SNFIPREP test gives you a variable weight based on the IP reputation 
in GBUdb. This allows you to get some weighting positively or negatively 
based on the reputation even when that reputation is not in one of the 
defined GBUdb envelopes. It's a subtle nudge in the right direction.

The SNFIP test gives you a hard result code based only on the IP 
reputation when that reputation is within one of the envelopes defined 
for GBUdb. So if the IP reputation is in the Caution, Black, or Truncate 
range then that test will fire.

Presumably all of the IP tests happen before SNF scans the 

Re: [Declude.JunkMail] Sniffer IP Reputation for white listing

2010-05-01 Thread Pete McNeil

On 4/30/2010 9:32 PM, Andy Schmidt wrote:


snip/


But your documentation of the reputation system has a graph that shows that
there is yet another category: WHITE.
   


I don't know the details of Declude's impelementation. Presumably they 
could (or maybe even do) implement WHITE.



The SNFIPREP tests does offer the ability to define at what decimal value
(between -1 and +1, in .1 increments) a weight can be subtracted. But the
question is - is that SENSIBLE use of your reputation database? Per example,
could -0.8 be a sensible threshold to give an email credit for coming from
a reputable IP source?
   


I'm guessing on how that test is implemented, but if I've guessed 
correctly then -0.8 would certainly be a good WHITE set point.


My guess is based on using a combined score value from the IP reputation 
that combines the confidence figure and the probability figure. In that 
case only a strongly negative p coupled with a strong c would result in 
a -0.8.



Or is it better to let the good reputation be considered AFTER the content
scan and then use the combined exit code?
   


As I understand it Declude uses a wheighting system --- except for some 
short-circuit abilities that means all tests are run, their scores are 
added together, and then the total is used to determine the disposition 
of the message. I don't think there is an 'AFTER' in this case.


The IP reputation test is useful in cases where a message might be too 
new to hit a pattern match and where the IP reputation is not quite 
strong enough to be in one of the GBUdb envelopes. In such a case it 
might be useful to combine the 'analog' reputation score with the scores 
from other tests to push the message over the fence one way or 
another... at least that's how the test was designed to work in the API 
we provide.


It sounds like you're describing the IP Reputation test as having 
thresholds. That's an interesting way to do it (I haven't looked to see 
if it is actually that way)... a better way to do it would be to scale 
the result so that from 0 to -1 the negative weight (let's pick a 
factor of 5) would rise linearly from 0 to -5 and similarly a positive 
going reputation would scale linearly from 0 to +5 as the API result 
scaled from 0 to +1.


The API result holds 0 as meaning I don't know --- either because the 
confidence figure (c) is 0 or because the probability figure (p) is 0 
(meaning a 50% chance of spam or ham). The farther away from 0 you get 
the more certain the statistics.


Hope this helps,

_M



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to imail...@declude.com, and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs. Sniffer Truncate -- SUGGESTION

2010-05-01 Thread Andy Schmidt
Hi Dave,

 

Oh, I think it's not just an opinion. Specially the SNFIPREP sample seems
to reduce the weight for 45% of all emails - when we all know what 98% are
spam. But, let's look at the facts, and then you can correct me where I got
them wrong so that either one of us can learn from their mistake.

 

 I don't believe it triples the score I think the max would be 2 tests
based on the same information 

 

1.   A black or truncate will trigger 

 

a)  The SNF test (either Truncate or IP Rules)

b)  The SNFIP test (either Truncate of Black)

c)   The SNFIPreputation test (because Truncate or Black will NOT have a
Good or Neutral reputation)

 

Is it therefore my believe that the score is tripled - or where am I
thinking wrong?

 

Now - this may actually a somewhat desirable outcome (because the mail will
be blocked) - I do agree with you on that. But, it can result in undesirable
outcome once customers attempt to slightly adjust the default settings
without realizing that some tests are triplicated.

 

 What would be helpful is making a suggestion to what settings you use
based on your results 

 

2.   Fair enough, my suggestion would obviously have to be to comment
out the two redundant SNF rules and let SNFIP handle the IP scoring part -
and increase those weights if you don't like to be just 10 (remember, if
the content SNF tests find a match, THAT score will also be added!)

 

SNFIPBLACK SNFIP x 5 20  0
SNFIPTRUNCATE  SNFIP x 6 30 0

# SNFTRUNCATE  SNF x 20 10 0

# SNIFFER-IP-RULES   SNF x 63 10 0



This way, your users can SEE that the SNF options exist and how they would
be coded, but  would be realize that they have to research the implications
first before removing the comment.

 

3.   I suspect that the biggest problem is the SNFIPREP test - but I'm
waiting for Pete to give some input. The way I understand your email from
Thursday about your algorithm, it potentially assigns 10% of all emails a
score of 0, it potentially assigns 45% of ALL emails a score of -5. And it
adds a weight of 10 to the remaining 45% of all emails - which also seems
rather arbitrary. (Disclaimer: What we don't know is the distribution
curve, is it a bell curve, where the majority fall into the range of -0.05
to +0.05 and very few fall in the + or  - side. Or is the distribution some
logarithmic curve, that has very few on the good side, a moderate
frequency in the middle and the increases sharply the further it gets on the
bad side.
Now, maybe you analyzed the distribution curve before you developed this
sample?

 

Until all these crucial questions are resolved (I wouldn't want to reduce
the weight for a totally unknown percentage of all the spam!) I would
comment it out for sure:

 

# IPREPUTATION SNFIPREP x 0 10 -5

 

But, I have a really good suggestion on how to make this entire test more
usable:

 

The whole point of the reputation scale (between -1 and +1) is to allow
Sniffer customers a graduated response - not just 2 values for 90% of the
number scale. My suggestion would be to slightly enhance your formula by
multiplying  the reputation value with the assigned weight (after shifting
it for the base score). THEN I think this test would be useful, because it
would actually produce a sliding scale of weights based on the reputation
scale. In other words:

 

(( Abs(Reputation Value) * 10 ) - Base Value) * [Pos or
Neg]WeightFactor = Final Weight

 

For this line:

 

# IPREPUTATION SNFIPREP x 0 2 -1

 

it would results in weights between +20 and -10 - which is in line what the
reputation scale was intended to provide:

 

Reputation 0.0: ( ( 0.0 * 10 ) - 0 ) * 2 = 0

 

Reputation 0.3: ( ( 0.3 * 10 ) - 0 ) * 2 = 6

Reputation 1.0: ( ( 1.0 * 10 ) - 0 ) * 2 = 20

  

Reputation -0.3: ( ( 0.1 * 10 ) - 0 ) * -1 = -3

Reputation -1.0: ( ( 1.0 * 10 ) - 0 ) * -1 = -10

 

Best Regards,

Andy

 

From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of David
Barker
Sent: Saturday, May 01, 2010 10:11 AM
To: declude.junkmail@declude.com
Subject: RE: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs.
Sniffer Truncate

 

My quick response.

The out of the box Declude Customer CAN use the samples given. The extra
scoring ensures that bad IP's are eliminated as spam. It would be the same
as placing an extra high score on a specific test. Pete's notes suggest:

63 - Black
Systems should usually quarantine or reject messages produced by this IP.

20 - Truncate
Systems should usually refuse connections from this IP.

Which means for the majority of our customers an exaggerated score on these
message is fine (I will have to check on Monday but I don't believe it
triples the score I think the max would be 2 tests based on the same
information) Unfortunately a large portion of our customers today do not
understand or even care about the details. The beauty of  Declude 

[Declude.JunkMail] Statistic programs for Junkmail

2010-05-01 Thread David Dodell
Curious what programs everyone is using to generate the nice reports showing 
what Junkmail tests are being activated?

Thanks

David

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to imail...@declude.com, and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] Statistic programs for Junkmail

2010-05-01 Thread Andy Schmidt
I happen to run Invariant Software's Declude Analyzer (for Declude Virus
and Declude Spam).

-Original Message-
From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of David
Dodell
Sent: Saturday, May 01, 2010 12:39 PM
To: declude.junkmail@declude.com
Subject: [Declude.JunkMail] Statistic programs for Junkmail

Curious what programs everyone is using to generate the nice reports showing
what Junkmail tests are being activated?

Thanks

David

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to imail...@declude.com, and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.





---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to imail...@declude.com, and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] Sniffer IP Reputation for white listing

2010-05-01 Thread Andy Schmidt
Hi Pete,

 

Funny - our messages overlapped. But I'm glad I was on the right track with
my suspicions. Hopefully this will help Declude to refine things.

 

 a better way to do it would be to scale the result so that from 0 to -1
the negative weight (let's pick a 

factor of 5) would rise linearly from 0 to -5 and similarly a positive going
reputation would scale linearly from 0 to +5 as the API result scaled from 0
to +1. 

 

Right - that's the same scheme I just pointed out to Dave myself - except in
my case you could pick a distinct factor for the - vs. the + side of the
scale (because Declude already has that option anyhow)

 

(( Abs(Reputation Value) * 10 ) - Base Value) * [Pos or
Neg]WeightFactor = Final Weight

 

For this line in the Declude config:

 

IPREPUTATION SNFIPREP x 0 2 -1

 

it would results in weights between +20 and -10, e.g.:

 

Reputation 0.0: ( ( 0.0 * 10 ) - 0 ) * 2   =   0

 

Reputation 0.3: ( ( 0.3 * 10 ) - 0 ) * 2   =6

Reputation 1.0: ( ( 1.0 * 10 ) - 0 ) * 2   =  20

  

Reputation -0.3: ( ( 0.3 * 10 ) - 0 ) * -1 =   -3

Reputation -1.0: ( ( 1.0 * 10 ) - 0 ) * -1 = -10

 

 

Here's an important question, though:

 

Do you have a distribution chart for the reputation scale? It of course
makes a HUGE different, whether the distribution of reputations reported for
the inflow of email is evenly distributed between -1.0 and 0.1, or whether
it is a bell curve where 80% are in the center area, or whether it's some
sort of exponential curve that has very few with good reputation, a modest
amount around the 0 point, and then expentionally increasing towards the bad
and turn reputations?

 

This way one could decide what factors to use for the + and - sides and
where to set the mid point (Declude allows you to shift the mid-point left
and right.

 

 I'm guessing on how that test is implemented, but if I've guessed
correctly then -0.8 would certainly be a good WHITE set point.

 

Thank you - that means in their default (sample) config file, they really
should adjust the midpoint away from 0 to -8 (they multiply the
reputation scale by 10 to be able to work with integers) 

 

IPREPUTATION  SNFIPREP  x  0  2   -1

 

probably to

 

IPREPUTATION   SNFIPREP   x -8  2 -1

 

but I'd have to check with Dave to see if -8 will indeed set the midpoint
to -0.8 or if the sign has to be reversed.

 

Thanks for taking the time to help all of us understand Sniffer in the
context of the Declude integration.

 

I'm very happy that Declude took the time and integrated the product. I just
would like to make sure it comes with an implementation sample that is a
good enough compromise for day-to-day use.

 

Best Regards,

Andy

 

 

 

-Original Message-
From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete
McNeil
Sent: Saturday, May 01, 2010 11:57 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] Sniffer IP Reputation for white listing

 

On 4/30/2010 9:32 PM, Andy Schmidt wrote:

 

 

snip/

 

 But your documentation of the reputation system has a graph that shows
that

 there is yet another category: WHITE.



 

I don't know the details of Declude's impelementation. Presumably they 

could (or maybe even do) implement WHITE.

 

 The SNFIPREP tests does offer the ability to define at what decimal value

 (between -1 and +1, in .1 increments) a weight can be subtracted. But the

 question is - is that SENSIBLE use of your reputation database? Per
example,

 could -0.8 be a sensible threshold to give an email credit for coming
from

 a reputable IP source?



 

I'm guessing on how that test is implemented, but if I've guessed 

correctly then -0.8 would certainly be a good WHITE set point.

 

My guess is based on using a combined score value from the IP reputation 

that combines the confidence figure and the probability figure. In that 

case only a strongly negative p coupled with a strong c would result in 

a -0.8.

 

 Or is it better to let the good reputation be considered AFTER the
content

 scan and then use the combined exit code?



 

As I understand it Declude uses a wheighting system --- except for some 

short-circuit abilities that means all tests are run, their scores are 

added together, and then the total is used to determine the disposition 

of the message. I don't think there is an 'AFTER' in this case.

 

The IP reputation test is useful in cases where a message might be too 

new to hit a pattern match and where the IP reputation is not quite 

strong enough to be in one of the GBUdb envelopes. In such a case it 

might be useful to combine the 'analog' reputation score with the scores 

from other tests to push the message over the fence one way or 

another... at least that's how the test was designed to work in the API 

we provide.

 

It sounds like you're describing the IP Reputation test as having 

thresholds. 

FW: [Declude.JunkMail] SNFIP option for WHITE?

2010-05-01 Thread Andy Schmidt
Dave,

Pete confirmed that in addition to the Caution, Black and Truncate
categories, there is a WHITE category (which was also mentioned in the
Sniffer documentation).

So, I seems as if besides the existing three SNFIP options:

  SNFIPCAUTION   SNFIP x 4  5 0
  SNFIPBLACK SNFIP x 5 10 0
  SNFIPTRUNCATE  SNFIP x 6 10 0

there should/could be a:

  SNFIPWHITE SNFIP x ??? -5 0

Best Regards,
Andy

-Original Message-
From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete
McNeil
Sent: Saturday, May 01, 2010 11:57 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] Sniffer IP Reputation for white listing

 But your documentation of the reputation system has a graph that shows
that
 there is yet another category: WHITE.
   

I don't know the details of Declude's impelementation. Presumably they 
could (or maybe even do) implement WHITE.





---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to imail...@declude.com, and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] Sniffer IP Reputation for white listing

2010-05-01 Thread Pete McNeil




On 5/1/2010 1:51 PM, Andy Schmidt wrote:

  
  
  



snip/


  
  
  
  Right - that's the same scheme I just pointed
out to Dave
myself - except in my case you could pick a distinct factor for the
"-" vs. the "+" side of the scale (because Declude already
has that option anyhow)
  


I was trying to provide a simple example. In practice it would probably
be better to have separate positive and negative going weights.


snip/


  
  Heres an important question, though:
  
  Do you have a distribution chart for the
reputation scale?
It of course makes a HUGE different, whether the distribution of
reputations reported
for the inflow of email is evenly distributed between -1.0 and 0.1, or
whether
it is a bell curve where 80% are in the center area, or whether
its some sort of exponential curve that has very few with good
reputation, a modest amount around the 0 point, and then expentionally
increasing
towards the bad and turn reputations?
  
  This way one could decide what factors to use
for the +
and  sides and where to set the mid point (Declude allows
you to shift the mid-point left and right.
  


The research we have shows that the curve is largely bipolar and
heavily weighted toward the black. Supposedly "good" ISP's frequently
produce  90% spam from their systems!! Indeed one of the mistakes
we made during early testing was to assume that anybody producing more
than 80% spam was probably not to be trusted and that the remaining 20%
might be explained largely by false negatives --- we were very wrong
about that. (SCIENCE!)

On the other hand, good reputation values do occur and when there is a
strong confidence value they can often be trusted. BUT NOT ALWAYS...
When one of the new pre-tested campaigns hits a fresh bot-net some of
the sources can gain strong positive reputations for a short time. Our
real-time IP conflict instrumentation has shown us a clearer picture of
this -- while we knew it was possible (even likely) we were surprised
to see how often solid new rules for these campaigns will be met with
auto-panics in the field when first deployed.

For this reason we chose a nonlinear curve to boil the statistics down
to a single value. R = sign(p) * sqr(abs(p) * c)

From: 

https://svn.microneil.com/websvn/filedetails.php?repname=PKG-SNF-SDK-WINpath=%2Ftrunk%2FSNFMultiDll%2Fsnfmultidll.cpp
default: {  // Ugly means we calculate the reputation
 Reputation =   // figure from the statistics. Start by
   sqrt(fabs(Tester.G.Probability() * Tester.G.Confidence()));  // combining the c  p figures then
 if(0  Tester.G.Probability()) Reputation *= -1.0; // flip the sign if p is negative.
}


I recommend a softer weight for "good looking" IP reputations --
something calculated to negate "iffy" tests and avoid false positives.
For "bad looking" IP reputations a strong weight is generally sound
provided there are some countering weights to balance it off when one
of those "Good" ISPs is delivering the message in the midst of their
80% spam flood.



  
  
  
   I'm guessing on how that test is
implemented,
but if I've guessed correctly then -0.8 would certainly be a good
WHITE
set point.
  
  Thank you  that means in their default
(sample) config file, they really should adjust the midpoint away from
0
to -8 (they multiply the reputation scale by 10 to be able to
work with integers) 
  


You know -- a lot of the professional filtering houses that started
with (or still use) Declude adjusted their scales up to 100 or higher
in order to give more room for fine adjustments. When we were
developing MDLP we preferred that as well. The choice of scale is a
matter of opinion and application -- and in a weight driven system it's
always up for adjustment as every weight interacts with every other
weight.


Best,

_M

-- 
President
MicroNeil Research Corporation
www.microneil.com




---This E-mail came from the Declude.JunkMail mailing list.  Tounsubscribe, just send an E-mail to imail...@declude.com, andtype "unsubscribe Declude.JunkMail".  The archives can be foundat http://www.mail-archive.com.