-------- Original-Nachricht --------
> Datum: Mon, 23 Nov 2009 12:55:26 +0100
> Von: coma <coma....@gm...>
> An: dspam-u...@li...
> Betreff: [Dspam-user] Question about Graham Burton and Pvalue

> Hi,
>
Hallo Coma,


> I looking to know what calculations are made by graham burton to calculate
> the probability and confidence.
>
> I look at the source code, and look at the equation of the combined
> probability but it's too hard for me :(
>
> I think the probability per token is calculated like this: If the word
> "viagra" (example) appears in 400 of 3 000 spam messages in 5 of the 300
> legitimate messages, for example, then its spam probability would be
> 0,8889
> (that is, [400/3000] divided by [5 / 300 +400 / 3000])
>
the formula for computing the probability for Graham Burton can be found
here:
http://en.wikipedia.org/wiki/Bayesian_spam_filtering#Computing_the_probability_that_a_message_containing_a_given_word_is_spam

Since you are French this here might be easier to read for you:
http://fr.wikipedia.org/wiki/Filtrage_bay%C3%A9sien_du_spam#Calculer_la_probabilit.C3.A9_qu.27un_message_contenant_un_mot_donn.C3.A9_soit_un_pourriel


> But for the confidence I do not know?
>
The confidence for Graham Burton uses Robinson's Geometric Mean. The
computation formula for it can be found in libdspam.c in the function
_ds_calc_result(). The part (main formula) that you are looking for is this
here:
/* Robinson's */
if (rob_used == 0)
{
p = q = s = 0;
}
else
{
p = 1.0 - pow (rob_bot, 1.0 / rob_used);
q = 1.0 - pow (rob_top, 1.0 / rob_used);
s = (p - q) / (p + q);
s = (s + 1.0) / 2.0;
}



> And what is exactly Pvalue?
>
PValue is short for Probability Value and is a value/configuration option
used in dspam.conf to allow the operator of DSPAM choose what to use for
computing probability values.


Steve



Thank you once again Steve =)


coma
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to