-------- Original-Nachricht -------- > Datum: Mon, 23 Nov 2009 12:55:26 +0100 > Von: coma <coma....@gmail.com> > An: dspam-user@lists.sourceforge.net > Betreff: [Dspam-user] Question about Graham Burton and Pvalue
> Hi, > Hallo Coma, > I looking to know what calculations are made by graham burton to calculate > the probability and confidence. > > I look at the source code, and look at the equation of the combined > probability but it's too hard for me :( > > I think the probability per token is calculated like this: If the word > "viagra" (example) appears in 400 of 3 000 spam messages in 5 of the 300 > legitimate messages, for example, then its spam probability would be > 0,8889 > (that is, [400/3000] divided by [5 / 300 +400 / 3000]) > the formula for computing the probability for Graham Burton can be found here: http://en.wikipedia.org/wiki/Bayesian_spam_filtering#Computing_the_probability_that_a_message_containing_a_given_word_is_spam Since you are French this here might be easier to read for you: http://fr.wikipedia.org/wiki/Filtrage_bay%C3%A9sien_du_spam#Calculer_la_probabilit.C3.A9_qu.27un_message_contenant_un_mot_donn.C3.A9_soit_un_pourriel > But for the confidence I do not know? > The confidence for Graham Burton uses Robinson's Geometric Mean. The computation formula for it can be found in libdspam.c in the function _ds_calc_result(). The part (main formula) that you are looking for is this here: /* Robinson's */ if (rob_used == 0) { p = q = s = 0; } else { p = 1.0 - pow (rob_bot, 1.0 / rob_used); q = 1.0 - pow (rob_top, 1.0 / rob_used); s = (p - q) / (p + q); s = (s + 1.0) / 2.0; } > And what is exactly Pvalue? > PValue is short for Probability Value and is a value/configuration option used in dspam.conf to allow the operator of DSPAM choose what to use for computing probability values. > Thank you in advance once again, > > > coma > Steve -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Dspam-user mailing list Dspam-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-user