On Tue, 13 Apr 2010 23:59:12 -0700 (PDT)
john espiro <john_esp...@yahoo.com> wrote:

> I am trying to figure out the pros/cons of the following pvalues...
> 
> PValue robinson
> #PValue markov
> #PValue bcr
> 
> What should _I_ use, and is there any documentation?
> 
I would suggest to use "bcr" if you are not using Hash driver. If you are using 
the Hash driver then use "markov".


PValue is the algorithm used for computing the probability. Available options 
are:
 bcr         Bayesian Chain Rule (Graham's Technique - "A Plan for Spam")
 robinson    Robinson's Technique (used in Chi-Square) 
 markov      Markovian Weighted Technique (for Markovian discrimination)


You can use "markov" only with the Hash driver and when you use Tokenizer SBPH. 
For more info about the "markov" algorithm read this Wikipedia link: 
http://en.wikipedia.org/wiki/Markov_Random_Field

"robinson" is used if you use "Chi-Square" as algorithm in DSPAM. For more info 
about the "robinson" algorithm read this link: 
http://www.linuxjournal.com/article/6467

For more info about "Chi-Sqare" read this Wikipedia link: 
http://en.wikipedia.org/wiki/Chi-square_statistic

More info about "bcr" can be found here: http://www.paulgraham.com/spam.html


> John
> 
-- 
Kind Regards from Switzerland,

Stevan Bajić

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dspam-user mailing list
Dspam-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to