Justin Mason wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Marc Perkel writes:
  
Continuing with my experimenting with a second bayesian filter - using 
spamprobe and controlling the tokens myself - and using SA to score the 
output.

So - I noticed that spam and ham often have different header fields. 
Some headers only show up in ham - and some headers only show up in 
spam. So I tokenized the headers themselves and fed just the header 
names in as data and got some really good results.

So - I don't know if SA is doing this but tokenizing the header names 
(excluding the common ones that all headers have) is very effective.
    

yes, we do that.


  
Great minds think alike .....

-- 
Marc Perkel - [EMAIL PROTECTED]

Spam Filter: http://www.junkemailfilter.com
    My Blog: http://marc.perkel.com
My Religion: http://www.churchofreality.org
~ "If it's real - we believe in it!" ~

Reply via email to