Re: Spam and Ham have different headers - bayesian tricks

Justin Mason 14 Feb 2005 19:05:19 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Marc Perkel writes:
> Continuing with my experimenting with a second bayesian filter - using 
> spamprobe and controlling the tokens myself - and using SA to score the 
> output.
> 
> So - I noticed that spam and ham often have different header fields. 
> Some headers only show up in ham - and some headers only show up in 
> spam. So I tokenized the headers themselves and fed just the header 
> names in as data and got some really good results.
> 
> So - I don't know if SA is doing this but tokenizing the header names 
> (excluding the common ones that all headers have) is very effective.

yes, we do that.

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFCEPZRMJF5cimLx9ARArWOAKCNCT7foX79+h06EFFiL3lQ0lZjVQCgrh97
VO71tbPWil5052pDSmyley4=
=1m7C
-----END PGP SIGNATURE-----

Re: Spam and Ham have different headers - bayesian tricks

Reply via email to