-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Marc Perkel writes: > Continuing with my experimenting with a second bayesian filter - using > spamprobe and controlling the tokens myself - and using SA to score the > output. > > So - I noticed that spam and ham often have different header fields. > Some headers only show up in ham - and some headers only show up in > spam. So I tokenized the headers themselves and fed just the header > names in as data and got some really good results. > > So - I don't know if SA is doing this but tokenizing the header names > (excluding the common ones that all headers have) is very effective. yes, we do that. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCEPZRMJF5cimLx9ARArWOAKCNCT7foX79+h06EFFiL3lQ0lZjVQCgrh97 VO71tbPWil5052pDSmyley4= =1m7C -----END PGP SIGNATURE-----
