-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
mine were all discussions of spam :( doh! I'll have to remember -- *never* mark spam discussions as ham, even if you can't spot a spamsign. - --j. Theo Van Dinter writes: > On Wed, Nov 24, 2004 at 01:19:49AM -0500, Matt Kettler wrote: > > Quite frankly, I suspect corpus pollution. It really only takes 1 high > > scoring spam in the nonspam corpus to really screw up the message scores. > > That's quite possible. I don't think anyone has 100% non-polluted corpus, > though try we might. :( > > > 1) DRUGS_PAIN_OBFU actually hit some nonspam? I find that odd, but it could > > be a typo. > > Looking at the submitted results: > > dave.log:. /home/dave/corpus/cooked-ham.43366468 > jm.log:. /home/jm/Mail/deld.priv/34675 > jm.log:. /home/jm/Mail/deld.priv/34682 > jm.log:. /home/jm/Mail/deld.priv/34699 > jm.log:. /home/jm/Mail/deld.priv/34703 > quinlan.log:. /home/corpus/mail/ham/166370 > quinlan.log:. /home/corpus/mail/ham/166400 > quinlan.log:. /home/corpus/mail/ham/166430 > quinlan.log:. /home/corpus/mail/ham/166437 > > > 2) DRUGS_SMEAR1 hit some nonspam? I find that damn near impossible. I don't > > think any nonspam email other than one quoting spam will ever hit that > > rule. It seems there's one drug spam, or drug spam quote in somebody's > > corpus, and it was run in all 4 sets. (If anyone can show me the nonspam > > matching that rule and it's not spam or a spam quote or discussion of SA's > > rules, I'll send em $20. Really.) > > jm.log:. /home/jm/Mail/deld.priv/26352 > > > 4) NIGERIAN_BODY3? could be a finance newsletter, but very unlikely. > > That was mine: > > theo.log:Y ham/misc200405-200407.33861588 > > Unfortunately I took those misc ham mboxes and converted them to dir > format a while ago, so I don't know what message that was. > > > 6) PERCENT_RANDOM? Very unlikely. What would have %rnd_x in it? > > jm.log:. /home/jm/Mail/deld.pub/12701 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) Comment: Exmh CVS iD8DBQFBq8S+MJF5cimLx9ARAqIgAJ9cvW676a9p9lliRZwZIb79xDNnqwCgstps ie+5pylFyumlfeFwt2kTRXA= =cb4U -----END PGP SIGNATURE-----