[EMAIL PROTECTED] (Justin Mason) writes:

> Yes, I agree -- this is the problem with older ham.  (esp. the SPF
> problem.  SPF is very brittle on this point.)
> 
> How's about putting stricter limits on the net check corpora?

Well, do we really want to use an extra 6 months on only one of the
runs?  I think it would be better to use more or less the same data.
 
> I would suggest though that Malte's point is also valid -- some "special
> case" reported FP mails should be kept in the ham corpus, if they really
> are special cases that the submitter is worried about.

Yes, I *am* keeping my non-SpamAssassin-list spam-related mail in the
corpus.  The main reason to remove the SpamAssassin list mail is that
we'll totally bias the corpus; I'm sure we'll have more than enough FPs
for iffy rules by virtue of our everyday mail.
 
> And the ham?  I'm +1 on keeping ham bounces.

Agreed, I am keeping ham bounces.
 
Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Reply via email to