-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Daniel Quinlan writes:
> Bob Menschel <[EMAIL PROTECTED]> writes:
> 
> > I can see the reason for most of Daniel's suggestions, and while I
> > think 12 months is too short a period for ham (I'd favor 18 or 24
> > months), I could live with that.
> 
> I might be able to live with 18, but I think we should stick with 12
> because of the network tests (which are on for 2 of the 3 mass-check
> runs if I recall correctly).  The problem is that you get more and more
> mail that is no longer representative of the current sender
> configuration: SPF negative, host no longer exists, IP address has
> changed, etc.

Yes, I agree -- this is the problem with older ham.  (esp. the SPF
problem.  SPF is very brittle on this point.)

How's about putting stricter limits on the net check corpora?

I would suggest though that Malte's point is also valid -- some "special
case" reported FP mails should be kept in the ham corpus, if they really
are special cases that the submitter is worried about.

> > Ham bounces (valid bounces of ham sent from our systems) are ham, and
> > should be in the ham corpus.  Spam bounces (blind bounces of spam sent
> > back to forged or faked from addresses) are spam, often containing the
> > content of the spam as well as the notification.
> 
> I agree those are spam, but since those can be addressed with techniques
> like envelope rewriting that are 100% reliable and non-probabilistic, I
> think we should just remove them.

And the ham?  I'm +1 on keeping ham bounces.

Spam bounces, however, I don't think should be used in the corpus at
all.

> >>> 5. no mailing list moderation administative messages since these also
> >>>    contain spam
> > 
> > They also contain ham. If a system administrator can differentiate
> > between them, why shouldn't the spam messages be in a spam corpus, and
> > the ham messages in a ham corpus?
> 
> Moderators can't ignore either type of moderation message for a large
> proportion of mailing list software (especially mailman).  If anything,
> they should all be ham and I don't think we want to do that.  I think
> it's better to just remove them.

OK, I've come around to that view BTW.  +1

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFA2e1wQTcbUG5Y7woRApGIAJ96HbTdMromHvsVa/gH1BOev1FtvgCgtbDM
dngT9ZZmVyR1VUa1MKwgT9U=
=WjwV
-----END PGP SIGNATURE-----

Reply via email to