On Tue, 2009-10-27 at 19:28 -0400, Timo Sirainen wrote: > On Tue, 2009-09-01 at 22:20 +0200, Karsten Bräckelmann wrote: > > The mail that is being trained is different than its respective source > > in the mbox file. The trained one shows added, trailing carriage-return > > chars for all headers, which are not in the headers in the mbox file. > > > > This breaks sa-learn -- both these variations are different, and SA > > would learn *both* when run against each one separately. > > > > How comes? Any insight? > > Probably because incoming mails have CRLF linefeeds. Antispam plugin > could drop these by wrapping the mail_get_stream()'s returned input > stream to i_stream_create_lf().
I'm not sure this is what we want -- shouldn't we keep it as pristine as possible? However, I don't understand Karsten anyway, which message is "the trained one"? Karsten, please list the three relevant messages: the one first handed to SA _before_ dovecot gets involved, the one stored, and the one handed to SA via antispam. johannes
signature.asc
Description: This is a digitally signed message part
