This is a Crosspost from AmaVis-users, as it's kind of a cross issue
post, I'll pose the same question here..

Thanks,
Richard.

Greetings:

I know this may well be OT for the AmaVis list, but I was wondering if
there was any expertise here that I could draw on...

I'm using Amavis and SpamAssassin (obviously), and I would like to get
something setup to feed false negatives to sa-learn... Unfortunately
there's not much I can do with my mail client setup... The only thing I
can really do is "forward as attachment" when there's a FN... 

The "forward as attachment" option produces messages that look like
this:

>From [EMAIL PROTECTED]  Tue May  4 13:09:21 2004
Return-Path: <[EMAIL PROTECTED]>
Received: from localhost (localhost [127.0.0.1])
        by whfirewall.nwtel.ca (8.12.11/8.12.9) with ESMTP id
i44K9LRp008313
        for <[EMAIL PROTECTED]>; Tue, 4 May 2004 13:09:21
-0700
Received: from whfirewall.nwtel.ca ([127.0.0.1])
 by localhost (whfirewall [127.0.0.1]) (amavisd-new, port 10024) with
LMTP
 id 07235-06 for <[EMAIL PROTECTED]>;
 Tue,  4 May 2004 13:09:19 -0700 (PDT)
Received: from hobbes.nwtel.ca (hobbes.nwtel.ca [172.16.96.89])
        by whfirewall.nwtel.ca (8.12.11/8.12.11) with ESMTP id
i44K99ea008300
        for <[EMAIL PROTECTED]>; Tue, 4 May 2004 13:09:09
-0700
Received: from WHTHYT-MTA by hobbes.nwtel.ca
        with Novell_GroupWise; Tue, 04 May 2004 13:09:09 -0700
Message-Id: <[EMAIL PROTECTED]>
X-Mailer: Novell GroupWise Internet Agent 6.5.1 
Date: Tue, 04 May 2004 13:08:41 -0700
From: "Richard Whittaker" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: Fwd: anemone
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="=__Part1E3FC459.0__="
X-Virus-Scanned: by amavisd-new 20030616-p9 and SA 2.63 at nwtel.ca
 
This is a MIME message. If you are reading this text, you may want to 
consider changing to a mail reader or gateway that understands how to 
properly handle MIME multipart messages.
 
--=__Part1E3FC459.0__=
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
 
 
 
Richard Whittaker, CISSP
Whitehorse Systems Manager,
IS Security Officer
NorthwesTel Inc.
 
--=__Part1E3FC459.0__=
Content-Type: message/rfc822
 
Return-path: <[EMAIL PROTECTED]>
Received: from whfirewall.nwtel.ca [192.168.90.253]
        by hobbes.nwtel.ca; Tue, 04 May 2004 10:52:06 -0700
Received: from localhost (localhost [127.0.0.1])
        by whfirewall.nwtel.ca (8.12.11/8.12.9) with ESMTP id
i44Hq6tT029961
        for <[EMAIL PROTECTED]>; Tue, 4 May 2004 10:52:06 -0700
Received: from whfirewall.nwtel.ca ([127.0.0.1])
 by localhost (whfirewall [127.0.0.1]) (amavisd-new, port 10024) with
LMTP
 id 29573-03-2; Tue,  4 May 2004 10:51:59 -0700 (PDT)
Received: from 199.85.228.1 ([219.234.169.251])
        by whfirewall.nwtel.ca (8.12.11/8.12.9) with SMTP id
i44HpBxh029825;
        Tue, 4 May 2004 10:51:17 -0700
X-Message-Info: 509FMR52245RND_UC_CHAR[1-3]io9/HHbewCikRD14cOCit919uVJ
Received: from plight ([142.200.25.110])
          by 730ns.hotbed.kinglet.dour.168.com
          (InterMail vM.4.00.91.80 724-8-1-033-1-12724) with ESMTP
          id
<[EMAIL PROTECTED]>
          for <[EMAIL PROTECTED]>; Tue, 04 May 2004 16:45:31 -0200
Message-ID: <[EMAIL PROTECTED]>
Reply-To: "Sanders" <[EMAIL PROTECTED]>
From: "Sanders" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Subject: anemone
Date: Tue, 04 May 2004 12:47:31 -0600
MIME-Version: 1.0
Content-Type: multipart/alternative;
        boundary="--71261322241384563126"
X-Virus-Scanned: by amavisd-new 20030616-p9 and SA 2.63 at nwtel.ca
X-Spam-Status: No, hits=-1.4 tagged_above=-999.0 required=2.0
tests=BAYES_01,
 BIZ_TLD
X-Spam-Level: 
 
----71261322241384563126
Content-Type: text/plain;
Content-Transfer-Encoding: 7Bit
 
....junk removed...
 
----71261322241384563126--
 
 
 
 
 
 
 
 
 
--=__Part1E3FC459.0__=--

When I run "sa-learn", I believe what's being identified is wrong, and
will taint my bayseian DB... 

[EMAIL PROTECTED]:/var/adm# su - amavis -c "sa-learn --spam -D -L --mbox
/var/spo
ol/mail/amavis"

...blah, blah, blah...

debug: Learning Spam
debug: uri tests: Done uriRE
debug: tokenize: header tokens for *p = "U*RWHITTAKER D*nwtel.ca D*ca"
debug: tokenize: header tokens for *m = " s09795f5 024 hobbes nwtel ca
"
debug: tokenize: header tokens for *x = "Novell GroupWise Internet
Agent 6.5.1 "
debug: tokenize: header tokens for *F = "U*RWHITTAKER D*nwtel.ca D*ca"
debug: tokenize: header tokens for To = "U*amavis D*whfirewall.nwtel.ca
D*nwtel.ca D*ca"
debug: tokenize: header tokens for Mime-Version = "1.0"
debug: tokenize: header tokens for *c = "multipart/mixed;   =__
PHrtHHHHHHHH . H __= "
debug: tokenize: header tokens for *r = "  WHTHYT-MTA by
hobbes.nwtel.ca   Novell_GroupWise; "
debug: tokenize: header tokens for *r = "  WHTHYT-MTA by
hobbes.nwtel.ca   Novell_GroupWise;    hobbes.nwtel.ca (hobbes.nwtel.ca
[172.16.96]) by whfirewall.nwtel.ca (8.12.11/8.12.11)        
<[EMAIL PROTECTED]>; "
debug: bayes: Learned '[EMAIL PROTECTED]'
Learned from 3 message(s) (3 message(s) examined).
debug: bayes: 8433 untie-ing
debug: bayes: 8433 untie-ing db_toks
debug: bayes: 8433 untie-ing db_seen
debug: bayes: files locked, now unlocking lock
debug: unlock: 8433 unlink /usr/share/bayes/.lock
[EMAIL PROTECTED]:/var/adm# ls -l | more

The learner is mis-identifying the message as coming from me (since I
forwarded it, but I did so as an attachment)...

Is there something I can do to pre-process these messages, and strip
out the headers from forwarding before sa-learn gets it's hooks into
it?... Has anyone dealt with this before?... Is there anything I can
do?... I've looked at the folder/IMAP option, and it's not likely going
to help me much... 

Regards,
Richard.


Richard Whittaker, CISSP
Whitehorse Systems Manager,
IS Security Officer
NorthwesTel Inc.

Reply via email to