Re: Does spamc unwrap spam reports?

2017-01-05 Thread @lbutlr
On Dec 28, 2016, at 3:01 AM, Lukas Erlacher  wrote:
> I'm calling "spamc --learntype=spam/ham" from a script, passing in emails 
> fetched from imap (I'm using ISBG with --learnspambox / --learnhambox and 
> --spamc actually).

Why are you calling spamc instead of sa-learn?

-- 
Apple broke AppleScripting signatures in Mail.app, so no random signatures.




Re: Does spamc unwrap spam reports?

2017-01-03 Thread RW
On Tue, 3 Jan 2017 10:39:52 +0100
Lukas Erlacher wrote:

> On 12/28/2016 03:12 PM, RW wrote:
> >
> > It's done in spamd. Don't attempt to remove X-Spam-* headers
> > yourself or it wont attempt to remove the mime encapsulation.
> >  
> 
> I'd like to convince myself of that... I ran `sudo -u debian-spamd
> spamc -c < spamspam.eml` on a mail that has spamlevel 14.4 and is
> encapsulated in a spam report. It gave 2.7/7.0... which I suppose is
> ok because it's an assessment of the spamminess of the whole mail.
> But that doesn't convince me...
> 
> How do I convince myself that it'll actually use the text of the 
> original spam mail to update the bayesian db?
 
You could try what I just did.

Edit a spam report and put made up words at the beginning of each
section, train it as spam and then put the made-up words through 
spamassassin -D bayes.


printf "\n\n Lhjkl  Ohjkl  Ihjkl \n" | spamassassin -D bayes
...
dbg: bayes: token 'ihjkl' => 0.986543689320388

Ihjkl was in the correct mime section at training - in the body of the
embedded spam.

What I don't get though is why isn't there a case-sensitive token
"Ihjkl"?


Re: Does spamc unwrap spam reports?

2017-01-03 Thread Lukas Erlacher

On 12/28/2016 03:12 PM, RW wrote:


It's done in spamd. Don't attempt to remove X-Spam-* headers yourself
or it wont attempt to remove the mime encapsulation.



I'd like to convince myself of that... I ran `sudo -u debian-spamd spamc 
-c < spamspam.eml` on a mail that has spamlevel 14.4 and is encapsulated 
in a spam report. It gave 2.7/7.0... which I suppose is ok because it's 
an assessment of the spamminess of the whole mail. But that doesn't 
convince me...


How do I convince myself that it'll actually use the text of the 
original spam mail to update the bayesian db?


Best,
Luke



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Does spamc unwrap spam reports?

2016-12-28 Thread RW
On Wed, 28 Dec 2016 11:01:05 +0100
Lukas Erlacher wrote:

> Hello,
> 
> https://wiki.apache.org/spamassassin/BayesInSpamAssassin says:
> 
> > It's OK to feed emails with Spamassassin markup into the sa-learn
> > command -- sa-learn will ignore any standard Spamassassin headers,
> > and if the original email has been encapsulated into an attachment
> > it will decapsulate the email. In other words sa-learn will undo
> > any changes which Spamassassin has done before learning the
> > spam/ham character of the email.  
> 
> I haven't found any documentation that specifies this for spamc/spamd.
> 
> I'm calling "spamc --learntype=spam/ham" from a script, passing in 
> emails fetched from imap (I'm using ISBG with --learnspambox / 
> --learnhambox and --spamc actually).
> 
> So, will spamc perform the same sanitization / unwrapping of messages 
> that were already processed by spamassassin that sa-learn does?

It's done in spamd. Don't attempt to remove X-Spam-* headers yourself
or it wont attempt to remove the mime encapsulation.


Re: Does spamc unwrap spam reports?

2016-12-28 Thread Martin Gregorie
On Wed, 2016-12-28 at 11:01 +0100, Lukas Erlacher wrote:
> I haven't found any documentation that specifies this for
> spamc/spamd.
> 
I don't think that passing an email to SA via spamc  makes any attempt
to strip pre-existing SA headers, but there's an easy way to check:

Find any message that has already been scanned by SA and has the SA
headers in place. Use spamc to process it through SA again and examine
the message text returned by SA:

    spamc temp.txt
mv temp.txt $1
}

if [ $# -gt 0 ]
then
for f in $*
do
clean $f
done
else
for f in data/*.txt
do
clean $f
done
fi

Some recent Linuxes have dropped 'gawk' from their command list. If
yours is one of them, replace 'gawk' with 'awk' in the script. Awk is a
very fast text processing tool, so this script is pretty rapid too. 

[*] 'at least some' because if the configuration is different between
the first and subsequent scans then different sets of SA headers will
have been included each time.


Martin



Does spamc unwrap spam reports?

2016-12-28 Thread Lukas Erlacher

Hello,

https://wiki.apache.org/spamassassin/BayesInSpamAssassin says:


It's OK to feed emails with Spamassassin markup into the sa-learn command -- 
sa-learn will ignore any standard Spamassassin headers, and if the original 
email has been encapsulated into an attachment it will decapsulate the email. 
In other words sa-learn will undo any changes which Spamassassin has done 
before learning the spam/ham character of the email.


I haven't found any documentation that specifies this for spamc/spamd.

I'm calling "spamc --learntype=spam/ham" from a script, passing in 
emails fetched from imap (I'm using ISBG with --learnspambox / 
--learnhambox and --spamc actually).


So, will spamc perform the same sanitization / unwrapping of messages 
that were already processed by spamassassin that sa-learn does?


Thanks!

Best regards,
Luke



smime.p7s
Description: S/MIME Cryptographic Signature