Re: Bayes autolearn: how does it resolve whether rules are body or header related?

2021-05-10 Thread RW
On Mon, 10 May 2021 20:39:31 +0200
Bert Van de Poel wrote:


> Based on what I've read, I agree that this is indeed a bug (or
> actually several). I've filed the following bug reports:
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7904 (missing body 
> types, as mentioned by RW)
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7905 (meta
> tflags=net tests are ignored)
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7906 (meta 
> tflags!=net tests are always header tests)
> https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7907 (better
> support for meta tests in autolearning in general, with 2 possible
> solutions)
> 
> Thank you very much to RW and Matus Uhlar for helping me figure out
> what code to look at and for al three of you to confirm that this is
> clearly a set of bugs.


I don't agree that they are bugs. I think it would be useful to add
missing body types, but I don't think the rest is hugely wrong, and
it's not sensible for anyone to spend a lot of time on it. Particularly
when it so easy to to turn-off the 3+3 test selectively with
autolearn_force.

Net meta rules usually contain scored net eval rules so it's sensible
to ignore them. Treating meta rules as header points seems to be erring
on the right side. There's a case for ignoring metarules altogether

Autolearning is something that's best avoided if at all possible.
Erring on on the side of avoiding mistraining is a good thing.


bayes stopwords.cf missing ifplugin

2021-05-10 Thread Benny Pedersen



ups


Re: Bayes autolearn: how does it resolve whether rules are body or header related?

2021-05-10 Thread Bert Van de Poel

Dear Loren,

Thank you very much for your email. Based on your message I could deduce 
there were earlier messages (which I then read through a web archive). 
For some unexplained reason I never received the previous 3 responses to 
my email. I hope the university network isn't randomly over-filtering 
spam again (we've had those kinds of problems for a while now, it's 
quite a problem, we are much more careful about how we mark spam).


Based on what I've read, I agree that this is indeed a bug (or actually 
several). I've filed the following bug reports:
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7904 (missing body 
types, as mentioned by RW)
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7905 (meta tflags=net 
tests are ignored)
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7906 (meta 
tflags!=net tests are always header tests)
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7907 (better support 
for meta tests in autolearning in general, with 2 possible solutions)


Thank you very much to RW and Matus Uhlar for helping me figure out what 
code to look at and for al three of you to confirm that this is clearly 
a set of bugs.


Feel free to file more bugs if you consider there are more based on my 
issue, as well as to give support, write suggestions or submit patches 
on the bugs I have already filed.


Kind regards,
Bert Van de Poel

On 10/05/2021 06:41, Loren Wilton wrote:

so you don't have points from body rules.

your mentioned URI_DEOBFU_INSTR is a meta rule:

meta URI_DEOBFU_INSTR __URI_DEOBFU_INSTR && !__MSGID_OK_HOST

so maybe it's not considered.


They are treated as header, or ignored if marked as net.


I think a bug report should be submitted for this.

Either they should be treated split 50/50 as header and body score, or 
when the metas are built they shoudl have a "body rule" flag, and that 
used to determine where the score goes.


I tried, but for some reason apache decided that I'm evil and blocked 
the submission attempt, so someone else can do it.


   Loren