From: "Matt Kettler" <[EMAIL PROTECTED]>

jdow wrote:
And it is scored LESS than BAYES_95 by default. That's a clear signal
that the theory behind the scoring system is a little skewed and needs
some rethinking.

No.. It does not mean there's a problem with the scoring system. It
means you're trying to apply a simple linear model to something which is
inherently not linear, nor simple. This is a VERY common misconception.

I have a few more thoughts that are probably more "constructive" than
merely saying that the perceptron model is "obviously" wrong where the
rubber meets the road.

It seems to me that the observed operation of the perceptron is driving
scores towards the minimum amount over 5.0 that can be managed and still
capture most of the spam.

I've been operating here on a slightly different principle, at least
for my own rules. I work to drive scores away from 5.0, in both
directions as needed. If I see a low scoring captured spam being
always scored greater than 8 or 10 I am pleased. When I see items
in the 5 to 10 range I figure out what I can do to drive it to the
correct direction, ham or spam. (Bayes is usually my choice of
action. I usually discover another email that has a mid level Bayes
score rather than an extreme level. And I wish I could codify how I
choose to feed Bayes. I feed it almost on an intuitive level, "This
is Bayes food" or "Bayes already has a lot of this food and is
obviously a little confused for my mail mix." That's hardly a good
"rule" for feeding that I can pass on to people. <sigh>)

So rather than having perceptron try to push towards a relatively
smooth curve of all scores it should work to push the overall score
profile into what one wag in an SF story called a "brassiere curve",
which is wonderfully descriptive when you think of some of the 50's
and 60's fashions. {^_-} If it can create a viable valley with very
few messages scoring near 5.0 and as wide a variance between the ham
peak and the spam peak it may act better.

THAT said, I note that I use meta rules regularly to generate some
modest negative scores as well as positive scores. This has had some
good side effects on the reliability of scoring here. I've noticed that
a small few of the SARE rules, over time, decayed into being fairly
good indications of ham rather than spam. Since SARE is more "agile"
than the basic SA rule sets it might be good if the SARE people took
this as a tool for <choke> lift and separation on the ham and spam
peaks. It might be interesting to notice if the obverse of "in this
BL" is a decent indication of "not spam" and give that a modest bit
of negative score for some cases.

I just pulled RATWR10a_MESSID because it was hitting 13% of ham and
4% of spam, for example. Perhaps I should have given it a very small
negative score instead. I note right now that SPF_PASS seems to hit
50% (!) of ham and only 4% of spam. Perhaps it, too, should have a
slight negative score to help increase the span between the ham peak
and the spam peak.

It does seem clear to me that the objective is not to create minimum
score to mark as spam so much as to create as large a separation between
typical ham and spam scores as possible. The more reliable rules should
have higher negative and positive scores as appropriate.

And of course, the final caveat, is that I am running a two person
install of SpamAssassin with per user rules and scores with two fairly
intelligent (although some people question that about me) people running
their own user rules and Bayes. I also do not use automatic anything.
I cannot get over the idea that automatic whitelist and automatic learning
are not necessarily stable concepts UNTIL you have a very reliable BAYES
setup and set of rules from manual training. I have that and still cannot
convince myself to "fix what isn't broken."

{^_^}   Joanne

Reply via email to