On 07/04/2012 04:40 PM, John Hardin wrote:
On Wed, 4 Jul 2012, Axb wrote:
from last update's 72_scores.cf
score HDRS_LCASE 3.749 3.999 3.749 3.999
score MANY_HDRS_LCASE 1.251 1.004 1.251 1.004
Although John manually set low scores in the sandbox file, these are
ignored (per design).
They are _limits_. The generator should not exceed those scores. The
newly limited scores may take a bit to show up in an update.
I'll watch those scores closely.
Fixed/forced scores should be set via 73_sandbox_manual_scores.cf and
not in sanbox files
They have comment:
# observed in UCE 9/2009
As they are hitting lots of ham, can we please loose these.
HDRS_LCASE_IMGONLY may be another candidate to be dropped.
Alex, I don't recall if you're running masschecks; if you are, can you
include such FPs in your ham corpus? The reason they're being scored so
highly by the rescorer is they do perform well against the masscheck
corpus.
I am running masschecks but these hits I see on msgs (maillog)
gatewayed thru $dayjob's boxes - not stuff stored locally.
I understand, a lot of rules may perform well in masschecks but overall
generic patterns should be dropped if we detect that real world traffic
shows they're dangerous.
Imo, we should be able to trust our traffic & judgement more than
masscheck corpuses which may be highly biased.
Axb