On 03/06/2018 12:54 PM, John Hardin wrote:
On Tue, 6 Mar 2018, RW wrote:
On Tue, 6 Mar 2018 08:47:35 -0800 (PST)
John Hardin wrote:
On Tue, 6 Mar 2018, David Jones wrote:
In this case these were really bad spam so the APOSTROPHE_TOCC is
just riding on the back of other rules, BLs, and high Bayes
What I generally look at is the detailed rule performance in
masscheck. If it primarily hits on spams that score in total 1-3
Why not under 5?
If it's close to 5 and there's a limit that suggests the limit could be
increased a bit.
It also needs to take into account the ham hits, which is why having a
ham-starved corpus is such a problem.
Are you saying we have a ham-starved corpus?
OVERALL SPAM HAM
ena-week0 77,945 36,459 41,486
ena-week1 93,847 52,781 41,066
ena-week2 69,297 30,328 38,969
ena-week3 75,853 31,995 43,858
ena-week4 92,680 37,511 55,169
409,622 189,074 220,548