[EMAIL PROTECTED] (Justin Mason) writes: > Bear in mind, the TCR figure that's output to the user in > "fp-fn-statistics" output is mostly useful to compare against > published algorithms, since it's the de-facto std of effectiveness in > the academic lit on spam-filtering.
Erm, but everyone uses different lambdas and different corpora, so I'm not sure when this type of comparison is possible. > But we shouldn't use it ourselves internally as an effectiveness metric, > because I don't think it's trustworthy (see below). > > To remind us what they represent in Ion's papers: > > lambda=1: filing into a "spam" folder > lambda=9: bouncing back to sender saying "your mail was spam" > lambda=100: silent disposal > > We should really be a lambda of 1, given that; but since SpamAssassin is > also used in other systems (e.g. with a system-wide quarantine, > unavailable to the end user), and because it was getting crazily-good > efficiency figures (like TCR > 100) at l=1, I picked a compromise l=5. I think the example mapping of policy to lambda number is wrong. Clearly, 1 FP is not the same amount of pain as 1 FN when filing into probable spam into a "spam" folder. lambda may be especially low if only 75% of spam is being caught with a high FP rate and you have to check your spam folder every day, but it's much higher when you get to SA-level accuracy. Maybe it shouldn't be considered at all when we're doing score optimizer work. Maybe a better metric is needed. > IMO a better metric would be to pick a desired FP rate, and then use > FN as a single-figure metric given that FP rate. Or vice versa. > Basically lock down a desired FP or FN rate and allow the perceptron > to find its "best" rate for the other figure. I agree with that. The perceptron is not quite there, though. Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
