http://bugzilla.spamassassin.org/show_bug.cgi?id=3096
------- Additional Comments From [EMAIL PROTECTED] 2004-03-28 20:49 ------- +1 on duncan's latest comments. '> I propose the following format: > <manual class> <result class> <score> <id> <rules> <value pairs> > where, for our current code, <manual class> is: > "spam" | "ham" | "none" > and <result class> is: > "spam" | "ham" I'd prefer that we stick with single characters, since that is what ArchiveIterator does. (It passes "s" or "h" around instead of "spam" or "ham") Furthermore, having it fixed width is a good thing imho.' what about: <manual class><result class> <score> <id> <rules> <value pairs> with one-letter classes. That gives us: hh: manually ham, classed as ham hs: false positive sh: false negative ss: manually spam, classed as spam That's handy because (a) it's closer to what the academic lit uses (TCR calculation in particular uses just those classes with pretty much that nomenclature in its computation), (b) it's very logical and obvious, (c) it fits in 2 bytes, so fixed width, (d) it fits in one non-whitespace "token" so very little script modification will be required in rule-qa et al. where /\S+\s+\S+etc./ is used. The "no manual classification" type would then be us: unknown, marked as spam uh: unknown, marked as ham like this: hh 0 ...path... RULES bayes=0.001 hh 0 ...path... RULES bayes=0.001 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
