Hello Justin, Tuesday, August 24, 2004, 7:02:00 PM, you wrote:
JM> BTW, I think you guys (SARE that is) are using mass-check to measure JM> accuracy, right? Yes. Our mass-check runs are more manual than the nightly runs used by the development team, but we hope to work on that... JM> That's the main issue that we had in the past with "external" JM> rulesets -- most of those were developed without measuring accuracy, JM> and once tested they don't come out too hot. But from what I can see JM> (from outside) it looks like you all have been doing that for a JM> while, which is cool. Yes. We post the rules to each other, run them through two or more (usually three) corpora, and use the combined results to determine whether rules are viable. (We're hoping to add a fourth corpus soon.) Viable to us is less strict than viable to the development team, lower thresholds, but the basic philosophies are the same, I think. JM> (BTW I should qualify what Daniel means by "non-heavyweight" -- in JM> other words, the rule doesn't greatly affect speed/RAM usage. I JM> think that's what he means at least.) Also important to us. My system, for instance, does a comprehensive mass-check on a single rule to dozens of rules in about half an hour. If any rule causes a noticeable jump in this performance measure, we either fix it or toss it. (I can't really measure RAM usage on my system, but the same concern applies.) We've also been trying to some extent to document a rule's history, so we know whether it came from a CLA member or elsewhere. We're discussing ways of making that more formal. JM> If we can work something out, that'll be great ;) We're all agreed about that. I'm hopeful we can. Bob Menschel
