Hello Justin,

Tuesday, August 24, 2004, 7:02:00 PM, you wrote:

JM> BTW, I think you guys (SARE that is) are using mass-check to measure
JM> accuracy, right?

Yes.  Our mass-check runs are more manual than the nightly runs used by
the development team, but we hope to work on that...

JM> That's the main issue that we had in the past with "external"
JM> rulesets -- most of those were developed without measuring accuracy,
JM> and once tested they don't come out too hot.  But from what I can see
JM> (from outside) it looks like you all have been doing that for a
JM> while, which is cool.

Yes.  We post the rules to each other, run them through two or more
(usually three) corpora, and use the combined results to determine
whether rules are viable. (We're hoping to add a fourth corpus soon.)

Viable to us is less strict than viable to the development team, lower
thresholds, but the basic philosophies are the same, I think.

JM> (BTW I should qualify what Daniel means by "non-heavyweight" -- in
JM> other words, the rule doesn't greatly affect speed/RAM usage.  I
JM> think that's what he means at least.)

Also important to us.  My system, for instance, does a comprehensive
mass-check on a single rule to dozens of rules in about half an hour. If
any rule causes a noticeable jump in this performance measure, we either
fix it or toss it.

(I can't really measure RAM usage on my system, but the same concern
applies.)

We've also been trying to some extent to document a rule's history, so we
know whether it came from a CLA member or elsewhere. We're discussing
ways of making that more formal.

JM> If we can work something out, that'll be great ;)

We're all agreed about that.  I'm hopeful we can.

Bob Menschel



Reply via email to