Justin Mason <[EMAIL PROTECTED]> writes: > sounds good to me. One thing - could you run some tests on the > sampling so we can see how reliable it is, in terms of > hit-frequencies? I'd like to get a "sanity check" on that, it's a key > aspect.
Yes. I will do so. I'm going to compare auto-learning with sample-learning using a similar percentage of messages learned. It should be easy enough to get it to a similar error rate if it's too good. Another thing I want to do is make it deterministic instead of using rand(100). If I change it to "learn 1 in N" instead of a percentage, then I can easily do a mod on the md5sum of the id and/or date. Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
