> baseline vs. x-lookup_ip: [. . .]
> false negative percentages > 2.228 1.671 won -25.00% > 3.343 3.064 won -8.35% > 5.292 4.735 won -10.53% > 4.735 4.457 won -5.87% > 2.786 2.507 won -10.01% > > won 5 times > tied 0 times > lost 0 times I'm glad to see that. That's the sort of improvement that I see with that code, but I think it's the first time that anyone else has reproduced it. Still, as people have pointed out before, there's at least one potential problem in the code. That's that data from DNS isn't necessarily stable. If someone needed to un-train their database on a message a day or two later, the tokens generated might easily not be the same as they were when the message was first trained on. That could send a token's count below zero. That doesn't affect me in practice, but it would surely affect someone if the code were used widely. Fixing it in general would require some rather elaborate persistence mechanism, I think. Regards, Matt _______________________________________________ spambayes-dev mailing list [email protected] http://mail.python.org/mailman/listinfo/spambayes-dev
