Matt> It took some training for me before my SpamBayes started to Matt> recognize those reliably, but it seems that my old hack to Matt> tokenize URL's IPs helps:
> This doesn't seem to be in the code base. 'Zat so? Yup! The patch is at: http://www.mondoinfo.com/tokenizerpatch.txt and the local cache I use it with is at: http://www.mondoinfo.com/dnscache.py There was some discussion of it here some time ago. It didn't seem to help on historical corpora, perhaps because spammers don't maintain their DNS for long. But on current spam it helps for me. I haven't experimented with breaking the IP up at anything other than byte boundaries. I also haven't looked at the related issue of whether four tokens for an exact match is optimal. Regards, Matt _______________________________________________ spambayes-dev mailing list [email protected] http://mail.python.org/mailman/listinfo/spambayes-dev
