I stumbled across this article

http://www.macdevcenter.com/pub/a/mac/2004/05/18/spam_pt2.html

while Googling around for anything that relates cluster analysis techniques to spam filtering.

This may be old knowledge to some people here, but was new to me. Apparently the trainable spam filter in Apple's Mail program does not use the Bayesian approach that we are familiar with. It uses a cluster discovery tool that was developed for document search and retrieval.

It would be interesting to compare this approach to Bayes. I'm also curious if this provides some hints about using some techniques from bioinformatics (as Justin referred to in a recent message to this list) such as UPGMA cluster analysis( http://www.nmsr.org/upgma.htm ).

-- sidney

Reply via email to