Hi, I've made some progress on this. I've written a couple modules, AI::Categorize and AI::Categorize::NaiveBayes. I've also written a paper that describes what they do: http://forum.swarthmore.edu/~ken/bayes/bayes.pod (same paper, http://forum.swarthmore.edu/~ken/bayes/bayes.html different http://forum.swarthmore.edu/~ken/bayes/bayes.txt formats) I'm not completely sold on the naming scheme or interface yet. The theory is that I'm making an abstract superclass for auto-categorization, and various categorization algorithms are modules descended therefrom. So far ...::NaiveBayes is the only subclass I've implemented, but I plan to look into Support Vectors or kNN or something if I get the chance. (I use these terms like I know what they mean, but I don't - writing these implementations is my way of learning about them. More info in a comparitive paper: http://www.cs.cmu.edu/~yiming/papers.yy/irj99.ps) If anyone has feedback or interest, I'd love to hear it before putting any of this stuff on CPAN. [EMAIL PROTECTED] (Ken Williams) wrote: >I'm currently working on an automatic document categorization system >written in Perl. The goal is to take questions coming in to an >ask-an-expert service (http://mathforum.com/dr.math/) and guess at the >category of their subject matter. The system I've begun writing uses >the "Na�ve Bayes" classification theory. I believe this is one of the >more popular ways of doing categorization, but it's my first stab at >understanding the technique. > >If anyone's interested in seeing anything I'm doing, you can look at >http://mathforum.com/~ken/bayes/ . Everything I've done so far is >there, but there's precious little explanation. Eventually I'll be >writing some explanation, because this is a project for a class I'm >taking in Computational Linguistics. > >Also, if anyone knows anything about the Bayesian methods I'm trying to >use, I'm interested in hearing it. I haven't really found any good >examples of the techniques in action, I'm just going on instinct and a >knowledge of Bayes' Theorem. > >I'll send updates to the list whenever I feel like I'm making progress. ------------------- ------------------- Ken Williams Last Bastion of Euclidity [EMAIL PROTECTED] The Math Forum
