Hi,

I've made some progress on this.  I've written a couple modules,
AI::Categorize and AI::Categorize::NaiveBayes.  I've also written a
paper that describes what they do:

 http://forum.swarthmore.edu/~ken/bayes/bayes.pod    (same paper,
 http://forum.swarthmore.edu/~ken/bayes/bayes.html    different
 http://forum.swarthmore.edu/~ken/bayes/bayes.txt     formats)

I'm not completely sold on the naming scheme or interface yet.  The
theory is that I'm making an abstract superclass for
auto-categorization, and various categorization algorithms are modules
descended therefrom.  So far ...::NaiveBayes is the only subclass I've
implemented, but I plan to look into Support Vectors or kNN or something
if I get the chance.  (I use these terms like I know what they mean, but
I don't - writing these implementations is my way of learning about
them.  More info in a comparitive paper:
http://www.cs.cmu.edu/~yiming/papers.yy/irj99.ps)

If anyone has feedback or interest, I'd love to hear it before putting
any of this stuff on CPAN.


[EMAIL PROTECTED] (Ken Williams) wrote:
>I'm currently working on an automatic document categorization system
>written in Perl.  The goal is to take questions coming in to an
>ask-an-expert service (http://mathforum.com/dr.math/) and guess at the
>category of their subject matter.  The system I've begun writing uses
>the "Na�ve Bayes" classification theory.  I believe this is one of the
>more popular ways of doing categorization, but it's my first stab at
>understanding the technique.
>
>If anyone's interested in seeing anything I'm doing, you can look at
>http://mathforum.com/~ken/bayes/ .  Everything I've done so far is
>there, but there's precious little explanation.  Eventually I'll be
>writing some explanation, because this is a project for a class I'm
>taking in Computational Linguistics.
>
>Also, if anyone knows anything about the Bayesian methods I'm trying to
>use, I'm interested in hearing it.  I haven't really found any good
>examples of the techniques in action, I'm just going on instinct and a
>knowledge of Bayes' Theorem.
>
>I'll send updates to the list whenever I feel like I'm making progress.

  -------------------                            -------------------
  Ken Williams                             Last Bastion of Euclidity
  [EMAIL PROTECTED]                            The Math Forum

Reply via email to