On Tuesday, September 23, 2003, at 09:12 AM, Dominique Vlieghe wrote:


On Tue, 2003-09-23 at 16:00, Ken Williams wrote:

Algorithm::NaiveBayes uses numerical attributes:


   $nb->add_instance
     (attributes => {foo => 1.7, bar => 3.234},
      label => 'whatever');

Or do I misunderstand your question?

I think you do, what I mean is that it should create some sort of distribution (e.g. a normal distribution) of the values of a given attribute. The naive bayes modules will (if I'm not mistaken) count the number of occurrences of a (numeric or not) attribute e.g. in your example you will have 1 times 1.7 and 1 times 3.234.

Right, those values essentially mean that the "foo" attribute is counted 1.7 times, and the "bar" attribute is counted 3.234 times. They're simply arbitrary weights, so "count" is not quite the right term, but it's the way many people think of them.


So what you're looking for is something like this, right?

   $nb->attribute_type('numeric');
   $nb->add_instance
     (attributes => {23 => 1.7, 35 => 3.234},
      label => 'whatever');

and then the model should correctly interpolate values for unseen attributes like 29 using an underlying distribution model? Currently this is unsupported by the module, but it could be added - especially with some help. =)

-Ken



Reply via email to