Hi Mahout,

In Naive Bayes, I think that a term does not exist in a training data
should not affect a score.
What do you think?

  org.apache.mahout.classifier.
naivebayes.AbstractNaiveBayesClassifier

 Before:
  protected double getScoreForLabelInstance(int label, Vector instance) {
    double result = 0.0;
    for (Element e : instance.nonZeroes()) {
      result += e.get() * getScoreForLabelFeature(label, e.index());
    }
    return result;
  }

 After:
  protected double getScoreForLabelInstance(int label, Vector instance) {
    double result = 0.0;
    for (Element e : instance.nonZeroes()) {
      int index = e.index();
      double featureWeight = model.featureWeight(index);
      if( featureLabelWeight != 0 ) {
        result += e.get() * getScoreForLabelFeature(label, index);
      }
    }
    return result;
  }

Thanks,
Toyoharu

Reply via email to