Hi Xiaobo,
as far as I recall the paper on which Mahout's NB implementation is
based on consists of two parts, the first parts describes techniques to
generally improve NB's predicition quality on skewed input data and the
likes while the second part shows how to handle textual data.
I think that bayes.* is an older implementation that includes the first
and the second part of the paper, while naivebayes.* is a newer one that
only contains the general algorithm described in the first part of the
paper.
--sebastian
On 14.08.2011 06:32, Xiaobo Gu wrote:
Hi,
1. What's difference between them from the algorithm point of view, do
they only support category predictors only?
2. What are the input file format requirement of them, for
org.apache.mahout.naivbayes.*, it requires
SequenceFile<Text,VectorWritable>, and for org.apache.mahout.bayes.*,
it requires a tab seperated text file without header, why not use the
same input format?
Regards,
Xiaobo Gu