On 14.08.2011 08:53, Xiaobo Gu wrote:
Can you show me the paper please. And do you mean there are the same
thing, we should always use naivebayes.*, provided we can prepair the
input data as required?

Here's a link to the paper: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.8572&rank=1

--sebastian


Regards,

Xiaobo Gu

On Sun, Aug 14, 2011 at 2:31 PM, Sebastian Schelter<[email protected]>  wrote:
Hi Xiaobo,

as far as I recall the paper on which Mahout's NB implementation is based on
consists of two parts, the first parts describes techniques to generally
improve NB's predicition quality on skewed input data and the likes while
the second part shows how to handle textual data.

I think that bayes.* is an older implementation that includes the first and
the second part of the paper, while naivebayes.* is a newer one that only
contains the general algorithm described in the first part of the paper.

--sebastian

On 14.08.2011 06:32, Xiaobo Gu wrote:

Hi,
1. What's difference between them from the algorithm point of view, do
they only support category predictors only?
2. What are the input file format requirement of them, for
org.apache.mahout.naivbayes.*, it requires
SequenceFile<Text,VectorWritable>, and for org.apache.mahout.bayes.*,
it requires a tab seperated text file without header, why not use the
same input format?


Regards,

Xiaobo Gu



Reply via email to