Can you show me the paper please. And do you mean there are the same thing, we should always use naivebayes.*, provided we can prepair the input data as required?
Regards, Xiaobo Gu On Sun, Aug 14, 2011 at 2:31 PM, Sebastian Schelter <[email protected]> wrote: > Hi Xiaobo, > > as far as I recall the paper on which Mahout's NB implementation is based on > consists of two parts, the first parts describes techniques to generally > improve NB's predicition quality on skewed input data and the likes while > the second part shows how to handle textual data. > > I think that bayes.* is an older implementation that includes the first and > the second part of the paper, while naivebayes.* is a newer one that only > contains the general algorithm described in the first part of the paper. > > --sebastian > > On 14.08.2011 06:32, Xiaobo Gu wrote: >> >> Hi, >> 1. What's difference between them from the algorithm point of view, do >> they only support category predictors only? >> 2. What are the input file format requirement of them, for >> org.apache.mahout.naivbayes.*, it requires >> SequenceFile<Text,VectorWritable>, and for org.apache.mahout.bayes.*, >> it requires a tab seperated text file without header, why not use the >> same input format? >> >> >> Regards, >> >> Xiaobo Gu > >
