Can you show me any material describing the file format requirement of Naïve 
Bayes please.


> -----Original Message-----
> From: Ted Dunning [mailto:[email protected]]
> Sent: Monday, July 25, 2011 11:16 PM
> To: [email protected]
> Cc: [email protected]
> Subject: Re: What about a universal input data handling mechanism for Mahout?
> 
> Good idea.
> 
> Somebody should file a JIRA.  My guess is that the best first step would be
> to have the logistic regression handle the naive Bayes input format.
> 
> 2011/7/25 Fernando Fernández <[email protected]>
> 
> > That would be very nice, actually I haven't tested most of Mahout
> > algorithms
> > for that reason...
> >
> > 2011/7/25 Xiaobo Gu <[email protected]>
> >
> > > Hi,
> > > Most time Mahout algorithms use Vector as the model training input,
> > > but don’t take care of how the instance vectors are generated, then
> > > every algorithm has it’s unique way, causing the original input file
> > > format requirement bound to specific algorithm. That causes a lot of
> > > work for the actual users, especially for command line users. For
> > > example, if we want to build a Logistic Regression and Naïve bayes
> > > model for the same data, we must prepare the data in two formats.
> > > Hence here comes for requirement that can you provide a universal
> > > mechanism for handling input data, such as CSV and a CSV to Vector
> > > encoder, then all algorithms will use it, and users just have to
> > > prepare data as CSV.
> > >
> > > Regards,
> > >
> > > Xiaobo Gu
> > >
> >

Reply via email to