Good idea.

Somebody should file a JIRA.  My guess is that the best first step would be
to have the logistic regression handle the naive Bayes input format.

2011/7/25 Fernando Fernández <[email protected]>

> That would be very nice, actually I haven't tested most of Mahout
> algorithms
> for that reason...
>
> 2011/7/25 Xiaobo Gu <[email protected]>
>
> > Hi,
> > Most time Mahout algorithms use Vector as the model training input,
> > but don’t take care of how the instance vectors are generated, then
> > every algorithm has it’s unique way, causing the original input file
> > format requirement bound to specific algorithm. That causes a lot of
> > work for the actual users, especially for command line users. For
> > example, if we want to build a Logistic Regression and Naïve bayes
> > model for the same data, we must prepare the data in two formats.
> > Hence here comes for requirement that can you provide a universal
> > mechanism for handling input data, such as CSV and a CSV to Vector
> > encoder, then all algorithms will use it, and users just have to
> > prepare data as CSV.
> >
> > Regards,
> >
> > Xiaobo Gu
> >
>

Reply via email to