[
https://issues.apache.org/jira/browse/MAHOUT-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated MAHOUT-785:
-----------------------------
Affects Version/s: (was: 0.6)
I think this is a fairly open-ended item. It's not clear to me that these
different algorithms operate on logically the same input, though I imagine the
command line options and such could be more standardized. Do you have
particular views on concrete changes to this end?
> Universal input file format for classifier algorithms in Mahout
> ---------------------------------------------------------------
>
> Key: MAHOUT-785
> URL: https://issues.apache.org/jira/browse/MAHOUT-785
> Project: Mahout
> Issue Type: Improvement
> Components: Classification
> Reporter: XiaoboGu
>
> I think a universal input file format is much more convinient for users,
> especially command line users, and we should even consider use some universal
> command line options for the classification algorithms, such as options for
> target/predictor variables and their types. Then users can prepare their data
> once, and build different models to get the best one. Currentlly we should
> consider the following:
> 1. SGD LogisticRegression
> 2. NaiveBayes
> 3. Bayes
> 4. Random Forest
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira