[
https://issues.apache.org/jira/browse/MAHOUT-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13856671#comment-13856671
]
Yexi Jiang commented on MAHOUT-1388:
------------------------------------
[~smarthi] OK, I'll add it. Currently, it only supports CSV.
> Add command line support and logging for MLP
> --------------------------------------------
>
> Key: MAHOUT-1388
> URL: https://issues.apache.org/jira/browse/MAHOUT-1388
> Project: Mahout
> Issue Type: Improvement
> Components: Classification
> Affects Versions: 1.0
> Reporter: Yexi Jiang
> Labels: mlp, sgd
> Fix For: 1.0
>
>
> The user should have the ability to run the Perceptron from the command line.
> There are two modes for MLP, the training and labeling, the first one takes
> the data as input and outputs the model, the second one takes the model and
> unlabeled data as input and outputs the results.
> The parameters are as follows:
> ------------------------------------------------
> --mode -mo // train or label
> --input -i (input data)
> --model -mo // in training mode, this is the location to store the model (if
> the specified location has an existing model, it will update the model
> through incremental learning), in labeling mode, this is the location to
> store the result
> --output -o // this is only useful in labeling mode
> --layersize -ls (no. of units per hidden layer) // use comma separated number
> to indicate the number of neurons in each layer (including input layer and
> output layer)
> --momentum -m
> --learningrate -l
> --regularizationweight -r
> --costfunction -cf // the type of cost function,
> ------------------------------------------------
> For example, train a 3-layer (including input, hidden, and output) MLP with
> Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01
> regularization weight, the parameter would be:
> mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1
> -m 0.1 -r 0.01 -cf minus_squared
> This command would read the training data from /tmp/training-data.csv and
> write the trained model to /tmp/model.model.
> If a user need to use an existing model, it will use the following command:
> mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o
> /tmp/label-result
> Moreover, we should be providing default values if the user does not specify
> any.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)