Yexi Jiang created MAHOUT-1388:
----------------------------------

             Summary: Add command line support and logging for MLP
                 Key: MAHOUT-1388
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
             Project: Mahout
          Issue Type: Improvement
          Components: Classification
    Affects Versions: 1.0
            Reporter: Yexi Jiang
             Fix For: 1.0


The user should have the ability to run the Perceptron from the command line.

There are two modes for MLP, the training and labeling, the first one takes the 
data as input and outputs the model, the second one takes the model and 
unlabeled data as input and outputs the results.

The parameters are as follows:
------------------------------------------------
--mode -mo // train or label
--input -i (input data)
--model -mo  // in training mode, this is the location to store the model (if 
the specified location has an existing model, it will update the model through 
incremental learning), in labeling mode, this is the location to store the 
result
--output -o           // this is only useful in labeling mode
--layersize -ls (no. of units per hidden layer) // use comma separated number 
to indicate the number of neurons in each layer (including input layer and 
output layer)
--momentum -m 
--learningrate -l
--regularizationweight -r
--costfunction -cf   // the type of cost function,
------------------------------------------------
For example, train a 3-layer (including input, hidden, and output) MLP with 
Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 
regularization weight, the parameter would be:

mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 
0.1 -r 0.01 -cf minus_squared

This command would read the training data from /tmp/training-data.csv and write 
the trained model to /tmp/model.model.

If a user need to use an existing model, it will use the following command:
mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result

Moreover, we should be providing default values if the user does not specify 
any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to