[ 
https://issues.apache.org/jira/browse/MAHOUT-703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036009#comment-13036009
 ] 

Ted Dunning commented on MAHOUT-703:
------------------------------------

It comes almost for free with SGD neural net codes to put L1 and L2 penalties 
in as well.  I would recommend it.

The trick is that you can't depend on the gradient being sparse so you can't 
use the lazy regularization.  Leon Botou describes 
a stochastic full regularization with an adjusted learning rate which should 
perform comparably.  He mostly talks about weight decay (which is L_2 
regularization) which can be handled cleverly by keeping a multiplier and a 
vector.  I think L_1 is important, but it requires something like truncated 
constant decay which can't be done with a multiplier.

See http://leon.bottou.org/projects/sgd

> Implement Gradient machine
> --------------------------
>
>                 Key: MAHOUT-703
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-703
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: Hector Yee
>            Priority: Minor
>              Labels: features
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Implement a gradient machine (aka 'neural network) that can be used for 
> classification or auto-encoding.
> It will just have an input layer, identity, sigmoid or tanh hidden layer and 
> an output layer.
> Training done by stochastic gradient descent (possibly mini-batch later).
> Sparsity will be optionally enforced by tweaking the bias in the hidden unit.
> For now it will go in classifier/sgd and the auto-encoder will wrap it in the 
> filter unit later on.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to