Re: [jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

Yexi Jiang Mon, 10 Jun 2013 13:18:47 -0700

OK, I try to update source code to the latest version.


2013/6/10 Yexi Jiang (JIRA) <[email protected]>

>
>     [
> https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679837#comment-13679837]
>
> Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM:
> ------------------------------------------------------------
>
> [~smarthi] When I apply this patch, the source code cannot be compiled.
> One of the error is that hiddenActivations cannot be resolved. Another
> error is that the class Functions.NEGATE is misspelled as Function.NEGATE.
>
>
>
>
>
>       was (Author: yxjiang):
>     [~smarthi] When I apply this patch, the source code cannot be
> compiled. One of the error is that hiddenActivations cannot be resolved.
> Another error is that the class Functions.NEGATE is misspell as
> Function.NEGATE.
>
>
>
> > Bug in Gradient Machine  - Computation of the gradient
> > ------------------------------------------------------
> >
> >                 Key: MAHOUT-975
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-975
> >             Project: Mahout
> >          Issue Type: Bug
> >          Components: Classification
> >    Affects Versions: 0.7
> >            Reporter: Christian Herta
> >            Assignee: Ted Dunning
> >             Fix For: 0.8
> >
> >         Attachments: GradientMachine.patch
> >
> >
> > The initialisation to compute the gradient descent weight updates for
> the output units should be wrong:
> >
> > In the comment: "dy / dw is just w since  y = x' * w + b."
> > This is wrong. dy/dw is x (ignoring the indices). The same
> initialisation is done in the code.
> > Check by using neural network terminology:
> > The gradient machine is a specialized version of a multi layer
> perceptron (MLP).
> > In a MLP the gradient for computing the "weight change" for the output
> units is:
> > dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j)
> > here: i index of the output layer; j index of the hidden layer
> > (d stands for the partial derivatives)
> > here: z_i = a_i (no squashing in the output layer)
> > with the special loss (cost function) is  E = 1 - a_g + a_b = 1 - z_g +
> z_b
> > with
> > g index of output unit with target value: +1 (positive class)
> > b: random output unit with target value: 0
> > =>
> > dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the
> hidden unit
> > j)
> > dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the
> hidden unit
> > j)
> > That's the same if the comment would be correct:
> > dy /dw = x (x is here the activation of the hidden unit) * (-1) for
> weights to
> > the output unit with target value +1.
> > ------------
> > In neural network implementations it's common to compute the gradient
> > numerically for a test of the implementation. This can be done by:
> > dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon))
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>



-- 
------
Yexi Jiang,
ECS 251,  [email protected]
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/

Re: [jira] [Comment Edited] (MAHOUT-975) Bug in Gradient Machine - Computation of the gradient

Reply via email to