OK, I try to update source code to the latest version.
2013/6/10 Yexi Jiang (JIRA) <[email protected]> > > [ > https://issues.apache.org/jira/browse/MAHOUT-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13679837#comment-13679837] > > Yexi Jiang edited comment on MAHOUT-975 at 6/10/13 8:07 PM: > ------------------------------------------------------------ > > [~smarthi] When I apply this patch, the source code cannot be compiled. > One of the error is that hiddenActivations cannot be resolved. Another > error is that the class Functions.NEGATE is misspelled as Function.NEGATE. > > > > > > was (Author: yxjiang): > [~smarthi] When I apply this patch, the source code cannot be > compiled. One of the error is that hiddenActivations cannot be resolved. > Another error is that the class Functions.NEGATE is misspell as > Function.NEGATE. > > > > > Bug in Gradient Machine - Computation of the gradient > > ------------------------------------------------------ > > > > Key: MAHOUT-975 > > URL: https://issues.apache.org/jira/browse/MAHOUT-975 > > Project: Mahout > > Issue Type: Bug > > Components: Classification > > Affects Versions: 0.7 > > Reporter: Christian Herta > > Assignee: Ted Dunning > > Fix For: 0.8 > > > > Attachments: GradientMachine.patch > > > > > > The initialisation to compute the gradient descent weight updates for > the output units should be wrong: > > > > In the comment: "dy / dw is just w since y = x' * w + b." > > This is wrong. dy/dw is x (ignoring the indices). The same > initialisation is done in the code. > > Check by using neural network terminology: > > The gradient machine is a specialized version of a multi layer > perceptron (MLP). > > In a MLP the gradient for computing the "weight change" for the output > units is: > > dE / dw_ij = dE / dz_i * dz_i / d_ij with z_i = sum_j (w_ij * a_j) > > here: i index of the output layer; j index of the hidden layer > > (d stands for the partial derivatives) > > here: z_i = a_i (no squashing in the output layer) > > with the special loss (cost function) is E = 1 - a_g + a_b = 1 - z_g + > z_b > > with > > g index of output unit with target value: +1 (positive class) > > b: random output unit with target value: 0 > > => > > dE / dw_gj = -dE/dz_g * dz_g/dw_gj = -1 * a_j (a_j: activity of the > hidden unit > > j) > > dE / dw_bj = -dE/dz_b * dz_b/dw_bj = +1 * a_j (a_j: activity of the > hidden unit > > j) > > That's the same if the comment would be correct: > > dy /dw = x (x is here the activation of the hidden unit) * (-1) for > weights to > > the output unit with target value +1. > > ------------ > > In neural network implementations it's common to compute the gradient > > numerically for a test of the implementation. This can be done by: > > dE/dw_ij = (E(w_ij + epsilon) -E(w_ij - epsilon) ) / (2* (epsilon)) > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > -- ------ Yexi Jiang, ECS 251, [email protected] School of Computer and Information Science, Florida International University Homepage: http://users.cis.fiu.edu/~yjian004/
