The gradient of T.eq will be zero (almost) everywhere and you're using it to compute num_win and num_lose.
On Sunday, March 5, 2017 at 2:42:14 PM UTC-8, tarom...@alum.northwestern.edu wrote: > > Also, the return values of this loss function are small compared to > cross-entropy, some sample values after random initialization were around > +/- 0.01. There is a LSTM layer and the input sequences are thousands of > elements long, so I suspected vanishing gradients. However, I'm printing > out the min, max, and mean of the gradients w.r.t each parameter, and they > are all exactly equal to 0, which seems to indicate a different problem. > > On Sunday, March 5, 2017 at 3:59:42 PM UTC-6, > tarom...@alum.northwestern.edu wrote: >> >> I have defined a custom loss function, and despite the loss function >> returning correct values given the inputs, the gradients are all always 0 >> w.r.t each of my parameters. I am not suppressing any theano errors >> including the disconnected input error, so I can't explain what is causing >> this. I have copied the loss function below; in words, I first convert a 3 >> class softmax output into a one hot representation, then I compare a subset >> of it to the response and compute a quantity of interest. More generally, I >> was under the impression that if one could express a function using theano >> ops, it could be used as a loss function. Is this not the case? >> >> def calc_one_hot_loss(pred, y, mask): >> mask_flat = T.flatten(mask) >> length = T.sum(mask_flat, dtype='int32') >> pred_unmasked = pred[mask_flat.nonzero()] >> max_indices = T.argmax(pred_unmasked, axis=1) >> pred_zero = T.set_subtensor(pred_unmasked[:], 0) >> pred_one_hot = T.set_subtensor(pred_zero[T.arange(length), max_indices], >> 1) >> y_unmasked = y[mask_flat.nonzero()] >> unchanged_col = pred_one_hot[:, preprocess.unchanged_index] >> pred_up = T.flatten(pred_one_hot[T.eq(unchanged_col, 0).nonzero(), >> preprocess.up_index]) >> pred_down = T.flatten(pred_one_hot[T.eq(unchanged_col, 0).nonzero(), >> preprocess.down_index]) >> y_up = T.flatten(y_unmasked[T.eq(unchanged_col, 0).nonzero(), >> preprocess.up_index]) >> y_down = T.flatten(y_unmasked[T.eq(unchanged_col, 0).nonzero(), >> preprocess.down_index]) >> diff_up = T.abs_(pred_up - y_up) >> diff_down = T.abs_(pred_down - y_down) >> diff_sum = diff_up + diff_down >> num_win = T.sum(T.eq(diff_sum, 0)) >> num_lose = T.sum(T.eq(diff_sum, 2)) >> loss = -1 * (num_win - num_lose) / length >> return loss >> >> >> >> >> >> >> >> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.