Hi Dear All,
When I construct such a computation
p_1 = 1 / (1 + T.exp(-T.dot(x, (w1-w2)) - (b1-b2)))
w1, w2, b1, b2 are parameters. And I construct a cross-entropy as loss
function.
But when I take gradients
T.grad(loss,[w1, b1, w2, b2]
All the resulted gradients are nan.
Is there any possible reasons for this problem? And any solution to it?
Thank you in advance!
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.