I found the problem. w1-w2 caused the numerical unstability On Thursday, January 5, 2017 at 10:14:37 PM UTC+8, Cosmo Zhang wrote: > > Hi Dear All, > > When I construct such a computation > > p_1 = 1 / (1 + T.exp(-T.dot(x, (w1-w2)) - (b1-b2))) > > w1, w2, b1, b2 are parameters. And I construct a cross-entropy as loss > function. > But when I take gradients > T.grad(loss,[w1, b1, w2, b2] > All the resulted gradients are nan. > Is there any possible reasons for this problem? And any solution to it? > > Thank you in advance! > > >
-- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
