I found the problem. w1-w2 caused the numerical unstability

On Thursday, January 5, 2017 at 10:14:37 PM UTC+8, Cosmo Zhang wrote:
>
> Hi Dear All,
>
>     When I construct such a computation
>     
> p_1 = 1 / (1 + T.exp(-T.dot(x, (w1-w2)) - (b1-b2)))
>
>     w1, w2, b1, b2 are parameters. And I construct a cross-entropy as loss 
> function.
>     But when I take gradients
> T.grad(loss,[w1, b1, w2, b2]
>     All the resulted gradients are nan.
>    Is there any possible reasons for this problem? And any solution to it?
>
>     Thank you in advance!
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to