# [theano-users] Trying to figure out why getting NaNs with softplus

```I've been banging my head against this problem for several hours so I
wanted to make sure one of my assumptions is not flawed (and hopefully get
```
I have a relatively simple network that is entirely linear except for the
loss function which is a sum of many softpluses.  A snippet of the code is
(here a_emb is a matrix and bias a scalar):

z= bias - (a_emb  - b_emb).norm(2, axis=1)
if clip:
z=z.clip(-bound,bound)
L0 = -T.nnet.softplus(-z)

L0 is one contribution to the loss.  There are a few more from higher rank
tensors that look like this (zn is a 3-tensor):

L1=T.sum(-T.nnet.softplus(zn), axis=1)

The only real complexity in the network is the use of subtensor indexing.
Basically I'm training a very large embedding model so to avoid updating
the whole matrix I take all inputs (e.g. indices corresponding to "a_emb",
"b_emb" above), put them in a subtensor, and then extract them out again
(by subindexing).  I then only update the subtensor via something like this:

If it helps I could post code showing how I setup the subtensor (but all
that stuff is just indexing, there's no non-linear operation there).

There's also a non-linear L2 loss function on the subtensor but I can't
imagine that's causing the problem:

L2lossV=(subV.norm(2, axis=1))

I'm not sure all the above is relevant but the issue is that I'm getting
NaN's very consistently and I'm having a hard time figuring out what
operation is causing it (using Nangaurd just made the code too slow to ever
get to the NaN).

As you can see from the above I tried to fight the NaN's by clipping the
input to softplus but this doesn't seem to work.  I clip the inputs to -10
to 10 but I still get NaNs.

My understanding, from reading around a little, was that softplus was
supposed to help avoid NaNs so I'm a bit confused that they're still
cropping up (and I can't see where else they could come from).  I would
appreciate any advice as to how to figure out the problem or even code
around it.

This is all with theano 0.8.2 on Ubuntu 16.04 and I'm using a CPU (but with
float32).

Thanks in advance for any help.

--

---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email