Thank you very much.
What you said is an excellent method. "You can bypass that by sampling all
the masks first, and pass them as a sequence."
--------------
Qiang Cui
On Saturday, March 7, 2015 at 5:01:11 AM UTC+8, Pascal Lamblin wrote:
>
> Hi,
>
> On Thu, Mar 05, 2015, Bitton Tenessi wrote:
> > Is there a way to make it work?
>
> This is a limitation we are aware of. The main problem is that scan is
> currently not able to recreate the right random sample at each step when
> computing the gradient.
>
> You can bypass that by sampling all the masks first, and pass them as a
> sequence.
>
> Also note that since you initialize x and w with integers, the gradient
> wrt w_grad will be 0. If you initialize it with float32, you can do:
>
> x = th.shared(np.array([1,2,3], dtype='float32'))
> w = th.shared(np.array([5,6,7], dtype='float32'))
>
> rng = RandomStreams()
> masks = rng.binomial(size=[3] + [x.shape[i] for i in range(x.ndim)])
>
> def step(idx, mask):
> x_drop = mask * x
> out = t.dot(x_drop, w)
> return out
>
> res, updates = th.scan(step, sequences=[t.arange(3), masks])
> w_grad = t.grad(res.sum(), w)
> fun = function([], [w_grad], updates=updates)
> print fun()
>
> [array([ 1., 2., 3.])]
>
>
>
> >
> > from theano.tensor.shared_randomstreams import RandomStreams
> > from theano import function
> > import numpy as np
> > import theano.tensor as t
> > import theano as th
> > from theano.printing import Print
> >
> > x = th.shared(np.array([1,2,3]))
> > w = th.shared(np.array([5,6,7]))
> >
> > def step(idx):
> > rng = RandomStreams()
> > mask = rng.binomial(size=x.shape)
> > x_drop = mask * x
> > out = t.dot(x_drop, w)
> > return out
> > res, updates = th.scan(step, sequences=t.arange(3))
> > w_grad = t.grad(res.sum(), w)
> > fun = function([], [w_grad], updates=updates)
> > print fun()
> >
> > w_grad = t.grad(res.sum(), w)
> > File "C:\Python27\lib\site-packages\theano\gradient.py", line 543, in
> grad
> > grad_dict, wrt, cost_name)
> > File "C:\Python27\lib\site-packages\theano\gradient.py", line 1273, in
> > _populate_grad_dict
> > rval = [access_grad_cache(elem) for elem in wrt]
> > File "C:\Python27\lib\site-packages\theano\gradient.py", line 1243, in
> > access_grad_cache
> > term.type.why_null)
> > theano.gradient.NullTypeGradError: tensor.grad encountered a NaN. This
> > variable is Null because the grad method for input 2 (<RandomStateType>)
> of
> > the for{cpu,scan_fn} op is mathematically undefined. Depends on a shared
> > variable
> >
> >
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "theano-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to [email protected] <javascript:>.
> > For more options, visit https://groups.google.com/d/optout.
>
>
> --
> Pascal
>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.