Re: [theano-users] Straight-through gradient for stochastic node in theano

Yoh Okuno Thu, 30 Mar 2017 02:13:12 -0700

Thank you! I will try it.

On Thursday, March 30, 2017 at 12:58:34 AM UTC+1, Pascal Lamblin wrote:
>
> You can try OpFromGraph with inline=True, and specify 
> override_gradients=... with the right expression. 
> This is still experimental. 
>
> On Wed, Mar 29, 2017, [email protected] <javascript:> wrote: 
> > Hi guys! 
> > 
> > I recently started using theano and am struggling to implement custom 
> > gradient for stochastic node. Can anyone help me? 
> > 
> > What I want is an op that produces one hot vector whose hot element is 
> > sampled from softmax distribution. 
> > The op is not differentiable, but I want to "fake" as if its gradient is 
> > softmax's one ("straight through estimator"). 
> > Below is the minimum code that perform forward path, which raises 
> > DisconnectedInputError due to missing gradient. 
> > 
> > 
> > import theano 
> > 
> > import theano.tensor as T 
> > 
> > import numpy as np 
> > 
> > 
> > logits_values = np.random.uniform(-1, 1, size=3) 
> > 
> > logits = theano.shared(logits_values, 'logits') 
> > 
> > probabilities = T.nnet.softmax(logits) 
> > 
> > print('probabilities', probabilities.eval()) 
> > 
> > # result: probabilities [[ 0.55155489  0.290773    0.15767211]] 
> > 
> > 
> > random_streams = T.shared_randomstreams.RandomStreams() 
> > 
> > index = random_streams.choice(size=(1,), a=3, p=probabilities[0]) 
> > 
> > samples = T.extra_ops.to_one_hot(index, logits.shape[-1]) 
> > 
> > print('samples', samples.eval()) 
> > 
> > # result: samples [[ 1.  0.  0.]] 
> > 
> > 
> > # We want to use gradient of probabilities instead of samples! 
> > 
> > samples_grad = T.grad(samples[0][0], logits) 
> > 
> > # result: raise DisconnectedInputError 
> > 
> > 
> > The node is not the final layer, so I can't use categorical cross 
> entropy loss for training it. 
> > 
> > I am trying to implement custom op (see attached stochastic_softmax.py), 
> but it is not working in practice. 
> > 
> > Since I have working expression for forward path, can I simply override 
> gradient of existing expression? 
> > 
> > -- 
> > 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups "theano-users" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to [email protected] <javascript:>. 
> > For more options, visit https://groups.google.com/d/optout. 
>
> > import numpy as np 
> > import theano 
> > import theano.tensor as T 
> > 
> > 
> > class StochasticSoftmax(theano.Op): 
> >     def __init__(self, random_state=np.random.RandomState()): 
> >         self.random_state = random_state 
> > 
> >     def make_node(self, x): 
> >         x = T.as_tensor_variable(x) 
> >         return theano.Apply(self, [x], [x.type()]) 
> > 
> >     def perform(self, node, inputs, output_storage): 
> >         # Gumbel-max trick 
> >         x, = inputs 
> >         z = self.random_state.gumbel(loc=0, scale=1, size=x.shape) 
> >         indices = (x + z).argmax(axis=-1) 
> >         y = np.eye(x.shape[-1], dtype=np.float32)[indices] 
> >         output_storage[0][0] = y 
> > 
> >     def grad(self, inp, grads): 
> >         x, = inp 
> >         g_sm, = grads 
> > 
> >         sm = T.nnet.softmax(x) 
> >         return [T.nnet.softmax_grad(g_sm, sm)] 
> > 
> >     def infer_shape(self, node, i0_shapes): 
> >         return i0_shapes 
>
>
> -- 
> Pascal 
>


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Straight-through gradient for stochastic node in theano

Reply via email to