Thank you! I will try it. On Thursday, March 30, 2017 at 12:58:34 AM UTC+1, Pascal Lamblin wrote: > > You can try OpFromGraph with inline=True, and specify > override_gradients=... with the right expression. > This is still experimental. > > On Wed, Mar 29, 2017, [email protected] <javascript:> wrote: > > Hi guys! > > > > I recently started using theano and am struggling to implement custom > > gradient for stochastic node. Can anyone help me? > > > > What I want is an op that produces one hot vector whose hot element is > > sampled from softmax distribution. > > The op is not differentiable, but I want to "fake" as if its gradient is > > softmax's one ("straight through estimator"). > > Below is the minimum code that perform forward path, which raises > > DisconnectedInputError due to missing gradient. > > > > > > import theano > > > > import theano.tensor as T > > > > import numpy as np > > > > > > logits_values = np.random.uniform(-1, 1, size=3) > > > > logits = theano.shared(logits_values, 'logits') > > > > probabilities = T.nnet.softmax(logits) > > > > print('probabilities', probabilities.eval()) > > > > # result: probabilities [[ 0.55155489 0.290773 0.15767211]] > > > > > > random_streams = T.shared_randomstreams.RandomStreams() > > > > index = random_streams.choice(size=(1,), a=3, p=probabilities[0]) > > > > samples = T.extra_ops.to_one_hot(index, logits.shape[-1]) > > > > print('samples', samples.eval()) > > > > # result: samples [[ 1. 0. 0.]] > > > > > > # We want to use gradient of probabilities instead of samples! > > > > samples_grad = T.grad(samples[0][0], logits) > > > > # result: raise DisconnectedInputError > > > > > > The node is not the final layer, so I can't use categorical cross > entropy loss for training it. > > > > I am trying to implement custom op (see attached stochastic_softmax.py), > but it is not working in practice. > > > > Since I have working expression for forward path, can I simply override > gradient of existing expression? > > > > -- > > > > --- > > You received this message because you are subscribed to the Google > Groups "theano-users" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected] <javascript:>. > > For more options, visit https://groups.google.com/d/optout. > > > import numpy as np > > import theano > > import theano.tensor as T > > > > > > class StochasticSoftmax(theano.Op): > > def __init__(self, random_state=np.random.RandomState()): > > self.random_state = random_state > > > > def make_node(self, x): > > x = T.as_tensor_variable(x) > > return theano.Apply(self, [x], [x.type()]) > > > > def perform(self, node, inputs, output_storage): > > # Gumbel-max trick > > x, = inputs > > z = self.random_state.gumbel(loc=0, scale=1, size=x.shape) > > indices = (x + z).argmax(axis=-1) > > y = np.eye(x.shape[-1], dtype=np.float32)[indices] > > output_storage[0][0] = y > > > > def grad(self, inp, grads): > > x, = inp > > g_sm, = grads > > > > sm = T.nnet.softmax(x) > > return [T.nnet.softmax_grad(g_sm, sm)] > > > > def infer_shape(self, node, i0_shapes): > > return i0_shapes > > > -- > Pascal >
-- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
