You can try OpFromGraph with inline=True, and specify override_gradients=... with the right expression. This is still experimental.
On Wed, Mar 29, 2017, nokunok...@gmail.com wrote: > Hi guys! > > I recently started using theano and am struggling to implement custom > gradient for stochastic node. Can anyone help me? > > What I want is an op that produces one hot vector whose hot element is > sampled from softmax distribution. > The op is not differentiable, but I want to "fake" as if its gradient is > softmax's one ("straight through estimator"). > Below is the minimum code that perform forward path, which raises > DisconnectedInputError due to missing gradient. > > > import theano > > import theano.tensor as T > > import numpy as np > > > logits_values = np.random.uniform(-1, 1, size=3) > > logits = theano.shared(logits_values, 'logits') > > probabilities = T.nnet.softmax(logits) > > print('probabilities', probabilities.eval()) > > # result: probabilities [[ 0.55155489 0.290773 0.15767211]] > > > random_streams = T.shared_randomstreams.RandomStreams() > > index = random_streams.choice(size=(1,), a=3, p=probabilities[0]) > > samples = T.extra_ops.to_one_hot(index, logits.shape[-1]) > > print('samples', samples.eval()) > > # result: samples [[ 1. 0. 0.]] > > > # We want to use gradient of probabilities instead of samples! > > samples_grad = T.grad(samples[0][0], logits) > > # result: raise DisconnectedInputError > > > The node is not the final layer, so I can't use categorical cross entropy > loss for training it. > > I am trying to implement custom op (see attached stochastic_softmax.py), but > it is not working in practice. > > Since I have working expression for forward path, can I simply override > gradient of existing expression? > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to theano-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > import numpy as np > import theano > import theano.tensor as T > > > class StochasticSoftmax(theano.Op): > def __init__(self, random_state=np.random.RandomState()): > self.random_state = random_state > > def make_node(self, x): > x = T.as_tensor_variable(x) > return theano.Apply(self, [x], [x.type()]) > > def perform(self, node, inputs, output_storage): > # Gumbel-max trick > x, = inputs > z = self.random_state.gumbel(loc=0, scale=1, size=x.shape) > indices = (x + z).argmax(axis=-1) > y = np.eye(x.shape[-1], dtype=np.float32)[indices] > output_storage[0][0] = y > > def grad(self, inp, grads): > x, = inp > g_sm, = grads > > sm = T.nnet.softmax(x) > return [T.nnet.softmax_grad(g_sm, sm)] > > def infer_shape(self, node, i0_shapes): > return i0_shapes -- Pascal -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.