Hi, Thank you for the suggestion, actually inlining makes more sense for what I am trying to do.
However, a casting issue arises when trying to compute the derivative wrt to the continuous input. If I understood correctly, DisconnectedInput should be returned as the gradient for integral inputs (or inputs wrt which I don't need the derivative) right? Below is the slightly modified code which illustrate this new issue: import numpy as np import theano.tensor as T import theano def make_ops(): x_var = T.vector() m_var = T.bvector() r = m_var.sum().astype('floatX') z = x_var * m_var / r def grad_op1(inputs, output_gradients): return [ output_gradients[0], # computation delegated to op2 theano.gradient.DisconnectedType()(), ] op1 = theano.OpFromGraph( inputs=[x_var, m_var], outputs=[z, r], grad_overrides=grad_op1, inline=True) z_var = T.vector() r_var = T.scalar() def grad_op2(inputs, output_gradients): _, m_, r_ = inputs return [ m_ * r_, theano.gradient.DisconnectedType()(), theano.gradient.DisconnectedType()() ] op2 = theano.OpFromGraph( inputs=[z_var, m_var, r_var], outputs=[z_var], grad_overrides=grad_op2, inline=True) return op1, op2 op1, op2 = make_ops() x_var = T.vector() m_var = T.bvector() z_, r = op1(x_var, m_var) z = op2(z_, m_var, r) g = theano.grad(T.sum(z), wrt=x_var) print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32), m_var: np.array([1, 0, 1, 1], dtype=np.int8)})) Le mardi 11 juillet 2017 11:32:50 UTC+2, nicolas....@gmail.com a écrit : > > Hi, > > I am trying to split an computation over two ops in order to avoid > spurious computations when computing the gradient. > My current attempt uses a first op which returns the desired result for > the forward part and extra intermediate results. The second op just > forwards the desired result, but its grad is overriden to compute the > gradient based on the intermediate results. > > In this configuration, Theano complains about unused inputs in the forward > computation because the intermediate results are not used for the forward > method of the second op. > > Is this an expected behaviour or a bug? > > ---- > > import numpy as np > import theano.tensor as T > import theano > > > def make_ops(): > x_var = T.vector() > m_var = T.bvector() > > r = m_var.sum().astype('floatX') > z = x_var * m_var / r > > > def grad_op1(inputs, output_gradients): > return [ > output_gradients[0], # computation delegated to op2 > theano.gradient.DisconnectedType()() > ] > > > op1 = theano.OpFromGraph( > inputs=[x_var, m_var], > outputs=[z, r], > grad_overrides=grad_op1) > > > z_var = T.vector() > r_var = T.scalar() > > def grad_op2(inputs, output_gradients): > _, m_, r_ = inputs > return [ > m_ * r_, > theano.gradient.DisconnectedType()(), > theano.gradient.DisconnectedType()() > ] > > op2 = theano.OpFromGraph( > inputs=[z_var, m_var, r_var], > outputs=[z_var], > grad_overrides=grad_op2) > > return op1, op2 > > > op1, op2 = make_ops() > x_var = T.vector() > m_var = T.bvector() > z_, r = op1(x_var, m_var) > z = op2(z_, m_var, r) > > print(z_.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32), > m_var: np.array([1, 0, 1, 1], dtype=np.int8)})) > > f = theano.function([x_var, m_var], [z], on_unused_input='ignore') # > raises anyway > > print(f(np.array([1., .3, .0, .2], dtype=np.float32), > np.array([1, 0, 1, 1], dtype=np.int8))) > > # g = theano.grad(T.sum(z), wrt=x_var) > # print(g.eval({x_var: np.array([1., .3, .0, .2], dtype=np.float32), > # m_var: np.array([1, 0, 1, 1], dtype=np.int8)})) > > > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.