Hi!
I think I am not understanding something about theano.scan or theano.grad,
possibly both.
For a TensorVariable ld representing a list of costs, I have thought of two
ways of computing the element-wise gradient with respect to a list of
shared variables that all these costs depend on, *and* subsequently adding
these gradients (In truth, of course I want to do something in between
these steps, that's why I don't just compute the gradient after the
summation, but that is not relevant for the question)
*1.- This way appears to be successful*
grads, _ = theano.scan(lambda i, ld : theano.grad(ld[i], ParameterList),
sequences=T.arange(ld.shape[0]),
non_sequences=ld)
*finalgrad* = [T.sum(entry, axis=0) for entry in grads]
*2.- Unsuccessful*
def listsum(lA, lB):
if len(lA) != len(lB):
raise Exception
else:
return [lA[i] + lB[i] for i in range(len(lA))]
grads, _ = theano.scan(lambda i, ld, out : listsum(out, theano.grad(ld[i],
ParameterList)), sequences=T.arange(ld.shape[0]),
non_sequences=ld,
outputs_info=np.zeros((len(self.ParametersSet['LogDensity']),)
) )
finalgrad = grads[-1]
This method throws a DisconnectedInputError, and then backtraces to where
the variable in question was created. Why does this method fail? All the
variables ld[i] certainly depend on the shared variables in ParameterList.
Thanks! Daniel
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.