You need to take the subtensor in the forward to save all computation. It
is a very problem to remove useless computation due to a subtensor at the
end of the graph. We cover very few optimisation compared to what is
needed. So move the subtensor in the forward.

Fred

Le lun. 2 oct. 2017 23:35, dhern <[email protected]> a écrit :

> Thanks for the reply.
>
> Right, that method however seems to address the issue for gradients with
> respect to shared variables. I am interested, as in the code above in
> taking symbolic gradients with respect to subarrays of theano tensors. That
> doesn't seem to be possible, correct?. I will look more closely into taking
> a subtensor of the gradient, although I am not sure it reduces computation
> time in my actual code, since that is what I did to begin with and it is
> still very time consuming.
>
>
> On Thursday, September 28, 2017 at 3:32:19 PM UTC-4, Pascal Lamblin wrote:
>
>> Maybe the following can help you.
>>
>>
>> http://deeplearning.net/software/theano/tutorial/faq_tutorial.html#how-to-update-a-subset-of-weights
>>
>> Also, if you take a subtensor of the gradient itself, some optimizations
>> can apply that would avoid the computation of the full gradient.
>>
>> For instance, with your example, the "subtensor" and "* 2" operations
>> are swapped:
>>
>>  >>> grad0 = full_grad[0]
>>  >>> g0 = theano.function([X, Y], grad0)
>>
>>  >>> theano.printing.debugprint(g0)
>> Elemwise{mul,no_inplace} [id A] ''   1
>>   |TensorConstant{(1,) of 2.0} [id B]
>>   |Subtensor{int64} [id C] ''   0
>>     |<TensorType(float64, matrix)> [id D]
>>     |Constant{0} [id E]
>>
>>
>> On 2017-09-27 05:25 PM, Daniel Hernandez wrote:
>> > Hi,
>> >
>> > I was wondering if someone here had an answer to this unsolved question
>> > over in stack overflow:
>> >
>> >
>> https://stackoverflow.com/questions/37545325/theano-gradient-of-subtensor
>> >
>> > Basically, how do you compute gradients w.r.t. a subtensor?
>> >
>> > The question arises in the context of large tensors, say Y and X, where
>> > it is known that each entry in Y depends only on a small subset of the
>> > entries of X. Taking T.grad(Y, X) is computationally expensive since it
>> > will compute every possible gradient so one would like to be able to
>> > compute, e.g. T.grad(Y, X[i]) . Here is some basic code illustrating
>> the
>> > problem.
>> >
>> > X = T.matrix()
>> > Y = T.sum(X**2)
>> >
>> > full_grad = T.grad(Y, X) # This works
>> >
>> > X0 = X[0]
>> > test = T.grad(Y, X0) # This pukes a Disconnected Input error
>> >
>> > Silencing the Disconnected Input can be done in grad, but of course,
>> > that doesn't solve anything, evaluating the gradients only results in a
>> > bunch of 0s. So, is there a way of taking these gradients with respect
>> > to a subtensor?
>> >
>> >
>> > --
>> >
>> > ---
>> > You received this message because you are subscribed to the Google
>> > Groups "theano-users" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>>
> > an email to [email protected]
>>
> > <mailto:[email protected]>.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>> --
>> Pascal Lamblin
>>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to