Re: [theano-users] Re: Gradient with respect to subtensor

Qiang Cui Thu, 20 Jul 2017 09:03:47 -0700

Hi, Olivier, 

Thank you very much.


' grad(loss, codes)[idx] ' solves my problem. It is the right solution to 
only update the subtensor. It avoids computing the whole gradients for each 
item in codes.



On Thursday, July 3, 2014 at 9:31:38 AM UTC+8, Olivier Delalleau wrote:
>
> When you write
>     grad(loss, codes[idx])
> the codes[idx] statement is creating a new symbolic variable that you have 
> not used yet anywhere in the computational graph of loss, which is why 
> Theano complains.
>
> The correct way to write it is grad(loss, codes)[idx] (and if everything 
> goes well Theano will be able to figure out by itself if it can avoid 
> computing the full grad(loss, codes))
>
> -=- Olivier
>
> 2014-06-30 13:09 GMT-04:00 Justin Brody <[email protected] <javascript:>>:
>
>> In case anyone else has similar problems, I got a good answer on 
>> stackoverflow:
>>
>> http://stackoverflow.com/questions/24468482/defining-a-gradient-with-respect-to-a-subtensor-in-theano
>>
>>
>>
>> On Saturday, June 28, 2014 9:20:41 PM UTC-4, Justin Brody wrote:
>>>
>>> For whatever it's worth, I think my fundamental misunderstanding is 
>>> about how symbolic variables get "bound".  For example, I tried changing my 
>>> code to:
>>>     current_codes = T.tensor3('current_codes')
>>>     del_codes = T.grad(loss, current_codes)
>>>     delc_fn = function([idx], del_codes, givens = [ [current_codes, 
>>> codes[idx] ]])
>>>
>>> and then calling the delc_fn in the updates part of my training 
>>> function.  Theano complains that current_codes is not part of the 
>>> computational graph of loss.  In my mind, it will *become* part of the 
>>> computational graph when it gets bound to codes[idx].  So this is the 
>>> tension I'm having trouble resolving:  I want to use the code I just wrote 
>>> to get the loss defined with respect to a specific variable (rather than a 
>>> subtensor of a specific variable) but I want to use T.grad(loss, 
>>> codes[idx]) to express what I'm really trying to do.  
>>> On Saturday, June 28, 2014 7:01:47 PM UTC-4, Justin Brody wrote:
>>>>
>>>> Hello,
>>>> I've been trying for many days to properly understand how shared 
>>>> variables and symbolic variables interact in Theano, but sadly I don't 
>>>> think I'm there.  My ignorance is quite probably reflected  in this 
>>>> question but I would still be very grateful for any guidance.
>>>>
>>>> I'm trying to implement a "deconvolutional network"; specifically I 
>>>> have a 3-tensor of inputs (each input is a 2D image) and a 4-tensor of 
>>>> codes; for the ith input codes[i] represents a set of codewords which 
>>>> together code for input i.
>>>>
>>>> I've been having a lot of trouble figuring out how to do gradient 
>>>> descent on the codewords. Here are the relevant parts of my code:
>>>>
>>>>  codes = shared(initial_codes, name="codes")          # Shared 4-tensor w/ 
>>>> dims (input #, code #, row #, col #)
>>>>  idx = T.lscalar()
>>>>  pre_loss_conv = conv2d(input = codes[idx].dimshuffle('x', 0, 1,2),
>>>>                        filters = dicts.dimshuffle('x', 0,1, 2),
>>>>                        border_mode = 'valid')
>>>>  loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], 
>>>> pre_loss_conv.shape[3]))
>>>>  loss_in = inputs[idx]
>>>>  loss = T.sum(1./2.*(loss_in - loss_conv)**2) 
>>>>
>>>>  del_codes = T.grad(loss, codes[idx])
>>>>  delc_fn = function([idx], del_codes)
>>>>  train_codes = function([input_index], loss, updates = [
>>>>     [codes, T.set_subtensor(codes[input_index], codes[input_index] - 
>>>>                             learning_rate*del_codes[input_index])     ]])
>>>>
>>>>  (here codes and dicts are shared tensor variables). Theano is unhappy 
>>>> with this, specifically with defining
>>>>
>>>>  del_codes = T.grad(loss, codes[idx])
>>>>
>>>>  The error message I'm getting is: 
>>>> *theano.gradient.DisconnectedInputError: 
>>>> grad method was asked to compute the gradient with   respect to a variable 
>>>> that is not part of the computational graph of the cost, or is used only 
>>>> by 
>>>> a non-differentiable operator:  Subtensor{int64}.0*
>>>>
>>>> I'm guessing that it wants a symbolic variable instead of codes[idx]; 
>>>> but then I'm not sure how to get everything connected to get the intended 
>>>> effect. I'm guessing I'll need to change the final line to something like
>>>>
>>>> learning_rate*del_codes)     ]])
>>>>
>>>> Can someone give me some pointers as to how to define this function 
>>>> properly? I think I'm probably missing something basic about working with 
>>>> Theano but I'm not sure what. 
>>>>
>>>> Thanks in advance!
>>>>
>>>> -Justin
>>>>
>>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Re: Gradient with respect to subtensor

Reply via email to