Hi, Olivier, Thank you very much.
' grad(loss, codes)[idx] ' solves my problem. It is the right solution to only update the subtensor. It avoids computing the whole gradients for each item in codes. On Thursday, July 3, 2014 at 9:31:38 AM UTC+8, Olivier Delalleau wrote: > > When you write > grad(loss, codes[idx]) > the codes[idx] statement is creating a new symbolic variable that you have > not used yet anywhere in the computational graph of loss, which is why > Theano complains. > > The correct way to write it is grad(loss, codes)[idx] (and if everything > goes well Theano will be able to figure out by itself if it can avoid > computing the full grad(loss, codes)) > > -=- Olivier > > 2014-06-30 13:09 GMT-04:00 Justin Brody <[email protected] <javascript:>>: > >> In case anyone else has similar problems, I got a good answer on >> stackoverflow: >> >> http://stackoverflow.com/questions/24468482/defining-a-gradient-with-respect-to-a-subtensor-in-theano >> >> >> >> On Saturday, June 28, 2014 9:20:41 PM UTC-4, Justin Brody wrote: >>> >>> For whatever it's worth, I think my fundamental misunderstanding is >>> about how symbolic variables get "bound". For example, I tried changing my >>> code to: >>> current_codes = T.tensor3('current_codes') >>> del_codes = T.grad(loss, current_codes) >>> delc_fn = function([idx], del_codes, givens = [ [current_codes, >>> codes[idx] ]]) >>> >>> and then calling the delc_fn in the updates part of my training >>> function. Theano complains that current_codes is not part of the >>> computational graph of loss. In my mind, it will *become* part of the >>> computational graph when it gets bound to codes[idx]. So this is the >>> tension I'm having trouble resolving: I want to use the code I just wrote >>> to get the loss defined with respect to a specific variable (rather than a >>> subtensor of a specific variable) but I want to use T.grad(loss, >>> codes[idx]) to express what I'm really trying to do. >>> On Saturday, June 28, 2014 7:01:47 PM UTC-4, Justin Brody wrote: >>>> >>>> Hello, >>>> I've been trying for many days to properly understand how shared >>>> variables and symbolic variables interact in Theano, but sadly I don't >>>> think I'm there. My ignorance is quite probably reflected in this >>>> question but I would still be very grateful for any guidance. >>>> >>>> I'm trying to implement a "deconvolutional network"; specifically I >>>> have a 3-tensor of inputs (each input is a 2D image) and a 4-tensor of >>>> codes; for the ith input codes[i] represents a set of codewords which >>>> together code for input i. >>>> >>>> I've been having a lot of trouble figuring out how to do gradient >>>> descent on the codewords. Here are the relevant parts of my code: >>>> >>>> codes = shared(initial_codes, name="codes") # Shared 4-tensor w/ >>>> dims (input #, code #, row #, col #) >>>> idx = T.lscalar() >>>> pre_loss_conv = conv2d(input = codes[idx].dimshuffle('x', 0, 1,2), >>>> filters = dicts.dimshuffle('x', 0,1, 2), >>>> border_mode = 'valid') >>>> loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], >>>> pre_loss_conv.shape[3])) >>>> loss_in = inputs[idx] >>>> loss = T.sum(1./2.*(loss_in - loss_conv)**2) >>>> >>>> del_codes = T.grad(loss, codes[idx]) >>>> delc_fn = function([idx], del_codes) >>>> train_codes = function([input_index], loss, updates = [ >>>> [codes, T.set_subtensor(codes[input_index], codes[input_index] - >>>> learning_rate*del_codes[input_index]) ]]) >>>> >>>> (here codes and dicts are shared tensor variables). Theano is unhappy >>>> with this, specifically with defining >>>> >>>> del_codes = T.grad(loss, codes[idx]) >>>> >>>> The error message I'm getting is: >>>> *theano.gradient.DisconnectedInputError: >>>> grad method was asked to compute the gradient with respect to a variable >>>> that is not part of the computational graph of the cost, or is used only >>>> by >>>> a non-differentiable operator: Subtensor{int64}.0* >>>> >>>> I'm guessing that it wants a symbolic variable instead of codes[idx]; >>>> but then I'm not sure how to get everything connected to get the intended >>>> effect. I'm guessing I'll need to change the final line to something like >>>> >>>> learning_rate*del_codes) ]]) >>>> >>>> Can someone give me some pointers as to how to define this function >>>> properly? I think I'm probably missing something basic about working with >>>> Theano but I'm not sure what. >>>> >>>> Thanks in advance! >>>> >>>> -Justin >>>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
