Anyone ? Does anyone know how I could loop the network n times using as inp.input_var the output of the previous step? I mean preferably without explicitly propagating an hidden state and having at the end a variable depending of the composed outputs of which calculating the gradients for SGD ?
Thank you Fab On Tuesday, 1 August 2017 10:56:18 UTC+1, Fab wrote: > > Hello everyone, > > I am trying to implement the multi-frame loss in this paper (equation (4) > , Figure 1) applied to 3D meshes using Theano and Lasagne: > https://arxiv.org/pdf/1607.03597.pdf > > I am using a custom convolutional layer that operates on meshes and needs > a sparse matrix per training sample. > Basically, in order to learn a physics simulation, I want to calculate > the velocity loss for a single frame n and then step in the future 5 times, > get that future loss and accumulate its partial derivatives to the gradient > of the single frame. > The stepping in the future is the problematic part. I am trying with a > scan at each loop to pass as input the result of the previous iteration (a > vector with 3 channels X,Y,Z) and a sparse matrix. > With the following (simplified) code I get this: > > *NotImplementedError: Theano has no sparse vectorUse X[a:b, c:d], X[a:b, > c:c+1] or X[a:b] instead.* > > > So essentially I don't know how to create a scan loop in which at each > iteration inp.input_var (network input) is updated with the output of the > network at the previous step. It seems in fact that inp.input_var can be > passed only from a theano.function ?? > > Apart from the error my other questions are: > 1- is this the correct way to do backpropagation through time ? (I want > the loss to be a composition of losses like in RNNs, but from the future) > my doubt is especially how I have to use inp.input_var > 2- should I use deterministic = true when I do the prediction in the > future ? > > Any suggestions would be greatly appreciated ! > Please ask me anything that is not clear. > > > import numpy as np > > import theano > import theano.tensor as T > import theano.sparse as Tsp > > import lasagne as L > import lasagne.layers as LL > import lasagne.objectives as LO > > import re > > > # Built a deep network with custom convolutional layers for meshes > def get_model(inp, patch_op): > icnn = inp > # Geodesic convolutional layer > icnn = (utils_lasagne.GCNNLayer([icnn, patch_op], 32, nrings=5, nrays= > 16)) > icnn = (utils_lasagne.GCNNLayer([icnn, patch_op], 64, nrings=5, nrays= > 16)) > icnn = (utils_lasagne.GCNNLayer([icnn, patch_op], 128, nrings=5, nrays > =16)) > > ffn = utils_lasagne.GCNNLayer([icnn, patch_op], 3, nrings=5, nrays=16 > , nonlinearity=None) > > return ffn > > > inp = LL.InputLayer(shape=(None, 3 )) > # patches (disks) > patch_op = LL.InputLayer(input_var=Tsp.csc_fmatrix('patch_op'), shape=( > None, None)) > > ffn = get_model(inp, patch_op) > > # ground truth velocities > truth = T.matrix('Truth') > > disk1 = Tsp.csc_fmatrix('disk1') > disk2 = Tsp.csc_fmatrix('disk2') > disk3 = Tsp.csc_fmatrix('disk3') > disk4 = Tsp.csc_fmatrix('disk4') > disk5 = Tsp.csc_fmatrix('disk5') > > targetFutureVelocitiesTruth = T.matrix('targetFutureVelocitiesTruth') > > learn_rate = T.scalar('learning_rate') > > output = LL.get_output(ffn) > pred = LL.get_output(ffn, deterministic=True) > > # Skip Connection > inputVelocities = inp.input_var[:,:3] # slice x y z excluding the > constraint last dimension of zeros and ones > output += inputVelocities > > targetVelocities = truth[:,3:6] > velCost = squared_error(output, targetVelocities) > > regL2 = L.regularization.regularize_network_params(ffn, L.regularization. > l2) > > # the cost of a single frame > cost = lambda_vel*velCost + l2_weight*regL2 > > # one step of the network > step = theano.function([patch_op.input_var, inp.input_var], > [output], > on_unused_input='warn' > ) > > # lists of sparse matrices > disks = [disk1,disk2,disk3,disk4,disk5] > # step 5 frames in the future > futurePredictions, _ = theano.scan(step, > outputs_info = inp.input_var, > sequences = disks, > n_steps = 5 ) > > # future velocity > lastFuturePrediction = futurePredictions[-1] > # error in the future > futureCost = squared_error(lastFuturePrediction, > targetFutureVelocitiesTruth) > > # get network parameters > params = LL.get_all_params(ffn, trainable=True) > # accumulate partial derivatives from this frame and from future frame > grads = T.grad(cost, params) > gradsFuture = T.grad(futureCost, params) > grads = grads + gradsFuture > > updates = L.updates.adam(grads, params, learning_rate=learn_rate) > > funcs = dict() > funcs['train'] = theano.function([inp.input_var, patch_op.input_var, truth > , learn_rate, > targetFutureVelocitiesTruth, disk1, disk2, > disk3, disk4, disk5 ], > [cost], > updates=updates, > on_unused_input='warn') > > # now just call the train function with a velocity field, disks, truth etc > to have the global loss in one go > > > > > > > > > Thank you very much for your help > Fab > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
