I have a large data array X and Y representing input and outputs of a 
classifier. I need to use mini-batch gradient descent to train a classifier 
(say logistic regression).

I have stored the parameters Weights as a theano.shared variable. Instead 
of copying small batches of data to the GPU, I also place all the data on 
the GPU as a theano.shared variable and only pass indexes to the GPU.

The scan function that accepts batches of indexes fr into X_shared and 
Y_shared  and then computes a loss.


*I want to call a gradient update function inside scan so that every scan 
routine uses an updated version of weights. Is this possible/efficient? If 
so how can I accomplish this?*


I essentially want to do mini-batch gradient descent inside the gpu.


I have seen examples where the cpu send the gpu random indexes, but I want 
to know if I can avoid that!


This is my scan function:

#shared variable for weights
weights = theano.shared(np.float32(np.random.rand(data_shape), 
'weights')#shared data 
X_shared = theano.shared(np.float32(MY_DATA_X), 'X')
Y_shared = theano.shared(np.float32(MY_DATA_Y), 'Y')
def scan_shared_loss_acc(fr, total_mean_loss):
    #fr is a batch of random indexes into the data
    fr = T.cast(fr, 'int32')
    x = T.cast(X_shared[fr], 'int32')
    y = T.cast(Y_shared[fr], 'int32')
    y_pred = get_prediction(x, y, weights)
    loss = T.nnet.categorical_crossentropy(y_pred, y).mean()
    #---- I want to do a gradient update here ----#
    #--- is the update below efficient? ----#
    #grad_weights = T.grad(loss, weights)
    #weights = weights - learning_rate * grad_weights 
    total_mean_loss = total_mean_loss + losses
    return total_mean_loss

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to