Following is the code which I am trying to put in a for loop:
layer0_input = x.reshape((batch_size, n_ch, img_ht, img_wdt)) # Construct the first convolutional pooling layer: # filtering reduces the image size to (28+5-1 , 28+5-1) = (32, 32) # maxpooling reduces this further to (32/2, 32/2) = (16, 16) # 4D output tensor is thus of shape (batch_size, nkerns[0], 16, 16) layer0 = LeNetConvPoolLayer( input=layer0_input, image_shape=(batch_size, 1, 28, 28), filter_shape=(nkerns[0], 1, 5, 5), poolsize=(2, 2) ) layer1 = LeNetConvPoolLayer( input=layer0.output, image_shape=(batch_size, nkerns[0], 16, 16), filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2) ) layer2 = LeNetConvPoolLayer( input=layer1.output, image_shape=(batch_size, nkerns[0], 10, 10), filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2) ) # construct a fully-connected sigmoidal layer layer3 = HiddenLayer( input=layer2.output.flatten(2), n_in=nkerns[1] * 7 * 7, n_out=500, ) # classify the values of the fully-connected sigmoidal layer layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10) # the cost we minimize during training is the NLL of the model cost = layer4.negative_log_likelihood(y) # create a list of all model parameters to be fit by gradient descent params = layer4.params + layer3.params + layer2.params + layer1.params + layer0.params # create a list of gradients for all model parameters grads = t.grad(cost, params) On Saturday, 27 August 2016 07:31:16 UTC+5:30, martin.de...@gmail.com wrote: > > check your shared variables that you send to T.gard() function. > That's where the problem starts. > For instance creating a shared variable this way > theano.shared(np.random.randn(20,20)).astype(dtype='float32') > will create a cast on the shared variable and most certainly will result > in a disconnected gradient error. > Tried printting the > Instead: > theano.shared(np.random.randn(20,20).astype(dtype='float32')) > will work without any problem. > > My suggestion would be to see what kind of variables the T.grad() > functions is accepting when it is called. > Make sure they are of the proper type. > I have tried printing the "params" before they are fed in t.grad on both the working code and the for loop code which I am trying to write and I get the same output: [W, b, W, b, <TensorType(float64, 4D)>, <TensorType(float64, vector)>, <TensorType(float64, 4D)>, <TensorType(float64, vector)>, <TensorType(float64, 4D)>, <TensorType(float64, vector)>] I don't think shared variables are causing the problem because I haven't dabbled with those lines of code from the original working code. I am attaching the entire program if you would like to have a look at it. > > > > On Saturday, August 27, 2016 at 1:38:40 AM UTC+1, pranav inani wrote: >> >> Hello, >> >> I am trying to create a general frame work of CNN. I use a list "layers" >> to store the ConvPoolLayer objects using the following code(mostly picked >> up from the theano tutorial on deepleaning.net): >> >> nkerns=[10, 10, 10] >> batch_size=500 >> img_ht=28 >> img_wdt=28 >> n_ch=1 >> layers=[None]*3 >> f_ht=5 >> f_wdt=5 >> >> >> i=0 >> for layer in range(len(layers)): >> layer = LeNetConvPoolLayer( >> input=layer0_input, >> image_shape=(batch_size, n_ch, img_ht, img_wdt), >> filter_shape=(nkerns[i], n_ch, f_ht, f_wdt), >> poolsize=(2, 2) >> ) >> img_ht=(img_ht+f_ht-1)/2 >> img_wdt=(img_wdt+f_wdt-1)/2 >> n_ch=nkerns[i] >> layers[i] = layer >> input = layer[i].output >> i+=1 >> >> >> >> >> # construct a fully-connected sigmoidal layer >> layer3 = HiddenLayer( >> input=layers[2].output.flatten(2), >> n_in=nkerns[2] * img_ht * img_wdt, >> n_out=500, >> ) >> >> # classify the values of the fully-connected sigmoidal layer >> layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10) >> >> # the cost we minimize during training is the NLL of the model >> cost = layer4.negative_log_likelihood(y) >> >> # create a list of all model parameters to be fit by gradient >> descent >> params = layer4.params + layer3.params + layers[2].params + >> layers[1].params+ layers[0].params >> >> >> >> >> >> While calculating the grads i get the following error: >> >> Traceback (most recent call last): >> File "test1.py", line 321, in <module> >> evaluate_convnet() >> File "test1.py", line 288, in evaluate_convnet >> grads = t.grad(cost, params) >> File "/home/pi/.local/lib/python2.7/site-packages/theano/gradient.py", >> line 545, in grad >> handle_disconnected(elem) >> File "/home/pi/.local/lib/python2.7/site-packages/theano/gradient.py", >> line 532, in handle_disconnected >> raise DisconnectedInputError(message) >> theano.gradient.DisconnectedInputError: grad method was asked to compute >> the gradient with respect to a variable that is not part of the >> computational graph of the cost, or is used only by a non-differentiable >> operator: <TensorType(float64, 4D)> >> Backtrace when the node is created: >> File "test1.py", line 321, in <module> >> evaluate_convnet() >> File "test1.py", line 245, in evaluate_convnet >> poolsize=(2, 2) >> File "test1.py", line 164, in __init__ >> borrow=True >> >> Any help on the matter would be much appreciated! Thanks. >> >> Regards, >> Pranav >> >> >> >> >> >> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
import sys import timeit import six.moves.cPickle as pickle import gzip import numpy as np from numpy import loadtxt from theano import shared from theano import function import theano import theano.tensor as t from theano.tensor.signal import pool from theano.tensor.nnet import conv2d Training_Set_Batch_Range=(0,20000) #Range of Training Set in Feature File and Actual Value File // In Tuple Validation_set_Batch_Range=None #Range of Validation Set in Feature File and Actual Value File // In Tuple Test_Set_Batch_Range=None #Range of Test Set in Feature File and Actual Value File def load_data(): b = np.loadtxt("mnist_train.txt",delimiter=',',skiprows=40000) data_y = np.loadtxt("mnist_train.txt",dtype='int32',delimiter=',',usecols=(0,),skiprows=40000) data_x = b[:,1:] train_batch_x=data_x[Training_Set_Batch_Range[0]:Training_Set_Batch_Range[1]] #Setting Training Batch_x//Training_Set_Batch_Range is a Tuple train_batch_y=data_y[Training_Set_Batch_Range[0]:Training_Set_Batch_Range[1]] #Setting Training Batch_y if(Validation_set_Batch_Range==None): valid_batch_x=None valid_batch_y=None else: valid_batch_x=data_x[Validation_set_Batch_Range[0]:Validation_set_Batch_Range[1]] #Setting Validation_Batch_x valid_set_y=data_y[Validation_set_Batch_Range[0]:Validation_set_Batch_Range[1]] #Setting Validation_Batch_y if(Test_Set_Batch_Range==None): test_batch_x=None test_batch_y=None else: test_batch_x=data_x[Test_Set_Batch_Range[0]:Test_Set_Batch_Range[1]] #Setting Test_Batch_x test_batch_y=data_y[Test_Set_Batch_Range[0]:Test_Set_Batch_Range[1]] #Setting Test_Batch_y train_batch=[train_batch_x,train_batch_y] valid_batch=[valid_batch_x,valid_batch_y] test_batch=[test_batch_x,test_batch_y] dataset=[train_batch,valid_batch,test_batch] return dataset class LogisticClassifier(object): def __init__(self,input, n_in, n_out): #n_in= No. of Units in Previous Hidden Layer #n_out=No. of Units in Output Layer #Input=Activation values of Previous Hidden Layer self.W=shared(value=np.ones((n_in, n_out),dtype=theano.config.floatX) ,name='W', borrow=True) #Parameters self.b=shared(value=np.zeros((n_out),dtype=theano.config.floatX) ,name='b', borrow=True) #Bias Units self.p_y_given_x=t.nnet.softmax(t.dot(input, self.W) +self.b) #Calculating Probabality Given Input self.y_pred=t.argmax(self.p_y_given_x, axis=1) self.params=[self.W, self.b] self.input=input def negative_log_likelihood(self, y): return -t.mean(t.log(self.p_y_given_x)[t.arange(y.shape[0]),y]) #Average Cross Entropy Function over a MiniBatch class HiddenLayer(): def __init__(self,input, n_in, n_out): #n_i=No. of Input units #n_h=No. of Hidden Layer Units #Input= Input Values of Features self.input=input W_values=np.array(np.random.uniform(low=-np.sqrt(6./(n_in + n_out)), high=np.sqrt(6./(n_in + n_out)),size=(n_in, n_out)),dtype=theano.config.floatX) #Setting up Weight Parameters for the Hidden Layer self.W=theano.shared(value=W_values, name='W', borrow=True) b_values=np.zeros((n_out,),dtype=theano.config.floatX) #Setting Up Bias Parameters self.b=theano.shared(value=b_values, name='b', borrow=True) lin_output=t.dot(input, self.W) + self.b #PreActivation Function for Hidden Layer self.output=t.tanh(lin_output) #Activation Function self.params = [self.W, self.b] #Storing The Parameters of Hidden Layer class LeNetConvPoolLayer(object): """Pool Layer of a convolutional network """ def __getitem__(self,key): return self def __init__(self, input, filter_shape, image_shape, poolsize=(2, 2)): """ Allocate a LeNetConvPoolLayer with shared variable internal parameters. :type input: theano.tensor.dtensor4 :param input: symbolic image tensor, of shape image_shape :type filter_shape: tuple or list of length 4 :param filter_shape: (number of filters, num input feature maps, filter height, filter width) :type image_shape: tuple or list of length 4 :param image_shape: (batch size, num input feature maps, image height, image width) :type poolsize: tuple or list of length 2 :param poolsize: the downsampling (pooling) factor (#rows, #cols) """ assert image_shape[1] == filter_shape[1] self.input = input # there are "num input feature maps * filter height * filter width" # inputs to each hidden unit fan_in = np.prod(filter_shape[1:]) # each unit in the lower layer receives a gradient from: # "num output feature maps * filter height * filter width" / # pooling size fan_out = (filter_shape[0] * np.prod(filter_shape[2:]) // np.prod(poolsize)) # initialize weights with random weights W_bound = np.sqrt(6. / (fan_in + fan_out)) self.W = theano.shared( np.asarray( np.random.uniform(low=-W_bound, high=W_bound, size=filter_shape), dtype=theano.config.floatX ), borrow=True ) # the bias is a 1D tensor -- one bias per output feature map b_values = np.zeros((filter_shape[0],), dtype=theano.config.floatX) self.b = theano.shared(value=b_values, borrow=True) # convolve input feature maps with filters conv_out = conv2d( input=input, filters=self.W, filter_shape=filter_shape, input_shape=image_shape, border_mode="full" ) # pool each feature map individually, using maxpooling pooled_out = pool.pool_2d( input=conv_out, ds=poolsize, ignore_border=True ) # add the bias term. Since the bias is a vector (1D array), we first # reshape it to a tensor of shape (1, n_filters, 1, 1). Each bias will # thus be broadcasted across mini-batches and feature map # width & height self.output = t.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x')) # store parameters of this layer self.params = [self.W, self.b] # keep track of model input self.input = input def evaluate_convnet(): learning_rate=0.1 n_epochs=200 nkerns=[10, 10, 10] batch_size=500 img_ht=28 img_wdt=28 n_ch=1 layers=[None]*3 f_ht=5 f_wdt=5 dataset=load_data() train_set=dataset[0] x_1 = train_set[0] y_1 = train_set[1] l=x_1.shape[0] x=t.dmatrices('x') y=t.ivector('y') ###################### # BUILD ACTUAL MODEL # ###################### print('... building the model') # Reshape matrix of rasterized images of shape (batch_size, 28 * 28) # to a 4D tensor, compatible with our LeNetConvPoolLayer # (28, 28) is the size of MNIST images. layer0_input = x.reshape((batch_size, n_ch, img_ht, img_wdt)) # Construct the first convolutional pooling layer: # filtering reduces the image size to (28+5-1 , 28+5-1) = (32, 32) # maxpooling reduces this further to (32/2, 32/2) = (16, 16) # 4D output tensor is thus of shape (batch_size, nkerns[0], 12, 12) i=0 for layer in range(len(layers)): layer = LeNetConvPoolLayer( input=layer0_input, image_shape=(batch_size, n_ch, img_ht, img_wdt), filter_shape=(nkerns[i], n_ch, f_ht, f_wdt), poolsize=(2, 2) ) img_ht=(img_ht+f_ht-1)/2 img_wdt=(img_wdt+f_wdt-1)/2 n_ch=nkerns[i] layers[i] = layer input = layer[i].output i+=1 # the HiddenLayer being fully-connected, it operates on 2D matrices of # shape (batch_size, num_pixels) (i.e matrix of rasterized images). # This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4), # or (500, 50 * 4 * 4) = (500, 800) with the default values. # construct a fully-connected sigmoidal layer layer3 = HiddenLayer( input=layers[2].output.flatten(2), n_in=nkerns[2] * img_ht * img_wdt, n_out=500, ) # classify the values of the fully-connected sigmoidal layer layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10) # the cost we minimize during training is the NLL of the model cost = layer4.negative_log_likelihood(y) # create a list of all model parameters to be fit by gradient descent params = layer4.params + layer3.params + layers[2].params + layers[1].params+ layers[0].params # create a list of gradients for all model parameters grads = t.grad(cost, params) # train_model is a function that updates the model parameters by # SGD Since this model has many parameters, it would be tedious to # manually create an update rule for each model parameter. We thus # create the updates list by automatically looping over all # (params[i], grads[i]) pairs. updates = [ (param_i, param_i - learning_rate * grad_i) for param_i, grad_i in zip(params, grads) ] train_model=function(inputs=[x,y], outputs=cost, updates=updates) epoch=0 while(epoch<n_epochs): i=0 z=batch_size while(i<l): X=x_1[i:z] Y=y_1[i:z] i=i+batch_size z=z+batch_size batch_avg_cost=train_model(X,Y) print(batch_avg_cost) epoch=epoch+1 evaluate_convnet()