[theano-users] Re: theano.gradient.DisconnectedInputError

pranav inani Sat, 27 Aug 2016 03:09:26 -0700

Following is the code which I am trying to put in a for loop:


    layer0_input = x.reshape((batch_size, n_ch, img_ht, img_wdt))

        # Construct the first convolutional pooling layer:
        # filtering reduces the image size to (28+5-1 , 28+5-1) = (32, 32)
        # maxpooling reduces this further to (32/2, 32/2) = (16, 16)
        # 4D output tensor is thus of shape (batch_size, nkerns[0], 16, 16)
    layer0 = LeNetConvPoolLayer(
        input=layer0_input,
        image_shape=(batch_size, 1, 28, 28),
        filter_shape=(nkerns[0], 1, 5, 5),
        poolsize=(2, 2)
        )




    layer1 = LeNetConvPoolLayer(
        input=layer0.output,
        image_shape=(batch_size, nkerns[0], 16, 16),
        filter_shape=(nkerns[1], nkerns[0], 5, 5),
        poolsize=(2, 2)
        )

    layer2 = LeNetConvPoolLayer(
        input=layer1.output,
        image_shape=(batch_size, nkerns[0], 10, 10),
        filter_shape=(nkerns[1], nkerns[0], 5, 5),
        poolsize=(2, 2)
        )


        




        # construct a fully-connected sigmoidal layer
    layer3 = HiddenLayer(
        input=layer2.output.flatten(2),
        n_in=nkerns[1] * 7 * 7,
        n_out=500,
        )

        # classify the values of the fully-connected sigmoidal layer
    layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10)

        # the cost we minimize during training is the NLL of the model
    cost = layer4.negative_log_likelihood(y)

        # create a list of all model parameters to be fit by gradient 
descent
    params = layer4.params + layer3.params + layer2.params + layer1.params 
+ layer0.params

        # create a list of gradients for all model parameters
    grads = t.grad(cost, params)



On Saturday, 27 August 2016 07:31:16 UTC+5:30, martin.de...@gmail.com wrote:
>
> check your shared variables that you send to T.gard() function.
> That's where the problem starts.
> For instance creating a shared variable this way
> theano.shared(np.random.randn(20,20)).astype(dtype='float32')
> will create a cast on the shared variable and most certainly will result 
> in a disconnected gradient error.
> Tried printting the 
> Instead:
> theano.shared(np.random.randn(20,20).astype(dtype='float32'))
> will work without any problem.
>

> My suggestion would be to see what kind of variables the T.grad() 
> functions is accepting when it is called.
> Make sure they are of the proper type.
>


I have tried printing the "params" before they are fed in t.grad on both 
the working code and the for loop code which I am trying to write and I get 
the same output:
[W, b, W, b, <TensorType(float64, 4D)>, <TensorType(float64, vector)>, 
<TensorType(float64, 4D)>, <TensorType(float64, vector)>, 
<TensorType(float64, 4D)>, <TensorType(float64, vector)>]

I don't think shared variables are causing the problem because I haven't 
dabbled with those lines of code from the original working code. I am 
attaching the entire program if you would like to have a look at it. 

 

>
>
>
> On Saturday, August 27, 2016 at 1:38:40 AM UTC+1, pranav inani wrote:
>>
>> Hello,
>>
>> I am trying to create a general frame work of CNN. I use a list "layers" 
>> to store the ConvPoolLayer objects using the following code(mostly picked 
>> up from the theano tutorial on deepleaning.net):
>>
>> nkerns=[10, 10, 10]
>> batch_size=500
>> img_ht=28
>> img_wdt=28
>> n_ch=1
>> layers=[None]*3
>> f_ht=5
>> f_wdt=5
>>
>>
>>     i=0
>>     for layer in range(len(layers)):
>>         layer = LeNetConvPoolLayer(
>>             input=layer0_input,
>>             image_shape=(batch_size, n_ch, img_ht, img_wdt),
>>             filter_shape=(nkerns[i], n_ch, f_ht, f_wdt),
>>             poolsize=(2, 2)
>>             )
>>         img_ht=(img_ht+f_ht-1)/2
>>         img_wdt=(img_wdt+f_wdt-1)/2
>>         n_ch=nkerns[i]
>>         layers[i] = layer
>>         input = layer[i].output
>>         i+=1
>>         
>>     
>>
>>
>>         # construct a fully-connected sigmoidal layer
>>     layer3 = HiddenLayer(
>>         input=layers[2].output.flatten(2),
>>         n_in=nkerns[2] * img_ht * img_wdt,
>>         n_out=500,
>>         )
>>
>>         # classify the values of the fully-connected sigmoidal layer
>>     layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10)
>>
>>         # the cost we minimize during training is the NLL of the model
>>     cost = layer4.negative_log_likelihood(y)
>>
>>         # create a list of all model parameters to be fit by gradient 
>> descent
>>     params = layer4.params + layer3.params + layers[2].params + 
>> layers[1].params+ layers[0].params
>>
>>
>>
>>
>>
>>  While calculating the grads i get the following error:
>>
>> Traceback (most recent call last):
>>   File "test1.py", line 321, in <module>
>>     evaluate_convnet()    
>>   File "test1.py", line 288, in evaluate_convnet
>>     grads = t.grad(cost, params)
>>   File "/home/pi/.local/lib/python2.7/site-packages/theano/gradient.py", 
>> line 545, in grad
>>     handle_disconnected(elem)
>>   File "/home/pi/.local/lib/python2.7/site-packages/theano/gradient.py", 
>> line 532, in handle_disconnected
>>     raise DisconnectedInputError(message)
>> theano.gradient.DisconnectedInputError: grad method was asked to compute 
>> the gradient with respect to a variable that is not part of the 
>> computational graph of the cost, or is used only by a non-differentiable 
>> operator: <TensorType(float64, 4D)>
>> Backtrace when the node is created:
>>   File "test1.py", line 321, in <module>
>>     evaluate_convnet()
>>   File "test1.py", line 245, in evaluate_convnet
>>     poolsize=(2, 2)
>>   File "test1.py", line 164, in __init__
>>     borrow=True
>>
>> Any help on the matter would be much appreciated! Thanks.
>>
>> Regards,
>> Pranav
>>
>>
>>
>>
>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

import sys
import timeit
import six.moves.cPickle as pickle
import gzip
import numpy as np
from numpy import loadtxt
from theano import shared
from theano import function
import theano
import theano.tensor as t
from theano.tensor.signal import pool
from theano.tensor.nnet import conv2d
Training_Set_Batch_Range=(0,20000)                  #Range of Training Set in Feature File and Actual Value File // In Tuple
Validation_set_Batch_Range=None                     #Range of Validation Set in Feature File and Actual Value File // In Tuple
Test_Set_Batch_Range=None                           #Range of Test Set in Feature File and Actual Value File





def load_data():


	b = np.loadtxt("mnist_train.txt",delimiter=',',skiprows=40000)
	data_y = np.loadtxt("mnist_train.txt",dtype='int32',delimiter=',',usecols=(0,),skiprows=40000)
	data_x = b[:,1:]
		


	train_batch_x=data_x[Training_Set_Batch_Range[0]:Training_Set_Batch_Range[1]]                                #Setting Training Batch_x//Training_Set_Batch_Range is a Tuple
        train_batch_y=data_y[Training_Set_Batch_Range[0]:Training_Set_Batch_Range[1]]                                #Setting Training Batch_y

        if(Validation_set_Batch_Range==None):
        
		valid_batch_x=None
	        valid_batch_y=None
        else:
        
		valid_batch_x=data_x[Validation_set_Batch_Range[0]:Validation_set_Batch_Range[1]]                        #Setting Validation_Batch_x
         	valid_set_y=data_y[Validation_set_Batch_Range[0]:Validation_set_Batch_Range[1]]                        #Setting Validation_Batch_y

        if(Test_Set_Batch_Range==None):
        	test_batch_x=None
        	test_batch_y=None

        else:
        
        	test_batch_x=data_x[Test_Set_Batch_Range[0]:Test_Set_Batch_Range[1]]                                         #Setting Test_Batch_x
        	test_batch_y=data_y[Test_Set_Batch_Range[0]:Test_Set_Batch_Range[1]]                                         #Setting Test_Batch_y

    	train_batch=[train_batch_x,train_batch_y]
    	valid_batch=[valid_batch_x,valid_batch_y]
    	test_batch=[test_batch_x,test_batch_y] 
    	dataset=[train_batch,valid_batch,test_batch]

	return dataset

class LogisticClassifier(object):

    def __init__(self,input, n_in, n_out):  
    #n_in= No. of Units in Previous Hidden Layer
    #n_out=No. of Units in Output Layer
    #Input=Activation values of Previous Hidden Layer

        self.W=shared(value=np.ones((n_in, n_out),dtype=theano.config.floatX) ,name='W', borrow=True)               #Parameters
    
        self.b=shared(value=np.zeros((n_out),dtype=theano.config.floatX) ,name='b', borrow=True)                    #Bias Units

        self.p_y_given_x=t.nnet.softmax(t.dot(input, self.W) +self.b)                                               #Calculating Probabality Given Input

        self.y_pred=t.argmax(self.p_y_given_x, axis=1)

        self.params=[self.W, self.b]

        self.input=input
        
    def negative_log_likelihood(self, y):

        return -t.mean(t.log(self.p_y_given_x)[t.arange(y.shape[0]),y])                                             #Average Cross Entropy Function over a MiniBatch


        

class HiddenLayer():

    def __init__(self,input, n_in, n_out):

        #n_i=No. of Input units
        #n_h=No. of Hidden Layer Units
        #Input= Input Values of Features

        self.input=input

        W_values=np.array(np.random.uniform(low=-np.sqrt(6./(n_in + n_out)), high=np.sqrt(6./(n_in + n_out)),size=(n_in, 						n_out)),dtype=theano.config.floatX)    #Setting up Weight Parameters for the Hidden Layer

        self.W=theano.shared(value=W_values, name='W', 	borrow=True)                                                                                                

        b_values=np.zeros((n_out,),dtype=theano.config.floatX)                                                                                                  #Setting Up Bias Parameters

        self.b=theano.shared(value=b_values, name='b', borrow=True)

        lin_output=t.dot(input, self.W) + self.b                                                                                                                #PreActivation Function for Hidden Layer                                                                    

        self.output=t.tanh(lin_output)                                                                                                                          #Activation Function

        self.params = [self.W, self.b]                                                                                                                          #Storing The Parameters of Hidden Layer



class LeNetConvPoolLayer(object):
    """Pool Layer of a convolutional network """
    def __getitem__(self,key): return self



    def __init__(self, input, filter_shape, image_shape, poolsize=(2, 2)):
        """
        Allocate a LeNetConvPoolLayer with shared variable internal parameters.


        :type input: theano.tensor.dtensor4
        :param input: symbolic image tensor, of shape image_shape

        :type filter_shape: tuple or list of length 4
        :param filter_shape: (number of filters, num input feature maps,
                              filter height, filter width)

        :type image_shape: tuple or list of length 4
        :param image_shape: (batch size, num input feature maps,
                             image height, image width)

        :type poolsize: tuple or list of length 2
        :param poolsize: the downsampling (pooling) factor (#rows, #cols)
        """

        assert image_shape[1] == filter_shape[1]
        self.input = input

        # there are "num input feature maps * filter height * filter width"
        # inputs to each hidden unit
        fan_in = np.prod(filter_shape[1:])
        # each unit in the lower layer receives a gradient from:
        # "num output feature maps * filter height * filter width" /
        #   pooling size
        fan_out = (filter_shape[0] * np.prod(filter_shape[2:]) //
                   np.prod(poolsize))
        # initialize weights with random weights
        W_bound = np.sqrt(6. / (fan_in + fan_out))
        self.W = theano.shared(
            np.asarray(
                np.random.uniform(low=-W_bound, high=W_bound, size=filter_shape),
                dtype=theano.config.floatX
            ),
            borrow=True
        )

        # the bias is a 1D tensor -- one bias per output feature map
        b_values = np.zeros((filter_shape[0],), dtype=theano.config.floatX)
        self.b = theano.shared(value=b_values, borrow=True)

        # convolve input feature maps with filters
        conv_out = conv2d(
            input=input,
            filters=self.W,
            filter_shape=filter_shape,
            input_shape=image_shape,
	    border_mode="full"	
        )

        # pool each feature map individually, using maxpooling
        pooled_out = pool.pool_2d(
            input=conv_out,
            ds=poolsize,
            ignore_border=True
        )

        # add the bias term. Since the bias is a vector (1D array), we first
        # reshape it to a tensor of shape (1, n_filters, 1, 1). Each bias will
        # thus be broadcasted across mini-batches and feature map
        # width & height
        self.output = t.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))

        # store parameters of this layer
        self.params = [self.W, self.b]

        # keep track of model input
        self.input = input

def evaluate_convnet():

	learning_rate=0.1
	n_epochs=200
	nkerns=[10, 10, 10]
	batch_size=500
	img_ht=28
	img_wdt=28
	n_ch=1
	layers=[None]*3
	f_ht=5
	f_wdt=5

	dataset=load_data()
	train_set=dataset[0]

	x_1 = train_set[0]
	y_1 = train_set[1]

	l=x_1.shape[0]


	x=t.dmatrices('x')
	y=t.ivector('y')

	    ######################
	    # BUILD ACTUAL MODEL #
	    ######################
	print('... building the model')

	    # Reshape matrix of rasterized images of shape (batch_size, 28 * 28)
	    # to a 4D tensor, compatible with our LeNetConvPoolLayer
	    # (28, 28) is the size of MNIST images.
	layer0_input = x.reshape((batch_size, n_ch, img_ht, img_wdt))

	    # Construct the first convolutional pooling layer:
	    # filtering reduces the image size to (28+5-1 , 28+5-1) = (32, 32)
	    # maxpooling reduces this further to (32/2, 32/2) = (16, 16)
	    # 4D output tensor is thus of shape (batch_size, nkerns[0], 12, 12)

	i=0
	for layer in range(len(layers)):
		layer = LeNetConvPoolLayer(
			input=layer0_input,
			image_shape=(batch_size, n_ch, img_ht, img_wdt),
			filter_shape=(nkerns[i], n_ch, f_ht, f_wdt),
			poolsize=(2, 2)
		    )
		img_ht=(img_ht+f_ht-1)/2
		img_wdt=(img_wdt+f_wdt-1)/2
		n_ch=nkerns[i]
		layers[i] = layer
		input = layer[i].output
		i+=1
		
	


		 
		
		

	    

	    # the HiddenLayer being fully-connected, it operates on 2D matrices of
	    # shape (batch_size, num_pixels) (i.e matrix of rasterized images).
	    # This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4),
	    # or (500, 50 * 4 * 4) = (500, 800) with the default values.


	    # construct a fully-connected sigmoidal layer
	layer3 = HiddenLayer(
		input=layers[2].output.flatten(2),
		n_in=nkerns[2] * img_ht * img_wdt,
		n_out=500,
	    )

	    # classify the values of the fully-connected sigmoidal layer
	layer4 = LogisticClassifier(input=layer3.output, n_in=500, n_out=10)

	    # the cost we minimize during training is the NLL of the model
	cost = layer4.negative_log_likelihood(y)

	    # create a list of all model parameters to be fit by gradient descent
	params = layer4.params + layer3.params + layers[2].params + layers[1].params+ layers[0].params
	



	    # create a list of gradients for all model parameters
	grads = t.grad(cost, params)

	    # train_model is a function that updates the model parameters by
	    # SGD Since this model has many parameters, it would be tedious to
	    # manually create an update rule for each model parameter. We thus
	    # create the updates list by automatically looping over all
	    # (params[i], grads[i]) pairs.
	updates = [
		(param_i, param_i - learning_rate * grad_i)
		for param_i, grad_i in zip(params, grads)
	    ]

	train_model=function(inputs=[x,y], outputs=cost, updates=updates)
	epoch=0
	  
	while(epoch<n_epochs):

	    i=0
	    z=batch_size
		
	    while(i<l):
	       X=x_1[i:z]	
	       Y=y_1[i:z]

	       i=i+batch_size
	       z=z+batch_size

	       batch_avg_cost=train_model(X,Y)
		                
	       print(batch_avg_cost)
		    
	       epoch=epoch+1

evaluate_convnet()

[theano-users] Re: theano.gradient.DisconnectedInputError

Reply via email to