I'm trying to verify the vanilla RNN model using Theano but is seeing the 
error I am not being able to understand. Any help will be much appreciated.

 
Following is the code snippet for RNN:
-----------------------------------------------------
        W_xh, W_hy, W_hh, b_h, b_y = self.params
        x = T.vector('x')
        y = T.vector('y')
       def forward_prop_step(x_t, h_t_prev, W_xh, W_hy, W_hh, b_h, b_y):
            h_t = T.tanh(W_xh[:, x_t] + T.dot(W_hh, h_t_prev) + b_h) 
...........................................(1)
#            h_t = T.tanh(T.dot(W_xh, x_t) + T.dot(W_hh, h_t_prev) + 
b_h).....................................(2)
            o_t = T.nnet.softmax(T.dot(W_hy, h_t) + b_y)
            return [o_t[0], h_t]
        h_0 = T.zeros(self.n_hidden)
        [o,h], _ = theano.scan(
            forward_prop_step,
            sequences=x,
            outputs_info=[None, h_0],
            non_sequences=[W_xh, W_hy, W_hh, b_h, b_y],
            truncate_gradient=self.bptt_truncate,
            strict=True)
---------------------------------------------------

In line (1) and (2), the only difference is replacing W_xh[:, x_t] with 
actual dot product T.dot(W_xh, x_t). When x_t is one hot encoding vector, 
column selection specified in (1) works fine. But if input is a vector of 
floating point numbers, we need to use what specified in (2). However, 
using (2) throws the following error:

ValueError: When compiling the inner function of scan (the function called 
by scan in each of its iterations) the following error has been 
encountered: The initial state (`outputs_info` in scan nomenclature) of 
variable IncSubtensor{Set;:int64:}.0 (argument number 1) has 2 
dimension(s), while the corresponding variable in the result of the inner 
function of scan (`fn`) has 2 dimension(s) (it should be one less than the 
initial state). For example, if the inner function of scan returns a vector 
of size d and scan uses the values of the previous time-step, then the 
initial state in scan should be a matrix of shape (1, d). The first 
dimension of this matrix corresponds to the number of previous time-steps 
that scan uses in each of its iterations. In order to solve this issue if 
the two varialbe currently have the same dimensionality, you can increase 
the dimensionality of the variable in the initial state of scan by using 
dimshuffle or shape_padleft. 


Seems like there's some problem with initialization 'h_0'.

My questions are:
1. Is there any workaround to avoid the issue like how 'h_0' should be 
initialized?
2. Why does it work for one hot encoding vector in line (1) but not in line 
(2)?

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to