It looks like originally x was a list of ints. Your x is a vector. I think 
what you currently have written will take one element of the vector x per 
time step and so T.dot(W_xh, x_t) will be a matrix times a scalar which is 
probably leading to your shape problems. Do you have a sequence of vectors 
for x or just one? If you have a sequence, x should come from T.matrix and 
if you just have one, you should just pre-compute the dot product and pass 
it as a non-sequence.

If x is a matrix, scan will iterate over the rows of the matrix.


On Tuesday, March 28, 2017 at 10:58:04 AM UTC-7, mrinmoy maity wrote:
>
> I'm trying to verify the vanilla RNN model using Theano but is seeing the 
> error I am not being able to understand. Any help will be much appreciated.
>
>  
> Following is the code snippet for RNN:
> -----------------------------------------------------
>         W_xh, W_hy, W_hh, b_h, b_y = self.params
>         x = T.vector('x')
>         y = T.vector('y')
>        def forward_prop_step(x_t, h_t_prev, W_xh, W_hy, W_hh, b_h, b_y):
>             h_t = T.tanh(W_xh[:, x_t] + T.dot(W_hh, h_t_prev) + b_h) 
> ...........................................(1)
> #            h_t = T.tanh(T.dot(W_xh, x_t) + T.dot(W_hh, h_t_prev) + 
> b_h).....................................(2)
>             o_t = T.nnet.softmax(T.dot(W_hy, h_t) + b_y)
>             return [o_t[0], h_t]
>         h_0 = T.zeros(self.n_hidden)
>         [o,h], _ = theano.scan(
>             forward_prop_step,
>             sequences=x,
>             outputs_info=[None, h_0],
>             non_sequences=[W_xh, W_hy, W_hh, b_h, b_y],
>             truncate_gradient=self.bptt_truncate,
>             strict=True)
> ---------------------------------------------------
>
> In line (1) and (2), the only difference is replacing W_xh[:, x_t] with 
> actual dot product T.dot(W_xh, x_t). When x_t is one hot encoding vector, 
> column selection specified in (1) works fine. But if input is a vector of 
> floating point numbers, we need to use what specified in (2). However, 
> using (2) throws the following error:
>
> ValueError: When compiling the inner function of scan (the function called 
> by scan in each of its iterations) the following error has been 
> encountered: The initial state (`outputs_info` in scan nomenclature) of 
> variable IncSubtensor{Set;:int64:}.0 (argument number 1) has 2 
> dimension(s), while the corresponding variable in the result of the inner 
> function of scan (`fn`) has 2 dimension(s) (it should be one less than the 
> initial state). For example, if the inner function of scan returns a vector 
> of size d and scan uses the values of the previous time-step, then the 
> initial state in scan should be a matrix of shape (1, d). The first 
> dimension of this matrix corresponds to the number of previous time-steps 
> that scan uses in each of its iterations. In order to solve this issue if 
> the two varialbe currently have the same dimensionality, you can increase 
> the dimensionality of the variable in the initial state of scan by using 
> dimshuffle or shape_padleft. 
>
>
> Seems like there's some problem with initialization 'h_0'.
>
> My questions are:
> 1. Is there any workaround to avoid the issue like how 'h_0' should be 
> initialized?
> 2. Why does it work for one hot encoding vector in line (1) but not in 
> line (2)?
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to