It looks like originally x was a list of ints. Your x is a vector. I think what you currently have written will take one element of the vector x per time step and so T.dot(W_xh, x_t) will be a matrix times a scalar which is probably leading to your shape problems. Do you have a sequence of vectors for x or just one? If you have a sequence, x should come from T.matrix and if you just have one, you should just pre-compute the dot product and pass it as a non-sequence.
If x is a matrix, scan will iterate over the rows of the matrix. On Tuesday, March 28, 2017 at 10:58:04 AM UTC-7, mrinmoy maity wrote: > > I'm trying to verify the vanilla RNN model using Theano but is seeing the > error I am not being able to understand. Any help will be much appreciated. > > > Following is the code snippet for RNN: > ----------------------------------------------------- > W_xh, W_hy, W_hh, b_h, b_y = self.params > x = T.vector('x') > y = T.vector('y') > def forward_prop_step(x_t, h_t_prev, W_xh, W_hy, W_hh, b_h, b_y): > h_t = T.tanh(W_xh[:, x_t] + T.dot(W_hh, h_t_prev) + b_h) > ...........................................(1) > # h_t = T.tanh(T.dot(W_xh, x_t) + T.dot(W_hh, h_t_prev) + > b_h).....................................(2) > o_t = T.nnet.softmax(T.dot(W_hy, h_t) + b_y) > return [o_t[0], h_t] > h_0 = T.zeros(self.n_hidden) > [o,h], _ = theano.scan( > forward_prop_step, > sequences=x, > outputs_info=[None, h_0], > non_sequences=[W_xh, W_hy, W_hh, b_h, b_y], > truncate_gradient=self.bptt_truncate, > strict=True) > --------------------------------------------------- > > In line (1) and (2), the only difference is replacing W_xh[:, x_t] with > actual dot product T.dot(W_xh, x_t). When x_t is one hot encoding vector, > column selection specified in (1) works fine. But if input is a vector of > floating point numbers, we need to use what specified in (2). However, > using (2) throws the following error: > > ValueError: When compiling the inner function of scan (the function called > by scan in each of its iterations) the following error has been > encountered: The initial state (`outputs_info` in scan nomenclature) of > variable IncSubtensor{Set;:int64:}.0 (argument number 1) has 2 > dimension(s), while the corresponding variable in the result of the inner > function of scan (`fn`) has 2 dimension(s) (it should be one less than the > initial state). For example, if the inner function of scan returns a vector > of size d and scan uses the values of the previous time-step, then the > initial state in scan should be a matrix of shape (1, d). The first > dimension of this matrix corresponds to the number of previous time-steps > that scan uses in each of its iterations. In order to solve this issue if > the two varialbe currently have the same dimensionality, you can increase > the dimensionality of the variable in the initial state of scan by using > dimshuffle or shape_padleft. > > > Seems like there's some problem with initialization 'h_0'. > > My questions are: > 1. Is there any workaround to avoid the issue like how 'h_0' should be > initialized? > 2. Why does it work for one hot encoding vector in line (1) but not in > line (2)? > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.