It looks like originally x was a list of ints. Your x is a vector. I think
what you currently have written will take one element of the vector x per
time step and so T.dot(W_xh, x_t) will be a matrix times a scalar which is
probably leading to your shape problems. Do you have a sequence of vectors
for x or just one? If you have a sequence, x should come from T.matrix and
if you just have one, you should just pre-compute the dot product and pass
it as a non-sequence.
If x is a matrix, scan will iterate over the rows of the matrix.
On Tuesday, March 28, 2017 at 10:58:04 AM UTC-7, mrinmoy maity wrote:
>
> I'm trying to verify the vanilla RNN model using Theano but is seeing the
> error I am not being able to understand. Any help will be much appreciated.
>
>
> Following is the code snippet for RNN:
> -----------------------------------------------------
> W_xh, W_hy, W_hh, b_h, b_y = self.params
> x = T.vector('x')
> y = T.vector('y')
> def forward_prop_step(x_t, h_t_prev, W_xh, W_hy, W_hh, b_h, b_y):
> h_t = T.tanh(W_xh[:, x_t] + T.dot(W_hh, h_t_prev) + b_h)
> ...........................................(1)
> # h_t = T.tanh(T.dot(W_xh, x_t) + T.dot(W_hh, h_t_prev) +
> b_h).....................................(2)
> o_t = T.nnet.softmax(T.dot(W_hy, h_t) + b_y)
> return [o_t[0], h_t]
> h_0 = T.zeros(self.n_hidden)
> [o,h], _ = theano.scan(
> forward_prop_step,
> sequences=x,
> outputs_info=[None, h_0],
> non_sequences=[W_xh, W_hy, W_hh, b_h, b_y],
> truncate_gradient=self.bptt_truncate,
> strict=True)
> ---------------------------------------------------
>
> In line (1) and (2), the only difference is replacing W_xh[:, x_t] with
> actual dot product T.dot(W_xh, x_t). When x_t is one hot encoding vector,
> column selection specified in (1) works fine. But if input is a vector of
> floating point numbers, we need to use what specified in (2). However,
> using (2) throws the following error:
>
> ValueError: When compiling the inner function of scan (the function called
> by scan in each of its iterations) the following error has been
> encountered: The initial state (`outputs_info` in scan nomenclature) of
> variable IncSubtensor{Set;:int64:}.0 (argument number 1) has 2
> dimension(s), while the corresponding variable in the result of the inner
> function of scan (`fn`) has 2 dimension(s) (it should be one less than the
> initial state). For example, if the inner function of scan returns a vector
> of size d and scan uses the values of the previous time-step, then the
> initial state in scan should be a matrix of shape (1, d). The first
> dimension of this matrix corresponds to the number of previous time-steps
> that scan uses in each of its iterations. In order to solve this issue if
> the two varialbe currently have the same dimensionality, you can increase
> the dimensionality of the variable in the initial state of scan by using
> dimshuffle or shape_padleft.
>
>
> Seems like there's some problem with initialization 'h_0'.
>
> My questions are:
> 1. Is there any workaround to avoid the issue like how 'h_0' should be
> initialized?
> 2. Why does it work for one hot encoding vector in line (1) but not in
> line (2)?
>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.