Hi Robin, I just came across the same issue.It looks very strange to me because I was using this kind of code (scan in which some intermediate values of the scan function have different sizes in different iterations) for a long time and everything was OK, but after making small modyfications (putting mentioned fragment into ifelse statement) it suddenly stopped working... Have you figured out how to manage this issue and could you suggest a solution?
Norbert W dniu niedziela, 29 maja 2016 23:33:38 UTC+2 użytkownik [email protected] napisał: > > Hi, > > I'm getting an error computing gradients through a scan in which some > intermediate values of the scan function have different sizes in different > iterations (the inputs and outputs always have the same size). Here's a > minimal example: > > import numpy as np > import theano > import theano.tensor as T > > d = 11 > h = 7 > W1 = theano.shared(name='W1', value=np.random.uniform(-0.1, 0.1, (d,h))) > W2 = theano.shared(name='W2', value=np.random.uniform(-0.1, 0.1, (h,))) > > n = T.lscalar('n') > vecs = T.matrix('vecs') > inds = T.lmatrix('inds') > def recurrence(t, vecs, inds, W1, W2): > cur_inds = inds[T.eq(inds[:,0], t).nonzero()] > cur_vecs = vecs[cur_inds[:,1]] > hidden_layers = T.tanh(cur_vecs.dot(W1)) > scores = hidden_layers.dot(W2) > return T.sum(scores) > results, _ = theano.scan( > fn=recurrence, sequences=[T.arange(n)], outputs_info=[None], > non_sequences=[vecs, inds, W1, W2], strict=True) > obj = T.sum(results) > grads = T.grad(obj, [W1, W2]) > f = theano.function(inputs=[n, vecs, inds], outputs=grads) > vecs_in = np.ones((10, d)) > inds_in = np.array([[0, 0], [1, 1], [1, 2], [2, 3], [3, 4], [3, 5], [3, > 6], [3, 7], [4, 8], [4, 9]]) > print f(5, vecs_in, inds_in) > > > Running this code results in the following error message (tried on 0.7.0, > 0.8.2, and 0.9.0dev1.dev-0044349fdf4244c5b616994bf16ad2ff1ff8ce8a): > > Traceback (most recent call last): > File "edge_scores.py", line 33, in <module> > print f(5, vecs_in, inds_in) > File > "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", > line 912, in __call__ > storage_map=getattr(self.fn, 'storage_map', None)) > File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line > 314, in raise_with_op > reraise(exc_type, exc_value, exc_trace) > File > "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", > line 899, in __call__ > self.fn() if output_subset is None else\ > File > "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", > line 951, in rval > r = p(n, [x[0] for x in i], o) > File > "/usr/local/lib/python2.7/dist-packages/theano/scan_module/scan_op.py", > line 940, in <lambda> > self, node) > File "theano/scan_module/scan_perform.pyx", line 547, in > theano.scan_module.scan_perform.perform > (/home/robinjia/.theano/compiledir_Linux-3.13--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/scan_perform/mod.cpp:6224) > ValueError: could not broadcast input array from shape (11,4) into shape > (11,2) > Apply node that caused the error: forall_inplace,cpu,grad_of_scan_fn}(n, > Alloc.0, Elemwise{eq,no_inplace}.0, Alloc.0, n, n, W1, W2, vecs, inds, > InplaceDimShuffle{x,0}.0) > Toposort index: 47 > Inputs types: [TensorType(int64, scalar), TensorType(float64, col), > TensorType(int8, matrix), TensorType(float64, matrix), TensorType(int64, > scalar), TensorType(int64, scalar), TensorType(float64, matrix), > TensorType(float64, vector), TensorType(float64, matrix), TensorType(int64, > matrix), TensorType(float64, row)] > Inputs shapes: [(), (5, 1), (5, 10), (2, 7), (), (), (11, 7), (7,), (10, > 11), (10, 2), (1, 7)] > Inputs strides: [(), (8, 8), (10, 1), (56, 8), (), (), (56, 8), (8,), (88, > 8), (16, 8), (56, 8)] > Inputs values: [array(5), array([[ 1.], > [ 1.], > [ 1.], > [ 1.], > [ 1.]]), 'not shown', 'not shown', array(5), array(5), 'not shown', > 'not shown', 'not shown', 'not shown', 'not shown'] > Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,grad_of_scan_fn}.0, > ScalarFromTensor.0)], > [InplaceDimShuffle{1,0,2}(forall_inplace,cpu,grad_of_scan_fn}.1)], > [Reshape{2}(forall_inplace,cpu,grad_of_scan_fn}.2, > MakeVector{dtype='int64'}.0), > Shape_i{1}(forall_inplace,cpu,grad_of_scan_fn}.2)]] > > HINT: Re-running with most Theano optimization disabled could give you a > back-trace of when this node was created. This can be done with by setting > the Theano flag 'optimizer=fast_compile'. If that does not work, Theano > optimizations can be disabled with 'optimizer=None'. > HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and > storage map footprint of this apply node. > > > A couple observations: > - There's no error if I turn off optimizations (theano.config.optimizer = > 'None') > - There's no error if I have a single layer and no hidden layer (i.e. if > scores = cur_vecs.dot(W) for W of the appropriate shape). > > Thanks! > > Robin > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
