The error comes only once I increment Ci with activation in the last but third line. If I don't do that, the entire thing works.
On Wednesday, 20 July 2016 17:12:12 UTC-4, Aditya Gudimella wrote: > > I have written the following code in theano. It keeps giving me a gradient > error. Can someone please help me? The error I get is: NotImplementedError: > Grad is not implemented for inputs withnumber of dimension other than 1. > > nodes_np = np.array([1,2,3,4,5,6]) > child_np = np.array([2,3,4,5,6]) > n_child_np = np.array([2,0,3,0,0,0]) > leaves_np = np.array([4,0,3,0,0,0]) > > nodes, child, n_child, leaves = shared(nodes_np.astype(float)), shared( > child_np.astype(float)), shared(n_child_np), shared(leaves_np) > > embed_dim = 3 > vocab = np.arange(200) > n_vocab = len(vocab) > > vci = np.random.random((1 + n_vocab, embed_dim)) > VCi = shared(vci, name='Embeddings') > > unique_ind = T.eq(nodes[:,None],vocab[None,:]) > Ci = VCi[unique_ind.nonzero()[1]] > > ... Initialize a few more weights here > > nonzero_par = n_child.nonzero() > nonzero_child = n_child.nonzero_values() > > luc = leaves[1:] > lup = repeat(leaves[nonzero_par], nonzero_child) > l = luc/lup > > sep = cumsum(nonzero_child) > i = T.arange(1, child.shape[0]+1) > i = i - repeat(T.concatenate((T.zeros(1), i[sep-1][:-1])), nonzero_child) > n = repeat(nonzero_child, nonzero_child) > li, ri = (n-i)/(n-1), (i-1)/(n-1) > li, ri = T.set_subtensor(li[T.eq(n,1)],0.5), > T.set_subtensor(ri[T.eq(n,1)],0.5) > > W_code_i = li[:, None, None]*Wl[None,:,:]+ri[:, None, None]*Wr[None,:,:] > # Wl and Wr are previously initialized weights > > product = l[:,None]*T.sum(W_code_i*Ci[1:,None,:], axis=2) + Bias > > def split_sum(array, split_cts): > ''' > Given an array like and an array of counts, it returns an array where > each element is the sum of count number of indices in array > That is retval[i] = sum(array[counts.cumsum()[i-1]:counts.cumsum()[i]]) > ''' > sep = cumsum(split_cts) - 1 > return diff(T.concatenate((T.zeros((array.shape[1], ))[None,:], > cumsum(array, axis=0)[sep]), axis=0), axis=0) > > activation = T.tanh(split_sum(product, nonzero_child)) > > Ci = T.sum(Wcomb1[None,:,:]*Ci[:,None,:], axis=2) > activation = T.sum(Wcomb2[None,:,:]*activation[:,None,:], axis=2) > > Ci = T.inc_subtensor(Ci[nonzero_par], activation) > > owp = T.max(Ci, axis=-1) > > T.grad(owp.norm(2), wrt=VCi) > > > Any help would be appreciated. Thanks in advance for your help. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
