[theano-users] Re: Weights are not updated on each iteration

2016-10-12 Thread Kv Manohar
I had normalized the input so that all the values were between 0 and 1.
There is no luck with using either ReLU activation unit or tanh nor by 
multiplying the initial weights with factor of 0.01.
There seems to be some other problem with my implementation which I'm 
unable to figure out.



On Wednesday, October 12, 2016 at 4:14:43 PM UTC+5:30, Kv Manohar wrote:
>
> *Initial Variables*
> *x = T.dmatrix('x')*
> *y = T.dmatrix('y')*
>
> *These are the weights of a neural network*
> *W1_vals = np.asarray(rng.randn(input, hidden), 
> dtype=theano.config.floatX)*
> *W1 = shared(value=W1_vals, name='W1')*
> *W2_vals = np.asarray(rng.randn(hidden, output), 
> dtype=theano.config.floatX)*
>
> *W2 = shared(value=W2_vals, name='W2')*
>
> *Cost function is:*
> hidden_activations = T.nnet.sigmoid(T.dot(x, W1))
> prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2))
>
> #y is one-hot vectors
> *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))*
> *params = [W1, W2]*
>
> *Corresponding gradients are computed as*
> *grads = T.grad(cost, params)*
>
> *Updates rule is*
> lr = 0.01
> updates = [(param, param-lr*grad) for param, grad in zip(params, grads)]
>
> *Function to train the model*
> *train = function(inputs=[x, y], outputs=cost, updates=updates)*
>
> *The problem I'm facing*
> *I'm updating the weights after one full sweep of training data (50 
> examples),*
> *When I print out the values of W1 and W2 after each iteration(using 
> W1.get_value() etc), W2 seems to get updated but not W1*
> *Values of W1 are constant through out.*
> *Where is the mistake in my code?*
> *I'm unable to figure it out*
> *Thanks!*
>
>
>
>
>
>
>
>
>
>
>
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] Re: Unable to write gradient step for rnn in theano

2016-10-12 Thread Doug
The error is telling you the issue, your original shared variables are 
float32 but the updates you produce are float64. I'm guessing you don't 
have floatX set as float32 in your theano config, so when you multiply the 
gradient by the learning rate it gets upcast to float64, you can either set 
that or manually cast the updates to float32.

On Tuesday, October 11, 2016 at 5:26:22 PM UTC-4, Raghu Ram wrote:
>
> The following is the code:
>
> # coding: utf-8
>
> # In[68]:
>
> #Importing stuff
> import theano
> import theano.tensor as T
> import numpy as np
>
>
> # In[69]:
>
> import nltk
> import sys
> import operator
> import csv
> import itertools
> from utils import *
> from datetime import datetime
>
>
> # In[70]:
>
> #Fixing vocabulary size for one hot vectors and some initialization stuff
> v_size = 8000
> unknown_token = "UNKNOWN_TOKEN"
> start_token = ""
> end_token = ""
>
>
> # In[71]:
>
> #Read data and start preprocessing
> with open('reddit-comments-2015-08.csv','rb') as f:
> reader = csv.reader(f, skipinitialspace=True)
> reader.next()
> sentences = 
> list(itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8')) for x in 
> reader]))
> print len(sentences)
>
>
> # In[72]:
>
> #Tokenize the sentences and add start and end tokens
> tokenized_sentences = [nltk.word_tokenize(s) for s in sentences]
> tokenized_sentences = [[start_token] + s + [end_token] for s in 
> tokenized_sentences]
>
>
> # In[73]:
>
> #Get word frequencies and use only most frequent words in vocabulary
> word_freq = nltk.FreqDist(itertools.chain(*tokenized_sentences))
> vocab = word_freq.most_common(v_size-1)
>
>
> # In[74]:
>
> #Do mapping and reverse mapping
> index_to_word = [x[0] for x in vocab]
> index_to_word.append(unknown_token)
> word_to_index = {w:i for i,w in enumerate(index_to_word)}
>
> #Removing less frequent words
> for i, s in enumerate(tokenized_sentences):
> tokenized_sentences[i] = [w if w in word_to_index else unknown_token for 
> w in s]
>
> #Got vectors but they are not one hot
> X_train = np.asarray([[word_to_index[w] for w in s[:-1]] for s in 
> tokenized_sentences])
> Y_train = np.asarray([[word_to_index[w] for w in s[1:]] for s in 
> tokenized_sentences])
> #Preprocessing ends here
>
>
> # In[75]:
>
> #Take only one sentence for now
> X_train = X_train[0]
> Y_train = Y_train[0]
>
>
> # In[76]:
>
> #Make input and output as onehot vectors. This can easily be replaced with 
> vectors generated by word2vec.
> X_train_onehot = np.eye(v_size)[X_train]
> X = theano.shared(np.array(X_train_onehot).astype('float32'), name = 'X')
> Y_train_onehot = np.eye(v_size)[Y_train]
> Y = theano.shared(np.array(Y_train_onehot).astype('float32'), name = 'Y')
>
>
> # In[77]:
>
> #Initializing U, V and W
> i_dim = v_size
> h_dim = 100
> o_dim = v_size
>
> U = theano.shared(np.random.randn(i_dim, h_dim).astype('float32'), name = 'U')
> W = theano.shared(np.random.randn(h_dim, h_dim).astype('float32'), name = 'W')
> V = theano.shared(np.random.randn(h_dim, o_dim).astype('float32'), name = 'V')
>
>
> # In[78]:
>
> #forward propagation
> s = T.vector('s')
>
> results, updates = theano.scan(lambda x, sm1: T.tanh( T.dot(x, U) + 
> T.dot(sm1, W)),
>sequences = X_train_onehot,
>outputs_info = s
>   )
> y_hat = T.dot(results, V)
>
> forward_propagation = theano.function(inputs=[s], outputs = y_hat)
>
>
> # In[80]:
>
> #loss
> loss = T.sum(T.nnet.categorical_crossentropy(y_hat, Y))
>
>
> # In[81]:
>
> #Gradients
> dw = T.grad(loss, W)
> du = T.grad(loss, U)
> dv = T.grad(loss, V)
>
>
> # In[82]:
>
> #BPTT
> learning_rate = T.scalar('learning_rate')
> gradient_step = theano.function(inputs = [s, learning_rate],
>updates = (
> (U, U - learning_rate * du),
> (V, V - learning_rate * dv),
> (W, W - learning_rate * dw)
> )
>)
>
>
> # In[ ]:
>
>
>
>
>
> On Wednesday, October 12, 2016 at 2:43:37 AM UTC+5:30, Raghu Ram wrote:
>>
>> Hello all,
>> I am new to theano.
>> I have the following code(Attached the file). It keeps throwing the error 
>> at gradient step. I added screenshot of Jupyter notebook as well. Can 
>> someone tell me what is going wrong.
>>
>>
>> 
>> Please look at attached file for full code.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] When I run "import theano" I get "Not able to select available GPU from 2 cards (out of memory)."

2016-10-12 Thread joseph
On this machine I have another process using theano with memory on the GPU. 
Normally I can launch many processes and import theano but now I get this 
error.

$ python
Python 2.7.12 |Anaconda 2.3.0 (64-bit)| (default, Jul  2 2016, 17:42:40)   
 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2   
 
Type "help", "copyright", "credits" or "license" for more information. 
 
Anaconda is brought to you by Continuum Analytics. 
 
Please check out: http://continuum.io/thanks and https://anaconda.org   

>>> import theano   

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device 
0 failed:   
cublasCreate() returned this error 'the CUDA Runtime initialization failed' 

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device 
gpu failed: 
Not able to select available GPU from 2 cards (out of memory). 
 
ERR!\nTraceback (most recent call last):   
 
  File "", line 1, in

  File "/u/cohenjos/.local/lib/python2.7/site-packages/theano/__init__.py", 
line 111, in
theano.sandbox.cuda.tests.test_driver.test_nvidia_driver1() 

  File 
"/u/cohenjos/.local/lib/python2.7/site-packages/theano/sandbox/cuda/tests/test_driver.py",
 
line 29, i
n test_nvidia_driver1   

A = cuda.shared_constructor(a) 
 
  File 
"/u/cohenjos/.local/lib/python2.7/site-packages/theano/sandbox/cuda/var.py", 
line 218, in float32_sha
red_constructor 

enable_cuda=False) 
 
  File 
"/u/cohenjos/.local/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py",
 
line 554, in use   
cuda_ndarray.cuda_ndarray.select_a_gpu()   
 
RuntimeError: ('Not able to select available GPU from 2 cards (out of 
memory).', 'You asked to force this de
vice and it failed. No fallback to the cpu or other gpu device.') 
>>>


~/.theanorc   
[global]

device=gpu0 

floatX=float32   


Nvidia-smi shows there is plenty of memory:

+-+ 

| NVIDIA-SMI 367.44 Driver Version: 367.44| 

|---+--+--+ 

| GPU  NamePersistence-M| Bus-IdDisp.A | Volatile Uncorr. ECC | 

| Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage | GPU-Util  Compute M. | 

|===+==+==| 

|   0  GeForce GTX TIT...  Off  | :04:00.0  On |  N/A | 

| 82%   86CP2   195W / 250W |   2358MiB /  6081MiB |100%  Default | 

+---+--+--+ 

|   1  GeForce GTX TIT...  Off  | :05:00.0 Off |  N/A | 

| 40%   68CP2   224W / 250W |   5851MiB /  6082MiB | 92%  Default | 

+---+--+--+ 



+-+ 

| Processes:   GPU Memory | 

|  GPU   PID  Type  Process name   Usage  | 


Re: Private message regarding: [theano-users] how to use theano scan with RandomStreams

2016-10-12 Thread Pascal Lamblin
I did not test this, because I don't have data or code for one_step, but it 
would be something like:

def loop_over_examples(x):
  # hidden and outputs of the entire sequence
  [h_vals, o_vals], inner_updates = theano.scan(fn=one_step,
sequences = dict(input = x, taps=[0]),
outputs_info = [h0, None], # corresponds to return type of 
one_step
non_sequences = [W_ih, W_hh, b_h, W_ho, b_o]
)
  return o_vals, inner_updates


O_vals, updates = theano.scan(fn = loop_over_examples,
sequences = dict(input = V, taps=[0]),
outputs_info = None
)
f = theano.function(inputs=[V], outputs=O_vals, updates=updates)

On Tue, Oct 11, 2016, Arzoo wrote:
> Hi Pascal,
> 
> I am trying to use nested scans in theano and getting Missing Input Error.
> It looks like in the document 
> http://christianherta.de/lehre/dataScience/machineLearning/neuralNetworks/recurrentNeuralNetworks.php
> they ignore the updates from the inner scan as well as the outer scan.
> But if I do something smiliar, it return the Missing Input Error again.
> 
> I was wondering if you could elaborate on your solution above. 
> Maybe with an example from the documentation : 
> 
> def loop_over_examples(x): 
>   # hidden and outputs of the entire sequence
>   [h_vals, o_vals], _ = theano.scan(fn=one_step,
> sequences = dict(input = x, taps=[0]),
> outputs_info = [h0, None], # corresponds to return type of 
> one_step
> non_sequences = [W_ih, W_hh, b_h, W_ho, b_o]
> )
>   return o_vals#  return y_vals
> 
> 
> O_vals, _ = theano.scan(fn = loop_over_examples,
> sequences = dict(input = V, taps=[0]),
> outputs_info = None
> )
> f = theano.function(inputs=[V], outputs=O_vals)
> 
> 
> How are the updates passed from the inner scan to outer scan?
> 
> Thanks
> Arzoo
> 
> 
> 
> On Thursday, September 29, 2016 at 10:16:38 PM UTC-4, Pascal Lamblin wrote:
> >
> > On Fri, Sep 30, 2016, 杨培 wrote: 
> > > Thanks。 
> > > As your answer。 The problem is I ignore the update dictionary returned 
> > by 
> > > theano scan。 
> > > Here,I have a new question。 
> > > In my program,this code is in another scan loop, and I have used the 
> > update 
> > > dictionary for the outter scan loop。 
> > > So,how I can past the update dictionary of RandomStream to theano 
> > function 
> > > method。 
> >
> > In the step function of the outer loop, return the updates dictionary 
> > coming from the inner scan. 
> > Then, pass the updates dictionary coming from the outer scan to 
> > theano.function. 
> >
> > > 
> > > 2016-09-30 5:57 GMT+08:00 Pascal Lamblin  > >: 
> > > 
> > > > Also, never ignore the updates returned by theano.scan! 
> > > > In other words, always do: 
> > > > 
> > > > result, updates = theano.scan(...) 
> > > > f = theano.function(..., updates=updates) 
> > > > 
> > > > and never: 
> > > > 
> > > > result, _ = theano.scan(...) 
> > > > 
> > > > On Thu, Sep 29, 2016, 杨培 wrote: 
> > > > > When I use theano loop with RandomStream to generate random number, 
> > > > >  theano compile fail with “MissingInput”。 
> > > > > 
> > > > > I Google this problem, and I found : 
> > > > > 
> > > > >- a  issue(https://github.com/Theano/Theano/issues/3437)。This 
> > issue 
> > > > said 
> > > > >we cannot use RandomStream with symbolic shape in scan。 
> > > > > 
> > > > > But I also found : 
> > > > > 
> > > > >- a documentation int Theano 
> > > > >(http://deeplearning.net/software/theano/library/scan. 
> > > > html#using-shared-variables-gibbs-sampling),the 
> > > > >code in the documentation use RandomStream with symbolic shape in 
> > > > scan。 
> > > > > 
> > > > > So,how to use theano scan with RandomStream。Thanks for your help 
> > > > > 
> > > > > here is my code。this code compile failed 
> > > > > 
> > > > > 
> > > > >- import theano; 
> > > > >- from theano import tensor as T; 
> > > > >- import numpy as np; 
> > > > >- 
> > > > >- from theano.sandbox.rng_mrg import MRG_RandomStreams as 
> > > > RandomStreams; 
> > > > >- 
> > > > >- x=T.ivector(); 
> > > > >- 
> > > > >- def step(i): 
> > > > >- sample=RandomStreams().binomial(size=(i,)); 
> > > > >- return sample; 
> > > > >- 
> > > > >- result,_=theano.scan(fn=step,outputs_info=None, 
> > > > >-  sequences=[x]); 
> > > > >- 
> > > > >- f=theano.function([x],result); 
> > > > >- 
> > > > >- x_val=np.array([1,2,3],dtype='int32'); 
> > > > >- print f(x_val); 
> > > > > 
> > > > > 
> > > > > this code work fine 
> > > > > 
> > > > >- import theano; 
> > > > >- from theano import tensor as T; 
> > > > >- import numpy as np; 
> > > > >- 
> > > > >- from 

Re: [theano-users] Testing 'borrow=True' with cnmm=0 and cnmm=0.3 to assess performance times

2016-10-12 Thread Pascal Lamblin
Hi,

My guess is that:

- without cnmem, allocation and deallocation of intermediate results
force synchronization of the GPU more often, so the overall time is
slower

- with cnmem and borrow=False, there is no synchronization at all, and
what is measured is just the time to launch the GPU kernels, not the
time to actually execute them.

- with cnmem and borrow=True, there seems to be one synchronization
forced after each function call, I'm not sure why.

On Sun, Oct 09, 2016, Chris Hanning wrote:
> Testing the following code from:
> 
> http://deeplearning.net/software/theano/tutorial/aliasing.html#borrowfunction
> 
> copy : https://paste.pound-python.org/show/vGCQlEMIoOPWZuUPo2DJ/
> 
> I found that running it on an iMac, i5, GeForce GT 640M gave significant 
> gains when enabling lib.cnmm
> 
> With CNMM disabled:
> 
> $ THEANO_FLAGS='device=gpu0,lib.cnmm=0' python borrow_test.py
> 
> Looping 1000 times took 0.49251699447631836 seconds without borrow and 
> 0.34339094161987305 seconds with borrow
> 
> With CNMM enabled:
> 
> $ THEANO_FLAGS='device=gpu0,lib.cnmm=0.3' python borrow_test.py
> 
> Looping 1000 times took 0.019893884658813477 seconds without borrow and 
> 0.3345789909362793 seconds with borrow
> 
> On this system, any value for cnmm over 0.4 would crash the program due to 
> memory constraints.
> There no significant difference in performance between 0.1 and 0.4.
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [theano-users] Weights are not updated on each iteration

2016-10-12 Thread Pascal Lamblin
The sigmoid activation function tends to saturate and block gradient
propagation, so the gradient wrt W1 is probably really close to zero in
your case.

Potential solutions include using another activation function (ReLU for
instance, or tanh), initializing W1 with smaller weights, making sure
your inputs are normalized (or scaled between 0 and 1, or -1 and 1).

On Wed, Oct 12, 2016, Kv Manohar wrote:
> *Initial Variables*
> *x = T.dmatrix('x')*
> *y = T.dmatrix('y')*
> 
> *These are the weights of a neural network*
> *W1_vals = np.asarray(rng.randn(input, hidden), dtype=theano.config.floatX)*
> *W1 = shared(value=W1_vals, name='W1')*
> *W2_vals = np.asarray(rng.randn(hidden, output), 
> dtype=theano.config.floatX)*
> 
> *W2 = shared(value=W2_vals, name='W2')*
> 
> *Cost function is:*
> hidden_activations = T.nnet.sigmoid(T.dot(x, W1))
> prob_y_given_x = T.nnet.softmax(T.dot(hidden_activations, W2))
> 
> #y is one-hot vectors
> *cost = T.mean(T.nnet.categorical_crossentropy(prob_y_given_x, y))*
> *params = [W1, W2]*
> 
> *Corresponding gradients are computed as*
> *grads = T.grad(cost, params)*
> 
> *Updates rule is*
> lr = 0.01
> updates = [(param, param-lr*grad) for param, grad in zip(params, grads)]
> 
> *Function to train the model*
> *train = function(inputs=[x, y], outputs=cost, updates=updates)*
> 
> *The problem I'm facing*
> *I'm updating the weights after one full sweep of training data (50 
> examples),*
> *When I print out the values of W1 and W2 after each iteration(using 
> W1.get_value() etc), W2 seems to get updated but not W1*
> *Values of W1 are constant through out.*
> *Where is the mistake in my code?*
> *I'm unable to figure it out*
> *Thanks!*
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [theano-users] Can't get bilinear_upsampling work

2016-10-12 Thread 狄凯
Hi Fred, thanks for the info, will try that.

On Tuesday, 11 October 2016 20:51:49 UTC+8, nouiz wrote:
>
> Update to Theano dev version. There was update to it since the last 
> release that could help you.
>
> Fred
>
> Le 11 oct. 2016 01:31, "狄凯"  a écrit :
>
>> Hi guys, I'm trying to use theano bilinear_upsampling function to write a 
>> custom layer for Keras.
>> I just failed to make a simple function with it.
>> Below is my example showing this failure:
>>
>> import theano.tensor as T
>> from theano import function
>> from theano.tensor.nnet.abstract_conv import bilinear_upsampling
>> x = T.tensor4('x')
>> y = bilinear_upsampling(x, 2)
>>
>> I get the following error:
>> Traceback (most recent call last):
>>   File "", line 1, in 
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/nnet/abstract_conv.py",
>>  
>> line 569, in bilinear_upsampling
>> row * ratio, col * ratio))
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/var.py",
>>  
>> line 327, in reshape
>> return theano.tensor.basic.reshape(self, shape, ndim=ndim)
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/basic.py",
>>  
>> line 4526, in reshape
>> newshape = as_tensor_variable(newshape)
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/basic.py",
>>  
>> line 208, in as_tensor_variable
>> raise AsTensorError("Cannot convert %s to TensorType" % str_x, 
>> type(x))
>> theano.tensor.var.AsTensorError: ('Cannot convert (None, None, 
>> Elemwise{mul,no_inplace}.0, Elemwise{mul,no_inplace}.0) to TensorType', 
>> )
>>
>> I thought the problem was because I didn't specify batch_size and 
>> num_input_channels of the bilinear_upsampling function, so I tested the 
>> following code:
>>
>> import theano.tensor as T
>> from theano import function
>> from theano.tensor.nnet.abstract_conv import bilinear_upsampling
>> x = T.tensor4('x')
>> y = bilinear_upsampling(x, 2, batch_size=x.shape[0], 
>> num_input_channels=x.shape[1])
>>
>> I got a different error:
>>
>> Traceback (most recent call last):
>>   File "", line 1, in 
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/nnet/abstract_conv.py",
>>  
>> line 542, in bilinear_upsampling
>> filter_flip=True)
>>   File 
>> "/home/dikai/bin/anaconda2/lib/python2.7/site-packages/theano/tensor/nnet/abstract_conv.py",
>>  
>> line 241, in conv2d_grad_wrt_inputs
>> integer_types, type(None)))
>> AssertionError
>>
>> I also checked the source code that raised this error:
>>
>> # checking the type of input_shape 
>> for dim in [0, 1]: 
>> assert isinstance(input_shape[dim], (theano.tensor.TensorConstant, 
>>
>> integer_types, type(None)))
>>
>> I don't know if this is a bug or not. Or can anyone provide a working 
>> example that upsamples a 4D tensor using this function (assuming actual 
>> shape of x is known only at runtime)?
>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "theano-users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to theano-users...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.