[theano-users] MILA and the future of Theano

2017-10-04 Thread Michaeel Kazi
Echoing the thanks of many here. I've long considered Theano the best math 
expression package, since it was very intuitive to implement many techniques 
beyond the standard deep learning structures.

It looks like Tensorflow has caught up, though I admit it was strange to see it 
declared 'the winner' over a year ago, before it had even supported a 
non-bucketed approach to recurrence...

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[theano-users] help to write theano op

2017-10-04 Thread Matthias Zoehrer
Hi,

We are currently working on a project porting complex gradients to theano 
using the Wirtinger Calculus.
So far we have written a couple of ops performing complex math. operations 
and
gradient calculations using real valued cost functions. All gradients of 
these functions are numerically tested.

At the moment we are interested in making those ops faster, as in the 
forward path only python is used.
We want to use gpu calculations also in forward mode. Most importantly, we 
will code CUDA code only for
selected ops, as most operations will be runnable using theano's base math 
functionality (T.xx()). 

The problem arises when I want to use those base GPU ops. Using them in 
perform() will not solve the problem as we are in 
np format. I was not able to figure out how to use make_thunk in 
combination with T.xx(). My solution for this problem is to calculate the 
T.xx()
in make_node and pass it to Apply(). Theno the theano.op will be calculated 
on the gpu (if supported) and will be passed to perform. I really don't
like this solution, so my question is: Is there a better way to use theano 
math (T.xx()) ops in forward() mode when writing theano.ops?  I think there
are only code examples using CUDA code

#---

import theano
import theano.tensor as T

import numpy as np

class cDot  (theano.Op):
__props__ = ()

def make_node(self, x, y):

x = theano.tensor.as_tensor_variable(x)
y = theano.tensor.as_tensor_variable(y)
z = T.dot(x,y)
return theano.Apply(self, [x, y, z], [x.type()])

def perform(self, node, inputs, output_storage):

x,y,z = inputs
z[0] = output_storage[0]
z[0] = z

'''
#   standard np calculation  
x,y,_ = inputs
z = output_storage[0]
z[0] = np.dot(x,y)
'''

def grad(self, inputs, output_grads):
x, y, _ = inputs
g = output_grads[0] 

#see 4.42 and 4.43
x_out = T.dot(g, T.conj(y.T))
y_out = T.dot(T.conj(x.T), g)
return [T.cast((x_out), 'complex64'), T.cast((y_out), 'complex64'), 
T.zeros_like(x)]

cdot = cDot()
#---


Thanks in advance

Matthias

ps.: we will share complex op code after submitting the paper. Maybe we 
could merge it into the master branch, but a lot of work has to be done 
before regarding interfaces etc. 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [theano-users] How to change the learning rate dynamically

2017-10-04 Thread Beatriz G.
I have undestood it, and It works for me, I have modified the code so after 
training each epocf and after validate, the learning rate is changed.

The cost took nan because the learning rate value was too high, I am not 
pretty sure why.

El martes, 3 de octubre de 2017, 23:36:00 (UTC+2), Michael Klachko escribió:
>
> The learning rate schedule is defined in this line: 
> updates.append((l_r,0.95*l_r)), and is used in this line: train_model = 
> theano.function([index], cost, updates=updates ...
> If you don't understand what's going on, read about theano functions.
>
>
>
>
> On Thursday, September 7, 2017 at 6:58:34 AM UTC-7, Beatriz G. wrote:
>>
>> Hi. 
>>
>> I have tried to apply your code, but it does not work for me, I got "nan" 
>> at training cost.
>>
>> l_r = theano.shared(np.array(learning_rate, 
>> dtype=theano.config.floatX))
>> 
>> cost = layer8.negative_log_likelihood(y)
>> 
>>
>>validate_model = theano.function(
>> [index],
>> layer8.errors(y),
>> givens={
>> x: valid_set_x[index * batch_size: (index + 1) * batch_size],
>> y: valid_set_y[index * batch_size: (index + 1) * batch_size],
>> is_train: numpy.cast['int32'](0)
>>
>> }
>> )
>>
>> # create a list of all model parameters to be fit by gradient descent
>> params = layer0.params + layer1.params + layer2.params + 
>> layer3.params + layer4.params + layer5.params + layer6.params + 
>> layer7.params + layer8.params
>>
>> # create a list of gradients for all model parameters
>> grads = T.grad(cost, params)
>>
>> ## Learning rate update
>> updates = [
>> (param_i, param_i - l_r * grad_i)
>> for param_i, grad_i in zip(params, grads)
>> ]
>> updates.append((l_r,0.95*l_r))
>>
>> train_model = theano.function([index], cost, updates=updates,
>>   givens={
>>   x: train_set_x[index * batch_size: 
>> (index + 1) * batch_size],
>>   y: train_set_y[index * batch_size: 
>> (index + 1) * batch_size],
>>   is_train: np.cast['int32'](1)})
>> in the while loop:
>> cost_ij = train_model(minibatch_index)
>> when it is time to validate:
>>   validation_losses = [validate_model(i) for i in 
>> range(n_valid_batches)]
>>
>>
>> The learning rate updates its value, the validation error is 90% 
>> continuously and the training cost is "nan" 
>>
>> Thank you in advance.
>>
>> Regards.
>>
>>
>> El lunes, 6 de octubre de 2014, 18:40:20 (UTC+2), Pascal Lamblin escribió:
>>>
>>> On Mon, Oct 06, 2014, Frédéric Bastien wrote: 
>>> > Exacte. 
>>>
>>> Also, you can make l_r a shared variable, and update it via a Theano 
>>> expression, if it is convenient to do so. For instance: 
>>>
>>> l_r = theano.shared(np.array(0.1, dtype=theano.config.floatX)) 
>>> ... 
>>> updates.append((l_r, 0.9 * l_r))  # Or however you want to change your 
>>> learning rate 
>>>
>>> train_model = theano.function([index], cost, updates=updates, 
>>> givens=...) 
>>>
>>> > 
>>> > Fred 
>>> > 
>>> > On Mon, Oct 6, 2014 at 11:14 AM, Ofir Levy  wrote: 
>>> > 
>>> > > ok I think I got it 
>>> > > 
>>> > > learning_rate = 0.1 
>>> > > 
>>> > > l_r = T.scalar('l_r', dtype=theano.config.floatX) 
>>> > > 
>>> > > updates = []for param_i, grad_i in zip(params, grads): 
>>> > > updates.append((param_i, param_i - l_r * grad_i)) 
>>> > > 
>>> > > train_model = theano.function([index,l_r], cost, updates = updates, 
>>> > > givens={ 
>>> > > x: train_set_x[index * batch_size: (index + 1) * 
>>> batch_size], 
>>> > > y: train_set_y[index * batch_size: (index + 1) * 
>>> batch_size]}) 
>>> > > 
>>> > > and in the training loop: 
>>> > > 
>>> > > cost_ij = train_model(minibatch_index, learning_rate) 
>>> > > 
>>> > > 
>>> > > 
>>> > > On Monday, October 6, 2014 5:38:33 PM UTC+3, Ofir Levy wrote: 
>>> > >> 
>>> > >> for the CNN example we currently have the following code: 
>>> > >> 
>>> > >> learning_rate = 0.1 
>>> > >> 
>>> > >> updates = []for param_i, grad_i in zip(params, grads): 
>>> > >> updates.append((param_i, param_i - learning_rate * 
>>> grad_i))train_model = theano.function([index], cost, updates = updates, 
>>> > >> givens={ 
>>> > >> x: train_set_x[index * batch_size: (index + 1) * 
>>> batch_size], 
>>> > >> y: train_set_y[index * batch_size: (index + 1) * 
>>> batch_size]}) 
>>> > >> 
>>> > >> and in the training loop: 
>>> > >> 
>>> > >> cost_ij = train_model(minibatch_index) 
>>> > >> 
>>> > >> 
>>> > >> can you kindly tell me how to change it to have a adaptive learning 
>>> rate? 
>>> > >> 
>>> > >> 
>>> > >> 
>>> > >> 
>>> > >> 
>>> > >> 
>>> > >> On Thursday, July 17, 2014 9:48:24 PM UTC+3, Frédéric Bastien 
>>> wrote: 
>>> > >>> 
>>> > >>> Make a theano variable that is the