I figured it out. I was not using the ''axes'' parameter. So I've been 
doing "per-activation" mean and variance which is simply poor. By starting 
to use "axes" for convolutions as 'spatial', I am getting good results now. 

Perhaps theano should make 'spatial' default for 4d tensors input and 
'per-activation' default for 2d tensors. 


On Saturday, February 25, 2017 at 6:47:03 PM UTC-7, Ragav Venkatesan wrote:
>
> There still seems to be a bug. It works reasonably for dotproduct layers. 
> But it fails for conv layers. All I get is Nans. 
>
> On Monday, February 20, 2017 at 9:55:27 AM UTC-7, nouiz wrote:
>>
>> BN had a bug that we fixed Friday. Can you update Theano and try again? 
>> Maybe it is already fixed.
>>
>> Fred
>>
>> On Mon, Feb 20, 2017 at 12:28 AM Ragav Venkatesan <[email protected]> 
>> wrote:
>>
>>> My previous comments had some bugs. Here is how I use it. 
>>>
>>>             self.gamma = 
>>> theano.shared(value=numpy.ones((1,channels,width,height),
>>>                                  dtype=theano.config.floatX), name = 
>>> 'gamma', borrow = borrow)
>>>             self.beta = 
>>> theano.shared(value=numpy.ones((1,channels,width,height),
>>>                                  dtype=theano.config.floatX), name = 
>>> 'beta', borrow=borrow)    
>>>             self.mean = 
>>> theano.shared(value=numpy.ones((1,channels,width,height),
>>>                              dtype=theano.config.floatX), name = 
>>> 'population_mean', borrow = borrow)
>>>             self.var = 
>>> theano.shared(value=numpy.ones((1,channels,width,height),
>>>                              dtype=theano.config.floatX), name = 
>>> 'population_var', borrow=borrow)  
>>>
>>>             batch_norm_out,_,_,self.mean,self.var = 
>>> batch_normalization_train(
>>>                                                   inputs = pool_out + \
>>>                                                                 
>>> self.b.dimshuffle('x', 0, 'x', 'x'),
>>>                                                   gamma = self.gamma,
>>>                                                   beta = self.beta,
>>>                                                   running_mean = 
>>> self.mean,
>>>                                                   running_var = self.var 
>>> )
>>>             batch_norm_inference = batch_normalization_test (
>>>                                                     inputs = pool_out + \
>>>                                                                 
>>> self.b.dimshuffle('x', 0, 'x', 'x'),
>>>                                                     gamma = self.gamma,
>>>                                                     beta = self.beta,
>>>                                                     mean = self.mean,
>>>                                                     var = self.var  )
>>>
>>>       I use batch_norm_out while training and batch_norm_inference while 
>>> testing.
>>>
>>> The question I still have though is on the running mean and variance 
>>> returned by the train method. Is it alright to over-write them the way I 
>>> have done so ? If not should I create a automatic update for the mean and 
>>> variance update such as 
>>>
>>> updates [self.mean] = (running mean returned by the train method). 
>>>
>>>
>>> On Sunday, February 19, 2017 at 9:19:57 PM UTC-7, Ragav Venkatesan wrote:
>>>>
>>>> I also have a question on this. This is how I am using it at the moment 
>>>> for my convolutional layer:  (conv + pool in pool out, pre-activation) 
>>>>
>>>>             self.mean = theano.shared(value=numpy.zeros((channels,),
>>>>                              dtype=theano.config.floatX), name = 
>>>> 'population_mean', borrow = borrow)
>>>>             self.var = theano.shared(value=numpy.zeros((nkerns,), 
>>>> dtype=theano.config.floatX),
>>>>                                      name = 'population_var', 
>>>> borrow=borrow)  
>>>>
>>>>             batch_norm_out,_,_,self.mean,self.var = 
>>>> batch_normalization_train(
>>>>                                                   inputs = pool_out + \
>>>>                                                                 
>>>> self.b.dimshuffle('x', 0, 'x', 'x'),
>>>>                                                   gamma = self.gamma,
>>>>                                                   beta = self.beta,
>>>>                                                   running_mean = 
>>>> self.mean,
>>>>                                                   running_var = 
>>>> self.var )
>>>>
>>>> And for inference time, I use the following :
>>>>
>>>>             batch_norm_inference = batch_normalization_test (
>>>>                                                     inputs = pool_out + 
>>>> \
>>>>                                                                 
>>>> self.b.dimshuffle('x', 0, 'x', 'x'),
>>>>                                                     gamma = self.gamma,
>>>>                                                     beta = self.beta,
>>>>                                                     mean = self.mean,
>>>>                                                     var = self.var  )
>>>>
>>>> The question I have though is on the running mean and variance returned 
>>>> by the train method. Is it alright to over-write them the way I have done 
>>>> so ? If not should I create a automatic update for the mean and variance 
>>>> update such as 
>>>>
>>>> updates [self.mean] = (running mean returned by the train method). 
>>>>
>>>>
>>>> On Thursday, February 16, 2017 at 8:17:24 PM UTC-7, David Leon wrote:
>>>>>
>>>>> I'm using nnet.bn.batch_normalization_train() and 
>>>>> nnet.bn.batch_normalization_test() for batch normalization, however 
>>>>> during 
>>>>> test phase, nnet.bn.batch_normalization_test() produces wrong results. 
>>>>> For 
>>>>> the time being, I just use nnet.bn.batch_normalization_train() with 
>>>>> *running_average_factor 
>>>>> *set to zero for test phase as:
>>>>>
>>>>> if deterministic is False:  # train phase
>>>>>     normalized, input_mean, input_inv_std, self.mean, self.var = 
>>>>> T.nnet.bn.batch_normalization_train(input, self.gamma, self.beta, 
>>>>> self.axes,
>>>>>                                                                           
>>>>>                            self.epsilon, self.alpha, self.mean, self.var)
>>>>> else: # test phase
>>>>>     # normalized = T.nnet.bn.batch_normalization_test(input, self.gamma, 
>>>>> self.beta, self.mean, self.var, self.axes, self.epsilon)
>>>>>     normalized, _, _, _, _ = T.nnet.bn.batch_normalization_train(input, 
>>>>> self.gamma, self.beta, self.axes, self.epsilon, 0.0, self.mean, self.var)
>>>>> return normalized
>>>>>
>>>>>
>>>>>
>>>>> My theano version is 
>>>>> '0.9.0beta1.dev-b2afa088d1cb416b4507348019af34adae908b73', CUDA 8.0 and 
>>>>> CuDNN 5.1
>>>>>
>>>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "theano-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to