Is this a correct usage of theano's dev version batchnorm? 


            w = theano.shared(... )
            b = theano.shared(...)
            gamma = 
theano.shared(value=numpy.ones((1,channels,width,height), 
dtype=theano.config.floatX), 
name = 'gamma', borrow = borrow)
            beta = 
theano.shared(value=numpy.zeros((1,channels,width,height),  
dtype=theano.config.floatX), name = 'beta', borrow=borrow)  
            running_mean = theano.shared(
 value=numpy.zeros((1,channels,height,width),   
dtype=theano.config.floatX),   name = 'population_mean', borrow = borrow)
            running_var = theano.shared(  
value=numpy.ones((1,channels,height,width),   dtype=theano.config.floatX),  
name = 'population_var', borrow=borrow)                                     
                                                        
            # Perform conv and pool but before activation ..... 

            batch_norm_out,_,_,mean,var = batch_normalization_train(
                                                  inputs = pool_out,
                                                  gamma = gamma,
                                                  beta = beta,
                                                  running_mean = 
running_mean,
                                                  running_var = running_var 
)

            mean = theano.tensor.unbroadcast(mean,0)
            var = theano.tensor.unbroadcast(var,0)
            updates[running_mean] = mean
            updates[running_var] = var + 0.001 # to avoid variance being 
zero.

            batch_norm_inference = batch_normalization_test (
                                                    inputs = pool_out
                                                    gamma = gamma,
                                                    beta = beta,
                                                    mean = running_mean,
                                                    var = running_var )
        

Of course I am learning the beta and gamma along with w and b and I am 
passing the updates to the train theano function.  It seems to work fine 
with dot product type layers, but convolutional layers always runs into 
NaNs for me. I was wondering if I was using the layer right. I use the 
batch_norm_out for training and I use batch_norm_inference for testing. 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to