Is this a correct usage of theano's dev version batchnorm?
w = theano.shared(... )
b = theano.shared(...)
gamma =
theano.shared(value=numpy.ones((1,channels,width,height),
dtype=theano.config.floatX),
name = 'gamma', borrow = borrow)
beta =
theano.shared(value=numpy.zeros((1,channels,width,height),
dtype=theano.config.floatX), name = 'beta', borrow=borrow)
running_mean = theano.shared(
value=numpy.zeros((1,channels,height,width),
dtype=theano.config.floatX), name = 'population_mean', borrow = borrow)
running_var = theano.shared(
value=numpy.ones((1,channels,height,width), dtype=theano.config.floatX),
name = 'population_var', borrow=borrow)
# Perform conv and pool but before activation .....
batch_norm_out,_,_,mean,var = batch_normalization_train(
inputs = pool_out,
gamma = gamma,
beta = beta,
running_mean =
running_mean,
running_var = running_var
)
mean = theano.tensor.unbroadcast(mean,0)
var = theano.tensor.unbroadcast(var,0)
updates[running_mean] = mean
updates[running_var] = var + 0.001 # to avoid variance being
zero.
batch_norm_inference = batch_normalization_test (
inputs = pool_out
gamma = gamma,
beta = beta,
mean = running_mean,
var = running_var )
Of course I am learning the beta and gamma along with w and b and I am
passing the updates to the train theano function. It seems to work fine
with dot product type layers, but convolutional layers always runs into
NaNs for me. I was wondering if I was using the layer right. I use the
batch_norm_out for training and I use batch_norm_inference for testing.
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.