There still seems to be a bug. It works reasonably for dotproduct layers. But it fails for conv layers. All I get is Nans.
On Monday, February 20, 2017 at 9:55:27 AM UTC-7, nouiz wrote: > > BN had a bug that we fixed Friday. Can you update Theano and try again? > Maybe it is already fixed. > > Fred > > On Mon, Feb 20, 2017 at 12:28 AM Ragav Venkatesan <[email protected] > <javascript:>> wrote: > >> My previous comments had some bugs. Here is how I use it. >> >> self.gamma = >> theano.shared(value=numpy.ones((1,channels,width,height), >> dtype=theano.config.floatX), name = >> 'gamma', borrow = borrow) >> self.beta = >> theano.shared(value=numpy.ones((1,channels,width,height), >> dtype=theano.config.floatX), name = >> 'beta', borrow=borrow) >> self.mean = >> theano.shared(value=numpy.ones((1,channels,width,height), >> dtype=theano.config.floatX), name = >> 'population_mean', borrow = borrow) >> self.var = >> theano.shared(value=numpy.ones((1,channels,width,height), >> dtype=theano.config.floatX), name = >> 'population_var', borrow=borrow) >> >> batch_norm_out,_,_,self.mean,self.var = >> batch_normalization_train( >> inputs = pool_out + \ >> >> self.b.dimshuffle('x', 0, 'x', 'x'), >> gamma = self.gamma, >> beta = self.beta, >> running_mean = >> self.mean, >> running_var = self.var ) >> batch_norm_inference = batch_normalization_test ( >> inputs = pool_out + \ >> >> self.b.dimshuffle('x', 0, 'x', 'x'), >> gamma = self.gamma, >> beta = self.beta, >> mean = self.mean, >> var = self.var ) >> >> I use batch_norm_out while training and batch_norm_inference while >> testing. >> >> The question I still have though is on the running mean and variance >> returned by the train method. Is it alright to over-write them the way I >> have done so ? If not should I create a automatic update for the mean and >> variance update such as >> >> updates [self.mean] = (running mean returned by the train method). >> >> >> On Sunday, February 19, 2017 at 9:19:57 PM UTC-7, Ragav Venkatesan wrote: >>> >>> I also have a question on this. This is how I am using it at the moment >>> for my convolutional layer: (conv + pool in pool out, pre-activation) >>> >>> self.mean = theano.shared(value=numpy.zeros((channels,), >>> dtype=theano.config.floatX), name = >>> 'population_mean', borrow = borrow) >>> self.var = theano.shared(value=numpy.zeros((nkerns,), >>> dtype=theano.config.floatX), >>> name = 'population_var', >>> borrow=borrow) >>> >>> batch_norm_out,_,_,self.mean,self.var = >>> batch_normalization_train( >>> inputs = pool_out + \ >>> >>> self.b.dimshuffle('x', 0, 'x', 'x'), >>> gamma = self.gamma, >>> beta = self.beta, >>> running_mean = >>> self.mean, >>> running_var = self.var >>> ) >>> >>> And for inference time, I use the following : >>> >>> batch_norm_inference = batch_normalization_test ( >>> inputs = pool_out + \ >>> >>> self.b.dimshuffle('x', 0, 'x', 'x'), >>> gamma = self.gamma, >>> beta = self.beta, >>> mean = self.mean, >>> var = self.var ) >>> >>> The question I have though is on the running mean and variance returned >>> by the train method. Is it alright to over-write them the way I have done >>> so ? If not should I create a automatic update for the mean and variance >>> update such as >>> >>> updates [self.mean] = (running mean returned by the train method). >>> >>> >>> On Thursday, February 16, 2017 at 8:17:24 PM UTC-7, David Leon wrote: >>>> >>>> I'm using nnet.bn.batch_normalization_train() and >>>> nnet.bn.batch_normalization_test() for batch normalization, however during >>>> test phase, nnet.bn.batch_normalization_test() produces wrong results. For >>>> the time being, I just use nnet.bn.batch_normalization_train() with >>>> *running_average_factor >>>> *set to zero for test phase as: >>>> >>>> if deterministic is False: # train phase >>>> normalized, input_mean, input_inv_std, self.mean, self.var = >>>> T.nnet.bn.batch_normalization_train(input, self.gamma, self.beta, >>>> self.axes, >>>> >>>> self.epsilon, self.alpha, self.mean, self.var) >>>> else: # test phase >>>> # normalized = T.nnet.bn.batch_normalization_test(input, self.gamma, >>>> self.beta, self.mean, self.var, self.axes, self.epsilon) >>>> normalized, _, _, _, _ = T.nnet.bn.batch_normalization_train(input, >>>> self.gamma, self.beta, self.axes, self.epsilon, 0.0, self.mean, self.var) >>>> return normalized >>>> >>>> >>>> >>>> My theano version is >>>> '0.9.0beta1.dev-b2afa088d1cb416b4507348019af34adae908b73', CUDA 8.0 and >>>> CuDNN 5.1 >>>> >>>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
