sammieghabra opened a new issue #18665:
URL: https://github.com/apache/incubator-mxnet/issues/18665


   ## Description
   Hi MXNet team,
   
   My team wants to implement a layer in MXNet to implement a ShiftScale 
operator. This layer is similar to batchnorm - however, we want `moving_mean` 
and `moving_var` to be used instead of `data_mean` and `data_var` to compute 
the output of the layer. I see that batch norm has a flag `use_global_stats`, 
and in the [mxnet docs](http://beta.mxnet.io/r/api/mx.symbol.BatchNorm.html), 
it seems that setting this flag to be true would be something similar to what 
I'm trying to do. However, upon inspecting the [batch-norm 
code](https://github.com/apache/incubator-mxnet/blob/master/src/operator/nn/batch_norm.cc#L260-L271),
 it seems that running_mean and running_var won't be updated if that flag is 
set to true for training. 
   
   1. Is there a reason why from a design perspective setting this 
`use_global_stats` flag to be true wont update the running mean and running 
var? 
   2. We would like to support this shift scale layer during training. So what 
my proposal is to do is to add another flag to the `batchNorm` operator to be 
"use_shift_scale", which would simply replace mean and var with running mean 
and running var when updating the weights. Is this something that MXNet team 
would be ok  with?
   3. We also plan to train with more than one instance - will the running_mean 
and running_var parameters be the same across instances? 
   
   Thanks
   Sammie


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to