[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-05 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r172389676
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -228,11 +301,27 @@ class DropoutOp {
 if (!MKLForward(s, pgen, this->pkeep_, in_data, out_data)) {
   const TBlob  = out_data[dropout::kMask];
   CHECK(req[dropout::kOut] != kAddTo);
-  LaunchRNG(s, pgen, out.Size(),
-out.dptr(),
-mask.dptr(),
-in_data[dropout::kData].dptr(),
-this->pkeep_);
+  // initialize the mask
+  LaunchRNG(s, pgen, out.Size(),
+  mask.dptr(),
+  this->pkeep_);
+  if (req[0] != kNullOp) {
+// broardcast mul
+TShape new_lshape, new_rshape, new_oshape;
+int ndim = 
BinaryBroadcastShapeCompact(in_data[dropout::kData].shape_,
+   mask.shape_, out.shape_,
+   _lshape, _rshape, 
_oshape);
+BROADCAST_NDIM_SWITCH(ndim, NDim, {
+  mshadow::Shape oshape = new_oshape.get();
+  mshadow::Shape lstride = 
mxnet_op::calc_stride(new_lshape.get());
+  mshadow::Shape rstride = 
mxnet_op::calc_stride(new_rshape.get());
+  mxnet_op::Kernel, xpu>::
+  template LaunchEx(s, new_oshape.Size(), req[0], lstride, 
rstride, oshape,
+  in_data[dropout::kData].dptr(),
+  mask.dptr(), out.dptr());
+});
+  }
 }
 
 Review comment:
   Thx @cjolivier01 . I added the condition check here 
https://github.com/apache/incubator-mxnet/pull/9931/files#diff-4aea2cc24c0bb4e8e48face9faf4aa26R249


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-05 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171937954
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -178,30 +184,17 @@ class DropoutOp {
   /*!
* \brief Dropout kernel, compute dropout tensor
*/
-  struct DropoutKernel {
-/*!
- * \brief Dropout kernel function
- * \param id Thread number (0-based representing count)
- * \param gen Random number generator
- * \param N Total number of items in the output
- * \param step Step between items, related to parallelism
- * \param dropout_out Output dropout values
- * \param mask_out  Output mask (is multiplied to create dropout output, 
may be 0)
- * \param input_data Input data to perform the dropout on
- * \param pkeep Dropout rate (keep when the generated random number is 
less than this value)
- */
+  struct BernoulliKernel {
+/*! \brief Bernoulli kernel for generating mask */
 MSHADOW_XINLINE static void Map(int id,
 RandGenerator gen,
 const int N,
 const int step,
-DType *dropout_out,
 DType *mask_out,
-const DType *input_data,
 const real_t pkeep) {
   RNG_KERNEL_LOOP(xpu, DType, id, gen, N, step, {
 const real_t rand_num = static_cast(genImpl.uniform());
 mask_out[i] = mshadow_op::threshold::Map(rand_num, pkeep) * 
(1.0f / pkeep);
-dropout_out[i] = input_data[i] * mask_out[i];
 
 Review comment:
   Thx @cjolivier01 . I get your point for efficiency. I have added a condition 
check for standard dropout, which has the same efficiency when none-axes 
provided:
   
https://github.com/apache/incubator-mxnet/pull/9931/files#diff-4aea2cc24c0bb4e8e48face9faf4aa26R252


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-05 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r172388387
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -337,6 +336,7 @@ class DropoutOp {
   real_t pkeep_;
   /*! \brief Dropout mode */
   dropout::DropoutOpMode mode_;
+  TShape axes;
 
 Review comment:
   Thx ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-02 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171937954
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -178,30 +184,17 @@ class DropoutOp {
   /*!
* \brief Dropout kernel, compute dropout tensor
*/
-  struct DropoutKernel {
-/*!
- * \brief Dropout kernel function
- * \param id Thread number (0-based representing count)
- * \param gen Random number generator
- * \param N Total number of items in the output
- * \param step Step between items, related to parallelism
- * \param dropout_out Output dropout values
- * \param mask_out  Output mask (is multiplied to create dropout output, 
may be 0)
- * \param input_data Input data to perform the dropout on
- * \param pkeep Dropout rate (keep when the generated random number is 
less than this value)
- */
+  struct BernoulliKernel {
+/*! \brief Bernoulli kernel for generating mask */
 MSHADOW_XINLINE static void Map(int id,
 RandGenerator gen,
 const int N,
 const int step,
-DType *dropout_out,
 DType *mask_out,
-const DType *input_data,
 const real_t pkeep) {
   RNG_KERNEL_LOOP(xpu, DType, id, gen, N, step, {
 const real_t rand_num = static_cast(genImpl.uniform());
 mask_out[i] = mshadow_op::threshold::Map(rand_num, pkeep) * 
(1.0f / pkeep);
-dropout_out[i] = input_data[i] * mask_out[i];
 
 Review comment:
   The mask size is different with input size.
   
   For example: input can be 10x5x2x3, the axes are (1,2)
   then the mask will be the size of 10x1x1x3


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-02 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171931913
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -178,30 +184,17 @@ class DropoutOp {
   /*!
* \brief Dropout kernel, compute dropout tensor
*/
-  struct DropoutKernel {
-/*!
- * \brief Dropout kernel function
- * \param id Thread number (0-based representing count)
- * \param gen Random number generator
- * \param N Total number of items in the output
- * \param step Step between items, related to parallelism
- * \param dropout_out Output dropout values
- * \param mask_out  Output mask (is multiplied to create dropout output, 
may be 0)
- * \param input_data Input data to perform the dropout on
- * \param pkeep Dropout rate (keep when the generated random number is 
less than this value)
- */
+  struct BernoulliKernel {
+/*! \brief Bernoulli kernel for generating mask */
 MSHADOW_XINLINE static void Map(int id,
 RandGenerator gen,
 const int N,
 const int step,
-DType *dropout_out,
 DType *mask_out,
-const DType *input_data,
 const real_t pkeep) {
   RNG_KERNEL_LOOP(xpu, DType, id, gen, N, step, {
 const real_t rand_num = static_cast(genImpl.uniform());
 mask_out[i] = mshadow_op::threshold::Map(rand_num, pkeep) * 
(1.0f / pkeep);
-dropout_out[i] = input_data[i] * mask_out[i];
 
 Review comment:
   I am not sure I understand you clearly. 
   I separate the original dropout kernel into two parts: 1) BernoulliKernel 2) 
broad_cast mul 
   so that we can enable axes support for variational dropout.
   
   Thx


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-02 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171931760
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -178,30 +184,17 @@ class DropoutOp {
   /*!
* \brief Dropout kernel, compute dropout tensor
*/
-  struct DropoutKernel {
-/*!
- * \brief Dropout kernel function
- * \param id Thread number (0-based representing count)
- * \param gen Random number generator
- * \param N Total number of items in the output
- * \param step Step between items, related to parallelism
- * \param dropout_out Output dropout values
- * \param mask_out  Output mask (is multiplied to create dropout output, 
may be 0)
- * \param input_data Input data to perform the dropout on
- * \param pkeep Dropout rate (keep when the generated random number is 
less than this value)
- */
+  struct BernoulliKernel {
+/*! \brief Bernoulli kernel for generating mask */
 MSHADOW_XINLINE static void Map(int id,
 RandGenerator gen,
 const int N,
 const int step,
-DType *dropout_out,
 DType *mask_out,
-const DType *input_data,
 const real_t pkeep) {
   RNG_KERNEL_LOOP(xpu, DType, id, gen, N, step, {
 const real_t rand_num = static_cast(genImpl.uniform());
 mask_out[i] = mshadow_op::threshold::Map(rand_num, pkeep) * 
(1.0f / pkeep);
-dropout_out[i] = input_data[i] * mask_out[i];
 
 Review comment:
   I am not sure I understand you clearly. 
   I separate the original dropout kernel into two parts: 1) BernoulliKernel 2) 
broad_cast Mul 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-02 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171929974
 
 

 ##
 File path: src/operator/nn/dropout-inl.h
 ##
 @@ -228,11 +301,27 @@ class DropoutOp {
 if (!MKLForward(s, pgen, this->pkeep_, in_data, out_data)) {
   const TBlob  = out_data[dropout::kMask];
   CHECK(req[dropout::kOut] != kAddTo);
-  LaunchRNG(s, pgen, out.Size(),
-out.dptr(),
-mask.dptr(),
-in_data[dropout::kData].dptr(),
-this->pkeep_);
+  // initialize the mask
+  LaunchRNG(s, pgen, out.Size(),
+  mask.dptr(),
+  this->pkeep_);
+  if (req[0] != kNullOp) {
+// broardcast mul
+TShape new_lshape, new_rshape, new_oshape;
+int ndim = 
BinaryBroadcastShapeCompact(in_data[dropout::kData].shape_,
+   mask.shape_, out.shape_,
+   _lshape, _rshape, 
_oshape);
+BROADCAST_NDIM_SWITCH(ndim, NDim, {
+  mshadow::Shape oshape = new_oshape.get();
+  mshadow::Shape lstride = 
mxnet_op::calc_stride(new_lshape.get());
+  mshadow::Shape rstride = 
mxnet_op::calc_stride(new_rshape.get());
+  mxnet_op::Kernel, xpu>::
+  template LaunchEx(s, new_oshape.Size(), req[0], lstride, 
rstride, oshape,
+  in_data[dropout::kData].dptr(),
+  mask.dptr(), out.dptr());
+});
+  }
 }
 
 Review comment:
   I haven't updated the MKL code for variational dropout (enabling axes). I 
need help with MKL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhanghang1989 commented on a change in pull request #9931: Add axes support to Dropout for variational dropout in NLP

2018-03-02 Thread GitBox
zhanghang1989 commented on a change in pull request #9931: Add axes support to 
Dropout for variational dropout in NLP
URL: https://github.com/apache/incubator-mxnet/pull/9931#discussion_r171929232
 
 

 ##
 File path: src/operator/nn/dropout.cc
 ##
 @@ -93,10 +93,16 @@ Example::
   std::vector *in_shape, std::vector *out_shape){
   using namespace mshadow;
   CHECK_EQ(in_shape->size(), 1U);
-  const TShape  = in_shape->at(0);
+  const DropoutParam& param = nnvm::get(attrs.parsed);
+  TShape dshape(in_shape->at(0));
   if (dshape.ndim() == 0) return false;
   out_shape->clear();
   out_shape->push_back(dshape);
+  if (param.axes.ndim() != 0) {
 
 Review comment:
   This axes can be empty for normal dropout :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services