DickJC123 commented on pull request #18905:
URL: https://github.com/apache/incubator-mxnet/pull/18905#issuecomment-675718858


   Beginning my review.  The change in strides looks good to me based on the 
following reasoning:
   
   A kernel that accesses an element at location (n_index, c_index, h_index, 
w_index) will use the equation:
   ```
   element_offset = n_index * n_stride + c_index * c_stride + h_index * 
h_stride + w_index * w_stride
   ```
   For a tensor with a particular dimension size of 1, the only valid index in 
that dimension is 0, so the stride in that dimension is essentially a don't 
care.  Your PR is changing strides of the bias tensor for dimension sizes equal 
to 1, so that really shouldn't matter.  The new strides you are proposing seem 
to make more sense though, and if that helps existing cuDNN libraries make a 
better bias kernel choice, all the better.
   
   Did you measure perf gains and under what environments?  Did you come to 
understand whether it was because of the reordering of bias and wgrad, or based 
on the actual cuDNN bias kernel choice, or both?
   
   Thanks for making the changes similarly in Deconvolution- it's good to keep 
these operators in sync as much as possible for ease of code maintenance.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to