fhieber commented on pull request #18531:
URL: https://github.com/apache/incubator-mxnet/pull/18531#issuecomment-643804092


   @eric-haibin-lin Thanks for the pointer. We had tried custom operators in a 
separate project (int8 quantization) but found them to be not suitable for 
performance reasons.
   @sxjscience Thanks! I will take a look. I haven't benchmarked SoftmaxOutput 
and a Gluon implementation recently. Mid 2019 we still observed a ~15-20% 
slowdown when not using SoftmaxOutput. I will give your label smoothing 
implementation a try, but my guess is that the combination of Softmax forward 
pass, (smoothed) cross-entropy gradient backward pass, and support for easy 
masking without creating an explicit mask, all implemented in a single 
operator, will still have a slight edge over an implementation with multiple 
operators.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to