ptrendx commented on a change in pull request #20320:
URL: https://github.com/apache/incubator-mxnet/pull/20320#discussion_r648547822
##########
File path: src/common/cuda/rtc/forward_functions-inl.h
##########
@@ -694,6 +694,15 @@ __device__ inline DType log_sigmoid(const DType val) {
}
}
+template <typename DType>
+__device__ inline DType mish(const DType val) {
+ if (type_util::has_double_or_integral<DType>::value) {
+ return val * ::tanh(::log(1 + ::exp(val)));
Review comment:
One thing that could be improved here (I did not notice this PR earlier,
sorry for a late feedback) is the numerical stability of the softrelu part -
see the implementation of the softrelu (it switches to softrelu(x) = x for
large values of x to avoid overflow). @Adnios could you open another PR
changing e.g. this function to
```
return val * op::tanh(op::softrelu(val));
```
(the double vs float is handled in op::tanh and op::softrelu anyway so this
one will also be simpler as a result) and similarly backward?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]