ChaiBapchya commented on issue #17980:
URL:
https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-637751126
@aaraujom any update?
This is an automated message from the Apache Git Service.
To respond to the
ChaiBapchya commented on issue #17980:
URL:
https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629526785
Can confirm that this issue is specific to AVX512 kernels.
Tried this on c5.xl
$ lscpu
```
Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
Flags:
ChaiBapchya commented on issue #17980:
URL:
https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629477511
Yes logs agree with these statements. But the perf difference isn't visible
via opperf.
My bad, lhs, rhs wrongly interpreted. But that's for dot,batch_dot. FC
ChaiBapchya commented on issue #17980:
URL:
https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629385677
> Tested with MXNet
[cfb474b](https://github.com/apache/incubator-mxnet/commit/cfb474ba743d5ea85161bf19875488f4cb409d3c).
Compiled with mostly-default cmake settings:
ChaiBapchya commented on issue #17980:
URL:
https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-623803913
> In case somebody finds this issue and wants their optimized build, here is
a different workaround that removes the need for `LD_PRELOAD`. Just do this
before