[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

2020-06-02 Thread GitBox
ChaiBapchya commented on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-637751126 @aaraujom any update? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

2020-05-15 Thread GitBox
ChaiBapchya commented on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629526785 Can confirm that this issue is specific to AVX512 kernels. Tried this on c5.xl $ lscpu ``` Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz Flags:

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

2020-05-15 Thread GitBox
ChaiBapchya commented on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629477511 Yes logs agree with these statements. But the perf difference isn't visible via opperf. My bad, lhs, rhs wrongly interpreted. But that's for dot,batch_dot. FC

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

2020-05-15 Thread GitBox
ChaiBapchya commented on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-629385677 > Tested with MXNet [cfb474b](https://github.com/apache/incubator-mxnet/commit/cfb474ba743d5ea85161bf19875488f4cb409d3c). Compiled with mostly-default cmake settings:

[GitHub] [incubator-mxnet] ChaiBapchya commented on issue #17980: When compiled with MKL, fully_connected calls DNNL while dot and batch_dot call MKL

2020-05-04 Thread GitBox
ChaiBapchya commented on issue #17980: URL: https://github.com/apache/incubator-mxnet/issues/17980#issuecomment-623803913 > In case somebody finds this issue and wants their optimized build, here is a different workaround that removes the need for `LD_PRELOAD`. Just do this before