While the speed-up looks solid, I noticed the following: 1. A difference in top-1 inference accuracy in this comment for squeezenet https://github.com/apache/incubator-mxnet/pull/12591#issuecomment-423105682 2. Higher variance in the training accuracy compared to GPU, and the lack of validation accuracy in this comment https://github.com/apache/incubator-mxnet/pull/12591#issuecomment-423889618 3. A clear difference in accuracy in https://github.com/apache/incubator-mxnet/pull/12591#issuecomment-423890706 4. Lack of comparison between regular builds and mkl builds which is what we should establish instead.
I also have the following questions regarding the results: 1. What does "multi-node" mean in the second diagram in this comment? https://github.com/apache/incubator-mxnet/pull/12591#issuecomment-423889618 2. What would be the results for more common CPUs? Overall, I think these evaluation doesn't yet cover the most important question for this PR: can we say with confidence that by switching to USE_MKLDNN by default, our library can achieve speed-up without losing accuracy, for different CPUs? [ Full content available at: https://github.com/apache/incubator-mxnet/pull/12591 ] This message was relayed via gitbox.apache.org for devnull@infra.apache.org