HyperZealot edited a comment on issue #12997: A better take forward kernel for
CPU
URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434862551
I don't think the workloads provided by @rongzha1 is suitable for
determining the memory bandwidth, based on the following
HyperZealot edited a comment on issue #12997: A better take forward kernel for
CPU
URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434436260
@rongzha1 Your memory bandwidth seems suspiciously high to me (>100GB/s),
are you using a special type of memory? You can
HyperZealot edited a comment on issue #12997: A better take forward kernel for
CPU
URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434408459
@rongzha1 From your code M should be num_cols(61400), that is a pretty big
number.
Update: I tested my changes with a