[GitHub] HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU

2018-10-31 Thread GitBox
HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434862551 I don't think the workloads provided by @rongzha1 is suitable for determining the memory bandwidth, based on the following

[GitHub] HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU

2018-10-30 Thread GitBox
HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434436260 @rongzha1 Your memory bandwidth seems suspiciously high to me (>100GB/s), are you using a special type of memory? You can

[GitHub] HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU

2018-10-30 Thread GitBox
HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434408459 @rongzha1 From your code M should be num_cols(61400), that is a pretty big number. Update: I tested my changes with a