[GitHub] HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU

GitBox Wed, 31 Oct 2018 15:00:32 -0700

HyperZealot edited a comment on issue #12997: A better take forward kernel for 
CPU
URL: https://github.com/apache/incubator-mxnet/pull/12997#issuecomment-434862551
 
 
   I don't think the workloads provided by @rongzha1 is suitable for 
determining the memory bandwidth, based on the following arguments:
   1. For each input, the effective working set size is test_rows * num_cols * 
4 bytes. Here's the working set size for each input:
   (1M, 20k, 512) = ~39MB
   (1M, 20k, 8) = ~0.6MB
   (800, 8, 61400) = ~1.90MB
   2. the benchmark script runs 100 trials for each group of indices, so if the 
cache's size is greater than the working set size, then the source data can be 
totally loaded in cache after paying for compulsory misses during the 1st 
trial, then after 1st trial you're actually measuring the cache bandwidth.
   3. Although the benchmark runs 100 trials for the same input, users usually 
on run once for each input so they always pay for the compulsory misses and the 
bottleneck is the memory speed.
   4. With a bit of search I found Skylake 8180 has a 38.5MB L3 cache, so both 
(800, 8, 61400) and (1M, 20k, 8) workloads can totally be cached into the L3 
cache, so the measurements performed on those cannot accurately compare the 
performance on increasing memory bandwidth consumption of different versions of 
code.
   5. If you really want to showcase the effect of different versions on 
num_cols=8, maybe you can switch to a CPU with smaller cache or you can 
increase test_rows to make the working set larger than your L3 cache size.
   6. I tested (50M, 1M, 8) (~30.5M working set size, definitely greater than 
my L3 cache size) on my own machine and got 7.90 GB/s for "for" version and 
9.80 GB/s for "memcpy" version.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] HyperZealot edited a comment on issue #12997: A better take forward kernel for CPU

Reply via email to