On Thu, Aug 27, 2020 at 8:48 PM Jakub Wartak <jakub.war...@tomtom.com> wrote: > I've tried to get cache misses ratio via PMCs, apparently on EC2 they are > (even on bigger) reporting as not-supported or zeros.
I heard some of the counters are only allowed on their dedicated instance types. > However interestingly the workload has IPC of 1.40 (instruction bound) which > to me is strange as I would expect BufTableLookup() to be actually heavy > memory bound (?) Maybe I'll try on some different hardware one day. Hmm, OK now you've made me go and read a bunch of Brendan Gregg bloggs and try some experiments of my own to get a feel for this number and what it might be telling us about the cache miss counters you can't see. Since I know how to generate arbitrary cache miss workloads for quick experiments using hash joins of different sizes, I tried that and noticed that when LLC misses were at 76% (bad), IPC was at 1.69 which is still higher than what you're seeing. When the hash table was much smaller and LLC misses were down to 15% (much better), IPC was at 2.83. I know Gregg said[1] "An IPC < 1.0 likely means memory bound, and an IPC > 1.0 likely means instruction bound", but that's not what I'm seeing here, and in his comments section that is disputed. So I'm not sure your IPC of 1.40 is evidence against the hypothesis on its own. [1] http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html