Hi all, I am trying to understand the output of the perf mem tool on my workstation with two intel Xeon X5650.
I recorded a perf.data file with memory load sampling (write sampling is not availble for these processors) as following (in the root directory of a Linux kernel source tree): perf mem -t load rec -c 1 make -j18 Then I am reporting the results with perf mem rep --sort=mem 97.00% 25519343 L1 hit 1.31% 43687 L3 hit 1.15% 37253 LFB hit 0.32% 3156 Remote Cache (1 hop) hit 0.14% 38579 L3 miss 0.05% 6309 L2 hit 0.03% 231 Remote RAM (1 hop) hit 0.00% 8 Local RAM hit 0.00% 2 Uncached hit As you can see, 97% of the loads (I am sampling all loads with -c 1: is it true ?) hit the L1 cache. My first question is about this high L1 hit ration and the small number of RAM requests (231 + 8). Is it realistic to have 97% of L1 hit and only 239 RAM accesses when compiling a Linux kernel ? Writing this email and looking again into Intel SDM I am thinking that the L3 misses are what is called "unknown L3 cache miss in SDM". As a consequence the total number of memory accesses would be L3 miss + Remote RAM + Local RAM, is it correct ? The second question is the Uncached hit: is it the Un-cacheable memory in the SDM ? If yes, I guess it's also a request to RAM. Finally, it's not very clear for me what Line Fill Buffer (LFB) is exactly and I was not able to find a pointer explaining that. Do you know where I can read information about this ? Thanks, ------ Manuel -- To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html