Hi Paul! I read the info you provided, but none of the programs actually support detecting cache conflicts.
Performance counters can detect cache misses, similar to cachegrind, but they cannot distinguish between cache misses related to cache conflicts and other cache misses. pahole is a tool with completely different usage, and that is to detect paddings in data structures. This isn't related to cache conflicts in any way. DHAT provides useful information, by allowing you to assess which data is accessed more frequently, but you need additional data to verify that the hot data is not evicted from the cache too soon. On Sun, Jan 29, 2023 at 4:25 PM Paul Floyd <pjfl...@wanadoo.fr> wrote: > > > > On 29-01-23 14:31, Ivica B wrote: > > Hi! > > > > I am looking for a tool that can detect cache conflicts, but I am not > > finding any. There are a few that are mostly academic, and thus not > > maintained. I think it is important for the performance analysis > > community to have a tool that to some extent can detect cache > > conflicts. Is it possible to implement support for detecting source > > code lines where cache conflicts occur? More info on cache conflicts > > below. > > [snip] > > I agree that this is an interesting topic. If anyone else has ideas I'm > all ears. > > My recommendations for this are: > > 1/ PMU/PMC (performance monitoring unit/counter) event counting tools > (perf record on Linux, pmcstat on FreeBSD, Oracle Studio collect on > Solaris, don't know for macOS). These can record events such as cache > misses with the associated callstacks. You can then use tools HotSpot > and perfgrind/kcachegrind (I hae used HotSpot but not perfgrind). > > The big advantage of this is that the PMCs are part of the hardware and > the overhead of doing this is minor. The only slight limitation is that > then number of counters is limited. > > 2/ pahole > https://github.com/acmel/dwarves > A really nice binary analysis tool. It will analyze your binary (with > debuginfo) and generate a report for all structures showing holes, > padding and cache lines. It can even generate modified source with > members reordered to improve the packing. However as this is a static > tool working only on the data structures it knows nothing about your > access patterns. > > 3/ DHAT > One of the Valgrind tools. This profiles heap memory. If the block is > less than 1k it will also generate a kind of ascii-html heat map. That > map is an aggregate, but you can usually guess which offsets get hit the > most together. > > Cachegrind doesn't really do this with the kind of accuracy that PMCs > do. It has a reduced model of the cache and has a basic branch > predictor. I don't know if or how speculative execution affects the > cache hit rate, but Valgrind doesn't do any of that. > > A+ > Paul > > > _______________________________________________ > Valgrind-users mailing list > Valgrind-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-users _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users