Currently the energy-cores event in the power PMU aggregates energy consumption data at a package level. On the other hand the core energy RAPL counter in AMD CPUs has a core scope (which means the energy consumption is recorded separately for each core). Earlier efforts to add the core event in the power PMU had failed [1], due to the difference in the scope of these two events. Hence, there is a need for a new core scope PMU.
This patchset adds a new "power_per_core" PMU alongside the existing "power" PMU, which will be responsible for collecting the new "energy-per-core" event. Tested the package level and core level PMU counters with workloads pinned to different CPUs. Results with workload pinned to CPU 1 in Core 1 on an AMD Zen4 Genoa machine: $ perf stat -a --per-core -e power_per_core/energy-per-core/ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C1 1 5.72 Joules power_per_core/energy-per-core/ S0-D0-C2 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C3 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C4 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C5 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C6 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C7 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C8 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C9 1 0.02 Joules power_per_core/energy-per-core/ S0-D0-C10 1 0.02 Joules power_per_core/energy-per-core/ [1]: https://lore.kernel.org/lkml/[email protected]/ This patchset applies cleanly on top of v6.10-rc3 as well as latest tip/master. Dhananjay Ugwekar (6): perf/x86/rapl: Fix the energy-pkg event for AMD CPUs perf/x86/rapl: Rename rapl_pmu variables perf/x86/rapl: Make rapl_model struct global perf/x86/rapl: Move cpumask variable to rapl_pmus struct perf/x86/rapl: Add wrapper for online/offline functions perf/x86/rapl: Add per-core energy counter support for AMD CPUs arch/x86/events/rapl.c | 311 ++++++++++++++++++++++++++++++----------- 1 file changed, 233 insertions(+), 78 deletions(-) -- 2.34.1
