This is to follow up earlier discussion on sharing hardware PMU counters across compatible events: https://marc.info/?t=151213803600016
A lot of this set is based on Tejun's work. I also got a lot of ideas and insights from Jiri's version. The major effort in this version is to make perf event scheduling faster. Specifically, all the operations on the critical paths have O(1) execution time. Commit message of RFC 2/2 has more information about the data structure we used for these operations. RFC 1/2 is a prepare patch. It may become unnecessary if we introduce virtual master later on. RFC 2/2 has majority of the new data structure, and operations. I have test this version on vm with tracepoint events. I also briefly tested it on real hardware, where it shows sharing of perf events and doesn't break too badly. Please share your comments and suggestions. Is this on the right direction of PMU counter sharing? Thanks in advance. Song Song Liu (2): perf: add move_dup() for PMU sharing. perf: Sharing PMU counters across compatible events arch/x86/events/core.c | 8 ++ include/linux/perf_event.h | 57 +++++++++ include/linux/trace_events.h | 3 + kernel/events/core.c | 255 +++++++++++++++++++++++++++++++++++++--- kernel/trace/trace_event_perf.c | 11 ++ 5 files changed, 316 insertions(+), 18 deletions(-) -- 2.9.5

