Organize per-CPU perf event groups by cgroup then by group/insertion index. To support cgroup hierarchies, a set of iterators is needed in visit_groups_merge. To make this unbounded, use a per-CPU allocated buffer. To make the set of iterators fast, use a min-heap ordered by the group index.
These patches include a caching algorithm that avoids a search for the first event in a group by Kan Liang <kan.li...@linux.intel.com> and the set of patches as a whole have benefitted from conversation with him. Version 2 of these patches addresses review comments and fixes bugs found by Jiri Olsa and Peter Zijlstra. Ian Rogers (7): perf: propagate perf_install_in_context errors up perf/cgroup: order events in RB tree by cgroup id perf: order iterators for visit_groups_merge into a min-heap perf: avoid a bounded set of visit_groups_merge iterators perf: cache perf_event_groups_first for cgroups perf: avoid double checking CPU and cgroup perf: rename visit_groups_merge to ctx_groups_sched_in include/linux/perf_event.h | 8 + kernel/events/core.c | 511 +++++++++++++++++++++++++++++-------- 2 files changed, 414 insertions(+), 105 deletions(-) -- 2.22.0.709.g102302147b-goog