Hi David, Is there any update about the patch series?
We recently encountered another performance issue on KNL. I think the RB-tree solution also has benefits for it. Thanks, Kan > Subject: [RFC 0/6] optimize ctx switch with rb-tree > > Following the discussion in: > https://patchwork.kernel.org/patch/9420035/ > > This is is an early version of a series of perf context switches > optimizations. > > The main idea is to create and maintain a list of inactive events sorted by > timestamp, and a rb-tree index to index it. The rb-tree's key are > {cpu,flexible,stamp} for task contexts and {cgroup,flexible,stamp} for CPU > contexts. > > The rb-tree provides functions to find intervals in the inactive event list so > that ctx_sched_in only has to visit the events that can be potentially be > scheduled (i.e. avoid iterations over events bound to CPUs or cgroups that > are not current). > > Since the inactive list is sort by timestamp, rotation can be done by simply > scheduling out and in the events. This implies that each timer interrupt, the > events will rotate by q events (where q is the number of hardware counters). > This changes the current behavior of rotation. > Feedback welcome! > > I haven't profiled the new approach. I am only assuming it will be superior > when the number of per-cpu or distict cgroup events is large. > > The last patch shows how perf_iterate_ctx can use the new rb-tree index to > reduce the number of visited events. I haven't looked carefully if locking and > other things are correct. > > If this changes are in the right direction. A next version could remove some > existing code, specifically the lists ctx->pinned_groups and > ctx->flexible_groups could be removed. Also, event_filter_match could be > simplified when called on events groups filtered using the rb-tree, since both > perform similar checks. > > David Carrillo-Cisneros (6): > perf/core: create active and inactive event groups > perf/core: add a rb-tree index to inactive_groups > perf/core: use rb-tree to sched in event groups > perf/core: avoid rb-tree traversal when no inactive events > perf/core: rotation no longer neccesary. Behavior has changed. Beware > perf/core: use rb-tree index to optimize filtered perf_iterate_ctx > > include/linux/perf_event.h | 13 ++ > kernel/events/core.c | 466 > +++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 426 insertions(+), 53 deletions(-) > > -- > 2.11.0.390.gc69c2f50cf-goog