Hello Steven, I am doing correlation of linux sched events, following all tasks between cpus, and one thing that would be really convenient would be to have a global trace_pipe_raw, in addition to the per-cpu ones, with already sorted events.
I would imagine the core functionality is already available, since trace_pipe in the tracing directory already shows all events regardless of CPU, and so it would be a matter of doing the same for trace_pipe_raw. But is there a good reason why trace_pipe_raw is available only per-cpu? Would work in the direction of adding a global trace_pipe_raw be considered for inclusion? Thank you, Claudio On 07/09/2018 05:32 PM, Steven Rostedt wrote: > On Fri, 6 Jul 2018 08:22:01 +0200 > Claudio <[email protected]> wrote: > >> Hello all, >> >> I have been experimenting with the idea of leaving ftrace enabled, with >> sched events, >> on production systems. >> >> The main concern that I am having at the moment is about the impact on the >> system. >> Enabling the sched events that I currently need for the tracing application >> seems to slow down context-switches considerably, and make the system less >> responsive. >> >> I have tested with cyclictest on the mainline kernel, and noticed an >> increase of min, avg latencies of around 25%. >> >> Is this expected? >> >> Some initial investigation into ftrace seems to point at the reservation and >> commit of the events into the ring buffer >> as the highest sources of overhead, while event parameters copying, >> including COMM, does not seem to have any noticeable effect >> relative to those costs. >> >> I have been running 20 times the following test, and thrown away the first >> results: >> >> $ sudo ./cyclictest --smp -p95 -m -s -N -l 100000 -q > > OK, I just noticed that you are using -N which means all numbers are in > nanoseconds. > >> >> $ uname -a >> Linux claudio-HP-ProBook-470-G5 4.18.0-rc3+ #3 SMP Tue Jul 3 15:50:30 CEST >> 2018 x86_64 x86_64 x86_64 GNU/Linux >> >> For brevity, this is a comparison of one test's results. All other test >> results show the same ~25% increase. >> >> On the left side, the run without ftrace sched events, on the right side >> with ftrace sched events enabled. >> >> CPU Count Min Act Avg Max Count >> Min-ftrace Act-ftrace Avg-ftrace Max-ftrace >> 0 100000 2339 2936 2841 139478 100000 >> 2900 3182 3566 93056 >> 1 66742 2365 3386 2874 93639 66750 >> 2959 3786 3646 154074 >> 2 50080 2376 3058 2910 196221 50097 >> 2997 4209 3655 18707 >> 3 40076 2394 3461 2931 17914 40091 >> 3006 4417 3750 17159 >> 4 33404 2371 3612 2834 15336 33419 >> 2997 3836 3594 23172 >> 5 28635 2387 3313 2885 25863 28649 >> 2995 3795 3647 9956 >> 6 25058 2384 3428 2968 12162 25071 >> 3051 4366 3719 18151 >> 7 22275 2381 2859 2982 10706 22287 >> 3046 5078 3825 10781 >> >> I would be thankful for any advice or comments on this, >> especially with the goal in mind to lower as much as possible the runtime >> impact on the system. > > Thus, the tracing is causing the wakeup time to be an average of 0.8us > longer. > > Yes that is expected. > > -- Steve >

