Hi Vincent, On Thu, Mar 23, 2017 at 3:08 PM, Vincent Guittot <vincent.guit...@linaro.org> wrote: [..] >>> >>>> So I'm not really aligned with the description of your problem: PELT >>>> metric underestimates the load of the CPU. The PELT is just about >>>> tracking CFS task utilization but not whole CPU utilization and >>>> according to your description of the problem (time stolen by irq), >>>> your problem doesn't come from an underestimation of CFS task but from >>>> time spent in something else but not accounted in the value used by >>>> schedutil >>> >>> Quite likely. Indeed, it can really be that the CFS task is preempted >>> because of some RT activity generated by the IRQ handler. >>> >>> More in general, I've also noticed many suboptimal freq switches when >>> RT tasks interleave with CFS ones, because of: >>> - relatively long down _and up_ throttling times >>> - the way schedutil's flags are tracked and updated >>> - the callsites from where we call schedutil updates >>> >>> For example it can really happen that we are running at the highest >>> OPP because of some RT activity. Then we switch back to a relatively >>> low utilization CFS workload and then: >>> 1. a tick happens which produces a frequency drop >> >> Any idea why this frequency drop would happen? Say a running CFS task >> gets preempted by RT task, the PELT signal shouldn't drop for the >> duration the CFS task is preempted because the task is runnable, so > > utilization only tracks the running state but not runnable state. > Runnable state is tracked in load_avg
Thanks. I got it now. Correct me if I'm wrong but strictly speaking utilization for a cfs_rq (which drives the frequency for CFS) still tracks the blocked/runnable time of tasks although its decayed as time moves forward. Only when we migrate the rq of a cfs task is the util_avg contribution removed from the rq. But I can see now why running RT can decay this load tracking signal. Regards, Joel