On 06/07/2014 03:07 AM, Rafael J. Wysocki wrote:
> On Wednesday, June 04, 2014 03:17:00 AM Srivatsa S. Bhat wrote:
>> Cpufreq governors like the ondemand governor calculate the load on the CPU
>> periodically by employing deferrable timers. A deferrable timer won't fire
>> if the CPU is completely idle (and there are no other timers to be run), in
>> order to avoid unnecessary wakeups and thus save CPU power.
>>
>> However, the load calculation logic is agnostic to all this, and this can
>> lead to the problem described below.
>>
>>
>> Time (ms)               CPU 1
>>
>> 100                Task-A running
>>
>> 110                Governor's timer fires, finds load as 100% in the last
>>                    10ms interval and increases the CPU frequency.
>>
>> 110.5              Task-A running
>>
>> 120             Governor's timer fires, finds load as 100% in the last
>>                 10ms interval and increases the CPU frequency.
>>
>> 125             Task-A went to sleep. With nothing else to do, CPU 1
>>                 went completely idle.
>>
>> 200             Task-A woke up and started running again.
>>
>> 200.5                   Governor's deferred timer (which was originally 
>> programmed
>>                 to fire at time 130) fires now. It calculates load for the
>>                 time period 120 to 200.5, and finds the load is almost zero.
>>                 Hence it decreases the CPU frequency to the minimum.
>>
>> 210             Governor's timer fires, finds load as 100% in the last
>>                 10ms interval and increases the CPU frequency.
>>
>>
>> So, after the workload woke up and started running, the frequency was 
>> suddenly
>> dropped to absolute minimum, and after that, there was an unnecessary delay 
>> of
>> 10ms (sampling period) to increase the CPU frequency back to a reasonable 
>> value.
>> And this pattern repeats for every wake-up-from-cpu-idle for that workload.
>> This can be quite undesirable for latency- or response-time sensitive bursty
>> workloads. So we need to fix the governor's logic to detect such 
>> wake-up-from-
>> cpu-idle scenarios and start the workload at a reasonably high CPU frequency.
>>
>> One extreme solution would be to fake a load of 100% in such scenarios. But
>> that might lead to undesirable side-effects such as frequency spikes (which
>> might also need voltage changes) especially if the previous frequency 
>> happened
>> to be very low.
>>
>> We just want to avoid the stupidity of dropping down the frequency to a 
>> minimum
>> and then enduring a needless (and long) delay before ramping it up back 
>> again.
>> So, let us simply carry forward the previous load - that is, let us just 
>> pretend
>> that the 'load' for the current time-window is the same as the load for the
>> previous window. That way, the frequency and voltage will continue to be set
>> to whatever values they were set at previously. This means that bursty 
>> workloads
>> will get a chance to influence the CPU frequency at which they wake up from
>> cpu-idle, based on their past execution history. Thus, they might be able to
>> avoid suffering from slow wakeups and long response-times.
>>
>> [ The right way to solve this problem is to teach the CPU frequency governors
>> to track load on a per-task basis, not a per-CPU basis, and set the 
>> appropriate
>> frequency on whichever CPU the task executes. But that involves redesigning
>> the cpufreq subsystem, so this patch should make the situation bearable until
>> then. ]
>>
>> Experimental results:
>> ====================
> 
> This formatting of the changelog evidently confused Patchwork.
> 

Oh, I didn't realize that that would create problems!

> That's not a big deal, but please try to avoid that in the future if possible.
> 

Sorry, I'll be careful next time. Thanks for letting me know!

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to