[tesla-dev] Adaptive Optimizations page update (ML)

Eric Saxe Wed, 06 Jun 2007 15:24:37 -0700

David Vengerov wrote:
> Eric Saxe wrote:
>
>> David Vengerov wrote:
>>
>>> Eric Saxe wrote:
>>>
>>>> If there are N or more running threads in an N-CPU system, then 
>>>> utilization is at 100%. Generally, i'm thinking that the CPUs 
>>>> should all be clocked up, if we want to maximize performance. 
>>>> There's not much opportunity to squander power in this case. It's 
>>>> really only the "partial utilization" scenario, where power is 
>>>> being directed at the part of the system that isn't being used. 
>>>
>>> If you hold this view, then the policy I described previously that 
>>> decides on the clock rate of idle CPUs based on how their number has 
>>> fluctuated in the past should be very relevant and allows us to find 
>>> the desired tradeoff between maximizing the the system's performance 
>>> (by never clocking down the idle CPUs) and minimizing the power 
>>> consumption (by running idle CPUs at the lowest power and increasing 
>>> their clock rate only when they receive some threads). The policy I 
>>> am envisioning should be able to clock down CPUs to different 
>>> extents based on estimated probabilities of some of them receiving 
>>> threads in the near future.
>>
>>
>> Where I think this is especially relevant is where there exists a 
>> non-trivial amount of time to bring online additional resources 
>> (latency). If ML techniques can help predict when utilization will 
>> increase/decrease, then perhaps the latency associated with bringing 
>> online/offline the additional capacity can be better hidden. In the 
>> data center context, where the unit of power management may be 
>> suspended / powered off systems, this could be significant. 
>
> Right. What latency do you expect in AMD and Intel systems?


For P-state (frequency/voltage scaling) of processors, I believe the 
latencies involved are very low.

> Do you think it will be more efficient for the power management 
> algorithm to poll CPUs at regular time intervals enquiring about their 
> workload, or would it be better for each CPU to notify the algorithm 
> immediately whenever it becomes idle and whenever it switches from 
> idle to working state? 

I believe not polling would be preferable. Otherwise, during the lag 
between the utilization change and the time when the power is managed 
either performance or power efficiency would be left on the table.

> In the former case, the time interval between polls will result in an 
> additional latency before the CPU power can be changed. In the latter 
> case no additional latency will be experienced. However, there might 
> still be a cost associated with changing the CPU clock frequency. Do 
> you envision any such cost being present? 

Changing the CPU clock frequency is a fairly light weight operation, I 
believe.

-Eric

[tesla-dev] Adaptive Optimizations page update (ML)

Reply via email to