[tesla-dev] Adaptive Optimizations page update (ML)

Mark Haywood Wed, 06 Jun 2007 21:22:09 -0400

Dana H. Myers wrote:
> Eric Saxe wrote:
>> David Vengerov wrote:
>>> Eric Saxe wrote:
>>>
>>>> David Vengerov wrote:
>>>>
>>>>> Eric Saxe wrote:
>>>>>
>>>>>> If there are N or more running threads in an N-CPU system, then 
>>>>>> utilization is at 100%. Generally, i'm thinking that the CPUs 
>>>>>> should all be clocked up, if we want to maximize performance. 
>>>>>> There's not much opportunity to squander power in this case. It's 
>>>>>> really only the "partial utilization" scenario, where power is 
>>>>>> being directed at the part of the system that isn't being used. 
>>>>>
>>>>> If you hold this view, then the policy I described previously that 
>>>>> decides on the clock rate of idle CPUs based on how their number 
>>>>> has fluctuated in the past should be very relevant and allows us to 
>>>>> find the desired tradeoff between maximizing the the system's 
>>>>> performance (by never clocking down the idle CPUs) and minimizing 
>>>>> the power consumption (by running idle CPUs at the lowest power and 
>>>>> increasing their clock rate only when they receive some threads). 
>>>>> The policy I am envisioning should be able to clock down CPUs to 
>>>>> different extents based on estimated probabilities of some of them 
>>>>> receiving threads in the near future.
>>>>
>>>>
>>>> Where I think this is especially relevant is where there exists a 
>>>> non-trivial amount of time to bring online additional resources 
>>>> (latency). If ML techniques can help predict when utilization will 
>>>> increase/decrease, then perhaps the latency associated with bringing 
>>>> online/offline the additional capacity can be better hidden. In the 
>>>> data center context, where the unit of power management may be 
>>>> suspended / powered off systems, this could be significant. 
>>>
>>> Right. What latency do you expect in AMD and Intel systems? 
>>
>> For P-state (frequency/voltage scaling) of processors, I believe the 
>> latencies involved are very low.
> Doesn't it depend on the specific chip?  I believe the Intel Enhanced 
> Speed Step
> are pretty quick to change state, but Opteron changes are more involved.
> Mark - any observations here?


Yes, it does depend upon the chip. Intel claims to have much less 
latency than AMD with their newer chips, which is why they believe that 
changing P-states multiple times a second is optimal. I don't know the 
numbers off the top of my head though.

[tesla-dev] Adaptive Optimizations page update (ML)

Reply via email to