Mark Haywood wrote: > Dana H. Myers wrote: >> Eric Saxe wrote: >>> David Vengerov wrote: >>>> Eric Saxe wrote: >>>> >>>>> David Vengerov wrote: >>>>> >>>>>> Eric Saxe wrote: >>>>>> >>>>>>> If there are N or more running threads in an N-CPU system, then >>>>>>> utilization is at 100%. Generally, i'm thinking that the CPUs >>>>>>> should all be clocked up, if we want to maximize performance. >>>>>>> There's not much opportunity to squander power in this case. It's >>>>>>> really only the "partial utilization" scenario, where power is >>>>>>> being directed at the part of the system that isn't being used. >>>>>> >>>>>> If you hold this view, then the policy I described previously that >>>>>> decides on the clock rate of idle CPUs based on how their number >>>>>> has fluctuated in the past should be very relevant and allows us >>>>>> to find the desired tradeoff between maximizing the the system's >>>>>> performance (by never clocking down the idle CPUs) and minimizing >>>>>> the power consumption (by running idle CPUs at the lowest power >>>>>> and increasing their clock rate only when they receive some >>>>>> threads). The policy I am envisioning should be able to clock down >>>>>> CPUs to different extents based on estimated probabilities of some >>>>>> of them receiving threads in the near future. >>>>> >>>>> >>>>> Where I think this is especially relevant is where there exists a >>>>> non-trivial amount of time to bring online additional resources >>>>> (latency). If ML techniques can help predict when utilization will >>>>> increase/decrease, then perhaps the latency associated with >>>>> bringing online/offline the additional capacity can be better >>>>> hidden. In the data center context, where the unit of power >>>>> management may be suspended / powered off systems, this could be >>>>> significant. >>>> >>>> Right. What latency do you expect in AMD and Intel systems? >>> >>> For P-state (frequency/voltage scaling) of processors, I believe the >>> latencies involved are very low. >> Doesn't it depend on the specific chip? I believe the Intel Enhanced >> Speed Step >> are pretty quick to change state, but Opteron changes are more involved. >> Mark - any observations here? > > Yes, it does depend upon the chip. Intel claims to have much less > latency than AMD with their newer chips, which is why they believe that > changing P-states multiple times a second is optimal. I don't know the > numbers off the top of my head though.
Actually, I lie. Off the top of my head I was pretty sure that Intel was claiming 10 milliseconds and I just googled to confirm it: http://www.intel.com/cd/ids/developer/asmo-na/eng/195910.htm?page=4 I don't know about Opteron. I haven't measured it. I'm not sure it matters, as I don't believe we'll implement P-state support for Opteron. Greyhound though ...
