[tesla-dev] CPU power management policies

David Vengerov Mon, 11 Jun 2007 13:01:41 -0700

Mark Haywood wrote:

> Again I'm not following. Can you describe a pure utility curve and 
> explain how it would be presented to the system for its consumption?


The utility curve can be provided by the user/administrator by clicking 
on one of the graphic images presented by the system, where each image 
depicts a concave, convex, or linear curve starting at (0,0) and ending 
at (1,1). The x-axis is the power level at which the user wants the 
system to operate (as a fraction of the maximum power level) and the 
y-axis is the performance the user wants the system to achieve (as a 
fraction of the maximum performance). This curve represents the set of 
(performance, power) points among which the user is indifferent, (it can 
be called "indifference" curve, or "Efficient Frontier" in economic 
terms) thus allowing the power management algorithm to make tradeoffs 
such as deciding whether it is worth to "compact" the workload into 
fewer CPUs, losing some performance but saving some power as a result. 
In order to make such a tradeoff, the algorithm would compute the new 
expected power level following such a decision, plug it into the utility 
curve and observe the required performance at that power level, and then 
the algorithm can actually make the "compacting" decision if the 
predicted performance is higher than what is required by the 
"indifference" curve. The big question is: how can the system's 
performance be computed starting from a particular thread-to-CPU 
allocation? That's where reinforcement learning (RL) comes in, as it 
allows one to use performance feedback to *learn* the mapping from 
currently observed variables (number of free CPUs, average utilization 
of occupied CPUs, etc.) into expected future performance.

>
> I'm not sure (because I'm not sure what feedback you're expecting), 
> but I doubt it. 

I would like to receive a real-valued feedback (something correlated 
with system's performance) at the frequency at which the system can 
potentially make decisions.

> The Solaris PM framework doesn't scan often enough (usually once every 
> 15 seconds) and incurs a lot of overhead for our task. 

Once every 15 seconds is definitely not enough.

> I really think we're going to need a standalone driver that manages 
> the CPUs independent of the PM framework. I'm working on it now. Of 
> course, how we're going to make this all work with the PM framework in 
> the future is another discussion. Just in case you're interested in 
> the Solaris PM framework, you can find the source at 
> usr/src/uts/os/sunpm.c. There's also the pm driver (provides the 
> ioctls I mentioned) at usr/src/uts/common/io/pm.c. 

OK, please let us know when you finish implementing the new 
functionality and whether it will be able to provide a more frequent 
performance feedback.

David

[tesla-dev] CPU power management policies

Reply via email to