http://defect.opensolaris.org/bz/show_bug.cgi?id=6232
--- Comment #2 from Bill Holler <bill.holler at sun.com> 2009-01-26 13:09:06 --- (In reply to comment #0) > usr/src/uts/i86pc/os/cpupm/cpu_idle.c > - Deep C state and C1 state is represented in mc_haltset. Will it be more > expensive to wakeup arbitrary CPU in cstate_wakeup() when the CPU passed is > not > in halted cpuset. We may prefer to wakeup CPUs in C1 state rather than in deep > C state. Line 162-177. This is http://defect.opensolaris.org/bz/show_bug.cgi?id=4616. Power savings went up nicely. Unfortunately performance went down when the "shallowest idle" CPU is selected. Thus is the frustrating nature of trading performance for power. :-( We need a better need a better C-state throttle mechanism before switching this on. (Also the scheduling algorithm you mentioned tends to favor scheduling threads onto CPUs with high interrupt loads etc.) In general we would like to move the dispatcher towards looking at Power Domains (cores) instead of looking at individual CPUs. CMT load balancing levels are only aware of their cpus. Instead cmt_pgs should be aware of their child cmt_pgs. The motivation is: the C-state of the hardware core is the higher of its sibling CPUs on hyper threaded architectures. :-( The c-state of a CPU alone may not be sufficient to know its hardware c-state. A higher power savings policy could prefer consolidating on cores instead of looking for the shallowest idle CPU. Higher power saving policies are a future OpenSolaris project. > - We can also consider time-stamping CPU idle loop. If a CPU has been in > idle state for long then put the CPU in deep-C state. Starting with mwait > first > and then progressing to deep-C sleep state. This way if the CPU were to > awakened soon then we will not go through expensive deep-C sleep state > transition. This will require testing. This current thinking is to attempt to predict short idle periods and just go to C1 when the system thinks the CPU will not be able to enter deeper C-states. We have been spending the last few months basically just testing different idle policy algorithms. We will probably put back more changes in this area before Nevada putback. > - I think we would want to consider the size of the system while waking up > CPUs and/or runq of active CPUs. For instance, on a laptop/desktop system, we > shouldn't end up waking up other CPUs through setbackdq()/setfrontdq() if > there > is a sudden burst of workload (callback disp_enq_thread is invoked whenever > there is a thread being enqueued in the run queue). I guess runq balance code > may take care of this but just checking. On large systems it will have > cascading effect but I guess we have to do better in numbers and saving power > is secondary at that point of time. As you noted: maintaining performance is the highest priority because deep c-states will be enabled by default. Currently the scheduler looks through the CMT levels for thread placement based on LOAD_BALANCING or COALESCING at each level. (A level is something like a shared socket, shared cache, shared pipeline etc.) Certainly a "prefer-power" policy may use different scheduling policies. We are actively investigating other algorithms. -- Configure bugmail: http://defect.opensolaris.org/bz/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
