Would the cpu_pm_policy be treated as essentially a cap on the register setting? That is, if the cpu_pm_policy setting maps to an MSR setting of 9, does this mean the OS would then only dynamically choose MSR values between 9-15?
-- jdh Bill Holler wrote: > Hi, > > I forgot to mention that cpu_pm_policy is just a policy. > There is no guaranty it maps to a specific MSR or hardware > implementation. > > For example Solaris could be dynamically setting the > ENERGY_PERFORMANCE_BIAS register to different > settings depending on things such as system-load, the > priority of the application being scheduled, a power policy > of the application, or power policy of the zone. > > Regards, > Bill > > > On 03/03/10 16:21, Bill Holler wrote: >> +1. >> >> Hi Aubrey, >> >> I also think it is time to move forward with this proposal. >> Generally we want the system to work best "out of the box" >> with no tuning. On the other hand, vendors will keep >> improving products with new features, and there will >> always be some specific applications were custom settings >> may be better. I feel this proposal supports innovation and >> application specific customization in line with the >> OpenSolaris community goals. >> >> This proposal applies to all types of CPUs. It uses >> "cpu_pm_policy" instead of for example mentioning a >> specific CPU's MSR. ;-) This proposal will be useful >> with other CPUs if/when they have hardware mechanisms >> for tuning power / performance. >> >> >> In the arc case we want to mention that there could >> be a policy conflict between this component setting and >> a system-power-policy, external Power Caping, etc. >> Generally we want users to use the default or a higher >> level policy such as the system power policy. >> Unfortunately the system power policy may not be >> fine-grain or diverse enough for some applications to >> specify cpu power policy. In that case cpu_pm_policy >> will be useful. My thought is: the user must really know >> what they want if they specify a component policy >> such as cpu_pm_policy instead of just using the >> system power policy. For that reason I feel cpu_pm_policy >> should override the system-power-policy at the cpupm level. >> >> Power Caping is different. Power Capping is an external >> policy. It is currently "owned" by the SP external to the >> OS. Power Caping should override a local cpu_pm_policy. >> >> >> Implementation comments: >> IMHO mcpu_pm_policy pointer should be in the >> mcpu_pm_mach_state structure instead of in the machcpu. >> We may want to allow the user to specify a number >> instead of just Perf, Balanced, Power, Default? >> >> Regards, >> Bill >> >> >> On 02/20/10 18:43, Li, Aubrey wrote: >>> Hi Bill, >>> >>> I think it's time to continue this proposal, since b134 is closed and >>> the >>> build is not limited now. power/perf bias setting is a start point >>> for future power related work, I'll prepare a PSARC file for the new >>> option if >>> this is acceptable. No is also a good answer with good reason. >>> >>> Thanks, >>> -Aubrey >>> >>> >>>> Bill.Holler Wrote: >>>> >>>>> Hi, >>>>> >>>>> This proposal is for a mechanism to set the new MSR >>>>> IA32_ENERGY_PERF_BIAS_MSR. This is a new hardware >>>>> feature. The MSR effects overall power/performance. >>>>> It gives a hint to the processor & package for desired >>>>> power/performance characteristics. It is related to p-states >>>>> and c-states (and may effect these features), but this feature >>>>> can have other socket/system-level effects as well. >>>>> The programmers guides do not go into details what the >>>>> other effects can be. :-( >>>>> >>>> The perf and power impact of this MSR is model specific. >>>> It's able to throttle turbo on WSM and probably help to do more >>>> hardware decision in future. For example, when the short interrupt >>>> storm is detected, it can demote CC6 request to CC3. >>>> >>>> >>>>> On 11/05/09 05:15, minskey guo wrote: >>>>> >>>>>> Jedy Wang ??: >>>>>> >>>>>>> Hi Li, >>>>>>> >>>>>>> As far as I know, gnome-power-manager has removed the support for >>>>>>> changing governor which is the same as profile I think. I remember >>>>>>> someone wrote a blog explaining the reason but I can not find it >>>>>>> now. >>>>>>> >>>> I >>>> >>>>>>> wonder why what makes us still need to implement this feature. >>>>>>> >>>>>> In linux world, there is ondemand governor in kernel. It sets cpu >>>>>> freqency >>>>>> according to cpu's current load. So, somebody consider that eveybody >>>>>> should use that governor, and let CPUs finish their jobs asap and >>>>>> >>>> then >>>> >>>>>> enter >>>>>> into C states for power-saving. Comparing to P state, c-state does >>>>>> >>>> save >>>> >>>>>> more power. That's why gnome removed it. >>>>>> >>>> This is also model specific and depends on if the frequency and voltage >>>> and >>>> power are linear. That's true on latest processor but not on earlier >>>> processor. >>>> >>>> I'm not sure why gnome removed it, but seems not a good idea to me. >>>> Some >>>> users want max perf and others want longer battery life. >>>> >>>> >>>>> Yes, a good p-state + c-state implementation is not easy >>>>> to tune for more power savings. Running in lower p-states >>>>> when a CPU is busy burns more power due to shorter time >>>>> in deeper C-states. Entering deeper C-states too aggressively >>>>> also burns more power (on both an idle and busy system) due >>>>> to unnecessary wakeup latency. ;-) Without knowing the >>>>> details, it seems likely that the gnome-power-manager >>>>> was removed because setting it made worse decisions >>>>> than a runtime prediction. >>>>> >>>>> >>>>> Solaris currently has mechanisms to turn P-state and >>>>> deeper C-state support on/off. >>>>> >>>>> A requirement is that the Energy Perf Bias MSR can be >>>>> set on systems not running a GUI. We would like to support >>>>> a possible future Gnome interface to set this MSR if/when it >>>>> exists. The proposal provides a mechanism that works on >>>>> systems without Gnome. >>>>> >>>> Right, most of servers do not run gnome. I don't expect gnome support >>>> but it would be great if it will, :-) >>>> >>>> IMHO, we should use this global cpu power policy setting instead of >>>> "cpupm" >>>> and "cpu-deep-idle", this is more friendly to the user. The users just >>>> want more >>>> perf or more power, I think they don't care if the system support p/c- >>>> state at the >>>> same time. "cpupm" is a confusion only for p-state. we call "cpupm" >>>> before we >>>> have deep idle support. Actually cpu-deep-idle is also one part of cpu >>>> power >>>> management, :) >>>> >>>>> >>>>>> but, someone doesn't care power-saving, when comparing it to other >>>>>> factors. For example, if you are plagued by the noise of CPU fan, and >>>>>> expect quiet it then you can lower cpu frequency, which results in >>>>>> lower heat, and then fan can be stopped. >>>>>> >>>>>> personally, I vote +1 for this project if I could vote, but I don't >>>>>> >>>> like >>>> >>>>>> the names of "perf-bias" etc :) >>>>>> >>>>>> >>>>>> Besides, can somebody tell me where IA32_ENERGY_PERF_BIAS_MSR >>>>>> comes ? Is it a part of IPS feature ? >>>>>> >>>>> Intel's Software Developer's Manuals 2A describes >>>>> CPUID detection of IA32_ENERGY_PERF_BIAS_MSR >>>>> and volume 3A describes the MSR. >>>>> http://www.intel.com/products/processor/manuals/ >>>>> Sorry, I do not know what IPS stands for? >>>>> >>>> cough, cough, IPS is not a released feature and should not be discussed >>>> here, ;p >>>> >>>> Thanks, >>>> -Aubrey >>>> >>>> >>>>> Regards, >>>>> Bill >>>>> >>>>> >>>>> >>>>>> -minskey >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> I remember why already support 2 profile through gnome-power-manager >>>>>>> >>>> on >>>> >>>>>>> Solaris. What's the difference between them? >>>>>>> >>>>>>> I do not understand the exact meaning perf-bias, balanced and power- >>>>>>> >>>> bias >>>> >>>>>>> either. Does not perf-bias means the cpu frequency will be always at >>>>>>> >>>> the >>>> >>>>>>> highest level? >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Jedy >>>>>>> On Wed, 2009-11-04 at 08:47 +0800, Li, Aubrey wrote: >>>>>>> >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> When we enable intel energy performance bias feature, we found the >>>>>>>> power >>>>>>>> profile implementation is necessary. Here I did a draft for cpu >>>>>>>> level power policy. >>>>>>>> http://cr.opensolaris.org/~aubrey/cpu_power_policy_v1/ >>>>>>>> >>>>>>>> The proposal added a new keyword to /etc/power.conf >>>>>>>> "cpu-power-policy", >>>>>>>> And we have 4 options for this new keyword: >>>>>>>> 1) perf-bias >>>>>>>> 2) balanced >>>>>>>> 3) power-bias >>>>>>>> 4) default, the same as perf-bias. >>>>>>>> >>>>>>>> /etc/power.conf accepts the user input and passes the prefered >>>>>>>> >>>> policy >>>> >>>>>>>> to the kernel thru ioctl. Then pm_ioctl calls the callback to walk >>>>>>>> >>>> a >>>> >>>>>>>> cpu >>>>>>>> power policy list. Every cpu pm feature which wants to be adjusted >>>>>>>> >>>> by >>>> >>>>>>>> this option and verified to be supported will register its callback >>>>>>>> function >>>>>>>> to the list, so that it can be called and adjusted by pmconfig. >>>>>>>> -------------------------------------------------------- >>>>>>>> /etc/power.conf >>>>>>>> | >>>>>>>> pm_ioctl(cpu_power_policy, policy) >>>>>>>> | >>>>>>>> cpu_power_policy_callb (policy) >>>>>>>> | >>>>>>>> ----> registered pm feature callback 1 (ENERGY_PERF_BIAS) >>>>>>>> | >>>>>>>> ----> registered pm feature callback 2 >>>>>>>> ... >>>>>>>> --------------------------------------------------------- >>>>>>>> Currently, only energy_perf_bias feature is registered, because my >>>>>>>> intention is >>>>>>>> to support adjusting energy_perf_bias MSR without reboot. I guess >>>>>>>> >>>> we >>>> >>>>>>>> probably >>>>>>>> can add p/t/c-state support later. When we add p/t/c-state support, >>>>>>>> my quick thought is, this option will override "cpupm" and >>>>>>>> "cpu-deep-idle" setting. >>>>>>>> >>>>>>>> Welcome your any comments and suggestions. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -Aubrey >>>>>>>> _______________________________________________ >>>>>>>> pm-discuss mailing list >>>>>>>> pm-discuss at opensolaris.org >>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> pm-discuss mailing list >>>>>>> pm-discuss at opensolaris.org >>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> pm-discuss mailing list >>>>>> pm-discuss at opensolaris.org >>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>> >>>>> _______________________________________________ >>>>> pm-discuss mailing list >>>>> pm-discuss at opensolaris.org >>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>> >>>> _______________________________________________ >>>> pm-discuss mailing list >>>> pm-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>> >> >> _______________________________________________ >> pm-discuss mailing list >> pm-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/pm-discuss > > _______________________________________________ > tesla-dev mailing list > tesla-dev at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/tesla-dev -- --------------------- Julia Harper, julia.harper at sun.com
