Hi, I forgot to mention that cpu_pm_policy is just a policy. There is no guaranty it maps to a specific MSR or hardware implementation.
For example Solaris could be dynamically setting the ENERGY_PERFORMANCE_BIAS register to different settings depending on things such as system-load, the priority of the application being scheduled, a power policy of the application, or power policy of the zone. Regards, Bill On 03/03/10 16:21, Bill Holler wrote: > +1. > > Hi Aubrey, > > I also think it is time to move forward with this proposal. > Generally we want the system to work best "out of the box" > with no tuning. On the other hand, vendors will keep > improving products with new features, and there will > always be some specific applications were custom settings > may be better. I feel this proposal supports innovation and > application specific customization in line with the > OpenSolaris community goals. > > This proposal applies to all types of CPUs. It uses > "cpu_pm_policy" instead of for example mentioning a > specific CPU's MSR. ;-) This proposal will be useful > with other CPUs if/when they have hardware mechanisms > for tuning power / performance. > > > In the arc case we want to mention that there could > be a policy conflict between this component setting and > a system-power-policy, external Power Caping, etc. > Generally we want users to use the default or a higher > level policy such as the system power policy. > Unfortunately the system power policy may not be > fine-grain or diverse enough for some applications to > specify cpu power policy. In that case cpu_pm_policy > will be useful. My thought is: the user must really know > what they want if they specify a component policy > such as cpu_pm_policy instead of just using the > system power policy. For that reason I feel cpu_pm_policy > should override the system-power-policy at the cpupm level. > > Power Caping is different. Power Capping is an external > policy. It is currently "owned" by the SP external to the > OS. Power Caping should override a local cpu_pm_policy. > > > Implementation comments: > IMHO mcpu_pm_policy pointer should be in the > mcpu_pm_mach_state structure instead of in the machcpu. > We may want to allow the user to specify a number > instead of just Perf, Balanced, Power, Default? > > Regards, > Bill > > > On 02/20/10 18:43, Li, Aubrey wrote: >> Hi Bill, >> >> I think it's time to continue this proposal, since b134 is closed and >> the >> build is not limited now. power/perf bias setting is a start point >> for future power related work, I'll prepare a PSARC file for the new >> option if >> this is acceptable. No is also a good answer with good reason. >> >> Thanks, >> -Aubrey >> >> >>> Bill.Holler Wrote: >>> >>>> Hi, >>>> >>>> This proposal is for a mechanism to set the new MSR >>>> IA32_ENERGY_PERF_BIAS_MSR. This is a new hardware >>>> feature. The MSR effects overall power/performance. >>>> It gives a hint to the processor & package for desired >>>> power/performance characteristics. It is related to p-states >>>> and c-states (and may effect these features), but this feature >>>> can have other socket/system-level effects as well. >>>> The programmers guides do not go into details what the >>>> other effects can be. :-( >>>> >>> The perf and power impact of this MSR is model specific. >>> It's able to throttle turbo on WSM and probably help to do more >>> hardware decision in future. For example, when the short interrupt >>> storm is detected, it can demote CC6 request to CC3. >>> >>> >>>> On 11/05/09 05:15, minskey guo wrote: >>>> >>>>> Jedy Wang ??: >>>>> >>>>>> Hi Li, >>>>>> >>>>>> As far as I know, gnome-power-manager has removed the support for >>>>>> changing governor which is the same as profile I think. I remember >>>>>> someone wrote a blog explaining the reason but I can not find it >>>>>> now. >>>>>> >>> I >>> >>>>>> wonder why what makes us still need to implement this feature. >>>>>> >>>>> In linux world, there is ondemand governor in kernel. It sets cpu >>>>> freqency >>>>> according to cpu's current load. So, somebody consider that eveybody >>>>> should use that governor, and let CPUs finish their jobs asap and >>>>> >>> then >>> >>>>> enter >>>>> into C states for power-saving. Comparing to P state, c-state does >>>>> >>> save >>> >>>>> more power. That's why gnome removed it. >>>>> >>> This is also model specific and depends on if the frequency and voltage >>> and >>> power are linear. That's true on latest processor but not on earlier >>> processor. >>> >>> I'm not sure why gnome removed it, but seems not a good idea to me. >>> Some >>> users want max perf and others want longer battery life. >>> >>> >>>> Yes, a good p-state + c-state implementation is not easy >>>> to tune for more power savings. Running in lower p-states >>>> when a CPU is busy burns more power due to shorter time >>>> in deeper C-states. Entering deeper C-states too aggressively >>>> also burns more power (on both an idle and busy system) due >>>> to unnecessary wakeup latency. ;-) Without knowing the >>>> details, it seems likely that the gnome-power-manager >>>> was removed because setting it made worse decisions >>>> than a runtime prediction. >>>> >>>> >>>> Solaris currently has mechanisms to turn P-state and >>>> deeper C-state support on/off. >>>> >>>> A requirement is that the Energy Perf Bias MSR can be >>>> set on systems not running a GUI. We would like to support >>>> a possible future Gnome interface to set this MSR if/when it >>>> exists. The proposal provides a mechanism that works on >>>> systems without Gnome. >>>> >>> Right, most of servers do not run gnome. I don't expect gnome support >>> but it would be great if it will, :-) >>> >>> IMHO, we should use this global cpu power policy setting instead of >>> "cpupm" >>> and "cpu-deep-idle", this is more friendly to the user. The users just >>> want more >>> perf or more power, I think they don't care if the system support p/c- >>> state at the >>> same time. "cpupm" is a confusion only for p-state. we call "cpupm" >>> before we >>> have deep idle support. Actually cpu-deep-idle is also one part of cpu >>> power >>> management, :) >>> >>>> >>>>> but, someone doesn't care power-saving, when comparing it to other >>>>> factors. For example, if you are plagued by the noise of CPU fan, and >>>>> expect quiet it then you can lower cpu frequency, which results in >>>>> lower heat, and then fan can be stopped. >>>>> >>>>> personally, I vote +1 for this project if I could vote, but I don't >>>>> >>> like >>> >>>>> the names of "perf-bias" etc :) >>>>> >>>>> >>>>> Besides, can somebody tell me where IA32_ENERGY_PERF_BIAS_MSR >>>>> comes ? Is it a part of IPS feature ? >>>>> >>>> Intel's Software Developer's Manuals 2A describes >>>> CPUID detection of IA32_ENERGY_PERF_BIAS_MSR >>>> and volume 3A describes the MSR. >>>> http://www.intel.com/products/processor/manuals/ >>>> Sorry, I do not know what IPS stands for? >>>> >>> cough, cough, IPS is not a released feature and should not be discussed >>> here, ;p >>> >>> Thanks, >>> -Aubrey >>> >>> >>>> Regards, >>>> Bill >>>> >>>> >>>> >>>>> -minskey >>>>> >>>>> >>>>> >>>>> >>>>>> I remember why already support 2 profile through gnome-power-manager >>>>>> >>> on >>> >>>>>> Solaris. What's the difference between them? >>>>>> >>>>>> I do not understand the exact meaning perf-bias, balanced and power- >>>>>> >>> bias >>> >>>>>> either. Does not perf-bias means the cpu frequency will be always at >>>>>> >>> the >>> >>>>>> highest level? >>>>>> >>>>>> Regards, >>>>>> >>>>>> Jedy >>>>>> On Wed, 2009-11-04 at 08:47 +0800, Li, Aubrey wrote: >>>>>> >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> When we enable intel energy performance bias feature, we found the >>>>>>> power >>>>>>> profile implementation is necessary. Here I did a draft for cpu >>>>>>> level power policy. >>>>>>> http://cr.opensolaris.org/~aubrey/cpu_power_policy_v1/ >>>>>>> >>>>>>> The proposal added a new keyword to /etc/power.conf >>>>>>> "cpu-power-policy", >>>>>>> And we have 4 options for this new keyword: >>>>>>> 1) perf-bias >>>>>>> 2) balanced >>>>>>> 3) power-bias >>>>>>> 4) default, the same as perf-bias. >>>>>>> >>>>>>> /etc/power.conf accepts the user input and passes the prefered >>>>>>> >>> policy >>> >>>>>>> to the kernel thru ioctl. Then pm_ioctl calls the callback to walk >>>>>>> >>> a >>> >>>>>>> cpu >>>>>>> power policy list. Every cpu pm feature which wants to be adjusted >>>>>>> >>> by >>> >>>>>>> this option and verified to be supported will register its callback >>>>>>> function >>>>>>> to the list, so that it can be called and adjusted by pmconfig. >>>>>>> -------------------------------------------------------- >>>>>>> /etc/power.conf >>>>>>> | >>>>>>> pm_ioctl(cpu_power_policy, policy) >>>>>>> | >>>>>>> cpu_power_policy_callb (policy) >>>>>>> | >>>>>>> ----> registered pm feature callback 1 (ENERGY_PERF_BIAS) >>>>>>> | >>>>>>> ----> registered pm feature callback 2 >>>>>>> ... >>>>>>> --------------------------------------------------------- >>>>>>> Currently, only energy_perf_bias feature is registered, because my >>>>>>> intention is >>>>>>> to support adjusting energy_perf_bias MSR without reboot. I guess >>>>>>> >>> we >>> >>>>>>> probably >>>>>>> can add p/t/c-state support later. When we add p/t/c-state support, >>>>>>> my quick thought is, this option will override "cpupm" and >>>>>>> "cpu-deep-idle" setting. >>>>>>> >>>>>>> Welcome your any comments and suggestions. >>>>>>> >>>>>>> Thanks, >>>>>>> -Aubrey >>>>>>> _______________________________________________ >>>>>>> pm-discuss mailing list >>>>>>> pm-discuss at opensolaris.org >>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> pm-discuss mailing list >>>>>> pm-discuss at opensolaris.org >>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> pm-discuss mailing list >>>>> pm-discuss at opensolaris.org >>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>> >>>> _______________________________________________ >>>> pm-discuss mailing list >>>> pm-discuss at opensolaris.org >>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>> >>> _______________________________________________ >>> pm-discuss mailing list >>> pm-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>> > > _______________________________________________ > pm-discuss mailing list > pm-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/pm-discuss
