Hi Bill, Thanks to bring this forward to the power architects.
Bill Holler wrote: > >Hi Aubrey, > >+1. This looks great to me. We need it for many projects to allow >the admin to specify the system should run more energy-efficient. > >We discussed this with the power architects, and they agree it looks >good. >One suggestion was to change "power" to "energy". For example >they would like to see "power-bias" name changed to "energy-bias", >and this should be called something like "system energy policy". >That would help identify that this knob is for system *energy* >efficiency. This sounds good. I'll change it in the new version onepager. > >As a side note, saving energy is different from saving power. >Power Capping (an externally imposed policy) takes effect when >power has an increased cost or increased maginal-cost. For example: >1. the power grid fails or is over-budgeted, 2. the server room has >exceeded cooling capacity. Power Capping is different and has >higher precedence than this system-level energy efficiency policy. Definitely. I'll add this note to the PSARC file as well. I'll post a new onepager according to the new SMF implementation of tunable power option. Thanks, -Aubrey > >I think I covered the points the pm architects were concerned with? >Sarito or Julia etc can comment if I left something out. :-) > >Regards, >Bill > > >On 04/01/10 01:43, Li, Aubrey wrote: >> Randy Fishel wrote: >> >>> This might be a bit contentious, as there not only is effort to >>> migrate the configuration to SMF, there is a consideration to define >>> something similar to system-pm-policy. On the other hand, there also >>> is lacking architecture and there doesn't seem to be much momentum in >>> providing it. >>> >>> I am also leaving for vacation on Friday morning. I will take a >>> printout with me in hopes of maybe reviewing it over the next week. >>> It may also give others the opportunity to see how this might fit >into >>> the "new" architecture. >>> >>> Cheers! >>> >>> ---- Randy >>> >> >> This was intended as cpu-pm-policy, a mechanism to provide a knob for >the >> user to tune the pm policy introduced by Intel Energy_Perf_Bias >feature on >> the fly. Currently Energy_Perf_Bias is set to be performance bias by >default, >> that means the power control unit in the processor will drive the >processor >> to the peak performance with any energy cost. This feature for example >can >> throttle turbo performance boost by setting a MSR to Power bias. In >the near >> future, the trend of silicon design is doing more and more in hardware, >Package >> /core C-state auto promotion or demotion, QPI link state, DRAM >refreshing, etc >> all will accept the hint from this feature. >> >> Besides this, as for CPU, we don't have an option to let the processor >run at >> the lowest frequency, or always run in the supported deepest idle >state if in >> idle. CMT_COALESCE dispatching policy is disabled in the kernel due to >peak >> performance hurt. But this policy helps to group the utilization onto >one >> package or even one core as possible. If we could group the >utilization onto >> one package in idle, that means the other packages can sleep longer >and deeper, >> and hence save more energy. These should be the momentum to prolong >the battery >> life or server not in the rush hour. >> >> Besides CPU, memory or other devices have the same situation. In the >current >> kernel, the memory power management driver FIPE has a default policy >setting >> fipe_pm_policy = FIPE_PM_POLICY_BALANCE >> From the source, FIPE_PM_POLICY_POWERSAVE policy could save more power >I think. >> Sooner or later, DDR3 could have the same requirement if we implement >power >> management on it. >> >> Recently, I found USB EHCI driver is not friendly to idle power when I >did a >> power characterization analysis. EHCI driver keeps polling and making >the host >> controller to issue DMA read and write operations when there is no USB >related >> ops, or even when there is no USB device connected. This problem >throttles the >> package c-state and makes a big gap between solaris and other OSes. >This might >> not depend on the power/perf profile. But a profile could make the >solution easy. >> >> I believe there are a few other cases I missed to give more momentum >to introduce >> a user profile for power performance bias, :) >> >> Thanks, >> -Aubrey >> >>> On Thu, 1 Apr 2010, Li, Aubrey wrote: >>> >>> >>>> Just wanna move forward for this work, here is a PSARC onepager, Any >>>> >>> inputs >>> >>>> are really appreciated! >>>> >>>> Thanks, >>>> -Aubrey >>>> >>>> ======== system-pm-policy_onepager_v1.txt >>>> >>> ================================= >>> >>>> Template Version: @(#)onepager.txt 1.35 07/11/07 SMI >>>> >>>> 1. Introduction >>>> 1.1. Project/Component Working Name: >>>> system-pm-policy keyword >>>> >>>> 1.2. Name of Document Author/Supplier: >>>> Author: Aubrey Li <[email protected]> >>>> >>>> 1.3. Date of This Document: >>>> April 28 , 2010 >>>> >>>> 2. Project Summary >>>> 2.1. Project Description: >>>> Solaris support for the system-pm-policy keyword in >>>> >>> power.conf(4). >>> >>>> A mechanism is desired to set system wide power performance >>>> >>> bias. >>> >>>> 2.2. Risks and Assumptions: >>>> Very few customers will use this keyword. Most customers >will >>>> >>> desire >>> >>>> power performance balanced policy to be the default. >>>> >>>> 4. Technical Description: >>>> 4.1. Details: >>>> >>>> pmconfig(1M) parses /etc/power.conf, if the system-pm-policy >>>> >>> keword >>> >>>> is in power.conf(4), it passes the user preferred policy to >>>> >>> the kernel >>> >>>> thru pm_ioctl by the command PM_SET_SYSTEM_POLICY. pm_ioctl() >>>> >>> then >>> >>>> calls pm_set_system_policy() to set the global policy >variable >>>> >>> and >>> >>>> calls the power managable modules to pass the policy down. >>>> >>>> Currently pm_set_system_policy() only set the CPU power >>>> >>> management >>> >>>> policy, and could set memory and other devices power >>>> >>> management policy >>> >>>> in future. CPU pm policy setting is machine specific. >>>> >>>> CPU has a few power management features, like C-state, P- >state, >>>> >>> energy >>> >>>> performance bias etc. Every CPU pm feature which wants to >>>> >>> inherit the >>> >>>> system-pm-policy will register its callback function to a >list, >>>> >>> when >>> >>>> pmconfig passes the policy to the kernel, the kernel will >walk >>>> >>> the list >>> >>>> to call the callback function and hence set the user >perferred >>>> >>> policy >>> >>>> to the different modules. >>>> >>>> /etc/power.conf may have [system-pm-policy <value>] >>>> | >>>> v >>>> pmconfig >>>> | >>>> v >>>> pm_ioctl(PM_SET_SYSTEM_POLICY, policy) >>>> | >>>> v >>>> pm_set_system_policy(policy) >>>> | >>>> ----> CPU pm policy callback >>>> | | >>>> | ----> registered CPU pm feature 1 >>>> >>> callback(ENERGY_PERF_BIAS) >>> >>>> | | >>>> | ----> ... >>>> | >>>> ----> Memory pm policy callback in future >>>> | >>>> ----> ... >>>> >>>> >>>> Power performance balanced policy will be set by default, >this >>>> >>> keeps the >>> >>>> current out-of-box setting unchanged. The system which has >>>> >>> extreme >>> >>>> performance requirements could disable the power management >>>> >>> features by >>> >>>> performance bias setting. If laptop runs on a battery, or >the >>>> >>> system in >>> >>>> the low utilization prefers power than performance, system- >pm- >>>> >>> policy could >>> >>>> be set to power bias and save more power, this could lead to >>>> >>> the lowest >>> >>>> CPU clock and always deepest idle state. >>>> >>>> Different power manageable devices could inherit the system >>>> >>> wide policy >>> >>>> completely, or they can maintain a specific pm policy >>>> >>> themselves but the >>> >>>> system wide policy must be the biggest weight coefficient to >>>> >>> their own >>> >>>> mechanism. >>>> >>>> >>>> 4.2. Bug/RFE Number(s): xxxxxxx >>>> >>>> 4.5. Interfaces: >>>> This project will import these existing interfaces. >>>> Interface stability will be "committed". >>>> >>>> Import: >>>> power.conf(4) (PSARC/1992/202) >>>> pmconfig(1m) >>>> >>>> Export: >>>> system-pm-policy >>>> >>>> system-pm-policy keyword. >>>> A system-pm-policy entry can be added to power.conf(4) to >set >>>> >>> the system >>> >>>> wide power policy. If this entry is present and set to >default >>>> >>> or it is >>> >>>> not present then the default balanced policy will be used, >>>> >>> this keeps the >>> >>>> current behavior unchanged. The other options will tune the >>>> >>> policy to power >>> >>>> bias or performance bias. >>>> >>>> power.conf(4) man page addition: >>>> >>>> a system-pm-policy may be used to set system wide power >policy. >>>> >>> The format >>> >>>> of the system-pm-policy entry is system-pm-policy policy. >>>> >>>> Acceptable policy values are: >>>> >>>> default Power performance balanced policy. >>>> >>>> perf-bias The system drives to maximum performance at any >energy >>>> >>> cost. >>> >>>> balanced Balanced performance vs. power and energy >>>> >>>> power-bias Max energy efficient. >>>> >>>> absent If the system-pm-policy keyword is absent from >>>> >>> power.conf(4), >>> >>>> the behavior is the same as the default case. >>>> >>>> 4.6. Doc Impact: >>>> power.conf man page. See above. >>>> >>>> 4.7. Admin/Config Impact: >>>> Administrators of systems can use this option to match the >>>> >>> different power >>> >>>> performance requirement. >>>> >>>> 4.8. HA Impact: None. >>>> >>>> 4.9. I18N/L10N Impact: No. >>>> >>>> 4.10. Packaging & Delivery: >>>> This change will be delivered as part of the Deep C-State >RFE. >>>> These changes will be made at the same time: >>>> kernel package >>>> power.conf package >>>> pmconfig package >>>> >>>> 4.11. Security Impact: None. >>>> >>>> 4.12. Dependencies: power.conf, pmconfig(1M) >>>> >>>> 6. Resources and Schedule: >>>> 6.1. Projected Availability: April 2010 >>>> >>>> 6.4. Product Approval Committee requested information: >>>> 6.4.1. Consolidation C-team Name: >>>> ON >>>> 6.5. ARC review type: FastTrack >>>> 6.6. ARC Exposure: open >>>> >>>> 7. Prototype Availability: >>>> 7.1. Prototype Availability: >>>> Prototype available on OpenSolaris in April 2010. >>>> >>>> >>> >======================================================================== >>> =========== >>> >>>> Li, Aubrey wrote: >>>> >>>>> Hi Bill, >>>>> >>>>> Here I made a change to propose system-wide policy support. >>>>> http://cr.opensolaris.org/~aubrey/sys_pm_policy_v1/ >>>>> The user profile from /etc/power.conf is still passed to the kernel >>>>> thru pm_ioctl, then call pm_set_system_policy(). Currently there is >>>>> >>> only >>> >>>>> cpu pm policy setting there, if memory/other devices need a bias as >>>>> >>> well, >>> >>>>> they can also be added to that function. >>>>> cpu pm policy related implementation has minor change against last >>>>> webrev, >>>>> mcpu_pm_policy pointer has been moved from machcpu to >>>>> >>> mcpu_pm_mach_state >>> >>>>> structure according to your suggestion. >>>>> >>>>> Any comments and suggestions are highly appreciated. >>>>> >>>>> Thanks, >>>>> -Aubrey >>>>> >>>>> Li, Aubrey wrote: >>>>> >>>>>> It looks like memory PM need such a bias as well. So I'd like to >>>>>> >>> change >>> >>>>>> the proposal to use the keyword "sys-pm-policy" instead. The >>>>>> >>> mechanism >>> >>>>>> will use the existing callb implementation to pass the user policy >>>>>> >>> from >>> >>>>>> /etc/power.conf to the kernel and walk the module registered list >to >>>>>> call >>>>>> module hook function to set the pm policy individually. >>>>>> >>>>>> I'm not sure if any other device driver need or be happy with this >>>>>> proposal. >>>>>> It would be great if the device driver developer can share some >>>>>> >>>>> thoughts >>>>> >>>>>> here. >>>>>> >>>>>> Thanks, >>>>>> -Aubrey >>>>>> >>>>>> Julia.Harper wrote: >>>>>> >>>>>>> I assume that this knob (profile) when turned way down would >>>>>>> >>> basically >>> >>>>>>> put the >>>>>>> system into "power savings" mode -- where the set of power states >>>>>>> >>> is >>> >>>>>>> restricted. >>>>>>> That is, no matter how long the utilization level demands more >>>>>>> >>> power, >>> >>>>>>> the >>>>>>> highest power states (for the cpus, memory, whatever) will never >be >>>>>>> entered. We >>>>>>> should probably use terminology that makes this clear. >>>>>>> >>>>>>> -- jdh >>>>>>> >>>>>>> >>>>>>> Liu, Jiang wrote: >>>>>>> >>>>>>>> I prefer the solution to introduce a global power profile for >all >>>>>>>> >>>>>>> devices. Currently >>>>>>> >>>>>>>> we need such a profile for CPUPM. In future when supporting >>>>>>>> >>> memory >>> >>>>>>> power >>>>>>> >>>>>>>> management, we may need a similiar profile for memory PM. And >>>>>>>> >>> user >>> >>>>>>> won't >>>>>>> >>>>>>>> like two variables/profiles for the same objective. >>>>>>>> >>>>>>>> Li, Aubrey <> wrote: >>>>>>>> >>>>>>>>> Bill Holler wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I forgot to mention that cpu_pm_policy is just a policy. >>>>>>>>>> There is no guaranty it maps to a specific MSR or hardware >>>>>>>>>> implementation. >>>>>>>>>> >>>>>>>>> Yes, I would like to propose a new option for CPU power >>>>>>>>> >>> management >>> >>>>>>>>> policy. This policy is a CPU bias between performance and power, >>>>>>>>> >>>>> the >>>>> >>>>>>>>> future CPU power management enhancement work can be based on >>>>>>>>> >>> this >>> >>>>>>>>> policy. - the default policy should keep the current "out of >the >>>>>>>>> >>>>>> box" >>>>>> >>>>>>>>> behavior unchanged, we'll try to save more power without >>>>>>>>> >>>>> performance >>>>> >>>>>>>>> hurt. >>>>>>>>> - there will be more power management futures coming on the >>>>>>>>> >>> future >>> >>>>>>>>> processor, like ENERGY_PERFORMANCE_BIAS, we can register these >>>>>>>>> >>> new >>> >>>>>>>>> futures under the policy framework, and offer a knob to the >user >>>>>>>>> >>> to >>> >>>>>>>>> change these settings on the fly. >>>>>>>>> - laptop users who want to prolong the battery life and less >>>>>>>>> >>> heat >>> >>>>>> and >>>>>> >>>>>>>>> smaller fan noise may want the system to work in some edge >>>>>>>>> >>>>> situation: >>>>> >>>>>>>>> for example, currently CPU can work in the highest clock if >>>>>>>>> >>> cpupm >>> >>>>> is >>>>> >>>>>>>>> disabled, but no choice to let CPU always work in the lowest >>>>>>>>> >>> clock. >>> >>>>>>>>> Similarly, Always enter deepest c-state is another choice to >>>>>>>>> >>> save >>> >>>>>>>>> more power. What's more, power aware dispatcher could be more >>>>>>>>> flexible to pick up CPU and dispatch thread if there is a >policy >>>>>>>>> indicator. - Some users doesn't care about power. Yes, we >>>>>>>>> >>> already >>> >>>>>>>>> have the options to let them to set ENERGY_PERFORMANCE_BIAS to >>>>>>>>> >>> be >>> >>>>>>>>> performance bias, to close c-state/p-state, and so on and so >>>>>>>>> >>> forth. >>> >>>>>>>>> But it's more friendly to the user to just change only one >>>>>>>>> >>> option. >>> >>>>>>>>> Here, the policy only focus on CPU. If you think we should have >>>>>>>>> >>> a >>> >>>>>>>>> policy for the memory, for the devices, or we should have a >>>>>>>>> system-wide policy, let's do this. cpu_pm_policy can be one >part >>>>>>>>> >>> of >>> >>>>>>>>> system-wide policy. >>>>>>>>> If nobody have thoughts on it, I'll continue to prepare a PSARC >>>>>>>>> >>>>> file >>>>> >>>>>>>>> to add cpu_pm_policy keyword. >>>>>>>>> >>>>>>>>> >>>>>>>>>> For example Solaris could be dynamically setting the >>>>>>>>>> ENERGY_PERFORMANCE_BIAS register to different settings >>>>>>>>>> >>> depending >>> >>>>> on >>>>> >>>>>>>>>> things such as system-load, >>>>>>>>>> >>>>>>>>> Yes, such of these settings can be dynamically changed if we >see >>>>>>>>> >>>>> the >>>>> >>>>>>>>> benefit. >>>>>>>>> >>>>>>>>> >>>>>>>>>> the priority of the application being scheduled, a power >policy >>>>>>>>>> >>> of >>> >>>>>>>>>> the application, >>>>>>>>>> >>>>>>>>> Making the thread power aware need another bunch of interfaces >I >>>>>>>>> think. For example, cmt_balance() can choose the different >>>>>>>>> >>>>> processor >>>>> >>>>>>>>> group according to the perf/power bias of the thread. >>>>>>>>> >>>>>>>>> >>>>>>>>>> or power policy of the zone. >>>>>>>>>> >>>>>>>>> Zone policy is an interesting topic. Different zone could have >>>>>>>>> different CPU resource, or can share the global CPU resource, >>>>>>>>> different zone could have different power policy, or they can >>>>>>>>> >>>>>> inherit >>>>>> >>>>>>>>> the global cpu_pm_policy setting. The virtual container could >>>>>>>>> >>> have >>> >>>>>>>>> many, but the hardware resource is unique. I think this can be >>>>>>>>> enhanced in the zone management, which will not be covered in >my >>>>>>>>> proposal, :) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -Aubrey >>>>>>>>> >>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Bill >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 03/03/10 16:21, Bill Holler wrote: >>>>>>>>>> >>>>>>>>>>> +1. >>>>>>>>>>> >>>>>>>>>>> Hi Aubrey, >>>>>>>>>>> >>>>>>>>>>> I also think it is time to move forward with this proposal. >>>>>>>>>>> Generally we want the system to work best "out of the box" >>>>>>>>>>> with no tuning. On the other hand, vendors will keep >>>>>>>>>>> >>> improving >>> >>>>>>>>>>> products with new features, and there will always be some >>>>>>>>>>> >>>>> specific >>>>> >>>>>>>>>>> applications were custom settings may be better. I feel this >>>>>>>>>>> proposal supports innovation and application specific >>>>>>>>>>> >>>>>> customization >>>>>> >>>>>>>>>>> in line with the OpenSolaris community goals. >>>>>>>>>>> >>>>>>>>>>> This proposal applies to all types of CPUs. It uses >>>>>>>>>>> >>>>>>> "cpu_pm_policy" >>>>>>> >>>>>>>>>>> instead of for example mentioning a specific CPU's MSR. ;-) >>>>>>>>>>> >>>>> This >>>>> >>>>>>>>>>> proposal will be useful with other CPUs if/when they have >>>>>>>>>>> >>>>> hardware >>>>> >>>>>>>>>>> mechanisms for tuning power / performance. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> In the arc case we want to mention that there could be a >>>>>>>>>>> >>> policy >>> >>>>>>>>>>> conflict between this component setting and a system-power- >>>>>>>>>>> >>> policy, >>> >>>>>>>>>>> external Power Caping, etc. Generally we want users to use >the >>>>>>>>>>> default or a higher level policy such as the system power >>>>>>>>>>> >>> policy. >>> >>>>>>>>>>> Unfortunately the system power policy may not be fine-grain >or >>>>>>>>>>> diverse enough for some applications to specify cpu power >>>>>>>>>>> >>> policy. >>> >>>>>>>>>>> In that case cpu_pm_policy will be useful. My thought is: >the >>>>>>>>>>> >>>>>> user >>>>>> >>>>>>>>>>> must really know what they want if they specify a component >>>>>>>>>>> >>>>> policy >>>>> >>>>>>>>>>> such as cpu_pm_policy instead of just using the system power >>>>>>>>>>> policy. For that reason I feel cpu_pm_policy should override >>>>>>>>>>> >>> the >>> >>>>>>>>>>> system-power-policy at the cpupm level. >>>>>>>>>>> >>>>>>>>>>> Power Caping is different. Power Capping is an external >>>>>>>>>>> >>> policy. >>> >>>>>>> It >>>>>>> >>>>>>>>>>> is currently "owned" by the SP external to the OS. Power >>>>>>>>>>> >>> Caping >>> >>>>>>>>>>> should override a local cpu_pm_policy. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Implementation comments: >>>>>>>>>>> IMHO mcpu_pm_policy pointer should be in the >>>>>>>>>>> >>> mcpu_pm_mach_state >>> >>>>>>>>>>> structure instead of in the machcpu. >>>>>>>>>>> We may want to allow the user to specify a number instead of >>>>>>>>>>> >>> just >>> >>>>>>>>>>> Perf, Balanced, Power, Default? >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Bill >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 02/20/10 18:43, Li, Aubrey wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Bill, >>>>>>>>>>>> >>>>>>>>>>>> I think it's time to continue this proposal, since b134 is >>>>>>>>>>>> >>>>> closed >>>>> >>>>>>>>>>>> and the build is not limited now. power/perf bias setting is >>>>>>>>>>>> >>> a >>> >>>>>>>>>>>> start point for future power related work, I'll prepare a >>>>>>>>>>>> >>> PSARC >>> >>>>>>>>>>>> file for the new option if this is acceptable. No is also a >>>>>>>>>>>> >>> good >>> >>>>>>>>>>>> answer with good reason. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -Aubrey >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Bill.Holler Wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> This proposal is for a mechanism to set the new MSR >>>>>>>>>>>>>> IA32_ENERGY_PERF_BIAS_MSR. This is a new hardware >>>>>>>>>>>>>> feature. The MSR effects overall power/performance. >>>>>>>>>>>>>> It gives a hint to the processor & package for desired >>>>>>>>>>>>>> power/performance characteristics. It is related to p- >>>>>>>>>>>>>> >>> states >>> >>>>>>> and >>>>>>> >>>>>>>>>>>>>> c-states (and may effect these features), but this feature >>>>>>>>>>>>>> >>> can >>> >>>>>>>>>>>>>> have other socket/system-level effects as well. >>>>>>>>>>>>>> The programmers guides do not go into details what the >>>>>>>>>>>>>> >>> other >>> >>>>>>>>>>>>>> effects can be. :-( >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> The perf and power impact of this MSR is model specific. >>>>>>>>>>>>> It's able to throttle turbo on WSM and probably help to do >>>>>>>>>>>>> >>> more >>> >>>>>>>>>>>>> hardware decision in future. For example, when the short >>>>>>>>>>>>> >>>>>>> interrupt >>>>>>> >>>>>>>>>>>>> storm is detected, it can demote CC6 request to CC3. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 11/05/09 05:15, minskey guo wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jedy Wang ??: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Li, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> As far as I know, gnome-power-manager has removed the >>>>>>>>>>>>>>>> >>>>> support >>>>> >>>>>>>>>>>>>>>> for changing governor which is the same as profile I >>>>>>>>>>>>>>>> >>> think. >>> >>>>> I >>>>> >>>>>>>>>>>>>>>> remember someone wrote a blog explaining the reason but >I >>>>>>>>>>>>>>>> >>>>> can >>>>> >>>>>>>>>>>>>>>> not find it now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> I >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> wonder why what makes us still need to implement this >>>>>>>>>>>>>>>> >>>>> feature. >>>>> >>>>>>>>>>>>>>> In linux world, there is ondemand governor in kernel. It >>>>>>>>>>>>>>> >>> sets >>> >>>>>>>>>>>>>>> cpu freqency according to cpu's current load. So, >somebody >>>>>>>>>>>>>>> consider that >>>>>>>>>>>>>>> >>>>>>>>>> eveybody >>>>>>>>>> >>>>>>>>>>>>>>> should use that governor, and let CPUs finish their jobs >>>>>>>>>>>>>>> >>> asap >>> >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> then >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> enter >>>>>>>>>>>>>>> into C states for power-saving. Comparing to P state, c- >>>>>>>>>>>>>>> >>> state >>> >>>>>>>>>>>>>>> does >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> save >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> more power. That's why gnome removed it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> This is also model specific and depends on if the frequency >>>>>>>>>>>>> >>> and >>> >>>>>>>>>>>>> voltage and power are linear. That's true on latest >>>>>>>>>>>>> >>> processor >>> >>>>>> but >>>>>> >>>>>>>>>>>>> not on earlier processor. >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not sure why gnome removed it, but seems not a good >idea >>>>>>>>>>>>> >>> to >>> >>>>>>>>>>>>> me. Some users want max perf and others want longer battery >>>>>>>>>>>>> >>>>> life. >>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, a good p-state + c-state implementation is not easy >to >>>>>>>>>>>>>> >>>>>> tune >>>>>> >>>>>>>>>>>>>> for more power savings. Running in lower p-states when a >>>>>>>>>>>>>> >>> CPU >>> >>>>>> is >>>>>> >>>>>>>>>>>>>> busy burns more power due to shorter time in deeper C- >>>>>>>>>>>>>> >>> states. >>> >>>>>>>>>>>>>> Entering deeper C-states too aggressively also burns more >>>>>>>>>>>>>> >>>>> power >>>>> >>>>>>>>>>>>>> (on both an idle and busy system) due to unnecessary >wakeup >>>>>>>>>>>>>> latency. ;-) Without knowing the details, it seems >likely >>>>>>>>>>>>>> >>>>>> that >>>>>> >>>>>>>>>>>>>> the gnome-power-manager was removed because setting it >made >>>>>>>>>>>>>> >>>>>>> worse >>>>>>> >>>>>>>>>>>>>> decisions than a runtime prediction. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Solaris currently has mechanisms to turn P-state and >deeper >>>>>>>>>>>>>> C-state support on/off. >>>>>>>>>>>>>> >>>>>>>>>>>>>> A requirement is that the Energy Perf Bias MSR can be set >>>>>>>>>>>>>> >>> on >>> >>>>>>>>>>>>>> systems not running a GUI. We would like to support a >>>>>>>>>>>>>> >>>>> possible >>>>> >>>>>>>>>>>>>> future Gnome interface to set this MSR if/when it exists. >>>>>>>>>>>>>> >>> The >>> >>>>>>>>>>>>>> proposal provides a mechanism that works on systems >without >>>>>>>>>>>>>> Gnome. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> Right, most of servers do not run gnome. I don't expect >>>>>>>>>>>>> >>> gnome >>> >>>>>>>>>>>>> support but it would be great if it will, :-) >>>>>>>>>>>>> >>>>>>>>>>>>> IMHO, we should use this global cpu power policy setting >>>>>>>>>>>>> >>>>> instead >>>>> >>>>>>>>>>>>> of "cpupm" and "cpu-deep-idle", this is more friendly to >the >>>>>>>>>>>>> user. The users just want more perf or more power, I think >>>>>>>>>>>>> >>> they >>> >>>>>>>>>>>>> don't care if the system support p/c- state at the same >time. >>>>>>>>>>>>> "cpupm" is a confusion only for p-state. we call "cpupm" >>>>>>>>>>>>> >>> before >>> >>>>>>>>>>>>> we have deep idle support. Actually cpu-deep-idle is also >>>>>>>>>>>>> >>> one >>> >>>>>>>>>>>>> part of cpu power management, :) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> but, someone doesn't care power-saving, when comparing it >>>>>>>>>>>>>>> >>> to >>> >>>>>>>>>>>>>>> other factors. For example, if you are plagued by the >>>>>>>>>>>>>>> >>> noise >>> >>>>> of >>>>> >>>>>>>>>>>>>>> CPU fan, >>>>>>>>>>>>>>> >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>>>>>>> expect quiet it then you can lower cpu frequency, which >>>>>>>>>>>>>>> >>>>>> results >>>>>> >>>>>>>>>>>>>>> in lower heat, and then fan can be stopped. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> personally, I vote +1 for this project if I could vote, >>>>>>>>>>>>>>> >>> but I >>> >>>>>>>>>>>>>>> don't >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> like >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>> the names of "perf-bias" etc :) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Besides, can somebody tell me where >>>>>>>>>>>>>>> >>> IA32_ENERGY_PERF_BIAS_MSR >>> >>>>>>>>>>>>>>> comes ? Is it a part of IPS feature ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Intel's Software Developer's Manuals 2A describes CPUID >>>>>>>>>>>>>> >>>>>>> detection >>>>>>> >>>>>>>>>>>>>> of IA32_ENERGY_PERF_BIAS_MSR and volume 3A describes the >>>>>>>>>>>>>> >>> MSR. >>> >>>>>>>>>>>>>> http://www.intel.com/products/processor/manuals/ >>>>>>>>>>>>>> Sorry, I do not know what IPS stands for? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> cough, cough, IPS is not a released feature and should not >>>>>>>>>>>>> >>> be >>> >>>>>>>>>>>>> discussed here, ;p >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -Aubrey >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Bill >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> -minskey >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I remember why already support 2 profile through gnome- >>>>>>>>>>>>>>>> >>>>> power- >>>>> >>>>>>>>>>>>>>>> manager >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> on >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> Solaris. What's the difference between them? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I do not understand the exact meaning perf-bias, >balanced >>>>>>>>>>>>>>>> >>>>> and >>>>> >>>>>>>>>>>>>>>> power- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>> bias >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> either. Does not perf-bias means the cpu frequency will >>>>>>>>>>>>>>>> >>> be >>> >>>>>>>>>>>>>>>> always >>>>>>>>>>>>>>>> >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>>>>> the >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>> highest level? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jedy >>>>>>>>>>>>>>>> On Wed, 2009-11-04 at 08:47 +0800, Li, Aubrey wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> When we enable intel energy performance bias feature, >we >>>>>>>>>>>>>>>>> found the power profile implementation is necessary. >>>>>>>>>>>>>>>>> >>> Here I >>> >>>>>>>>>>>>>>>>> did a draft for cpu level power policy. >>>>>>>>>>>>>>>>> http://cr.opensolaris.org/~aubrey/cpu_power_policy_v1/ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The proposal added a new keyword to /etc/power.conf >>>>>>>>>>>>>>>>> "cpu-power-policy", And we have 4 options for this new >>>>>>>>>>>>>>>>> keyword: 1) perf-bias 2) balanced >>>>>>>>>>>>>>>>> 3) power-bias >>>>>>>>>>>>>>>>> 4) default, the same as perf-bias. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /etc/power.conf accepts the user input and passes the >>>>>>>>>>>>>>>>> >>>>>>> prefered >>>>>>> >>>>>>>>>>>>> policy >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> to the kernel thru ioctl. Then pm_ioctl calls the >>>>>>>>>>>>>>>>> >>> callback >>> >>>>>> to >>>>>> >>>>>>>>>>>>>>>>> walk >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>> a >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> cpu >>>>>>>>>>>>>>>>> power policy list. Every cpu pm feature which wants to >>>>>>>>>>>>>>>>> >>> be >>> >>>>>>>>>>>>>>>>> adjusted >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>> by >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> this option and verified to be supported will register >>>>>>>>>>>>>>>>> >>> its >>> >>>>>>>>>>>>>>>>> callback function to the list, so that it can be called >>>>>>>>>>>>>>>>> >>> and >>> >>>>>>>>>>>>>>>>> adjusted by pmconfig. >>>>>>>>>>>>>>>>> --------------------------------------------------- >- >>>>>>>>>>>>>>>>> >>> --- >>> >>>>> - >>>>> >>>>>>>>>>>>>>>>> /etc/power.conf | pm_ioctl(cpu_power_policy, policy) >>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>> cpu_power_policy_callb (policy) >>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>> ----> registered pm feature callback 1 >>>>>>>>>>>>>>>>> >>>>> (ENERGY_PERF_BIAS) >>>>> >>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>> ----> registered pm feature callback 2 >>>>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>>> ------------------------------------------------------- >- >>>>>>>>>>>>>>>>> >>> - >>> >>>>>>>>>>>>>>>>> Currently, only energy_perf_bias feature is registered, >>>>>>>>>>>>>>>>> because my intention is to support adjusting >>>>>>>>>>>>>>>>> >>>>>> energy_perf_bias >>>>>> >>>>>>>>>>>>>>>>> MSR without reboot. I guess >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>> we >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>> probably >>>>>>>>>>>>>>>>> can add p/t/c-state support later. When we add p/t/c- >>>>>>>>>>>>>>>>> >>> state >>> >>>>>>>>>>>>>>>>> support, my quick thought is, this option will override >>>>>>>>>>>>>>>>> "cpupm" and "cpu-deep-idle" setting. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Welcome your any comments and suggestions. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> -Aubrey >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> pm-discuss mailing list >>>>>>>>>>> [email protected] >>>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> pm-discuss mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> tesla-dev mailing list >>>>>>>>> [email protected] >>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/tesla-dev >>>>>>>>> >>>>>>>> Liu Jiang (Gerry) >>>>>>>> OpenSolaris, OTC, SSG, Intel >>>>>>>> _______________________________________________ >>>>>>>> pm-discuss mailing list >>>>>>>> [email protected] >>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>>>> >>>>>>> -- >>>>>>> >>>>>>> --------------------- >>>>>>> Julia Harper, [email protected] >>>>>>> >>>>>> _______________________________________________ >>>>>> pm-discuss mailing list >>>>>> [email protected] >>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>>> >>>>> _______________________________________________ >>>>> pm-discuss mailing list >>>>> [email protected] >>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss >>>>> _______________________________________________ pm-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pm-discuss
