Would the cpu_pm_policy be treated as essentially a cap on the register 
setting?   That 
is, if the cpu_pm_policy setting maps to an MSR setting of 9, does this mean 
the OS would 
then only dynamically choose MSR values between 9-15?

-- jdh


Bill Holler wrote:
> Hi,
> 
> I forgot to mention that cpu_pm_policy is just a policy.
> There is no guaranty it maps to a specific MSR or hardware
> implementation.
> 
> For example Solaris could be dynamically setting the
> ENERGY_PERFORMANCE_BIAS register to different
> settings depending on things such as system-load, the
> priority of the application being scheduled, a power policy
> of the application, or power policy of the zone.
> 
> Regards,
> Bill
> 
> 
> On 03/03/10 16:21, Bill Holler wrote:
>> +1.
>>
>> Hi Aubrey,
>>
>> I also think it is time to move forward with this proposal.
>> Generally we want the system to work best "out of the box"
>> with no tuning.  On the other hand, vendors will keep
>> improving products with new features, and there will
>> always be some specific applications were custom settings
>> may be better.  I feel this proposal supports innovation and
>> application specific customization in line with the
>> OpenSolaris community goals.
>>
>> This proposal applies to all types of CPUs.  It uses
>> "cpu_pm_policy" instead of for example mentioning a
>> specific CPU's MSR.  ;-)  This proposal will be useful
>> with other CPUs if/when they have hardware mechanisms
>> for tuning power / performance.
>>
>>
>> In the arc case we want to mention that there could
>> be a policy conflict between this component setting and
>> a system-power-policy, external Power Caping, etc.
>> Generally we want users to use the default or a higher
>> level policy such as the system power policy.
>> Unfortunately the system power policy may not be
>> fine-grain or diverse enough for some applications to
>> specify cpu power policy.  In that case cpu_pm_policy
>> will be useful.  My thought is: the user must really know
>> what they want if they specify a component policy
>> such as cpu_pm_policy instead of just using the
>> system power policy.  For that reason I feel cpu_pm_policy
>> should override the system-power-policy at the cpupm level.
>>
>> Power Caping is different.  Power Capping is an external
>> policy.  It is currently "owned" by the SP external to the
>> OS.  Power Caping should override a local cpu_pm_policy.
>>
>>
>> Implementation comments:
>> IMHO mcpu_pm_policy pointer should be in the
>> mcpu_pm_mach_state structure instead of in the machcpu.
>> We may want to allow the user to specify a number
>> instead of just Perf, Balanced, Power, Default?
>>
>> Regards,
>> Bill
>>
>>
>> On 02/20/10 18:43, Li, Aubrey wrote:
>>> Hi Bill,
>>>
>>> I think it's time to continue this proposal, since b134 is closed and 
>>> the
>>> build is not limited now. power/perf bias setting is a start point 
>>> for future power related work, I'll prepare a PSARC file for the new 
>>> option if
>>> this is acceptable. No is also a good answer with good reason.
>>>
>>> Thanks,
>>> -Aubrey
>>>
>>>  
>>>> Bill.Holler Wrote:
>>>>   
>>>>> Hi,
>>>>>
>>>>> This proposal is for a mechanism to set the new MSR
>>>>> IA32_ENERGY_PERF_BIAS_MSR.   This is a new hardware
>>>>> feature.  The MSR effects overall power/performance.
>>>>> It gives a hint to the processor & package for desired
>>>>> power/performance characteristics.  It is related to p-states
>>>>> and c-states (and may effect these features), but this feature
>>>>> can have other socket/system-level effects as well.
>>>>> The programmers guides do not go into details what the
>>>>> other effects can be.  :-(
>>>>>       
>>>> The perf and power impact of this MSR is model specific.
>>>> It's able to throttle turbo on WSM and probably help to do more
>>>> hardware decision in future. For example, when the short interrupt
>>>> storm is detected, it can demote CC6 request to CC3.
>>>>
>>>>   
>>>>> On 11/05/09 05:15, minskey guo wrote:
>>>>>     
>>>>>> Jedy Wang ??:
>>>>>>       
>>>>>>> Hi Li,
>>>>>>>
>>>>>>> As far as I know, gnome-power-manager has removed the support for
>>>>>>> changing governor which is the same as profile I think. I remember
>>>>>>> someone wrote a blog explaining the reason but I can not find it 
>>>>>>> now.
>>>>>>>           
>>>> I
>>>>   
>>>>>>> wonder why what makes us still need to implement this feature.
>>>>>>>           
>>>>>> In linux world, there is ondemand governor in kernel. It sets cpu
>>>>>> freqency
>>>>>> according to cpu's current load. So, somebody consider that eveybody
>>>>>> should use that governor, and let CPUs finish their jobs asap and
>>>>>>         
>>>> then
>>>>   
>>>>>> enter
>>>>>> into C states for power-saving. Comparing to P state, c-state does
>>>>>>         
>>>> save
>>>>   
>>>>>> more power. That's why gnome removed it.
>>>>>>         
>>>> This is also model specific and depends on if the frequency and voltage
>>>> and
>>>> power are linear. That's true on latest processor but not on earlier
>>>> processor.
>>>>
>>>> I'm not sure why gnome removed it, but seems not a good idea to me. 
>>>> Some
>>>> users want max perf and others want longer battery life.
>>>>
>>>>   
>>>>> Yes, a good p-state + c-state implementation is not easy
>>>>> to tune for more power savings.  Running in lower p-states
>>>>> when a CPU is busy burns more power due to shorter time
>>>>> in deeper C-states.  Entering deeper C-states too aggressively
>>>>> also burns more power (on both an idle and busy system) due
>>>>> to unnecessary wakeup latency.  ;-)  Without knowing the
>>>>> details, it seems likely that the gnome-power-manager
>>>>> was removed because setting it made worse decisions
>>>>> than a runtime prediction.
>>>>>
>>>>>
>>>>> Solaris currently has mechanisms to turn P-state and
>>>>> deeper C-state support on/off.
>>>>>
>>>>> A requirement is that the Energy Perf Bias MSR can be
>>>>> set on systems not running a GUI.  We would like to support
>>>>> a possible future Gnome interface to set this MSR if/when it
>>>>> exists.  The proposal provides a mechanism that works on
>>>>> systems without Gnome.
>>>>>       
>>>> Right, most of servers do not run gnome. I don't expect gnome support
>>>> but it would be great if it will, :-)
>>>>
>>>> IMHO, we should use this global cpu power policy setting instead of
>>>> "cpupm"
>>>> and "cpu-deep-idle", this is more friendly to the user. The users just
>>>> want more
>>>> perf or more power, I think they don't care if the system support p/c-
>>>> state at the
>>>> same time. "cpupm" is a confusion only for p-state. we call "cpupm"
>>>> before we
>>>> have deep idle support. Actually cpu-deep-idle is also one part of cpu
>>>> power
>>>> management, :)
>>>>   
>>>>>     
>>>>>> but, someone doesn't care power-saving, when comparing it to other
>>>>>> factors. For example, if you are plagued by the noise of CPU fan, and
>>>>>> expect quiet it then you can lower cpu frequency, which results in
>>>>>> lower heat, and then fan can be stopped.
>>>>>>
>>>>>> personally, I vote +1 for this project if I could vote, but I don't
>>>>>>         
>>>> like
>>>>   
>>>>>> the names of "perf-bias" etc :)
>>>>>>
>>>>>>
>>>>>> Besides, can somebody tell me where IA32_ENERGY_PERF_BIAS_MSR
>>>>>> comes ? Is it a part of IPS feature ?
>>>>>>         
>>>>> Intel's Software Developer's Manuals 2A describes
>>>>> CPUID detection of IA32_ENERGY_PERF_BIAS_MSR
>>>>> and volume 3A describes the MSR.
>>>>> http://www.intel.com/products/processor/manuals/
>>>>> Sorry, I do not know what IPS stands for?
>>>>>       
>>>> cough, cough, IPS is not a released feature and should not be discussed
>>>> here, ;p
>>>>
>>>> Thanks,
>>>> -Aubrey
>>>>
>>>>   
>>>>> Regards,
>>>>> Bill
>>>>>
>>>>>
>>>>>     
>>>>>> -minskey
>>>>>>
>>>>>>
>>>>>>
>>>>>>       
>>>>>>> I remember why already support 2 profile through gnome-power-manager
>>>>>>>           
>>>> on
>>>>   
>>>>>>> Solaris. What's the difference between them?
>>>>>>>
>>>>>>> I do not understand the exact meaning perf-bias, balanced and power-
>>>>>>>           
>>>> bias
>>>>   
>>>>>>> either. Does not perf-bias means the cpu frequency will be always at
>>>>>>>           
>>>> the
>>>>   
>>>>>>> highest level?
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Jedy
>>>>>>> On Wed, 2009-11-04 at 08:47 +0800, Li, Aubrey wrote:
>>>>>>>
>>>>>>>         
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> When we enable intel energy performance bias feature, we found the
>>>>>>>> power
>>>>>>>> profile implementation is necessary. Here I did a draft for cpu
>>>>>>>> level power policy.
>>>>>>>> http://cr.opensolaris.org/~aubrey/cpu_power_policy_v1/
>>>>>>>>
>>>>>>>> The proposal added a new keyword to /etc/power.conf
>>>>>>>> "cpu-power-policy",
>>>>>>>> And we have 4 options for this new keyword:
>>>>>>>> 1) perf-bias
>>>>>>>> 2) balanced
>>>>>>>> 3) power-bias
>>>>>>>> 4) default, the same as perf-bias.
>>>>>>>>
>>>>>>>> /etc/power.conf accepts the user input and passes the prefered
>>>>>>>>             
>>>> policy
>>>>   
>>>>>>>> to the kernel thru ioctl. Then pm_ioctl calls the callback to walk
>>>>>>>>             
>>>> a
>>>>   
>>>>>>>> cpu
>>>>>>>> power policy list. Every cpu pm feature which wants to be adjusted
>>>>>>>>             
>>>> by
>>>>   
>>>>>>>> this option and verified to be supported will register its callback
>>>>>>>> function
>>>>>>>> to the list, so that it can be called and adjusted by pmconfig.
>>>>>>>> --------------------------------------------------------
>>>>>>>> /etc/power.conf
>>>>>>>>     |
>>>>>>>>     pm_ioctl(cpu_power_policy, policy)
>>>>>>>>     |
>>>>>>>> cpu_power_policy_callb (policy)
>>>>>>>>     |
>>>>>>>>     ----> registered pm feature callback 1 (ENERGY_PERF_BIAS)
>>>>>>>>     |
>>>>>>>>     ----> registered pm feature callback 2
>>>>>>>>     ...
>>>>>>>> ---------------------------------------------------------
>>>>>>>> Currently, only energy_perf_bias feature is registered, because my
>>>>>>>> intention is
>>>>>>>> to support adjusting energy_perf_bias MSR without reboot. I guess
>>>>>>>>             
>>>> we
>>>>   
>>>>>>>> probably
>>>>>>>> can add p/t/c-state support later. When we add p/t/c-state support,
>>>>>>>> my quick thought is, this option will override "cpupm" and
>>>>>>>> "cpu-deep-idle" setting.
>>>>>>>>
>>>>>>>> Welcome your any comments and suggestions.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> -Aubrey
>>>>>>>> _______________________________________________
>>>>>>>> pm-discuss mailing list
>>>>>>>> pm-discuss at opensolaris.org
>>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>>>>>>>>
>>>>>>>>             
>>>>>>> _______________________________________________
>>>>>>> pm-discuss mailing list
>>>>>>> pm-discuss at opensolaris.org
>>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>> _______________________________________________
>>>>>> pm-discuss mailing list
>>>>>> pm-discuss at opensolaris.org
>>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>>>>>>         
>>>>> _______________________________________________
>>>>> pm-discuss mailing list
>>>>> pm-discuss at opensolaris.org
>>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>>>>>       
>>>> _______________________________________________
>>>> pm-discuss mailing list
>>>> pm-discuss at opensolaris.org
>>>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>>>>     
>>
>> _______________________________________________
>> pm-discuss mailing list
>> pm-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
> 
> _______________________________________________
> tesla-dev mailing list
> tesla-dev at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/tesla-dev

-- 

---------------------
     Julia Harper, julia.harper at sun.com

Reply via email to