<Sorry for the long term delay. Weird, this mail is missing in my
company mail box
but fortunately I catched it in gmail box now.>

I think Power-saving profile can be divided into two profiles if we
take the laptop
into account.

low power mode
  - the usage scenario is, the user moves his box from the desktop to his knees,
    he probably need cooler or quieter. If we translate the
requitement to the detail
    implementation, this could be i) redirect unaffinitized threads to
a sub-set of cores
    to do the socket or core offlining ii) we pick the lower part of
supported p-states to
    throttle the frequency instead of the marked one. iii) we activate
the cooling
    mechanism(i.e. T-state, fan control) in the lower temperature
threshold. etc...
max power saving mode
 -  the usage scenario is, the system encountered a critial power
issue, or there is a
    client console not care about performance at all. We translate the
requirement to
    i) always enter the supported deepest idle state. ii) always
running on the lowest
    frequency. iii) aggregating the threads as possible as the system
can to do the
    socket or core offlining. etc...

Thanks,
-Aubrey

David J. Brown <> wrote:
> The attached is a draft of the case which introduces the current engineering
> program for Power Mangement in Solaris.
>
> Note that it is intentionally high-level, and does not, itself present any
> specific interfaces. It attempts to provide an overview of what the program
> has in mind, with a partial outline of some of the initially intended
> projects.  Specific component design and interface will be the purview of
> each of these succeeding one-pagers.
>
> Comments are invited.
> -db
>
> --
> ; David J. Brown Ph.D. (cantab.)
> ; Principal Engineer
> ; Solaris Engineering
> ; Oracle
> ; --
> ; Postal Address:                   Telephone: (650) 786-5558
> ;  4150 Network Circle, UMPK17-307  FAX:       (650) 786-5734
> ;  Santa Clara, CA 95054            e-mail:    [email protected]
>
>
>
> Template Version: @(#)onepager.txt 1.35 07/11/07 SMI
> Copyright 2007 Sun Microsystems
>
> 1. Introduction
>   1.1. Project/Component Working Name:
>        Power Management 2.0 Umbrella Case
>
>   1.2. Name of Document Author/Supplier:
>        David J. Brown
>
>   1.3. Date of This Document:
>        06/15/2010
>
>        1.3.1. Date this project was conceived:
>                June 2009 (This umbrella case is derived and extended
>                from earlier work to support suspend/resume and CPU
>                power management on Sun's x64 hardware platforms).
>
>   1.4. Name of Major Document Customer(s)/Consumer(s):
>        1.4.1. The PAC or CPT you expect to review your project:
>                Systems PAC
>        1.4.2. The ARC(s) you expect to review your project:
>                PSARC
>        1.4.3. The Director/VP who is "Sponsoring" this project:
>                [email protected], [email protected]
>        1.4.4. The name of your business unit:
>                x64 Platform Software
>
>   1.5. Email Aliases:
>        1.5.1. Responsible Manager:     [email protected]
>        1.5.2. Responsible Engineer:    [email protected]
>        1.5.3. Marketing Manager:       [email protected]
>        1.5.4. Interest List:           [email protected]
>
> 2. Project Summary
>   2.1. Project Description:
>        The existing power management in Solaris dates from over 17 years
>        ago (April 1993), when the original effort to implement checkpoint
>        resume (CPR) for the "Voyager" product took place.
>        Recently, there has been a great deal of vigor in the industry
> related
>        energy efficiency, and hence the appearance of many new power
> management
>        facilities - ranging from the individual hardware components to the
>        contemporary hardware platforms.
>
>        Over the past four years specific work has been done to support
>        contemporary features on the Intel-architecture platforms (both
>        Sun's AMD- and Intel-based systems).  The principal focus of these
>        projects has been to implement Suspend-to-RAM (ACPI S3), and to
>        support contemporary CPU power-management features (P-states,
> C-states,
>        and T-states) for contemporary AMD and Intel processors.
>
>        A range of modern facilities for power management are emerging.
>        The system's earlier conceptions for power management need to be
>        revised to support these properly, and to pursue the end-objective of
>        energy-efficient computing.
>
>        This case introduces the Program, and a number of the initial
> projects
>        that constitute the next generation of power management facilities
> within Solaris.
>
>
>   2.2. Risks and Assumptions:
>
>        The program's broad goal is to provide a practical solution to the
> energy-efficient
>        computing problem.  This requires the ability to construct a
> comprehensive system power
>        model (for each platform upon which it runs), and will also
> ultimately
>        require improved knowledge of the dynamic resource use of both
> individual applications and
>        workloads.  We expect to gain better informational interfaces from
> both hardware
>        component vendors (e.g. Intel) and the hardware platforms teams to
> address the first point.
>        Improved knowledge of applications and workloads is a great
> opportunity now that Sun has
>        been integrated with Oracle.  This relies on the participation of the
> various software
>        product groups in question.
>
> 3. Business Summary
>
>   3.1. Problem Area:
>        Support for contemporary component- and platform-level power
> management
>        facilities provides the basis for the more energy-efficient operation
>        of the computing hardware Sun/Oracle sells.
>
>        A number of rudimentary facilities are coming to be in place in the
>        hardware components and platforms, and the remainder of the systems
>        stack above must be improved to exploit these.  Particular attention
>        is needed at the levels of the firmware, the system virtualization
> layer
>        (hypervisor) and the operating system (whether virtualized or running
>        natively).
>
>        The focus of this program is on facilities implemented within the
> Solaris operating
>        system.  The initial attention will be to the system design and
> implementation
>        required when the OS is run on bare iron (i.e. the non-virtualized
>        case).  These techniques will then be considered within the context
> where
>        Solaris is a virtualized guest and extended as appropriate.  The
> specific virtualized
>        settings of interest are Solaris under both the Sun4v and OVM
> hypervisors (i.e.
>        these two para-virtualized cases).
>
>        The Program's primary focus will be on Server power management (with
> particular
>        attention to the company's volume servers to begin with).
>
>
>   3.2. Market/Requester:
>        All customers are now attentive to the energy consequences of
>        the systems they operate.  This is of particular concern in
>        data centers, where the energy costs to operate equipment
>        can now be expected to meet or exceed the capital cost of
>        the equipement's acquisition.
>
>        All federal government customers (as well, possibly, as state
>        and local government ones, and others) will be required to purchase
>        equipment that meets the EPA's Energy-star guidelines.  The EPA
> already
>        have a specification for consumer equipment, and one for mobile and
>        workstation class computer equipment.  They are presently developing
>        the Energy-star specifications for servers, storage, and data
>        centers.
>
>
>   3.3. Business Justification:
>        Energy management is now a principal concern for all computer
>        equipment purchasers - pointedly so for those in the
> enterprise/commercial
>        and government sectors.
>
>   3.4. Competitive Analysis:
>        This feature is needed for Solaris to be competitive with operating
> systems
>        from other vendors, and perhaps even to provide advantage over them.
>
>   3.5. Opportunity Window/Exposure:
>        Exposure is immediate.  All major hardware component vendors are
> driving
>        and delivering these features, and we must support and exploit them
> to
>        keep pace.  RedHat, Microsoft Windows (both client and server), SuSE
>        and others are all following these features and working to support
> them.
>
>        Power management work in Solaris is publicly visible via development
> projects
>        in OpenSolaris.
>
>   3.6. How will you know when you are done?:
>        This Power Management Program can be considered done, when the system
> has
>        a power management facility that can:
>
>        - Be enabled or disabled: When enabled, the system strives to be
> "energy efficient" --
>        by making dynamic changes to the hardware platform's provisioning
> and/or performance
>        levels in order to minimize the amount of energy required to perform
> any workload run
>        on the system.
>
>        - When enabled, the system administrator can express bounds on the
> degradation
>        in performance and/or responsiveness to dynamic changes in load that
> the system
>        will stay within.
>
>        - In addition (whether dynamic power management is enabled or not),
> the system will
>        be able to restrict itself to operate within a specified
>        portion of the hardware platform's full capacity when the environment
> it's operating
>        in requires that.  This may occur when either power or reserved
> energy are limited; when
>        the system is running as a virtualized guest; or when otherwise
> specified by the sytsem's
>        administrator.
>
>        Power Management will be enabled by default.  It is a goal that no
> administrator
>        would wish to disable it, except in a small number of special or
> unusual circumstances.
>        a. Certain mission-critical or pseudo real-time deployments where
> static provisioning
>        for worst-case is required.
>        b. Pathological workloads whose dynamic behaviors grossly violate the
> assumptions
>        of the energy-efficiency algorithms used by the PM system.
>
> 4. Technical Description:
>
>        Power management refers to the system's dynamic adjustment of a
> platform's hardware
>        resources.  This may be achieved by adjusting the performance levels
> of particular
>        resources, and/or what is presently provisioned (available), in order
> to achieve the
>        best possible energy efficiency while running computational tasks.
>
>        At the highest level, the simple purpose of power management on
> Oracle's platforms is
>        to maximize energy-efficiency at all times.  That is, the objective
> is to minimize
>        the total energy required to complete any computational task, and/or
> to operate any
>        service.  The following basic operating conceptions are illustrative,
> and may be
>        helpful to a more detailed understanding.
>
>        - "Performance-only"
>
>        The system does not perform any dynamic power management.  It
> operates as systems software
>        has traditionally, using a statically defined set of resources whose
> performance levels
>        and available capacity is not adjusted according to the workload's
> requirements
>        while the system runs.
>
>        - Energy-efficient at maximum performance (elsewhere called "Adaptive
> Performance")
>
>        The system does perform dynamic power management, but must still
> achieve the maximum
>        possible performance for sustained workloads.  Energy is saved where
> possible
>        by dynamically adjusting the platform's provisioning and resource
> performance levels,
>        but only for those hardware assets that do not improve the performace
> of what
>        is running.  The simplest way to think of this is that the system
> will eliminate any
>        gratuitous over-provisioning - all resources which do not affect
> performance of
>        the current workload are appropriately adjusted.
>
>        The key desideratum for this choice, is that there should be no
> practical performance
>        difference between workloads run in this way, when compared to the
> situation in which
>        no power management at all is being done (i.e. when the system's
> power management function
>        is disabled).  In practice this may mean that we designate a certain
> small "principled amount"
>        of allowed performance regression as assessed under certain standard
> benchmarks.
>
>        - Energy-efficient with tolerated performance regression (elsewhere
> called "Elastic")
>
>        The general case of energy-efficiency is that in which the constraint
> of maximum possible
>        performance for the workload can be relaxed.  The system's objective
> is still to minimize
>        the total energy to perform any computational task run on the
> platform.
>
>        While the best achievable performance is not required, some bounds
> are established in
>        order to limit the degradation of performance and/or responsiveness
> to changing load.
>        The system may adjust provisioning dynamically so long as it stays
> within a
>        specified responsiveness to increase provisioning to full capacity as
> required by transient
>        load.
>
>        - Power-constrained or Energy-constrained operation (elsewhere called
> "Power-saving")
>
>        This operating constraint applies when there is a practical limit on
> power (rate of energy
>        delivery) or a limited [reserve] supply of energy available.  System
> capacity must be
>        reduced to remain within these limits.
>
>        In these cases, the system is prepared to degrade the quality of one
> or more services, and/or
>        to reduce resourcing that may reduce their service level.  In
> addition, decisions that certain
>        tasks or services are not to be run at all, in favor of others that
> are deemed to be more
>        critical may be made.
>
>        The system might choose to reduce capacity and use that more limited
> resource
>        in such a way that all tasks are affected equally (in their
> performance or responsiveness
>        to changing demand).
>        In the ideal, different tasks or services running on the system might
> be degraded
>        non-uniformly, with the objective of keeping critical services
> running at required
>        throughput, to stay within available power limits or so that an
> energy reserve can
>        be made to last for an appropriate duration.
>
>        The usage cases for this operating condition is limited energy
> reserve, such as for
>        mobile systems whilst operating on battery power,
>        or for tethered systems that find themselves to be
>        running on a UPS, battery-backup, or generator backup power source
> due to power outage.
>        Another usage case is when instantaneous *power* availability is
> limited - such as in
>        a power utility brown out, or any other power distribution situation
> that might cause this.
>        In each of these cases, load must be shed and/or service quality
> reduced to stay within
>        these limits.
>
>    4.1. Details
>
>        This umbrella case provides context and scope for the Program.
>  Specific design is
>        to be provided in the projects under it.
>        A number of projects are expected.  The following provides a partial
> outline:
>
>        1. The system's power management facility will be implementated as a
>        Solaris service, with new high-level administrative controls
> expressed
>        in SMF (Solaris's service managment facility).  These controls will
>        be hardware platform and instruction-set architecture abstract.
>        The primary method of administration is SMF.
>
>        2. Aspects of the system's earlier power management facility which
> are
>        obsolete or inadequate to address the above-described dynamic power
>        management solution will be removed.  This includes a number of
> low-level
>        implementation-specific controls (hardware platform and/or
> ISA-specific)
>        which were exposed earlier.
>
>        3. The usability of the service will be improved. Both a command-line
> and
>        programmatic interface to the PM service and its facilities will be
> provided.
>        Appropriate exposure of the administrative controls to the service is
> another
>        consideration. An appropriate means to access aspects of the
> service's
>        SMF description will be provided.
>
>        4. An improved framework for resource-centric power management will
> be provided.
>        The initial work done with the power-aware dispatcher (as that
> relates to the CPU
>        resources) will be used as the example to extend similar dynamic
> power management
>        capabilities to other hardware devices on the platform.
>
>        5. A new device driver interface for power-relevant operation will be
> introduced,
>        so that devices can describe their PM-relevant capabilities to the
> system.
>        For example, to describe the various power and performance states
> they offer,
>        their ability to perform software actions such as suspend/resume, as
> well as
>        the interfaces required to operate those controls.
>
>        6. Development practices for power management and energy-aware
> modules (with
>        particular attention to device drivers) within the Solaris OS will be
> defined
>        and codified.
>
>        7. Observability and debugability of the system's power-aware and
> energy-efficient
>        facilities and their operation will be improved.  dtrace probes seem
> one likely avenue.
>
>        8. The system's Suspend/Resume capability will be expanded to
> encompass a
>        broader range of system-level Power states.
>        The suspend/resume facility will be improved to encompass and unify a
> more
>        complete range of system suspend types (from power-on suspend,
> best-available-suspend,
>        suspend-to-RAM, suspend-to-disk [non-volatile storage],
> hybrid-suspend, to soft-off)
>
>
>    4.2. Bug/RFE Number(s):
>        N/A
>
>    4.3. In Scope:
>        The framework and system implementation needed to offer a
> power-management service that
>        operate according to the aforementioned approach to
> energy-efficiency.
>        This encompasses power management facilities both for active
> (while-running) and inactive
>        (non-running) operation of the platform or its more individual
> components.
>
>    4.4. Out of Scope:
>
>        Near-term, we will give much less attention to non-server systems
> (desktops
>        and laptops).
>
>    4.5. Interfaces:
>        Interfaces will be specified on a per-project basis.
>
>    4.6. Doc Impact:
>        Manual pages, developer docs and administration guides will be
> impacted by
>        the individual projects on this roadmap.
>
>    4.7. Admin/Config Impact:
>        The default installation will enable the system to perform dynamic
> power management,
>        and its default configuration shall be to perform energy-efficient
> operation described by
>        the "adaptive performance" conception above.
>        The administrator may configure the system to perform
> energy-efficient operation under
>        more relaxed performance constraints.
>
>    4.8. HA Impact:
>        More rapid availability (activation) of non-provisioned
> (idle/suspended)
>        resources is one expected outcome of this program.
>
>    4.9. I18N/L10N Impact:
>        Limited to the addition of 10's of messages as exposed by the new SMF
> power
>        service.
>
>    4.10. Packaging & Delivery:
>        Part of the core OS facilities delivered in the Solaris OS/Net
> consolidation
>
>    4.11. Security Impact:
>        This project introduces system-level facilities that require an
> appropriate level
>        of authorization to configure and/or enact, but this is not in any
> way
>        extraordinary.
>
>        Configuration and enactment of power-management actions will be
> auditable.
>
>    4.12. Dependencies:
>        The capacity and utilization abstractions presently underlying the
> implementation
>        of the Power-aware dispatcher, and the energy-efficiency heuristic it
> presently uses,
>        is something this Program expects to sustain.
>
> 5. Reference Documents:
>
>    5.1. Design/Specification documents
>        To be provided by each individual project under this umbrella.
>
>    5.2. Related documents
>        Brown, David J. and Charles Reams, "Toward Energy-efficient
> Computing,"
>        Communications of the ACM, Vol. 53, No. 3, pp. 50-58, March 2010,
>
>  http://cacm.acm.org/magazines/2010/3/76284-toward-energy-efficient-computing/fulltext
>
>        Saxe, Eric, "Power-efficient Software," Communications of the ACM,
> Vol. 53, No.2,
>        pp. 44-48, Feb 2010,
>
>  http://cacm.acm.org/magazines/2010/2/69355-power-efficient-software/fulltext
>
>        Power management community on Open Solaris:
>        http://hub.opensolaris.org/bin/view/Community+Group+pm/
>
>        Recent PM-related ARC cases
>
>        PSARC 2009/396 Tickless Kernel Architecture / lbolt decoupling
>        PSARC 2009/289 FBDIMM Idle Power Enhancement (FIPE) driver
>        PSARC 2009/283 Default enabling of CPU power management in S10U8 for
> x86 systems
>        PSARC 2009/112 sys-suspend(1)
>        PSARC 2009/101 Turbo mode observability
>        PSARC 2009/086 PowerTOP --cpu option
>        PSARC 2008/777 cpupm keyword mode extensions
>        PSARC 2008/742 SDcard Framework Suspend & Resume
>        PSARC 2008/376 PowerTOP for OpenSolaris
>        PSARC 2008/291 Power Management Core-disable for n2/vf CPUs
>        PSARC 2008/091 Libtopo enumeration of fans and power supplies via
> IPMI
>        PSARC 2008/021 HAL Power Management Support
>        PSARC 2007/679 CPUFreq HAL
>        PSARC 2006/273 Rage XL Framebuffer Driver
>        PSARC 2006/132 Wake On LAN
>        PSARC 2005/469 X86 Energy Star compliance
>
>
> 6. Resources and Schedule:
>   6.1. Projected Availability:
>        CY2010Q3 Suspend-to-disk
>        CY2010Q4 Power service (initial PM 2.0) SMF facility
>        CY2010Q4 libpower - C language bindings (programmatic interface)
>        CY2010Q4 Suspend/Resume for initial reference set of x64 volume
>                server products (e.g. x4170, x4270, x4275, Lynx+)
>        CY2011Q4 Energy-star compliance for certain volume servers
>
>   6.2. Cost of Effort:
>        To be defined by the follow-on projects.
>
>   6.3. Cost of Capital Resources:
>        Existing lab clients and servers will be used.
>
>   6.4. Product Approval Committee requested information:
>        6.4.1. Consolidation or Component Name:
>                ON
>        6.4.3. Type of CPT Review and Approval expected:
>                Standard
>        6.4.4. Project Boundary Conditions:
>                To be define by the follow-on projects.
>        6.4.5. Is this a necessary project for OEM agreements:
>                No
>        6.4.6. Notes:
>                // See dependencies section above.
>        6.4.7. Target RTI Date/Release:
>                To be defined by the follow-on projects.
>        6.4.8. Target Code Design Review Date:
>                To be defined by the follow-on projects.
>        6.4.9. Update approval addition:
>                N/A
>
>   6.5. ARC review type:
>        Standard
>
>   6.6. ARC Exposure:
>        open
>       6.6.1. Rationale:
>                N/A
>
> 7. Prototype Availability:
>   7.1. Prototype Availability:
>        To be defined by the follow-on projects.
>
>   7.2. Prototype Cost:
>        To be defined by the follow-on projects.
>
>
> _______________________________________________
> pm-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/pm-discuss
>
>
_______________________________________________
pm-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pm-discuss

Reply via email to