---------- Forwarded message ----------
Date: Wed, 18 Feb 2009 12:20:32 -0800
From: Garrett D'Amore <[email protected]>
To: Randy Fishel <randy.fishel at sun.com>
Cc: PSARC-ext at sun.com, jiang.liu at intel.com, frank.wang at intel.com,
    tesla-dev at opensolaris.org
Subject: Re: CPU Idle Notification [PSARC/2009/115 FastTrack timeout 02/25/2009]

Randy Fishel wrote:
> I am sponsoring the following fasttrack for Gerry Liu from Intel.  It
> proposes a notification interface initially used and provided by memory
> power management work done by Intel engineers in the OpenSolaris
> Tesla project.
> 
> It requests micro/patch release binding.  All interfaces will have
> a "Volatile" stability.  Timeout is February 25th.
> 
> This project, release binding, and stability has been reviewed and
> approved by the OpenSolaris Tesla project leaders and contributors.
> 
> 
> 
> 1. Introduction
>     1.1. Project/Component Working Name:
>       CPU idle notification interface
> 
>     1.2. Name of Document Author/Supplier:
>       Author: Gerry Liu <jiang.liu at intel.com>
> 
>     1.3. Date of This Document:
>       November 16, 2008
> 
> 4. Technical Description:
>     4.1. Problem
>       A CPU idle notification mechanism is needed to signal other
> components which are interested in the CPU idle state change events
>       when CPU enters/exits idle state. This mechanism could be used  by
>       following components:
>       A) Memory power saving driver
>       B) Lazy TLB flush on x86 system
>       C) CPU power management framework
> 
>     4.2. Proposal
>       We propose to add following data structures/interfaces to
>       OpenSolaris kernel.
> 
>     4.2.1 CPU idle notification information data structure
>       typedef struct cpu_idle_info {
>               int      ci_flags;
>               int      ci_intr_count;    /* Interrupt count. */
>               int      ci_idle_state;    /* Idle state to enter. */
>               hrtime_t ci_idle_latency;  /* Idle round trip latency. */
>               hrtime_t ci_max_idle_time; /* Predicted max idle time. */
>               hrtime_t ci_last_idle_time;/* Last idle period. */
>               hrtime_t ci_last_busy_time;/* Last busy period. */
>       } cpu_idle_info_t;
>       This structure is used to pass CPU idle information to callbacks.
>       It could be extended to support architecture/platform specific
>       information in future.
>       Valid flags for cpu_idle_info_t:
>           CPU_IDLE_CI_FLAG_IDLE_STATE: field ci_idle_state is valid.
>           CPU_IDLE_CI_FLAG_IDLE_LATENCY: field ci_idle_latency is valid.
>           CPU_IDLE_CB_FLAG_MAX_IDLE_TIME: field ci_max_idle_time is valid.
> 
>   

Is there a reason that a single structure is required (are some of these fields
linked and need to be dealt with atomically?)  If not, then I'd prefer to have
separate name-value pairs for each kind of datum.  Perhaps something like a
named property interface.

My experience is that structures with lots of detail tend to have problems
evolving over time (you wind up with lots of "reserved" or "obsolete" fields...)

>       typedef void (*cpu_idle_enter_cbfn)(void *arg,
>           cpu_idle_info_t *infop);
>       Entering idle state notification callback must obey all constraints
>       which applies to idle thread becuase it will be called in idle  thread
> context.
>       Argument arg is the parameter passed in when registering callback.
>       
>     4.2.3 Prototype of exiting idle state notification callback
>       typedef void (*cpu_idle_exit_cbfn)(void *arg, int flags);
>       Exiting idle state notification callback will be called in idle
>       thread context or interrupt context. There are flags to distinguish
>       the calling contexts.
>         arg is the parameter passed in when registering callback.
>       Valid flags for exiting idle state notification callback:
>           CPU_IDLE_CB_FLAG_INTR: called in interrupt context
>           CPU_IDLE_CB_FLAG_IDLE: called in idle thread context
> 
>     4.2.4 CPU idle notification callback data structures
>       typedef struct cpu_idle_callback {
>               int                     version;
>               cpu_idle_enter_cbfn     idle_enter;
>               cpu_idle_exit_cbfn      idle_exit;
>       } cpu_idle_callback_t;
>       At least one of idle_enter and idle_exit is non-NULL.
>   

What is the value of "version" in this structure?

>     4.2.5. Register CPU idle notification callback
>       int cpu_idle_register_callback(uint_t priority,
>           cpu_idle_callback_t *callbackp, void *arg, void **hdlpp);
>       This interface registers a callback to be called when CPU idle state
>       changes. All registered callbacks will be called in priority order
>       from high to low when CPU enters idle state and will be called in
>       reverse order when CPU exits idle state.
>       Argument priority is used to determine the calling order of
>       registered callbacks.
>       Argument arg will be passed back to registered callback and how to
>       use it is determined by callback.
> 
>     4.2.6. Deregister CPU notification callback
>       int cpu_idle_unregister_callback(uint_t priority,
>           cpu_idle_callback_t *callbackp, void *arg, void *hdlp);
>       This interface deregisters a registered callback.
> 
>     4.2.7. Signal entering idle state event
>       void cpu_idle_enter(cpu_idle_info_t *infop);
>       This interface notifies idle notification subsystem that a specific
>       CPU is entering into idle state.
> 
>     4.2.8. Signal exiting idle state event
>       void cpu_idle_exit(int flags);
>       This interface notifies idle notification subsystem that a specific
>       CPU is exiting from idle state.
>   

This seems specific to "idle" notification.  Would it be useful to have
something that dealt with other kinds of CPU state transitions as well?
(C-states, for example?)

   -- Garrett
> 6. Resources and Schedule:
>     6.4. Steering Committee requested information
>       6.4.1. Consolidation C-team Name:
>               ON
>     6.5. ARC review type: FastTrack
>     6.6. ARC Exposure: open
> 
>   

Reply via email to