[tesla-dev] Dispatcher driven P-state management under simple policy

Li, Aubrey Fri, 15 Aug 2008 10:25:24 +0800

Eric.Saxe wrote:

> Liu, Jiang wrote:
>> It's dependent on the domain coordination method, currently
> there are three kinds of domain coordination:
>> 1) Software all mode: to transit a domain into a specific
> P-state, software needs to write to P-state control register
> on every CPU in the domain.
>> 2) Software any mode:  to transit a domain into a specific
> P-state, software only needs to write to P-state control
> registre on one of the CPUs in the domain.
>> 3) Hardware coordination mode: each CPU in the domain writes
> what it wants to P-state control register and hardware/BIOS
> will choose the correct P-state for the domain.
>> so the answer to your question is it depends and a domain
> controller is needed to support different coordination methods.
>> 
> Thanks. The code as it's currently written lends itself particularly
> well to "Software any".  
>Mark, Anup, Bill and I were discussing a few
> days ago ideas for what the implementation of the current
> simply policy
> would look like under modes 1 or 3.
> 
> What's currently implemented is analogous to how a one might
> manage the
> state of a light switch in a room.
> - First one in the room turns the switch on.
> - Last one out of the room turns the switch off.
> - No polling to see if someone left the room without turning
> the switch off.
> 
> This policy makes sense if P-state transitions are
> inexpensive. The more
> x-calls are needed to implement P-state transitions, the higher the
> transition cost (at least from the software point of view). Mark was
> remarking that even if the hardware handles the coordination, that it
> might still be expensive because it simply means someone else (the
> hardware) is having to do more work.
> 
> We want to be able to leverage any coordination the hardware
> is able to
> do, since that can reduce the number of xcalls needed to have a domain
> transition P-states. The best case for this is mode 2, where zero
> x-calls are needed, and the worst seems to be mode 1, where you always
> need to x-call all CPUs to have the domain transition P-states. Mode 3
> is somewhere in the middle. If we can assume the behavior in 3 (that
> for example, the real P-state for the domain is the fastest P-state
> requested), then that will help us to better understand when
> xcalls are
> needed, and when they aren't.
> 
> The other way to go here, is to always have CPUs make their
> own P-state
> transitions, and then track in software the actual speed of
> the domain.
> This has the benefit of never requiring x-calls, but all CPUs
> would need
> to request P-state transitions....which I suppose, if necessary is
> better than having to do it via a x-call.
> 
> I think it's important that we:
>    - have an understanding of the real "performance" state of the
> domain, since that will influence where threads are scheduled to run.
>    - don't have to poll to determine when state transitions
> should happen
>    - find a good balance between the overhead resulting from policy
> evaluation and state transitions, and being responsive about making
> those state transitions. 
> 
> Depending on the overhead, we can add some hysteresis into the
> policy...for example by only requesting a raising of P-state when a
> thread has successfully finished it's time quantum, or is switching to
> another (non-idle) thread...which would help eliminate state
> transitions thrashings triggered by otherwise transient system
> activity. 
> 
> Thoughts?


I't s a bit hard for me to understand why we need to implement p-state
policy or c-state policy in PAD. IMHO, what PAD does is how to choose
CPU, according to the domain power state. All the cores in the P-state 
domain are assumed to run at the same speed. All the cores in the
C-state
domain are assumed to sleep at the same deep level. PAD just needs to
collect these information and determines which domain, which CPU should
be choosed for the next thread scheduled.
When P-state does the speed transition should be put into P-state
driver.
And a x-call is needed or not may not have big impact to PAD. P-state
speed
transition still needs to be checked periodically corresponding to the
workload.

C-state has the samiliar methodology, the next idle type will not be
decided
by PAD, still decided by the time of last c-state residency. And
according to
the policy and current c-state domain power state, PAD decides whether
to 
schedule the next thread to the C3 domain or C0 domain.

Thanks,
-Aubrey

[tesla-dev] Dispatcher driven P-state management under simple policy

Reply via email to