On 08/13/2012 08:21 PM, Alex Shi wrote:

> Since there is no power saving consideration in scheduler CFS, I has a
> very rough idea for enabling a new power saving schema in CFS.
> 
> It bases on the following assumption:
> 1, If there are many task crowd in system, just let few domain cpus
> running and let other cpus idle can not save power. Let all cpu take the
> load, finish tasks early, and then get into idle. will save more power
> and have better user experience.
> 
> 2, schedule domain, schedule group perfect match the hardware, and
> the power consumption unit. So, pull tasks out of a domain means
> potentially this power consumption unit idle.
> 
> So, according Peter mentioned in commit 8e7fbcbc22c(sched: Remove stale
> power aware scheduling), this proposal will adopt the
> sched_balance_policy concept and use 2 kind of policy: performance, power.
> 
> And in scheduling, 2 place will care the policy, load_balance() and in
> task fork/exec: select_task_rq_fair().



Any comments for this rough proposal, specially for the assumptions?

> 
> Here is some pseudo code try to explain the proposal behaviour in
> load_balance() and select_task_rq_fair();
> 
> 
> load_balance() {
>       update_sd_lb_stats(); //get busiest group, idlest group data.
> 
>       if (sd->nr_running > sd's capacity) {
>               //power saving policy is not suitable for
>               //this scenario, it runs like performance policy
>               mv tasks from busiest cpu in busiest group to
>               idlest  cpu in idlest group;
>       } else {// the sd has enough capacity to hold all tasks.
>               if (sg->nr_running > sg's capacity) {
>                       //imbalanced between groups
>                       if (schedule policy == performance) {
>                               //when 2 busiest group at same busy
>                               //degree, need to prefer the one has
>                               // softest group??
>                               move tasks from busiest group to
>                                       idletest group;
>                       } else if (schedule policy == power)
>                               move tasks from busiest group to
>                               idlest group until busiest is just full
>                               of capacity.
>                               //the busiest group can balance
>                               //internally after next time LB,
>               } else {
>                       //all groups has enough capacity for its tasks.
>                       if (schedule policy == performance)
>                               //all tasks may has enough cpu
>                               //resources to run,
>                               //mv tasks from busiest to idlest group?
>                               //no, at this time, it's better to keep
>                               //the task on current cpu.
>                               //so, it is maybe better to do balance
>                               //in each of groups
>                               for_each_imbalance_groups()
>                                       move tasks from busiest cpu to
>                                       idlest cpu in each of groups;
>                       else if (schedule policy == power) {
>                               if (no hard pin in idlest group)
>                                       mv tasks from idlest group to
>                                       busiest until busiest full.
>                               else
>                                       mv unpin tasks to the biggest
>                                       hard pin group.
>                       }
>               }
>       }
> }
> 
> select_task_rq_fair()
> {
>       for_each_domain(cpu, tmp) {
>               if (policy == power && tmp_has_capacity &&
>                        tmp->flags & sd_flag) {
>                       sd = tmp;
>                       //It is fine to got cpu in the domain
>                       break;
>               }
>       }
> 
>       while(sd) {
>               if policy == power
>                       find_busiest_and_capable_group()
>               else
>                       find_idlest_group();
>               if (!group) {
>                       sd = sd->child;
>                       continue;
>               }
>               ...
>       }
> }
> 
> sub proposal:
> 1, If it's possible to balance task on idlest cpu not appointed 'balance
> cpu'. If so, it may can reduce one more time balancing.
> The idlest cpu can prefer the new idle cpu;  and is the least load cpu;
> 2, se or task load is good for running time setting.
> but it should the second basis in load balancing. The first basis of LB
> is running tasks' number in group/cpu. Since whatever of the weight of
> groups is, if the tasks number is less than cpu number, the group is
> still has capacity to take more tasks. (will consider the SMT cpu power
> or other big/little cpu capacity on ARM.)
> 
> unsolved issues:
> 1, like current scheduler, it didn't handled cpu affinity well in
> load_balance.
> 2, task group that isn't consider well in this rough proposal.
> 
> It isn't consider well and may has mistaken . So just share my ideas and
> hope it become better and workable in your comments and discussion.
> 
> Thanks
> Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to