On 08/13/2012 08:21 PM, Alex Shi wrote: > Since there is no power saving consideration in scheduler CFS, I has a > very rough idea for enabling a new power saving schema in CFS. > > It bases on the following assumption: > 1, If there are many task crowd in system, just let few domain cpus > running and let other cpus idle can not save power. Let all cpu take the > load, finish tasks early, and then get into idle. will save more power > and have better user experience. > > 2, schedule domain, schedule group perfect match the hardware, and > the power consumption unit. So, pull tasks out of a domain means > potentially this power consumption unit idle. > > So, according Peter mentioned in commit 8e7fbcbc22c(sched: Remove stale > power aware scheduling), this proposal will adopt the > sched_balance_policy concept and use 2 kind of policy: performance, power. > > And in scheduling, 2 place will care the policy, load_balance() and in > task fork/exec: select_task_rq_fair().
Any comments for this rough proposal, specially for the assumptions? > > Here is some pseudo code try to explain the proposal behaviour in > load_balance() and select_task_rq_fair(); > > > load_balance() { > update_sd_lb_stats(); //get busiest group, idlest group data. > > if (sd->nr_running > sd's capacity) { > //power saving policy is not suitable for > //this scenario, it runs like performance policy > mv tasks from busiest cpu in busiest group to > idlest cpu in idlest group; > } else {// the sd has enough capacity to hold all tasks. > if (sg->nr_running > sg's capacity) { > //imbalanced between groups > if (schedule policy == performance) { > //when 2 busiest group at same busy > //degree, need to prefer the one has > // softest group?? > move tasks from busiest group to > idletest group; > } else if (schedule policy == power) > move tasks from busiest group to > idlest group until busiest is just full > of capacity. > //the busiest group can balance > //internally after next time LB, > } else { > //all groups has enough capacity for its tasks. > if (schedule policy == performance) > //all tasks may has enough cpu > //resources to run, > //mv tasks from busiest to idlest group? > //no, at this time, it's better to keep > //the task on current cpu. > //so, it is maybe better to do balance > //in each of groups > for_each_imbalance_groups() > move tasks from busiest cpu to > idlest cpu in each of groups; > else if (schedule policy == power) { > if (no hard pin in idlest group) > mv tasks from idlest group to > busiest until busiest full. > else > mv unpin tasks to the biggest > hard pin group. > } > } > } > } > > select_task_rq_fair() > { > for_each_domain(cpu, tmp) { > if (policy == power && tmp_has_capacity && > tmp->flags & sd_flag) { > sd = tmp; > //It is fine to got cpu in the domain > break; > } > } > > while(sd) { > if policy == power > find_busiest_and_capable_group() > else > find_idlest_group(); > if (!group) { > sd = sd->child; > continue; > } > ... > } > } > > sub proposal: > 1, If it's possible to balance task on idlest cpu not appointed 'balance > cpu'. If so, it may can reduce one more time balancing. > The idlest cpu can prefer the new idle cpu; and is the least load cpu; > 2, se or task load is good for running time setting. > but it should the second basis in load balancing. The first basis of LB > is running tasks' number in group/cpu. Since whatever of the weight of > groups is, if the tasks number is less than cpu number, the group is > still has capacity to take more tasks. (will consider the SMT cpu power > or other big/little cpu capacity on ARM.) > > unsolved issues: > 1, like current scheduler, it didn't handled cpu affinity well in > load_balance. > 2, task group that isn't consider well in this rough proposal. > > It isn't consider well and may has mistaken . So just share my ideas and > hope it become better and workable in your comments and discussion. > > Thanks > Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/