On Thursday 25 Feb 2021 at 08:36:11 (+0000), [email protected] wrote:
> From: Vincent Donnefort <[email protected]>
> 
> find_energy_efficient_cpu() (feec()) computes for each perf_domain (pd) an
> energy delta as follows:
> 
>   feec(task)
>     for_each_pd
>       base_energy = compute_energy(task, -1, pd)
>         -> for_each_cpu(pd)
>            -> cpu_util_next(cpu, task, -1)
> 
>       energy_delta = compute_energy(task, dst_cpu, pd)
>         -> for_each_cpu(pd)
>            -> cpu_util_next(cpu, task, dst_cpu)
>       energy_delta -= base_energy
> 
> Then it picks the best CPU as being the one that minimizes energy_delta.
> 
> cpu_util_next() estimates the CPU utilization that would happen if the
> task was placed on dst_cpu as follows:
> 
>   max(cpu_util + task_util, cpu_util_est + _task_util_est)
> 
> The task contribution to the energy delta can then be either:
> 
>   (1) _task_util_est, on a mostly idle CPU, where cpu_util is close to 0
>       and _task_util_est > cpu_util.
>   (2) task_util, on a mostly busy CPU, where cpu_util > _task_util_est.
> 
>   (cpu_util_est doesn't appear here. It is 0 when a CPU is idle and
>    otherwise must be small enough so that feec() takes the CPU as a
>    potential target for the task placement)
> 
> This is problematic for feec(), as cpu_util_next() might give an unfair
> advantage to a CPU which is mostly busy (2) compared to one which is
> mostly idle (1). _task_util_est being always bigger than task_util in
> feec() (as the task is waking up), the task contribution to the energy
> might look smaller on certain CPUs (2) and this breaks the energy
> comparison.
> 
> This issue is, moreover, not sporadic. By starving idle CPUs, it keeps
> their cpu_util < _task_util_est (1) while others will maintain cpu_util >
> _task_util_est (2).
> 
> Fix this problem by always using max(task_util, _task_util_est) as a task
> contribution to the energy (ENERGY_UTIL). The new estimated CPU
> utilization for the energy would then be:
> 
>   max(cpu_util, cpu_util_est) + max(task_util, _task_util_est)
> 
> compute_energy() still needs to know which OPP would be selected if the
> task would be migrated in the perf_domain (FREQUENCY_UTIL). Hence,
> cpu_util_next() is still used to estimate the maximum util within the pd.
> 
> Signed-off-by: Vincent Donnefort <[email protected]>

Reviewed-by: Quentin Perret <[email protected]>

Thanks,
Quentin

Reply via email to