On Wed, Jun 07, 2017 at 01:18:57PM -0600, Jeffrey Hugo wrote: > If load_balance() fails to migrate any tasks because all tasks were > affined, load_balance() removes the source cpu from consideration and > attempts to redo and balance among the new subset of cpus. > > There is a bug in this code path where the algorithm considers all active > cpus in the system (minus the source that was just masked out). This is > not valid for two reasons: some active cpus may not be in the current > scheduling domain and one of the active cpus is dst_cpu. These cpus should > not be considered, as we cannot pull load from them. > > Instead of failing out of load_balance(), we may end up redoing the search > with no valid cpus and incorrectly concluding the domain is balanced. > Additionally, if the group_imbalance flag was just set, it may also be > incorrectly unset, thus the flag will not be seen by other cpus in future > load_balance() runs as that algorithm intends. > > Fix the check by removing cpus not in the current domain and the dst_cpu > from considertation, thus limiting the evaluation to valid remaining cpus > from which load might be migrated. > > Co-authored-by: Austin Christ <[email protected]> > Co-authored-by: Dietmar Eggemann <[email protected]> > Signed-off-by: Jeffrey Hugo <[email protected]> > Tested-by: Tyler Baicar <[email protected]>
Yes, this looks good. Thanks!

