On 2026/1/2 3:15, Waiman Long wrote:
> Since commit f62a5d39368e ("cgroup/cpuset: Remove remote_partition_check()
> & make update_cpumasks_hier() handle remote partition"), the
> compute_effective_exclusive_cpumask() helper was extended to
> strip exclusive CPUs from siblings when computing effective_xcpus
> (cpuset.cpus.exclusive.effective). This helper was later renamed to
> compute_excpus() in commit 86bbbd1f33ab ("cpuset: Refactor exclusive
> CPU mask computation logic").
>
> This helper is supposed to be used consistently to compute
> effective_xcpus. However, there is an exception within the callback
> critical section in update_cpumasks_hier() when exclusive_cpus of a
> valid partition root is empty. This can cause effective_xcpus value to
> differ depending on where exactly it is last computed. Fix this by using
> compute_excpus() in this case to give a consistent result.
>
> Signed-off-by: Waiman Long <[email protected]>
> ---
> kernel/cgroup/cpuset.c | 14 +++++---------
> 1 file changed, 5 insertions(+), 9 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index da2b3b51630e..37d118a9ad4d 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2168,17 +2168,13 @@ static void update_cpumasks_hier(struct cpuset *cs,
> struct tmpmasks *tmp,
> spin_lock_irq(&callback_lock);
> cpumask_copy(cp->effective_cpus, tmp->new_cpus);
> cp->partition_root_state = new_prs;
> - if (!cpumask_empty(cp->exclusive_cpus) && (cp != cs))
> - compute_excpus(cp, cp->effective_xcpus);
> -
> /*
> - * Make sure effective_xcpus is properly set for a valid
> - * partition root.
> + * Need to compute effective_xcpus if either exclusive_cpus
> + * is non-empty or it is a valid partition root.
> */
> - if ((new_prs > 0) && cpumask_empty(cp->exclusive_cpus))
> - cpumask_and(cp->effective_xcpus,
> - cp->cpus_allowed, parent->effective_xcpus);
> - else if (new_prs < 0)
> + if ((new_prs > 0) || !cpumask_empty(cp->exclusive_cpus))
> + compute_excpus(cp, cp->effective_xcpus);
> + if (new_prs < 0)
> reset_partition_data(cp);
> spin_unlock_irq(&callback_lock);
>
The code resets partition data only for new_prs < 0. My understanding is that a
partition is invalid
when new_prs <= 0. Shouldn't reset_partition_data() also be called when new_prs
= 0? Is there a
specific reason to skip the reset in that case?
--
Best regards,
Ridong