On 2/9/26 2:53 AM, Chen Ridong wrote:

On 2026/2/7 4:37, Waiman Long wrote:
Now that we are going to defer any changes to the HK_TYPE_DOMAIN
housekeeping cpumasks to either task_work or workqueue
where rebuild_sched_domains() call will be issued. The current
rebuild_sched_domains_locked() call near the end of the cpuset critical
section can be removed in such cases.

Currently, a boolean force_sd_rebuild flag is used to decide if
rebuild_sched_domains_locked() call needs to be invoked. To allow
deferral that like, we change it to a tri-state sd_rebuild enumaration
type.

Signed-off-by: Waiman Long <[email protected]>
---
  kernel/cgroup/cpuset.c | 20 ++++++++++++++------
  1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index d26c77a726b2..e224df321e34 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -173,7 +173,11 @@ static bool                isolcpus_twork_queued;  /* T */
   * Note that update_relax_domain_level() in cpuset-v1.c can still call
   * rebuild_sched_domains_locked() directly without using this flag.
   */
-static bool force_sd_rebuild;                  /* RWCS */
+static enum {
+       SD_NO_REBUILD = 0,
+       SD_REBUILD,
+       SD_DEFER_REBUILD,
+} sd_rebuild;                                  /* RWCS */
/*
   * Partition root states:
@@ -990,7 +994,7 @@ void rebuild_sched_domains_locked(void)
lockdep_assert_cpus_held();
        lockdep_assert_cpuset_lock_held();
-       force_sd_rebuild = false;
+       sd_rebuild = SD_NO_REBUILD;
/* Generate domain masks and attrs */
        ndoms = generate_sched_domains(&doms, &attr);
@@ -1377,6 +1381,9 @@ static void update_isolation_cpumasks(void)
        else
                isolated_cpus_updating = false;
If isolated_hk_cpus is defined, I believe isolated_cpus_updating becomes 
redundant.
Note that they have different exclusion rules. Other than that, you are right that "!cpumask_equal(isolated_hk_cpu, isolated_cpus)" should be equivalent to isolated_cpus_updating. But because of the different exclusion rules, there are restriction on where you can use one or the other.

+       /* Defer rebuild_sched_domains() to task_work or wq */
+       sd_rebuild = SD_DEFER_REBUILD;
+
There is a potential issue: we defer all domain rebuilds here, including those
triggered by hotplug events which may change the isolation state.

The problem is that functions like cpuset_cpu_active, which rely on the
scheduler domains being up-to-date—will, also be delayed. Is that okay?

No, we are not deferring all domain rebuilds. We are just deferring domain rebuilds that involves changes in the set of isolated CPUs. Domains rebuild will still happen if there is no changes in the set of isolated CPUs. I need to take a further to investigate if this is a problem or not. Anyway s suggested in my reply to Federic, I am considering to not changing isolated_cpus due to hotplug events. In that case, this problem should be gone.

Cheers,
Longman


Reply via email to