cpuset: Defer housekeeping_update() call from CPU hotplug to workqueue

Waiman Long Fri, 30 Jan 2026 17:08:31 -0800


On 1/30/26 7:47 PM, Chen Ridong wrote:


On 2026/1/30 23:42, Waiman Long wrote:

The update_isolation_cpumasks() function can be called either directly
from regular cpuset control file write with cpuset_full_lock() called
or via the CPU hotplug path with cpus_write_lock and cpuset_mutex held.

Note this statement.


As we are going to enable dynamic update to the nozh_full housekeeping
cpumask (HK_TYPE_KERNEL_NOISE) soon with the help of CPU hotplug,
allowing the CPU hotplug path to call into housekeeping_update() directly
from update_isolation_cpumasks() will likely cause deadlock. So we
have to defer any call to housekeeping_update() after the CPU hotplug
operation has finished. This is now done via the workqueue where
the actual housekeeping_update() call, if needed, will happen after
cpus_write_lock is released.

We can't use the synchronous task_work API as call from CPU hotplug
path happen in the per-cpu kthread of the CPU that is being shut down
or brought up. Because of the asynchronous nature of workqueue, the
HK_TYPE_DOMAIN housekeeping cpumask will be updated a bit later than the
"cpuset.cpus.isolated" control file in this case.

Also add a check in test_cpuset_prs.sh and modify some existing
test cases to confirm that "cpuset.cpus.isolated" and HK_TYPE_DOMAIN
housekeeping cpumask will both be updated.

Signed-off-by: Waiman Long <[email protected]>
---
  kernel/cgroup/cpuset.c                        | 37 +++++++++++++++++--
  .../selftests/cgroup/test_cpuset_prs.sh       | 13 +++++--
  2 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 7b7d12ab1006..0b0eb1df09d5 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -84,6 +84,9 @@ static cpumask_var_t  isolated_cpus;
   */
  static bool isolated_cpus_updating;

+/* Both cpuset_mutex and cpus_read_locked acquired */

+static bool cpuset_locked;
+
  /*
   * A flag to force sched domain rebuild at the end of an operation.
   * It can be set in
@@ -285,10 +288,12 @@ void cpuset_full_lock(void)
  {
        cpus_read_lock();
        mutex_lock(&cpuset_mutex);
+       cpuset_locked = true;
  }

void cpuset_full_unlock(void)

  {
+       cpuset_locked = false;
        mutex_unlock(&cpuset_mutex);
        cpus_read_unlock();
  }
@@ -1285,6 +1290,16 @@ static bool prstate_housekeeping_conflict(int prstate, 
struct cpumask *new_cpus)
        return false;
  }

+static void isolcpus_workfn(struct work_struct *work)

+{
+       cpuset_full_lock();
+       if (isolated_cpus_updating) {
+               WARN_ON_ONCE(housekeeping_update(isolated_cpus) < 0);
+               isolated_cpus_updating = false;
+       }
+       cpuset_full_unlock();
+}
+
  /*
   * update_isolation_cpumasks - Update external isolation related CPU masks
   *
@@ -1293,14 +1308,30 @@ static bool prstate_housekeeping_conflict(int prstate, 
struct cpumask *new_cpus)
   */
  static void update_isolation_cpumasks(void)
  {
-       int ret;
+       static DECLARE_WORK(isolcpus_work, isolcpus_workfn);

if (!isolated_cpus_updating)

                return;

Can this happen?

cpu0                                    cpu1
[...]

isolated_cpus_updating = true;
...
// 'full_lock' is not acquired
update_isolation_cpumasks

That is not true. Either cpus_read_lock or cpus_write_lock andcpuset_mutex are held when update_isolation_cpumasks() is called. Sothere is mutual exclusion.

                                        // exec worker concurrently
                                        isolcpus_workfn
                                        cpuset_full_lock
                                        isolated_cpus_updating = false;
                                        cpuset_full_unlock();
// This returns uncorrectly
if (!isolated_cpus_updating)
        return;

Cheers,
Longman

Re: [PATCH/for-next v2 1/2] cgroup/cpuset: Defer housekeeping_update() call from CPU hotplug to workqueue

Reply via email to