Hi,

This series fixes a cpuset DL bandwidth rollback bug and adds a selftest to 
cover it.

cpuset_can_attach() only reserves deadline bandwidth when migrating DL
tasks to a disjoint CPU mask, but cpuset_cancel_attach() rolls back based
only on the presence of migrating DL tasks. This makes the rollback path
asymmetric: it can call dl_bw_free() even when no dl_bw_alloc() was done.
Because the alloc and free CPU selection also differed, rollback could
return bandwidth to a different root domain than the one originally
charged.

Patch 1 fixes the rollback accounting by recording the CPU used for
dl_bw_alloc() during attach and using that exact state in
cpuset_cancel_attach(). If no allocation happened, rollback skips
dl_bw_free(). Successful attach behavior is unchanged.

Patch 2 adds a sched_ext selftest which exercises the real attach
rollback path from userspace. The test uses a sched_ext scheduler whose
cgroup_prep_move() rejects SCHED_DEADLINE tasks so that the cpu
controller fails after cpuset has already accepted the move. It then
checks that dl_bw->total_bw on the target CPU remains unchanged across
the failed move.

Local testing:

With patch 1:
ok 1 cpuset_dl_rollback

Without patch 1:
ERR: cpuset_dl_rollback.c:760
Expected total_bw for CPU0 to remain unchanged (1677696 != 2027221)
not ok 1 cpuset_dl_rollback

Guopeng Zhang (2):
  cgroup/cpuset: record DL BW alloc CPU for attach rollback
  selftests/sched_ext: add cpuset DL rollback test

 kernel/cgroup/cpuset-internal.h               |   5 +
 kernel/cgroup/cpuset.c                        |  13 +-
 tools/testing/selftests/sched_ext/Makefile    |   1 +
 .../sched_ext/cpuset_dl_rollback.bpf.c        |  28 +
 .../selftests/sched_ext/cpuset_dl_rollback.c  | 810 ++++++++++++++++++
 5 files changed, 853 insertions(+), 4 deletions(-)
 create mode 100644 tools/testing/selftests/sched_ext/cpuset_dl_rollback.bpf.c
 create mode 100644 tools/testing/selftests/sched_ext/cpuset_dl_rollback.c

-- 
2.43.0


Reply via email to