Thanks again for all the feedback, Matthew.  I've incorporated the
requested suggestions and will send out a v2 shortly.

** Summary changed:

- Re-enable memcg v1 on Noble (6.14)
+ enable MEMCG_V1 and CPUSETS_V1 on Noble HWE

** Description changed:

  [Impact]
  
- Although v1 cgroups are deprecated in Noble, it was still possible for 
+ Although v1 cgroups are deprecated in Noble, it was still possible for
  users on 6.8 kernels to utilize them.  This was especially helpful in
- helping migrating users to Noble and then separately upgrading their
- remaining v1 cgroups applications.  Instead of requiring all users to
- upgrade and fix their v2 support, v1 support could be provisionally
- enabled until the necessary support was available in the applications
- that still lack v2 support.
+ the Noble migration process.  It allowed users to pick up the new OS and
+ then separately upgrade their remaining v1 cgroups applications.  This
+ unblocked the migration path for v1 cgroups users, because v1 support
+ could be provisionally enabled until the necessary support was available
+ in the applications that still lack v2 support.
  
- Starting in 6.12, CONFIG_MEMCG_V1 was added and defaulted to false.
- Noble 6.8 users that were unlucky enough to still need V1 cgroups found
- that they could no longer use memcgs in the 6.14 kernel.
+ Starting in 6.12, CONFIG_MEMCG_V1 and CONFIG_CPUSETS_V1 were added and
+ defaulted to false.  Noble 6.8 users that were unlucky enough to still
+ need these V1 cgroups found that they could no longer use them in the
+ 6.14 kernel.
  
- Specific use cases include older JVMs that fail to correctly handle
- missing controllers from /proc/cgroups.  In that case, the container
- limit detection is turned off and the JVM uses the host's limits.
+ Some of the specific failures that were encountered include older JVMs
+ that fail to correctly handle missing controllers from /proc/cgroups.
+ If memory or cpuset are absent, the container limit detection is turned
+ off and the JVM uses the host's limits.  JVMs configured in containers
+ with specific memory usage percentages then end up consuming too much
+ memory and often crash.
  
- Further, Apache Yarn is still completing their v1 -> v2 migration, which
- leaves some Hadoop use cases without proper support.
+ Apache Yarn is still completing their v1 -> v2 migration, which leaves
+ some Hadoop use cases without proper support.
  
- The request here is to enable MEMCG_V1 on Noble, but not newer releases,
- for as long as the Noble HWE kernel train still has kernels with cgroup
- v1 support.  This gives users a little bit longer to complete their
- migration while still using newer hardware, but with the understanding
- that this really is the end of the line for v1 cgroups.
+ The request here is to enable these V1 controllers on Noble, but not
+ newer releases, for as long as the Noble HWE kernel train still has
+ kernels with upstream cgroup v1 support.  This gives users a little bit
+ longer to complete their migration while still using newer hardware, but
+ with the understanding that this really is the end of the line for v1
+ cgroups.
  
  [Fix]
  
- Re-enable CONFIG_MEMCG_V1 in the 6.14 Noble config.
+ Re-enable the missing v1 controllers in the 6.14 Noble config.
+ 
+ In 6.8 there were 14 controllers.  In the current 6.14 config there are
+ also 14 controllers.  However, the difference is that the current 6.14
+ build the dmem controller was added, and the cpuset and memory
+ controllers were removed.
+ 
+ Diffing both the /proc/cgroups and configs between the 6.14 and 6.8
+ releases gives:
+ 
+   -CPUSETS_V1 n
+   -MEMCG_V1 n
+ 
+ These differences were also corroborated via source inspection.  Changes
+ in 6.12 moved these controllers to be guarded by ifdefs that default to
+ being disabled via make olddefconfig.
+ 
+ In order to ensure that 6.14 has the same v1 cgroup controllers enabled
+ as 6.8, enable both CONFIG_CPUSETS_V1 and CONFIG_MEMCG_V1 for Noble.
  
  [Test]
  
- Booted a kernel with this change and validated that v1 memcgs were
- present again.
+ Booted a kernel with this change and validated that the missing v1
+ memcgs were present again.
  
- [Potential Regression]
+ Before:
  
- The regression potential here should be low since this merely restores
- and existing feature that most users were not using but that a few still
- depended upon.
+    $ grep memory /proc/cgroups 
+    $ grep cpuset /proc/cgroups 
+    
+  with v1 cgroups enabled:
+    
+    $ mount | grep cgroup | grep memory
+    $ mount | grep cgroup | grep cpuset
+    
+    $ ls /sys/fs/cgroup | grep memory
+    $ ls /sys/fs/cgroup | grep cpuset
+ 
+ After:
+ 
+    $ grep memory /proc/cgroups 
+    memory     0       88      1
+    $ grep cpuset /proc/cgroups 
+    cpuset     0       88      1
+    
+  with v1 cgroups enabled:
+    
+    $ mount | grep cgroup | grep memory
+    cgroup on /sys/fs/cgroup/memory type cgroup 
(rw,nosuid,nodev,noexec,relatime,memory)
+    $ mount | grep cgroup | grep cpuset
+    cgroup on /sys/fs/cgroup/cpuset type cgroup 
(rw,nosuid,nodev,noexec,relatime,cpuset)
+    
+    $ ls /sys/fs/cgroup | grep memory
+    memory
+    $ ls /sys/fs/cgroup | grep cpuset
+    cpuset
+ 
+ A config diff of the previous build versus a build cranked from these
+ patches:
+ 
+  CPUSETS_V1 n -> y
+  MEMCG_V1 n -> y
+ 
+ [Where problems can occur]
+ 
+ Since these changes re-introduce code that was disabled via ifdef,
+ there's a possible increase in the binary size.  After comparing the
+ results from an identical build with these config flags disabled, the
+ difference in compressed artifact size for an x86 vmlinuz is an increase
+ of 16k.
+ 
+ The difference in uncompressed memory usage after boot is an increase of
+ 40k, broken down as 21k code, 19k rwdata, 12k rodata, 8k init, -28k
+ bss, and 8k reserved.
+ 
+ The primary remaining risk is around future breakage of these interfaces
+ since they are no longer part of the default configuration.  If this is
+ not part of upstream's test matrix, then there is additional potential
+ breakage possible. However, the author has no knowledge of actual v1
+ cgroups breakage at the time this patch is being submitted.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2122368

Title:
  enable MEMCG_V1 and CPUSETS_V1 on Noble HWE

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe-6.14/+bug/2122368/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to