** Description changed:

  This is a public version of https://bugs.launchpad.net/bugs/2049792
  
  Backport:  [SRF] performance: hwmon: (coretemp) Fix core count
  limitation  (merged upstream in 6.9) to jammy
  
- [Description]
-   coretemp driver supports at most 128 cores per package. Cores higher than 
128 will lose their core temperature information.
-   Some SRF SKUs have more than 128 cores per package and triggers the issue.
+ [Impact]
+ 
+ In linux 6.8 the coretemp driver supports at most 128 cores per package.
+ Cores higher than 128 will lose their core temperature information.
+ 
+ There is an upstream patch set that allows to support more than 128
+ cores per package, it's applied to linux-next, then to Noble.
+ 
+ We should apply the patch set to the Jammy 5.15 kernel, so that we can
+ properly support systems with a large amount of cores per package.
+ 
+ [Test case]
+ 
+ Read temperature info from /sys/class/hwmon on a system with > 128 cores
+ per package (that means we don't have a proper test case to verify the
+ fix at the moment).
  
  [Fix]
+ 
  A series of patch is part of this improvement:
+ 
  1a793caf6f69 hwmon: (coretemp) Use dynamic allocated memory for core temp_data
  18b24a5f9ca3 hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
  326241f71f3d hwmon: (coretemp) Split package temp_data and core temp_data
  b0b01414a261 hwmon: (coretemp) Abstract core_temp helpers
  87eb801925a0 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
  18d8f5583388 hwmon: (coretemp) Replace sensor_device_attribute with 
device_attribute
  25f8e01baa05 hwmon: (coretemp) Remove unnecessary dependency of array index
  c8c2074020a8 hwmon: (coretemp) Introduce enum for attr index
+ 
  And some patch are required to make the backporting clean:
+ 
  34cf8c657cf03 hwmon: (coretemp) Enlarge per package core count limit
  fdaf0c8629d45 hwmon: (coretemp) Fix bogus core_id to attr name mapping
  4e440abc89458 hwmon: (coretemp) Fix out-of-bounds memory access
  a2930f6dc90f0 hwmon: (coretemp) Delete an obsolete comment
  6c2b659913ad9 hwmon: (coretemp) Delete tjmax debug message
  0f8b916bc5b5d hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
  fae30e3c203e0 hwmon: (coretemp) Add support for dynamic ttarget
  c0c67f8761cec hwmon: (coretemp) Add support for dynamic tjmax
  2bc0e6d07ee50 hwmon: (coretemp) rearrange tjmax handing code
  5c0e64dde80ff hwmon: (coretemp) Remove obsolete temp_data->valid
  
- Only 5c0e64dde80ff has to be modified as it's delete a variable which changed 
type
+ Only 5c0e64dde80ff has to be modified as it's deleting a variable which 
changed type
  because of a refactoring.
  
- [Test]
- Verify on specific hardware if we can read temperature accordingly.
+ There is a number of commits, but they are only changing one file.
+ 
+ [Regression potential]
+ 
+ We may experience hwmon-related regressions, either systems reading
+ incorrect temperature information or even bugs/crashes when accessing
+ data from /sys/class/hwmon.

** Changed in: linux (Ubuntu Jammy)
       Status: New => In Progress

** Changed in: linux (Ubuntu Jammy)
     Assignee: (unassigned) => Thibf (thibf)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2058668

Title:
   [SRF] performance: hwmon: (coretemp) Fix core count limitation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058668/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to