** Description changed:
This is a public version of https://bugs.launchpad.net/bugs/2049792
Backport: [SRF] performance: hwmon: (coretemp) Fix core count
limitation (merged upstream in 6.9) to jammy
- [Description]
- coretemp driver supports at most 128 cores per package. Cores higher than
128 will lose their core temperature information.
- Some SRF SKUs have more than 128 cores per package and triggers the issue.
+ [Impact]
+
+ In linux 6.8 the coretemp driver supports at most 128 cores per package.
+ Cores higher than 128 will lose their core temperature information.
+
+ There is an upstream patch set that allows to support more than 128
+ cores per package, it's applied to linux-next, then to Noble.
+
+ We should apply the patch set to the Jammy 5.15 kernel, so that we can
+ properly support systems with a large amount of cores per package.
+
+ [Test case]
+
+ Read temperature info from /sys/class/hwmon on a system with > 128 cores
+ per package (that means we don't have a proper test case to verify the
+ fix at the moment).
[Fix]
+
A series of patch is part of this improvement:
+
1a793caf6f69 hwmon: (coretemp) Use dynamic allocated memory for core temp_data
18b24a5f9ca3 hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
326241f71f3d hwmon: (coretemp) Split package temp_data and core temp_data
b0b01414a261 hwmon: (coretemp) Abstract core_temp helpers
87eb801925a0 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
18d8f5583388 hwmon: (coretemp) Replace sensor_device_attribute with
device_attribute
25f8e01baa05 hwmon: (coretemp) Remove unnecessary dependency of array index
c8c2074020a8 hwmon: (coretemp) Introduce enum for attr index
+
And some patch are required to make the backporting clean:
+
34cf8c657cf03 hwmon: (coretemp) Enlarge per package core count limit
fdaf0c8629d45 hwmon: (coretemp) Fix bogus core_id to attr name mapping
4e440abc89458 hwmon: (coretemp) Fix out-of-bounds memory access
a2930f6dc90f0 hwmon: (coretemp) Delete an obsolete comment
6c2b659913ad9 hwmon: (coretemp) Delete tjmax debug message
0f8b916bc5b5d hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
fae30e3c203e0 hwmon: (coretemp) Add support for dynamic ttarget
c0c67f8761cec hwmon: (coretemp) Add support for dynamic tjmax
2bc0e6d07ee50 hwmon: (coretemp) rearrange tjmax handing code
5c0e64dde80ff hwmon: (coretemp) Remove obsolete temp_data->valid
- Only 5c0e64dde80ff has to be modified as it's delete a variable which changed
type
+ Only 5c0e64dde80ff has to be modified as it's deleting a variable which
changed type
because of a refactoring.
- [Test]
- Verify on specific hardware if we can read temperature accordingly.
+ There is a number of commits, but they are only changing one file.
+
+ [Regression potential]
+
+ We may experience hwmon-related regressions, either systems reading
+ incorrect temperature information or even bugs/crashes when accessing
+ data from /sys/class/hwmon.
** Changed in: linux (Ubuntu Jammy)
Status: New => In Progress
** Changed in: linux (Ubuntu Jammy)
Assignee: (unassigned) => Thibf (thibf)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2058668
Title:
[SRF] performance: hwmon: (coretemp) Fix core count limitation
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058668/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs