** Description changed:

+ == SRU Justification [ HIRSUTE ] ==
+ 
  The upstream thermald 2.4.6 has been released with some more bug fixes
- that are pertinent to H/W available in older releases.  Pull in these
- fixes:
+ that are pertinent to H/W available in older releases such as Hirsute.
+ These fix a variety of issues found on H/W in the field and such as
+ over-throttling, handling alternate ACPI object names for B0D4, handling
+ trip zones which may have wrong settings and disabling the legacy rapl
+ cdev when rapl-mmio is available.
+ 
+ == The fixes ==
+ 
+ Pull in these fixes:
  
  From 2.4.6:
  
  commit 273a53a11da2a7302ad7a5bc7e3bf04f221ce4e2
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Mon Jun 7 16:47:47 2021 -0700
  
-     Use Adaptive PPCC limits for RAPL MMIO
-     
-     Set the correct device name as RAPL-MSR so that RAPL-MMIO can
-     also set the correct default power limits.
+     Use Adaptive PPCC limits for RAPL MMIO
+ 
+     Set the correct device name as RAPL-MSR so that RAPL-MMIO can
+     also set the correct default power limits.
  
  commit 2dd67300448fa4a2aa8f3e00ee5b604c73a1f7d9
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Mon Jun 7 16:45:14 2021 -0700
  
-     Increase power limit for disabled RAPL-MMIO
-     
-     Increase 100W to 200W as some desktop platform already have limit
-     more than 100W.
+     Increase power limit for disabled RAPL-MMIO
+ 
+     Increase 100W to 200W as some desktop platform already have limit
+     more than 100W.
  
  commit 3de1004a49d0d157573bbdc1097b2fbed056879f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Sun Jun 6 19:19:15 2021 -0700
  
-     Special case for default PSVT
-     
-     When there are no adaptive tables and only one default PSVT table
-     is present with just one entry with MAX type. Add one additional
-     entry as done for non default case.
-  
+     Special case for default PSVT
+ 
+     When there are no adaptive tables and only one default PSVT table
+     is present with just one entry with MAX type. Add one additional
+     entry as done for non default case.
+ 
  From 2.4.5:
  
  commit 301a89284e9d74a9a1f8315c1673a548dcacda8c
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Sat May 29 10:50:44 2021 -0700
  
-     Set a very high RAPL MSR PL1 with --adaptive
-     
-     After upgrading Dell Latitude 5420, again noticed performance degradation.
-     The PPCC power limit for MSR RAPL PL1 is reduced to 15W. Even though we
-     disable MSR RAPL with --adaptive option, it is not getting disabled. So
-     MSR RAPL limits still playing role.
-     
-     To fix that set a very high MSR RAPL PL1 limit so that it never causes
-     throttling. All throttling with --adaptive option is done using RAPL-MMIO.
+     Set a very high RAPL MSR PL1 with --adaptive
+ 
+     After upgrading Dell Latitude 5420, again noticed performance degradation.
+     The PPCC power limit for MSR RAPL PL1 is reduced to 15W. Even though we
+     disable MSR RAPL with --adaptive option, it is not getting disabled. So
+     MSR RAPL limits still playing role.
+ 
+     To fix that set a very high MSR RAPL PL1 limit so that it never causes
+     throttling. All throttling with --adaptive option is done using RAPL-MMIO.
  
  From 2.4.4:
  
  commit ea4491971059259e46daad10ae850d3d530b02f2
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Thu Mar 11 10:06:15 2021 -0800
  
-     Fix error for condition names
-     
-     The current code caps the max name as the last condition name,
-     which is "Power_Slider". So any condition more than 56 will be
-     printing error, with "Power_Slider" as condition name. For example
-     for condition = 57:
-     Unsupported condition 57 (Power_slider)
-     
-     This is confusing during debug, so print "UNKNOWN" for condition
-     name 56.
+     Fix error for condition names
+ 
+     The current code caps the max name as the last condition name,
+     which is "Power_Slider". So any condition more than 56 will be
+     printing error, with "Power_Slider" as condition name. For example
+     for condition = 57:
+     Unsupported condition 57 (Power_slider)
+ 
+     This is confusing during debug, so print "UNKNOWN" for condition
+     name 56.
  
  commit 9115f2fb0b296a22a62908f2718ca873af2a452f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:36:13 2021 -0800
  
-     
-     This is confusing during debug, so print "UNKNOWN" for condition
-     name 56.
+     This is confusing during debug, so print "UNKNOWN" for condition
+     name 56.
  
  commit 9115f2fb0b296a22a62908f2718ca873af2a452f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:36:13 2021 -0800
  
-     Check for alternate names for B0D4 device
-     
-     B0D4 can be named as TCPU or B0D4. So search for both names
-     if failed to find one.
+     Check for alternate names for B0D4 device
+ 
+     B0D4 can be named as TCPU or B0D4. So search for both names
+     if failed to find one.
  
  commit 660ee6f1f6351e6c291c0699147231be402c2bb8
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:33:08 2021 -0800
  
-     Delete all trips from zones before psvt install
-     
-     Initially zones has all the trips from sysfs, which may have wrong
-     settings. Instead of deleting only for matched psvt zones, delete
-     for all zones. In this way only zones which are in PSVT will be
-     present.
+     Delete all trips from zones before psvt install
+ 
+     Initially zones has all the trips from sysfs, which may have wrong
+     settings. Instead of deleting only for matched psvt zones, delete
+     for all zones. In this way only zones which are in PSVT will be
+     present.
  
  commit 1ad03424f7f3d339521635f08377b323375b2747
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 11:36:27 2021 -0800
  
-     Disable legacy rapl cdev when rapl-mmio is in use
-     
-     Explicitly disable legacy rapl based on MSR interface when rapl-mmio
-     is in use. This will prevent PL1/PL2 power limit from MSR based rapl,
-     which may not be the correct one.
+     Disable legacy rapl cdev when rapl-mmio is in use
  
- commit 9b5c7a45021a5376be98d88b903daeb2d5f30ac4
- Merge: 9e731de 45832e1
- Author: Srinivas Pandruvada <3802550+spandruv...@users.noreply.github.com>
- Date:   Thu Apr 1 12:09:54 2021 -0700
- 
-     Merge pull request #295 from ColinIanKing/master
-     
-     Fix spelling mistakes found using codespell
+     Explicitly disable legacy rapl based on MSR interface when rapl-mmio
+     is in use. This will prevent PL1/PL2 power limit from MSR based rapl,
+     which may not be the correct one.
  
  commit 45832e16290fec7353513b1ce03533b73b18f0c6
  Author: Colin Ian King <colin.k...@canonical.com>
  Date:   Thu Mar 18 11:22:38 2021 +0000
  
-     Fix spelling mistakes found using codespell
-     
-     There are a handful of spelling mistakes in comments and literal
-     strings found by codespell. Fix these.
-     
-     Signed-off-by: Colin Ian King <colin.k...@canonical.com>
+     Fix spelling mistakes found using codespell
+ 
+     There are a handful of spelling mistakes in comments and literal
+     strings found by codespell. Fix these.
+ 
+     Signed-off-by: Colin Ian King <colin.k...@canonical.com>
+ 
+ == Test plan ==
+ 
+ Actually this is problematic as the changes affect different H/W in
+ different ways and testing to touch all these changes requires full CPU
+ exercising to try and trip thermal overrun.
+ 
+ test plan is as follows:
+ 
+ 1. install -proposed thermald, run with debug logging enabled
+ 2. run stress-ng --cpu 0 for a few hours on various H/W with air vents 
plugged to try and trigger thermal overrun and exercise thermald thermal 
throttling.
+ 3. check the thermald logging to check for state change behaviour on thermal 
changes.
+ 
+ 
+ == Where problems could occur ==
+ 
+ These have been already tested on H/W in the field, for example: 
+ https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1930422
+ 
+ however, all these fixes can alter the functionality of thermald for a
+ range of platforms so the regressions potential is high. The set of
+ changes in thermald are already in upstream thermald in Ubuntu Impish so
+ these fixes will already have been exercised on the development release.
+ 
+ Issues can occur in:
+ 
+ 1. H/W supporting RAPL MMIO
+ 2. Devices with TCPU ACPI object names, this will now behave differently 
(correctly)
+ 3. Systems with both RAPL cdev support and RAPL MMIO support will behave 
differently, cdev will now be (correctly) disabled.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to thermald in Ubuntu.
https://bugs.launchpad.net/bugs/1931565

Title:
  pull in latest thermald bug fixes into thermald 2.4.3

Status in thermald package in Ubuntu:
  In Progress

Bug description:
  == SRU Justification [ HIRSUTE ] ==

  The upstream thermald 2.4.6 has been released with some more bug fixes
  that are pertinent to H/W available in older releases such as Hirsute.
  These fix a variety of issues found on H/W in the field and such as
  over-throttling, handling alternate ACPI object names for B0D4,
  handling trip zones which may have wrong settings and disabling the
  legacy rapl cdev when rapl-mmio is available.

  == The fixes ==

  Pull in these fixes:

  From 2.4.6:

  commit 273a53a11da2a7302ad7a5bc7e3bf04f221ce4e2
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Mon Jun 7 16:47:47 2021 -0700

      Use Adaptive PPCC limits for RAPL MMIO

      Set the correct device name as RAPL-MSR so that RAPL-MMIO can
      also set the correct default power limits.

  commit 2dd67300448fa4a2aa8f3e00ee5b604c73a1f7d9
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Mon Jun 7 16:45:14 2021 -0700

      Increase power limit for disabled RAPL-MMIO

      Increase 100W to 200W as some desktop platform already have limit
      more than 100W.

  commit 3de1004a49d0d157573bbdc1097b2fbed056879f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Sun Jun 6 19:19:15 2021 -0700

      Special case for default PSVT

      When there are no adaptive tables and only one default PSVT table
      is present with just one entry with MAX type. Add one additional
      entry as done for non default case.

  From 2.4.5:

  commit 301a89284e9d74a9a1f8315c1673a548dcacda8c
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Sat May 29 10:50:44 2021 -0700

      Set a very high RAPL MSR PL1 with --adaptive

      After upgrading Dell Latitude 5420, again noticed performance degradation.
      The PPCC power limit for MSR RAPL PL1 is reduced to 15W. Even though we
      disable MSR RAPL with --adaptive option, it is not getting disabled. So
      MSR RAPL limits still playing role.

      To fix that set a very high MSR RAPL PL1 limit so that it never causes
      throttling. All throttling with --adaptive option is done using RAPL-MMIO.

  From 2.4.4:

  commit ea4491971059259e46daad10ae850d3d530b02f2
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Thu Mar 11 10:06:15 2021 -0800

      Fix error for condition names

      The current code caps the max name as the last condition name,
      which is "Power_Slider". So any condition more than 56 will be
      printing error, with "Power_Slider" as condition name. For example
      for condition = 57:
      Unsupported condition 57 (Power_slider)

      This is confusing during debug, so print "UNKNOWN" for condition
      name 56.

  commit 9115f2fb0b296a22a62908f2718ca873af2a452f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:36:13 2021 -0800

      This is confusing during debug, so print "UNKNOWN" for condition
      name 56.

  commit 9115f2fb0b296a22a62908f2718ca873af2a452f
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:36:13 2021 -0800

      Check for alternate names for B0D4 device

      B0D4 can be named as TCPU or B0D4. So search for both names
      if failed to find one.

  commit 660ee6f1f6351e6c291c0699147231be402c2bb8
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 16:33:08 2021 -0800

      Delete all trips from zones before psvt install

      Initially zones has all the trips from sysfs, which may have wrong
      settings. Instead of deleting only for matched psvt zones, delete
      for all zones. In this way only zones which are in PSVT will be
      present.

  commit 1ad03424f7f3d339521635f08377b323375b2747
  Author: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
  Date:   Tue Mar 9 11:36:27 2021 -0800

      Disable legacy rapl cdev when rapl-mmio is in use

      Explicitly disable legacy rapl based on MSR interface when rapl-mmio
      is in use. This will prevent PL1/PL2 power limit from MSR based rapl,
      which may not be the correct one.

  commit 45832e16290fec7353513b1ce03533b73b18f0c6
  Author: Colin Ian King <colin.k...@canonical.com>
  Date:   Thu Mar 18 11:22:38 2021 +0000

      Fix spelling mistakes found using codespell

      There are a handful of spelling mistakes in comments and literal
      strings found by codespell. Fix these.

      Signed-off-by: Colin Ian King <colin.k...@canonical.com>

  == Test plan ==

  Actually this is problematic as the changes affect different H/W in
  different ways and testing to touch all these changes requires full
  CPU exercising to try and trip thermal overrun.

  test plan is as follows:

  1. install -proposed thermald, run with debug logging enabled
  2. run stress-ng --cpu 0 for a few hours on various H/W with air vents 
plugged to try and trigger thermal overrun and exercise thermald thermal 
throttling.
  3. check the thermald logging to check for state change behaviour on thermal 
changes.


  == Where problems could occur ==

  These have been already tested on H/W in the field, for example: 
  https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1930422

  however, all these fixes can alter the functionality of thermald for a
  range of platforms so the regressions potential is high. The set of
  changes in thermald are already in upstream thermald in Ubuntu Impish
  so these fixes will already have been exercised on the development
  release.

  Issues can occur in:

  1. H/W supporting RAPL MMIO
  2. Devices with TCPU ACPI object names, this will now behave differently 
(correctly)
  3. Systems with both RAPL cdev support and RAPL MMIO support will behave 
differently, cdev will now be (correctly) disabled.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1931565/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to