** Changed in: ubuntu-z-systems
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1904884

Title:
  s390: dbginfo.sh triggers kernel panic, reading from
  /sys/kernel/mm/page_idle/bitmap

Status in Ubuntu on IBM z Systems:
  Fix Committed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Focal:
  Fix Committed
Status in linux source package in Groovy:
  Fix Released
Status in linux source package in Hirsute:
  Fix Released

Bug description:
  SRU Justification:
  ==================

  [Impact]

  * While executing dbginfo.sh (a script to collect runtime,
  configuration, and trace information on s390x) the systems hangs.

  * This is because 'idle page tracking' users can pass random pfn, that might 
be mapped to
  an offline page - and attempts to access offline pages lead to the hang.

  * It needs to be avoided that such pages are accessed.

  * The upstream commit modifies 'page_idle_get_page()' to use 
'pfn_to_online_page()' instead of a
  'pfn_valid()' and 'pfn_to_page()' combination, so that the pfn mapped to an 
offline page is skipped.

  [Fix]

  * 92fb1db26eef "mm/page_idle.c: skip offline pages"

  [Test Case]

  * IBM Z or LinuxONE hardware with Ubuntu Server 18.04 (GA kernel,
  4.15) installed.

  * Execute a test application that tries to access offline pages.

  * Or execute dbginfo.sh with having some offline (idle) pages in the
  system.

  [Regression Potential]

  * There is a certain regression risk, especially for bionic, since the
  structure in the kernel 4.15 is a bit different compared to kernel 5.4
  (and newer).

  * However, for newer kernels the modification is pretty save, since
  it's upstream accepted since kernel 5.8 and with that already inluded
  in hirsute and groovy.

  * And the patch is fine (and cherry picks cleanly) for focal as well.

  * For bionic there is a slightly conflicting context, since the struct
  'zone' was replaced by 'pg_data_t *pgdat' (by another commit:
  92fb1db26eef), but that change (or any change to the struct zone)
  would not be necessary to fix the uninitialized struct page access.

  * Hence the upstream commit/patch needs to be adjusted/backported to
  bionic 4.15, largely by replacing line 'pg_data_t *pgdat;' with
  'struct zone *zone;' (or actually leaving this line).

  * But this needs to be carefully considered, since the handling of
  idle pages could be harmful, in the end it could make things even
  worse, means break even more.

  [Other]

  * The patch got upstream accepted with kernel v5.8, hence it's already
  is in groovy and hirsute.

  * The upstream commit cherry picks cleanly to focal, but for bionic a
  backport is needed.

  * Hence this kernel SRU request is for focal (cherry-pick) and bionic 
(backport).
  __________

  System hangs on dbginfo.sh script execution.

  Solution:
  Commit 92fb1db26eef ("mm/page_idle.c: skip offline pages")

  Included upstream since kernel v5.8, so it is already included in
  Ubuntu 20.10, but not in 20.04 and earlier.

  Commit 92fb1db26eef ("mm/page_idle.c: skip offline pages") applies
  cleanly on ubuntu-focal, but not on ubuntu-bionic.

  Adjustment / backport for bionic should be trivial, but it is not IBM
  code and therefore the backport will be requested here by Canonical.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1904884/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to