** Description changed:

  SRU Justification:
  
  [ Impact ]
  
-  * KVM 2nd level guest (means KVM VM that runs nested on top of a Power 10
-    PowerVM hypervisor) hangs during LTP (Linux Test Projects) test suite.
+  * KVM 2nd level guest (means KVM VM that runs nested on top of a Power 10
+    PowerVM hypervisor) hangs during LTP (Linux Test Projects) test suite.
  
-  * It hangs with:
-    "Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ 
new_slab"
+  * It hangs with:
+    "Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ 
new_slab"
  
-  * Diagnosing the issues points this this fix/upstream-commit:
-    [commit message, by Barry Song <[email protected]>]
-    Within try_to_unmap_one(), page_vma_mapped_walk() races with other PTE
-    modifications preceded by pte clear. While iterating over PTEs of a large 
folio,
-    it only starts acquiring PTL from the first valid (present) PTE.
-    PTE modifications can temporarily set PTEs to pte_none.
-    Consequently, the initial PTEs of a large folio might be skipped
-    in try_to_unmap_one().
-    For example, for an anon folio, if we skip PTE0, we may have PTE0 which is
-    still present, while PTE1 ~ PTE(nr_pages - 1) are swap entries after
-    try_to_unmap_one().
-    So folio will be still mapped, the folio fails to be reclaimed and is put
-    back to LRU in this round.
-    This also breaks up PTEs optimization such as CONT-PTE on this large folio
-    and may lead to accident folio_split() afterwards.
-    And since a part of PTEs are now swap entries, accessing those parts will
-    introduce overhead - do_swap_page.
-    Although the kernel can withstand all of the above issues, the situation
-    still seems quite awkward and warrants making it more ideal.
-    The same race also occurs with small folios, but they have only one PTE,
-    thus, it won't be possible for them to be partially unmapped.
-    This patch [see below] holds PTL from PTE0, allowing us to avoid reading
-    PTE values that are in the process of being transformed. With stable PTE
-    values, we can ensure that this large folio is either completely reclaimed
-    or that all PTEs remain untouched in this round.
-    A corner case is that if we hold PTL from PTE0 and most initial PTEs have
-    been really unmapped before that, we may increase the duration of holding
-    PTL. Thus we only apply this optimization to folios which are still 
entirely
-    mapped (not in deferred_split list). 
+  * Diagnosing the issues points this this fix/upstream-commit:
+    [commit message, by Barry Song <[email protected]>]
+    Within try_to_unmap_one(), page_vma_mapped_walk() races with other PTE
+    modifications preceded by pte clear. While iterating over PTEs of a large 
folio,
+    it only starts acquiring PTL from the first valid (present) PTE.
+    PTE modifications can temporarily set PTEs to pte_none.
+    Consequently, the initial PTEs of a large folio might be skipped
+    in try_to_unmap_one().
+    For example, for an anon folio, if we skip PTE0, we may have PTE0 which is
+    still present, while PTE1 ~ PTE(nr_pages - 1) are swap entries after
+    try_to_unmap_one().
+    So folio will be still mapped, the folio fails to be reclaimed and is put
+    back to LRU in this round.
+    This also breaks up PTEs optimization such as CONT-PTE on this large folio
+    and may lead to accident folio_split() afterwards.
+    And since a part of PTEs are now swap entries, accessing those parts will
+    introduce overhead - do_swap_page.
+    Although the kernel can withstand all of the above issues, the situation
+    still seems quite awkward and warrants making it more ideal.
+    The same race also occurs with small folios, but they have only one PTE,
+    thus, it won't be possible for them to be partially unmapped.
+    This patch [see below] holds PTL from PTE0, allowing us to avoid reading
+    PTE values that are in the process of being transformed. With stable PTE
+    values, we can ensure that this large folio is either completely reclaimed
+    or that all PTEs remain untouched in this round.
+    A corner case is that if we hold PTL from PTE0 and most initial PTEs have
+    been really unmapped before that, we may increase the duration of holding
+    PTL. Thus we only apply this optimization to folios which are still 
entirely
+    mapped (not in deferred_split list).
  
  [ Fix ]
  
-  * 73bc32875ee9 73bc32875ee9b1881dd780308c6793fe463fe803
-    "mm: hold PTL from the first PTE while reclaiming a large folio"
+  * 73bc32875ee9 73bc32875ee9b1881dd780308c6793fe463fe803
+    "mm: hold PTL from the first PTE while reclaiming a large folio"
  
  [ Test Plan ]
  
-  * An IBM Power 10 system (where PowerVM is mandatory)
-    running Ubuntu Server 24.04 (kernel 6.8) or later 
-    with (nested) KVM setup (so KVM on top of PowerVM).
+  * An IBM Power 10 system (where PowerVM is mandatory)
+    running Ubuntu Server 24.04 (kernel 6.8) or later
+    with (nested) KVM setup (so KVM on top of PowerVM).
  
-  * Run LTP test suite
-    Tests running: SLS(io,base)
+  * Run LTP test suite
+    Tests running: SLS(io,base)
  
-  * Without the patch the above test will hang with
-    Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ 
new_slab
+  * Without the patch the above test will hang with
+    Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ 
new_slab
  
  [ Where problems could occur ]
  
-  * This is a common code change in the memory management sub-system,
-    hence great care needs to be taken, even if it was discussed upfront
-    at the https://lore.kernel.org/ mailing list and the upstream commit
-    provenance shows that many eyes had a look at this.
+  * This is a common code change in the memory management sub-system,
+    hence great care needs to be taken, even if it was discussed upfront
+    at the https://lore.kernel.org/ mailing list and the upstream commit
+    provenance shows that many eyes had a look at this.
  
-  * The modification is relatively small with just one if statement
-    (across two lines) in mm/vmscan.c.
+  * The modification is relatively small with just one if statement
+    (across two lines) in mm/vmscan.c.
  
-  * This change is to assist 'try_to_unmap' to acquire page table locks (PTL)
-    from the first page table entry (PTE) and to eliminate the influence of
-    temporary and volatile PTE values.
+  * This change is to assist 'try_to_unmap' to acquire page table locks (PTL)
+    from the first page table entry (PTE) and to eliminate the influence of
+    temporary and volatile PTE values.
  
-  * If done wrong it can especially have a negative impact in case of large 
folios.
-    and wrong hints might be given to try_to_unmap
-    which may lead to bad page swapping.
+  * If done wrong it can especially have a negative impact in case of large 
folios.
+    and wrong hints might be given to try_to_unmap
+    which may lead to bad page swapping.
  
-  * In case of an issue with this patch the result can also be decreased
-    performance and efficiency in the page table handling - the opposite
-    of what the patch is supposed to address.
+  * In case of an issue with this patch the result can also be decreased
+    performance and efficiency in the page table handling - the opposite
+    of what the patch is supposed to address.
  
-  * Fortunately several developers had their eyes on this commit,
-    as the provenance of the patch and the discussion ot lkml shows.
+  * Fortunately several developers had their eyes on this commit,
+    as the provenance of the patch and the discussion at lkml shows.
  
  [ Other Info ]
-  
-  * The commit is upstream since v6.10(-rc1), hence it will be included
-    in oracular with the planned target kernel.
+ 
+  * The commit is upstream since v6.10(-rc1), hence it will be included
+    in oracular with the planned target kernel.
  
  __________
  
  == Comment: #0 - SEETEENA THOUFEEK <[email protected]> - 2024-08-06 
00:20:57 ==
  +++ This bug was initially created as a clone of Bug #206372 +++
  
  ---Problem Description---
  L2 Guest hung during LTP Tests. Back trace of paca->saved_r1 
(0xc000000c1bc8bb00) (possibly stale) @ new_slab (edit)
  
  ---uname output---
  NA
  
  ---Additional Hardware Info---
  NA
  
  Contact Information = na
  
  ---Debugger Data---
  NA
  
  ---Patches Installed---
  NA
  
  ---Steps to Reproduce---
  
  Tests running: SLS(io,base)
  LPAR Config:
  ============
  PHYP Environment:  PowerVM
  LPAR Hostname/IP: 10.33.2.107
  Rootvg Filesystem: xfs
  Network Interface: Shiner-T
  vNIC/SR-IOV Config: n/a
  IO Type: SAN
  IO Disk Type: raw
  Multipath Enabled: No
  
-------------------------------------------------------------------------------------
  DUMP Config:
  ============
  KDUMP configured: Yes
  XMON enabled no
  DUMP Available: no
  
  Machine Type = na
  
  Userspace rpm: NA
  
  The userspace tool has the following bit modes: NA
  
  Userspace tool obtained from project website:  na
  
  Userspace tool common name: NA
  
  *Additional Instructions for na:
  -Post a private note with access information to the machine that is currently 
in the debugger.
  -Attach ltrace and strace of userspace application.
  
  please include this commit in Ubuntu 24.04
  
  upstream commit  which is solving these data store lockups:
  73bc32875ee9b1881dd780308c6793fe463fe803 mm: hold PTL from the first PTE 
while reclaiming a large folio

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2076147

Title:
  Add 'mm: hold PTL from the first PTE while reclaiming a large folio'
  to fix L2 Guest hang during LTP Test

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/2076147/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to