** Description changed: + SRU Justification: + + [ Impact ] + + * KVM 2nd level guest (means KVM VM that runs nested on top of a Power 10 + PowerVM hypervisor) hangs during LTP (Linux Test Projects) test suite. + + * It hangs with: + "Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ new_slab" + + * Diagnosing the issues points this this fix/upstream-commit: + [commit message, by Barry Song <[email protected]>] + Within try_to_unmap_one(), page_vma_mapped_walk() races with other PTE + modifications preceded by pte clear. While iterating over PTEs of a large folio, + it only starts acquiring PTL from the first valid (present) PTE. + PTE modifications can temporarily set PTEs to pte_none. + Consequently, the initial PTEs of a large folio might be skipped + in try_to_unmap_one(). + For example, for an anon folio, if we skip PTE0, we may have PTE0 which is + still present, while PTE1 ~ PTE(nr_pages - 1) are swap entries after + try_to_unmap_one(). + So folio will be still mapped, the folio fails to be reclaimed and is put + back to LRU in this round. + This also breaks up PTEs optimization such as CONT-PTE on this large folio + and may lead to accident folio_split() afterwards. + And since a part of PTEs are now swap entries, accessing those parts will + introduce overhead - do_swap_page. + Although the kernel can withstand all of the above issues, the situation + still seems quite awkward and warrants making it more ideal. + The same race also occurs with small folios, but they have only one PTE, + thus, it won't be possible for them to be partially unmapped. + This patch [see below] holds PTL from PTE0, allowing us to avoid reading + PTE values that are in the process of being transformed. With stable PTE + values, we can ensure that this large folio is either completely reclaimed + or that all PTEs remain untouched in this round. + A corner case is that if we hold PTL from PTE0 and most initial PTEs have + been really unmapped before that, we may increase the duration of holding + PTL. Thus we only apply this optimization to folios which are still entirely + mapped (not in deferred_split list). + + [ Fix ] + + * 73bc32875ee9 73bc32875ee9b1881dd780308c6793fe463fe803 + "mm: hold PTL from the first PTE while reclaiming a large folio" + + [ Test Plan ] + + * An IBM Power 10 system (where PowerVM is mandatory) + running Ubuntu Server 24.04 (kernel 6.8) or later + with (nested) KVM setup (so KVM on top of PowerVM). + + * Run LTP test suite + Tests running: SLS(io,base) + + * Without the patch the above test will hang with + Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ new_slab + + [ Where problems could occur ] + + * This is a common code change in the memory management sub-system, + hence great care needs to be taken, even if it was discussed upfront + at the https://lore.kernel.org/ mailing list and the upstream commit + provenance shows that many eyes had a look at this. + + * The modification is relatively small with just one if statement + (across two lines) in mm/vmscan.c. + + * This change is to assist 'try_to_unmap' to acquire page table locks (PTL) + from the first page table entry (PTE) and to eliminate the influence of + temporary and volatile PTE values. + + * If done wrong it can especially have a negative impact in case of large folios. + and wrong hints might be given to try_to_unmap + which may lead to bad page swapping. + + * In case of an issue with this patch the result can also be decreased + performance and efficiency in the page table handling - the opposite + of what the patch is supposed to address. + + * Fortunately several developers had their eyes on this commit, + as the provenance of the patch and the discussion ot lkml shows. + + [ Other Info ] + + * The commit is upstream since v6.10(-rc1), hence it will be included + in oracular with the planned target kernel. + + __________ + == Comment: #0 - SEETEENA THOUFEEK <[email protected]> - 2024-08-06 00:20:57 == +++ This bug was initially created as a clone of Bug #206372 +++ ---Problem Description--- L2 Guest hung during LTP Tests. Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ new_slab (edit) - ---uname output--- NA - + ---Additional Hardware Info--- - NA + NA - - Contact Information = na - + Contact Information = na + ---Debugger Data--- - NA - + NA + ---Patches Installed--- NA - + ---Steps to Reproduce--- - + Tests running: SLS(io,base) LPAR Config: ============ PHYP Environment: PowerVM LPAR Hostname/IP: 10.33.2.107 Rootvg Filesystem: xfs Network Interface: Shiner-T vNIC/SR-IOV Config: n/a IO Type: SAN IO Disk Type: raw Multipath Enabled: No ------------------------------------------------------------------------------------- DUMP Config: ============ KDUMP configured: Yes XMON enabled no DUMP Available: no - - Machine Type = na - Userspace rpm: NA - - The userspace tool has the following bit modes: NA + Machine Type = na - Userspace tool obtained from project website: na - - Userspace tool common name: NA - - *Additional Instructions for na: + Userspace rpm: NA + + The userspace tool has the following bit modes: NA + + Userspace tool obtained from project website: na + + Userspace tool common name: NA + + *Additional Instructions for na: -Post a private note with access information to the machine that is currently in the debugger. -Attach ltrace and strace of userspace application. - please include this commit in Ubuntu 24.04 upstream commit which is solving these data store lockups: 73bc32875ee9b1881dd780308c6793fe463fe803 mm: hold PTL from the first PTE while reclaiming a large folio
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2076147 Title: L2 Guest hung during LTP Tests. Back trace of paca->saved_r1 (0xc000000c1bc8bb00) (possibly stale) @ new_slab To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/2076147/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
