On 12/09/25(Fri) 18:45, Alexander Bluhm wrote:
> On Thu, Sep 11, 2025 at 06:31:42PM +0200, Martin Pieuchot wrote:
> > On 08/09/25(Mon) 18:53, Martin Pieuchot wrote:
> > > On 29/08/25(Fri) 19:12, Alexander Bluhm wrote:
> > > > Hi,
> > > > 
> > > > One of my i386 test machines crashed during make build.  Kernel is
> > > > GENERIC.MP built from current sources.
> > > > 
> > > > panic: uvm_fault(0xd59b2424, 0xcf800000, 0, 1) -> e
> > > > Stopped at      db_enter+0x4:   popl    %ebp
> > > >     TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
> > > > *435945  40779     21         0x2          0    5  llvm-tblgen
> > > >  212835  13904     21         0x2          0    3  llvm-tblgen
> > > >  150086  44694     21         0x2          0   11  llvm-tblgen
> > > >  332575  10539     21         0x2          0   10  llvm-tblgen
> > > >  385473  77182     21         0x2          0    8  llvm-tblgen
> > > >  104320  19436      0     0x14000      0x200    1  aiodoned
> > > > db_enter() at db_enter+0x4
> > > > panic(d0cbc6b7) at panic+0x7a
> > > > kpageflttrap(f86b5c94,cf800000) at kpageflttrap+0x133
> > > > trap(f86b5c94) at trap+0x255
> > > > calltrap() at calltrap+0xc
> > > > pmap_remove_ptes_pae(d0f6fda0,0,cf800000,0,1000,0,f86b5d1c) at 
> > > > pmap_remove_ptes_pae+0x4f
> > > > pmap_do_remove_pae(d0f6fda0,0,1000,0) at pmap_do_remove_pae+0x120
> > > > pmap_remove(d0f6fda0,0,1000) at pmap_remove+0x18
> > > > uvm_pagermapout(0,1) at uvm_pagermapout+0x1a
> > > 
> > > This is very wrong.  That means `kva' is 0.  The only way this can
> > > happen is if pmap_enter(9) failed in uvm_pagermapin().
> > > 
> > > Using pmap_kenter_pa(9) would not only prevent this issue, it would also
> > > speed up memory recovery.  Sadly we had to revert such change because on
> > > Landisk it doesn't handle conflicting cache aliases like pmap_enter(9).
> > 
> > Here's a diff that fixes the bug and does not make landisk slow other
> > architectures.  This gives a noticeable boost for page faults and
> > aggressive swapping (like a stress test with torture).  
> > 
> > Note that NetBSD also calls pmap_kenter_pa(9) in this case. So maybe
> > there's a fix for landisk out there.  Anyone care about landisk?
> > 
> > Alexander would you please test this on your i386?
> 
> I did make build, release and regress on the affected machines.  No
> more crashes seen, but the crashes did not happen reliably before.

Alexander, could you try the diff below?

Index: uvm_pager.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_pager.c,v
diff -u -p -r1.94 uvm_pager.c
--- uvm_pager.c 10 Mar 2025 14:13:58 -0000      1.94
+++ uvm_pager.c 6 Oct 2025 08:13:07 -0000
@@ -263,13 +263,16 @@ uvm_pagermapin(struct vm_page **pps, int
                pp = *pps++;
                KASSERT(pp);
                KASSERT(pp->pg_flags & PG_BUSY);
-               /* Allow pmap_enter to fail. */
-               if (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
+               while (pmap_enter(pmap_kernel(), cva, VM_PAGE_TO_PHYS(pp),
                    prot, PMAP_WIRED | PMAP_CANFAIL | prot) != 0) {
-                       pmap_remove(pmap_kernel(), kva, cva);
-                       pmap_update(pmap_kernel());
-                       uvm_pseg_release(kva);
-                       return 0;
+                       if (flags & UVMPAGER_MAPIN_WAITOK)
+                               uvm_wait("pgrmapin");
+                       else {
+                               pmap_remove(pmap_kernel(), kva, cva);
+                               pmap_update(pmap_kernel());
+                               uvm_pseg_release(kva);
+                               return 0;
+                       }
                }
        }
        pmap_update(pmap_kernel());


Reply via email to