Re: Another UVM wire page count underflow panic

Jason Thorpe Fri, 15 Mar 2019 19:26:28 -0700


> On Mar 15, 2019, at 2:48 PM, Robert Elz <[email protected]> wrote:


> Upon reflection, there is no hurry to fix this one, unlike the previous
> one which was screwing up the b5 tests - we (at least currently) have no
> tests which do anything as crazy as the code sequence to trigger this, so
> we can take our time and solve it properly.

Well, true, but I think a "fix before netbsd-9, will pull-ups to -8 and -7" is 
certainly a worthy goal.  After all, there is now a known sequence of calls 
that can cause a crash.

>  | POSIX's semantics could just as well be represented with a bit
>  | in a flags word,
> 
> the one in the UVM map entry - yes, but that ons isn't really the
> issue.   What matters is the pmap count, and even in posix that needs
> to be a count, as multiple processes can independently lock the same
> (shared) region, and neither one's unwire affects the wiring done by
> another.
> 
> Unless my assumptions about what is what here are incorrect (which they
> easily could be) the count that matters is the one which needs to remain
> a count.

The pmap layer doesn't really have a count.  It just has a "this PTE is wired" 
bit.  When the vm_map_entry that covers that PTE transitions from "not-wired" 
to "wired", the PTE gets the wired bit; when the vm_map_entry transitions from 
"wired" to "not-wired", the PTE loses the bit.  It's really as simple as that.  
The pmap layer doesn't assume a count, it just depends on the upper layers 
keeping track of the state transitions, and updating the bottom layer 
accordingly.

The same goes for the backing pages -- you've probably noticed that the pages 
are either wired or unwired only at those rising and falling edges of 
vm_map_entry "wired-ness", but the pages, of course, have a count in them 
because there can exist multiple mappings for a page.

UVM history lesson time!  In some ways it's slightly silly to even have a 
wire_count in the vm_map_entry, because vm_map_entry's are not really shared 
... they exist only in a single vm_map, and they correspond to one or more PTEs 
in the pmap's tables (one pmap per uvm_map)... but the count is in some ways an 
artifact of how uvm_vslock() / uvm_vsunlock() used to work ... they *used* to 
call uvm_map_pageable() (because the old Mach VM implementation used to call 
vm_map_pageable()) for doing physio and other things that necessitated wiring 
down user buffers so the kernel / devices could safely access them.  But that 
changed some 2 decades ago (again, I think this may have been my fault :-) for 
a couple of reasons:

        (1) munlock(2) and its semantics; you don't want it to unwire the 
buffers that a device is going to DMA into!

        (2) uvm_map_pageable() can fragment the map because of the entry 
clipping.

...so the transient wirings used by uvm_vslock() and uvm_vsunlock() were 
changed to use uvm_fault_wire() and uvm_fault_unwire() directly, to 
specifically fiddle with the wired-ness of the underlying pages, while leaving 
the vm_map_entry's unchanged.

>  | I would suggest that the right way to fix this would be as follows:
> 
> I think we ought to work out what the data structs should look like
> in the various possible cases - including mixed shm and m*() allocations,
> mappings, wiring, protection schemes - including where pages are
> mapped (either in more than once in one process, or in different
> processes) in both forms (a page that is a shm in one place is mmap'd
> in another, and wired by one of them, or both, or neither).

This should be relatively straight forward... I'll see if I can put together a 
couple of diagrams this weekend between various kid / household duties (and 
also recovering from this bout of late winter flu that's kept me out of my 
$DayJob office for a couple of days, bleh).  The wiring propagation between the 
various layers is really all about rising and falling edges, and once you 
understand the rules, it's pretty easy to work out what the data structures at 
each layer should look like for any given scenario.  In fact, the current code 
mostly follows those rules; the bugs, it seems, are really in defining what 
constitutes a rising or falling edge.

> Until we know what it will look like, I don't think trying to find
> minimal code changes from what we have now will be productive.
> 
> First we need an audit of everything that affects or uses the UVM
> mappings to see just what is required.  The shm stuff is easy that
> way, as they have a very small visible footprint - even if they are
> an ugly design.
> 
> Tomorrow (or much later today, or whatever you want to call it!)
> 
> kre
> 

-- thorpej

Re: Another UVM wire page count underflow panic

Reply via email to