Hello David, On 21/12/22(Wed) 11:37, David Hill wrote: > On 12/21/22 11:23, Martin Pieuchot wrote: > > On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > > > > > > On 17/12/22(Sat) 14:15, David Hill wrote: > > > > > > > On 10/28/22 03:46, Renato Aguiar wrote: > > > > > > > > Use of bbolt Go library causes 7.2 to freeze. I suspect > > > > > > > > it is triggering some > > > > > > > > sort of deadlock in mmap because threads get stuck at vmmaplk. > > > > > > > > > > > > > > > > I managed to reproduce it consistently in a laptop with > > > > > > > > 4 cores (i5-1135G7) > > > > > > > > using one unit test from bbolt: > > > > > > > > > > > > > > > > $ doas pkg_add git go > > > > > > > > $ git clone https://github.com/etcd-io/bbolt.git > > > > > > > > $ cd bbolt > > > > > > > > $ git checkout v1.3.6 > > > > > > > > $ go test -v -run TestSimulate_10000op_10p > > > > > > > > > > > > > > > > The test never ends and this is the 'top' report: > > > > > > > > > > > > > > > > PID TID PRI NICE SIZE RES STATE > > > > > > > > WAIT TIME CPU COMMAND > > > > > > > > 32181 438138 -18 0 57M 13M idle uvn_fls > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 331169 10 0 57M 13M sleep/1 nanoslp > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 497390 10 0 57M 13M idle vmmaplk > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 380477 14 0 57M 13M idle vmmaplk > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 336950 14 0 57M 13M idle vmmaplk > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 491043 14 0 57M 13M idle vmmaplk > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > 32181 347071 2 0 57M 13M idle kqread > > > > > > > > 0:00 0.00% bbolt.test > > > > > > > > > > > > > > > > After this, most commands just hang. For example, > > > > > > > > running a 'ps | grep foo' in > > > > > > > > another shell would do it. > > > > > > > > > > > > > > > > > > > > > > I can reproduce this on MP, but not SP. Here is /trace from > > > > > > > ddb after using > > > > > > > the ddb.trigger sysctl. Is there any other information I > > > > > > > could pull from > > > > > > > DDB that may help? > > > > > > > > > > > > Thanks for the useful report David! > > > > > > > > > > > > The issue seems to be a deadlock between the `vmmaplk' and a > > > > > > particular > > > > > > `vmobjlock'. uvm_map_clean() calls uvn_flush() which sleeps with > > > > > > the > > > > > > `vmmaplk' held. > > > > > > > > > > > > I'll think a bit about this and try to come up with a fix ASAP. > > > > > > > > > > I'm missing a piece of information. All the threads in your report > > > > > seem > > > > > to want a read version of the `vmmaplk' so they should not block. > > > > > Could > > > > > you reproduce the hang with a WITNESS kernel and print 'show all > > > > > locks' > > > > > in addition to all the informations you've reported? > > > > > > > > > > > > > Sure. Its always the same; 2 processes (sysctl and bbolt.test) and 3 > > > > locks (sysctllk, kernel_lock, and vmmaplk) with bbolt.test always on the > > > > uvn_flsh thread. > > > > > > > > > > > > Process 98301 (sysctl) thread 0xfff...... > > > > exclusive rwlock sysctllk r = 0 (0xfffff...) > > > > exclusive kernel_lock &kernel_lock r = 0 (0xffffff......) > > > > Process 32181 (bbolt.test) thread (0xffffff...) (438138) > > > > shared rwlock vmmaplk r = 0 (0xfffff......) > > > > > > > > To reproduce, just do: > > > > $ doas pkg_add git go > > > > $ git clone https://github.com/etcd-io/bbolt.git > > > > $ cd bbolt > > > > $ git checkout v1.3.6 > > > > $ go test -v -run TestSimulate_10000op_10p > > > > > > > > The test will hang happen almost instantly. > > > > > > > > > > Not sure if this is a hint.. > > > > > > https://github.com/etcd-io/bbolt/blob/master/db.go#L27-L31 > > > > > > // IgnoreNoSync specifies whether the NoSync field of a DB is ignored when > > > // syncing changes to a file. This is required as some operating systems, > > > // such as OpenBSD, do not have a unified buffer cache (UBC) and writes > > > // must be synchronized using the msync(2) syscall. > > > const IgnoreNoSync = runtime.GOOS == "openbsd" > > > > Yes, the issue is related to sync(2). Could you try the diff below, it > > is not a fix, and tell me if you can produce the issue with it? I can't. > > Ran it 20 times and all completed and passed. I was also able to interrupt > it as well. no issues. > > Excellent!
Here's the best fix I could come up with. We mark the VM map as "busy" during the page fault just before the faulting thread releases the shared lock. This ensures no other thread will grab an exclusive lock until the fault is finished. I couldn't trigger the reproducer with this, can you? Index: uvm/uvm_fault.c =================================================================== RCS file: /cvs/src/sys/uvm/uvm_fault.c,v retrieving revision 1.133 diff -u -p -r1.133 uvm_fault.c --- uvm/uvm_fault.c 4 Nov 2022 09:36:44 -0000 1.133 +++ uvm/uvm_fault.c 20 Jan 2023 13:52:58 -0000 @@ -1277,6 +1277,9 @@ uvm_fault_lower(struct uvm_faultinfo *uf /* update rusage counters */ curproc->p_ru.ru_majflt++; + /* prevent siblings to grab an exclusive lock on the map */ + vm_map_busy(ufi->map); + uvmfault_unlockall(ufi, amap, NULL); counters_inc(uvmexp_counters, flt_get); @@ -1293,13 +1296,16 @@ uvm_fault_lower(struct uvm_faultinfo *uf KASSERT(result != VM_PAGER_PEND); if (result == VM_PAGER_AGAIN) { + vm_map_unbusy(ufi->map); tsleep_nsec(&nowake, PVM, "fltagain2", MSEC_TO_NSEC(5)); return ERESTART; } - if (!UVM_ET_ISNOFAULT(ufi->entry)) + if (!UVM_ET_ISNOFAULT(ufi->entry)) { + vm_map_unbusy(ufi->map); return (EIO); + } uobjpage = PGO_DONTCARE; uobj = NULL; @@ -1308,6 +1314,7 @@ uvm_fault_lower(struct uvm_faultinfo *uf /* re-verify the state of the world. */ locked = uvmfault_relock(ufi); + vm_map_unbusy(ufi->map); if (locked && amap != NULL) amap_lock(amap);