On Tue, Dec 07, 2021 at 10:04:04AM -0600, Scott Cheloha wrote:
> On Mon, Nov 29, 2021 at 10:50:32PM +0100, Martin Pieuchot wrote:
> > On 24/11/21(Wed) 11:16, Martin Pieuchot wrote:
> > > Diff below unlock the bottom part of the UVM fault handler.  I'm
> > > interested in squashing the remaining bugs.  Please test with your usual
> > > setup & report back.
> > 
> > Thanks to all the testers, here's a new version that includes a bug fix.
> > 
> > Tests on !x86 architectures are much appreciated!
> 
> [...]
> 
> witness: lock order reversal:
>  1st 0xfffffd83d37b23c8 uobjlk (&uobj->vmobjlock)
>  2nd 0xffff8000015fdb00 objmm (&obj->mm.lock)
> lock order "&obj->mm.lock"(rwlock) -> "&uobj->vmobjlock"(rwlock) first seen 
> at:
> #0  rw_enter+0x68
> #1  uvm_obj_wire+0x4f
> #2  shmem_get_pages+0xae
> #3  __i915_gem_object_get_pages+0x85
> #4  i915_vma_pin_ww+0x451
> #5  i915_ggtt_pin+0x61
> #6  intel_execlists_submission_setup+0x396
> #7  intel_engines_init+0x2ff
> #8  intel_gt_init+0x136
> #9  i915_gem_init+0x9d
> #10 i915_driver_probe+0x760
> #11 inteldrm_attachhook+0x46
> #12 config_process_deferred_mountroot+0x5b
> #13 main+0x743
> lock order "&uobj->vmobjlock"(rwlock) -> "&obj->mm.lock"(rwlock) first seen 
> at:
> #0  rw_enter+0x68
> #1  __i915_gem_object_get_pages+0x29
> #2  i915_gem_fault+0x1cb
> #3  drm_fault+0x163
> #4  uvm_fault+0x19b
> #5  upageflttrap+0x5e
> #6  usertrap+0x190
> #7  recall_trap+0x8

Seeing the same reversal with the same stack trace during all boots.

Also, just saw a real novelty, a simultaneous trap panic and a warning
printf from vref().  The console showed this:

pWaAnRiNcI:NG:  vSrPeLf   OTs eLdO   ER e Dr eO Nv gTeRtA Pr eEqXuIiTr eda 0

Which I deciphered into:

        WARNING: vref used where vget required
and
        panic: SPL NOT LOWERED ON TRAP EXIT a 0

The warning printf is from vref() in vfs_subr.c.

The panic could be from Xintr_user_exit() in locore.S or
alltraps_kern() in vector.S.  Not familiar enough with that code to
say which.

The console locked up and I was not able to look at anything in ddb.

Same machine as before:

> dmesg:
> 
> OpenBSD 7.0-current (GENERIC.MP) #17: Tue Dec  7 09:39:06 CST 2021
>     ssc@jetsam.local:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 16895528960 (16112MB)
> avail mem = 16237621248 (15485MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x9f03b000 (63 entries)
> bios0: vendor LENOVO version "N23ET59W (1.34 )" date 11/08/2018
> bios0: LENOVO 20KHCTO1WW
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP SSDT SSDT TPM2 UEFI SSDT SSDT HPET APIC MCFG ECDT 
> SSDT SSDT BOOT BATB SSDT SSDT SSDT LPIT WSMT SSDT SSDT SSDT DBGP DBG2 MSDM 
> DMAR NHLT ASF! FPDT UEFI
> acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
> RP02(S4) PXSX(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) RP06(S4) 
> PXSX(S4) RP07(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 23999999 Hz
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1795.82 MHz, 06-8e-0a
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 24MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> tsc: cpu0/cpu1 sync round 1: 1865 regressions
> tsc: cpu0 lags cpu1 by 0 cycles
> tsc: cpu1 lags cpu0 by 56 cycles
> cpu1: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1795.82 MHz, 06-8e-0a
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu1: 256KB 64b/line 8-way L2 cache
> cpu1: smt 0, core 1, package 0
> cpu2 at mainbus0: apid 4 (application processor)
> tsc: cpu0/cpu2 sync round 1: 1921 regressions
> tsc: cpu0 lags cpu2 by 0 cycles
> tsc: cpu2 lags cpu0 by 60 cycles
> cpu2: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1794.85 MHz, 06-8e-0a
> cpu2: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu2: 256KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 2, package 0
> cpu3 at mainbus0: apid 6 (application processor)
> tsc: cpu0/cpu3 sync round 1: 1819 regressions
> tsc: cpu0 lags cpu3 by 0 cycles
> tsc: cpu3 lags cpu0 by 60 cycles
> cpu3: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1795.19 MHz, 06-8e-0a
> cpu3: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu3: 256KB 64b/line 8-way L2 cache
> cpu3: smt 0, core 3, package 0
> cpu4 at mainbus0: apid 1 (application processor)
> tsc: cpu0/cpu4 sync round 1: 12962 regressions
> tsc: cpu0 lags cpu4 by 0 cycles
> tsc: cpu4 lags cpu0 by 134 cycles
> cpu4: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1795.30 MHz, 06-8e-0a
> cpu4: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu4: 256KB 64b/line 8-way L2 cache
> cpu4: smt 1, core 0, package 0
> cpu5 at mainbus0: apid 3 (application processor)
> tsc: cpu0/cpu5 sync round 1: 1970 regressions
> tsc: cpu0 lags cpu5 by 0 cycles
> tsc: cpu5 lags cpu0 by 70 cycles
> cpu5: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1794.74 MHz, 06-8e-0a
> cpu5: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu5: 256KB 64b/line 8-way L2 cache
> cpu5: smt 1, core 1, package 0
> cpu6 at mainbus0: apid 5 (application processor)
> tsc: cpu0/cpu6 sync round 1: 2097 regressions
> tsc: cpu0 lags cpu6 by 0 cycles
> tsc: cpu6 lags cpu0 by 76 cycles
> cpu6: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1795.15 MHz, 06-8e-0a
> cpu6: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu6: 256KB 64b/line 8-way L2 cache
> cpu6: smt 1, core 2, package 0
> cpu7 at mainbus0: apid 7 (application processor)
> tsc: cpu0/cpu7 sync round 1: 1849 regressions
> tsc: cpu0 lags cpu7 by 0 cycles
> tsc: cpu7 lags cpu0 by 62 cycles
> cpu7: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz, 1794.87 MHz, 06-8e-0a
> cpu7: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
> cpu7: 256KB 64b/line 8-way L2 cache
> cpu7: smt 1, core 3, package 0
> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 120 pins
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf0000000, bus 0-127
> acpiec0 at acpi0
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus 2 (RP01)
> acpiprt2 at acpi0: bus -1 (RP02)
> acpiprt3 at acpi0: bus -1 (RP03)
> acpiprt4 at acpi0: bus -1 (RP04)
> acpiprt5 at acpi0: bus 4 (RP05)
> acpiprt6 at acpi0: bus -1 (RP06)
> acpiprt7 at acpi0: bus -1 (RP07)
> acpiprt8 at acpi0: bus -1 (RP08)
> acpiprt9 at acpi0: bus -1 (RP09)
> acpiprt10 at acpi0: bus -1 (RP10)
> acpiprt11 at acpi0: bus -1 (RP11)
> acpiprt12 at acpi0: bus -1 (RP12)
> acpiprt13 at acpi0: bus -1 (RP13)
> acpiprt14 at acpi0: bus -1 (RP14)
> acpiprt15 at acpi0: bus -1 (RP15)
> acpiprt16 at acpi0: bus -1 (RP16)
> acpiprt17 at acpi0: bus -1 (RP17)
> acpiprt18 at acpi0: bus -1 (RP18)
> acpiprt19 at acpi0: bus -1 (RP19)
> acpiprt20 at acpi0: bus -1 (RP20)
> acpiprt21 at acpi0: bus -1 (RP21)
> acpiprt22 at acpi0: bus -1 (RP22)
> acpiprt23 at acpi0: bus -1 (RP23)
> acpiprt24 at acpi0: bus -1 (RP24)
> acpipci0 at acpi0 PCI0: 0x00000000 0x00000011 0x00000001
> acpithinkpad0 at acpi0: version 2.0
> acpiac0 at acpi0: AC unit online
> acpibat0 at acpi0: BAT0 model "01AV430" serial  3080 type LiP oem "SMP"
> "LEN0100" at acpi0 not configured
> "INT3403" at acpi0 not configured
> acpicmos0 at acpi0
> "ALPS0000" at acpi0 not configured
> "INT0E0C" at acpi0 not configured
> acpibtn0 at acpi0: SLPB
> "PNP0C14" at acpi0 not configured
> acpibtn1 at acpi0: LID_
> "PNP0C14" at acpi0 not configured
> "PNP0C14" at acpi0 not configured
> "PNP0C14" at acpi0 not configured
> "PNP0C14" at acpi0 not configured
> "INT3400" at acpi0 not configured
> tpm0 at acpi0 TPM_ 2.0 (TIS) addr 0xfed40000/0x5000, device 0x0000104a rev 
> 0x4e
> acpicpu0 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu1 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu2 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu3 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu4 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu5 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu6 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpicpu7 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
> C1(1000@1 mwait.1), PSS
> acpipwrres0 at acpi0: PUBS, resource for XHC_
> acpitz0 at acpi0: critical temperature is 128 degC
> acpivideo0 at acpi0: GFX0
> acpivout0 at acpivideo0: DD1F
> cpu0: using VERW MDS workaround (except on vmm entry)
> cpu0: Enhanced SpeedStep 1795 MHz: speeds: 2101, 2100, 1900, 1800, 1700, 
> 1600, 1500, 1400, 1200, 1100, 1000, 800, 700, 600, 500, 400 MHz
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel Core 8G Host" rev 0x08
> inteldrm0 at pci0 dev 2 function 0 "Intel UHD Graphics 620" rev 0x07
> drm0 at inteldrm0
> inteldrm0: msi, KABYLAKE, gen 9
> "Intel Core 6G Thermal" rev 0x08 at pci0 dev 4 function 0 not configured
> "Intel Core GMM" rev 0x00 at pci0 dev 8 function 0 not configured
> xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x21: msi, xHCI 
> 1.0
> usb0 at xhci0: USB revision 3.0
> uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 
> addr 1
> pchtemp0 at pci0 dev 20 function 2 "Intel 100 Series Thermal" rev 0x21
> "Intel 100 Series MEI" rev 0x21 at pci0 dev 22 function 0 not configured
> ppb0 at pci0 dev 28 function 0 "Intel 100 Series PCIE" rev 0xf1: msi
> pci1 at ppb0 bus 2
> iwm0 at pci1 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi
> ppb1 at pci0 dev 28 function 4 "Intel 100 Series PCIE" rev 0xf1: msi
> pci2 at ppb1 bus 4
> nvme0 at pci2 dev 0 function 0 "Samsung SM981/PM981 NVMe" rev 0x00: msix, 
> NVMe 1.2
> nvme0: SAMSUNG MZVLB1T0HALR-000L7, firmware 4L2QEXA7, serial S3TPNX0K919907
> scsibus1 at nvme0: 2 targets, initiator 0
> sd0 at scsibus1 targ 1 lun 0: <NVMe, SAMSUNG MZVLB1T0, 4L2Q>
> sd0: 976762MB, 512 bytes/sector, 2000409264 sectors
> pcib0 at pci0 dev 31 function 0 "Intel 200 Series LPC" rev 0x21
> "Intel 100 Series PMC" rev 0x21 at pci0 dev 31 function 2 not configured
> azalia0 at pci0 dev 31 function 3 "Intel 200 Series HD Audio" rev 0x21: msi
> azalia0: codecs: Realtek ALC285, Intel/0x280b, using Realtek ALC285
> audio0 at azalia0
> ichiic0 at pci0 dev 31 function 4 "Intel 100 Series SMBus" rev 0x21: apic 2 
> int 16
> iic0 at ichiic0
> em0 at pci0 dev 31 function 6 "Intel I219-LM" rev 0x21: msi, address 
> e8:6a:64:78:68:5f
> isa0 at pcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
> pckbd0 at pckbc0 (kbd slot)
> wskbd0 at pckbd0: console keyboard
> pms0 at pckbc0 (aux slot)
> wsmouse0 at pms0 mux 0
> wsmouse1 at pms0 mux 0
> pms0: Synaptics clickpad, firmware 9.16, 0x1e2a1 0x940300 0x33d840 0xf00ba3 
> 0x12e800
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> vmm0 at mainbus0: VMX/EPT
> efifb at mainbus0 not configured
> dt: 451 probes
> Profiling kernel, textsize=31479140 [ffffffff80000000..ffffffff81e05564]
> uvideo0 at uhub0 port 5 configuration 1 interface 0 "SunplusIT Inc Integrated 
> IR Camera" rev 2.01/0.12 addr 2
> uvideo0: device not supported
> ugen0 at uhub0 port 7 "Intel Bluetooth" rev 2.00/0.10 addr 3
> uvideo1 at uhub0 port 8 configuration 1 interface 0 "Chicony Electronics 
> Co.,Ltd. Integrated Camera" rev 2.01/0.12 addr 4
> video0 at uvideo1
> ugen1 at uhub0 port 9 "Synaptics product 0x009a" rev 2.00/1.64 addr 5
> umass0 at uhub0 port 15 configuration 1 interface 0 "Generic USB3.0-CRW" rev 
> 3.00/29.08 addr 6
> umass0: using SCSI over Bulk-Only
> scsibus2 at umass0: 2 targets, initiator 0
> sd1 at scsibus2 targ 1 lun 0: <Generic-, SD/MMC CRW, 1.00> removable 
> serial.0bda0328008282014000
> vscsi0 at root
> scsibus3 at vscsi0: 256 targets
> softraid0 at root
> scsibus4 at softraid0: 256 targets
> root on sd0a (847a92f1384e1964.a) swap on sd0b dump on sd0b
> WARNING: / was not properly unmounted
> inteldrm0: 2560x1440, 32bpp
> wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
> wsdisplay0: screen 1-5 added (std, vt100 emulation)
> iwm0: hw rev 0x230, fw ver 36.ca7b901d.0, address 98:3b:8f:ef:6b:ef
> 
> 
> > 
> > Thanks a lot,
> > Martin
> > 
> > diff --git sys/arch/amd64/conf/GENERIC.MP sys/arch/amd64/conf/GENERIC.MP
> > index bb842f6d96e..e5334c19eac 100644
> > --- sys/arch/amd64/conf/GENERIC.MP
> > +++ sys/arch/amd64/conf/GENERIC.MP
> > @@ -4,6 +4,6 @@ include "arch/amd64/conf/GENERIC"
> >  
> >  option     MULTIPROCESSOR
> >  #option    MP_LOCKDEBUG
> > -#option    WITNESS
> > +option     WITNESS
> >  
> >  cpu*               at mainbus?
> > diff --git sys/arch/i386/conf/GENERIC.MP sys/arch/i386/conf/GENERIC.MP
> > index 980a572b8fd..ef7ded61501 100644
> > --- sys/arch/i386/conf/GENERIC.MP
> > +++ sys/arch/i386/conf/GENERIC.MP
> > @@ -7,6 +7,6 @@ include "arch/i386/conf/GENERIC"
> >  
> >  option             MULTIPROCESSOR  # Multiple processor support
> >  #option            MP_LOCKDEBUG
> > -#option            WITNESS
> > +option             WITNESS
> >  
> >  cpu*               at mainbus?
> > diff --git sys/dev/pci/drm/i915/gem/i915_gem_shmem.c 
> > sys/dev/pci/drm/i915/gem/i915_gem_shmem.c
> > index ce8e2eca141..47b567087e7 100644
> > --- sys/dev/pci/drm/i915/gem/i915_gem_shmem.c
> > +++ sys/dev/pci/drm/i915/gem/i915_gem_shmem.c
> > @@ -268,8 +268,10 @@ shmem_truncate(struct drm_i915_gem_object *obj)
> >  #ifdef __linux__
> >     shmem_truncate_range(file_inode(obj->base.filp), 0, (loff_t)-1);
> >  #else
> > +   rw_enter(obj->base.uao->vmobjlock, RW_WRITE);
> >     obj->base.uao->pgops->pgo_flush(obj->base.uao, 0, obj->base.size,
> >         PGO_ALLPAGES | PGO_FREE);
> > +   rw_exit(obj->base.uao->vmobjlock);
> >  #endif
> >     obj->mm.madv = __I915_MADV_PURGED;
> >     obj->mm.pages = ERR_PTR(-EFAULT);
> > diff --git sys/dev/pci/drm/radeon/radeon_ttm.c 
> > sys/dev/pci/drm/radeon/radeon_ttm.c
> > index eb879b5c72c..837a9f94298 100644
> > --- sys/dev/pci/drm/radeon/radeon_ttm.c
> > +++ sys/dev/pci/drm/radeon/radeon_ttm.c
> > @@ -1006,6 +1006,8 @@ radeon_ttm_fault(struct uvm_faultinfo *ufi, vaddr_t 
> > vaddr, vm_page_t *pps,
> >     struct radeon_device *rdev;
> >     int r;
> >  
> > +   KASSERT(rw_write_held(ufi->entry->object.uvm_obj->vmobjlock));
> > +
> >     bo = (struct drm_gem_object *)ufi->entry->object.uvm_obj;
> >     rdev = bo->dev->dev_private;
> >     down_read(&rdev->pm.mclk_lock);
> > diff --git sys/uvm/uvm_aobj.c sys/uvm/uvm_aobj.c
> > index 20051d95dc1..a5c403ab67d 100644
> > --- sys/uvm/uvm_aobj.c
> > +++ sys/uvm/uvm_aobj.c
> > @@ -184,7 +184,7 @@ const struct uvm_pagerops aobj_pager = {
> >   * deadlock.
> >   */
> >  static LIST_HEAD(aobjlist, uvm_aobj) uao_list = 
> > LIST_HEAD_INITIALIZER(uao_list);
> > -static struct mutex uao_list_lock = MUTEX_INITIALIZER(IPL_NONE);
> > +static struct mutex uao_list_lock = MUTEX_INITIALIZER(IPL_MPFLOOR);
> >  
> >  
> >  /*
> > @@ -277,6 +277,7 @@ uao_find_swslot(struct uvm_object *uobj, int pageidx)
> >   * uao_set_swslot: set the swap slot for a page in an aobj.
> >   *
> >   * => setting a slot to zero frees the slot
> > + * => object must be locked by caller
> >   * => we return the old slot number, or -1 if we failed to allocate
> >   *    memory to record the new slot number
> >   */
> > @@ -286,7 +287,7 @@ uao_set_swslot(struct uvm_object *uobj, int pageidx, 
> > int slot)
> >     struct uvm_aobj *aobj = (struct uvm_aobj *)uobj;
> >     int oldslot;
> >  
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(rw_write_held(uobj->vmobjlock) || uobj->uo_refs == 0);
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> >  
> >     /*
> > @@ -358,7 +359,9 @@ uao_free(struct uvm_aobj *aobj)
> >     struct uvm_object *uobj = &aobj->u_obj;
> >  
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >     uao_dropswap_range(uobj, 0, 0);
> > +   rw_exit(uobj->vmobjlock);
> >  
> >     if (UAO_USES_SWHASH(aobj)) {
> >             /*
> > @@ -671,6 +674,7 @@ struct uvm_object *
> >  uao_create(vsize_t size, int flags)
> >  {
> >     static struct uvm_aobj kernel_object_store;
> > +   static struct rwlock bootstrap_kernel_object_lock;
> >     static int kobj_alloced = 0;
> >     int pages = round_page(size) >> PAGE_SHIFT;
> >     struct uvm_aobj *aobj;
> > @@ -742,6 +746,11 @@ uao_create(vsize_t size, int flags)
> >      * Initialise UVM object.
> >      */
> >     uvm_obj_init(&aobj->u_obj, &aobj_pager, refs);
> > +   if (flags & UAO_FLAG_KERNOBJ) {
> > +           /* Use a temporary static lock for kernel_object. */
> > +           rw_init(&bootstrap_kernel_object_lock, "kobjlk");
> > +           uvm_obj_setlock(&aobj->u_obj, &bootstrap_kernel_object_lock);
> > +   }
> >  
> >     /*
> >      * now that aobj is ready, add it to the global list
> > @@ -822,20 +831,20 @@ uao_detach(struct uvm_object *uobj)
> >      * involved in is complete), release any swap resources and free
> >      * the page itself.
> >      */
> > -   uvm_lock_pageq();
> > -   while((pg = RBT_ROOT(uvm_objtree, &uobj->memt)) != NULL) {
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> > +   while ((pg = RBT_ROOT(uvm_objtree, &uobj->memt)) != NULL) {
> > +           pmap_page_protect(pg, PROT_NONE);
> >             if (pg->pg_flags & PG_BUSY) {
> >                     atomic_setbits_int(&pg->pg_flags, PG_WANTED);
> > -                   uvm_unlock_pageq();
> > -                   tsleep_nsec(pg, PVM, "uao_det", INFSLP);
> > -                   uvm_lock_pageq();
> > +                   rwsleep_nsec(pg, uobj->vmobjlock, PVM, "uao_det",
> > +                       INFSLP);
> >                     continue;
> >             }
> > -           pmap_page_protect(pg, PROT_NONE);
> >             uao_dropswap(&aobj->u_obj, pg->offset >> PAGE_SHIFT);
> > +           uvm_lock_pageq();
> >             uvm_pagefree(pg);
> > +           uvm_unlock_pageq();
> >     }
> > -   uvm_unlock_pageq();
> >  
> >     /*
> >      * Finally, free the anonymous UVM object itself.
> > @@ -864,7 +873,7 @@ uao_flush(struct uvm_object *uobj, voff_t start, voff_t 
> > stop, int flags)
> >     voff_t curoff;
> >  
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >  
> >     if (flags & PGO_ALLPAGES) {
> >             start = 0;
> > @@ -901,7 +910,8 @@ uao_flush(struct uvm_object *uobj, voff_t start, voff_t 
> > stop, int flags)
> >             /* Make sure page is unbusy, else wait for it. */
> >             if (pp->pg_flags & PG_BUSY) {
> >                     atomic_setbits_int(&pp->pg_flags, PG_WANTED);
> > -                   tsleep_nsec(pp, PVM, "uaoflsh", INFSLP);
> > +                   rwsleep_nsec(pp, uobj->vmobjlock, PVM, "uaoflsh",
> > +                       INFSLP);
> >                     curoff -= PAGE_SIZE;
> >                     continue;
> >             }
> > @@ -972,7 +982,7 @@ uao_flush(struct uvm_object *uobj, voff_t start, voff_t 
> > stop, int flags)
> >   * 2: page is zero-fill    -> allocate a new page and zero it.
> >   * 3: page is swapped out  -> fetch the page from swap.
> >   *
> > - * cases 1 and 2 can be handled with PGO_LOCKED, case 3 cannot.
> > + * cases 1 can be handled with PGO_LOCKED, cases 2 and 3 cannot.
> >   * so, if the "center" page hits case 3 (or any page, with PGO_ALLPAGES),
> >   * then we will need to return VM_PAGER_UNLOCK.
> >   *
> > @@ -992,7 +1002,7 @@ uao_get(struct uvm_object *uobj, voff_t offset, struct 
> > vm_page **pps,
> >     boolean_t done;
> >  
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >  
> >     /*
> >      * get number of pages
> > @@ -1115,7 +1125,10 @@ uao_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >  
> >                             /* out of RAM? */
> >                             if (ptmp == NULL) {
> > +                                   rw_exit(uobj->vmobjlock);
> >                                     uvm_wait("uao_getpage");
> > +                                   rw_enter(uobj->vmobjlock, RW_WRITE);
> > +                                   /* goto top of pps while loop */
> >                                     continue;
> >                             }
> >  
> > @@ -1135,7 +1148,8 @@ uao_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >                     /* page is there, see if we need to wait on it */
> >                     if ((ptmp->pg_flags & PG_BUSY) != 0) {
> >                             atomic_setbits_int(&ptmp->pg_flags, PG_WANTED);
> > -                           tsleep_nsec(ptmp, PVM, "uao_get", INFSLP);
> > +                           rwsleep_nsec(ptmp, uobj->vmobjlock, PVM,
> > +                               "uao_get", INFSLP);
> >                             continue;       /* goto top of pps while loop */
> >                     }
> >  
> > @@ -1169,8 +1183,12 @@ uao_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >             } else {
> >                     /*
> >                      * page in the swapped-out page.
> > +                    * unlock object for i/o, relock when done.
> >                      */
> > +
> > +                   rw_exit(uobj->vmobjlock);
> >                     rv = uvm_swap_get(ptmp, swslot, PGO_SYNCIO);
> > +                   rw_enter(uobj->vmobjlock, RW_WRITE);
> >  
> >                     /*
> >                      * I/O done.  check for errors.
> > @@ -1194,6 +1212,7 @@ uao_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >                             uvm_lock_pageq();
> >                             uvm_pagefree(ptmp);
> >                             uvm_unlock_pageq();
> > +                           rw_exit(uobj->vmobjlock);
> >  
> >                             return rv;
> >                     }
> > @@ -1215,11 +1234,14 @@ uao_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >  
> >     }       /* lcv loop */
> >  
> > +   rw_exit(uobj->vmobjlock);
> >     return VM_PAGER_OK;
> >  }
> >  
> >  /*
> >   * uao_dropswap:  release any swap resources from this aobj page.
> > + *
> > + * => aobj must be locked or have a reference count of 0.
> >   */
> >  int
> >  uao_dropswap(struct uvm_object *uobj, int pageidx)
> > @@ -1238,6 +1260,7 @@ uao_dropswap(struct uvm_object *uobj, int pageidx)
> >  /*
> >   * page in every page in every aobj that is paged-out to a range of 
> > swslots.
> >   * 
> > + * => aobj must be locked and is returned locked.
> >   * => returns TRUE if pagein was aborted due to lack of memory.
> >   */
> >  boolean_t
> > @@ -1272,7 +1295,9 @@ uao_swap_off(int startslot, int endslot)
> >             /*
> >              * Page in all pages in the swap slot range.
> >              */
> > +           rw_enter(aobj->u_obj.vmobjlock, RW_WRITE);
> >             rv = uao_pagein(aobj, startslot, endslot);
> > +           rw_exit(aobj->u_obj.vmobjlock);
> >  
> >             /* Drop the reference of the current object. */
> >             uao_detach(&aobj->u_obj);
> > @@ -1375,14 +1400,21 @@ restart:
> >  static boolean_t
> >  uao_pagein_page(struct uvm_aobj *aobj, int pageidx)
> >  {
> > +   struct uvm_object *uobj = &aobj->u_obj;
> >     struct vm_page *pg;
> >     int rv, slot, npages;
> >  
> >     pg = NULL;
> >     npages = 1;
> > +
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >     rv = uao_get(&aobj->u_obj, (voff_t)pageidx << PAGE_SHIFT,
> >         &pg, &npages, 0, PROT_READ | PROT_WRITE, 0, 0);
> >  
> > +   /*
> > +    * relock and finish up.
> > +    */
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     switch (rv) {
> >     case VM_PAGER_OK:
> >             break;
> > @@ -1430,7 +1462,7 @@ uao_dropswap_range(struct uvm_object *uobj, voff_t 
> > start, voff_t end)
> >     int swpgonlydelta = 0;
> >  
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> > -   /* KASSERT(mutex_owned(uobj->vmobjlock)); */
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >  
> >     if (end == 0) {
> >             end = INT64_MAX;
> > diff --git sys/uvm/uvm_device.c sys/uvm/uvm_device.c
> > index e5d035f2947..994ab537a82 100644
> > --- sys/uvm/uvm_device.c
> > +++ sys/uvm/uvm_device.c
> > @@ -166,7 +166,9 @@ udv_attach(dev_t device, vm_prot_t accessprot, voff_t 
> > off, vsize_t size)
> >                     /*
> >                      * bump reference count, unhold, return.
> >                      */
> > +                   rw_enter(lcv->u_obj.vmobjlock, RW_WRITE);
> >                     lcv->u_obj.uo_refs++;
> > +                   rw_exit(lcv->u_obj.vmobjlock);
> >  
> >                     mtx_enter(&udv_lock);
> >                     if (lcv->u_flags & UVM_DEVICE_WANTED)
> > @@ -228,8 +230,9 @@ udv_attach(dev_t device, vm_prot_t accessprot, voff_t 
> > off, vsize_t size)
> >  static void
> >  udv_reference(struct uvm_object *uobj)
> >  {
> > -   KERNEL_ASSERT_LOCKED();
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     uobj->uo_refs++;
> > +   rw_exit(uobj->vmobjlock);
> >  }
> >  
> >  /*
> > @@ -248,8 +251,10 @@ udv_detach(struct uvm_object *uobj)
> >      * loop until done
> >      */
> >  again:
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     if (uobj->uo_refs > 1) {
> >             uobj->uo_refs--;
> > +           rw_exit(uobj->vmobjlock);
> >             return;
> >     }
> >     KASSERT(uobj->uo_npages == 0 && RBT_EMPTY(uvm_objtree, &uobj->memt));
> > @@ -260,10 +265,7 @@ again:
> >     mtx_enter(&udv_lock);
> >     if (udv->u_flags & UVM_DEVICE_HOLD) {
> >             udv->u_flags |= UVM_DEVICE_WANTED;
> > -           /*
> > -            * lock interleaving. -- this is ok in this case since the
> > -            * locks are both IPL_NONE
> > -            */
> > +           rw_exit(uobj->vmobjlock);
> >             msleep_nsec(udv, &udv_lock, PVM | PNORELOCK, "udv_detach",
> >                 INFSLP);
> >             goto again;
> > @@ -276,6 +278,7 @@ again:
> >     if (udv->u_flags & UVM_DEVICE_WANTED)
> >             wakeup(udv);
> >     mtx_leave(&udv_lock);
> > +   rw_exit(uobj->vmobjlock);
> >  
> >     uvm_obj_destroy(uobj);
> >     free(udv, M_TEMP, sizeof(*udv));
> > diff --git sys/uvm/uvm_fault.c sys/uvm/uvm_fault.c
> > index c90d9b3fa81..ed72f1bbf92 100644
> > --- sys/uvm/uvm_fault.c
> > +++ sys/uvm/uvm_fault.c
> > @@ -326,7 +326,8 @@ uvmfault_anonget(struct uvm_faultinfo *ufi, struct 
> > vm_amap *amap,
> >                     if (pg->uobject) {
> >                             /* Owner of page is UVM object. */
> >                             uvmfault_unlockall(ufi, amap, NULL);
> > -                           tsleep_nsec(pg, PVM, "anonget1", INFSLP);
> > +                           rwsleep_nsec(pg, pg->uobject->vmobjlock,
> > +                               PVM | PNORELOCK, "anonget1", INFSLP);
> >                     } else {
> >                             /* Owner of page is anon. */
> >                             uvmfault_unlockall(ufi, NULL, NULL);
> > @@ -620,6 +621,7 @@ uvm_fault(vm_map_t orig_map, vaddr_t vaddr, vm_fault_t 
> > fault_type,
> >                      */
> >                     if (uobj != NULL && uobj->pgops->pgo_fault != NULL) {
> >                             KERNEL_LOCK();
> > +                           rw_enter(uobj->vmobjlock, RW_WRITE);
> >                             error = uobj->pgops->pgo_fault(&ufi,
> >                                 flt.startva, pages, flt.npages,
> >                                 flt.centeridx, fault_type, flt.access_type,
> > @@ -634,10 +636,8 @@ uvm_fault(vm_map_t orig_map, vaddr_t vaddr, vm_fault_t 
> > fault_type,
> >                                     error = EACCES;
> >                     } else {
> >                             /* case 2: fault on backing obj or zero fill */
> > -                           KERNEL_LOCK();
> >                             error = uvm_fault_lower(&ufi, &flt, pages,
> >                                 fault_type);
> > -                           KERNEL_UNLOCK();
> >                     }
> >             }
> >     }
> > @@ -793,10 +793,10 @@ uvm_fault_check(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >                     voff_t uoff;
> >  
> >                     uoff = (flt->startva - ufi->entry->start) + 
> > ufi->entry->offset;
> > -                   KERNEL_LOCK();
> > +                   rw_enter(uobj->vmobjlock, RW_WRITE);
> >                     (void) uobj->pgops->pgo_flush(uobj, uoff, uoff +
> >                         ((vsize_t)nback << PAGE_SHIFT), PGO_DEACTIVATE);
> > -                   KERNEL_UNLOCK();
> > +                   rw_exit(uobj->vmobjlock);
> >             }
> >  
> >             /* now forget about the backpages */
> > @@ -1098,6 +1098,8 @@ uvm_fault_lower_lookup(
> >     int lcv, gotpages;
> >     vaddr_t currva;
> >  
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> > +
> >     counters_inc(uvmexp_counters, flt_lget);
> >     gotpages = flt->npages;
> >     (void) uobj->pgops->pgo_get(uobj,
> > @@ -1211,6 +1213,14 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >      * made it BUSY.
> >      */
> >  
> > +   /*
> > +    * locked:
> > +    */
> > +   KASSERT(amap == NULL ||
> > +       rw_write_held(amap->am_lock));
> > +   KASSERT(uobj == NULL ||
> > +       rw_write_held(uobj->vmobjlock));
> > +
> >     /*
> >      * note that uobjpage can not be PGO_DONTCARE at this point.  we now
> >      * set uobjpage to PGO_DONTCARE if we are doing a zero fill.  if we
> > @@ -1268,6 +1278,7 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >                             return (EIO);
> >  
> >                     uobjpage = PGO_DONTCARE;
> > +                   uobj = NULL;
> >                     promote = TRUE;
> >             }
> >  
> > @@ -1276,6 +1287,12 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >             if (locked && amap != NULL)
> >                     amap_lock(amap);
> >  
> > +           /* might be changed */
> > +           if (uobjpage != PGO_DONTCARE) {
> > +                   uobj = uobjpage->uobject;
> > +                   rw_enter(uobj->vmobjlock, RW_WRITE);
> > +           }
> > +
> >             /*
> >              * Re-verify that amap slot is still free. if there is
> >              * a problem, we clean up.
> > @@ -1300,10 +1317,12 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >                     atomic_clearbits_int(&uobjpage->pg_flags,
> >                         PG_BUSY|PG_WANTED);
> >                     UVM_PAGE_OWN(uobjpage, NULL);
> > -                   return ERESTART;
> >             }
> > -           if (locked == FALSE)
> > +
> > +           if (locked == FALSE) {
> > +                   rw_exit(uobj->vmobjlock);
> >                     return ERESTART;
> > +           }
> >  
> >             /*
> >              * we have the data in uobjpage which is PG_BUSY
> > @@ -1423,6 +1442,7 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >                     uvm_lock_pageq();
> >                     uvm_pageactivate(uobjpage);
> >                     uvm_unlock_pageq();
> > +                   rw_exit(uobj->vmobjlock);
> >                     uobj = NULL;
> >             } else {
> >                     counters_inc(uvmexp_counters, flt_przero);
> > @@ -1434,7 +1454,7 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >  
> >             if (amap_add(&ufi->entry->aref,
> >                 ufi->orig_rvaddr - ufi->entry->start, anon, 0)) {
> > -                   uvmfault_unlockall(ufi, amap, NULL);
> > +                   uvmfault_unlockall(ufi, amap, uobj);
> >                     uvm_anfree(anon);
> >                     counters_inc(uvmexp_counters, flt_noamap);
> >  
> > @@ -1483,25 +1503,32 @@ uvm_fault_lower(struct uvm_faultinfo *ufi, struct 
> > uvm_faultctx *flt,
> >             return ERESTART;
> >     }
> >  
> > -   uvm_lock_pageq();
> > -
> >     if (fault_type == VM_FAULT_WIRE) {
> > +           uvm_lock_pageq();
> >             uvm_pagewire(pg);
> > +           uvm_unlock_pageq();
> >             if (pg->pg_flags & PQ_AOBJ) {
> >                     /*
> >                      * since the now-wired page cannot be paged out,
> >                      * release its swap resources for others to use.
> > -                    * since an aobj page with no swap cannot be PG_CLEAN,
> > -                    * clear its clean flag now.
> > +                    * since an aobj page with no swap cannot be clean,
> > +                    * mark it dirty now.
> > +                    *
> > +                    * use pg->uobject here.  if the page is from a
> > +                    * tmpfs vnode, the pages are backed by its UAO and
> > +                    * not the vnode.
> >                      */
> > +                   KASSERT(uobj != NULL);
> > +                   KASSERT(uobj->vmobjlock == pg->uobject->vmobjlock);
> >                     atomic_clearbits_int(&pg->pg_flags, PG_CLEAN);
> >                     uao_dropswap(uobj, pg->offset >> PAGE_SHIFT);
> >             }
> >     } else {
> >             /* activate it */
> > +           uvm_lock_pageq();
> >             uvm_pageactivate(pg);
> > +           uvm_unlock_pageq();
> >     }
> > -   uvm_unlock_pageq();
> >  
> >     if (pg->pg_flags & PG_WANTED)
> >             wakeup(pg);
> > @@ -1567,7 +1594,7 @@ uvm_fault_unwire(vm_map_t map, vaddr_t start, vaddr_t 
> > end)
> >  void
> >  uvm_fault_unwire_locked(vm_map_t map, vaddr_t start, vaddr_t end)
> >  {
> > -   vm_map_entry_t entry, next;
> > +   vm_map_entry_t entry, oentry = NULL, next;
> >     pmap_t pmap = vm_map_pmap(map);
> >     vaddr_t va;
> >     paddr_t pa;
> > @@ -1578,12 +1605,9 @@ uvm_fault_unwire_locked(vm_map_t map, vaddr_t start, 
> > vaddr_t end)
> >     /*
> >      * we assume that the area we are unwiring has actually been wired
> >      * in the first place.   this means that we should be able to extract
> > -    * the PAs from the pmap.   we also lock out the page daemon so that
> > -    * we can call uvm_pageunwire.
> > +    * the PAs from the pmap.
> >      */
> >  
> > -   uvm_lock_pageq();
> > -
> >     /*
> >      * find the beginning map entry for the region.
> >      */
> > @@ -1605,6 +1629,17 @@ uvm_fault_unwire_locked(vm_map_t map, vaddr_t start, 
> > vaddr_t end)
> >                     entry = next;
> >             }
> >  
> > +           /*
> > +            * lock it.
> > +            */
> > +           if (entry != oentry) {
> > +                   if (oentry != NULL) {
> > +                           uvm_map_unlock_entry(oentry);
> > +                   }
> > +                   uvm_map_lock_entry(entry);
> > +                   oentry = entry;
> > +           }
> > +
> >             /*
> >              * if the entry is no longer wired, tell the pmap.
> >              */
> > @@ -1612,11 +1647,16 @@ uvm_fault_unwire_locked(vm_map_t map, vaddr_t 
> > start, vaddr_t end)
> >                     pmap_unwire(pmap, va);
> >  
> >             pg = PHYS_TO_VM_PAGE(pa);
> > -           if (pg)
> > +           if (pg) {
> > +                   uvm_lock_pageq();
> >                     uvm_pageunwire(pg);
> > +                   uvm_unlock_pageq();
> > +           }
> >     }
> >  
> > -   uvm_unlock_pageq();
> > +   if (oentry != NULL) {
> > +           uvm_map_unlock_entry(entry);
> > +   }
> >  }
> >  
> >  /*
> > @@ -1650,6 +1690,8 @@ void
> >  uvmfault_unlockall(struct uvm_faultinfo *ufi, struct vm_amap *amap,
> >      struct uvm_object *uobj)
> >  {
> > +   if (uobj)
> > +           rw_exit(uobj->vmobjlock);
> >     if (amap != NULL)
> >             amap_unlock(amap);
> >     uvmfault_unlockmaps(ufi, FALSE);
> > diff --git sys/uvm/uvm_km.c sys/uvm/uvm_km.c
> > index fc31ae99dff..5f36935c09d 100644
> > --- sys/uvm/uvm_km.c
> > +++ sys/uvm/uvm_km.c
> > @@ -249,13 +249,15 @@ uvm_km_pgremove(struct uvm_object *uobj, vaddr_t 
> > startva, vaddr_t endva)
> >     int swpgonlydelta = 0;
> >  
> >     KASSERT(UVM_OBJ_IS_AOBJ(uobj));
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >  
> >     pmap_remove(pmap_kernel(), startva, endva);
> >     for (curoff = start ; curoff < end ; curoff += PAGE_SIZE) {
> >             pp = uvm_pagelookup(uobj, curoff);
> >             if (pp && pp->pg_flags & PG_BUSY) {
> >                     atomic_setbits_int(&pp->pg_flags, PG_WANTED);
> > -                   tsleep_nsec(pp, PVM, "km_pgrm", INFSLP);
> > +                   rwsleep_nsec(pp, uobj->vmobjlock, PVM, "km_pgrm",
> > +                       INFSLP);
> >                     curoff -= PAGE_SIZE; /* loop back to us */
> >                     continue;
> >             }
> > @@ -383,6 +385,9 @@ uvm_km_kmemalloc_pla(struct vm_map *map, struct 
> > uvm_object *obj, vsize_t size,
> >             return (0);
> >     }
> >  
> > +   if (obj != NULL)
> > +           rw_enter(obj->vmobjlock, RW_WRITE);
> > +
> >     loopva = kva;
> >     while (loopva != kva + size) {
> >             pg = TAILQ_FIRST(&pgl);
> > @@ -409,6 +414,9 @@ uvm_km_kmemalloc_pla(struct vm_map *map, struct 
> > uvm_object *obj, vsize_t size,
> >     KASSERT(TAILQ_EMPTY(&pgl));
> >     pmap_update(pmap_kernel());
> >  
> > +   if (obj != NULL)
> > +           rw_exit(obj->vmobjlock);
> > +
> >     return kva;
> >  }
> >  
> > @@ -474,12 +482,14 @@ uvm_km_alloc1(struct vm_map *map, vsize_t size, 
> > vsize_t align, boolean_t zeroit)
> >     /* now allocate the memory.  we must be careful about released pages. */
> >     loopva = kva;
> >     while (size) {
> > +           rw_enter(uvm.kernel_object->vmobjlock, RW_WRITE);
> >             /* allocate ram */
> >             pg = uvm_pagealloc(uvm.kernel_object, offset, NULL, 0);
> >             if (pg) {
> >                     atomic_clearbits_int(&pg->pg_flags, PG_BUSY);
> >                     UVM_PAGE_OWN(pg, NULL);
> >             }
> > +           rw_exit(uvm.kernel_object->vmobjlock);
> >             if (__predict_false(pg == NULL)) {
> >                     if (curproc == uvm.pagedaemon_proc) {
> >                             /*
> > diff --git sys/uvm/uvm_map.c sys/uvm/uvm_map.c
> > index d153bbfd20b..06553a814c6 100644
> > --- sys/uvm/uvm_map.c
> > +++ sys/uvm/uvm_map.c
> > @@ -124,6 +124,8 @@ struct vm_map_entry     *uvm_mapent_alloc(struct 
> > vm_map*, int);
> >  void                        uvm_mapent_free(struct vm_map_entry*);
> >  void                        uvm_unmap_kill_entry(struct vm_map*,
> >                         struct vm_map_entry*);
> > +void                        uvm_unmap_kill_entry_withlock(struct vm_map *,
> > +                       struct vm_map_entry *, int);
> >  void                        uvm_unmap_detach_intrsafe(struct uvm_map_deadq 
> > *);
> >  void                        uvm_mapent_mkfree(struct vm_map*,
> >                         struct vm_map_entry*, struct vm_map_entry**,
> > @@ -499,6 +501,28 @@ uvm_map_reference(struct vm_map *map)
> >     atomic_inc_int(&map->ref_count);
> >  }
> >  
> > +void
> > +uvm_map_lock_entry(struct vm_map_entry *entry)
> > +{
> > +   if (entry->aref.ar_amap != NULL) {
> > +           amap_lock(entry->aref.ar_amap);
> > +   }
> > +   if (UVM_ET_ISOBJ(entry)) {
> > +           rw_enter(entry->object.uvm_obj->vmobjlock, RW_WRITE);
> > +   }
> > +}
> > +
> > +void
> > +uvm_map_unlock_entry(struct vm_map_entry *entry)
> > +{
> > +   if (UVM_ET_ISOBJ(entry)) {
> > +           rw_exit(entry->object.uvm_obj->vmobjlock);
> > +   }
> > +   if (entry->aref.ar_amap != NULL) {
> > +           amap_unlock(entry->aref.ar_amap);
> > +   }
> > +}
> > +
> >  /*
> >   * Calculate the dused delta.
> >   */
> > @@ -2101,7 +2125,8 @@ uvm_mapent_mkfree(struct vm_map *map, struct 
> > vm_map_entry *entry,
> >   * Unwire and release referenced amap and object from map entry.
> >   */
> >  void
> > -uvm_unmap_kill_entry(struct vm_map *map, struct vm_map_entry *entry)
> > +uvm_unmap_kill_entry_withlock(struct vm_map *map, struct vm_map_entry 
> > *entry,
> > +    int needlock)
> >  {
> >     /* Unwire removed map entry. */
> >     if (VM_MAPENT_ISWIRED(entry)) {
> > @@ -2111,6 +2136,9 @@ uvm_unmap_kill_entry(struct vm_map *map, struct 
> > vm_map_entry *entry)
> >             KERNEL_UNLOCK();
> >     }
> >  
> > +   if (needlock)
> > +           uvm_map_lock_entry(entry);
> > +
> >     /* Entry-type specific code. */
> >     if (UVM_ET_ISHOLE(entry)) {
> >             /* Nothing to be done for holes. */
> > @@ -2157,17 +2185,19 @@ uvm_unmap_kill_entry(struct vm_map *map, struct 
> > vm_map_entry *entry)
> >              */
> >             uvm_km_pgremove(entry->object.uvm_obj, entry->start,
> >                 entry->end);
> > -
> > -           /*
> > -            * null out kernel_object reference, we've just
> > -            * dropped it
> > -            */
> > -           entry->etype &= ~UVM_ET_OBJ;
> > -           entry->object.uvm_obj = NULL;  /* to be safe */
> >     } else {
> >             /* remove mappings the standard way. */
> >             pmap_remove(map->pmap, entry->start, entry->end);
> >     }
> > +
> > +   if (needlock)
> > +           uvm_map_unlock_entry(entry);
> > +}
> > +
> > +void
> > +uvm_unmap_kill_entry(struct vm_map *map, struct vm_map_entry *entry)
> > +{
> > +   uvm_unmap_kill_entry_withlock(map, entry, 0);
> >  }
> >  
> >  /*
> > @@ -2227,7 +2257,7 @@ uvm_unmap_remove(struct vm_map *map, vaddr_t start, 
> > vaddr_t end,
> >                     map->sserial++;
> >  
> >             /* Kill entry. */
> > -           uvm_unmap_kill_entry(map, entry);
> > +           uvm_unmap_kill_entry_withlock(map, entry, 1);
> >  
> >             /* Update space usage. */
> >             if ((map->flags & VM_MAP_ISVMSPACE) &&
> > @@ -3420,8 +3450,10 @@ uvm_map_protect(struct vm_map *map, vaddr_t start, 
> > vaddr_t end,
> >                              */
> >                             iter->wired_count = 0;
> >                     }
> > +                   uvm_map_lock_entry(iter);
> >                     pmap_protect(map->pmap, iter->start, iter->end,
> >                         iter->protection & mask);
> > +                   uvm_map_unlock_entry(iter);
> >             }
> >  
> >             /*
> > @@ -3967,11 +3999,13 @@ uvm_mapent_forkcopy(struct vmspace *new_vm, struct 
> > vm_map *new_map,
> >                      */
> >                     if (!UVM_ET_ISNEEDSCOPY(old_entry)) {
> >                             if (old_entry->max_protection & PROT_WRITE) {
> > +                                   uvm_map_lock_entry(old_entry);
> >                                     pmap_protect(old_map->pmap,
> >                                         old_entry->start,
> >                                         old_entry->end,
> >                                         old_entry->protection &
> >                                         ~PROT_WRITE);
> > +                                   uvm_map_unlock_entry(old_entry);
> >                                     pmap_update(old_map->pmap);
> >                             }
> >                             old_entry->etype |= UVM_ET_NEEDSCOPY;
> > @@ -4751,9 +4785,11 @@ flush_object:
> >                 ((flags & PGO_FREE) == 0 ||
> >                  ((entry->max_protection & PROT_WRITE) != 0 &&
> >                   (entry->etype & UVM_ET_COPYONWRITE) == 0))) {
> > +                   rw_enter(uobj->vmobjlock, RW_WRITE);
> >                     rv = uobj->pgops->pgo_flush(uobj,
> >                         cp_start - entry->start + entry->offset,
> >                         cp_end - entry->start + entry->offset, flags);
> > +                   rw_exit(uobj->vmobjlock);
> >  
> >                     if (rv == FALSE)
> >                             error = EFAULT;
> > diff --git sys/uvm/uvm_map.h sys/uvm/uvm_map.h
> > index 12092ebfcd2..6c02bc93137 100644
> > --- sys/uvm/uvm_map.h
> > +++ sys/uvm/uvm_map.h
> > @@ -442,6 +442,9 @@ void            vm_map_unbusy_ln(struct vm_map*, char*, 
> > int);
> >  #define vm_map_unbusy(map) vm_map_unbusy_ln(map, NULL, 0)
> >  #endif
> >  
> > +void               uvm_map_lock_entry(struct vm_map_entry *);
> > +void               uvm_map_unlock_entry(struct vm_map_entry *);
> > +
> >  #endif /* _KERNEL */
> >  
> >  /*
> > diff --git sys/uvm/uvm_object.c sys/uvm/uvm_object.c
> > index 675cd9de2da..8b52a14459f 100644
> > --- sys/uvm/uvm_object.c
> > +++ sys/uvm/uvm_object.c
> > @@ -1,7 +1,7 @@
> >  /* $OpenBSD: uvm_object.c,v 1.22 2021/10/23 14:42:08 mpi Exp $     */
> >  
> >  /*
> > - * Copyright (c) 2006 The NetBSD Foundation, Inc.
> > + * Copyright (c) 2006, 2010, 2019 The NetBSD Foundation, Inc.
> >   * All rights reserved.
> >   *
> >   * This code is derived from software contributed to The NetBSD Foundation
> > @@ -38,6 +38,7 @@
> >  #include <sys/systm.h>
> >  #include <sys/mman.h>
> >  #include <sys/atomic.h>
> > +#include <sys/rwlock.h>
> >  
> >  #include <uvm/uvm.h>
> >  
> > @@ -51,15 +52,27 @@ const struct uvm_pagerops bufcache_pager = {
> >     /* nothing */
> >  };
> >  
> > -/* We will fetch this page count per step */
> > +/* Page count to fetch per single step. */
> >  #define    FETCH_PAGECOUNT 16
> >  
> >  /*
> > - * uvm_obj_init: initialise a uvm object.
> > + * uvm_obj_init: initialize UVM memory object.
> >   */
> >  void
> >  uvm_obj_init(struct uvm_object *uobj, const struct uvm_pagerops *pgops, 
> > int refs)
> >  {
> > +   int alock;
> > +
> > +   alock = ((pgops != NULL) && (pgops != &pmap_pager) &&
> > +       (pgops != &bufcache_pager) && (refs != UVM_OBJ_KERN));
> > +
> > +   if (alock) {
> > +           /* Allocate and assign a lock. */
> > +           rw_obj_alloc(&uobj->vmobjlock, "uobjlk");
> > +   } else {
> > +           /* The lock will need to be set via uvm_obj_setlock(). */
> > +           uobj->vmobjlock = NULL;
> > +   }
> >     uobj->pgops = pgops;
> >     RBT_INIT(uvm_objtree, &uobj->memt);
> >     uobj->uo_npages = 0;
> > @@ -73,12 +86,38 @@ void
> >  uvm_obj_destroy(struct uvm_object *uo)
> >  {
> >     KASSERT(RBT_EMPTY(uvm_objtree, &uo->memt));
> > +
> > +   rw_obj_free(uo->vmobjlock);
> > +}
> > +
> > +/*
> > + * uvm_obj_setlock: assign a vmobjlock to the UVM object.
> > + *
> > + * => Caller is responsible to ensure that UVM objects is not use.
> > + * => Only dynamic lock may be previously set.  We drop the reference then.
> > + */
> > +void
> > +uvm_obj_setlock(struct uvm_object *uo, struct rwlock *lockptr)
> > +{
> > +   struct rwlock *olockptr = uo->vmobjlock;
> > +
> > +   if (olockptr) {
> > +           /* Drop the reference on the old lock. */
> > +           rw_obj_free(olockptr);
> > +   }
> > +   if (lockptr == NULL) {
> > +           /* If new lock is not passed - allocate default one. */
> > +           rw_obj_alloc(&lockptr, "uobjlk");
> > +   }
> > +   uo->vmobjlock = lockptr;
> >  }
> >  
> >  #ifndef SMALL_KERNEL
> >  /*
> > - * uvm_obj_wire: wire the pages of entire uobj
> > + * uvm_obj_wire: wire the pages of entire UVM object.
> >   *
> > + * => NOTE: this function should only be used for types of objects
> > + *  where PG_RELEASED flag is never set (aobj objects)
> >   * => caller must pass page-aligned start and end values
> >   * => if the caller passes in a pageq pointer, we'll return a list of
> >   *  wired pages.
> > @@ -94,6 +133,7 @@ uvm_obj_wire(struct uvm_object *uobj, voff_t start, 
> > voff_t end,
> >  
> >     left = (end - start) >> PAGE_SHIFT;
> >  
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     while (left) {
> >  
> >             npages = MIN(FETCH_PAGECOUNT, left);
> > @@ -107,6 +147,7 @@ uvm_obj_wire(struct uvm_object *uobj, voff_t start, 
> > voff_t end,
> >             if (error)
> >                     goto error;
> >  
> > +           rw_enter(uobj->vmobjlock, RW_WRITE);
> >             for (i = 0; i < npages; i++) {
> >  
> >                     KASSERT(pgs[i] != NULL);
> > @@ -134,6 +175,7 @@ uvm_obj_wire(struct uvm_object *uobj, voff_t start, 
> > voff_t end,
> >             left -= npages;
> >             offset += (voff_t)npages << PAGE_SHIFT;
> >     }
> > +   rw_exit(uobj->vmobjlock);
> >  
> >     return 0;
> >  
> > @@ -145,17 +187,17 @@ error:
> >  }
> >  
> >  /*
> > - * uobj_unwirepages: unwire the pages of entire uobj
> > + * uvm_obj_unwire: unwire the pages of entire UVM object.
> >   *
> >   * => caller must pass page-aligned start and end values
> >   */
> > -
> >  void
> >  uvm_obj_unwire(struct uvm_object *uobj, voff_t start, voff_t end)
> >  {
> >     struct vm_page *pg;
> >     off_t offset;
> >  
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     uvm_lock_pageq();
> >     for (offset = start; offset < end; offset += PAGE_SIZE) {
> >             pg = uvm_pagelookup(uobj, offset);
> > @@ -166,6 +208,7 @@ uvm_obj_unwire(struct uvm_object *uobj, voff_t start, 
> > voff_t end)
> >             uvm_pageunwire(pg);
> >     }
> >     uvm_unlock_pageq();
> > +   rw_exit(uobj->vmobjlock);
> >  }
> >  #endif /* !SMALL_KERNEL */
> >  
> > diff --git sys/uvm/uvm_object.h sys/uvm/uvm_object.h
> > index 9a74600c9df..5fc32ca3eb8 100644
> > --- sys/uvm/uvm_object.h
> > +++ sys/uvm/uvm_object.h
> > @@ -32,14 +32,25 @@
> >  #define _UVM_UVM_OBJECT_H_
> >  
> >  /*
> > - * uvm_object.h
> > - */
> > -
> > -/*
> > - * uvm_object: all that is left of mach objects.
> > + * The UVM memory object interface.  Notes:
> > + *
> > + * A UVM memory object represents a list of pages, which are managed by
> > + * the object's pager operations (uvm_object::pgops).  All pages belonging
> > + * to an object are owned by it and thus protected by the object lock.
> > + *
> > + * The lock (uvm_object::vmobjlock) may be shared amongst the UVM objects.
> > + * By default, the lock is allocated dynamically using rw_obj_init() cache.
> > + * Lock sharing is normally used when there is an underlying object.  For
> > + * example, vnode representing a file may have an underlying node, which
> > + * is the case for tmpfs and layered file systems.  In such case, vnode's
> > + * UVM object and the underlying UVM object shares the lock.
> > + *
> > + * The reference count is managed atomically for the anonymous UVM objects.
> > + * For other objects, it is arbitrary (may use the lock or atomics).
> >   */
> >  
> >  struct uvm_object {
> > +   struct rwlock                   *vmobjlock;     /* lock on object */
> >     const struct uvm_pagerops       *pgops;         /* pager ops */
> >     RBT_HEAD(uvm_objtree, vm_page)   memt;          /* pages in object */
> >     int                              uo_npages;     /* # of pages in memt */
> > @@ -52,10 +63,10 @@ struct uvm_object {
> >   * memory objects don't have reference counts -- they never die).
> >   *
> >   * this value is used to detected kernel object mappings at uvm_unmap()
> > - * time.  normally when an object is unmapped its pages eventually become
> > - * deactivated and then paged out and/or freed.  this is not useful
> > + * time.   normally when an object is unmapped its pages eventaully become
> > + * deactivated and then paged out and/or freed.    this is not useful
> >   * for kernel objects... when a kernel object is unmapped we always want
> > - * to free the resources associated with the mapping.  UVM_OBJ_KERN
> > + * to free the resources associated with the mapping.   UVM_OBJ_KERN
> >   * allows us to decide which type of unmapping we want to do.
> >   *
> >   * in addition, we have kernel objects which may be used in an
> > @@ -100,8 +111,12 @@ RBT_PROTOTYPE(uvm_objtree, vm_page, objt, uvm_pagecmp)
> >  #define UVM_OBJ_IS_BUFCACHE(uobj)                                  \
> >     ((uobj)->pgops == &bufcache_pager)
> >  
> > +#define UVM_OBJ_IS_DUMMY(uobj)                                             
> > \
> > +   (UVM_OBJ_IS_PMAP(uobj) || UVM_OBJ_IS_BUFCACHE(uobj))
> > +
> >  void       uvm_obj_init(struct uvm_object *, const struct uvm_pagerops *, 
> > int);
> >  void       uvm_obj_destroy(struct uvm_object *);
> > +void       uvm_obj_setlock(struct uvm_object *, struct rwlock *);
> >  int        uvm_obj_wire(struct uvm_object *, voff_t, voff_t, struct pglist 
> > *);
> >  void       uvm_obj_unwire(struct uvm_object *, voff_t, voff_t);
> >  void       uvm_obj_free(struct uvm_object *);
> > diff --git sys/uvm/uvm_page.c sys/uvm/uvm_page.c
> > index a90b23af6df..b0d705994d1 100644
> > --- sys/uvm/uvm_page.c
> > +++ sys/uvm/uvm_page.c
> > @@ -118,6 +118,7 @@ static vaddr_t      virtual_space_end;
> >   */
> >  static void uvm_pageinsert(struct vm_page *);
> >  static void uvm_pageremove(struct vm_page *);
> > +int uvm_page_owner_locked_p(struct vm_page *);
> >  
> >  /*
> >   * inline functions
> > @@ -125,7 +126,7 @@ static void uvm_pageremove(struct vm_page *);
> >  /*
> >   * uvm_pageinsert: insert a page in the object
> >   *
> > - * => caller must lock page queues XXX questionable
> > + * => caller must lock object
> >   * => call should have already set pg's object and offset pointers
> >   *    and bumped the version counter
> >   */
> > @@ -134,7 +135,10 @@ uvm_pageinsert(struct vm_page *pg)
> >  {
> >     struct vm_page  *dupe;
> >  
> > +   KASSERT(UVM_OBJ_IS_DUMMY(pg->uobject) ||
> > +       rw_write_held(pg->uobject->vmobjlock));
> >     KASSERT((pg->pg_flags & PG_TABLED) == 0);
> > +
> >     dupe = RBT_INSERT(uvm_objtree, &pg->uobject->memt, pg);
> >     /* not allowed to insert over another page */
> >     KASSERT(dupe == NULL);
> > @@ -145,12 +149,15 @@ uvm_pageinsert(struct vm_page *pg)
> >  /*
> >   * uvm_page_remove: remove page from object
> >   *
> > - * => caller must lock page queues
> > + * => caller must lock object
> >   */
> >  static inline void
> >  uvm_pageremove(struct vm_page *pg)
> >  {
> > +   KASSERT(UVM_OBJ_IS_DUMMY(pg->uobject) ||
> > +       rw_write_held(pg->uobject->vmobjlock));
> >     KASSERT(pg->pg_flags & PG_TABLED);
> > +
> >     RBT_REMOVE(uvm_objtree, &pg->uobject->memt, pg);
> >  
> >     atomic_clearbits_int(&pg->pg_flags, PG_TABLED);
> > @@ -683,11 +690,19 @@ uvm_pagealloc_pg(struct vm_page *pg, struct 
> > uvm_object *obj, voff_t off,
> >  {
> >     int     flags;
> >  
> > +   KASSERT(obj == NULL || anon == NULL);
> > +   KASSERT(anon == NULL || off == 0);
> > +   KASSERT(off == trunc_page(off));
> > +   KASSERT(obj == NULL || UVM_OBJ_IS_DUMMY(obj) ||
> > +       rw_write_held(obj->vmobjlock));
> > +   KASSERT(anon == NULL || anon->an_lock == NULL ||
> > +       rw_write_held(anon->an_lock));
> > +
> >     flags = PG_BUSY | PG_FAKE;
> >     pg->offset = off;
> >     pg->uobject = obj;
> >     pg->uanon = anon;
> > -
> > +   KASSERT(uvm_page_owner_locked_p(pg));
> >     if (anon) {
> >             anon->an_page = pg;
> >             flags |= PQ_ANON;
> > @@ -846,7 +861,9 @@ uvm_pagerealloc_multi(struct uvm_object *obj, voff_t 
> > off, vsize_t size,
> >                     uvm_pagecopy(tpg, pg);
> >                     KASSERT(tpg->wire_count == 1);
> >                     tpg->wire_count = 0;
> > +                   uvm_lock_pageq();
> >                     uvm_pagefree(tpg);
> > +                   uvm_unlock_pageq();
> >                     uvm_pagealloc_pg(pg, obj, offset, NULL);
> >             }
> >     }
> > @@ -873,6 +890,10 @@ uvm_pagealloc(struct uvm_object *obj, voff_t off, 
> > struct vm_anon *anon,
> >     KASSERT(obj == NULL || anon == NULL);
> >     KASSERT(anon == NULL || off == 0);
> >     KASSERT(off == trunc_page(off));
> > +   KASSERT(obj == NULL || UVM_OBJ_IS_DUMMY(obj) ||
> > +       rw_write_held(obj->vmobjlock));
> > +   KASSERT(anon == NULL || anon->an_lock == NULL ||
> > +       rw_write_held(anon->an_lock));
> >  
> >     pmr_flags = UVM_PLA_NOWAIT;
> >  
> > @@ -940,10 +961,9 @@ uvm_pageclean(struct vm_page *pg)
> >  {
> >     u_int flags_to_clear = 0;
> >  
> > -#if all_pmap_are_fixed
> > -   if (pg->pg_flags & (PG_TABLED|PQ_ACTIVE|PQ_INACTIVE))
> > +   if ((pg->pg_flags & (PG_TABLED|PQ_ACTIVE|PQ_INACTIVE)) &&
> > +       (pg->uobject == NULL || !UVM_OBJ_IS_PMAP(pg->uobject)))
> >             MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> > -#endif
> >  
> >  #ifdef DEBUG
> >     if (pg->uobject == (void *)0xdeadbeef &&
> > @@ -953,6 +973,10 @@ uvm_pageclean(struct vm_page *pg)
> >  #endif
> >  
> >     KASSERT((pg->pg_flags & PG_DEV) == 0);
> > +   KASSERT(pg->uobject == NULL || UVM_OBJ_IS_DUMMY(pg->uobject) ||
> > +       rw_write_held(pg->uobject->vmobjlock));
> > +   KASSERT(pg->uobject != NULL || pg->uanon == NULL ||
> > +           rw_write_held(pg->uanon->an_lock));
> >  
> >     /*
> >      * if the page was an object page (and thus "TABLED"), remove it
> > @@ -1009,10 +1033,9 @@ uvm_pageclean(struct vm_page *pg)
> >  void
> >  uvm_pagefree(struct vm_page *pg)
> >  {
> > -#if all_pmap_are_fixed
> > -   if (pg->pg_flags & (PG_TABLED|PQ_ACTIVE|PQ_INACTIVE))
> > +   if ((pg->pg_flags & (PG_TABLED|PQ_ACTIVE|PQ_INACTIVE)) &&
> > +       (pg->uobject == NULL || !UVM_OBJ_IS_PMAP(pg->uobject)))
> >             MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> > -#endif
> >  
> >     uvm_pageclean(pg);
> >     uvm_pmr_freepages(pg, 1);
> > @@ -1037,6 +1060,10 @@ uvm_page_unbusy(struct vm_page **pgs, int npgs)
> >             if (pg == NULL || pg == PGO_DONTCARE) {
> >                     continue;
> >             }
> > +
> > +           KASSERT(uvm_page_owner_locked_p(pg));
> > +           KASSERT(pg->pg_flags & PG_BUSY);
> > +
> >             if (pg->pg_flags & PG_WANTED) {
> >                     wakeup(pg);
> >             }
> > @@ -1207,6 +1234,7 @@ uvm_pagelookup(struct uvm_object *obj, voff_t off)
> >  void
> >  uvm_pagewire(struct vm_page *pg)
> >  {
> > +   KASSERT(uvm_page_owner_locked_p(pg));
> >     MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> >  
> >     if (pg->wire_count == 0) {
> > @@ -1237,6 +1265,7 @@ uvm_pagewire(struct vm_page *pg)
> >  void
> >  uvm_pageunwire(struct vm_page *pg)
> >  {
> > +   KASSERT(uvm_page_owner_locked_p(pg));
> >     MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> >  
> >     pg->wire_count--;
> > @@ -1258,6 +1287,7 @@ uvm_pageunwire(struct vm_page *pg)
> >  void
> >  uvm_pagedeactivate(struct vm_page *pg)
> >  {
> > +   KASSERT(uvm_page_owner_locked_p(pg));
> >     MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> >  
> >     if (pg->pg_flags & PQ_ACTIVE) {
> > @@ -1294,6 +1324,7 @@ uvm_pagedeactivate(struct vm_page *pg)
> >  void
> >  uvm_pageactivate(struct vm_page *pg)
> >  {
> > +   KASSERT(uvm_page_owner_locked_p(pg));
> >     MUTEX_ASSERT_LOCKED(&uvm.pageqlock);
> >  
> >     if (pg->pg_flags & PQ_INACTIVE) {
> > @@ -1341,6 +1372,24 @@ uvm_pagecopy(struct vm_page *src, struct vm_page 
> > *dst)
> >     pmap_copy_page(src, dst);
> >  }
> >  
> > +/*
> > + * uvm_page_owner_locked_p: return true if object associated with page is
> > + * locked.  this is a weak check for runtime assertions only.
> > + */
> > +int
> > +uvm_page_owner_locked_p(struct vm_page *pg)
> > +{
> > +   if (pg->uobject != NULL) {
> > +           if (UVM_OBJ_IS_DUMMY(pg->uobject))
> > +                   return 1;
> > +           return rw_write_held(pg->uobject->vmobjlock);
> > +   }
> > +   if (pg->uanon != NULL) {
> > +           return rw_write_held(pg->uanon->an_lock);
> > +   }
> > +   return 1;
> > +}
> > +
> >  /*
> >   * uvm_pagecount: count the number of physical pages in the address range.
> >   */
> > diff --git sys/uvm/uvm_pager.c sys/uvm/uvm_pager.c
> > index 286e7c2a025..4766f21df3a 100644
> > --- sys/uvm/uvm_pager.c
> > +++ sys/uvm/uvm_pager.c
> > @@ -134,6 +134,24 @@ uvm_pseg_get(int flags)
> >     int i;
> >     struct uvm_pseg *pseg;
> >  
> > +   /*
> > +    * XXX Prevent lock ordering issue in uvm_unmap_detach().  A real
> > +    * fix would be to move the KERNEL_LOCK() out of uvm_unmap_detach().
> > +    *
> > +    *  witness_checkorder() at witness_checkorder+0xba0
> > +    *  __mp_lock() at __mp_lock+0x5f
> > +    *  uvm_unmap_detach() at uvm_unmap_detach+0xc5
> > +    *  uvm_map() at uvm_map+0x857
> > +    *  uvm_km_valloc_try() at uvm_km_valloc_try+0x65
> > +    *  uvm_pseg_get() at uvm_pseg_get+0x6f
> > +    *  uvm_pagermapin() at uvm_pagermapin+0x45
> > +    *  uvn_io() at uvn_io+0xcf
> > +    *  uvn_get() at uvn_get+0x156
> > +    *  uvm_fault_lower() at uvm_fault_lower+0x28a
> > +    *  uvm_fault() at uvm_fault+0x1b3
> > +    *  upageflttrap() at upageflttrap+0x62
> > +    */
> > +   KERNEL_LOCK();
> >     mtx_enter(&uvm_pseg_lck);
> >  
> >  pager_seg_restart:
> > @@ -159,6 +177,7 @@ pager_seg_restart:
> >                     if (!UVM_PSEG_INUSE(pseg, i)) {
> >                             pseg->use |= 1 << i;
> >                             mtx_leave(&uvm_pseg_lck);
> > +                           KERNEL_UNLOCK();
> >                             return pseg->start + i * MAXBSIZE;
> >                     }
> >             }
> > @@ -171,6 +190,7 @@ pager_seg_fail:
> >     }
> >  
> >     mtx_leave(&uvm_pseg_lck);
> > +   KERNEL_UNLOCK();
> >     return 0;
> >  }
> >  
> > @@ -543,11 +563,15 @@ ReTry:
> >                     /* XXX daddr_t -> int */
> >                     int nswblk = (result == VM_PAGER_AGAIN) ? swblk : 0;
> >                     if (pg->pg_flags & PQ_ANON) {
> > +                           rw_enter(pg->uanon->an_lock, RW_WRITE);
> >                             pg->uanon->an_swslot = nswblk;
> > +                           rw_exit(pg->uanon->an_lock);
> >                     } else {
> > +                           rw_enter(pg->uobject->vmobjlock, RW_WRITE);
> >                             uao_set_swslot(pg->uobject,
> >                                            pg->offset >> PAGE_SHIFT,
> >                                            nswblk);
> > +                           rw_exit(pg->uobject->vmobjlock);
> >                     }
> >             }
> >             if (result == VM_PAGER_AGAIN) {
> > @@ -612,6 +636,8 @@ uvm_pager_dropcluster(struct uvm_object *uobj, struct 
> > vm_page *pg,
> >  {
> >     int lcv;
> >  
> > +   KASSERT(uobj == NULL || rw_write_held(uobj->vmobjlock));
> > +
> >     /* drop all pages but "pg" */
> >     for (lcv = 0 ; lcv < *npages ; lcv++) {
> >             /* skip "pg" or empty slot */
> > @@ -625,10 +651,13 @@ uvm_pager_dropcluster(struct uvm_object *uobj, struct 
> > vm_page *pg,
> >              */
> >             if (!uobj) {
> >                     if (ppsp[lcv]->pg_flags & PQ_ANON) {
> > +                           rw_enter(ppsp[lcv]->uanon->an_lock, RW_WRITE);
> >                             if (flags & PGO_REALLOCSWAP)
> >                                       /* zap swap block */
> >                                       ppsp[lcv]->uanon->an_swslot = 0;
> >                     } else {
> > +                           rw_enter(ppsp[lcv]->uobject->vmobjlock,
> > +                               RW_WRITE);
> >                             if (flags & PGO_REALLOCSWAP)
> >                                     uao_set_swslot(ppsp[lcv]->uobject,
> >                                         ppsp[lcv]->offset >> PAGE_SHIFT, 0);
> > @@ -649,7 +678,6 @@ uvm_pager_dropcluster(struct uvm_object *uobj, struct 
> > vm_page *pg,
> >                             UVM_PAGE_OWN(ppsp[lcv], NULL);
> >  
> >                             /* kills anon and frees pg */
> > -                           rw_enter(ppsp[lcv]->uanon->an_lock, RW_WRITE);
> >                             uvm_anon_release(ppsp[lcv]->uanon);
> >  
> >                             continue;
> > @@ -672,6 +700,14 @@ uvm_pager_dropcluster(struct uvm_object *uobj, struct 
> > vm_page *pg,
> >                     pmap_clear_modify(ppsp[lcv]);
> >                     atomic_setbits_int(&ppsp[lcv]->pg_flags, PG_CLEAN);
> >             }
> > +
> > +           /* if anonymous cluster, unlock object and move on */
> > +           if (!uobj) {
> > +                   if (ppsp[lcv]->pg_flags & PQ_ANON)
> > +                           rw_exit(ppsp[lcv]->uanon->an_lock);
> > +                   else
> > +                           rw_exit(ppsp[lcv]->uobject->vmobjlock);
> > +           }
> >     }
> >  }
> >  
> > @@ -736,6 +772,7 @@ uvm_aio_aiodone(struct buf *bp)
> >                     swap = (pg->pg_flags & PQ_SWAPBACKED) != 0;
> >                     if (!swap) {
> >                             uobj = pg->uobject;
> > +                           rw_enter(uobj->vmobjlock, RW_WRITE);
> >                     }
> >             }
> >             KASSERT(swap || pg->uobject == uobj);
> > @@ -763,6 +800,9 @@ uvm_aio_aiodone(struct buf *bp)
> >             }
> >     }
> >     uvm_page_unbusy(pgs, npages);
> > +   if (!swap) {
> > +           rw_exit(uobj->vmobjlock);
> > +   }
> >  
> >  #ifdef UVM_SWAP_ENCRYPT
> >  freed:
> > diff --git sys/uvm/uvm_pdaemon.c sys/uvm/uvm_pdaemon.c
> > index e0ab150cddc..1ac4b29d256 100644
> > --- sys/uvm/uvm_pdaemon.c
> > +++ sys/uvm/uvm_pdaemon.c
> > @@ -440,19 +440,6 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                     uvmexp.pdscans++;
> >                     nextpg = TAILQ_NEXT(p, pageq);
> >  
> > -                   /*
> > -                    * move referenced pages back to active queue and
> > -                    * skip to next page (unlikely to happen since
> > -                    * inactive pages shouldn't have any valid mappings
> > -                    * and we cleared reference before deactivating).
> > -                    */
> > -
> > -                   if (pmap_is_referenced(p)) {
> > -                           uvm_pageactivate(p);
> > -                           uvmexp.pdreact++;
> > -                           continue;
> > -                   }
> > -
> >                     if (p->pg_flags & PQ_ANON) {
> >                             anon = p->uanon;
> >                             KASSERT(anon != NULL);
> > @@ -461,6 +448,16 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                                     /* lock failed, skip this page */
> >                                     continue;
> >                             }
> > +                           /*
> > +                            * move referenced pages back to active queue
> > +                            * and skip to next page.
> > +                            */
> > +                           if (pmap_is_referenced(p)) {
> > +                                   uvm_pageactivate(p);
> > +                                   rw_exit(anon->an_lock);
> > +                                   uvmexp.pdreact++;
> > +                                   continue;
> > +                           }
> >                             if (p->pg_flags & PG_BUSY) {
> >                                     rw_exit(anon->an_lock);
> >                                     uvmexp.pdbusy++;
> > @@ -471,7 +468,23 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                     } else {
> >                             uobj = p->uobject;
> >                             KASSERT(uobj != NULL);
> > +                           if (rw_enter(uobj->vmobjlock,
> > +                               RW_WRITE|RW_NOSLEEP)) {
> > +                                   /* lock failed, skip this page */
> > +                                   continue;
> > +                           }
> > +                           /*
> > +                            * move referenced pages back to active queue
> > +                            * and skip to next page.
> > +                            */
> > +                           if (pmap_is_referenced(p)) {
> > +                                   uvm_pageactivate(p);
> > +                                   rw_exit(uobj->vmobjlock);
> > +                                   uvmexp.pdreact++;
> > +                                   continue;
> > +                           }
> >                             if (p->pg_flags & PG_BUSY) {
> > +                                   rw_exit(uobj->vmobjlock);
> >                                     uvmexp.pdbusy++;
> >                                     /* someone else owns page, skip it */
> >                                     continue;
> > @@ -507,6 +520,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                                     /* remove from object */
> >                                     anon->an_page = NULL;
> >                                     rw_exit(anon->an_lock);
> > +                           } else {
> > +                                   rw_exit(uobj->vmobjlock);
> >                             }
> >                             continue;
> >                     }
> > @@ -518,6 +533,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                     if (free + uvmexp.paging > uvmexp.freetarg << 2) {
> >                             if (anon) {
> >                                     rw_exit(anon->an_lock);
> > +                           } else {
> > +                                   rw_exit(uobj->vmobjlock);
> >                             }
> >                             continue;
> >                     }
> > @@ -533,6 +550,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                             uvm_pageactivate(p);
> >                             if (anon) {
> >                                     rw_exit(anon->an_lock);
> > +                           } else {
> > +                                   rw_exit(uobj->vmobjlock);
> >                             }
> >                             continue;
> >                     }
> > @@ -602,6 +621,9 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                                             UVM_PAGE_OWN(p, NULL);
> >                                             if (anon)
> >                                                     rw_exit(anon->an_lock);
> > +                                           else
> > +                                                   rw_exit(
> > +                                                       uobj->vmobjlock);
> >                                             continue;
> >                                     }
> >                                     swcpages = 0;   /* cluster is empty */
> > @@ -635,6 +657,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                     if (p) {        /* if we just added a page to cluster */
> >                             if (anon)
> >                                     rw_exit(anon->an_lock);
> > +                           else
> > +                                   rw_exit(uobj->vmobjlock);
> >  
> >                             /* cluster not full yet? */
> >                             if (swcpages < swnpages)
> > @@ -748,6 +772,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                     if (swap_backed) {
> >                             if (anon)
> >                                     rw_enter(anon->an_lock, RW_WRITE);
> > +                           else
> > +                                   rw_enter(uobj->vmobjlock, RW_WRITE);
> >                     }
> >  
> >  #ifdef DIAGNOSTIC
> > @@ -810,6 +836,8 @@ uvmpd_scan_inactive(struct pglist *pglst)
> >                      */
> >                     if (anon)
> >                             rw_exit(anon->an_lock);
> > +                   else if (uobj)
> > +                           rw_exit(uobj->vmobjlock);
> >  
> >                     if (nextpg && (nextpg->pg_flags & PQ_INACTIVE) == 0) {
> >                             nextpg = TAILQ_FIRST(pglst);    /* reload! */
> > @@ -920,8 +948,12 @@ uvmpd_scan(void)
> >                     KASSERT(p->uanon != NULL);
> >                     if (rw_enter(p->uanon->an_lock, RW_WRITE|RW_NOSLEEP))
> >                             continue;
> > -           } else
> > +           } else {
> >                     KASSERT(p->uobject != NULL);
> > +                   if (rw_enter(p->uobject->vmobjlock,
> > +                       RW_WRITE|RW_NOSLEEP))
> > +                           continue;
> > +           }
> >  
> >             /*
> >              * if there's a shortage of swap, free any swap allocated
> > @@ -959,6 +991,8 @@ uvmpd_scan(void)
> >             }
> >             if (p->pg_flags & PQ_ANON)
> >                     rw_exit(p->uanon->an_lock);
> > +           else
> > +                   rw_exit(p->uobject->vmobjlock);
> >     }
> >  }
> >  
> > @@ -982,6 +1016,10 @@ uvmpd_drop(struct pglist *pglst)
> >                     continue;
> >  
> >             if (p->pg_flags & PG_CLEAN) {
> > +                   struct uvm_object * uobj = p->uobject;
> > +
> > +                   rw_enter(uobj->vmobjlock, RW_WRITE);
> > +                   uvm_lock_pageq();
> >                     /*
> >                      * we now have the page queues locked.
> >                      * the page is not busy.   if the page is clean we
> > @@ -997,6 +1035,8 @@ uvmpd_drop(struct pglist *pglst)
> >                             pmap_page_protect(p, PROT_NONE);
> >                             uvm_pagefree(p);
> >                     }
> > +                   uvm_unlock_pageq();
> > +                   rw_exit(uobj->vmobjlock);
> >             }
> >     }
> >  }
> > @@ -1004,13 +1044,9 @@ uvmpd_drop(struct pglist *pglst)
> >  void
> >  uvmpd_hibernate(void)
> >  {
> > -   uvm_lock_pageq();
> > -
> >     uvmpd_drop(&uvm.page_inactive_swp);
> >     uvmpd_drop(&uvm.page_inactive_obj);
> >     uvmpd_drop(&uvm.page_active);
> > -
> > -   uvm_unlock_pageq();
> >  }
> >  
> >  #endif
> > diff --git sys/uvm/uvm_vnode.c sys/uvm/uvm_vnode.c
> > index 3cbdd5222b6..af69e8352ed 100644
> > --- sys/uvm/uvm_vnode.c
> > +++ sys/uvm/uvm_vnode.c
> > @@ -280,8 +280,9 @@ uvn_reference(struct uvm_object *uobj)
> >             panic("uvn_reference: invalid state");
> >     }
> >  #endif
> > -   KERNEL_ASSERT_LOCKED();
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     uobj->uo_refs++;
> > +   rw_exit(uobj->vmobjlock);
> >  }
> >  
> >  /*
> > @@ -300,9 +301,10 @@ uvn_detach(struct uvm_object *uobj)
> >     struct vnode *vp;
> >     int oldflags;
> >  
> > -   KERNEL_ASSERT_LOCKED();
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     uobj->uo_refs--;                        /* drop ref! */
> >     if (uobj->uo_refs) {                    /* still more refs */
> > +           rw_exit(uobj->vmobjlock);
> >             return;
> >     }
> >  
> > @@ -323,8 +325,7 @@ uvn_detach(struct uvm_object *uobj)
> >     if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
> >             /* won't block */
> >             uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
> > -           vrele(vp);                      /* drop vnode reference */
> > -           return;
> > +           goto out;
> >     }
> >  
> >     /* its a goner! */
> > @@ -353,7 +354,8 @@ uvn_detach(struct uvm_object *uobj)
> >     /* wait on any outstanding io */
> >     while (uobj->uo_npages && uvn->u_flags & UVM_VNODE_RELKILL) {
> >             uvn->u_flags |= UVM_VNODE_IOSYNC;
> > -           tsleep_nsec(&uvn->u_nio, PVM, "uvn_term", INFSLP);
> > +           rwsleep_nsec(&uvn->u_nio, uobj->vmobjlock, PVM, "uvn_term",
> > +               INFSLP);
> >     }
> >  
> >     if ((uvn->u_flags & UVM_VNODE_RELKILL) == 0)
> > @@ -373,6 +375,8 @@ uvn_detach(struct uvm_object *uobj)
> >     /* wake up any sleepers */
> >     if (oldflags & UVM_VNODE_WANTED)
> >             wakeup(uvn);
> > +out:
> > +   rw_exit(uobj->vmobjlock);
> >  
> >     /* drop our reference to the vnode. */
> >     vrele(vp);
> > @@ -409,10 +413,13 @@ void
> >  uvm_vnp_terminate(struct vnode *vp)
> >  {
> >     struct uvm_vnode *uvn = vp->v_uvm;
> > +   struct uvm_object *uobj = &uvn->u_obj;
> >     int oldflags;
> >  
> >     /* check if it is valid */
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     if ((uvn->u_flags & UVM_VNODE_VALID) == 0) {
> > +           rw_exit(uobj->vmobjlock);
> >             return;
> >     }
> >  
> > @@ -479,7 +486,8 @@ uvm_vnp_terminate(struct vnode *vp)
> >              */
> >  #endif
> >             uvn->u_flags |= UVM_VNODE_IOSYNC;
> > -           tsleep_nsec(&uvn->u_nio, PVM, "uvn_term", INFSLP);
> > +           rwsleep_nsec(&uvn->u_nio, uobj->vmobjlock, PVM, "uvn_term",
> > +               INFSLP);
> >     }
> >  
> >     /*
> > @@ -512,6 +520,8 @@ uvm_vnp_terminate(struct vnode *vp)
> >  
> >     if (oldflags & UVM_VNODE_WANTED)
> >             wakeup(uvn);
> > +
> > +   rw_exit(uobj->vmobjlock);
> >  }
> >  
> >  /*
> > @@ -589,7 +599,7 @@ uvn_flush(struct uvm_object *uobj, voff_t start, voff_t 
> > stop, int flags)
> >     boolean_t retval, need_iosync, needs_clean;
> >     voff_t curoff;
> >  
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >     TAILQ_INIT(&dead);
> >  
> >     /* get init vals and determine how we are going to traverse object */
> > @@ -673,8 +683,8 @@ uvn_flush(struct uvm_object *uobj, voff_t start, voff_t 
> > stop, int flags)
> >                                     atomic_setbits_int(&pp->pg_flags,
> >                                         PG_WANTED);
> >                                     uvm_unlock_pageq();
> > -                                   tsleep_nsec(pp, PVM, "uvn_flsh",
> > -                                       INFSLP);
> > +                                   rwsleep_nsec(pp, uobj->vmobjlock, PVM,
> > +                                       "uvn_flsh", INFSLP);
> >                                     uvm_lock_pageq();
> >                                     curoff -= PAGE_SIZE;
> >                                     continue;
> > @@ -824,7 +834,8 @@ ReTry:
> >     if (need_iosync) {
> >             while (uvn->u_nio != 0) {
> >                     uvn->u_flags |= UVM_VNODE_IOSYNC;
> > -                   tsleep_nsec(&uvn->u_nio, PVM, "uvn_flush", INFSLP);
> > +                   rwsleep_nsec(&uvn->u_nio, uobj->vmobjlock, PVM,
> > +                       "uvn_flush", INFSLP);
> >             }
> >             if (uvn->u_flags & UVM_VNODE_IOSYNCWANTED)
> >                     wakeup(&uvn->u_flags);
> > @@ -878,7 +889,7 @@ uvn_put(struct uvm_object *uobj, struct vm_page **pps, 
> > int npages, int flags)
> >  {
> >     int retval;
> >  
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> >  
> >     retval = uvn_io((struct uvm_vnode*)uobj, pps, npages, flags, UIO_WRITE);
> >  
> > @@ -903,7 +914,8 @@ uvn_get(struct uvm_object *uobj, voff_t offset, struct 
> > vm_page **pps,
> >     int lcv, result, gotpages;
> >     boolean_t done;
> >  
> > -   KERNEL_ASSERT_LOCKED();
> > +   KASSERT(((flags & PGO_LOCKED) != 0 && rw_lock_held(uobj->vmobjlock)) ||
> > +       (flags & PGO_LOCKED) == 0);
> >  
> >     /* step 1: handled the case where fault data structures are locked. */
> >     if (flags & PGO_LOCKED) {
> > @@ -1033,7 +1045,8 @@ uvn_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >                     /* page is there, see if we need to wait on it */
> >                     if ((ptmp->pg_flags & PG_BUSY) != 0) {
> >                             atomic_setbits_int(&ptmp->pg_flags, PG_WANTED);
> > -                           tsleep_nsec(ptmp, PVM, "uvn_get", INFSLP);
> > +                           rwsleep_nsec(ptmp, uobj->vmobjlock, PVM,
> > +                               "uvn_get", INFSLP);
> >                             continue;       /* goto top of pps while loop */
> >                     }
> >  
> > @@ -1077,6 +1090,7 @@ uvn_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >                     uvm_lock_pageq();
> >                     uvm_pagefree(ptmp);
> >                     uvm_unlock_pageq();
> > +                   rw_exit(uobj->vmobjlock);
> >                     return result;
> >             }
> >  
> > @@ -1098,6 +1112,8 @@ uvn_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >  
> >     }
> >  
> > +
> > +   rw_exit(uobj->vmobjlock);
> >     return (VM_PAGER_OK);
> >  }
> >  
> > @@ -1113,6 +1129,7 @@ uvn_get(struct uvm_object *uobj, voff_t offset, 
> > struct vm_page **pps,
> >  int
> >  uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int npages, int flags, int 
> > rw)
> >  {
> > +   struct uvm_object *uobj = &uvn->u_obj;
> >     struct vnode *vn;
> >     struct uio uio;
> >     struct iovec iov;
> > @@ -1123,6 +1140,8 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >     int netunlocked = 0;
> >     int lkflags = (flags & PGO_NOWAIT) ? LK_NOWAIT : 0;
> >  
> > +   KASSERT(rw_write_held(uobj->vmobjlock));
> > +
> >     /* init values */
> >     waitf = (flags & PGO_SYNCIO) ? M_WAITOK : M_NOWAIT;
> >     vn = uvn->u_vnode;
> > @@ -1134,7 +1153,8 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >                     return VM_PAGER_AGAIN;
> >             }
> >             uvn->u_flags |= UVM_VNODE_IOSYNCWANTED;
> > -           tsleep_nsec(&uvn->u_flags, PVM, "uvn_iosync", INFSLP);
> > +           rwsleep_nsec(&uvn->u_flags, uobj->vmobjlock, PVM, "uvn_iosync",
> > +               INFSLP);
> >     }
> >  
> >     /* check size */
> > @@ -1157,6 +1177,7 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >      * (this time with sleep ok).
> >      */
> >     uvn->u_nio++;                   /* we have an I/O in progress! */
> > +   rw_exit(uobj->vmobjlock);
> >     if (kva == 0)
> >             kva = uvm_pagermapin(pps, npages,
> >                 mapinflags | UVMPAGER_MAPIN_WAITOK);
> > @@ -1200,6 +1221,7 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >      * Ideally, this kind of operation *should* work.
> >      */
> >     result = 0;
> > +   KERNEL_LOCK();
> >     if ((uvn->u_flags & UVM_VNODE_VNISLOCKED) == 0)
> >             result = vn_lock(vn, LK_EXCLUSIVE | LK_RECURSEFAIL | lkflags);
> >     if (result == 0) {
> > @@ -1215,6 +1237,7 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >                     VOP_UNLOCK(vn);
> >  
> >     }
> > +   KERNEL_UNLOCK();
> >  
> >     if (netunlocked)
> >             NET_LOCK();
> > @@ -1241,6 +1264,7 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >     uvm_pagermapout(kva, npages);
> >  
> >     /* now clean up the object (i.e. drop I/O count) */
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     uvn->u_nio--;                   /* I/O DONE! */
> >     if ((uvn->u_flags & UVM_VNODE_IOSYNC) != 0 && uvn->u_nio == 0) {
> >             wakeup(&uvn->u_nio);
> > @@ -1252,8 +1276,12 @@ uvn_io(struct uvm_vnode *uvn, vm_page_t *pps, int 
> > npages, int flags, int rw)
> >             KASSERT(flags & PGO_NOWAIT);
> >             return VM_PAGER_AGAIN;
> >     } else {
> > -           while (rebooting)
> > -                   tsleep_nsec(&rebooting, PVM, "uvndead", INFSLP);
> > +           if (rebooting) {
> > +                   KERNEL_LOCK();
> > +                   while (rebooting)
> > +                           tsleep_nsec(&rebooting, PVM, "uvndead", INFSLP);
> > +                   KERNEL_UNLOCK();
> > +           }
> >             return VM_PAGER_ERROR;
> >     }
> >  }
> > @@ -1300,11 +1328,14 @@ int
> >  uvm_vnp_uncache(struct vnode *vp)
> >  {
> >     struct uvm_vnode *uvn = vp->v_uvm;
> > +   struct uvm_object *uobj = &uvn->u_obj;
> >  
> >     /* lock uvn part of the vnode and check if we need to do anything */
> >  
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >     if ((uvn->u_flags & UVM_VNODE_VALID) == 0 ||
> >                     (uvn->u_flags & UVM_VNODE_BLOCKED) != 0) {
> > +           rw_exit(uobj->vmobjlock);
> >             return TRUE;
> >     }
> >  
> > @@ -1314,6 +1345,7 @@ uvm_vnp_uncache(struct vnode *vp)
> >      */
> >     uvn->u_flags &= ~UVM_VNODE_CANPERSIST;
> >     if (uvn->u_obj.uo_refs) {
> > +           rw_exit(uobj->vmobjlock);
> >             return FALSE;
> >     }
> >  
> > @@ -1323,6 +1355,7 @@ uvm_vnp_uncache(struct vnode *vp)
> >      */
> >     vref(vp);                       /* seems ok, even with VOP_LOCK */
> >     uvn->u_obj.uo_refs++;           /* value is now 1 */
> > +   rw_exit(uobj->vmobjlock);
> >  
> >  #ifdef VFSLCKDEBUG
> >     /*
> > @@ -1374,6 +1407,11 @@ void
> >  uvm_vnp_setsize(struct vnode *vp, off_t newsize)
> >  {
> >     struct uvm_vnode *uvn = vp->v_uvm;
> > +   struct uvm_object *uobj = &uvn->u_obj;
> > +
> > +   KERNEL_ASSERT_LOCKED();
> > +
> > +   rw_enter(uobj->vmobjlock, RW_WRITE);
> >  
> >     /* lock uvn and check for valid object, and if valid: do it! */
> >     if (uvn->u_flags & UVM_VNODE_VALID) {
> > @@ -1389,6 +1427,7 @@ uvm_vnp_setsize(struct vnode *vp, off_t newsize)
> >             }
> >             uvn->u_size = newsize;
> >     }
> > +   rw_exit(uobj->vmobjlock);
> >  }
> >  
> >  /*
> > @@ -1447,6 +1486,7 @@ uvm_vnp_sync(struct mount *mp)
> >  
> >     /* step 3: we now have a list of uvn's that may need cleaning. */
> >     SIMPLEQ_FOREACH(uvn, &uvn_sync_q, u_syncq) {
> > +           rw_enter(uvn->u_obj.vmobjlock, RW_WRITE);
> >  #ifdef DEBUG
> >             if (uvn->u_flags & UVM_VNODE_DYING) {
> >                     printf("uvm_vnp_sync: dying vnode on sync list\n");
> > @@ -1465,6 +1505,7 @@ uvm_vnp_sync(struct mount *mp)
> >                     LIST_REMOVE(uvn, u_wlist);
> >                     uvn->u_flags &= ~UVM_VNODE_WRITEABLE;
> >             }
> > +           rw_exit(uvn->u_obj.vmobjlock);
> >  
> >             /* now drop our reference to the uvn */
> >             uvn_detach(&uvn->u_obj);
> > 

Reply via email to