Re: Incorrect IPL when pool_get(9) is called under rwlock
On Wed, Sep 01, 2021 at 08:53:35AM +0200, Martin Pieuchot wrote: > syzkaller reported [0] the following lock ordering issue: > > db{0}> trace > db_enter() at db_enter+0x18 sys/arch/amd64/amd64/db_interface.c:440 > panic(82464b8f) at panic+0x177 sys/kern/subr_prf.c:202 > witness_checkorder(82838c20,9,0) at witness_checkorder+0x11eb > sys/kern/subr_witness.c:833 > __mp_lock(82838a18) at __mp_lock+0xa1 read_rflags > machine/cpufunc.h:195 [inline] > __mp_lock(82838a18) at __mp_lock+0xa1 intr_disable > machine/cpufunc.h:216 [inline] > __mp_lock(82838a18) at __mp_lock+0xa1 sys/kern/kern_lock.c:142 > intr_handler(80002123ad80,80255d80) at intr_handler+0x5e > sys/arch/amd64/amd64/intr.c:532 > Xintr_ioapic_edge20_untramp() at Xintr_ioapic_edge20_untramp+0x18f > Xspllower() at Xspllower+0x19 > mtx_enter_try(829b8d10) at mtx_enter_try+0x100 > mtx_enter(829b8d10) at mtx_enter+0x4b sys/kern/kern_lock.c:266 > pool_get(829b8d10,9) at pool_get+0xbf sys/kern/subr_pool.c:581 > vm_create(80b29000,8000211922a8) at vm_create+0x261 > sys/arch/amd64/amd64/vmm.c:1526 > vmmioctl(a00,c5005601,80b29000,1,8000211922a8) at vmmioctl+0x1f2 > VOP_IOCTL(fd806e213830,c5005601,80b29000,1,fd807f7d8840,8000211922a8) > at VOP_IOCTL+0x9a sys/kern/vfs_vops.c:295 > vn_ioctl(fd806e4aca28,c5005601,80b29000,8000211922a8) at > vn_ioctl+0xba sys/kern/vfs_vnops.c:531 > sys_ioctl(8000211922a8,80002123b398,80002123b3e0) at > sys_ioctl+0x4a2 > > > The issue is that pool_get(9) at line 1526 is done after grabbing the > `vm_lock'. If an interrupt needing the KERNEL_LOCK() occurs at that > moment the above mentionned lock ordering problem could cause a > deadlock. > > To prevent such issue we generally mark the pool with IPL_MPFLOOR. > > [0] > https://syzkaller.appspot.com/bug?id=c73756cc996a58a625da35fbaa90ba6b9e0c60dc > ok mlarkin@ > Index: arch/amd64/amd64/vmm.c > === > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v > retrieving revision 1.287 > diff -u -p -r1.287 vmm.c > --- arch/amd64/amd64/vmm.c31 Aug 2021 17:40:59 - 1.287 > +++ arch/amd64/amd64/vmm.c1 Sep 2021 06:45:38 - > @@ -430,7 +430,7 @@ vmm_attach(struct device *parent, struct > > pool_init(_pool, sizeof(struct vm), 0, IPL_NONE, PR_WAITOK, > "vmpool", NULL); > - pool_init(_pool, sizeof(struct vcpu), 64, IPL_NONE, PR_WAITOK, > + pool_init(_pool, sizeof(struct vcpu), 64, IPL_MPFLOOR, PR_WAITOK, > "vcpupl", NULL); > > vmm_softc = sc;
Re: Incorrect IPL when pool_get(9) is called under rwlock
On Wed, Sep 01, 2021 at 08:53:35AM +0200, Martin Pieuchot wrote: > syzkaller reported [0] the following lock ordering issue: > > db{0}> trace > db_enter() at db_enter+0x18 sys/arch/amd64/amd64/db_interface.c:440 > panic(82464b8f) at panic+0x177 sys/kern/subr_prf.c:202 > witness_checkorder(82838c20,9,0) at witness_checkorder+0x11eb > sys/kern/subr_witness.c:833 > __mp_lock(82838a18) at __mp_lock+0xa1 read_rflags > machine/cpufunc.h:195 [inline] > __mp_lock(82838a18) at __mp_lock+0xa1 intr_disable > machine/cpufunc.h:216 [inline] > __mp_lock(82838a18) at __mp_lock+0xa1 sys/kern/kern_lock.c:142 > intr_handler(80002123ad80,80255d80) at intr_handler+0x5e > sys/arch/amd64/amd64/intr.c:532 > Xintr_ioapic_edge20_untramp() at Xintr_ioapic_edge20_untramp+0x18f > Xspllower() at Xspllower+0x19 > mtx_enter_try(829b8d10) at mtx_enter_try+0x100 > mtx_enter(829b8d10) at mtx_enter+0x4b sys/kern/kern_lock.c:266 > pool_get(829b8d10,9) at pool_get+0xbf sys/kern/subr_pool.c:581 > vm_create(80b29000,8000211922a8) at vm_create+0x261 > sys/arch/amd64/amd64/vmm.c:1526 > vmmioctl(a00,c5005601,80b29000,1,8000211922a8) at vmmioctl+0x1f2 > VOP_IOCTL(fd806e213830,c5005601,80b29000,1,fd807f7d8840,8000211922a8) > at VOP_IOCTL+0x9a sys/kern/vfs_vops.c:295 > vn_ioctl(fd806e4aca28,c5005601,80b29000,8000211922a8) at > vn_ioctl+0xba sys/kern/vfs_vnops.c:531 > sys_ioctl(8000211922a8,80002123b398,80002123b3e0) at > sys_ioctl+0x4a2 > > > The issue is that pool_get(9) at line 1526 is done after grabbing the > `vm_lock'. If an interrupt needing the KERNEL_LOCK() occurs at that > moment the above mentionned lock ordering problem could cause a > deadlock. > > To prevent such issue we generally mark the pool with IPL_MPFLOOR. > > [0] > https://syzkaller.appspot.com/bug?id=c73756cc996a58a625da35fbaa90ba6b9e0c60dc > Thanks, will take a look. This was introduced yesterday with the new vcpu locking diff. -ml > Index: arch/amd64/amd64/vmm.c > === > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v > retrieving revision 1.287 > diff -u -p -r1.287 vmm.c > --- arch/amd64/amd64/vmm.c31 Aug 2021 17:40:59 - 1.287 > +++ arch/amd64/amd64/vmm.c1 Sep 2021 06:45:38 - > @@ -430,7 +430,7 @@ vmm_attach(struct device *parent, struct > > pool_init(_pool, sizeof(struct vm), 0, IPL_NONE, PR_WAITOK, > "vmpool", NULL); > - pool_init(_pool, sizeof(struct vcpu), 64, IPL_NONE, PR_WAITOK, > + pool_init(_pool, sizeof(struct vcpu), 64, IPL_MPFLOOR, PR_WAITOK, > "vcpupl", NULL); > > vmm_softc = sc; >
Incorrect IPL when pool_get(9) is called under rwlock
syzkaller reported [0] the following lock ordering issue: db{0}> trace db_enter() at db_enter+0x18 sys/arch/amd64/amd64/db_interface.c:440 panic(82464b8f) at panic+0x177 sys/kern/subr_prf.c:202 witness_checkorder(82838c20,9,0) at witness_checkorder+0x11eb sys/kern/subr_witness.c:833 __mp_lock(82838a18) at __mp_lock+0xa1 read_rflags machine/cpufunc.h:195 [inline] __mp_lock(82838a18) at __mp_lock+0xa1 intr_disable machine/cpufunc.h:216 [inline] __mp_lock(82838a18) at __mp_lock+0xa1 sys/kern/kern_lock.c:142 intr_handler(80002123ad80,80255d80) at intr_handler+0x5e sys/arch/amd64/amd64/intr.c:532 Xintr_ioapic_edge20_untramp() at Xintr_ioapic_edge20_untramp+0x18f Xspllower() at Xspllower+0x19 mtx_enter_try(829b8d10) at mtx_enter_try+0x100 mtx_enter(829b8d10) at mtx_enter+0x4b sys/kern/kern_lock.c:266 pool_get(829b8d10,9) at pool_get+0xbf sys/kern/subr_pool.c:581 vm_create(80b29000,8000211922a8) at vm_create+0x261 sys/arch/amd64/amd64/vmm.c:1526 vmmioctl(a00,c5005601,80b29000,1,8000211922a8) at vmmioctl+0x1f2 VOP_IOCTL(fd806e213830,c5005601,80b29000,1,fd807f7d8840,8000211922a8) at VOP_IOCTL+0x9a sys/kern/vfs_vops.c:295 vn_ioctl(fd806e4aca28,c5005601,80b29000,8000211922a8) at vn_ioctl+0xba sys/kern/vfs_vnops.c:531 sys_ioctl(8000211922a8,80002123b398,80002123b3e0) at sys_ioctl+0x4a2 The issue is that pool_get(9) at line 1526 is done after grabbing the `vm_lock'. If an interrupt needing the KERNEL_LOCK() occurs at that moment the above mentionned lock ordering problem could cause a deadlock. To prevent such issue we generally mark the pool with IPL_MPFLOOR. [0] https://syzkaller.appspot.com/bug?id=c73756cc996a58a625da35fbaa90ba6b9e0c60dc Index: arch/amd64/amd64/vmm.c === RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v retrieving revision 1.287 diff -u -p -r1.287 vmm.c --- arch/amd64/amd64/vmm.c 31 Aug 2021 17:40:59 - 1.287 +++ arch/amd64/amd64/vmm.c 1 Sep 2021 06:45:38 - @@ -430,7 +430,7 @@ vmm_attach(struct device *parent, struct pool_init(_pool, sizeof(struct vm), 0, IPL_NONE, PR_WAITOK, "vmpool", NULL); - pool_init(_pool, sizeof(struct vcpu), 64, IPL_NONE, PR_WAITOK, + pool_init(_pool, sizeof(struct vcpu), 64, IPL_MPFLOOR, PR_WAITOK, "vcpupl", NULL); vmm_softc = sc;