Re: [PATCH] dmaengine: pl330: fix a race condition in case of threaded irqs
On Tue, Mar 06, 2018 at 09:13:37AM +0800, Qi Hou wrote: > When booting up with "threadirqs" in command line, all irq handlers of the DMA > controller pl330 will be threaded forcedly. These threads will race for the > same > list, pl330->req_done. > > Before the callback, the spinlock was released. And after it, the spinlock was > taken. This opened an race window where another threaded irq handler could > steal > the spinlock and be permitted to delete entries of the list, pl330->req_done. > > If the later deleted an entry that was still referred to by the former, there > would > be a kernel panic when the former was scheduled and tried to get the next > sibling > of the deleted entry. Applied, thanks -- ~Vinod
Re: [PATCH] dmaengine: pl330: fix a race condition in case of threaded irqs
On Tue, Mar 06, 2018 at 09:13:37AM +0800, Qi Hou wrote: > When booting up with "threadirqs" in command line, all irq handlers of the DMA > controller pl330 will be threaded forcedly. These threads will race for the > same > list, pl330->req_done. > > Before the callback, the spinlock was released. And after it, the spinlock was > taken. This opened an race window where another threaded irq handler could > steal > the spinlock and be permitted to delete entries of the list, pl330->req_done. > > If the later deleted an entry that was still referred to by the former, there > would > be a kernel panic when the former was scheduled and tried to get the next > sibling > of the deleted entry. Applied, thanks -- ~Vinod
[PATCH] dmaengine: pl330: fix a race condition in case of threaded irqs
When booting up with "threadirqs" in command line, all irq handlers of the DMA controller pl330 will be threaded forcedly. These threads will race for the same list, pl330->req_done. Before the callback, the spinlock was released. And after it, the spinlock was taken. This opened an race window where another threaded irq handler could steal the spinlock and be permitted to delete entries of the list, pl330->req_done. If the later deleted an entry that was still referred to by the former, there would be a kernel panic when the former was scheduled and tried to get the next sibling of the deleted entry. The scenario could be depicted as below: Thread: T1 pl330->req_done Thread: T2 | | | | -A-B-C-D- | Locked | | | | Waiting Del A | | | -B-C-D- | Unlocked| | | | Locked Waiting | | | |Del B | | | | -C-D- Unlocked Waiting | | | Locked | get C via B \ - Kernel panic The kernel panic looked like as below: Unable to handle kernel paging request at virtual address dead0108 pgd = ff8008c9e000 [dead0108] *pgd=00027fffe003, *pud=00027fffe003, *pmd= Internal error: Oops: 9644 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 85 Comm: irq/59-6633 Not tainted 4.8.24-WR9.0.0.12_standard #2 Hardware name: Broadcom NS2 SVK (DT) task: ffc1f5cc3c00 task.stack: ffc1f5ce PC is at pl330_irq_handler+0x27c/0x390 LR is at pl330_irq_handler+0x2a8/0x390 pc : [] lr : [] pstate: 81c5 sp : ffc1f5ce3d00 x29: ffc1f5ce3d00 x28: 0140 x27: ffc1f5c530b0 x26: dead0100 x25: dead0200 x24: 00418958 x23: 0001 x22: ffc1f5ccd668 x21: ffc1f5ccd590 x20: ffc1f5ccd418 x19: dead0060 x18: 0001 x17: 0007 x16: 0001 x15: x14: x13: x12: x11: 0001 x10: 0840 x9 : ffc1f5ce x8 : ffc1f5cc3338 x7 : ff8008ce2020 x6 : x5 : x4 : 0001 x3 : dead0200 x2 : dead0100 x1 : 0140 x0 : ffc1f5ccd590 Process irq/59-6633 (pid: 85, stack limit = 0xffc1f5ce0020) Stack: (0xffc1f5ce3d00 to 0xffc1f5ce4000) 3d00: ffc1f5ce3d80 ff80080f09d0 ffc1f5ca0c00 ffc1f6f7c600 3d20: ffc1f5ce ffc1f6f7c600 ffc1f5ca0c00 ff80080f0998 3d40: ffc1f5ce ff80080f 3d60: ff8008ce202c ff8008ce2020 ffc1f5ccd668 ffc1f5c530b0 3d80: ffc1f5ce3db0 ff80080f0d70 ffc1f5ca0c40 0001 3da0: ffc1f5ce ff80080f0cfc ffc1f5ce3e20 ff80080bf4f8 3dc0: ffc1f5ca0c80 ff8008bf3798 ff8008955528 ffc1f5ca0c00 3de0: ff80080f0c30 3e00: ff80080f0b68 3e20: ff8008083690 ff80080bf420 ffc1f5ca0c80 3e40: ff80080cb648 3e60: ff8008b1c780 ffc1f5ca0c00 3e80: ffc1 ff80 ffc1f5ce3e90 ffc1f5ce3e90 3ea0: ff80 ffc1f5ce3eb0 ffc1f5ce3eb0 3ec0: 3ee0: 3f00: 3f20: 3f40: 3f60: 3f80: 3fa0: 3fc0: 0005 3fe0: 000275ce3ff0 000275ce3ff8 Call trace: Exception stack(0xffc1f5ce3b30 to 0xffc1f5ce3c60) 3b20: dead0060 0080 3b40: ffc1f5ce3d00 ff80084cb694 0008 0e88 3b60: ffc1f5ce3bb0 ff80080dac68 ffc1f5ce3b90 ff8008826fe4 3b80: 01c0 01c0 ffc1f5ce3bb0 ff800848dfcc 3ba0: 0002 ff8008b15ae4 ffc1f5ce3c00 ff800808f000 3bc0: 0010 ff80088377f0 ffc1f5ccd590 0140 3be0: dead0100 dead0200 0001
[PATCH] dmaengine: pl330: fix a race condition in case of threaded irqs
When booting up with "threadirqs" in command line, all irq handlers of the DMA controller pl330 will be threaded forcedly. These threads will race for the same list, pl330->req_done. Before the callback, the spinlock was released. And after it, the spinlock was taken. This opened an race window where another threaded irq handler could steal the spinlock and be permitted to delete entries of the list, pl330->req_done. If the later deleted an entry that was still referred to by the former, there would be a kernel panic when the former was scheduled and tried to get the next sibling of the deleted entry. The scenario could be depicted as below: Thread: T1 pl330->req_done Thread: T2 | | | | -A-B-C-D- | Locked | | | | Waiting Del A | | | -B-C-D- | Unlocked| | | | Locked Waiting | | | |Del B | | | | -C-D- Unlocked Waiting | | | Locked | get C via B \ - Kernel panic The kernel panic looked like as below: Unable to handle kernel paging request at virtual address dead0108 pgd = ff8008c9e000 [dead0108] *pgd=00027fffe003, *pud=00027fffe003, *pmd= Internal error: Oops: 9644 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 85 Comm: irq/59-6633 Not tainted 4.8.24-WR9.0.0.12_standard #2 Hardware name: Broadcom NS2 SVK (DT) task: ffc1f5cc3c00 task.stack: ffc1f5ce PC is at pl330_irq_handler+0x27c/0x390 LR is at pl330_irq_handler+0x2a8/0x390 pc : [] lr : [] pstate: 81c5 sp : ffc1f5ce3d00 x29: ffc1f5ce3d00 x28: 0140 x27: ffc1f5c530b0 x26: dead0100 x25: dead0200 x24: 00418958 x23: 0001 x22: ffc1f5ccd668 x21: ffc1f5ccd590 x20: ffc1f5ccd418 x19: dead0060 x18: 0001 x17: 0007 x16: 0001 x15: x14: x13: x12: x11: 0001 x10: 0840 x9 : ffc1f5ce x8 : ffc1f5cc3338 x7 : ff8008ce2020 x6 : x5 : x4 : 0001 x3 : dead0200 x2 : dead0100 x1 : 0140 x0 : ffc1f5ccd590 Process irq/59-6633 (pid: 85, stack limit = 0xffc1f5ce0020) Stack: (0xffc1f5ce3d00 to 0xffc1f5ce4000) 3d00: ffc1f5ce3d80 ff80080f09d0 ffc1f5ca0c00 ffc1f6f7c600 3d20: ffc1f5ce ffc1f6f7c600 ffc1f5ca0c00 ff80080f0998 3d40: ffc1f5ce ff80080f 3d60: ff8008ce202c ff8008ce2020 ffc1f5ccd668 ffc1f5c530b0 3d80: ffc1f5ce3db0 ff80080f0d70 ffc1f5ca0c40 0001 3da0: ffc1f5ce ff80080f0cfc ffc1f5ce3e20 ff80080bf4f8 3dc0: ffc1f5ca0c80 ff8008bf3798 ff8008955528 ffc1f5ca0c00 3de0: ff80080f0c30 3e00: ff80080f0b68 3e20: ff8008083690 ff80080bf420 ffc1f5ca0c80 3e40: ff80080cb648 3e60: ff8008b1c780 ffc1f5ca0c00 3e80: ffc1 ff80 ffc1f5ce3e90 ffc1f5ce3e90 3ea0: ff80 ffc1f5ce3eb0 ffc1f5ce3eb0 3ec0: 3ee0: 3f00: 3f20: 3f40: 3f60: 3f80: 3fa0: 3fc0: 0005 3fe0: 000275ce3ff0 000275ce3ff8 Call trace: Exception stack(0xffc1f5ce3b30 to 0xffc1f5ce3c60) 3b20: dead0060 0080 3b40: ffc1f5ce3d00 ff80084cb694 0008 0e88 3b60: ffc1f5ce3bb0 ff80080dac68 ffc1f5ce3b90 ff8008826fe4 3b80: 01c0 01c0 ffc1f5ce3bb0 ff800848dfcc 3ba0: 0002 ff8008b15ae4 ffc1f5ce3c00 ff800808f000 3bc0: 0010 ff80088377f0 ffc1f5ccd590 0140 3be0: dead0100 dead0200 0001