Re: [PATCH] block: fix possible bd_size_lock deadlock
On 2021/03/13 4:37, Jens Axboe wrote: > On 3/11/21 5:11 AM, yanfei...@windriver.com wrote: >> From: Yanfei Xu >> >> bd_size_lock spinlock could be taken in block softirq, thus we should >> disable the softirq before taking the lock. >> >> WARNING: inconsistent lock state >> 5.12.0-rc2-syzkaller #0 Not tainted >> >> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage. >> kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes: >> 8f87826c (>i_size_seqcount){+.+-}-{0:0}, at: >> end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006 >> {SOFTIRQ-ON-W} state was registered at: >> lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510 >> lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483 >> do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline] >> do_write_seqcount_begin include/linux/seqlock.h:545 [inline] >> i_size_write include/linux/fs.h:863 [inline] >> set_capacity+0x13c/0x1f8 block/genhd.c:50 >> brd_alloc+0x130/0x180 drivers/block/brd.c:401 >> brd_init+0xcc/0x1e0 drivers/block/brd.c:500 >> do_one_initcall+0x8c/0x59c init/main.c:1226 >> do_initcall_level init/main.c:1299 [inline] >> do_initcalls init/main.c:1315 [inline] >> do_basic_setup init/main.c:1335 [inline] >> kernel_init_freeable+0x2cc/0x330 init/main.c:1537 >> kernel_init+0x10/0x120 init/main.c:1424 >> ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158 >> 0x0 >> irq event stamp: 2783413 >> hardirqs last enabled at (2783412): [<802011ec>] >> __do_softirq+0xf4/0x7ac kernel/softirq.c:329 >> hardirqs last disabled at (2783413): [<8277d260>] >> __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline] >> hardirqs last disabled at (2783413): [<8277d260>] >> _raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231 >> softirqs last enabled at (2783410): [<826b5050>] spin_unlock_bh >> include/linux/spinlock.h:399 [inline] >> softirqs last enabled at (2783410): [<826b5050>] >> batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467 >> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack >> include/asm-generic/softirq_stack.h:10 [inline] >> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq >> kernel/softirq.c:248 [inline] >> softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4 >> kernel/softirq.c:235 >> >> other info that might help us debug this: >> Possible unsafe locking scenario: >> >>CPU0 >> >> lock(>i_size_seqcount); >> >> lock(>i_size_seqcount); >> >> *** DEADLOCK *** >> >> 3 locks held by kworker/u4:0/7: >> #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data >> kernel/workqueue.c:615 [inline] >> #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: >> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] >> #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: >> process_one_work+0x214/0x998 kernel/workqueue.c:2246 >> #1: 85147ef8 >> ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: >> set_work_data kernel/workqueue.c:615 [inline] >> #1: 85147ef8 >> ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: >> set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] >> #1: 85147ef8 >> ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: >> process_one_work+0x214/0x998 kernel/workqueue.c:2246 >> #2: 8f878010 (>size_lock){...-}-{2:2}, at: >> ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66 > > Damien? We have that revert queued up for this for 5.12, but looking > at that, the state before that was kind of messy too. Indeed... I was thinking about this and I think I am with Christoph on this: drivers should not call set_capacity() from command completion context. I think the best thing to do would be to fix drivers that do that but that may not be RC material ? Looking into more details of this case, it is slightly different though. set_capacity() is here not called from soft IRQ context. It looks like a regular initialization, but one that seems way too early in the boot process when a secondary core is being initialized with IRQ not yet enabled... I think. And the warnings come from i_size_write() calling preempt_disable() rather than set_capacity() use of spin_lock(>bd_size_lock). I wonder how it is possible to have brd being initialized so early. I am not sure how to fix that. It looks like arm arch code territory. For now, we could revert the revert as I do not think that Yanfei patch is enough since completions may be from hard IRQ context too, which is not covered with the spin_lock_bh() variants (c.f. a similar problem we are facing with that in scsi completion [1]) I do not have any good idea how to proceed though. [1] https://lore.kernel.org/linux-scsi/ph0pr04mb7416c8330459e92d8aa21a889b...@ph0pr04mb7416.namprd04.prod.outlook.com/T/#t -- Damien Le Moal Western Digital Research
Re: [PATCH] block: fix possible bd_size_lock deadlock
On 3/11/21 5:11 AM, yanfei...@windriver.com wrote: > From: Yanfei Xu > > bd_size_lock spinlock could be taken in block softirq, thus we should > disable the softirq before taking the lock. > > WARNING: inconsistent lock state > 5.12.0-rc2-syzkaller #0 Not tainted > > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage. > kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes: > 8f87826c (>i_size_seqcount){+.+-}-{0:0}, at: > end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006 > {SOFTIRQ-ON-W} state was registered at: > lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510 > lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483 > do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline] > do_write_seqcount_begin include/linux/seqlock.h:545 [inline] > i_size_write include/linux/fs.h:863 [inline] > set_capacity+0x13c/0x1f8 block/genhd.c:50 > brd_alloc+0x130/0x180 drivers/block/brd.c:401 > brd_init+0xcc/0x1e0 drivers/block/brd.c:500 > do_one_initcall+0x8c/0x59c init/main.c:1226 > do_initcall_level init/main.c:1299 [inline] > do_initcalls init/main.c:1315 [inline] > do_basic_setup init/main.c:1335 [inline] > kernel_init_freeable+0x2cc/0x330 init/main.c:1537 > kernel_init+0x10/0x120 init/main.c:1424 > ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158 > 0x0 > irq event stamp: 2783413 > hardirqs last enabled at (2783412): [<802011ec>] > __do_softirq+0xf4/0x7ac kernel/softirq.c:329 > hardirqs last disabled at (2783413): [<8277d260>] > __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline] > hardirqs last disabled at (2783413): [<8277d260>] > _raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231 > softirqs last enabled at (2783410): [<826b5050>] spin_unlock_bh > include/linux/spinlock.h:399 [inline] > softirqs last enabled at (2783410): [<826b5050>] > batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467 > softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack > include/asm-generic/softirq_stack.h:10 [inline] > softirqs last disabled at (2783411): [<8024ddfc>] do_softirq > kernel/softirq.c:248 [inline] > softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4 > kernel/softirq.c:235 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(>i_size_seqcount); > > lock(>i_size_seqcount); > > *** DEADLOCK *** > > 3 locks held by kworker/u4:0/7: > #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data > kernel/workqueue.c:615 [inline] > #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] > #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: > process_one_work+0x214/0x998 kernel/workqueue.c:2246 > #1: 85147ef8 > ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: > set_work_data kernel/workqueue.c:615 [inline] > #1: 85147ef8 > ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: > set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] > #1: 85147ef8 > ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: > process_one_work+0x214/0x998 kernel/workqueue.c:2246 > #2: 8f878010 (>size_lock){...-}-{2:2}, at: > ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66 Damien? We have that revert queued up for this for 5.12, but looking at that, the state before that was kind of messy too. -- Jens Axboe
[PATCH] block: fix possible bd_size_lock deadlock
From: Yanfei Xu bd_size_lock spinlock could be taken in block softirq, thus we should disable the softirq before taking the lock. WARNING: inconsistent lock state 5.12.0-rc2-syzkaller #0 Not tainted inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-R} usage. kworker/u4:0/7 [HC0[0]:SC1[1]:HE0:SE0] takes: 8f87826c (>i_size_seqcount){+.+-}-{0:0}, at: end_bio_bh_io_sync+0x38/0x54 fs/buffer.c:3006 {SOFTIRQ-ON-W} state was registered at: lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510 lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483 do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline] do_write_seqcount_begin include/linux/seqlock.h:545 [inline] i_size_write include/linux/fs.h:863 [inline] set_capacity+0x13c/0x1f8 block/genhd.c:50 brd_alloc+0x130/0x180 drivers/block/brd.c:401 brd_init+0xcc/0x1e0 drivers/block/brd.c:500 do_one_initcall+0x8c/0x59c init/main.c:1226 do_initcall_level init/main.c:1299 [inline] do_initcalls init/main.c:1315 [inline] do_basic_setup init/main.c:1335 [inline] kernel_init_freeable+0x2cc/0x330 init/main.c:1537 kernel_init+0x10/0x120 init/main.c:1424 ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158 0x0 irq event stamp: 2783413 hardirqs last enabled at (2783412): [<802011ec>] __do_softirq+0xf4/0x7ac kernel/softirq.c:329 hardirqs last disabled at (2783413): [<8277d260>] __raw_read_lock_irqsave include/linux/rwlock_api_smp.h:157 [inline] hardirqs last disabled at (2783413): [<8277d260>] _raw_read_lock_irqsave+0x84/0x88 kernel/locking/spinlock.c:231 softirqs last enabled at (2783410): [<826b5050>] spin_unlock_bh include/linux/spinlock.h:399 [inline] softirqs last enabled at (2783410): [<826b5050>] batadv_nc_purge_paths+0x10c/0x148 net/batman-adv/network-coding.c:467 softirqs last disabled at (2783411): [<8024ddfc>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] softirqs last disabled at (2783411): [<8024ddfc>] do_softirq kernel/softirq.c:248 [inline] softirqs last disabled at (2783411): [<8024ddfc>] do_softirq+0xd8/0xe4 kernel/softirq.c:235 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(>i_size_seqcount); lock(>i_size_seqcount); *** DEADLOCK *** 3 locks held by kworker/u4:0/7: #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline] #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] #0: 88c622a8 ((wq_completion)bat_events){+.+.}-{0:0}, at: process_one_work+0x214/0x998 kernel/workqueue.c:2246 #1: 85147ef8 ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:615 [inline] #1: 85147ef8 ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline] #1: 85147ef8 ((work_completion)(&(_priv->nc.work)->work)){+.+.}-{0:0}, at: process_one_work+0x214/0x998 kernel/workqueue.c:2246 #2: 8f878010 (>size_lock){...-}-{2:2}, at: ntfs_end_buffer_async_read+0x6c/0x558 fs/ntfs/aops.c:66 Fixes: 0f47227705d8 (block: revert "block: fix bd_size_lock use") Reported-by: syzbot+a464ba0296692a4d2...@syzkaller.appspotmail.com Signed-off-by: Yanfei Xu --- block/genhd.c | 4 ++-- block/partitions/core.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index c55e8f0fced1..a246fcbd6fc5 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -46,9 +46,9 @@ void set_capacity(struct gendisk *disk, sector_t sectors) { struct block_device *bdev = disk->part0; - spin_lock(>bd_size_lock); + spin_lock_bh(>bd_size_lock); i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT); - spin_unlock(>bd_size_lock); + spin_unlock_bh(>bd_size_lock); } EXPORT_SYMBOL(set_capacity); diff --git a/block/partitions/core.c b/block/partitions/core.c index 1a7558917c47..777db55debce 100644 --- a/block/partitions/core.c +++ b/block/partitions/core.c @@ -88,9 +88,9 @@ static int (*check_part[])(struct parsed_partitions *) = { static void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors) { - spin_lock(>bd_size_lock); + spin_lock_bh(>bd_size_lock); i_size_write(bdev->bd_inode, (loff_t)sectors << SECTOR_SHIFT); - spin_unlock(>bd_size_lock); + spin_unlock_bh(>bd_size_lock); } static struct parsed_partitions *allocate_partitions(struct gendisk *hd) -- 2.27.0