Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://patchwork.kernel.org/patch/3894781/ , will probably hit -rc2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
I was on vacation last week, I'll update btrfs-next today once we are happy with integration. Thanks, Josef Sasha Levin sasha.le...@oracle.com wrote: On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 02:03 PM, Sasha Levin wrote: On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Hi Sasha, The ink is still a little wet on the integration branch. It'll definitely go to linux-next and to Linus. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041]local_irq_disable(); [ 788.460041]lock(delayed_node-mutex); [ 788.460041]lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 2 locks held by 224/31203: #0: (shrinker_rwsem){..}, at: [8113be6d] shrink_slab+0x3d/0x110 #1: (type-s_umount_key#32){++}, at: [8117cd84] grab_super_passive+0x44/0x90 the shortest dependencies between 2nd lock and 1st lock: - (found-groups_sem){+.} ops: 405561 { HARDIRQ-ON-W at: [810af476] __lock_acquire+0x7f6/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01d3c] down_write+0x5c/0xc0 [a001f6a6] __link_block_group+0x46/0x130 [btrfs] [a0023411] btrfs_read_block_groups+0x341/0x690 [btrfs] [a0031c50] open_ctree+0x1880/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b HARDIRQ-ON-R at: [810af265] __lock_acquire+0x5e5/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01c8c] down_read+0x4c/0xa0 [a002da32] btrfs_calc_num_tolerated_disk_barrier_failures+0x142/0x240 [btrfs] [a0031c6e] open_ctree+0x189e/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b SOFTIRQ-ON-W at: [810af4aa] __lock_acquire+0x82a/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01d3c] down_write+0x5c/0xc0 [a001f6a6] __link_block_group+0x46/0x130 [btrfs] [a0023411] btrfs_read_block_groups+0x341/0x690 [btrfs] [a0031c50] open_ctree+0x1880/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b SOFTIRQ-ON-R at: [810af4aa] __lock_acquire+0x82a/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01c8c] down_read+0x4c/0xa0 [a002da32] btrfs_calc_num_tolerated_disk_barrier_failures+0x142/0x240 [btrfs] [a0031c6e] open_ctree+0x189e/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b RECLAIM_FS-ON-W at: [810b1c6c] mark_held_locks+0x8c/0x170 [810b249a] lockdep_trace_alloc+0x8a/0xd0 [8116f807] __kmalloc_track_caller+0x47/0x210 [813cdefb] kvasprintf+0x5b/0x90 [813c166a] kobject_set_name_vargs+0x2a/0x70 [813c1ffa] kobject_add+0x5a/0xb0 [a001f75d] __link_block_group+0xfd/0x130 [btrfs] [a0023411]
btrfs: lock inversion between delayed_node-mutex and found-groups_sem
Hi all, While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.451695] = [ 788.452455] [ INFO: possible irq lock inversion dependency detected ] [ 788.453020] 3.14.0-rc6-next-20140313-sasha-00010-gb8c1db1-dirty #217 Tainted: GW [ 788.453827] - [ 788.454371] kswapd3/4199 just changed the state of lock: [ 788.454902] (delayed_node-mutex){+.+.-.}, at: __btrfs_release_delayed_node+0x4f/0x140 (fs/btrfs/delayed-inode.c:263) [ 788.455890] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 788.456543] (found-groups_sem){+.} and interrupts could create inverse lock ordering between them. [ 788.457491] [ 788.457491] other info that might help us debug this: [ 788.458115] Possible interrupt unsafe locking scenario: [ 788.458115] [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041]local_irq_disable(); [ 788.460041]lock(delayed_node-mutex); [ 788.460041]lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: [ 788.460041] #0: (shrinker_rwsem){..}, at: shrink_slab+0x3f/0x160 (mm/vmscan.c:360) [ 788.460041] #1: (type-s_umount_key#108){.+.+..}, at: grab_super_passive+0x56/0x90 (fs/super.c:361) [ 788.460041] [ 788.460041] the shortest dependencies between 2nd lock and 1st lock: [ 788.460041] - (found-groups_sem){+.} ops: 46 { [ 788.460041] HARDIRQ-ON-W at: [ 788.460041] mark_irqflags+0xf0/0x170 (kernel/locking/lockdep.c:2800) [ 788.460041] __lock_acquire+0x2de/0x5a0 (kernel/locking/lockdep.c:3138) [ 788.460041] lock_acquire+0x182/0x1d0 (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 788.460041] down_write+0x5c/0xc0 (arch/x86/include/asm/rwsem.h:130 kernel/locking/rwsem.c:50) [ 788.460041] __link_block_group+0x45/0x110 (fs/btrfs/extent-tree.c:8348) [ 788.460041] btrfs_read_block_groups+0x3ae/0x700 (fs/btrfs/extent-tree.c:8533) [ 788.460041] open_ctree+0x1abf/0x2210 (fs/btrfs/disk-io.c:2749) [ 788.460041] btrfs_fill_super+0x81/0x140 (fs/btrfs/super.c:958) [ 788.460041] btrfs_mount+0x26a/0x300 (fs/btrfs/super.c:1295) [ 788.460041] mount_fs+0x8d/0x1a0 (fs/super.c:1091) [ 788.460041] vfs_kern_mount+0x79/0x150 (fs/namespace.c:813) [ 788.460041] do_new_mount+0xcd/0x1c0 (fs/namespace.c:2068)[ 788.460041] do_mount+0x15d/0x210 (fs/namespace.c:2392) [ 788.460041] SyS_mount+0x9d/0xe0 (fs/namespace.c:2589 fs/namespace.c:2560) [ 788.460041] tracesys+0xdd/0xe2 (arch/x86/kernel/entry_64.S:749) [ 788.460041] HARDIRQ-ON-R at: [ 788.460041] mark_irqflags+0xbc/0x170 (kernel/locking/lockdep.c:2792) [ 788.460041] __lock_acquire+0x2de/0x5a0 (kernel/locking/lockdep.c:3138) [ 788.460041] lock_acquire+0x182/0x1d0 (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) [ 788.460041] down_read+0x4c/0xa0 (arch/x86/include/asm/rwsem.h:83 kernel/locking/rwsem.c:23) [ 788.460041] btrfs_calc_num_tolerated_disk_barrier_failures+0x2a7/0x3a0 (fs/btrfs/disk-io.c:3309) [ 788.460041] open_ctree+0x1af7/0x2210 (fs/btrfs/disk-io.c:2755) [ 788.460041] btrfs_fill_super+0x81/0x140 (fs/btrfs/super.c:958) [ 788.460041] btrfs_mount+0x26a/0x300 (fs/btrfs/super.c:1295) [ 788.460041] mount_fs+0x8d/0x1a0 (fs/super.c:1091) [ 788.460041] vfs_kern_mount+0x79/0x150 (fs/namespace.c:813) [ 788.460041] do_new_mount+0xcd/0x1c0 (fs/namespace.c:2068) [ 788.460041] do_mount+0x15d/0x210 (fs/namespace.c:2392) [ 788.460041] SyS_mount+0x9d/0xe0 (fs/namespace.c:2589 fs/namespace.c:2560) [ 788.460041] tracesys+0xdd/0xe2 (arch/x86/kernel/entry_64.S:749) [ 788.460041] SOFTIRQ-ON-W at: [ 788.460041] mark_irqflags+0x110/0x170 (kernel/locking/lockdep.c:2804) [ 788.460041] __lock_acquire+0x2de/0x5a0 (kernel/locking/lockdep.c:3138) [ 788.460041] lock_acquire+0x182/0x1d0 (arch/x86/include/asm/current.h:14