Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://patchwork.kernel.org/patch/3894781/ , will probably hit -rc2 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
I was on vacation last week, I'll update btrfs-next today once we are happy with integration. Thanks, Josef Sasha Levin sasha.le...@oracle.com wrote: On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 04/07/2014 02:03 PM, Sasha Levin wrote: On 04/07/2014 01:17 PM, Chris Mason wrote: On 04/07/2014 12:54 PM, David Sterba wrote: On Fri, Apr 04, 2014 at 05:15:23PM -0400, Sasha Levin wrote: On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. It is: https://urldefense.proofpoint.com/v1/url?u=https://patchwork.kernel.org/patch/3894781/k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=6%2FL0lzzDhu0Y1hL9xm%2BQyA%3D%3D%0Am=HQJVSK4wPTft1zWwI1cGvwj5OfdmN5UItVlucU1K31o%3D%0As=5113699a2e7345a779333c87dd5b1d88b4410a7c7fcd5fa424baeb838ad7d31b , will probably hit -rc2 Its in the integration branch now along with some other important fixes. We'll get it out shortly Chris, Can I suggest adding the integration branch to linux-next as well? That way all the folks who report issues coming out of -next would be able to test the fixes as well. Hi Sasha, The ink is still a little wet on the integration branch. It'll definitely go to linux-next and to Linus. -chris -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On 03/26/2014 01:01 PM, Jeff Mahoney wrote: On 3/17/14, 9:05 AM, David Sterba wrote: On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041] local_irq_disable(); [ 788.460041] lock(delayed_node-mutex); [ 788.460041] lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 I think this is from my sysfs patches. We call kobject_add while holding the group_sem. kobject_add ultimately allocates with GFP_KERNEL, so it can enter reclaim. This particular case isn't dangerous, but it could hit while hot-adding a device. The fix should be pretty simple. Is that fix available anywhere? I'm still seeing the issue in -next. Thanks, Sasha -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs: lock inversion between delayed_node-mutex and found-groups_sem
On Fri, Mar 14, 2014 at 08:12:16PM -0400, Sasha Levin wrote: While fuzzing with trinity inside a KVM tools guest running the latest -next kernel I've stumbled on the following: [ 788.458756]CPU0CPU1 [ 788.459188] [ 788.459625] lock(found-groups_sem); [ 788.460041]local_irq_disable(); [ 788.460041]lock(delayed_node-mutex); [ 788.460041]lock(found-groups_sem); [ 788.460041] Interrupt [ 788.460041] lock(delayed_node-mutex); [ 788.460041] [ 788.460041] *** DEADLOCK *** [ 788.460041] [ 788.460041] 2 locks held by kswapd3/4199: I've once (3.14-rc5) seen the same warning also caused by xfstests/generic/224 2 locks held by 224/31203: #0: (shrinker_rwsem){..}, at: [8113be6d] shrink_slab+0x3d/0x110 #1: (type-s_umount_key#32){++}, at: [8117cd84] grab_super_passive+0x44/0x90 the shortest dependencies between 2nd lock and 1st lock: - (found-groups_sem){+.} ops: 405561 { HARDIRQ-ON-W at: [810af476] __lock_acquire+0x7f6/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01d3c] down_write+0x5c/0xc0 [a001f6a6] __link_block_group+0x46/0x130 [btrfs] [a0023411] btrfs_read_block_groups+0x341/0x690 [btrfs] [a0031c50] open_ctree+0x1880/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b HARDIRQ-ON-R at: [810af265] __lock_acquire+0x5e5/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01c8c] down_read+0x4c/0xa0 [a002da32] btrfs_calc_num_tolerated_disk_barrier_failures+0x142/0x240 [btrfs] [a0031c6e] open_ctree+0x189e/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b SOFTIRQ-ON-W at: [810af4aa] __lock_acquire+0x82a/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01d3c] down_write+0x5c/0xc0 [a001f6a6] __link_block_group+0x46/0x130 [btrfs] [a0023411] btrfs_read_block_groups+0x341/0x690 [btrfs] [a0031c50] open_ctree+0x1880/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b SOFTIRQ-ON-R at: [810af4aa] __lock_acquire+0x82a/0x1fb0 [810b12e2] lock_acquire+0x92/0x120 [81a01c8c] down_read+0x4c/0xa0 [a002da32] btrfs_calc_num_tolerated_disk_barrier_failures+0x142/0x240 [btrfs] [a0031c6e] open_ctree+0x189e/0x2310 [btrfs] [a00065db] btrfs_mount+0x55b/0x860 [btrfs] [8117d6f0] mount_fs+0x20/0xe0 [81199ab6] vfs_kern_mount+0x76/0x160 [8119c25d] do_mount+0x31d/0x970 [8119cc30] SyS_mount+0x90/0xe0 [81a0ca92] system_call_fastpath+0x16/0x1b RECLAIM_FS-ON-W at: [810b1c6c] mark_held_locks+0x8c/0x170 [810b249a] lockdep_trace_alloc+0x8a/0xd0 [8116f807] __kmalloc_track_caller+0x47/0x210 [813cdefb] kvasprintf+0x5b/0x90 [813c166a] kobject_set_name_vargs+0x2a/0x70 [813c1ffa] kobject_add+0x5a/0xb0 [a001f75d] __link_block_group+0xfd/0x130 [btrfs] [a0023411]