On 3/19/18 2:08 PM, David Sterba wrote: > On Mon, Mar 19, 2018 at 01:52:05PM -0400, Jeff Mahoney wrote: >> On 3/16/18 4:12 PM, David Sterba wrote: >>> On Fri, Mar 16, 2018 at 02:36:27PM -0400, [email protected] wrote: >>>> From: Jeff Mahoney <[email protected]> >>>> >>>> While running btrfs/011, I hit the following lockdep splat. >>>> >>>> This is the important bit: >>>> pcpu_alloc+0x1ac/0x5e0 >>>> __percpu_counter_init+0x4e/0xb0 >>>> btrfs_init_fs_root+0x99/0x1c0 [btrfs] >>>> btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs] >>>> resolve_indirect_refs+0x130/0x830 [btrfs] >>>> find_parent_nodes+0x69e/0xff0 [btrfs] >>>> btrfs_find_all_roots_safe+0xa0/0x110 [btrfs] >>>> btrfs_find_all_roots+0x50/0x70 [btrfs] >>>> btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs] >>>> btrfs_commit_transaction+0x3ce/0x9b0 [btrfs] >>>> >>>> The percpu_counter_init call in btrfs_alloc_subvolume_writers >>>> uses GFP_KERNEL, which we can't do during transaction commit. >>>> >>>> This switches it to GFP_NOFS. >>> >>>> Signed-off-by: Jeff Mahoney <[email protected]> >>>> --- >>>> fs/btrfs/disk-io.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c >>>> index 21f34ad0d411..eb6bb3169a9e 100644 >>>> --- a/fs/btrfs/disk-io.c >>>> +++ b/fs/btrfs/disk-io.c >>>> @@ -1108,7 +1108,7 @@ static struct btrfs_subvolume_writers >>>> *btrfs_alloc_subvolume_writers(void) >>>> if (!writers) >>>> return ERR_PTR(-ENOMEM); >>>> >>>> - ret = percpu_counter_init(&writers->counter, 0, GFP_KERNEL); >>>> + ret = percpu_counter_init(&writers->counter, 0, GFP_NOFS); >>> >>> A line above the diff context is another allocation that does GFP_NOFS, >>> so one of the gfp flags were wrong. >>> >>> Looks like there's another instance where percpu allocates with >>> GFP_KERNEL: create_space_info that can be called from the path that >>> allocates chunks, so this also looks like a NOFS candidate. >> >> We can get rid of this case entirely. Those call sites should be >> removed since the space_infos are all allocated at mount time. > > That would be great and make a few things simpler. So this means that > __find_space_info never fails once the space infos are properly > initialized, right? That was my concern in do_chunk_alloc and > btrfs_make_block_group (that's called from __btrfs_alloc_chunk).
That's a different case. The raid levels are added when the first block group of a particular read level is loaded up. That can happen when the block groups are read in initially, where it should be safe to use GFP_KERNEL or when a chunk of a new type is allocated. The thing is that a chunk of a new type will only be allocated when we're converting via balance, so we may be able to do the kobject_add for the raid level when we start the balance rather than wait for it to create the block group. -Jeff -- Jeff Mahoney SUSE Labs
signature.asc
Description: OpenPGP digital signature
