On 3/16/18 4:12 PM, David Sterba wrote:
> On Fri, Mar 16, 2018 at 02:36:27PM -0400, [email protected] wrote:
>> From: Jeff Mahoney <[email protected]>
>>
>> While running btrfs/011, I hit the following lockdep splat.
>>
>> This is the important bit:
>>    pcpu_alloc+0x1ac/0x5e0
>>    __percpu_counter_init+0x4e/0xb0
>>    btrfs_init_fs_root+0x99/0x1c0 [btrfs]
>>    btrfs_get_fs_root.part.54+0x5b/0x150 [btrfs]
>>    resolve_indirect_refs+0x130/0x830 [btrfs]
>>    find_parent_nodes+0x69e/0xff0 [btrfs]
>>    btrfs_find_all_roots_safe+0xa0/0x110 [btrfs]
>>    btrfs_find_all_roots+0x50/0x70 [btrfs]
>>    btrfs_qgroup_prepare_account_extents+0x53/0x90 [btrfs]
>>    btrfs_commit_transaction+0x3ce/0x9b0 [btrfs]
>>
>> The percpu_counter_init call in btrfs_alloc_subvolume_writers
>> uses GFP_KERNEL, which we can't do during transaction commit.
>>
>> This switches it to GFP_NOFS.
> 
>> Signed-off-by: Jeff Mahoney <[email protected]>
>> ---
>>  fs/btrfs/disk-io.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 21f34ad0d411..eb6bb3169a9e 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1108,7 +1108,7 @@ static struct btrfs_subvolume_writers 
>> *btrfs_alloc_subvolume_writers(void)
>>      if (!writers)
>>              return ERR_PTR(-ENOMEM);
>>  
>> -    ret = percpu_counter_init(&writers->counter, 0, GFP_KERNEL);
>> +    ret = percpu_counter_init(&writers->counter, 0, GFP_NOFS);
> 
> A line above the diff context is another allocation that does GFP_NOFS,
> so one of the gfp flags were wrong.

This one was wrong.  It was initially implicitly GFP_KERNEL until Tejun
added the gfp_t argument and used GFP_KERNEL for most of the sites.
Since that was effectively a no-op, it was the right thing for him to do
without asking every subsystem maintainer their preference.

> Looks like there's another instance where percpu allocates with
> GFP_KERNEL: create_space_info that can be called from the path that
> allocates chunks, so this also looks like a NOFS candidate.

That's probably for the same reason.

> And in the same function, there's another indirect and hidden GFP_KERNEL
> allocation from kobject_init_and_add. So in this case we can't fix all
> the gfp problems at the call site and will have to use the scoped
> approach eventually.

Yep.  That's not a huge barrier, though.  We can push the kobject_add
into a workqueue pretty easily.

> I haven't found any instance of such lockdep reports in my logs (over a
> long period), so it's quite unlikely to end up in the recursive
> allocation.
> 
> Patch added to next, thanks. 

When hunting to see if this had already been fixed, I did find two
reports.  One from Qu from April of last year and another from Mike
Galbraith in 2016.

-Jeff

-- 
Jeff Mahoney
SUSE Labs

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to