Re: [f2fs-dev] [PATCH 1/2] f2fs: remove stale inode entry before eviction from gdirty_list

Chao Yu Tue, 20 Nov 2018 19:31:31 -0800

Hi Ritesh,

On 2018/11/20 18:33, Ritesh Harjani wrote:
> 
> On 11/20/2018 2:33 AM, Jaegeuk Kim wrote:
>> On 11/19, Ritesh Harjani wrote:
>>> Hi Chao,
>>>
>>> On 11/19/2018 12:09 PM, Chao Yu wrote:
>>>> Hi Ritesh,
>>>>
>>>> On 2018/11/19 14:02, Ritesh Harjani wrote:
>>>>> This is seen when CP_ERROR_FLAG is not set & FS may be corrupted.
>>>>> There is a case observed where dirty stale inode pointer data is still
>>>>> present in the gdirty_list causing panic on access while doing
>>>>> checkpoint operation.
>>>>>
>>>>> WARNING: CPU: 3 PID: 1827 at
>>>>> kernel/msm-4.14/fs/f2fs/inode.c:567
>>>>> f2fs_evict_inode+0x364/0x37c
>>>>> <...>
>>>>> [42246.776289] BUG: spinlock bad magic on CPU#4, 1245
>>>>> [42246.782674] Unable to handle kernel paging request at virtual address 
>>>>> 6b6b6b6b6b713b
>>>>> <...>
>>>>> [42246.896370] task: ffffffc0f0434080 task.stack: ffffff8023ea0000
>>>>> [42246.902465] pc : spin_bug+0x80/0xb8
>>>>> [42246.906055] lr : spin_bug+0x64/0xb8
>>>>> <...>
>>>>> [42247.122346] Call trace:
>>>>> [42247.124876]  spin_bug+0x80/0xb8
>>>>> [42247.128110]  do_raw_spin_lock+0xe8/0x118
>>>>> [42247.132144]  _raw_spin_lock+0x24/0x30
>>>>> [42247.135916]  igrab+0x20/0x6c
>>>>> [42247.138894]  f2fs_sync_inode_meta+0x58/0xc0
>>>>> [42247.143199]  write_checkpoint+0x1c4/0xecc
>>>>> [42247.147322]  f2fs_sync_fs+0x118/0x170
>>>>> [42247.151096]  f2fs_do_sync_file+0x4f0/0x798
>>>>> [42247.155311]  f2fs_sync_file+0x54/0x6c
>>>>> [42247.159087]  vfs_fsync_range+0x90/0xac
>>>>> [42247.162950]  vfs_fsync+0x2c/0x38
>>>>> [42247.166278]  do_fsync+0x3c/0x78
>>>>> [42247.169515]  SyS_fdatasync+0x20/0x30
>>>>>
>>>>> Signed-off-by: Ritesh Harjani <[email protected]>
>>>>> ---
>>>>>    fs/f2fs/inode.c | 10 ++++++----
>>>>>    1 file changed, 6 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>>>> index 91ceee0..c57f636 100644
>>>>> --- a/fs/f2fs/inode.c
>>>>> +++ b/fs/f2fs/inode.c
>>>>> @@ -702,11 +702,13 @@ void f2fs_evict_inode(struct inode *inode)
>>>>>           stat_dec_inline_dir(inode);
>>>>>           stat_dec_inline_inode(inode);
>>>>> - if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) &&
>>>>> -                         !is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
>>>>> -         f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));
>>>>> - else
>>>>> + if (unlikely(is_inode_flag_set(inode, FI_DIRTY_INODE))) {
>>>>>                   f2fs_inode_synced(inode);
>>>>> +         f2fs_msg(sbi->sb, KERN_WARNING,
>>>>> +                  "inconsistent dirty inode:%u entry found during 
>>>>> eviction\n",
>>>>> +                  inode->i_ino);
>>>>> +         f2fs_bug_on(sbi, 1);
>>>> IIRC, Jaegeuk added below condition to avoid f2fs_bug_on during doing test
>>>> w/ checkpoint error injection, if we remove this, we may still encounter
>>>> such problem.
>>>>
>>>> if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG)))
>>> Ok, agreed. Does below sounds good then?
>>> The idea is to go ahead and call f2fs_inode_synced(inode) when the inode
>>> FI_DIRTY_INODE flag is set irrespective of CP_ERROR_FLAG set or not. Because
>>> otherwise there is a stale inode entry which will
>>> remain in gdirty_list => causing kernel panic in checkpoint path.
>>>
>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
>>> index 91ceee0..00915c2 100644
>>> --- a/fs/f2fs/inode.c
>>> +++ b/fs/f2fs/inode.c
>>> @@ -702,11 +702,15 @@ void f2fs_evict_inode(struct inode *inode)
>>>          stat_dec_inline_dir(inode);
>>>          stat_dec_inline_inode(inode);
>>>
>>> -       if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) &&
>>> -                               !is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
>>> -               f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));
>>> -       else
>>> +       if (unlikely(is_inode_flag_set(inode, FI_DIRTY_INODE))) {
>>>                  f2fs_inode_synced(inode);
>>> +               f2fs_msg(sbi->sb, KERN_WARNING,
>>> +                        "inconsistent dirty inode:%u entry found during
>>> eviction\n",
>>> +                        inode->i_ino);
>>> +               if ((!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) &&
>>> +                               !is_sbi_flag_set(sbi, SBI_CP_DISABLED)))
>>> +                       f2fs_bug_on(sbi, 1);
>>> +       }
>>>
>>>> So I'd like to know what kind of case can make dirty inode during evict(),
>>>> can you explain more?
>>> Yes, we also could not get exact path about when this can happen. But below
>>> are the parallel ongoing contexts when the issue is seen:-
>>> We do suspect that there is something already wrong in the FS even before
>>> when this issue occurred (due to "inconsistent node block" logs). This could
>>> be due to some other underlying storage driver issue.
>>> Not sure though.
>>>
>>> 1. unlinkat operation causing f2fs_evict_inode causing a warning (as
>>> mentioned in the commit text).
>>> 2. echo 3 > /proc/sys/vm/drop_caches.
>>> 3. Rename operation on some file.
>>> 4. vfs_fsync -> doing checkpointing => this causes a kernel panic since
>>> stale inode entry is still present which got freed from f2fs_evict_inode
>>> path.
>>>
>>> Some error logs which were seen long before this issue occurred.
>>> [42219.089810] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50018, 
>>> node_footer[nid:50327,ino:50327,ofs:0,cpver:12146505996282773182,blkaddr:515484]
>>> [42219.104281] F2FS-fs (mmcblk0p75): inconsistent node block, nid:49836, 
>>> node_footer[nid:50228,ino:50228,ofs:0,cpver:10230247294256961017,blkaddr:1496346]
>>> [42219.118723] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50327, 
>>> node_footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0]
>>> [42219.130782] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50228, 
>>> node_footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0]
>> I've concerned that this patch hides the bug and makes the partition being 
>> more
>> corrupted. We have to figure out where such the node block were generated.


I agree with Jaegeuk, we'd better to figure out root cause of this issue,
instead of hiding it.

> This issue seems to be mostly due to problem from underlying storage 
> driver. As of now the suspect
> is not from f2fs since we don't see this issue on a different 
> configuration of storage driver.
> The above patch is mainly to fix a kernel crash when the system/FS has 
> actually gone bad due to some other subsystem causing corruption.
> We thought it is a good fix to have because anyways if there is a dirty 
> inode found during eviction,
> then we should remove any stale entries of it from RAM data structures, 
> before freeing the inode (with a warning msg).

Thanks for the explanation, I can understand your concern, if you worry
about further corruption, would it be better to stop checkpoint in addition
for this case in qcom's kernel?

> 
>> How many patches have you cherry-picked? Which kernel version are you using?
> kernel 4.14. Yes we do have all the patches pulled from android aosp.

You mean code under below tag?

https://android.googlesource.com/kernel/common/+/ASB-2018-11-05_4.14

Thanks,

>> Have you enabled ICE?
> Yes. ICE is enabled.
> 
> 
> Thanks
> Ritesh
> 
>>
>> Thanks,
>>
>>
>>> Thanks,
>>>
>>>> Thanks,
>>>>
>>>>> + }
>>>>>           /* ino == 0, if f2fs_new_inode() was failed t*/
>>>>>           if (inode->i_ino)
>>>>>
> 
> 
> .
> 



_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/2] f2fs: remove stale inode entry before eviction from gdirty_list

Reply via email to