Hi Ritesh, On 2018/11/20 18:33, Ritesh Harjani wrote: > > On 11/20/2018 2:33 AM, Jaegeuk Kim wrote: >> On 11/19, Ritesh Harjani wrote: >>> Hi Chao, >>> >>> On 11/19/2018 12:09 PM, Chao Yu wrote: >>>> Hi Ritesh, >>>> >>>> On 2018/11/19 14:02, Ritesh Harjani wrote: >>>>> This is seen when CP_ERROR_FLAG is not set & FS may be corrupted. >>>>> There is a case observed where dirty stale inode pointer data is still >>>>> present in the gdirty_list causing panic on access while doing >>>>> checkpoint operation. >>>>> >>>>> WARNING: CPU: 3 PID: 1827 at >>>>> kernel/msm-4.14/fs/f2fs/inode.c:567 >>>>> f2fs_evict_inode+0x364/0x37c >>>>> <...> >>>>> [42246.776289] BUG: spinlock bad magic on CPU#4, 1245 >>>>> [42246.782674] Unable to handle kernel paging request at virtual address >>>>> 6b6b6b6b6b713b >>>>> <...> >>>>> [42246.896370] task: ffffffc0f0434080 task.stack: ffffff8023ea0000 >>>>> [42246.902465] pc : spin_bug+0x80/0xb8 >>>>> [42246.906055] lr : spin_bug+0x64/0xb8 >>>>> <...> >>>>> [42247.122346] Call trace: >>>>> [42247.124876] spin_bug+0x80/0xb8 >>>>> [42247.128110] do_raw_spin_lock+0xe8/0x118 >>>>> [42247.132144] _raw_spin_lock+0x24/0x30 >>>>> [42247.135916] igrab+0x20/0x6c >>>>> [42247.138894] f2fs_sync_inode_meta+0x58/0xc0 >>>>> [42247.143199] write_checkpoint+0x1c4/0xecc >>>>> [42247.147322] f2fs_sync_fs+0x118/0x170 >>>>> [42247.151096] f2fs_do_sync_file+0x4f0/0x798 >>>>> [42247.155311] f2fs_sync_file+0x54/0x6c >>>>> [42247.159087] vfs_fsync_range+0x90/0xac >>>>> [42247.162950] vfs_fsync+0x2c/0x38 >>>>> [42247.166278] do_fsync+0x3c/0x78 >>>>> [42247.169515] SyS_fdatasync+0x20/0x30 >>>>> >>>>> Signed-off-by: Ritesh Harjani <[email protected]> >>>>> --- >>>>> fs/f2fs/inode.c | 10 ++++++---- >>>>> 1 file changed, 6 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c >>>>> index 91ceee0..c57f636 100644 >>>>> --- a/fs/f2fs/inode.c >>>>> +++ b/fs/f2fs/inode.c >>>>> @@ -702,11 +702,13 @@ void f2fs_evict_inode(struct inode *inode) >>>>> stat_dec_inline_dir(inode); >>>>> stat_dec_inline_inode(inode); >>>>> - if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) && >>>>> - !is_sbi_flag_set(sbi, SBI_CP_DISABLED))) >>>>> - f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); >>>>> - else >>>>> + if (unlikely(is_inode_flag_set(inode, FI_DIRTY_INODE))) { >>>>> f2fs_inode_synced(inode); >>>>> + f2fs_msg(sbi->sb, KERN_WARNING, >>>>> + "inconsistent dirty inode:%u entry found during >>>>> eviction\n", >>>>> + inode->i_ino); >>>>> + f2fs_bug_on(sbi, 1); >>>> IIRC, Jaegeuk added below condition to avoid f2fs_bug_on during doing test >>>> w/ checkpoint error injection, if we remove this, we may still encounter >>>> such problem. >>>> >>>> if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG))) >>> Ok, agreed. Does below sounds good then? >>> The idea is to go ahead and call f2fs_inode_synced(inode) when the inode >>> FI_DIRTY_INODE flag is set irrespective of CP_ERROR_FLAG set or not. Because >>> otherwise there is a stale inode entry which will >>> remain in gdirty_list => causing kernel panic in checkpoint path. >>> >>> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c >>> index 91ceee0..00915c2 100644 >>> --- a/fs/f2fs/inode.c >>> +++ b/fs/f2fs/inode.c >>> @@ -702,11 +702,15 @@ void f2fs_evict_inode(struct inode *inode) >>> stat_dec_inline_dir(inode); >>> stat_dec_inline_inode(inode); >>> >>> - if (likely(!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) && >>> - !is_sbi_flag_set(sbi, SBI_CP_DISABLED))) >>> - f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); >>> - else >>> + if (unlikely(is_inode_flag_set(inode, FI_DIRTY_INODE))) { >>> f2fs_inode_synced(inode); >>> + f2fs_msg(sbi->sb, KERN_WARNING, >>> + "inconsistent dirty inode:%u entry found during >>> eviction\n", >>> + inode->i_ino); >>> + if ((!is_set_ckpt_flags(sbi, CP_ERROR_FLAG) && >>> + !is_sbi_flag_set(sbi, SBI_CP_DISABLED))) >>> + f2fs_bug_on(sbi, 1); >>> + } >>> >>>> So I'd like to know what kind of case can make dirty inode during evict(), >>>> can you explain more? >>> Yes, we also could not get exact path about when this can happen. But below >>> are the parallel ongoing contexts when the issue is seen:- >>> We do suspect that there is something already wrong in the FS even before >>> when this issue occurred (due to "inconsistent node block" logs). This could >>> be due to some other underlying storage driver issue. >>> Not sure though. >>> >>> 1. unlinkat operation causing f2fs_evict_inode causing a warning (as >>> mentioned in the commit text). >>> 2. echo 3 > /proc/sys/vm/drop_caches. >>> 3. Rename operation on some file. >>> 4. vfs_fsync -> doing checkpointing => this causes a kernel panic since >>> stale inode entry is still present which got freed from f2fs_evict_inode >>> path. >>> >>> Some error logs which were seen long before this issue occurred. >>> [42219.089810] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50018, >>> node_footer[nid:50327,ino:50327,ofs:0,cpver:12146505996282773182,blkaddr:515484] >>> [42219.104281] F2FS-fs (mmcblk0p75): inconsistent node block, nid:49836, >>> node_footer[nid:50228,ino:50228,ofs:0,cpver:10230247294256961017,blkaddr:1496346] >>> [42219.118723] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50327, >>> node_footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0] >>> [42219.130782] F2FS-fs (mmcblk0p75): inconsistent node block, nid:50228, >>> node_footer[nid:0,ino:0,ofs:0,cpver:0,blkaddr:0] >> I've concerned that this patch hides the bug and makes the partition being >> more >> corrupted. We have to figure out where such the node block were generated.
I agree with Jaegeuk, we'd better to figure out root cause of this issue, instead of hiding it. > This issue seems to be mostly due to problem from underlying storage > driver. As of now the suspect > is not from f2fs since we don't see this issue on a different > configuration of storage driver. > The above patch is mainly to fix a kernel crash when the system/FS has > actually gone bad due to some other subsystem causing corruption. > We thought it is a good fix to have because anyways if there is a dirty > inode found during eviction, > then we should remove any stale entries of it from RAM data structures, > before freeing the inode (with a warning msg). Thanks for the explanation, I can understand your concern, if you worry about further corruption, would it be better to stop checkpoint in addition for this case in qcom's kernel? > >> How many patches have you cherry-picked? Which kernel version are you using? > kernel 4.14. Yes we do have all the patches pulled from android aosp. You mean code under below tag? https://android.googlesource.com/kernel/common/+/ASB-2018-11-05_4.14 Thanks, >> Have you enabled ICE? > Yes. ICE is enabled. > > > Thanks > Ritesh > >> >> Thanks, >> >> >>> Thanks, >>> >>>> Thanks, >>>> >>>>> + } >>>>> /* ino == 0, if f2fs_new_inode() was failed t*/ >>>>> if (inode->i_ino) >>>>> > > > . > _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
