On 2018/9/20 6:38, Jaegeuk Kim wrote:
> On 09/19, Chao Yu wrote:
>> On 2018/9/19 0:45, Jaegeuk Kim wrote:
>>> On 09/18, Chao Yu wrote:
>>>> On 2018/9/18 10:05, Jaegeuk Kim wrote:
>>>>> On 09/18, Chao Yu wrote:
>>>>>> On 2018/9/18 9:19, Jaegeuk Kim wrote:
>>>>>>> On 09/13, Chao Yu wrote:
>>>>>>>> On 2018/9/13 3:54, Jaegeuk Kim wrote:
>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>> On 2018/9/12 9:40, Chao Yu wrote:
>>>>>>>>>>> On 2018/9/12 9:25, Jaegeuk Kim wrote:
>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>> On 2018/9/12 8:27, Jaegeuk Kim wrote:
>>>>>>>>>>>>>> On 09/11, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>> On 09/12, Chao Yu wrote:
>>>>>>>>>>>>>>>> On 2018/9/12 4:15, Jaegeuk Kim wrote:
>>>>>>>>>>>>>>>>> fsck.f2fs is able to recover the quota structure, since 
>>>>>>>>>>>>>>>>> roll-forward recovery
>>>>>>>>>>>>>>>>> can recover it based on previous user information.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I didn't get it, both fsck and kernel recover quota file based 
>>>>>>>>>>>>>>>> all inodes'
>>>>>>>>>>>>>>>> uid/gid/prjid, if {x}id didn't change, wouldn't those two 
>>>>>>>>>>>>>>>> recovery result be the
>>>>>>>>>>>>>>>> same?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I thought that, but had to add this, since I was encountering 
>>>>>>>>>>>>>>> quota errors right
>>>>>>>>>>>>>>> after getting some files recovered. And, I thought it'd make it 
>>>>>>>>>>>>>>> more safe to do
>>>>>>>>>>>>>>> fsck after roll-forward recovery.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anyway, let me test again without this patch for a while.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hmm, I just got a fsck failure right after some files recovered.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To make sure, do you test with "f2fs: guarantee journalled quota 
>>>>>>>>>>>>> data by
>>>>>>>>>>>>> checkpoint"? if not, I think there is no guarantee that f2fs can 
>>>>>>>>>>>>> recover
>>>>>>>>>>>>> quote info into correct quote file, because, in last checkpoint, 
>>>>>>>>>>>>> quota file
>>>>>>>>>>>>> may was corrupted/inconsistent. Right?
>>>>>>>>>>>
>>>>>>>>>>> Oh, I forget to mention that, I add a patch to fsck to let it 
>>>>>>>>>>> noticing
>>>>>>>>>>> CP_QUOTA_NEED_FSCK_FLAG flag, and by default, fsck will fix 
>>>>>>>>>>> corrupted quote
>>>>>>>>>>> file if the flag is set, but w/o this flag, quota file is still 
>>>>>>>>>>> corrupted
>>>>>>>>>>> detected by fsck, I guess there is bug in v8.
>>>>>>>>>>
>>>>>>>>>> In v8, there are two cases we didn't guarantee quota file's 
>>>>>>>>>> consistence:
>>>>>>>>>> 1. flush time in block_operation exceed a threshold.
>>>>>>>>>> 2. dquot subsystem error occurs.
>>>>>>>>>>
>>>>>>>>>> For above case, fsck should repair the quota file by default.
>>>>>>>>>
>>>>>>>>> Okay, I got another failure and it seems CP_QUOTA_NEED_FSCK_FLAG was 
>>>>>>>>> not set
>>>>>>>>> during the recovery. So, we have something missing in the recovery in 
>>>>>>>>> terms
>>>>>>>>> of quota updates.
>>>>>>>>
>>>>>>>> Yeah, I checked the code, just found one suspected place:
>>>>>>>>
>>>>>>>> find_fsync_dnodes()
>>>>>>>>  - f2fs_recover_inode_page
>>>>>>>>   - inc_valid_node_count
>>>>>>>>    - dquot_reserve_block  dquot info is not initialized now
>>>>>>>>  - add_fsync_inode
>>>>>>>>   - dquot_initialize
>>>>>>>>
>>>>>>>> I think we should reserve block for inode block after 
>>>>>>>> dquot_initialize(), can
>>>>>>>> you confirm this?
>>>>>>>
>>>>>>> Let me test this.
>>>>>>>
>>>>>>> >From b90260bc577fe87570b1ef7b134554a8295b1f6c Mon Sep 17 00:00:00 2001
>>>>>>> From: Jaegeuk Kim <jaeg...@kernel.org>
>>>>>>> Date: Mon, 17 Sep 2018 18:14:41 -0700
>>>>>>> Subject: [PATCH] f2fs: count inode block for recovered files
>>>>>>>
>>>>>>> If a new file is recovered, we missed to reserve its inode block.
>>>>>>
>>>>>> I remember, in order to keep line with other filesystem, unlike on-disk, 
>>>>>> we
>>>>>> have to keep backward compatibilty, in memory we don't account block 
>>>>>> number
>>>>>> for f2fs' inode block, but only account inode number for it, so here like
>>>>>> we did in inc_valid_node_count(), we don't need to do this.
>>>>>
>>>>> Okay, I just hit the error again w/o your patch. Another one coming to my 
>>>>> mind
>>>>> is that caused by uid/gid change during recovery. Let me try out your 
>>>>> patch.
>>>>
>>>> I guess we should update dquot and inode's uid/gid atomically under
>>>> lock_op() in f2fs_setattr() to prevent corruption on sys quota file.
>>>>
>>>> v9 can pass all xfstest cases and por_fsstress case w/ sys quota file
>>>> enabled, but w/ normal quota file, I got one regression reported by
>>>> generic/232, I fixed in v10, will do some tests and release it later.
>>>>
>>>> Note that, my fsck can fix corrupted quota file automatically once
>>>> CP_QUOTA_NEED_FSCK_FLAG is set.
>>>
>>> I hit failures again with your v9 w/ sysfile quota and modified fsck to 
>>> detect
>>
>> That's strange, in my environment, before v9, I always encounter corrupted
>> quota sysfile after step 9), after v9, I never hit failure again.
>>
>> 1) enable fault injection
>> 2) run fsstress
>> 3) call shutdowon
>> 4) kill fsstress
>> 5) unmount
>> 6) fsck
>> 7) mount
>> 8) umount
>> 9) fsck
>> 10) go 1).
>>
>>> CP_QUOTA_NEED_FSCK_FLAG to fix the partition. Note that, if I set NEED_FSCK
>>> flag in roll-forward recovery, everything is fine.
>>
>> I do the test based on codes in my git tree, could you check the result
>> again based on my code? in where I just disable nat_bits recovery, not
>> sure, in step 6) fsck can break some thing in image.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=f2fs-dev
>>
>> Also, I just send the fsck code, could you check that too?
>>
>> And I'd like to know your mount option and mkfs option, could you list for 
>> me?
> 
> I'm just doing this.
> https://github.com/jaegeuk/xfstests-f2fs/blob/f2fs/run.sh#L220

I just sent one patch to fix POR issue which missed to recover uid/gid of
inode.

[PATCH] f2fs: fix to recover inode's uid/gid during POR

After applying this patch, I can reproduce sys quota file corruption... let
me figure out the solution.

Thanks,

> 
>>
>> Thanks,
>>
>>>
>>>>
>>>> Thanks,
>>>>
>>>>>
>>>>>>
>>>>>> Can you test v9 first? I didn't encounter quota corruption with your
>>>>>> testcase right now. Will check it in cell phone environment.
>>>>>>
>>>>>>>
>>>>>>> Signed-off-by: Chao Yu <yuch...@huawei.com>
>>>>>>> Signed-off-by: Jaegeuk Kim <jaeg...@kernel.org>
>>>>>>> ---
>>>>>>>  fs/f2fs/recovery.c | 5 +++++
>>>>>>>  1 file changed, 5 insertions(+)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
>>>>>>> index 56d34193a74b..bff5cf730e13 100644
>>>>>>> --- a/fs/f2fs/recovery.c
>>>>>>> +++ b/fs/f2fs/recovery.c
>>>>>>> @@ -84,6 +84,11 @@ static struct fsync_inode_entry 
>>>>>>> *add_fsync_inode(struct f2fs_sb_info *sbi,
>>>>>>>                 err = dquot_alloc_inode(inode);
>>>>>>>                 if (err)
>>>>>>>                         goto err_out;
>>>>>>> +               err = dquot_reserve_block(inode, 1);
>>>>>>> +               if (err) {
>>>>>>> +                       dquot_drop(inode);
>>>>>>> +                       goto err_out;
>>>>>>> +               }
>>>>>>>         }
>>>>>>>  
>>>>>>>         entry = f2fs_kmem_cache_alloc(fsync_entry_slab, GFP_F2FS_ZERO);
>>>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> .
>>>
> 
> .
> 



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to