> On Aug 21, 2015, at 8:48 PM, Chao Yu <[email protected]> wrote:
> 
> Hi Jaegeuk,
> 
>> -----Original Message-----
>> From: Jaegeuk Kim [mailto:[email protected]]
>> Sent: Thursday, August 20, 2015 11:35 PM
>> To: Chao Yu
>> Cc: [email protected]; [email protected];
>> [email protected]
>> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of 
>> newly allocated nid
>> 
>> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
>>> Hi Jaegeuk,
>>> 
>>>> -----Original Message-----
>>>> From: Jaegeuk Kim [mailto:[email protected]]
>>>> Sent: Tuesday, August 18, 2015 4:46 PM
>>>> To: [email protected]; [email protected];
>>>> [email protected]
>>>> Cc: Jaegeuk Kim
>>>> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of 
>>>> newly allocated nid
>>>> 
>>>> This patch adds a routine which checks the block address of newly 
>>>> allocated nid.
>>>> If an nid has already allocated by other thread due to subtle data races, 
>>>> it
>>>> will result in filesystem corruption.
>>>> So, it needs to check whether its block address was already allocated or 
>>>> not
>>>> in prior to nid allocation as the last chance.
>>>> 
>>>> Signed-off-by: Jaegeuk Kim <[email protected]>
>>>> ---
>>>> fs/f2fs/node.c | 9 +++++++++
>>>> 1 file changed, 9 insertions(+)
>>>> 
>>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>>>> index 3cc32b8..6bef5a2 100644
>>>> --- a/fs/f2fs/node.c
>>>> +++ b/fs/f2fs/node.c
>>>> @@ -1573,6 +1573,8 @@ retry:
>>>> 
>>>>    /* We should not use stale free nids created by build_free_nids */
>>>>    if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
>>>> +          struct node_info ni;
>>>> +
>>>>            f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
>>>>            list_for_each_entry(i, &nm_i->free_nid_list, list)
>>>>                    if (i->state == NID_NEW)
>>>> @@ -1583,6 +1585,13 @@ retry:
>>>>            i->state = NID_ALLOC;
>>>>            nm_i->fcnt--;
>>>>            spin_unlock(&nm_i->free_nid_list_lock);
>>>> +
>>>> +          /* check nid is allocated already */
>>>> +          get_node_info(sbi, *nid, &ni);
>>>> +          if (ni.blk_addr != NULL_ADDR) {
>>> 
>>> I didn't get it, why free nid is with non-NULL blkaddr?
>>> Could you please explain more about this?
>> 
>> As I wrote in the description, I've been suffering from wrongly added free 
>> nids
>> which results in fs corruption. I suspect somewhat race condition in
>> build_free_nids, but it is very subtle to figure out exactly.
>> So, I wrote this patch to fix that.
>> 
>> The concern would be performance regarding to cold cache miss at an NAT 
>> entry.
>> However, I expect that it would be tolerable since get_node_info will be 
>> called
>> after alloc_nid later.
> 
> After investigating, I think I can reproduce this bug:
> 
> 1. touch a (nid = 4) & touch b (nid = 5)
> 2. sync
> 3. rm a & rm b
> a) rm a to make next_scan_nid = 4.
> b) I change the logical of f2fs code making remove_inode_page failed when
> file b is being removed, so file b's nat entry is not set dirty;
> 4. sync
> 5. touch 1815 files
> 6. echo 3 > /proc/sys/vm/drop_caches
> drop clean nat entry of inode (nid:5), it makes we can pass blkaddr
> verification in add_free_nid:
>       if (build) {
>               /* do not add allocated nids */
> 
> 7. touch c
> because there is no free nids in cache, we try to build cache by two steps:
> a) build nids by loading from nat pages;
> b) build nids by loading from curseg and try to unload nids which has valid
> blkaddr in curseg.
> 
> unfortunately, our build operation is not atomic, so after step a), nid:5

After rethinking about this issue on the way coming back home, I find that
it seems not right here, because we will try to check build_lock status in
on_build_free_nids, allocation will not happen during building free nid
cache. I missed that previously.

Sorry for my wrong conclusion, please ignore them. :(

I’d like to reinvestigate this issue.

Thanks,

> should be in free nids cache and it should be removed in step b). So all
> free nids allocated between step a) and step b) can be risky of incorrect
> allocation.
> 
> If I'm not miss something, the root casue looks like our recent change:
> allocate free nid aggressively.
> 
> Thanks,
>> 
>>> 
>>>> +                  alloc_nid_done(sbi, *nid);
>>> 
>>> Will another thread call alloc_nid_done too, making this free nid being
>>> released again?
>> 
>> No, its state became NID_ALLOC, so no other thread can pick this up till
>> alloc_nid_done is called.
>> 
>> Thanks,
>> 
>>> 
>>> Thanks,
>>> 
>>>> +                  goto retry;
>>>> +          }
>>>>            return true;
>>>>    }
>>>>    spin_unlock(&nm_i->free_nid_list_lock);
>>>> --
>>>> 2.1.1
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------------
>>>> _______________________________________________
>>>> Linux-f2fs-devel mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> Linux-f2fs-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to