Hi Jaegeuk,

> -----Original Message-----
> From: Chao Yu [mailto:yuchaoch...@hotmail.com]
> Sent: Friday, August 21, 2015 11:00 PM
> To: Chao Yu
> Cc: Jaegeuk Kim; linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of 
> newly allocated nid
> 
> > On Aug 21, 2015, at 8:48 PM, Chao Yu <chao2...@samsung.com> wrote:
> >
> > Hi Jaegeuk,
> >
> >> -----Original Message-----
> >> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> >> Sent: Thursday, August 20, 2015 11:35 PM
> >> To: Chao Yu
> >> Cc: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org;
> >> linux-f2fs-de...@lists.sourceforge.net
> >> Subject: Re: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of 
> >> newly allocated
> nid
> >>
> >> On Thu, Aug 20, 2015 at 05:12:03PM +0800, Chao Yu wrote:
> >>> Hi Jaegeuk,
> >>>
> >>>> -----Original Message-----
> >>>> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> >>>> Sent: Tuesday, August 18, 2015 4:46 PM
> >>>> To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org;
> >>>> linux-f2fs-de...@lists.sourceforge.net
> >>>> Cc: Jaegeuk Kim
> >>>> Subject: [f2fs-dev] [PATCH 5/5] f2fs: check the node block address of 
> >>>> newly allocated nid
> >>>>
> >>>> This patch adds a routine which checks the block address of newly 
> >>>> allocated nid.
> >>>> If an nid has already allocated by other thread due to subtle data 
> >>>> races, it
> >>>> will result in filesystem corruption.
> >>>> So, it needs to check whether its block address was already allocated or 
> >>>> not
> >>>> in prior to nid allocation as the last chance.
> >>>>
> >>>> Signed-off-by: Jaegeuk Kim <jaeg...@kernel.org>
> >>>> ---
> >>>> fs/f2fs/node.c | 9 +++++++++
> >>>> 1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> >>>> index 3cc32b8..6bef5a2 100644
> >>>> --- a/fs/f2fs/node.c
> >>>> +++ b/fs/f2fs/node.c
> >>>> @@ -1573,6 +1573,8 @@ retry:
> >>>>
> >>>>  /* We should not use stale free nids created by build_free_nids */
> >>>>  if (nm_i->fcnt && !on_build_free_nids(nm_i)) {
> >>>> +                struct node_info ni;
> >>>> +
> >>>>          f2fs_bug_on(sbi, list_empty(&nm_i->free_nid_list));
> >>>>          list_for_each_entry(i, &nm_i->free_nid_list, list)
> >>>>                  if (i->state == NID_NEW)
> >>>> @@ -1583,6 +1585,13 @@ retry:
> >>>>          i->state = NID_ALLOC;
> >>>>          nm_i->fcnt--;
> >>>>          spin_unlock(&nm_i->free_nid_list_lock);
> >>>> +
> >>>> +                /* check nid is allocated already */
> >>>> +                get_node_info(sbi, *nid, &ni);
> >>>> +                if (ni.blk_addr != NULL_ADDR) {
> >>>
> >>> I didn't get it, why free nid is with non-NULL blkaddr?
> >>> Could you please explain more about this?
> >>
> >> As I wrote in the description, I've been suffering from wrongly added free 
> >> nids
> >> which results in fs corruption. I suspect somewhat race condition in
> >> build_free_nids, but it is very subtle to figure out exactly.
> >> So, I wrote this patch to fix that.
> >>
> >> The concern would be performance regarding to cold cache miss at an NAT 
> >> entry.
> >> However, I expect that it would be tolerable since get_node_info will be 
> >> called
> >> after alloc_nid later.
> >
> > After investigating, I think I can reproduce this bug:
> >
> > 1. touch a (nid = 4) & touch b (nid = 5)
> > 2. sync
> > 3. rm a & rm b
> > a) rm a to make next_scan_nid = 4.
> > b) I change the logical of f2fs code making remove_inode_page failed when
> > file b is being removed, so file b's nat entry is not set dirty;
> > 4. sync
> > 5. touch 1815 files
> > 6. echo 3 > /proc/sys/vm/drop_caches
> > drop clean nat entry of inode (nid:5), it makes we can pass blkaddr
> > verification in add_free_nid:
> >     if (build) {
> >             /* do not add allocated nids */
> >
> > 7. touch c
> > because there is no free nids in cache, we try to build cache by two steps:
> > a) build nids by loading from nat pages;
> > b) build nids by loading from curseg and try to unload nids which has valid
> > blkaddr in curseg.
> >
> > unfortunately, our build operation is not atomic, so after step a), nid:5
> 
> After rethinking about this issue on the way coming back home, I find that
> it seems not right here, because we will try to check build_lock status in
> on_build_free_nids, allocation will not happen during building free nid
> cache. I missed that previously.
> 
> Sorry for my wrong conclusion, please ignore them. :(
> 
> I’d like to reinvestigate this issue.

I reinvestigate this issue and find one possible call path for reproducing this
issue, and I wrote patches for fxing, can you please help to review the 
following
patches?

Thanks,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to