Hi Yunlei,

> -----Original Message-----
> From: He YunLei [mailto:heyun...@huawei.com]
> Sent: Wednesday, February 24, 2016 3:32 PM
> To: Chao Yu; jaeg...@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> Cc: bintian.w...@huawei.com; 'Biao He'
> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by losing 
> wake_up
> 
> On 2016/2/24 11:46, Chao Yu wrote:
> > Hi Yunlei,
> >
> >> -----Original Message-----
> >> From: He YunLei [mailto:heyun...@huawei.com]
> >> Sent: Tuesday, February 23, 2016 7:36 PM
> >> To: Chao Yu; jaeg...@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >> Cc: bintian.w...@huawei.com; 'Biao He'
> >> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by 
> >> losing wake_up
> >>
> >> On 2016/2/23 17:15, Chao Yu wrote:
> >> Hi Chao,
> >>
> >>> Hi Yunlei,
> >>>
> >>>> -----Original Message-----
> >>>> From: He YunLei [mailto:heyun...@huawei.com]
> >>>> Sent: Tuesday, February 23, 2016 3:03 PM
> >>>> To: Chao Yu; jaeg...@kernel.org; linux-f2fs-devel@lists.sourceforge.net
> >>>> Cc: bintian.w...@huawei.com; 'Biao He'
> >>>> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by 
> >>>> losing wake_up
> >>>>
> >>>> On 2016/2/23 13:44, Chao Yu wrote:
> >>>>> Hi Yunlei,
> >>>> Hi Chao,
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Yunlei He [mailto:heyun...@huawei.com]
> >>>>>> Sent: Tuesday, February 23, 2016 12:08 PM
> >>>>>> To: chao2...@samsung.com; jaeg...@kernel.org; 
> >>>>>> linux-f2fs-devel@lists.sourceforge.net
> >>>>>> Cc: bintian.w...@huawei.com; Yunlei He; Biao He
> >>>>>> Subject: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by 
> >>>>>> losing wake_up
> >>>>>>
> >>>>>> The D state of wait_on_all_pages_writeback should be waken by
> >>>>>> function f2fs_write_end_io when all writeback pages have been
> >>>>>> succesfully written to device. It's possible that wake_up comes
> >>>>>> between get_pages and io_schedule. Maybe in this case it will
> >>>>>> lost wake_up and still in D state even if all pages have been
> >>>>>> write back to device, and finally, the whole system will be into
> >>>>>> the hungtask state.
> >>>>>
> >>>>> I haven't encountered such issue so far, do you suffer this in real
> >>>>> world?
> >>>>>
> >>>> yes, I have encounter it, the whole file system is blocked at function
> >>>> wait_on_all_pages_writeback beyond 120s when write cp, and no error 
> >>>> reported
> >>>> by storage device driver.
> >>>
> >>> Could this reproducible? If it could, could you please share the details.
> >>> And did this occur in a huge size f2fs image?
> >>>
> >>>>>>
> >>>>>>                    if (!get_pages(sbi, F2FS_WRITEBACK))
> >>>>>>                             break;
> >>>>>>                                        <---------  wake_up
> >>>>>
> >>>>> wake_up will put all tasks linked in sbi->cp_wait on run-queue, so
> >>>>> here it should be save to call io_schedule, after being rescheduled,
> >>>>> it will get the chance to check above condition to break out.
> >>>>>
> >>>>> Thanks,
> >>>>
> >>>> Here, we just doubt something weird may cause wait_on_all_pages_writeback
> >>>> could not be waken. Wake_up trigger only one time by last bio's end_io
> >>>> function, if the thread happen to miss it, the thread will be in D state
> >>>> forever. So we change the code to make wait_on_all_pages_writeback awaken
> >>>> periodically, then check the condition.
> >>>
> >>> Got it.
> >>>
> >>> The patch can fix issue that checkpointer will wait forever in case of
> >>> write_end_io was failed to call wake_up for some reason.
> >
> > I found one possible case:
> >
> > CPU0:                                       CPU1:
> >   - write_checkpoint
> >    - do_checkpoint
> >     - wait_on_all_pages_writeback
> >                                      - f2fs_write_end_io
> >                                       - wake_up
> >                                     this is last writebacked page, but
> >                                     no sleeper in sbi->cp_wait wait
> >                                     queue, wake_up is not been called.
> >      - prepare_to_wait(TASK_UNINTERRUPTIBLE)
> >      Here, current task is been preempted,
> >      but there will be no waker to wake up
> >      this task since last write_end_io
> >      has been called before. So current
> >      task will sleep forever.
> >      - io_schedule
> >
> > How do you think of it?
> Hi Chao,
> 
> Here, current task add itself into wait queue at first, and then check the
> condition whether write back page is zero. So, in the above situation,
> current task is been preempted in - prepare_to_wait(TASK_UNINTERRUPTIBLE),
> current task will not sleep for the write back page is zero.

Oh, I meant:

 - prepare_to_wait(TASK_UNINTERRUPTIBLE)
 - Preempt
 - if (!get_pages(sbi, F2FS_WRITEBACK)) break;    or maybe preempt here
 - io_schedule
 - finish_wait

Even there is no more writeback state page, also preemption can happen before
finish_wait, after that, once the task was been switched out, as it was set as
TASK_UNINTERRUPTIBLE, there is no chance to schedule it again.

Thanks,

> 
> Thanks,
> >
> > And if this is right, following patch can fix this issue.
> >
> > ---
> >   fs/f2fs/checkpoint.c | 14 +++++++++-----
> >   fs/f2fs/data.c       |  9 +++++++--
> >   fs/f2fs/f2fs.h       |  3 ++-
> >   fs/f2fs/super.c      |  1 +
> >   4 files changed, 19 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 9d277f8..9446c3d 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -914,15 +914,19 @@ static void wait_on_all_pages_writeback(struct 
> > f2fs_sb_info *sbi)
> >   {
> >     DEFINE_WAIT(wait);
> >
> > -   for (;;) {
> > -           prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
> > +   spin_lock(&sbi->cp_wb_lock);
> >
> > -           if (!get_pages(sbi, F2FS_WRITEBACK))
> > -                   break;
> > +   while (get_pages(sbi, F2FS_WRITEBACK)) {
> > +           prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
> >
> > +           spin_unlock(&sbi->cp_wb_lock);
> >             io_schedule();
> > +           spin_lock(&sbi->cp_wb_lock);
> > +
> > +           finish_wait(&sbi->cp_wait, &wait);
> >     }
> > -   finish_wait(&sbi->cp_wait, &wait);
> > +
> > +   spin_unlock(&sbi->cp_wb_lock);
> >   }
> >
> >   static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index e5c762b..e31deb97 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -59,6 +59,7 @@ static void f2fs_write_end_io(struct bio *bio)
> >   {
> >     struct f2fs_sb_info *sbi = bio->bi_private;
> >     struct bio_vec *bvec;
> > +   unsigned long flags;
> >     int i;
> >
> >     bio_for_each_segment_all(bvec, bio, i) {
> > @@ -74,8 +75,12 @@ static void f2fs_write_end_io(struct bio *bio)
> >             dec_page_count(sbi, F2FS_WRITEBACK);
> >     }
> >
> > -   if (!get_pages(sbi, F2FS_WRITEBACK) && wq_has_sleeper(&sbi->cp_wait))
> > -           wake_up(&sbi->cp_wait);
> > +   if (!get_pages(sbi, F2FS_WRITEBACK)) {
> > +           spin_lock_irqsave(&sbi->cp_wb_lock, flags);
> > +           if (wq_has_sleeper(&sbi->cp_wait))
> > +                   wake_up(&sbi->cp_wait);
> > +           spin_unlock_irqrestore(&sbi->cp_wb_lock, flags);
> > +   }
> >
> >     bio_put(bio);
> >   }
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index 0d25430..fd47984 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -727,7 +727,8 @@ struct f2fs_sb_info {
> >     struct rw_semaphore cp_rwsem;           /* blocking FS operations */
> >     struct rw_semaphore node_write;         /* locking node writes */
> >     struct mutex writepages;                /* mutex for writepages() */
> > -   wait_queue_head_t cp_wait;
> > +   wait_queue_head_t cp_wait;              /* for wait pages writeback */
> > +   spinlock_t cp_wb_lock;                  /* for protect cp_wait */
> >     unsigned long last_time[MAX_TIME];      /* to store time in jiffies */
> >     long interval_time[MAX_TIME];           /* to store thresholds */
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> > index 7b62016..5316c7a 100644
> > --- a/fs/f2fs/super.c
> > +++ b/fs/f2fs/super.c
> > @@ -1374,6 +1374,7 @@ try_onemore:
> >
> >     init_rwsem(&sbi->cp_rwsem);
> >     init_waitqueue_head(&sbi->cp_wait);
> > +   spin_lock_init(&sbi->cp_wb_lock);
> >     init_sb_info(sbi);
> >
> >     /* get an inode for meta space */
> >



------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to