Hi, Chao, > -----Original Message----- > From: Chao Yu [mailto:chao2...@samsung.com] > Sent: Wednesday, February 24, 2016 4:05 PM > To: heyunlei; jaeg...@kernel.org; linux-f2fs-devel@lists.sourceforge.net > Cc: Wangbintian; hebiao (G) > Subject: RE: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by losing > wake_up > > Hi Yunlei, > > > -----Original Message----- > > From: He YunLei [mailto:heyun...@huawei.com] > > Sent: Wednesday, February 24, 2016 3:32 PM > > To: Chao Yu; jaeg...@kernel.org; > > linux-f2fs-devel@lists.sourceforge.net > > Cc: bintian.w...@huawei.com; 'Biao He' > > Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused by > > losing wake_up > > > > On 2016/2/24 11:46, Chao Yu wrote: > > > Hi Yunlei, > > > > > >> -----Original Message----- > > >> From: He YunLei [mailto:heyun...@huawei.com] > > >> Sent: Tuesday, February 23, 2016 7:36 PM > > >> To: Chao Yu; jaeg...@kernel.org; > > >> linux-f2fs-devel@lists.sourceforge.net > > >> Cc: bintian.w...@huawei.com; 'Biao He' > > >> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused > > >> by losing wake_up > > >> > > >> On 2016/2/23 17:15, Chao Yu wrote: > > >> Hi Chao, > > >> > > >>> Hi Yunlei, > > >>> > > >>>> -----Original Message----- > > >>>> From: He YunLei [mailto:heyun...@huawei.com] > > >>>> Sent: Tuesday, February 23, 2016 3:03 PM > > >>>> To: Chao Yu; jaeg...@kernel.org; > > >>>> linux-f2fs-devel@lists.sourceforge.net > > >>>> Cc: bintian.w...@huawei.com; 'Biao He' > > >>>> Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem > > >>>> caused by losing wake_up > > >>>> > > >>>> On 2016/2/23 13:44, Chao Yu wrote: > > >>>>> Hi Yunlei, > > >>>> Hi Chao, > > >>>>> > > >>>>>> -----Original Message----- > > >>>>>> From: Yunlei He [mailto:heyun...@huawei.com] > > >>>>>> Sent: Tuesday, February 23, 2016 12:08 PM > > >>>>>> To: chao2...@samsung.com; jaeg...@kernel.org; > > >>>>>> linux-f2fs-devel@lists.sourceforge.net > > >>>>>> Cc: bintian.w...@huawei.com; Yunlei He; Biao He > > >>>>>> Subject: [f2fs-dev] [PATCH] f2fs: avoid hungtask problem caused > > >>>>>> by losing wake_up > > >>>>>> > > >>>>>> The D state of wait_on_all_pages_writeback should be waken by > > >>>>>> function f2fs_write_end_io when all writeback pages have been > > >>>>>> succesfully written to device. It's possible that wake_up comes > > >>>>>> between get_pages and io_schedule. Maybe in this case it will > > >>>>>> lost wake_up and still in D state even if all pages have been > > >>>>>> write back to device, and finally, the whole system will be > > >>>>>> into the hungtask state. > > >>>>> > > >>>>> I haven't encountered such issue so far, do you suffer this in > > >>>>> real world? > > >>>>> > > >>>> yes, I have encounter it, the whole file system is blocked at > > >>>> function wait_on_all_pages_writeback beyond 120s when write cp, > > >>>> and no error reported by storage device driver. > > >>> > > >>> Could this reproducible? If it could, could you please share the > > >>> details. > > >>> And did this occur in a huge size f2fs image? > > >>> > > >>>>>> > > >>>>>> if (!get_pages(sbi, F2FS_WRITEBACK)) > > >>>>>> break; > > >>>>>> <--------- wake_up > > >>>>> > > >>>>> wake_up will put all tasks linked in sbi->cp_wait on run-queue, > > >>>>> so here it should be save to call io_schedule, after being > > >>>>> rescheduled, it will get the chance to check above condition to break > out. > > >>>>> > > >>>>> Thanks, > > >>>> > > >>>> Here, we just doubt something weird may cause > > >>>> wait_on_all_pages_writeback could not be waken. Wake_up trigger > > >>>> only one time by last bio's end_io function, if the thread happen > > >>>> to miss it, the thread will be in D state forever. So we change > > >>>> the code to make wait_on_all_pages_writeback awaken periodically, > then check the condition. > > >>> > > >>> Got it. > > >>> > > >>> The patch can fix issue that checkpointer will wait forever in > > >>> case of write_end_io was failed to call wake_up for some reason. > > > > > > I found one possible case: > > > > > > CPU0: CPU1: > > > - write_checkpoint > > > - do_checkpoint > > > - wait_on_all_pages_writeback > > > - f2fs_write_end_io > > > - wake_up > > > this is last writebacked page, but > > > no sleeper in sbi->cp_wait wait > > > queue, wake_up is not been called. > > > - prepare_to_wait(TASK_UNINTERRUPTIBLE) > > > Here, current task is been preempted, > > > but there will be no waker to wake up > > > this task since last write_end_io > > > has been called before. So current > > > task will sleep forever. > > > - io_schedule > > > > > > How do you think of it? > > Hi Chao, > > > > Here, current task add itself into wait queue at first, and then check > > the condition whether write back page is zero. So, in the above > > situation, current task is been preempted in - > > prepare_to_wait(TASK_UNINTERRUPTIBLE), > > current task will not sleep for the write back page is zero. > > Oh, I meant: > > - prepare_to_wait(TASK_UNINTERRUPTIBLE) > - Preempt > - if (!get_pages(sbi, F2FS_WRITEBACK)) break; or maybe preempt here > - io_schedule > - finish_wait > > Even there is no more writeback state page, also preemption can happen > before finish_wait, after that, once the task was been switched out, as it was > set as TASK_UNINTERRUPTIBLE, there is no chance to schedule it again. > > Thanks, > First, according to our hungtask stack, we can confirm we are in io_schedule.
Secondly, from my point of view, it is safe to preempt regardless of the task's state. It will eventually reschedule. You can see kernel\sched\core.c :: __schedule. If PREEMPT_ACTIVE is set, the current task will be put back to the run queue. > > > > Thanks, > > > > > > And if this is right, following patch can fix this issue. > > > > > > --- > > > fs/f2fs/checkpoint.c | 14 +++++++++----- > > > fs/f2fs/data.c | 9 +++++++-- > > > fs/f2fs/f2fs.h | 3 ++- > > > fs/f2fs/super.c | 1 + > > > 4 files changed, 19 insertions(+), 8 deletions(-) > > > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index > > > 9d277f8..9446c3d 100644 > > > --- a/fs/f2fs/checkpoint.c > > > +++ b/fs/f2fs/checkpoint.c > > > @@ -914,15 +914,19 @@ static void wait_on_all_pages_writeback(struct > f2fs_sb_info *sbi) > > > { > > > DEFINE_WAIT(wait); > > > > > > - for (;;) { > > > - prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE); > > > + spin_lock(&sbi->cp_wb_lock); > > > > > > - if (!get_pages(sbi, F2FS_WRITEBACK)) > > > - break; > > > + while (get_pages(sbi, F2FS_WRITEBACK)) { > > > + prepare_to_wait(&sbi->cp_wait, &wait, > TASK_UNINTERRUPTIBLE); > > > > > > + spin_unlock(&sbi->cp_wb_lock); > > > io_schedule(); > > > + spin_lock(&sbi->cp_wb_lock); > > > + > > > + finish_wait(&sbi->cp_wait, &wait); > > > } > > > - finish_wait(&sbi->cp_wait, &wait); > > > + > > > + spin_unlock(&sbi->cp_wb_lock); > > > } > > > > > > static int do_checkpoint(struct f2fs_sb_info *sbi, struct > > > cp_control *cpc) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index > > > e5c762b..e31deb97 100644 > > > --- a/fs/f2fs/data.c > > > +++ b/fs/f2fs/data.c > > > @@ -59,6 +59,7 @@ static void f2fs_write_end_io(struct bio *bio) > > > { > > > struct f2fs_sb_info *sbi = bio->bi_private; > > > struct bio_vec *bvec; > > > + unsigned long flags; > > > int i; > > > > > > bio_for_each_segment_all(bvec, bio, i) { @@ -74,8 +75,12 @@ > > > static void f2fs_write_end_io(struct bio *bio) > > > dec_page_count(sbi, F2FS_WRITEBACK); > > > } > > > > > > - if (!get_pages(sbi, F2FS_WRITEBACK) && wq_has_sleeper(&sbi->cp_wait)) > > > - wake_up(&sbi->cp_wait); > > > + if (!get_pages(sbi, F2FS_WRITEBACK)) { > > > + spin_lock_irqsave(&sbi->cp_wb_lock, flags); > > > + if (wq_has_sleeper(&sbi->cp_wait)) > > > + wake_up(&sbi->cp_wait); > > > + spin_unlock_irqrestore(&sbi->cp_wb_lock, flags); > > > + } > > > > > > bio_put(bio); > > > } > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 0d25430..fd47984 > > > 100644 > > > --- a/fs/f2fs/f2fs.h > > > +++ b/fs/f2fs/f2fs.h > > > @@ -727,7 +727,8 @@ struct f2fs_sb_info { > > > struct rw_semaphore cp_rwsem; /* blocking FS > > > operations */ > > > struct rw_semaphore node_write; /* locking node writes > > > */ > > > struct mutex writepages; /* mutex for > > > writepages() */ > > > - wait_queue_head_t cp_wait; > > > + wait_queue_head_t cp_wait; /* for wait pages writeback */ > > > + spinlock_t cp_wb_lock; /* for protect cp_wait */ > > > unsigned long last_time[MAX_TIME]; /* to store time in > > > jiffies */ > > > long interval_time[MAX_TIME]; /* to store thresholds > > > */ > > > > > > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index > > > 7b62016..5316c7a 100644 > > > --- a/fs/f2fs/super.c > > > +++ b/fs/f2fs/super.c > > > @@ -1374,6 +1374,7 @@ try_onemore: > > > > > > init_rwsem(&sbi->cp_rwsem); > > > init_waitqueue_head(&sbi->cp_wait); > > > + spin_lock_init(&sbi->cp_wb_lock); > > > init_sb_info(sbi); > > > > > > /* get an inode for meta space */ > > > > ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel