Hi Jaegeuk, Sorry for the delay, the modification looks good to me. ;)
Thanks, On 2017/8/16 1:54, Jaegeuk Kim wrote: > On 08/15, Chao Yu wrote: >> On 2017/8/15 11:45, Jaegeuk Kim wrote: >>> On 08/07, Chao Yu wrote: >>>> From: Chao Yu <[email protected]> >>>> >>>> Commit d618ebaf0aa8 ("f2fs: enable small discard by default") enables >>>> f2fs to issue 4K size discard in real-time discard mode. However, issuing >>>> smaller discard may cost more lifetime but releasing less free space in >>>> flash device. Since f2fs has ability of separating hot/cold data and >>>> garbage collection, we can expect that small-sized invalid region would >>>> expand soon with OPU, deletion or garbage collection on valid datas, so >>>> it's better to delay or skip issuing smaller size discards, it could help >>>> to reduce overmuch consumption of IO bandwidth and lifetime of flash >>>> storage. >>>> >>>> This patch makes f2fs selectng 64K size as its default minimal >>>> granularity, and issue discard with the size which is not smaller than >>>> minimal granularity. Also it exposes discard granularity as sysfs entry >>>> for configuration in different scenario. >>> >>> Hi Chao, >>> >>> I'd like to change the default value to 1 in order to keep the original >>> behavior, since we must avoid performance fluctuation after this single >>> patch. Instead, you probably can change the value through sysfs. >> >> As I know, in fragmented filesystem space, there are may dozens of thousand >> discard, in scenario of cellphone user are using, 30% is above 64K size, but >> occupy 75% space of all undiscard space, so I changed discard_granularity to >> 64K >> just to release bulk space in device. For other small-sized discards, I >> expect >> that they may extend and cross the granularity threshold soon, and fstrim of >> android could cover them in the night. > > Yup, I thought that, but this patch prevents fstrim from issuing small > discards > due to the granularity check. And, low-end device likes to issue small > discards > much more. How about this? > > From a0f38a8574a35995ba9e9e81ae5138919bb672a8 Mon Sep 17 00:00:00 2001 > From: Chao Yu <[email protected]> > Date: Mon, 7 Aug 2017 23:09:56 +0800 > Subject: [PATCH] f2fs: introduce discard_granularity sysfs entry > > Commit d618ebaf0aa8 ("f2fs: enable small discard by default") enables > f2fs to issue 4K size discard in real-time discard mode. However, issuing > smaller discard may cost more lifetime but releasing less free space in > flash device. Since f2fs has ability of separating hot/cold data and > garbage collection, we can expect that small-sized invalid region would > expand soon with OPU, deletion or garbage collection on valid datas, so > it's better to delay or skip issuing smaller size discards, it could help > to reduce overmuch consumption of IO bandwidth and lifetime of flash > storage. > > This patch makes f2fs selectng 64K size as its default minimal > granularity, and issue discard with the size which is not smaller than > minimal granularity. Also it exposes discard granularity as sysfs entry > for configuration in different scenario. > > Jaegeuk Kim: > We must issue all the accumulated discard commands when fstrim is called. > So, I've added pend_list_tag[] to indicate whether we should issue the > commands or not. If tag sets P_ACTIVE or P_TRIM, we have to issue them. > P_TRIM is set once at a time, given fstrim trigger. > > Signed-off-by: Chao Yu <[email protected]> > Signed-off-by: Jaegeuk Kim <[email protected]> > --- > Documentation/ABI/testing/sysfs-fs-f2fs | 9 +++++++ > fs/f2fs/f2fs.h | 9 +++++++ > fs/f2fs/segment.c | 43 > +++++++++++++++++++++++++++++++-- > fs/f2fs/sysfs.c | 23 ++++++++++++++++++ > 4 files changed, 82 insertions(+), 2 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs > b/Documentation/ABI/testing/sysfs-fs-f2fs > index 621da3fc56c5..11b7f4ebea7c 100644 > --- a/Documentation/ABI/testing/sysfs-fs-f2fs > +++ b/Documentation/ABI/testing/sysfs-fs-f2fs > @@ -57,6 +57,15 @@ Contact: "Jaegeuk Kim" <[email protected]> > Description: > Controls the issue rate of small discard commands. > > +What: /sys/fs/f2fs/<disk>/discard_granularity > +Date: July 2017 > +Contact: "Chao Yu" <[email protected]> > +Description: > + Controls discard granularity of inner discard thread, inner > thread > + will not issue discards with size that is smaller than > granularity. > + The unit size is one block, now only support configuring in > range > + of [1, 512]. > + > What: /sys/fs/f2fs/<disk>/max_victim_search > Date: January 2014 > Contact: "Jaegeuk Kim" <[email protected]> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h > index e252e5bf9791..336021b9b93e 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -196,11 +196,18 @@ struct discard_entry { > unsigned char discard_map[SIT_VBLOCK_MAP_SIZE]; /* segment discard > bitmap */ > }; > > +/* default discard granularity of inner discard thread, unit: block count */ > +#define DEFAULT_DISCARD_GRANULARITY 16 > + > /* max discard pend list number */ > #define MAX_PLIST_NUM 512 > #define plist_idx(blk_num) ((blk_num) >= MAX_PLIST_NUM ? \ > (MAX_PLIST_NUM - 1) : (blk_num - 1)) > > +#define P_ACTIVE 0x01 > +#define P_TRIM 0x02 > +#define plist_issue(tag) (((tag) & P_ACTIVE) || ((tag) & P_TRIM)) > + > enum { > D_PREP, > D_SUBMIT, > @@ -236,11 +243,13 @@ struct discard_cmd_control { > struct task_struct *f2fs_issue_discard; /* discard thread */ > struct list_head entry_list; /* 4KB discard entry list */ > struct list_head pend_list[MAX_PLIST_NUM];/* store pending entries */ > + unsigned char pend_list_tag[MAX_PLIST_NUM];/* tag for pending entries */ > struct list_head wait_list; /* store on-flushing entries */ > wait_queue_head_t discard_wait_queue; /* waiting queue for wake-up */ > struct mutex cmd_lock; > unsigned int nr_discards; /* # of discards in the list */ > unsigned int max_discards; /* max. discards to be issued */ > + unsigned int discard_granularity; /* discard granularity */ > unsigned int undiscard_blks; /* # of undiscard blocks */ > atomic_t issued_discard; /* # of issued discard */ > atomic_t issing_discard; /* # of issing discard */ > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 05144b3a7f62..8c90b69dcd6d 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -1028,22 +1028,49 @@ static void __issue_discard_cmd(struct f2fs_sb_info > *sbi, bool issue_cond) > f2fs_bug_on(sbi, > !__check_rb_tree_consistence(sbi, &dcc->root)); > blk_start_plug(&plug); > - for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { > + for (i = MAX_PLIST_NUM - 1; > + i >= 0 && plist_issue(dcc->pend_list_tag[i]); i--) { > pend_list = &dcc->pend_list[i]; > list_for_each_entry_safe(dc, tmp, pend_list, list) { > f2fs_bug_on(sbi, dc->state != D_PREP); > > + /* Hurry up to finish fstrim */ > + if (dcc->pend_list_tag[i] & P_TRIM) { > + __submit_discard_cmd(sbi, dc); > + continue; > + } > + > if (!issue_cond || is_idle(sbi)) > __submit_discard_cmd(sbi, dc); > if (issue_cond && iter++ > DISCARD_ISSUE_RATE) > goto out; > } > + if (list_empty(pend_list) && dcc->pend_list_tag[i] & P_TRIM) > + dcc->pend_list_tag[i] &= (~P_TRIM); > } > out: > blk_finish_plug(&plug); > mutex_unlock(&dcc->cmd_lock); > } > > +static void __drop_discard_cmd(struct f2fs_sb_info *sbi) > +{ > + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > + struct list_head *pend_list; > + struct discard_cmd *dc, *tmp; > + int i; > + > + mutex_lock(&dcc->cmd_lock); > + for (i = MAX_PLIST_NUM - 1; i >= 0; i--) { > + pend_list = &dcc->pend_list[i]; > + list_for_each_entry_safe(dc, tmp, pend_list, list) { > + f2fs_bug_on(sbi, dc->state != D_PREP); > + __remove_discard_cmd(sbi, dc); > + } > + } > + mutex_unlock(&dcc->cmd_lock); > +} > + > static void __wait_one_discard_bio(struct f2fs_sb_info *sbi, > struct discard_cmd *dc) > { > @@ -1126,6 +1153,7 @@ void stop_discard_thread(struct f2fs_sb_info *sbi) > void f2fs_wait_discard_bios(struct f2fs_sb_info *sbi) > { > __issue_discard_cmd(sbi, false); > + __drop_discard_cmd(sbi); > __wait_discard_cmd(sbi, false); > } > > @@ -1448,9 +1476,13 @@ static int create_discard_cmd_control(struct > f2fs_sb_info *sbi) > if (!dcc) > return -ENOMEM; > > + dcc->discard_granularity = DEFAULT_DISCARD_GRANULARITY; > INIT_LIST_HEAD(&dcc->entry_list); > - for (i = 0; i < MAX_PLIST_NUM; i++) > + for (i = 0; i < MAX_PLIST_NUM; i++) { > INIT_LIST_HEAD(&dcc->pend_list[i]); > + if (i >= dcc->discard_granularity - 1) > + dcc->pend_list_tag[i] |= P_ACTIVE; > + } > INIT_LIST_HEAD(&dcc->wait_list); > mutex_init(&dcc->cmd_lock); > atomic_set(&dcc->issued_discard, 0); > @@ -2079,11 +2111,13 @@ bool exist_trim_candidates(struct f2fs_sb_info *sbi, > struct cp_control *cpc) > > int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range) > { > + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > __u64 start = F2FS_BYTES_TO_BLK(range->start); > __u64 end = start + F2FS_BYTES_TO_BLK(range->len) - 1; > unsigned int start_segno, end_segno; > struct cp_control cpc; > int err = 0; > + int i; > > if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize) > return -EINVAL; > @@ -2127,6 +2161,11 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct > fstrim_range *range) > > schedule(); > } > + /* It's time to issue all the filed discards */ > + mutex_lock(&dcc->cmd_lock); > + for (i = 0; i < MAX_PLIST_NUM; i++) > + dcc->pend_list_tag[i] |= P_TRIM; > + mutex_unlock(&dcc->cmd_lock); > out: > range->len = F2FS_BLK_TO_BYTES(cpc.trimmed); > return err; > diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c > index c40e5d24df9f..4bcaa9059026 100644 > --- a/fs/f2fs/sysfs.c > +++ b/fs/f2fs/sysfs.c > @@ -152,6 +152,27 @@ static ssize_t f2fs_sbi_store(struct f2fs_attr *a, > spin_unlock(&sbi->stat_lock); > return count; > } > + > + if (!strcmp(a->attr.name, "discard_granularity")) { > + struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; > + int i; > + > + if (t == 0 || t > MAX_PLIST_NUM) > + return -EINVAL; > + if (t == *ui) > + return count; > + > + mutex_lock(&dcc->cmd_lock); > + for (i = 0; i < MAX_PLIST_NUM; i++) { > + if (i >= t - 1) > + dcc->pend_list_tag[i] |= P_ACTIVE; > + else > + dcc->pend_list_tag[i] &= (~P_ACTIVE); > + } > + mutex_unlock(&dcc->cmd_lock); > + return count; > + } > + > *ui = t; > > if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0) > @@ -248,6 +269,7 @@ F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_idle, > gc_idle); > F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, gc_urgent, gc_urgent); > F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, reclaim_segments, rec_prefree_segments); > F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, max_small_discards, > max_discards); > +F2FS_RW_ATTR(DCC_INFO, discard_cmd_control, discard_granularity, > discard_granularity); > F2FS_RW_ATTR(RESERVED_BLOCKS, f2fs_sb_info, reserved_blocks, > reserved_blocks); > F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, batched_trim_sections, trim_sections); > F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, ipu_policy, ipu_policy); > @@ -290,6 +312,7 @@ static struct attribute *f2fs_attrs[] = { > ATTR_LIST(gc_urgent), > ATTR_LIST(reclaim_segments), > ATTR_LIST(max_small_discards), > + ATTR_LIST(discard_granularity), > ATTR_LIST(batched_trim_sections), > ATTR_LIST(ipu_policy), > ATTR_LIST(min_ipu_util), > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linux-f2fs-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
