[f2fs-dev] documentation for atomic writes, encryption and other features

2017-02-16 Thread Daniel Black
I was looking at Documentation/filesystems/f2fs.txt trying to work out
how to do atomic writes.

I did find https://www.spinics.net/lists/kernel/msg1839691.html which
contains some elements.

It seems F2FS_IOC_DB_OPEN might been replaced by
F2FS_IOC_START_ATOMIC_WRITE looking at fs/f2fs/f2fs.h

I'd really like some documentation to be written in filesystems/f2fs.txt
to aid the use of features that have been so diligently developed. I'd
also like to know what happens when the atomic write size gets too big.

After that perhaps it can be used in application like other storage
systems e.g. https://jira.mariadb.org/browse/MDEV-11417

Keep up the good work.

Cheers,

Daniel Black


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [RFC] add ovp valid_blocks check for bg gc victim to fg_gc

2017-02-16 Thread Hou Pengyang
On 2017/2/17 7:48, Jaegeuk Kim wrote:
> Hi Pengyang,
>
> Nice
Hi Jaegeuk,
catch!
>
> I think fggc_threshold needs to be revised, and we need to consider about
> general victim selection as well.
>
> Could you take a look at this?
>
>>From 23b265f5ca6405032d092e240c94a827f743e42b Mon Sep 17 00:00:00 2001
> From: Hou Pengyang 
> Date: Thu, 16 Feb 2017 12:34:31 +
> Subject: [PATCH] f2fs: add ovp valid_blocks check for bg gc victim to fg_gc
>
> For foreground gc, greedy algorithm should be adapted, which makes
> this formula work well:
>
>   (2 * (100 / config.overprovision + 1) + 6)
>
> But currently, we fg_gc have a prior to select bg_gc victim segments to gc 
> first,
> these victims are selected by cost-benefit algorithm, we can't guarantee such 
> segments
> have the small valid blocks, which may destroy the f2fs rule, on the worstest 
> case, would
> consume all the free segments.
>
> This patch fix this by add a filter in check_bg_victims, if segment's has # 
> of valid blocks
> over overprovision ratio, skip such segments.
>
> Cc: 
> Signed-off-by: Hou Pengyang 
> Signed-off-by: Jaegeuk Kim 
> ---
>   fs/f2fs/f2fs.h|  3 +++
>   fs/f2fs/gc.c  | 22 --
>   fs/f2fs/segment.h |  9 +
>   3 files changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index cc22dc458896..1c9f0cc8f027 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -888,6 +888,9 @@ struct f2fs_sb_info {
>   struct f2fs_gc_kthread  *gc_thread; /* GC thread */
>   unsigned int cur_victim_sec;/* current victim section num */
>
> + /* threshold for converting bg victims for fg */
> + u64 fggc_threshold;
> +
>   /* maximum # of trials to find a victim segment for SSR and GC */
>   unsigned int max_victim_search;
>
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 88e5e7b10ab6..fd4e479820e6 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -166,7 +166,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
> gc_type,
>   p->ofs_unit = sbi->segs_per_sec;
>   }
>
> - if (p->max_search > sbi->max_victim_search)
> + /* we need to check every dirty segments in the FG_GC case */
> + if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
>   p->max_search = sbi->max_victim_search;
>
>   p->offset = sbi->last_victim[p->gc_mode];
> @@ -199,6 +200,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
> *sbi)
>   for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
>   if (sec_usage_check(sbi, secno))
>   continue;
> +
> + if (no_fggc_candidate(sbi, secno))
> + continue;
> +
>   clear_bit(secno, dirty_i->victim_secmap);
>   return secno * sbi->segs_per_sec;
>   }
> @@ -322,13 +327,15 @@ static int get_victim_by_default(struct f2fs_sb_info 
> *sbi,
>   nsearched++;
>   }
>
> -
>   secno = GET_SECNO(sbi, segno);
>
>   if (sec_usage_check(sbi, secno))
>   goto next;
>   if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
>   goto next;
> + if (gc_type == FG_GC && p.alloc_mode == LFS &&
> + no_fggc_candidate(sbi, secno))
> + goto next;
>
>   cost = get_gc_cost(sbi, segno, );
>
> @@ -989,5 +996,16 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool 
> background)
>
>   void build_gc_manager(struct f2fs_sb_info *sbi)
>   {
> + u64 user_block_count, ovp_count, blocks_per_sec, th;
> +
>   DIRTY_I(sbi)->v_ops = _v_ops;
> +
> + /* threshold of # of valid blocks in a section for victims of FG_GC */
> + user_block_count = sbi->user_block_count;
> + ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;

About the ovp_count calculation,

in mkfs.f2fs, we get ovp_segment by

 set_cp(overprov_segment_count, (get_sb(segment_count_main) -
get_cp(rsvd_segment_count)) *
 config.overprovision / 100);

 set_cp(overprov_segment_count, get_cp(overprov_segment_count) +
 get_cp(rsvd_segment_count));

where the overprov calculation is based on the space excluding the
rsvd segment, and the final overprov_segment is sum of the REAL
overprov segments and the rsvd ones.

So, when to calculate the overprov ratio, the rsvd segments should
be subtracted from the ckpt->overprov_semgents?

Thanks,

> + blocks_per_sec = sbi->blocks_per_seg * sbi->segs_per_sec;
> +
> + th = user_block_count * 100 * blocks_per_sec /
> + ((user_block_count + ovp_count) * 100);
> + sbi->fggc_threshold = th;
>   }
> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> index 5cb5755c75d9..f4020f141d83 100644
> --- 

Re: [f2fs-dev] [RFC] add ovp valid_blocks check for bg gc victim to fg_gc

2017-02-16 Thread Jaegeuk Kim
On 02/17, Hou Pengyang wrote:
> On 2017/2/17 7:48, Jaegeuk Kim wrote:
> > Hi Pengyang,
> > 
> > Nice
> Hi Jaegeuk,
> catch!
> > 
> > I think fggc_threshold needs to be revised, and we need to consider about
> > general victim selection as well.
> > 
> > Could you take a look at this?
> > 
> > > From 23b265f5ca6405032d092e240c94a827f743e42b Mon Sep 17 00:00:00 2001
> > From: Hou Pengyang 
> > Date: Thu, 16 Feb 2017 12:34:31 +
> > Subject: [PATCH] f2fs: add ovp valid_blocks check for bg gc victim to fg_gc
> > 
> > For foreground gc, greedy algorithm should be adapted, which makes
> > this formula work well:
> > 
> > (2 * (100 / config.overprovision + 1) + 6)
> > 
> > But currently, we fg_gc have a prior to select bg_gc victim segments to gc 
> > first,
> > these victims are selected by cost-benefit algorithm, we can't guarantee 
> > such segments
> > have the small valid blocks, which may destroy the f2fs rule, on the 
> > worstest case, would
> > consume all the free segments.
> > 
> > This patch fix this by add a filter in check_bg_victims, if segment's has # 
> > of valid blocks
> > over overprovision ratio, skip such segments.
> > 
> > Cc: 
> > Signed-off-by: Hou Pengyang 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >   fs/f2fs/f2fs.h|  3 +++
> >   fs/f2fs/gc.c  | 22 --
> >   fs/f2fs/segment.h |  9 +
> >   3 files changed, 32 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index cc22dc458896..1c9f0cc8f027 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -888,6 +888,9 @@ struct f2fs_sb_info {
> > struct f2fs_gc_kthread  *gc_thread; /* GC thread */
> > unsigned int cur_victim_sec;/* current victim section num */
> > 
> > +   /* threshold for converting bg victims for fg */
> > +   u64 fggc_threshold;
> > +
> > /* maximum # of trials to find a victim segment for SSR and GC */
> > unsigned int max_victim_search;
> > 
> > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> > index 88e5e7b10ab6..fd4e479820e6 100644
> > --- a/fs/f2fs/gc.c
> > +++ b/fs/f2fs/gc.c
> > @@ -166,7 +166,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
> > gc_type,
> > p->ofs_unit = sbi->segs_per_sec;
> > }
> > 
> > -   if (p->max_search > sbi->max_victim_search)
> > +   /* we need to check every dirty segments in the FG_GC case */
> > +   if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
> > p->max_search = sbi->max_victim_search;
> > 
> > p->offset = sbi->last_victim[p->gc_mode];
> > @@ -199,6 +200,10 @@ static unsigned int check_bg_victims(struct 
> > f2fs_sb_info *sbi)
> > for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
> > if (sec_usage_check(sbi, secno))
> > continue;
> > +
> > +   if (no_fggc_candidate(sbi, secno))
> > +   continue;
> > +
> > clear_bit(secno, dirty_i->victim_secmap);
> > return secno * sbi->segs_per_sec;
> > }
> > @@ -322,13 +327,15 @@ static int get_victim_by_default(struct f2fs_sb_info 
> > *sbi,
> > nsearched++;
> > }
> > 
> > -
> > secno = GET_SECNO(sbi, segno);
> > 
> > if (sec_usage_check(sbi, secno))
> > goto next;
> > if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
> > goto next;
> > +   if (gc_type == FG_GC && p.alloc_mode == LFS &&
> > +   no_fggc_candidate(sbi, secno))
> > +   goto next;
> > 
> > cost = get_gc_cost(sbi, segno, );
> > 
> > @@ -989,5 +996,16 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool 
> > background)
> > 
> >   void build_gc_manager(struct f2fs_sb_info *sbi)
> >   {
> > +   u64 user_block_count, ovp_count, blocks_per_sec, th;
> > +
> > DIRTY_I(sbi)->v_ops = _v_ops;
> > +
> > +   /* threshold of # of valid blocks in a section for victims of FG_GC */
> > +   user_block_count = sbi->user_block_count;
> > +   ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;
> 
> About the ovp_count calculation,
> 
> in mkfs.f2fs, we get ovp_segment by
> 
> set_cp(overprov_segment_count, (get_sb(segment_count_main) -
>get_cp(rsvd_segment_count)) *
> config.overprovision / 100);
> 
> set_cp(overprov_segment_count, get_cp(overprov_segment_count) +
> get_cp(rsvd_segment_count));
> 
> where the overprov calculation is based on the space excluding the
> rsvd segment, and the final overprov_segment is sum of the REAL
> overprov segments and the rsvd ones.
> 
> So, when to calculate the overprov ratio, the rsvd segments should
> be subtracted from the ckpt->overprov_semgents?

I just got calculation from fresh mounted image. What I could confirm was that
user can see 

Re: [f2fs-dev] documentation for atomic writes, encryption and other features

2017-02-16 Thread Jaegeuk Kim
Hi Daniel,

On 02/17, Daniel Black wrote:
> I was looking at Documentation/filesystems/f2fs.txt trying to work out
> how to do atomic writes.
> 
> I did find https://www.spinics.net/lists/kernel/msg1839691.html which
> contains some elements.
> 
> It seems F2FS_IOC_DB_OPEN might been replaced by
> F2FS_IOC_START_ATOMIC_WRITE looking at fs/f2fs/f2fs.h
> 
> I'd really like some documentation to be written in filesystems/f2fs.txt
> to aid the use of features that have been so diligently developed.

Agreed that we need to describe it in f2fs.txt, and let me add the work in our
TODO list.

> I'd also like to know what happens when the atomic write size gets too big.

I didn't test large atomic writes a lot, since we enabled and have tested it in
sqlite under Android products, which triggers small atomic writes across lots of
different db files. In current design, there is no limit on the size of atomic
writes. And it handles basic errors like ENOMEM or EIO during commit.

> After that perhaps it can be used in application like other storage
> systems e.g. https://jira.mariadb.org/browse/MDEV-11417

Quite interesting!
Yup, I thought that there would be many use-cases of this feature. But I've not
been able to dive into other scenarios seriously, since application needs to be
modified; sqlite was quite easy for me to change tho.

Thanks,

> 
> Keep up the good work.
> 
> Cheers,
> 
> Daniel Black
> 
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> ___
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [RFC] add ovp valid_blocks check for bg gc victim to fg_gc

2017-02-16 Thread Jaegeuk Kim
Hi Pengyang,

Nice catch!

I think fggc_threshold needs to be revised, and we need to consider about
general victim selection as well.

Could you take a look at this?

>From 23b265f5ca6405032d092e240c94a827f743e42b Mon Sep 17 00:00:00 2001
From: Hou Pengyang 
Date: Thu, 16 Feb 2017 12:34:31 +
Subject: [PATCH] f2fs: add ovp valid_blocks check for bg gc victim to fg_gc

For foreground gc, greedy algorithm should be adapted, which makes
this formula work well:

(2 * (100 / config.overprovision + 1) + 6)

But currently, we fg_gc have a prior to select bg_gc victim segments to gc 
first,
these victims are selected by cost-benefit algorithm, we can't guarantee such 
segments
have the small valid blocks, which may destroy the f2fs rule, on the worstest 
case, would
consume all the free segments.

This patch fix this by add a filter in check_bg_victims, if segment's has # of 
valid blocks
over overprovision ratio, skip such segments.

Cc: 
Signed-off-by: Hou Pengyang 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/f2fs.h|  3 +++
 fs/f2fs/gc.c  | 22 --
 fs/f2fs/segment.h |  9 +
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index cc22dc458896..1c9f0cc8f027 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -888,6 +888,9 @@ struct f2fs_sb_info {
struct f2fs_gc_kthread  *gc_thread; /* GC thread */
unsigned int cur_victim_sec;/* current victim section num */
 
+   /* threshold for converting bg victims for fg */
+   u64 fggc_threshold;
+
/* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search;
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 88e5e7b10ab6..fd4e479820e6 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -166,7 +166,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
p->ofs_unit = sbi->segs_per_sec;
}
 
-   if (p->max_search > sbi->max_victim_search)
+   /* we need to check every dirty segments in the FG_GC case */
+   if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
 
p->offset = sbi->last_victim[p->gc_mode];
@@ -199,6 +200,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
if (sec_usage_check(sbi, secno))
continue;
+
+   if (no_fggc_candidate(sbi, secno))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return secno * sbi->segs_per_sec;
}
@@ -322,13 +327,15 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
nsearched++;
}
 
-
secno = GET_SECNO(sbi, segno);
 
if (sec_usage_check(sbi, secno))
goto next;
if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   no_fggc_candidate(sbi, secno))
+   goto next;
 
cost = get_gc_cost(sbi, segno, );
 
@@ -989,5 +996,16 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool 
background)
 
 void build_gc_manager(struct f2fs_sb_info *sbi)
 {
+   u64 user_block_count, ovp_count, blocks_per_sec, th;
+
DIRTY_I(sbi)->v_ops = _v_ops;
+
+   /* threshold of # of valid blocks in a section for victims of FG_GC */
+   user_block_count = sbi->user_block_count;
+   ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;
+   blocks_per_sec = sbi->blocks_per_seg * sbi->segs_per_sec;
+
+   th = user_block_count * 100 * blocks_per_sec /
+   ((user_block_count + ovp_count) * 100);
+   sbi->fggc_threshold = th;
 }
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 5cb5755c75d9..f4020f141d83 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -716,6 +716,15 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info 
*sbi, int base, int type)
- (base + 1) + type;
 }
 
+static inline bool no_fggc_candidate(struct f2fs_sb_info *sbi,
+   unsigned int secno)
+{
+   if (get_valid_blocks(sbi, secno, sbi->segs_per_sec) >=
+   sbi->fggc_threshold)
+   return true;
+   return false;
+}
+
 static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int 
secno)
 {
if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno))
-- 
2.11.0


--
Check out the vibrant tech community 

[f2fs-dev] [RFC] add ovp valid_blocks check for bg gc victim to fg_gc

2017-02-16 Thread Hou Pengyang
For foreground gc, greedy algorithm should be adapted, which makes
this formula work well:

(2 * (100 / config.overprovision + 1) + 6)

But currently, we fg_gc have a prior to select bg_gc victim segments to gc 
first,
these victims are selected by cost-benefit algorithm, we can't guarantee such 
segments
have the small valid blocks, which may destroy the f2fs rule, on the worstest 
case, would 
consume all the free segments.

This patch fix this by add a filter in check_bg_victims, if segment's has # of 
valid blocks
over overprovision ratio, skip such segments.

Signed-off-by: Hou Pengyang 
---
 fs/f2fs/f2fs.h|  1 +
 fs/f2fs/gc.c  |  4 
 fs/f2fs/segment.h | 12 
 fs/f2fs/super.c   | 10 ++
 4 files changed, 27 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c9a97c3..0afe94f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -881,6 +881,7 @@ struct f2fs_sb_info {
struct mutex gc_mutex;  /* mutex for GC */
struct f2fs_gc_kthread  *gc_thread; /* GC thread */
unsigned int cur_victim_sec;/* current victim section num */
+   unsigned int fggc_threshold;/* threshold for converting bg victims 
for fg */
 
/* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 88e5e7b..18f3efa 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -199,6 +199,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
if (sec_usage_check(sbi, secno))
continue;
+
+   if (valid_blocks_ovp_check(sbi, secno))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return secno * sbi->segs_per_sec;
}
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 5cb5755..2a6a1a9 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -716,6 +716,18 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info 
*sbi, int base, int type)
- (base + 1) + type;
 }
 
+static inline bool valid_blocks_ovp_check(struct f2fs_sb_info *sbi, unsigned 
int secno)
+{
+   struct sit_info *sit_i = SIT_I(sbi);
+   unsigned int valid_blocks;
+
+   valid_blocks = sit_i->sec_entries[secno].valid_blocks;
+   if (valid_blocks >= sbi->fggc_threshold)
+   return true;
+   return false;
+
+}
+
 static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int 
secno)
 {
if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno))
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 68a555b..54c4dac 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2090,6 +2090,16 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
if (err)
goto free_kobj;
}
+
+   sbi->fggc_threshold = (le32_to_cpu(sbi->ckpt->overprov_segment_count) -
+   
le32_to_cpu(sbi->ckpt->rsvd_segment_count)) * 100 /
+   
(le32_to_cpu(sbi->raw_super->segment_count_main) -
+   
le32_to_cpu(sbi->ckpt->rsvd_segment_count)) *
+   sbi->blocks_per_seg * sbi->segs_per_sec 
/ 100;
+
+   sbi->fggc_threshold = sbi->blocks_per_seg * sbi->segs_per_sec -
+   sbi->fggc_threshold;
+
kfree(options);
 
/* recover broken superblock */
-- 
2.10.1


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: fix multiple f2fs_add_link() calls having same name

2017-02-16 Thread Jaegeuk Kim
On 02/16, Chao Yu wrote:
> On 2017/2/16 9:16, Jaegeuk Kim wrote:
> > On 02/14, Jaegeuk Kim wrote:
> >> On 02/15, Chao Yu wrote:
> >>> On 2017/2/15 2:03, Jaegeuk Kim wrote:
>  VFS uses f2fs_lookup() to decide f2fs_add_link() call during file 
>  creation.
>  But, if there is a race condition, f2fs_add_link() can be called multiple
>  times, resulting in multiple dentries with a same name. This patches 
>  fixes
>  it by adding __f2fs_find_entry() in f2fs_add_link() path.
> >>>
> >>> As during ->lookup, i_mutex will be held all the time, so there is no race
> >>> condition in between different file creators?
> >>
> >> Hehe, yup.
> >> I dropped this patch.
> > 
> > It turns out sdcardfs has a race condition between lookup and create, and it
> > calls vfs_create() twice. Workaround by f2fs would be needed for whole AOSP 
> > as
> > well. And I added to check current to avoid performance regression.
> > 
> > Thanks,
> > 
> >>From 2f433dbfa491550e666a3d08f5a59ac5bed42b0f Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim 
> > Date: Tue, 14 Feb 2017 09:54:37 -0800
> > Subject: [PATCH] f2fs: fix multiple f2fs_add_link() calls having same name
> > 
> > It turns out a stakable filesystem like sdcardfs in AOSP can trigger 
> > multiple
> > vfs_create() to lower filesystem. In that case, f2fs will add multiple 
> > dentries
> > having same name which breaks filesystem consistency.
> > 
> > Until upper layer fixes, let's work around by f2fs, which shows actually not
> > much performance regression.
> > 
> > Cc: 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/dir.c  | 26 --
> >  fs/f2fs/f2fs.h |  1 +
> >  2 files changed, 25 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
> > index 827c5daef4fc..4f5b3bb514c6 100644
> > --- a/fs/f2fs/dir.c
> > +++ b/fs/f2fs/dir.c
> > @@ -207,9 +207,11 @@ static struct f2fs_dir_entry *find_in_level(struct 
> > inode *dir,
> > f2fs_put_page(dentry_page, 0);
> > }
> >  
> > +   /* This is to increase the speed of f2fs_create */
> > if (!de && room && F2FS_I(dir)->chash != namehash) {
> > F2FS_I(dir)->chash = namehash;
> > F2FS_I(dir)->clevel = level;
> > +   F2FS_I(dir)->task = current;
> > }
> 
> Hash collision can happen on different file names, here, assignment of .task
> should be more accurately to prevent racing between lookup and creat, so how 
> about:
> 
> if (!de && room) {
>   .task = current;
>   if (chash != namehash) {
>   .chash = namehash;
>   .clevel = level;
>   }
> }

Makes sense.
Will apply that. ;)

Thanks,

> 
> Thanks,
> 
> >  
> > return de;
> > @@ -643,14 +645,34 @@ int __f2fs_add_link(struct inode *dir, const struct 
> > qstr *name,
> > struct inode *inode, nid_t ino, umode_t mode)
> >  {
> > struct fscrypt_name fname;
> > +   struct page *page = NULL;
> > +   struct f2fs_dir_entry *de = NULL;
> > int err;
> >  
> > err = fscrypt_setup_filename(dir, name, 0, );
> > if (err)
> > return err;
> >  
> > -   err = __f2fs_do_add_link(dir, , inode, ino, mode);
> > -
> > +   /*
> > +* An immature stakable filesystem shows a race condition between lookup
> > +* and create. If we have same task when doing lookup and create, it's
> > +* definitely fine as expected by VFS normally. Otherwise, let's just
> > +* verify on-disk dentry one more time, which guarantees filesystem
> > +* consistency more.
> > +*/
> > +   if (current != F2FS_I(dir)->task) {
> > +   de = __f2fs_find_entry(dir, , );
> > +   F2FS_I(dir)->task = NULL;
> > +   }
> > +   if (de) {
> > +   f2fs_dentry_kunmap(dir, page);
> > +   f2fs_put_page(page, 0);
> > +   err = -EEXIST;
> > +   } else if (!IS_ERR(page)) {
> > +   err = __f2fs_do_add_link(dir, , inode, ino, mode);
> > +   } else {
> > +   err = PTR_ERR(page);
> > +   }
> > fscrypt_free_filename();
> > return err;
> >  }
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index d263dade5e4c..cc22dc458896 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -454,6 +454,7 @@ struct f2fs_inode_info {
> > atomic_t dirty_pages;   /* # of dirty pages */
> > f2fs_hash_t chash;  /* hash value of given file name */
> > unsigned int clevel;/* maximum level of given file name */
> > +   struct task_struct *task;   /* lookup and create consistency */
> > nid_t i_xattr_nid;  /* node id that contains xattrs */
> > loff_t  last_disk_size; /* lastly written file size */
> >  
> > 

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

Re: [f2fs-dev] [RFC] add ovp valid_blocks check for bg gc victim to fg_gc

2017-02-16 Thread Hou Pengyang
On 2017/2/17 10:54, Jaegeuk Kim wrote:
> On 02/17, Hou Pengyang wrote:
>> On 2017/2/17 7:48, Jaegeuk Kim wrote:
>>> Hi Pengyang,
>>>
>>> Nice
>> Hi Jaegeuk,
>> catch!
>>>
>>> I think fggc_threshold needs to be revised, and we need to consider about
>>> general victim selection as well.
>>>
>>> Could you take a look at this?
>>>
  From 23b265f5ca6405032d092e240c94a827f743e42b Mon Sep 17 00:00:00 2001
>>> From: Hou Pengyang 
>>> Date: Thu, 16 Feb 2017 12:34:31 +
>>> Subject: [PATCH] f2fs: add ovp valid_blocks check for bg gc victim to fg_gc
>>>
>>> For foreground gc, greedy algorithm should be adapted, which makes
>>> this formula work well:
>>>
>>> (2 * (100 / config.overprovision + 1) + 6)
>>>
>>> But currently, we fg_gc have a prior to select bg_gc victim segments to gc 
>>> first,
>>> these victims are selected by cost-benefit algorithm, we can't guarantee 
>>> such segments
>>> have the small valid blocks, which may destroy the f2fs rule, on the 
>>> worstest case, would
>>> consume all the free segments.
>>>
>>> This patch fix this by add a filter in check_bg_victims, if segment's has # 
>>> of valid blocks
>>> over overprovision ratio, skip such segments.
>>>
>>> Cc: 
>>> Signed-off-by: Hou Pengyang 
>>> Signed-off-by: Jaegeuk Kim 
>>> ---
>>>fs/f2fs/f2fs.h|  3 +++
>>>fs/f2fs/gc.c  | 22 --
>>>fs/f2fs/segment.h |  9 +
>>>3 files changed, 32 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>> index cc22dc458896..1c9f0cc8f027 100644
>>> --- a/fs/f2fs/f2fs.h
>>> +++ b/fs/f2fs/f2fs.h
>>> @@ -888,6 +888,9 @@ struct f2fs_sb_info {
>>> struct f2fs_gc_kthread  *gc_thread; /* GC thread */
>>> unsigned int cur_victim_sec;/* current victim section num */
>>>
>>> +   /* threshold for converting bg victims for fg */
>>> +   u64 fggc_threshold;
>>> +
>>> /* maximum # of trials to find a victim segment for SSR and GC */
>>> unsigned int max_victim_search;
>>>
>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
>>> index 88e5e7b10ab6..fd4e479820e6 100644
>>> --- a/fs/f2fs/gc.c
>>> +++ b/fs/f2fs/gc.c
>>> @@ -166,7 +166,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
>>> gc_type,
>>> p->ofs_unit = sbi->segs_per_sec;
>>> }
>>>
>>> -   if (p->max_search > sbi->max_victim_search)
>>> +   /* we need to check every dirty segments in the FG_GC case */
>>> +   if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
>>> p->max_search = sbi->max_victim_search;
>>>
>>> p->offset = sbi->last_victim[p->gc_mode];
>>> @@ -199,6 +200,10 @@ static unsigned int check_bg_victims(struct 
>>> f2fs_sb_info *sbi)
>>> for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
>>> if (sec_usage_check(sbi, secno))
>>> continue;
>>> +
>>> +   if (no_fggc_candidate(sbi, secno))
>>> +   continue;
>>> +
>>> clear_bit(secno, dirty_i->victim_secmap);
>>> return secno * sbi->segs_per_sec;
>>> }
>>> @@ -322,13 +327,15 @@ static int get_victim_by_default(struct f2fs_sb_info 
>>> *sbi,
>>> nsearched++;
>>> }
>>>
>>> -
>>> secno = GET_SECNO(sbi, segno);
>>>
>>> if (sec_usage_check(sbi, secno))
>>> goto next;
>>> if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
>>> goto next;
>>> +   if (gc_type == FG_GC && p.alloc_mode == LFS &&
>>> +   no_fggc_candidate(sbi, secno))
>>> +   goto next;
>>>
>>> cost = get_gc_cost(sbi, segno, );
>>>
>>> @@ -989,5 +996,16 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool 
>>> background)
>>>
>>>void build_gc_manager(struct f2fs_sb_info *sbi)
>>>{
>>> +   u64 user_block_count, ovp_count, blocks_per_sec, th;
>>> +
>>> DIRTY_I(sbi)->v_ops = _v_ops;
>>> +
>>> +   /* threshold of # of valid blocks in a section for victims of FG_GC */
>>> +   user_block_count = sbi->user_block_count;
>>> +   ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;
>>
>> About the ovp_count calculation,
>>
>> in mkfs.f2fs, we get ovp_segment by
>>
>>  set_cp(overprov_segment_count, (get_sb(segment_count_main) -
>> get_cp(rsvd_segment_count)) *
>>  config.overprovision / 100);
>>
>>  set_cp(overprov_segment_count, get_cp(overprov_segment_count) +
>>  get_cp(rsvd_segment_count));
>>
>> where the overprov calculation is based on the space excluding the
>> rsvd segment, and the final overprov_segment is sum of the REAL
>> overprov segments and the rsvd ones.
>>
>> So, when to calculate the overprov ratio, the rsvd segments should
>> be subtracted from the ckpt->overprov_semgents?
>
> I just got calculation from fresh mounted image. What