from:"Yunlong Song"

[f2fs-dev] [PATCH] f2fs: remove redundant comment of unused wio_mutex

2018-12-13 Thread Yunlong Song

Commit 089842de ("f2fs: remove codes of unused wio_mutex") removes codes
of unused wio_mutex, but missing the comment, so delete it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 7cec897..03e7b37 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1175,7 +1175,6 @@ struct f2fs_sb_info {
 
/* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
-   /* bio ordering for NODE/DATA */
/* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: remove codes of unused wio_mutex

2018-12-11 Thread Yunlong Song

v1->v2: delete comments in f2fs.h: "/* bio ordering for NODE/DATA */"

Signed-off-by: Yunlong Song 
Reviewed-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/f2fs.h  | 2 --
 fs/f2fs/super.c | 5 +
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1e03197..195850e 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1170,8 +1170,6 @@ struct f2fs_sb_info {
 
/* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
-   struct mutex wio_mutex[NR_PAGE_TYPE - 1][NR_TEMP_TYPE];
-   /* bio ordering for NODE/DATA */
/* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index af58b2c..2d18de5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2674,7 +2674,7 @@ int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi)
 static void init_sb_info(struct f2fs_sb_info *sbi)
 {
struct f2fs_super_block *raw_super = sbi->raw_super;
-   int i, j;
+   int i;
 
sbi->log_sectors_per_block =
le32_to_cpu(raw_super->log_sectors_per_block);
@@ -2710,9 +2710,6 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
 
INIT_LIST_HEAD(>s_list);
mutex_init(>umount_mutex);
-   for (i = 0; i < NR_PAGE_TYPE - 1; i++)
-   for (j = HOT; j < NR_TEMP_TYPE; j++)
-   mutex_init(>wio_mutex[i][j]);
init_rwsem(>io_order_lock);
spin_lock_init(>cp_lock);
 
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v3] f2fs: only flush the single temp bio cache which owns the target page

2018-11-12 Thread Yunlong Song

Previously, when f2fs finds which temp bio cache owns the target page,
it will flush all the three temp bio caches, but we only need to flush
one single bio cache indeed, which can help to keep bio merged.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 37 ++---
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 6e0ffb1..cd8f670 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -372,29 +372,6 @@ static bool __has_merged_page(struct f2fs_bio_info *io, 
struct inode *inode,
return false;
 }
 
-static bool has_merged_page(struct f2fs_sb_info *sbi, struct inode *inode,
-   struct page *page, nid_t ino,
-   enum page_type type)
-{
-   enum page_type btype = PAGE_TYPE_OF_BIO(type);
-   enum temp_type temp;
-   struct f2fs_bio_info *io;
-   bool ret = false;
-
-   for (temp = HOT; temp < NR_TEMP_TYPE; temp++) {
-   io = sbi->write_io[btype] + temp;
-
-   down_read(>io_rwsem);
-   ret = __has_merged_page(io, inode, page, ino);
-   up_read(>io_rwsem);
-
-   /* TODO: use HOT temp only for meta pages now. */
-   if (ret || btype == META)
-   break;
-   }
-   return ret;
-}
-
 static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi,
enum page_type type, enum temp_type temp)
 {
@@ -420,13 +397,19 @@ static void __submit_merged_write_cond(struct 
f2fs_sb_info *sbi,
nid_t ino, enum page_type type, bool force)
 {
enum temp_type temp;
-
-   if (!force && !has_merged_page(sbi, inode, page, ino, type))
-   return;
+   bool ret = true;
 
for (temp = HOT; temp < NR_TEMP_TYPE; temp++) {
+   if (!force) {
+   enum page_type btype = PAGE_TYPE_OF_BIO(type);
+   struct f2fs_bio_info *io = sbi->write_io[btype] + temp;
 
-   __f2fs_submit_merged_write(sbi, type, temp);
+   down_read(>io_rwsem);
+   ret = __has_merged_page(io, inode, page, ino);
+   up_read(>io_rwsem);
+   }
+   if (ret)
+   __f2fs_submit_merged_write(sbi, type, temp);
 
/* TODO: use HOT temp only for meta pages now. */
if (type >= META)
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: fix get_blocktype_secs bug when segs_per_sec is larger than 1

2018-10-30 Thread Yunlong Song

f2fs_need_SSR uses get_blocktype_secs to calculate needed dirty
sections, however, for the case segs_per_sec > 1, when needed segs are
smaller than segs_per_sec, it will just return 0, so fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 56204a8..ef41ea2 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1842,7 +1842,7 @@ static inline int get_blocktype_secs(struct f2fs_sb_info 
*sbi, int block_type)
unsigned int segs = (get_pages(sbi, block_type) + pages_per_sec - 1) >>
sbi->log_blocks_per_seg;
 
-   return segs / sbi->segs_per_sec;
+   return (segs + sbi->segs_per_sec - 1) / sbi->segs_per_sec;
 }
 
 static inline block_t valid_user_blocks(struct f2fs_sb_info *sbi)
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: change segment to section in f2fs_ioc_gc_range

2018-10-30 Thread Yunlong Song

f2fs_ioc_gc_range skips blocks_per_seg each time, however, f2fs_gc moves
blocks of section each time, so fix it from segment to section.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 88b1246..f981b6c 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2155,7 +2155,7 @@ static int f2fs_ioc_gc_range(struct file *filp, unsigned 
long arg)
}
 
ret = f2fs_gc(sbi, range.sync, true, GET_SEGNO(sbi, range.start));
-   range.start += sbi->blocks_per_seg;
+   range.start += BLKS_PER_SEC(sbi);
if (range.start <= end)
goto do_more;
 out:
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: fix get_blocktype_secs bug when segs_per_sec is larger than 1

2018-10-30 Thread Yunlong Song

f2fs_need_SSR uses get_blocktype_secs to calculate needed dirty
sections, however, for the case segs_per_sec > 1, when needed segs are
smaller than segs_per_sec, it will just return 0, so fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 56204a8..d47417b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1842,7 +1842,7 @@ static inline int get_blocktype_secs(struct f2fs_sb_info 
*sbi, int block_type)
unsigned int segs = (get_pages(sbi, block_type) + pages_per_sec - 1) >>
sbi->log_blocks_per_seg;
 
-   return segs / sbi->segs_per_sec;
+   return (segs / sbi->segs_per_sec + sbi->segs_per_sec - 1) / 
sbi->segs_per_sec;
 }
 
 static inline block_t valid_user_blocks(struct f2fs_sb_info *sbi)
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 3/3] f2fs: export migration_granularity sysfs entry

2018-10-24 Thread Yunlong Song





On 2018/10/24 18:37, Chao Yu wrote:

Add one sysfs entry to control migration granularity of GC in large
section f2fs, it can be tuned to mitigate heavy overhead of migrating
huge number of blocks in large section.

Signed-off-by: Chao Yu 
---
  Documentation/ABI/testing/sysfs-fs-f2fs | 9 +
  fs/f2fs/sysfs.c | 7 +++
  2 files changed, 16 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index 3ac41774ad3c..a7ce33199457 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -92,6 +92,15 @@ Contact: "Jaegeuk Kim" 
  Description:
 Controls the number of trials to find a victim segment.
  
+What:		/sys/fs/f2fs//migration_granularity

+Date:  October 2018
+Contact:   "Chao Yu" 
+Description:
+Controls migration granularity of garbage collection on large
+section, it can let GC move partial segment{s} of one section
+in one GC cycle, so that dispersing heavy overhead GC to
+multiple lightweight one.
+
  What: /sys/fs/f2fs//dir_level
  Date: March 2014
  Contact:  "Jaegeuk Kim" 
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 240e4881279e..b393fda6d6dc 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -246,6 +246,11 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
return count;
}
  
+	if (!strcmp(a->attr.name, "discard_granularity")) {


Should be migration_granularity ?


+   if (t == 0 || t > sbi->segs_per_sec)
+   return -EINVAL;
+   }
+
if (!strcmp(a->attr.name, "trim_sections"))
return -EINVAL;
  
@@ -406,6 +411,7 @@ F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ram_thresh, ram_thresh);

  F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, ra_nid_pages, ra_nid_pages);
  F2FS_RW_ATTR(NM_INFO, f2fs_nm_info, dirty_nats_ratio, dirty_nats_ratio);
  F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, max_victim_search, max_victim_search);
+F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, migration_granularity, 
migration_granularity);
  F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, dir_level, dir_level);
  F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, cp_interval, interval_time[CP_TIME]);
  F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, idle_interval, interval_time[REQ_TIME]);
@@ -460,6 +466,7 @@ static struct attribute *f2fs_attrs[] = {
ATTR_LIST(min_hot_blocks),
ATTR_LIST(min_ssr_sections),
ATTR_LIST(max_victim_search),
+   ATTR_LIST(migration_granularity),
ATTR_LIST(dir_level),
ATTR_LIST(ram_thresh),
ATTR_LIST(ra_nid_pages),


--
Thanks,
Yunlong Song




___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: change segment to section in f2fs_ioc_gc_range

2018-10-24 Thread Yunlong Song

f2fs_ioc_gc_range skips blocks_per_seg each time, however, f2fs_gc moves
blocks of section each time, so fix it from segment to section.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/file.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 88b1246..8c06724 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -2155,7 +2155,7 @@ static int f2fs_ioc_gc_range(struct file *filp, unsigned 
long arg)
}
 
ret = f2fs_gc(sbi, range.sync, true, GET_SEGNO(sbi, range.start));
-   range.start += sbi->blocks_per_seg;
+   range.start += sbi->blocks_per_seg * sbi->segs_per_sec;
if (range.start <= end)
goto do_more;
 out:
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: only flush the single temp bio cache which owns the target page

2018-10-24 Thread Yunlong Song

Previously, when f2fs finds which temp bio cache owns the target page,
it will flush all the three temp bio caches, but we only need to flush
one single bio cache indeed, which can help to keep bio merged.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 106f116..882e217 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -396,13 +396,17 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, 
struct inode *inode,
 }
 
 static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi,
-   enum page_type type, enum temp_type temp)
+   struct inode *inode, struct page *page, nid_t 
ino,
+   enum page_type type, enum temp_type temp, bool 
force)
 {
enum page_type btype = PAGE_TYPE_OF_BIO(type);
struct f2fs_bio_info *io = sbi->write_io[btype] + temp;
 
down_write(>io_rwsem);
 
+   if (!force && !__has_merged_page(io, inode, page, ino))
+   goto out;
+
/* change META to META_FLUSH in the checkpoint procedure */
if (type >= META_FLUSH) {
io->fio.type = META_FLUSH;
@@ -412,6 +416,7 @@ static void __f2fs_submit_merged_write(struct f2fs_sb_info 
*sbi,
io->fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
}
__submit_merged_bio(io);
+out:
up_write(>io_rwsem);
 }
 
@@ -426,7 +431,7 @@ static void __submit_merged_write_cond(struct f2fs_sb_info 
*sbi,
 
for (temp = HOT; temp < NR_TEMP_TYPE; temp++) {
 
-   __f2fs_submit_merged_write(sbi, type, temp);
+   __f2fs_submit_merged_write(sbi, inode, page, ino, type, temp, 
force);
 
/* TODO: use HOT temp only for meta pages now. */
if (type >= META)
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: only flush the single temp bio cache which owns the target page

2018-10-24 Thread Yunlong Song

Previously, when f2fs finds which temp bio cache owns the target page,
it will flush all the three temp bio caches, but we only need to flush
one single bio cache indeed, which can help to keep bio merged.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 106f116..04ebbad 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -374,7 +374,7 @@ static bool __has_merged_page(struct f2fs_bio_info *io, 
struct inode *inode,
 
 static bool has_merged_page(struct f2fs_sb_info *sbi, struct inode *inode,
struct page *page, nid_t ino,
-   enum page_type type)
+   enum page_type type, enum 
temp_type *owner)
 {
enum page_type btype = PAGE_TYPE_OF_BIO(type);
enum temp_type temp;
@@ -392,6 +392,10 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, 
struct inode *inode,
if (ret || btype == META)
break;
}
+   if (!ret || (!inode && !page && !ino))
+   *owner = NR_TEMP_TYPE;
+   else
+   *owner = temp;
return ret;
 }
 
@@ -421,9 +425,14 @@ static void __submit_merged_write_cond(struct f2fs_sb_info 
*sbi,
 {
enum temp_type temp;
 
-   if (!force && !has_merged_page(sbi, inode, page, ino, type))
+   if (!force && !has_merged_page(sbi, inode, page, ino, type, ))
return;
 
+   if (!force && temp != NR_TEMP_TYPE) {
+   __f2fs_submit_merged_write(sbi, type, temp);
+   return;
+   }
+
for (temp = HOT; temp < NR_TEMP_TYPE; temp++) {
 
__f2fs_submit_merged_write(sbi, type, temp);
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: remove codes of unused wio_mutex

2018-10-24 Thread Yunlong Song

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h  | 1 -
 fs/f2fs/super.c | 5 +
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 56204a8..4dfa28d 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1170,7 +1170,6 @@ struct f2fs_sb_info {
 
/* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
-   struct mutex wio_mutex[NR_PAGE_TYPE - 1][NR_TEMP_TYPE];
/* bio ordering for NODE/DATA */
/* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index af58b2c..2d18de5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2674,7 +2674,7 @@ int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi)
 static void init_sb_info(struct f2fs_sb_info *sbi)
 {
struct f2fs_super_block *raw_super = sbi->raw_super;
-   int i, j;
+   int i;
 
sbi->log_sectors_per_block =
le32_to_cpu(raw_super->log_sectors_per_block);
@@ -2710,9 +2710,6 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
 
INIT_LIST_HEAD(>s_list);
mutex_init(>umount_mutex);
-   for (i = 0; i < NR_PAGE_TYPE - 1; i++)
-   for (j = HOT; j < NR_TEMP_TYPE; j++)
-   mutex_init(>wio_mutex[i][j]);
init_rwsem(>io_order_lock);
spin_lock_init(>cp_lock);
 
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: fix count of seg_freed to make sec_freed correct

2018-10-24 Thread Yunlong Song

When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before
gc starts, do_garbage_collect will skip counting seg_freed++, and this
will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/gc.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a07241f..57841e9 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1130,9 +1130,9 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
GET_SUM_BLOCK(sbi, segno));
f2fs_put_page(sum_page, 0);
 
-   if (get_valid_blocks(sbi, segno, false) == 0 ||
-   !PageUptodate(sum_page) ||
-   unlikely(f2fs_cp_error(sbi)))
+   if (get_valid_blocks(sbi, segno, false) == 0)
+   goto freed;
+   if (!PageUptodate(sum_page) || unlikely(f2fs_cp_error(sbi)))
goto next;
 
sum = page_address(sum_page);
@@ -1160,6 +1160,7 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
 
stat_inc_seg_count(sbi, type, gc_type);
 
+freed:
if (gc_type == FG_GC &&
get_valid_blocks(sbi, segno, false) == 0)
seg_freed++;
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: avoid GC causing encrypted file corrupted

2018-10-24 Thread Yunlong Song


ping...

On 2018/9/18 20:39, Yunlong Song wrote:

The encrypted file may be corrupted by GC in following case:

Time 1: | segment 1 blkaddr = A |  GC -> | segment 2 blkaddr = B |
Encrypted block 1 is moved from blkaddr A of segment 1 to blkaddr B of
segment 2,

Time 2: | segment 1 blkaddr = B |  GC -> | segment 3 blkaddr = C |

Before page 1 is written back and if segment 2 become a victim, then
page 1 is moved from blkaddr B of segment 2 to blkaddr Cof segment 3,
during the GC process of Time 2, f2fs should wait for page 1 written back
before reading it, or move_data_block will read a garbage block from
blkaddr B since page is not written back to blkaddr B yet.

Commit 6aa58d8a ("f2fs: readahead encrypted block during GC") introduce
ra_data_block to read encrypted block, but it forgets to add
f2fs_wait_on_page_writeback to avoid racing between GC and flush.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/gc.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a4c1a41..c55fb62 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -641,6 +641,14 @@ static int ra_data_block(struct inode *inode, pgoff_t 
index)
fio.page = page;
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
  
+	/*

+* don't cache encrypted data into meta inode until previous dirty
+* data were writebacked to avoid racing between GC and flush.
+*/
+   f2fs_wait_on_page_writeback(page, DATA, true);
+
+   f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);
+
fio.encrypted_page = f2fs_pagecache_get_page(META_MAPPING(sbi),
dn.data_blkaddr,
FGP_LOCK | FGP_CREAT, GFP_NOFS);
@@ -723,6 +731,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 */
f2fs_wait_on_page_writeback(page, DATA, true);
  
+	f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);

+
err = f2fs_get_node_info(fio.sbi, dn.nid, );
if (err)
goto put_out;


--
Thanks,
Yunlong Song




___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: fix count of seg_freed to make sec_freed correct

2018-10-10 Thread Yunlong Song

When sbi->segs_per_sec > 1, and if some segno has 0 valid blocks before
gc starts, do_garbage_collect will skip counting seg_freed++, and this
will cause seg_freed < sbi->segs_per_sec and finally skip sec_freed++.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/gc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a07241f..dc63cd5 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1160,10 +1160,10 @@ static int do_garbage_collect(struct f2fs_sb_info *sbi,
 
stat_inc_seg_count(sbi, type, gc_type);
 
+next:
if (gc_type == FG_GC &&
get_valid_blocks(sbi, segno, false) == 0)
seg_freed++;
-next:
f2fs_put_page(sum_page, 0);
}
 
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: avoid GC causing encrypted file corrupted

2018-09-18 Thread Yunlong Song





On 2018/9/19 2:17, Jaegeuk Kim wrote:

On 09/18, Yunlong Song wrote:

The encrypted file may be corrupted by GC in following case:

Time 1: | segment 1 blkaddr = A |  GC -> | segment 2 blkaddr = B |
Encrypted block 1 is moved from blkaddr A of segment 1 to blkaddr B of
segment 2,

Time 2: | segment 1 blkaddr = B |  GC -> | segment 3 blkaddr = C |

 segment 2 blkaddr = B?

Sorry for typing error.
Yes.



Before page 1 is written back and if segment 2 become a victim, then
page 1 is moved from blkaddr B of segment 2 to blkaddr Cof segment 3,

  C of ?

Yes.



during the GC process of Time 2, f2fs should wait for page 1 written back
before reading it, or move_data_block will read a garbage block from
blkaddr B since page is not written back to blkaddr B yet.

move_data_block() checks PageUptodate() so it won't get garbage, yes?
So, does ra_data_block need to check PageUptodate?
You mean if page 1 is read from blkaddr B before it is written back to 
blkaddr B, then
the page will become non-uptodate status, why? Is it because 
__read_end_io checks

"(bio->bi_status || PageError(page))" and ClearPageUptodate(page)?




Commit 6aa58d8a ("f2fs: readahead encrypted block during GC") introduce
ra_data_block to read encrypted block, but it forgets to add
f2fs_wait_on_page_writeback to avoid racing between GC and flush.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/gc.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a4c1a41..c55fb62 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -641,6 +641,14 @@ static int ra_data_block(struct inode *inode, pgoff_t 
index)
fio.page = page;
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
  
+	/*

+* don't cache encrypted data into meta inode until previous dirty
+* data were writebacked to avoid racing between GC and flush.
+*/
+   f2fs_wait_on_page_writeback(page, DATA, true);
+
+   f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);
+
fio.encrypted_page = f2fs_pagecache_get_page(META_MAPPING(sbi),
dn.data_blkaddr,
FGP_LOCK | FGP_CREAT, GFP_NOFS);
@@ -723,6 +731,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 */
f2fs_wait_on_page_writeback(page, DATA, true);
  
+	f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);

+
err = f2fs_get_node_info(fio.sbi, dn.nid, );
if (err)
goto put_out;
--
1.8.5.2

.



--
Thanks,
Yunlong Song




___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: avoid GC causing encrypted file corrupted

2018-09-18 Thread Yunlong Song

The encrypted file may be corrupted by GC in following case:

Time 1: | segment 1 blkaddr = A |  GC -> | segment 2 blkaddr = B |
Encrypted block 1 is moved from blkaddr A of segment 1 to blkaddr B of
segment 2,

Time 2: | segment 1 blkaddr = B |  GC -> | segment 3 blkaddr = C |

Before page 1 is written back and if segment 2 become a victim, then
page 1 is moved from blkaddr B of segment 2 to blkaddr Cof segment 3,
during the GC process of Time 2, f2fs should wait for page 1 written back
before reading it, or move_data_block will read a garbage block from
blkaddr B since page is not written back to blkaddr B yet.

Commit 6aa58d8a ("f2fs: readahead encrypted block during GC") introduce
ra_data_block to read encrypted block, but it forgets to add
f2fs_wait_on_page_writeback to avoid racing between GC and flush.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/gc.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index a4c1a41..c55fb62 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -641,6 +641,14 @@ static int ra_data_block(struct inode *inode, pgoff_t 
index)
fio.page = page;
fio.new_blkaddr = fio.old_blkaddr = dn.data_blkaddr;
 
+   /*
+* don't cache encrypted data into meta inode until previous dirty
+* data were writebacked to avoid racing between GC and flush.
+*/
+   f2fs_wait_on_page_writeback(page, DATA, true);
+
+   f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);
+
fio.encrypted_page = f2fs_pagecache_get_page(META_MAPPING(sbi),
dn.data_blkaddr,
FGP_LOCK | FGP_CREAT, GFP_NOFS);
@@ -723,6 +731,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 */
f2fs_wait_on_page_writeback(page, DATA, true);
 
+   f2fs_wait_on_block_writeback(inode, dn.data_blkaddr);
+
err = f2fs_get_node_info(fio.sbi, dn.nid, );
if (err)
goto put_out;
-- 
1.8.5.2



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 4/5] f2fs: let BG_GC check every dirty segments and gc over a threshold

2018-07-24 Thread Yunlong Song





On 2018/7/24 22:52, Chao Yu wrote:

On 2018/7/23 22:10, Yunlong Song wrote:

BG_GC is triggered in idle time, so it is better check every dirty
segment and finds the best victim to gc. Otherwise, BG_GC will be
limited to only 8G areas, and probably select a victim which has nearly

If 8GB range is not enough and just hard code now, we can export it in sysfs and
do the configuration.
There is already a sysfs entry called max_victim_search, but the default 
value is defined
as DEF_MAX_VICTIM_SEARCH, i.e., 4096, equals to 8GB. So can we increase 
this default

value to UINT_MAX?



full of valid blocks, resulting a big WAI. Besides, we also add a

BGGC should move cold data anway, if we only consider WA, hot data section can
be selected with very high probability, but hot data can do OPU itself sooner or
later, so moving them will cause higher WA.
Yes, but this problem also appears in the default 8G area, even in 8G 
area, perhaps there is
still a victim section which has fewest valid blocks but with hot data 
type. This patch adds
a bggc_threshold to avoid big WA and wishes SSR write data to the 
section whose threshold
is over bggc_threshold but with cold data type. Since the initial 
min_cost in BG_GC is valued
as UINT_MAX, BG_GC can always successfully select a victim and move 
blocks in common case,
but sometimes it is not needed, for example, there are already enough 
free sections and each
dirty section has same valid blocks, if BG_GC continue its job, then 
there is a big WA.




I think the better way is we can export a sysfs entry to adjust factor to
control weight of aging or valid block of section. So that, user can adjust it
to select less valid block candidate first instead of high aging one.
How about export the bggc_threshold as sysfs entry, the default value is 
defined as the old
fggc_threshold, i.e., (main - ovp) / (main - rsvd) * BLKS_PER_SEC. User 
can adjust this value

to control the WA and non-WA requirement.



Thanks,


bggc_threshold (which is the old "fggc_threshold", so revert commit
"299254") to stop BG_GC when there is no good choice. This is especially
good for large section case to reduce WAI.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/f2fs.h|  2 ++
  fs/f2fs/gc.c  | 23 ---
  fs/f2fs/segment.h |  9 +
  3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index f8a7b42..24a9d7f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1220,6 +1220,8 @@ struct f2fs_sb_info {
unsigned int cur_fg_victim_sec; /* current FG_GC victim section 
num */
unsigned int cur_bg_victim_sec; /* current BG_GC victim section 
num */
unsigned int gc_mode;   /* current GC state */
+   /* threshold for selecting bg victims */
+   u64 bggc_threshold;
/* for skip statistic */
unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
  
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index 0e7a265..21e8d59 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -189,9 +189,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
p->ofs_unit = sbi->segs_per_sec;
}
  
-	/* we need to check every dirty segments in the FG_GC case */

-   if (gc_type != FG_GC &&
-   (sbi->gc_mode != GC_URGENT) &&
+   /* we need to check every dirty segments in the GC case */
+   if (p->alloc_mode == SSR &&
p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
  
@@ -230,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info *sbi)

for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
if (sec_usage_check(sbi, secno))
continue;
+
+   if (no_bggc_candidate(sbi, secno))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -368,6 +371,10 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (sec_usage_check(sbi, secno))
goto next;
  
+		if (gc_type == BG_GC && p.alloc_mode == LFS &&

+   no_bggc_candidate(sbi, secno))
+   goto next;
+
cost = get_gc_cost(sbi, segno, );
  
  		if (p.min_cost > cost) {

@@ -1140,8 +1147,18 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
  
  void f2fs_build_gc_manager(struct f2fs_sb_info *sbi)

  {
+   u64 main_count, resv_count, ovp_count;
+
DIRTY_I(sbi)->v_ops = _v_ops;
  
+	/* threshold of # of valid blocks in a section for victims of BG_GC */

+   main_count = SM_I(sbi)->main_segments << sbi->log_blocks_per_seg;
+   resv_count = SM_I(sbi)->reserved_segments

Re: [f2fs-dev] [PATCH 2/5] f2fs: add cur_victim_sec for BG_GC to avoid skipping BG_GC victim

2018-07-24 Thread Yunlong Song





On 2018/7/24 22:17, Chao Yu wrote:

On 2018/7/24 21:39, Yunlong Song wrote:


On 2018/7/24 21:11, Chao Yu wrote:

On 2018/7/23 22:10, Yunlong Song wrote:

If f2fs aborts BG_GC, then the section bit of victim_secmap will be set,
which will cause the section skipped in the future get_victim of BG_GC.
In a worst case that each section in the victim_secmap is set and there
are enough free sections (so FG_GC can not be triggered), then BG_GC
will skip all the sections and cannot find any victims, causing BG_GC

If f2fs aborts BG_GC, we'd better to clear victim_secmap?

We can keep the bit set in victim_secmap for FG_GC use next time as before, the

No, I don't think we could assume that FGGC will come soon, and in adaptive
mode, after we triggered SSR agressively, FG_GC will be much less.

For your case, we need to clear victim_secmap.
However, if it is cleared, then FG_GC will lose the chance to have a 
quick selection of the victim
candidate, which BG_GC has selected and aborted in last round or there 
are still some blocks
ungced because these blocks belong to an opening atomic file. Especially 
for the large section
case, when BG_GC stops its job if IO state change from idle to busy, 
then it is better that FG_GC
can continue to gc the section selected before. So how about adding 
another map to record these
sections, and make FG_GC/BG_GC select these sections, as for the old 
victim_secmap, keep its
old logic, BG_GC can not select those sections in victim_secmap, but 
FG_GC can.





diffierent
is that this patch will make BG_GC ignore the bit set in victim_secmap, so BG_GC
can still
get the the section (which is in set) as victim and do GC jobs.

I guess this scenario is the case our previous scheme tries to prevent, since if
in selected section, all block there are cached and set dirty, BGGC will end up
with doing nothing, it's inefficient.


OK, I understand.



Thanks,


failed each time. Besides, SSR also uses BG_GC to get ssr segment, if

Looks like foreground GC will try to grab section which is selected as
victim of background GC?

Yes, this is exactly the value of victim_secmap, it helps FG_GC reduce time in
selecting victims
and continue the job which BG_GC has not finished.


Thanks,


many sections in the victim_secmap are set, then SSR cannot get a proper
ssr segment to allocate blocks, which makes SSR inefficiently. To fix
this problem, we can add cur_victim_sec for BG_GC similar like that in
FG_GC to avoid selecting the same section repeatedly.

Signed-off-by: Yunlong Song 
---
   fs/f2fs/f2fs.h  |  3 ++-
   fs/f2fs/gc.c| 15 +--
   fs/f2fs/segment.h   |  3 ++-
   fs/f2fs/super.c |  3 ++-
   include/trace/events/f2fs.h | 18 --
   5 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 57a8851..f8a7b42 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1217,7 +1217,8 @@ struct f2fs_sb_info {
   /* for cleaning operations */
   struct mutex gc_mutex;/* mutex for GC */
   struct f2fs_gc_kthread*gc_thread;/* GC thread */
-unsigned int cur_victim_sec;/* current victim section num */
+unsigned int cur_fg_victim_sec;/* current FG_GC victim section
num */
+unsigned int cur_bg_victim_sec;/* current BG_GC victim section
num */
   unsigned int gc_mode;/* current GC state */
   /* for skip statistic */
   unsigned long long skipped_atomic_files[2];/* FG_GC and BG_GC */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 2ba470d..705d419 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -367,8 +367,6 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
 if (sec_usage_check(sbi, secno))
   goto next;
-if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
-goto next;
 cost = get_gc_cost(sbi, segno, );
   @@ -391,14 +389,17 @@ static int get_victim_by_default(struct f2fs_sb_info
*sbi,
   if (p.alloc_mode == LFS) {
   secno = GET_SEC_FROM_SEG(sbi, p.min_segno);
   if (gc_type == FG_GC)
-sbi->cur_victim_sec = secno;
-else
+sbi->cur_fg_victim_sec = secno;
+else {
   set_bit(secno, dirty_i->victim_secmap);
+sbi->cur_bg_victim_sec = secno;
+}
   }
   *result = (p.min_segno / p.ofs_unit) * p.ofs_unit;
 trace_f2fs_get_victim(sbi->sb, type, gc_type, ,
-sbi->cur_victim_sec,
+sbi->cur_fg_victim_sec,
+sbi->cur_bg_victim_sec,
   prefree_segments(sbi), free_segments(sbi));
   }
   out:
@@ -1098,7 +1099,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
   }
 if (gc_type == FG_GC)
-sbi->cur_victim_sec = NULL_SEGNO;
+sbi->cur_fg_victim

Re: [f2fs-dev] [PATCH 2/5] f2fs: add cur_victim_sec for BG_GC to avoid skipping BG_GC victim

2018-07-24 Thread Yunlong Song





On 2018/7/24 21:11, Chao Yu wrote:

On 2018/7/23 22:10, Yunlong Song wrote:

If f2fs aborts BG_GC, then the section bit of victim_secmap will be set,
which will cause the section skipped in the future get_victim of BG_GC.
In a worst case that each section in the victim_secmap is set and there
are enough free sections (so FG_GC can not be triggered), then BG_GC
will skip all the sections and cannot find any victims, causing BG_GC

If f2fs aborts BG_GC, we'd better to clear victim_secmap?
We can keep the bit set in victim_secmap for FG_GC use next time as 
before, the diffierent
is that this patch will make BG_GC ignore the bit set in victim_secmap, 
so BG_GC can still

get the the section (which is in set) as victim and do GC jobs.



failed each time. Besides, SSR also uses BG_GC to get ssr segment, if

Looks like foreground GC will try to grab section which is selected as
victim of background GC?
Yes, this is exactly the value of victim_secmap, it helps FG_GC reduce 
time in selecting victims

and continue the job which BG_GC has not finished.



Thanks,


many sections in the victim_secmap are set, then SSR cannot get a proper
ssr segment to allocate blocks, which makes SSR inefficiently. To fix
this problem, we can add cur_victim_sec for BG_GC similar like that in
FG_GC to avoid selecting the same section repeatedly.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/f2fs.h  |  3 ++-
  fs/f2fs/gc.c| 15 +--
  fs/f2fs/segment.h   |  3 ++-
  fs/f2fs/super.c |  3 ++-
  include/trace/events/f2fs.h | 18 --
  5 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 57a8851..f8a7b42 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1217,7 +1217,8 @@ struct f2fs_sb_info {
/* for cleaning operations */
struct mutex gc_mutex;  /* mutex for GC */
struct f2fs_gc_kthread  *gc_thread; /* GC thread */
-   unsigned int cur_victim_sec;/* current victim section num */
+   unsigned int cur_fg_victim_sec; /* current FG_GC victim section 
num */
+   unsigned int cur_bg_victim_sec; /* current BG_GC victim section 
num */
unsigned int gc_mode;   /* current GC state */
/* for skip statistic */
unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 2ba470d..705d419 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -367,8 +367,6 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
  
  		if (sec_usage_check(sbi, secno))

goto next;
-   if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
-   goto next;
  
  		cost = get_gc_cost(sbi, segno, );
  
@@ -391,14 +389,17 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,

if (p.alloc_mode == LFS) {
secno = GET_SEC_FROM_SEG(sbi, p.min_segno);
if (gc_type == FG_GC)
-   sbi->cur_victim_sec = secno;
-   else
+   sbi->cur_fg_victim_sec = secno;
+   else {
set_bit(secno, dirty_i->victim_secmap);
+   sbi->cur_bg_victim_sec = secno;
+   }
}
*result = (p.min_segno / p.ofs_unit) * p.ofs_unit;
  
  		trace_f2fs_get_victim(sbi->sb, type, gc_type, ,

-   sbi->cur_victim_sec,
+   sbi->cur_fg_victim_sec,
+   sbi->cur_bg_victim_sec,
prefree_segments(sbi), free_segments(sbi));
}
  out:
@@ -1098,7 +1099,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
}
  
  	if (gc_type == FG_GC)

-   sbi->cur_victim_sec = NULL_SEGNO;
+   sbi->cur_fg_victim_sec = NULL_SEGNO;
+   else
+   sbi->cur_bg_victim_sec = NULL_SEGNO;
  
  	if (!sync) {

if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 5049551..b21bb96 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -787,7 +787,8 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info 
*sbi, int base, int type)
  
  static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int secno)

  {
-   if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno))
+   if (IS_CURSEC(sbi, secno) || (sbi->cur_fg_victim_sec == secno) ||
+   (sbi->cur_bg_victim_sec == secno))
return true;
return false;
  }
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 7187885..ef69ebf 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2386,7 +2386,8 @@ static void init_sb_inf

[f2fs-dev] [PATCH v2] f2fs: clear victim_secmap when section has full valid blocks

2018-07-24 Thread Yunlong Song

Without this patch, f2fs only clears victim_secmap when it finds out
that the section has no valid blocks at all, but forgets to clear the
victim_secmap when the whole section has full valid blocks.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..0a79554 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -776,7 +776,8 @@ static void __remove_dirty_segment(struct f2fs_sb_info 
*sbi, unsigned int segno,
if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
dirty_i->nr_dirty[t]--;
 
-   if (get_valid_blocks(sbi, segno, true) == 0)
+   if (get_valid_blocks(sbi, segno, true) == 0 ||
+   get_valid_blocks(sbi, segno, true) == BLKS_PER_SEC(sbi))
clear_bit(GET_SEC_FROM_SEG(sbi, segno),
dirty_i->victim_secmap);
}
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 1/5] f2fs: clear victim_secmap when section has full valid blocks

2018-07-23 Thread Yunlong Song

Without this patch, f2fs only clears victim_secmap when it finds out
that the section has no valid blocks at all, but forgets to clear the
victim_secmap when the whole section has full valid blocks.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..255bff5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -776,7 +776,9 @@ static void __remove_dirty_segment(struct f2fs_sb_info 
*sbi, unsigned int segno,
if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
dirty_i->nr_dirty[t]--;
 
-   if (get_valid_blocks(sbi, segno, true) == 0)
+   if (get_valid_blocks(sbi, segno, true) == 0 ||
+   get_valid_blocks(sbi, segno, true) ==
+   (sbi->segs_per_sec << sbi->log_blocks_per_seg))
clear_bit(GET_SEC_FROM_SEG(sbi, segno),
dirty_i->victim_secmap);
}
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 3/5] f2fs: clear_bit the SSR selected section in the victim_secmap

2018-07-23 Thread Yunlong Song

SSR uses get_victim to select ssr segment to allocate data blocks, which
makes the previous result of victim_secmap inaccurately, so we would
better clear the bit of the section in the victim_secmap.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/gc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 705d419..0e7a265 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -394,7 +394,8 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
set_bit(secno, dirty_i->victim_secmap);
sbi->cur_bg_victim_sec = secno;
}
-   }
+   } else
+   clear_bit(secno, dirty_i->victim_secmap);
*result = (p.min_segno / p.ofs_unit) * p.ofs_unit;
 
trace_f2fs_get_victim(sbi->sb, type, gc_type, ,
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 2/5] f2fs: add cur_victim_sec for BG_GC to avoid skipping BG_GC victim

2018-07-23 Thread Yunlong Song

If f2fs aborts BG_GC, then the section bit of victim_secmap will be set,
which will cause the section skipped in the future get_victim of BG_GC.
In a worst case that each section in the victim_secmap is set and there
are enough free sections (so FG_GC can not be triggered), then BG_GC
will skip all the sections and cannot find any victims, causing BG_GC
failed each time. Besides, SSR also uses BG_GC to get ssr segment, if
many sections in the victim_secmap are set, then SSR cannot get a proper
ssr segment to allocate blocks, which makes SSR inefficiently. To fix
this problem, we can add cur_victim_sec for BG_GC similar like that in
FG_GC to avoid selecting the same section repeatedly.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h  |  3 ++-
 fs/f2fs/gc.c| 15 +--
 fs/f2fs/segment.h   |  3 ++-
 fs/f2fs/super.c |  3 ++-
 include/trace/events/f2fs.h | 18 --
 5 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 57a8851..f8a7b42 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1217,7 +1217,8 @@ struct f2fs_sb_info {
/* for cleaning operations */
struct mutex gc_mutex;  /* mutex for GC */
struct f2fs_gc_kthread  *gc_thread; /* GC thread */
-   unsigned int cur_victim_sec;/* current victim section num */
+   unsigned int cur_fg_victim_sec; /* current FG_GC victim section 
num */
+   unsigned int cur_bg_victim_sec; /* current BG_GC victim section 
num */
unsigned int gc_mode;   /* current GC state */
/* for skip statistic */
unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 2ba470d..705d419 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -367,8 +367,6 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
 
if (sec_usage_check(sbi, secno))
goto next;
-   if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap))
-   goto next;
 
cost = get_gc_cost(sbi, segno, );
 
@@ -391,14 +389,17 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (p.alloc_mode == LFS) {
secno = GET_SEC_FROM_SEG(sbi, p.min_segno);
if (gc_type == FG_GC)
-   sbi->cur_victim_sec = secno;
-   else
+   sbi->cur_fg_victim_sec = secno;
+   else {
set_bit(secno, dirty_i->victim_secmap);
+   sbi->cur_bg_victim_sec = secno;
+   }
}
*result = (p.min_segno / p.ofs_unit) * p.ofs_unit;
 
trace_f2fs_get_victim(sbi->sb, type, gc_type, ,
-   sbi->cur_victim_sec,
+   sbi->cur_fg_victim_sec,
+   sbi->cur_bg_victim_sec,
prefree_segments(sbi), free_segments(sbi));
}
 out:
@@ -1098,7 +1099,9 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
}
 
if (gc_type == FG_GC)
-   sbi->cur_victim_sec = NULL_SEGNO;
+   sbi->cur_fg_victim_sec = NULL_SEGNO;
+   else
+   sbi->cur_bg_victim_sec = NULL_SEGNO;
 
if (!sync) {
if (has_not_enough_free_secs(sbi, sec_freed, 0)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 5049551..b21bb96 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -787,7 +787,8 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info 
*sbi, int base, int type)
 
 static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int 
secno)
 {
-   if (IS_CURSEC(sbi, secno) || (sbi->cur_victim_sec == secno))
+   if (IS_CURSEC(sbi, secno) || (sbi->cur_fg_victim_sec == secno) ||
+   (sbi->cur_bg_victim_sec == secno))
return true;
return false;
 }
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 7187885..ef69ebf 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2386,7 +2386,8 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
sbi->root_ino_num = le32_to_cpu(raw_super->root_ino);
sbi->node_ino_num = le32_to_cpu(raw_super->node_ino);
sbi->meta_ino_num = le32_to_cpu(raw_super->meta_ino);
-   sbi->cur_victim_sec = NULL_SECNO;
+   sbi->cur_fg_victim_sec = NULL_SECNO;
+   sbi->cur_bg_victim_sec = NULL_SECNO;
sbi->max_victim_search = DEF_MAX_VICTIM_SEARCH;
 
sbi->dir_level = DEF_DIR_LEVEL;
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 7956989..0f01f82 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trac

[f2fs-dev] [PATCH 5/5] f2fs: add proc entry to show victim_secmap bitmap

2018-07-23 Thread Yunlong Song

This patch adds a new proc entry to show victim_secmap information in
more detail, which is very helpful to know the get_victim candidate
status clearly, and helpful to debug problems (e.g., some sections can
not gc all of its blocks, since some blocks belong to atomic file,
leaving victim_secmap with section bit setting, in extrem case, this
will lead all bytes of victim_secmap setting with 0xff).

Signed-off-by: Yunlong Song 
---
 fs/f2fs/sysfs.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index bca1236..f22782a 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -615,6 +615,28 @@ static int __maybe_unused iostat_info_seq_show(struct 
seq_file *seq,
return 0;
 }
 
+static int __maybe_unused victim_bits_seq_show(struct seq_file *seq,
+   void *offset)
+{
+   struct super_block *sb = seq->private;
+   struct f2fs_sb_info *sbi = F2FS_SB(sb);
+   struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
+   int i;
+
+   seq_puts(seq, "format: victim_secmap bitmaps\n");
+
+   for (i = 0; i < MAIN_SECS(sbi); i++) {
+   if ((i % 10) == 0)
+   seq_printf(seq, "%-10d", i);
+   seq_printf(seq, "%d", test_bit(i, dirty_i->victim_secmap) ? 1 : 
0);
+   if ((i % 10) == 9 || i == (MAIN_SECS(sbi) - 1))
+   seq_putc(seq, '\n');
+   else
+   seq_putc(seq, ' ');
+   }
+   return 0;
+}
+
 int __init f2fs_init_sysfs(void)
 {
int ret;
@@ -664,6 +686,8 @@ int f2fs_register_sysfs(struct f2fs_sb_info *sbi)
segment_bits_seq_show, sb);
proc_create_single_data("iostat_info", S_IRUGO, sbi->s_proc,
iostat_info_seq_show, sb);
+   proc_create_single_data("victim_bits", S_IRUGO, sbi->s_proc,
+   victim_bits_seq_show, sb);
}
return 0;
 }
@@ -674,6 +698,7 @@ void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi)
remove_proc_entry("iostat_info", sbi->s_proc);
remove_proc_entry("segment_info", sbi->s_proc);
remove_proc_entry("segment_bits", sbi->s_proc);
+   remove_proc_entry("victim_bits", sbi->s_proc);
remove_proc_entry(sbi->sb->s_id, f2fs_proc_root);
}
kobject_del(>s_kobj);
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 4/5] f2fs: let BG_GC check every dirty segments and gc over a threshold

2018-07-23 Thread Yunlong Song

BG_GC is triggered in idle time, so it is better check every dirty
segment and finds the best victim to gc. Otherwise, BG_GC will be
limited to only 8G areas, and probably select a victim which has nearly
full of valid blocks, resulting a big WAI. Besides, we also add a
bggc_threshold (which is the old "fggc_threshold", so revert commit
"299254") to stop BG_GC when there is no good choice. This is especially
good for large section case to reduce WAI.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h|  2 ++
 fs/f2fs/gc.c  | 23 ---
 fs/f2fs/segment.h |  9 +
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index f8a7b42..24a9d7f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1220,6 +1220,8 @@ struct f2fs_sb_info {
unsigned int cur_fg_victim_sec; /* current FG_GC victim section 
num */
unsigned int cur_bg_victim_sec; /* current BG_GC victim section 
num */
unsigned int gc_mode;   /* current GC state */
+   /* threshold for selecting bg victims */
+   u64 bggc_threshold;
/* for skip statistic */
unsigned long long skipped_atomic_files[2]; /* FG_GC and BG_GC */
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 0e7a265..21e8d59 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -189,9 +189,8 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
p->ofs_unit = sbi->segs_per_sec;
}
 
-   /* we need to check every dirty segments in the FG_GC case */
-   if (gc_type != FG_GC &&
-   (sbi->gc_mode != GC_URGENT) &&
+   /* we need to check every dirty segments in the GC case */
+   if (p->alloc_mode == SSR &&
p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
 
@@ -230,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
for_each_set_bit(secno, dirty_i->victim_secmap, MAIN_SECS(sbi)) {
if (sec_usage_check(sbi, secno))
continue;
+
+   if (no_bggc_candidate(sbi, secno))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -368,6 +371,10 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (sec_usage_check(sbi, secno))
goto next;
 
+   if (gc_type == BG_GC && p.alloc_mode == LFS &&
+   no_bggc_candidate(sbi, secno))
+   goto next;
+
cost = get_gc_cost(sbi, segno, );
 
if (p.min_cost > cost) {
@@ -1140,8 +1147,18 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 
 void f2fs_build_gc_manager(struct f2fs_sb_info *sbi)
 {
+   u64 main_count, resv_count, ovp_count;
+
DIRTY_I(sbi)->v_ops = _v_ops;
 
+   /* threshold of # of valid blocks in a section for victims of BG_GC */
+   main_count = SM_I(sbi)->main_segments << sbi->log_blocks_per_seg;
+   resv_count = SM_I(sbi)->reserved_segments << sbi->log_blocks_per_seg;
+   ovp_count = SM_I(sbi)->ovp_segments << sbi->log_blocks_per_seg;
+
+   sbi->bggc_threshold = div64_u64((main_count - ovp_count) *
+   BLKS_PER_SEC(sbi), (main_count - 
resv_count));
+
sbi->gc_pin_file_threshold = DEF_GC_FAILED_PINNED_FILES;
 
/* give warm/cold data area from slower device */
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index b21bb96..932e59b 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -785,6 +785,15 @@ static inline block_t sum_blk_addr(struct f2fs_sb_info 
*sbi, int base, int type)
- (base + 1) + type;
 }
 
+static inline bool no_bggc_candidate(struct f2fs_sb_info *sbi,
+unsigned int secno)
+{
+if (get_valid_blocks(sbi, GET_SEG_FROM_SEC(sbi, secno), true) >
+sbi->bggc_threshold)
+return true;
+return false;
+}
+
 static inline bool sec_usage_check(struct f2fs_sb_info *sbi, unsigned int 
secno)
 {
if (IS_CURSEC(sbi, secno) || (sbi->cur_fg_victim_sec == secno) ||
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 0/5] f2fs: fix and improve for victim_secmap

2018-07-23 Thread Yunlong Song

There are some fixes and improvements for using with victim_secmap.

Yunlong Song (5):
  f2fs: clear victim_secmap when section has full valid blocks
  f2fs: add cur_victim_sec for BG_GC to avoid skipping BG_GC victim
  f2fs: clear_bit the SSR selected section in the victim_secmap
  f2fs: let BG_GC check every dirty segments and gc over a threshold
  f2fs: add proc entry to show victim_secmap bitmap

 fs/f2fs/f2fs.h  |  5 -
 fs/f2fs/gc.c| 39 ++-
 fs/f2fs/segment.c   |  4 +++-
 fs/f2fs/segment.h   | 12 +++-
 fs/f2fs/super.c |  3 ++-
 fs/f2fs/sysfs.c | 25 +
 include/trace/events/f2fs.h | 18 --
 7 files changed, 87 insertions(+), 19 deletions(-)

-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v3] f2fs: issue discard align to section in LFS mode

2018-07-19 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round 2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
valid blocks of this section, so discard issued, but this time prefree
bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;

round 3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
0 0 0 0 0, no valid blocks of this section, so discard issued,
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

To fix this problem, we can align the start and end value to section
boundary for fstrim and real-time discard operation, and decide to issue
discard only when the whole section is invalid, which can issue discard
aligned to section size as much as possible and avoid redundant discard.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.c | 20 +---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..97770f3 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1710,21 +1710,30 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
mutex_lock(_i->seglist_lock);
 
while (1) {
int i;
+
+   if (need_align && end != -1)
+   end--;
start = find_next_bit(prefree_map, MAIN_SEGS(sbi), end + 1);
if (start >= MAIN_SEGS(sbi))
break;
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);
 
-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(end, sbi->segs_per_sec);
+   }
 
-   dirty_i->nr_dirty[PRE] -= end - start;
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
 
if (!test_opt(sbi, DISCARD))
continue;
@@ -2511,6 +2520,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2528,6 +2538,10 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);
+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno + 1, sbi->segs_per_sec) - 1;
+   }
 
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: issue discard align to section in LFS mode

2018-07-17 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round 2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
valid blocks of this section, so discard issued, but this time prefree
bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;

round 3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
0 0 0 0 0, no valid blocks of this section, so discard issued,
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

To fix this problem, we can align the start and end value to section
boundary for fstrim and real-time discard operation, and decide to issue
discard only when the whole section is invalid, which can issue discard
aligned to section size as much as possible and avoid redundant discard.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..c25e2eb 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1710,6 +1710,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
mutex_lock(_i->seglist_lock);
 
@@ -1721,14 +1722,19 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);
 
-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
-
-   dirty_i->nr_dirty[PRE] -= end - start;
-
if (!test_opt(sbi, DISCARD))
continue;
 
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(end, sbi->segs_per_sec);
+   }
+
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
+
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
@@ -2511,6 +2517,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2528,6 +2535,10 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);
+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno + 1, sbi->segs_per_sec) - 1;
+   }
 
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: issue discard align to section in LFS mode

2018-07-17 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round 2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
valid blocks of this section, so discard issued, but this time prefree
bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;

round 3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
0 0 0 0 0, no valid blocks of this section, so discard issued,
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

To fix this problem, we can align the start and end value to section
boundary for fstrim and real-time discard operation, and decide to issue
discard only when the whole section is invalid, which can issue discard
aligned to section size as much as possible and avoid redundant discard.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..ff5de34 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1710,6 +1710,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
mutex_lock(_i->seglist_lock);
 
@@ -1721,14 +1722,19 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);
 
-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
-
-   dirty_i->nr_dirty[PRE] -= end - start;
-
if (!test_opt(sbi, DISCARD))
continue;
 
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(end, sbi->segs_per_sec);
+   }
+
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
+
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
@@ -2511,6 +2517,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2528,6 +2535,13 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);
+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno, sbi->segs_per_sec);
+   if (start_segno == end_segno)
+   end_segno += sbi->segs_per_sec;
+   end_segno--;
+   }
 
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v2] f2fs: clear the remaining prefree_map of the section

2018-07-17 Thread Yunlong Song





On 2018/7/17 17:05, Chao Yu wrote:

On 2018/7/16 18:03, Yunlong Song wrote:

For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint, so
fix it.

I mean:

Subject: [PATCH] f2fs: issue discard align to section in LFS mode

---
  fs/f2fs/segment.c | 19 ---
  1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index f12dad627fb4..6640c790cf64 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1761,6 +1761,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;

mutex_lock(_i->seglist_lock);

@@ -1772,10 +1773,15 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);

-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(start, sbi->segs_per_sec);
+   }

-   dirty_i->nr_dirty[PRE] -= end - start;
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
The above part should be put below if (!test_opt(sbi, DISCARD)), or in 
test_opt(sbi, DISCARD) == 0 case,
the first segment of the next section will be skipped, if only the first 
segment is prefree and all the other

segments of the section are not prefree.


if (!test_opt(sbi, DISCARD))
continue;
@@ -2564,6 +2570,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;

if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2582,6 +2589,12 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);

+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno, sbi->segs_per_sec);
+   end_segno = min(end_segno, MAIN_SEGS(sbi) - 1);
+   }
+
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
cpc.trim_start = start_segno;


--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: clear the remaining prefree_map of the section

2018-07-16 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint, so
fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..5dc1d5cc 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1729,6 +1729,15 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
if (!test_opt(sbi, DISCARD))
continue;
 
+   if (test_opt(sbi, LFS) && sbi->segs_per_sec > 1) {
+   start = rounddown(start, sbi->segs_per_sec);
+   i = end;
+   end = roundup(end, sbi->segs_per_sec);
+   while (++i < end)
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
+
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 2/5] f2fs: clear the remaining prefree_map of the section

2018-07-12 Thread Yunlong Song


Because in f2fs_clear_prefree_segments, the codes:
...
while (1) {
int i;
start = find_next_bit(prefree_map, MAIN_SEGS(sbi), end + 1);
if (start >= MAIN_SEGS(sbi))
break;
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);

for (i = start; i < end; i++)
clear_bit(i, prefree_map);
...
next:
secno = GET_SEC_FROM_SEG(sbi, start);
start_segno = GET_SEG_FROM_SEC(sbi, secno);
if (!IS_CURSEC(sbi, secno) &&
!get_valid_blocks(sbi, start, true))
f2fs_issue_discard(sbi, START_BLOCK(sbi, start_segno),
sbi->segs_per_sec << sbi->log_blocks_per_seg);

start = start_segno + sbi->segs_per_sec;
if (start < end)
goto next;
else
end = start - 1;
...
In round 2, for prefree_map: 1 1 0 1 1, start = 0, end = 2, then

start = start_segno + sbi->segs_per_sec makes start = 5

if (start < end)  --> start = 5, end = 2

so end = start -1  --> end = 4, then return to while again, this time skips
prefree bit 3 and 4.

On 2018/7/13 11:42, Chao Yu wrote:

On 2018/7/13 11:28, Yunlong Song wrote:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no

Why prefree_map is not 0 0 0 0 0?

Thanks,


valid blocks of this section, so discard issued
but this time prefree bit of NO.3 and NO.4 is skipped...

round3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 - >
0 0 0 0 0, no valid blocks of this section, so discard issued
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

On 2018/7/13 11:13, Chao Yu wrote:

On 2018/7/12 23:09, Yunlong Song wrote:

For the case when sbi->segs_per_sec > 1, take section:segment = 5 for
example, if the section prefree_map is ...previous section | current
section (1 1 0 1 1) | next section..., then the start = x, end = x + 1,
after start = start_segno + sbi->segs_per_sec, start = x + 5, then it
will skip x + 3 and x + 4, but their bitmap is still set, which will
cause duplicated f2fs_issue_discard of this same section in the next
write_checkpoint, so fix it.

I didn't get it, if # 2 segment is not prefree state, so it still has valid
blocks there, so we won't issue discard due to below condition, right?

if (!IS_CURSEC(sbi, secno) &&
    !get_valid_blocks(sbi, start, true))

Thanks,


Signed-off-by: Yunlong Song 
---
   fs/f2fs/segment.c | 19 +--
   1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 47b6595..fd38b61 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1684,8 +1684,23 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
start = start_segno + sbi->segs_per_sec;
if (start < end)
goto next;
-   else
-   end = start - 1;
+   else {
+   start_segno = start;
+
+   while (1) {
+   start = find_next_bit(prefree_map, start_segno,
+   end + 
1);
+   if (start >= start_segno)
+   break;
+   end = find_next_zero_bit(prefree_map, 
start_segno,
+   
start + 1);
+   for (i = start; i < end; i++)
+   clear_bit(i, prefree_map);
+   dirty_i->nr_dirty[PRE] -= end - start;
+   }
+
+   end = start_segno - 1;
+   }
    }
mutex_unlock(_i->seglist_lock);
   


.



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 5/5] f2fs: do not __punch_discard_cmd in lfs mode

2018-07-12 Thread Yunlong Song

How about change test_opt(sbi, LFS) to f2fs_sb_has_blkzoned(sbi->sb) in 
this patch, which can

avoid punch discard (creating small discard) in zoned block device.

On 2018/7/13 11:26, Chao Yu wrote:

On 2018/7/12 23:09, Yunlong Song wrote:

In lfs mode, it is better to submit and wait for discard of the
new_blkaddr's overall section, rather than punch it which makes
more small discards and is not friendly with flash alignment. And
f2fs does not have to wait discard of each new_blkaddr except for the
start_block of each section with this patch.

For non-zoned block device, unaligned discard can be allowed; and if synchronous
discard is very slow, it will block block allocator here, rather than that, I
prefer just punch 4k lba of discard entry for performance.

If you don't want to encounter this condition, I suggest issue large size
discard more quickly.

Thanks,


Signed-off-by: Yunlong Song 
---
  fs/f2fs/segment.c | 76 ++-
  fs/f2fs/segment.h |  7 -
  2 files changed, 75 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index f6c20e0..bce321a 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -893,7 +893,19 @@ static void __remove_discard_cmd(struct f2fs_sb_info *sbi,
  static void f2fs_submit_discard_endio(struct bio *bio)
  {
struct discard_cmd *dc = (struct discard_cmd *)bio->bi_private;
+   struct f2fs_sb_info *sbi = F2FS_SB(dc->bdev->bd_super);
  
+	if (test_opt(sbi, LFS)) {

+   unsigned int segno = GET_SEGNO(sbi, dc->lstart);
+   unsigned int secno = GET_SEC_FROM_SEG(sbi, segno);
+   int cnt = (dc->len >> sbi->log_blocks_per_seg) /
+   sbi->segs_per_sec;
+
+   while (cnt--) {
+   set_bit(secno, FREE_I(sbi)->discard_secmap);
+   secno++;
+   }
+   }
dc->error = blk_status_to_errno(bio->bi_status);
dc->state = D_DONE;
complete_all(>wait);
@@ -1349,8 +1361,15 @@ static void f2fs_wait_discard_bio(struct f2fs_sb_info 
*sbi, block_t blkaddr)
dc = (struct discard_cmd *)f2fs_lookup_rb_tree(>root,
NULL, blkaddr);
if (dc) {
-   if (dc->state == D_PREP) {
+   if (dc->state == D_PREP && !test_opt(sbi, LFS))
__punch_discard_cmd(sbi, dc, blkaddr);
+   else if (dc->state == D_PREP && test_opt(sbi, LFS)) {
+   struct discard_policy dpolicy;
+
+   __init_discard_policy(sbi, , DPOLICY_FORCE, 1);
+   __submit_discard_cmd(sbi, , dc);
+   dc->ref++;
+   need_wait = true;
} else {
dc->ref++;
need_wait = true;
@@ -2071,9 +2090,10 @@ static void get_new_segment(struct f2fs_sb_info *sbi,
unsigned int hint = GET_SEC_FROM_SEG(sbi, *newseg);
unsigned int old_zoneno = GET_ZONE_FROM_SEG(sbi, *newseg);
unsigned int left_start = hint;
-   bool init = true;
+   bool init = true, check_discard = test_opt(sbi, LFS) ? true : false;
int go_left = 0;
int i;
+   unsigned long *free_secmap;
  
  	spin_lock(_i->segmap_lock);
  
@@ -2084,11 +2104,25 @@ static void get_new_segment(struct f2fs_sb_info *sbi,

goto got_it;
}
  find_other_zone:
-   secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint);
+   if (check_discard) {
+   int entries = f2fs_bitmap_size(MAIN_SECS(sbi)) / 
sizeof(unsigned long);
+
+   free_secmap = free_i->tmp_secmap;
+   for (i = 0; i < entries; i++)
+   free_secmap[i] = (!(free_i->free_secmap[i] ^
+   free_i->discard_secmap[i])) | 
free_i->free_secmap[i];
+   } else
+   free_secmap = free_i->free_secmap;
+
+   secno = find_next_zero_bit(free_secmap, MAIN_SECS(sbi), hint);
if (secno >= MAIN_SECS(sbi)) {
if (dir == ALLOC_RIGHT) {
-   secno = find_next_zero_bit(free_i->free_secmap,
+   secno = find_next_zero_bit(free_secmap,
MAIN_SECS(sbi), 0);
+   if (secno >= MAIN_SECS(sbi) && check_discard) {
+   check_discard = false;
+   goto find_other_zone;
+   }
f2fs_bug_on(sbi, secno >= MAIN_SECS(sbi));
} else {
go_left = 1;
@@ -2098,13 +2132,17 @@ static void get_new_segment(struct f2fs_sb_info *sbi,
if (go_left == 0)
goto skip_left;
  
-	while (test_

Re: [f2fs-dev] [PATCH 4/5] f2fs: disable small discard in lfs mode

2018-07-12 Thread Yunlong Song

How about change test_opt(sbi, LFS) to f2fs_sb_has_blkzoned(sbi->sb) in 
this patch, we apply

this patch to zoned block device?

On 2018/7/13 11:17, Chao Yu wrote:

On 2018/7/12 23:09, Yunlong Song wrote:

In lfs mode, it is better to send the discard of the overall section
each time to avoid missing alignment with flash.

Hmm.. I think LFS mode can be used widely on different kind of device instead of
just on zoned block device, so let's just keep old implementation here.

Thanks,


Signed-off-by: Yunlong Song 
---
  fs/f2fs/segment.c | 3 ++-
  fs/f2fs/sysfs.c   | 4 
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index fd38b61..f6c20e0 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1766,7 +1766,8 @@ static int create_discard_cmd_control(struct f2fs_sb_info 
*sbi)
atomic_set(>issing_discard, 0);
atomic_set(>discard_cmd_cnt, 0);
dcc->nr_discards = 0;
-   dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg;
+   dcc->max_discards = test_opt(sbi, LFS) ? 0 :
+   MAIN_SEGS(sbi) << sbi->log_blocks_per_seg;
dcc->undiscard_blks = 0;
dcc->root = RB_ROOT;
dcc->rbtree_check = false;
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 2e7e611..4b6c457 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -271,6 +271,10 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
return count;
}
  
+	if (!strcmp(a->attr.name, "max_small_discards") &&

+   test_opt(sbi, LFS))
+   return -EINVAL;
+
*ui = t;
  
  	if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)




.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 2/5] f2fs: clear the remaining prefree_map of the section

2018-07-12 Thread Yunlong Song


round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0, 
prefree of NO.2 is cleared, and no discard issued


round2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round1, then 
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no 
valid blocks of this section, so discard issued

but this time prefree bit of NO.3 and NO.4 is skipped...

round3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 - > 
0 0 0 0 0, no valid blocks of this section, so discard issued
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of 
this section is sent again...


On 2018/7/13 11:13, Chao Yu wrote:

On 2018/7/12 23:09, Yunlong Song wrote:

For the case when sbi->segs_per_sec > 1, take section:segment = 5 for
example, if the section prefree_map is ...previous section | current
section (1 1 0 1 1) | next section..., then the start = x, end = x + 1,
after start = start_segno + sbi->segs_per_sec, start = x + 5, then it
will skip x + 3 and x + 4, but their bitmap is still set, which will
cause duplicated f2fs_issue_discard of this same section in the next
write_checkpoint, so fix it.

I didn't get it, if # 2 segment is not prefree state, so it still has valid
blocks there, so we won't issue discard due to below condition, right?

if (!IS_CURSEC(sbi, secno) &&
!get_valid_blocks(sbi, start, true))

Thanks,


Signed-off-by: Yunlong Song 
---
  fs/f2fs/segment.c | 19 +--
  1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 47b6595..fd38b61 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1684,8 +1684,23 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
start = start_segno + sbi->segs_per_sec;
if (start < end)
goto next;
-   else
-   end = start - 1;
+   else {
+   start_segno = start;
+
+   while (1) {
+   start = find_next_bit(prefree_map, start_segno,
+   end + 
1);
+   if (start >= start_segno)
+   break;
+   end = find_next_zero_bit(prefree_map, 
start_segno,
+   
start + 1);
+   for (i = start; i < end; i++)
+   clear_bit(i, prefree_map);
+   dirty_i->nr_dirty[PRE] -= end - start;
+   }
+
+   end = start_segno - 1;
+   }
}
mutex_unlock(_i->seglist_lock);
  



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 4/5] f2fs: disable small discard in lfs mode

2018-07-12 Thread Yunlong Song

In lfs mode, it is better to send the discard of the overall section
each time to avoid missing alignment with flash.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 3 ++-
 fs/f2fs/sysfs.c   | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index fd38b61..f6c20e0 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1766,7 +1766,8 @@ static int create_discard_cmd_control(struct f2fs_sb_info 
*sbi)
atomic_set(>issing_discard, 0);
atomic_set(>discard_cmd_cnt, 0);
dcc->nr_discards = 0;
-   dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg;
+   dcc->max_discards = test_opt(sbi, LFS) ? 0 :
+   MAIN_SEGS(sbi) << sbi->log_blocks_per_seg;
dcc->undiscard_blks = 0;
dcc->root = RB_ROOT;
dcc->rbtree_check = false;
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 2e7e611..4b6c457 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -271,6 +271,10 @@ static ssize_t __sbi_store(struct f2fs_attr *a,
return count;
}
 
+   if (!strcmp(a->attr.name, "max_small_discards") &&
+   test_opt(sbi, LFS))
+   return -EINVAL;
+
*ui = t;
 
if (!strcmp(a->attr.name, "iostat_enable") && *ui == 0)
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 5/5] f2fs: do not __punch_discard_cmd in lfs mode

2018-07-12 Thread Yunlong Song

In lfs mode, it is better to submit and wait for discard of the
new_blkaddr's overall section, rather than punch it which makes
more small discards and is not friendly with flash alignment. And
f2fs does not have to wait discard of each new_blkaddr except for the
start_block of each section with this patch.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 76 ++-
 fs/f2fs/segment.h |  7 -
 2 files changed, 75 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index f6c20e0..bce321a 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -893,7 +893,19 @@ static void __remove_discard_cmd(struct f2fs_sb_info *sbi,
 static void f2fs_submit_discard_endio(struct bio *bio)
 {
struct discard_cmd *dc = (struct discard_cmd *)bio->bi_private;
+   struct f2fs_sb_info *sbi = F2FS_SB(dc->bdev->bd_super);
 
+   if (test_opt(sbi, LFS)) {
+   unsigned int segno = GET_SEGNO(sbi, dc->lstart);
+   unsigned int secno = GET_SEC_FROM_SEG(sbi, segno);
+   int cnt = (dc->len >> sbi->log_blocks_per_seg) /
+   sbi->segs_per_sec;
+
+   while (cnt--) {
+   set_bit(secno, FREE_I(sbi)->discard_secmap);
+   secno++;
+   }
+   }
dc->error = blk_status_to_errno(bio->bi_status);
dc->state = D_DONE;
complete_all(>wait);
@@ -1349,8 +1361,15 @@ static void f2fs_wait_discard_bio(struct f2fs_sb_info 
*sbi, block_t blkaddr)
dc = (struct discard_cmd *)f2fs_lookup_rb_tree(>root,
NULL, blkaddr);
if (dc) {
-   if (dc->state == D_PREP) {
+   if (dc->state == D_PREP && !test_opt(sbi, LFS))
__punch_discard_cmd(sbi, dc, blkaddr);
+   else if (dc->state == D_PREP && test_opt(sbi, LFS)) {
+   struct discard_policy dpolicy;
+
+   __init_discard_policy(sbi, , DPOLICY_FORCE, 1);
+   __submit_discard_cmd(sbi, , dc);
+   dc->ref++;
+   need_wait = true;
} else {
dc->ref++;
need_wait = true;
@@ -2071,9 +2090,10 @@ static void get_new_segment(struct f2fs_sb_info *sbi,
unsigned int hint = GET_SEC_FROM_SEG(sbi, *newseg);
unsigned int old_zoneno = GET_ZONE_FROM_SEG(sbi, *newseg);
unsigned int left_start = hint;
-   bool init = true;
+   bool init = true, check_discard = test_opt(sbi, LFS) ? true : false;
int go_left = 0;
int i;
+   unsigned long *free_secmap;
 
spin_lock(_i->segmap_lock);
 
@@ -2084,11 +2104,25 @@ static void get_new_segment(struct f2fs_sb_info *sbi,
goto got_it;
}
 find_other_zone:
-   secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint);
+   if (check_discard) {
+   int entries = f2fs_bitmap_size(MAIN_SECS(sbi)) / 
sizeof(unsigned long);
+
+   free_secmap = free_i->tmp_secmap;
+   for (i = 0; i < entries; i++)
+   free_secmap[i] = (!(free_i->free_secmap[i] ^
+   free_i->discard_secmap[i])) | 
free_i->free_secmap[i];
+   } else
+   free_secmap = free_i->free_secmap;
+
+   secno = find_next_zero_bit(free_secmap, MAIN_SECS(sbi), hint);
if (secno >= MAIN_SECS(sbi)) {
if (dir == ALLOC_RIGHT) {
-   secno = find_next_zero_bit(free_i->free_secmap,
+   secno = find_next_zero_bit(free_secmap,
MAIN_SECS(sbi), 0);
+   if (secno >= MAIN_SECS(sbi) && check_discard) {
+   check_discard = false;
+   goto find_other_zone;
+   }
f2fs_bug_on(sbi, secno >= MAIN_SECS(sbi));
} else {
go_left = 1;
@@ -2098,13 +2132,17 @@ static void get_new_segment(struct f2fs_sb_info *sbi,
if (go_left == 0)
goto skip_left;
 
-   while (test_bit(left_start, free_i->free_secmap)) {
+   while (test_bit(left_start, free_secmap)) {
if (left_start > 0) {
left_start--;
continue;
}
-   left_start = find_next_zero_bit(free_i->free_secmap,
+   left_start = find_next_zero_bit(free_secmap,
MAIN_SECS(sbi), 0);
+   if (left_start >= MAIN_SECS(sbi) && check_discard) {
+   check_d

[f2fs-dev] [PATCH 2/5] f2fs: clear the remaining prefree_map of the section

2018-07-12 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1, take section:segment = 5 for
example, if the section prefree_map is ...previous section | current
section (1 1 0 1 1) | next section..., then the start = x, end = x + 1,
after start = start_segno + sbi->segs_per_sec, start = x + 5, then it
will skip x + 3 and x + 4, but their bitmap is still set, which will
cause duplicated f2fs_issue_discard of this same section in the next
write_checkpoint, so fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 47b6595..fd38b61 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1684,8 +1684,23 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
start = start_segno + sbi->segs_per_sec;
if (start < end)
goto next;
-   else
-   end = start - 1;
+   else {
+   start_segno = start;
+
+   while (1) {
+   start = find_next_bit(prefree_map, start_segno,
+   end + 
1);
+   if (start >= start_segno)
+   break;
+   end = find_next_zero_bit(prefree_map, 
start_segno,
+   
start + 1);
+   for (i = start; i < end; i++)
+   clear_bit(i, prefree_map);
+   dirty_i->nr_dirty[PRE] -= end - start;
+   }
+
+   end = start_segno - 1;
+   }
}
mutex_unlock(_i->seglist_lock);
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 1/5] f2fs: do not set free of current section

2018-07-12 Thread Yunlong Song

For the case when sbi->segs_per_sec > 1, take section:segment = 5 for
example, if segment 1 is just used and allocate new segment 2, and the
blocks of segment 1 is invalidated, at this time, the previous code will
use __set_test_and_free to free the free_secmap and free_sections++,
this is not correct since it is still a current section, so fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index b5bd328..5049551 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -448,6 +448,8 @@ static inline void __set_test_and_free(struct f2fs_sb_info 
*sbi,
if (test_and_clear_bit(segno, free_i->free_segmap)) {
free_i->free_segments++;
 
+   if (IS_CURSEC(sbi, secno))
+   goto skip_free;
next = find_next_bit(free_i->free_segmap,
start_segno + sbi->segs_per_sec, start_segno);
if (next >= start_segno + sbi->segs_per_sec) {
@@ -455,6 +457,7 @@ static inline void __set_test_and_free(struct f2fs_sb_info 
*sbi,
free_i->free_sections++;
}
}
+skip_free:
spin_unlock(_i->segmap_lock);
 }
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 3/5] f2fs: blk_finish_plug of submit_bio in lfs mode

2018-07-12 Thread Yunlong Song

Expand the blk_finish_plug action from blkzoned to normal lfs mode,
since plug will cause the out-of-order IO submission, which is not
friendly to flash in lfs mode.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 70813a4..f12151d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -263,7 +263,7 @@ static inline void __submit_bio(struct f2fs_sb_info *sbi,
if (type != DATA && type != NODE)
goto submit_io;
 
-   if (f2fs_sb_has_blkzoned(sbi->sb) && current->plug)
+   if (test_opt(sbi, LFS) && current->plug)
blk_finish_plug(current->plug);
 
start = bio->bi_iter.bi_size >> F2FS_BLKSIZE_BITS;
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 0/5] f2fs: fix some bugs in lfs mode with large section

2018-07-12 Thread Yunlong Song

f2fs has some bugs in section:segment > 1 and lfs mode, so fix them.

Yunlong Song (5):
  f2fs: do not set free of current section
  f2fs: clear the remaining prefree_map of the section
  f2fs: blk_finish_plug of submit_bio in lfs mode
  f2fs: disable small discard in lfs mode
  f2fs: do not __punch_discard_cmd in lfs mode

 fs/f2fs/data.c|  2 +-
 fs/f2fs/segment.c | 98 +--
 fs/f2fs/segment.h | 10 +-
 fs/f2fs/sysfs.c   |  4 +++
 4 files changed, 102 insertions(+), 12 deletions(-)

-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs-tools: do not ignore *.rej in git

2018-05-30 Thread Yunlong Song

When using git apply --reject to resolve the conflicts, we should better
see all the *.rej clear, or we will miss merging some *.rej patches.

Signed-off-by: Yunlong Song 
---
 .gitignore | 1 -
 1 file changed, 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index 3f04e85..391c528 100644
--- a/.gitignore
+++ b/.gitignore
@@ -11,7 +11,6 @@
 *.mod.c
 *.lst
 *.orig
-*.rej
 
 CVS
 !.gitignore
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs-tools: fix to ignore sg_write_buffer in git

2018-05-30 Thread Yunlong Song

Add tools/sg_write_buffer/sg_write_buffer to .gitignore.

Signed-off-by: Yunlong Song 
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index d5ca55d..3f04e85 100644
--- a/.gitignore
+++ b/.gitignore
@@ -50,6 +50,7 @@ stamp-h1
 /tools/fibmap.f2fs
 /tools/parse.f2fs
 /tools/f2fscrypt
+/tools/sg_write_buffer/sg_write_buffer
 
 # cscope files
 cscope.*
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH rebase] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-30 Thread Yunlong Song

zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song 
---
 fsck/resize.c  | 7 ---
 mkfs/f2fs_format.c | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index 649f5d9..3f8b01d 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
 
 static int get_new_sb(struct f2fs_super_block *sb)
 {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * DEFAULT_SECTOR_SIZE +
+   ((u_int64_t) c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * DEFAULT_SECTOR_SIZE;
+   (u_int64_t) c.start_sector * DEFAULT_SECTOR_SIZE;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index e0c3cb8..2350c10 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * DEFAULT_SECTOR_SIZE +
+   ((u_int64_t) c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * DEFAULT_SECTOR_SIZE;
+   (u_int64_t) c.start_sector * DEFAULT_SECTOR_SIZE;
 
if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-27 Thread Yunlong Song

Just keep it same with u_int64_t defined in mkfs/f2fs_format.c, and 
c.start_sector * c.sector_size
may be u32 overflow, so add (u_int64_t) before c.start_sector * 
c.sector_size and change the target

value zone_align_start_offset to (u_int64_t).

On 2018/5/26 19:27, Junling Zheng wrote:

No neet to change zone_align_start_offset to u64, because 
zone_align_start_offset is always
smaller than zone_size_bytes, which is u32.

Thanks,
Junling

On 2018/5/26 16:09, Yunlong Song wrote:

zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fsck/resize.c  | 7 ---
  mkfs/f2fs_format.c | 4 ++--
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
  
  static int get_new_sb(struct f2fs_super_block *sb)

  {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	if (c.start_sector % c.sectors_per_blk) {

MSG(1, "\t%s: Align start sector number to the page unit\n",




.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-26 Thread Yunlong Song


ping...

On 2018/5/7 10:15, Yunlong Song wrote:

f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fsck/resize.c  |  4 ++--
  mkfs/f2fs_format.c | 12 ++--
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
-	if (c.start_sector % c.sectors_per_blk) {

+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
  
  	if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /

-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
    get_sb(segment0_blkaddr));
return -1;


--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-26 Thread Yunlong Song

zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/resize.c  | 7 ---
 mkfs/f2fs_format.c | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
 
 static int get_new_sb(struct f2fs_super_block *sb)
 {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
if (c.start_sector % c.sectors_per_blk) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-06 Thread Yunlong Song

f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/resize.c  |  4 ++--
 mkfs/f2fs_format.c | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
-   if (c.start_sector % c.sectors_per_blk) {
+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
 
if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /
-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
get_sb(segment0_blkaddr));
return -1;
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs-tools: fix the sector_size to default value

2018-05-04 Thread Yunlong Song

f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just set
the sector_size to the default value 512.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 lib/libf2fs.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 102e579..e160f2a 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -768,7 +768,6 @@ void get_kernel_uname_version(__u8 *version)
 int get_device_info(int i)
 {
int32_t fd = 0;
-   uint32_t sector_size;
 #ifndef BLKGETSIZE64
uint32_t total_sectors;
 #endif
@@ -822,12 +821,6 @@ int get_device_info(int i)
} else if (S_ISREG(stat_buf->st_mode)) {
dev->total_sectors = stat_buf->st_size / dev->sector_size;
} else if (S_ISBLK(stat_buf->st_mode)) {
-#ifdef BLKSSZGET
-   if (ioctl(fd, BLKSSZGET, _size) < 0)
-   MSG(0, "\tError: Using the default sector size\n");
-   else if (dev->sector_size < sector_size)
-   dev->sector_size = sector_size;
-#endif
 #ifdef BLKGETSIZE64
if (ioctl(fd, BLKGETSIZE64, >total_sectors) < 0) {
MSG(0, "\tError: Cannot get the device size\n");
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song


Please avoid this patch, I make a mistake.

On 2018/5/3 15:45, Yunlong Song wrote:

Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/segment.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
  
-	if (waitqueue_active(>flush_wait_queue))

+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
  
  	if (fcc->f2fs_issue_flush) {


--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song

Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
 
-   if (waitqueue_active(>flush_wait_queue))
+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
 
if (fcc->f2fs_issue_flush) {
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: remove unmatched zero_user_segment when convert inline dentry

2018-04-03 Thread Yunlong Song

Since the layout of regular dentry block is different from inline dentry
block, zero_user_segment starting from MAX_INLINE_DATA(dir) is not
correct for regular dentry block, besides, bitmap is already copied and
used, so there is no necessary to zero page at all, so just remove the
zero_user_segment is OK.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/inline.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 3b77d64..573ec2f 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -367,7 +367,6 @@ static int f2fs_move_inline_dirents(struct inode *dir, 
struct page *ipage,
goto out;
 
f2fs_wait_on_page_writeback(page, DATA, true);
-   zero_user_segment(page, MAX_INLINE_DATA(dir), PAGE_SIZE);
 
dentry_blk = page_address(page);
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: make assignment of t->dentry_bitmap more readable

2018-04-02 Thread Yunlong Song

In make_dentry_ptr_block, it is confused with "&" for t->dentry_bitmap
but without "&" for t->dentry, so delete "&" to make code more readable.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 191ee57..474b9e9 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -471,7 +471,7 @@ static inline void make_dentry_ptr_block(struct inode 
*inode,
d->inode = inode;
d->max = NR_DENTRY_IN_BLOCK;
d->nr_bitmap = SIZE_OF_DENTRY_BITMAP;
-   d->bitmap = >dentry_bitmap;
+   d->bitmap = t->dentry_bitmap;
d->dentry = t->dentry;
d->filename = t->filename;
 }
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song

Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/super.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..42d564c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -827,7 +827,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
/* Initialize f2fs-specific inode info */
atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +838,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song

Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/super.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..3a7fa03 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -825,9 +825,7 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_once((void *) fi);
 
/* Initialize f2fs-specific inode info */
-   atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +837,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-19 Thread Yunlong Song


OK, got it.

On 2018/3/19 14:23, Jaegeuk Kim wrote:

On 03/19, Yunlong Song wrote:

Hi, Jaegeuk,
 I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

Moved into the beginning of the tree for cherry-picking into f2fs-stable.

Thanks,


On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/dir.c   | 23 +--
   fs/f2fs/f2fs.h  |  6 --
   fs/f2fs/inline.c|  3 +--
   fs/f2fs/inode.c |  2 +-
   fs/f2fs/namei.c | 14 +-
   fs/f2fs/recovery.c  | 11 +--
   include/linux/f2fs_fs.h |  1 -
   7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
return de;
   }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
b

Re: [f2fs-dev] [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-18 Thread Yunlong Song


Hi, Jaegeuk,
I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/dir.c   | 23 +--
  fs/f2fs/f2fs.h  |  6 --
  fs/f2fs/inline.c|  3 +--
  fs/f2fs/inode.c |  2 +-
  fs/f2fs/namei.c | 14 +-
  fs/f2fs/recovery.c  | 11 +--
  include/linux/f2fs_fs.h |  1 -
  7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
  
-	dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);

+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
  
  	return de;

  }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
  
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,

f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
  
  	dir->i_mtime = dir->i_ctime = current_time(dir);

@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-	dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

do_make_empty_dir(inode, parent, );
  
-	kunmap_atomic(dentry_blk);

-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-		dentry_blk = kmap(dentry_page);

+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
  
-		kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
}
  
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,

if (inode)
up_write(_I(inode)->i_sem);
  
-	kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
  
  	return err;

@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
  
  	dir->i_ctime = dir->i_mtime = current_time(dir);

@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
  
-		dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
  
  		f2fs

Re: [f2fs-dev] [PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-28 Thread Yunlong Song


On 2018/2/28 13:48, Jaegeuk Kim wrote:

Hi Yunlong,

As Eric pointed out, how do you think using nohighmem for directory likewise
ext4, which looks like more efficient?


OK, I have sent out another patch like this.

 Actually, we don't need to do this in

most of recent kernels, right?



Why? I have got this panic using arm with recent kernel.


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-02-28 Thread Yunlong Song

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c   | 23 +--
 fs/f2fs/f2fs.h  |  6 --
 fs/f2fs/inline.c|  3 +--
 fs/f2fs/inode.c |  2 +-
 fs/f2fs/namei.c | 14 +-
 fs/f2fs/recovery.c  | 11 +--
 include/linux/f2fs_fs.h |  1 -
 7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
 
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
 
return de;
 }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
 
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
 
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
 
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
 
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
 
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
 
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
 
f2fs_put_page(dentry_page, 1);
 
@@ -901,19 +890,17 @@ static int f2fs_rea

[f2fs-dev] [PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-27 Thread Yunlong Song

This reverts commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b.

Conflicts:
fs/f2fs/dir.c

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patch:
commit e06f86e61d7a ("f2fs crypto: avoid unneeded memory allocation in 
->readdir")
reverts the codes, which causes panic again in arm, so fix it back to the old 
version.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
Reviewed-by: Chao Yu <yuch...@huawei.com>
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v3] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-27 Thread Yunlong Song

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add part of
the old patch again for dentry page.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-24 Thread Yunlong Song

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit 922ec355f86365388203672119b5bca346a45085 ("f2fs crypto: avoid
unneeded memory allocation when {en/de}crypting symlink")
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add the old
patch again.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c   |  7 +++
 fs/f2fs/namei.c | 10 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..c0cf3e7b 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = kmalloc(de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index c4c94c7..2cb70c1 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1170,8 +1170,13 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
 
/* Symlink is encrypted */
sd = (struct fscrypt_symlink_data *)caddr;
-   cstr.name = sd->encrypted_path;
cstr.len = le16_to_cpu(sd->len);
+   cstr.name = kmalloc(cstr.len, GFP_NOFS);
+   if (!cstr.name) {
+   res = -ENOMEM;
+   goto errout;
+   }
+   memcpy(cstr.name, sd->encrypted_path, cstr.len);
 
/* this is broken symlink case */
if (unlikely(cstr.len == 0)) {
@@ -1198,6 +1203,8 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
goto errout;
}
 
+   kfree(cstr.name);
+
paddr = pstr.name;
 
/* Null-terminate the name */
@@ -1207,6 +1214,7 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
set_delayed_call(done, kfree_link, paddr);
return paddr;
 errout:
+   kfree(cstr.name);
fscrypt_fname_free_buffer();
put_page(cpage);
return ERR_PTR(res);
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song


OK, Got it.

On 2018/2/11 11:50, Chao Yu wrote:

On 2018/2/11 11:34, Yunlong Song wrote:

Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.


Hmm.. data block migration is running based on meta inode, so it will
be safe since checkpoint will flush all meta pages including encrypted
pages cached in meta inode?

Thanks,



On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/gc.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))






.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song


Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.

On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/gc.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song


The problem is that you can not find a proper value of the threshold
time, when f2fs_gc select the GCed data page of the atomic file (which
has atomic started but not atomic committed yet), then f2fs_gc will
run into loop, and all the f2fs ops will be blocked in f2fs_balane_fs.
If the threshold time is set larger (e.g. 5s? Then all the f2fs ops will
block 5s, which will cause unexpected bad result of user experience).
And if the threshold time is set smaller (e.g. 500ms? Then the atomic
ops will probably fail frequently). BTW, some more patches are needed
to notify the atomic ops itself that it has run time out, and should
handle the inmem pages

Back to these two patches, why not use them to separate inmem pages
and GCed data pages in such a simple way.

On 2018/2/9 21:38, Chao Yu wrote:

On 2018/2/9 21:29, Yunlong Song wrote:

Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic


That's another issue, which I have suggest to set a threshold time
to release atomic/volatile pages by balance_fs_bg.

Thanks,


commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
   - f2fs_is_atomic_file()
 skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

  - gc_data_segment
   - move_data_page
- f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
- set_inode_flag(inode, FI_ATOMIC_FILE);
- f2fs_set_data_page_dirty
 - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
   - lock_page()
- mutex_lock(>commit_lock);
 - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
 fs/f2fs/file.c | 3 +++
 fs/f2fs/gc.c   | 5 -
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

 inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
 if (f2fs_is_volatile_file(inode))
 goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 }
 err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
 inode_unlock(inode);
 mnt_drop_write_file(filp);
 return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 if (!check_valid_map(F2FS_I_SB(inode), segno, off))
 goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
 if (f2fs_is_pinned_file(inode)) {
 f2fs_pin_file_control(inode, true);
 goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
 if (!check_valid_map(F2FS_I_SB(inode), segno,

Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song


Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic
commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
  - f2fs_is_atomic_file()
skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

 - gc_data_segment
  - move_data_page
   - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
   - set_inode_flag(inode, FI_ATOMIC_FILE);
   - f2fs_set_data_page_dirty
- register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
  - lock_page()
   - mutex_lock(>commit_lock);
- lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
fs/f2fs/file.c | 3 +++
fs/f2fs/gc.c   | 5 -
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);





.





.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song


As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.
This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

- gc_data_segment
 - move_data_page
  - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
  - set_inode_flag(inode, FI_ATOMIC_FILE);
  - f2fs_set_data_page_dirty
   - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
 - lock_page()
  - mutex_lock(>commit_lock);
   - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
   fs/f2fs/file.c | 3 +++
   fs/f2fs/gc.c   | 5 -
   2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

   inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
   if (f2fs_is_volatile_file(inode))
   goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
   ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
   }
   err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
   inode_unlock(inode);
   mnt_drop_write_file(filp);
   return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
   if (f2fs_is_pinned_file(inode)) {
   f2fs_pin_file_control(inode, true);
   goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


   if (f2fs_is_pinned_file(inode)) {
   if (gc_type == FG_GC)
   f2fs_pin_file_control(inode, true);





.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: set_code_data in move_data_block

2018-02-08 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/gc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-07 Thread Yunlong Song


Then the GCed data pages are totally mixed with the inmem atomic pages,
this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:   GC:
- mutex_lock(>commit_lock);
- lock_page()
 - mutex_lock(>commit_lock);
  - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
  fs/f2fs/file.c | 3 +++
  fs/f2fs/gc.c   | 5 -
  2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+   down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
  err_out:
+   up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-05 Thread Yunlong Song

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/file.c  | 15 --
 fs/f2fs/gc.c| 61 +
 fs/f2fs/super.c |  1 +
 4 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index dbe87c7..b58a8f2 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -628,6 +628,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct task_struct *inmem_task; /* store inmemory task */
struct mutex inmem_lock;/* lock for inmemory pages */
+   struct mutex commit_lock;   /* lock for commit GCed pages */
struct extent_tree *extent_tree;/* cached extent_tree entry */
struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
struct rw_semaphore i_mmap_sem;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542..7e14724 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -202,6 +202,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
 {
struct inode *inode = file->f_mapping->host;
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   struct f2fs_inode_info *fi = F2FS_I(inode);
nid_t ino = inode->i_ino;
int ret = 0;
enum cp_reason_type cp_reason = 0;
@@ -219,11 +220,13 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
/* if fdatasync is triggered, let's do in-place-update */
if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks)
set_inode_flag(inode, FI_NEED_IPU);
+   mutex_lock(>commit_lock);
ret = file_write_and_wait_range(file, start, end);
clear_inode_flag(inode, FI_NEED_IPU);
 
if (ret) {
trace_f2fs_sync_file_exit(inode, cp_reason, datasync, ret);
+   mutex_unlock(>commit_lock);
return ret;
}
 
@@ -244,8 +247,11 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
goto go_write;
 
if (is_inode_flag_set(inode, FI_UPDATE_WRITE) ||
-   exist_written_data(sbi, ino, UPDATE_INO))
+   exist_written_data(sbi, ino, UPDATE_INO)) {
+   mutex_unlock(>commit_lock);
goto flush_out;
+   }
+   mutex_unlock(>commit_lock);
goto out;
}
 go_write:
@@ -268,16 +274,20 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
try_to_fix_pino(inode);
clear_inode_flag(inode, FI_APPEND_WRITE);
clear_inode_flag(inode, FI_UPDATE_WRITE);
+   mutex_unlock(>commit_lock);
goto out;
}
 sync_nodes:
ret = fsync_node_pages(sbi, inode, , atomic);
-   if (ret)
+   if (ret) {
+   mutex_unlock(>commit_lock);
goto out;
+   }
 
/* if cp_error was enabled, we should avoid infinite loop */
if (unlikely(f2fs_cp_error(sbi))) {
ret = -EIO;
+   mutex_unlock(>commit_lock);
goto out;
}
 
@@ -286,6 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
f2fs_write_inode(inode, NULL);
goto sync_nodes;
}
+   mutex_unlock(>commit_lock);
 
/*
 * If it's atomic_write, it's just fine to keep write ordering. So
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 9d54ddb..b98aff5 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -676,13 +676,20 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode) &&
-   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
-   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
-   SetPagePrivate(fio.encrypted_page);
-   }
-   set_page_dirty(fio.encrypted_page);
+   if (f2fs_is_atomic_file(inode)) {
+   struct f2fs_inode_info *fi = F2FS_I(inode);
+
+   mutex_lock(>commit_lock);
+   if (!f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page,
+   (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted

Re: [f2fs-dev] [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song


OK, now I got it, thanks for the explanation. Then the point is to avoid
set_page_dirty between file_write_and_wait_range and fsync_node_pages,
so we can lock before file_write_and_wait_range and unlock after 
fsync_node_pages, and lock before set_page_dirty and unlock after

set_page_dirty. These patches and the locks can make sure the GCed data
pages are all committed to nand flash with their nodes.

On 2018/2/5 19:10, Chao Yu wrote:

On 2018/2/5 17:37, Yunlong Song wrote:



OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
   - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal


Yes, node #2 contains newly updated LBAx of page #2, but if page #2 is not
committed to LBAx, after recovery, page #2 's block address in node #2 will
point to LBAx which contains random data, result in corrupted db file.


gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden


f2fs will ensure GCed data being persisted during checkpoint, so migrated page
#1 and updated node #1 will both be committed in this checkpoint.

Please check WB_DATA_TYPE macro to see how we define data type that cp
guarantees to writeback.


power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just


If node #2 is committed only, it will be harmful to db transaction due to the
reason I said above.

Thanks,


the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
 	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.






.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song




OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
  - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal
gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden
power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just
the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
fs/f2fs/data.c | 5 ++---
fs/f2fs/gc.c   | 6 --
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
}

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;

	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song


Is it necessary to make atomic commit fail? What's the problem of this
patch (no lock at all and does not make atomic fail)? These two patches
aims to provide ability to gc old blocks of opened atomic file, with no
affection to original atomic commit and no mix with inmem pages.

On 2018/2/5 14:29, Chao Yu wrote:

On 2018/2/5 10:53, Yunlong Song wrote:

Is it necessary to add a lock here? What's the problem of this patch (no
lock at all)? Anyway, the problem is expected to be fixed asap, since
attackers can easily write an app with only atomic start and no atomic
commit, which will cause f2fs run into loop gc if the disk layout is
much fragmented, since f2fs_gc will select the same target victim all
the time (e.g. one block of target victim belongs to the opened atomic
file, and it will not be moved and do_garbage_collect will finally
return 0, and that victim is selected again next time) and goto gc_more
time and time again, which will block all the fs ops (all the fs ops
will hang up in f2fs_balance_fs).


Hmm.. w/ original commit log and implementation, I supposed that the patch
intended to fix to make atomic write be isolated from other IOs like GC
triggered writes...

Alright, we have discuss the problem before in below link:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1571951.html

I meant, for example:

f2fs_ioc_start_atomic_write()
inode->atomic_open_time = get_mtime();

f2fs_ioc_commit_atomic_write()
inode->atomic_open_time = 0;

f2fs_balance_fs_bg()
for_each_atomic_open_file()
if (inode->atomic_open_time &&
inode->atomic_open_time > threshold) {
drop_inmem_pages();
f2fs_msg();
}

threshold = 30s

Any thoughts?

Thanks,



On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/data.c | 5 ++---
   fs/f2fs/gc.c   | 6 --
   2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
   }
   
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
   
   	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
    if (gc_type == FG_GC)



.






.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song

Is it necessary to add a lock here? What's the problem of this patch (no 
lock at all)? Anyway, the problem is expected to be fixed asap, since 
attackers can easily write an app with only atomic start and no atomic 
commit, which will cause f2fs run into loop gc if the disk layout is 
much fragmented, since f2fs_gc will select the same target victim all 
the time (e.g. one block of target victim belongs to the opened atomic 
file, and it will not be moved and do_garbage_collect will finally 
return 0, and that victim is selected again next time) and goto gc_more 
time and time again, which will block all the fs ops (all the fs ops 
will hang up in f2fs_balance_fs).


On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/data.c | 5 ++---
  fs/f2fs/gc.c   | 6 --
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
  }
  
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
  
  	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH 2/2] f2fs: add GC_WRITTEN_PAGE to gc atomic file

2018-02-02 Thread Yunlong Song

This patch enables to gc atomic file by adding GC_WRITTEN_PAGE to
identify the gced pages of atomic file, which can avoid
register_inmem_page in set_page_dirty, so the gced pages will not mix
with the inmem pages.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c|  7 ++-
 fs/f2fs/gc.c  | 25 ++---
 fs/f2fs/segment.h |  3 +++
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index edafcb6..5e1fc5d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -120,6 +120,10 @@ static void f2fs_write_end_io(struct bio *bio)
 
dec_page_count(sbi, type);
clear_cold_data(page);
+   if (IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, 0);
+   ClearPagePrivate(page);
+   }
end_page_writeback(page);
}
if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
@@ -2418,7 +2422,8 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page))
SetPageUptodate(page);
 
-   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
+   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)
+   && !IS_GC_WRITTEN_PAGE(page)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
register_inmem_page(inode, page);
return 1;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 84ab3ff..9d54ddb 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,10 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -680,6 +676,12 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted_page);
+   }
set_page_dirty(fio.encrypted_page);
f2fs_wait_on_page_writeback(fio.encrypted_page, DATA, true);
if (clear_page_dirty_for_io(fio.encrypted_page))
@@ -730,9 +732,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);
@@ -742,6 +741,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
set_cold_data(page);
} else {
@@ -762,6 +767,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
int err;
 
 retry:
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
f2fs_wait_on_page_writeback(page, DATA, true);
if (clear_page_dirty_for_io(page)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index f11c4bc..f0a6432 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -203,11 +203,14 @@ struct segment_allocation {
  */
 #define ATOMIC_WRITTEN_PAGE((unsigned long)-1)
 #define DUMMY_WRITTEN_PAGE ((unsigned long)-2)
+#define GC_WRITTEN_PAGE((unsigned long)-3)
 
 #define IS_ATOMIC_WRITTEN_PAGE(page)   \
(page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE)
 #define IS_DUMMY_WRITTEN_PAGE(page)\
(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
+#define

[f2fs-dev] [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-02 Thread Yunlong Song

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
if (f2fs_is_pinned_file(inode)) {
@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v2] f2fs: handle newly created page when revoking inmem pages

2018-01-10 Thread Yunlong Song


Should it be "When committing inmem pages is not successful" ?

On 2018/1/11 8:17, Daeho Jeong wrote:

When committing inmem pages is successful, we revoke already committed
blocks in __revoke_inmem_pages() and finally replace the committed
ones with the old blocks using f2fs_replace_block(). However, if
the committed block was newly created one, the address of the old
block is NEW_ADDR and __f2fs_replace_block() cannot handle NEW_ADDR
as new_blkaddr properly and a kernel panic occurrs.

Signed-off-by: Daeho Jeong <daeho.je...@samsung.com>
Tested-by: Shu Tan <shu@samsung.com>
Reviewed-by: Chao Yu <yuch...@huawei.com>
---
  fs/f2fs/segment.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0673d08 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -248,7 +248,11 @@ static int __revoke_inmem_pages(struct inode *inode,
goto next;
}
get_node_info(sbi, dn.nid, );
-   f2fs_replace_block(sbi, , dn.data_blkaddr,
+   if (cur->old_addr == NEW_ADDR) {
+   invalidate_blocks(sbi, dn.data_blkaddr);
+   f2fs_update_data_blkaddr(, NEW_ADDR);
+   } else
+   f2fs_replace_block(sbi, , dn.data_blkaddr,
cur->old_addr, ni.version, true, true);
f2fs_put_dnode();
    }



--
Thanks,
Yunlong Song


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 2/2] f2fs: add reserved blocks for root user

2018-01-05 Thread Yunlong Song




On 2018/1/4 2:58, Jaegeuk Kim wrote:

@@ -1590,11 +1598,17 @@ static inline int inc_valid_block_count(struct 
f2fs_sb_info *sbi,
sbi->total_valid_block_count += (block_t)(*count);
avail_user_block_count = sbi->user_block_count -
sbi->current_reserved_blocks;
+
+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   avail_user_block_count -= sbi->root_reserved_blocks;


Should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   avail_user_block_count -= sbi->root_reserved_blocks;


@@ -1783,9 +1797,13 @@ static inline int inc_valid_node_count(struct 
f2fs_sb_info *sbi,
  
  	spin_lock(>stat_lock);
  
-	valid_block_count = sbi->total_valid_block_count + 1;

-   if (unlikely(valid_block_count + sbi->current_reserved_blocks >
-   sbi->user_block_count)) {
+   valid_block_count = sbi->total_valid_block_count +
+   sbi->current_reserved_blocks + 1;
+



+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   valid_block_count += sbi->root_reserved_blocks;
+

should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   valid_block_count += sbi->root_reserved_blocks;



+   if (unlikely(valid_block_count > sbi->user_block_count)) {
    spin_unlock(>stat_lock);
goto enospc;
}



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH 1/2] f2fs: show precise # of blocks that user/root can use

2018-01-04 Thread Yunlong Song


NACK

man statfs shows:

struct statfs {
...
fsblkcnt_t   f_bfree;   /* free blocks in fs */
fsblkcnt_t   f_bavail;  /* free blocks available to
unprivileged user */
...
}

f_bfree is free blocks in fs, so buf->bfree should be

buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;

On 2018/1/4 2:58, Jaegeuk Kim wrote:

Let's show precise # of blocks that user/root can use through bavail and bfree
respectively.

Signed-off-by: Jaegeuk Kim <jaeg...@kernel.org>
---
  fs/f2fs/super.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a820ba55b10..4c1c99cf54ef 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1005,9 +1005,9 @@ static int f2fs_statfs(struct dentry *dentry, struct 
kstatfs *buf)
buf->f_bsize = sbi->blocksize;
  
  	buf->f_blocks = total_count - start_count;

-   buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
-   buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
+   buf->f_bfree = user_block_count - valid_user_blocks(sbi) -
sbi->current_reserved_blocks;
+   buf->f_bavail = buf->f_bfree;
  
  	avail_node_count = sbi->total_node_count - sbi->nquota_files -

    F2FS_RESERVED_NODE_NUM;


--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v4] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song

In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..50575d5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2720,6 +2720,7 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
type = se->type;
 
down_write(_I(sbi)->curseg_lock);
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
 
if (!recover_curseg) {
/* for recovery flow */
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v3] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song

In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..6c6d2dd 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..e3bbabf 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && IS_NODESEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: check segment type before recover data

2018-01-02 Thread Yunlong Song




On 2018/1/2 14:49, Chao Yu wrote:

On 2017/12/30 15:42, Yunlong Song wrote:

In some case, the node blocks has wrong blkaddr whose segment type is

You mean *data block* has wrong blkaddr whose segment type is NODE?

Yes.



NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a

Do you mean the root cause is that __f2fs_replace_block didn't update
next_blkoff?

No, it's not the root cause. The root cause may be something like DDR flip.



result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Sorry, I can't catch the whole cause and effect from you description, if
possible, could you give an example?
For example, the i_inline flag has F2FS_INLINE_XATTR, and the last 50 
i_addrs have xattr
context. But if DDR flips, the i_inline flag may miss F2FS_INLINE_XATTR, 
and the last 50 i_addrs
are considered as data block addr. If the xattr context is 0x1234, and 
0x1234 happens to be
a valid block addr, and the block 0x1234 happens to be in a warm node 
segment. Then do_recover_data
will call f2fs_replace_block() with dest = 0x1234, which will change 
curseg of warm node to
0x1234's segment, and make update_sit_entry(sbi, 0x1234, 1), the 
curseg->next_blkoff also
points to 0x1234's offset in its segment. When recovery process calls 
write_checkpoint, sync
nodes will write to 0x1234's offset of curseg warm node. The 
update_sit_entry will check that

offset and find the bitmap is already set to 1 and then calls f2fs_bug_on.



Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/recovery.c | 3 ++-
  fs/f2fs/segment.h  | 3 +++
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
  
  		/* dest is valid block, try to recover from src to dest */

-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
  
  			if (src == NULL_ADDR) {

err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
  #define SECTOR_TO_BLOCK(sectors)  \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
  
+#define is_data_blkaddr(sbi, blkaddr)	\

+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
  /*
   * indicate a block allocation direction: RIGHT and LEFT.
   * RIGHT means allocating new sections towards the end of volume.



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] f2fs: check segment type before recover data

2017-12-29 Thread Yunlong Song

In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/recovery.c | 3 ++-
 fs/f2fs/segment.h  | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
 
/* dest is valid block, try to recover from src to dest */
-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
 
if (src == NULL_ADDR) {
err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
 #define SECTOR_TO_BLOCK(sectors)   \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
 
+#define is_data_blkaddr(sbi, blkaddr)  \
+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
 /*
  * indicate a block allocation direction: RIGHT and LEFT.
  * RIGHT means allocating new sections towards the end of volume.
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-25 Thread Yunlong Song

In this case, f2fs_gc will skip all the victims and return with no dead 
loop. The atomic file will

use SSR to OPU, it‘s OK.

On 2017/12/25 17:45, Chao Yu wrote:

On 2017/12/25 14:15, Yunlong Song wrote:

What if the application starts atomic write but forgets to commit, e.g.
bugs in application or the application
is a malicious software itself?

I agree we should consider robustness of f2fs in security aspect, but
please consider more scenario of these sqlite customized interface usage,
it looks just skipping gc is not enough, for example, if there is one large
size db in our partition, with random write, its data spreads in each
segment, once this db has been atomic opened, foreground gc may loop for ever.

How about checking opened time of atomic or volatile file in
f2fs_balance_fs, if it exceeds threshold, we can restore the file to normal
one to avoid potential security issue.

Thanks,


On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/f2fs.h  |  8 
   fs/f2fs/gc.c| 34 ++
   fs/f2fs/super.c |  3 +++
   3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
   };
   
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
   #define F2FS_FEATURE_ENCRYPT 0x0001
   #define F2FS_FEATURE_BLKZONED0x0002
   #define F2FS_FEATURE_ATOMIC_WRITE0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
   
   	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
   
   	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
   
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
   
   		cost = get_gc_cost(sbi, segno, );
   
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
   
   	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int

[f2fs-dev] [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-23 Thread Yunlong Song

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim
segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h  |  8 
 fs/f2fs/gc.c| 34 ++
 fs/f2fs/super.c |  3 +++
 3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
 };
 
+struct gc_loop_info {
+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
 #define F2FS_FEATURE_ENCRYPT   0x0001
 #define F2FS_FEATURE_BLKZONED  0x0002
 #define F2FS_FEATURE_ATOMIC_WRITE  0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
 
/* threshold for converting bg victims for fg */
u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
 
/* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
 
+   if (sbi->gc_loop.segmap &&
+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
 
cost = get_gc_cost(sbi, segno, );
 
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
 
if (gc_type == FG_GC)
@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 
if (sync)
ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
 }
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
 
for (i = 0; i < NR_INODE_TYPE; i++) {
INIT_LIST_HEAD(>inode_list[i]);
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! ht

Re: [f2fs-dev] [PATCH v4] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-22 Thread Yunlong Song


And there is en[namelen] = '\0', should fix namelen to its right value.

On 2017/12/23 11:35, Chao Yu wrote:

On 2017/12/23 11:19, Yunlong Song wrote:

Double free problem:
Since ddr bit jump makes i_namelen a larger value (> 255)，when file is
not encrypted,
the convert_encrypted_name will memcpy out range of en[255], when en is
freed, there
will be double free problem.

It looks there is only memcpy overflow problem here.

Thanks,


On 2017/12/23 11:05, Chao Yu wrote:

On 2017/12/18 21:25, Yunlong Song wrote:

v1 -> v2: use child_info to pass dentry namelen
v2 -> v3: check child != NULL to include the F2FS_FT_ORPHAN file type
v3 -> v4: fix the i_namelen problem of dump.f2fs、

There is no commit log, so what do you mean about "avoid double free"?

Other than that, looks good to me.

Reviewed-by: Chao Yu <yuch...@huawei.com>

Thanks,


.



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v4] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-22 Thread Yunlong Song


Double free problem:
Since ddr bit jump makes i_namelen a larger value (> 255)，when file is 
not encrypted,
the convert_encrypted_name will memcpy out range of en[255], when en is 
freed, there

will be double free problem.

On 2017/12/23 11:05, Chao Yu wrote:

On 2017/12/18 21:25, Yunlong Song wrote:

v1 -> v2: use child_info to pass dentry namelen
v2 -> v3: check child != NULL to include the F2FS_FT_ORPHAN file type
v3 -> v4: fix the i_namelen problem of dump.f2fs、

There is no commit log, so what do you mean about "avoid double free"?

Other than that, looks good to me.

Reviewed-by: Chao Yu <yuch...@huawei.com>

Thanks,


.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v3] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-18 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/fsck.c | 26 +-
 fsck/fsck.h |  3 ++-
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 2212aa3..b52b6e4 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -539,7 +539,7 @@ int fsck_chk_node_blk(struct f2fs_sb_info *sbi, struct 
f2fs_inode *inode,
 
if (sanity_check_inode(sbi, node_blk))
goto err;
-   fsck_chk_inode_blk(sbi, nid, ftype, node_blk, blk_cnt, );
+   fsck_chk_inode_blk(sbi, nid, ftype, node_blk, blk_cnt, , 
child);
quota_add_inode_usage(fsck->qctx, nid, _blk->i);
} else {
switch (ntype) {
@@ -633,7 +633,7 @@ unmatched:
 /* start with valid nid and blkaddr */
 void fsck_chk_inode_blk(struct f2fs_sb_info *sbi, u32 nid,
enum FILE_TYPE ftype, struct f2fs_node *node_blk,
-   u32 *blk_cnt, struct node_info *ni)
+   u32 *blk_cnt, struct node_info *ni, struct child_info *child_d)
 {
struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
struct child_info child;
@@ -850,8 +850,23 @@ skip_blkcnt_fix:
en = malloc(F2FS_NAME_LEN + 1);
ASSERT(en);
 
-   namelen = convert_encrypted_name(node_blk->i.i_name,
-   le32_to_cpu(node_blk->i.i_namelen),
+   namelen = le32_to_cpu(node_blk->i.i_namelen);
+   if (namelen > F2FS_NAME_LEN) {
+   if (child_d && child_d->i_namelen <= F2FS_NAME_LEN) {
+   ASSERT_MSG("ino: 0x%x has i_namelen: 0x%x, "
+   "but has %d characters for name",
+   nid, namelen, child_d->i_namelen);
+   if (c.fix_on) {
+   FIX_MSG("[0x%x] i_namelen=0x%x -> 0x%x", nid, 
namelen,
+   child_d->i_namelen);
+   node_blk->i.i_namelen = 
cpu_to_le32(child_d->i_namelen);
+   need_fix = 1;
+   }
+   namelen = child_d->i_namelen;
+   } else
+   namelen = F2FS_NAME_LEN;
+   }
+   namelen = convert_encrypted_name(node_blk->i.i_name, namelen,
en, file_enc_name(_blk->i));
en[namelen] = '\0';
if (ftype == F2FS_FT_ORPHAN)
@@ -1414,9 +1429,10 @@ static int __chk_dentries(struct f2fs_sb_info *sbi, 
struct child_info *child,
dentry, max, i, last_blk, enc_name);
 
blk_cnt = 1;
+   child->i_namelen = name_len;
ret = fsck_chk_node_blk(sbi,
NULL, le32_to_cpu(dentry[i].ino),
-   ftype, TYPE_INODE, _cnt, NULL);
+   ftype, TYPE_INODE, _cnt, child);
 
if (ret && c.fix_on) {
int j;
diff --git a/fsck/fsck.h b/fsck/fsck.h
index 0343fbd..d635c5a 100644
--- a/fsck/fsck.h
+++ b/fsck/fsck.h
@@ -54,6 +54,7 @@ struct child_info {
u32 pp_ino; /*parent parent ino*/
struct extent_info ei;
u32 last_blk;
+   u32 i_namelen;  /* dentry namelen */
 };
 
 struct f2fs_fsck {
@@ -128,7 +129,7 @@ extern int fsck_chk_node_blk(struct f2fs_sb_info *, struct 
f2fs_inode *, u32,
enum FILE_TYPE, enum NODE_TYPE, u32 *,
struct child_info *);
 extern void fsck_chk_inode_blk(struct f2fs_sb_info *, u32, enum FILE_TYPE,
-   struct f2fs_node *, u32 *, struct node_info *);
+   struct f2fs_node *, u32 *, struct node_info *, struct 
child_info *);
 extern int fsck_chk_dnode_blk(struct f2fs_sb_info *, struct f2fs_inode *,
u32, enum FILE_TYPE, struct f2fs_node *, u32 *,
struct child_info *, struct node_info *);
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH v2] fsck.f2fs: check nid range before use to avoid segmentation fault

2017-12-18 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/fsck.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 11b8b0b..faf0663 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -740,7 +740,7 @@ void fsck_chk_inode_blk(struct f2fs_sb_info *sbi, u32 nid,
for (idx = 0; idx < 5; idx++) {
u32 nid = le32_to_cpu(node_blk->i.i_nid[idx]);
 
-   if (nid != 0) {
+   if (nid != 0 && IS_VALID_NID(sbi, nid)) {
struct node_info ni;
 
get_node_info(sbi, nid, );
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] fsck.f2fs: check nid range before use to avoid segmentation fault

2017-12-14 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/fsck.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 11b8b0b..2212aa3 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -14,6 +14,15 @@
 char *tree_mark;
 uint32_t tree_mark_size = 256;
 
+static inline int check_nid_range(struct f2fs_sb_info *sbi, nid_t nid)
+{
+if (nid < F2FS_ROOT_INO(sbi))
+return -EINVAL;
+if (nid >= NM_I(sbi)->max_nid)
+return -EINVAL;
+return 0;
+}
+
 int f2fs_set_main_bitmap(struct f2fs_sb_info *sbi, u32 blk, int type)
 {
struct f2fs_fsck *fsck = F2FS_FSCK(sbi);
@@ -740,7 +749,7 @@ void fsck_chk_inode_blk(struct f2fs_sb_info *sbi, u32 nid,
for (idx = 0; idx < 5; idx++) {
u32 nid = le32_to_cpu(node_blk->i.i_nid[idx]);
 
-   if (nid != 0) {
+   if (nid != 0 && !check_nid_range(sbi, nid)) {
struct node_info ni;
 
get_node_info(sbi, nid, );
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

[f2fs-dev] [PATCH] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-14 Thread Yunlong Song

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/fsck.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/fsck/fsck.c b/fsck/fsck.c
index 2212aa3..8ff4e4b 100644
--- a/fsck/fsck.c
+++ b/fsck/fsck.c
@@ -643,7 +643,7 @@ void fsck_chk_inode_blk(struct f2fs_sb_info *sbi, u32 nid,
u64 i_blocks = le64_to_cpu(node_blk->i.i_blocks);
int ofs = get_extra_isize(node_blk);
unsigned char *en;
-   int namelen;
+   int namelen, i_namelen;
unsigned int idx = 0;
int need_fix = 0;
int ret;
@@ -850,8 +850,21 @@ skip_blkcnt_fix:
en = malloc(F2FS_NAME_LEN + 1);
ASSERT(en);
 
-   namelen = convert_encrypted_name(node_blk->i.i_name,
-   le32_to_cpu(node_blk->i.i_namelen),
+   i_namelen = le32_to_cpu(node_blk->i.i_namelen);
+   namelen = strlen((const char *)node_blk->i.i_name);
+   if (i_namelen > F2FS_NAME_LEN) {
+   ASSERT_MSG("ino: 0x%x has i_namelen: 0x%x, "
+   "but has %d characters for name",
+   nid, i_namelen, namelen);
+   if (c.fix_on) {
+   FIX_MSG("[0x%x] i_namelen=0x%x -> 0x%x", nid, i_namelen,
+   namelen);
+   node_blk->i.i_namelen = cpu_to_le32(namelen);
+   need_fix = 1;
+   }
+   i_namelen = namelen;
+   }
+   namelen = convert_encrypted_name(node_blk->i.i_name, i_namelen,
en, file_enc_name(_blk->i));
en[namelen] = '\0';
if (ftype == F2FS_FT_ORPHAN)
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v4 RESEND] f2fs: fix out-of-free problem caused by atomic write

2017-11-29 Thread Yunlong Song


ping...

On 2017/11/17 8:54, Yunlong Song wrote:

f2fs_balance_fs only actives once in the commit_inmem_pages, but there
are more than one page to commit, so all the other pages will miss the
check. This will lead to out-of-free problem when commit a very large
file. However, we cannot do f2fs_balance_fs for each inmem page, since
this will break atomicity. As a result, we should do f2fs_balance_fs
for all the inmem pages together.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/debug.c   |  5 +++--
  fs/f2fs/f2fs.h| 26 --
  fs/f2fs/segment.c | 30 --
  fs/f2fs/segment.h |  4 +++-
  4 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index f7eec50..41c47c4 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -50,6 +50,7 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES);
+   si->inmem_commit_pages = get_pages(sbi, F2FS_INMEM_COMMIT_PAGES);
si->aw_cnt = atomic_read(>aw_cnt);
si->vw_cnt = atomic_read(>vw_cnt);
si->max_aw_cnt = atomic_read(>max_aw_cnt);
@@ -360,9 +361,9 @@ static int stat_show(struct seq_file *s, void *v)
   si->nr_discarding, si->nr_discarded,
   si->nr_discard_cmd, si->undiscard_blks);
seq_printf(s, "  - inmem: %4d, atomic IO: %4d (Max. %4d), "
-   "volatile IO: %4d (Max. %4d)\n",
+   "volatile IO: %4d (Max. %4d), commit: %4d\n",
   si->inmem_pages, si->aw_cnt, si->max_aw_cnt,
-  si->vw_cnt, si->max_vw_cnt);
+  si->vw_cnt, si->max_vw_cnt, si->inmem_commit_pages);
seq_printf(s, "  - nodes: %4d in %4d\n",
   si->ndirty_node, si->node_pages);
seq_printf(s, "  - dents: %4d in dirs:%4d (%4d)\n",
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 13a96b8..749bdb6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -610,6 +610,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct task_struct *inmem_task; /* store inmemory task */
struct mutex inmem_lock;/* lock for inmemory pages */
+   unsigned long inmem_blocks; /* inmemory blocks */
struct extent_tree *extent_tree;/* cached extent_tree entry */
struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
struct rw_semaphore i_mmap_sem;
@@ -863,6 +864,7 @@ enum count_type {
F2FS_DIRTY_NODES,
F2FS_DIRTY_META,
F2FS_INMEM_PAGES,
+   F2FS_INMEM_COMMIT_PAGES,
F2FS_DIRTY_IMETA,
F2FS_WB_CP_DATA,
F2FS_WB_DATA,
@@ -1600,7 +1602,21 @@ static inline void inc_page_count(struct f2fs_sb_info 
*sbi, int count_type)
atomic_inc(>nr_pages[count_type]);
  
  	if (count_type == F2FS_DIRTY_DATA || count_type == F2FS_INMEM_PAGES ||

-   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA)
+   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA ||
+   count_type == F2FS_INMEM_COMMIT_PAGES)
+   return;
+
+   set_sbi_flag(sbi, SBI_IS_DIRTY);
+}
+
+static inline void inc_pages_count(struct f2fs_sb_info *sbi, int count_type,
+   int pages)
+{
+   atomic_add(pages, >nr_pages[count_type]);
+
+   if (count_type == F2FS_DIRTY_DATA || count_type == F2FS_INMEM_PAGES ||
+   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA ||
+   count_type == F2FS_INMEM_COMMIT_PAGES)
return;
  
  	set_sbi_flag(sbi, SBI_IS_DIRTY);

@@ -1618,6 +1634,12 @@ static inline void dec_page_count(struct f2fs_sb_info 
*sbi, int count_type)
atomic_dec(>nr_pages[count_type]);
  }
  
+static inline void dec_pages_count(struct f2fs_sb_info *sbi, int count_type,

+   int pages)
+{
+   atomic_sub(pages, >nr_pages[count_type]);
+}
+
  static inline void inode_dec_dirty_pages(struct inode *inode)
  {
if (!S_ISDIR(inode->i_mode) && !S_ISREG(inode->i_mode) &&
@@ -2716,7 +2738,7 @@ struct f2fs_stat_info {
unsigned long long hit_total, total_ext;
int ext_tree, zombie_tree, ext_node;
int ndirty_node, ndirty_dent, ndirty_meta, ndirty_data, ndirty_imeta;
-   int inmem_pages;
+   int inmem_pages, inmem_commit_pages;
unsigned int ndirty_dirs, ndirty_files, ndirty_all;
int nats, dirty_nats, sits, dirty_sits;
int free_nids, avail_nids

Re: [f2fs-dev] [PATCH] f2fs: avoid false positive of free secs check

2017-11-29 Thread Yunlong Song


SSR can make hot/warm/cold nodes written together, so why should we account
them different?

On 2017/11/29 19:56, Chao Yu wrote:

On 2017/11/27 14:54, Yunlong Song wrote:

Sometimes f2fs_gc is called with no target victim (e.g. xfstest
generic/027, ndirty_node:545 ndiry_dent:1 ndirty_imeta:513 rsvd_segs:21
free_segs:27, has_not_enough_free_secs will return true). This patch
first merges pages and then converts into sections.

I don't think this could be right, IMO, instead, it would be better to
account dirty hot/warm/cold nodes or imeta separately, as actually, they
will use different section, but currently, our calculation way is based
on that they could be written to same section.

Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/f2fs.h|  9 -
  fs/f2fs/segment.c | 12 +++-
  fs/f2fs/segment.h | 13 +
  3 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..e89cff7 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1675,15 +1675,6 @@ static inline int get_dirty_pages(struct inode *inode)
return atomic_read(_I(inode)->dirty_pages);
  }
  
-static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type)

-{
-   unsigned int pages_per_sec = sbi->segs_per_sec * sbi->blocks_per_seg;
-   unsigned int segs = (get_pages(sbi, block_type) + pages_per_sec - 1) >>
-   sbi->log_blocks_per_seg;
-
-   return segs / sbi->segs_per_sec;
-}
-
  static inline block_t valid_user_blocks(struct f2fs_sb_info *sbi)
  {
return sbi->total_valid_block_count;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..603f805 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -171,17 +171,19 @@ static unsigned long __find_rev_next_zero_bit(const 
unsigned long *addr,
  
  bool need_SSR(struct f2fs_sb_info *sbi)

  {
-   int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES);
-   int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS);
-   int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA);
+   s64 node_pages = get_pages(sbi, F2FS_DIRTY_NODES);
+   s64 dent_pages = get_pages(sbi, F2FS_DIRTY_DENTS);
+   s64 imeta_pages = get_pages(sbi, F2FS_DIRTY_IMETA);
  
  	if (test_opt(sbi, LFS))

return false;
if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
return true;
  
-	return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +

-   SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
+   return free_sections(sbi) <=
+   (PAGE2SEC(sbi, node_pages + imeta_pages) +
+   PAGE2SEC(sbi, 2 * dent_pages) +
+   SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
  }
  
  void register_inmem_page(struct inode *inode, struct page *page)

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index d1d394c..723d79e 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,10 @@
  #define SECTOR_TO_BLOCK(sectors)  \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
  
+#define PAGE2SEC(sbi, pages)\

+   pages) + BLKS_PER_SEC(sbi) - 1) \
+   >> sbi->log_blocks_per_seg) / sbi->segs_per_sec)
+
  /*
   * indicate a block allocation direction: RIGHT and LEFT.
   * RIGHT means allocating new sections towards the end of volume.
@@ -527,9 +531,9 @@ static inline bool has_curseg_enough_space(struct 
f2fs_sb_info *sbi)
  static inline bool has_not_enough_free_secs(struct f2fs_sb_info *sbi,
int freed, int needed)
  {
-   int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES);
-   int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS);
-   int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA);
+   s64 node_pages = get_pages(sbi, F2FS_DIRTY_NODES);
+   s64 dent_pages = get_pages(sbi, F2FS_DIRTY_DENTS);
+   s64 imeta_pages = get_pages(sbi, F2FS_DIRTY_IMETA);
  
  	if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))

return false;
@@ -538,7 +542,8 @@ static inline bool has_not_enough_free_secs(struct 
f2fs_sb_info *sbi,
has_curseg_enough_space(sbi))
return false;
return (free_sections(sbi) + freed) <=
-   (node_secs + 2 * dent_secs + imeta_secs +
+   (PAGE2SEC(sbi, node_pages + imeta_pages) +
+   PAGE2SEC(sbi, 2 * dent_pages) +
    reserved_sections(sbi) + needed);
  }
  


.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel m

[f2fs-dev] [PATCH] f2fs: avoid false positive of free secs check

2017-11-26 Thread Yunlong Song

Sometimes f2fs_gc is called with no target victim (e.g. xfstest
generic/027, ndirty_node:545 ndiry_dent:1 ndirty_imeta:513 rsvd_segs:21
free_segs:27, has_not_enough_free_secs will return true). This patch
first merges pages and then converts into sections.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h|  9 -
 fs/f2fs/segment.c | 12 +++-
 fs/f2fs/segment.h | 13 +
 3 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..e89cff7 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1675,15 +1675,6 @@ static inline int get_dirty_pages(struct inode *inode)
return atomic_read(_I(inode)->dirty_pages);
 }
 
-static inline int get_blocktype_secs(struct f2fs_sb_info *sbi, int block_type)
-{
-   unsigned int pages_per_sec = sbi->segs_per_sec * sbi->blocks_per_seg;
-   unsigned int segs = (get_pages(sbi, block_type) + pages_per_sec - 1) >>
-   sbi->log_blocks_per_seg;
-
-   return segs / sbi->segs_per_sec;
-}
-
 static inline block_t valid_user_blocks(struct f2fs_sb_info *sbi)
 {
return sbi->total_valid_block_count;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..603f805 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -171,17 +171,19 @@ static unsigned long __find_rev_next_zero_bit(const 
unsigned long *addr,
 
 bool need_SSR(struct f2fs_sb_info *sbi)
 {
-   int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES);
-   int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS);
-   int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA);
+   s64 node_pages = get_pages(sbi, F2FS_DIRTY_NODES);
+   s64 dent_pages = get_pages(sbi, F2FS_DIRTY_DENTS);
+   s64 imeta_pages = get_pages(sbi, F2FS_DIRTY_IMETA);
 
if (test_opt(sbi, LFS))
return false;
if (sbi->gc_thread && sbi->gc_thread->gc_urgent)
return true;
 
-   return free_sections(sbi) <= (node_secs + 2 * dent_secs + imeta_secs +
-   SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
+   return free_sections(sbi) <=
+   (PAGE2SEC(sbi, node_pages + imeta_pages) +
+   PAGE2SEC(sbi, 2 * dent_pages) +
+   SM_I(sbi)->min_ssr_sections + reserved_sections(sbi));
 }
 
 void register_inmem_page(struct inode *inode, struct page *page)
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index d1d394c..723d79e 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,10 @@
 #define SECTOR_TO_BLOCK(sectors)   \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
 
+#define PAGE2SEC(sbi, pages)   \
+   pages) + BLKS_PER_SEC(sbi) - 1) \
+   >> sbi->log_blocks_per_seg) / sbi->segs_per_sec)
+
 /*
  * indicate a block allocation direction: RIGHT and LEFT.
  * RIGHT means allocating new sections towards the end of volume.
@@ -527,9 +531,9 @@ static inline bool has_curseg_enough_space(struct 
f2fs_sb_info *sbi)
 static inline bool has_not_enough_free_secs(struct f2fs_sb_info *sbi,
int freed, int needed)
 {
-   int node_secs = get_blocktype_secs(sbi, F2FS_DIRTY_NODES);
-   int dent_secs = get_blocktype_secs(sbi, F2FS_DIRTY_DENTS);
-   int imeta_secs = get_blocktype_secs(sbi, F2FS_DIRTY_IMETA);
+   s64 node_pages = get_pages(sbi, F2FS_DIRTY_NODES);
+   s64 dent_pages = get_pages(sbi, F2FS_DIRTY_DENTS);
+   s64 imeta_pages = get_pages(sbi, F2FS_DIRTY_IMETA);
 
if (unlikely(is_sbi_flag_set(sbi, SBI_POR_DOING)))
return false;
@@ -538,7 +542,8 @@ static inline bool has_not_enough_free_secs(struct 
f2fs_sb_info *sbi,
has_curseg_enough_space(sbi))
return false;
return (free_sections(sbi) + freed) <=
-   (node_secs + 2 * dent_secs + imeta_secs +
+   (PAGE2SEC(sbi, node_pages + imeta_pages) +
+   PAGE2SEC(sbi, 2 * dent_pages) +
reserved_sections(sbi) + needed);
 }
 
-- 
1.8.5.2


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v3 RESEND] f2fs: add bug_on when f2fs_gc even fails to get one victim

2017-11-25 Thread Yunlong Song


Ok, I have found a panic with this bug_on for generic/027 today:

[ 5157.753224] F2FS-fs (loop2): Mounted with checkpoint version = 2e2
generic/027[ 5168.741251] run fstests generic/027 at 2017-11-25 04:46:40
[ 5189.445989] F2FS-fs (loop3): Found nat_bits in checkpoint
[ 5189.510872] F2FS-fs (loop3): Mounted with checkpoint version = 165da00b
[ 5250.613849] [ cut here ]
[ 5250.616840] kernel BUG at 
/opt/s00293685/src/kernel/jaegeuk/f2fs/fs/f2fs/gc.c:1038!

[ 5250.628467] invalid opcode:  [#1] SMP
[ 5250.628467] Modules linked in:
[ 5250.628467] CPU: 7 PID: 3173 Comm: xfs_io Not tainted 4.14.0-rc4+ #128
[ 5250.628467] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014

[ 5250.628467] task: 880130f2be80 task.stack: c9000acd
[ 5250.628467] RIP: 0010:f2fs_gc+0x9da/0xa80
[ 5250.628467] RSP: 0018:c9000acd3b48 EFLAGS: 0246
[ 5250.628467] RAX: 001b RBX: 880134fa2648 RCX: 
880134fa2f00
[ 5250.628467] RDX: 0006 RSI: 0200 RDI: 
0001
[ 5250.628467] RBP: c9000acd3c38 R08: 001b R09: 
0001
[ 5250.628467] R10:  R11: 0001 R12: 

[ 5250.628467] R13: 0001 R14: 880138472000 R15: 
0002
[ 5250.628467] FS:  01666880() GS:88013fdc() 
knlGS:

[ 5250.628467] CS:  0010 DS:  ES:  CR0: 80050033
[ 5250.628467] CR2: 006ef120 CR3: 000130f48000 CR4: 
06e0

[ 5250.628467] Call Trace:
[ 5250.628467]  f2fs_balance_fs+0x13c/0x1f0
[ 5250.628467]  f2fs_create+0x146/0x260
[ 5250.628467]  path_openat+0xe31/0x12c0
[ 5250.628467]  do_filp_open+0x7e/0xd0
[ 5250.628467]  ? kmem_cache_alloc+0x92/0x160
[ 5250.628467]  ? getname_flags+0x4f/0x1f0
[ 5250.628467]  do_sys_open+0x115/0x1f0
[ 5250.628467]  SyS_open+0x1e/0x20
[ 5250.628467]  entry_SYSCALL_64_fastpath+0x13/0x94
[ 5250.628467] RIP: 0033:0x4171d0
[ 5250.628467] RSP: 002b:7fff9a45b678 EFLAGS: 0246 ORIG_RAX: 
0002
[ 5250.628467] RAX: ffda RBX: 0001 RCX: 
004171d0
[ 5250.628467] RDX: 0180 RSI: 0042 RDI: 
7fff9a45c1cb
[ 5250.628467] RBP: 7fff9a45c1bf R08: 7fff9a45b7f0 R09: 
0001
[ 5250.628467] R10: 004bd8d3 R11: 0246 R12: 
0006
[ 5250.628467] R13: 7fff9a45b830 R14: 0180 R15: 

[ 5250.628467] Code: 00 bb c3 ff ff ff e9 2c fa ff ff 4d 8b 27 bb fb ff 
ff ff c7 44 24 7c 00 00 00 00 c7 84 24 80 00 00 00 00 00 00 00 e9 0c fa 
ff ff <0f> 0b 41 8b 96 fc 03 00 00 41 8b be f4 03 00 00 4c 8b 21 45 8b

[ 5250.628467] RIP: f2fs_gc+0x9da/0xa80 RSP: c9000acd3b48
[ 5250.685538] ---[ end trace 00b8c84c59632b32 ]---

Let me fix it one by one.

On 2017/11/23 21:05, Chao Yu wrote:

On 2017/11/22 11:50, Yunlong Song wrote:

ping again...

On 2017/11/17 9:09, Yunlong Song wrote:

This can help to find potential bugs on some corner case.

Could you test this patch with fstest suit? if there are any testcases
can trigger this bug_on, it will be better to fix them all together.

Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/gc.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..c89128b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1035,6 +1035,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
   goto stop;
   }
   if (!__get_victim(sbi, , gc_type)) {
+f2fs_bug_on(sbi, !total_freed && has_not_enough_free_secs(sbi, 0, 0));
   ret = -ENODATA;
   goto stop;
   }

.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH v4 RESEND] f2fs: fix out-of-free problem caused by atomic write

2017-11-21 Thread Yunlong Song


ping again...

On 2017/11/17 8:54, Yunlong Song wrote:

f2fs_balance_fs only actives once in the commit_inmem_pages, but there
are more than one page to commit, so all the other pages will miss the
check. This will lead to out-of-free problem when commit a very large
file. However, we cannot do f2fs_balance_fs for each inmem page, since
this will break atomicity. As a result, we should do f2fs_balance_fs
for all the inmem pages together.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/debug.c   |  5 +++--
  fs/f2fs/f2fs.h| 26 --
  fs/f2fs/segment.c | 30 --
  fs/f2fs/segment.h |  4 +++-
  4 files changed, 54 insertions(+), 11 deletions(-)

diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index f7eec50..41c47c4 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -50,6 +50,7 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->ndirty_files = sbi->ndirty_inode[FILE_INODE];
si->ndirty_all = sbi->ndirty_inode[DIRTY_META];
si->inmem_pages = get_pages(sbi, F2FS_INMEM_PAGES);
+   si->inmem_commit_pages = get_pages(sbi, F2FS_INMEM_COMMIT_PAGES);
si->aw_cnt = atomic_read(>aw_cnt);
si->vw_cnt = atomic_read(>vw_cnt);
si->max_aw_cnt = atomic_read(>max_aw_cnt);
@@ -360,9 +361,9 @@ static int stat_show(struct seq_file *s, void *v)
   si->nr_discarding, si->nr_discarded,
   si->nr_discard_cmd, si->undiscard_blks);
seq_printf(s, "  - inmem: %4d, atomic IO: %4d (Max. %4d), "
-   "volatile IO: %4d (Max. %4d)\n",
+   "volatile IO: %4d (Max. %4d), commit: %4d\n",
   si->inmem_pages, si->aw_cnt, si->max_aw_cnt,
-  si->vw_cnt, si->max_vw_cnt);
+  si->vw_cnt, si->max_vw_cnt, si->inmem_commit_pages);
seq_printf(s, "  - nodes: %4d in %4d\n",
   si->ndirty_node, si->node_pages);
seq_printf(s, "  - dents: %4d in dirs:%4d (%4d)\n",
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 13a96b8..749bdb6 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -610,6 +610,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct task_struct *inmem_task; /* store inmemory task */
struct mutex inmem_lock;/* lock for inmemory pages */
+   unsigned long inmem_blocks; /* inmemory blocks */
struct extent_tree *extent_tree;/* cached extent_tree entry */
struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
struct rw_semaphore i_mmap_sem;
@@ -863,6 +864,7 @@ enum count_type {
F2FS_DIRTY_NODES,
F2FS_DIRTY_META,
F2FS_INMEM_PAGES,
+   F2FS_INMEM_COMMIT_PAGES,
F2FS_DIRTY_IMETA,
F2FS_WB_CP_DATA,
F2FS_WB_DATA,
@@ -1600,7 +1602,21 @@ static inline void inc_page_count(struct f2fs_sb_info 
*sbi, int count_type)
atomic_inc(>nr_pages[count_type]);
  
  	if (count_type == F2FS_DIRTY_DATA || count_type == F2FS_INMEM_PAGES ||

-   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA)
+   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA ||
+   count_type == F2FS_INMEM_COMMIT_PAGES)
+   return;
+
+   set_sbi_flag(sbi, SBI_IS_DIRTY);
+}
+
+static inline void inc_pages_count(struct f2fs_sb_info *sbi, int count_type,
+   int pages)
+{
+   atomic_add(pages, >nr_pages[count_type]);
+
+   if (count_type == F2FS_DIRTY_DATA || count_type == F2FS_INMEM_PAGES ||
+   count_type == F2FS_WB_CP_DATA || count_type == F2FS_WB_DATA ||
+   count_type == F2FS_INMEM_COMMIT_PAGES)
return;
  
  	set_sbi_flag(sbi, SBI_IS_DIRTY);

@@ -1618,6 +1634,12 @@ static inline void dec_page_count(struct f2fs_sb_info 
*sbi, int count_type)
atomic_dec(>nr_pages[count_type]);
  }
  
+static inline void dec_pages_count(struct f2fs_sb_info *sbi, int count_type,

+   int pages)
+{
+   atomic_sub(pages, >nr_pages[count_type]);
+}
+
  static inline void inode_dec_dirty_pages(struct inode *inode)
  {
if (!S_ISDIR(inode->i_mode) && !S_ISREG(inode->i_mode) &&
@@ -2716,7 +2738,7 @@ struct f2fs_stat_info {
unsigned long long hit_total, total_ext;
int ext_tree, zombie_tree, ext_node;
int ndirty_node, ndirty_dent, ndirty_meta, ndirty_data, ndirty_imeta;
-   int inmem_pages;
+   int inmem_pages, inmem_commit_pages;
unsigned int ndirty_dirs, ndirty_files, ndirty_all;
int nats, dirty_nats, sits, dirty_sits;
int free_nids, avail_nids

Re: [f2fs-dev] [PATCH v3 RESEND] f2fs: add bug_on when f2fs_gc even fails to get one victim

2017-11-21 Thread Yunlong Song


ping again...

On 2017/11/17 9:09, Yunlong Song wrote:

This can help to find potential bugs on some corner case.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/gc.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..c89128b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -1035,6 +1035,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
goto stop;
}
if (!__get_victim(sbi, , gc_type)) {
+   f2fs_bug_on(sbi, !total_freed && has_not_enough_free_secs(sbi, 
0, 0));
ret = -ENODATA;
goto stop;
}


--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: let f2fs also gc atomic file to avoid loop gc

2017-11-16 Thread Yunlong Song


How about add file_write_and_wait_range in __write_node_page as following:
if (atomic && !test_opt(sbi, NOBARRIER)) {
file_write_and_wait_range(file, 0, LLONG_MAX);
fio.op_flags |= REQ_PREFLUSH | REQ_FUA;
}

The all the GCed data will be flushed to non-volatile before last node 
write with REQ_PREFLUSH | REQ_FUA.


On 2017/11/17 11:20, Chao Yu wrote:

On 2017/11/17 11:04, Yunlong Song wrote:

The atomic commit will trigger:
  -f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true)
  -file_write_and_wait_range(file, 0, LLONG_MAX)
  -fsync_node_pages
  -__write_node_page
  -REQ_PREFLUSH | REQ_FUA

So data is flushed to non-volatile before  last node write with > REQ_PREFLUSH 
| REQ_FUA,

I mean GCed data.

- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
  - fsync_node_pages

Thanks,


we do not need to worry about the inconsistent problem. Right?

On 2017/11/17 10:49, Chao Yu wrote:

On 2017/11/17 8:58, Yunlong Song wrote:

Is there any problem if just deleting the judgement condition in this patch?

IIRC, dirty node comes from data segment GC can be writebacked & flushed during
atomic commit, but related data will still be in inner bio cache, after later
SPOR, data would be inconsistent.

Thanks,


On 2017/11/8 17:28, Chao Yu wrote:

On 2017/11/8 10:34, Yunlong Song wrote:

If some files are opened with atomic flag and have not commited yet, at
the same time, if all the target victim segments have at least one page
of these atomic files, then f2fs gc will fail to do gc and hangs in the
process of go to gc_more, since gc_date_segment will not move any data
and get_valid_blocks will never be 0, then do_garbage_collect will
always return 0.

Oh, I added this judgment condition to avoid ruining atomic write by data
GC, could we find another way to solve this issue? BTW, if there is direct
IO, we will also skip data segment GC.

Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
fs/f2fs/gc.c | 6 --
1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..3fdcd04 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -621,9 +621,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

-   goto out;
-
set_new_dnode(, inode, NULL, NULL, 0);
err = get_dnode_of_data(, bidx, LOOKUP_NODE);
if (err)
@@ -718,9 +715,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

-   goto out;
-
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;


.


.



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Re: [f2fs-dev] [PATCH] f2fs: let f2fs also gc atomic file to avoid loop gc

2017-11-16 Thread Yunlong Song


The atomic commit will trigger:
-f2fs_do_sync_file(filp, 0, LLONG_MAX, 0, true)
-file_write_and_wait_range(file, 0, LLONG_MAX)
-fsync_node_pages
-__write_node_page
-REQ_PREFLUSH | REQ_FUA

So data is flushed to non-volatile before  last node write with 
REQ_PREFLUSH | REQ_FUA,

we do not need to worry about the inconsistent problem. Right?

On 2017/11/17 10:49, Chao Yu wrote:

On 2017/11/17 8:58, Yunlong Song wrote:

Is there any problem if just deleting the judgement condition in this patch?

IIRC, dirty node comes from data segment GC can be writebacked & flushed during
atomic commit, but related data will still be in inner bio cache, after later
SPOR, data would be inconsistent.

Thanks,


On 2017/11/8 17:28, Chao Yu wrote:

On 2017/11/8 10:34, Yunlong Song wrote:

If some files are opened with atomic flag and have not commited yet, at
the same time, if all the target victim segments have at least one page
of these atomic files, then f2fs gc will fail to do gc and hangs in the
process of go to gc_more, since gc_date_segment will not move any data
and get_valid_blocks will never be 0, then do_garbage_collect will
always return 0.

Oh, I added this judgment condition to avoid ruining atomic write by data
GC, could we find another way to solve this issue? BTW, if there is direct
IO, we will also skip data segment GC.

Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/gc.c | 6 --
   1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..3fdcd04 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -621,9 +621,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

-   goto out;
-
set_new_dnode(, inode, NULL, NULL, 0);
err = get_dnode_of_data(, bidx, LOOKUP_NODE);
if (err)
@@ -718,9 +715,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

-   goto out;
-
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;


.



.



--
Thanks,
Yunlong Song



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

1 2 3 >

1 - 100 of 228 matches

Mail list logo