[PATCH] f2fs: remove redundant comment of unused wio_mutex

2018-12-13 Thread Yunlong Song
Commit 089842de ("f2fs: remove codes of unused wio_mutex") removes codes
of unused wio_mutex, but missing the comment, so delete it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 7cec897..03e7b37 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1175,7 +1175,6 @@ struct f2fs_sb_info {
 
/* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
-   /* bio ordering for NODE/DATA */
/* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
-- 
1.8.5.2



[PATCH v2] f2fs: remove codes of unused wio_mutex

2018-12-11 Thread Yunlong Song
v1->v2: delete comments in f2fs.h: "/* bio ordering for NODE/DATA */"

Signed-off-by: Yunlong Song 
Reviewed-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/f2fs.h  | 2 --
 fs/f2fs/super.c | 5 +
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 1e03197..195850e 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1170,8 +1170,6 @@ struct f2fs_sb_info {
 
/* for bio operations */
struct f2fs_bio_info *write_io[NR_PAGE_TYPE];   /* for write bios */
-   struct mutex wio_mutex[NR_PAGE_TYPE - 1][NR_TEMP_TYPE];
-   /* bio ordering for NODE/DATA */
/* keep migration IO order for LFS mode */
struct rw_semaphore io_order_lock;
mempool_t *write_io_dummy;  /* Dummy pages */
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index af58b2c..2d18de5 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2674,7 +2674,7 @@ int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi)
 static void init_sb_info(struct f2fs_sb_info *sbi)
 {
struct f2fs_super_block *raw_super = sbi->raw_super;
-   int i, j;
+   int i;
 
sbi->log_sectors_per_block =
le32_to_cpu(raw_super->log_sectors_per_block);
@@ -2710,9 +2710,6 @@ static void init_sb_info(struct f2fs_sb_info *sbi)
 
INIT_LIST_HEAD(>s_list);
mutex_init(>umount_mutex);
-   for (i = 0; i < NR_PAGE_TYPE - 1; i++)
-   for (j = HOT; j < NR_TEMP_TYPE; j++)
-   mutex_init(>wio_mutex[i][j]);
init_rwsem(>io_order_lock);
spin_lock_init(>cp_lock);
 
-- 
1.8.5.2



[PATCH] f2fs: issue discard align to section in LFS mode

2018-07-17 Thread Yunlong Song
For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round 2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
valid blocks of this section, so discard issued, but this time prefree
bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;

round 3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
0 0 0 0 0, no valid blocks of this section, so discard issued,
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

To fix this problem, we can align the start and end value to section
boundary for fstrim and real-time discard operation, and decide to issue
discard only when the whole section is invalid, which can issue discard
aligned to section size as much as possible and avoid redundant discard.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..ff5de34 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1710,6 +1710,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
mutex_lock(_i->seglist_lock);
 
@@ -1721,14 +1722,19 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);
 
-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
-
-   dirty_i->nr_dirty[PRE] -= end - start;
-
if (!test_opt(sbi, DISCARD))
continue;
 
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(end, sbi->segs_per_sec);
+   }
+
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
+
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
@@ -2511,6 +2517,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2528,6 +2535,13 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);
+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno, sbi->segs_per_sec);
+   if (start_segno == end_segno)
+   end_segno += sbi->segs_per_sec;
+   end_segno--;
+   }
 
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
-- 
1.8.5.2



[PATCH] f2fs: issue discard align to section in LFS mode

2018-07-17 Thread Yunlong Song
For the case when sbi->segs_per_sec > 1 with lfs mode, take
section:segment = 5 for example, if the section prefree_map is
...previous section | current section (1 1 0 1 1) | next section...,
then the start = x, end = x + 1, after start = start_segno +
sbi->segs_per_sec, start = x + 5, then it will skip x + 3 and x + 4, but
their bitmap is still set, which will cause duplicated
f2fs_issue_discard of this same section in the next write_checkpoint:

round 1: section bitmap : 1 1 1 1 1, all valid, prefree_map: 0 0 0 0 0
then rm data block NO.2, block NO.2 becomes invalid, prefree_map: 0 0 1 0 0
write_checkpoint: section bitmap: 1 1 0 1 1, prefree_map: 0 0 0 0 0,
prefree of NO.2 is cleared, and no discard issued

round 2: rm data block NO.0, NO.1, NO.3, NO.4
all invalid, but prefree bit of NO.2 is set and cleared in round 1, then
prefree_map: 1 1 0 1 1
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1, no
valid blocks of this section, so discard issued, but this time prefree
bit of NO.3 and NO.4 is skipped due to start = start_segno + sbi->segs_per_sec;

round 3:
write_checkpoint: section bitmap: 0 0 0 0 0, prefree_map: 0 0 0 1 1 ->
0 0 0 0 0, no valid blocks of this section, so discard issued,
this time prefree bit of NO.3 and NO.4 is cleared, but the discard of
this section is sent again...

To fix this problem, we can align the start and end value to section
boundary for fstrim and real-time discard operation, and decide to issue
discard only when the whole section is invalid, which can issue discard
aligned to section size as much as possible and avoid redundant discard.

Signed-off-by: Yunlong Song 
Signed-off-by: Chao Yu 
---
 fs/f2fs/segment.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index cfff7cf..ff5de34 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -1710,6 +1710,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
unsigned int start = 0, end = -1;
unsigned int secno, start_segno;
bool force = (cpc->reason & CP_DISCARD);
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
mutex_lock(_i->seglist_lock);
 
@@ -1721,14 +1722,19 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info 
*sbi,
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
start + 1);
 
-   for (i = start; i < end; i++)
-   clear_bit(i, prefree_map);
-
-   dirty_i->nr_dirty[PRE] -= end - start;
-
if (!test_opt(sbi, DISCARD))
continue;
 
+   if (need_align) {
+   start = rounddown(start, sbi->segs_per_sec);
+   end = roundup(end, sbi->segs_per_sec);
+   }
+
+   for (i = start; i < end; i++) {
+   if (test_and_clear_bit(i, prefree_map))
+   dirty_i->nr_dirty[PRE]--;
+   }
+
if (force && start >= cpc->trim_start &&
(end - 1) <= cpc->trim_end)
continue;
@@ -2511,6 +2517,7 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
struct discard_policy dpolicy;
unsigned long long trimmed = 0;
int err = 0;
+   bool need_align = test_opt(sbi, LFS) && sbi->segs_per_sec > 1;
 
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
return -EINVAL;
@@ -2528,6 +2535,13 @@ int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct 
fstrim_range *range)
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
GET_SEGNO(sbi, end);
+   if (need_align) {
+   start_segno = rounddown(start_segno, sbi->segs_per_sec);
+   end_segno = roundup(end_segno, sbi->segs_per_sec);
+   if (start_segno == end_segno)
+   end_segno += sbi->segs_per_sec;
+   end_segno--;
+   }
 
cpc.reason = CP_DISCARD;
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
-- 
1.8.5.2



Re: [f2fs-dev] [PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-27 Thread Yunlong Song
Just keep it same with u_int64_t defined in mkfs/f2fs_format.c, and 
c.start_sector * c.sector_size
may be u32 overflow, so add (u_int64_t) before c.start_sector * 
c.sector_size and change the target

value zone_align_start_offset to (u_int64_t).

On 2018/5/26 19:27, Junling Zheng wrote:

No neet to change zone_align_start_offset to u64, because 
zone_align_start_offset is always
smaller than zone_size_bytes, which is u32.

Thanks,
Junling

On 2018/5/26 16:09, Yunlong Song wrote:

zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fsck/resize.c  | 7 ---
  mkfs/f2fs_format.c | 4 ++--
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
  
  static int get_new_sb(struct f2fs_super_block *sb)

  {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	if (c.start_sector % c.sectors_per_blk) {

MSG(1, "\t%s: Align start sector number to the page unit\n",




.



--
Thanks,
Yunlong Song




Re: [f2fs-dev] [PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-27 Thread Yunlong Song
Just keep it same with u_int64_t defined in mkfs/f2fs_format.c, and 
c.start_sector * c.sector_size
may be u32 overflow, so add (u_int64_t) before c.start_sector * 
c.sector_size and change the target

value zone_align_start_offset to (u_int64_t).

On 2018/5/26 19:27, Junling Zheng wrote:

No neet to change zone_align_start_offset to u64, because 
zone_align_start_offset is always
smaller than zone_size_bytes, which is u32.

Thanks,
Junling

On 2018/5/26 16:09, Yunlong Song wrote:

zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song 
---
  fsck/resize.c  | 7 ---
  mkfs/f2fs_format.c | 4 ++--
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
  
  static int get_new_sb(struct f2fs_super_block *sb)

  {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
  
  	if (c.start_sector % c.sectors_per_blk) {

MSG(1, "\t%s: Align start sector number to the page unit\n",




.



--
Thanks,
Yunlong Song




Re: [PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-26 Thread Yunlong Song

ping...

On 2018/5/7 10:15, Yunlong Song wrote:

f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fsck/resize.c  |  4 ++--
  mkfs/f2fs_format.c | 12 ++--
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
-	if (c.start_sector % c.sectors_per_blk) {

+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
  
  	if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /

-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
    get_sb(segment0_blkaddr));
return -1;


--
Thanks,
Yunlong Song




Re: [PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-26 Thread Yunlong Song

ping...

On 2018/5/7 10:15, Yunlong Song wrote:

f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song 
---
  fsck/resize.c  |  4 ++--
  mkfs/f2fs_format.c | 12 ++--
  2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
  
  	zone_size_bytes = segment_size_bytes * segs_per_zone;

zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
  	set_sb(segment_count, (c.target_sectors * c.sector_size -

zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
  
  	zone_align_start_offset =

-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
  
-	if (c.start_sector % c.sectors_per_blk) {

+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
  
  	if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /

-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
get_sb(segment0_blkaddr));
return -1;


--
Thanks,
Yunlong Song




[PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-26 Thread Yunlong Song
zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/resize.c  | 7 ---
 mkfs/f2fs_format.c | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
 
 static int get_new_sb(struct f2fs_super_block *sb)
 {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
if (c.start_sector % c.sectors_per_blk) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
-- 
1.8.5.2



[PATCH] f2fs-tools: fix overflow bug of start_sector when computing zone_align_start_offset

2018-05-26 Thread Yunlong Song
zone_align_start_offset should be u64, but config.start_sector is u32,
so it may be overflow when computing zone_align_start_offset.

Signed-off-by: Yunlong Song 
---
 fsck/resize.c  | 7 ---
 mkfs/f2fs_format.c | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..8ac7d45 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -11,7 +11,8 @@
 
 static int get_new_sb(struct f2fs_super_block *sb)
 {
-   u_int32_t zone_size_bytes, zone_align_start_offset;
+   u_int32_t zone_size_bytes;
+   u_int64_t zone_align_start_offset;
u_int32_t blocks_for_sit, blocks_for_nat, blocks_for_ssa;
u_int32_t sit_segments, nat_segments, diff, total_meta_segments;
u_int32_t total_valid_blks_available;
@@ -27,10 +28,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..f045e23 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,10 +212,10 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   ((u_int64_t) c.start_sector * c.sector_size +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   (u_int64_t) c.start_sector * c.sector_size;
 
if (c.start_sector % c.sectors_per_blk) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
-- 
1.8.5.2



[PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-06 Thread Yunlong Song
f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fsck/resize.c  |  4 ++--
 mkfs/f2fs_format.c | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
-   if (c.start_sector % c.sectors_per_blk) {
+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
 
if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /
-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
get_sb(segment0_blkaddr));
return -1;
-- 
1.8.5.2



[PATCH v2] f2fs-tools: fix to match with the start_sector

2018-05-06 Thread Yunlong Song
f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just
change the sector_size to the default value when computing with
start_sector. And fix sectors_per_blk as well.

Signed-off-by: Yunlong Song 
---
 fsck/resize.c  |  4 ++--
 mkfs/f2fs_format.c | 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/fsck/resize.c b/fsck/resize.c
index d285dd7..ada2155 100644
--- a/fsck/resize.c
+++ b/fsck/resize.c
@@ -27,10 +27,10 @@ static int get_new_sb(struct f2fs_super_block *sb)
 
zone_size_bytes = segment_size_bytes * segs_per_zone;
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
set_sb(segment_count, (c.target_sectors * c.sector_size -
zone_align_start_offset) / segment_size_bytes /
diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
index 0a99a77..ced5fea 100644
--- a/mkfs/f2fs_format.c
+++ b/mkfs/f2fs_format.c
@@ -212,18 +212,18 @@ static int f2fs_prepare_super_block(void)
set_sb(block_count, c.total_sectors >> log_sectors_per_block);
 
zone_align_start_offset =
-   (c.start_sector * c.sector_size +
+   (c.start_sector * DEFAULT_SECTOR_SIZE +
2 * F2FS_BLKSIZE + zone_size_bytes - 1) /
zone_size_bytes * zone_size_bytes -
-   c.start_sector * c.sector_size;
+   c.start_sector * DEFAULT_SECTOR_SIZE;
 
-   if (c.start_sector % c.sectors_per_blk) {
+   if (c.start_sector % DEFAULT_SECTORS_PER_BLOCK) {
MSG(1, "\t%s: Align start sector number to the page unit\n",
c.zoned_mode ? "FAIL" : "WARN");
MSG(1, "\ti.e., start sector: %d, ofs:%d (sects/page: %d)\n",
c.start_sector,
-   c.start_sector % c.sectors_per_blk,
-   c.sectors_per_blk);
+   c.start_sector % DEFAULT_SECTORS_PER_BLOCK,
+   DEFAULT_SECTORS_PER_BLOCK);
if (c.zoned_mode)
return -1;
}
@@ -235,7 +235,7 @@ static int f2fs_prepare_super_block(void)
get_sb(segment0_blkaddr));
 
if (c.zoned_mode && (get_sb(segment0_blkaddr) + c.start_sector /
-   c.sectors_per_blk) % c.zone_blocks) {
+   DEFAULT_SECTORS_PER_BLOCK) % 
c.zone_blocks) {
MSG(1, "\tError: Unaligned segment0 block address %u\n",
get_sb(segment0_blkaddr));
return -1;
-- 
1.8.5.2



[PATCH] f2fs-tools: fix the sector_size to default value

2018-05-04 Thread Yunlong Song
f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just set
the sector_size to the default value 512.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 lib/libf2fs.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 102e579..e160f2a 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -768,7 +768,6 @@ void get_kernel_uname_version(__u8 *version)
 int get_device_info(int i)
 {
int32_t fd = 0;
-   uint32_t sector_size;
 #ifndef BLKGETSIZE64
uint32_t total_sectors;
 #endif
@@ -822,12 +821,6 @@ int get_device_info(int i)
} else if (S_ISREG(stat_buf->st_mode)) {
dev->total_sectors = stat_buf->st_size / dev->sector_size;
} else if (S_ISBLK(stat_buf->st_mode)) {
-#ifdef BLKSSZGET
-   if (ioctl(fd, BLKSSZGET, _size) < 0)
-   MSG(0, "\tError: Using the default sector size\n");
-   else if (dev->sector_size < sector_size)
-   dev->sector_size = sector_size;
-#endif
 #ifdef BLKGETSIZE64
if (ioctl(fd, BLKGETSIZE64, >total_sectors) < 0) {
MSG(0, "\tError: Cannot get the device size\n");
-- 
1.8.5.2



[PATCH] f2fs-tools: fix the sector_size to default value

2018-05-04 Thread Yunlong Song
f2fs-tools uses ioctl BLKSSZGET to get sector_size, however, this ioctl
will return a value which may be larger than 512 (according to the value
of q->limits.logical_block_size), then this will be inconsistent with
the start_sector, since start_sector is got from ioctl HDIO_GETGEO and
is always in 512 size unit for a sector. To fix this problem, just set
the sector_size to the default value 512.

Signed-off-by: Yunlong Song 
---
 lib/libf2fs.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 102e579..e160f2a 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -768,7 +768,6 @@ void get_kernel_uname_version(__u8 *version)
 int get_device_info(int i)
 {
int32_t fd = 0;
-   uint32_t sector_size;
 #ifndef BLKGETSIZE64
uint32_t total_sectors;
 #endif
@@ -822,12 +821,6 @@ int get_device_info(int i)
} else if (S_ISREG(stat_buf->st_mode)) {
dev->total_sectors = stat_buf->st_size / dev->sector_size;
} else if (S_ISBLK(stat_buf->st_mode)) {
-#ifdef BLKSSZGET
-   if (ioctl(fd, BLKSSZGET, _size) < 0)
-   MSG(0, "\tError: Using the default sector size\n");
-   else if (dev->sector_size < sector_size)
-   dev->sector_size = sector_size;
-#endif
 #ifdef BLKGETSIZE64
if (ioctl(fd, BLKGETSIZE64, >total_sectors) < 0) {
MSG(0, "\tError: Cannot get the device size\n");
-- 
1.8.5.2



Re: [PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song

Please avoid this patch, I make a mistake.

On 2018/5/3 15:45, Yunlong Song wrote:

Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/segment.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
  
-	if (waitqueue_active(>flush_wait_queue))

+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
  
  	if (fcc->f2fs_issue_flush) {


--
Thanks,
Yunlong Song




Re: [PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song

Please avoid this patch, I make a mistake.

On 2018/5/3 15:45, Yunlong Song wrote:

Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/segment.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
  
-	if (waitqueue_active(>flush_wait_queue))

+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
  
  	if (fcc->f2fs_issue_flush) {


--
Thanks,
Yunlong Song




[PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song
Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
 
-   if (waitqueue_active(>flush_wait_queue))
+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
 
if (fcc->f2fs_issue_flush) {
-- 
1.8.5.2



[PATCH] f2fs: fix the way to wake up issue_flush thread

2018-05-03 Thread Yunlong Song
Commit 6f890df0 ("f2fs: fix out-of-order execution in f2fs_issue_flush")
uses waitqueue_active to wake up issue_flush thread, but there is no
wait entry to wake in this queue, so change it back to use the original
fcc->dispatch_list to wake up issue_flush thread.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 503a98a..5aa5ee4e 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -612,7 +612,7 @@ int f2fs_issue_flush(struct f2fs_sb_info *sbi, nid_t ino)
/* update issue_list before we wake up issue_flush thread */
smp_mb();
 
-   if (waitqueue_active(>flush_wait_queue))
+   if (!fcc->dispatch_list)
wake_up(>flush_wait_queue);
 
if (fcc->f2fs_issue_flush) {
-- 
1.8.5.2



[PATCH] f2fs: remove unmatched zero_user_segment when convert inline dentry

2018-04-03 Thread Yunlong Song
Since the layout of regular dentry block is different from inline dentry
block, zero_user_segment starting from MAX_INLINE_DATA(dir) is not
correct for regular dentry block, besides, bitmap is already copied and
used, so there is no necessary to zero page at all, so just remove the
zero_user_segment is OK.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/inline.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 3b77d64..573ec2f 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -367,7 +367,6 @@ static int f2fs_move_inline_dirents(struct inode *dir, 
struct page *ipage,
goto out;
 
f2fs_wait_on_page_writeback(page, DATA, true);
-   zero_user_segment(page, MAX_INLINE_DATA(dir), PAGE_SIZE);
 
dentry_blk = page_address(page);
 
-- 
1.8.5.2



[PATCH] f2fs: remove unmatched zero_user_segment when convert inline dentry

2018-04-03 Thread Yunlong Song
Since the layout of regular dentry block is different from inline dentry
block, zero_user_segment starting from MAX_INLINE_DATA(dir) is not
correct for regular dentry block, besides, bitmap is already copied and
used, so there is no necessary to zero page at all, so just remove the
zero_user_segment is OK.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/inline.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 3b77d64..573ec2f 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -367,7 +367,6 @@ static int f2fs_move_inline_dirents(struct inode *dir, 
struct page *ipage,
goto out;
 
f2fs_wait_on_page_writeback(page, DATA, true);
-   zero_user_segment(page, MAX_INLINE_DATA(dir), PAGE_SIZE);
 
dentry_blk = page_address(page);
 
-- 
1.8.5.2



[PATCH] f2fs: make assignment of t->dentry_bitmap more readable

2018-04-02 Thread Yunlong Song
In make_dentry_ptr_block, it is confused with "&" for t->dentry_bitmap
but without "&" for t->dentry, so delete "&" to make code more readable.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 191ee57..474b9e9 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -471,7 +471,7 @@ static inline void make_dentry_ptr_block(struct inode 
*inode,
d->inode = inode;
d->max = NR_DENTRY_IN_BLOCK;
d->nr_bitmap = SIZE_OF_DENTRY_BITMAP;
-   d->bitmap = >dentry_bitmap;
+   d->bitmap = t->dentry_bitmap;
d->dentry = t->dentry;
d->filename = t->filename;
 }
-- 
1.8.5.2



[PATCH] f2fs: make assignment of t->dentry_bitmap more readable

2018-04-02 Thread Yunlong Song
In make_dentry_ptr_block, it is confused with "&" for t->dentry_bitmap
but without "&" for t->dentry, so delete "&" to make code more readable.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 191ee57..474b9e9 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -471,7 +471,7 @@ static inline void make_dentry_ptr_block(struct inode 
*inode,
d->inode = inode;
d->max = NR_DENTRY_IN_BLOCK;
d->nr_bitmap = SIZE_OF_DENTRY_BITMAP;
-   d->bitmap = >dentry_bitmap;
+   d->bitmap = t->dentry_bitmap;
d->dentry = t->dentry;
d->filename = t->filename;
 }
-- 
1.8.5.2



[PATCH v2] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song
Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/super.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..42d564c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -827,7 +827,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
/* Initialize f2fs-specific inode info */
atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +838,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2



[PATCH v2] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song
Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/super.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..42d564c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -827,7 +827,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
/* Initialize f2fs-specific inode info */
atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +838,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2



[PATCH] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song
Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/super.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..3a7fa03 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -825,9 +825,7 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_once((void *) fi);
 
/* Initialize f2fs-specific inode info */
-   atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +837,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2



[PATCH] f2fs: no need to initialize zero value for GFP_F2FS_ZERO

2018-03-21 Thread Yunlong Song
Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not
need to initialize zero value for its member any more.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/super.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0c1fe9b..3a7fa03 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -825,9 +825,7 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_once((void *) fi);
 
/* Initialize f2fs-specific inode info */
-   atomic_set(>dirty_pages, 0);
fi->i_current_depth = 1;
-   fi->i_advise = 0;
init_rwsem(>i_sem);
INIT_LIST_HEAD(>dirty_list);
INIT_LIST_HEAD(>gdirty_list);
@@ -839,10 +837,6 @@ static struct inode *f2fs_alloc_inode(struct super_block 
*sb)
init_rwsem(>i_mmap_sem);
init_rwsem(>i_xattr_sem);
 
-#ifdef CONFIG_QUOTA
-   memset(>i_dquot, 0, sizeof(fi->i_dquot));
-   fi->i_reserved_quota = 0;
-#endif
/* Will be used by directory only */
fi->i_dir_level = F2FS_SB(sb)->dir_level;
 
-- 
1.8.5.2



Re: [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-19 Thread Yunlong Song

OK, got it.

On 2018/3/19 14:23, Jaegeuk Kim wrote:

On 03/19, Yunlong Song wrote:

Hi, Jaegeuk,
 I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

Moved into the beginning of the tree for cherry-picking into f2fs-stable.

Thanks,


On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/dir.c   | 23 +--
   fs/f2fs/f2fs.h  |  6 --
   fs/f2fs/inline.c|  3 +--
   fs/f2fs/inode.c |  2 +-
   fs/f2fs/namei.c | 14 +-
   fs/f2fs/recovery.c  | 11 +--
   include/linux/f2fs_fs.h |  1 -
   7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
return de;
   }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
b

Re: [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-19 Thread Yunlong Song

OK, got it.

On 2018/3/19 14:23, Jaegeuk Kim wrote:

On 03/19, Yunlong Song wrote:

Hi, Jaegeuk,
 I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

Moved into the beginning of the tree for cherry-picking into f2fs-stable.

Thanks,


On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song 
---
   fs/f2fs/dir.c   | 23 +--
   fs/f2fs/f2fs.h  |  6 --
   fs/f2fs/inline.c|  3 +--
   fs/f2fs/inode.c |  2 +-
   fs/f2fs/namei.c | 14 +-
   fs/f2fs/recovery.c  | 11 +--
   include/linux/f2fs_fs.h |  1 -
   7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
return de;
   }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos

Re: [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-18 Thread Yunlong Song

Hi, Jaegeuk,
I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/dir.c   | 23 +--
  fs/f2fs/f2fs.h  |  6 --
  fs/f2fs/inline.c|  3 +--
  fs/f2fs/inode.c |  2 +-
  fs/f2fs/namei.c | 14 +-
  fs/f2fs/recovery.c  | 11 +--
  include/linux/f2fs_fs.h |  1 -
  7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
  
-	dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);

+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
  
  	return de;

  }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
  
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,

f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
  
  	dir->i_mtime = dir->i_ctime = current_time(dir);

@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-	dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

do_make_empty_dir(inode, parent, );
  
-	kunmap_atomic(dentry_blk);

-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-		dentry_blk = kmap(dentry_page);

+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
  
-		kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
}
  
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,

if (inode)
up_write(_I(inode)->i_sem);
  
-	kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
  
  	return err;

@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
  
  	dir->i_ctime = dir->i_mtime = current_time(dir);

@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
  
-		dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
  
  		f2fs

Re: [PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-03-18 Thread Yunlong Song

Hi, Jaegeuk,
I find this patch is removed from current branch of dev-test
recently, why? Any bugs?

On 2018/2/28 20:31, Yunlong Song wrote:

Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/dir.c   | 23 +--
  fs/f2fs/f2fs.h  |  6 --
  fs/f2fs/inline.c|  3 +--
  fs/f2fs/inode.c |  2 +-
  fs/f2fs/namei.c | 14 +-
  fs/f2fs/recovery.c  | 11 +--
  include/linux/f2fs_fs.h |  1 -
  7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
  
-	dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);

+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
  
  	return de;

  }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
  
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,

f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
  
  	dir->i_mtime = dir->i_ctime = current_time(dir);

@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-	dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
  
  	make_dentry_ptr_block(NULL, , dentry_blk);

do_make_empty_dir(inode, parent, );
  
-	kunmap_atomic(dentry_blk);

-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
  
-		dentry_blk = kmap(dentry_page);

+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
  
-		kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
}
  
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,

if (inode)
up_write(_I(inode)->i_sem);
  
-	kunmap(dentry_page);

f2fs_put_page(dentry_page, 1);
  
  	return err;

@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
  
  	dir->i_ctime = dir->i_mtime = current_time(dir);

@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
  
-		dentry_blk = kmap_atomic(dentry_page);

+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
  
  		f2fs_put_page(dentry_page, 1);
  
@@ 

Re: [PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-28 Thread Yunlong Song

On 2018/2/28 13:48, Jaegeuk Kim wrote:

Hi Yunlong,

As Eric pointed out, how do you think using nohighmem for directory likewise
ext4, which looks like more efficient?


OK, I have sent out another patch like this.

 Actually, we don't need to do this in

most of recent kernels, right?



Why? I have got this panic using arm with recent kernel.



Re: [PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-28 Thread Yunlong Song

On 2018/2/28 13:48, Jaegeuk Kim wrote:

Hi Yunlong,

As Eric pointed out, how do you think using nohighmem for directory likewise
ext4, which looks like more efficient?


OK, I have sent out another patch like this.

 Actually, we don't need to do this in

most of recent kernels, right?



Why? I have got this panic using arm with recent kernel.



[PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-02-28 Thread Yunlong Song
Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c   | 23 +--
 fs/f2fs/f2fs.h  |  6 --
 fs/f2fs/inline.c|  3 +--
 fs/f2fs/inode.c |  2 +-
 fs/f2fs/namei.c | 14 +-
 fs/f2fs/recovery.c  | 11 +--
 include/linux/f2fs_fs.h |  1 -
 7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
 
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
 
return de;
 }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
 
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
 
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
 
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
 
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
 
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
 
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
 
f2fs_put_page(dentry_page, 1);
 
@@ -901,19 +890,17 @@ static int f2fs_rea

[PATCH] f2fs: don't put dentry page in pagecache into highmem

2018-02-28 Thread Yunlong Song
Previous dentry page uses highmem, which will cause panic in platforms
using highmem (such as arm), since the address space of dentry pages
from highmem directly goes into the decryption path via the function
fscrypt_fname_disk_to_usr. But sg_init_one assumes the address is not
from highmem, and then cause panic since it doesn't call kmap_high but
kunmap_high is triggered at the end. To fix this problem in a simple
way, this patch avoids to put dentry page in pagecache into highmem.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/dir.c   | 23 +--
 fs/f2fs/f2fs.h  |  6 --
 fs/f2fs/inline.c|  3 +--
 fs/f2fs/inode.c |  2 +-
 fs/f2fs/namei.c | 14 +-
 fs/f2fs/recovery.c  | 11 +--
 include/linux/f2fs_fs.h |  1 -
 7 files changed, 13 insertions(+), 47 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..797eb05 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -94,14 +94,12 @@ static struct f2fs_dir_entry *find_in_block(struct page 
*dentry_page,
struct f2fs_dir_entry *de;
struct f2fs_dentry_ptr d;
 
-   dentry_blk = (struct f2fs_dentry_block *)kmap(dentry_page);
+   dentry_blk = (struct f2fs_dentry_block *)page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
de = find_target_dentry(fname, namehash, max_slots, );
if (de)
*res_page = dentry_page;
-   else
-   kunmap(dentry_page);
 
return de;
 }
@@ -287,7 +285,6 @@ ino_t f2fs_inode_by_name(struct inode *dir, const struct 
qstr *qstr,
de = f2fs_find_entry(dir, qstr, page);
if (de) {
res = le32_to_cpu(de->ino);
-   f2fs_dentry_kunmap(dir, *page);
f2fs_put_page(*page, 0);
}
 
@@ -302,7 +299,6 @@ void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry 
*de,
f2fs_wait_on_page_writeback(page, type, true);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode->i_mode);
-   f2fs_dentry_kunmap(dir, page);
set_page_dirty(page);
 
dir->i_mtime = dir->i_ctime = current_time(dir);
@@ -350,13 +346,11 @@ static int make_empty_dir(struct inode *inode,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
 
make_dentry_ptr_block(NULL, , dentry_blk);
do_make_empty_dir(inode, parent, );
 
-   kunmap_atomic(dentry_blk);
-
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
@@ -547,13 +541,12 @@ int f2fs_add_regular_entry(struct inode *dir, const 
struct qstr *new_name,
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
 
-   dentry_blk = kmap(dentry_page);
+   dentry_blk = page_address(dentry_page);
bit_pos = room_for_filename(_blk->dentry_bitmap,
slots, NR_DENTRY_IN_BLOCK);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
 
@@ -588,7 +581,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct 
qstr *new_name,
if (inode)
up_write(_I(inode)->i_sem);
 
-   kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
 
return err;
@@ -642,7 +634,6 @@ int __f2fs_add_link(struct inode *dir, const struct qstr 
*name,
F2FS_I(dir)->task = NULL;
}
if (de) {
-   f2fs_dentry_kunmap(dir, page);
f2fs_put_page(page, 0);
err = -EEXIST;
} else if (IS_ERR(page)) {
@@ -730,7 +721,6 @@ void f2fs_delete_entry(struct f2fs_dir_entry *dentry, 
struct page *page,
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
-   kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
 
dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -775,7 +765,7 @@ bool f2fs_empty_dir(struct inode *dir)
return false;
}
 
-   dentry_blk = kmap_atomic(dentry_page);
+   dentry_blk = page_address(dentry_page);
if (bidx == 0)
bit_pos = 2;
else
@@ -783,7 +773,6 @@ bool f2fs_empty_dir(struct inode *dir)
bit_pos = find_next_bit_le(_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
-   kunmap_atomic(dentry_blk);
 
f2fs_put_page(dentry_page, 1);
 
@@ -901,19 +890,17 @@ static int f2fs_readdir(s

[PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-27 Thread Yunlong Song
This reverts commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b.

Conflicts:
fs/f2fs/dir.c

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patch:
commit e06f86e61d7a ("f2fs crypto: avoid unneeded memory allocation in 
->readdir")
reverts the codes, which causes panic again in arm, so fix it back to the old 
version.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
Reviewed-by: Chao Yu <yuch...@huawei.com>
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



[PATCH v4] Revert "f2fs crypto: avoid unneeded memory allocation in ->readdir"

2018-02-27 Thread Yunlong Song
This reverts commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b.

Conflicts:
fs/f2fs/dir.c

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patch:
commit e06f86e61d7a ("f2fs crypto: avoid unneeded memory allocation in 
->readdir")
reverts the codes, which causes panic again in arm, so fix it back to the old 
version.

Signed-off-by: Yunlong Song 
Reviewed-by: Chao Yu 
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



[PATCH v3] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-27 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add part of
the old patch again for dentry page.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



[PATCH v3] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-27 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add part of
the old patch again for dentry page.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/dir.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..de2e295 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = f2fs_kmalloc(sbi, de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



Re: [PATCH] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-25 Thread Yunlong Song

Hi, Eric,
Thanks for your review, I have removed the symlink part and use
kmemdup instead. As for directory pages restricted to lowmem, I am
not sure whether it is proper for f2fs, since the directory structures
of f2fs are different from ext4. So I just keep its old kmap.

On 2018/2/25 2:32, Eric Biggers wrote:

Hi Yunlong,

On Sat, Feb 24, 2018 at 05:14:58PM +0800, Yunlong Song wrote:

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

  We got dentry pages from high_mem, and its address space directly goes into 
the
  decryption path via f2fs_fname_disk_to_usr.
  But, sg_init_one assumes the address is not from high_mem, so we can get this
  panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

  kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
  ...
   (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
   (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
   (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
   (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
   (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
   (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
   (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
   (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
   (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
   (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
   (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit 922ec355f86365388203672119b5bca346a45085 ("f2fs crypto: avoid
unneeded memory allocation when {en/de}crypting symlink")
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add the old
patch again.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/dir.c   |  7 +++
  fs/f2fs/namei.c | 10 +-
  2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..c0cf3e7b 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
  
+			de_name.name = kmalloc(de_name.len, GFP_NOFS);

+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
  
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c

index c4c94c7..2cb70c1 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1170,8 +1170,13 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
  
  	/* Symlink is encrypted */

sd = (struct fscrypt_symlink_data *)caddr;
-   cstr.name = sd->encrypted_path;
cstr.len = le16_to_cpu(sd->len);
+   cstr.name = kmalloc(cstr.len, GFP_NOFS);
+   if (!cstr.name) {
+   res = -ENOMEM;
+   goto errout;
+   }
+   memcpy(cstr.name, sd->encrypted_path, cstr.len);
  
  	/* this is broken symlink case */

if (unlikely(cstr.len == 0)) {
@@ -1198,6 +1203,8 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
goto errout;
}
  
+	kfree(cstr.name);

+
paddr = pstr.name;
  
  	/* Null-terminate the name */

@@ -1207,6 +1214,7 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
set_delayed_call(done, kfree_link, paddr);
return paddr;
  errout:
+   kfree(cstr.name);
fscrypt_fname_free_buffer();
put_page(cpage);
return ERR_PTR(res);
--
1.8.5.2



The pagecache for symlinks in f2fs already uses lowmem only, so the change to
->get_link() isn't needed.  Also that part of the patch doesn't apply to
upstream.

For directories we may need your fix.  Note: kmalloc + memcpy should be replaced
with kmemdup.  But alternatively, directory pages could be restricted to lowmem
which would match ext4.

Eric

.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-25 Thread Yunlong Song

Hi, Eric,
Thanks for your review, I have removed the symlink part and use
kmemdup instead. As for directory pages restricted to lowmem, I am
not sure whether it is proper for f2fs, since the directory structures
of f2fs are different from ext4. So I just keep its old kmap.

On 2018/2/25 2:32, Eric Biggers wrote:

Hi Yunlong,

On Sat, Feb 24, 2018 at 05:14:58PM +0800, Yunlong Song wrote:

In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

  We got dentry pages from high_mem, and its address space directly goes into 
the
  decryption path via f2fs_fname_disk_to_usr.
  But, sg_init_one assumes the address is not from high_mem, so we can get this
  panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

  kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
  ...
   (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
   (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
   (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
   (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
   (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
   (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
   (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
   (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
   (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
   (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
   (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit 922ec355f86365388203672119b5bca346a45085 ("f2fs crypto: avoid
unneeded memory allocation when {en/de}crypting symlink")
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add the old
patch again.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/dir.c   |  7 +++
  fs/f2fs/namei.c | 10 +-
  2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..c0cf3e7b 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
  
+			de_name.name = kmalloc(de_name.len, GFP_NOFS);

+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
  
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c

index c4c94c7..2cb70c1 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1170,8 +1170,13 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
  
  	/* Symlink is encrypted */

sd = (struct fscrypt_symlink_data *)caddr;
-   cstr.name = sd->encrypted_path;
cstr.len = le16_to_cpu(sd->len);
+   cstr.name = kmalloc(cstr.len, GFP_NOFS);
+   if (!cstr.name) {
+   res = -ENOMEM;
+   goto errout;
+   }
+   memcpy(cstr.name, sd->encrypted_path, cstr.len);
  
  	/* this is broken symlink case */

if (unlikely(cstr.len == 0)) {
@@ -1198,6 +1203,8 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
goto errout;
}
  
+	kfree(cstr.name);

+
paddr = pstr.name;
  
  	/* Null-terminate the name */

@@ -1207,6 +1214,7 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
set_delayed_call(done, kfree_link, paddr);
return paddr;
  errout:
+   kfree(cstr.name);
fscrypt_fname_free_buffer();
put_page(cpage);
return ERR_PTR(res);
--
1.8.5.2



The pagecache for symlinks in f2fs already uses lowmem only, so the change to
->get_link() isn't needed.  Also that part of the patch doesn't apply to
upstream.

For directories we may need your fix.  Note: kmalloc + memcpy should be replaced
with kmemdup.  But alternatively, directory pages could be restricted to lowmem
which would match ext4.

Eric

.



--
Thanks,
Yunlong Song



[PATCH v2] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-25 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add part of
the old patch again for dentry page.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..23fff48 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,15 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = kmemdup(d->filename[bit_pos],
+   de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



[PATCH v2] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-25 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add part of
the old patch again for dentry page.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/dir.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..23fff48 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,15 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = kmemdup(d->filename[bit_pos],
+   de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
-- 
1.8.5.2



[PATCH] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-24 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit 922ec355f86365388203672119b5bca346a45085 ("f2fs crypto: avoid
unneeded memory allocation when {en/de}crypting symlink")
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add the old
patch again.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/dir.c   |  7 +++
 fs/f2fs/namei.c | 10 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..c0cf3e7b 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = kmalloc(de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index c4c94c7..2cb70c1 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1170,8 +1170,13 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
 
/* Symlink is encrypted */
sd = (struct fscrypt_symlink_data *)caddr;
-   cstr.name = sd->encrypted_path;
cstr.len = le16_to_cpu(sd->len);
+   cstr.name = kmalloc(cstr.len, GFP_NOFS);
+   if (!cstr.name) {
+   res = -ENOMEM;
+   goto errout;
+   }
+   memcpy(cstr.name, sd->encrypted_path, cstr.len);
 
/* this is broken symlink case */
if (unlikely(cstr.len == 0)) {
@@ -1198,6 +1203,8 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
goto errout;
}
 
+   kfree(cstr.name);
+
paddr = pstr.name;
 
/* Null-terminate the name */
@@ -1207,6 +1214,7 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
set_delayed_call(done, kfree_link, paddr);
return paddr;
 errout:
+   kfree(cstr.name);
fscrypt_fname_free_buffer();
put_page(cpage);
return ERR_PTR(res);
-- 
1.8.5.2



[PATCH] f2fs: allocate buffer for decrypting filename to avoid panic

2018-02-24 Thread Yunlong Song
In some platforms (such as arm), high memory is used, then the
decrypting filename will cause panic, the reason see commit
569cf1876a32e574ba8a7fb825cd91bafd003882 ("f2fs crypto: allocate buffer
for decrypting filename"):

 We got dentry pages from high_mem, and its address space directly goes into the
 decryption path via f2fs_fname_disk_to_usr.
 But, sg_init_one assumes the address is not from high_mem, so we can get this
 panic since it doesn't call kmap_high but kunmap_high is triggered at the end.

 kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
 Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
 ...
  (kunmap_high+0xb0/0xb8) from [] (__kunmap_atomic+0xa0/0xa4)
  (__kunmap_atomic+0xa0/0xa4) from [] 
(blkcipher_walk_done+0x128/0x1ec)
  (blkcipher_walk_done+0x128/0x1ec) from [] 
(crypto_cbc_decrypt+0xc0/0x170)
  (crypto_cbc_decrypt+0xc0/0x170) from [] 
(crypto_cts_decrypt+0xc0/0x114)
  (crypto_cts_decrypt+0xc0/0x114) from [] (async_decrypt+0x40/0x48)
  (async_decrypt+0x40/0x48) from [] 
(f2fs_fname_disk_to_usr+0x124/0x304)
  (f2fs_fname_disk_to_usr+0x124/0x304) from [] 
(f2fs_fill_dentries+0xac/0x188)
  (f2fs_fill_dentries+0xac/0x188) from [] (f2fs_readdir+0x1f0/0x300)
  (f2fs_readdir+0x1f0/0x300) from [] (vfs_readdir+0x90/0xb4)
  (vfs_readdir+0x90/0xb4) from [] (SyS_getdents64+0x64/0xcc)
  (SyS_getdents64+0x64/0xcc) from [] (ret_fast_syscall+0x0/0x30)

Howerver, later patches:
commit 922ec355f86365388203672119b5bca346a45085 ("f2fs crypto: avoid
unneeded memory allocation when {en/de}crypting symlink")
commit e06f86e61d7a67fe6e826010f57aa39c674f4b1b ("f2fs crypto: avoid
unneeded memory allocation in ->readdir")

reverts the codes, which causes panic again in arm, so let's add the old
patch again.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/dir.c   |  7 +++
 fs/f2fs/namei.c | 10 +-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index f00b5ed..c0cf3e7b 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -825,9 +825,16 @@ int f2fs_fill_dentries(struct dir_context *ctx, struct 
f2fs_dentry_ptr *d,
int save_len = fstr->len;
int err;
 
+   de_name.name = kmalloc(de_name.len, GFP_NOFS);
+   if (!de_name.name)
+   return -ENOMEM;
+
+   memcpy(de_name.name, d->filename[bit_pos], de_name.len);
+
err = fscrypt_fname_disk_to_usr(d->inode,
(u32)de->hash_code, 0,
_name, fstr);
+   kfree(de_name.name);
if (err)
return err;
 
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index c4c94c7..2cb70c1 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1170,8 +1170,13 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
 
/* Symlink is encrypted */
sd = (struct fscrypt_symlink_data *)caddr;
-   cstr.name = sd->encrypted_path;
cstr.len = le16_to_cpu(sd->len);
+   cstr.name = kmalloc(cstr.len, GFP_NOFS);
+   if (!cstr.name) {
+   res = -ENOMEM;
+   goto errout;
+   }
+   memcpy(cstr.name, sd->encrypted_path, cstr.len);
 
/* this is broken symlink case */
if (unlikely(cstr.len == 0)) {
@@ -1198,6 +1203,8 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
goto errout;
}
 
+   kfree(cstr.name);
+
paddr = pstr.name;
 
/* Null-terminate the name */
@@ -1207,6 +1214,7 @@ static const char *f2fs_encrypted_get_link(struct dentry 
*dentry,
set_delayed_call(done, kfree_link, paddr);
return paddr;
 errout:
+   kfree(cstr.name);
fscrypt_fname_free_buffer();
put_page(cpage);
return ERR_PTR(res);
-- 
1.8.5.2



Re: [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song

OK, Got it.

On 2018/2/11 11:50, Chao Yu wrote:

On 2018/2/11 11:34, Yunlong Song wrote:

Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.


Hmm.. data block migration is running based on meta inode, so it will
be safe since checkpoint will flush all meta pages including encrypted
pages cached in meta inode?

Thanks,



On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/gc.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))






.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song

OK, Got it.

On 2018/2/11 11:50, Chao Yu wrote:

On 2018/2/11 11:34, Yunlong Song wrote:

Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.


Hmm.. data block migration is running based on meta inode, so it will
be safe since checkpoint will flush all meta pages including encrypted
pages cached in meta inode?

Thanks,



On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song 
---
   fs/f2fs/gc.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))






.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song

Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.

On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/gc.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: set_code_data in move_data_block

2018-02-10 Thread Yunlong Song

Ping...

move_data_block misses set_cold_data, then the F2FS_WB_CP_DATA will
lack these data pages in move_data_block, and write_checkpoint can
not make sure this pages committed to the flash.

On 2018/2/8 20:33, Yunlong Song wrote:

Signed-off-by: Yunlong Song 
---
  fs/f2fs/gc.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))



--
Thanks,
Yunlong Song



Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

The problem is that you can not find a proper value of the threshold
time, when f2fs_gc select the GCed data page of the atomic file (which
has atomic started but not atomic committed yet), then f2fs_gc will
run into loop, and all the f2fs ops will be blocked in f2fs_balane_fs.
If the threshold time is set larger (e.g. 5s? Then all the f2fs ops will
block 5s, which will cause unexpected bad result of user experience).
And if the threshold time is set smaller (e.g. 500ms? Then the atomic
ops will probably fail frequently). BTW, some more patches are needed
to notify the atomic ops itself that it has run time out, and should
handle the inmem pages

Back to these two patches, why not use them to separate inmem pages
and GCed data pages in such a simple way.

On 2018/2/9 21:38, Chao Yu wrote:

On 2018/2/9 21:29, Yunlong Song wrote:

Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic


That's another issue, which I have suggest to set a threshold time
to release atomic/volatile pages by balance_fs_bg.

Thanks,


commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
   - f2fs_is_atomic_file()
 skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

  - gc_data_segment
   - move_data_page
- f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
- set_inode_flag(inode, FI_ATOMIC_FILE);
- f2fs_set_data_page_dirty
 - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
   - lock_page()
- mutex_lock(>commit_lock);
 - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
 fs/f2fs/file.c | 3 +++
 fs/f2fs/gc.c   | 5 -
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

 inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
 if (f2fs_is_volatile_file(inode))
 goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 }
 err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
 inode_unlock(inode);
 mnt_drop_write_file(filp);
 return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 if (!check_valid_map(F2FS_I_SB(inode), segno, off))
 goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
 if (f2fs_is_pinned_file(inode)) {
 f2fs_pin_file_control(inode, true);
 goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
 if (!check_valid_map(F2FS_I_SB(inode), segno,

Re: [f2fs-dev] [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

The problem is that you can not find a proper value of the threshold
time, when f2fs_gc select the GCed data page of the atomic file (which
has atomic started but not atomic committed yet), then f2fs_gc will
run into loop, and all the f2fs ops will be blocked in f2fs_balane_fs.
If the threshold time is set larger (e.g. 5s? Then all the f2fs ops will
block 5s, which will cause unexpected bad result of user experience).
And if the threshold time is set smaller (e.g. 500ms? Then the atomic
ops will probably fail frequently). BTW, some more patches are needed
to notify the atomic ops itself that it has run time out, and should
handle the inmem pages

Back to these two patches, why not use them to separate inmem pages
and GCed data pages in such a simple way.

On 2018/2/9 21:38, Chao Yu wrote:

On 2018/2/9 21:29, Yunlong Song wrote:

Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic


That's another issue, which I have suggest to set a threshold time
to release atomic/volatile pages by balance_fs_bg.

Thanks,


commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
   - f2fs_is_atomic_file()
 skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

  - gc_data_segment
   - move_data_page
- f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
- set_inode_flag(inode, FI_ATOMIC_FILE);
- f2fs_set_data_page_dirty
 - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
   - lock_page()
- mutex_lock(>commit_lock);
 - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
 fs/f2fs/file.c | 3 +++
 fs/f2fs/gc.c   | 5 -
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

 inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
 if (f2fs_is_volatile_file(inode))
 goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
 ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
 }
 err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
 inode_unlock(inode);
 mnt_drop_write_file(filp);
 return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
 if (!check_valid_map(F2FS_I_SB(inode), segno, off))
 goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
 if (f2fs_is_pinned_file(inode)) {
 f2fs_pin_file_control(inode, true);
 goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
 if (!check_valid_map(F2FS_I_SB(inode), segno,

Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic
commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
  - f2fs_is_atomic_file()
skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

 - gc_data_segment
  - move_data_page
   - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
   - set_inode_flag(inode, FI_ATOMIC_FILE);
   - f2fs_set_data_page_dirty
- register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
  - lock_page()
   - mutex_lock(>commit_lock);
- lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
fs/f2fs/file.c | 3 +++
fs/f2fs/gc.c   | 5 -
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);





.





.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

Back to the problem, if we skip out, then the f2fs_gc will go
into dead loop if the apps only atomic start but never atomic
commit. The main aim of my two patches is to remove the skip
action to avoid the dead loop.

On 2018/2/9 21:26, Chao Yu wrote:

On 2018/2/9 20:56, Yunlong Song wrote:

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.


We will skip GCing that page as below check:

- move_data_{page,block}
  - f2fs_is_atomic_file()
skip out;

No?

Thanks,


This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

 - gc_data_segment
  - move_data_page
   - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
   - set_inode_flag(inode, FI_ATOMIC_FILE);
   - f2fs_set_data_page_dirty
- register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
  - lock_page()
   - mutex_lock(>commit_lock);
- lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
fs/f2fs/file.c | 3 +++
fs/f2fs/gc.c   | 5 -
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);





.





.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.
This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

- gc_data_segment
 - move_data_page
  - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
  - set_inode_flag(inode, FI_ATOMIC_FILE);
  - f2fs_set_data_page_dirty
   - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
 - lock_page()
  - mutex_lock(>commit_lock);
   - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
   fs/f2fs/file.c | 3 +++
   fs/f2fs/gc.c   | 5 -
   2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

   inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
   if (f2fs_is_volatile_file(inode))
   goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
   ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
   }
   err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
   inode_unlock(inode);
   mnt_drop_write_file(filp);
   return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
   if (f2fs_is_pinned_file(inode)) {
   f2fs_pin_file_control(inode, true);
   goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


   if (f2fs_is_pinned_file(inode)) {
   if (gc_type == FG_GC)
   f2fs_pin_file_control(inode, true);





.



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-09 Thread Yunlong Song

As what I point in last mail, if the atomic file is not committed
yet, gc_data_segment will register_inmem_page the GCed data pages.
This will cause these data pages written twice, the first write
happens in move_data_page->do_write_data_page, and the second
write happens in later __commit_inmem_pages->do_write_data_page.

On 2018/2/9 20:44, Chao Yu wrote:

On 2018/2/8 11:11, Yunlong Song wrote:

Then the GCed data pages are totally mixed with the inmem atomic pages,


If we add dio_rwsem, GC flow is exclude with atomic write flow. There
will be not race case to mix GCed page into atomic pages.

Or you mean:

- gc_data_segment
 - move_data_page
  - f2fs_is_atomic_file
- f2fs_ioc_start_atomic_write
  - set_inode_flag(inode, FI_ATOMIC_FILE);
  - f2fs_set_data_page_dirty
   - register_inmem_page

In this case, GCed page can be mixed into database transaction, but could
it cause any problem except break rule of isolation for transaction.


this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


Could we fix the race case first, based on that fixing, and then find the
place that we can improve?




On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:GC:
- mutex_lock(>commit_lock);
 - lock_page()
  - mutex_lock(>commit_lock);
   - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
   fs/f2fs/file.c | 3 +++
   fs/f2fs/gc.c   | 5 -
   2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

   inode_lock(inode);

+down_write(_I(inode)->dio_rwsem[WRITE]);
+
   if (f2fs_is_volatile_file(inode))
   goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
   ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
   }
   err_out:
+up_write(_I(inode)->dio_rwsem[WRITE]);
   inode_unlock(inode);
   mnt_drop_write_file(filp);
   return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Seems that we need this check.


-
   if (f2fs_is_pinned_file(inode)) {
   f2fs_pin_file_control(inode, true);
   goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
   if (!check_valid_map(F2FS_I_SB(inode), segno, off))
   goto out;

-if (f2fs_is_atomic_file(inode))
-goto out;


Ditto.

Thanks,


   if (f2fs_is_pinned_file(inode)) {
   if (gc_type == FG_GC)
   f2fs_pin_file_control(inode, true);





.



--
Thanks,
Yunlong Song



[PATCH] f2fs: set_code_data in move_data_block

2018-02-08 Thread Yunlong Song
Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/gc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))
-- 
1.8.5.2



[PATCH] f2fs: set_code_data in move_data_block

2018-02-08 Thread Yunlong Song
Signed-off-by: Yunlong Song 
---
 fs/f2fs/gc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..2095630 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -692,6 +692,7 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
fio.op = REQ_OP_WRITE;
fio.op_flags = REQ_SYNC;
fio.new_blkaddr = newaddr;
+   set_cold_data(fio.page);
err = f2fs_submit_page_write();
if (err) {
if (PageWriteback(fio.encrypted_page))
-- 
1.8.5.2



Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-07 Thread Yunlong Song

Then the GCed data pages are totally mixed with the inmem atomic pages,
this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:   GC:
- mutex_lock(>commit_lock);
- lock_page()
 - mutex_lock(>commit_lock);
  - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
  fs/f2fs/file.c | 3 +++
  fs/f2fs/gc.c   | 5 -
  2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+   down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
  err_out:
+   up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);



--
Thanks,
Yunlong Song



Re: [PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-07 Thread Yunlong Song

Then the GCed data pages are totally mixed with the inmem atomic pages,
this will cause the atomic commit ops write the GCed data pages twice
(the first write happens in GC).

How about using the early two patches to separate the inmem data pages
and GCed data pages, and use dio_rwsem instead of this patch to fix the
dnode page problem (dnode page commited but data page are not committed
for the GCed page)?


On 2018/2/7 20:16, Chao Yu wrote:

On 2018/2/6 11:49, Yunlong Song wrote:

This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.


do_fsync:   GC:
- mutex_lock(>commit_lock);
- lock_page()
 - mutex_lock(>commit_lock);
  - lock_page()


Well, please consider lock dependency & code complexity, IMO, reuse
fi->dio_rwsem[WRITE] will be enough as below:

---
  fs/f2fs/file.c | 3 +++
  fs/f2fs/gc.c   | 5 -
  2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542e5464..1bdc11feb8d0 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -1711,6 +1711,8 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)

inode_lock(inode);

+   down_write(_I(inode)->dio_rwsem[WRITE]);
+
if (f2fs_is_volatile_file(inode))
goto err_out;

@@ -1729,6 +1731,7 @@ static int f2fs_ioc_commit_atomic_write(struct file *filp)
ret = f2fs_do_sync_file(filp, 0, LLONG_MAX, 1, false);
}
  err_out:
+   up_write(_I(inode)->dio_rwsem[WRITE]);
inode_unlock(inode);
mnt_drop_write_file(filp);
return ret;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd532a9..e49416283563 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,9 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -729,8 +726,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-   if (f2fs_is_atomic_file(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);



--
Thanks,
Yunlong Song



[PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-05 Thread Yunlong Song
This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/file.c  | 15 --
 fs/f2fs/gc.c| 61 +
 fs/f2fs/super.c |  1 +
 4 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index dbe87c7..b58a8f2 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -628,6 +628,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct task_struct *inmem_task; /* store inmemory task */
struct mutex inmem_lock;/* lock for inmemory pages */
+   struct mutex commit_lock;   /* lock for commit GCed pages */
struct extent_tree *extent_tree;/* cached extent_tree entry */
struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
struct rw_semaphore i_mmap_sem;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542..7e14724 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -202,6 +202,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
 {
struct inode *inode = file->f_mapping->host;
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   struct f2fs_inode_info *fi = F2FS_I(inode);
nid_t ino = inode->i_ino;
int ret = 0;
enum cp_reason_type cp_reason = 0;
@@ -219,11 +220,13 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
/* if fdatasync is triggered, let's do in-place-update */
if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks)
set_inode_flag(inode, FI_NEED_IPU);
+   mutex_lock(>commit_lock);
ret = file_write_and_wait_range(file, start, end);
clear_inode_flag(inode, FI_NEED_IPU);
 
if (ret) {
trace_f2fs_sync_file_exit(inode, cp_reason, datasync, ret);
+   mutex_unlock(>commit_lock);
return ret;
}
 
@@ -244,8 +247,11 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
goto go_write;
 
if (is_inode_flag_set(inode, FI_UPDATE_WRITE) ||
-   exist_written_data(sbi, ino, UPDATE_INO))
+   exist_written_data(sbi, ino, UPDATE_INO)) {
+   mutex_unlock(>commit_lock);
goto flush_out;
+   }
+   mutex_unlock(>commit_lock);
goto out;
}
 go_write:
@@ -268,16 +274,20 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
try_to_fix_pino(inode);
clear_inode_flag(inode, FI_APPEND_WRITE);
clear_inode_flag(inode, FI_UPDATE_WRITE);
+   mutex_unlock(>commit_lock);
goto out;
}
 sync_nodes:
ret = fsync_node_pages(sbi, inode, , atomic);
-   if (ret)
+   if (ret) {
+   mutex_unlock(>commit_lock);
goto out;
+   }
 
/* if cp_error was enabled, we should avoid infinite loop */
if (unlikely(f2fs_cp_error(sbi))) {
ret = -EIO;
+   mutex_unlock(>commit_lock);
goto out;
}
 
@@ -286,6 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
f2fs_write_inode(inode, NULL);
goto sync_nodes;
}
+   mutex_unlock(>commit_lock);
 
/*
 * If it's atomic_write, it's just fine to keep write ordering. So
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 9d54ddb..b98aff5 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -676,13 +676,20 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode) &&
-   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
-   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
-   SetPagePrivate(fio.encrypted_page);
-   }
-   set_page_dirty(fio.encrypted_page);
+   if (f2fs_is_atomic_file(inode)) {
+   struct f2fs_inode_info *fi = F2FS_I(inode);
+
+   mutex_lock(>commit_lock);
+   if (!f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page,
+   (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted

[PATCH] f2fs: add fi->commit_lock to protect commit GCed pages

2018-02-05 Thread Yunlong Song
This patch adds fi->commit_lock to avoid the case that GCed node pages
are committed but GCed data pages are not committed. This can avoid the
db file run into inconsistent state when sudden-power-off happens if
data pages of atomic file is allowed to be GCed before.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h  |  1 +
 fs/f2fs/file.c  | 15 --
 fs/f2fs/gc.c| 61 +
 fs/f2fs/super.c |  1 +
 4 files changed, 55 insertions(+), 23 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index dbe87c7..b58a8f2 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -628,6 +628,7 @@ struct f2fs_inode_info {
struct list_head inmem_pages;   /* inmemory pages managed by f2fs */
struct task_struct *inmem_task; /* store inmemory task */
struct mutex inmem_lock;/* lock for inmemory pages */
+   struct mutex commit_lock;   /* lock for commit GCed pages */
struct extent_tree *extent_tree;/* cached extent_tree entry */
struct rw_semaphore dio_rwsem[2];/* avoid racing between dio and gc */
struct rw_semaphore i_mmap_sem;
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index 672a542..7e14724 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -202,6 +202,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
 {
struct inode *inode = file->f_mapping->host;
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+   struct f2fs_inode_info *fi = F2FS_I(inode);
nid_t ino = inode->i_ino;
int ret = 0;
enum cp_reason_type cp_reason = 0;
@@ -219,11 +220,13 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
/* if fdatasync is triggered, let's do in-place-update */
if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks)
set_inode_flag(inode, FI_NEED_IPU);
+   mutex_lock(>commit_lock);
ret = file_write_and_wait_range(file, start, end);
clear_inode_flag(inode, FI_NEED_IPU);
 
if (ret) {
trace_f2fs_sync_file_exit(inode, cp_reason, datasync, ret);
+   mutex_unlock(>commit_lock);
return ret;
}
 
@@ -244,8 +247,11 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
goto go_write;
 
if (is_inode_flag_set(inode, FI_UPDATE_WRITE) ||
-   exist_written_data(sbi, ino, UPDATE_INO))
+   exist_written_data(sbi, ino, UPDATE_INO)) {
+   mutex_unlock(>commit_lock);
goto flush_out;
+   }
+   mutex_unlock(>commit_lock);
goto out;
}
 go_write:
@@ -268,16 +274,20 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
try_to_fix_pino(inode);
clear_inode_flag(inode, FI_APPEND_WRITE);
clear_inode_flag(inode, FI_UPDATE_WRITE);
+   mutex_unlock(>commit_lock);
goto out;
}
 sync_nodes:
ret = fsync_node_pages(sbi, inode, , atomic);
-   if (ret)
+   if (ret) {
+   mutex_unlock(>commit_lock);
goto out;
+   }
 
/* if cp_error was enabled, we should avoid infinite loop */
if (unlikely(f2fs_cp_error(sbi))) {
ret = -EIO;
+   mutex_unlock(>commit_lock);
goto out;
}
 
@@ -286,6 +296,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t 
start, loff_t end,
f2fs_write_inode(inode, NULL);
goto sync_nodes;
}
+   mutex_unlock(>commit_lock);
 
/*
 * If it's atomic_write, it's just fine to keep write ordering. So
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 9d54ddb..b98aff5 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -676,13 +676,20 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode) &&
-   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
-   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
-   SetPagePrivate(fio.encrypted_page);
-   }
-   set_page_dirty(fio.encrypted_page);
+   if (f2fs_is_atomic_file(inode)) {
+   struct f2fs_inode_info *fi = F2FS_I(inode);
+
+   mutex_lock(>commit_lock);
+   if (!f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page,
+   (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted_page)

Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song

OK, now I got it, thanks for the explanation. Then the point is to avoid
set_page_dirty between file_write_and_wait_range and fsync_node_pages,
so we can lock before file_write_and_wait_range and unlock after 
fsync_node_pages, and lock before set_page_dirty and unlock after

set_page_dirty. These patches and the locks can make sure the GCed data
pages are all committed to nand flash with their nodes.

On 2018/2/5 19:10, Chao Yu wrote:

On 2018/2/5 17:37, Yunlong Song wrote:



OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
   - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal


Yes, node #2 contains newly updated LBAx of page #2, but if page #2 is not
committed to LBAx, after recovery, page #2 's block address in node #2 will
point to LBAx which contains random data, result in corrupted db file.


gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden


f2fs will ensure GCed data being persisted during checkpoint, so migrated page
#1 and updated node #1 will both be committed in this checkpoint.

Please check WB_DATA_TYPE macro to see how we define data type that cp
guarantees to writeback.


power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just


If node #2 is committed only, it will be harmful to db transaction due to the
reason I said above.

Thanks,


the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
 	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song

OK, now I got it, thanks for the explanation. Then the point is to avoid
set_page_dirty between file_write_and_wait_range and fsync_node_pages,
so we can lock before file_write_and_wait_range and unlock after 
fsync_node_pages, and lock before set_page_dirty and unlock after

set_page_dirty. These patches and the locks can make sure the GCed data
pages are all committed to nand flash with their nodes.

On 2018/2/5 19:10, Chao Yu wrote:

On 2018/2/5 17:37, Yunlong Song wrote:



OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
   - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal


Yes, node #2 contains newly updated LBAx of page #2, but if page #2 is not
committed to LBAx, after recovery, page #2 's block address in node #2 will
point to LBAx which contains random data, result in corrupted db file.


gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden


f2fs will ensure GCed data being persisted during checkpoint, so migrated page
#1 and updated node #1 will both be committed in this checkpoint.

Please check WB_DATA_TYPE macro to see how we define data type that cp
guarantees to writeback.


power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just


If node #2 is committed only, it will be harmful to db transaction due to the
reason I said above.

Thanks,


the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
 	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song



OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
  - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal
gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden
power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just
the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
fs/f2fs/data.c | 5 ++---
fs/f2fs/gc.c   | 6 --
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
}

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;

	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-05 Thread Yunlong Song



OK, details as I explained before:

atomic_commit   GC
- file_write_and_wait_range
- move_data_block
 - f2fs_submit_page_write
  - f2fs_update_data_blkaddr
   - set_page_dirty
  - fsync_node_pages

1. atomic writes data page #1 & update node #1
2. GC data page #2 & update node #2
3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be
committed.

After a sudden pow-cut, database transaction will be inconsistent. So I think
there will be better to exclude gc/atomic_write to each other, with a lock
instead of flag checking.



I do not understand why this transaction is inconsistent, is it a
problem that page #2 is not committed into nand flash? Since normal
gc also has this problem:

Suppose that there is db file A, f2fs_gc moves data page #1 of db file
A. But if write checkpoint only commit node page #1 and then a sudden
power-cut happens. Data page #1 is not committed to nand flash, but
node page #1 is committed. Is the db transaction broken and
inconsistent?

Come back to your example, I think data page 2 of atomic file does not
belong to this transaction, so even node page 2 is committed, it is just
the same problem as what I have listed above(db file A), and it does not
break this transaction.


Thanks,



So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song 
---
fs/f2fs/data.c | 5 ++---
fs/f2fs/gc.c   | 6 --
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
}

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;

	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;

-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song

Is it necessary to make atomic commit fail? What's the problem of this
patch (no lock at all and does not make atomic fail)? These two patches
aims to provide ability to gc old blocks of opened atomic file, with no
affection to original atomic commit and no mix with inmem pages.

On 2018/2/5 14:29, Chao Yu wrote:

On 2018/2/5 10:53, Yunlong Song wrote:

Is it necessary to add a lock here? What's the problem of this patch (no
lock at all)? Anyway, the problem is expected to be fixed asap, since
attackers can easily write an app with only atomic start and no atomic
commit, which will cause f2fs run into loop gc if the disk layout is
much fragmented, since f2fs_gc will select the same target victim all
the time (e.g. one block of target victim belongs to the opened atomic
file, and it will not be moved and do_garbage_collect will finally
return 0, and that victim is selected again next time) and goto gc_more
time and time again, which will block all the fs ops (all the fs ops
will hang up in f2fs_balance_fs).


Hmm.. w/ original commit log and implementation, I supposed that the patch
intended to fix to make atomic write be isolated from other IOs like GC
triggered writes...

Alright, we have discuss the problem before in below link:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1571951.html

I meant, for example:

f2fs_ioc_start_atomic_write()
inode->atomic_open_time = get_mtime();

f2fs_ioc_commit_atomic_write()
inode->atomic_open_time = 0;

f2fs_balance_fs_bg()
for_each_atomic_open_file()
if (inode->atomic_open_time &&
inode->atomic_open_time > threshold) {
drop_inmem_pages();
f2fs_msg();
}

threshold = 30s

Any thoughts?

Thanks,



On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/data.c | 5 ++---
   fs/f2fs/gc.c   | 6 --
   2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
   }
   
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
   
   	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
    if (gc_type == FG_GC)



.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song

Is it necessary to make atomic commit fail? What's the problem of this
patch (no lock at all and does not make atomic fail)? These two patches
aims to provide ability to gc old blocks of opened atomic file, with no
affection to original atomic commit and no mix with inmem pages.

On 2018/2/5 14:29, Chao Yu wrote:

On 2018/2/5 10:53, Yunlong Song wrote:

Is it necessary to add a lock here? What's the problem of this patch (no
lock at all)? Anyway, the problem is expected to be fixed asap, since
attackers can easily write an app with only atomic start and no atomic
commit, which will cause f2fs run into loop gc if the disk layout is
much fragmented, since f2fs_gc will select the same target victim all
the time (e.g. one block of target victim belongs to the opened atomic
file, and it will not be moved and do_garbage_collect will finally
return 0, and that victim is selected again next time) and goto gc_more
time and time again, which will block all the fs ops (all the fs ops
will hang up in f2fs_balance_fs).


Hmm.. w/ original commit log and implementation, I supposed that the patch
intended to fix to make atomic write be isolated from other IOs like GC
triggered writes...

Alright, we have discuss the problem before in below link:
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1571951.html

I meant, for example:

f2fs_ioc_start_atomic_write()
inode->atomic_open_time = get_mtime();

f2fs_ioc_commit_atomic_write()
inode->atomic_open_time = 0;

f2fs_balance_fs_bg()
for_each_atomic_open_file()
if (inode->atomic_open_time &&
inode->atomic_open_time > threshold) {
drop_inmem_pages();
f2fs_msg();
}

threshold = 30s

Any thoughts?

Thanks,



On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song 
---
   fs/f2fs/data.c | 5 ++---
   fs/f2fs/gc.c   | 6 --
   2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
   }
   
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
   
   	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
   
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.






.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song
Is it necessary to add a lock here? What's the problem of this patch (no 
lock at all)? Anyway, the problem is expected to be fixed asap, since 
attackers can easily write an app with only atomic start and no atomic 
commit, which will cause f2fs run into loop gc if the disk layout is 
much fragmented, since f2fs_gc will select the same target victim all 
the time (e.g. one block of target victim belongs to the opened atomic 
file, and it will not be moved and do_garbage_collect will finally 
return 0, and that victim is selected again next time) and goto gc_more 
time and time again, which will block all the fs ops (all the fs ops 
will hang up in f2fs_balance_fs).


On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/data.c | 5 ++---
  fs/f2fs/gc.c   | 6 --
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
  }
  
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
  
  	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.



--
Thanks,
Yunlong Song



Re: [PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-04 Thread Yunlong Song
Is it necessary to add a lock here? What's the problem of this patch (no 
lock at all)? Anyway, the problem is expected to be fixed asap, since 
attackers can easily write an app with only atomic start and no atomic 
commit, which will cause f2fs run into loop gc if the disk layout is 
much fragmented, since f2fs_gc will select the same target victim all 
the time (e.g. one block of target victim belongs to the opened atomic 
file, and it will not be moved and do_garbage_collect will finally 
return 0, and that victim is selected again next time) and goto gc_more 
time and time again, which will block all the fs ops (all the fs ops 
will hang up in f2fs_balance_fs).


On 2018/2/4 22:56, Chao Yu wrote:

On 2018/2/3 10:47, Yunlong Song wrote:

If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.


Let's avoid Ccing fs mailing list if the patch didn't change vfs common
codes.

As you know, the problem here is mixed dnode block flushing w/o writebacking
gced data block, result in making transaction unintegrated.

So how about just using dio_rwsem[WRITE] during atomic committing to exclude
GCing data block of atomic opened file?

Thanks,



Signed-off-by: Yunlong Song 
---
  fs/f2fs/data.c | 5 ++---
  fs/f2fs/gc.c   | 6 --
  2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
  }
  
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c

index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
  
  	if (f2fs_is_pinned_file(inode)) {

@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
  
-	if (f2fs_is_atomic_file(inode))

+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)



.



--
Thanks,
Yunlong Song



[PATCH 2/2] f2fs: add GC_WRITTEN_PAGE to gc atomic file

2018-02-02 Thread Yunlong Song
This patch enables to gc atomic file by adding GC_WRITTEN_PAGE to
identify the gced pages of atomic file, which can avoid
register_inmem_page in set_page_dirty, so the gced pages will not mix
with the inmem pages.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c|  7 ++-
 fs/f2fs/gc.c  | 25 ++---
 fs/f2fs/segment.h |  3 +++
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index edafcb6..5e1fc5d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -120,6 +120,10 @@ static void f2fs_write_end_io(struct bio *bio)
 
dec_page_count(sbi, type);
clear_cold_data(page);
+   if (IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, 0);
+   ClearPagePrivate(page);
+   }
end_page_writeback(page);
}
if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
@@ -2418,7 +2422,8 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page))
SetPageUptodate(page);
 
-   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
+   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)
+   && !IS_GC_WRITTEN_PAGE(page)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
register_inmem_page(inode, page);
return 1;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 84ab3ff..9d54ddb 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,10 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -680,6 +676,12 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted_page);
+   }
set_page_dirty(fio.encrypted_page);
f2fs_wait_on_page_writeback(fio.encrypted_page, DATA, true);
if (clear_page_dirty_for_io(fio.encrypted_page))
@@ -730,9 +732,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);
@@ -742,6 +741,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
set_cold_data(page);
} else {
@@ -762,6 +767,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
int err;
 
 retry:
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
f2fs_wait_on_page_writeback(page, DATA, true);
if (clear_page_dirty_for_io(page)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index f11c4bc..f0a6432 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -203,11 +203,14 @@ struct segment_allocation {
  */
 #define ATOMIC_WRITTEN_PAGE((unsigned long)-1)
 #define DUMMY_WRITTEN_PAGE ((unsigned long)-2)
+#define GC_WRITTEN_PAGE((unsigned long)-3)
 
 #define IS_ATOMIC_WRITTEN_PAGE(page)   \
(page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE)
 #define IS_DUMMY_WRITTEN_PAGE(page)\
(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
+#define 

[PATCH 2/2] f2fs: add GC_WRITTEN_PAGE to gc atomic file

2018-02-02 Thread Yunlong Song
This patch enables to gc atomic file by adding GC_WRITTEN_PAGE to
identify the gced pages of atomic file, which can avoid
register_inmem_page in set_page_dirty, so the gced pages will not mix
with the inmem pages.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c|  7 ++-
 fs/f2fs/gc.c  | 25 ++---
 fs/f2fs/segment.h |  3 +++
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index edafcb6..5e1fc5d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -120,6 +120,10 @@ static void f2fs_write_end_io(struct bio *bio)
 
dec_page_count(sbi, type);
clear_cold_data(page);
+   if (IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, 0);
+   ClearPagePrivate(page);
+   }
end_page_writeback(page);
}
if (!get_pages(sbi, F2FS_WB_CP_DATA) &&
@@ -2418,7 +2422,8 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page))
SetPageUptodate(page);
 
-   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
+   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)
+   && !IS_GC_WRITTEN_PAGE(page)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
register_inmem_page(inode, page);
return 1;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 84ab3ff..9d54ddb 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,10 +622,6 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
-
if (f2fs_is_pinned_file(inode)) {
f2fs_pin_file_control(inode, true);
goto out;
@@ -680,6 +676,12 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
goto put_page_out;
}
 
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(fio.encrypted_page)) {
+   set_page_private(fio.encrypted_page, (unsigned 
long)GC_WRITTEN_PAGE);
+   SetPagePrivate(fio.encrypted_page);
+   }
set_page_dirty(fio.encrypted_page);
f2fs_wait_on_page_writeback(fio.encrypted_page, DATA, true);
if (clear_page_dirty_for_io(fio.encrypted_page))
@@ -730,9 +732,6 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode) &&
-   !f2fs_is_commit_atomic_write(inode))
-   goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
f2fs_pin_file_control(inode, true);
@@ -742,6 +741,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (gc_type == BG_GC) {
if (PageWriteback(page))
goto out;
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
set_cold_data(page);
} else {
@@ -762,6 +767,12 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
int err;
 
 retry:
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode) &&
+   !IS_GC_WRITTEN_PAGE(page)) {
+   set_page_private(page, (unsigned long)GC_WRITTEN_PAGE);
+   SetPagePrivate(page);
+   }
set_page_dirty(page);
f2fs_wait_on_page_writeback(page, DATA, true);
if (clear_page_dirty_for_io(page)) {
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index f11c4bc..f0a6432 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -203,11 +203,14 @@ struct segment_allocation {
  */
 #define ATOMIC_WRITTEN_PAGE((unsigned long)-1)
 #define DUMMY_WRITTEN_PAGE ((unsigned long)-2)
+#define GC_WRITTEN_PAGE((unsigned long)-3)
 
 #define IS_ATOMIC_WRITTEN_PAGE(page)   \
(page_private(page) == (unsigned long)ATOMIC_WRITTEN_PAGE)
 #define IS_DUMMY_WRITTEN_PAGE(page)\
(page_private(page) == (unsigned long)DUMMY_WRITTEN_PAGE)
+#define IS_GC_WRITTEN_PAGE(page) 

[PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-02 Thread Yunlong Song
If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
if (f2fs_is_pinned_file(inode)) {
@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
-- 
1.8.5.2



[PATCH 1/2] f2fs: enable to gc page whose inode already atomic commit

2018-02-02 Thread Yunlong Song
If inode has already started to atomic commit, then set_page_dirty will
not mix the gc pages with the inmem atomic pages, so the page can be
gced safely.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/data.c | 5 ++---
 fs/f2fs/gc.c   | 6 --
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 7435830..edafcb6 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode *inode, struct 
f2fs_io_info *fio)
return true;
if (S_ISDIR(inode->i_mode))
return true;
-   if (f2fs_is_atomic_file(inode))
-   return true;
if (fio) {
if (is_cold_data(fio->page))
return true;
if (IS_ATOMIC_WRITTEN_PAGE(fio->page))
return true;
-   }
+   } else if (f2fs_is_atomic_file(inode))
+   return true;
return false;
 }
 
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index b9d93fd..84ab3ff 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, block_t 
bidx,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
 
if (f2fs_is_pinned_file(inode)) {
@@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, block_t 
bidx, int gc_type,
if (!check_valid_map(F2FS_I_SB(inode), segno, off))
goto out;
 
-   if (f2fs_is_atomic_file(inode))
+   if (f2fs_is_atomic_file(inode) &&
+   !f2fs_is_commit_atomic_write(inode))
goto out;
if (f2fs_is_pinned_file(inode)) {
if (gc_type == FG_GC)
-- 
1.8.5.2



Re: [PATCH] f2fs: fix heap mode to reset it back

2018-01-29 Thread Yunlong Song
The old commit allocates hot data & nodes in the beginning of partition 
both for heap and
noheap mode. But from the commit message, the heap mode should be like 
before, i.e.,

allocate hot data & nodes from curseg to left.

On 2018/1/29 16:12, Chao Yu wrote:

Hi Yunlong,

On 2018/1/29 11:37, Yunlong Song wrote:

Commit 7a20b8a61eff81bdb7097a578752a74860e9d142 ("f2fs: allocate node
and hot data in the beginning of partition") introduces another mount
option, heap, to reset it back. But it does not do anything for heap
mode, so fix it.

I think Jaegeuk did three things in that patch:
a) add missing heap mount option handling in ->show_options.
b) set noheap by default.
c) change allocation policy to the one that allocate hotdata & nodes in the
front of main are intensively.

They could be separated, independent, and I don't see such intention that
we can only use c) the new introduced allocation policy in noheap mode.

Anyway, I think Jaegeuk can help to double check that.

Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/gc.c  | 5 +++--
  fs/f2fs/segment.c | 3 ++-
  2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index aa720cc..b9d93fd 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -191,8 +191,9 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
  
-	/* let's select beginning hot/small space first */

-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   /* let's select beginning hot/small space first in no_heap mode*/
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
p->offset = 0;
else
p->offset = SIT_I(sbi)->last_victim[p->gc_mode];
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e5739ce..77a48c4 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2167,7 +2167,8 @@ static unsigned int __get_next_segno(struct f2fs_sb_info 
*sbi, int type)
if (sbi->segs_per_sec != 1)
return CURSEG_I(sbi, type)->segno;
  
-	if (type == CURSEG_HOT_DATA || IS_NODESEG(type))

+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
        return 0;
  
  	if (SIT_I(sbi)->last_victim[ALLOC_NEXT])




.



--
Thanks,
Yunlong Song




Re: [PATCH] f2fs: fix heap mode to reset it back

2018-01-29 Thread Yunlong Song
The old commit allocates hot data & nodes in the beginning of partition 
both for heap and
noheap mode. But from the commit message, the heap mode should be like 
before, i.e.,

allocate hot data & nodes from curseg to left.

On 2018/1/29 16:12, Chao Yu wrote:

Hi Yunlong,

On 2018/1/29 11:37, Yunlong Song wrote:

Commit 7a20b8a61eff81bdb7097a578752a74860e9d142 ("f2fs: allocate node
and hot data in the beginning of partition") introduces another mount
option, heap, to reset it back. But it does not do anything for heap
mode, so fix it.

I think Jaegeuk did three things in that patch:
a) add missing heap mount option handling in ->show_options.
b) set noheap by default.
c) change allocation policy to the one that allocate hotdata & nodes in the
front of main are intensively.

They could be separated, independent, and I don't see such intention that
we can only use c) the new introduced allocation policy in noheap mode.

Anyway, I think Jaegeuk can help to double check that.

Thanks,


Signed-off-by: Yunlong Song 
---
  fs/f2fs/gc.c  | 5 +++--
  fs/f2fs/segment.c | 3 ++-
  2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index aa720cc..b9d93fd 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -191,8 +191,9 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
  
-	/* let's select beginning hot/small space first */

-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   /* let's select beginning hot/small space first in no_heap mode*/
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
p->offset = 0;
else
p->offset = SIT_I(sbi)->last_victim[p->gc_mode];
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e5739ce..77a48c4 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2167,7 +2167,8 @@ static unsigned int __get_next_segno(struct f2fs_sb_info 
*sbi, int type)
if (sbi->segs_per_sec != 1)
return CURSEG_I(sbi, type)->segno;
  
-	if (type == CURSEG_HOT_DATA || IS_NODESEG(type))

+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
    return 0;
  
  	if (SIT_I(sbi)->last_victim[ALLOC_NEXT])




.



--
Thanks,
Yunlong Song




[PATCH] f2fs: fix heap mode to reset it back

2018-01-28 Thread Yunlong Song
Commit 7a20b8a61eff81bdb7097a578752a74860e9d142 ("f2fs: allocate node
and hot data in the beginning of partition") introduces another mount
option, heap, to reset it back. But it does not do anything for heap
mode, so fix it.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/gc.c  | 5 +++--
 fs/f2fs/segment.c | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index aa720cc..b9d93fd 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -191,8 +191,9 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
 
-   /* let's select beginning hot/small space first */
-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   /* let's select beginning hot/small space first in no_heap mode*/
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
p->offset = 0;
else
p->offset = SIT_I(sbi)->last_victim[p->gc_mode];
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e5739ce..77a48c4 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2167,7 +2167,8 @@ static unsigned int __get_next_segno(struct f2fs_sb_info 
*sbi, int type)
if (sbi->segs_per_sec != 1)
return CURSEG_I(sbi, type)->segno;
 
-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
return 0;
 
if (SIT_I(sbi)->last_victim[ALLOC_NEXT])
-- 
1.8.5.2



[PATCH] f2fs: fix heap mode to reset it back

2018-01-28 Thread Yunlong Song
Commit 7a20b8a61eff81bdb7097a578752a74860e9d142 ("f2fs: allocate node
and hot data in the beginning of partition") introduces another mount
option, heap, to reset it back. But it does not do anything for heap
mode, so fix it.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/gc.c  | 5 +++--
 fs/f2fs/segment.c | 3 ++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index aa720cc..b9d93fd 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -191,8 +191,9 @@ static void select_policy(struct f2fs_sb_info *sbi, int 
gc_type,
if (gc_type != FG_GC && p->max_search > sbi->max_victim_search)
p->max_search = sbi->max_victim_search;
 
-   /* let's select beginning hot/small space first */
-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   /* let's select beginning hot/small space first in no_heap mode*/
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
p->offset = 0;
else
p->offset = SIT_I(sbi)->last_victim[p->gc_mode];
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index e5739ce..77a48c4 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2167,7 +2167,8 @@ static unsigned int __get_next_segno(struct f2fs_sb_info 
*sbi, int type)
if (sbi->segs_per_sec != 1)
return CURSEG_I(sbi, type)->segno;
 
-   if (type == CURSEG_HOT_DATA || IS_NODESEG(type))
+   if (test_opt(sbi, NOHEAP) &&
+   (type == CURSEG_HOT_DATA || IS_NODESEG(type)))
return 0;
 
if (SIT_I(sbi)->last_victim[ALLOC_NEXT])
-- 
1.8.5.2



Re: [PATCH v2] f2fs: handle newly created page when revoking inmem pages

2018-01-10 Thread Yunlong Song

Should it be "When committing inmem pages is not successful" ?

On 2018/1/11 8:17, Daeho Jeong wrote:

When committing inmem pages is successful, we revoke already committed
blocks in __revoke_inmem_pages() and finally replace the committed
ones with the old blocks using f2fs_replace_block(). However, if
the committed block was newly created one, the address of the old
block is NEW_ADDR and __f2fs_replace_block() cannot handle NEW_ADDR
as new_blkaddr properly and a kernel panic occurrs.

Signed-off-by: Daeho Jeong <daeho.je...@samsung.com>
Tested-by: Shu Tan <shu@samsung.com>
Reviewed-by: Chao Yu <yuch...@huawei.com>
---
  fs/f2fs/segment.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0673d08 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -248,7 +248,11 @@ static int __revoke_inmem_pages(struct inode *inode,
goto next;
}
get_node_info(sbi, dn.nid, );
-   f2fs_replace_block(sbi, , dn.data_blkaddr,
+   if (cur->old_addr == NEW_ADDR) {
+   invalidate_blocks(sbi, dn.data_blkaddr);
+   f2fs_update_data_blkaddr(, NEW_ADDR);
+   } else
+   f2fs_replace_block(sbi, , dn.data_blkaddr,
cur->old_addr, ni.version, true, true);
f2fs_put_dnode();
    }



--
Thanks,
Yunlong Song



Re: [PATCH v2] f2fs: handle newly created page when revoking inmem pages

2018-01-10 Thread Yunlong Song

Should it be "When committing inmem pages is not successful" ?

On 2018/1/11 8:17, Daeho Jeong wrote:

When committing inmem pages is successful, we revoke already committed
blocks in __revoke_inmem_pages() and finally replace the committed
ones with the old blocks using f2fs_replace_block(). However, if
the committed block was newly created one, the address of the old
block is NEW_ADDR and __f2fs_replace_block() cannot handle NEW_ADDR
as new_blkaddr properly and a kernel panic occurrs.

Signed-off-by: Daeho Jeong 
Tested-by: Shu Tan 
Reviewed-by: Chao Yu 
---
  fs/f2fs/segment.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index c117e09..0673d08 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -248,7 +248,11 @@ static int __revoke_inmem_pages(struct inode *inode,
goto next;
}
get_node_info(sbi, dn.nid, );
-   f2fs_replace_block(sbi, , dn.data_blkaddr,
+   if (cur->old_addr == NEW_ADDR) {
+   invalidate_blocks(sbi, dn.data_blkaddr);
+   f2fs_update_data_blkaddr(, NEW_ADDR);
+   } else
+   f2fs_replace_block(sbi, , dn.data_blkaddr,
cur->old_addr, ni.version, true, true);
f2fs_put_dnode();
}



--
Thanks,
Yunlong Song



Re: [f2fs-dev] [PATCH 2/2] f2fs: add reserved blocks for root user

2018-01-05 Thread Yunlong Song



On 2018/1/4 2:58, Jaegeuk Kim wrote:

@@ -1590,11 +1598,17 @@ static inline int inc_valid_block_count(struct 
f2fs_sb_info *sbi,
sbi->total_valid_block_count += (block_t)(*count);
avail_user_block_count = sbi->user_block_count -
sbi->current_reserved_blocks;
+
+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   avail_user_block_count -= sbi->root_reserved_blocks;


Should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   avail_user_block_count -= sbi->root_reserved_blocks;


@@ -1783,9 +1797,13 @@ static inline int inc_valid_node_count(struct 
f2fs_sb_info *sbi,
  
  	spin_lock(>stat_lock);
  
-	valid_block_count = sbi->total_valid_block_count + 1;

-   if (unlikely(valid_block_count + sbi->current_reserved_blocks >
-   sbi->user_block_count)) {
+   valid_block_count = sbi->total_valid_block_count +
+   sbi->current_reserved_blocks + 1;
+



+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   valid_block_count += sbi->root_reserved_blocks;
+

should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   valid_block_count += sbi->root_reserved_blocks;



+   if (unlikely(valid_block_count > sbi->user_block_count)) {
    spin_unlock(>stat_lock);
goto enospc;
}



--
Thanks,
Yunlong Song




Re: [f2fs-dev] [PATCH 2/2] f2fs: add reserved blocks for root user

2018-01-05 Thread Yunlong Song



On 2018/1/4 2:58, Jaegeuk Kim wrote:

@@ -1590,11 +1598,17 @@ static inline int inc_valid_block_count(struct 
f2fs_sb_info *sbi,
sbi->total_valid_block_count += (block_t)(*count);
avail_user_block_count = sbi->user_block_count -
sbi->current_reserved_blocks;
+
+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   avail_user_block_count -= sbi->root_reserved_blocks;


Should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   avail_user_block_count -= sbi->root_reserved_blocks;


@@ -1783,9 +1797,13 @@ static inline int inc_valid_node_count(struct 
f2fs_sb_info *sbi,
  
  	spin_lock(>stat_lock);
  
-	valid_block_count = sbi->total_valid_block_count + 1;

-   if (unlikely(valid_block_count + sbi->current_reserved_blocks >
-   sbi->user_block_count)) {
+   valid_block_count = sbi->total_valid_block_count +
+   sbi->current_reserved_blocks + 1;
+



+   if (!(test_opt(sbi, RESERVE_ROOT) && capable(CAP_SYS_RESOURCE)))
+   valid_block_count += sbi->root_reserved_blocks;
+

should better be:

+   if (test_opt(sbi, RESERVE_ROOT) && !capable(CAP_SYS_RESOURCE))
+   valid_block_count += sbi->root_reserved_blocks;



+   if (unlikely(valid_block_count > sbi->user_block_count)) {
    spin_unlock(>stat_lock);
goto enospc;
}



--
Thanks,
Yunlong Song




Re: [f2fs-dev] [PATCH 1/2] f2fs: show precise # of blocks that user/root can use

2018-01-04 Thread Yunlong Song

NACK

man statfs shows:

struct statfs {
...
fsblkcnt_t   f_bfree;   /* free blocks in fs */
fsblkcnt_t   f_bavail;  /* free blocks available to
unprivileged user */
...
}

f_bfree is free blocks in fs, so buf->bfree should be

buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;

On 2018/1/4 2:58, Jaegeuk Kim wrote:

Let's show precise # of blocks that user/root can use through bavail and bfree
respectively.

Signed-off-by: Jaegeuk Kim <jaeg...@kernel.org>
---
  fs/f2fs/super.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a820ba55b10..4c1c99cf54ef 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1005,9 +1005,9 @@ static int f2fs_statfs(struct dentry *dentry, struct 
kstatfs *buf)
buf->f_bsize = sbi->blocksize;
  
  	buf->f_blocks = total_count - start_count;

-   buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
-   buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
+   buf->f_bfree = user_block_count - valid_user_blocks(sbi) -
sbi->current_reserved_blocks;
+   buf->f_bavail = buf->f_bfree;
  
  	avail_node_count = sbi->total_node_count - sbi->nquota_files -

    F2FS_RESERVED_NODE_NUM;


--
Thanks,
Yunlong Song




Re: [f2fs-dev] [PATCH 1/2] f2fs: show precise # of blocks that user/root can use

2018-01-04 Thread Yunlong Song

NACK

man statfs shows:

struct statfs {
...
fsblkcnt_t   f_bfree;   /* free blocks in fs */
fsblkcnt_t   f_bavail;  /* free blocks available to
unprivileged user */
...
}

f_bfree is free blocks in fs, so buf->bfree should be

buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;

On 2018/1/4 2:58, Jaegeuk Kim wrote:

Let's show precise # of blocks that user/root can use through bavail and bfree
respectively.

Signed-off-by: Jaegeuk Kim 
---
  fs/f2fs/super.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a820ba55b10..4c1c99cf54ef 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1005,9 +1005,9 @@ static int f2fs_statfs(struct dentry *dentry, struct 
kstatfs *buf)
buf->f_bsize = sbi->blocksize;
  
  	buf->f_blocks = total_count - start_count;

-   buf->f_bfree = user_block_count - valid_user_blocks(sbi) + ovp_count;
-   buf->f_bavail = user_block_count - valid_user_blocks(sbi) -
+   buf->f_bfree = user_block_count - valid_user_blocks(sbi) -
sbi->current_reserved_blocks;
+   buf->f_bavail = buf->f_bfree;
  
  	avail_node_count = sbi->total_node_count - sbi->nquota_files -

    F2FS_RESERVED_NODE_NUM;


--
Thanks,
Yunlong Song




[PATCH v4] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..50575d5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2720,6 +2720,7 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
type = se->type;
 
down_write(_I(sbi)->curseg_lock);
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
 
if (!recover_curseg) {
/* for recovery flow */
-- 
1.8.5.2



[PATCH v4] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..50575d5 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2720,6 +2720,7 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
type = se->type;
 
down_write(_I(sbi)->curseg_lock);
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
 
if (!recover_curseg) {
/* for recovery flow */
-- 
1.8.5.2



[PATCH v3] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..6c6d2dd 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2



[PATCH v3] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check segment type in __f2fs_replace_block.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..6c6d2dd 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && !IS_DATASEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2



[PATCH v2] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..e3bbabf 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && IS_NODESEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2



[PATCH v2] f2fs: check segment type in __f2fs_replace_block

2018-01-03 Thread Yunlong Song
Signed-off-by: Yunlong Song 
---
 fs/f2fs/segment.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 890d483..e3bbabf 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2719,6 +2719,8 @@ void __f2fs_replace_block(struct f2fs_sb_info *sbi, 
struct f2fs_summary *sum,
se = get_seg_entry(sbi, segno);
type = se->type;
 
+   f2fs_bug_on(sbi, se->valid_blocks && IS_NODESEG(type));
+
down_write(_I(sbi)->curseg_lock);
 
if (!recover_curseg) {
-- 
1.8.5.2



Re: [PATCH] f2fs: check segment type before recover data

2018-01-02 Thread Yunlong Song



On 2018/1/2 14:49, Chao Yu wrote:

On 2017/12/30 15:42, Yunlong Song wrote:

In some case, the node blocks has wrong blkaddr whose segment type is

You mean *data block* has wrong blkaddr whose segment type is NODE?

Yes.



NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a

Do you mean the root cause is that __f2fs_replace_block didn't update
next_blkoff?

No, it's not the root cause. The root cause may be something like DDR flip.



result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Sorry, I can't catch the whole cause and effect from you description, if
possible, could you give an example?
For example, the i_inline flag has F2FS_INLINE_XATTR, and the last 50 
i_addrs have xattr
context. But if DDR flips, the i_inline flag may miss F2FS_INLINE_XATTR, 
and the last 50 i_addrs
are considered as data block addr. If the xattr context is 0x1234, and 
0x1234 happens to be
a valid block addr, and the block 0x1234 happens to be in a warm node 
segment. Then do_recover_data
will call f2fs_replace_block() with dest = 0x1234, which will change 
curseg of warm node to
0x1234's segment, and make update_sit_entry(sbi, 0x1234, 1), the 
curseg->next_blkoff also
points to 0x1234's offset in its segment. When recovery process calls 
write_checkpoint, sync
nodes will write to 0x1234's offset of curseg warm node. The 
update_sit_entry will check that

offset and find the bitmap is already set to 1 and then calls f2fs_bug_on.



Thanks,


Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/recovery.c | 3 ++-
  fs/f2fs/segment.h  | 3 +++
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
  
  		/* dest is valid block, try to recover from src to dest */

-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
  
  			if (src == NULL_ADDR) {

err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
  #define SECTOR_TO_BLOCK(sectors)  \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
  
+#define is_data_blkaddr(sbi, blkaddr)	\

+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
  /*
   * indicate a block allocation direction: RIGHT and LEFT.
   * RIGHT means allocating new sections towards the end of volume.



.



--
Thanks,
Yunlong Song




Re: [PATCH] f2fs: check segment type before recover data

2018-01-02 Thread Yunlong Song



On 2018/1/2 14:49, Chao Yu wrote:

On 2017/12/30 15:42, Yunlong Song wrote:

In some case, the node blocks has wrong blkaddr whose segment type is

You mean *data block* has wrong blkaddr whose segment type is NODE?

Yes.



NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a

Do you mean the root cause is that __f2fs_replace_block didn't update
next_blkoff?

No, it's not the root cause. The root cause may be something like DDR flip.



result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Sorry, I can't catch the whole cause and effect from you description, if
possible, could you give an example?
For example, the i_inline flag has F2FS_INLINE_XATTR, and the last 50 
i_addrs have xattr
context. But if DDR flips, the i_inline flag may miss F2FS_INLINE_XATTR, 
and the last 50 i_addrs
are considered as data block addr. If the xattr context is 0x1234, and 
0x1234 happens to be
a valid block addr, and the block 0x1234 happens to be in a warm node 
segment. Then do_recover_data
will call f2fs_replace_block() with dest = 0x1234, which will change 
curseg of warm node to
0x1234's segment, and make update_sit_entry(sbi, 0x1234, 1), the 
curseg->next_blkoff also
points to 0x1234's offset in its segment. When recovery process calls 
write_checkpoint, sync
nodes will write to 0x1234's offset of curseg warm node. The 
update_sit_entry will check that

offset and find the bitmap is already set to 1 and then calls f2fs_bug_on.



Thanks,


Signed-off-by: Yunlong Song 
---
  fs/f2fs/recovery.c | 3 ++-
  fs/f2fs/segment.h  | 3 +++
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
  
  		/* dest is valid block, try to recover from src to dest */

-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
  
  			if (src == NULL_ADDR) {

err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
  #define SECTOR_TO_BLOCK(sectors)  \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
  
+#define is_data_blkaddr(sbi, blkaddr)	\

+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
  /*
   * indicate a block allocation direction: RIGHT and LEFT.
   * RIGHT means allocating new sections towards the end of volume.



.



--
Thanks,
Yunlong Song




[PATCH] f2fs: check segment type before recover data

2017-12-29 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/recovery.c | 3 ++-
 fs/f2fs/segment.h  | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
 
/* dest is valid block, try to recover from src to dest */
-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
 
if (src == NULL_ADDR) {
err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
 #define SECTOR_TO_BLOCK(sectors)   \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
 
+#define is_data_blkaddr(sbi, blkaddr)  \
+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
 /*
  * indicate a block allocation direction: RIGHT and LEFT.
  * RIGHT means allocating new sections towards the end of volume.
-- 
1.8.5.2



[PATCH] f2fs: check segment type before recover data

2017-12-29 Thread Yunlong Song
In some case, the node blocks has wrong blkaddr whose segment type is
NODE, e.g., recover inode has missing xattr flag and the blkaddr is in
the xattr range. Since fsck.f2fs does not check the recovery nodes, this
will cause __f2fs_replace_block change the curseg of node and do the
update_sit_entry(sbi, new_blkaddr, 1) with no next_blkoff refresh, as a
result, when recovery process write checkpoint and sync nodes, the
next_blkoff of curseg is used in the segment bit map, then it will
cause f2fs_bug_on. So let's check the segment type before recover data,
and stop recover if it is not in DATA segment.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/recovery.c | 3 ++-
 fs/f2fs/segment.h  | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
index 7d63faf..e8fee4a 100644
--- a/fs/f2fs/recovery.c
+++ b/fs/f2fs/recovery.c
@@ -478,7 +478,8 @@ static int do_recover_data(struct f2fs_sb_info *sbi, struct 
inode *inode,
}
 
/* dest is valid block, try to recover from src to dest */
-   if (is_valid_blkaddr(sbi, dest, META_POR)) {
+   if (is_valid_blkaddr(sbi, dest, META_POR) &&
+   is_data_blkaddr(sbi, dest)) {
 
if (src == NULL_ADDR) {
err = reserve_new_block();
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 71a2aaa..5c5a215 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -115,6 +115,9 @@
 #define SECTOR_TO_BLOCK(sectors)   \
((sectors) >> F2FS_LOG_SECTORS_PER_BLOCK)
 
+#define is_data_blkaddr(sbi, blkaddr)  \
+   (IS_DATASEG(get_seg_entry(sbi, GET_SEGNO(sbi, blkaddr))->type))
+
 /*
  * indicate a block allocation direction: RIGHT and LEFT.
  * RIGHT means allocating new sections towards the end of volume.
-- 
1.8.5.2



Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-25 Thread Yunlong Song
In this case, f2fs_gc will skip all the victims and return with no dead 
loop. The atomic file will

use SSR to OPU, it‘s OK.

On 2017/12/25 17:45, Chao Yu wrote:

On 2017/12/25 14:15, Yunlong Song wrote:

What if the application starts atomic write but forgets to commit, e.g.
bugs in application or the application
is a malicious software itself?

I agree we should consider robustness of f2fs in security aspect, but
please consider more scenario of these sqlite customized interface usage,
it looks just skipping gc is not enough, for example, if there is one large
size db in our partition, with random write, its data spreads in each
segment, once this db has been atomic opened, foreground gc may loop for ever.

How about checking opened time of atomic or volatile file in
f2fs_balance_fs, if it exceeds threshold, we can restore the file to normal
one to avoid potential security issue.

Thanks,


On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
   fs/f2fs/f2fs.h  |  8 
   fs/f2fs/gc.c| 34 ++
   fs/f2fs/super.c |  3 +++
   3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
   };
   
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
   #define F2FS_FEATURE_ENCRYPT 0x0001
   #define F2FS_FEATURE_BLKZONED0x0002
   #define F2FS_FEATURE_ATOMIC_WRITE0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
   
   	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
   
   	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
   
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
   
   		cost = get_gc_cost(sbi, segno, );
   
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
   
   	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-25 Thread Yunlong Song
In this case, f2fs_gc will skip all the victims and return with no dead 
loop. The atomic file will

use SSR to OPU, it‘s OK.

On 2017/12/25 17:45, Chao Yu wrote:

On 2017/12/25 14:15, Yunlong Song wrote:

What if the application starts atomic write but forgets to commit, e.g.
bugs in application or the application
is a malicious software itself?

I agree we should consider robustness of f2fs in security aspect, but
please consider more scenario of these sqlite customized interface usage,
it looks just skipping gc is not enough, for example, if there is one large
size db in our partition, with random write, its data spreads in each
segment, once this db has been atomic opened, foreground gc may loop for ever.

How about checking opened time of atomic or volatile file in
f2fs_balance_fs, if it exceeds threshold, we can restore the file to normal
one to avoid potential security issue.

Thanks,


On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song 
---
   fs/f2fs/f2fs.h  |  8 
   fs/f2fs/gc.c| 34 ++
   fs/f2fs/super.c |  3 +++
   3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
   };
   
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
   #define F2FS_FEATURE_ENCRYPT 0x0001
   #define F2FS_FEATURE_BLKZONED0x0002
   #define F2FS_FEATURE_ATOMIC_WRITE0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
   
   	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
   
   	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
   
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
   
   		cost = get_gc_cost(sbi, segno, );
   
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
   
   	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_s

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Yunlong Song
What if the application starts atomic write but forgets to commit, e.g. 
bugs in application or the application

is a malicious software itself?

On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
  fs/f2fs/f2fs.h  |  8 
  fs/f2fs/gc.c| 34 ++
  fs/f2fs/super.c |  3 +++
  3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
  };
  
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
  #define F2FS_FEATURE_ENCRYPT  0x0001
  #define F2FS_FEATURE_BLKZONED 0x0002
  #define F2FS_FEATURE_ATOMIC_WRITE 0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
  
  	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
  
  	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
  
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
  
  		cost = get_gc_cost(sbi, segno, );
  
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
  
  	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
  
  	if (sync)

ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
  }
  
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap

Re: [PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-24 Thread Yunlong Song
What if the application starts atomic write but forgets to commit, e.g. 
bugs in application or the application

is a malicious software itself?

On 2017/12/25 11:44, Chao Yu wrote:

On 2017/12/23 21:09, Yunlong Song wrote:

For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim

File should not be atomic opened for long time since normally sqlite
transaction will finish quickly, so we can expect that gc loop could be
ended up soon, right?

Thanks,


segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song 
---
  fs/f2fs/f2fs.h  |  8 
  fs/f2fs/gc.c| 34 ++
  fs/f2fs/super.c |  3 +++
  3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
  };
  
+struct gc_loop_info {

+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
  #define F2FS_FEATURE_ENCRYPT  0x0001
  #define F2FS_FEATURE_BLKZONED 0x0002
  #define F2FS_FEATURE_ATOMIC_WRITE 0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
  
  	/* threshold for converting bg victims for fg */

u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
  
  	/* maximum # of trials to find a victim segment for SSR and GC */

unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
  
+		if (sbi->gc_loop.segmap &&

+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
  
  		cost = get_gc_cost(sbi, segno, );
  
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,

seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
  
  	if (gc_type == FG_GC)

@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
  
  	if (sync)

ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
  }
  
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi

[PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-23 Thread Yunlong Song
For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim
segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song <yunlong.s...@huawei.com>
---
 fs/f2fs/f2fs.h  |  8 
 fs/f2fs/gc.c| 34 ++
 fs/f2fs/super.c |  3 +++
 3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
 };
 
+struct gc_loop_info {
+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
 #define F2FS_FEATURE_ENCRYPT   0x0001
 #define F2FS_FEATURE_BLKZONED  0x0002
 #define F2FS_FEATURE_ATOMIC_WRITE  0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
 
/* threshold for converting bg victims for fg */
u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
 
/* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
 
+   if (sbi->gc_loop.segmap &&
+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
 
cost = get_gc_cost(sbi, segno, );
 
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
 
if (gc_type == FG_GC)
@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 
if (sync)
ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
 }
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
 
for (i = 0; i < NR_INODE_TYPE; i++) {
INIT_LIST_HEAD(>inode_list[i]);
-- 
1.8.5.2



[PATCH] f2fs: avoid f2fs_gc dead loop

2017-12-23 Thread Yunlong Song
For some corner case, f2fs_gc selects one target victim but cannot free
that victim segment due to some reason (e.g. the segment has some blocks
of atomic file which is not commited yet), in this case, the victim
segment may probably be selected over and over, and then f2fs_gc will
go to dead loop. This patch identifies the dead-loop segment, and skips
it in __get_victim next time.

Signed-off-by: Yunlong Song 
---
 fs/f2fs/f2fs.h  |  8 
 fs/f2fs/gc.c| 34 ++
 fs/f2fs/super.c |  3 +++
 3 files changed, 45 insertions(+)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ca6b0c9..b75851b 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -115,6 +115,13 @@ struct f2fs_mount_info {
unsigned intopt;
 };
 
+struct gc_loop_info {
+   int count;
+   unsigned int segno;
+   unsigned long *segmap;
+};
+#define GC_LOOP_MAX 10
+
 #define F2FS_FEATURE_ENCRYPT   0x0001
 #define F2FS_FEATURE_BLKZONED  0x0002
 #define F2FS_FEATURE_ATOMIC_WRITE  0x0004
@@ -1125,6 +1132,7 @@ struct f2fs_sb_info {
 
/* threshold for converting bg victims for fg */
u64 fggc_threshold;
+   struct gc_loop_info gc_loop;
 
/* maximum # of trials to find a victim segment for SSR and GC */
unsigned int max_victim_search;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5d5bba4..4ee9e1b 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -229,6 +229,10 @@ static unsigned int check_bg_victims(struct f2fs_sb_info 
*sbi)
if (no_fggc_candidate(sbi, secno))
continue;
 
+   if (sbi->gc_loop.segmap &&
+   test_bit(GET_SEG_FROM_SEC(sbi, secno), 
sbi->gc_loop.segmap))
+   continue;
+
clear_bit(secno, dirty_i->victim_secmap);
return GET_SEG_FROM_SEC(sbi, secno);
}
@@ -371,6 +375,9 @@ static int get_victim_by_default(struct f2fs_sb_info *sbi,
if (gc_type == FG_GC && p.alloc_mode == LFS &&
no_fggc_candidate(sbi, secno))
goto next;
+   if (gc_type == FG_GC && p.alloc_mode == LFS &&
+   sbi->gc_loop.segmap && test_bit(segno, 
sbi->gc_loop.segmap))
+   goto next;
 
cost = get_gc_cost(sbi, segno, );
 
@@ -1042,6 +1049,27 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
seg_freed = do_garbage_collect(sbi, segno, _list, gc_type);
if (gc_type == FG_GC && seg_freed == sbi->segs_per_sec)
sec_freed++;
+   else if (gc_type == FG_GC && seg_freed == 0) {
+   if (!sbi->gc_loop.segmap) {
+   sbi->gc_loop.segmap =
+   kvzalloc(f2fs_bitmap_size(MAIN_SEGS(sbi)), 
GFP_KERNEL);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   if (segno == sbi->gc_loop.segno) {
+   if (sbi->gc_loop.count > GC_LOOP_MAX) {
+   f2fs_bug_on(sbi, 1);
+   set_bit(segno, sbi->gc_loop.segmap);
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
+   else
+   sbi->gc_loop.count++;
+   } else {
+   sbi->gc_loop.segno = segno;
+   sbi->gc_loop.count = 0;
+   }
+   }
total_freed += seg_freed;
 
if (gc_type == FG_GC)
@@ -1075,6 +1103,12 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
 
if (sync)
ret = sec_freed ? 0 : -EAGAIN;
+   if (sbi->gc_loop.segmap) {
+   kvfree(sbi->gc_loop.segmap);
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
+   }
return ret;
 }
 
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 031cb26..76f0b72 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -2562,6 +2562,9 @@ static int f2fs_fill_super(struct super_block *sb, void 
*data, int silent)
sbi->last_valid_block_count = sbi->total_valid_block_count;
sbi->reserved_blocks = 0;
sbi->current_reserved_blocks = 0;
+   sbi->gc_loop.segmap = NULL;
+   sbi->gc_loop.count = 0;
+   sbi->gc_loop.segno = NULL_SEGNO;
 
for (i = 0; i < NR_INODE_TYPE; i++) {
INIT_LIST_HEAD(>inode_list[i]);
-- 
1.8.5.2



Re: [PATCH v4] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-22 Thread Yunlong Song

And there is en[namelen] = '\0', should fix namelen to its right value.

On 2017/12/23 11:35, Chao Yu wrote:

On 2017/12/23 11:19, Yunlong Song wrote:

Double free problem:
Since ddr bit jump makes i_namelen a larger value (> 255),when file is
not encrypted,
the convert_encrypted_name will memcpy out range of en[255], when en is
freed, there
will be double free problem.

It looks there is only memcpy overflow problem here.

Thanks,


On 2017/12/23 11:05, Chao Yu wrote:

On 2017/12/18 21:25, Yunlong Song wrote:

v1 -> v2: use child_info to pass dentry namelen
v2 -> v3: check child != NULL to include the F2FS_FT_ORPHAN file type
v3 -> v4: fix the i_namelen problem of dump.f2fs、

There is no commit log, so what do you mean about "avoid double free"?

Other than that, looks good to me.

Reviewed-by: Chao Yu <yuch...@huawei.com>

Thanks,


.



.



--
Thanks,
Yunlong Song




Re: [PATCH v4] fsck.f2fs: check and fix i_namelen to avoid double free

2017-12-22 Thread Yunlong Song

And there is en[namelen] = '\0', should fix namelen to its right value.

On 2017/12/23 11:35, Chao Yu wrote:

On 2017/12/23 11:19, Yunlong Song wrote:

Double free problem:
Since ddr bit jump makes i_namelen a larger value (> 255),when file is
not encrypted,
the convert_encrypted_name will memcpy out range of en[255], when en is
freed, there
will be double free problem.

It looks there is only memcpy overflow problem here.

Thanks,


On 2017/12/23 11:05, Chao Yu wrote:

On 2017/12/18 21:25, Yunlong Song wrote:

v1 -> v2: use child_info to pass dentry namelen
v2 -> v3: check child != NULL to include the F2FS_FT_ORPHAN file type
v3 -> v4: fix the i_namelen problem of dump.f2fs、

There is no commit log, so what do you mean about "avoid double free"?

Other than that, looks good to me.

Reviewed-by: Chao Yu 

Thanks,


.



.



--
Thanks,
Yunlong Song




  1   2   3   4   5   6   7   8   >