Re: [f2fs-dev] [PATCH v11] f2fs: guarantee journalled quota data by checkpoint

2018-10-16 Thread Chao Yu
Jaegeuk,

Sorry for the long delay, I'm busy on other thing.

I'm trying your fixing code on both fsck and kernel with 'run.sh
por_fsstress' case.

And got below output, is that normal in updated fsck? I didn't have time to
look into this.

Info: checkpoint state = 8c6 :  quota_need_fsck nat_bits crc
compacted_summary orphan_inodes sudden-power-off
[fsck_chk_quota_files:1755] Fixing Quota file ([  0] ino [0x4])
[ERROR] quotaio_tree.c:83:write_blk:: Cannot write block (1320):
Inappropriate ioctl for device
[ERROR] quotaio_tree.c:110:get_free_dqblk:: Cannot allocate new quota block
(out of disk space).
[ERROR] quotaio_tree.c:315:dq_insert_tree:: Cannot write quota (id
67368348): Inappropriate ioctl for device
[fsck_chk_quota_files:1755] Fixing Quota file ([  1] ino [0x5])
[ERROR] quotaio_tree.c:83:write_blk:: Cannot write block (1332):
Inappropriate ioctl for device
[ERROR] quotaio_tree.c:110:get_free_dqblk:: Cannot allocate new quota block
(out of disk space).
[ERROR] quotaio_tree.c:315:dq_insert_tree:: Cannot write quota (id
73435216): Inappropriate ioctl for device

Thanks,


On 2018/10/3 0:45, Jaegeuk Kim wrote:
> On 10/01, Chao Yu wrote:
>> On 2018-10-1 9:49, Jaegeuk Kim wrote:
>>> On 10/01, Chao Yu wrote:
 On 2018-10-1 9:29, Jaegeuk Kim wrote:
> On 10/01, Chao Yu wrote:
>> Hi Jaegeuk,
>>
>> On 2018-10-1 8:06, Jaegeuk Kim wrote:
>>> Hi Chao,
>>>
>>> This fails on fsstress with godown without fault injection. Could you 
>>> please
>>> test a bit? I assumed that this patch should give no fsck failure along 
>>> with
>>> valid checkpoint having no flag.
>>
>> Okay, let me reproduce with that case.
>>
>>>
>>> BTW, I'm in doubt that f2fs_lock_all covers entire quota modification. 
>>> What
>>> about prepare_write_begin() -> f2fs_get_block() ... -> 
>>> inc_valid_block_count()?
>>
>> If quota data changed in above path, we will detect that in below 
>> condition:
>>
>> block_operation()
>>
>>  down_write(>node_change);
>>
>>  if (__need_flush_quota(sbi)) {
>>  up_write(>node_change);
>>  f2fs_unlock_all(sbi);
>>  goto retry_flush_quotas;
>>  }
>>
>> So there is no problem?
>
> We may need to check quota is dirty, since we have no way to detect by
> f2fs structures?

 Below condition can check that.

 static bool __need_flush_quota(struct f2fs_sb_info *sbi)
 {
 ...
if (is_sbi_flag_set(sbi, SBI_QUOTA_NEED_FLUSH))
return true;
if (get_pages(sbi, F2FS_DIRTY_QDATA))
return true;
 ...
 }

 static int f2fs_dquot_mark_dquot_dirty(struct dquot *dquot)
 {
 ...
ret = dquot_mark_dquot_dirty(dquot);

/* if we are using journalled quota */
if (is_journalled_quota(sbi))
set_sbi_flag(sbi, SBI_QUOTA_NEED_FLUSH);
 ...
 }
>>>
>>> Okay, then, could you please run the above stress test to reproduce this?
>>
>> Sure, let me try this case and fix it.
>>
>> Could you check other patches in mailing list, and test them instead?
> 
> With the below change, the test result is much better for now.
> Let me know, if you have further concern.
> 
> ---
>  fs/f2fs/checkpoint.c | 6 ++
>  fs/f2fs/super.c  | 4 +++-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index a1facfbfc5c7..b111c6201023 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -,6 +,8 @@ static int block_operations(struct f2fs_sb_info *sbi)
>  
>  retry_flush_quotas:
>   if (__need_flush_quota(sbi)) {
> + int locked;
> +
>   if (++cnt > DEFAULT_RETRY_QUOTA_FLUSH_COUNT) {
>   set_sbi_flag(sbi, SBI_QUOTA_SKIP_FLUSH);
>   f2fs_lock_all(sbi);
> @@ -1118,7 +1120,11 @@ static int block_operations(struct f2fs_sb_info *sbi)
>   }
>   clear_sbi_flag(sbi, SBI_QUOTA_NEED_FLUSH);
>  
> + /* only failed during mount/umount/freeze/quotactl */
> + locked = down_read_trylock(>sb->s_umount);
>   f2fs_quota_sync(sbi->sb, -1);
> + if (locked)
> + up_read(>sb->s_umount);
>   }
>  
>   f2fs_lock_all(sbi);
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index a28c245b1288..b39f60d57120 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1706,6 +1706,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, 
> int type, char *data,
>   congestion_wait(BLK_RW_ASYNC, HZ/50);
>   goto repeat;
>   }
> + set_sbi_flag(F2FS_SB(sb), SBI_QUOTA_NEED_REPAIR);
>   return PTR_ERR(page);
>   }
>  
> @@ -1717,6 +1718,7 @@ static ssize_t f2fs_quota_read(struct super_block *sb, 
> int type, 

Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jaegeuk Kim
On 10/17, Chao Yu wrote:
> On 2018/10/17 10:19, Jaegeuk Kim wrote:
> > On 10/17, Chao Yu wrote:
> >> Hi Jaegeuk, Jens,
> >>
> >> On 2018/10/17 3:23, Jaegeuk Kim wrote:
> >>> Thanks Jens,
> >>>
> >>> On top of the patch killing the dead code, I wrote another one to detect
> >>> the idle time by the internal account logic like below. IMHO, it'd be
> >>> better to decouple f2fs with other layers, if possible.
> >>>
> >>> Thanks,
> >>>
> >>> >From 85daf5190671b3d98ef779bdea77b4a046658708 Mon Sep 17 00:00:00 2001
> >>> From: Jaegeuk Kim 
> >>> Date: Tue, 16 Oct 2018 10:20:53 -0700
> >>> Subject: [PATCH] f2fs: account read IOs and use IO counts for is_idle
> >>>
> >>> This patch adds issued read IO counts which is under block layer.
> >>
> >> I think the problem here is, in android environment, there is multiple
> >> kinds of filesystems used in one device (emmc or ufs), so, if f2fs only be
> >> aware of IOs from itself, still we can face IO conflict between f2fs and
> >> other filesystem (ext4 used in system/... partition), it is possible when
> >> we are trigger GC/discard intensively such as in gc_urgent mode, user can
> >> be stuck when loading data from system partition.
> > 
> > IMO, we don't need to worry about gc_urgent too much, since it will be set 
> > at
> 
> For most of the time, it's okay, in android idle time (IIRC, in middle
> night, charging, screen shutdowns for a while) can be broken by user who
> has bad habit that using cell phone in middle night randomly then user
> operation can be stuck due to IO busy storage.

It's only triggered once a day. And, I don't think /system is only giving
IOs at that moment.

> 
> > the given android idle time. Other than that, I think it won't trigger GC 
> > and
> > discard too frequently which would be able to affect /system operations.
> 
> How about considering low-end products? since performance of storage device
> is real bad, handling discard/read IO in /data can affect /system operations.
> 
> > The new interface would be nice to have, but give more complexity on 
> > backporting
> > work in old kernels and newer as well across the layers.
> 
> Yes, that's a problem.
> 
> BTW, we just encounter 'GC not work' problem in product, I guess that's due
> to is_idle()'s problem, let's try to confirm the root cause.
> 
> Thanks,
> 
> > 
> >>
> >> So I guess the better way to avoid this is to export IO busy status in
> >> block layer to filesystem, in f2fs, we can provider effective service in
> >> background thread at real idle time, and will not impact user.
> >>
> >> Any thoughts?
> >>
> >> Thanks,
> >>
> >>>
> >>> Signed-off-by: Jaegeuk Kim 
> >>> ---
> >>>  fs/f2fs/data.c  | 24 ++--
> >>>  fs/f2fs/debug.c |  7 ++-
> >>>  fs/f2fs/f2fs.h  | 22 --
> >>>  3 files changed, 44 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index 8952f2d610a6..5fdc8d751f19 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -52,6 +52,23 @@ static bool __is_cp_guaranteed(struct page *page)
> >>>   return false;
> >>>  }
> >>>  
> >>> +static enum count_type __read_io_type(struct page *page)
> >>> +{
> >>> + struct address_space *mapping = page->mapping;
> >>> +
> >>> + if (mapping) {
> >>> + struct inode *inode = mapping->host;
> >>> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> >>> +
> >>> + if (inode->i_ino == F2FS_META_INO(sbi))
> >>> + return F2FS_RD_META;
> >>> +
> >>> + if (inode->i_ino == F2FS_NODE_INO(sbi))
> >>> + return F2FS_RD_NODE;
> >>> + }
> >>> + return F2FS_RD_DATA;
> >>> +}
> >>> +
> >>>  /* postprocessing steps for read bios */
> >>>  enum bio_post_read_step {
> >>>   STEP_INITIAL = 0,
> >>> @@ -82,6 +99,7 @@ static void __read_end_io(struct bio *bio)
> >>>   } else {
> >>>   SetPageUptodate(page);
> >>>   }
> >>> + dec_page_count(F2FS_P_SB(page), __read_io_type(page));
> >>>   unlock_page(page);
> >>>   }
> >>>   if (bio->bi_private)
> >>> @@ -464,8 +482,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
> >>>  
> >>>   __submit_bio(fio->sbi, bio, fio->type);
> >>>  
> >>> - if (!is_read_io(fio->op))
> >>> - inc_page_count(fio->sbi, WB_DATA_TYPE(fio->page));
> >>> + inc_page_count(fio->sbi, is_read_io(fio->op) ?
> >>> + __read_io_type(page): WB_DATA_TYPE(fio->page));
> >>>   return 0;
> >>>  }
> >>>  
> >>> @@ -595,6 +613,7 @@ static int f2fs_submit_page_read(struct inode *inode, 
> >>> struct page *page,
> >>>   }
> >>>   ClearPageError(page);
> >>>   __submit_bio(F2FS_I_SB(inode), bio, DATA);
> >>> + inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
> >>>   return 0;
> >>>  }
> >>>  
> >>> @@ -1593,6 +1612,7 @@ static int f2fs_mpage_readpages(struct 
> >>> address_space *mapping,
> >>>   if (bio_add_page(bio, page, blocksize, 0) < blocksize)
> >>>   goto submit_and_realloc;
> >>>  
> 

Re: [f2fs-dev] [PATCH v2] f2fs: clear cold data flag if IO is not counted

2018-10-16 Thread Chao Yu
On 2018/10/17 10:34, Jaegeuk Kim wrote:
> This reverts commit 66110abc4c931f879d70e83e1281f891699364bf.
> 
> If we clear the cold data flag out of the writeback flow, we can miscount
> -1 by end_io.
> 
> Balancing F2FS Async:
>  - IO (CP:1, Data:   -1, Flush: (   001), Discard: (   ...
> 
> GC thread:  IRQ
> - move_data_page()
>  - set_page_dirty()
>   - clear_cold_data()
> - f2fs_write_end_io()
>  - type = WB_DATA_TYPE(page);
>here, we get wrong type
>  - dec_page_count(sbi, type);
>  - f2fs_wait_on_page_writeback()
> 
> Cc: 
> Signed-off-by: Jaegeuk Kim 

Reviewed-by: Chao Yu 

Thanks,



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Chao Yu
On 2018/10/17 10:19, Jaegeuk Kim wrote:
> On 10/17, Chao Yu wrote:
>> Hi Jaegeuk, Jens,
>>
>> On 2018/10/17 3:23, Jaegeuk Kim wrote:
>>> Thanks Jens,
>>>
>>> On top of the patch killing the dead code, I wrote another one to detect
>>> the idle time by the internal account logic like below. IMHO, it'd be
>>> better to decouple f2fs with other layers, if possible.
>>>
>>> Thanks,
>>>
>>> >From 85daf5190671b3d98ef779bdea77b4a046658708 Mon Sep 17 00:00:00 2001
>>> From: Jaegeuk Kim 
>>> Date: Tue, 16 Oct 2018 10:20:53 -0700
>>> Subject: [PATCH] f2fs: account read IOs and use IO counts for is_idle
>>>
>>> This patch adds issued read IO counts which is under block layer.
>>
>> I think the problem here is, in android environment, there is multiple
>> kinds of filesystems used in one device (emmc or ufs), so, if f2fs only be
>> aware of IOs from itself, still we can face IO conflict between f2fs and
>> other filesystem (ext4 used in system/... partition), it is possible when
>> we are trigger GC/discard intensively such as in gc_urgent mode, user can
>> be stuck when loading data from system partition.
> 
> IMO, we don't need to worry about gc_urgent too much, since it will be set at

For most of the time, it's okay, in android idle time (IIRC, in middle
night, charging, screen shutdowns for a while) can be broken by user who
has bad habit that using cell phone in middle night randomly then user
operation can be stuck due to IO busy storage.

> the given android idle time. Other than that, I think it won't trigger GC and
> discard too frequently which would be able to affect /system operations.

How about considering low-end products? since performance of storage device
is real bad, handling discard/read IO in /data can affect /system operations.

> The new interface would be nice to have, but give more complexity on 
> backporting
> work in old kernels and newer as well across the layers.

Yes, that's a problem.

BTW, we just encounter 'GC not work' problem in product, I guess that's due
to is_idle()'s problem, let's try to confirm the root cause.

Thanks,

> 
>>
>> So I guess the better way to avoid this is to export IO busy status in
>> block layer to filesystem, in f2fs, we can provider effective service in
>> background thread at real idle time, and will not impact user.
>>
>> Any thoughts?
>>
>> Thanks,
>>
>>>
>>> Signed-off-by: Jaegeuk Kim 
>>> ---
>>>  fs/f2fs/data.c  | 24 ++--
>>>  fs/f2fs/debug.c |  7 ++-
>>>  fs/f2fs/f2fs.h  | 22 --
>>>  3 files changed, 44 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
>>> index 8952f2d610a6..5fdc8d751f19 100644
>>> --- a/fs/f2fs/data.c
>>> +++ b/fs/f2fs/data.c
>>> @@ -52,6 +52,23 @@ static bool __is_cp_guaranteed(struct page *page)
>>> return false;
>>>  }
>>>  
>>> +static enum count_type __read_io_type(struct page *page)
>>> +{
>>> +   struct address_space *mapping = page->mapping;
>>> +
>>> +   if (mapping) {
>>> +   struct inode *inode = mapping->host;
>>> +   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
>>> +
>>> +   if (inode->i_ino == F2FS_META_INO(sbi))
>>> +   return F2FS_RD_META;
>>> +
>>> +   if (inode->i_ino == F2FS_NODE_INO(sbi))
>>> +   return F2FS_RD_NODE;
>>> +   }
>>> +   return F2FS_RD_DATA;
>>> +}
>>> +
>>>  /* postprocessing steps for read bios */
>>>  enum bio_post_read_step {
>>> STEP_INITIAL = 0,
>>> @@ -82,6 +99,7 @@ static void __read_end_io(struct bio *bio)
>>> } else {
>>> SetPageUptodate(page);
>>> }
>>> +   dec_page_count(F2FS_P_SB(page), __read_io_type(page));
>>> unlock_page(page);
>>> }
>>> if (bio->bi_private)
>>> @@ -464,8 +482,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
>>>  
>>> __submit_bio(fio->sbi, bio, fio->type);
>>>  
>>> -   if (!is_read_io(fio->op))
>>> -   inc_page_count(fio->sbi, WB_DATA_TYPE(fio->page));
>>> +   inc_page_count(fio->sbi, is_read_io(fio->op) ?
>>> +   __read_io_type(page): WB_DATA_TYPE(fio->page));
>>> return 0;
>>>  }
>>>  
>>> @@ -595,6 +613,7 @@ static int f2fs_submit_page_read(struct inode *inode, 
>>> struct page *page,
>>> }
>>> ClearPageError(page);
>>> __submit_bio(F2FS_I_SB(inode), bio, DATA);
>>> +   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
>>> return 0;
>>>  }
>>>  
>>> @@ -1593,6 +1612,7 @@ static int f2fs_mpage_readpages(struct address_space 
>>> *mapping,
>>> if (bio_add_page(bio, page, blocksize, 0) < blocksize)
>>> goto submit_and_realloc;
>>>  
>>> +   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
>>> ClearPageError(page);
>>> last_block_in_bio = block_nr;
>>> goto next_page;
>>> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
>>> index 026e10f30889..139b4d5c83d5 100644
>>> --- a/fs/f2fs/debug.c
>>> +++ b/fs/f2fs/debug.c
>>> 

Re: [f2fs-dev] [PATCH] f2fs: clear cold data flag if IO is not counted

2018-10-16 Thread Jaegeuk Kim
On 10/17, Chao Yu wrote:
> On 2018/10/16 11:10, Jaegeuk Kim wrote:
> > On 10/16, Chao Yu wrote:
> >> On 2018/10/16 7:08, Jaegeuk Kim wrote:
> >>> On 10/15, Chao Yu wrote:
>  On 2018/10/11 5:22, Jaegeuk Kim wrote:
> > If we clear the cold data flag out of the writeback flow, we can 
> > miscount
> > -1 by end_io.
> 
>  I didn't get it, which count do you mean?
> >>>
> >>> It's the number of dirty pages.
> >>>
> >>> Balancing F2FS Async:
> >>>   - IO (CP:1, Data:   -1, Flush: (   001), Discard: (   0 
> >>> 129304)) cmd:0 undiscard:   0
> 
> Better to add such info in commit message. :)
> 
> >>>
> >>
> >> So I guess the race should be:
> >>
> >> GC thread: IRQ
> >> - move_data_page()
> >>  - set_page_dirty()
> >>   - clear_cold_data()
> >>- f2fs_write_end_io()
> >> - type = WB_DATA_TYPE(page);
> >>   here, we get wrong type
> >> - dec_page_count(sbi, type);
> >>  - f2fs_wait_on_page_writeback()
> >>
> >> How about relocate wait_writeback & set_page_dirty to avoid above race:
> > 
> > Well, wait_on_stable_page() doesn't guarantee its end_io. So, I'm not sure
> > this is the only case.
> 
> Yes, you're right, I missed that case.
> 
> Can you use git-revert to generate the patch? so we can remain original
> commit info for better backward tracking.
> 
> How do you think pick up original patch I submitted:
> 
> https://lkml.org/lkml/2018/7/27/415

Seems that works.

Thanks,

> 
> Thanks,
> 
> > 
> >>
> >> move_data_page()
> >>
> >> retry:
> >>f2fs_wait_on_page_writeback(page, DATA, true);
> >>set_page_dirty(page);
> >>
> >> Thanks,
> >>
> 
>  Thanks,
> 
> >
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/data.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 29a9d3b8f709..4102799b5558 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -2636,10 +2636,6 @@ static int f2fs_set_data_page_dirty(struct page 
> > *page)
> > if (!PageUptodate(page))
> > SetPageUptodate(page);
> >  
> > -   /* don't remain PG_checked flag which was set during GC */
> > -   if (is_cold_data(page))
> > -   clear_cold_data(page);
> > -
> > if (f2fs_is_atomic_file(inode) && 
> > !f2fs_is_commit_atomic_write(inode)) {
> > if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
> > f2fs_register_inmem_page(inode, page);
> >
> >>>
> >>> .
> >>>
> > 
> > .
> > 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: clear cold data flag if IO is not counted

2018-10-16 Thread Jaegeuk Kim
This reverts commit 66110abc4c931f879d70e83e1281f891699364bf.

If we clear the cold data flag out of the writeback flow, we can miscount
-1 by end_io.

Balancing F2FS Async:
 - IO (CP:1, Data:   -1, Flush: (   001), Discard: (   ...

GC thread:  IRQ
- move_data_page()
 - set_page_dirty()
  - clear_cold_data()
- f2fs_write_end_io()
 - type = WB_DATA_TYPE(page);
   here, we get wrong type
 - dec_page_count(sbi, type);
 - f2fs_wait_on_page_writeback()

Cc: 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 3f272c18fb61..0d0b4dd55b04 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2650,10 +2650,6 @@ static int f2fs_set_data_page_dirty(struct page *page)
if (!PageUptodate(page))
SetPageUptodate(page);
 
-   /* don't remain PG_checked flag which was set during GC */
-   if (is_cold_data(page))
-   clear_cold_data(page);
-
if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
f2fs_register_inmem_page(inode, page);
-- 
2.19.0.605.g01d371f741-goog



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jaegeuk Kim
On 10/17, Chao Yu wrote:
> Hi Jaegeuk, Jens,
> 
> On 2018/10/17 3:23, Jaegeuk Kim wrote:
> > Thanks Jens,
> > 
> > On top of the patch killing the dead code, I wrote another one to detect
> > the idle time by the internal account logic like below. IMHO, it'd be
> > better to decouple f2fs with other layers, if possible.
> > 
> > Thanks,
> > 
> >>From 85daf5190671b3d98ef779bdea77b4a046658708 Mon Sep 17 00:00:00 2001
> > From: Jaegeuk Kim 
> > Date: Tue, 16 Oct 2018 10:20:53 -0700
> > Subject: [PATCH] f2fs: account read IOs and use IO counts for is_idle
> > 
> > This patch adds issued read IO counts which is under block layer.
> 
> I think the problem here is, in android environment, there is multiple
> kinds of filesystems used in one device (emmc or ufs), so, if f2fs only be
> aware of IOs from itself, still we can face IO conflict between f2fs and
> other filesystem (ext4 used in system/... partition), it is possible when
> we are trigger GC/discard intensively such as in gc_urgent mode, user can
> be stuck when loading data from system partition.

IMO, we don't need to worry about gc_urgent too much, since it will be set at
the given android idle time. Other than that, I think it won't trigger GC and
discard too frequently which would be able to affect /system operations.
The new interface would be nice to have, but give more complexity on backporting
work in old kernels and newer as well across the layers.

> 
> So I guess the better way to avoid this is to export IO busy status in
> block layer to filesystem, in f2fs, we can provider effective service in
> background thread at real idle time, and will not impact user.
> 
> Any thoughts?
> 
> Thanks,
> 
> > 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/data.c  | 24 ++--
> >  fs/f2fs/debug.c |  7 ++-
> >  fs/f2fs/f2fs.h  | 22 --
> >  3 files changed, 44 insertions(+), 9 deletions(-)
> > 
> > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > index 8952f2d610a6..5fdc8d751f19 100644
> > --- a/fs/f2fs/data.c
> > +++ b/fs/f2fs/data.c
> > @@ -52,6 +52,23 @@ static bool __is_cp_guaranteed(struct page *page)
> > return false;
> >  }
> >  
> > +static enum count_type __read_io_type(struct page *page)
> > +{
> > +   struct address_space *mapping = page->mapping;
> > +
> > +   if (mapping) {
> > +   struct inode *inode = mapping->host;
> > +   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> > +
> > +   if (inode->i_ino == F2FS_META_INO(sbi))
> > +   return F2FS_RD_META;
> > +
> > +   if (inode->i_ino == F2FS_NODE_INO(sbi))
> > +   return F2FS_RD_NODE;
> > +   }
> > +   return F2FS_RD_DATA;
> > +}
> > +
> >  /* postprocessing steps for read bios */
> >  enum bio_post_read_step {
> > STEP_INITIAL = 0,
> > @@ -82,6 +99,7 @@ static void __read_end_io(struct bio *bio)
> > } else {
> > SetPageUptodate(page);
> > }
> > +   dec_page_count(F2FS_P_SB(page), __read_io_type(page));
> > unlock_page(page);
> > }
> > if (bio->bi_private)
> > @@ -464,8 +482,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
> >  
> > __submit_bio(fio->sbi, bio, fio->type);
> >  
> > -   if (!is_read_io(fio->op))
> > -   inc_page_count(fio->sbi, WB_DATA_TYPE(fio->page));
> > +   inc_page_count(fio->sbi, is_read_io(fio->op) ?
> > +   __read_io_type(page): WB_DATA_TYPE(fio->page));
> > return 0;
> >  }
> >  
> > @@ -595,6 +613,7 @@ static int f2fs_submit_page_read(struct inode *inode, 
> > struct page *page,
> > }
> > ClearPageError(page);
> > __submit_bio(F2FS_I_SB(inode), bio, DATA);
> > +   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
> > return 0;
> >  }
> >  
> > @@ -1593,6 +1612,7 @@ static int f2fs_mpage_readpages(struct address_space 
> > *mapping,
> > if (bio_add_page(bio, page, blocksize, 0) < blocksize)
> > goto submit_and_realloc;
> >  
> > +   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
> > ClearPageError(page);
> > last_block_in_bio = block_nr;
> > goto next_page;
> > diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> > index 026e10f30889..139b4d5c83d5 100644
> > --- a/fs/f2fs/debug.c
> > +++ b/fs/f2fs/debug.c
> > @@ -55,6 +55,9 @@ static void update_general_status(struct f2fs_sb_info 
> > *sbi)
> > si->max_vw_cnt = atomic_read(>max_vw_cnt);
> > si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
> > si->nr_wb_data = get_pages(sbi, F2FS_WB_DATA);
> > +   si->nr_rd_data = get_pages(sbi, F2FS_RD_DATA);
> > +   si->nr_rd_node = get_pages(sbi, F2FS_RD_NODE);
> > +   si->nr_rd_meta = get_pages(sbi, F2FS_RD_META);
> > if (SM_I(sbi) && SM_I(sbi)->fcc_info) {
> > si->nr_flushed =
> > atomic_read(_I(sbi)->fcc_info->issued_flush);
> > @@ -371,7 +374,9 @@ static int stat_show(struct seq_file *s, void *v)
> > 

[f2fs-dev] [PATCH] mkfs.f2fs: support formating large size file in 32-bits platform

2018-10-16 Thread Chao Yu
In 32-bits platform, {f,}stat on a large size file during mkfs, it will
cause EOVERFLOW error, this patch fixes to add macro definition
_FILE_OFFSET_BITS to avoid that error.

Signed-off-by: Chao Yu 
---
 lib/libf2fs.c| 1 +
 mkfs/f2fs_format_utils.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/lib/libf2fs.c b/lib/libf2fs.c
index 8bda1fed082c..529f1caa4144 100644
--- a/lib/libf2fs.c
+++ b/lib/libf2fs.c
@@ -7,6 +7,7 @@
  * Dual licensed under the GPL or LGPL version 2 licenses.
  */
 #define _LARGEFILE64_SOURCE
+#define _FILE_OFFSET_BITS 64
 
 #include 
 #include 
diff --git a/mkfs/f2fs_format_utils.c b/mkfs/f2fs_format_utils.c
index 5b9cc0c6ac69..e620190df4b7 100644
--- a/mkfs/f2fs_format_utils.c
+++ b/mkfs/f2fs_format_utils.c
@@ -16,6 +16,10 @@
 #define _GNU_SOURCE
 #endif
 
+#ifndef _FILE_OFFSET_BITS
+#define _FILE_OFFSET_BITS 64
+#endif
+
 #include 
 
 #include 
-- 
2.18.0.rc1



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: clear cold data flag if IO is not counted

2018-10-16 Thread Chao Yu
On 2018/10/16 11:10, Jaegeuk Kim wrote:
> On 10/16, Chao Yu wrote:
>> On 2018/10/16 7:08, Jaegeuk Kim wrote:
>>> On 10/15, Chao Yu wrote:
 On 2018/10/11 5:22, Jaegeuk Kim wrote:
> If we clear the cold data flag out of the writeback flow, we can miscount
> -1 by end_io.

 I didn't get it, which count do you mean?
>>>
>>> It's the number of dirty pages.
>>>
>>> Balancing F2FS Async:
>>>   - IO (CP:1, Data:   -1, Flush: (   001), Discard: (   0 
>>> 129304)) cmd:0 undiscard:   0

Better to add such info in commit message. :)

>>>
>>
>> So I guess the race should be:
>>
>> GC thread:   IRQ
>> - move_data_page()
>>  - set_page_dirty()
>>   - clear_cold_data()
>>  - f2fs_write_end_io()
>>   - type = WB_DATA_TYPE(page);
>> here, we get wrong type
>>   - dec_page_count(sbi, type);
>>  - f2fs_wait_on_page_writeback()
>>
>> How about relocate wait_writeback & set_page_dirty to avoid above race:
> 
> Well, wait_on_stable_page() doesn't guarantee its end_io. So, I'm not sure
> this is the only case.

Yes, you're right, I missed that case.

Can you use git-revert to generate the patch? so we can remain original
commit info for better backward tracking.

How do you think pick up original patch I submitted:

https://lkml.org/lkml/2018/7/27/415

Thanks,

> 
>>
>> move_data_page()
>>
>> retry:
>>  f2fs_wait_on_page_writeback(page, DATA, true);
>>  set_page_dirty(page);
>>
>> Thanks,
>>

 Thanks,

>
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 29a9d3b8f709..4102799b5558 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -2636,10 +2636,6 @@ static int f2fs_set_data_page_dirty(struct page 
> *page)
>   if (!PageUptodate(page))
>   SetPageUptodate(page);
>  
> - /* don't remain PG_checked flag which was set during GC */
> - if (is_cold_data(page))
> - clear_cold_data(page);
> -
>   if (f2fs_is_atomic_file(inode) && !f2fs_is_commit_atomic_write(inode)) {
>   if (!IS_ATOMIC_WRITTEN_PAGE(page)) {
>   f2fs_register_inmem_page(inode, page);
>
>>>
>>> .
>>>
> 
> .
> 



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Chao Yu
Hi Jaegeuk, Jens,

On 2018/10/17 3:23, Jaegeuk Kim wrote:
> Thanks Jens,
> 
> On top of the patch killing the dead code, I wrote another one to detect
> the idle time by the internal account logic like below. IMHO, it'd be
> better to decouple f2fs with other layers, if possible.
> 
> Thanks,
> 
>>From 85daf5190671b3d98ef779bdea77b4a046658708 Mon Sep 17 00:00:00 2001
> From: Jaegeuk Kim 
> Date: Tue, 16 Oct 2018 10:20:53 -0700
> Subject: [PATCH] f2fs: account read IOs and use IO counts for is_idle
> 
> This patch adds issued read IO counts which is under block layer.

I think the problem here is, in android environment, there is multiple
kinds of filesystems used in one device (emmc or ufs), so, if f2fs only be
aware of IOs from itself, still we can face IO conflict between f2fs and
other filesystem (ext4 used in system/... partition), it is possible when
we are trigger GC/discard intensively such as in gc_urgent mode, user can
be stuck when loading data from system partition.

So I guess the better way to avoid this is to export IO busy status in
block layer to filesystem, in f2fs, we can provider effective service in
background thread at real idle time, and will not impact user.

Any thoughts?

Thanks,

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/data.c  | 24 ++--
>  fs/f2fs/debug.c |  7 ++-
>  fs/f2fs/f2fs.h  | 22 --
>  3 files changed, 44 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 8952f2d610a6..5fdc8d751f19 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -52,6 +52,23 @@ static bool __is_cp_guaranteed(struct page *page)
>   return false;
>  }
>  
> +static enum count_type __read_io_type(struct page *page)
> +{
> + struct address_space *mapping = page->mapping;
> +
> + if (mapping) {
> + struct inode *inode = mapping->host;
> + struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
> +
> + if (inode->i_ino == F2FS_META_INO(sbi))
> + return F2FS_RD_META;
> +
> + if (inode->i_ino == F2FS_NODE_INO(sbi))
> + return F2FS_RD_NODE;
> + }
> + return F2FS_RD_DATA;
> +}
> +
>  /* postprocessing steps for read bios */
>  enum bio_post_read_step {
>   STEP_INITIAL = 0,
> @@ -82,6 +99,7 @@ static void __read_end_io(struct bio *bio)
>   } else {
>   SetPageUptodate(page);
>   }
> + dec_page_count(F2FS_P_SB(page), __read_io_type(page));
>   unlock_page(page);
>   }
>   if (bio->bi_private)
> @@ -464,8 +482,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
>  
>   __submit_bio(fio->sbi, bio, fio->type);
>  
> - if (!is_read_io(fio->op))
> - inc_page_count(fio->sbi, WB_DATA_TYPE(fio->page));
> + inc_page_count(fio->sbi, is_read_io(fio->op) ?
> + __read_io_type(page): WB_DATA_TYPE(fio->page));
>   return 0;
>  }
>  
> @@ -595,6 +613,7 @@ static int f2fs_submit_page_read(struct inode *inode, 
> struct page *page,
>   }
>   ClearPageError(page);
>   __submit_bio(F2FS_I_SB(inode), bio, DATA);
> + inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
>   return 0;
>  }
>  
> @@ -1593,6 +1612,7 @@ static int f2fs_mpage_readpages(struct address_space 
> *mapping,
>   if (bio_add_page(bio, page, blocksize, 0) < blocksize)
>   goto submit_and_realloc;
>  
> + inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
>   ClearPageError(page);
>   last_block_in_bio = block_nr;
>   goto next_page;
> diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
> index 026e10f30889..139b4d5c83d5 100644
> --- a/fs/f2fs/debug.c
> +++ b/fs/f2fs/debug.c
> @@ -55,6 +55,9 @@ static void update_general_status(struct f2fs_sb_info *sbi)
>   si->max_vw_cnt = atomic_read(>max_vw_cnt);
>   si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
>   si->nr_wb_data = get_pages(sbi, F2FS_WB_DATA);
> + si->nr_rd_data = get_pages(sbi, F2FS_RD_DATA);
> + si->nr_rd_node = get_pages(sbi, F2FS_RD_NODE);
> + si->nr_rd_meta = get_pages(sbi, F2FS_RD_META);
>   if (SM_I(sbi) && SM_I(sbi)->fcc_info) {
>   si->nr_flushed =
>   atomic_read(_I(sbi)->fcc_info->issued_flush);
> @@ -371,7 +374,9 @@ static int stat_show(struct seq_file *s, void *v)
>   seq_printf(s, "  - Inner Struct Count: tree: %d(%d), node: 
> %d\n",
>   si->ext_tree, si->zombie_tree, si->ext_node);
>   seq_puts(s, "\nBalancing F2FS Async:\n");
> - seq_printf(s, "  - IO (CP: %4d, Data: %4d, Flush: (%4d %4d 
> %4d), "
> + seq_printf(s, "  - IO_R (Data: %4d, Node: %4d, Meta: %4d\n",
> +si->nr_rd_data, si->nr_rd_node, si->nr_rd_meta);
> + seq_printf(s, "  - IO_W (CP: %4d, Data: %4d, Flush: (%4d %4d 
> %4d), "
>   

Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Chao Yu
On 2018/10/17 0:20, Jens Axboe wrote:
> On 10/16/18 10:06 AM, Chao Yu wrote:
>> Hi Jens,
>>
>> On 2018-10-16 22:34, Jens Axboe wrote:
>>> This doesn't work on stacked devices, and it doesn't work on
>>> blk-mq devices. The request_list is only used on legacy, which
>>> we don't have much of anymore, and soon won't have any of.
>>>
>>> Kill the check.
>>
>> In order to avoid conflicting with user IO, GC and Discard thread try to be
>> aware of IO status of device by is_idle(), if previous implementation doesn't
>> work, do we have any other existing method to get such status? Or do you have
>> any suggestion on this?
> 
> This particular patch just kills code that won't yield any useful
> information on stacked devices or blk-mq. As those are the only
> ones that will exist shortly, it's dead.

Yes, anyway, I'm okay with that change, please add:

Reviewed-by: Chao Yu 

> 
> For blk-mq, you could do a busy tag iteration. Something like this.
> That should be done separately though, in essence the existing code
> is dead/useless.

Thanks for given code below, let's discuss this on Jaegeuk's patch. :)

Thanks,

> 
> 
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
> index 4254e74c1446..823f04209e8c 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -403,6 +403,7 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, 
> busy_iter_fn *fn,
>   }
>   blk_queue_exit(q);
>  }
> +EXPORT_SYMBOL_GPL(blk_mq_queue_tag_busy_iter);
>  
>  static int bt_alloc(struct sbitmap_queue *bt, unsigned int depth,
>   bool round_robin, int node)
> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> index 61deab0b5a5a..5af7ff94b400 100644
> --- a/block/blk-mq-tag.h
> +++ b/block/blk-mq-tag.h
> @@ -33,8 +33,6 @@ extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx 
> *hctx,
>   struct blk_mq_tags **tags,
>   unsigned int depth, bool can_grow);
>  extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
> -void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
> - void *priv);
>  
>  static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt,
>struct blk_mq_hw_ctx *hctx)
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index 58778992eca4..c3a2a7fe3f88 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1347,18 +1347,6 @@ static inline void f2fs_update_time(struct 
> f2fs_sb_info *sbi, int type)
>   sbi->last_time[type] = jiffies;
>  }
>  
> -static inline bool f2fs_time_over(struct f2fs_sb_info *sbi, int type)
> -{
> - unsigned long interval = sbi->interval_time[type] * HZ;
> -
> - return time_after(jiffies, sbi->last_time[type] + interval);
> -}
> -
> -static inline bool is_idle(struct f2fs_sb_info *sbi)
> -{
> - return f2fs_time_over(sbi, REQ_TIME);
> -}
> -
>  /*
>   * Inline functions
>   */
> @@ -2948,6 +2936,7 @@ void f2fs_destroy_segment_manager_caches(void);
>  int f2fs_rw_hint_to_seg_type(enum rw_hint hint);
>  enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
>   enum page_type type, enum temp_type temp);
> +bool f2fs_is_idle(struct f2fs_sb_info *sbi);
>  
>  /*
>   * checkpoint.c
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 5c8d00422237..fb4152527cf1 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -83,7 +83,7 @@ static int gc_thread_func(void *data)
>   if (!mutex_trylock(>gc_mutex))
>   goto next;
>  
> - if (!is_idle(sbi)) {
> + if (!f2fs_is_idle(sbi)) {
>   increase_sleep_time(gc_th, _ms);
>   mutex_unlock(>gc_mutex);
>   goto next;
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index 30779aaa9dba..a02c5ecfb2e4 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -11,7 +11,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -33,6 +33,37 @@ static struct kmem_cache *discard_cmd_slab;
>  static struct kmem_cache *sit_entry_set_slab;
>  static struct kmem_cache *inmem_entry_slab;
>  
> +static bool f2fs_time_over(struct f2fs_sb_info *sbi, int type)
> +{
> + unsigned long interval = sbi->interval_time[type] * HZ;
> +
> + return time_after(jiffies, sbi->last_time[type] + interval);
> +}
> +
> +static void is_idle_fn(struct blk_mq_hw_ctx *hctx, struct request *req,
> +void *data, bool reserved)
> +{
> + unsigned int *cnt = data;
> +
> + (*cnt)++;
> +}
> +
> +bool f2fs_is_idle(struct f2fs_sb_info *sbi)
> +{
> + struct block_device *bdev = sbi->sb->s_bdev;
> + struct request_queue *q = bdev_get_queue(bdev);
> +
> + if (q->mq_ops) {
> + unsigned int cnt = 0;
> +
> + blk_mq_queue_tag_busy_iter(q, is_idle_fn, );
> + if (cnt)
> + return 

Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jens Axboe
On 10/16/18 1:31 PM, Jaegeuk Kim wrote:
> On 10/16, Jens Axboe wrote:
>> On 10/16/18 1:23 PM, Jaegeuk Kim wrote:
>>> Thanks Jens,
>>>
>>> On top of the patch killing the dead code, I wrote another one to detect
>>> the idle time by the internal account logic like below. IMHO, it'd be
>>> better to decouple f2fs with other layers, if possible.
>>
>> I agree, that's what got us into trouble in the first place. Can I add
>> your acked-by or review-by to the other patch, then?
> 
> I queued the patch in f2fs for -next with SOB, so I guess block for -next 
> should
> be fine to assume this patch is in. Can I?

Yes, that's totally fine as well, thanks.

-- 
Jens Axboe



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jaegeuk Kim
On 10/16, Jens Axboe wrote:
> On 10/16/18 1:23 PM, Jaegeuk Kim wrote:
> > Thanks Jens,
> > 
> > On top of the patch killing the dead code, I wrote another one to detect
> > the idle time by the internal account logic like below. IMHO, it'd be
> > better to decouple f2fs with other layers, if possible.
> 
> I agree, that's what got us into trouble in the first place. Can I add
> your acked-by or review-by to the other patch, then?

I queued the patch in f2fs for -next with SOB, so I guess block for -next should
be fine to assume this patch is in. Can I?

> 
> -- 
> Jens Axboe


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jens Axboe
On 10/16/18 1:23 PM, Jaegeuk Kim wrote:
> Thanks Jens,
> 
> On top of the patch killing the dead code, I wrote another one to detect
> the idle time by the internal account logic like below. IMHO, it'd be
> better to decouple f2fs with other layers, if possible.

I agree, that's what got us into trouble in the first place. Can I add
your acked-by or review-by to the other patch, then?

-- 
Jens Axboe



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jaegeuk Kim
Thanks Jens,

On top of the patch killing the dead code, I wrote another one to detect
the idle time by the internal account logic like below. IMHO, it'd be
better to decouple f2fs with other layers, if possible.

Thanks,

>From 85daf5190671b3d98ef779bdea77b4a046658708 Mon Sep 17 00:00:00 2001
From: Jaegeuk Kim 
Date: Tue, 16 Oct 2018 10:20:53 -0700
Subject: [PATCH] f2fs: account read IOs and use IO counts for is_idle

This patch adds issued read IO counts which is under block layer.

Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/data.c  | 24 ++--
 fs/f2fs/debug.c |  7 ++-
 fs/f2fs/f2fs.h  | 22 --
 3 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 8952f2d610a6..5fdc8d751f19 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -52,6 +52,23 @@ static bool __is_cp_guaranteed(struct page *page)
return false;
 }
 
+static enum count_type __read_io_type(struct page *page)
+{
+   struct address_space *mapping = page->mapping;
+
+   if (mapping) {
+   struct inode *inode = mapping->host;
+   struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+
+   if (inode->i_ino == F2FS_META_INO(sbi))
+   return F2FS_RD_META;
+
+   if (inode->i_ino == F2FS_NODE_INO(sbi))
+   return F2FS_RD_NODE;
+   }
+   return F2FS_RD_DATA;
+}
+
 /* postprocessing steps for read bios */
 enum bio_post_read_step {
STEP_INITIAL = 0,
@@ -82,6 +99,7 @@ static void __read_end_io(struct bio *bio)
} else {
SetPageUptodate(page);
}
+   dec_page_count(F2FS_P_SB(page), __read_io_type(page));
unlock_page(page);
}
if (bio->bi_private)
@@ -464,8 +482,8 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
 
__submit_bio(fio->sbi, bio, fio->type);
 
-   if (!is_read_io(fio->op))
-   inc_page_count(fio->sbi, WB_DATA_TYPE(fio->page));
+   inc_page_count(fio->sbi, is_read_io(fio->op) ?
+   __read_io_type(page): WB_DATA_TYPE(fio->page));
return 0;
 }
 
@@ -595,6 +613,7 @@ static int f2fs_submit_page_read(struct inode *inode, 
struct page *page,
}
ClearPageError(page);
__submit_bio(F2FS_I_SB(inode), bio, DATA);
+   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
return 0;
 }
 
@@ -1593,6 +1612,7 @@ static int f2fs_mpage_readpages(struct address_space 
*mapping,
if (bio_add_page(bio, page, blocksize, 0) < blocksize)
goto submit_and_realloc;
 
+   inc_page_count(F2FS_I_SB(inode), F2FS_RD_DATA);
ClearPageError(page);
last_block_in_bio = block_nr;
goto next_page;
diff --git a/fs/f2fs/debug.c b/fs/f2fs/debug.c
index 026e10f30889..139b4d5c83d5 100644
--- a/fs/f2fs/debug.c
+++ b/fs/f2fs/debug.c
@@ -55,6 +55,9 @@ static void update_general_status(struct f2fs_sb_info *sbi)
si->max_vw_cnt = atomic_read(>max_vw_cnt);
si->nr_wb_cp_data = get_pages(sbi, F2FS_WB_CP_DATA);
si->nr_wb_data = get_pages(sbi, F2FS_WB_DATA);
+   si->nr_rd_data = get_pages(sbi, F2FS_RD_DATA);
+   si->nr_rd_node = get_pages(sbi, F2FS_RD_NODE);
+   si->nr_rd_meta = get_pages(sbi, F2FS_RD_META);
if (SM_I(sbi) && SM_I(sbi)->fcc_info) {
si->nr_flushed =
atomic_read(_I(sbi)->fcc_info->issued_flush);
@@ -371,7 +374,9 @@ static int stat_show(struct seq_file *s, void *v)
seq_printf(s, "  - Inner Struct Count: tree: %d(%d), node: 
%d\n",
si->ext_tree, si->zombie_tree, si->ext_node);
seq_puts(s, "\nBalancing F2FS Async:\n");
-   seq_printf(s, "  - IO (CP: %4d, Data: %4d, Flush: (%4d %4d 
%4d), "
+   seq_printf(s, "  - IO_R (Data: %4d, Node: %4d, Meta: %4d\n",
+  si->nr_rd_data, si->nr_rd_node, si->nr_rd_meta);
+   seq_printf(s, "  - IO_W (CP: %4d, Data: %4d, Flush: (%4d %4d 
%4d), "
"Discard: (%4d %4d)) cmd: %4d undiscard:%4u\n",
   si->nr_wb_cp_data, si->nr_wb_data,
   si->nr_flushing, si->nr_flushed,
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 156e6cd2c812..5c80eca194b5 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -950,6 +950,9 @@ enum count_type {
F2FS_DIRTY_IMETA,
F2FS_WB_CP_DATA,
F2FS_WB_DATA,
+   F2FS_RD_DATA,
+   F2FS_RD_NODE,
+   F2FS_RD_META,
NR_COUNT_TYPE,
 };
 
@@ -1392,11 +1395,6 @@ static inline unsigned int f2fs_time_to_wait(struct 
f2fs_sb_info *sbi,
return wait_ms;
 }
 
-static inline bool is_idle(struct f2fs_sb_info *sbi, int type)
-{
-   return f2fs_time_over(sbi, type);
-}
-
 /*
  * Inline functions
  */
@@ -1787,7 +1785,9 @@ static inline void 

Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jens Axboe
On 10/16/18 10:06 AM, Chao Yu wrote:
> Hi Jens,
> 
> On 2018-10-16 22:34, Jens Axboe wrote:
>> This doesn't work on stacked devices, and it doesn't work on
>> blk-mq devices. The request_list is only used on legacy, which
>> we don't have much of anymore, and soon won't have any of.
>>
>> Kill the check.
> 
> In order to avoid conflicting with user IO, GC and Discard thread try to be
> aware of IO status of device by is_idle(), if previous implementation doesn't
> work, do we have any other existing method to get such status? Or do you have
> any suggestion on this?

This particular patch just kills code that won't yield any useful
information on stacked devices or blk-mq. As those are the only
ones that will exist shortly, it's dead.

For blk-mq, you could do a busy tag iteration. Something like this.
That should be done separately though, in essence the existing code
is dead/useless.


diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 4254e74c1446..823f04209e8c 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -403,6 +403,7 @@ void blk_mq_queue_tag_busy_iter(struct request_queue *q, 
busy_iter_fn *fn,
}
blk_queue_exit(q);
 }
+EXPORT_SYMBOL_GPL(blk_mq_queue_tag_busy_iter);
 
 static int bt_alloc(struct sbitmap_queue *bt, unsigned int depth,
bool round_robin, int node)
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
index 61deab0b5a5a..5af7ff94b400 100644
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -33,8 +33,6 @@ extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
struct blk_mq_tags **tags,
unsigned int depth, bool can_grow);
 extern void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
-void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_iter_fn *fn,
-   void *priv);
 
 static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt,
 struct blk_mq_hw_ctx *hctx)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 58778992eca4..c3a2a7fe3f88 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1347,18 +1347,6 @@ static inline void f2fs_update_time(struct f2fs_sb_info 
*sbi, int type)
sbi->last_time[type] = jiffies;
 }
 
-static inline bool f2fs_time_over(struct f2fs_sb_info *sbi, int type)
-{
-   unsigned long interval = sbi->interval_time[type] * HZ;
-
-   return time_after(jiffies, sbi->last_time[type] + interval);
-}
-
-static inline bool is_idle(struct f2fs_sb_info *sbi)
-{
-   return f2fs_time_over(sbi, REQ_TIME);
-}
-
 /*
  * Inline functions
  */
@@ -2948,6 +2936,7 @@ void f2fs_destroy_segment_manager_caches(void);
 int f2fs_rw_hint_to_seg_type(enum rw_hint hint);
 enum rw_hint f2fs_io_type_to_rw_hint(struct f2fs_sb_info *sbi,
enum page_type type, enum temp_type temp);
+bool f2fs_is_idle(struct f2fs_sb_info *sbi);
 
 /*
  * checkpoint.c
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 5c8d00422237..fb4152527cf1 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -83,7 +83,7 @@ static int gc_thread_func(void *data)
if (!mutex_trylock(>gc_mutex))
goto next;
 
-   if (!is_idle(sbi)) {
+   if (!f2fs_is_idle(sbi)) {
increase_sleep_time(gc_th, _ms);
mutex_unlock(>gc_mutex);
goto next;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 30779aaa9dba..a02c5ecfb2e4 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +33,37 @@ static struct kmem_cache *discard_cmd_slab;
 static struct kmem_cache *sit_entry_set_slab;
 static struct kmem_cache *inmem_entry_slab;
 
+static bool f2fs_time_over(struct f2fs_sb_info *sbi, int type)
+{
+   unsigned long interval = sbi->interval_time[type] * HZ;
+
+   return time_after(jiffies, sbi->last_time[type] + interval);
+}
+
+static void is_idle_fn(struct blk_mq_hw_ctx *hctx, struct request *req,
+  void *data, bool reserved)
+{
+   unsigned int *cnt = data;
+
+   (*cnt)++;
+}
+
+bool f2fs_is_idle(struct f2fs_sb_info *sbi)
+{
+   struct block_device *bdev = sbi->sb->s_bdev;
+   struct request_queue *q = bdev_get_queue(bdev);
+
+   if (q->mq_ops) {
+   unsigned int cnt = 0;
+
+   blk_mq_queue_tag_busy_iter(q, is_idle_fn, );
+   if (cnt)
+   return false;
+   }
+
+   return f2fs_time_over(sbi, REQ_TIME);
+}
+
 static unsigned long __reverse_ulong(unsigned char *str)
 {
unsigned long tmp = 0;
@@ -511,7 +542,7 @@ void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
else
f2fs_build_free_nids(sbi, false, false);
 
-   if (!is_idle(sbi) &&
+   if (!f2fs_is_idle(sbi) &&

Re: [f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Chao Yu
Hi Jens,

On 2018-10-16 22:34, Jens Axboe wrote:
> This doesn't work on stacked devices, and it doesn't work on
> blk-mq devices. The request_list is only used on legacy, which
> we don't have much of anymore, and soon won't have any of.
> 
> Kill the check.

In order to avoid conflicting with user IO, GC and Discard thread try to be
aware of IO status of device by is_idle(), if previous implementation doesn't
work, do we have any other existing method to get such status? Or do you have
any suggestion on this?

Thanks,

> 
> Cc: Jaegeuk Kim 
> Cc: linux-f2fs-devel@lists.sourceforge.net
> Signed-off-by: Jens Axboe 
> ---
>  fs/f2fs/f2fs.h | 7 ---
>  1 file changed, 7 deletions(-)
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index abf925664d9c..58778992eca4 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -1356,13 +1356,6 @@ static inline bool f2fs_time_over(struct f2fs_sb_info 
> *sbi, int type)
>  
>  static inline bool is_idle(struct f2fs_sb_info *sbi)
>  {
> - struct block_device *bdev = sbi->sb->s_bdev;
> - struct request_queue *q = bdev_get_queue(bdev);
> - struct request_list *rl = >root_rl;
> -
> - if (rl->count[BLK_RW_SYNC] || rl->count[BLK_RW_ASYNC])
> - return false;
> -
>   return f2fs_time_over(sbi, REQ_TIME);
>  }
>  
> 


___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


[f2fs-dev] [PATCH] f2fs: remove request_list check in is_idle()

2018-10-16 Thread Jens Axboe
This doesn't work on stacked devices, and it doesn't work on
blk-mq devices. The request_list is only used on legacy, which
we don't have much of anymore, and soon won't have any of.

Kill the check.

Cc: Jaegeuk Kim 
Cc: linux-f2fs-devel@lists.sourceforge.net
Signed-off-by: Jens Axboe 
---
 fs/f2fs/f2fs.h | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index abf925664d9c..58778992eca4 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1356,13 +1356,6 @@ static inline bool f2fs_time_over(struct f2fs_sb_info 
*sbi, int type)
 
 static inline bool is_idle(struct f2fs_sb_info *sbi)
 {
-   struct block_device *bdev = sbi->sb->s_bdev;
-   struct request_queue *q = bdev_get_queue(bdev);
-   struct request_list *rl = >root_rl;
-
-   if (rl->count[BLK_RW_SYNC] || rl->count[BLK_RW_ASYNC])
-   return false;
-
return f2fs_time_over(sbi, REQ_TIME);
 }
 
-- 
2.17.1

-- 
Jens Axboe



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH v2] f2fs: cleanup dirty pages if recover failed

2018-10-16 Thread Chao Yu
On 2018/10/12 18:49, Sheng Yong wrote:
> During recover, we will try to create new dentries for inodes with
> dentry_mark. But if the parent is missing (e.g. killed by fsck),
> recover will break. But those recovered dirty pages are not cleanup.
> This will hit f2fs_bug_on:
> 
> [   53.519566] F2FS-fs (loop0): Found nat_bits in checkpoint
> [   53.539354] F2FS-fs (loop0): recover_inode: ino = 5, name = file, inline = 
> 3
> [   53.539402] F2FS-fs (loop0): recover_dentry: ino = 5, name = file, dir = 
> 0, err = -2
> [   53.545760] F2FS-fs (loop0): Cannot recover all fsync data errno=-2
> [   53.546105] F2FS-fs (loop0): access invalid blkaddr:4294967295
> [   53.546171] WARNING: CPU: 1 PID: 1798 at fs/f2fs/checkpoint.c:163 
> f2fs_is_valid_blkaddr+0x26c/0x320
> [   53.546174] Modules linked in:
> [   53.546183] CPU: 1 PID: 1798 Comm: mount Not tainted 4.19.0-rc2+ #1
> [   53.546186] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS 
> VirtualBox 12/01/2006
> [   53.546191] RIP: 0010:f2fs_is_valid_blkaddr+0x26c/0x320
> [   53.546195] Code: 85 bb 00 00 00 48 89 df 88 44 24 07 e8 ad a8 db ff 48 8b 
> 3b 44 89 e1 48 c7 c2 40 03 72 a9 48 c7 c6 e0 01 72 a9 e8 84 3c ff ff <0f> 0b 
> 0f b6 44 24 07 e9 8a 00 00 00 48 8d bf 38 01 00 00 e8 7c a8
> [   53.546201] RSP: 0018:88006c067768 EFLAGS: 00010282
> [   53.546208] RAX:  RBX: 880068844200 RCX: 
> a83e1a33
> [   53.546211] RDX:  RSI: 0008 RDI: 
> 88006d51e590
> [   53.546215] RBP: 0005 R08: ed000daa3cb3 R09: 
> ed000daa3cb3
> [   53.546218] R10: 0001 R11: ed000daa3cb2 R12: 
> 
> [   53.546221] R13: 88006a1f8000 R14: 0200 R15: 
> 0009
> [   53.546226] FS:  7fb2f3646840() GS:88006d50() 
> knlGS:
> [   53.546229] CS:  0010 DS:  ES:  CR0: 80050033
> [   53.546234] CR2: 7f0fd77f0008 CR3: 687e6002 CR4: 
> 000206e0
> [   53.546237] DR0:  DR1:  DR2: 
> 
> [   53.546240] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [   53.546242] Call Trace:
> [   53.546248]  f2fs_submit_page_bio+0x95/0x740
> [   53.546253]  read_node_page+0x161/0x1e0
> [   53.546271]  ? truncate_node+0x650/0x650
> [   53.546283]  ? add_to_page_cache_lru+0x12c/0x170
> [   53.546288]  ? pagecache_get_page+0x262/0x2d0
> [   53.546292]  __get_node_page+0x200/0x660
> [   53.546302]  f2fs_update_inode_page+0x4a/0x160
> [   53.546306]  f2fs_write_inode+0x86/0xb0
> [   53.546317]  __writeback_single_inode+0x49c/0x620
> [   53.546322]  writeback_single_inode+0xe4/0x1e0
> [   53.546326]  sync_inode_metadata+0x93/0xd0
> [   53.546330]  ? sync_inode+0x10/0x10
> [   53.546342]  ? do_raw_spin_unlock+0xed/0x100
> [   53.546347]  f2fs_sync_inode_meta+0xe0/0x130
> [   53.546351]  f2fs_fill_super+0x287d/0x2d10
> [   53.546367]  ? vsnprintf+0x742/0x7a0
> [   53.546372]  ? f2fs_commit_super+0x180/0x180
> [   53.546379]  ? up_write+0x20/0x40
> [   53.546385]  ? set_blocksize+0x5f/0x140
> [   53.546391]  ? f2fs_commit_super+0x180/0x180
> [   53.546402]  mount_bdev+0x181/0x200
> [   53.546406]  mount_fs+0x94/0x180
> [   53.546411]  vfs_kern_mount+0x6c/0x1e0
> [   53.546415]  do_mount+0xe5e/0x1510
> [   53.546420]  ? fs_reclaim_release+0x9/0x30
> [   53.546424]  ? copy_mount_string+0x20/0x20
> [   53.546428]  ? fs_reclaim_acquire+0xd/0x30
> [   53.546435]  ? __might_sleep+0x2c/0xc0
> [   53.546440]  ? ___might_sleep+0x53/0x170
> [   53.546453]  ? __might_fault+0x4c/0x60
> [   53.546468]  ? _copy_from_user+0x95/0xa0
> [   53.546474]  ? memdup_user+0x39/0x60
> [   53.546478]  ksys_mount+0x88/0xb0
> [   53.546482]  __x64_sys_mount+0x5d/0x70
> [   53.546495]  do_syscall_64+0x65/0x130
> [   53.546503]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   53.547639] ---[ end trace b804d1ea2fec893e ]---
> 
> So if recover fails, we need to drop all recovered data.
> 
> Signed-off-by: Sheng Yong 

Reviewed-by: Chao Yu 

Thanks,



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


Re: [f2fs-dev] [PATCH 1/4] mkfs.f2fs: Added missing statements related to error checking.

2018-10-16 Thread Chao Yu
On 2018/10/12 6:20, Sotirios-Efstathios Maneas wrote:
> The following patch adds a few missing statements related to error checking.
> 
> Signed-off-by: Sotirios-Efstathios Maneas 
> ---
>  mkfs/f2fs_format.c   | 10 --
>  mkfs/f2fs_format_utils.c |  5 +
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/mkfs/f2fs_format.c b/mkfs/f2fs_format.c
> index 9b0a0d1..a2685cb 100644
> --- a/mkfs/f2fs_format.c
> +++ b/mkfs/f2fs_format.c
> @@ -991,12 +991,16 @@ static int f2fs_write_super_block(void)
>   u_int8_t *zero_buff;
>  
>   zero_buff = calloc(F2FS_BLKSIZE, 1);
> + if (zero_buff == NULL) {

I think it will be good to keep the coding style:

if (!zero_buff)

> + MSG(1, "\tError: Calloc Failed for super_blk_zero_buf!!!\n");
> + return -1;
> + }
>  
>   memcpy(zero_buff + F2FS_SUPER_OFFSET, sb, sizeof(*sb));
>   DBG(1, "\tWriting super block, at offset 0x%08x\n", 0);
>   for (index = 0; index < 2; index++) {
>   if (dev_write_block(zero_buff, index)) {
> - MSG(1, "\tError: While while writing supe_blk "
> + MSG(1, "\tError: While while writing super_blk "

This typo should be fixed in another patch.

>   "on disk!!! index : %d\n", index);
>   free(zero_buff);
>   return -1;
> @@ -1021,8 +1025,10 @@ static int f2fs_discard_obsolete_dnode(void)
>   return 0;
>  
>   raw_node = calloc(sizeof(struct f2fs_node), 1);
> - if (!raw_node)
> + if (raw_node == NULL) {

Coding style.

> + MSG(1, "\tError: Calloc Failed for discard_raw_node!!!\n");
>   return -1;
> + }
>  
>   /* avoid power-off-recovery based on roll-forward policy */
>   offset = get_sb(main_blkaddr);
> diff --git a/mkfs/f2fs_format_utils.c b/mkfs/f2fs_format_utils.c
> index bf9ffbd..36b363a 100644
> --- a/mkfs/f2fs_format_utils.c
> +++ b/mkfs/f2fs_format_utils.c
> @@ -51,6 +51,11 @@ static int trim_device(int i)
>   int fd = dev->fd;
>  
>   stat_buf = malloc(sizeof(struct stat));
> + if (stat_buf == NULL) {

Ditto.

How about fixing all error handling of calloc/malloc from differnet .c
files in one patch?

Thanks,

> + MSG(1, "\tError: Malloc Failed for trim_stat_buf!!!\n");
> + return -1;
> + }
> +
>   if (fstat(fd, stat_buf) < 0 ) {
>   MSG(1, "\tError: Failed to get the device stat!!!\n");
>   free(stat_buf);
> 



___
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel