[f2fs-dev] [PATCH] f2fs: optimize the way of traversing free_nid_bitmap
We call scan_free_nid_bits only when there isn't many free nids left, it means that marked bits in free_nid_bitmap are supposed to be few, use find_next_bit_le is more efficient in such case. According to my tests, use find_next_bit_le instead of test_bit_le will cut down the traversal time to one third of its original. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/node.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index fef5c68..d234c6e 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1955,6 +1955,7 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + nid_t nid; down_read(_i->nat_tree_lock); @@ -1964,10 +1965,10 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) if (!nm_i->free_nid_count[i]) continue; for (idx = 0; idx < NAT_ENTRY_PER_BLOCK; idx++) { - nid_t nid; - - if (!test_bit_le(idx, nm_i->free_nid_bitmap[i])) - continue; + idx = find_next_bit_le(nm_i->free_nid_bitmap[i], + NAT_ENTRY_PER_BLOCK, idx); + if (idx >= NAT_ENTRY_PER_BLOCK) + break; nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); @@ -1980,7 +1981,6 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { block_t addr; - nid_t nid; addr = le32_to_cpu(nat_in_journal(journal, i).block_addr); nid = le32_to_cpu(nid_in_journal(journal, i)); -- 2.7.4
[f2fs-dev] [PATCH] f2fs: optimize the way of traversing free_nid_bitmap
We call scan_free_nid_bits only when there isn't many free nids left, it means that marked bits in free_nid_bitmap are supposed to be few, use find_next_bit_le is more efficient in such case. According to my tests, use find_next_bit_le instead of test_bit_le will cut down the traversal time to one third of its original. Signed-off-by: Fan li --- fs/f2fs/node.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index fef5c68..d234c6e 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1955,6 +1955,7 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + nid_t nid; down_read(_i->nat_tree_lock); @@ -1964,10 +1965,10 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) if (!nm_i->free_nid_count[i]) continue; for (idx = 0; idx < NAT_ENTRY_PER_BLOCK; idx++) { - nid_t nid; - - if (!test_bit_le(idx, nm_i->free_nid_bitmap[i])) - continue; + idx = find_next_bit_le(nm_i->free_nid_bitmap[i], + NAT_ENTRY_PER_BLOCK, idx); + if (idx >= NAT_ENTRY_PER_BLOCK) + break; nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); @@ -1980,7 +1981,6 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { block_t addr; - nid_t nid; addr = le32_to_cpu(nat_in_journal(journal, i).block_addr); nid = le32_to_cpu(nid_in_journal(journal, i)); -- 2.7.4
RE: [f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids are acquired
> -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Tuesday, November 07, 2017 11:41 AM > To: Fan Li > Cc: 'Chao Yu'; 'Chao Yu'; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids > are acquired > > Hi, > > I merged this patch after fixing some broken format. Could you please check > your email configuration? > > Thanks, > Sorry to bother you with so trivial problem, mail configuration is fine, but I use a wrong way to copy the text this time, won't happen again. > On 11/07, Fan Li wrote: > > In current version, after scan_free_nid_bits, the scan is over if > > nid_cnt[FREE_NID] != 0. > > In most cases, there are still free nids in the free list during the > > scan, and scan_free_nid_bits usually can't increase nid_cnt[FREE_NID]. > > It causes that __build_free_nids is called many times without solving > > the shortage of the free nids. This patch fixes that. > > > > Signed-off-by: Fan li <fanofcode...@samsung.com> > > --- > > fs/f2fs/node.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..5cef118 > > 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -2010,7 +2010,7 @@ static void __build_free_nids(struct f2fs_sb_info > > *sbi, bool sync, bool mount) > > /* try to find free nids in free_nid_bitmap */ > > scan_free_nid_bits(sbi); > > > > - if (nm_i->nid_cnt[FREE_NID]) > > + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) > > return; > > } > > > > -- > > 2.7.4
RE: [f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids are acquired
> -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Tuesday, November 07, 2017 11:41 AM > To: Fan Li > Cc: 'Chao Yu'; 'Chao Yu'; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids > are acquired > > Hi, > > I merged this patch after fixing some broken format. Could you please check > your email configuration? > > Thanks, > Sorry to bother you with so trivial problem, mail configuration is fine, but I use a wrong way to copy the text this time, won't happen again. > On 11/07, Fan Li wrote: > > In current version, after scan_free_nid_bits, the scan is over if > > nid_cnt[FREE_NID] != 0. > > In most cases, there are still free nids in the free list during the > > scan, and scan_free_nid_bits usually can't increase nid_cnt[FREE_NID]. > > It causes that __build_free_nids is called many times without solving > > the shortage of the free nids. This patch fixes that. > > > > Signed-off-by: Fan li > > --- > > fs/f2fs/node.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..5cef118 > > 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -2010,7 +2010,7 @@ static void __build_free_nids(struct f2fs_sb_info > > *sbi, bool sync, bool mount) > > /* try to find free nids in free_nid_bitmap */ > > scan_free_nid_bits(sbi); > > > > - if (nm_i->nid_cnt[FREE_NID]) > > + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) > > return; > > } > > > > -- > > 2.7.4
[f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids are acquired
In current version, after scan_free_nid_bits, the scan is over if nid_cnt[FREE_NID] != 0. In most cases, there are still free nids in the free list during the scan, and scan_free_nid_bits usually can't increase nid_cnt[FREE_NID]. It causes that __build_free_nids is called many times without solving the shortage of the free nids. This patch fixes that. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..5cef118 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2010,7 +2010,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /* try to find free nids in free_nid_bitmap */ scan_free_nid_bits(sbi); - if (nm_i->nid_cnt[FREE_NID]) + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) return; } -- 2.7.4
[f2fs-dev] [PATCH] f2fs: keep scanning until enough free nids are acquired
In current version, after scan_free_nid_bits, the scan is over if nid_cnt[FREE_NID] != 0. In most cases, there are still free nids in the free list during the scan, and scan_free_nid_bits usually can't increase nid_cnt[FREE_NID]. It causes that __build_free_nids is called many times without solving the shortage of the free nids. This patch fixes that. Signed-off-by: Fan li --- fs/f2fs/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..5cef118 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2010,7 +2010,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /* try to find free nids in free_nid_bitmap */ scan_free_nid_bits(sbi); - if (nm_i->nid_cnt[FREE_NID]) + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) return; } -- 2.7.4
RE: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Friday, November 03, 2017 9:16 PM > To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan > free nid > > On 2017/11/3 18:29, Fan Li wrote: > > > > > >> -Original Message- > >> From: Chao Yu [mailto:yuch...@huawei.com] > >> Sent: Friday, November 03, 2017 4:54 PM > >> To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > >> Cc: linux-kernel@vger.kernel.org; > >> linux-f2fs-de...@lists.sourceforge.net > >> Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of > >> scan free nid > >> > >> On 2017/11/3 15:31, Fan Li wrote: > >>> In current version, we preserve 8 pages of nat blocks as free nids, > >>> we build bitmaps for it and use them to allocate nids until its > >>> number drops below NAT_ENTRY_PER_BLOCK. > >>> > >>> After that, we have a problem, scan_free_nid_bits will scan the same > >>> 8 pages trying to find more free nids, but in most cases the free > >>> nids in these bitmaps are already in free list, scan them won't get > >>> us any new nids. > >>> Further more, after scan_free_nid_bits, the scan is over if > >>> nid_cnt[FREE_NID] != 0. > >>> It causes that we scan the same pages over and over again, and no > >>> new free nids are found until nid_cnt[FREE_NID]==0. While the > >>> scanned pages increase, the problem grows worse. > >>> > >>> This patch mark the range where new free nids could exist and keep > >>> scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > >>> The new vairable first_scan_block marks the start of the range, it's > >>> initialized with NEW_ADDR, which means all free nids before > >>> next_scan_nid are already in free list; and use next_scan_nid as the > >>> end of the range since all free nids which are scanned in > >>> scan_free_nid_bits must be smaller next_scan_nid. > >> > >> Think over again, IMO, we can add an variable for stating total count > >> of free nids in bitamp, if there is no free nid, just > > skipping scanning all > >> existed bitmap. > >> > >> And if there is only few free nid scattered in bitmap, the cost will > >> be limited because we will skip scanning > > nm_i::free_nid_bitmap if > >> nm_i::free_nid_count is zero. Once we find one free nid, let's skip out. > >> > >> Since there shouldn't be very heavy overhead for CPU during traveling > >> nm_i::nat_block_bitmap, I expect below change could be more simple for > >> maintaining and being with the same effect. > >> > >> How do you think? > >> > > > > I think if you need this to work, check total_bitmap_free_nid may not be > > sufficient enough. > > The problem this patch presents is that even all the free nids are > > already in the free list, we still scan all the pages. > > The scan proceeds once free nid count is below NAT_ENTRY_PER_BLOCK. > > So in most cases, there are still free nids in the bitmap during the > > scan, and current codes will check every one of them to see if they are > > actually in free list. > > If only check total_bitmap_free_nid == 0 won't take this overhead away. > > Oh, you could see that, we have added free_nid_count in each NAT block's > free_nid_bitmap bitmap, before scan the bitmap, we will make > sure there is at least one free nid. > > scan_free_nid_bits() > for (i = 0; i < nm_i->nat_blocks; i++) { > if (!test_bit_le(i, nm_i->nat_block_bitmap)) > continue; > if (!nm_i->free_nid_count[i]) > continue; Do you mean free_nid_count here? I thought free_nid_count only represents how many nats are marked free in bitmap of one block. To my understanding, even a nat is already in the free list, it will still have a bit marked as free in free_nid_bitmap and a count in free_nid_count. That means if free_nid_count != 0, and there are marked bits in the bitmap, the free nats in this block could still be all in the free list. The purpose of scan is to find new nats and add them to free list, go through the nats which are already in the free list isn't what we want. And in xfstest, under most cases scan_free_nid_bits runs, all free nats are indeed in the free list. > for (idx = 0; idx < NAT_ENTRY_PER_BLOCK; i
RE: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Friday, November 03, 2017 9:16 PM > To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan > free nid > > On 2017/11/3 18:29, Fan Li wrote: > > > > > >> -Original Message- > >> From: Chao Yu [mailto:yuch...@huawei.com] > >> Sent: Friday, November 03, 2017 4:54 PM > >> To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > >> Cc: linux-kernel@vger.kernel.org; > >> linux-f2fs-de...@lists.sourceforge.net > >> Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of > >> scan free nid > >> > >> On 2017/11/3 15:31, Fan Li wrote: > >>> In current version, we preserve 8 pages of nat blocks as free nids, > >>> we build bitmaps for it and use them to allocate nids until its > >>> number drops below NAT_ENTRY_PER_BLOCK. > >>> > >>> After that, we have a problem, scan_free_nid_bits will scan the same > >>> 8 pages trying to find more free nids, but in most cases the free > >>> nids in these bitmaps are already in free list, scan them won't get > >>> us any new nids. > >>> Further more, after scan_free_nid_bits, the scan is over if > >>> nid_cnt[FREE_NID] != 0. > >>> It causes that we scan the same pages over and over again, and no > >>> new free nids are found until nid_cnt[FREE_NID]==0. While the > >>> scanned pages increase, the problem grows worse. > >>> > >>> This patch mark the range where new free nids could exist and keep > >>> scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > >>> The new vairable first_scan_block marks the start of the range, it's > >>> initialized with NEW_ADDR, which means all free nids before > >>> next_scan_nid are already in free list; and use next_scan_nid as the > >>> end of the range since all free nids which are scanned in > >>> scan_free_nid_bits must be smaller next_scan_nid. > >> > >> Think over again, IMO, we can add an variable for stating total count > >> of free nids in bitamp, if there is no free nid, just > > skipping scanning all > >> existed bitmap. > >> > >> And if there is only few free nid scattered in bitmap, the cost will > >> be limited because we will skip scanning > > nm_i::free_nid_bitmap if > >> nm_i::free_nid_count is zero. Once we find one free nid, let's skip out. > >> > >> Since there shouldn't be very heavy overhead for CPU during traveling > >> nm_i::nat_block_bitmap, I expect below change could be more simple for > >> maintaining and being with the same effect. > >> > >> How do you think? > >> > > > > I think if you need this to work, check total_bitmap_free_nid may not be > > sufficient enough. > > The problem this patch presents is that even all the free nids are > > already in the free list, we still scan all the pages. > > The scan proceeds once free nid count is below NAT_ENTRY_PER_BLOCK. > > So in most cases, there are still free nids in the bitmap during the > > scan, and current codes will check every one of them to see if they are > > actually in free list. > > If only check total_bitmap_free_nid == 0 won't take this overhead away. > > Oh, you could see that, we have added free_nid_count in each NAT block's > free_nid_bitmap bitmap, before scan the bitmap, we will make > sure there is at least one free nid. > > scan_free_nid_bits() > for (i = 0; i < nm_i->nat_blocks; i++) { > if (!test_bit_le(i, nm_i->nat_block_bitmap)) > continue; > if (!nm_i->free_nid_count[i]) > continue; Do you mean free_nid_count here? I thought free_nid_count only represents how many nats are marked free in bitmap of one block. To my understanding, even a nat is already in the free list, it will still have a bit marked as free in free_nid_bitmap and a count in free_nid_count. That means if free_nid_count != 0, and there are marked bits in the bitmap, the free nats in this block could still be all in the free list. The purpose of scan is to find new nats and add them to free list, go through the nats which are already in the free list isn't what we want. And in xfstest, under most cases scan_free_nid_bits runs, all free nats are indeed in the free list. > for (idx = 0; idx < NAT_ENTRY_PER_BLOCK; i
RE: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:yuch...@huawei.com] > Sent: Friday, November 03, 2017 4:54 PM > To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan > free nid > > On 2017/11/3 15:31, Fan Li wrote: > > In current version, we preserve 8 pages of nat blocks as free nids, we > > build bitmaps for it and use them to allocate nids until its number > > drops below NAT_ENTRY_PER_BLOCK. > > > > After that, we have a problem, scan_free_nid_bits will scan the same > > 8 pages trying to find more free nids, but in most cases the free nids > > in these bitmaps are already in free list, scan them won't get us any > > new nids. > > Further more, after scan_free_nid_bits, the scan is over if > > nid_cnt[FREE_NID] != 0. > > It causes that we scan the same pages over and over again, and no new > > free nids are found until nid_cnt[FREE_NID]==0. While the scanned > > pages increase, the problem grows worse. > > > > This patch mark the range where new free nids could exist and keep > > scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > > The new vairable first_scan_block marks the start of the range, it's > > initialized with NEW_ADDR, which means all free nids before > > next_scan_nid are already in free list; and use next_scan_nid as the > > end of the range since all free nids which are scanned in > > scan_free_nid_bits must be smaller next_scan_nid. > > Think over again, IMO, we can add an variable for stating total count of free > nids in bitamp, if there is no free nid, just skipping scanning all > existed bitmap. > > And if there is only few free nid scattered in bitmap, the cost will be > limited because we will skip scanning nm_i::free_nid_bitmap if > nm_i::free_nid_count is zero. Once we find one free nid, let's skip out. > > Since there shouldn't be very heavy overhead for CPU during traveling > nm_i::nat_block_bitmap, I expect below change could be more > simple for maintaining and being with the same effect. > > How do you think? > I think if you need this to work, check total_bitmap_free_nid may not be sufficient enough. The problem this patch presents is that even all the free nids are already in the free list, we still scan all the pages. The scan proceeds once free nid count is below NAT_ENTRY_PER_BLOCK. So in most cases, there are still free nids in the bitmap during the scan, and current codes will check every one of them to see if they are actually in free list. If only check total_bitmap_free_nid == 0 won't take this overhead away. I considered a lot of ways to fix this problem before I submit this patch, One of my idea is quite similar to yours, but I use "if (total_bitmap_free_nid == nm_i->nid_cnt[FREE_NID])" to decide whether skip or not. If you insist, I can submit this simpler one instead, but some follow upgrade would be unavailable, for example, use smaller granularity for tracking last-scanned-position that we talked about. I know sometimes I can be obsessed with the performance, I usually choose the faster way over simpler ones. If you think it's too much, please tell me, I'm sure we can find some middle ground. Thank you > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index cb3f10bc8723..238d95e89dec > 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -729,6 +729,7 @@ struct f2fs_nm_info { > unsigned char (*free_nid_bitmap)[NAT_ENTRY_BITMAP_SIZE]; > unsigned char *nat_block_bitmap; > unsigned short *free_nid_count; /* free nid count of NAT block */ > + unsigned int total_bitmap_free_nid; /* total free nid count in > bitmap */ > > /* for checkpoint */ > char *nat_bitmap; /* NAT bitmap pointer */ > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index fef5c68886b1..e4861908a396 > 100644 > --- a/fs/f2fs/node.c > +++ b/fs/f2fs/node.c > @@ -1911,10 +1911,13 @@ static void update_free_nid_bitmap(struct > f2fs_sb_info *sbi, nid_t nid, > else > __clear_bit_le(nid_ofs, nm_i->free_nid_bitmap[nat_ofs]); > > - if (set) > + if (set) { > nm_i->free_nid_count[nat_ofs]++; > - else if (!build) > + nm_i->total_bitmap_free_nid++; > + } else if (!build) { > nm_i->free_nid_count[nat_ofs]--; > + nm_i->total_bitmap_free_nid--; > + } > } > > static void scan_nat_page(struct f2fs_sb_info *sbi, @@ -1958,6 +1961,9 @@ > static void scan_free_nid_bits(struct f2fs_sb_info *sbi) > > down_read(_i->nat_tree_lock); > > +
RE: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:yuch...@huawei.com] > Sent: Friday, November 03, 2017 4:54 PM > To: Fan Li; 'Chao Yu'; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan > free nid > > On 2017/11/3 15:31, Fan Li wrote: > > In current version, we preserve 8 pages of nat blocks as free nids, we > > build bitmaps for it and use them to allocate nids until its number > > drops below NAT_ENTRY_PER_BLOCK. > > > > After that, we have a problem, scan_free_nid_bits will scan the same > > 8 pages trying to find more free nids, but in most cases the free nids > > in these bitmaps are already in free list, scan them won't get us any > > new nids. > > Further more, after scan_free_nid_bits, the scan is over if > > nid_cnt[FREE_NID] != 0. > > It causes that we scan the same pages over and over again, and no new > > free nids are found until nid_cnt[FREE_NID]==0. While the scanned > > pages increase, the problem grows worse. > > > > This patch mark the range where new free nids could exist and keep > > scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > > The new vairable first_scan_block marks the start of the range, it's > > initialized with NEW_ADDR, which means all free nids before > > next_scan_nid are already in free list; and use next_scan_nid as the > > end of the range since all free nids which are scanned in > > scan_free_nid_bits must be smaller next_scan_nid. > > Think over again, IMO, we can add an variable for stating total count of free > nids in bitamp, if there is no free nid, just skipping scanning all > existed bitmap. > > And if there is only few free nid scattered in bitmap, the cost will be > limited because we will skip scanning nm_i::free_nid_bitmap if > nm_i::free_nid_count is zero. Once we find one free nid, let's skip out. > > Since there shouldn't be very heavy overhead for CPU during traveling > nm_i::nat_block_bitmap, I expect below change could be more > simple for maintaining and being with the same effect. > > How do you think? > I think if you need this to work, check total_bitmap_free_nid may not be sufficient enough. The problem this patch presents is that even all the free nids are already in the free list, we still scan all the pages. The scan proceeds once free nid count is below NAT_ENTRY_PER_BLOCK. So in most cases, there are still free nids in the bitmap during the scan, and current codes will check every one of them to see if they are actually in free list. If only check total_bitmap_free_nid == 0 won't take this overhead away. I considered a lot of ways to fix this problem before I submit this patch, One of my idea is quite similar to yours, but I use "if (total_bitmap_free_nid == nm_i->nid_cnt[FREE_NID])" to decide whether skip or not. If you insist, I can submit this simpler one instead, but some follow upgrade would be unavailable, for example, use smaller granularity for tracking last-scanned-position that we talked about. I know sometimes I can be obsessed with the performance, I usually choose the faster way over simpler ones. If you think it's too much, please tell me, I'm sure we can find some middle ground. Thank you > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index cb3f10bc8723..238d95e89dec > 100644 > --- a/fs/f2fs/f2fs.h > +++ b/fs/f2fs/f2fs.h > @@ -729,6 +729,7 @@ struct f2fs_nm_info { > unsigned char (*free_nid_bitmap)[NAT_ENTRY_BITMAP_SIZE]; > unsigned char *nat_block_bitmap; > unsigned short *free_nid_count; /* free nid count of NAT block */ > + unsigned int total_bitmap_free_nid; /* total free nid count in > bitmap */ > > /* for checkpoint */ > char *nat_bitmap; /* NAT bitmap pointer */ > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index fef5c68886b1..e4861908a396 > 100644 > --- a/fs/f2fs/node.c > +++ b/fs/f2fs/node.c > @@ -1911,10 +1911,13 @@ static void update_free_nid_bitmap(struct > f2fs_sb_info *sbi, nid_t nid, > else > __clear_bit_le(nid_ofs, nm_i->free_nid_bitmap[nat_ofs]); > > - if (set) > + if (set) { > nm_i->free_nid_count[nat_ofs]++; > - else if (!build) > + nm_i->total_bitmap_free_nid++; > + } else if (!build) { > nm_i->free_nid_count[nat_ofs]--; > + nm_i->total_bitmap_free_nid--; > + } > } > > static void scan_nat_page(struct f2fs_sb_info *sbi, @@ -1958,6 +1961,9 @@ > static void scan_free_nid_bits(struct f2fs_sb_info *sbi) > > down_read(_i->nat_tree_lock); > > +
[f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
In current version, we preserve 8 pages of nat blocks as free nids, we build bitmaps for it and use them to allocate nids until its number drops below NAT_ENTRY_PER_BLOCK. After that, we have a problem, scan_free_nid_bits will scan the same 8 pages trying to find more free nids, but in most cases the free nids in these bitmaps are already in free list, scan them won't get us any new nids. Further more, after scan_free_nid_bits, the scan is over if nid_cnt[FREE_NID] != 0. It causes that we scan the same pages over and over again, and no new free nids are found until nid_cnt[FREE_NID]==0. While the scanned pages increase, the problem grows worse. This patch mark the range where new free nids could exist and keep scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. The new vairable first_scan_block marks the start of the range, it's initialized with NEW_ADDR, which means all free nids before next_scan_nid are already in free list; and use next_scan_nid as the end of the range since all free nids which are scanned in scan_free_nid_bits must be smaller next_scan_nid. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/f2fs.h | 1 + fs/f2fs/node.c | 42 +++--- 2 files changed, 36 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -705,6 +705,7 @@ struct f2fs_nm_info { nid_t max_nid; /* maximum possible node ids */ nid_t available_nids; /* # of available node ids */ nid_t next_scan_nid;/* the next nid to be scanned */ + block_t first_scan_block; /* the first NAT block to be scanned */ unsigned int ram_thresh;/* control the memory footprint */ unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..f921e0c 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1812,7 +1812,7 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) struct f2fs_nm_info *nm_i = NM_I(sbi); struct free_nid *i, *e; struct nat_entry *ne; - int err = -EINVAL; + int need_free = 1; bool ret = false; /* 0 nid should not be used */ @@ -1863,13 +1863,25 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) } } ret = true; - err = __insert_free_nid(sbi, i, FREE_NID); + need_free = __insert_free_nid(sbi, i, FREE_NID); err_out: spin_unlock(_i->nid_list_lock); radix_tree_preload_end(); err: - if (err) + if (need_free) kmem_cache_free(free_nid_slab, i); + /* +* For nid that should be free but not in the free +* structure, update the scan range in hope of adding +* it in the next scan. +*/ + if (!ret || need_free < 0) { + block_t tmp_block = NAT_BLOCK_OFFSET(nid); + + if (tmp_block < nm_i->first_scan_block) + nm_i->first_scan_block = tmp_block; + } + return ret; } @@ -1950,10 +1962,17 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); - down_read(_i->nat_tree_lock); + /* every free nid in blocks scanned previously is in the free list */ + if (nm_i->first_scan_block == NEW_ADDR) + return; - for (i = 0; i < nm_i->nat_blocks; i++) { + if (max_blocks == 0) + max_blocks = nm_i->nat_blocks; + + down_read(_i->nat_tree_lock); + for (i = nm_i->first_scan_block; i < max_blocks; i++) { if (!test_bit_le(i, nm_i->nat_block_bitmap)) continue; if (!nm_i->free_nid_count[i]) @@ -1967,10 +1986,13 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); - if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) + if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) { + nm_i->first_scan_block = i; goto out; + } } } + nm_i->first_scan_block = NEW_ADDR; out: down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { @@ -2010,7 +2032,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /
[f2fs-dev] [PATCH RESEND] f2fs: modify the procedure of scan free nid
In current version, we preserve 8 pages of nat blocks as free nids, we build bitmaps for it and use them to allocate nids until its number drops below NAT_ENTRY_PER_BLOCK. After that, we have a problem, scan_free_nid_bits will scan the same 8 pages trying to find more free nids, but in most cases the free nids in these bitmaps are already in free list, scan them won't get us any new nids. Further more, after scan_free_nid_bits, the scan is over if nid_cnt[FREE_NID] != 0. It causes that we scan the same pages over and over again, and no new free nids are found until nid_cnt[FREE_NID]==0. While the scanned pages increase, the problem grows worse. This patch mark the range where new free nids could exist and keep scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. The new vairable first_scan_block marks the start of the range, it's initialized with NEW_ADDR, which means all free nids before next_scan_nid are already in free list; and use next_scan_nid as the end of the range since all free nids which are scanned in scan_free_nid_bits must be smaller next_scan_nid. Signed-off-by: Fan li --- fs/f2fs/f2fs.h | 1 + fs/f2fs/node.c | 42 +++--- 2 files changed, 36 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -705,6 +705,7 @@ struct f2fs_nm_info { nid_t max_nid; /* maximum possible node ids */ nid_t available_nids; /* # of available node ids */ nid_t next_scan_nid;/* the next nid to be scanned */ + block_t first_scan_block; /* the first NAT block to be scanned */ unsigned int ram_thresh;/* control the memory footprint */ unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..f921e0c 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1812,7 +1812,7 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) struct f2fs_nm_info *nm_i = NM_I(sbi); struct free_nid *i, *e; struct nat_entry *ne; - int err = -EINVAL; + int need_free = 1; bool ret = false; /* 0 nid should not be used */ @@ -1863,13 +1863,25 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) } } ret = true; - err = __insert_free_nid(sbi, i, FREE_NID); + need_free = __insert_free_nid(sbi, i, FREE_NID); err_out: spin_unlock(_i->nid_list_lock); radix_tree_preload_end(); err: - if (err) + if (need_free) kmem_cache_free(free_nid_slab, i); + /* +* For nid that should be free but not in the free +* structure, update the scan range in hope of adding +* it in the next scan. +*/ + if (!ret || need_free < 0) { + block_t tmp_block = NAT_BLOCK_OFFSET(nid); + + if (tmp_block < nm_i->first_scan_block) + nm_i->first_scan_block = tmp_block; + } + return ret; } @@ -1950,10 +1962,17 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); - down_read(_i->nat_tree_lock); + /* every free nid in blocks scanned previously is in the free list */ + if (nm_i->first_scan_block == NEW_ADDR) + return; - for (i = 0; i < nm_i->nat_blocks; i++) { + if (max_blocks == 0) + max_blocks = nm_i->nat_blocks; + + down_read(_i->nat_tree_lock); + for (i = nm_i->first_scan_block; i < max_blocks; i++) { if (!test_bit_le(i, nm_i->nat_block_bitmap)) continue; if (!nm_i->free_nid_count[i]) @@ -1967,10 +1986,13 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); - if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) + if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) { + nm_i->first_scan_block = i; goto out; + } } } + nm_i->first_scan_block = NEW_ADDR; out: down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { @@ -2010,7 +2032,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /* try to find free nids in free_nid
[f2fs-dev] [PATCH] f2fs: save a multiplication for last_nid calculation
Use a slightly easier way to calculate last_nid. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 7834097..55ab330 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2642,7 +2642,7 @@ static inline void load_free_nid_bitmap(struct f2fs_sb_info *sbi) __set_bit_le(i, nm_i->nat_block_bitmap); nid = i * NAT_ENTRY_PER_BLOCK; - last_nid = (i + 1) * NAT_ENTRY_PER_BLOCK; + last_nid = nid + NAT_ENTRY_PER_BLOCK; spin_lock(_I(sbi)->nid_list_lock); for (; nid < last_nid; nid++) -- 2.7.4
[f2fs-dev] [PATCH] f2fs: save a multiplication for last_nid calculation
Use a slightly easier way to calculate last_nid. Signed-off-by: Fan li --- fs/f2fs/node.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 7834097..55ab330 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2642,7 +2642,7 @@ static inline void load_free_nid_bitmap(struct f2fs_sb_info *sbi) __set_bit_le(i, nm_i->nat_block_bitmap); nid = i * NAT_ENTRY_PER_BLOCK; - last_nid = (i + 1) * NAT_ENTRY_PER_BLOCK; + last_nid = nid + NAT_ENTRY_PER_BLOCK; spin_lock(_I(sbi)->nid_list_lock); for (; nid < last_nid; nid++) -- 2.7.4
RE: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Wednesday, November 01, 2017 8:47 PM > To: Fan Li; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid > > On 2017/11/1 18:03, Fan Li wrote: > > > > > >> -Original Message- > >> From: Chao Yu [mailto:c...@kernel.org] > >> Sent: Tuesday, October 31, 2017 10:32 PM > >> To: Fan Li; 'Jaegeuk Kim' > >> Cc: linux-kernel@vger.kernel.org; > >> linux-f2fs-de...@lists.sourceforge.net > >> Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan > >> free nid > >> > >> On 2017/10/31 21:37, Fan Li wrote: > >>> In current version, we preserve 8 pages of nat blocks as free nids, > >>> build bitmaps for it and use them to allocate nids until its number > >>> drops below NAT_ENTRY_PER_BLOCK. > >>> > >>> After that, we have a problem, scan_free_nid_bits will scan the same > >>> 8 pages trying to find more free nids, but in most cases the free > >>> nids in these bitmaps are already in free list, scan them won't get > >>> us any new nids. > >>> Further more, after scan_free_nid_bits, the search is over if > >>> nid_cnt[FREE_NID] != 0. > >>> It causes that we scan the same pages over and over again, yet no > >>> new free nids are found until nid_cnt[FREE_NID]==0. > >>> > >>> This patch mark the range where new free nids could exist and keep > >>> scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > >>> The new vairable first_scan_block marks the start of the range, it's > >>> initialized with NEW_ADDR, which means all free nids before > >>> next_scan_nid are already in free list; and use next_scan_nid as the > >>> end of the range since all free nids which are scanned must be > >>> smaller next_scan_nid. > >>> > >>> > >>> Signed-off-by: Fan li <fanofcode...@samsung.com> > >>> --- > >>> fs/f2fs/f2fs.h | 1 + > >>> fs/f2fs/node.c | 30 ++ > >>> 2 files changed, 27 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 > >>> 100644 > >>> --- a/fs/f2fs/f2fs.h > >>> +++ b/fs/f2fs/f2fs.h > >>> @@ -705,6 +705,7 @@ struct f2fs_nm_info { > >>> nid_t max_nid; /* maximum possible node ids */ > >>> nid_t available_nids; /* # of available node ids */ > >>> nid_t next_scan_nid;/* the next nid to be scanned */ > >>> + block_t first_scan_block; /* the first NAT block to be scanned */ > >> > >> As we are traveling bitmap, so how about using smaller granularity for > >> tracking last-scanned-position. like: > >> > >> unsigned next_bitmap_pos; ? > >> > > Yes, I think it's a good idea, but original code scans nids by blocks, > > if I change that, I need to change some other details too, and before that, > > I want to make sure this idea of patch is right. > > I also have some ideas about it, if that's OK, I tend to submit other > > patches to implement them. > > > >>> unsigned int ram_thresh;/* control the memory footprint */ > >>> unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ > >>> unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ > >>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 > >>> 100644 > >>> --- a/fs/f2fs/node.c > >>> +++ b/fs/f2fs/node.c > >>> @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct > >>> f2fs_sb_info *sbi) > >>> struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); > >>> struct f2fs_journal *journal = curseg->journal; > >>> unsigned int i, idx; > >>> + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); > >>> > >>> - down_read(_i->nat_tree_lock); > >>> + /* every free nid in blocks scanned previously is in the free list */ > >>> + if (nm_i->first_scan_block == NEW_ADDR) > >> > >> How about using nm_i->max_nid as no more free nids in bitmap? > >> > > For now, I use the block as the unit of variable first_scan_block, for > >
RE: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Wednesday, November 01, 2017 8:47 PM > To: Fan Li; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid > > On 2017/11/1 18:03, Fan Li wrote: > > > > > >> -Original Message- > >> From: Chao Yu [mailto:c...@kernel.org] > >> Sent: Tuesday, October 31, 2017 10:32 PM > >> To: Fan Li; 'Jaegeuk Kim' > >> Cc: linux-kernel@vger.kernel.org; > >> linux-f2fs-de...@lists.sourceforge.net > >> Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan > >> free nid > >> > >> On 2017/10/31 21:37, Fan Li wrote: > >>> In current version, we preserve 8 pages of nat blocks as free nids, > >>> build bitmaps for it and use them to allocate nids until its number > >>> drops below NAT_ENTRY_PER_BLOCK. > >>> > >>> After that, we have a problem, scan_free_nid_bits will scan the same > >>> 8 pages trying to find more free nids, but in most cases the free > >>> nids in these bitmaps are already in free list, scan them won't get > >>> us any new nids. > >>> Further more, after scan_free_nid_bits, the search is over if > >>> nid_cnt[FREE_NID] != 0. > >>> It causes that we scan the same pages over and over again, yet no > >>> new free nids are found until nid_cnt[FREE_NID]==0. > >>> > >>> This patch mark the range where new free nids could exist and keep > >>> scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > >>> The new vairable first_scan_block marks the start of the range, it's > >>> initialized with NEW_ADDR, which means all free nids before > >>> next_scan_nid are already in free list; and use next_scan_nid as the > >>> end of the range since all free nids which are scanned must be > >>> smaller next_scan_nid. > >>> > >>> > >>> Signed-off-by: Fan li > >>> --- > >>> fs/f2fs/f2fs.h | 1 + > >>> fs/f2fs/node.c | 30 ++ > >>> 2 files changed, 27 insertions(+), 4 deletions(-) > >>> > >>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 > >>> 100644 > >>> --- a/fs/f2fs/f2fs.h > >>> +++ b/fs/f2fs/f2fs.h > >>> @@ -705,6 +705,7 @@ struct f2fs_nm_info { > >>> nid_t max_nid; /* maximum possible node ids */ > >>> nid_t available_nids; /* # of available node ids */ > >>> nid_t next_scan_nid;/* the next nid to be scanned */ > >>> + block_t first_scan_block; /* the first NAT block to be scanned */ > >> > >> As we are traveling bitmap, so how about using smaller granularity for > >> tracking last-scanned-position. like: > >> > >> unsigned next_bitmap_pos; ? > >> > > Yes, I think it's a good idea, but original code scans nids by blocks, > > if I change that, I need to change some other details too, and before that, > > I want to make sure this idea of patch is right. > > I also have some ideas about it, if that's OK, I tend to submit other > > patches to implement them. > > > >>> unsigned int ram_thresh;/* control the memory footprint */ > >>> unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ > >>> unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ > >>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 > >>> 100644 > >>> --- a/fs/f2fs/node.c > >>> +++ b/fs/f2fs/node.c > >>> @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct > >>> f2fs_sb_info *sbi) > >>> struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); > >>> struct f2fs_journal *journal = curseg->journal; > >>> unsigned int i, idx; > >>> + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); > >>> > >>> - down_read(_i->nat_tree_lock); > >>> + /* every free nid in blocks scanned previously is in the free list */ > >>> + if (nm_i->first_scan_block == NEW_ADDR) > >> > >> How about using nm_i->max_nid as no more free nids in bitmap? > >> > > For now, I use the block as the unit of variable first_scan_block, for > > the same reason above, I tend to cha
RE: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Tuesday, October 31, 2017 10:32 PM > To: Fan Li; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid > > On 2017/10/31 21:37, Fan Li wrote: > > In current version, we preserve 8 pages of nat blocks as free nids, > > build bitmaps for it and use them to allocate nids until its number > > drops below NAT_ENTRY_PER_BLOCK. > > > > After that, we have a problem, scan_free_nid_bits will scan the same > > 8 pages trying to find more free nids, but in most cases the free nids > > in these bitmaps are already in free list, scan them won't get us any > > new nids. > > Further more, after scan_free_nid_bits, the search is over if > > nid_cnt[FREE_NID] != 0. > > It causes that we scan the same pages over and over again, yet no new > > free nids are found until nid_cnt[FREE_NID]==0. > > > > This patch mark the range where new free nids could exist and keep > > scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > > The new vairable first_scan_block marks the start of the range, it's > > initialized with NEW_ADDR, which means all free nids before > > next_scan_nid are already in free list; and use next_scan_nid as the > > end of the range since all free nids which are scanned must be smaller > > next_scan_nid. > > > > > > Signed-off-by: Fan li <fanofcode...@samsung.com> > > --- > > fs/f2fs/f2fs.h | 1 + > > fs/f2fs/node.c | 30 ++ > > 2 files changed, 27 insertions(+), 4 deletions(-) > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 > > 100644 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -705,6 +705,7 @@ struct f2fs_nm_info { > > nid_t max_nid; /* maximum possible node ids */ > > nid_t available_nids; /* # of available node ids */ > > nid_t next_scan_nid;/* the next nid to be scanned */ > > + block_t first_scan_block; /* the first NAT block to be scanned */ > > As we are traveling bitmap, so how about using smaller granularity for > tracking last-scanned-position. like: > > unsigned next_bitmap_pos; ? > Yes, I think it's a good idea, but original code scans nids by blocks, if I change that, I need to change some other details too, and before that, I want to make sure this idea of patch is right. I also have some ideas about it, if that's OK, I tend to submit other patches to implement them. > > unsigned int ram_thresh;/* control the memory footprint */ > > unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ > > unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 > > 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct f2fs_sb_info > > *sbi) > > struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); > > struct f2fs_journal *journal = curseg->journal; > > unsigned int i, idx; > > + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); > > > > - down_read(_i->nat_tree_lock); > > + /* every free nid in blocks scanned previously is in the free list */ > > + if (nm_i->first_scan_block == NEW_ADDR) > > How about using nm_i->max_nid as no more free nids in bitmap? > For now, I use the block as the unit of variable first_scan_block, for the same reason above, I tend to change it in another patch. > > + return; > > > > - for (i = 0; i < nm_i->nat_blocks; i++) { > > + /* > > +* TODO: "next_scan_nid == 0" means after searching every nat block, > > +* we still can't find enough free nids, there may not be any > > +* more nid left to be found, we should stop at somewhere > > +* instead of going through these all over again. > > +*/ > > + if (max_blocks == 0) > > + max_blocks = nm_i->nat_blocks; > > + > > + down_read(_i->nat_tree_lock); > > + for (i = nm_i->first_scan_block; i < max_blocks; i++) { > > Free nids could be set free after nodes were truncated & checkpoint, if we > start from first_scan_block, we will miss some free nids. > This is the part I'm not sure. To my understanding, after nodes were truncated, the nats will be cached as dirty nats, the IS
RE: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
> -Original Message- > From: Chao Yu [mailto:c...@kernel.org] > Sent: Tuesday, October 31, 2017 10:32 PM > To: Fan Li; 'Jaegeuk Kim' > Cc: linux-kernel@vger.kernel.org; linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid > > On 2017/10/31 21:37, Fan Li wrote: > > In current version, we preserve 8 pages of nat blocks as free nids, > > build bitmaps for it and use them to allocate nids until its number > > drops below NAT_ENTRY_PER_BLOCK. > > > > After that, we have a problem, scan_free_nid_bits will scan the same > > 8 pages trying to find more free nids, but in most cases the free nids > > in these bitmaps are already in free list, scan them won't get us any > > new nids. > > Further more, after scan_free_nid_bits, the search is over if > > nid_cnt[FREE_NID] != 0. > > It causes that we scan the same pages over and over again, yet no new > > free nids are found until nid_cnt[FREE_NID]==0. > > > > This patch mark the range where new free nids could exist and keep > > scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. > > The new vairable first_scan_block marks the start of the range, it's > > initialized with NEW_ADDR, which means all free nids before > > next_scan_nid are already in free list; and use next_scan_nid as the > > end of the range since all free nids which are scanned must be smaller > > next_scan_nid. > > > > > > Signed-off-by: Fan li > > --- > > fs/f2fs/f2fs.h | 1 + > > fs/f2fs/node.c | 30 ++ > > 2 files changed, 27 insertions(+), 4 deletions(-) > > > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 > > 100644 > > --- a/fs/f2fs/f2fs.h > > +++ b/fs/f2fs/f2fs.h > > @@ -705,6 +705,7 @@ struct f2fs_nm_info { > > nid_t max_nid; /* maximum possible node ids */ > > nid_t available_nids; /* # of available node ids */ > > nid_t next_scan_nid;/* the next nid to be scanned */ > > + block_t first_scan_block; /* the first NAT block to be scanned */ > > As we are traveling bitmap, so how about using smaller granularity for > tracking last-scanned-position. like: > > unsigned next_bitmap_pos; ? > Yes, I think it's a good idea, but original code scans nids by blocks, if I change that, I need to change some other details too, and before that, I want to make sure this idea of patch is right. I also have some ideas about it, if that's OK, I tend to submit other patches to implement them. > > unsigned int ram_thresh;/* control the memory footprint */ > > unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ > > unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 > > 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct f2fs_sb_info > > *sbi) > > struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); > > struct f2fs_journal *journal = curseg->journal; > > unsigned int i, idx; > > + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); > > > > - down_read(_i->nat_tree_lock); > > + /* every free nid in blocks scanned previously is in the free list */ > > + if (nm_i->first_scan_block == NEW_ADDR) > > How about using nm_i->max_nid as no more free nids in bitmap? > For now, I use the block as the unit of variable first_scan_block, for the same reason above, I tend to change it in another patch. > > + return; > > > > - for (i = 0; i < nm_i->nat_blocks; i++) { > > + /* > > +* TODO: "next_scan_nid == 0" means after searching every nat block, > > +* we still can't find enough free nids, there may not be any > > +* more nid left to be found, we should stop at somewhere > > +* instead of going through these all over again. > > +*/ > > + if (max_blocks == 0) > > + max_blocks = nm_i->nat_blocks; > > + > > + down_read(_i->nat_tree_lock); > > + for (i = nm_i->first_scan_block; i < max_blocks; i++) { > > Free nids could be set free after nodes were truncated & checkpoint, if we > start from first_scan_block, we will miss some free nids. > This is the part I'm not sure. To my understanding, after nodes were truncated, the nats will be cached as dirty nats, the IS_CHECKPOINTED flag will
[f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
In current version, we preserve 8 pages of nat blocks as free nids, build bitmaps for it and use them to allocate nids until its number drops below NAT_ENTRY_PER_BLOCK. After that, we have a problem, scan_free_nid_bits will scan the same 8 pages trying to find more free nids, but in most cases the free nids in these bitmaps are already in free list, scan them won't get us any new nids. Further more, after scan_free_nid_bits, the search is over if nid_cnt[FREE_NID] != 0. It causes that we scan the same pages over and over again, yet no new free nids are found until nid_cnt[FREE_NID]==0. This patch mark the range where new free nids could exist and keep scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. The new vairable first_scan_block marks the start of the range, it's initialized with NEW_ADDR, which means all free nids before next_scan_nid are already in free list; and use next_scan_nid as the end of the range since all free nids which are scanned must be smaller next_scan_nid. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/f2fs.h | 1 + fs/f2fs/node.c | 30 ++ 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -705,6 +705,7 @@ struct f2fs_nm_info { nid_t max_nid; /* maximum possible node ids */ nid_t available_nids; /* # of available node ids */ nid_t next_scan_nid;/* the next nid to be scanned */ + block_t first_scan_block; /* the first NAT block to be scanned */ unsigned int ram_thresh;/* control the memory footprint */ unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); - down_read(_i->nat_tree_lock); + /* every free nid in blocks scanned previously is in the free list */ + if (nm_i->first_scan_block == NEW_ADDR) + return; - for (i = 0; i < nm_i->nat_blocks; i++) { + /* +* TODO: "next_scan_nid == 0" means after searching every nat block, +* we still can't find enough free nids, there may not be any +* more nid left to be found, we should stop at somewhere +* instead of going through these all over again. +*/ + if (max_blocks == 0) + max_blocks = nm_i->nat_blocks; + + down_read(_i->nat_tree_lock); + for (i = nm_i->first_scan_block; i < max_blocks; i++) { if (!test_bit_le(i, nm_i->nat_block_bitmap)) continue; if (!nm_i->free_nid_count[i]) @@ -1967,10 +1980,13 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); - if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) + if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) { + nm_i->first_scan_block = i; goto out; + } } } + nm_i->first_scan_block = NEW_ADDR; out: down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { @@ -2010,7 +2026,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /* try to find free nids in free_nid_bitmap */ scan_free_nid_bits(sbi); - if (nm_i->nid_cnt[FREE_NID]) + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) return; } @@ -2163,6 +2179,7 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) struct f2fs_nm_info *nm_i = NM_I(sbi); struct free_nid *i, *next; int nr = nr_shrink; + nid_t min_nid = nm_i->max_nid; if (nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) return 0; @@ -2176,11 +2193,15 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) break; + if (i->nid < min_nid) + min_nid = i->nid; __remove_free_nid(sbi, i, FREE_NID); kmem_cache_free(free_nid_slab, i);
[f2fs-dev] [PATCH] f2fs: modify the procedure of scan free nid
In current version, we preserve 8 pages of nat blocks as free nids, build bitmaps for it and use them to allocate nids until its number drops below NAT_ENTRY_PER_BLOCK. After that, we have a problem, scan_free_nid_bits will scan the same 8 pages trying to find more free nids, but in most cases the free nids in these bitmaps are already in free list, scan them won't get us any new nids. Further more, after scan_free_nid_bits, the search is over if nid_cnt[FREE_NID] != 0. It causes that we scan the same pages over and over again, yet no new free nids are found until nid_cnt[FREE_NID]==0. This patch mark the range where new free nids could exist and keep scan for free nids until nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK. The new vairable first_scan_block marks the start of the range, it's initialized with NEW_ADDR, which means all free nids before next_scan_nid are already in free list; and use next_scan_nid as the end of the range since all free nids which are scanned must be smaller next_scan_nid. Signed-off-by: Fan li --- fs/f2fs/f2fs.h | 1 + fs/f2fs/node.c | 30 ++ 2 files changed, 27 insertions(+), 4 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index e0ef31c..ae1cf91 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -705,6 +705,7 @@ struct f2fs_nm_info { nid_t max_nid; /* maximum possible node ids */ nid_t available_nids; /* # of available node ids */ nid_t next_scan_nid;/* the next nid to be scanned */ + block_t first_scan_block; /* the first NAT block to be scanned */ unsigned int ram_thresh;/* control the memory footprint */ unsigned int ra_nid_pages; /* # of nid pages to be readaheaded */ unsigned int dirty_nats_ratio; /* control dirty nats ratio threshold */ diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index 3d0d1be..7834097 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1950,10 +1950,23 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_HOT_DATA); struct f2fs_journal *journal = curseg->journal; unsigned int i, idx; + unsigned int max_blocks = NAT_BLOCK_OFFSET(nm_i->next_scan_nid); - down_read(_i->nat_tree_lock); + /* every free nid in blocks scanned previously is in the free list */ + if (nm_i->first_scan_block == NEW_ADDR) + return; - for (i = 0; i < nm_i->nat_blocks; i++) { + /* +* TODO: "next_scan_nid == 0" means after searching every nat block, +* we still can't find enough free nids, there may not be any +* more nid left to be found, we should stop at somewhere +* instead of going through these all over again. +*/ + if (max_blocks == 0) + max_blocks = nm_i->nat_blocks; + + down_read(_i->nat_tree_lock); + for (i = nm_i->first_scan_block; i < max_blocks; i++) { if (!test_bit_le(i, nm_i->nat_block_bitmap)) continue; if (!nm_i->free_nid_count[i]) @@ -1967,10 +1980,13 @@ static void scan_free_nid_bits(struct f2fs_sb_info *sbi) nid = i * NAT_ENTRY_PER_BLOCK + idx; add_free_nid(sbi, nid, true); - if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) + if (nm_i->nid_cnt[FREE_NID] >= MAX_FREE_NIDS) { + nm_i->first_scan_block = i; goto out; + } } } + nm_i->first_scan_block = NEW_ADDR; out: down_read(>journal_rwsem); for (i = 0; i < nats_in_cursum(journal); i++) { @@ -2010,7 +2026,7 @@ static void __build_free_nids(struct f2fs_sb_info *sbi, bool sync, bool mount) /* try to find free nids in free_nid_bitmap */ scan_free_nid_bits(sbi); - if (nm_i->nid_cnt[FREE_NID]) + if (nm_i->nid_cnt[FREE_NID] >= NAT_ENTRY_PER_BLOCK) return; } @@ -2163,6 +2179,7 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) struct f2fs_nm_info *nm_i = NM_I(sbi); struct free_nid *i, *next; int nr = nr_shrink; + nid_t min_nid = nm_i->max_nid; if (nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) return 0; @@ -2176,11 +2193,15 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) break; + if (i->nid < min_nid) + min_nid = i->nid; __remove_free_nid(sbi, i, FREE_NID); kmem_cache_free(free_nid_slab, i); nr_shrink--;
RE: [f2fs-dev] [PATCH] f2fs: add a function to move nid
> -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Monday, October 30, 2017 5:30 PM > To: Chao Yu > Cc: Fan Li; 'Chao Yu'; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: add a function to move nid > > On 10/28, Chao Yu wrote: > > On 2017/10/28 19:03, Fan Li wrote: > > > This patch add a new function to move nid from one state to another. > > > Move operation is heavily used, by adding a new function for it we > > > can cut down some branches from several flow. > > > > > > > > > Signed-off-by: Fan li <fanofcode...@samsung.com> > > > --- > > > fs/f2fs/node.c | 51 > > > ++- > > > 1 file changed, 30 insertions(+), 21 deletions(-) > > > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..8116b50 > > > --- a/fs/f2fs/node.c > > > +++ b/fs/f2fs/node.c > > > @@ -1763,15 +1763,13 @@ static struct free_nid > > > *__lookup_free_nid_list(struct f2fs_nm_info *nm_i, } > > > > > > static int __insert_free_nid(struct f2fs_sb_info *sbi, > > > - struct free_nid *i, enum nid_state state, bool new) > > > + struct free_nid *i, enum nid_state state) > > > { > > > struct f2fs_nm_info *nm_i = NM_I(sbi); > > > > > > - if (new) { > > > - int err = radix_tree_insert(_i->free_nid_root, i->nid, i); > > > - if (err) > > > - return err; > > > - } > > > + int err = radix_tree_insert(_i->free_nid_root, i->nid, i); > > > + if (err) > > > + return err; > > > > > > f2fs_bug_on(sbi, state != i->state); > > > nm_i->nid_cnt[state]++; > > > @@ -1781,7 +1779,7 @@ static int __insert_free_nid(struct > > > f2fs_sb_info *sbi, } > > > > > > static void __remove_free_nid(struct f2fs_sb_info *sbi, > > > - struct free_nid *i, enum nid_state state, bool reuse) > > > + struct free_nid *i, enum nid_state state) > > > { > > > struct f2fs_nm_info *nm_i = NM_I(sbi); > > > > > > @@ -1789,8 +1787,23 @@ static void __remove_free_nid(struct f2fs_sb_info > > > *sbi, > > > nm_i->nid_cnt[state]--; > > > if (state == FREE_NID) > > > list_del(>list); > > > - if (!reuse) > > > - radix_tree_delete(_i->free_nid_root, i->nid); > > > + radix_tree_delete(_i->free_nid_root, i->nid); } > > > + > > > +static void __move_free_nid(struct f2fs_sb_info *sbi, struct free_nid *i, > > > + enum nid_state org_state, enum nid_state dst_state) { > > > + struct f2fs_nm_info *nm_i = NM_I(sbi); > > > + > > > + f2fs_bug_on(sbi, org_state != i->state); > > > + i->state = dst_state; > > > + nm_i->nid_cnt[org_state]--; > > > + nm_i->nid_cnt[dst_state]++; > > > + > > > + if (org_state == FREE_NID) > > > > Welcome back to community, Fan. :) > > > > Minor, how about "if (dst_stat == PREALLOC_NID)" ? > > How about this? > > switch (dst_state) { > case PREALLOC_NID: > list_del(>list); > break; > case FREE_NID: > list_add_tail(>list, _i->free_nid_list); > break; > default: > BUG_ON(1); > }; > I had thought about it, the reason why I finally chose the current version is because " dst_state== PREALLOC_NID" implies that all states other than PREALLOC_NID are in a list, so we need to call list_del. Of course it's true now, but in the future if we add more states which aren't in a list, it would be harder for us to track this branch to modify it since it's not related to FREE_NID explicitly. However the difference is trivial, if you think it's not a big deal, it's fine by me. > > > > Otherwise this patch looks good to me, anyway, you can add: > > > > Reviewed-by: Chao Yu <yuch...@huawei.com> > > > > Thanks, > > > > > + list_del(>list); > > > + else if (dst_state == FREE_NID) > > > + list_add_tail(>list, _i->free_nid_list); > > > } > > > > > > /* return if the nid is recognized as free */ @@ -1850,7 +1863,7 @@ > > > static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) > > > } > > >
RE: [f2fs-dev] [PATCH] f2fs: add a function to move nid
> -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Monday, October 30, 2017 5:30 PM > To: Chao Yu > Cc: Fan Li; 'Chao Yu'; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: add a function to move nid > > On 10/28, Chao Yu wrote: > > On 2017/10/28 19:03, Fan Li wrote: > > > This patch add a new function to move nid from one state to another. > > > Move operation is heavily used, by adding a new function for it we > > > can cut down some branches from several flow. > > > > > > > > > Signed-off-by: Fan li > > > --- > > > fs/f2fs/node.c | 51 > > > ++- > > > 1 file changed, 30 insertions(+), 21 deletions(-) > > > > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..8116b50 > > > --- a/fs/f2fs/node.c > > > +++ b/fs/f2fs/node.c > > > @@ -1763,15 +1763,13 @@ static struct free_nid > > > *__lookup_free_nid_list(struct f2fs_nm_info *nm_i, } > > > > > > static int __insert_free_nid(struct f2fs_sb_info *sbi, > > > - struct free_nid *i, enum nid_state state, bool new) > > > + struct free_nid *i, enum nid_state state) > > > { > > > struct f2fs_nm_info *nm_i = NM_I(sbi); > > > > > > - if (new) { > > > - int err = radix_tree_insert(_i->free_nid_root, i->nid, i); > > > - if (err) > > > - return err; > > > - } > > > + int err = radix_tree_insert(_i->free_nid_root, i->nid, i); > > > + if (err) > > > + return err; > > > > > > f2fs_bug_on(sbi, state != i->state); > > > nm_i->nid_cnt[state]++; > > > @@ -1781,7 +1779,7 @@ static int __insert_free_nid(struct > > > f2fs_sb_info *sbi, } > > > > > > static void __remove_free_nid(struct f2fs_sb_info *sbi, > > > - struct free_nid *i, enum nid_state state, bool reuse) > > > + struct free_nid *i, enum nid_state state) > > > { > > > struct f2fs_nm_info *nm_i = NM_I(sbi); > > > > > > @@ -1789,8 +1787,23 @@ static void __remove_free_nid(struct f2fs_sb_info > > > *sbi, > > > nm_i->nid_cnt[state]--; > > > if (state == FREE_NID) > > > list_del(>list); > > > - if (!reuse) > > > - radix_tree_delete(_i->free_nid_root, i->nid); > > > + radix_tree_delete(_i->free_nid_root, i->nid); } > > > + > > > +static void __move_free_nid(struct f2fs_sb_info *sbi, struct free_nid *i, > > > + enum nid_state org_state, enum nid_state dst_state) { > > > + struct f2fs_nm_info *nm_i = NM_I(sbi); > > > + > > > + f2fs_bug_on(sbi, org_state != i->state); > > > + i->state = dst_state; > > > + nm_i->nid_cnt[org_state]--; > > > + nm_i->nid_cnt[dst_state]++; > > > + > > > + if (org_state == FREE_NID) > > > > Welcome back to community, Fan. :) > > > > Minor, how about "if (dst_stat == PREALLOC_NID)" ? > > How about this? > > switch (dst_state) { > case PREALLOC_NID: > list_del(>list); > break; > case FREE_NID: > list_add_tail(>list, _i->free_nid_list); > break; > default: > BUG_ON(1); > }; > I had thought about it, the reason why I finally chose the current version is because " dst_state== PREALLOC_NID" implies that all states other than PREALLOC_NID are in a list, so we need to call list_del. Of course it's true now, but in the future if we add more states which aren't in a list, it would be harder for us to track this branch to modify it since it's not related to FREE_NID explicitly. However the difference is trivial, if you think it's not a big deal, it's fine by me. > > > > Otherwise this patch looks good to me, anyway, you can add: > > > > Reviewed-by: Chao Yu > > > > Thanks, > > > > > + list_del(>list); > > > + else if (dst_state == FREE_NID) > > > + list_add_tail(>list, _i->free_nid_list); > > > } > > > > > > /* return if the nid is recognized as free */ @@ -1850,7 +1863,7 @@ > > > static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) > > > } > > > } > > > ret = true; > > &
[f2fs-dev] [PATCH] f2fs: optimize __update_nat_bits
Make three modification for __update_nat_bits: 1. Take the codes of dealing the nat with nid 0 out of the loop Such nat only needs to be dealt with once at beginning. 2. Use " nat_index == 0" instead of " start_nid == 0" to decide if it's the first nat block It's better that we don't assume @start_nid is the first nid of the nat block it's in. 3. Use " if (nat_blk->entries[i].block_addr != NULL_ADDR)" to explicitly comfirm the value of block_addr use constant to make sure the codes is right, even if the value of NULL_ADDR changes. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/node.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..b97a031 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2407,15 +2407,17 @@ static void __update_nat_bits(struct f2fs_sb_info *sbi, nid_t start_nid, unsigned int nat_index = start_nid / NAT_ENTRY_PER_BLOCK; struct f2fs_nat_block *nat_blk = page_address(page); int valid = 0; - int i; + int i = 0; if (!enabled_nat_bits(sbi, NULL)) return; - for (i = 0; i < NAT_ENTRY_PER_BLOCK; i++) { - if (start_nid == 0 && i == 0) - valid++; - if (nat_blk->entries[i].block_addr) + if (nat_index == 0) { + valid = 1; + i = 1; + } + for (; i < NAT_ENTRY_PER_BLOCK; i++) { + if (nat_blk->entries[i].block_addr != NULL_ADDR) valid++; } if (valid == 0) { -- 2.7.4
[f2fs-dev] [PATCH] f2fs: optimize __update_nat_bits
Make three modification for __update_nat_bits: 1. Take the codes of dealing the nat with nid 0 out of the loop Such nat only needs to be dealt with once at beginning. 2. Use " nat_index == 0" instead of " start_nid == 0" to decide if it's the first nat block It's better that we don't assume @start_nid is the first nid of the nat block it's in. 3. Use " if (nat_blk->entries[i].block_addr != NULL_ADDR)" to explicitly comfirm the value of block_addr use constant to make sure the codes is right, even if the value of NULL_ADDR changes. Signed-off-by: Fan li --- fs/f2fs/node.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..b97a031 100644 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -2407,15 +2407,17 @@ static void __update_nat_bits(struct f2fs_sb_info *sbi, nid_t start_nid, unsigned int nat_index = start_nid / NAT_ENTRY_PER_BLOCK; struct f2fs_nat_block *nat_blk = page_address(page); int valid = 0; - int i; + int i = 0; if (!enabled_nat_bits(sbi, NULL)) return; - for (i = 0; i < NAT_ENTRY_PER_BLOCK; i++) { - if (start_nid == 0 && i == 0) - valid++; - if (nat_blk->entries[i].block_addr) + if (nat_index == 0) { + valid = 1; + i = 1; + } + for (; i < NAT_ENTRY_PER_BLOCK; i++) { + if (nat_blk->entries[i].block_addr != NULL_ADDR) valid++; } if (valid == 0) { -- 2.7.4
[f2fs-dev] [PATCH] f2fs: add a function to move nid
This patch add a new function to move nid from one state to another. Move operation is heavily used, by adding a new function for it we can cut down some branches from several flow. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/node.c | 51 ++- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..8116b50 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1763,15 +1763,13 @@ static struct free_nid *__lookup_free_nid_list(struct f2fs_nm_info *nm_i, } static int __insert_free_nid(struct f2fs_sb_info *sbi, - struct free_nid *i, enum nid_state state, bool new) + struct free_nid *i, enum nid_state state) { struct f2fs_nm_info *nm_i = NM_I(sbi); - if (new) { - int err = radix_tree_insert(_i->free_nid_root, i->nid, i); - if (err) - return err; - } + int err = radix_tree_insert(_i->free_nid_root, i->nid, i); + if (err) + return err; f2fs_bug_on(sbi, state != i->state); nm_i->nid_cnt[state]++; @@ -1781,7 +1779,7 @@ static int __insert_free_nid(struct f2fs_sb_info *sbi, } static void __remove_free_nid(struct f2fs_sb_info *sbi, - struct free_nid *i, enum nid_state state, bool reuse) + struct free_nid *i, enum nid_state state) { struct f2fs_nm_info *nm_i = NM_I(sbi); @@ -1789,8 +1787,23 @@ static void __remove_free_nid(struct f2fs_sb_info *sbi, nm_i->nid_cnt[state]--; if (state == FREE_NID) list_del(>list); - if (!reuse) - radix_tree_delete(_i->free_nid_root, i->nid); + radix_tree_delete(_i->free_nid_root, i->nid); +} + +static void __move_free_nid(struct f2fs_sb_info *sbi, struct free_nid *i, + enum nid_state org_state, enum nid_state dst_state) +{ + struct f2fs_nm_info *nm_i = NM_I(sbi); + + f2fs_bug_on(sbi, org_state != i->state); + i->state = dst_state; + nm_i->nid_cnt[org_state]--; + nm_i->nid_cnt[dst_state]++; + + if (org_state == FREE_NID) + list_del(>list); + else if (dst_state == FREE_NID) + list_add_tail(>list, _i->free_nid_list); } /* return if the nid is recognized as free */ @@ -1850,7 +1863,7 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) } } ret = true; - err = __insert_free_nid(sbi, i, FREE_NID, true); + err = __insert_free_nid(sbi, i, FREE_NID); err_out: spin_unlock(_i->nid_list_lock); radix_tree_preload_end(); @@ -1869,7 +1882,7 @@ static void remove_free_nid(struct f2fs_sb_info *sbi, nid_t nid) spin_lock(_i->nid_list_lock); i = __lookup_free_nid_list(nm_i, nid); if (i && i->state == FREE_NID) { - __remove_free_nid(sbi, i, FREE_NID, false); + __remove_free_nid(sbi, i, FREE_NID); need_free = true; } spin_unlock(_i->nid_list_lock); @@ -2080,9 +2093,7 @@ bool alloc_nid(struct f2fs_sb_info *sbi, nid_t *nid) struct free_nid, list); *nid = i->nid; - __remove_free_nid(sbi, i, FREE_NID, true); - i->state = PREALLOC_NID; - __insert_free_nid(sbi, i, PREALLOC_NID, false); + __move_free_nid(sbi, i, FREE_NID, PREALLOC_NID); nm_i->available_nids--; update_free_nid_bitmap(sbi, *nid, false, false); @@ -2108,7 +2119,7 @@ void alloc_nid_done(struct f2fs_sb_info *sbi, nid_t nid) spin_lock(_i->nid_list_lock); i = __lookup_free_nid_list(nm_i, nid); f2fs_bug_on(sbi, !i); - __remove_free_nid(sbi, i, PREALLOC_NID, false); + __remove_free_nid(sbi, i, PREALLOC_NID); spin_unlock(_i->nid_list_lock); kmem_cache_free(free_nid_slab, i); @@ -2131,12 +2142,10 @@ void alloc_nid_failed(struct f2fs_sb_info *sbi, nid_t nid) f2fs_bug_on(sbi, !i); if (!available_free_memory(sbi, FREE_NIDS)) { - __remove_free_nid(sbi, i, PREALLOC_NID, false); + __remove_free_nid(sbi, i, PREALLOC_NID); need_free = true; } else { - __remove_free_nid(sbi, i, PREALLOC_NID, true); - i->state = FREE_NID; - __insert_free_nid(sbi, i, FREE_NID, false); + __move_free_nid(sbi, i, PREALLOC_NID, FREE_NID); } nm_i->available_nids++; @@ -2167,7 +2176,7 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) break; - __remove_free_nid(
[f2fs-dev] [PATCH] f2fs: add a function to move nid
This patch add a new function to move nid from one state to another. Move operation is heavily used, by adding a new function for it we can cut down some branches from several flow. Signed-off-by: Fan li --- fs/f2fs/node.c | 51 ++- 1 file changed, 30 insertions(+), 21 deletions(-) diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c index ac629d6..8116b50 --- a/fs/f2fs/node.c +++ b/fs/f2fs/node.c @@ -1763,15 +1763,13 @@ static struct free_nid *__lookup_free_nid_list(struct f2fs_nm_info *nm_i, } static int __insert_free_nid(struct f2fs_sb_info *sbi, - struct free_nid *i, enum nid_state state, bool new) + struct free_nid *i, enum nid_state state) { struct f2fs_nm_info *nm_i = NM_I(sbi); - if (new) { - int err = radix_tree_insert(_i->free_nid_root, i->nid, i); - if (err) - return err; - } + int err = radix_tree_insert(_i->free_nid_root, i->nid, i); + if (err) + return err; f2fs_bug_on(sbi, state != i->state); nm_i->nid_cnt[state]++; @@ -1781,7 +1779,7 @@ static int __insert_free_nid(struct f2fs_sb_info *sbi, } static void __remove_free_nid(struct f2fs_sb_info *sbi, - struct free_nid *i, enum nid_state state, bool reuse) + struct free_nid *i, enum nid_state state) { struct f2fs_nm_info *nm_i = NM_I(sbi); @@ -1789,8 +1787,23 @@ static void __remove_free_nid(struct f2fs_sb_info *sbi, nm_i->nid_cnt[state]--; if (state == FREE_NID) list_del(>list); - if (!reuse) - radix_tree_delete(_i->free_nid_root, i->nid); + radix_tree_delete(_i->free_nid_root, i->nid); +} + +static void __move_free_nid(struct f2fs_sb_info *sbi, struct free_nid *i, + enum nid_state org_state, enum nid_state dst_state) +{ + struct f2fs_nm_info *nm_i = NM_I(sbi); + + f2fs_bug_on(sbi, org_state != i->state); + i->state = dst_state; + nm_i->nid_cnt[org_state]--; + nm_i->nid_cnt[dst_state]++; + + if (org_state == FREE_NID) + list_del(>list); + else if (dst_state == FREE_NID) + list_add_tail(>list, _i->free_nid_list); } /* return if the nid is recognized as free */ @@ -1850,7 +1863,7 @@ static bool add_free_nid(struct f2fs_sb_info *sbi, nid_t nid, bool build) } } ret = true; - err = __insert_free_nid(sbi, i, FREE_NID, true); + err = __insert_free_nid(sbi, i, FREE_NID); err_out: spin_unlock(_i->nid_list_lock); radix_tree_preload_end(); @@ -1869,7 +1882,7 @@ static void remove_free_nid(struct f2fs_sb_info *sbi, nid_t nid) spin_lock(_i->nid_list_lock); i = __lookup_free_nid_list(nm_i, nid); if (i && i->state == FREE_NID) { - __remove_free_nid(sbi, i, FREE_NID, false); + __remove_free_nid(sbi, i, FREE_NID); need_free = true; } spin_unlock(_i->nid_list_lock); @@ -2080,9 +2093,7 @@ bool alloc_nid(struct f2fs_sb_info *sbi, nid_t *nid) struct free_nid, list); *nid = i->nid; - __remove_free_nid(sbi, i, FREE_NID, true); - i->state = PREALLOC_NID; - __insert_free_nid(sbi, i, PREALLOC_NID, false); + __move_free_nid(sbi, i, FREE_NID, PREALLOC_NID); nm_i->available_nids--; update_free_nid_bitmap(sbi, *nid, false, false); @@ -2108,7 +2119,7 @@ void alloc_nid_done(struct f2fs_sb_info *sbi, nid_t nid) spin_lock(_i->nid_list_lock); i = __lookup_free_nid_list(nm_i, nid); f2fs_bug_on(sbi, !i); - __remove_free_nid(sbi, i, PREALLOC_NID, false); + __remove_free_nid(sbi, i, PREALLOC_NID); spin_unlock(_i->nid_list_lock); kmem_cache_free(free_nid_slab, i); @@ -2131,12 +2142,10 @@ void alloc_nid_failed(struct f2fs_sb_info *sbi, nid_t nid) f2fs_bug_on(sbi, !i); if (!available_free_memory(sbi, FREE_NIDS)) { - __remove_free_nid(sbi, i, PREALLOC_NID, false); + __remove_free_nid(sbi, i, PREALLOC_NID); need_free = true; } else { - __remove_free_nid(sbi, i, PREALLOC_NID, true); - i->state = FREE_NID; - __insert_free_nid(sbi, i, FREE_NID, false); + __move_free_nid(sbi, i, PREALLOC_NID, FREE_NID); } nm_i->available_nids++; @@ -2167,7 +2176,7 @@ int try_to_free_nids(struct f2fs_sb_info *sbi, int nr_shrink) nm_i->nid_cnt[FREE_NID] <= MAX_FREE_NIDS) break; - __remove_free_nid(sbi, i, FREE_NID, false
[f2fs-dev] [PATCH] f2fs: fix bugs and simplify codes of f2fs_fiemap
fix bugs: 1. len could be updated incorrectly when start+len is beyond isize. 2. If there is a hole consisting of more than two blocks, it could fail to add FIEMAP_EXTENT_LAST flag for the last extent. 3. If there is an extent beyond isize, when we search extents in a range that ends at isize, it will also return the extent beyond isize, which is outside the range. Signed-off-by: Fan li --- fs/f2fs/data.c | 80 +++- 1 file changed, 27 insertions(+), 53 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 5c43b2d..d67c599 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -783,7 +783,6 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, loff_t isize = i_size_read(inode); u64 logical = 0, phys = 0, size = 0; u32 flags = 0; - bool past_eof = false, whole_file = false; int ret = 0; ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC); @@ -797,17 +796,18 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, } mutex_lock(>i_mutex); + if (start >= isize) + goto out; - if (len >= isize) { - whole_file = true; - len = isize; - } + if (start + len > isize) + len = isize - start; if (logical_to_blk(inode, len) == 0) len = blk_to_logical(inode, 1); start_blk = logical_to_blk(inode, start); last_blk = logical_to_blk(inode, start + len - 1); + next: memset(_bh, 0, sizeof(struct buffer_head)); map_bh.b_size = len; @@ -819,59 +819,33 @@ next: /* HOLE */ if (!buffer_mapped(_bh)) { - start_blk++; - - if (!past_eof && blk_to_logical(inode, start_blk) >= isize) - past_eof = 1; - - if (past_eof && size) { - flags |= FIEMAP_EXTENT_LAST; - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - } else if (size) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - size = 0; - } + /* Go through holes util pass the EOF */ + if (blk_to_logical(inode, start_blk++) < isize) + goto prep_next; + /* Found a hole beyond isize means no more extents. +* Note that the premise is that filesystems don't +* punch holes beyond isize and keep size unchanged. +*/ + flags |= FIEMAP_EXTENT_LAST; + } - /* if we have holes up to/past EOF then we're done */ - if (start_blk > last_blk || past_eof || ret) - goto out; - } else { - if (start_blk > last_blk && !whole_file) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - goto out; - } + if (size) + ret = fiemap_fill_next_extent(fieinfo, logical, + phys, size, flags); - /* -* if size != 0 then we know we already have an extent -* to add, so add it. -*/ - if (size) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - if (ret) - goto out; - } + if (start_blk > last_blk || ret) + goto out; - logical = blk_to_logical(inode, start_blk); - phys = blk_to_logical(inode, map_bh.b_blocknr); - size = map_bh.b_size; - flags = 0; - if (buffer_unwritten(_bh)) - flags = FIEMAP_EXTENT_UNWRITTEN; + logical = blk_to_logical(inode, start_blk); + phys = blk_to_logical(inode, map_bh.b_blocknr); + size = map_bh.b_size; + flags = 0; + if (buffer_unwritten(_bh)) + flags = FIEMAP_EXTENT_UNWRITTEN; - start_blk += logical_to_blk(inode, size); + start_blk += logical_to_blk(inode, size); - /* -* If we are past the EOF, then we need to make sure as -* soon as we find a hole that the last extent we found -* is marked with FIEMAP_EXTENT_LAST -*/ - if (!past_eof && logical + size >= isize) - past_eof = true; - } +prep_next: cond_resched(); if (fatal_signal_pending(current)) ret = -EINTR; -- 1.7.9.5 -- To unsub
[f2fs-dev] [PATCH] f2fs: fix bugs and simplify codes of f2fs_fiemap
fix bugs: 1. len could be updated incorrectly when start+len is beyond isize. 2. If there is a hole consisting of more than two blocks, it could fail to add FIEMAP_EXTENT_LAST flag for the last extent. 3. If there is an extent beyond isize, when we search extents in a range that ends at isize, it will also return the extent beyond isize, which is outside the range. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/data.c | 80 +++- 1 file changed, 27 insertions(+), 53 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 5c43b2d..d67c599 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -783,7 +783,6 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, loff_t isize = i_size_read(inode); u64 logical = 0, phys = 0, size = 0; u32 flags = 0; - bool past_eof = false, whole_file = false; int ret = 0; ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC); @@ -797,17 +796,18 @@ int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, } mutex_lock(>i_mutex); + if (start >= isize) + goto out; - if (len >= isize) { - whole_file = true; - len = isize; - } + if (start + len > isize) + len = isize - start; if (logical_to_blk(inode, len) == 0) len = blk_to_logical(inode, 1); start_blk = logical_to_blk(inode, start); last_blk = logical_to_blk(inode, start + len - 1); + next: memset(_bh, 0, sizeof(struct buffer_head)); map_bh.b_size = len; @@ -819,59 +819,33 @@ next: /* HOLE */ if (!buffer_mapped(_bh)) { - start_blk++; - - if (!past_eof && blk_to_logical(inode, start_blk) >= isize) - past_eof = 1; - - if (past_eof && size) { - flags |= FIEMAP_EXTENT_LAST; - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - } else if (size) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - size = 0; - } + /* Go through holes util pass the EOF */ + if (blk_to_logical(inode, start_blk++) < isize) + goto prep_next; + /* Found a hole beyond isize means no more extents. +* Note that the premise is that filesystems don't +* punch holes beyond isize and keep size unchanged. +*/ + flags |= FIEMAP_EXTENT_LAST; + } - /* if we have holes up to/past EOF then we're done */ - if (start_blk > last_blk || past_eof || ret) - goto out; - } else { - if (start_blk > last_blk && !whole_file) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - goto out; - } + if (size) + ret = fiemap_fill_next_extent(fieinfo, logical, + phys, size, flags); - /* -* if size != 0 then we know we already have an extent -* to add, so add it. -*/ - if (size) { - ret = fiemap_fill_next_extent(fieinfo, logical, - phys, size, flags); - if (ret) - goto out; - } + if (start_blk > last_blk || ret) + goto out; - logical = blk_to_logical(inode, start_blk); - phys = blk_to_logical(inode, map_bh.b_blocknr); - size = map_bh.b_size; - flags = 0; - if (buffer_unwritten(_bh)) - flags = FIEMAP_EXTENT_UNWRITTEN; + logical = blk_to_logical(inode, start_blk); + phys = blk_to_logical(inode, map_bh.b_blocknr); + size = map_bh.b_size; + flags = 0; + if (buffer_unwritten(_bh)) + flags = FIEMAP_EXTENT_UNWRITTEN; - start_blk += logical_to_blk(inode, size); + start_blk += logical_to_blk(inode, size); - /* -* If we are past the EOF, then we need to make sure as -* soon as we find a hole that the last extent we found -* is marked with FIEMAP_EXTENT_LAST -*/ - if (!past_eof && logical + size >= isize) - past_eof = true; - } +prep_next: cond_resched(); if (fatal_signal_pending(current)) ret =
[f2fs-dev] [PATCH] f2fs: optimize the flow of f2fs_map_blocks
check map->m_len right after it changes to avoid excess call to update dnode_of_data. Signed-off-by: Fan li --- fs/f2fs/data.c | 69 +--- 1 file changed, 36 insertions(+), 33 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 90a2ffe..8fa4560 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -573,6 +573,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int err = 0, ofs = 1; struct extent_info ei; bool allocated = false; + block_t blkaddr; map->m_len = 0; map->m_flags = 0; @@ -636,6 +637,9 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, pgofs++; get_next: + if (map->m_len >= maxblocks) + goto sync_out; + if (dn.ofs_in_node >= end_offset) { if (allocated) sync_inode_page(); @@ -653,44 +657,43 @@ get_next: end_offset = ADDRS_PER_PAGE(dn.node_page, F2FS_I(inode)); } - if (maxblocks > map->m_len) { - block_t blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); + blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); - if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) { - if (create) { - if (unlikely(f2fs_cp_error(sbi))) { - err = -EIO; - goto sync_out; - } - err = __allocate_data_block(); - if (err) - goto sync_out; - allocated = true; - map->m_flags |= F2FS_MAP_NEW; - blkaddr = dn.data_blkaddr; - } else { - /* -* we only merge preallocated unwritten blocks -* for fiemap. -*/ - if (flag != F2FS_GET_BLOCK_FIEMAP || - blkaddr != NEW_ADDR) - goto sync_out; + if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) { + if (create) { + if (unlikely(f2fs_cp_error(sbi))) { + err = -EIO; + goto sync_out; } + err = __allocate_data_block(); + if (err) + goto sync_out; + allocated = true; + map->m_flags |= F2FS_MAP_NEW; + blkaddr = dn.data_blkaddr; + } else { + /* +* we only merge preallocated unwritten blocks +* for fiemap. +*/ + if (flag != F2FS_GET_BLOCK_FIEMAP || + blkaddr != NEW_ADDR) + goto sync_out; } + } - /* Give more consecutive addresses for the readahead */ - if ((map->m_pblk != NEW_ADDR && - blkaddr == (map->m_pblk + ofs)) || - (map->m_pblk == NEW_ADDR && - blkaddr == NEW_ADDR)) { - ofs++; - dn.ofs_in_node++; - pgofs++; - map->m_len++; - goto get_next; - } + /* Give more consecutive addresses for the readahead */ + if ((map->m_pblk != NEW_ADDR && + blkaddr == (map->m_pblk + ofs)) || + (map->m_pblk == NEW_ADDR && + blkaddr == NEW_ADDR)) { + ofs++; + dn.ofs_in_node++; + pgofs++; + map->m_len++; + goto get_next; } + sync_out: if (allocated) sync_inode_page(); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: optimize the flow of f2fs_map_blocks
check map->m_len right after it changes to avoid excess call to update dnode_of_data. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/data.c | 69 +--- 1 file changed, 36 insertions(+), 33 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 90a2ffe..8fa4560 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -573,6 +573,7 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, int err = 0, ofs = 1; struct extent_info ei; bool allocated = false; + block_t blkaddr; map->m_len = 0; map->m_flags = 0; @@ -636,6 +637,9 @@ int f2fs_map_blocks(struct inode *inode, struct f2fs_map_blocks *map, pgofs++; get_next: + if (map->m_len >= maxblocks) + goto sync_out; + if (dn.ofs_in_node >= end_offset) { if (allocated) sync_inode_page(); @@ -653,44 +657,43 @@ get_next: end_offset = ADDRS_PER_PAGE(dn.node_page, F2FS_I(inode)); } - if (maxblocks > map->m_len) { - block_t blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); + blkaddr = datablock_addr(dn.node_page, dn.ofs_in_node); - if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) { - if (create) { - if (unlikely(f2fs_cp_error(sbi))) { - err = -EIO; - goto sync_out; - } - err = __allocate_data_block(); - if (err) - goto sync_out; - allocated = true; - map->m_flags |= F2FS_MAP_NEW; - blkaddr = dn.data_blkaddr; - } else { - /* -* we only merge preallocated unwritten blocks -* for fiemap. -*/ - if (flag != F2FS_GET_BLOCK_FIEMAP || - blkaddr != NEW_ADDR) - goto sync_out; + if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR) { + if (create) { + if (unlikely(f2fs_cp_error(sbi))) { + err = -EIO; + goto sync_out; } + err = __allocate_data_block(); + if (err) + goto sync_out; + allocated = true; + map->m_flags |= F2FS_MAP_NEW; + blkaddr = dn.data_blkaddr; + } else { + /* +* we only merge preallocated unwritten blocks +* for fiemap. +*/ + if (flag != F2FS_GET_BLOCK_FIEMAP || + blkaddr != NEW_ADDR) + goto sync_out; } + } - /* Give more consecutive addresses for the readahead */ - if ((map->m_pblk != NEW_ADDR && - blkaddr == (map->m_pblk + ofs)) || - (map->m_pblk == NEW_ADDR && - blkaddr == NEW_ADDR)) { - ofs++; - dn.ofs_in_node++; - pgofs++; - map->m_len++; - goto get_next; - } + /* Give more consecutive addresses for the readahead */ + if ((map->m_pblk != NEW_ADDR && + blkaddr == (map->m_pblk + ofs)) || + (map->m_pblk == NEW_ADDR && + blkaddr == NEW_ADDR)) { + ofs++; + dn.ofs_in_node++; + pgofs++; + map->m_len++; + goto get_next; } + sync_out: if (allocated) sync_inode_page(); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: fix to reset variable correctlly
f2fs_map_blocks will set m_flags and m_len to 0, so we don't need to reset m_flags ourselves, but have to reset m_len to correct value before use it again. Signed-off-by: Fan li --- fs/f2fs/file.c |9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 9949d0f..2c8050c 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1700,7 +1700,6 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, } map.m_lblk = pg_start; - map.m_len = pg_end - pg_start; /* * lookup mapping info in dnode page cache, skip defragmenting if all @@ -1708,14 +1707,13 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, * in logical blocks. */ while (map.m_lblk < pg_end) { - map.m_flags = 0; + map.m_len = pg_end - map.m_lblk; err = f2fs_map_blocks(inode, , 0, F2FS_GET_BLOCK_READ); if (err) goto out; if (!(map.m_flags & F2FS_MAP_FLAGS)) { map.m_lblk++; - map.m_len--; continue; } @@ -1726,7 +1724,6 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, blk_end = map.m_pblk + map.m_len; map.m_lblk += map.m_len; - map.m_len = pg_end - map.m_lblk; } if (!fragmented) @@ -1752,14 +1749,13 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, int cnt = 0; do_map: - map.m_flags = 0; + map.m_len = pg_end - map.m_lblk; err = f2fs_map_blocks(inode, , 0, F2FS_GET_BLOCK_READ); if (err) goto clear_out; if (!(map.m_flags & F2FS_MAP_FLAGS)) { map.m_lblk++; - map.m_len--; continue; } @@ -1784,7 +1780,6 @@ do_map: } map.m_lblk = idx; - map.m_len = pg_end - idx; if (idx < pg_end && cnt < blk_per_seg) goto do_map; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: fix to reset variable correctlly
f2fs_map_blocks will set m_flags and m_len to 0, so we don't need to reset m_flags ourselves, but have to reset m_len to correct value before use it again. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/file.c |9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 9949d0f..2c8050c 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -1700,7 +1700,6 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, } map.m_lblk = pg_start; - map.m_len = pg_end - pg_start; /* * lookup mapping info in dnode page cache, skip defragmenting if all @@ -1708,14 +1707,13 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, * in logical blocks. */ while (map.m_lblk < pg_end) { - map.m_flags = 0; + map.m_len = pg_end - map.m_lblk; err = f2fs_map_blocks(inode, , 0, F2FS_GET_BLOCK_READ); if (err) goto out; if (!(map.m_flags & F2FS_MAP_FLAGS)) { map.m_lblk++; - map.m_len--; continue; } @@ -1726,7 +1724,6 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, blk_end = map.m_pblk + map.m_len; map.m_lblk += map.m_len; - map.m_len = pg_end - map.m_lblk; } if (!fragmented) @@ -1752,14 +1749,13 @@ static int f2fs_defragment_range(struct f2fs_sb_info *sbi, int cnt = 0; do_map: - map.m_flags = 0; + map.m_len = pg_end - map.m_lblk; err = f2fs_map_blocks(inode, , 0, F2FS_GET_BLOCK_READ); if (err) goto clear_out; if (!(map.m_flags & F2FS_MAP_FLAGS)) { map.m_lblk++; - map.m_len--; continue; } @@ -1784,7 +1780,6 @@ do_map: } map.m_lblk = idx; - map.m_len = pg_end - idx; if (idx < pg_end && cnt < blk_per_seg) goto do_map; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: optimize __find_rev_next_bit
1. Skip __reverse_ulong if the bitmap is empty. 2. Reduce branches and codes. According to my test, the performance of this new version is 5% higher on an empty bitmap of 64bytes, and remains about the same in the worst scenario. Signed-off-by: Fan li --- fs/f2fs/segment.c | 46 ++ 1 file changed, 18 insertions(+), 28 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index f77b325..efbf6b5 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -86,6 +86,7 @@ static inline unsigned long __reverse_ffs(unsigned long word) /* * __find_rev_next(_zero)_bit is copied from lib/find_next_bit.c because * f2fs_set_bit makes MSB and LSB reversed in a byte. + * @size must be integral times of unsigned long. * Example: * MSB <--> LSB * f2fs_set_bit(0, bitmap) => 1000 @@ -95,47 +96,36 @@ static unsigned long __find_rev_next_bit(const unsigned long *addr, unsigned long size, unsigned long offset) { const unsigned long *p = addr + BIT_WORD(offset); - unsigned long result = offset & ~(BITS_PER_LONG - 1); + unsigned long result = size; unsigned long tmp; if (offset >= size) return size; - size -= result; + size -= (offset & ~(BITS_PER_LONG - 1)); offset %= BITS_PER_LONG; - if (!offset) - goto aligned; - tmp = __reverse_ulong((unsigned char *)p); - tmp &= ~0UL >> offset; - - if (size < BITS_PER_LONG) - goto found_first; - if (tmp) - goto found_middle; + while (1) { + if (*p == 0) + goto pass; - size -= BITS_PER_LONG; - result += BITS_PER_LONG; - p++; -aligned: - while (size & ~(BITS_PER_LONG-1)) { tmp = __reverse_ulong((unsigned char *)p); + + tmp &= ~0UL >> offset; + if (size < BITS_PER_LONG) + tmp &= (~0UL << (BITS_PER_LONG - size)); if (tmp) - goto found_middle; - result += BITS_PER_LONG; + goto found; +pass: + if (size <= BITS_PER_LONG) + break; size -= BITS_PER_LONG; + offset = 0; p++; } - if (!size) - return result; - - tmp = __reverse_ulong((unsigned char *)p); -found_first: - tmp &= (~0UL << (BITS_PER_LONG - size)); - if (!tmp) /* Are any bits set? */ - return result + size; /* Nope. */ -found_middle: - return result + __reverse_ffs(tmp); + return result; +found: + return result - size + __reverse_ffs(tmp); } static unsigned long __find_rev_next_zero_bit(const unsigned long *addr, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: optimize __find_rev_next_bit
1. Skip __reverse_ulong if the bitmap is empty. 2. Reduce branches and codes. According to my test, the performance of this new version is 5% higher on an empty bitmap of 64bytes, and remains about the same in the worst scenario. Signed-off-by: Fan li <fanofcode...@samsung.com> --- fs/f2fs/segment.c | 46 ++ 1 file changed, 18 insertions(+), 28 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index f77b325..efbf6b5 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -86,6 +86,7 @@ static inline unsigned long __reverse_ffs(unsigned long word) /* * __find_rev_next(_zero)_bit is copied from lib/find_next_bit.c because * f2fs_set_bit makes MSB and LSB reversed in a byte. + * @size must be integral times of unsigned long. * Example: * MSB <--> LSB * f2fs_set_bit(0, bitmap) => 1000 @@ -95,47 +96,36 @@ static unsigned long __find_rev_next_bit(const unsigned long *addr, unsigned long size, unsigned long offset) { const unsigned long *p = addr + BIT_WORD(offset); - unsigned long result = offset & ~(BITS_PER_LONG - 1); + unsigned long result = size; unsigned long tmp; if (offset >= size) return size; - size -= result; + size -= (offset & ~(BITS_PER_LONG - 1)); offset %= BITS_PER_LONG; - if (!offset) - goto aligned; - tmp = __reverse_ulong((unsigned char *)p); - tmp &= ~0UL >> offset; - - if (size < BITS_PER_LONG) - goto found_first; - if (tmp) - goto found_middle; + while (1) { + if (*p == 0) + goto pass; - size -= BITS_PER_LONG; - result += BITS_PER_LONG; - p++; -aligned: - while (size & ~(BITS_PER_LONG-1)) { tmp = __reverse_ulong((unsigned char *)p); + + tmp &= ~0UL >> offset; + if (size < BITS_PER_LONG) + tmp &= (~0UL << (BITS_PER_LONG - size)); if (tmp) - goto found_middle; - result += BITS_PER_LONG; + goto found; +pass: + if (size <= BITS_PER_LONG) + break; size -= BITS_PER_LONG; + offset = 0; p++; } - if (!size) - return result; - - tmp = __reverse_ulong((unsigned char *)p); -found_first: - tmp &= (~0UL << (BITS_PER_LONG - size)); - if (!tmp) /* Are any bits set? */ - return result + size; /* Nope. */ -found_middle: - return result + __reverse_ffs(tmp); + return result; +found: + return result - size + __reverse_ffs(tmp); } static unsigned long __find_rev_next_zero_bit(const unsigned long *addr, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [f2fs-dev] [PATCH] f2fs: refactor __find_rev_next_{zero}_bit
In bitmaps of f2fs, bytes are sorted in ascending order, but bits in a byte are sorted in descending order. It seems to be the reason why __reverse_ulong is needed. May I ask why f2fs bitmap apply such order? > -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Friday, October 23, 2015 12:36 AM > To: Chao Yu > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: refactor __find_rev_next_{zero}_bit > > Hi Chao, > > On Thu, Oct 22, 2015 at 07:24:01PM +0800, Chao Yu wrote: > > Hi Jaegeuk, > > > > > -Original Message- > > > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > > > Sent: Thursday, October 22, 2015 6:30 AM > > > To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org; > > > linux-f2fs-de...@lists.sourceforge.net > > > Subject: Re: [f2fs-dev] [PATCH] f2fs: refactor > > > __find_rev_next_{zero}_bit > > > > > > On Wed, Oct 21, 2015 at 02:16:43PM -0700, Jaegeuk Kim wrote: > > > > This patch refactors __find_rev_next_{zero}_bit which was disabled > > > > previously due to bugs. > > > > Actually I didn't know the history here, could you please explain > > details about the bugs? then I can do the review. > > When I disabled the existing function by: > > commit e19ef527aa32f057710ec842fe656bffc263b0bb > Author: Jaegeuk Kim > Date: Mon May 18 11:45:15 2015 -0700 > > f2fs: avoid buggy functions > > I found a mismatch of results between the existing __find_rev_next_bit and > f2fs_test_bit work. I remember it happened on > __find_rev_next_bit sometimes by xfstests which include trim_fs. > (I didn't get an error from __find_rev_zero_bit though.) > > Since I have no time to investigate the bug in more detail at that time, I > just disabled it. But, now I'm trying to refactor the flow to fix > this. > > Thanks, > > > > > Thanks, > > > > > > > > > > Signed-off-by: Jaegeuk Kim > > > > --- > > > > fs/f2fs/segment.c | 106 > > > > +- > > > > 1 file changed, 49 insertions(+), 57 deletions(-) > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index > > > > f37c212..50afc93 100644 > > > > --- a/fs/f2fs/segment.c > > > > +++ b/fs/f2fs/segment.c > > > > @@ -29,6 +29,21 @@ static struct kmem_cache *discard_entry_slab; > > > > static struct kmem_cache *sit_entry_set_slab; static struct > > > > kmem_cache *inmem_entry_slab; > > > > > > > > +static unsigned long __reverse_ulong(unsigned char *str) { > > > > + unsigned long tmp = 0; > > > > + int shift = 24, idx = 0; > > > > + > > > > +#if BITS_PER_LONG == 64 > > > > + shift = 56; > > > > +#endif > > > > + while (shift >= 0) { > > > > + tmp |= (unsigned long)str[idx++] << shift; > > > > + shift -= BITS_PER_BYTE; > > > > + } > > > > + return tmp; > > > > +} > > > > + > > > > /* > > > > * __reverse_ffs is copied from include/asm-generic/bitops/__ffs.h > > > > since > > > > * MSB and LSB are reversed in a byte by f2fs_set_bit. > > > > @@ -38,27 +53,31 @@ static inline unsigned long __reverse_ffs(unsigned > > > > long word) > > > > int num = 0; > > > > > > > > #if BITS_PER_LONG == 64 > > > > - if ((word & 0x) == 0) { > > > > + if ((word & 0x) == 0) > > > > > > I fixed the sparse warning by modifying: > > > if ((word & 0xUL) == 0) > > > > > > Thanks, > > > > > > > num += 32; > > > > + else > > > > word >>= 32; > > > > - } > > > > #endif > > > > - if ((word & 0x) == 0) { > > > > + if ((word & 0x) == 0) > > > > num += 16; > > > > + else > > > > word >>= 16; > > > > - } > > > > - if ((word & 0xff) == 0) { > > > > + > > > > + if ((word & 0xff00) == 0) > > > > num += 8; > > > > + else > > > > word >>= 8; > > > > - } > > > > + > > > > if ((word & 0xf0) == 0) > > > > num += 4; > > > > else > > > > word >>= 4; > > > > + > > > > if ((word & 0xc) == 0) > > > > num += 2; > > > > else > > > > word >>= 2; > > > > + > > > > if ((word & 0x2) == 0) > > > > num += 1; > > > > return num; > > > > @@ -68,26 +87,16 @@ static inline unsigned long __reverse_ffs(unsigned > > > > long word) > > > > * __find_rev_next(_zero)_bit is copied from lib/find_next_bit.c > > > > because > > > > * f2fs_set_bit makes MSB and LSB reversed in a byte. > > > > * Example: > > > > - * LSB <--> MSB > > > > - * f2fs_set_bit(0, bitmap) => 0001 > > > > - * f2fs_set_bit(7, bitmap) => 1000 > > > > + * MSB <--> LSB > > > > + * f2fs_set_bit(0, bitmap) => 1000 > > > > + * f2fs_set_bit(7, bitmap)
RE: [f2fs-dev] [PATCH] f2fs: refactor __find_rev_next_{zero}_bit
In bitmaps of f2fs, bytes are sorted in ascending order, but bits in a byte are sorted in descending order. It seems to be the reason why __reverse_ulong is needed. May I ask why f2fs bitmap apply such order? > -Original Message- > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > Sent: Friday, October 23, 2015 12:36 AM > To: Chao Yu > Cc: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; > linux-f2fs-de...@lists.sourceforge.net > Subject: Re: [f2fs-dev] [PATCH] f2fs: refactor __find_rev_next_{zero}_bit > > Hi Chao, > > On Thu, Oct 22, 2015 at 07:24:01PM +0800, Chao Yu wrote: > > Hi Jaegeuk, > > > > > -Original Message- > > > From: Jaegeuk Kim [mailto:jaeg...@kernel.org] > > > Sent: Thursday, October 22, 2015 6:30 AM > > > To: linux-kernel@vger.kernel.org; linux-fsde...@vger.kernel.org; > > > linux-f2fs-de...@lists.sourceforge.net > > > Subject: Re: [f2fs-dev] [PATCH] f2fs: refactor > > > __find_rev_next_{zero}_bit > > > > > > On Wed, Oct 21, 2015 at 02:16:43PM -0700, Jaegeuk Kim wrote: > > > > This patch refactors __find_rev_next_{zero}_bit which was disabled > > > > previously due to bugs. > > > > Actually I didn't know the history here, could you please explain > > details about the bugs? then I can do the review. > > When I disabled the existing function by: > > commit e19ef527aa32f057710ec842fe656bffc263b0bb > Author: Jaegeuk Kim> Date: Mon May 18 11:45:15 2015 -0700 > > f2fs: avoid buggy functions > > I found a mismatch of results between the existing __find_rev_next_bit and > f2fs_test_bit work. I remember it happened on > __find_rev_next_bit sometimes by xfstests which include trim_fs. > (I didn't get an error from __find_rev_zero_bit though.) > > Since I have no time to investigate the bug in more detail at that time, I > just disabled it. But, now I'm trying to refactor the flow to fix > this. > > Thanks, > > > > > Thanks, > > > > > > > > > > Signed-off-by: Jaegeuk Kim > > > > --- > > > > fs/f2fs/segment.c | 106 > > > > +- > > > > 1 file changed, 49 insertions(+), 57 deletions(-) > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index > > > > f37c212..50afc93 100644 > > > > --- a/fs/f2fs/segment.c > > > > +++ b/fs/f2fs/segment.c > > > > @@ -29,6 +29,21 @@ static struct kmem_cache *discard_entry_slab; > > > > static struct kmem_cache *sit_entry_set_slab; static struct > > > > kmem_cache *inmem_entry_slab; > > > > > > > > +static unsigned long __reverse_ulong(unsigned char *str) { > > > > + unsigned long tmp = 0; > > > > + int shift = 24, idx = 0; > > > > + > > > > +#if BITS_PER_LONG == 64 > > > > + shift = 56; > > > > +#endif > > > > + while (shift >= 0) { > > > > + tmp |= (unsigned long)str[idx++] << shift; > > > > + shift -= BITS_PER_BYTE; > > > > + } > > > > + return tmp; > > > > +} > > > > + > > > > /* > > > > * __reverse_ffs is copied from include/asm-generic/bitops/__ffs.h > > > > since > > > > * MSB and LSB are reversed in a byte by f2fs_set_bit. > > > > @@ -38,27 +53,31 @@ static inline unsigned long __reverse_ffs(unsigned > > > > long word) > > > > int num = 0; > > > > > > > > #if BITS_PER_LONG == 64 > > > > - if ((word & 0x) == 0) { > > > > + if ((word & 0x) == 0) > > > > > > I fixed the sparse warning by modifying: > > > if ((word & 0xUL) == 0) > > > > > > Thanks, > > > > > > > num += 32; > > > > + else > > > > word >>= 32; > > > > - } > > > > #endif > > > > - if ((word & 0x) == 0) { > > > > + if ((word & 0x) == 0) > > > > num += 16; > > > > + else > > > > word >>= 16; > > > > - } > > > > - if ((word & 0xff) == 0) { > > > > + > > > > + if ((word & 0xff00) == 0) > > > > num += 8; > > > > + else > > > > word >>= 8; > > > > - } > > > > + > > > > if ((word & 0xf0) == 0) > > > > num += 4; > > > > else > > > > word >>= 4; > > > > + > > > > if ((word & 0xc) == 0) > > > > num += 2; > > > > else > > > > word >>= 2; > > > > + > > > > if ((word & 0x2) == 0) > > > > num += 1; > > > > return num; > > > > @@ -68,26 +87,16 @@ static inline unsigned long __reverse_ffs(unsigned > > > > long word) > > > > * __find_rev_next(_zero)_bit is copied from lib/find_next_bit.c > > > > because > > > > * f2fs_set_bit makes MSB and LSB reversed in a byte. > > > > * Example: > > > > - * LSB <--> MSB > > > > - * f2fs_set_bit(0, bitmap) => 0001 > > > > - * f2fs_set_bit(7, bitmap) => 1000 > > > > + * MSB <--> LSB > > > > + * f2fs_set_bit(0, bitmap) => 1000
[f2fs-dev][PATCH]f2fs: merge pages with the same sync_mode flag
Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages submits last write requests by sync_mode flags callers pass. This causes a performance problem since continuous pages with different sync flags can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous requests often take more time. This patch makes the following modifies to DATA writebacks: 1. every page will be written back using the sync mode caller pass. 2. only pages with the same sync mode can be merged in one bio request. These changes are restricted to DATA pages.Other types of writebacks are modified To remain synchronous. In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% , and this patch has no obvious impact on other performance tests. >From e942aa6336f650542f9e873404ca7c3c5e175a72 Mon Sep 17 00:00:00 2001 From: Fan Li Date: Mon, 9 Dec 2013 11:25:17 +0800 Subject: [PATCH] merge pages with the same sync_mode flag Signed-off-by: Fan Li --- fs/f2fs/data.c| 18 ++ fs/f2fs/f2fs.h|8 +--- fs/f2fs/gc.c |6 +- fs/f2fs/segment.c | 27 +-- 4 files changed, 37 insertions(+), 22 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 4e2fc09..96f0a92 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -194,8 +194,9 @@ void f2fs_submit_page_mbio(struct f2fs_sb_info *sbi, struct page *page, if (!is_read_io(rw)) inc_page_count(sbi, F2FS_WRITEBACK); - if (io->bio && io->last_block_in_bio != blk_addr - 1) - __submit_merged_bio(sbi, io, type, true, rw); + if (io->bio && (io->last_block_in_bio != blk_addr - 1 || + io->rw_flag != rw)) + __submit_merged_bio(sbi, io, type, false, io->rw_flag); alloc_new: if (io->bio == NULL) { bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi)); @@ -203,6 +204,7 @@ alloc_new: io->bio->bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr); io->bio->bi_end_io = is_read_io(rw) ? f2fs_read_end_io : f2fs_write_end_io; + io->rw_flag = rw; /* * The end_io will be assigned at the sumbission phase. * Until then, let bio_add_page() merge consecutive IOs as much @@ -212,7 +214,7 @@ alloc_new: if (bio_add_page(io->bio, page, PAGE_CACHE_SIZE, 0) < PAGE_CACHE_SIZE) { - __submit_merged_bio(sbi, io, type, true, rw); + __submit_merged_bio(sbi, io, type, false, rw); goto alloc_new; } @@ -642,7 +644,7 @@ static int f2fs_read_data_pages(struct file *file, return mpage_readpages(mapping, pages, nr_pages, get_data_block_ro); } -int do_write_data_page(struct page *page) +int do_write_data_page(struct page *page, struct writeback_control *wbc) { struct inode *inode = page->mapping->host; block_t old_blk_addr, new_blk_addr; @@ -670,10 +672,10 @@ int do_write_data_page(struct page *page) !is_cold_data(page) && need_inplace_update(inode))) { rewrite_data_page(F2FS_SB(inode->i_sb), page, - old_blk_addr); + old_blk_addr, wbc); } else { write_data_page(inode, page, , - old_blk_addr, _blk_addr); + old_blk_addr, _blk_addr, wbc); update_extent_cache(new_blk_addr, ); } out_writepage: @@ -720,10 +722,10 @@ write: if (S_ISDIR(inode->i_mode)) { dec_page_count(sbi, F2FS_DIRTY_DENTS); inode_dec_dirty_dents(inode); - err = do_write_data_page(page); + err = do_write_data_page(page, wbc); } else { f2fs_lock_op(sbi); - err = do_write_data_page(page); + err = do_write_data_page(page, wbc); f2fs_unlock_op(sbi); need_balance_fs = true; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 10eca02..783f39d 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -368,6 +368,7 @@ enum page_type { struct f2fs_bio_info { struct bio *bio;/* bios to merge */ sector_t last_block_in_bio; /* last block number */ + int rw_flag;/* rw flag for all pages */ struct mutex io_mutex; /* mutex for bio */ }; @@ -1098,8 +1099,9 @@ void write_meta_page(struct f2fs_sb_info *, struct page *); void write_node_page(struct f2fs_sb_info *, struct page *, unsigned int,
[f2fs-dev][PATCH]f2fs: merge pages with the same sync_mode flag
Previously f2fs submits most of write requests using WRITE_SYNC, but f2fs_write_data_pages submits last write requests by sync_mode flags callers pass. This causes a performance problem since continuous pages with different sync flags can't be merged in cfq IO scheduler(thanks yu chao for pointing it out), and synchronous requests often take more time. This patch makes the following modifies to DATA writebacks: 1. every page will be written back using the sync mode caller pass. 2. only pages with the same sync mode can be merged in one bio request. These changes are restricted to DATA pages.Other types of writebacks are modified To remain synchronous. In my test with tiotest, f2fs sequence write performance is improved by about 7%-10% , and this patch has no obvious impact on other performance tests. From e942aa6336f650542f9e873404ca7c3c5e175a72 Mon Sep 17 00:00:00 2001 From: Fan Li fanofcode...@samsung.com Date: Mon, 9 Dec 2013 11:25:17 +0800 Subject: [PATCH] merge pages with the same sync_mode flag Signed-off-by: Fan Li fanofcode...@samsung.com --- fs/f2fs/data.c| 18 ++ fs/f2fs/f2fs.h|8 +--- fs/f2fs/gc.c |6 +- fs/f2fs/segment.c | 27 +-- 4 files changed, 37 insertions(+), 22 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 4e2fc09..96f0a92 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -194,8 +194,9 @@ void f2fs_submit_page_mbio(struct f2fs_sb_info *sbi, struct page *page, if (!is_read_io(rw)) inc_page_count(sbi, F2FS_WRITEBACK); - if (io-bio io-last_block_in_bio != blk_addr - 1) - __submit_merged_bio(sbi, io, type, true, rw); + if (io-bio (io-last_block_in_bio != blk_addr - 1 || + io-rw_flag != rw)) + __submit_merged_bio(sbi, io, type, false, io-rw_flag); alloc_new: if (io-bio == NULL) { bio_blocks = MAX_BIO_BLOCKS(max_hw_blocks(sbi)); @@ -203,6 +204,7 @@ alloc_new: io-bio-bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr); io-bio-bi_end_io = is_read_io(rw) ? f2fs_read_end_io : f2fs_write_end_io; + io-rw_flag = rw; /* * The end_io will be assigned at the sumbission phase. * Until then, let bio_add_page() merge consecutive IOs as much @@ -212,7 +214,7 @@ alloc_new: if (bio_add_page(io-bio, page, PAGE_CACHE_SIZE, 0) PAGE_CACHE_SIZE) { - __submit_merged_bio(sbi, io, type, true, rw); + __submit_merged_bio(sbi, io, type, false, rw); goto alloc_new; } @@ -642,7 +644,7 @@ static int f2fs_read_data_pages(struct file *file, return mpage_readpages(mapping, pages, nr_pages, get_data_block_ro); } -int do_write_data_page(struct page *page) +int do_write_data_page(struct page *page, struct writeback_control *wbc) { struct inode *inode = page-mapping-host; block_t old_blk_addr, new_blk_addr; @@ -670,10 +672,10 @@ int do_write_data_page(struct page *page) !is_cold_data(page) need_inplace_update(inode))) { rewrite_data_page(F2FS_SB(inode-i_sb), page, - old_blk_addr); + old_blk_addr, wbc); } else { write_data_page(inode, page, dn, - old_blk_addr, new_blk_addr); + old_blk_addr, new_blk_addr, wbc); update_extent_cache(new_blk_addr, dn); } out_writepage: @@ -720,10 +722,10 @@ write: if (S_ISDIR(inode-i_mode)) { dec_page_count(sbi, F2FS_DIRTY_DENTS); inode_dec_dirty_dents(inode); - err = do_write_data_page(page); + err = do_write_data_page(page, wbc); } else { f2fs_lock_op(sbi); - err = do_write_data_page(page); + err = do_write_data_page(page, wbc); f2fs_unlock_op(sbi); need_balance_fs = true; } diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 10eca02..783f39d 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -368,6 +368,7 @@ enum page_type { struct f2fs_bio_info { struct bio *bio;/* bios to merge */ sector_t last_block_in_bio; /* last block number */ + int rw_flag;/* rw flag for all pages */ struct mutex io_mutex; /* mutex for bio */ }; @@ -1098,8 +1099,9 @@ void write_meta_page(struct f2fs_sb_info *, struct page *); void write_node_page(struct f2fs_sb_info *, struct page *, unsigned int, block_t, block_t *); void write_data_page
[f2fs-dev] [PATCH V2] f2fs: change the method of calculating the number summary blocks
npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than actual number by one, which causes checkpoint can't be written into disk contiguously, and sometimes summary blocks can't be compacted like they should. Besides, when writing summary blocks into pages, if remain space in a page isn't big enough for one f2fs_summary, it will be left unused, current code seems not to take it into account. Signed-off-by: Fan Li --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..e40cdc4 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { if (sbi->ckpt->alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes < sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2 * SUM_JOURNAL_SIZE - + SUM_FOOTER_SIZE) / SUMMARY_SIZE; + if (valid_sum_count <= sum_in_page) return 1; - else if (total_size_bytes < 2 * sum_space) + else if ((valid_sum_count - sum_in_page) <= + (PAGE_CACHE_SIZE - SUM_FOOTER_SIZE) / SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [f2fs-dev] [PATCH] f2fs: change the method of calculating the number summary blocks
> Hi, > > 2013-10-28 (월), 08:54 +0000, Fan Li: > > "There is a HTML error in the previous email, so I send this one.If you > already received this before, please ignore it.Sorry for the inconvenience" > > > > This patch change the method of calculating the number of summary blocks > in function npages_for_summary_flush. > > npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a > f2fs_summary while the actual size is just SUMMARY_SIZE. > > As a result, occasionally the return value of npages_for_summary_flush will > be bigger than the actual number by one, and the checkpoint won't be > written contiguously into disk. > > Besides, struct f2fs_summary can't be split to two pages, so it could take > more space than the sum of its size, the current version seems not to take it > into account. > > > > Again. Please write a description not to exceed 80 columns. > Sorry, didn't know. here's the short version: npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than actual number by one, which causes checkpoint can't be written into disk contiguously, and sometimes summary blocks can't be compacted like they should. Besides, when writing summary blocks into pages, if remain space in a page isn't big enough for one f2fs_summary, it will be left unused, current code seems not to take it into account. Signed-off-by: Fan Li --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..e40cdc4 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { if (sbi->ckpt->alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes < sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2 * SUM_JOURNAL_SIZE - + SUM_FOOTER_SIZE) / SUMMARY_SIZE; + if (valid_sum_count <= sum_in_page) return 1; - else if (total_size_bytes < 2 * sum_space) + else if ((valid_sum_count - sum_in_page) <= + (PAGE_CACHE_SIZE - SUM_FOOTER_SIZE) / SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [f2fs-dev] [PATCH] f2fs: change the method of calculating the number summary blocks
Hi, 2013-10-28 (월), 08:54 +, Fan Li: There is a HTML error in the previous email, so I send this one.If you already received this before, please ignore it.Sorry for the inconvenience This patch change the method of calculating the number of summary blocks in function npages_for_summary_flush. npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while the actual size is just SUMMARY_SIZE. As a result, occasionally the return value of npages_for_summary_flush will be bigger than the actual number by one, and the checkpoint won't be written contiguously into disk. Besides, struct f2fs_summary can't be split to two pages, so it could take more space than the sum of its size, the current version seems not to take it into account. Again. Please write a description not to exceed 80 columns. Sorry, didn't know. here's the short version: npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than actual number by one, which causes checkpoint can't be written into disk contiguously, and sometimes summary blocks can't be compacted like they should. Besides, when writing summary blocks into pages, if remain space in a page isn't big enough for one f2fs_summary, it will be left unused, current code seems not to take it into account. Signed-off-by: Fan Li fanofcode...@samsung.com --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..e40cdc4 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i = CURSEG_COLD_DATA; i++) { if (sbi-ckpt-alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2 * SUM_JOURNAL_SIZE - + SUM_FOOTER_SIZE) / SUMMARY_SIZE; + if (valid_sum_count = sum_in_page) return 1; - else if (total_size_bytes 2 * sum_space) + else if ((valid_sum_count - sum_in_page) = + (PAGE_CACHE_SIZE - SUM_FOOTER_SIZE) / SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH V2] f2fs: change the method of calculating the number summary blocks
npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than actual number by one, which causes checkpoint can't be written into disk contiguously, and sometimes summary blocks can't be compacted like they should. Besides, when writing summary blocks into pages, if remain space in a page isn't big enough for one f2fs_summary, it will be left unused, current code seems not to take it into account. Signed-off-by: Fan Li fanofcode...@samsung.com --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..e40cdc4 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i = CURSEG_COLD_DATA; i++) { if (sbi-ckpt-alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2 * SUM_JOURNAL_SIZE - + SUM_FOOTER_SIZE) / SUMMARY_SIZE; + if (valid_sum_count = sum_in_page) return 1; - else if (total_size_bytes 2 * sum_space) + else if ((valid_sum_count - sum_in_page) = + (PAGE_CACHE_SIZE - SUM_FOOTER_SIZE) / SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[f2fs-dev] [PATCH] f2fs: change the method of calculating the number summary blocks
"There is a HTML error in the previous email, so I send this one.If you already received this before, please ignore it.Sorry for the inconvenience" This patch change the method of calculating the number of summary blocks in function npages_for_summary_flush. npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while the actual size is just SUMMARY_SIZE. As a result, occasionally the return value of npages_for_summary_flush will be bigger than the actual number by one, and the checkpoint won't be written contiguously into disk. Besides, struct f2fs_summary can't be split to two pages, so it could take more space than the sum of its size, the current version seems not to take it into account. Signed-off-by: Fan Li --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..ba2930f 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) { if (sbi->ckpt->alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes < sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2*SUM_JOURNAL_SIZE - SUM_FOOTER_SIZE)/ + SUMMARY_SIZE; + if (valid_sum_count <= sum_in_page) return 1; - else if (total_size_bytes < 2 * sum_space) + else if (valid_sum_count-sum_in_page <= + (PAGE_CACHE_SIZE-SUM_FOOTER_SIZE)/SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5
[f2fs-dev] [PATCH] f2fs: change the method of calculating the number summary blocks
There is a HTML error in the previous email, so I send this one.If you already received this before, please ignore it.Sorry for the inconvenience This patch change the method of calculating the number of summary blocks in function npages_for_summary_flush. npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while the actual size is just SUMMARY_SIZE. As a result, occasionally the return value of npages_for_summary_flush will be bigger than the actual number by one, and the checkpoint won't be written contiguously into disk. Besides, struct f2fs_summary can't be split to two pages, so it could take more space than the sum of its size, the current version seems not to take it into account. Signed-off-by: Fan Li fanofcode...@samsung.com --- fs/f2fs/segment.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 487af61..ba2930f 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -279,9 +279,8 @@ static void __add_sum_entry(struct f2fs_sb_info *sbi, int type, */ int npages_for_summary_flush(struct f2fs_sb_info *sbi) { - int total_size_bytes = 0; int valid_sum_count = 0; - int i, sum_space; + int i, sum_in_page; for (i = CURSEG_HOT_DATA; i = CURSEG_COLD_DATA; i++) { if (sbi-ckpt-alloc_type[i] == SSR) @@ -290,13 +289,12 @@ int npages_for_summary_flush(struct f2fs_sb_info *sbi) valid_sum_count += curseg_blkoff(sbi, i); } - total_size_bytes = valid_sum_count * (SUMMARY_SIZE + 1) - + sizeof(struct nat_journal) + 2 - + sizeof(struct sit_journal) + 2; - sum_space = PAGE_CACHE_SIZE - SUM_FOOTER_SIZE; - if (total_size_bytes sum_space) + sum_in_page = (PAGE_CACHE_SIZE - 2*SUM_JOURNAL_SIZE - SUM_FOOTER_SIZE)/ + SUMMARY_SIZE; + if (valid_sum_count = sum_in_page) return 1; - else if (total_size_bytes 2 * sum_space) + else if (valid_sum_count-sum_in_page = + (PAGE_CACHE_SIZE-SUM_FOOTER_SIZE)/SUMMARY_SIZE) return 2; return 3; } -- 1.7.9.5