On Thu, Nov 03, 2016 at 05:50:34PM +0800, Chao Yu wrote:
> On 2016/11/3 1:23, Jaegeuk Kim wrote:
> > On Wed, Nov 02, 2016 at 03:34:32PM +0800, Chao Yu wrote:
> >> Hi Jaegeuk,
> >>
> >> On 2016/10/21 10:28, Jaegeuk Kim wrote:
> >>> This patch replaces the copied code with original generic function.
> >>
> >> Will we plan to do further enhance inside f2fs_set_page_dirty_nobuffers, 
> >> if we
> >> don't it's better revert fe76b796fc5194cc3d57265002e3a748566d073f, as we 
> >> don't
> >> need to wrap __set_page_dirty_nobuffers.
> > 
> > Urg. I was confused something here.
> > Please ignore this patch. I won't merge this patch.
> 
> Why? isn't __set_page_dirty_nobuffers more fit for f2fs' non-buffer 
> management?

For a while ago, when I tried to improve the performance on pmem, I could hit
that __set_page_dirty_buffers() slightly improved the bandwidth comparing to
__set_page_dirty_nobuffers().

When referencing the below comment written in __set_page_dirty_nobuffers(), it
seems I could get that by adopting "top-down" approach instead of "bottom-up",
which avoids lock contention as I guess. I couldn't do deep investigation on it
though.

/*
 * For address_spaces which do not use buffers.  Just tag the page as dirty in
 * its radix tree.
 *
 * This is also used when a single buffer is being dirtied: we want to set the
 * page dirty in that case, but not all the buffers.  This is a "bottom-up"
 * dirtying, whereas __set_page_dirty_buffers() is a "top-down" dirtying.
 *
 * The caller must ensure this doesn't race with truncation.  Most will simply
 * hold the page lock, but e.g. zap_pte_range() calls with the page mapped and
 * the pte lock held, which also locks out truncation.
 */

So, I measured the performance again with fxmark on ramdisk, 8 cores, DWAL,
bufferedio case. I got 2683158 works/sec w/ "top-down" over 2512609 works/sec w/
"bottom-up".

Thanks,

> 
> Thanks,
> 
> > 
> >> BTW, does the original patch make memory cgroup functionality problematic?
> > 
> > I don't think there is a problem, since I just copied 
> > __set_page_dirty_buffers()
> > except page_has_buffers' stuffs.
> > 
> > Thank you for pointing this out. :)
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Signed-off-by: Jaegeuk Kim <jaeg...@kernel.org>
> >>> ---
> >>>  fs/f2fs/data.c | 29 -----------------------------
> >>>  fs/f2fs/f2fs.h |  6 +++++-
> >>>  2 files changed, 5 insertions(+), 30 deletions(-)
> >>>
> >>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> >>> index 68edb47..3954315 100644
> >>> --- a/fs/f2fs/data.c
> >>> +++ b/fs/f2fs/data.c
> >>> @@ -1801,35 +1801,6 @@ int f2fs_release_page(struct page *page, gfp_t 
> >>> wait)
> >>>   return 1;
> >>>  }
> >>>  
> >>> -/*
> >>> - * This was copied from __set_page_dirty_buffers which gives higher 
> >>> performance
> >>> - * in very high speed storages. (e.g., pmem)
> >>> - */
> >>> -void f2fs_set_page_dirty_nobuffers(struct page *page)
> >>> -{
> >>> - struct address_space *mapping = page->mapping;
> >>> - unsigned long flags;
> >>> -
> >>> - if (unlikely(!mapping))
> >>> -         return;
> >>> -
> >>> - spin_lock(&mapping->private_lock);
> >>> - lock_page_memcg(page);
> >>> - SetPageDirty(page);
> >>> - spin_unlock(&mapping->private_lock);
> >>> -
> >>> - spin_lock_irqsave(&mapping->tree_lock, flags);
> >>> - WARN_ON_ONCE(!PageUptodate(page));
> >>> - account_page_dirtied(page, mapping);
> >>> - radix_tree_tag_set(&mapping->page_tree,
> >>> -                 page_index(page), PAGECACHE_TAG_DIRTY);
> >>> - spin_unlock_irqrestore(&mapping->tree_lock, flags);
> >>> - unlock_page_memcg(page);
> >>> -
> >>> - __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);
> >>> - return;
> >>> -}
> >>> -
> >>>  static int f2fs_set_data_page_dirty(struct page *page)
> >>>  {
> >>>   struct address_space *mapping = page->mapping;
> >>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>> index 168f939..b66a04c 100644
> >>> --- a/fs/f2fs/f2fs.h
> >>> +++ b/fs/f2fs/f2fs.h
> >>> @@ -1960,6 +1960,11 @@ static inline unsigned long 
> >>> f2fs_find_next_bit(const void *addr,
> >>>   return find_next_bit(addr, size, offset + 2);
> >>>  }
> >>>  
> >>> +static inline void f2fs_set_page_dirty_nobuffers(struct page *page)
> >>> +{
> >>> + __set_page_dirty_nobuffers(page);
> >>> +}
> >>> +
> >>>  #define get_inode_mode(i) \
> >>>   ((is_inode_flag_set(i, FI_ACL_MODE)) ? \
> >>>    (F2FS_I(i)->i_acl_mode) : ((i)->i_mode))
> >>> @@ -2200,7 +2205,6 @@ struct page *get_new_data_page(struct inode *, 
> >>> struct page *, pgoff_t, bool);
> >>>  int do_write_data_page(struct f2fs_io_info *);
> >>>  int f2fs_map_blocks(struct inode *, struct f2fs_map_blocks *, int, int);
> >>>  int f2fs_fiemap(struct inode *inode, struct fiemap_extent_info *, u64, 
> >>> u64);
> >>> -void f2fs_set_page_dirty_nobuffers(struct page *);
> >>>  void f2fs_invalidate_page(struct page *, unsigned int, unsigned int);
> >>>  int f2fs_release_page(struct page *, gfp_t);
> >>>  #ifdef CONFIG_MIGRATION
> >>>
> > 
> > .
> > 

------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to