Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-06 Thread Gioh Kim



2014-08-02 오전 1:04, Peter Zijlstra 쓴 글:

On Fri, Aug 01, 2014 at 05:24:59PM +0200, Jan Kara wrote:


   OK, makes sense. But then if there's heavy IO going on, anything that has
IO pending on it is pinned and IO completion can easily take something
close to a second or more. So meeting subsecond deadlines may be tough even
for ordinary data pages under heavy load, even more so for metadata where
there are further constraints. OTOH phones aren't usually IO bound so in
practice it needn't be so bad ;).


Yeah, typically phones are not IO bound :-)


So if it is sub-second unless someone
loads the storage, then that sounds doable even for metadata. But we'll
need to attach ->migratepage callback to blkdev pages and at least in ext4
case teach it how to move pages tracked by the journal.


Right, making it possible at all if of course much prefered over not
possible, regardless of timeliness :-)


Sadly its not only mobile devices that excel in crappy hardware, there's
plenty desktop stuff that could use this too, like some of the v4l
devices iirc.

   Yeah, but in such usecases the guarantees we can offer for completion of
migration are even more vague :(.


Yeah, lets start by making it possible, after that we can maybe look at
making it better, who knows.



Is my patch applicable? Or what do I have to do now?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-06 Thread Gioh Kim



2014-08-02 오전 1:04, Peter Zijlstra 쓴 글:

On Fri, Aug 01, 2014 at 05:24:59PM +0200, Jan Kara wrote:


   OK, makes sense. But then if there's heavy IO going on, anything that has
IO pending on it is pinned and IO completion can easily take something
close to a second or more. So meeting subsecond deadlines may be tough even
for ordinary data pages under heavy load, even more so for metadata where
there are further constraints. OTOH phones aren't usually IO bound so in
practice it needn't be so bad ;).


Yeah, typically phones are not IO bound :-)


So if it is sub-second unless someone
loads the storage, then that sounds doable even for metadata. But we'll
need to attach -migratepage callback to blkdev pages and at least in ext4
case teach it how to move pages tracked by the journal.


Right, making it possible at all if of course much prefered over not
possible, regardless of timeliness :-)


Sadly its not only mobile devices that excel in crappy hardware, there's
plenty desktop stuff that could use this too, like some of the v4l
devices iirc.

   Yeah, but in such usecases the guarantees we can offer for completion of
migration are even more vague :(.


Yeah, lets start by making it possible, after that we can maybe look at
making it better, who knows.



Is my patch applicable? Or what do I have to do now?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 05:24:59PM +0200, Jan Kara wrote:

>   OK, makes sense. But then if there's heavy IO going on, anything that has
> IO pending on it is pinned and IO completion can easily take something
> close to a second or more. So meeting subsecond deadlines may be tough even
> for ordinary data pages under heavy load, even more so for metadata where
> there are further constraints. OTOH phones aren't usually IO bound so in
> practice it needn't be so bad ;). 

Yeah, typically phones are not IO bound :-)

> So if it is sub-second unless someone
> loads the storage, then that sounds doable even for metadata. But we'll
> need to attach ->migratepage callback to blkdev pages and at least in ext4
> case teach it how to move pages tracked by the journal.

Right, making it possible at all if of course much prefered over not
possible, regardless of timeliness :-)

> > Sadly its not only mobile devices that excel in crappy hardware, there's
> > plenty desktop stuff that could use this too, like some of the v4l
> > devices iirc.
>   Yeah, but in such usecases the guarantees we can offer for completion of
> migration are even more vague :(.

Yeah, lets start by making it possible, after that we can maybe look at
making it better, who knows.


pgpFdWdAaZ6os.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 15:36:18, Peter Zijlstra wrote:
> On Fri, Aug 01, 2014 at 11:57:00AM +0200, Jan Kara wrote:
> > So the quiestion really is how hard guarantee do you need that a page in
> > movable zone is really movable. Or better in what timeframe should it be
> > movable? It may be possible to make e.g. migratepage callback for ext4
> > blkdev pages which will handle migration of pages that are just idly
> > sitting in a journal waiting to be committed. That may be reasonably doable
> > although it won't be perfect. Or we may just decide it's not worth the
> > bother and allocate all blkdev pages from unmovable zone...
> 
> So the point of CMA is to cater to those (arguably broken) devices that
> do not have scatter gather IO, and these include things like the camera
> device on your phone.
> 
> Previously (and possibly currently) your android Linux kernel will
> simply preallocate a massive physically linear chunk of memory and
> assign it to the camera hardware and not use it at all.
> 
> This is a terrible waste for most of the time people aren't running
> their camera app at all. So the point is to allow usage of the memory,
> but upon request be able to 'immediately' clear it through
> migration/writeback.
> 
> So we should be fairly 'quick' in making the memory available,
> definitely sub second timeframes.
  OK, makes sense. But then if there's heavy IO going on, anything that has
IO pending on it is pinned and IO completion can easily take something
close to a second or more. So meeting subsecond deadlines may be tough even
for ordinary data pages under heavy load, even more so for metadata where
there are further constraints. OTOH phones aren't usually IO bound so in
practice it needn't be so bad ;). So if it is sub-second unless someone
loads the storage, then that sounds doable even for metadata. But we'll
need to attach ->migratepage callback to blkdev pages and at least in ext4
case teach it how to move pages tracked by the journal.
 
> Sadly its not only mobile devices that excel in crappy hardware, there's
> plenty desktop stuff that could use this too, like some of the v4l
> devices iirc.
  Yeah, but in such usecases the guarantees we can offer for completion of
migration are even more vague :(.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 11:57:00AM +0200, Jan Kara wrote:
> So the quiestion really is how hard guarantee do you need that a page in
> movable zone is really movable. Or better in what timeframe should it be
> movable? It may be possible to make e.g. migratepage callback for ext4
> blkdev pages which will handle migration of pages that are just idly
> sitting in a journal waiting to be committed. That may be reasonably doable
> although it won't be perfect. Or we may just decide it's not worth the
> bother and allocate all blkdev pages from unmovable zone...

So the point of CMA is to cater to those (arguably broken) devices that
do not have scatter gather IO, and these include things like the camera
device on your phone.

Previously (and possibly currently) your android Linux kernel will
simply preallocate a massive physically linear chunk of memory and
assign it to the camera hardware and not use it at all.

This is a terrible waste for most of the time people aren't running
their camera app at all. So the point is to allow usage of the memory,
but upon request be able to 'immediately' clear it through
migration/writeback.

So we should be fairly 'quick' in making the memory available,
definitely sub second timeframes.


Sadly its not only mobile devices that excel in crappy hardware, there's
plenty desktop stuff that could use this too, like some of the v4l
devices iirc.


pgpjIdOIpAd7R.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 10:06:40, Gioh Kim wrote:
> Function path is like followings:
> 
> [   97.868304] [<8011a750>] (drop_buffers+0xfc/0x168) from [<8011bc64>] 
> (try_to_free_buffers+0x50/0xbc)
> [   97.877457] [<8011bc64>] (try_to_free_buffers+0x50/0xbc) from [<80121e40>] 
> (blkdev_releasepage+0x38/0x48)
> [   97.887093] [<80121e40>] (blkdev_releasepage+0x38/0x48) from [<800add8c>] 
> (try_to_release_page+0x40/0x5c)
> [   97.896728] [<800add8c>] (try_to_release_page+0x40/0x5c) from [<800bd9bc>] 
> (shrink_page_list+0x508/0x8a4)
> [   97.906334] [<800bd9bc>] (shrink_page_list+0x508/0x8a4) from [<800bde5c>] 
> (reclaim_clean_pages_from_list+0x104/0x148)
> [   97.917017] [<800bde5c>] (reclaim_clean_pages_from_list+0x104/0x148) from 
> [<800b5dec>] (alloc_contig_range+0x114/0x2dc)
> [   97.927856] [<800b5dec>] (alloc_contig_range+0x114/0x2dc) from 
> [<802f6c04>] (dma_alloc_from_contiguous+0x8c/0x14c)
> [   97.938264] [<802f6c04>] (dma_alloc_from_contiguous+0x8c/0x14c) from 
> [<80017b6c>] (__alloc_from_contiguous+0x34/0xc0)
> [   97.948926] [<80017b6c>] (__alloc_from_contiguous+0x34/0xc0) from 
> [<80017d40>] (__dma_alloc+0xc4/0x2a0)
> [   97.958362] [<80017d40>] (__dma_alloc+0xc4/0x2a0) from [<8001803c>] 
> (arm_dma_alloc+0x80/0x98)
> [   97.966916] [<8001803c>] (arm_dma_alloc+0x80/0x98) from [<7f6ea188>] 
> (cma_test_probe+0xe0/0x1f0 [drv])
  OK, this makes more sense to me. But also as Joonsoo Kim pointed out
even if we go into the migration path, we will end up calling
try_to_free_buffers() because blkdev pages are one of those which use
fallback_migrate_page() as their ->migratepage callback.

Now regarding your quest to make all pages in the movable zone really
movable - you are going to have hard time to achieve that for blkdev pages.
E.g. when a metadata buffer is part of a running transaction, it will be
pinned in memory until that transaction commits which easily takes seconds.
And for busy metadata buffer there's no guarantee that after that
transaction commits the buffer isn't already part of the newly started
transaction. So these buffers may be effectively unmovable while someone
writes to the filesystem.

So the quiestion really is how hard guarantee do you need that a page in
movable zone is really movable. Or better in what timeframe should it be
movable? It may be possible to make e.g. migratepage callback for ext4
blkdev pages which will handle migration of pages that are just idly
sitting in a journal waiting to be committed. That may be reasonably doable
although it won't be perfect. Or we may just decide it's not worth the
bother and allocate all blkdev pages from unmovable zone...

Honza

-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 17:34:46, Joonsoo Kim wrote:
> On Thu, Jul 31, 2014 at 02:21:14PM +0200, Jan Kara wrote:
> > On Thu 31-07-14 09:37:15, Gioh Kim wrote:
> > > 
> > > 
> > > 2014-07-31 오전 9:03, Jan Kara 쓴 글:
> > > >On Thu 31-07-14 08:54:40, Gioh Kim wrote:
> > > >>2014-07-30 오후 7:11, Jan Kara 쓴 글:
> > > >>>On Wed 30-07-14 16:44:24, Gioh Kim wrote:
> > > 2014-07-22 오후 6:38, Jan Kara 쓴 글:
> > > >On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> > > >>On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> > > >>>Hello,
> > > >>>
> > > >>>This patch try to solve problem that a long-lasting page cache of
> > > >>>ext4 superblock disturbs page migration.
> > > >>>
> > > >>>I've been testing CMA feature on my ARM-based platform
> > > >>>and found some pages for page caches cannot be migrated.
> > > >>>Some of them are page caches of superblock of ext4 filesystem.
> > > >>>
> > > >>>Current ext4 reads superblock with sb_bread(). sb_bread() 
> > > >>>allocates page
> > > >>>from movable area. But the problem is that ext4 hold the page until
> > > >>>it is unmounted. If root filesystem is ext4 the page cannot be 
> > > >>>migrated forever.
> > > >>>
> > > >>>I introduce a new API for allocating page from non-movable area.
> > > >>>It is useful for ext4 and others that want to hold page cache for 
> > > >>>a long time.
> > > >>
> > > >>There's no word on why you can't teach ext4 to still migrate that 
> > > >>page.
> > > >>For all I know it might be impossible, but at least mention why.
> > > 
> > > I am very sorry for lacking of details.
> > > 
> > > In ext4_fill_super() the buffer-head of superblock is stored in 
> > > sbi->s_sbh.
> > > The page belongs to the buffer-head is allocated from movable area.
> > > To migrate the page the buffer-head should be released via brelse().
> > > But brelse() is not called until unmount.
> > > >>>   Hum, I don't see where in the code do we check buffer_head use 
> > > >>> count. Can
> > > >>>you please point me? Thanks.
> > > >>
> > > >>Filesystem code does not check buffer_head use count.  sb_bread() 
> > > >>returns
> > > >>the buffer_head that is included in bh_lru and has non-zero use count.
> > > >>You can see the bh_lru code in buffer.c: __find_get_clock() and
> > > >>lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
> > > >>bh_lru().  It first calls get_bh() to increase the use count and insert
> > > >>bh into the lru array.
> > > >>
> > > >>The buffer_head use count is non-zero until brelse() is called.
> > > >   So I probably didn't phrase the question precisely enough. What I was
> > > >asking about is where exactly *migration* code checks buffer use count?
> > > >Because as I'm looking at buffer_migrate_page() we lock the buffers on a
> > > >migrated page but we don't look at buffer use counts... So it seems to me
> > > >that migration of a page with buffers should succeed even if buffer head
> > > >has an elevated use count. Now I think that it *should* check the buffer
> > > >use counts (it is dangerous to migrate buffers someone holds reference 
> > > >to)
> > > >but I just cannot find that place. Or does CMA use some other migration
> > > >function for buffer pages than buffer_migrate_page()?
> > > 
> > > CMA allocation function is cma_alloc().
> > > Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() 
> > > -> migrate_pages -> unmap_and_move
> > > -> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.
> > > 
> > > The buffer_busy() is checking b_count.
> > > If buffer is busy buffer-cache cannot be removed.
> > > So the page that includes buffer_head and the page that is refered by
> > > buffer_head are not movable.
> > > 
> > > Is this what you need?
> >   Yes, this is what I was asking about. Thanks! But as I'm looking into
> > __unmap_and_move() it calls try_to_free_buffers() only if page->mapping ==
> > NULL. As the comment before that test states, this can happen only for swap
> > cache (not our case) or for pagecache pages that were truncated and not yet
> > fully cleaned up. But superblock page cannot really be truncated. So I
> > somewhat doubt you can hit the above path for a page holding superblock...
> 
> Hello,
> 
> Although page->mapping != NULL, mapping->a_ops->migratepage could be
> NULL. This is the case of block_device. See def_blk_aops in
> fs/block_dev.c. In this case, fallback_migrate_page() is called and
> then try_to_release_page() and try_to_free_buffers() would be called.
  Aaah, right! Finally I understand what happens and why I couldn't see
buffer_migrate_page() being called for blkdev buffers. I didn't realize
blkdev mappings end up with NULL ->migratepage callback. Thanks a lot for
clearing this up.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line 

Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Joonsoo Kim
On Thu, Jul 31, 2014 at 02:21:14PM +0200, Jan Kara wrote:
> On Thu 31-07-14 09:37:15, Gioh Kim wrote:
> > 
> > 
> > 2014-07-31 오전 9:03, Jan Kara 쓴 글:
> > >On Thu 31-07-14 08:54:40, Gioh Kim wrote:
> > >>2014-07-30 오후 7:11, Jan Kara 쓴 글:
> > >>>On Wed 30-07-14 16:44:24, Gioh Kim wrote:
> > 2014-07-22 오후 6:38, Jan Kara 쓴 글:
> > >On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> > >>On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> > >>>Hello,
> > >>>
> > >>>This patch try to solve problem that a long-lasting page cache of
> > >>>ext4 superblock disturbs page migration.
> > >>>
> > >>>I've been testing CMA feature on my ARM-based platform
> > >>>and found some pages for page caches cannot be migrated.
> > >>>Some of them are page caches of superblock of ext4 filesystem.
> > >>>
> > >>>Current ext4 reads superblock with sb_bread(). sb_bread() allocates 
> > >>>page
> > >>>from movable area. But the problem is that ext4 hold the page until
> > >>>it is unmounted. If root filesystem is ext4 the page cannot be 
> > >>>migrated forever.
> > >>>
> > >>>I introduce a new API for allocating page from non-movable area.
> > >>>It is useful for ext4 and others that want to hold page cache for a 
> > >>>long time.
> > >>
> > >>There's no word on why you can't teach ext4 to still migrate that 
> > >>page.
> > >>For all I know it might be impossible, but at least mention why.
> > 
> > I am very sorry for lacking of details.
> > 
> > In ext4_fill_super() the buffer-head of superblock is stored in 
> > sbi->s_sbh.
> > The page belongs to the buffer-head is allocated from movable area.
> > To migrate the page the buffer-head should be released via brelse().
> > But brelse() is not called until unmount.
> > >>>   Hum, I don't see where in the code do we check buffer_head use count. 
> > >>> Can
> > >>>you please point me? Thanks.
> > >>
> > >>Filesystem code does not check buffer_head use count.  sb_bread() returns
> > >>the buffer_head that is included in bh_lru and has non-zero use count.
> > >>You can see the bh_lru code in buffer.c: __find_get_clock() and
> > >>lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
> > >>bh_lru().  It first calls get_bh() to increase the use count and insert
> > >>bh into the lru array.
> > >>
> > >>The buffer_head use count is non-zero until brelse() is called.
> > >   So I probably didn't phrase the question precisely enough. What I was
> > >asking about is where exactly *migration* code checks buffer use count?
> > >Because as I'm looking at buffer_migrate_page() we lock the buffers on a
> > >migrated page but we don't look at buffer use counts... So it seems to me
> > >that migration of a page with buffers should succeed even if buffer head
> > >has an elevated use count. Now I think that it *should* check the buffer
> > >use counts (it is dangerous to migrate buffers someone holds reference to)
> > >but I just cannot find that place. Or does CMA use some other migration
> > >function for buffer pages than buffer_migrate_page()?
> > 
> > CMA allocation function is cma_alloc().
> > Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() -> 
> > migrate_pages -> unmap_and_move
> > -> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.
> > 
> > The buffer_busy() is checking b_count.
> > If buffer is busy buffer-cache cannot be removed.
> > So the page that includes buffer_head and the page that is refered by
> > buffer_head are not movable.
> > 
> > Is this what you need?
>   Yes, this is what I was asking about. Thanks! But as I'm looking into
> __unmap_and_move() it calls try_to_free_buffers() only if page->mapping ==
> NULL. As the comment before that test states, this can happen only for swap
> cache (not our case) or for pagecache pages that were truncated and not yet
> fully cleaned up. But superblock page cannot really be truncated. So I
> somewhat doubt you can hit the above path for a page holding superblock...

Hello,

Although page->mapping != NULL, mapping->a_ops->migratepage could be
NULL. This is the case of block_device. See def_blk_aops in
fs/block_dev.c. In this case, fallback_migrate_page() is called and
then try_to_release_page() and try_to_free_buffers() would be called.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Joonsoo Kim
On Thu, Jul 31, 2014 at 02:21:14PM +0200, Jan Kara wrote:
 On Thu 31-07-14 09:37:15, Gioh Kim wrote:
  
  
  2014-07-31 오전 9:03, Jan Kara 쓴 글:
  On Thu 31-07-14 08:54:40, Gioh Kim wrote:
  2014-07-30 오후 7:11, Jan Kara 쓴 글:
  On Wed 30-07-14 16:44:24, Gioh Kim wrote:
  2014-07-22 오후 6:38, Jan Kara 쓴 글:
  On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
  On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
  Hello,
  
  This patch try to solve problem that a long-lasting page cache of
  ext4 superblock disturbs page migration.
  
  I've been testing CMA feature on my ARM-based platform
  and found some pages for page caches cannot be migrated.
  Some of them are page caches of superblock of ext4 filesystem.
  
  Current ext4 reads superblock with sb_bread(). sb_bread() allocates 
  page
  from movable area. But the problem is that ext4 hold the page until
  it is unmounted. If root filesystem is ext4 the page cannot be 
  migrated forever.
  
  I introduce a new API for allocating page from non-movable area.
  It is useful for ext4 and others that want to hold page cache for a 
  long time.
  
  There's no word on why you can't teach ext4 to still migrate that 
  page.
  For all I know it might be impossible, but at least mention why.
  
  I am very sorry for lacking of details.
  
  In ext4_fill_super() the buffer-head of superblock is stored in 
  sbi-s_sbh.
  The page belongs to the buffer-head is allocated from movable area.
  To migrate the page the buffer-head should be released via brelse().
  But brelse() is not called until unmount.
 Hum, I don't see where in the code do we check buffer_head use count. 
   Can
  you please point me? Thanks.
  
  Filesystem code does not check buffer_head use count.  sb_bread() returns
  the buffer_head that is included in bh_lru and has non-zero use count.
  You can see the bh_lru code in buffer.c: __find_get_clock() and
  lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
  bh_lru().  It first calls get_bh() to increase the use count and insert
  bh into the lru array.
  
  The buffer_head use count is non-zero until brelse() is called.
 So I probably didn't phrase the question precisely enough. What I was
  asking about is where exactly *migration* code checks buffer use count?
  Because as I'm looking at buffer_migrate_page() we lock the buffers on a
  migrated page but we don't look at buffer use counts... So it seems to me
  that migration of a page with buffers should succeed even if buffer head
  has an elevated use count. Now I think that it *should* check the buffer
  use counts (it is dangerous to migrate buffers someone holds reference to)
  but I just cannot find that place. Or does CMA use some other migration
  function for buffer pages than buffer_migrate_page()?
  
  CMA allocation function is cma_alloc().
  Function flow is alloc_contig_range() - __alloc_contig_migrate_range() - 
  migrate_pages - unmap_and_move
  - __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.
  
  The buffer_busy() is checking b_count.
  If buffer is busy buffer-cache cannot be removed.
  So the page that includes buffer_head and the page that is refered by
  buffer_head are not movable.
  
  Is this what you need?
   Yes, this is what I was asking about. Thanks! But as I'm looking into
 __unmap_and_move() it calls try_to_free_buffers() only if page-mapping ==
 NULL. As the comment before that test states, this can happen only for swap
 cache (not our case) or for pagecache pages that were truncated and not yet
 fully cleaned up. But superblock page cannot really be truncated. So I
 somewhat doubt you can hit the above path for a page holding superblock...

Hello,

Although page-mapping != NULL, mapping-a_ops-migratepage could be
NULL. This is the case of block_device. See def_blk_aops in
fs/block_dev.c. In this case, fallback_migrate_page() is called and
then try_to_release_page() and try_to_free_buffers() would be called.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 17:34:46, Joonsoo Kim wrote:
 On Thu, Jul 31, 2014 at 02:21:14PM +0200, Jan Kara wrote:
  On Thu 31-07-14 09:37:15, Gioh Kim wrote:
   
   
   2014-07-31 오전 9:03, Jan Kara 쓴 글:
   On Thu 31-07-14 08:54:40, Gioh Kim wrote:
   2014-07-30 오후 7:11, Jan Kara 쓴 글:
   On Wed 30-07-14 16:44:24, Gioh Kim wrote:
   2014-07-22 오후 6:38, Jan Kara 쓴 글:
   On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
   On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
   Hello,
   
   This patch try to solve problem that a long-lasting page cache of
   ext4 superblock disturbs page migration.
   
   I've been testing CMA feature on my ARM-based platform
   and found some pages for page caches cannot be migrated.
   Some of them are page caches of superblock of ext4 filesystem.
   
   Current ext4 reads superblock with sb_bread(). sb_bread() 
   allocates page
   from movable area. But the problem is that ext4 hold the page until
   it is unmounted. If root filesystem is ext4 the page cannot be 
   migrated forever.
   
   I introduce a new API for allocating page from non-movable area.
   It is useful for ext4 and others that want to hold page cache for 
   a long time.
   
   There's no word on why you can't teach ext4 to still migrate that 
   page.
   For all I know it might be impossible, but at least mention why.
   
   I am very sorry for lacking of details.
   
   In ext4_fill_super() the buffer-head of superblock is stored in 
   sbi-s_sbh.
   The page belongs to the buffer-head is allocated from movable area.
   To migrate the page the buffer-head should be released via brelse().
   But brelse() is not called until unmount.
  Hum, I don't see where in the code do we check buffer_head use 
count. Can
   you please point me? Thanks.
   
   Filesystem code does not check buffer_head use count.  sb_bread() 
   returns
   the buffer_head that is included in bh_lru and has non-zero use count.
   You can see the bh_lru code in buffer.c: __find_get_clock() and
   lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
   bh_lru().  It first calls get_bh() to increase the use count and insert
   bh into the lru array.
   
   The buffer_head use count is non-zero until brelse() is called.
  So I probably didn't phrase the question precisely enough. What I was
   asking about is where exactly *migration* code checks buffer use count?
   Because as I'm looking at buffer_migrate_page() we lock the buffers on a
   migrated page but we don't look at buffer use counts... So it seems to me
   that migration of a page with buffers should succeed even if buffer head
   has an elevated use count. Now I think that it *should* check the buffer
   use counts (it is dangerous to migrate buffers someone holds reference 
   to)
   but I just cannot find that place. Or does CMA use some other migration
   function for buffer pages than buffer_migrate_page()?
   
   CMA allocation function is cma_alloc().
   Function flow is alloc_contig_range() - __alloc_contig_migrate_range() 
   - migrate_pages - unmap_and_move
   - __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.
   
   The buffer_busy() is checking b_count.
   If buffer is busy buffer-cache cannot be removed.
   So the page that includes buffer_head and the page that is refered by
   buffer_head are not movable.
   
   Is this what you need?
Yes, this is what I was asking about. Thanks! But as I'm looking into
  __unmap_and_move() it calls try_to_free_buffers() only if page-mapping ==
  NULL. As the comment before that test states, this can happen only for swap
  cache (not our case) or for pagecache pages that were truncated and not yet
  fully cleaned up. But superblock page cannot really be truncated. So I
  somewhat doubt you can hit the above path for a page holding superblock...
 
 Hello,
 
 Although page-mapping != NULL, mapping-a_ops-migratepage could be
 NULL. This is the case of block_device. See def_blk_aops in
 fs/block_dev.c. In this case, fallback_migrate_page() is called and
 then try_to_release_page() and try_to_free_buffers() would be called.
  Aaah, right! Finally I understand what happens and why I couldn't see
buffer_migrate_page() being called for blkdev buffers. I didn't realize
blkdev mappings end up with NULL -migratepage callback. Thanks a lot for
clearing this up.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 10:06:40, Gioh Kim wrote:
 Function path is like followings:
 
 [   97.868304] [8011a750] (drop_buffers+0xfc/0x168) from [8011bc64] 
 (try_to_free_buffers+0x50/0xbc)
 [   97.877457] [8011bc64] (try_to_free_buffers+0x50/0xbc) from [80121e40] 
 (blkdev_releasepage+0x38/0x48)
 [   97.887093] [80121e40] (blkdev_releasepage+0x38/0x48) from [800add8c] 
 (try_to_release_page+0x40/0x5c)
 [   97.896728] [800add8c] (try_to_release_page+0x40/0x5c) from [800bd9bc] 
 (shrink_page_list+0x508/0x8a4)
 [   97.906334] [800bd9bc] (shrink_page_list+0x508/0x8a4) from [800bde5c] 
 (reclaim_clean_pages_from_list+0x104/0x148)
 [   97.917017] [800bde5c] (reclaim_clean_pages_from_list+0x104/0x148) from 
 [800b5dec] (alloc_contig_range+0x114/0x2dc)
 [   97.927856] [800b5dec] (alloc_contig_range+0x114/0x2dc) from 
 [802f6c04] (dma_alloc_from_contiguous+0x8c/0x14c)
 [   97.938264] [802f6c04] (dma_alloc_from_contiguous+0x8c/0x14c) from 
 [80017b6c] (__alloc_from_contiguous+0x34/0xc0)
 [   97.948926] [80017b6c] (__alloc_from_contiguous+0x34/0xc0) from 
 [80017d40] (__dma_alloc+0xc4/0x2a0)
 [   97.958362] [80017d40] (__dma_alloc+0xc4/0x2a0) from [8001803c] 
 (arm_dma_alloc+0x80/0x98)
 [   97.966916] [8001803c] (arm_dma_alloc+0x80/0x98) from [7f6ea188] 
 (cma_test_probe+0xe0/0x1f0 [drv])
  OK, this makes more sense to me. But also as Joonsoo Kim pointed out
even if we go into the migration path, we will end up calling
try_to_free_buffers() because blkdev pages are one of those which use
fallback_migrate_page() as their -migratepage callback.

Now regarding your quest to make all pages in the movable zone really
movable - you are going to have hard time to achieve that for blkdev pages.
E.g. when a metadata buffer is part of a running transaction, it will be
pinned in memory until that transaction commits which easily takes seconds.
And for busy metadata buffer there's no guarantee that after that
transaction commits the buffer isn't already part of the newly started
transaction. So these buffers may be effectively unmovable while someone
writes to the filesystem.

So the quiestion really is how hard guarantee do you need that a page in
movable zone is really movable. Or better in what timeframe should it be
movable? It may be possible to make e.g. migratepage callback for ext4
blkdev pages which will handle migration of pages that are just idly
sitting in a journal waiting to be committed. That may be reasonably doable
although it won't be perfect. Or we may just decide it's not worth the
bother and allocate all blkdev pages from unmovable zone...

Honza

-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 11:57:00AM +0200, Jan Kara wrote:
 So the quiestion really is how hard guarantee do you need that a page in
 movable zone is really movable. Or better in what timeframe should it be
 movable? It may be possible to make e.g. migratepage callback for ext4
 blkdev pages which will handle migration of pages that are just idly
 sitting in a journal waiting to be committed. That may be reasonably doable
 although it won't be perfect. Or we may just decide it's not worth the
 bother and allocate all blkdev pages from unmovable zone...

So the point of CMA is to cater to those (arguably broken) devices that
do not have scatter gather IO, and these include things like the camera
device on your phone.

Previously (and possibly currently) your android Linux kernel will
simply preallocate a massive physically linear chunk of memory and
assign it to the camera hardware and not use it at all.

This is a terrible waste for most of the time people aren't running
their camera app at all. So the point is to allow usage of the memory,
but upon request be able to 'immediately' clear it through
migration/writeback.

So we should be fairly 'quick' in making the memory available,
definitely sub second timeframes.


Sadly its not only mobile devices that excel in crappy hardware, there's
plenty desktop stuff that could use this too, like some of the v4l
devices iirc.


pgpjIdOIpAd7R.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Jan Kara
On Fri 01-08-14 15:36:18, Peter Zijlstra wrote:
 On Fri, Aug 01, 2014 at 11:57:00AM +0200, Jan Kara wrote:
  So the quiestion really is how hard guarantee do you need that a page in
  movable zone is really movable. Or better in what timeframe should it be
  movable? It may be possible to make e.g. migratepage callback for ext4
  blkdev pages which will handle migration of pages that are just idly
  sitting in a journal waiting to be committed. That may be reasonably doable
  although it won't be perfect. Or we may just decide it's not worth the
  bother and allocate all blkdev pages from unmovable zone...
 
 So the point of CMA is to cater to those (arguably broken) devices that
 do not have scatter gather IO, and these include things like the camera
 device on your phone.
 
 Previously (and possibly currently) your android Linux kernel will
 simply preallocate a massive physically linear chunk of memory and
 assign it to the camera hardware and not use it at all.
 
 This is a terrible waste for most of the time people aren't running
 their camera app at all. So the point is to allow usage of the memory,
 but upon request be able to 'immediately' clear it through
 migration/writeback.
 
 So we should be fairly 'quick' in making the memory available,
 definitely sub second timeframes.
  OK, makes sense. But then if there's heavy IO going on, anything that has
IO pending on it is pinned and IO completion can easily take something
close to a second or more. So meeting subsecond deadlines may be tough even
for ordinary data pages under heavy load, even more so for metadata where
there are further constraints. OTOH phones aren't usually IO bound so in
practice it needn't be so bad ;). So if it is sub-second unless someone
loads the storage, then that sounds doable even for metadata. But we'll
need to attach -migratepage callback to blkdev pages and at least in ext4
case teach it how to move pages tracked by the journal.
 
 Sadly its not only mobile devices that excel in crappy hardware, there's
 plenty desktop stuff that could use this too, like some of the v4l
 devices iirc.
  Yeah, but in such usecases the guarantees we can offer for completion of
migration are even more vague :(.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-08-01 Thread Peter Zijlstra
On Fri, Aug 01, 2014 at 05:24:59PM +0200, Jan Kara wrote:

   OK, makes sense. But then if there's heavy IO going on, anything that has
 IO pending on it is pinned and IO completion can easily take something
 close to a second or more. So meeting subsecond deadlines may be tough even
 for ordinary data pages under heavy load, even more so for metadata where
 there are further constraints. OTOH phones aren't usually IO bound so in
 practice it needn't be so bad ;). 

Yeah, typically phones are not IO bound :-)

 So if it is sub-second unless someone
 loads the storage, then that sounds doable even for metadata. But we'll
 need to attach -migratepage callback to blkdev pages and at least in ext4
 case teach it how to move pages tracked by the journal.

Right, making it possible at all if of course much prefered over not
possible, regardless of timeliness :-)

  Sadly its not only mobile devices that excel in crappy hardware, there's
  plenty desktop stuff that could use this too, like some of the v4l
  devices iirc.
   Yeah, but in such usecases the guarantees we can offer for completion of
 migration are even more vague :(.

Yeah, lets start by making it possible, after that we can maybe look at
making it better, who knows.


pgpFdWdAaZ6os.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Gioh Kim



2014-08-01 오전 9:07, Gioh Kim 쓴 글:



2014-07-31 오후 9:21, Jan Kara 쓴 글:

On Thu 31-07-14 09:37:15, Gioh Kim wrote:



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

>from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() -> 
migrate_pages -> unmap_and_move
-> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by
buffer_head are not movable.

Is this what you need?

   Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page->mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...


I printed the address of busy buffer_head in drop_buffers() that is called by 
try_to_free_buffers().
And I printed the address of sb buffer_head.
They were the same.

I'm going to check page->mapping.


I'm very sorry. It's my fault.

Function path is like followings:

[   97.868304] [<8011a750>] (drop_buffers+0xfc/0x168) from [<8011bc64>] 
(try_to_free_buffers+0x50/0xbc)
[   97.877457] [<8011bc64>] (try_to_free_buffers+0x50/0xbc) from [<80121e40>] 
(blkdev_releasepage+0x38/0x48)
[   97.887093] [<80121e40>] (blkdev_releasepage+0x38/0x48) from [<800add8c>] 
(try_to_release_page+0x40/0x5c)
[   97.896728] [<800add8c>] (try_to_release_page+0x40/0x5c) from [<800bd9bc>] 
(shrink_page_list+0x508/0x8a4)
[   97.906334] [<800bd9bc>] (shrink_page_list+0x508/0x8a4) from [<800bde5c>] 
(reclaim_clean_pages_from_list+0x104/0x148)
[   97.917017] [<800bde5c>] (reclaim_clean_pages_from_list+0x104/0x148) from 
[<800b5dec>] (alloc_contig_range+0x114/0x2dc)
[   97.927856] [<800b5dec>] (alloc_contig_range+0x114/0x2dc) from [<802f6c04>] 
(dma_alloc_from_contiguous+0x8c/0x14c)
[   97.938264] [<802f6c04>] (dma_alloc_from_contiguous+0x8c/0x14c) from 
[<80017b6c>] (__alloc_from_contiguous+0x34/0xc0)
[   97.948926] [<80017b6c>] (__alloc_from_contiguous+0x34/0xc0) from 
[<80017d40>] (__dma_alloc+0xc4/0x2a0)
[   97.958362] [<80017d40>] (__dma_alloc+0xc4/0x2a0) from [<8001803c>] 
(arm_dma_alloc+0x80/0x98)
[   

Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Gioh Kim



2014-07-31 오후 9:21, Jan Kara 쓴 글:

On Thu 31-07-14 09:37:15, Gioh Kim wrote:



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

>from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() -> 
migrate_pages -> unmap_and_move
-> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by
buffer_head are not movable.

Is this what you need?

   Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page->mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...


I printed the address of busy buffer_head in drop_buffers() that is called by 
try_to_free_buffers().
And I printed the address of sb buffer_head.
They were the same.

I'm going to check page->mapping.




Honza


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Jan Kara
On Thu 31-07-14 09:37:15, Gioh Kim wrote:
> 
> 
> 2014-07-31 오전 9:03, Jan Kara 쓴 글:
> >On Thu 31-07-14 08:54:40, Gioh Kim wrote:
> >>2014-07-30 오후 7:11, Jan Kara 쓴 글:
> >>>On Wed 30-07-14 16:44:24, Gioh Kim wrote:
> 2014-07-22 오후 6:38, Jan Kara 쓴 글:
> >On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> >>On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> >>>Hello,
> >>>
> >>>This patch try to solve problem that a long-lasting page cache of
> >>>ext4 superblock disturbs page migration.
> >>>
> >>>I've been testing CMA feature on my ARM-based platform
> >>>and found some pages for page caches cannot be migrated.
> >>>Some of them are page caches of superblock of ext4 filesystem.
> >>>
> >>>Current ext4 reads superblock with sb_bread(). sb_bread() allocates 
> >>>page
> >>>from movable area. But the problem is that ext4 hold the page until
> >>>it is unmounted. If root filesystem is ext4 the page cannot be 
> >>>migrated forever.
> >>>
> >>>I introduce a new API for allocating page from non-movable area.
> >>>It is useful for ext4 and others that want to hold page cache for a 
> >>>long time.
> >>
> >>There's no word on why you can't teach ext4 to still migrate that page.
> >>For all I know it might be impossible, but at least mention why.
> 
> I am very sorry for lacking of details.
> 
> In ext4_fill_super() the buffer-head of superblock is stored in 
> sbi->s_sbh.
> The page belongs to the buffer-head is allocated from movable area.
> To migrate the page the buffer-head should be released via brelse().
> But brelse() is not called until unmount.
> >>>   Hum, I don't see where in the code do we check buffer_head use count. 
> >>> Can
> >>>you please point me? Thanks.
> >>
> >>Filesystem code does not check buffer_head use count.  sb_bread() returns
> >>the buffer_head that is included in bh_lru and has non-zero use count.
> >>You can see the bh_lru code in buffer.c: __find_get_clock() and
> >>lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
> >>bh_lru().  It first calls get_bh() to increase the use count and insert
> >>bh into the lru array.
> >>
> >>The buffer_head use count is non-zero until brelse() is called.
> >   So I probably didn't phrase the question precisely enough. What I was
> >asking about is where exactly *migration* code checks buffer use count?
> >Because as I'm looking at buffer_migrate_page() we lock the buffers on a
> >migrated page but we don't look at buffer use counts... So it seems to me
> >that migration of a page with buffers should succeed even if buffer head
> >has an elevated use count. Now I think that it *should* check the buffer
> >use counts (it is dangerous to migrate buffers someone holds reference to)
> >but I just cannot find that place. Or does CMA use some other migration
> >function for buffer pages than buffer_migrate_page()?
> 
> CMA allocation function is cma_alloc().
> Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() -> 
> migrate_pages -> unmap_and_move
> -> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.
> 
> The buffer_busy() is checking b_count.
> If buffer is busy buffer-cache cannot be removed.
> So the page that includes buffer_head and the page that is refered by
> buffer_head are not movable.
> 
> Is this what you need?
  Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page->mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Jan Kara
On Thu 31-07-14 09:37:15, Gioh Kim wrote:
 
 
 2014-07-31 오전 9:03, Jan Kara 쓴 글:
 On Thu 31-07-14 08:54:40, Gioh Kim wrote:
 2014-07-30 오후 7:11, Jan Kara 쓴 글:
 On Wed 30-07-14 16:44:24, Gioh Kim wrote:
 2014-07-22 오후 6:38, Jan Kara 쓴 글:
 On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
 On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
 Hello,
 
 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.
 
 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.
 
 Current ext4 reads superblock with sb_bread(). sb_bread() allocates 
 page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be 
 migrated forever.
 
 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a 
 long time.
 
 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.
 
 I am very sorry for lacking of details.
 
 In ext4_fill_super() the buffer-head of superblock is stored in 
 sbi-s_sbh.
 The page belongs to the buffer-head is allocated from movable area.
 To migrate the page the buffer-head should be released via brelse().
 But brelse() is not called until unmount.
Hum, I don't see where in the code do we check buffer_head use count. 
  Can
 you please point me? Thanks.
 
 Filesystem code does not check buffer_head use count.  sb_bread() returns
 the buffer_head that is included in bh_lru and has non-zero use count.
 You can see the bh_lru code in buffer.c: __find_get_clock() and
 lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
 bh_lru().  It first calls get_bh() to increase the use count and insert
 bh into the lru array.
 
 The buffer_head use count is non-zero until brelse() is called.
So I probably didn't phrase the question precisely enough. What I was
 asking about is where exactly *migration* code checks buffer use count?
 Because as I'm looking at buffer_migrate_page() we lock the buffers on a
 migrated page but we don't look at buffer use counts... So it seems to me
 that migration of a page with buffers should succeed even if buffer head
 has an elevated use count. Now I think that it *should* check the buffer
 use counts (it is dangerous to migrate buffers someone holds reference to)
 but I just cannot find that place. Or does CMA use some other migration
 function for buffer pages than buffer_migrate_page()?
 
 CMA allocation function is cma_alloc().
 Function flow is alloc_contig_range() - __alloc_contig_migrate_range() - 
 migrate_pages - unmap_and_move
 - __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.
 
 The buffer_busy() is checking b_count.
 If buffer is busy buffer-cache cannot be removed.
 So the page that includes buffer_head and the page that is refered by
 buffer_head are not movable.
 
 Is this what you need?
  Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page-mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Gioh Kim



2014-07-31 오후 9:21, Jan Kara 쓴 글:

On Thu 31-07-14 09:37:15, Gioh Kim wrote:



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() - __alloc_contig_migrate_range() - 
migrate_pages - unmap_and_move
- __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by
buffer_head are not movable.

Is this what you need?

   Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page-mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...


I printed the address of busy buffer_head in drop_buffers() that is called by 
try_to_free_buffers().
And I printed the address of sb buffer_head.
They were the same.

I'm going to check page-mapping.




Honza


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-31 Thread Gioh Kim



2014-08-01 오전 9:07, Gioh Kim 쓴 글:



2014-07-31 오후 9:21, Jan Kara 쓴 글:

On Thu 31-07-14 09:37:15, Gioh Kim wrote:



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() - __alloc_contig_migrate_range() - 
migrate_pages - unmap_and_move
- __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by
buffer_head are not movable.

Is this what you need?

   Yes, this is what I was asking about. Thanks! But as I'm looking into
__unmap_and_move() it calls try_to_free_buffers() only if page-mapping ==
NULL. As the comment before that test states, this can happen only for swap
cache (not our case) or for pagecache pages that were truncated and not yet
fully cleaned up. But superblock page cannot really be truncated. So I
somewhat doubt you can hit the above path for a page holding superblock...


I printed the address of busy buffer_head in drop_buffers() that is called by 
try_to_free_buffers().
And I printed the address of sb buffer_head.
They were the same.

I'm going to check page-mapping.


I'm very sorry. It's my fault.

Function path is like followings:

[   97.868304] [8011a750] (drop_buffers+0xfc/0x168) from [8011bc64] 
(try_to_free_buffers+0x50/0xbc)
[   97.877457] [8011bc64] (try_to_free_buffers+0x50/0xbc) from [80121e40] 
(blkdev_releasepage+0x38/0x48)
[   97.887093] [80121e40] (blkdev_releasepage+0x38/0x48) from [800add8c] 
(try_to_release_page+0x40/0x5c)
[   97.896728] [800add8c] (try_to_release_page+0x40/0x5c) from [800bd9bc] 
(shrink_page_list+0x508/0x8a4)
[   97.906334] [800bd9bc] (shrink_page_list+0x508/0x8a4) from [800bde5c] 
(reclaim_clean_pages_from_list+0x104/0x148)
[   97.917017] [800bde5c] (reclaim_clean_pages_from_list+0x104/0x148) from 
[800b5dec] (alloc_contig_range+0x114/0x2dc)
[   97.927856] [800b5dec] (alloc_contig_range+0x114/0x2dc) from [802f6c04] 
(dma_alloc_from_contiguous+0x8c/0x14c)
[   97.938264] [802f6c04] (dma_alloc_from_contiguous+0x8c/0x14c) from 
[80017b6c] (__alloc_from_contiguous+0x34/0xc0)
[   97.948926] [80017b6c] (__alloc_from_contiguous+0x34/0xc0) from 
[80017d40] (__dma_alloc+0xc4/0x2a0)
[   97.958362] [80017d40] (__dma_alloc+0xc4/0x2a0) from [8001803c] 
(arm_dma_alloc+0x80/0x98)
[   97.966916] [8001803c] (arm_dma_alloc+0x80/0x98) from 

Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

>from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() -> __alloc_contig_migrate_range() -> 
migrate_pages -> unmap_and_move
-> __unmap_and_move -> try_to_free_buffers -> drop_buffers -> buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by 
buffer_head are not movable.

Is this what you need?



Honza


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Jan Kara
On Thu 31-07-14 08:54:40, Gioh Kim wrote:
> 2014-07-30 오후 7:11, Jan Kara 쓴 글:
> >On Wed 30-07-14 16:44:24, Gioh Kim wrote:
> >>2014-07-22 오후 6:38, Jan Kara 쓴 글:
> >>>On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> >Hello,
> >
> >This patch try to solve problem that a long-lasting page cache of
> >ext4 superblock disturbs page migration.
> >
> >I've been testing CMA feature on my ARM-based platform
> >and found some pages for page caches cannot be migrated.
> >Some of them are page caches of superblock of ext4 filesystem.
> >
> >Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
> >from movable area. But the problem is that ext4 hold the page until
> >it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> >forever.
> >
> >I introduce a new API for allocating page from non-movable area.
> >It is useful for ext4 and others that want to hold page cache for a long 
> >time.
> 
> There's no word on why you can't teach ext4 to still migrate that page.
> For all I know it might be impossible, but at least mention why.
> >>
> >>I am very sorry for lacking of details.
> >>
> >>In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
> >>The page belongs to the buffer-head is allocated from movable area.
> >>To migrate the page the buffer-head should be released via brelse().
> >>But brelse() is not called until unmount.
> >   Hum, I don't see where in the code do we check buffer_head use count. Can
> >you please point me? Thanks.
> 
> Filesystem code does not check buffer_head use count.  sb_bread() returns
> the buffer_head that is included in bh_lru and has non-zero use count.
> You can see the bh_lru code in buffer.c: __find_get_clock() and
> lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
> bh_lru().  It first calls get_bh() to increase the use count and insert
> bh into the lru array.
> 
> The buffer_head use count is non-zero until brelse() is called.
  So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

>from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.
sb_bread() returns the buffer_head that is included in bh_lru and has non-zero 
use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and lookup_bh_lru().
bh_lru_install() inserts the buffer_head into the bh_lru().
It first calls get_bh() to increase the use count and insert bh into the lru 
array.

The buffer_head use count is non-zero until brelse() is called.




   It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh->b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).


The sb buffer is not movable because it is not released.
sb_bread increase the reference counter of buffer-head so that
the page of the buffer-head cannot be movable.

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

   OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...


Even a single page can make CMA migration fail.




Actually it is strange that ext4 keeps buffer-head in superblock
structure until unmount (it can be long time) I thinks the buffer-head
should be released immediately like fat_fill_super() did.  I believe
there is a reason to keep buffer-head so that I suggest this patch.

   We don't copy some data from the superblock to other structure so from
time to time we need to look e.g. at feature bits within superblock buffer.
Historically we were updating numbers of free blocks and inodes in the
superblock with each allocation but we don't do that anymore because it
scales poorly. So there is no fundamental reason for keeping sb buffer
pinned anymore. Just someone would have to rewrite the code to copy some
pieces of data from the buffer to some other structure and use it there.


I hope so.



Honza


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-30 오후 7:19, Peter Zijlstra 쓴 글:

On Wed, Jul 30, 2014 at 12:11:43PM +0200, Jan Kara wrote:

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

   OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...


The thing is, CMA _must_ be able to clear all the pages in its range,
otherwise its broken.

So placing nonmovable pages in a movable block utterly wrecks that.


YES. Even a single page can make CMA migration fail.



Now, Ted said that there's more effectively pinned stuff from
filesystems (and I imagine those would be things like the root inode
etc.) and those would equally wreck this..

But Gioh didn't mention any of that.. he should I suppose.


Thanks to inform me.

I thought there are more pinned stuff but I didn't know what they are.
I tried CMA migration but it failed even after I moved the sb page-cache to 
non-movable area.
So I just guessed there are more pinned stuff.
I am newbie and not familiar with filesystem code.

Of course all of the pinned stuff should be moved to non-movable area.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Peter Zijlstra
On Wed, Jul 30, 2014 at 12:11:43PM +0200, Jan Kara wrote:
> > sb_bread allocates page from movable area but it is not movable until the
> > reference counter of the buffer-head becomes zero.
> > There is no lock for the buffer but the reference counter acts like lock.
>   OK, but why do you care about a single page (of at most handful if you
> have more filesystems) which isn't movable? That shouldn't make a big
> difference to compaction...

The thing is, CMA _must_ be able to clear all the pages in its range,
otherwise its broken.

So placing nonmovable pages in a movable block utterly wrecks that.

Now, Ted said that there's more effectively pinned stuff from
filesystems (and I imagine those would be things like the root inode
etc.) and those would equally wreck this..

But Gioh didn't mention any of that.. he should I suppose.


pgpji9thGJ_OX.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Jan Kara
On Wed 30-07-14 16:44:24, Gioh Kim wrote:
> 2014-07-22 오후 6:38, Jan Kara 쓴 글:
> >On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> >>On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> >>>Hello,
> >>>
> >>>This patch try to solve problem that a long-lasting page cache of
> >>>ext4 superblock disturbs page migration.
> >>>
> >>>I've been testing CMA feature on my ARM-based platform
> >>>and found some pages for page caches cannot be migrated.
> >>>Some of them are page caches of superblock of ext4 filesystem.
> >>>
> >>>Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
> >>>from movable area. But the problem is that ext4 hold the page until
> >>>it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> >>>forever.
> >>>
> >>>I introduce a new API for allocating page from non-movable area.
> >>>It is useful for ext4 and others that want to hold page cache for a long 
> >>>time.
> >>
> >>There's no word on why you can't teach ext4 to still migrate that page.
> >>For all I know it might be impossible, but at least mention why.
> 
> I am very sorry for lacking of details.
> 
> In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
> The page belongs to the buffer-head is allocated from movable area.
> To migrate the page the buffer-head should be released via brelse().
> But brelse() is not called until unmount.
  Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.

> >   It doesn't seem to be worth the effort to make that page movable to me
> >(it's reasonably doable since superblock buffer isn't accessed in *that*
> >many places but single movable page doesn't seem like a good tradeoff for
> >the complexity).
> >
> >But this made me look into the migration code and it isn't completely clear
> >to me what makes the migration code decide that sb buffer isn't movable? We
> >seem to be locking the buffers before moving the underlying page but we
> >don't do any reference or state checks on the buffers... That seems to be
> >assuming that noone looks at bh->b_data without holding buffer lock. That
> >is likely true for ordinary data but definitely not true for metadata
> >buffers (i.e., buffers for pages from block device mappings).
> 
> The sb buffer is not movable because it is not released.
> sb_bread increase the reference counter of buffer-head so that
> the page of the buffer-head cannot be movable.
> 
> sb_bread allocates page from movable area but it is not movable until the
> reference counter of the buffer-head becomes zero.
> There is no lock for the buffer but the reference counter acts like lock.
  OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...

> Actually it is strange that ext4 keeps buffer-head in superblock
> structure until unmount (it can be long time) I thinks the buffer-head
> should be released immediately like fat_fill_super() did.  I believe
> there is a reason to keep buffer-head so that I suggest this patch.
  We don't copy some data from the superblock to other structure so from
time to time we need to look e.g. at feature bits within superblock buffer.
Historically we were updating numbers of free blocks and inodes in the
superblock with each allocation but we don't do that anymore because it
scales poorly. So there is no fundamental reason for keeping sb buffer
pinned anymore. Just someone would have to rewrite the code to copy some
pieces of data from the buffer to some other structure and use it there.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Kyungmin Park
Adding Marek & Tomasz,


On Wed, Jul 30, 2014 at 4:44 PM, Gioh Kim  wrote:
>
>
> 2014-07-22 오후 6:38, Jan Kara 쓴 글:
>
>> On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
>>>
>>> On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

 Hello,

 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.

 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.

 Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be migrated
 forever.

 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a long
 time.
>>>
>>>
>>> There's no word on why you can't teach ext4 to still migrate that page.
>>> For all I know it might be impossible, but at least mention why.
>
>
> I am very sorry for lacking of details.
>
> In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
> The page belongs to the buffer-head is allocated from movable area.
> To migrate the page the buffer-head should be released via brelse().
> But brelse() is not called until unmount.
>
> For example, fat_fill_super() reads superblock via sb_bread()
> and release it via brelse() immediately. Therefore the page that stores
> superblock can be migrated.
>
>
>
>
>>It doesn't seem to be worth the effort to make that page movable to me
>> (it's reasonably doable since superblock buffer isn't accessed in *that*
>> many places but single movable page doesn't seem like a good tradeoff for
>> the complexity).
>>
>> But this made me look into the migration code and it isn't completely
>> clear
>> to me what makes the migration code decide that sb buffer isn't movable?
>> We
>> seem to be locking the buffers before moving the underlying page but we
>> don't do any reference or state checks on the buffers... That seems to be
>> assuming that noone looks at bh->b_data without holding buffer lock. That
>> is likely true for ordinary data but definitely not true for metadata
>> buffers (i.e., buffers for pages from block device mappings).
>
we got similar issues and add similar work-around codes.

Thank you,
Kyungmin Park
>
> The sb buffer is not movable because it is not released.
> sb_bread increase the reference counter of buffer-head so that
> the page of the buffer-head cannot be movable.
>
> sb_bread allocates page from movable area but it is not movable until the
> reference counter of the buffer-head becomes zero.
> There is no lock for the buffer but the reference counter acts like lock.
>
> Actually it is strange that ext4 keeps buffer-head in superblock structure
> until unmount (it can be long time)
> I thinks the buffer-head should be released immediately like
> fat_fill_super() did.
> I believe there is a reason to keep buffer-head so that I suggest this
> patch.
>
>
>
>
>>
>> Added linux-mm to CC to enlighten me a bit ;)
>>
>> Honza
>>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-27 오전 10:01, Theodore Ts'o 쓴 글:

Gioh,

As follow up, if you want some further discussions about why these
patches should be accepted, it would be good to get some hard data
about why the keeping the ext4 superblock pinned is causing such a
problem for page migation.  Can you give us more details about what
the impact is of not having these patches?  And how it compres to
other data structures which are currently allocated in the moveable
area and tend to be pinned effectively indefinitely?

Thanks,

- Ted



I am very sorry to be late. I couldn't access the network for a week.

sb_bread() allocates page from movable area but the reference count of the 
buffer-head
that manages page should be zero to migrate the page.
Therefore brelase() should be called immediately after sb_bread() such like 
fat_fill_super().
But ext4 called brelse() when unmount the superblock.
The page cannot be movable until unmount.
CMA/HOTPLUG memory try to move the page but it fails.

If ext4 needs to keep buffer-cache of superblock until unmount,
it should allocated the page from non-movable area (because it can be a long 
time).
This patch try to do it.

I also sent an email to Jan Kara. Please refer it.

Thank you for your kindness.
Please inform me if you need any information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi->s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

For example, fat_fill_super() reads superblock via sb_bread()
and release it via brelse() immediately. Therefore the page that stores 
superblock can be migrated.




   It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh->b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).


The sb buffer is not movable because it is not released.
sb_bread increase the reference counter of buffer-head so that
the page of the buffer-head cannot be movable.

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

Actually it is strange that ext4 keeps buffer-head in superblock structure 
until unmount (it can be long time)
I thinks the buffer-head should be released immediately like fat_fill_super() 
did.
I believe there is a reason to keep buffer-head so that I suggest this patch.





Added linux-mm to CC to enlighten me a bit ;)

Honza


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

For example, fat_fill_super() reads superblock via sb_bread()
and release it via brelse() immediately. Therefore the page that stores 
superblock can be migrated.




   It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh-b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).


The sb buffer is not movable because it is not released.
sb_bread increase the reference counter of buffer-head so that
the page of the buffer-head cannot be movable.

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

Actually it is strange that ext4 keeps buffer-head in superblock structure 
until unmount (it can be long time)
I thinks the buffer-head should be released immediately like fat_fill_super() 
did.
I believe there is a reason to keep buffer-head so that I suggest this patch.





Added linux-mm to CC to enlighten me a bit ;)

Honza


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-27 오전 10:01, Theodore Ts'o 쓴 글:

Gioh,

As follow up, if you want some further discussions about why these
patches should be accepted, it would be good to get some hard data
about why the keeping the ext4 superblock pinned is causing such a
problem for page migation.  Can you give us more details about what
the impact is of not having these patches?  And how it compres to
other data structures which are currently allocated in the moveable
area and tend to be pinned effectively indefinitely?

Thanks,

- Ted



I am very sorry to be late. I couldn't access the network for a week.

sb_bread() allocates page from movable area but the reference count of the 
buffer-head
that manages page should be zero to migrate the page.
Therefore brelase() should be called immediately after sb_bread() such like 
fat_fill_super().
But ext4 called brelse() when unmount the superblock.
The page cannot be movable until unmount.
CMA/HOTPLUG memory try to move the page but it fails.

If ext4 needs to keep buffer-cache of superblock until unmount,
it should allocated the page from non-movable area (because it can be a long 
time).
This patch try to do it.

I also sent an email to Jan Kara. Please refer it.

Thank you for your kindness.
Please inform me if you need any information.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Kyungmin Park
Adding Marek  Tomasz,


On Wed, Jul 30, 2014 at 4:44 PM, Gioh Kim gioh@lge.com wrote:


 2014-07-22 오후 6:38, Jan Kara 쓴 글:

 On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

 On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

 Hello,

 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.

 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.

 Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be migrated
 forever.

 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a long
 time.


 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.


 I am very sorry for lacking of details.

 In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
 The page belongs to the buffer-head is allocated from movable area.
 To migrate the page the buffer-head should be released via brelse().
 But brelse() is not called until unmount.

 For example, fat_fill_super() reads superblock via sb_bread()
 and release it via brelse() immediately. Therefore the page that stores
 superblock can be migrated.




It doesn't seem to be worth the effort to make that page movable to me
 (it's reasonably doable since superblock buffer isn't accessed in *that*
 many places but single movable page doesn't seem like a good tradeoff for
 the complexity).

 But this made me look into the migration code and it isn't completely
 clear
 to me what makes the migration code decide that sb buffer isn't movable?
 We
 seem to be locking the buffers before moving the underlying page but we
 don't do any reference or state checks on the buffers... That seems to be
 assuming that noone looks at bh-b_data without holding buffer lock. That
 is likely true for ordinary data but definitely not true for metadata
 buffers (i.e., buffers for pages from block device mappings).

we got similar issues and add similar work-around codes.

Thank you,
Kyungmin Park

 The sb buffer is not movable because it is not released.
 sb_bread increase the reference counter of buffer-head so that
 the page of the buffer-head cannot be movable.

 sb_bread allocates page from movable area but it is not movable until the
 reference counter of the buffer-head becomes zero.
 There is no lock for the buffer but the reference counter acts like lock.

 Actually it is strange that ext4 keeps buffer-head in superblock structure
 until unmount (it can be long time)
 I thinks the buffer-head should be released immediately like
 fat_fill_super() did.
 I believe there is a reason to keep buffer-head so that I suggest this
 patch.





 Added linux-mm to CC to enlighten me a bit ;)

 Honza


 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Jan Kara
On Wed 30-07-14 16:44:24, Gioh Kim wrote:
 2014-07-22 오후 6:38, Jan Kara 쓴 글:
 On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
 On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
 Hello,
 
 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.
 
 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.
 
 Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be migrated 
 forever.
 
 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a long 
 time.
 
 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.
 
 I am very sorry for lacking of details.
 
 In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
 The page belongs to the buffer-head is allocated from movable area.
 To migrate the page the buffer-head should be released via brelse().
 But brelse() is not called until unmount.
  Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.

It doesn't seem to be worth the effort to make that page movable to me
 (it's reasonably doable since superblock buffer isn't accessed in *that*
 many places but single movable page doesn't seem like a good tradeoff for
 the complexity).
 
 But this made me look into the migration code and it isn't completely clear
 to me what makes the migration code decide that sb buffer isn't movable? We
 seem to be locking the buffers before moving the underlying page but we
 don't do any reference or state checks on the buffers... That seems to be
 assuming that noone looks at bh-b_data without holding buffer lock. That
 is likely true for ordinary data but definitely not true for metadata
 buffers (i.e., buffers for pages from block device mappings).
 
 The sb buffer is not movable because it is not released.
 sb_bread increase the reference counter of buffer-head so that
 the page of the buffer-head cannot be movable.
 
 sb_bread allocates page from movable area but it is not movable until the
 reference counter of the buffer-head becomes zero.
 There is no lock for the buffer but the reference counter acts like lock.
  OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...

 Actually it is strange that ext4 keeps buffer-head in superblock
 structure until unmount (it can be long time) I thinks the buffer-head
 should be released immediately like fat_fill_super() did.  I believe
 there is a reason to keep buffer-head so that I suggest this patch.
  We don't copy some data from the superblock to other structure so from
time to time we need to look e.g. at feature bits within superblock buffer.
Historically we were updating numbers of free blocks and inodes in the
superblock with each allocation but we don't do that anymore because it
scales poorly. So there is no fundamental reason for keeping sb buffer
pinned anymore. Just someone would have to rewrite the code to copy some
pieces of data from the buffer to some other structure and use it there.

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Peter Zijlstra
On Wed, Jul 30, 2014 at 12:11:43PM +0200, Jan Kara wrote:
  sb_bread allocates page from movable area but it is not movable until the
  reference counter of the buffer-head becomes zero.
  There is no lock for the buffer but the reference counter acts like lock.
   OK, but why do you care about a single page (of at most handful if you
 have more filesystems) which isn't movable? That shouldn't make a big
 difference to compaction...

The thing is, CMA _must_ be able to clear all the pages in its range,
otherwise its broken.

So placing nonmovable pages in a movable block utterly wrecks that.

Now, Ted said that there's more effectively pinned stuff from
filesystems (and I imagine those would be things like the root inode
etc.) and those would equally wreck this..

But Gioh didn't mention any of that.. he should I suppose.


pgpji9thGJ_OX.pgp
Description: PGP signature


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-30 오후 7:19, Peter Zijlstra 쓴 글:

On Wed, Jul 30, 2014 at 12:11:43PM +0200, Jan Kara wrote:

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

   OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...


The thing is, CMA _must_ be able to clear all the pages in its range,
otherwise its broken.

So placing nonmovable pages in a movable block utterly wrecks that.


YES. Even a single page can make CMA migration fail.



Now, Ted said that there's more effectively pinned stuff from
filesystems (and I imagine those would be things like the root inode
etc.) and those would equally wreck this..

But Gioh didn't mention any of that.. he should I suppose.


Thanks to inform me.

I thought there are more pinned stuff but I didn't know what they are.
I tried CMA migration but it failed even after I moved the sb page-cache to 
non-movable area.
So I just guessed there are more pinned stuff.
I am newbie and not familiar with filesystem code.

Of course all of the pinned stuff should be moved to non-movable area.




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.
sb_bread() returns the buffer_head that is included in bh_lru and has non-zero 
use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and lookup_bh_lru().
bh_lru_install() inserts the buffer_head into the bh_lru().
It first calls get_bh() to increase the use count and insert bh into the lru 
array.

The buffer_head use count is non-zero until brelse() is called.




   It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh-b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).


The sb buffer is not movable because it is not released.
sb_bread increase the reference counter of buffer-head so that
the page of the buffer-head cannot be movable.

sb_bread allocates page from movable area but it is not movable until the
reference counter of the buffer-head becomes zero.
There is no lock for the buffer but the reference counter acts like lock.

   OK, but why do you care about a single page (of at most handful if you
have more filesystems) which isn't movable? That shouldn't make a big
difference to compaction...


Even a single page can make CMA migration fail.




Actually it is strange that ext4 keeps buffer-head in superblock
structure until unmount (it can be long time) I thinks the buffer-head
should be released immediately like fat_fill_super() did.  I believe
there is a reason to keep buffer-head so that I suggest this patch.

   We don't copy some data from the superblock to other structure so from
time to time we need to look e.g. at feature bits within superblock buffer.
Historically we were updating numbers of free blocks and inodes in the
superblock with each allocation but we don't do that anymore because it
scales poorly. So there is no fundamental reason for keeping sb buffer
pinned anymore. Just someone would have to rewrite the code to copy some
pieces of data from the buffer to some other structure and use it there.


I hope so.



Honza


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Jan Kara
On Thu 31-07-14 08:54:40, Gioh Kim wrote:
 2014-07-30 오후 7:11, Jan Kara 쓴 글:
 On Wed 30-07-14 16:44:24, Gioh Kim wrote:
 2014-07-22 오후 6:38, Jan Kara 쓴 글:
 On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
 On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
 Hello,
 
 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.
 
 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.
 
 Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be migrated 
 forever.
 
 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a long 
 time.
 
 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.
 
 I am very sorry for lacking of details.
 
 In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
 The page belongs to the buffer-head is allocated from movable area.
 To migrate the page the buffer-head should be released via brelse().
 But brelse() is not called until unmount.
Hum, I don't see where in the code do we check buffer_head use count. Can
 you please point me? Thanks.
 
 Filesystem code does not check buffer_head use count.  sb_bread() returns
 the buffer_head that is included in bh_lru and has non-zero use count.
 You can see the bh_lru code in buffer.c: __find_get_clock() and
 lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
 bh_lru().  It first calls get_bh() to increase the use count and insert
 bh into the lru array.
 
 The buffer_head use count is non-zero until brelse() is called.
  So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-30 Thread Gioh Kim



2014-07-31 오전 9:03, Jan Kara 쓴 글:

On Thu 31-07-14 08:54:40, Gioh Kim wrote:

2014-07-30 오후 7:11, Jan Kara 쓴 글:

On Wed 30-07-14 16:44:24, Gioh Kim wrote:

2014-07-22 오후 6:38, Jan Kara 쓴 글:

On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:

On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:

Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page

from movable area. But the problem is that ext4 hold the page until

it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.


There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.


I am very sorry for lacking of details.

In ext4_fill_super() the buffer-head of superblock is stored in sbi-s_sbh.
The page belongs to the buffer-head is allocated from movable area.
To migrate the page the buffer-head should be released via brelse().
But brelse() is not called until unmount.

   Hum, I don't see where in the code do we check buffer_head use count. Can
you please point me? Thanks.


Filesystem code does not check buffer_head use count.  sb_bread() returns
the buffer_head that is included in bh_lru and has non-zero use count.
You can see the bh_lru code in buffer.c: __find_get_clock() and
lookup_bh_lru().  bh_lru_install() inserts the buffer_head into the
bh_lru().  It first calls get_bh() to increase the use count and insert
bh into the lru array.

The buffer_head use count is non-zero until brelse() is called.

   So I probably didn't phrase the question precisely enough. What I was
asking about is where exactly *migration* code checks buffer use count?
Because as I'm looking at buffer_migrate_page() we lock the buffers on a
migrated page but we don't look at buffer use counts... So it seems to me
that migration of a page with buffers should succeed even if buffer head
has an elevated use count. Now I think that it *should* check the buffer
use counts (it is dangerous to migrate buffers someone holds reference to)
but I just cannot find that place. Or does CMA use some other migration
function for buffer pages than buffer_migrate_page()?


CMA allocation function is cma_alloc().
Function flow is alloc_contig_range() - __alloc_contig_migrate_range() - 
migrate_pages - unmap_and_move
- __unmap_and_move - try_to_free_buffers - drop_buffers - buffer_busy.

The buffer_busy() is checking b_count.
If buffer is busy buffer-cache cannot be removed.
So the page that includes buffer_head and the page that is refered by 
buffer_head are not movable.

Is this what you need?



Honza


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-26 Thread Theodore Ts'o
Gioh,

As follow up, if you want some further discussions about why these
patches should be accepted, it would be good to get some hard data
about why the keeping the ext4 superblock pinned is causing such a
problem for page migation.  Can you give us more details about what
the impact is of not having these patches?  And how it compres to
other data structures which are currently allocated in the moveable
area and tend to be pinned effectively indefinitely?

Thanks,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-26 Thread Theodore Ts'o
Gioh,

As follow up, if you want some further discussions about why these
patches should be accepted, it would be good to get some hard data
about why the keeping the ext4 superblock pinned is causing such a
problem for page migation.  Can you give us more details about what
the impact is of not having these patches?  And how it compres to
other data structures which are currently allocated in the moveable
area and tend to be pinned effectively indefinitely?

Thanks,

- Ted
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Jan Kara
On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
> On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> > Hello,
> > 
> > This patch try to solve problem that a long-lasting page cache of
> > ext4 superblock disturbs page migration.
> > 
> > I've been testing CMA feature on my ARM-based platform
> > and found some pages for page caches cannot be migrated.
> > Some of them are page caches of superblock of ext4 filesystem.
> > 
> > Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
> > from movable area. But the problem is that ext4 hold the page until
> > it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> > forever.
> > 
> > I introduce a new API for allocating page from non-movable area.
> > It is useful for ext4 and others that want to hold page cache for a long 
> > time.
> 
> There's no word on why you can't teach ext4 to still migrate that page.
> For all I know it might be impossible, but at least mention why.
  It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh->b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).

Added linux-mm to CC to enlighten me a bit ;)

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Theodore Ts'o
On Tue, Jul 22, 2014 at 09:30:05AM +0200, Peter Zijlstra wrote:
> > I introduce a new API for allocating page from non-movable area.
> > It is useful for ext4 and others that want to hold page cache for a long 
> > time.
> 
> There's no word on why you can't teach ext4 to still migrate that page.
> For all I know it might be impossible, but at least mention why.

In theory we might be able to do it, but it's only a single 4k page,
and we'd have to add RCU locking all over the place in order to be
able to switch out the superblock structure, since we reference it all
over the place inside fs/ext4.  The question I'd ask is is it worth
it.

I suspect the bigger deal is that there are all sorts of inodes and
dentries which are effectively pinned and thus, impossible to migrate.
This probably locks down many more pages (by a fact of at least 10 or
20), and I'd think that's something you would be much more interested
in fixing.

   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
> Hello,
> 
> This patch try to solve problem that a long-lasting page cache of
> ext4 superblock disturbs page migration.
> 
> I've been testing CMA feature on my ARM-based platform
> and found some pages for page caches cannot be migrated.
> Some of them are page caches of superblock of ext4 filesystem.
> 
> Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
> from movable area. But the problem is that ext4 hold the page until
> it is unmounted. If root filesystem is ext4 the page cannot be migrated 
> forever.
> 
> I introduce a new API for allocating page from non-movable area.
> It is useful for ext4 and others that want to hold page cache for a long time.

There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Peter Zijlstra
On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
 Hello,
 
 This patch try to solve problem that a long-lasting page cache of
 ext4 superblock disturbs page migration.
 
 I've been testing CMA feature on my ARM-based platform
 and found some pages for page caches cannot be migrated.
 Some of them are page caches of superblock of ext4 filesystem.
 
 Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
 from movable area. But the problem is that ext4 hold the page until
 it is unmounted. If root filesystem is ext4 the page cannot be migrated 
 forever.
 
 I introduce a new API for allocating page from non-movable area.
 It is useful for ext4 and others that want to hold page cache for a long time.

There's no word on why you can't teach ext4 to still migrate that page.
For all I know it might be impossible, but at least mention why.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Theodore Ts'o
On Tue, Jul 22, 2014 at 09:30:05AM +0200, Peter Zijlstra wrote:
  I introduce a new API for allocating page from non-movable area.
  It is useful for ext4 and others that want to hold page cache for a long 
  time.
 
 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.

In theory we might be able to do it, but it's only a single 4k page,
and we'd have to add RCU locking all over the place in order to be
able to switch out the superblock structure, since we reference it all
over the place inside fs/ext4.  The question I'd ask is is it worth
it.

I suspect the bigger deal is that there are all sorts of inodes and
dentries which are effectively pinned and thus, impossible to migrate.
This probably locks down many more pages (by a fact of at least 10 or
20), and I'd think that's something you would be much more interested
in fixing.

   - Ted
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-22 Thread Jan Kara
On Tue 22-07-14 09:30:05, Peter Zijlstra wrote:
 On Tue, Jul 22, 2014 at 02:18:47PM +0900, Gioh Kim wrote:
  Hello,
  
  This patch try to solve problem that a long-lasting page cache of
  ext4 superblock disturbs page migration.
  
  I've been testing CMA feature on my ARM-based platform
  and found some pages for page caches cannot be migrated.
  Some of them are page caches of superblock of ext4 filesystem.
  
  Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
  from movable area. But the problem is that ext4 hold the page until
  it is unmounted. If root filesystem is ext4 the page cannot be migrated 
  forever.
  
  I introduce a new API for allocating page from non-movable area.
  It is useful for ext4 and others that want to hold page cache for a long 
  time.
 
 There's no word on why you can't teach ext4 to still migrate that page.
 For all I know it might be impossible, but at least mention why.
  It doesn't seem to be worth the effort to make that page movable to me
(it's reasonably doable since superblock buffer isn't accessed in *that*
many places but single movable page doesn't seem like a good tradeoff for
the complexity).

But this made me look into the migration code and it isn't completely clear
to me what makes the migration code decide that sb buffer isn't movable? We
seem to be locking the buffers before moving the underlying page but we
don't do any reference or state checks on the buffers... That seems to be
assuming that noone looks at bh-b_data without holding buffer lock. That
is likely true for ordinary data but definitely not true for metadata
buffers (i.e., buffers for pages from block device mappings).

Added linux-mm to CC to enlighten me a bit ;)

Honza
-- 
Jan Kara j...@suse.cz
SUSE Labs, CR
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-21 Thread Gioh Kim
Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.

I have 2 patchs:

1. Patch 1/2: introduce a new API that create page cache from non-movable area
2. Patch 2/2: have ext4 use the new API to read superblock

This patchset is based on linux-next-20140717.

Thanks a lot.

Gioh Kim (2):
 fs/buffer.c: allocate buffer cache from non-movable area
 ext4: allocate buffer-cache for superblock in non-movable area

 fs/buffer.c |   39 ---
 fs/ext4/super.c |6 +++---
 include/linux/buffer_head.h |8 
 3 files changed, 43 insertions(+), 10 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] new API to allocate buffer-cache for superblock in non-movable area

2014-07-21 Thread Gioh Kim
Hello,

This patch try to solve problem that a long-lasting page cache of
ext4 superblock disturbs page migration.

I've been testing CMA feature on my ARM-based platform
and found some pages for page caches cannot be migrated.
Some of them are page caches of superblock of ext4 filesystem.

Current ext4 reads superblock with sb_bread(). sb_bread() allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated forever.

I introduce a new API for allocating page from non-movable area.
It is useful for ext4 and others that want to hold page cache for a long time.

I have 2 patchs:

1. Patch 1/2: introduce a new API that create page cache from non-movable area
2. Patch 2/2: have ext4 use the new API to read superblock

This patchset is based on linux-next-20140717.

Thanks a lot.

Gioh Kim (2):
 fs/buffer.c: allocate buffer cache from non-movable area
 ext4: allocate buffer-cache for superblock in non-movable area

 fs/buffer.c |   39 ---
 fs/ext4/super.c |6 +++---
 include/linux/buffer_head.h |8 
 3 files changed, 43 insertions(+), 10 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/