RE: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-03 Thread Marek Szyprowski
Hello,

On Thursday, February 02, 2012 8:53 PM Michał Nazarewicz wrote:

  On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
  Pages, which have incorrect migrate type on free finally
  causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.
 
 On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman m...@csn.ul.ie wrote:
  I'm not quite seeing this. In free_hot_cold_page(), the pageblock
  type is checked so the page private should be set to MIGRATE_CMA or
  MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
  pageblock to MIGRATE_MOVABLE in error.
 
 Here's what I think may happen:
 
 When drain_all_pages() is called, __free_one_page() is called for each page on
 pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
 This result in the page being put on MIGRATE_CMA freelist even though its
 pageblock's migrate type is MIGRATE_ISOLATE.
 
 When allocation happens and pcp list is empty, rmqueue_bulk() will get 
 executed
 with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to 
 grab
 some pages and because the page described above is on MIGRATE_CMA freelist it
 may be returned back to rmqueue_bulk().
 
 But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
 following code:
 
 #ifdef CONFIG_CMA
   if (is_pageblock_cma(page))
   set_page_private(page, MIGRATE_CMA);
   else
 #endif
   set_page_private(page, migratetype);
 
 will set it's private to MIGRATE_MOVABLE and in the end the page lands back
 on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
 and not MIGRATE_CMA.
 
 One more drain_all_pages() (which may happen since alloc_contig_range() calls
 set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
 convert the whole pageblock to MIGRATE_MOVABLE.
 
 I know, this sounds crazy and improbable, but I couldn't find an easier path
 to destruction.  As you pointed, once the page is allocated, 
 free_hot_cold_page()
 will do the right thing by reading pageblock's migrate type.
 
 Marek is currently experimenting with various patches including the following
 change:
 
 #ifdef CONFIG_CMA
  int mt = get_pageblock_migratetype(page);
  if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
  set_page_private(page, mt);
  else
 #endif
  set_page_private(page, migratetype);
 
 As a matter of fact, if __rmqueue() was changed to return migrate type of the
 freelist it took page from, we could avoid this get_pageblock_migratetype() 
 all
 together.  For now, however, I'd rather not go that way just yet -- I'll be 
 happy
 to dig into it once CMA gets merged.

After this and some other changes I'm unable to reproduce that issue. I did a 
whole
night tests and it still works fine, so it looks that it has been finally 
solved.
I will post v20 patchset soon :)

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-03 Thread Mel Gorman
On Thu, Feb 02, 2012 at 08:53:25PM +0100, Michal Nazarewicz wrote:
 On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
 Pages, which have incorrect migrate type on free finally
 causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.
 
 On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman m...@csn.ul.ie wrote:
 I'm not quite seeing this. In free_hot_cold_page(), the pageblock
 type is checked so the page private should be set to MIGRATE_CMA or
 MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
 pageblock to MIGRATE_MOVABLE in error.
 
 Here's what I think may happen:
 
 When drain_all_pages() is called, __free_one_page() is called for each page on
 pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
 This result in the page being put on MIGRATE_CMA freelist even though its
 pageblock's migrate type is MIGRATE_ISOLATE.
 

Ok, although it will only be allocated for MIGRATE_CMA-compatible
requests so it is not a disaster.

 When allocation happens and pcp list is empty, rmqueue_bulk() will get 
 executed
 with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to 
 grab
 some pages and because the page described above is on MIGRATE_CMA freelist it
 may be returned back to rmqueue_bulk().
 

This will allocate the page from a pageblock we are trying to isolate
pages from, but only for a movable page that can still be migrated. It
does mean that CMA is doing more work than it should of course and
the problem also impacts memory hot-remove. It's worse for memory
hot-remove because potentially an UNMOVABLE page was allocated from
a MIGRATE_ISOLATE pageblock.

 But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
 following code:
 
 #ifdef CONFIG_CMA
   if (is_pageblock_cma(page))
   set_page_private(page, MIGRATE_CMA);
   else
 #endif
   set_page_private(page, migratetype);
 
 will set it's private to MIGRATE_MOVABLE and in the end the page lands back
 on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
 and not MIGRATE_CMA.
 
 One more drain_all_pages() (which may happen since alloc_contig_range() calls
 set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
 convert the whole pageblock to MIGRATE_MOVABLE.
 
 I know, this sounds crazy and improbable, but I couldn't find an easier path
 to destruction.  As you pointed, once the page is allocated, 
 free_hot_cold_page()
 will do the right thing by reading pageblock's migrate type.
 

Ok, it's crazy but the problem is there.

 Marek is currently experimenting with various patches including the following
 change:
 
 #ifdef CONFIG_CMA
 int mt = get_pageblock_migratetype(page);
 if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
 set_page_private(page, mt);
 else
 #endif
 set_page_private(page, migratetype);
 
 As a matter of fact, if __rmqueue() was changed to return migrate type of the
 freelist it took page from, we could avoid this get_pageblock_migratetype() 
 all
 together.  For now, however, I'd rather not go that way just yet -- I'll be 
 happy
 to dig into it once CMA gets merged.
 

Ok, thanks for persisting with this.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-02 Thread Mel Gorman
On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
   +   page = pfn_to_page(pfn);
   +   if (PageBuddy(page)) {
   +   pfn += 1  page_order(page);
   +   } else if (page_count(page) == 0) {
   +   set_page_private(page, MIGRATE_ISOLATE);
   +   ++pfn;
   
   This is dangerous for two reasons. If the page_count is 0, it could
   be because the page is in the process of being freed and is not
   necessarily on the per-cpu lists yet and you cannot be sure if the
   contents of page-private are important. Second, there is nothing to
   prevent another CPU allocating this page from its per-cpu list while
   the private field is getting updated from here which might lead to
   some interesting races.
   
   I recognise that what you are trying to do is respond to Gilad's
   request that you really check if an IPI here is necessary. I think what
   you need to do is check if a page with a count of 0 is encountered
   and if it is, then a draining of the per-cpu lists is necessary. To
   address Gilad's concerns, be sure to only this this once per attempt at
   CMA rather than for every page encountered with a count of 0 to avoid a
   storm of IPIs.
  
   It's actually more then that.
  
   This is the same issue that I first fixed with a change to 
   free_pcppages_bulk()
   function[1].  At the time of positing, you said you'd like me to try and 
   find
   a different solution which would not involve paying the price of calling
   get_pageblock_migratetype().  Later I also realised that this solution is
   not enough.
  
   [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
  
  
  Yes. I had forgotten the history but looking at that patch again,
  I would reach the conclusion that this was adding a new call to
  get_pageblock_migratetype() in the bulk free path. That would affect
  everybody whether they were using CMA or not.
 
 This will be a bit ugly, but we can also use that code and compile it 
 conditionally
 when CMA has been enabled.

That would also be very unfortunate because it means enabling CMA incurs
a performance cost to everyone whether they use CMA or not. For ARM,
this may not be a problem but it would be for other arches if they
wanted to use CMA or if it ever became part of a distro contig.

 Pages, which have incorrect migrate type on free finally
 causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.

I'm not quite seeing this. In free_hot_cold_page(), the pageblock
type is checked so the page private should be set to MIGRATE_CMA or
MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
pageblock to MIGRATE_MOVABLE in error. If it turns out that you
absolutely have to call get_pageblock_migratetype() from
free_pcppages_bulk() and my alternative suggestion did not work out then
document all these issues in a comment when putting the call under
CONFIG_CMA so that it is not forgotten.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-02 Thread Michal Nazarewicz

On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:

Pages, which have incorrect migrate type on free finally
causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.


On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman m...@csn.ul.ie wrote:

I'm not quite seeing this. In free_hot_cold_page(), the pageblock
type is checked so the page private should be set to MIGRATE_CMA or
MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
pageblock to MIGRATE_MOVABLE in error.


Here's what I think may happen:

When drain_all_pages() is called, __free_one_page() is called for each page on
pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
This result in the page being put on MIGRATE_CMA freelist even though its
pageblock's migrate type is MIGRATE_ISOLATE.

When allocation happens and pcp list is empty, rmqueue_bulk() will get executed
with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to grab
some pages and because the page described above is on MIGRATE_CMA freelist it
may be returned back to rmqueue_bulk().

But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
following code:

#ifdef CONFIG_CMA
if (is_pageblock_cma(page))
set_page_private(page, MIGRATE_CMA);
else
#endif
set_page_private(page, migratetype);

will set it's private to MIGRATE_MOVABLE and in the end the page lands back
on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
and not MIGRATE_CMA.

One more drain_all_pages() (which may happen since alloc_contig_range() calls
set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
convert the whole pageblock to MIGRATE_MOVABLE.

I know, this sounds crazy and improbable, but I couldn't find an easier path
to destruction.  As you pointed, once the page is allocated, 
free_hot_cold_page()
will do the right thing by reading pageblock's migrate type.

Marek is currently experimenting with various patches including the following
change:

#ifdef CONFIG_CMA
int mt = get_pageblock_migratetype(page);
if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
set_page_private(page, mt);
else
#endif
set_page_private(page, migratetype);

As a matter of fact, if __rmqueue() was changed to return migrate type of the
freelist it took page from, we could avoid this get_pageblock_migratetype() all
together.  For now, however, I'd rather not go that way just yet -- I'll be 
happy
to dig into it once CMA gets merged.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-31 Thread Marek Szyprowski
Hello,

On Monday, January 30, 2012 5:15 PM Mel Gorman wrote:

 On Mon, Jan 30, 2012 at 04:41:22PM +0100, Michal Nazarewicz wrote:
  On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman m...@csn.ul.ie wrote:

(snipped)

  + page = pfn_to_page(pfn);
  + if (PageBuddy(page)) {
  + pfn += 1  page_order(page);
  + } else if (page_count(page) == 0) {
  + set_page_private(page, MIGRATE_ISOLATE);
  + ++pfn;
  
  This is dangerous for two reasons. If the page_count is 0, it could
  be because the page is in the process of being freed and is not
  necessarily on the per-cpu lists yet and you cannot be sure if the
  contents of page-private are important. Second, there is nothing to
  prevent another CPU allocating this page from its per-cpu list while
  the private field is getting updated from here which might lead to
  some interesting races.
  
  I recognise that what you are trying to do is respond to Gilad's
  request that you really check if an IPI here is necessary. I think what
  you need to do is check if a page with a count of 0 is encountered
  and if it is, then a draining of the per-cpu lists is necessary. To
  address Gilad's concerns, be sure to only this this once per attempt at
  CMA rather than for every page encountered with a count of 0 to avoid a
  storm of IPIs.
 
  It's actually more then that.
 
  This is the same issue that I first fixed with a change to 
  free_pcppages_bulk()
  function[1].  At the time of positing, you said you'd like me to try and 
  find
  a different solution which would not involve paying the price of calling
  get_pageblock_migratetype().  Later I also realised that this solution is
  not enough.
 
  [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
 
 
 Yes. I had forgotten the history but looking at that patch again,
 I would reach the conclusion that this was adding a new call to
 get_pageblock_migratetype() in the bulk free path. That would affect
 everybody whether they were using CMA or not.

This will be a bit ugly, but we can also use that code and compile it 
conditionally
when CMA has been enabled. Pages, which have incorrect migrate type on free 
finally
causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE. 
This is
not a problem for non-CMA case where only pageblocks with MIGRATE_MOVABLE 
migration
type are being isolated.

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center



--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Mel Gorman
On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:
 From: Michal Nazarewicz min...@mina86.com
 
 This commit changes set_migratetype_isolate() so that it updates
 migrate type of pages on pcp list which is saved in their
 page_private.
 
 Signed-off-by: Michal Nazarewicz min...@mina86.com
 Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
 ---
  include/linux/page-isolation.h |6 ++
  mm/page_alloc.c|1 +
  mm/page_isolation.c|   24 
  3 files changed, 31 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
 index 051c1b1..8c02c2b 100644
 --- a/include/linux/page-isolation.h
 +++ b/include/linux/page-isolation.h
 @@ -27,6 +27,12 @@ extern int
  test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn);
  
  /*
 + * Check all pages in pageblock, find the ones on pcp list, and set
 + * their page_private to MIGRATE_ISOLATE.
 + */
 +extern void update_pcp_isolate_block(unsigned long pfn);
 +
 +/*
   * Internal funcs.Changes pageblock's migrate type.
   * Please use make_pagetype_isolated()/make_pagetype_movable().
   */
 diff --git a/mm/page_alloc.c b/mm/page_alloc.c
 index e1c5656..70709e7 100644
 --- a/mm/page_alloc.c
 +++ b/mm/page_alloc.c
 @@ -5465,6 +5465,7 @@ out:
   if (!ret) {
   set_pageblock_migratetype(page, MIGRATE_ISOLATE);
   move_freepages_block(zone, page, MIGRATE_ISOLATE);
 + update_pcp_isolate_block(pfn);
   }
  
   spin_unlock_irqrestore(zone-lock, flags);
 diff --git a/mm/page_isolation.c b/mm/page_isolation.c
 index 4ae42bb..9ea2f6e 100644
 --- a/mm/page_isolation.c
 +++ b/mm/page_isolation.c
 @@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, 
 unsigned long end_pfn)
   spin_unlock_irqrestore(zone-lock, flags);
   return ret ? 0 : -EBUSY;
  }
 +
 +/* must hold zone-lock */
 +void update_pcp_isolate_block(unsigned long pfn)
 +{
 + unsigned long end_pfn = pfn + pageblock_nr_pages;
 + struct page *page;
 +
 + while (pfn  end_pfn) {
 + if (!pfn_valid_within(pfn)) {
 + ++pfn;
 + continue;
 + }
 +

There is a potential problem here that you need to be aware of.
set_pageblock_migratetype() is called from start_isolate_page_range().
I do not think there is a guarantee that pfn + pageblock_nr_pages is
not in a different block of MAX_ORDER_NR_PAGES. If that is right then
your options are to add a check like this;

if ((pfn  (MAX_ORDER_NR_PAGES - 1)) == 0  !pfn_valid(pfn))
break;

or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
the same block as pfn and relying on the caller to have called
pfn_valid.

 + page = pfn_to_page(pfn);
 + if (PageBuddy(page)) {
 + pfn += 1  page_order(page);
 + } else if (page_count(page) == 0) {
 + set_page_private(page, MIGRATE_ISOLATE);
 + ++pfn;

This is dangerous for two reasons. If the page_count is 0, it could
be because the page is in the process of being freed and is not
necessarily on the per-cpu lists yet and you cannot be sure if the
contents of page-private are important. Second, there is nothing to
prevent another CPU allocating this page from its per-cpu list while
the private field is getting updated from here which might lead to
some interesting races.

I recognise that what you are trying to do is respond to Gilad's
request that you really check if an IPI here is necessary. I think what
you need to do is check if a page with a count of 0 is encountered
and if it is, then a draining of the per-cpu lists is necessary. To
address Gilad's concerns, be sure to only this this once per attempt at
CMA rather than for every page encountered with a count of 0 to avoid a
storm of IPIs.

 + } else {
 + ++pfn;
 + }
 + }
 +}

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Michal Nazarewicz

On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman m...@csn.ul.ie wrote:


On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:

From: Michal Nazarewicz min...@mina86.com
@@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
long end_pfn)
spin_unlock_irqrestore(zone-lock, flags);
return ret ? 0 : -EBUSY;
 }
+
+/* must hold zone-lock */
+void update_pcp_isolate_block(unsigned long pfn)
+{
+   unsigned long end_pfn = pfn + pageblock_nr_pages;
+   struct page *page;
+
+   while (pfn  end_pfn) {
+   if (!pfn_valid_within(pfn)) {
+   ++pfn;
+   continue;
+   }
+


On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman m...@csn.ul.ie wrote:

There is a potential problem here that you need to be aware of.
set_pageblock_migratetype() is called from start_isolate_page_range().
I do not think there is a guarantee that pfn + pageblock_nr_pages is
not in a different block of MAX_ORDER_NR_PAGES. If that is right then
your options are to add a check like this;

if ((pfn  (MAX_ORDER_NR_PAGES - 1)) == 0  !pfn_valid(pfn))
break;

or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
the same block as pfn and relying on the caller to have called
pfn_valid.


pfn = round_down(pfn, pageblock_nr_pages);
end_pfn = pfn + pageblock_nr_pages;

should do the trick as well, right?  move_freepages_block() seem to be
doing the same thing.


+   page = pfn_to_page(pfn);
+   if (PageBuddy(page)) {
+   pfn += 1  page_order(page);
+   } else if (page_count(page) == 0) {
+   set_page_private(page, MIGRATE_ISOLATE);
+   ++pfn;


This is dangerous for two reasons. If the page_count is 0, it could
be because the page is in the process of being freed and is not
necessarily on the per-cpu lists yet and you cannot be sure if the
contents of page-private are important. Second, there is nothing to
prevent another CPU allocating this page from its per-cpu list while
the private field is getting updated from here which might lead to
some interesting races.

I recognise that what you are trying to do is respond to Gilad's
request that you really check if an IPI here is necessary. I think what
you need to do is check if a page with a count of 0 is encountered
and if it is, then a draining of the per-cpu lists is necessary. To
address Gilad's concerns, be sure to only this this once per attempt at
CMA rather than for every page encountered with a count of 0 to avoid a
storm of IPIs.


It's actually more then that.

This is the same issue that I first fixed with a change to free_pcppages_bulk()
function[1].  At the time of positing, you said you'd like me to try and find
a different solution which would not involve paying the price of calling
get_pageblock_migratetype().  Later I also realised that this solution is
not enough.

[1] http://article.gmane.org/gmane.linux.kernel.mm/70314

My next attempt was to run drain PCP list while holding zone-lock[2], but that
quickly proven to be broken approach when Marek started testing it on an SMP
system.

[2] http://article.gmane.org/gmane.linux.kernel.mm/72016

This patch is yet another attempt of solving this old issue.  Even though it has
a potential race condition we came to conclusion that the actual chances of
causing any problems are slim.  Various stress tests did not, in fact, show
the race to be an issue.

The problem is that if a page is on a PCP list, and it's underlaying pageblocks'
migrate type is changed to MIGRATE_ISOLATE, the page (i) will still remain on 
PCP
list and thus someone can allocate it, and (ii) when removed from PCP list, the
page will be put on freelist of migrate type it had prior to change.

(i) is actually not such a big issue since the next thing that happens after
isolation is migration so all the pages will get freed.  (ii) is actual problem
and if [1] is not an acceptable solution I really don't have a good fix for 
that.

One things that comes to mind is calling drain_all_pages() prior to acquiring
zone-lock in set_migratetype_isolate().  This is however prone to races since
after the drain and before the zone-lock is acquired, pages might get moved
back to PCP list.

Draining PCP list after acquiring zone-lock is not possible because
smp_call_function_many() cannot be called with interrupts disabled, and changing
spin_lock_irqsave() to spin_lock() followed by local_irq_save() causes a dead
lock (that's what [2] attempted to do).

Any suggestions are welcome!


+   } else {
+   ++pfn;
+   }
+   }
+}


--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo--
--
To unsubscribe 

Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Mel Gorman
On Mon, Jan 30, 2012 at 04:41:22PM +0100, Michal Nazarewicz wrote:
 On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman m...@csn.ul.ie wrote:
 
 On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:
 From: Michal Nazarewicz min...@mina86.com
 @@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, 
 unsigned long end_pfn)
 spin_unlock_irqrestore(zone-lock, flags);
 return ret ? 0 : -EBUSY;
  }
 +
 +/* must hold zone-lock */
 +void update_pcp_isolate_block(unsigned long pfn)
 +{
 +   unsigned long end_pfn = pfn + pageblock_nr_pages;
 +   struct page *page;
 +
 +   while (pfn  end_pfn) {
 +   if (!pfn_valid_within(pfn)) {
 +   ++pfn;
 +   continue;
 +   }
 +
 
 On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman m...@csn.ul.ie wrote:
 There is a potential problem here that you need to be aware of.
 set_pageblock_migratetype() is called from start_isolate_page_range().
 I do not think there is a guarantee that pfn + pageblock_nr_pages is
 not in a different block of MAX_ORDER_NR_PAGES. If that is right then
 your options are to add a check like this;
 
 if ((pfn  (MAX_ORDER_NR_PAGES - 1)) == 0  !pfn_valid(pfn))
  break;
 
 or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
 the same block as pfn and relying on the caller to have called
 pfn_valid.
 
   pfn = round_down(pfn, pageblock_nr_pages);
   end_pfn = pfn + pageblock_nr_pages;
 
 should do the trick as well, right?  move_freepages_block() seem to be
 doing the same thing.
 

That would also do it the trick.

 +   page = pfn_to_page(pfn);
 +   if (PageBuddy(page)) {
 +   pfn += 1  page_order(page);
 +   } else if (page_count(page) == 0) {
 +   set_page_private(page, MIGRATE_ISOLATE);
 +   ++pfn;
 
 This is dangerous for two reasons. If the page_count is 0, it could
 be because the page is in the process of being freed and is not
 necessarily on the per-cpu lists yet and you cannot be sure if the
 contents of page-private are important. Second, there is nothing to
 prevent another CPU allocating this page from its per-cpu list while
 the private field is getting updated from here which might lead to
 some interesting races.
 
 I recognise that what you are trying to do is respond to Gilad's
 request that you really check if an IPI here is necessary. I think what
 you need to do is check if a page with a count of 0 is encountered
 and if it is, then a draining of the per-cpu lists is necessary. To
 address Gilad's concerns, be sure to only this this once per attempt at
 CMA rather than for every page encountered with a count of 0 to avoid a
 storm of IPIs.
 
 It's actually more then that.
 
 This is the same issue that I first fixed with a change to 
 free_pcppages_bulk()
 function[1].  At the time of positing, you said you'd like me to try and find
 a different solution which would not involve paying the price of calling
 get_pageblock_migratetype().  Later I also realised that this solution is
 not enough.
 
 [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
 

Yes. I had forgotten the history but looking at that patch again,
I would reach the conclusion that this was adding a new call to
get_pageblock_migratetype() in the bulk free path. That would affect
everybody whether they were using CMA or not.

 My next attempt was to run drain PCP list while holding zone-lock[2], but 
 that
 quickly proven to be broken approach when Marek started testing it on an SMP
 system.
 
 [2] http://article.gmane.org/gmane.linux.kernel.mm/72016
 
 This patch is yet another attempt of solving this old issue.  Even though it 
 has
 a potential race condition we came to conclusion that the actual chances of
 causing any problems are slim.  Various stress tests did not, in fact, show
 the race to be an issue.
 

It is a really small race. To cause a problem CPU 1 must find a page
with count 0, CPU 2 must then allocate the page and set page-private
before CPU 1 overwrites that value but it's there.

 The problem is that if a page is on a PCP list, and it's underlaying 
 pageblocks'
 migrate type is changed to MIGRATE_ISOLATE, the page (i) will still remain on 
 PCP
 list and thus someone can allocate it, and (ii) when removed from PCP list, 
 the
 page will be put on freelist of migrate type it had prior to change.
 
 (i) is actually not such a big issue since the next thing that happens after
 isolation is migration so all the pages will get freed.  (ii) is actual 
 problem
 and if [1] is not an acceptable solution I really don't have a good fix for 
 that.
 
 One things that comes to mind is calling drain_all_pages() prior to acquiring
 zone-lock in set_migratetype_isolate().  This is however prone to races since
 after the drain and before the zone-lock is acquired, pages might get moved
 back to PCP list.
 
 Draining PCP list after acquiring zone-lock is not possible because
 

[PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-26 Thread Marek Szyprowski
From: Michal Nazarewicz min...@mina86.com

This commit changes set_migratetype_isolate() so that it updates
migrate type of pages on pcp list which is saved in their
page_private.

Signed-off-by: Michal Nazarewicz min...@mina86.com
Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
---
 include/linux/page-isolation.h |6 ++
 mm/page_alloc.c|1 +
 mm/page_isolation.c|   24 
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 051c1b1..8c02c2b 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -27,6 +27,12 @@ extern int
 test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn);
 
 /*
+ * Check all pages in pageblock, find the ones on pcp list, and set
+ * their page_private to MIGRATE_ISOLATE.
+ */
+extern void update_pcp_isolate_block(unsigned long pfn);
+
+/*
  * Internal funcs.Changes pageblock's migrate type.
  * Please use make_pagetype_isolated()/make_pagetype_movable().
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e1c5656..70709e7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5465,6 +5465,7 @@ out:
if (!ret) {
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
move_freepages_block(zone, page, MIGRATE_ISOLATE);
+   update_pcp_isolate_block(pfn);
}
 
spin_unlock_irqrestore(zone-lock, flags);
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 4ae42bb..9ea2f6e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
long end_pfn)
spin_unlock_irqrestore(zone-lock, flags);
return ret ? 0 : -EBUSY;
 }
+
+/* must hold zone-lock */
+void update_pcp_isolate_block(unsigned long pfn)
+{
+   unsigned long end_pfn = pfn + pageblock_nr_pages;
+   struct page *page;
+
+   while (pfn  end_pfn) {
+   if (!pfn_valid_within(pfn)) {
+   ++pfn;
+   continue;
+   }
+
+   page = pfn_to_page(pfn);
+   if (PageBuddy(page)) {
+   pfn += 1  page_order(page);
+   } else if (page_count(page) == 0) {
+   set_page_private(page, MIGRATE_ISOLATE);
+   ++pfn;
+   } else {
+   ++pfn;
+   }
+   }
+}
-- 
1.7.1.569.g6f426

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html