Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-03 Thread Mel Gorman
On Thu, Feb 02, 2012 at 08:53:25PM +0100, Michal Nazarewicz wrote:
> >On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
> >>Pages, which have incorrect migrate type on free finally
> >>causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.
> 
> On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman  wrote:
> >I'm not quite seeing this. In free_hot_cold_page(), the pageblock
> >type is checked so the page private should be set to MIGRATE_CMA or
> >MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
> >pageblock to MIGRATE_MOVABLE in error.
> 
> Here's what I think may happen:
> 
> When drain_all_pages() is called, __free_one_page() is called for each page on
> pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
> This result in the page being put on MIGRATE_CMA freelist even though its
> pageblock's migrate type is MIGRATE_ISOLATE.
> 

Ok, although it will only be allocated for MIGRATE_CMA-compatible
requests so it is not a disaster.

> When allocation happens and pcp list is empty, rmqueue_bulk() will get 
> executed
> with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to 
> grab
> some pages and because the page described above is on MIGRATE_CMA freelist it
> may be returned back to rmqueue_bulk().
> 

This will allocate the page from a pageblock we are trying to isolate
pages from, but only for a movable page that can still be migrated. It
does mean that CMA is doing more work than it should of course and
the problem also impacts memory hot-remove. It's worse for memory
hot-remove because potentially an UNMOVABLE page was allocated from
a MIGRATE_ISOLATE pageblock.

> But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
> following code:
> 
> #ifdef CONFIG_CMA
>   if (is_pageblock_cma(page))
>   set_page_private(page, MIGRATE_CMA);
>   else
> #endif
>   set_page_private(page, migratetype);
> 
> will set it's private to MIGRATE_MOVABLE and in the end the page lands back
> on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
> and not MIGRATE_CMA.
> 
> One more drain_all_pages() (which may happen since alloc_contig_range() calls
> set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
> convert the whole pageblock to MIGRATE_MOVABLE.
> 
> I know, this sounds crazy and improbable, but I couldn't find an easier path
> to destruction.  As you pointed, once the page is allocated, 
> free_hot_cold_page()
> will do the right thing by reading pageblock's migrate type.
> 

Ok, it's crazy but the problem is there.

> Marek is currently experimenting with various patches including the following
> change:
> 
> #ifdef CONFIG_CMA
> int mt = get_pageblock_migratetype(page);
> if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
> set_page_private(page, mt);
> else
> #endif
> set_page_private(page, migratetype);
> 
> As a matter of fact, if __rmqueue() was changed to return migrate type of the
> freelist it took page from, we could avoid this get_pageblock_migratetype() 
> all
> together.  For now, however, I'd rather not go that way just yet -- I'll be 
> happy
> to dig into it once CMA gets merged.
> 

Ok, thanks for persisting with this.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-03 Thread Marek Szyprowski
Hello,

On Thursday, February 02, 2012 8:53 PM Michał Nazarewicz wrote:

> > On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
> >> Pages, which have incorrect migrate type on free finally
> >> causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.
> 
> On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman  wrote:
> > I'm not quite seeing this. In free_hot_cold_page(), the pageblock
> > type is checked so the page private should be set to MIGRATE_CMA or
> > MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
> > pageblock to MIGRATE_MOVABLE in error.
> 
> Here's what I think may happen:
> 
> When drain_all_pages() is called, __free_one_page() is called for each page on
> pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
> This result in the page being put on MIGRATE_CMA freelist even though its
> pageblock's migrate type is MIGRATE_ISOLATE.
> 
> When allocation happens and pcp list is empty, rmqueue_bulk() will get 
> executed
> with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to 
> grab
> some pages and because the page described above is on MIGRATE_CMA freelist it
> may be returned back to rmqueue_bulk().
> 
> But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
> following code:
> 
> #ifdef CONFIG_CMA
>   if (is_pageblock_cma(page))
>   set_page_private(page, MIGRATE_CMA);
>   else
> #endif
>   set_page_private(page, migratetype);
> 
> will set it's private to MIGRATE_MOVABLE and in the end the page lands back
> on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
> and not MIGRATE_CMA.
> 
> One more drain_all_pages() (which may happen since alloc_contig_range() calls
> set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
> convert the whole pageblock to MIGRATE_MOVABLE.
> 
> I know, this sounds crazy and improbable, but I couldn't find an easier path
> to destruction.  As you pointed, once the page is allocated, 
> free_hot_cold_page()
> will do the right thing by reading pageblock's migrate type.
> 
> Marek is currently experimenting with various patches including the following
> change:
> 
> #ifdef CONFIG_CMA
>  int mt = get_pageblock_migratetype(page);
>  if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
>  set_page_private(page, mt);
>  else
> #endif
>  set_page_private(page, migratetype);
> 
> As a matter of fact, if __rmqueue() was changed to return migrate type of the
> freelist it took page from, we could avoid this get_pageblock_migratetype() 
> all
> together.  For now, however, I'd rather not go that way just yet -- I'll be 
> happy
> to dig into it once CMA gets merged.

After this and some other changes I'm unable to reproduce that issue. I did a 
whole
night tests and it still works fine, so it looks that it has been finally 
solved.
I will post v20 patchset soon :)

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-02 Thread Michal Nazarewicz

On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:

Pages, which have incorrect migrate type on free finally
causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.


On Thu, 02 Feb 2012 13:47:29 +0100, Mel Gorman  wrote:

I'm not quite seeing this. In free_hot_cold_page(), the pageblock
type is checked so the page private should be set to MIGRATE_CMA or
MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
pageblock to MIGRATE_MOVABLE in error.


Here's what I think may happen:

When drain_all_pages() is called, __free_one_page() is called for each page on
pcp list with migrate type deducted from page_private() which is MIGRATE_CMA.
This result in the page being put on MIGRATE_CMA freelist even though its
pageblock's migrate type is MIGRATE_ISOLATE.

When allocation happens and pcp list is empty, rmqueue_bulk() will get executed
with migratetype argument set to MIGRATE_MOVABLE.  It calls __rmqueue() to grab
some pages and because the page described above is on MIGRATE_CMA freelist it
may be returned back to rmqueue_bulk().

But, pageblock's migrate type is not MIGRATE_CMA but MIGRATE_ISOLATE, so the
following code:

#ifdef CONFIG_CMA
if (is_pageblock_cma(page))
set_page_private(page, MIGRATE_CMA);
else
#endif
set_page_private(page, migratetype);

will set it's private to MIGRATE_MOVABLE and in the end the page lands back
on MIGRATE_MOVABLE pcp list but this time with page_private == MIGRATE_MOVABLE
and not MIGRATE_CMA.

One more drain_all_pages() (which may happen since alloc_contig_range() calls
set_migratetype_isolate() for each block) and next __rmqueue_fallback() may
convert the whole pageblock to MIGRATE_MOVABLE.

I know, this sounds crazy and improbable, but I couldn't find an easier path
to destruction.  As you pointed, once the page is allocated, 
free_hot_cold_page()
will do the right thing by reading pageblock's migrate type.

Marek is currently experimenting with various patches including the following
change:

#ifdef CONFIG_CMA
int mt = get_pageblock_migratetype(page);
if (is_migrate_cma(mt) || mt == MIGRATE_ISOLATE)
set_page_private(page, mt);
else
#endif
set_page_private(page, migratetype);

As a matter of fact, if __rmqueue() was changed to return migrate type of the
freelist it took page from, we could avoid this get_pageblock_migratetype() all
together.  For now, however, I'd rather not go that way just yet -- I'll be 
happy
to dig into it once CMA gets merged.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-02-02 Thread Mel Gorman
On Tue, Jan 31, 2012 at 05:23:59PM +0100, Marek Szyprowski wrote:
> > > >>+   page = pfn_to_page(pfn);
> > > >>+   if (PageBuddy(page)) {
> > > >>+   pfn += 1 << page_order(page);
> > > >>+   } else if (page_count(page) == 0) {
> > > >>+   set_page_private(page, MIGRATE_ISOLATE);
> > > >>+   ++pfn;
> > > >
> > > >This is dangerous for two reasons. If the page_count is 0, it could
> > > >be because the page is in the process of being freed and is not
> > > >necessarily on the per-cpu lists yet and you cannot be sure if the
> > > >contents of page->private are important. Second, there is nothing to
> > > >prevent another CPU allocating this page from its per-cpu list while
> > > >the private field is getting updated from here which might lead to
> > > >some interesting races.
> > > >
> > > >I recognise that what you are trying to do is respond to Gilad's
> > > >request that you really check if an IPI here is necessary. I think what
> > > >you need to do is check if a page with a count of 0 is encountered
> > > >and if it is, then a draining of the per-cpu lists is necessary. To
> > > >address Gilad's concerns, be sure to only this this once per attempt at
> > > >CMA rather than for every page encountered with a count of 0 to avoid a
> > > >storm of IPIs.
> > >
> > > It's actually more then that.
> > >
> > > This is the same issue that I first fixed with a change to 
> > > free_pcppages_bulk()
> > > function[1].  At the time of positing, you said you'd like me to try and 
> > > find
> > > a different solution which would not involve paying the price of calling
> > > get_pageblock_migratetype().  Later I also realised that this solution is
> > > not enough.
> > >
> > > [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
> > >
> > 
> > Yes. I had forgotten the history but looking at that patch again,
> > I would reach the conclusion that this was adding a new call to
> > get_pageblock_migratetype() in the bulk free path. That would affect
> > everybody whether they were using CMA or not.
> 
> This will be a bit ugly, but we can also use that code and compile it 
> conditionally
> when CMA has been enabled.

That would also be very unfortunate because it means enabling CMA incurs
a performance cost to everyone whether they use CMA or not. For ARM,
this may not be a problem but it would be for other arches if they
wanted to use CMA or if it ever became part of a distro contig.

> Pages, which have incorrect migrate type on free finally
> causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE.

I'm not quite seeing this. In free_hot_cold_page(), the pageblock
type is checked so the page private should be set to MIGRATE_CMA or
MIGRATE_ISOLATE for the CMA area. It's not clear how this can change a
pageblock to MIGRATE_MOVABLE in error. If it turns out that you
absolutely have to call get_pageblock_migratetype() from
free_pcppages_bulk() and my alternative suggestion did not work out then
document all these issues in a comment when putting the call under
CONFIG_CMA so that it is not forgotten.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-31 Thread Marek Szyprowski
Hello,

On Monday, January 30, 2012 5:15 PM Mel Gorman wrote:

> On Mon, Jan 30, 2012 at 04:41:22PM +0100, Michal Nazarewicz wrote:
> > On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman  wrote:

(snipped)

> > >>+ page = pfn_to_page(pfn);
> > >>+ if (PageBuddy(page)) {
> > >>+ pfn += 1 << page_order(page);
> > >>+ } else if (page_count(page) == 0) {
> > >>+ set_page_private(page, MIGRATE_ISOLATE);
> > >>+ ++pfn;
> > >
> > >This is dangerous for two reasons. If the page_count is 0, it could
> > >be because the page is in the process of being freed and is not
> > >necessarily on the per-cpu lists yet and you cannot be sure if the
> > >contents of page->private are important. Second, there is nothing to
> > >prevent another CPU allocating this page from its per-cpu list while
> > >the private field is getting updated from here which might lead to
> > >some interesting races.
> > >
> > >I recognise that what you are trying to do is respond to Gilad's
> > >request that you really check if an IPI here is necessary. I think what
> > >you need to do is check if a page with a count of 0 is encountered
> > >and if it is, then a draining of the per-cpu lists is necessary. To
> > >address Gilad's concerns, be sure to only this this once per attempt at
> > >CMA rather than for every page encountered with a count of 0 to avoid a
> > >storm of IPIs.
> >
> > It's actually more then that.
> >
> > This is the same issue that I first fixed with a change to 
> > free_pcppages_bulk()
> > function[1].  At the time of positing, you said you'd like me to try and 
> > find
> > a different solution which would not involve paying the price of calling
> > get_pageblock_migratetype().  Later I also realised that this solution is
> > not enough.
> >
> > [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
> >
> 
> Yes. I had forgotten the history but looking at that patch again,
> I would reach the conclusion that this was adding a new call to
> get_pageblock_migratetype() in the bulk free path. That would affect
> everybody whether they were using CMA or not.

This will be a bit ugly, but we can also use that code and compile it 
conditionally
when CMA has been enabled. Pages, which have incorrect migrate type on free 
finally
causes pageblock migration type change from MIGRATE_CMA to MIGRATE_MOVABLE. 
This is
not a problem for non-CMA case where only pageblocks with MIGRATE_MOVABLE 
migration
type are being isolated.

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Mel Gorman
On Mon, Jan 30, 2012 at 04:41:22PM +0100, Michal Nazarewicz wrote:
> On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman  wrote:
> 
> >On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:
> >>From: Michal Nazarewicz 
> >>@@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, 
> >>unsigned long end_pfn)
> >>spin_unlock_irqrestore(&zone->lock, flags);
> >>return ret ? 0 : -EBUSY;
> >> }
> >>+
> >>+/* must hold zone->lock */
> >>+void update_pcp_isolate_block(unsigned long pfn)
> >>+{
> >>+   unsigned long end_pfn = pfn + pageblock_nr_pages;
> >>+   struct page *page;
> >>+
> >>+   while (pfn < end_pfn) {
> >>+   if (!pfn_valid_within(pfn)) {
> >>+   ++pfn;
> >>+   continue;
> >>+   }
> >>+
> 
> On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman  wrote:
> >There is a potential problem here that you need to be aware of.
> >set_pageblock_migratetype() is called from start_isolate_page_range().
> >I do not think there is a guarantee that pfn + pageblock_nr_pages is
> >not in a different block of MAX_ORDER_NR_PAGES. If that is right then
> >your options are to add a check like this;
> >
> >if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0 && !pfn_valid(pfn))
> > break;
> >
> >or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
> >the same block as pfn and relying on the caller to have called
> >pfn_valid.
> 
>   pfn = round_down(pfn, pageblock_nr_pages);
>   end_pfn = pfn + pageblock_nr_pages;
> 
> should do the trick as well, right?  move_freepages_block() seem to be
> doing the same thing.
> 

That would also do it the trick.

> >>+   page = pfn_to_page(pfn);
> >>+   if (PageBuddy(page)) {
> >>+   pfn += 1 << page_order(page);
> >>+   } else if (page_count(page) == 0) {
> >>+   set_page_private(page, MIGRATE_ISOLATE);
> >>+   ++pfn;
> >
> >This is dangerous for two reasons. If the page_count is 0, it could
> >be because the page is in the process of being freed and is not
> >necessarily on the per-cpu lists yet and you cannot be sure if the
> >contents of page->private are important. Second, there is nothing to
> >prevent another CPU allocating this page from its per-cpu list while
> >the private field is getting updated from here which might lead to
> >some interesting races.
> >
> >I recognise that what you are trying to do is respond to Gilad's
> >request that you really check if an IPI here is necessary. I think what
> >you need to do is check if a page with a count of 0 is encountered
> >and if it is, then a draining of the per-cpu lists is necessary. To
> >address Gilad's concerns, be sure to only this this once per attempt at
> >CMA rather than for every page encountered with a count of 0 to avoid a
> >storm of IPIs.
> 
> It's actually more then that.
> 
> This is the same issue that I first fixed with a change to 
> free_pcppages_bulk()
> function[1].  At the time of positing, you said you'd like me to try and find
> a different solution which would not involve paying the price of calling
> get_pageblock_migratetype().  Later I also realised that this solution is
> not enough.
> 
> [1] http://article.gmane.org/gmane.linux.kernel.mm/70314
> 

Yes. I had forgotten the history but looking at that patch again,
I would reach the conclusion that this was adding a new call to
get_pageblock_migratetype() in the bulk free path. That would affect
everybody whether they were using CMA or not.

> My next attempt was to run drain PCP list while holding zone->lock[2], but 
> that
> quickly proven to be broken approach when Marek started testing it on an SMP
> system.
> 
> [2] http://article.gmane.org/gmane.linux.kernel.mm/72016
> 
> This patch is yet another attempt of solving this old issue.  Even though it 
> has
> a potential race condition we came to conclusion that the actual chances of
> causing any problems are slim.  Various stress tests did not, in fact, show
> the race to be an issue.
> 

It is a really small race. To cause a problem CPU 1 must find a page
with count 0, CPU 2 must then allocate the page and set page->private
before CPU 1 overwrites that value but it's there.

> The problem is that if a page is on a PCP list, and it's underlaying 
> pageblocks'
> migrate type is changed to MIGRATE_ISOLATE, the page (i) will still remain on 
> PCP
> list and thus someone can allocate it, and (ii) when removed from PCP list, 
> the
> page will be put on freelist of migrate type it had prior to change.
> 
> (i) is actually not such a big issue since the next thing that happens after
> isolation is migration so all the pages will get freed.  (ii) is actual 
> problem
> and if [1] is not an acceptable solution I really don't have a good fix for 
> that.
> 
> One things that comes to mind is calling drain_all_pages() prior to acquiring
> zone->lock in set_migratetype_isolate().  This is however prone to races since
> after the drain and before th

Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Michal Nazarewicz

On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman  wrote:


On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:

From: Michal Nazarewicz 
@@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
long end_pfn)
spin_unlock_irqrestore(&zone->lock, flags);
return ret ? 0 : -EBUSY;
 }
+
+/* must hold zone->lock */
+void update_pcp_isolate_block(unsigned long pfn)
+{
+   unsigned long end_pfn = pfn + pageblock_nr_pages;
+   struct page *page;
+
+   while (pfn < end_pfn) {
+   if (!pfn_valid_within(pfn)) {
+   ++pfn;
+   continue;
+   }
+


On Mon, 30 Jan 2012 12:15:22 +0100, Mel Gorman  wrote:

There is a potential problem here that you need to be aware of.
set_pageblock_migratetype() is called from start_isolate_page_range().
I do not think there is a guarantee that pfn + pageblock_nr_pages is
not in a different block of MAX_ORDER_NR_PAGES. If that is right then
your options are to add a check like this;

if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0 && !pfn_valid(pfn))
break;

or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
the same block as pfn and relying on the caller to have called
pfn_valid.


pfn = round_down(pfn, pageblock_nr_pages);
end_pfn = pfn + pageblock_nr_pages;

should do the trick as well, right?  move_freepages_block() seem to be
doing the same thing.


+   page = pfn_to_page(pfn);
+   if (PageBuddy(page)) {
+   pfn += 1 << page_order(page);
+   } else if (page_count(page) == 0) {
+   set_page_private(page, MIGRATE_ISOLATE);
+   ++pfn;


This is dangerous for two reasons. If the page_count is 0, it could
be because the page is in the process of being freed and is not
necessarily on the per-cpu lists yet and you cannot be sure if the
contents of page->private are important. Second, there is nothing to
prevent another CPU allocating this page from its per-cpu list while
the private field is getting updated from here which might lead to
some interesting races.

I recognise that what you are trying to do is respond to Gilad's
request that you really check if an IPI here is necessary. I think what
you need to do is check if a page with a count of 0 is encountered
and if it is, then a draining of the per-cpu lists is necessary. To
address Gilad's concerns, be sure to only this this once per attempt at
CMA rather than for every page encountered with a count of 0 to avoid a
storm of IPIs.


It's actually more then that.

This is the same issue that I first fixed with a change to free_pcppages_bulk()
function[1].  At the time of positing, you said you'd like me to try and find
a different solution which would not involve paying the price of calling
get_pageblock_migratetype().  Later I also realised that this solution is
not enough.

[1] http://article.gmane.org/gmane.linux.kernel.mm/70314

My next attempt was to run drain PCP list while holding zone->lock[2], but that
quickly proven to be broken approach when Marek started testing it on an SMP
system.

[2] http://article.gmane.org/gmane.linux.kernel.mm/72016

This patch is yet another attempt of solving this old issue.  Even though it has
a potential race condition we came to conclusion that the actual chances of
causing any problems are slim.  Various stress tests did not, in fact, show
the race to be an issue.

The problem is that if a page is on a PCP list, and it's underlaying pageblocks'
migrate type is changed to MIGRATE_ISOLATE, the page (i) will still remain on 
PCP
list and thus someone can allocate it, and (ii) when removed from PCP list, the
page will be put on freelist of migrate type it had prior to change.

(i) is actually not such a big issue since the next thing that happens after
isolation is migration so all the pages will get freed.  (ii) is actual problem
and if [1] is not an acceptable solution I really don't have a good fix for 
that.

One things that comes to mind is calling drain_all_pages() prior to acquiring
zone->lock in set_migratetype_isolate().  This is however prone to races since
after the drain and before the zone->lock is acquired, pages might get moved
back to PCP list.

Draining PCP list after acquiring zone->lock is not possible because
smp_call_function_many() cannot be called with interrupts disabled, and changing
spin_lock_irqsave() to spin_lock() followed by local_irq_save() causes a dead
lock (that's what [2] attempted to do).

Any suggestions are welcome!


+   } else {
+   ++pfn;
+   }
+   }
+}


--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
th

Re: [PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-30 Thread Mel Gorman
On Thu, Jan 26, 2012 at 10:00:44AM +0100, Marek Szyprowski wrote:
> From: Michal Nazarewicz 
> 
> This commit changes set_migratetype_isolate() so that it updates
> migrate type of pages on pcp list which is saved in their
> page_private.
> 
> Signed-off-by: Michal Nazarewicz 
> Signed-off-by: Marek Szyprowski 
> ---
>  include/linux/page-isolation.h |6 ++
>  mm/page_alloc.c|1 +
>  mm/page_isolation.c|   24 
>  3 files changed, 31 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 051c1b1..8c02c2b 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -27,6 +27,12 @@ extern int
>  test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn);
>  
>  /*
> + * Check all pages in pageblock, find the ones on pcp list, and set
> + * their page_private to MIGRATE_ISOLATE.
> + */
> +extern void update_pcp_isolate_block(unsigned long pfn);
> +
> +/*
>   * Internal funcs.Changes pageblock's migrate type.
>   * Please use make_pagetype_isolated()/make_pagetype_movable().
>   */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e1c5656..70709e7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5465,6 +5465,7 @@ out:
>   if (!ret) {
>   set_pageblock_migratetype(page, MIGRATE_ISOLATE);
>   move_freepages_block(zone, page, MIGRATE_ISOLATE);
> + update_pcp_isolate_block(pfn);
>   }
>  
>   spin_unlock_irqrestore(&zone->lock, flags);
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 4ae42bb..9ea2f6e 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, 
> unsigned long end_pfn)
>   spin_unlock_irqrestore(&zone->lock, flags);
>   return ret ? 0 : -EBUSY;
>  }
> +
> +/* must hold zone->lock */
> +void update_pcp_isolate_block(unsigned long pfn)
> +{
> + unsigned long end_pfn = pfn + pageblock_nr_pages;
> + struct page *page;
> +
> + while (pfn < end_pfn) {
> + if (!pfn_valid_within(pfn)) {
> + ++pfn;
> + continue;
> + }
> +

There is a potential problem here that you need to be aware of.
set_pageblock_migratetype() is called from start_isolate_page_range().
I do not think there is a guarantee that pfn + pageblock_nr_pages is
not in a different block of MAX_ORDER_NR_PAGES. If that is right then
your options are to add a check like this;

if ((pfn & (MAX_ORDER_NR_PAGES - 1)) == 0 && !pfn_valid(pfn))
break;

or else ensure that end_pfn is always MAX_ORDER_NR_PAGES aligned and in
the same block as pfn and relying on the caller to have called
pfn_valid.

> + page = pfn_to_page(pfn);
> + if (PageBuddy(page)) {
> + pfn += 1 << page_order(page);
> + } else if (page_count(page) == 0) {
> + set_page_private(page, MIGRATE_ISOLATE);
> + ++pfn;

This is dangerous for two reasons. If the page_count is 0, it could
be because the page is in the process of being freed and is not
necessarily on the per-cpu lists yet and you cannot be sure if the
contents of page->private are important. Second, there is nothing to
prevent another CPU allocating this page from its per-cpu list while
the private field is getting updated from here which might lead to
some interesting races.

I recognise that what you are trying to do is respond to Gilad's
request that you really check if an IPI here is necessary. I think what
you need to do is check if a page with a count of 0 is encountered
and if it is, then a draining of the per-cpu lists is necessary. To
address Gilad's concerns, be sure to only this this once per attempt at
CMA rather than for every page encountered with a count of 0 to avoid a
storm of IPIs.

> + } else {
> + ++pfn;
> + }
> + }
> +}

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/15] mm: page_alloc: update migrate type of pages on pcp when isolating

2012-01-26 Thread Marek Szyprowski
From: Michal Nazarewicz 

This commit changes set_migratetype_isolate() so that it updates
migrate type of pages on pcp list which is saved in their
page_private.

Signed-off-by: Michal Nazarewicz 
Signed-off-by: Marek Szyprowski 
---
 include/linux/page-isolation.h |6 ++
 mm/page_alloc.c|1 +
 mm/page_isolation.c|   24 
 3 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 051c1b1..8c02c2b 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -27,6 +27,12 @@ extern int
 test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn);
 
 /*
+ * Check all pages in pageblock, find the ones on pcp list, and set
+ * their page_private to MIGRATE_ISOLATE.
+ */
+extern void update_pcp_isolate_block(unsigned long pfn);
+
+/*
  * Internal funcs.Changes pageblock's migrate type.
  * Please use make_pagetype_isolated()/make_pagetype_movable().
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e1c5656..70709e7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5465,6 +5465,7 @@ out:
if (!ret) {
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
move_freepages_block(zone, page, MIGRATE_ISOLATE);
+   update_pcp_isolate_block(pfn);
}
 
spin_unlock_irqrestore(&zone->lock, flags);
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 4ae42bb..9ea2f6e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -139,3 +139,27 @@ int test_pages_isolated(unsigned long start_pfn, unsigned 
long end_pfn)
spin_unlock_irqrestore(&zone->lock, flags);
return ret ? 0 : -EBUSY;
 }
+
+/* must hold zone->lock */
+void update_pcp_isolate_block(unsigned long pfn)
+{
+   unsigned long end_pfn = pfn + pageblock_nr_pages;
+   struct page *page;
+
+   while (pfn < end_pfn) {
+   if (!pfn_valid_within(pfn)) {
+   ++pfn;
+   continue;
+   }
+
+   page = pfn_to_page(pfn);
+   if (PageBuddy(page)) {
+   pfn += 1 << page_order(page);
+   } else if (page_count(page) == 0) {
+   set_page_private(page, MIGRATE_ISOLATE);
+   ++pfn;
+   } else {
+   ++pfn;
+   }
+   }
+}
-- 
1.7.1.569.g6f426

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html