Re: question on NUMA page migration

2012-10-21 Thread Ingo Molnar

* Rik van Riel  wrote:

> On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
> >On 10/19/2012 11:53 PM, Rik van Riel wrote:
> >>Hi Andrea, Peter,
> >>
> >>I have a question on page refcounting in your NUMA
> >>page migration code.
> >>
> >>In Peter's case, I wonder why you introduce a new
> >>MIGRATE_FAULT migration mode. If the normal page
> >>migration / compaction logic can do without taking
> >>an extra reference count, why does your code need it?
> >
> >Hi Rik van Riel,
> >
> >This is which part of codes? Why I can't find MIGRATE_FAULT in latest
> >v3.7-rc2?
> 
> It is in tip.git in the numa/core branch.

The Git access URI is:

  git pull  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core
  git clone git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core

Thanks,

Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-21 Thread Ingo Molnar

* Rik van Riel  wrote:

> On 10/19/2012 09:23 PM, Ingo Molnar wrote:
> >
> >* Rik van Riel  wrote:
> >
> >>On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >>>On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> >>
> Another alternative might be to do the put_page inside
> do_prot_none_numa().  That would be analogous to do_wp_page
> disposing of the old page for the caller.
> >>>
> >>>It'd have to be inside migrate_misplaced_page(), can't do before
> >>>isolate_lru_page() or the page might disappear. Doing it after is
> >>>(obviously) too late.
> >>
> >>Keeping an extra refcount on the page might _still_
> >>result in it disappearing from the process by some
> >>other means, in-between you grabbing the refcount
> >>and invoking migration of the page.
> >>
> I am not real happy about NUMA migration introducing its own
> migration mode...
> >>>
> >>>You didn't seem to mind too much earlier, but I can remove it if you
> >>>want.
> >>
> >>Could have been reviewing fatigue :)
> >
> >:-)
> >
> >>And yes, it would have been nice to not have a special
> >>migration mode for sched/numa.
> >>
> >>Speaking of, when do you guys plan to submit a (cleaned up)
> >>version of the sched/numa patch series for review on lkml?
> >
> >Which commit(s) worry you specifically?
> 
> One of them would be the commit that introduces MIGRATE_FAULT.

MIGRATE_FAULT is still used.

> Adding it in one patch, and removing it into a next just makes
> things harder on the reviewers.

Yes.

> 03a040f6c17ab81659579ba6abe267c0562097e4

It's still used:

comet:~/tip> git grep MIGRATE_FAULT
include/linux/migrate_mode.h: * MIGRATE_FAULT called from the fault path to 
migrate-on-fault for mempolicy
include/linux/migrate_mode.h:   MIGRATE_FAULT,
mm/migrate.c:   if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT) {
mm/migrate.c:   if (mode == MIGRATE_FAULT) {
mm/migrate.c:* MIGRATE_FAULT has an extra reference on the page and
mm/migrate.c:   if ((mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT) && head &&
mm/migrate.c:   if (mode != MIGRATE_ASYNC && mode != MIGRATE_FAULT)
mm/migrate.c:   if (!force || mode == MIGRATE_ASYNC || mode == 
MIGRATE_FAULT)
mm/migrate.c:   ret = __unmap_and_move(page, newpage, 0, 0, MIGRATE_FAULT);

> If the changesets with NUMA syscalls are still in your tree's 
> history, they should not be submitted as part of the patch 
> series, either.

No, the syscalls have not been there for quite some time.

If you have trouble with any specific commit, please quote the 
specific SHA1 so that I can have a look and resolve any specific 
concerns.

Otherwise, lets continue with the integration?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-21 Thread Ingo Molnar

* Rik van Riel r...@redhat.com wrote:

 On 10/19/2012 09:23 PM, Ingo Molnar wrote:
 
 * Rik van Riel r...@redhat.com wrote:
 
 On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
 On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
 
 Another alternative might be to do the put_page inside
 do_prot_none_numa().  That would be analogous to do_wp_page
 disposing of the old page for the caller.
 
 It'd have to be inside migrate_misplaced_page(), can't do before
 isolate_lru_page() or the page might disappear. Doing it after is
 (obviously) too late.
 
 Keeping an extra refcount on the page might _still_
 result in it disappearing from the process by some
 other means, in-between you grabbing the refcount
 and invoking migration of the page.
 
 I am not real happy about NUMA migration introducing its own
 migration mode...
 
 You didn't seem to mind too much earlier, but I can remove it if you
 want.
 
 Could have been reviewing fatigue :)
 
 :-)
 
 And yes, it would have been nice to not have a special
 migration mode for sched/numa.
 
 Speaking of, when do you guys plan to submit a (cleaned up)
 version of the sched/numa patch series for review on lkml?
 
 Which commit(s) worry you specifically?
 
 One of them would be the commit that introduces MIGRATE_FAULT.

MIGRATE_FAULT is still used.

 Adding it in one patch, and removing it into a next just makes
 things harder on the reviewers.

Yes.

 03a040f6c17ab81659579ba6abe267c0562097e4

It's still used:

comet:~/tip git grep MIGRATE_FAULT
include/linux/migrate_mode.h: * MIGRATE_FAULT called from the fault path to 
migrate-on-fault for mempolicy
include/linux/migrate_mode.h:   MIGRATE_FAULT,
mm/migrate.c:   if (mode != MIGRATE_ASYNC  mode != MIGRATE_FAULT) {
mm/migrate.c:   if (mode == MIGRATE_FAULT) {
mm/migrate.c:* MIGRATE_FAULT has an extra reference on the page and
mm/migrate.c:   if ((mode == MIGRATE_ASYNC || mode == MIGRATE_FAULT)  head 
mm/migrate.c:   if (mode != MIGRATE_ASYNC  mode != MIGRATE_FAULT)
mm/migrate.c:   if (!force || mode == MIGRATE_ASYNC || mode == 
MIGRATE_FAULT)
mm/migrate.c:   ret = __unmap_and_move(page, newpage, 0, 0, MIGRATE_FAULT);

 If the changesets with NUMA syscalls are still in your tree's 
 history, they should not be submitted as part of the patch 
 series, either.

No, the syscalls have not been there for quite some time.

If you have trouble with any specific commit, please quote the 
specific SHA1 so that I can have a look and resolve any specific 
concerns.

Otherwise, lets continue with the integration?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-21 Thread Ingo Molnar

* Rik van Riel r...@redhat.com wrote:

 On 10/20/2012 10:39 PM, Ni zhan Chen wrote:
 On 10/19/2012 11:53 PM, Rik van Riel wrote:
 Hi Andrea, Peter,
 
 I have a question on page refcounting in your NUMA
 page migration code.
 
 In Peter's case, I wonder why you introduce a new
 MIGRATE_FAULT migration mode. If the normal page
 migration / compaction logic can do without taking
 an extra reference count, why does your code need it?
 
 Hi Rik van Riel,
 
 This is which part of codes? Why I can't find MIGRATE_FAULT in latest
 v3.7-rc2?
 
 It is in tip.git in the numa/core branch.

The Git access URI is:

  git pull  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core
  git clone git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git numa/core

Thanks,

Ingo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Rik van Riel

On 10/20/2012 10:39 PM, Ni zhan Chen wrote:

On 10/19/2012 11:53 PM, Rik van Riel wrote:

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?


Hi Rik van Riel,

This is which part of codes? Why I can't find MIGRATE_FAULT in latest
v3.7-rc2?


It is in tip.git in the numa/core branch.


--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Ni zhan Chen

On 10/19/2012 11:53 PM, Rik van Riel wrote:

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?


Hi Rik van Riel,

This is which part of codes? Why I can't find MIGRATE_FAULT in latest 
v3.7-rc2?


Regards,
Chen



In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:

/*
 * Pin the head subpage at least until the first
 * __isolate_lru_page succeeds (__isolate_lru_page pins it
 * again when it succeeds). If we unpin before
 * __isolate_lru_page successd, the page could be freed and
 * reallocated out from under us. Thus our previous checks on
 * the page, and the split_huge_page, would be worthless.
 *
 * We really only need to do this if "ret > 0" but it doesn't
 * hurt to do it unconditionally as nobody can reference
 * "page" anymore after this and so we can avoid an "if (ret >
 * 0)" branch here.
 */
put_page(page);

This also confuses me.

If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.

If we do need the extra refcount, why is normal
page migration safe? :)



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Rik van Riel

On 10/19/2012 09:23 PM, Ingo Molnar wrote:


* Rik van Riel  wrote:


On 10/19/2012 01:53 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:



Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.


It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.


Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.


I am not real happy about NUMA migration introducing its own
migration mode...


You didn't seem to mind too much earlier, but I can remove it if you
want.


Could have been reviewing fatigue :)


:-)


And yes, it would have been nice to not have a special
migration mode for sched/numa.

Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?


Which commit(s) worry you specifically?


One of them would be the commit that introduces MIGRATE_FAULT.
Adding it in one patch, and removing it into a next just makes
things harder on the reviewers.

03a040f6c17ab81659579ba6abe267c0562097e4


If the changesets with NUMA syscalls are still in your tree's
history, they should not be submitted as part of the patch
series, either.
--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Rik van Riel

On 10/19/2012 09:23 PM, Ingo Molnar wrote:


* Rik van Riel r...@redhat.com wrote:


On 10/19/2012 01:53 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:



Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.


It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.


Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.


I am not real happy about NUMA migration introducing its own
migration mode...


You didn't seem to mind too much earlier, but I can remove it if you
want.


Could have been reviewing fatigue :)


:-)


And yes, it would have been nice to not have a special
migration mode for sched/numa.

Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?


Which commit(s) worry you specifically?


One of them would be the commit that introduces MIGRATE_FAULT.
Adding it in one patch, and removing it into a next just makes
things harder on the reviewers.

03a040f6c17ab81659579ba6abe267c0562097e4


If the changesets with NUMA syscalls are still in your tree's
history, they should not be submitted as part of the patch
series, either.
--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Ni zhan Chen

On 10/19/2012 11:53 PM, Rik van Riel wrote:

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?


Hi Rik van Riel,

This is which part of codes? Why I can't find MIGRATE_FAULT in latest 
v3.7-rc2?


Regards,
Chen



In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:

/*
 * Pin the head subpage at least until the first
 * __isolate_lru_page succeeds (__isolate_lru_page pins it
 * again when it succeeds). If we unpin before
 * __isolate_lru_page successd, the page could be freed and
 * reallocated out from under us. Thus our previous checks on
 * the page, and the split_huge_page, would be worthless.
 *
 * We really only need to do this if ret  0 but it doesn't
 * hurt to do it unconditionally as nobody can reference
 * page anymore after this and so we can avoid an if (ret 
 * 0) branch here.
 */
put_page(page);

This also confuses me.

If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.

If we do need the extra refcount, why is normal
page migration safe? :)



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-20 Thread Rik van Riel

On 10/20/2012 10:39 PM, Ni zhan Chen wrote:

On 10/19/2012 11:53 PM, Rik van Riel wrote:

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?


Hi Rik van Riel,

This is which part of codes? Why I can't find MIGRATE_FAULT in latest
v3.7-rc2?


It is in tip.git in the numa/core branch.


--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Ingo Molnar

* Rik van Riel  wrote:

> On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
> >On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
> 
> >>Another alternative might be to do the put_page inside
> >>do_prot_none_numa().  That would be analogous to do_wp_page
> >>disposing of the old page for the caller.
> >
> >It'd have to be inside migrate_misplaced_page(), can't do before
> >isolate_lru_page() or the page might disappear. Doing it after is
> >(obviously) too late.
> 
> Keeping an extra refcount on the page might _still_
> result in it disappearing from the process by some
> other means, in-between you grabbing the refcount
> and invoking migration of the page.
> 
> >>I am not real happy about NUMA migration introducing its own
> >>migration mode...
> >
> >You didn't seem to mind too much earlier, but I can remove it if you
> >want.
> 
> Could have been reviewing fatigue :)

:-)

> And yes, it would have been nice to not have a special
> migration mode for sched/numa.
> 
> Speaking of, when do you guys plan to submit a (cleaned up)
> version of the sched/numa patch series for review on lkml?

Which commit(s) worry you specifically?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Rik van Riel

On 10/19/2012 01:53 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:



Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.


It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.


Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.


I am not real happy about NUMA migration introducing its own
migration mode...


You didn't seem to mind too much earlier, but I can remove it if you
want.


Could have been reviewing fatigue :)

And yes, it would have been nice to not have a special
migration mode for sched/numa.

Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Peter Zijlstra
On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:

> Would it make sense to have the normal page migration code always
> work with the extra refcount, so we do not have to introduce a new
> MIGRATE_FAULT migration mode?
> 
> On the other hand, compaction does not take the extra reference...

Right, it appears to not do this, it gets pages from the pfn and
zone->lock and the isolate_lru_page() call is the first reference.

> Another alternative might be to do the put_page inside
> do_prot_none_numa().  That would be analogous to do_wp_page
> disposing of the old page for the caller.

It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.

> I am not real happy about NUMA migration introducing its own
> migration mode...

You didn't seem to mind too much earlier, but I can remove it if you
want.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Rik van Riel

On 10/19/2012 12:39 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:


If we do need the extra refcount, why is normal
page migration safe? :)


Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get

Normal does:

   follow_page(FOLL_GET) +1

   isolate_lru_page() +1

   put_page() -1

ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).


Would it make sense to have the normal page migration code always
work with the extra refcount, so we do not have to introduce a new
MIGRATE_FAULT migration mode?

On the other hand, compaction does not take the extra reference...

Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.

I am not real happy about NUMA migration introducing its own
migration mode...

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Peter Zijlstra
On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
> 
> If we do need the extra refcount, why is normal
> page migration safe? :) 

Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get

Normal does:

  follow_page(FOLL_GET) +1

  isolate_lru_page() +1

  put_page() -1

ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).

And while I suppose I could do a put_page() in migrate_misplaced_page()
that makes the function frob the page-count depending on the return
value.

I always try and avoid conditional locks/refs, therefore the code ends
up doing:

  page = vm_normal_page()
  if (page) {
get_page()

migrate_misplaced_page()

put_page()
  }


where migrate_misplaced_page() does isolate_lru_page()/putback_lru_page,
and this leaves the page-count invariant.

We got a ref, therefore we must put a ref, is easier than we got a ref
and must put except when...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


question on NUMA page migration

2012-10-19 Thread Rik van Riel

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?

In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:

/*
 * Pin the head subpage at least until the first
 * __isolate_lru_page succeeds (__isolate_lru_page pins it
 * again when it succeeds). If we unpin before
 * __isolate_lru_page successd, the page could be freed and
 * reallocated out from under us. Thus our previous checks on
 * the page, and the split_huge_page, would be worthless.
 *
 * We really only need to do this if "ret > 0" but it doesn't
 * hurt to do it unconditionally as nobody can reference
 * "page" anymore after this and so we can avoid an "if (ret >
 * 0)" branch here.
 */
put_page(page);

This also confuses me.

If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.

If we do need the extra refcount, why is normal
page migration safe? :)

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


question on NUMA page migration

2012-10-19 Thread Rik van Riel

Hi Andrea, Peter,

I have a question on page refcounting in your NUMA
page migration code.

In Peter's case, I wonder why you introduce a new
MIGRATE_FAULT migration mode. If the normal page
migration / compaction logic can do without taking
an extra reference count, why does your code need it?

In Andrea's case, we have a comment suggesting an
extra refcount is needed, immediately followed by
a put_page:

/*
 * Pin the head subpage at least until the first
 * __isolate_lru_page succeeds (__isolate_lru_page pins it
 * again when it succeeds). If we unpin before
 * __isolate_lru_page successd, the page could be freed and
 * reallocated out from under us. Thus our previous checks on
 * the page, and the split_huge_page, would be worthless.
 *
 * We really only need to do this if ret  0 but it doesn't
 * hurt to do it unconditionally as nobody can reference
 * page anymore after this and so we can avoid an if (ret 
 * 0) branch here.
 */
put_page(page);

This also confuses me.

If we do not need the extra refcount (and I do not
understand why NUMA migrate-on-fault needs one more
refcount than normal page migration), we can get
rid of the MIGRATE_FAULT mode.

If we do need the extra refcount, why is normal
page migration safe? :)

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Peter Zijlstra
On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:
 
 If we do need the extra refcount, why is normal
 page migration safe? :) 

Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get

Normal does:

  follow_page(FOLL_GET) +1

  isolate_lru_page() +1

  put_page() -1

ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).

And while I suppose I could do a put_page() in migrate_misplaced_page()
that makes the function frob the page-count depending on the return
value.

I always try and avoid conditional locks/refs, therefore the code ends
up doing:

  page = vm_normal_page()
  if (page) {
get_page()

migrate_misplaced_page()

put_page()
  }


where migrate_misplaced_page() does isolate_lru_page()/putback_lru_page,
and this leaves the page-count invariant.

We got a ref, therefore we must put a ref, is easier than we got a ref
and must put except when...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Rik van Riel

On 10/19/2012 12:39 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 11:53 -0400, Rik van Riel wrote:


If we do need the extra refcount, why is normal
page migration safe? :)


Its mostly a matter of how convoluted you make the code, regular page
migration is about as bad as you can get

Normal does:

   follow_page(FOLL_GET) +1

   isolate_lru_page() +1

   put_page() -1

ending up with a page with a single reference (for anon, or one extra
each for the mapping and buffer).


Would it make sense to have the normal page migration code always
work with the extra refcount, so we do not have to introduce a new
MIGRATE_FAULT migration mode?

On the other hand, compaction does not take the extra reference...

Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.

I am not real happy about NUMA migration introducing its own
migration mode...

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Peter Zijlstra
On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:

 Would it make sense to have the normal page migration code always
 work with the extra refcount, so we do not have to introduce a new
 MIGRATE_FAULT migration mode?
 
 On the other hand, compaction does not take the extra reference...

Right, it appears to not do this, it gets pages from the pfn and
zone-lock and the isolate_lru_page() call is the first reference.

 Another alternative might be to do the put_page inside
 do_prot_none_numa().  That would be analogous to do_wp_page
 disposing of the old page for the caller.

It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.

 I am not real happy about NUMA migration introducing its own
 migration mode...

You didn't seem to mind too much earlier, but I can remove it if you
want.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Rik van Riel

On 10/19/2012 01:53 PM, Peter Zijlstra wrote:

On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:



Another alternative might be to do the put_page inside
do_prot_none_numa().  That would be analogous to do_wp_page
disposing of the old page for the caller.


It'd have to be inside migrate_misplaced_page(), can't do before
isolate_lru_page() or the page might disappear. Doing it after is
(obviously) too late.


Keeping an extra refcount on the page might _still_
result in it disappearing from the process by some
other means, in-between you grabbing the refcount
and invoking migration of the page.


I am not real happy about NUMA migration introducing its own
migration mode...


You didn't seem to mind too much earlier, but I can remove it if you
want.


Could have been reviewing fatigue :)

And yes, it would have been nice to not have a special
migration mode for sched/numa.

Speaking of, when do you guys plan to submit a (cleaned up)
version of the sched/numa patch series for review on lkml?

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: question on NUMA page migration

2012-10-19 Thread Ingo Molnar

* Rik van Riel r...@redhat.com wrote:

 On 10/19/2012 01:53 PM, Peter Zijlstra wrote:
 On Fri, 2012-10-19 at 13:13 -0400, Rik van Riel wrote:
 
 Another alternative might be to do the put_page inside
 do_prot_none_numa().  That would be analogous to do_wp_page
 disposing of the old page for the caller.
 
 It'd have to be inside migrate_misplaced_page(), can't do before
 isolate_lru_page() or the page might disappear. Doing it after is
 (obviously) too late.
 
 Keeping an extra refcount on the page might _still_
 result in it disappearing from the process by some
 other means, in-between you grabbing the refcount
 and invoking migration of the page.
 
 I am not real happy about NUMA migration introducing its own
 migration mode...
 
 You didn't seem to mind too much earlier, but I can remove it if you
 want.
 
 Could have been reviewing fatigue :)

:-)

 And yes, it would have been nice to not have a special
 migration mode for sched/numa.
 
 Speaking of, when do you guys plan to submit a (cleaned up)
 version of the sched/numa patch series for review on lkml?

Which commit(s) worry you specifically?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/