Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-24 Thread Christoph Lameter
On Fri, 21 Feb 2014, Nishanth Aravamudan wrote:

 I added two calls to local_memory_node(), I *think* both are necessary,
 but am willing to be corrected.

 One is in map_cpu_to_node() and one is in start_secondary(). The
 start_secondary() path is fine, AFAICT, as we are up  running at that
 point. But in [the renamed function] update_numa_cpu_node() which is
 used by hotplug, we get called from do_init_bootmem(), which is before
 the zonelists are setup.

 I think both calls are necessary because I believe the
 arch_update_cpu_topology() is used for supporting firmware-driven
 home-noding, which does not invoke start_secondary() again (the
 processor is already running, we're just updating the topology in that
 situation).

 Then again, I could special-case the do_init_bootmem callpath, which is
 only called at kernel init time?

Well taht looks to be simpler.

  I do agree that calling local_memory_node() too early then trying to
  fudge around the consequences seems rather wrong.

 If the answer is to simply not call local_memory_node() early, I'll
 submit a patch to at least add a comment, as there's nothing in the code
 itself to prevent this from happening and is guaranteed to oops.

Ok.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-24 Thread Nishanth Aravamudan
On 24.02.2014 [13:43:31 -0600], Christoph Lameter wrote:
 On Fri, 21 Feb 2014, Nishanth Aravamudan wrote:
 
  I added two calls to local_memory_node(), I *think* both are necessary,
  but am willing to be corrected.
 
  One is in map_cpu_to_node() and one is in start_secondary(). The
  start_secondary() path is fine, AFAICT, as we are up  running at that
  point. But in [the renamed function] update_numa_cpu_node() which is
  used by hotplug, we get called from do_init_bootmem(), which is before
  the zonelists are setup.
 
  I think both calls are necessary because I believe the
  arch_update_cpu_topology() is used for supporting firmware-driven
  home-noding, which does not invoke start_secondary() again (the
  processor is already running, we're just updating the topology in that
  situation).
 
  Then again, I could special-case the do_init_bootmem callpath, which is
  only called at kernel init time?
 
 Well taht looks to be simpler.

Ok, I'll work on this.

   I do agree that calling local_memory_node() too early then trying to
   fudge around the consequences seems rather wrong.
 
  If the answer is to simply not call local_memory_node() early, I'll
  submit a patch to at least add a comment, as there's nothing in the code
  itself to prevent this from happening and is guaranteed to oops.
 
 Ok.

Thanks!
-Nish

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-21 Thread Andrew Morton
On Thu, 20 Feb 2014 10:28:47 -0800 Nishanth Aravamudan 
n...@linux.vnet.ibm.com wrote:

 On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote:
  On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:
  
   We can call local_memory_node() before the zonelists are setup. In that
   case, first_zones_zonelist() will not set zone and the reference to
   zone-node will Oops. Catch this case, and, since we presumably running
   very early, just return that any node will do.
  
  Really? Isnt there some way to avoid this call if zonelists are not setup
  yet?
 
 How do I best determine if zonelists aren't setup yet?
 
 The call-path in question (after my series is applied) is:
 
 arch/powerpc/kernel/setup_64.c::setup_arch -
   arch/powerpc/mm/numa.c::do_init_bootmem() -
   cpu_numa_callback() -
   numa_setup_cpu() -
   map_cpu_to_node() -
   update_numa_cpu_node() -
   set_cpu_numa_mem()
 
 and setup_arch() is called before build_all_zonelists(NULL, NULL) in
 start_kernel(). This seemed like the most reasonable path, as it's used
 on hotplug as well.
 

But the call to local_memory_node() you added was in start_secondary(),
which isn't in that trace.

I do agree that calling local_memory_node() too early then trying to
fudge around the consequences seems rather wrong.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-21 Thread Nishanth Aravamudan
On 21.02.2014 [14:42:03 -0800], Andrew Morton wrote:
 On Thu, 20 Feb 2014 10:28:47 -0800 Nishanth Aravamudan 
 n...@linux.vnet.ibm.com wrote:
 
  On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote:
   On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:
   
We can call local_memory_node() before the zonelists are setup. In that
case, first_zones_zonelist() will not set zone and the reference to
zone-node will Oops. Catch this case, and, since we presumably running
very early, just return that any node will do.
   
   Really? Isnt there some way to avoid this call if zonelists are not setup
   yet?
  
  How do I best determine if zonelists aren't setup yet?
  
  The call-path in question (after my series is applied) is:
  
  arch/powerpc/kernel/setup_64.c::setup_arch -
  arch/powerpc/mm/numa.c::do_init_bootmem() -
  cpu_numa_callback() -
  numa_setup_cpu() -
  map_cpu_to_node() -
  update_numa_cpu_node() -
  set_cpu_numa_mem()
  
  and setup_arch() is called before build_all_zonelists(NULL, NULL) in
  start_kernel(). This seemed like the most reasonable path, as it's used
  on hotplug as well.
  
 
 But the call to local_memory_node() you added was in start_secondary(),
 which isn't in that trace.

I added two calls to local_memory_node(), I *think* both are necessary,
but am willing to be corrected.

One is in map_cpu_to_node() and one is in start_secondary(). The
start_secondary() path is fine, AFAICT, as we are up  running at that
point. But in [the renamed function] update_numa_cpu_node() which is
used by hotplug, we get called from do_init_bootmem(), which is before
the zonelists are setup.

I think both calls are necessary because I believe the
arch_update_cpu_topology() is used for supporting firmware-driven
home-noding, which does not invoke start_secondary() again (the
processor is already running, we're just updating the topology in that
situation).

Then again, I could special-case the do_init_bootmem callpath, which is
only called at kernel init time?

 I do agree that calling local_memory_node() too early then trying to
 fudge around the consequences seems rather wrong.

If the answer is to simply not call local_memory_node() early, I'll
submit a patch to at least add a comment, as there's nothing in the code
itself to prevent this from happening and is guaranteed to oops.

Thanks,
Nish

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-20 Thread Christoph Lameter
On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:

 We can call local_memory_node() before the zonelists are setup. In that
 case, first_zones_zonelist() will not set zone and the reference to
 zone-node will Oops. Catch this case, and, since we presumably running
 very early, just return that any node will do.

Really? Isnt there some way to avoid this call if zonelists are not setup
yet?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-20 Thread Nishanth Aravamudan
On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote:
 On Wed, 19 Feb 2014, Nishanth Aravamudan wrote:
 
  We can call local_memory_node() before the zonelists are setup. In that
  case, first_zones_zonelist() will not set zone and the reference to
  zone-node will Oops. Catch this case, and, since we presumably running
  very early, just return that any node will do.
 
 Really? Isnt there some way to avoid this call if zonelists are not setup
 yet?

How do I best determine if zonelists aren't setup yet?

The call-path in question (after my series is applied) is:

arch/powerpc/kernel/setup_64.c::setup_arch -
arch/powerpc/mm/numa.c::do_init_bootmem() -
cpu_numa_callback() -
numa_setup_cpu() -
map_cpu_to_node() -
update_numa_cpu_node() -
set_cpu_numa_mem()

and setup_arch() is called before build_all_zonelists(NULL, NULL) in
start_kernel(). This seemed like the most reasonable path, as it's used
on hotplug as well.

I'm open to suggestsions!

Thanks,
Nish

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-19 Thread Nishanth Aravamudan
We can call local_memory_node() before the zonelists are setup. In that
case, first_zones_zonelist() will not set zone and the reference to
zone-node will Oops. Catch this case, and, since we presumably running
very early, just return that any node will do.

Signed-off-by: Nishanth Aravamudan n...@linux.vnet.ibm.com
Cc: Andrew Morton a...@linux-foundation.org
Cc: Michal Hocko mho...@suse.cz
Cc: Mel Gorman mgor...@suse.de
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org
Cc: Christoph Lameter c...@linux.com
Cc: David Rientjes rient...@google.com
Cc: Joonsoo Kim iamjoonsoo@lge.com
Cc: Ben Herrenschmidt b...@kernel.crashing.org
Cc: Anton Blanchard an...@samba.org

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e3758a0..5de4337 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3650,6 +3650,8 @@ int local_memory_node(int node)
   gfp_zone(GFP_KERNEL),
   NULL,
   zone);
+   if (!zone)
+   return NUMA_NO_NODE;
return zone-node;
 }
 #endif

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup

2014-02-19 Thread Nishanth Aravamudan
[ Grr, sorry for not including you originally Andrew, if this ends up
being ok with others, it will probably need to go through your tree. ]

On 19.02.2014 [15:17:14 -0800], Nishanth Aravamudan wrote:
 We can call local_memory_node() before the zonelists are setup. In that
 case, first_zones_zonelist() will not set zone and the reference to
 zone-node will Oops. Catch this case, and, since we presumably running
 very early, just return that any node will do.
 
 Signed-off-by: Nishanth Aravamudan n...@linux.vnet.ibm.com
 Cc: Andrew Morton a...@linux-foundation.org
 Cc: Michal Hocko mho...@suse.cz
 Cc: Mel Gorman mgor...@suse.de
 Cc: linux...@kvack.org
 Cc: linux-ker...@vger.kernel.org
 Cc: Christoph Lameter c...@linux.com
 Cc: David Rientjes rient...@google.com
 Cc: Joonsoo Kim iamjoonsoo@lge.com
 Cc: Ben Herrenschmidt b...@kernel.crashing.org
 Cc: Anton Blanchard an...@samba.org
 
 diff --git a/mm/page_alloc.c b/mm/page_alloc.c
 index e3758a0..5de4337 100644
 --- a/mm/page_alloc.c
 +++ b/mm/page_alloc.c
 @@ -3650,6 +3650,8 @@ int local_memory_node(int node)
  gfp_zone(GFP_KERNEL),
  NULL,
  zone);
 + if (!zone)
 + return NUMA_NO_NODE;
   return zone-node;
  }
  #endif

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev