Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On Fri, 21 Feb 2014, Nishanth Aravamudan wrote: I added two calls to local_memory_node(), I *think* both are necessary, but am willing to be corrected. One is in map_cpu_to_node() and one is in start_secondary(). The start_secondary() path is fine, AFAICT, as we are up running at that point. But in [the renamed function] update_numa_cpu_node() which is used by hotplug, we get called from do_init_bootmem(), which is before the zonelists are setup. I think both calls are necessary because I believe the arch_update_cpu_topology() is used for supporting firmware-driven home-noding, which does not invoke start_secondary() again (the processor is already running, we're just updating the topology in that situation). Then again, I could special-case the do_init_bootmem callpath, which is only called at kernel init time? Well taht looks to be simpler. I do agree that calling local_memory_node() too early then trying to fudge around the consequences seems rather wrong. If the answer is to simply not call local_memory_node() early, I'll submit a patch to at least add a comment, as there's nothing in the code itself to prevent this from happening and is guaranteed to oops. Ok. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On 24.02.2014 [13:43:31 -0600], Christoph Lameter wrote: On Fri, 21 Feb 2014, Nishanth Aravamudan wrote: I added two calls to local_memory_node(), I *think* both are necessary, but am willing to be corrected. One is in map_cpu_to_node() and one is in start_secondary(). The start_secondary() path is fine, AFAICT, as we are up running at that point. But in [the renamed function] update_numa_cpu_node() which is used by hotplug, we get called from do_init_bootmem(), which is before the zonelists are setup. I think both calls are necessary because I believe the arch_update_cpu_topology() is used for supporting firmware-driven home-noding, which does not invoke start_secondary() again (the processor is already running, we're just updating the topology in that situation). Then again, I could special-case the do_init_bootmem callpath, which is only called at kernel init time? Well taht looks to be simpler. Ok, I'll work on this. I do agree that calling local_memory_node() too early then trying to fudge around the consequences seems rather wrong. If the answer is to simply not call local_memory_node() early, I'll submit a patch to at least add a comment, as there's nothing in the code itself to prevent this from happening and is guaranteed to oops. Ok. Thanks! -Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On Thu, 20 Feb 2014 10:28:47 -0800 Nishanth Aravamudan n...@linux.vnet.ibm.com wrote: On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote: On Wed, 19 Feb 2014, Nishanth Aravamudan wrote: We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Really? Isnt there some way to avoid this call if zonelists are not setup yet? How do I best determine if zonelists aren't setup yet? The call-path in question (after my series is applied) is: arch/powerpc/kernel/setup_64.c::setup_arch - arch/powerpc/mm/numa.c::do_init_bootmem() - cpu_numa_callback() - numa_setup_cpu() - map_cpu_to_node() - update_numa_cpu_node() - set_cpu_numa_mem() and setup_arch() is called before build_all_zonelists(NULL, NULL) in start_kernel(). This seemed like the most reasonable path, as it's used on hotplug as well. But the call to local_memory_node() you added was in start_secondary(), which isn't in that trace. I do agree that calling local_memory_node() too early then trying to fudge around the consequences seems rather wrong. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On 21.02.2014 [14:42:03 -0800], Andrew Morton wrote: On Thu, 20 Feb 2014 10:28:47 -0800 Nishanth Aravamudan n...@linux.vnet.ibm.com wrote: On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote: On Wed, 19 Feb 2014, Nishanth Aravamudan wrote: We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Really? Isnt there some way to avoid this call if zonelists are not setup yet? How do I best determine if zonelists aren't setup yet? The call-path in question (after my series is applied) is: arch/powerpc/kernel/setup_64.c::setup_arch - arch/powerpc/mm/numa.c::do_init_bootmem() - cpu_numa_callback() - numa_setup_cpu() - map_cpu_to_node() - update_numa_cpu_node() - set_cpu_numa_mem() and setup_arch() is called before build_all_zonelists(NULL, NULL) in start_kernel(). This seemed like the most reasonable path, as it's used on hotplug as well. But the call to local_memory_node() you added was in start_secondary(), which isn't in that trace. I added two calls to local_memory_node(), I *think* both are necessary, but am willing to be corrected. One is in map_cpu_to_node() and one is in start_secondary(). The start_secondary() path is fine, AFAICT, as we are up running at that point. But in [the renamed function] update_numa_cpu_node() which is used by hotplug, we get called from do_init_bootmem(), which is before the zonelists are setup. I think both calls are necessary because I believe the arch_update_cpu_topology() is used for supporting firmware-driven home-noding, which does not invoke start_secondary() again (the processor is already running, we're just updating the topology in that situation). Then again, I could special-case the do_init_bootmem callpath, which is only called at kernel init time? I do agree that calling local_memory_node() too early then trying to fudge around the consequences seems rather wrong. If the answer is to simply not call local_memory_node() early, I'll submit a patch to at least add a comment, as there's nothing in the code itself to prevent this from happening and is guaranteed to oops. Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On Wed, 19 Feb 2014, Nishanth Aravamudan wrote: We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Really? Isnt there some way to avoid this call if zonelists are not setup yet? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
On 20.02.2014 [10:05:39 -0600], Christoph Lameter wrote: On Wed, 19 Feb 2014, Nishanth Aravamudan wrote: We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Really? Isnt there some way to avoid this call if zonelists are not setup yet? How do I best determine if zonelists aren't setup yet? The call-path in question (after my series is applied) is: arch/powerpc/kernel/setup_64.c::setup_arch - arch/powerpc/mm/numa.c::do_init_bootmem() - cpu_numa_callback() - numa_setup_cpu() - map_cpu_to_node() - update_numa_cpu_node() - set_cpu_numa_mem() and setup_arch() is called before build_all_zonelists(NULL, NULL) in start_kernel(). This seemed like the most reasonable path, as it's used on hotplug as well. I'm open to suggestsions! Thanks, Nish ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Signed-off-by: Nishanth Aravamudan n...@linux.vnet.ibm.com Cc: Andrew Morton a...@linux-foundation.org Cc: Michal Hocko mho...@suse.cz Cc: Mel Gorman mgor...@suse.de Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Cc: Christoph Lameter c...@linux.com Cc: David Rientjes rient...@google.com Cc: Joonsoo Kim iamjoonsoo@lge.com Cc: Ben Herrenschmidt b...@kernel.crashing.org Cc: Anton Blanchard an...@samba.org diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e3758a0..5de4337 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3650,6 +3650,8 @@ int local_memory_node(int node) gfp_zone(GFP_KERNEL), NULL, zone); + if (!zone) + return NUMA_NO_NODE; return zone-node; } #endif ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] mm: return NUMA_NO_NODE in local_memory_node if zonelists are not setup
[ Grr, sorry for not including you originally Andrew, if this ends up being ok with others, it will probably need to go through your tree. ] On 19.02.2014 [15:17:14 -0800], Nishanth Aravamudan wrote: We can call local_memory_node() before the zonelists are setup. In that case, first_zones_zonelist() will not set zone and the reference to zone-node will Oops. Catch this case, and, since we presumably running very early, just return that any node will do. Signed-off-by: Nishanth Aravamudan n...@linux.vnet.ibm.com Cc: Andrew Morton a...@linux-foundation.org Cc: Michal Hocko mho...@suse.cz Cc: Mel Gorman mgor...@suse.de Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Cc: Christoph Lameter c...@linux.com Cc: David Rientjes rient...@google.com Cc: Joonsoo Kim iamjoonsoo@lge.com Cc: Ben Herrenschmidt b...@kernel.crashing.org Cc: Anton Blanchard an...@samba.org diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e3758a0..5de4337 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3650,6 +3650,8 @@ int local_memory_node(int node) gfp_zone(GFP_KERNEL), NULL, zone); + if (!zone) + return NUMA_NO_NODE; return zone-node; } #endif ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev