Hi, Srivatsa Thanks for your reply :)
On 04/03/2014 04:50 PM, Srivatsa S. Bhat wrote: [snip] > > Now, the interesting thing to note here is that, if CPU0's node was already > set as node0, *nothing* should go wrong, since its just a redundant update. > However, if CPU0's original node mapping was something different, or if > node0 doesn't even exist in the machine, then the system can crash. By printk I confirmed all cpus was belong to node 1 at very beginning, and things become magically after the wrong updating... > > Have you verified that CPU0's node mapping is different from node 0? > That is, boot the kernel with "numa=debug" in the kernel command line and > it will print out the cpu-to-node associativity during boot. That way you > can figure out what was the original associativity that was set. This will > confirm the theory that the hypervisor sent a redundant update, but because > of the weird pre-allocation using kzalloc that we do inside > arch_update_cpu_topology(), we wrongly updated CPU0's mapping as CPU0 <-> > Node0. Associativity should changes, otherwise we won't continue the updating, and empty updates[] was confirmed to show up inside arch_update_cpu_topology(). What I can't make sure is whether this is legal, notify changes but no changes happen sounds weird...however, even if it's legal, a check in here still make sense IMHO. Regards, Michael Wang > > > Regards, > Srivatsa S. Bhat > >> Thus we should stop the updating in such cases, this patch will achieve >> this and fix the issue. >> >> CC: Benjamin Herrenschmidt <b...@kernel.crashing.org> >> CC: Paul Mackerras <pau...@samba.org> >> CC: Nathan Fontenot <nf...@linux.vnet.ibm.com> >> CC: Stephen Rothwell <s...@canb.auug.org.au> >> CC: Andrew Morton <a...@linux-foundation.org> >> CC: Robert Jennings <r...@linux.vnet.ibm.com> >> CC: Jesse Larrew <jlar...@linux.vnet.ibm.com> >> CC: "Srivatsa S. Bhat" <srivatsa.b...@linux.vnet.ibm.com> >> CC: Alistair Popple <alist...@popple.id.au> >> Signed-off-by: Michael Wang <wang...@linux.vnet.ibm.com> >> --- >> arch/powerpc/mm/numa.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c >> index 30a42e2..6757690 100644 >> --- a/arch/powerpc/mm/numa.c >> +++ b/arch/powerpc/mm/numa.c >> @@ -1591,6 +1591,14 @@ int arch_update_cpu_topology(void) >> cpu = cpu_last_thread_sibling(cpu); >> } >> >> + /* >> + * The 'cpu_associativity_changes_mask' could be cleared if >> + * all the cpus it indicates won't change their node, in >> + * which case the 'updated_cpus' will be empty. >> + */ >> + if (!cpumask_weight(&updated_cpus)) >> + goto out; >> + >> stop_machine(update_cpu_topology, &updates[0], &updated_cpus); >> >> /* >> @@ -1612,6 +1620,7 @@ int arch_update_cpu_topology(void) >> changed = 1; >> } >> >> +out: >> kfree(updates); >> return changed; >> } >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev