On Wed, 26 Jan 2022 13:24:10 +0800 Gavin Shan <gs...@redhat.com> wrote:
> The default CPU-to-NUMA association is given by mc->get_default_cpu_node_id() > when it isn't provided explicitly. However, the CPU topology isn't fully > considered in the default association and it causes CPU topology broken > warnings on booting Linux guest. > > For example, the following warning messages are observed when the Linux guest > is booted with the following command lines. > > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ > -accel kvm -machine virt,gic-version=host \ > -cpu host \ > -smp 6,sockets=2,cores=3,threads=1 \ > -m 1024M,slots=16,maxmem=64G \ > -object memory-backend-ram,id=mem0,size=128M \ > -object memory-backend-ram,id=mem1,size=128M \ > -object memory-backend-ram,id=mem2,size=128M \ > -object memory-backend-ram,id=mem3,size=128M \ > -object memory-backend-ram,id=mem4,size=128M \ > -object memory-backend-ram,id=mem4,size=384M \ > -numa node,nodeid=0,memdev=mem0 \ > -numa node,nodeid=1,memdev=mem1 \ > -numa node,nodeid=2,memdev=mem2 \ > -numa node,nodeid=3,memdev=mem3 \ > -numa node,nodeid=4,memdev=mem4 \ > -numa node,nodeid=5,memdev=mem5 > : > alternatives: patching kernel code > BUG: arch topology borken > the CLS domain not a subset of the MC domain > <the above error log repeats> > BUG: arch topology borken > the DIE domain not a subset of the NODE domain > > With current implementation of mc->get_default_cpu_node_id(), CPU#0 to CPU#5 > are associated with NODE#0 to NODE#5 separately. That's incorrect because > CPU#0/1/2 should be associated with same NUMA node because they're seated > in same socket. > > This fixes the issue by considering the socket when default CPU-to-NUMA > is given. With this applied, no more CPU topology broken warnings are seen > from the Linux guest. The 6 CPUs are associated with NODE#0/1, but there are > no CPUs associated with NODE#2/3/4/5. >From migration point of view it looks fine to me, and doesn't need a compat >knob since NUMA data (on virt-arm) only used to construct ACPI tables (and we don't version those unless something is broken by it). > Signed-off-by: Gavin Shan <gs...@redhat.com> > --- > hw/arm/virt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 141350bf21..b4a95522d3 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -2499,7 +2499,7 @@ virt_cpu_index_to_props(MachineState *ms, unsigned > cpu_index) > > static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx) > { > - return idx % ms->numa_state->num_nodes; > + return idx / (ms->smp.dies * ms->smp.clusters * ms->smp.cores * > ms->smp.threads); I'd like for ARM folks to confirm whether above is correct (i.e. socket is NUMA node boundary and also if above topo vars could have odd values. Don't look at horribly complicated x86 as example, but it showed that vendors could stash pretty much anything there, so we should consider it here as well and maybe forbid that in smp virt-arm parser) > } > > static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)