Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Thu, 2014-02-13 at 10:44 +0100, Igor Mammedov wrote: On Thu, 13 Feb 2014 14:14:08 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote: Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Hi, Igor, Andreas, In addition, I want to know what way user could use to specify an arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea? -device qemu64,socket=X,core=Y,thread=Z? or add a new optional command line? Definitely not another CLU option. I see a couple of options, 1. as you suggest with additional 'numa=N' so that QEMU could know where to attach a new CPU. 2. add 'parent' like option tied to linkcpu property and specify full QOM path on CLI: -device cpufoo,parent=/machine/node[N]/socket[X]/... Hi, Igor, Currently, we know, after adding an arbitrary CPU then do migration, on target, there will be not aware that which CPU have been added. in order to notify target of the cpu topo, can we specify full QOM path that you mentioned 2th point on target? if we can simply make smp n + 1 work as well at target to be better, but target how to know the cpu topo on source side? Thanks, Chen Regards, Andreas
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Mon, 17 Feb 2014 18:24:09 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Thu, 2014-02-13 at 10:44 +0100, Igor Mammedov wrote: On Thu, 13 Feb 2014 14:14:08 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote: Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Hi, Igor, Andreas, In addition, I want to know what way user could use to specify an arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea? -device qemu64,socket=X,core=Y,thread=Z? or add a new optional command line? Definitely not another CLU option. I see a couple of options, 1. as you suggest with additional 'numa=N' so that QEMU could know where to attach a new CPU. 2. add 'parent' like option tied to linkcpu property and specify full QOM path on CLI: -device cpufoo,parent=/machine/node[N]/socket[X]/... Hi, Igor, Currently, we know, after adding an arbitrary CPU then do migration, on target, there will be not aware that which CPU have been added. in order to notify target of the cpu topo, can we specify full QOM path that you
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Thu, 13 Feb 2014 14:14:08 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote: Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Hi, Igor, Andreas, In addition, I want to know what way user could use to specify an arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea? -device qemu64,socket=X,core=Y,thread=Z? or add a new optional command line? Definitely not another CLU option. I see a couple of options, 1. as you suggest with additional 'numa=N' so that QEMU could know where to attach a new CPU. 2. add 'parent' like option tied to linkcpu property and specify full QOM path on CLI: -device cpufoo,parent=/machine/node[N]/socket[X]/... Thanks, Chen Regards, Andreas -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Tue, 21 Jan 2014 11:10:30 +0100 Andreas Färber afaer...@suse.de wrote: Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. In terms of parent/child relationship, I guess we are not going to come up with uniform design, since boards could differ very much in that aspect. I was rather thinking in terms of providing stable/uniform CLI/QMP NUMA interface using QOM tree. At startup we potentially have cpu topology information and set of NUMA nodes, so we could pre-build containers up to the point where CPU threads are attached and pre-create empty linksCPU and fill them later with actual CPU threads. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Regards, Andreas -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Tue, 2014-01-21 at 11:10 +0100, Andreas Färber wrote: Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Hi, Igor, Andreas, In addition, I want to know what way user could use to specify an arbitrary CPU if using /machine/node/N/socket/X/core/Y/thread/Z idea? -device qemu64,socket=X,core=Y,thread=Z? or add a new optional command line? Thanks, Chen Regards, Andreas -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: On Wed, 15 Jan 2014 20:24:01 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Tue, 2014-01-14 at 11:40 +0100, Igor Mammedov wrote: On Tue, 14 Jan 2014 17:27:20 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: the intend of this patch is to register cpu vmstates with apic id instead of cpu index, due to the property setting of apic_id is behind the cpu initialization. so we move the registers of cpu vmstate from cpu_exec_init() to x86_cpu_realizefn() to let the set apicid as the parameter. Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com --- exec.c| 5 + target-i386/cpu.c | 9 + 2 files changed, 14 insertions(+) diff --git a/exec.c b/exec.c index 7e49e8e..9be5855 100644 --- a/exec.c +++ b/exec.c @@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index) void cpu_exec_init(CPUArchState *env) { CPUState *cpu = ENV_GET_CPU(env); +#if !defined(TARGET_I386) CPUClass *cc = CPU_GET_CLASS(cpu); +#endif CPUState *some_cpu; int cpu_index; @@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env) #if defined(CONFIG_USER_ONLY) cpu_list_unlock(); #endif + +#if !defined(TARGET_I386) if (qdev_get_vmsd(DEVICE(cpu)) == NULL) { vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu); } @@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env) if (cc-vmsd != NULL) { vmstate_register(NULL, cpu_index, cc-vmsd, cpu); } +#endif /* !TARGET_I386 */ } #if defined(TARGET_HAS_ICE) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 967529a..dada6f6 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp) static void x86_cpu_realizefn(DeviceState *dev, Error **errp) { CPUState *cs = CPU(dev); +CPUClass *cc = CPU_GET_CLASS(cs); X86CPU *cpu = X86_CPU(dev); X86CPUClass *xcc = X86_CPU_GET_CLASS(dev); CPUX86State *env = cpu-env; @@ -2615,6 +2616,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp) cpu_reset(cs); xcc-parent_realize(dev, local_err); + +if (qdev_get_vmsd(DEVICE(cs)) == NULL) { +vmstate_register(NULL, env-cpuid_apic_id, vmstate_cpu_common, cs); +} + +if (cc-vmsd != NULL) { +vmstate_register(NULL, env-cpuid_apic_id, cc-vmsd, cs); +} how about doing it in common CPUclass.realize() you can use get_arch_id() for getting CPU id, it returns cpu_index by default and apic_id for target-i386. Thanks for your kind suggestion, does this mean we can directly move vmstate_register to cpu_common_realizefn()? yep, that is a gist of it. There is more to the issue with discontinuous CPUs, a lot of code still use cpu_index and the way it's allocated is not compatible with discontinuous CPUs, so this issue should be fixed as well. Also you propose to use a raw apic id with CLI/user interface. I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: On Wed, 15 Jan 2014 20:24:01 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Tue, 2014-01-14 at 11:40 +0100, Igor Mammedov wrote: On Tue, 14 Jan 2014 17:27:20 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: the intend of this patch is to register cpu vmstates with apic id instead of cpu index, due to the property setting of apic_id is behind the cpu initialization. so we move the registers of cpu vmstate from cpu_exec_init() to x86_cpu_realizefn() to let the set apicid as the parameter. Signed-off-by: Chen Fan chen.fan.f...@cn.fujitsu.com --- exec.c| 5 + target-i386/cpu.c | 9 + 2 files changed, 14 insertions(+) diff --git a/exec.c b/exec.c index 7e49e8e..9be5855 100644 --- a/exec.c +++ b/exec.c @@ -438,7 +438,9 @@ CPUState *qemu_get_cpu(int index) void cpu_exec_init(CPUArchState *env) { CPUState *cpu = ENV_GET_CPU(env); +#if !defined(TARGET_I386) CPUClass *cc = CPU_GET_CLASS(cpu); +#endif CPUState *some_cpu; int cpu_index; @@ -460,6 +462,8 @@ void cpu_exec_init(CPUArchState *env) #if defined(CONFIG_USER_ONLY) cpu_list_unlock(); #endif + +#if !defined(TARGET_I386) if (qdev_get_vmsd(DEVICE(cpu)) == NULL) { vmstate_register(NULL, cpu_index, vmstate_cpu_common, cpu); } @@ -472,6 +476,7 @@ void cpu_exec_init(CPUArchState *env) if (cc-vmsd != NULL) { vmstate_register(NULL, cpu_index, cc-vmsd, cpu); } +#endif /* !TARGET_I386 */ } #if defined(TARGET_HAS_ICE) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 967529a..dada6f6 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -2552,6 +2552,7 @@ static void x86_cpu_apic_realize(X86CPU *cpu, Error **errp) static void x86_cpu_realizefn(DeviceState *dev, Error **errp) { CPUState *cs = CPU(dev); +CPUClass *cc = CPU_GET_CLASS(cs); X86CPU *cpu = X86_CPU(dev); X86CPUClass *xcc = X86_CPU_GET_CLASS(dev); CPUX86State *env = cpu-env; @@ -2615,6 +2616,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp) cpu_reset(cs); xcc-parent_realize(dev, local_err); + +if (qdev_get_vmsd(DEVICE(cs)) == NULL) { +vmstate_register(NULL, env-cpuid_apic_id, vmstate_cpu_common, cs); +} + +if (cc-vmsd != NULL) { +vmstate_register(NULL, env-cpuid_apic_id, cc-vmsd, cs); +} how about doing it in common CPUclass.realize() you can use get_arch_id() for getting CPU id, it returns cpu_index by default and apic_id for target-i386. Thanks for your kind suggestion, does this mean we can directly move vmstate_register to cpu_common_realizefn()? yep, that is a gist of it. There is more to the issue with discontinuous CPUs, a lot of code still use cpu_index and the way it's allocated is not compatible with discontinuous CPUs, so this issue should be fixed as well. Also you propose to use a raw apic id with CLI/user interface. I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids
Re: [libvirt] [Qemu-devel] Exposing and calculating CPU APIC IDs (was Re: [RFC 1/3] target-i386: moving registers of vmstate from cpu_exec_init() to x86_cpu_realizefn())
Am 21.01.2014 10:51, schrieb Chen Fan: On Tue, 2014-01-21 at 10:31 +0100, Igor Mammedov wrote: On Tue, 21 Jan 2014 15:12:45 +0800 Chen Fan chen.fan.f...@cn.fujitsu.com wrote: On Mon, 2014-01-20 at 13:29 +0100, Igor Mammedov wrote: On Fri, 17 Jan 2014 17:13:55 -0200 Eduardo Habkost ehabk...@redhat.com wrote: On Wed, Jan 15, 2014 at 03:37:04PM +0100, Igor Mammedov wrote: I recall there were objections to it since APIC ID contains topology information and it's not trivial for user to get it right. The last idea that was discussed to fix it was not expose APIC ID to user but rather introduce QOM hierarchy like: /machine/node/N/socket/X/core/Y/thread/Z and use it in user interface as a means to specify an arbitrary CPU and let QEMU calculate APIC ID based on this path. But nobody took on implementing it yet. We're taking so long to get a decent interface implemented, that part of me is considering exposing the APIC ID directly like suggested before, and requiring libvirt to calculate topology-aware APIC IDs[1] to properly implement CPU hotplug (and possibly for other tasks). If you are speaking about 'qemu will core dump with -smp 254, sockets=2, cores=3, threads=2' http://patchwork.ozlabs.org/patch/301272/ bug then it's limitation of ACPI implementation, I'm going to refactor it to use full APIC ids instead of using bitmap, so that we won't ever run into issue regardless of cpu supported CPU count. Another part of me is hoping that the libvirt developers ask us to please not do that, so I can use it as argument against exposing the APIC IDs directly the next time we discuss this. :) why not try your /machine/node/N/socket/X/core/Y/thread/Z idea first. It will benefit not only cpu hotplug but also '-numa' and topology description in general. have there been any plan/model of the idea? Need to add a new option to qemu command? I suppose we can start with internal default implementation first. one way could be 1. let machine prebuild empty QOM tree /machine/node/N/socket/X/core/Y/thread/Z 2. add node, socket, core, thread properties to CPU and link CPU into respective link created by #1 Thanks, I hope I can take some time to make some patches to implement it. Please give us a few hours to reply. :) /machine/node seems too broad a term to me. You can't prebuild the full tree, you can only prepare the nodes. core[Y]/thread[Z] was previously discussed as syntax. The important part to decide on will be what is going to be child and what link. Has anyone played with the Intel Quark platform for instance? (Galileo board or upcoming Edison card) On a regular mainboard, we would have socket[X] as a linkx86_64-cpu, which might point to a childcpu /machine/memory-node[W]/cpu[X]. But if we do so we can't reassign it to another memory node - acceptable? With Quark (or Qseven modules etc.) there would be a container object rather than the /machine itself that has a childi386-cpu instead of a linki386-cpu. I guess the memory nodes could still be on the /machine though. The other point of discussion between Anthony and me was whether core[Y] should be a link or child, same for thread. I believe a child is better as it enforces that unrealizing the CPU will unrealize all its cores and all its threads in the future. More issues may pop up when thinking about it longer than a few minutes. But yes, we need to start investigating this, and so far I had other priorities like getting the CPUState mess I created cleaned up. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list