Re: [PATCH RESEND v2 2/2] target/i386/kvm: get and put AMD pmu registers

2023-07-02 Thread Like Xu
On Wed, Jun 21, 2023 at 9:39 AM Dongli Zhang  wrote:
>
> The QEMU side calls kvm_get_msrs() to save the pmu registers from the KVM
> side to QEMU, and calls kvm_put_msrs() to store the pmu registers back to
> the KVM side.
>
> However, only the Intel gp/fixed/global pmu registers are involved. There
> is not any implementation for AMD pmu registers. The
> 'has_architectural_pmu_version' and 'num_architectural_pmu_gp_counters' are
> calculated at kvm_arch_init_vcpu() via cpuid(0xa). This does not work for
> AMD. Before AMD PerfMonV2, the number of gp registers is decided based on
> the CPU version.

Updating the relevant documentation to clarify this part of the deficiency
would be a good first step.

>
> This patch is to add the support for AMD version=1 pmu, to get and put AMD
> pmu registers. Otherwise, there will be a bug:

AMD version=1 ?

AMD does not have version 1, just directly has 2, perhaps because of x86
compatibility. AMD also does not have the so-called architectural pmu.
Maybe need to rename has_architectural_pmu_version for AMD.

It might be more helpful to add similar support for AMD PerfMonV2.

>
> 1. The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it
> is running "perf top". The pmu registers are not disabled gracefully.
>
> 2. Although the x86_cpu_reset() resets many registers to zero, the
> kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result,
> some pmu events are still enabled at the KVM side.

I agree that we should have done that, especially if guest pmu is enabled
on the AMD platforms.

>
> 3. The KVM pmc_speculative_in_use() always returns true so that the events
> will not be reclaimed. The kvm_pmc->perf_event is still active.
>
> 4. After the reboot, the VM kernel reports below error:
>
> [0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS 
> detected, complain to your hardware vendor.
> [0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 
> c0010200 is 530076)
>
> 5. In a worse case, the active kvm_pmc->perf_event is still able to
> inject unknown NMIs randomly to the VM kernel.
>
> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>
> The patch is to fix the issue by resetting AMD pmu registers during the
> reset.

I'm not sure if the qemu_reset or VM kexec will necessarily trigger
kvm::amd_pmu_reset().

>
> Cc: Joe Jin 
> Cc: Like Xu 
> Signed-off-by: Dongli Zhang 
> ---
>  target/i386/cpu.h |  5 +++
>  target/i386/kvm/kvm.c | 83 +--
>  2 files changed, 86 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index cd047e0410..b8ba72e87a 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -471,6 +471,11 @@ typedef enum X86Seg {
>  #define MSR_CORE_PERF_GLOBAL_CTRL   0x38f
>  #define MSR_CORE_PERF_GLOBAL_OVF_CTRL   0x390
>
> +#define MSR_K7_EVNTSEL0 0xc001
> +#define MSR_K7_PERFCTR0 0xc0010004
> +#define MSR_F15H_PERF_CTL0  0xc0010200
> +#define MSR_F15H_PERF_CTR0  0xc0010201
> +
>  #define MSR_MC0_CTL 0x400
>  #define MSR_MC0_STATUS  0x401
>  #define MSR_MC0_ADDR0x402
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index bf4136fa1b..a0f7273dad 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -2084,6 +2084,32 @@ int kvm_arch_init_vcpu(CPUState *cs)
>  }
>  }
>
> +/*
> + * If KVM_CAP_PMU_CAPABILITY is not supported, there is no way to
> + * disable the AMD pmu virtualization.
> + *
> + * If KVM_CAP_PMU_CAPABILITY is supported, kvm_state->pmu_cap_disabled
> + * indicates the KVM side has already disabled the pmu virtualization.
> + */
> +if (IS_AMD_CPU(env) && !cs->kvm_state->pmu_cap_disabled) {
> +int64_t family;
> +
> +family = (env->cpuid_version >> 8) & 0xf;
> +if (family == 0xf) {
> +family += (env->cpuid_version >> 20) & 0xff;
> +}
> +
> +if (family >= 6) {
> +has_architectural_pmu_version = 1;
> +
> +if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_PERFCORE) {
> +num_architectural_pmu_gp_counters = 6;

Please make the code a little more readable with some macro definitions.

#define AMD64_NUM_COUNTERS 4
#define AMD64_NUM_COUNTERS_CORE 6

> +} else {
> +num_architectural_pmu_gp_counters = 4;
> +}
> +}
> +}
> +
>  cpu_x86_cpuid(env, 0x8000, 0, , , , );
>
>  for (i = 0x800

Re: [PATCH RESEND v2 1/2] target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE

2023-07-02 Thread Like Xu
On Wed, Jun 21, 2023 at 9:39 AM Dongli Zhang  wrote:
>
> The "perf stat" at the VM side still works even we set "-cpu host,-pmu" in
> the QEMU command line. That is, neither "-cpu host,-pmu" nor "-cpu EPYC"
> could disable the pmu virtualization in an AMD environment.
>
> We still see below at VM kernel side ...
>
> [0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
>
> ... although we expect something like below.
>
> [0.596381] Performance Events: PMU not available due to virtualization, 
> using software events only.
> [0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
>
> This is because the AMD pmu (v1) does not rely on cpuid to decide if the
> pmu virtualization is supported.
>
> We introduce a new property 'pmu-cap-disabled' for KVM accel to set
> KVM_PMU_CAP_DISABLE if KVM_CAP_PMU_CAPABILITY is supported. Only x86 host
> is supported because currently KVM uses KVM_CAP_PMU_CAPABILITY only for
> x86.

We may check cpu->enable_pmu when creating the first CPU or a BSP one
(before it gets running) and then choose whether to disable guest pmu using
vm ioctl KVM_CAP_PMU_CAPABILITY. Introducing a new property is not too
acceptable if there are other options.

>
> Cc: Joe Jin 
> Cc: Like Xu 
> Signed-off-by: Dongli Zhang 
> ---
> Changed since v1:
> - In version 1 we did not introduce the new property. We ioctl
>   KVM_PMU_CAP_DISABLE only before the creation of the 1st vcpu. We had
>   introduced a helpfer function to do this job before creating the 1st
>   KVM vcpu in v1.
>
>  accel/kvm/kvm-all.c  |  1 +
>  include/sysemu/kvm_int.h |  1 +
>  qemu-options.hx  |  7 ++
>  target/i386/kvm/kvm.c| 46 
>  4 files changed, 55 insertions(+)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 7679f397ae..238098e991 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -3763,6 +3763,7 @@ static void kvm_accel_instance_init(Object *obj)
>  s->xen_version = 0;
>  s->xen_gnttab_max_frames = 64;
>  s->xen_evtchn_max_pirq = 256;
> +s->pmu_cap_disabled = false;
>  }
>
>  /**
> diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
> index 511b42bde5..cbbe08ec54 100644
> --- a/include/sysemu/kvm_int.h
> +++ b/include/sysemu/kvm_int.h
> @@ -123,6 +123,7 @@ struct KVMState
>  uint32_t xen_caps;
>  uint16_t xen_gnttab_max_frames;
>  uint16_t xen_evtchn_max_pirq;
> +bool pmu_cap_disabled;
>  };
>
>  void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml,
> diff --git a/qemu-options.hx b/qemu-options.hx
> index b57489d7ca..1976c0ca3e 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -187,6 +187,7 @@ DEF("accel", HAS_ARG, QEMU_OPTION_accel,
>  "tb-size=n (TCG translation block cache size)\n"
>  "dirty-ring-size=n (KVM dirty ring GFN count, default 
> 0)\n"
>  "
> notify-vmexit=run|internal-error|disable,notify-window=n (enable notify VM 
> exit and set notify window, x86 only)\n"
> +"pmu-cap-disabled=true|false (disable 
> KVM_CAP_PMU_CAPABILITY, x86 only, default false)\n"
>  "thread=single|multi (enable multi-threaded TCG)\n", 
> QEMU_ARCH_ALL)
>  SRST
>  ``-accel name[,prop=value[,...]]``
> @@ -254,6 +255,12 @@ SRST
>  open up for a specified of time (i.e. notify-window).
>  Default: notify-vmexit=run,notify-window=0.
>
> +``pmu-cap-disabled=true|false``
> +When the KVM accelerator is used, it controls whether to disable the
> +KVM_CAP_PMU_CAPABILITY via KVM_PMU_CAP_DISABLE. When disabled, the
> +PMU virtualization is disabled at the KVM module side. This is for
> +x86 host only.
> +
>  ERST
>
>  DEF("smp", HAS_ARG, QEMU_OPTION_smp,
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index de531842f6..bf4136fa1b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -129,6 +129,7 @@ static bool has_msr_ucode_rev;
>  static bool has_msr_vmx_procbased_ctls2;
>  static bool has_msr_perf_capabs;
>  static bool has_msr_pkrs;
> +static bool has_pmu_cap;
>
>  static uint32_t has_architectural_pmu_version;
>  static uint32_t num_architectural_pmu_gp_counters;
> @@ -2767,6 +2768,23 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>  }
>  }
>
> +has_pmu_cap = kvm_check_extension(s, KVM_CAP_PMU_CAPABILITY);
> +
> +if (s->pmu_cap_disabled) {
> +if (has_pmu_cap) {
> +

Re: [PATCH v2 0/2] target/i386/kvm: fix two svm pmu virtualization bugs

2023-06-19 Thread Like Xu

I think we've been stuck here too long. Sorry Dongli.

+zhenyu, could you get someone to follow up on this, or I will start working on 
that.


On 9/1/2023 9:19 am, Dongli Zhang wrote:

Ping?

About [PATCH v2 2/2], the bad thing is that the customer will not be able to
notice the issue, that is, the "Broken BIOS detected" in dmesg, immediately.

As a result, the customer VM many panic randomly anytime in the future (once
issue is encountered) if "/proc/sys/kernel/unknown_nmi_panic" is enabled.

Thank you very much!

Dongli Zhang

On 12/19/22 06:45, Dongli Zhang wrote:

Can I get feedback for this patchset, especially the [PATCH v2 2/2]?

About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD
VM, especially the below case:

1. Enable panic on nmi.
2. Use perf to monitor the performance of VM. Although without a test, I think
the nmi watchdog has the same effect.
3. A sudden system reset, or a kernel panic (kdump/kexec).
4. After reboot, there will be random unknown NMI.
5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time.

Thank you very much!

Dongli Zhang

On 12/1/22 16:22, Dongli Zhang wrote:

This patchset is to fix two svm pmu virtualization bugs, x86 only.

version 1:
https://lore.kernel.org/all/20221119122901.2469-1-dongli.zh...@oracle.com/

1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization.

To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu
virtualization. There is still below at the VM linux side ...

[0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.

... although we expect something like below.

[0.596381] Performance Events: PMU not available due to virtualization, 
using software events only.
[0.600972] NMI watchdog: Perf NMI watchdog permanently disabled

The 1st patch has introduced a new x86 only accel/kvm property
"pmu-cap-disabled=true" to disable the pmu virtualization via
KVM_PMU_CAP_DISABLE.

I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1.
Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I
finally used the latter because it is easier to use.


2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset)
at the KVM side may inject random unwanted/unknown NMIs to the VM.

The svm pmu registers are not reset during QEMU system_reset.

(1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it
is running "perf top". The pmu registers are not disabled gracefully.

(2). Although the x86_cpu_reset() resets many registers to zero, the
kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result,
some pmu events are still enabled at the KVM side.

(3). The KVM pmc_speculative_in_use() always returns true so that the events
will not be reclaimed. The kvm_pmc->perf_event is still active.

(4). After the reboot, the VM kernel reports below error:

[0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, 
complain to your hardware vendor.
[0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 
c0010200 is 530076)

(5). In a worse case, the active kvm_pmc->perf_event is still able to
inject unknown NMIs randomly to the VM kernel.

[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.

The 2nd patch is to fix the issue by resetting AMD pmu registers as well as
Intel registers.


This patchset does not cover PerfMonV2, until the below patchset is merged
into the KVM side.

[PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support
https://lore.kernel.org/all/2022102645.82001-1-lik...@tencent.com/


Dongli Zhang (2):
   target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE
   target/i386/kvm: get and put AMD pmu registers

  accel/kvm/kvm-all.c  |   1 +
  include/sysemu/kvm_int.h |   1 +
  qemu-options.hx  |   7 +++
  target/i386/cpu.h|   5 ++
  target/i386/kvm/kvm.c| 129 +-
  5 files changed, 141 insertions(+), 2 deletions(-)

Thank you very much!

Dongli Zhang









Re: [PATCH 0/3] kvm: fix two svm pmu virtualization bugs

2022-11-20 Thread Like Xu

On 19/11/2022 8:28 pm, Dongli Zhang wrote:

This patchset is to fix two svm pmu virtualization bugs.

1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization.

To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu
virtualization. There is still below at the VM linux side ...


Many QEMU vendor forks already have similar fixes, and
thanks for bringing this issue back to the mainline.



[0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.

... although we expect something like below.

[0.596381] Performance Events: PMU not available due to virtualization, 
using software events only.
[0.600972] NMI watchdog: Perf NMI watchdog permanently disabled

The patch 1-2 is to disable the pmu virtualization via KVM_PMU_CAP_DISABLE
if the per-vcpu "pmu" property is disabled.

I considered 'KVM_X86_SET_MSR_FILTER' initially.
Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I
finally used the latter because it is easier to use.


2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset)
at the KVM side may inject random unwanted/unknown NMIs to the VM.

The svm pmu registers are not reset during QEMU system_reset.

(1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it
is running "perf top". The pmu registers are not disabled gracefully.

(2). Although the x86_cpu_reset() resets many registers to zero, the
kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result,
some pmu events are still enabled at the KVM side.

(3). The KVM pmc_speculative_in_use() always returns true so that the events
will not be reclaimed. The kvm_pmc->perf_event is still active.


I'm not sure if you're saying KVM doing something wrong, I don't think so
because KVM doesn't sense the system_reset defined by QEME or other user space,
AMD's vPMC will continue to be enabled (if it was enabled before), generating 
pmi
injection into the guest, and the newly started guest doesn't realize the 
counter is still

enabled and blowing up the error log.



(4). After the reboot, the VM kernel reports below error:

[0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, 
complain to your hardware vendor.
[0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 
c0010200 is 530076)

(5). In a worse case, the active kvm_pmc->perf_event is still able to
inject unknown NMIs randomly to the VM kernel.

[...] Uhhuh. NMI received for unknown reason 30 on CPU 0.

The patch 3 is to fix the issue by resetting AMD pmu registers as well as
Intel registers.


This fix idea looks good, it does require syncing the new changed device state 
of QEMU to KVM.





This patchset does cover does not cover PerfMonV2, until the below patchset
is merged into the KVM side.

[PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support
https://lore.kernel.org/all/2022102645.82001-1-lik...@tencent.com/


Dongli Zhang (3):
   kvm: introduce a helper before creating the 1st vcpu
   i386: kvm: disable KVM_CAP_PMU_CAPABILITY if "pmu" is disabled
   target/i386/kvm: get and put AMD pmu registers

  accel/kvm/kvm-all.c|   7 ++-
  include/sysemu/kvm.h   |   2 +
  target/arm/kvm64.c |   4 ++
  target/i386/cpu.h  |   5 +++
  target/i386/kvm/kvm.c  | 104 +++-
  target/mips/kvm.c  |   4 ++
  target/ppc/kvm.c   |   4 ++
  target/riscv/kvm.c |   4 ++
  target/s390x/kvm/kvm.c |   4 ++
  9 files changed, 134 insertions(+), 4 deletions(-)

Thank you very much!

Dongli Zhang







Re: [PATCH] i386: Disable BTS and PEBS

2022-07-20 Thread Like Xu

On 20/7/2022 2:53 am, Sean Christopherson wrote:

On Tue, Jul 19, 2022, Paolo Bonzini wrote:

On 7/18/22 22:12, Sean Christopherson wrote:

On Mon, Jul 18, 2022, Paolo Bonzini wrote:

This needs to be fixed in the kernel because old QEMU/new KVM is supported.


I can't object to adding a quirk for this since KVM is breaking userspace, but 
on
the KVM side we really need to stop "sanitizing" userspace inputs unless it puts
the host at risk, because inevitably it leads to needing a quirk.


The problem is not the sanitizing, it's that userspace literally cannot know
that this needs to be done because the feature bits are "backwards" (1 =
unavailable).


Yes, the bits being inverted contributed to KVM not providing a way for 
userspace
to enumerate PEBS and BTS support, but lack of enumeration is a seperate issue.

If KVM had simply ignored invalid guest state from the get go, then userspace 
would
never have gained a dependency on KVM sanitizing guest state.  The fact that KVM
didn't enumerate support in any way is an orthogonal problem.  To play nice with
older userspace, KVM will need to add a quirk to restore the sanizting code, but
that doesn't solve the enumeration issue.  And vice versa, solving the 
enuemaration
problem doesn't magically fix old userspace.


The right way to fix it is probably to use feature MSRs and, by default,
leave the features marked as unavailable.  I'll think it through and post a
patch tomorrow for both KVM and QEMU (to enable PEBS).


Try to help:

KVM already have MSR_IA32_PERF_CAPABILITIES as a feature msr (to enable 
LBR/PEBS),
and KVM_CAP_PMU_CAPABILITY as vm ioctl extension for model specific crappiness.



Yeah, lack of CPUID bits is annoying.






Re: [PATCH] i386: Disable BTS and PEBS

2022-07-17 Thread Like Xu

On 18/7/2022 11:22 am, Zhenzhong Duan wrote:

Since below KVM commit, KVM hided BTS as it's not supported yet.
b9181c8ef356 ("KVM: x86/pmu: Avoid exposing Intel BTS feature")

After below KVM commit, it gave control of MSR_IA32_MISC_ENABLES to userspace.
9fc222967a39 ("KVM: x86: Give host userspace full control of 
MSR_IA32_MISC_ENABLES")

So qemu takes the responsibility to hide BTS.
Without fix, we get below warning in guest kernel:

[] unchecked MSR access error: WRMSR to 0x1d9 (tried to write 
0x01c0) at rIP: 0xaa070644 (native_write_msr+0x4/0x20)
[] Call Trace:
[]  
[]  intel_pmu_enable_bts+0x5d/0x70
[]  bts_event_add+0x77/0x90
[]  event_sched_in.isra.135+0x99/0x1e0

Tested-by: Xiangfei Ma 
Signed-off-by: Zhenzhong Duan 
---
  target/i386/cpu.h | 6 --
  target/i386/kvm/kvm.c | 4 
  2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 82004b65b944..8a83d0995c66 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -434,8 +434,10 @@ typedef enum X86Seg {
  
  #define MSR_IA32_MISC_ENABLE0x1a0

  /* Indicates good rep/movs microcode on some processors: */
-#define MSR_IA32_MISC_ENABLE_DEFAULT1
-#define MSR_IA32_MISC_ENABLE_MWAIT  (1ULL << 18)
+#define MSR_IA32_MISC_ENABLE_DEFAULT  1
+#define MSR_IA32_MISC_ENABLE_BTS_UNAVAIL  (1ULL << 11)
+#define MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL (1ULL << 12)
+#define MSR_IA32_MISC_ENABLE_MWAIT(1ULL << 18)
  
  #define MSR_MTRRphysBase(reg)   (0x200 + 2 * (reg))

  #define MSR_MTRRphysMask(reg)   (0x200 + 2 * (reg) + 1)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f148a6d52fa4..002e0520dd76 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2180,6 +2180,10 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
  {
  CPUX86State *env = >env;
  
+/* Disable BTS feature which is unsupported on KVM */

+env->msr_ia32_misc_enable |= MSR_IA32_MISC_ENABLE_BTS_UNAVAIL;
+env->msr_ia32_misc_enable |= MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL;


Would it be more readable to group msr_ia32_misc_enable code into this function:

static void x86_cpu_reset(DeviceState *dev)

and, why disable PEBS (we need it at least for "-cpu host,migratable=no") ?

Also, the behavior of MISC_ENABLE_EMON is also inconsistent with "pmu=off”.


+
  env->xcr0 = 1;
  if (kvm_irqchip_in_kernel()) {
  env->mp_state = cpu_is_bsp(cpu) ? KVM_MP_STATE_RUNNABLE :




Re: [PATCH v5 0/2] Enable legacy LBR support for guest

2022-02-09 Thread Like Xu

Hi Weijiang,

On 23/1/2022 12:11 am, Yang Weijiang wrote:

KVM legacy LBR patches have been merged in kernel 5.12, this patchset
is to expose the feature to guest from the perf capability MSR. Qemu can
add LBR format in cpu option to achieve it, e.g., -cpu host,lbr-fmt=0x5,


Some older Intel CPUs may have lbr-fmt=LBR_FORMAT_32 (which is 0), would
you help verify that KVM is supported on these platforms ? If so, how do we 
enable
guest LBR form the QEMU side, w/ -cpu host,lbr-fmt=0x0 ?


the format should match host value in IA32_PERF_CAPABILITIES.

Note, KVM legacy LBR solution accelerates guest perf performace by LBR MSR
passthrough so it requires guest cpu model matches that of host's, i.e.,


Would you help add live migration support across host/guest CPU models when
hosts at both ends have the same number of LBR entries and the same lbr-fmt ?

Thanks,
Like Xu


only -cpu host is supported.

Change in v5:
1. This patchset is rebased on tip : 6621441db5
2. No functional change since v4.




[PATCH] target/i386/cpu: Use the KVM reported value for the number of ASIDs

2021-07-15 Thread Like Xu
From: Like Xu 

If KVM is enabled, use the supported number of address space identifiers
(ASIDs) by the CPUID Fn8000_000A_EBX instead of hard-coding it to 0x10.

Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 48b55ebd0a..959c4425a4 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5523,7 +5523,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 case 0x800A:
 if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
 *eax = 0x0001; /* SVM Revision */
-*ebx = 0x0010; /* nr of ASIDs */
+/* nr of ASIDs */
+if (kvm_enabled()) {
+*ebx = kvm_arch_get_supported_cpuid(cs->kvm_state,
+0x800A, 0, R_EBX);
+} else {
+*ebx = 0x0010;
+}
 *ecx = 0;
 *edx = env->features[FEAT_SVM]; /* optional features */
 } else {
-- 
2.32.0




[PATCH v3 2/2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-05-07 Thread Like Xu
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the user-provided lbr-fmt value should not violate its bitmask (0x3f)
  and it should be the same as the host lbr_fmt value or just use the
  QEMU option "-cpu host,migratable=no" to enable guest LBR.

Signed-off-by: Like Xu 
---
v2-v3 Changelog:
- Add a new generic property macro to validate its bitmask;
- Differentiate "lbr-fmt=0" from "lbr-fmt not set";
- Do what the user asked for whenever possible;
- Treat mismatch or violatation as an error rather than warning;

Testcases for a lbr-fmt=5 host:

 "-cpu host" --> "Disable LBR"
 "-cpu host,lbr-fmt=0" --> "Disable LBR"
 "-cpu host,lbr-fmt=5" --> "Enable LBR"
 "-cpu host,lbr-fmt=6" --> "Error out, lbr mismatch"
 "-cpu host,lbr-fmt=0xff" --> "Error out, bitmask violatation"
 "-cpu host,migratable=no" --> "Enable LBR"
 "-cpu host,migratable=no,lbr-fmt=0" --> "Disable LBR"
 "-cpu host,migratable=no,lbr-fmt=5" --> "Enable LBR"
 "-cpu host,migratable=no,lbr-fmt=6" --> "Error out, lbr mismatch"
 "-cpu host,migratable=no,lbr-fmt=0xff" --> "Error out, bitmask violatation"

 target/i386/cpu.c | 39 +++
 target/i386/cpu.h | 10 ++
 2 files changed, 49 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..d03306179a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6748,6 +6748,41 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 goto out;
 }
 
+/*
+ * Override env->features[FEAT_PERF_CAPABILITIES]
+ * with explicit user-provided settings.
+ */
+if (cpu->lbr_fmt != ~PERF_CAP_LBR_FMT) {
+if ((cpu->lbr_fmt & PERF_CAP_LBR_FMT) != cpu->lbr_fmt) {
+error_setg(errp, "invalid lbr-fmt");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT;
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
+/*
+ * We can always validate env->features[FEAT_PERF_CAPABILITIES],
+ * no matter how it was initialized:
+ */
+uint64_t requested_lbr_fmt =
+env->features[FEAT_PERF_CAPABILITIES] & PERF_CAP_LBR_FMT;
+if (requested_lbr_fmt && kvm_enabled()) {
+uint64_t host_perf_cap =
+x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
+uint64_t host_lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
+if (!cpu->enable_pmu) {
+error_setg(errp, "vPMU: LBR is unsupported without pmu=on");
+return;
+}
+if (requested_lbr_fmt != host_lbr_fmt) {
+error_setg(errp, "vPMU: the lbr-fmt value (0x%lx) mismatches "
+"the host supported value (0x%lx).",
+requested_lbr_fmt, host_lbr_fmt);
+return;
+}
+}
+
 x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
 
 if (cpu->enforce_cpuid && x86_cpu_have_filtered_features(cpu)) {
@@ -7150,6 +7185,9 @@ static void x86_cpu_initfn(Object *obj)
 object_property_add_alias(obj, "sse4_1", obj, "sse4.1");
 object_property_add_alias(obj, "sse4_2", obj, "sse4.2");
 
+cpu->lbr_fmt = ~PERF_CAP_LBR_FMT;
+object_property_add_alias(obj, "lbr_fmt", obj, "lbr-fmt");
+
 if (xcc->model) {
 x86_cpu_load_model(cpu, xcc->model);
 }
@@ -7300,6 +7338,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_BITMASK_UINT64("lbr-fmt", X86CPU, lbr_fmt, PERF_CAP_LBR_FMT),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 1bc300ce85..bab394e18e 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 

[PATCH v3 1/2] qdev-properties: Add a new macro to validate bitmask for setter

2021-05-07 Thread Like Xu
The new generic DEFINE_PROP_BITMASK_UINT64 could be used to ensure
that a user-provided property value complies with its bitmask rule
and the default value is recommended to be set in instance_init().

Signed-off-by: Like Xu 
---
 hw/core/qdev-properties.c| 19 +++
 include/hw/qdev-properties.h | 12 
 include/qapi/qmp/qerror.h|  3 +++
 3 files changed, 34 insertions(+)

diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 50f40949f5..3784d3b30d 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -428,6 +428,25 @@ const PropertyInfo qdev_prop_int64 = {
 .set_default_value = qdev_propinfo_set_default_value_int,
 };
 
+static void set_bitmask_uint64(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+Property *prop = opaque;
+uint64_t *ptr = object_field_prop_ptr(obj, prop);
+
+visit_type_uint64(v, name, ptr, errp);
+
+if (*ptr & ~prop->bitmask) {
+error_setg(errp, QERR_INVALID_BITMASK_VALUE, name, prop->bitmask);
+}
+}
+
+const PropertyInfo qdev_prop_bitmask_uint64 = {
+.name  = "int64",
+.get   = get_uint64,
+.set   = set_bitmask_uint64,
+};
+
 /* --- string --- */
 
 static void release_string(Object *obj, const char *name, void *opaque)
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 0ef97d60ce..42f0112e14 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -17,6 +17,7 @@ struct Property {
 const PropertyInfo *info;
 ptrdiff_toffset;
 uint8_t  bitnr;
+uint64_t bitmask;
 bool set_default;
 union {
 int64_t i;
@@ -53,6 +54,7 @@ extern const PropertyInfo qdev_prop_uint16;
 extern const PropertyInfo qdev_prop_uint32;
 extern const PropertyInfo qdev_prop_int32;
 extern const PropertyInfo qdev_prop_uint64;
+extern const PropertyInfo qdev_prop_bitmask_uint64;
 extern const PropertyInfo qdev_prop_int64;
 extern const PropertyInfo qdev_prop_size;
 extern const PropertyInfo qdev_prop_string;
@@ -102,6 +104,16 @@ extern const PropertyInfo qdev_prop_link;
 .set_default = true, \
 .defval.u= (bool)_defval)
 
+/**
+ * The DEFINE_PROP_BITMASK_UINT64 could be used to ensure that
+ * a user-provided value complies with certain bitmask rule and
+ * the default value is recommended to be set in instance_init().
+ */
+#define DEFINE_PROP_BITMASK_UINT64(_name, _state, _field, _bitmask)   \
+DEFINE_PROP(_name, _state, _field, qdev_prop_bitmask_uint64, uint64_t, \
+.bitmask= (_bitmask), \
+.set_default = false)
+
 #define PROP_ARRAY_LEN_PREFIX "len-"
 
 /**
diff --git a/include/qapi/qmp/qerror.h b/include/qapi/qmp/qerror.h
index 596fce0c54..aab7902760 100644
--- a/include/qapi/qmp/qerror.h
+++ b/include/qapi/qmp/qerror.h
@@ -68,4 +68,7 @@
 #define QERR_UNSUPPORTED \
 "this feature or command is not currently supported"
 
+#define QERR_INVALID_BITMASK_VALUE \
+"the requested value for '%s' violates its bitmask '0x%lx'"
+
 #endif /* QERROR_H */
-- 
2.30.2




Re: [PATCH v2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-04-29 Thread Like Xu

Hi Eduardo,

Thanks for your detailed comments.

On 2021/4/29 5:19, Eduardo Habkost wrote:

On Tue, Apr 27, 2021 at 04:09:48PM +0800, Like Xu wrote:

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
   OR use the QEMU option "-cpu host,migratable=no".


I don't understand why "migratable" matters here.  "migratable"
is just a convenience property to get better defaults when using
"-cpu host".  I don't know why it would change the lbr-fmt
validation rules.


Your comments bevlow help me understand why we introduced "migratable"
and I'll fllow it.





Signed-off-by: Like Xu 
---


A changelog explaining what you changed since v1 would have been
useful here.


Sorry for inconvenience.




  target/i386/cpu.c | 34 ++
  target/i386/cpu.h | 10 ++
  target/i386/kvm/kvm.c | 10 --
  3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..9c8e54aa6f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6623,6 +6623,10 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  }
  
  for (w = 0; w < FEATURE_WORDS; w++) {

+if (w == FEAT_PERF_CAPABILITIES) {
+continue;
+}
+


Why exactly is this necessary?  I expected to be completely OK to
call mark_unavailable_features() multiple times for the same
FeatureWord.



OK.


If there's a reason why this is necessary, I suggest adding a
comment explaining why.


  uint64_t host_feat =
  x86_cpu_get_supported_feature_word(w, false);
  uint64_t requested_features = env->features[w];
@@ -6630,6 +6634,27 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  mark_unavailable_features(cpu, w, unavailable_features, prefix);
  }
  
+uint64_t host_perf_cap =

+x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
+if (!cpu->lbr_fmt && !cpu->migratable) {
+cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;


"migratable=no" is not a request to override explicit user
settings.  This is why we have the ~env->user_features masking
inside x86_cpu_expand_features() when initializing
env->features[].

In either case, I don't understand why you need the lines above.
"migratable=no" should already trigger the x86_cpu_get_supported_feature_word()
loop inside x86_cpu_expand_features(), and it should initialize
env->features[FEAT_PERF_CAPABILITIES] with the host value.  Isn't
that code working for you?



+if (cpu->lbr_fmt) {
+info_report("vPMU: The value of lbr-fmt has been adjusted "
+"to 0x%lx and guest LBR is enabled.",
+host_perf_cap & PERF_CAP_LBR_FMT);




From your other message:


(I'm assuming your examples are for a lbr-fmt=5 host)


"-cpu host,migratable=no" --> "Enable guest LBR and show warning"


Enabling guest LBR in this case is 100% OK, isn't it?  I don't
think you need to show a warning.



"-cpu host,migratable=no,lbr-fmt=0" --> "Enable guest LBR and show warning"


Why?  In this case, we should do what the user asked for whenever
possible, and the user is explicitly asking lbr-fmt to be 0.


"-cpu host,migratable=no,lbr-fmt=5" --> "Enable guest LBR"


Looks OK.


"-cpu host,migratable=no,lbr-fmt=6" --> "Disable guest LBR and show warning"


Makes sense to me[1].



+}
+} else {
+uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT;
+if (requested_lbr_fmt && kvm_enabled()) {




From your other message:



"-cpu host,lbr-fmt=0" --> "Disable guest LBR"


Makes sense to me.  I understand this as a confirmation that it's
OK to have a guest/host mismatch if guest LBR=0.


"-cpu host,lbr-fmt=5" --> "Enable guest LBR"


Makes sense to me.


"-cpu host,lbr-fmt=6" --> "Disable guest LBR and show warning"


Makes sense to me[1].


[1] As long as "show warning" becomes "fatal error" if enforce=1.
 mark_unavailable_features() should make sure this happens.

 Or maybe we should make this

Re: [PATCH RESEND 1/2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-04-27 Thread Like Xu

Hi Eduardo,

On 2021/4/24 5:20, Eduardo Habkost wrote:

Hi,

Sorry for missing the previous submission of this series, and
thanks for resubmitting.


Long time no see and thanks for your comments.



On Fri, Apr 23, 2021 at 10:20:36AM +0800, Like Xu wrote:

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
   or use the QEMU option "-cpu host,migratable=no".

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 
---
  target/i386/cpu.c | 16 
  target/i386/cpu.h | 10 ++
  target/i386/kvm/kvm.c |  5 +++--
  3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..eee6da3ad8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6627,6 +6627,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  x86_cpu_get_supported_feature_word(w, false);
  uint64_t requested_features = env->features[w];
  uint64_t unavailable_features = requested_features & ~host_feat;
+if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&


If this block of code should run only once, why is this inside
the loop in the first place?

I suggest following the same pattern used for intel-pt flags and
moving it outside the loop.


Sure, the mark_unavailable_features() will skip the check for
feature_word(FEAT_PERF_CAPABILITIES) and avoid avoid double checking.




+(requested_features & PERF_CAP_LBR_FMT)) {


What exactly is supposed to happen if the VCPU is configured with
LBR_FMT=0 and the host has LBR_FMT != 0 ?


If the VCPU is configured with LBR_FMT=0 and the host has LBR_FMT != 0,
the guest LBR will be enabled if "migratable=no" and
will be disabled if "migratable=yes" by default.

Some test cases and expected results can be listed as:

"-cpu host,lbr-fmt=0" --> "Disable guest LBR"
"-cpu host,lbr-fmt=5" --> "Enable guest LBR"
"-cpu host,lbr-fmt=6" --> "Disable guest LBR and show warning"

"-cpu host,migratable=no" --> "Enable guest LBR and show warning"
"-cpu host,migratable=no,lbr-fmt=0" --> "Enable guest LBR and show warning"
"-cpu host,migratable=no,lbr-fmt=5" --> "Enable guest LBR"
"-cpu host,migratable=no,lbr-fmt=6" --> "Disable guest LBR and show warning"



If it shouldn't be an error, then the new kvm_exact_match_flags
field added in patch 2/2 becomes hard to reuse, and easy to
misuse (there's no code documentation indicating that a mismatch
is allowed if the requested bits are all zero).  In that case,
maybe patch 2/2 could be dropped by now.



Let us drop the patch 2/2 and please help review the new version:

https://lore.kernel.org/qemu-devel/20210427080948.439432-1-like...@linux.intel.com/


If it should be an error, this patch and 2/2 don't seem correct.
If correcting that, I also suggest reversing the patch order in
the series, so this whole block of code doesn't even need to be
added in the first place.



+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}
  mark_unavailable_features(cpu, w, unavailable_features, prefix);
  }
  
@@ -6734,6 +6741,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)

  }
  }
  
+if (cpu->lbr_fmt) {

+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
  /* mwait extended info: needed for Core compatibility */
  /* We always wake on interrupt even if host does not have the capability 
*/
  cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7300,6 +7315,7 @@ static Property x86_cpu_properties[] = {
  #endif
  DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
  DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
  
  DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinloc

[PATCH v2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-04-27 Thread Like Xu
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
  OR use the QEMU option "-cpu host,migratable=no".

Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 34 ++
 target/i386/cpu.h | 10 ++
 target/i386/kvm/kvm.c | 10 --
 3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..9c8e54aa6f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6623,6 +6623,10 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 }
 
 for (w = 0; w < FEATURE_WORDS; w++) {
+if (w == FEAT_PERF_CAPABILITIES) {
+continue;
+}
+
 uint64_t host_feat =
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
@@ -6630,6 +6634,27 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
 
+uint64_t host_perf_cap =
+x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false);
+if (!cpu->lbr_fmt && !cpu->migratable) {
+cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT;
+if (cpu->lbr_fmt) {
+info_report("vPMU: The value of lbr-fmt has been adjusted "
+"to 0x%lx and guest LBR is enabled.",
+host_perf_cap & PERF_CAP_LBR_FMT);
+}
+} else {
+uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT;
+if (requested_lbr_fmt && kvm_enabled()) {
+if (requested_lbr_fmt != (host_perf_cap & PERF_CAP_LBR_FMT)) {
+cpu->lbr_fmt = 0;
+warn_report("vPMU: The supported lbr-fmt value on the host "
+"is 0x%lx and guest LBR is disabled.",
+host_perf_cap & PERF_CAP_LBR_FMT);
+}
+}
+}
+
 if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) &&
 kvm_enabled()) {
 KVMState *s = CPU(cpu)->kvm_state;
@@ -6734,6 +6759,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (cpu->lbr_fmt) {
+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
 /* mwait extended info: needed for Core compatibility */
 /* We always wake on interrupt even if host does not have the capability */
 cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7300,6 +7333,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 570f916878..b12c879fc4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1726,6 +1727,15 @@ struct X86CPU {
  */
 bool enable_pmu;
 
+/*
+ * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+ * This can't be enabled by default yet because it doesn't have
+ * ABI stability guarantees, as it is only allowed to pass all
+ * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+ * (that depends on host CPU and kernel capabilities) to the guest.
+ */
+uint8_t lbr_fmt;
+
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
  * different LMCE configurations.
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7fe9f52710..aa926984ae 100644
--- a/target/i386/k

Re: [PATCH v2] hw/i386: Expand the range of CPU topologies between smp and maxcpus

2021-04-26 Thread Like Xu

On 2021/4/26 21:30, Daniel P. Berrangé wrote:

On Mon, Apr 26, 2021 at 10:08:52AM +0800, caodon...@kingsoft.com wrote:

Change the criteria for the initial CPU topology and maxcpus, user can
have more settings


Can you provide a better explanation of why this is needed. What
valid usage scenario is blocked by the current check ?

AFAICT, it partially reverts an intentional change done in several
years ago in :


   commit bc1fb850a31468ac4976f3895f01a6d981e06d0a
   Author: Igor Mammedov 
   Date:   Thu Sep 13 13:06:01 2018 +0200

 vl.c deprecate incorrect CPUs topology
 
 -smp [cpus],sockets/cores/threads[,maxcpus] should describe topology

 so that total number of logical CPUs [sockets * cores * threads]
 would be equal to [maxcpus], however historically we didn't have
 such check in QEMU and it is possible to start VM with an invalid
 topology.
 Deprecate invalid options combination so we can make sure that
 the topology VM started with is always correct in the future.
 Users with an invalid sockets/cores/threads/maxcpus values should
 fix their CLI to make sure that
[sockets * cores * threads] == [maxcpus]




Another helpful commit would be:

commit c4332cd1dcf2964c23893ab4c0bf8d774e42a3cf
Author: Igor Mammedov 
Date:   Fri Sep 11 09:32:02 2020 -0400

smp: drop support for deprecated (invalid topologies)

it's was deprecated since 3.1

Support for invalid topologies is removed, the user must ensure
that topologies described with -smp include all possible cpus,
i.e. (sockets * cores * threads) == maxcpus or QEMU will
exit with error.


So is the following statement correct:

When we explicitly set the topology, we must ensure that the combination 
(sockets/dies/cores/threads/maxcpus) is always valid. If we need hot plug 
testing, we can only use something like "-smp 1,maxcpus = 4" since 3.1.


?






Signed-off-by: Dongli Cao 
---
hw/i386/pc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8a84b25..ef2e819 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -751,7 +751,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
  exit(1);
  }

-if (sockets * dies * cores * threads != ms->smp.max_cpus) {
+if (sockets * dies * cores * threads > ms->smp.max_cpus) {
  error_report("Invalid CPU topology deprecated: "
   "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
"
   "!= maxcpus (%u)",


This is


--
1.8.3.1









caodon...@kingsoft.com




Regards,
Daniel






[PATCH RESEND 1/2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-04-22 Thread Like Xu
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
  or use the QEMU option "-cpu host,migratable=no".

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 16 
 target/i386/cpu.h | 10 ++
 target/i386/kvm/kvm.c |  5 +++--
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad0e7..eee6da3ad8 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6627,6 +6627,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
 uint64_t unavailable_features = requested_features & ~host_feat;
+if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&
+(requested_features & PERF_CAP_LBR_FMT)) {
+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
 
@@ -6734,6 +6741,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (cpu->lbr_fmt) {
+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
 /* mwait extended info: needed for Core compatibility */
 /* We always wake on interrupt even if host does not have the capability */
 cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7300,6 +7315,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 570f916878..b12c879fc4 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1726,6 +1727,15 @@ struct X86CPU {
  */
 bool enable_pmu;
 
+/*
+ * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+ * This can't be enabled by default yet because it doesn't have
+ * ABI stability guarantees, as it is only allowed to pass all
+ * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+ * (that depends on host CPU and kernel capabilities) to the guest.
+ */
+uint8_t lbr_fmt;
+
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
  * different LMCE configurations.
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 7fe9f52710..4d842d32a6 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2732,8 +2732,9 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, 
FeatureWordArray f)
MSR_IA32_PERF_CAPABILITIES);
 
 if (kvm_perf_cap) {
-kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
-kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+kvm_perf_cap = cpu->migratable ?
+(kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;
+kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
 }
 }
 
-- 
2.30.2




[PATCH RESEND 2/2] target/i386: add kvm_exact_match_flags to FeatureWordInfo

2021-04-22 Thread Like Xu
Instead of hardcoding the PERF_CAPABILITIES rules in this loop,
this could become a FeatureWordInfo field. It would be very useful
for other features like intel-pt, where we need some bits to match
the host bits too.

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index eee6da3ad8..56a486b498 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -708,6 +708,8 @@ typedef struct FeatureWordInfo {
 uint64_t migratable_flags; /* Feature flags known to be migratable */
 /* Features that shouldn't be auto-enabled by "-cpu host" */
 uint64_t no_autoenable_flags;
+/* Bits that must match host exactly when using KVM */
+uint64_t kvm_exact_match_flags;
 } FeatureWordInfo;
 
 static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
@@ -1147,6 +1149,11 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
= {
 .msr = {
 .index = MSR_IA32_PERF_CAPABILITIES,
 },
+/*
+ * KVM is not able to emulate a VCPU with LBR_FMT different
+ * from the host, so LBR_FMT must match the host exactly.
+ */
+.kvm_exact_match_flags = PERF_CAP_LBR_FMT,
 },
 
 [FEAT_VMX_PROCBASED_CTLS] = {
@@ -6623,16 +6630,18 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 }
 
 for (w = 0; w < FEATURE_WORDS; w++) {
+FeatureWordInfo *fi = _word_info[w];
+uint64_t match_flags = fi->kvm_exact_match_flags;
 uint64_t host_feat =
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
 uint64_t unavailable_features = requested_features & ~host_feat;
-if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&
-(requested_features & PERF_CAP_LBR_FMT)) {
-if ((host_feat & PERF_CAP_LBR_FMT) !=
-(requested_features & PERF_CAP_LBR_FMT)) {
-unavailable_features |= PERF_CAP_LBR_FMT;
-}
+if (kvm_enabled() && match_flags) {
+uint64_t mismatches = (requested_features & match_flags) &&
+(requested_features ^ host_feat) & match_flags;
+mark_unavailable_features(cpu, w,
+mismatches, "feature doesn't match host");
+unavailable_features &= ~match_flags;
 }
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
-- 
2.30.2




Re: [RESEND][BUG FIX HELP] QEMU main thread endlessly hangs in __ppoll()

2021-03-04 Thread Like Xu

Hi John,

Thanks for your comment.

On 2021/3/5 7:53, John Snow wrote:

On 2/28/21 9:39 PM, Like Xu wrote:

Hi Genius,

I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may 
still exist in the mainline.

Thanks in advance to heroes who can take a look and share understanding.



Do you have a test case that reproduces on 5.2? It'd be nice to know if it 
was still a problem in the latest source tree or not.


We narrowed down the source of the bug, which basically came from
the following qmp usage:

{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 
'drive_del replication0' } }


One of the test cases is the COLO usage (docs/colo-proxy.txt).

This issue is sporadic,the probability may be 1/15 for a io-heavy guest.

I believe it's reproducible on 5.2 and the latest tree.



--js


The qemu main thread endlessly hangs in the handle of the qmp statement:
{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 
'drive_del replication0' } }

and we have the call trace looks like:

#0 0x7f3c22045bf6 in __ppoll (fds=0x555611328410, nfds=1, 
timeout=, timeout@entry=0x7ffc56c66db0,

sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
#1 0x55561021f415 in ppoll (__ss=0x0, __timeout=0x7ffc56c66db0, 
__nfds=, __fds=)

at /usr/include/x86_64-linux-gnu/bits/poll2.h:77
#2 qemu_poll_ns (fds=, nfds=, 
timeout=) at util/qemu-timer.c:348
#3 0x555610221430 in aio_poll (ctx=ctx@entry=0x5556113010f0, 
blocking=blocking@entry=true) at util/aio-posix.c:669
#4 0x55561019268d in bdrv_do_drained_begin (poll=true, 
ignore_bds_parents=false, parent=0x0, recursive=false,

bs=0x55561138b0a0) at block/io.c:430
#5 bdrv_do_drained_begin (bs=0x55561138b0a0, recursive=, 
parent=0x0, ignore_bds_parents=,

poll=) at block/io.c:396
#6 0x55561017b60b in quorum_del_child (bs=0x55561138b0a0, 
child=0x7f36dc0ce380, errp=)

at block/quorum.c:1063
#7 0x55560ff5836b in qmp_x_blockdev_change (parent=0x555612373120 
"colo-disk0", has_child=,
child=0x5556112df3e0 "children.1", has_node=, node=0x0, 
errp=0x7ffc56c66f98) at blockdev.c:4494
#8 0x5556100f8f57 in qmp_marshal_x_blockdev_change (args=out>, ret=, errp=0x7ffc56c67018)

at qapi/qapi-commands-block-core.c:1538
#9 0x5556101d8290 in do_qmp_dispatch (errp=0x7ffc56c67010, 
allow_oob=, request=,

cmds=0x5556109c69a0 ) at qapi/qmp-dispatch.c:132
#10 qmp_dispatch (cmds=0x5556109c69a0 , request=out>, allow_oob=)

at qapi/qmp-dispatch.c:175
#11 0x5556100d4c4d in monitor_qmp_dispatch (mon=0x5556113a6f40, 
req=) at monitor/qmp.c:145
#12 0x5556100d5437 in monitor_qmp_bh_dispatcher (data=out>) at monitor/qmp.c:234
#13 0x55561021dbec in aio_bh_call (bh=0x5556112164bGrateful0) at 
util/async.c:117

#14 aio_bh_poll (ctx=ctx@entry=0x5556112151b0) at util/async.c:117
#15 0x5556102212c4 in aio_dispatch (ctx=0x5556112151b0) at 
util/aio-posix.c:459
#16 0x55561021dab2 in aio_ctx_dispatch (source=, 
callback=, user_data=)

at util/async.c:260
#17 0x7f3c22302fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0

#18 0x555610220358 in glib_pollfds_poll () at util/main-loop.c:219
#19 os_host_main_loop_wait (timeout=) at util/main-loop.c:242
#20 main_loop_wait (nonblocking=) at util/main-loop.c:518
#21 0x55560ff600fe in main_loop () at vl.c:1814
#22 0x55560fddbce9 in main (argc=, argv=out>, envp=) at vl.c:4503


We found that we're doing endless check in the line of 
block/io.c:bdrv_do_drained_begin():

 BDRV_POLL_WHILE(bs, bdrv_drain_poll_top_level(bs, recursive, parent));
and it turns out that the bdrv_drain_poll() always get true from:
- bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents)
- AND atomic_read(>in_flight)

I personally think this is a deadlock issue in the a QEMU block layer
(as we know, we have some #FIXME comments in related codes, such as block 
permisson update).

Any comments are welcome and appreciated.

---
thx,likexu








Re: [PATCH v2 1/2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-02-28 Thread Like Xu

Hi Paolo & Eduardo,

Do we have any comment for the QEMU LBR enabling patches?

https://lore.kernel.org/qemu-devel/20210201045453.240258-1-like...@linux.intel.com/

On 2021/2/1 12:54, Like Xu wrote:

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
   or use the QEMU option "-cpu host,migratable=no".

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 
---
  target/i386/cpu.c | 16 
  target/i386/cpu.h | 10 ++
  target/i386/kvm/kvm.c |  5 +++--
  3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ae89024d36..80a5d3f0c2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6504,6 +6504,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  x86_cpu_get_supported_feature_word(w, false);
  uint64_t requested_features = env->features[w];
  uint64_t unavailable_features = requested_features & ~host_feat;
+if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&
+(requested_features & PERF_CAP_LBR_FMT)) {
+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}
  mark_unavailable_features(cpu, w, unavailable_features, prefix);
  }
  
@@ -6611,6 +6618,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)

  }
  }
  
+if (cpu->lbr_fmt) {

+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
  /* mwait extended info: needed for Core compatibility */
  /* We always wake on interrupt even if host does not have the capability 
*/
  cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7184,6 +7199,7 @@ static Property x86_cpu_properties[] = {
  #endif
  DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
  DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
  
  DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,

 HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d23a5b340a..64320bced2 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@ typedef enum X86Seg {
  #define ARCH_CAP_TSX_CTRL_MSR (1<<7)
  
  #define MSR_IA32_PERF_CAPABILITIES  0x345

+#define PERF_CAP_LBR_FMT  0x3f
  
  #define MSR_IA32_TSX_CTRL		0x122

  #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1709,6 +1710,15 @@ struct X86CPU {
   */
  bool enable_pmu;
  
+/*

+ * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+ * This can't be enabled by default yet because it doesn't have
+ * ABI stability guarantees, as it is only allowed to pass all
+ * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+ * (that depends on host CPU and kernel capabilities) to the guest.
+ */
+uint8_t lbr_fmt;
+
  /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It 
is
   * disabled by default to avoid breaking migration between QEMU with
   * different LMCE configurations.
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6dc1ee052d..49745efb78 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2705,8 +2705,9 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, 
FeatureWordArray f)
 MSR_IA32_PERF_CAPABILITIES);
  
  if (kvm_perf_cap) {

-kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
-kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+kvm_perf_cap = cpu->migratable ?
+(kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;
+kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
  }
  }
  






[RESEND][BUG FIX HELP] QEMU main thread endlessly hangs in __ppoll()

2021-02-28 Thread Like Xu

Hi Genius,

I am a user of QEMU v4.2.0 and stuck in an interesting bug, which may still 
exist in the mainline.

Thanks in advance to heroes who can take a look and share understanding.

The qemu main thread endlessly hangs in the handle of the qmp statement:
{'execute': 'human-monitor-command', 'arguments':{ 'command-line': 
'drive_del replication0' } }

and we have the call trace looks like:

#0 0x7f3c22045bf6 in __ppoll (fds=0x555611328410, nfds=1, 
timeout=, timeout@entry=0x7ffc56c66db0,

sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
#1 0x55561021f415 in ppoll (__ss=0x0, __timeout=0x7ffc56c66db0, 
__nfds=, __fds=)

at /usr/include/x86_64-linux-gnu/bits/poll2.h:77
#2 qemu_poll_ns (fds=, nfds=, 
timeout=) at util/qemu-timer.c:348
#3 0x555610221430 in aio_poll (ctx=ctx@entry=0x5556113010f0, 
blocking=blocking@entry=true) at util/aio-posix.c:669
#4 0x55561019268d in bdrv_do_drained_begin (poll=true, 
ignore_bds_parents=false, parent=0x0, recursive=false,

bs=0x55561138b0a0) at block/io.c:430
#5 bdrv_do_drained_begin (bs=0x55561138b0a0, recursive=, 
parent=0x0, ignore_bds_parents=,

poll=) at block/io.c:396
#6 0x55561017b60b in quorum_del_child (bs=0x55561138b0a0, 
child=0x7f36dc0ce380, errp=)

at block/quorum.c:1063
#7 0x55560ff5836b in qmp_x_blockdev_change (parent=0x555612373120 
"colo-disk0", has_child=,
child=0x5556112df3e0 "children.1", has_node=, node=0x0, 
errp=0x7ffc56c66f98) at blockdev.c:4494
#8 0x5556100f8f57 in qmp_marshal_x_blockdev_change (args=out>, ret=, errp=0x7ffc56c67018)

at qapi/qapi-commands-block-core.c:1538
#9 0x5556101d8290 in do_qmp_dispatch (errp=0x7ffc56c67010, 
allow_oob=, request=,

cmds=0x5556109c69a0 ) at qapi/qmp-dispatch.c:132
#10 qmp_dispatch (cmds=0x5556109c69a0 , request=out>, allow_oob=)

at qapi/qmp-dispatch.c:175
#11 0x5556100d4c4d in monitor_qmp_dispatch (mon=0x5556113a6f40, 
req=) at monitor/qmp.c:145
#12 0x5556100d5437 in monitor_qmp_bh_dispatcher (data=) 
at monitor/qmp.c:234
#13 0x55561021dbec in aio_bh_call (bh=0x5556112164bGrateful0) at 
util/async.c:117

#14 aio_bh_poll (ctx=ctx@entry=0x5556112151b0) at util/async.c:117
#15 0x5556102212c4 in aio_dispatch (ctx=0x5556112151b0) at 
util/aio-posix.c:459
#16 0x55561021dab2 in aio_ctx_dispatch (source=, 
callback=, user_data=)

at util/async.c:260
#17 0x7f3c22302fbd in g_main_context_dispatch () from 
/lib/x86_64-linux-gnu/libglib-2.0.so.0

#18 0x555610220358 in glib_pollfds_poll () at util/main-loop.c:219
#19 os_host_main_loop_wait (timeout=) at util/main-loop.c:242
#20 main_loop_wait (nonblocking=) at util/main-loop.c:518
#21 0x55560ff600fe in main_loop () at vl.c:1814
#22 0x55560fddbce9 in main (argc=, argv=, 
envp=) at vl.c:4503


We found that we're doing endless check in the line of 
block/io.c:bdrv_do_drained_begin():

BDRV_POLL_WHILE(bs, bdrv_drain_poll_top_level(bs, recursive, parent));
and it turns out that the bdrv_drain_poll() always get true from:
- bdrv_parent_drained_poll(bs, ignore_parent, ignore_bds_parents)
- AND atomic_read(>in_flight)

I personally think this is a deadlock issue in the a QEMU block layer
(as we know, we have some #FIXME comments in related codes, such as block 
permisson update).

Any comments are welcome and appreciated.

---
thx,likexu



[PATCH v2 1/2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2021-01-31 Thread Like Xu
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value
  or use the QEMU option "-cpu host,migratable=no".

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 16 
 target/i386/cpu.h | 10 ++
 target/i386/kvm/kvm.c |  5 +++--
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ae89024d36..80a5d3f0c2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6504,6 +6504,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
 uint64_t unavailable_features = requested_features & ~host_feat;
+if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&
+(requested_features & PERF_CAP_LBR_FMT)) {
+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
 
@@ -6611,6 +6618,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (cpu->lbr_fmt) {
+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
 /* mwait extended info: needed for Core compatibility */
 /* We always wake on interrupt even if host does not have the capability */
 cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7184,6 +7199,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d23a5b340a..64320bced2 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -354,6 +354,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1709,6 +1710,15 @@ struct X86CPU {
  */
 bool enable_pmu;
 
+/*
+ * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+ * This can't be enabled by default yet because it doesn't have
+ * ABI stability guarantees, as it is only allowed to pass all
+ * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+ * (that depends on host CPU and kernel capabilities) to the guest.
+ */
+uint8_t lbr_fmt;
+
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
  * different LMCE configurations.
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 6dc1ee052d..49745efb78 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2705,8 +2705,9 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, 
FeatureWordArray f)
MSR_IA32_PERF_CAPABILITIES);
 
 if (kvm_perf_cap) {
-kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
-kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+kvm_perf_cap = cpu->migratable ?
+(kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap;
+kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap);
 }
 }
 
-- 
2.29.2




[PATCH v2 2/2] target/i386: add kvm_exact_match_flags to FeatureWordInfo

2021-01-31 Thread Like Xu
Eduardo has a suggestion: instead of hardcoding the
PERF_CAPABILITIES rules in this loop, this could become a
FeatureWordInfo field. It would be very useful for other
features like intel-pt, where we need some bits to match the
host too.

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 80a5d3f0c2..8eaa5879ea 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -708,6 +708,8 @@ typedef struct FeatureWordInfo {
 uint64_t migratable_flags; /* Feature flags known to be migratable */
 /* Features that shouldn't be auto-enabled by "-cpu host" */
 uint64_t no_autoenable_flags;
+/* Bits that must match host exactly when using KVM */
+uint64_t kvm_exact_match_flags;
 } FeatureWordInfo;
 
 static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
@@ -1147,6 +1149,11 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
= {
 .msr = {
 .index = MSR_IA32_PERF_CAPABILITIES,
 },
+/*
+ * KVM is not able to emulate a VCPU with LBR_FMT different
+ * from the host, so LBR_FMT must match the host exactly.
+ */
+.kvm_exact_match_flags = PERF_CAP_LBR_FMT,
 },
 
 [FEAT_VMX_PROCBASED_CTLS] = {
@@ -6500,16 +6507,18 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 }
 
 for (w = 0; w < FEATURE_WORDS; w++) {
+FeatureWordInfo *fi = _word_info[w];
+uint64_t match_flags = fi->kvm_exact_match_flags;
 uint64_t host_feat =
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
 uint64_t unavailable_features = requested_features & ~host_feat;
-if (kvm_enabled() && w == FEAT_PERF_CAPABILITIES &&
-(requested_features & PERF_CAP_LBR_FMT)) {
-if ((host_feat & PERF_CAP_LBR_FMT) !=
-(requested_features & PERF_CAP_LBR_FMT)) {
-unavailable_features |= PERF_CAP_LBR_FMT;
-}
+if (kvm_enabled() && match_flags) {
+uint64_t mismatches = (requested_features & match_flags) &&
+(requested_features ^ host_feat) & match_flags;
+mark_unavailable_features(cpu, w,
+mismatches, "feature doesn't match host");
+unavailable_features &= ~match_flags;
 }
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
-- 
2.29.2




Re: [PATCH 4/5 v4] KVM: VMX: Fill in conforming vmx_x86_ops via macro

2020-11-09 Thread Like Xu

Hi Krish,

On 2020/11/10 9:23, Krish Sadhukhan wrote:

@@ -1192,7 +1192,7 @@ void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 
fs_sel, u16 gs_sel,
}
  }
  
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)

+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu)


What do you think of renaming it to

void vmx_prepare_switch_for_guest(struct kvm_vcpu *vcpu);

?

Thanks,
Like Xu


  {
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct vmcs_host_state *host_state;

@@ -311,7 +311,7 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu,
  int allocate_vpid(void);
  void free_vpid(int vpid);
  void vmx_set_constant_host_state(struct vcpu_vmx *vmx);
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu);
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu);
  void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel,
unsigned long fs_base, unsigned long gs_base);
  int vmx_get_cpl(struct kvm_vcpu *vcpu);





Re: [Qemu-devel PATCH v2] target/i386: add "-cpu,lbr-fmt=*" support to enable guest LBR

2020-09-29 Thread Like Xu

Hi Eduardo,

On 2020/9/30 1:38, Eduardo Habkost wrote:

(CCing the people from the thread, as kvm_exact_match_flags would
be useful for INTEL_PT_IP_LIP)

On Tue, Sep 29, 2020 at 02:12:17PM +0800, Like Xu wrote:

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value.

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 


The approach below looks better, thanks!  Only one problem below,
with a few suggestions and questions:


---
  target/i386/cpu.c | 16 
  target/i386/cpu.h | 10 ++
  2 files changed, 26 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3ffd877dd5..b10344be01 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6461,6 +6461,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  x86_cpu_get_supported_feature_word(w, false);
  uint64_t requested_features = env->features[w];
  uint64_t unavailable_features = requested_features & ~host_feat;
+if (w == FEAT_PERF_CAPABILITIES &&
+(requested_features & PERF_CAP_LBR_FMT)) {
+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}


This looks correct, but needs to be conditional on kvm_enabled().

I also have a suggestion: instead of hardcoding the
PERF_CAPABILITIES rules in this loop, this could become a
FeatureWordInfo field.  It would be very useful for other
features like intel-pt, where we need some bits to match the host
too.


The idea looks good to me.



Could you please check if the following patch works?

Signed-off-by: Eduardo Habkost 
---
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b10344be010..d4107dcc026 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -704,6 +704,8 @@ typedef struct FeatureWordInfo {
  uint64_t migratable_flags; /* Feature flags known to be migratable */
  /* Features that shouldn't be auto-enabled by "-cpu host" */
  uint64_t no_autoenable_flags;
+/* Bits that must match host exactly when using KVM */
+uint64_t kvm_exact_match_flags;
  } FeatureWordInfo;
  
  static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {

@@ -1143,6 +1145,11 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
= {
  .msr = {
  .index = MSR_IA32_PERF_CAPABILITIES,
  },
+/*
+ * KVM is not able to emulate a VCPU with LBR_FMT different
+ * from the host, so LBR_FMT must match the host exactly.
+ */
+.kvm_exact_match_flags = PERF_CAP_LBR_FMT,
  },
  
  [FEAT_VMX_PROCBASED_CTLS] = {

@@ -6457,16 +6464,15 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
  }
  
  for (w = 0; w < FEATURE_WORDS; w++) {

+FeatureWordInfo *fi = _word_info[w];
  uint64_t host_feat =
  x86_cpu_get_supported_feature_word(w, false);
  uint64_t requested_features = env->features[w];
  uint64_t unavailable_features = requested_features & ~host_feat;
-if (w == FEAT_PERF_CAPABILITIES &&
-(requested_features & PERF_CAP_LBR_FMT)) {
-if ((host_feat & PERF_CAP_LBR_FMT) !=
-(requested_features & PERF_CAP_LBR_FMT)) {
-unavailable_features |= PERF_CAP_LBR_FMT;
-}
+if (kvm_enabled()) {
+uint64_t mismatches = (requested_features ^ host_feat) &
+  fi->kvm_exact_match_flags;
+mark_unavailable_features(cpu, w, mismatches, "feature doesn't match 
host");
  }
  mark_unavailable_features(cpu, w, unavailable_features, prefix);
  }
   
---


I may refine this part in this way:

for (w = 0; w < FEATURE_WORDS; w++) {
FeatureWordInfo *fi = _word_info[w];
uint64_t match_flags = fi->kvm_exact_match_flags;
uint64_t host_feat =
x86_cpu_get_supported_feature_word(w, false);
uint64_t requested_features = env->features[w];
uint64_t unavailable_features = requested_features & ~host_feat;
if (kvm_enabled() && match_flags) {
uint64_t mismatches = (requested_features & match_flags) &a

[Qemu-devel PATCH v2] target/i386: add "-cpu, lbr-fmt=*" support to enable guest LBR

2020-09-29 Thread Like Xu
The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. The QEMU
could configure whether it's enabled or not for each guest via CLI.

The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and,
- the supported returned value for lbr_fmt from this msr is not zero and,
- the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM,
- the configured lbr-fmt value is the same as the host lbr_fmt value.

Cc: Eduardo Habkost 
Cc: Paolo Bonzini 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 16 
 target/i386/cpu.h | 10 ++
 2 files changed, 26 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3ffd877dd5..b10344be01 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6461,6 +6461,13 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 x86_cpu_get_supported_feature_word(w, false);
 uint64_t requested_features = env->features[w];
 uint64_t unavailable_features = requested_features & ~host_feat;
+if (w == FEAT_PERF_CAPABILITIES &&
+(requested_features & PERF_CAP_LBR_FMT)) {
+if ((host_feat & PERF_CAP_LBR_FMT) !=
+(requested_features & PERF_CAP_LBR_FMT)) {
+unavailable_features |= PERF_CAP_LBR_FMT;
+}
+}
 mark_unavailable_features(cpu, w, unavailable_features, prefix);
 }
 
@@ -6533,6 +6540,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (cpu->lbr_fmt) {
+if (!cpu->enable_pmu) {
+error_setg(errp, "LBR is unsupported since guest PMU is 
disabled.");
+return;
+}
+env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt;
+}
+
 /* mwait extended info: needed for Core compatibility */
 /* We always wake on interrupt even if host does not have the capability */
 cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
@@ -7157,6 +7172,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_NOTIFY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index f519d2bfd4..c1cf8b7160 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -357,6 +357,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1701,6 +1702,15 @@ struct X86CPU {
  */
 bool enable_pmu;
 
+/*
+ * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR.
+ * This can't be enabled by default yet because it doesn't have
+ * ABI stability guarantees, as it is only allowed to pass all
+ * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature()
+ * (that depends on host CPU and kernel capabilities) to the guest.
+ */
+uint8_t lbr_fmt;
+
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
  * different LMCE configurations.
-- 
2.21.3




[PATCH] target/i386: add -cpu,lbr=true support to enable guest LBR

2020-07-26 Thread Like Xu
The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd and,
- the supported returned value for lbr_fmt from this msr is not zero.

The LBR feature would be disabled on the guest if:
- the msr-based-feature IA32_PERF_CAPABILITIES is unsupporterd OR,
- qemu set the IA32_PERF_CAPABILITIES msr feature without lbr_fmt values OR,
- the requested guest vcpu model doesn't support PDCM.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Marcelo Tosatti 
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu 
---
 hw/i386/pc.c  |  1 +
 target/i386/cpu.c | 24 ++--
 target/i386/cpu.h |  2 ++
 target/i386/kvm.c |  7 ++-
 4 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3d419d5991..857aff75bb 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -318,6 +318,7 @@ GlobalProperty pc_compat_1_5[] = {
 { "Nehalem-" TYPE_X86_CPU, "min-level", "2" },
 { "virtio-net-pci", "any_layout", "off" },
 { TYPE_X86_CPU, "pmu", "on" },
+{ TYPE_X86_CPU, "lbr", "on" },
 { "i440FX-pcihost", "short_root_bus", "0" },
 { "q35-pcihost", "short_root_bus", "0" },
 };
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 588f32e136..c803994887 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1142,8 +1142,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 [FEAT_PERF_CAPABILITIES] = {
 .type = MSR_FEATURE_WORD,
 .feat_names = {
-NULL, NULL, NULL, NULL,
-NULL, NULL, NULL, NULL,
+"lbr-fmt-bit-0", "lbr-fmt-bit-1", "lbr-fmt-bit-2", "lbr-fmt-bit-3",
+"lbr-fmt-bit-4", "lbr-fmt-bit-5", NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, "full-width-write", NULL, NULL,
 NULL, NULL, NULL, NULL,
@@ -4224,6 +4224,12 @@ static bool lmce_supported(void)
 return !!(mce_cap & MCG_LMCE_P);
 }
 
+static inline bool lbr_supported(void)
+{
+return kvm_enabled() && (kvm_arch_get_supported_msr_feature(kvm_state,
+MSR_IA32_PERF_CAPABILITIES) & PERF_CAP_LBR_FMT);
+}
+
 #define CPUID_MODEL_ID_SZ 48
 
 /**
@@ -4327,6 +4333,9 @@ static void max_x86_cpu_initfn(Object *obj)
 }
 
 object_property_set_bool(OBJECT(cpu), "pmu", true, _abort);
+if (lbr_supported()) {
+object_property_set_bool(OBJECT(cpu), "lbr", true, _abort);
+}
 }
 
 static const TypeInfo max_x86_cpu_type_info = {
@@ -5535,6 +5544,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 }
 if (!cpu->enable_pmu) {
 *ecx &= ~CPUID_EXT_PDCM;
+if (cpu->enable_lbr) {
+warn_report("LBR is unsupported since guest PMU is disabled.");
+exit(1);
+}
 }
 break;
 case 2:
@@ -6553,6 +6566,12 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (!cpu->max_features && cpu->enable_lbr &&
+!(env->features[FEAT_1_ECX] & CPUID_EXT_PDCM)) {
+warn_report("requested vcpu model doesn't support PDCM for LBR.");
+exit(1);
+}
+
 if (cpu->ucode_rev == 0) {
 /* The default is the same as KVM's.  */
 if (IS_AMD_CPU(env)) {
@@ -7187,6 +7206,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_BOOL("lbr", X86CPU, enable_lbr, false),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_RETRY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index e1a5c174dc..a059913e26 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -357,6 +357,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1702,6 +1703,7 @@ struct X86CPU {
  * capabilities) directly to the guest.
  */
 bool enable_pmu;
+bool enable_lbr;
 
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index b8455c89ed..feb33d5472 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -2690,8 +26

Re: [PATCH 1/2] migration/colo: fix typo in the COLO Framework module

2020-06-14 Thread Like Xu

On 2020/6/15 9:36, Zhanghailiang wrote:

Hi Like,

Please check this patch, It seems that you didn't use git format-patch command 
to generate this patch,
It is in wrong format.


I rebase the patch on the top commit of
7d3660e79830a069f1848bb4fa1cdf8f666424fb,
and hope it helps you.



Thanks,
Hailiang


From 15c19be9be07598d4264a4a84b85d4efa79bff9d Mon Sep 17 00:00:00 2001
From: Like Xu 
Date: Mon, 15 Jun 2020 10:10:57 +0800
Subject: [PATCH 1/2] migration/colo: fix typo in the COLO Framework module

Cc: Hailiang Zhang 
Signed-off-by: Like Xu 
---
 docs/COLO-FT.txt | 8 
 migration/colo.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index c8e1740935..fdc0207cff 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -10,7 +10,7 @@ See the COPYING file in the top-level directory.
 This document gives an overview of COLO's design and how to use it.

 == Background ==
-Virtual machine (VM) replication is a well known technique for providing
+Virtual machine (VM) replication is a well-known technique for providing
 application-agnostic software-implemented hardware fault tolerance,
 also known as "non-stop service".

@@ -103,7 +103,7 @@ Primary side.

 COLO Proxy:
 Delivers packets to Primary and Secondary, and then compare the responses 
from

-both side. Then decide whether to start a checkpoint according to some rules.
+both sides. Then decide whether to start a checkpoint according to some rules.
 Please refer to docs/colo-proxy.txt for more information.

 Note:
@@ -146,12 +146,12 @@ in test procedure.

 == Test procedure ==
 Note: Here we are running both instances on the same host for testing,
-change the IP Addresses if you want to run it on two hosts. Initally
+change the IP Addresses if you want to run it on two hosts. Initially
 127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.

 == Startup qemu ==
 1. Primary:
-Note: Initally, $imagefolder/primary.qcow2 needs to be copied to all hosts.
+Note: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
 You don't need to change any IP's here, because 0.0.0.0 listens on any
 interface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
 instance.
diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..80788d46b5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -632,7 +632,7 @@ out:
 /*
  * It is safe to unregister notifier after failover finished.
  * Besides, colo_delay_timer and colo_checkpoint_sem can't be
- * released befor unregister notifier, or there will be use-after-free
+ * released before unregister notifier, or there will be use-after-free
  * error.
  */
 colo_compare_unregister_notifier(_compare_notifier);
--
2.21.3





[PATCH 2/2] migration/colo/net: fix typo in the COLO Proxy module

2020-06-14 Thread Like Xu
Cc: Zhang Chen 
Cc: Li Zhijian 
Signed-off-by: Like Xu 
---
 docs/colo-proxy.txt | 4 ++--
 net/colo-compare.c  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/colo-proxy.txt b/docs/colo-proxy.txt
index fa1cef0278..0bbd6f720a 100644
--- a/docs/colo-proxy.txt
+++ b/docs/colo-proxy.txt
@@ -21,7 +21,7 @@ and filter-rewriter compose the COLO-proxy.
 == Architecture ==
 
 COLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter
-(except colo-compare). It keep Secondary VM connect normally to
+(except colo-compare). It keeps Secondary VM connect normally to
 client and compare packets sent by PVM with sent by SVM.
 If the packet difference, notify COLO-frame to do checkpoint and send
 all primary packet has queued. Otherwise just send the queued primary
@@ -94,7 +94,7 @@ Redirect Server Filter --> COLO-Compare
 COLO-compare receive primary guest packet then
 waiting secondary redirect packet to compare it.
 If packet same,send queued primary packet and clear
-queued secondary packet, Otherwise send primary packet
+queued secondary packet, otherwise send primary packet
 and do checkpoint.
 
 COLO-Compare --> Another Redirector Filter
diff --git a/net/colo-compare.c b/net/colo-compare.c
index c07e7c1c09..3efc61c777 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -658,7 +658,7 @@ static void colo_compare_packet(CompareState *s, Connection 
*conn,
 g_queue_remove(>secondary_list, result->data);
 } else {
 /*
- * If one packet arrive late, the secondary_list or
+ * If one packet arrives late, the secondary_list or
  * primary_list will be empty, so we can't compare it
  * until next comparison. If the packets in the list are
  * timeout, it will trigger a checkpoint request.
@@ -1296,7 +1296,7 @@ static void colo_compare_finalize(Object *obj)
 }
 }
 
-/* Release all unhandled packets after compare thead exited */
+/* Release all unhandled packets after compare thread exited */
 g_queue_foreach(>conn_list, colo_flush_packets, s);
 
 g_queue_clear(>conn_list);
-- 
2.21.3




[PATCH 1/2] migration/colo: fix typo in the COLO Framework module

2020-06-14 Thread Like Xu
Cc: Hailiang Zhang 
Signed-off-by: Like Xu 
---
 docs/COLO-FT.txt | 8 
 migration/colo.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt
index c8e1740935..fdc0207cff 100644
--- a/docs/COLO-FT.txt
+++ b/docs/COLO-FT.txt
@@ -10,7 +10,7 @@ See the COPYING file in the top-level directory.
 This document gives an overview of COLO's design and how to use it.
 
 == Background ==
-Virtual machine (VM) replication is a well known technique for providing
+Virtual machine (VM) replication is a well-known technique for providing
 application-agnostic software-implemented hardware fault tolerance,
 also known as "non-stop service".
 
@@ -103,7 +103,7 @@ Primary side.
 
 COLO Proxy:
 Delivers packets to Primary and Secondary, and then compare the responses from
-both side. Then decide whether to start a checkpoint according to some rules.
+both sides. Then decide whether to start a checkpoint according to some rules.
 Please refer to docs/colo-proxy.txt for more information.
 
 Note:
@@ -146,12 +146,12 @@ in test procedure.
 
 == Test procedure ==
 Note: Here we are running both instances on the same host for testing,
-change the IP Addresses if you want to run it on two hosts. Initally
+change the IP Addresses if you want to run it on two hosts. Initially
 127.0.0.1 is the Primary Host and 127.0.0.2 is the Secondary Host.
 
 == Startup qemu ==
 1. Primary:
-Note: Initally, $imagefolder/primary.qcow2 needs to be copied to all hosts.
+Note: Initially, $imagefolder/primary.qcow2 needs to be copied to all hosts.
 You don't need to change any IP's here, because 0.0.0.0 listens on any
 interface. The chardev's with 127.0.0.1 IP's loopback to the local qemu
 instance.
diff --git a/migration/colo.c b/migration/colo.c
index ea7d1e9d4e..80788d46b5 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -632,7 +632,7 @@ out:
 /*
  * It is safe to unregister notifier after failover finished.
  * Besides, colo_delay_timer and colo_checkpoint_sem can't be
- * released befor unregister notifier, or there will be use-after-free
+ * released before unregister notifier, or there will be use-after-free
  * error.
  */
 colo_compare_unregister_notifier(_compare_notifier);
-- 
2.21.3




[Qemu-devel] [PATCH 2/2] target/i386: add -cpu, lbr=true support to enable guest LBR

2020-06-13 Thread Like Xu
The LBR feature would be enabled on the guest if:
- the KVM is enabled and the PMU is enabled and,
- the msr-based-feature IA32_PERF_CAPABILITIES is supporterd and,
- the supported returned value for lbr_fmt from this msr is not zero.

The LBR feature would be disabled on the guest if:
- the msr-based-feature IA32_PERF_CAPABILITIES is unsupporterd OR,
- qemu set the IA32_PERF_CAPABILITIES msr feature without lbr_fmt values OR,
- the requested guest vcpu model doesn't support PDCM.

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Marcelo Tosatti 
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu 
---
 hw/i386/pc.c  |  1 +
 target/i386/cpu.c | 25 +++--
 target/i386/cpu.h |  2 ++
 target/i386/kvm.c |  7 ++-
 4 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2128f3d6fe..8d8d42a8ea 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -316,6 +316,7 @@ GlobalProperty pc_compat_1_5[] = {
 { "Nehalem-" TYPE_X86_CPU, "min-level", "2" },
 { "virtio-net-pci", "any_layout", "off" },
 { TYPE_X86_CPU, "pmu", "on" },
+{ TYPE_X86_CPU, "lbr", "on" },
 { "i440FX-pcihost", "short_root_bus", "0" },
 { "q35-pcihost", "short_root_bus", "0" },
 };
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e47c9d1604..262a2595fa 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1142,8 +1142,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
{
 [FEAT_PERF_CAPABILITIES] = {
 .type = MSR_FEATURE_WORD,
 .feat_names = {
-NULL, NULL, NULL, NULL,
-NULL, NULL, NULL, NULL,
+"lbr-fmt-bit-0", "lbr-fmt-bit-1", "lbr-fmt-bit-2", "lbr-fmt-bit-3",
+"lbr-fmt-bit-4", "lbr-fmt-bit-5", NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, "full-width-write", NULL, NULL,
 NULL, NULL, NULL, NULL,
@@ -4187,6 +4187,13 @@ static bool lmce_supported(void)
 return !!(mce_cap & MCG_LMCE_P);
 }
 
+static inline bool lbr_supported(void)
+{
+return kvm_enabled() && (PERF_CAP_LBR_FMT &
+kvm_arch_get_supported_msr_feature(kvm_state,
+   MSR_IA32_PERF_CAPABILITIES));
+}
+
 #define CPUID_MODEL_ID_SZ 48
 
 /**
@@ -4290,6 +4297,9 @@ static void max_x86_cpu_initfn(Object *obj)
 }
 
 object_property_set_bool(OBJECT(cpu), true, "pmu", _abort);
+if (lbr_supported()) {
+object_property_set_bool(OBJECT(cpu), true, "lbr", _abort);
+}
 }
 
 static const TypeInfo max_x86_cpu_type_info = {
@@ -5510,6 +5520,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 }
 if (!cpu->enable_pmu) {
 *ecx &= ~CPUID_EXT_PDCM;
+if (cpu->enable_lbr) {
+warn_report("LBR is unsupported since guest PMU is disabled.");
+exit(1);
+}
 }
 break;
 case 2:
@@ -6528,6 +6542,12 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (!cpu->max_features && cpu->enable_lbr &&
+!(env->features[FEAT_1_ECX] & CPUID_EXT_PDCM)) {
+warn_report("requested vcpu model doesn't support PDCM for LBR.");
+exit(1);
+}
+
 if (cpu->ucode_rev == 0) {
 /* The default is the same as KVM's.  */
 if (IS_AMD_CPU(env)) {
@@ -7165,6 +7185,7 @@ static Property x86_cpu_properties[] = {
 #endif
 DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID),
 DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
+DEFINE_PROP_BOOL("lbr", X86CPU, enable_lbr, false),
 
 DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts,
HYPERV_SPINLOCK_NEVER_RETRY),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index fad2f874bd..e5f65e9b0c 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -357,6 +357,7 @@ typedef enum X86Seg {
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
 #define MSR_IA32_PERF_CAPABILITIES  0x345
+#define PERF_CAP_LBR_FMT  0x3f
 
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
@@ -1686,6 +1687,7 @@ struct X86CPU {
  * capabilities) directly to the guest.
  */
 bool enable_pmu;
+bool enable_lbr;
 
 /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It is
  * disabled by default to avoid breaking migration between QEMU with
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9be6f76b2c..524ae86b0c 100644
--- a/target/i386/k

[Qemu-devel] [PATCH 1/2] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES

2020-06-13 Thread Like Xu
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.

The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcelo Tosatti 
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu 
Message-Id: <20200529074347.124619-5-like...@linux.intel.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c | 23 +++
 target/i386/cpu.h |  3 +++
 target/i386/kvm.c | 20 
 3 files changed, 46 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 02065e35d4..e47c9d1604 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1139,6 +1139,22 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
= {
 .index = MSR_IA32_CORE_CAPABILITY,
 },
 },
+[FEAT_PERF_CAPABILITIES] = {
+.type = MSR_FEATURE_WORD,
+.feat_names = {
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, "full-width-write", NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+},
+.msr = {
+.index = MSR_IA32_PERF_CAPABILITIES,
+},
+},
 
 [FEAT_VMX_PROCBASED_CTLS] = {
 .type = MSR_FEATURE_WORD,
@@ -1316,6 +1332,10 @@ static FeatureDep feature_dependencies[] = {
 .from = { FEAT_7_0_EDX, CPUID_7_0_EDX_CORE_CAPABILITY },
 .to = { FEAT_CORE_CAPABILITY,   ~0ull },
 },
+{
+.from = { FEAT_1_ECX, CPUID_EXT_PDCM },
+.to = { FEAT_PERF_CAPABILITIES,   ~0ull },
+},
 {
 .from = { FEAT_1_ECX,   CPUID_EXT_VMX },
 .to = { FEAT_VMX_PROCBASED_CTLS,~0ull },
@@ -5488,6 +5508,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
 *edx |= CPUID_HT;
 }
+if (!cpu->enable_pmu) {
+*ecx &= ~CPUID_EXT_PDCM;
+}
 break;
 case 2:
 /* cache info: needed for Pentium Pro compatibility */
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 408392dbf6..fad2f874bd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -356,6 +356,8 @@ typedef enum X86Seg {
 #define MSR_IA32_ARCH_CAPABILITIES  0x10a
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
+#define MSR_IA32_PERF_CAPABILITIES  0x345
+
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
 
@@ -529,6 +531,7 @@ typedef enum FeatureWord {
 FEAT_XSAVE_COMP_HI, /* CPUID[EAX=0xd,ECX=0].EDX */
 FEAT_ARCH_CAPABILITIES,
 FEAT_CORE_CAPABILITY,
+FEAT_PERF_CAPABILITIES,
 FEAT_VMX_PROCBASED_CTLS,
 FEAT_VMX_SECONDARY_CTLS,
 FEAT_VMX_PINBASED_CTLS,
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 34f838728d..9be6f76b2c 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -106,6 +106,7 @@ static bool has_msr_core_capabs;
 static bool has_msr_vmx_vmfunc;
 static bool has_msr_ucode_rev;
 static bool has_msr_vmx_procbased_ctls2;
+static bool has_msr_perf_capabs;
 
 static uint32_t has_architectural_pmu_version;
 static uint32_t num_architectural_pmu_gp_counters;
@@ -2027,6 +2028,9 @@ static int kvm_get_supported_msrs(KVMState *s)
 case MSR_IA32_CORE_CAPABILITY:
 has_msr_core_capabs = true;
 break;
+case MSR_IA32_PERF_CAPABILITIES:
+has_msr_perf_capabs = true;
+break;
 case MSR_IA32_VMX_VMFUNC:
 has_msr_vmx_vmfunc = true;
 break;
@@ -2643,6 +2647,18 @@ static void kvm_msr_entry_add_vmx(X86CPU *cpu, 
FeatureWordArray f)
   VMCS12_MAX_FIELD_INDEX << 1);
 }
 
+static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
+{
+uint64_t kvm_perf_cap =
+kvm_arch_get_supported_msr_feature(kvm_state,
+   MSR_IA32_PERF_CAPABILITIES);
+
+if (kvm_perf_cap) {
+kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
+kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+}
+}
+
 static int kvm_buf_set_msrs(X86CPU *cpu)
 {
 int ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_MSRS, cpu->kvm_msr_buf);
@@ -2675,6 +2691,10 @@ static void kvm_init_msrs(X86CPU *cpu)
   env->features[FEAT_CORE_CAPABILITY]);
 }
 
+if (has_msr_perf_capabs && cpu->enable_pmu) {
+kvm_msr_entry_add_perf(cpu, env->features);
+}
+
 if (has_msr_ucode_rev) {
 kvm_msr_entry_add(cpu, MSR_IA32_UCODE_REV, cpu->ucode_rev);
 }
-- 
2.21.3




[Qemu-devel PATCH] target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES

2020-05-29 Thread Like Xu
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.

The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Marcelo Tosatti 
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 29 +
 target/i386/cpu.h |  3 +++
 target/i386/kvm.c | 20 
 3 files changed, 52 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3733d9a279..be56966bb0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1139,6 +1139,22 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
= {
 .index = MSR_IA32_CORE_CAPABILITY,
 },
 },
+[FEAT_PERF_CAPABILITIES] = {
+.type = MSR_FEATURE_WORD,
+.feat_names = {
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, "full-width-write", NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+},
+.msr = {
+.index = MSR_IA32_PERF_CAPABILITIES,
+},
+},
 
 [FEAT_VMX_PROCBASED_CTLS] = {
 .type = MSR_FEATURE_WORD,
@@ -1316,6 +1332,10 @@ static FeatureDep feature_dependencies[] = {
 .from = { FEAT_7_0_EDX, CPUID_7_0_EDX_CORE_CAPABILITY },
 .to = { FEAT_CORE_CAPABILITY,   ~0ull },
 },
+{
+.from = { FEAT_1_ECX, CPUID_EXT_PDCM },
+.to = { FEAT_PERF_CAPABILITIES,   ~0ull },
+},
 {
 .from = { FEAT_1_ECX,   CPUID_EXT_VMX },
 .to = { FEAT_VMX_PROCBASED_CTLS,~0ull },
@@ -5488,6 +5508,9 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ebx |= (cs->nr_cores * cs->nr_threads) << 16;
 *edx |= CPUID_HT;
 }
+if (!cpu->enable_pmu) {
+*ecx &= ~CPUID_EXT_PDCM;
+}
 break;
 case 2:
 /* cache info: needed for Pentium Pro compatibility */
@@ -6505,6 +6528,12 @@ static void x86_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 }
 
+if (kvm_enabled() && cpu->enable_pmu &&
+(kvm_arch_get_supported_cpuid(kvm_state, 1, 0, R_ECX) &
+ CPUID_EXT_PDCM)) {
+env->features[FEAT_1_ECX] |= CPUID_EXT_PDCM;
+}
+
 if (cpu->ucode_rev == 0) {
 /* The default is the same as KVM's.  */
 if (IS_AMD_CPU(env)) {
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 408392dbf6..fad2f874bd 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -356,6 +356,8 @@ typedef enum X86Seg {
 #define MSR_IA32_ARCH_CAPABILITIES  0x10a
 #define ARCH_CAP_TSX_CTRL_MSR  (1<<7)
 
+#define MSR_IA32_PERF_CAPABILITIES  0x345
+
 #define MSR_IA32_TSX_CTRL  0x122
 #define MSR_IA32_TSCDEADLINE0x6e0
 
@@ -529,6 +531,7 @@ typedef enum FeatureWord {
 FEAT_XSAVE_COMP_HI, /* CPUID[EAX=0xd,ECX=0].EDX */
 FEAT_ARCH_CAPABILITIES,
 FEAT_CORE_CAPABILITY,
+FEAT_PERF_CAPABILITIES,
 FEAT_VMX_PROCBASED_CTLS,
 FEAT_VMX_SECONDARY_CTLS,
 FEAT_VMX_PINBASED_CTLS,
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 34f838728d..9be6f76b2c 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -106,6 +106,7 @@ static bool has_msr_core_capabs;
 static bool has_msr_vmx_vmfunc;
 static bool has_msr_ucode_rev;
 static bool has_msr_vmx_procbased_ctls2;
+static bool has_msr_perf_capabs;
 
 static uint32_t has_architectural_pmu_version;
 static uint32_t num_architectural_pmu_gp_counters;
@@ -2027,6 +2028,9 @@ static int kvm_get_supported_msrs(KVMState *s)
 case MSR_IA32_CORE_CAPABILITY:
 has_msr_core_capabs = true;
 break;
+case MSR_IA32_PERF_CAPABILITIES:
+has_msr_perf_capabs = true;
+break;
 case MSR_IA32_VMX_VMFUNC:
 has_msr_vmx_vmfunc = true;
 break;
@@ -2643,6 +2647,18 @@ static void kvm_msr_entry_add_vmx(X86CPU *cpu, 
FeatureWordArray f)
   VMCS12_MAX_FIELD_INDEX << 1);
 }
 
+static void kvm_msr_entry_add_perf(X86CPU *cpu, FeatureWordArray f)
+{
+uint64_t kvm_perf_cap =
+kvm_arch_get_supported_msr_feature(kvm_state,
+   MSR_IA32_PERF_CAPABILITIES);
+
+if (kvm_perf_cap) {
+kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES,
+kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]);
+}
+}
+
 static int kvm_buf_set_msrs(X86CPU *cpu)
 {
 int ret = kvm_vcpu_ioctl(CPU

Re: [PATCH] i386/cpu: Expand MAX_FIXED_COUNTERS from 3 to 4 to for Icelake

2020-03-26 Thread Like Xu

On 2020/3/27 2:48, Paolo Bonzini wrote:

On 17/03/20 06:54, Like Xu wrote:

In the Intel SDM, "Table 18-2. Association of Fixed-Function
Performance Counters with Architectural Performance Events",
we may have a new fixed counter 'TOPDOWN.SLOTS' (since Icelake),
which counts the number of available slots for an unhalted
logical processor. Check commit 6017608936 in the kernel tree.

Signed-off-by: Like Xu 
---
  target/i386/cpu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 576f309bbf..ec2b67d425 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1185,7 +1185,7 @@ typedef struct {
  #define CPU_NB_REGS CPU_NB_REGS32
  #endif
  
-#define MAX_FIXED_COUNTERS 3

+#define MAX_FIXED_COUNTERS 4
  #define MAX_GP_COUNTERS(MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
  
  #define TARGET_INSN_START_EXTRA_WORDS 1




Hi Like, the problem with this patch is that it breaks live migration;
the vmstate_msr_architectural_pmu record hardcodes MAX_FIXED_COUNTERS as
the number of registers.

So it's more complicated, you need to add a new subsection (following
vmstate_msr_architectural_pmu) and transmit it only if the 4th counter
is nonzero (instead of the more complicated check in pmu_enable_needed).
  Just to be safe, I'd make the new subsection hold 16 counters and bump
MAX_FIXED_COUNTERS to 16.


The new MAX_FIXED_COUNTERS looks good to me and
and let me follow up this live migration issue.

Thanks,
Like Xu



Thanks,

Paolo







Re: [PATCH] i386/cpu: Expand MAX_FIXED_COUNTERS from 3 to 4 to for Icelake

2020-03-26 Thread Like Xu

Anyone to help review this change?

Thanks,
Like Xu

On 2020/3/17 13:54, Like Xu wrote:

In the Intel SDM, "Table 18-2. Association of Fixed-Function
Performance Counters with Architectural Performance Events",
we may have a new fixed counter 'TOPDOWN.SLOTS' (since Icelake),
which counts the number of available slots for an unhalted
logical processor. Check commit 6017608936 in the kernel tree.

Signed-off-by: Like Xu 
---
  target/i386/cpu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 576f309bbf..ec2b67d425 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1185,7 +1185,7 @@ typedef struct {
  #define CPU_NB_REGS CPU_NB_REGS32
  #endif
  
-#define MAX_FIXED_COUNTERS 3

+#define MAX_FIXED_COUNTERS 4
  #define MAX_GP_COUNTERS(MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
  
  #define TARGET_INSN_START_EXTRA_WORDS 1







[PATCH] i386/cpu: Expand MAX_FIXED_COUNTERS from 3 to 4 to for Icelake

2020-03-16 Thread Like Xu
In the Intel SDM, "Table 18-2. Association of Fixed-Function
Performance Counters with Architectural Performance Events",
we may have a new fixed counter 'TOPDOWN.SLOTS' (since Icelake),
which counts the number of available slots for an unhalted
logical processor. Check commit 6017608936 in the kernel tree.

Signed-off-by: Like Xu 
---
 target/i386/cpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 576f309bbf..ec2b67d425 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1185,7 +1185,7 @@ typedef struct {
 #define CPU_NB_REGS CPU_NB_REGS32
 #endif
 
-#define MAX_FIXED_COUNTERS 3
+#define MAX_FIXED_COUNTERS 4
 #define MAX_GP_COUNTERS(MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
 
 #define TARGET_INSN_START_EXTRA_WORDS 1
-- 
2.21.1




Re: Difference between 'current_machine' vs MACHINE(qdev_get_machine())

2020-01-09 Thread Like Xu

On 2020/1/9 20:01, Paolo Bonzini wrote:

On 09/01/20 12:23, Philippe Mathieu-Daudé wrote:



     current_machine =
MACHINE(object_new_with_class(OBJECT_CLASS(machine_class)));
     object_property_add_child(object_get_root(), "machine",
   OBJECT(current_machine), _abort);

The bigger user of 'current_machine' is the accel/KVM code.

Recently in a0628599f..cc7d44c2e0 "Replace global smp variables with
machine smp properties" we started to use MACHINE(qdev_get_machine()).

qdev_get_machine() resolves the machine in the QOM composition tree.
I am confused by this comment:

   /* qdev_get_machine() can return something that's not TYPE_MACHINE
    * if this is one of the user-only emulators; in that case there's
    * no need to check the ignore_memory_transaction_failures board flag.
    */

Following a0628599f..cc7d44c2e0, a5e0b33119 use 'current_machine' again.

What are the differences between both form, when should we use one or
another (or can we use a single one?). Can this break user-only mode?


I would always use MACHINE(qdev_get_machine()), espeecially outside
vl.c.  Ideally, current_machine would be static within vl.c or even
unused outside the object_property_add_child() that you quote above.

Most of the times, I noticed from a quick grep, we actually want to
access the accelerator, not the machine, so we could add a
qemu_get_accelerator() wrapper that does
MACHINE(qdev_get_machine())->accelerator.

Paolo



I prefer to use MACHINE(qdev_get_machine()) wherever possible.

However, the qdev_get_machine() would return non TYPE_MACHINE object if:
- call qdev_get_machine() before we do 
"object_property_add_child(object_get_root(), "machine", 
OBJECT(current_machine), _abort);" in vl.c;

- or in the context with '#ifdef CONFIG_USER_ONLY';

Thanks,
Like Xu




Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation

2019-08-18 Thread Like Xu

On 2019/8/16 21:49, Eduardo Habkost wrote:

On Fri, Aug 16, 2019 at 09:04:16AM +0800, Like Xu wrote:

Hi,

On 2019/8/16 2:38, Eduardo Habkost wrote:

The error message for die-id range validation is incorrect.  Example:

$ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
  -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
qemu-system-x86_64: -device 
qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
  Invalid CPU die-id: 1 must be in range 0:5

The actual range for die-id in this example is 0:0.


There is one die per socket by default.

The case sockets=6 means there are 6 dies by default
and the range for die-id is 0:5.



I don't understand why you say that.  die-id supposed to identify
a die inside a socket.  The code below is already checking for
(cpu->die_id > pcms->smp_dies - 1) because of that.



You're right about this.
Sorry to make a mess to support die topology.





Fix the error message to use smp_dies and print the correct range.

Signed-off-by: Eduardo Habkost 
---
   hw/i386/pc.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..24b03bb49c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
   return;
   } else if (cpu->die_id > pcms->smp_dies - 1) {
   error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
-   cpu->die_id, max_socket);
+   cpu->die_id, pcms->smp_dies - 1);
   return;
   }
   if (cpu->core_id < 0) {










Re: [Qemu-devel] [PATCH 1/3] pc: Fix error message on die-id validation

2019-08-15 Thread Like Xu

Hi,

On 2019/8/16 2:38, Eduardo Habkost wrote:

The error message for die-id range validation is incorrect.  Example:

   $ qemu-system-x86_64 -smp 1,sockets=6,maxcpus=6 \
 -device qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0
   qemu-system-x86_64: -device 
qemu64-x86_64-cpu,socket-id=1,die-id=1,core-id=0,thread-id=0: \
 Invalid CPU die-id: 1 must be in range 0:5

The actual range for die-id in this example is 0:0.


There is one die per socket by default.

The case sockets=6 means there are 6 dies by default
and the range for die-id is 0:5.



Fix the error message to use smp_dies and print the correct range.

Signed-off-by: Eduardo Habkost 
---
  hw/i386/pc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 549c437050..24b03bb49c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2412,7 +2412,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  return;
  } else if (cpu->die_id > pcms->smp_dies - 1) {
  error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
-   cpu->die_id, max_socket);
+   cpu->die_id, pcms->smp_dies - 1);
  return;
  }
  if (cpu->core_id < 0) {






Re: [Qemu-devel] [PATCH for 4.1?] includes: remove stale [smp|max]_cpus externs

2019-07-11 Thread Like Xu

On 2019/7/11 21:05, Alex Bennée wrote:

Commit a5e0b3311 removed these in favour of querying machine
properties. Remove the extern declarations as well.

Signed-off-by: Alex Bennée 
Cc: Like Xu 


Reviewed-by: Like Xu 


---
  include/sysemu/sysemu.h | 2 --
  1 file changed, 2 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 984c439ac9..e70edf7c1c 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -103,8 +103,6 @@ extern const char *keyboard_layout;
  extern int win2k_install_hack;
  extern int alt_grab;
  extern int ctrl_grab;
-extern int smp_cpus;
-extern unsigned int max_cpus;
  extern int cursor_hide;
  extern int graphic_rotate;
  extern int no_quit;






Re: [Qemu-devel] [PATCH v3 05/10] hw/riscv: Replace global smp variables with machine smp properties

2019-06-20 Thread Like Xu

On 2019/6/20 22:52, Eduardo Habkost wrote:

On Sun, May 19, 2019 at 04:54:23AM +0800, Like Xu wrote:

The global smp variables in riscv are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
---
  hw/riscv/sifive_e.c| 6 --
  hw/riscv/sifive_plic.c | 3 +++
  hw/riscv/sifive_u.c| 6 --
  hw/riscv/spike.c   | 2 ++
  hw/riscv/virt.c| 1 +
  5 files changed, 14 insertions(+), 4 deletions(-)


This was incomplete, I had to apply the following fixup.

Signed-off-by: Eduardo Habkost 


Reviewed-by: Like Xu 


---
  hw/riscv/spike.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 9e95f2c13c..d91d49dcae 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -172,6 +172,7 @@ static void spike_board_init(MachineState *machine)
  MemoryRegion *main_mem = g_new(MemoryRegion, 1);
  MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
  int i;
+unsigned int smp_cpus = machine->smp.cpus;
  
  /* Initialize SOC */

  object_initialize_child(OBJECT(machine), "soc", >soc, sizeof(s->soc),






[Qemu-devel] [PATCH v4 1/3] target/i386: Add CPUID.1F generation support for multi-dies PCMachine

2019-06-19 Thread Like Xu
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.

If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 41 +
 target/i386/cpu.h |  1 +
 target/i386/kvm.c | 12 
 3 files changed, 54 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 88908a6373..efcbe6a2b2 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4439,6 +4439,42 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
 }
 
+assert(!(*eax & ~0x1f));
+*ebx &= 0x; /* The count doesn't need to be reliable. */
+break;
+case 0x1F:
+/* V2 Extended Topology Enumeration Leaf */
+if (env->nr_dies < 2) {
+*eax = *ebx = *ecx = *edx = 0;
+break;
+}
+
+*ecx = count & 0xff;
+*edx = cpu->apic_id;
+switch (count) {
+case 0:
+*eax = apicid_core_offset(env->nr_dies, cs->nr_cores,
+cs->nr_threads);
+*ebx = cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
+break;
+case 1:
+*eax = apicid_die_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
+break;
+case 2:
+*eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
+break;
+default:
+*eax = 0;
+*ebx = 0;
+*ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
+}
 assert(!(*eax & ~0x1f));
 *ebx &= 0x; /* The count doesn't need to be reliable. */
 break;
@@ -5116,6 +5152,11 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
**errp)
 x86_cpu_adjust_level(cpu, >env.cpuid_min_level, 0x14);
 }
 
+/* CPU topology with multi-dies support requires CPUID[0x1F] */
+if (env->nr_dies > 1) {
+x86_cpu_adjust_level(cpu, >cpuid_min_level, 0x1F);
+}
+
 /* SVM requires CPUID[0x800A] */
 if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
 x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x800A);
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 46dd81f6b7..eec6e4b7b7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -726,6 +726,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_TOPOLOGY_LEVEL_INVALID  (0U << 8)
 #define CPUID_TOPOLOGY_LEVEL_SMT  (1U << 8)
 #define CPUID_TOPOLOGY_LEVEL_CORE (2U << 8)
+#define CPUID_TOPOLOGY_LEVEL_DIE  (5U << 8)
 
 /* MSR Feature Bits */
 #define MSR_ARCH_CAP_RDCL_NO(1U << 0)
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 6899061b4e..5deb4248ac 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1080,6 +1080,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 break;
 }
+case 0x1f:
+if (env->nr_dies < 2) {
+break;
+}
 case 4:
 case 0xb:
 case 0xd:
@@ -1087,6 +1091,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 if (i == 0xd && j == 64) {
 break;
 }
+
+if (i == 0x1f && j == 64) {
+break;
+}
+
 c->function = i;
 c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 c->index = j;
@@ -1098,6 +1107,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 if (i == 0xb && !(c->ecx & 0xff00)) {
 break;
 }
+if (i == 0x1f && !(c->ecx & 0xff00)) {
+break;
+}
 if (i == 0xd && c->eax == 0) {
 continue;
 }
-- 
2.21.0




[Qemu-devel] [PATCH v4 3/3] vl.c: Add -smp, dies=* command line support and update doc

2019-06-19 Thread Like Xu
For PC target, users could configure the number of dies per one package
via command line with this patch, such as "-smp dies=2,cores=4".

The parsing rules of new cpu-topology model obey the same restrictions/logic
as the legacy socket/core/thread model especially on missing values computing.

Signed-off-by: Like Xu 
---
 hw/i386/pc.c| 30 +-
 qemu-options.hx | 17 +
 vl.c|  3 +++
 3 files changed, 29 insertions(+), 21 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 092bd10d4d..2ed1b3f8de 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1548,9 +1548,12 @@ static void pc_new_cpu(PCMachineState *pcms, int64_t 
apic_id, Error **errp)
  */
 void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 {
+PCMachineState *pcms = PC_MACHINE(ms);
+
 if (opts) {
 unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
 unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+unsigned dies = qemu_opt_get_number(opts, "dies", 1);
 unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
 unsigned threads = qemu_opt_get_number(opts, "threads", 0);
 
@@ -1560,24 +1563,24 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 threads = threads > 0 ? threads : 1;
 if (cpus == 0) {
 sockets = sockets > 0 ? sockets : 1;
-cpus = cores * threads * sockets;
+cpus = cores * threads * dies * sockets;
 } else {
 ms->smp.max_cpus =
 qemu_opt_get_number(opts, "maxcpus", cpus);
-sockets = ms->smp.max_cpus / (cores * threads);
+sockets = ms->smp.max_cpus / (cores * threads * dies);
 }
 } else if (cores == 0) {
 threads = threads > 0 ? threads : 1;
-cores = cpus / (sockets * threads);
+cores = cpus / (sockets * dies * threads);
 cores = cores > 0 ? cores : 1;
 } else if (threads == 0) {
-threads = cpus / (cores * sockets);
+threads = cpus / (cores * dies * sockets);
 threads = threads > 0 ? threads : 1;
-} else if (sockets * cores * threads < cpus) {
+} else if (sockets * dies * cores * threads < cpus) {
 error_report("cpu topology: "
- "sockets (%u) * cores (%u) * threads (%u) < "
+ "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
< "
  "smp_cpus (%u)",
- sockets, cores, threads, cpus);
+ sockets, dies, cores, threads, cpus);
 exit(1);
 }
 
@@ -1589,26 +1592,27 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 exit(1);
 }
 
-if (sockets * cores * threads > ms->smp.max_cpus) {
+if (sockets * dies * cores * threads > ms->smp.max_cpus) {
 error_report("cpu topology: "
- "sockets (%u) * cores (%u) * threads (%u) > "
+ "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
> "
  "maxcpus (%u)",
- sockets, cores, threads,
+ sockets, dies, cores, threads,
  ms->smp.max_cpus);
 exit(1);
 }
 
-if (sockets * cores * threads != ms->smp.max_cpus) {
+if (sockets * dies * cores * threads != ms->smp.max_cpus) {
 warn_report("Invalid CPU topology deprecated: "
-"sockets (%u) * cores (%u) * threads (%u) "
+"sockets (%u) * dies (%u) * cores (%u) * threads (%u) "
 "!= maxcpus (%u)",
-sockets, cores, threads,
+sockets, dies, cores, threads,
 ms->smp.max_cpus);
 }
 
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+pcms->smp_dies = dies;
 }
 
 if (ms->smp.cpus > 1) {
diff --git a/qemu-options.hx b/qemu-options.hx
index 0d8beb4afd..a5b314a448 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -138,25 +138,26 @@ no incompatible TCG features have been enabled (e.g. 
icount/replay).
 ETEXI
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
+"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,dies=dies][,sockets=sockets]\n"
 "set the number of CPUs to 'n' [default=1]\n"
 "maxcpus= maxi

[Qemu-devel] [PATCH v4 0/3] Introduce cpu die topology and enable CPUID.1F for i386

2019-06-19 Thread Like Xu
This patch series introduces a new cpu topolgy 'die' for PCMachine,
which extends virtual cpu topology to the socket/die/core/thread model,
allowing the setting of dies number per one socket via -smp qemu command.

For i386, it upgrades APIC-IDs generation and reversion functions with a
new exposed leaf called CPUID.1F, which is a preferred superset to leaf 0BH.
The CPUID.1F spec is on the latest Inetl SDM, 3-190 Vol 2A.

Guest system could discover multi-die/package topology through CPUID.1F.
and its benefit is primarily for _reporting_ of the guest CPU topology.
The guest kernel with multi-die/package support have no impact on its
cache topology, NUMA topology, Linux scheduler, or system performance.

==changelog==

v4:

- base commit: 22fa84da on github.com/ehabkost/qemu.git:machine-next
- refine comments for pc_smp_parse()
- remove the use of cpu->enable_cpuid_0x1f
- apply new logic for cpuid_min_level adjustment and drop the legacy one
- refine the way of MachineState casting in pc_smp_parse()
- [QUEUED] move test_topo_bits to the previous patch for bisectability

v3: https://patchwork.kernel.org/cover/10989013/

- add a MachineClass::smp_parse function pointer
- place the PC-specific function inside hw/i386/pc.c
- introduce die_id in a separate patch with default value 0
- set env->nr_dies in pc_new_cpu() and pc_cpu_pre_plug()
- fix a circular dependency between target/i386/cpu.c and hw/i386/pc.c
- fix cpu->die_id check in pc_cpu_pre_plug()
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 219dca61ebf41625831d4f96a720852baf44b762

v2: https://patchwork.kernel.org/cover/10953191/

- Enable cpu die-level topolgy only for PCMachine and X86CPU
- Minimize cpuid.0.eax to the setting value actually used by guest
- Update cmd line -smps docs for die-level configurations
- Refactoring topo-bit tests for x86_apicid_from_cpu_idx() with nr_dies
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 2259637b95bef3116cc262459271de08e038cc66

v1: https://patchwork.kernel.org/cover/10876667/

Like Xu (3):
  target/i386: Add CPUID.1F generation support for multi-dies PCMachine
  machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
  vl.c: Add -smp, dies=* command line support and update doc

 hw/core/machine.c| 76 
 hw/i386/pc.c | 83 
 include/hw/boards.h  |  5 +++
 include/hw/i386/pc.h |  1 +
 qemu-options.hx  | 17 -
 target/i386/cpu.c| 41 ++
 target/i386/cpu.h|  1 +
 target/i386/kvm.c| 12 +++
 vl.c | 78 +++--
 9 files changed, 233 insertions(+), 81 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v4 2/3] machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()

2019-06-19 Thread Like Xu
To make smp_parse() more flexible and expansive, a smp_parse function
pointer is added to MachineClass that machine types could override.

The generic smp_parse() code in vl.c is moved to hw/core/machine.c, and
become the default implementation of MachineClass::smp_parse. A PC-specific
function called pc_smp_parse() has been added to hw/i386/pc.c, which in
this patch changes nothing against the default one .

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
Reviewed-by: Eduardo Habkost 
---
 hw/core/machine.c| 76 ++
 hw/i386/pc.c | 79 
 include/hw/boards.h  |  5 +++
 include/hw/i386/pc.h |  1 +
 vl.c | 75 ++---
 5 files changed, 163 insertions(+), 73 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 8b8d263afe..36a838f1cb 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -11,6 +11,9 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/option.h"
+#include "qapi/qmp/qerror.h"
+#include "sysemu/replay.h"
 #include "qemu/units.h"
 #include "hw/boards.h"
 #include "qapi/error.h"
@@ -728,6 +731,78 @@ void machine_set_cpu_numa_node(MachineState *machine,
 }
 }
 
+static void smp_parse(MachineState *ms, QemuOpts *opts)
+{
+if (opts) {
+unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
+unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
+unsigned threads = qemu_opt_get_number(opts, "threads", 0);
+
+/* compute missing values, prefer sockets over cores over threads */
+if (cpus == 0 || sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+if (cpus == 0) {
+sockets = sockets > 0 ? sockets : 1;
+cpus = cores * threads * sockets;
+} else {
+ms->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+sockets = ms->smp.max_cpus / (cores * threads);
+}
+} else if (cores == 0) {
+threads = threads > 0 ? threads : 1;
+cores = cpus / (sockets * threads);
+cores = cores > 0 ? cores : 1;
+} else if (threads == 0) {
+threads = cpus / (cores * sockets);
+threads = threads > 0 ? threads : 1;
+} else if (sockets * cores * threads < cpus) {
+error_report("cpu topology: "
+ "sockets (%u) * cores (%u) * threads (%u) < "
+ "smp_cpus (%u)",
+ sockets, cores, threads, cpus);
+exit(1);
+}
+
+ms->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+
+if (ms->smp.max_cpus < cpus) {
+error_report("maxcpus must be equal to or greater than smp");
+exit(1);
+}
+
+if (sockets * cores * threads > ms->smp.max_cpus) {
+error_report("cpu topology: "
+ "sockets (%u) * cores (%u) * threads (%u) > "
+ "maxcpus (%u)",
+ sockets, cores, threads,
+ ms->smp.max_cpus);
+exit(1);
+}
+
+if (sockets * cores * threads != ms->smp.max_cpus) {
+warn_report("Invalid CPU topology deprecated: "
+"sockets (%u) * cores (%u) * threads (%u) "
+"!= maxcpus (%u)",
+sockets, cores, threads,
+ms->smp.max_cpus);
+}
+
+ms->smp.cpus = cpus;
+ms->smp.cores = cores;
+ms->smp.threads = threads;
+}
+
+if (ms->smp.cpus > 1) {
+Error *blocker = NULL;
+error_setg(, QERR_REPLAY_NOT_SUPPORTED, "smp");
+replay_add_blocker(blocker);
+}
+}
+
 static void machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
@@ -735,6 +810,7 @@ static void machine_class_init(ObjectClass *oc, void *data)
 /* Default 128 MB as guest ram size */
 mc->default_ram_size = 128 * MiB;
 mc->rom_file_has_mr = true;
+mc->smp_parse = smp_parse;
 
 /* numa node memory size aligned on 8MB by default.
  * On Linux, each node's border has to be 8MB aligned
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7b8c9caed6..092bd10d4d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,8 @@
 #include "hw/i386/intel_iommu.h"
 #include "hw/net/ne2000-isa.h"
 #include &quo

Re: [Qemu-devel] [PATCH v3 7/9] target/i386: Support multi-dies when host doesn't support CPUID.1F

2019-06-19 Thread Like Xu

On 2019/6/20 7:36, Eduardo Habkost wrote:

On Wed, Jun 19, 2019 at 04:15:46PM -0300, Eduardo Habkost wrote:

On Wed, Jun 12, 2019 at 04:41:02PM +0800, Like Xu wrote:

In guest CPUID generation process, the cpuid_min_level would be adjusted to
the maximum passed value for basic CPUID configuration and it should not be
restricted by the limited value returned from cpu_x86_cpuid(). After the basic
cpu_x86_cpuid() loop is finished, the cpuid_0_entry.eax needs to be configured
again by the last adjusted cpuid_min_level value.

If a user wants to expose CPUID.1F by passing dies > 1 for any reason without
host support, a per-cpu smp topology warning will appear but it's not blocked.

Signed-off-by: Like Xu 


This code doesn't look at host CPUID at all, as far as I can see.
Isn't it simpler to just make cpuid_x86_cpuid() return the
correct data?


I suggest the following change instead.

Signed-off-by: Eduardo Habkost 


Hi Eduardo,

Your code is more reasonable and concise than mine on this
so let's not break cpuid_x86_cpuid().

I'll remove the use of enable_cpuid_0x1f in next version, and should I 
resend the patch series "Refactor cpu topo into machine properties" 
because rebase-fix may distract you ?



---
  target/i386/cpu.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6db38e145b..d05a224092 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5152,6 +5152,10 @@ static void x86_cpu_expand_features(X86CPU *cpu, Error 
**errp)
  x86_cpu_adjust_level(cpu, >env.cpuid_min_level, 0x14);
  }
  
+if (env->nr_dies > 1) {

+x86_cpu_adjust_level(cpu, >cpuid_min_level, 0x1F);
+}
+
  /* SVM requires CPUID[0x800A] */
  if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
  x86_cpu_adjust_level(cpu, >cpuid_min_xlevel, 0x800A);






Re: [Qemu-devel] [PATCH v3 0/9] Introduce cpu die topology and enable CPUID.1F for i386

2019-06-18 Thread Like Xu

Ping for timely review.

On 2019/6/12 16:40, Like Xu wrote:

Multi-chip packaging technology allows integration of multi-cores in one die
and multi-dies in one single package, for example Intel CLX-AP or AMD EPYC.

This patch series extend the CPU topology to the socket/dies/core/thread model,
allowing the setting of dies number per one socket on -smp qemu command. For
i386, it upgrades APIC_IDs generation and reversion functions with a new exposed
leaf called CPUID.1F, which is a preferred superset to leaf 0BH. The CPUID.1F
spec is on https://software.intel.com/en-us/articles/intel-sdm, 3-190 Vol 2A.

E.g. we use -smp 4,dies=2,cores=2,threads=1 to run a multi-dies guest and
check raw cpuid data and the expected output from guest is following:
0x001f 0x00: eax=0x ebx=0x0001 ecx=0x0100 edx=0x0002
0x001f 0x01: eax=0x0001 ebx=0x0002 ecx=0x0201 edx=0x0001
0x001f 0x02: eax=0x0002 ebx=0x0004 ecx=0x0502 edx=0x0003
0x001f 0x03: eax=0x ebx=0x ecx=0x0003 edx=0x0001

Guest system could discover multi-die/package topology through CPUID.1F.
and its benefit is primarily for _reporting_ of the (virtual) CPU topology.
The guest kernel with multi-die/package support have no impact on its
cache topology, NUMA topology, Linux scheduler, or system performance.

==changelog==

v3:

- add a MachineClass::smp_parse function pointer
- place the PC-specific function inside hw/i386/pc.c
- introduce die_id in a separate patch with default value 0
- set env->nr_dies in pc_new_cpu() and pc_cpu_pre_plug()
- fix a circular dependency between target/i386/cpu.c and hw/i386/pc.c
- fix cpu->die_id check in pc_cpu_pre_plug()
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 219dca61ebf41625831d4f96a720852baf44b762

v2: https://patchwork.kernel.org/cover/10953191/

- Enable cpu die-level topolgy only for PCMachine and X86CPU
- Minimize cpuid.0.eax to the setting value actually used by guest
- Update cmd line -smps docs for die-level configurations
- Refactoring topo-bit tests for x86_apicid_from_cpu_idx() with nr_dies
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 2259637b95bef3116cc262459271de08e038cc66

v1: https://patchwork.kernel.org/cover/10876667/

Like Xu (9):
   i386: Add die-level cpu topology to x86CPU on PCMachine
   hw/i386: Adjust nr_dies with configured smp_dies for PCMachine
   i386/cpu: Consolidate die-id validity in smp context
   i386: Update new x86_apicid parsing rules with die_offset support
   tests/x86-cpuid: Update testcases in test_topo_bits() with multiple dies
   i386/cpu: Add CPUID.1F generation support for multi-dies PCMachine
   target/i386: Support multi-dies when host doesn't support CPUID.1F
   machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
   vl.c: Add -smp, dies=* command line support and update doc

  hmp.c  |   3 +
  hw/core/machine.c  |  89 ++
  hw/i386/pc.c   | 148 -
  include/hw/boards.h|   5 ++
  include/hw/i386/pc.h   |   3 +
  include/hw/i386/topology.h |  76 +--
  qapi/misc.json |   6 +-
  qemu-options.hx|  17 +++--
  target/i386/cpu.c  |  53 +++--
  target/i386/cpu.h  |   7 ++
  target/i386/kvm.c  |  36 -
  tests/test-x86-cpuid.c |  84 +++--
  vl.c   |  78 ++-
  13 files changed, 438 insertions(+), 167 deletions(-)






[Qemu-devel] [PATCH v3 4/9] i386: Update new x86_apicid parsing rules with die_offset support

2019-06-12 Thread Like Xu
In new sockets/dies/cores/threads model, the apicid of logical cpu could
imply die level info of guest cpu topology thus x86_apicid_from_cpu_idx()
need to be refactored with #dies value, so does apicid_*_offset().

To keep semantic compatibility, the legacy pkg_offset which helps to
generate CPUIDs such as 0x3 for L3 cache should be mapping to die_offset.

Signed-off-by: Like Xu 
---
 hw/i386/pc.c   | 29 ++-
 include/hw/i386/topology.h | 76 +++---
 target/i386/cpu.c  | 13 ---
 3 files changed, 81 insertions(+), 37 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 6e774c6c8e..b4dbd1064d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -933,10 +933,11 @@ void enable_compat_apic_id_mode(void)
 static uint32_t x86_cpu_apic_id_from_index(MachineState *ms,
unsigned int cpu_index)
 {
+PCMachineState *pcms = PC_MACHINE(ms);
 uint32_t correct_id;
 static bool warned;
 
-correct_id = x86_apicid_from_cpu_idx(ms->smp.cores,
+correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
  ms->smp.threads, cpu_index);
 if (compat_apic_id_mode) {
 if (cpu_index != correct_id && !warned && !qtest_enabled()) {
@@ -2355,18 +2356,21 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 topo.die_id = cpu->die_id;
 topo.core_id = cpu->core_id;
 topo.smt_id = cpu->thread_id;
-cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
+cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
+smp_threads, );
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
 if (!cpu_slot) {
 MachineState *ms = MACHINE(pcms);
 
-x86_topo_ids_from_apicid(cpu->apic_id, smp_cores, smp_threads, );
-error_setg(errp, "Invalid CPU [socket: %u, core: %u, thread: %u] with"
-  " APIC ID %" PRIu32 ", valid index range 0:%d",
-   topo.pkg_id, topo.core_id, topo.smt_id, cpu->apic_id,
-   ms->possible_cpus->len - 1);
+x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
+ smp_cores, smp_threads, );
+error_setg(errp,
+"Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
+" APIC ID %" PRIu32 ", valid index range 0:%d",
+topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
+cpu->apic_id, ms->possible_cpus->len - 1);
 return;
 }
 
@@ -2382,7 +2386,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
  * once -smp refactoring is complete and there will be CPU private
  * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-x86_topo_ids_from_apicid(cpu->apic_id, smp_cores, smp_threads, );
+x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
+ smp_cores, smp_threads, );
 if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
 error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
 " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo.pkg_id);
@@ -2679,10 +2684,12 @@ pc_cpu_index_to_props(MachineState *ms, unsigned 
cpu_index)
 static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
X86CPUTopoInfo topo;
+   PCMachineState *pcms = PC_MACHINE(ms);
 
assert(idx < ms->possible_cpus->len);
x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-ms->smp.cores, ms->smp.threads, );
+pcms->smp_dies, ms->smp.cores,
+ms->smp.threads, );
return topo.pkg_id % nb_numa_nodes;
 }
 
@@ -2690,6 +2697,7 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 {
 int i;
 unsigned int max_cpus = ms->smp.max_cpus;
+PCMachineState *pcms = PC_MACHINE(ms);
 
 if (ms->possible_cpus) {
 /*
@@ -2710,7 +2718,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
 ms->possible_cpus->cpus[i].vcpus_count = 1;
 ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(ms, i);
 x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
- ms->smp.cores, ms->smp.threads, );
+ pcms->smp_dies, ms->smp.cores,
+ ms->smp.threads, );
 ms->possible_cpus->cpus[i].props.has_socket_id = true;
 ms->possib

[Qemu-devel] [PATCH v3 9/9] vl.c: Add -smp, dies=* command line support and update doc

2019-06-12 Thread Like Xu
For PC target, users could configure the number of dies per one package
via command line with this patch, such as "-smp dies=2,cores=4".

The parsing rules of new cpu-topology model obey the same restrictions/logic
as the legacy socket/core/thread model especially on missing values computing.

Signed-off-by: Like Xu 
---
 hw/i386/pc.c| 32 ++--
 qemu-options.hx | 17 +
 vl.c|  3 +++
 3 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 63b44bd2bd..8a5da4f0c1 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1543,10 +1543,13 @@ static void pc_new_cpu(PCMachineState *pcms, int64_t 
apic_id, Error **errp)
 
 void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 {
-/* copy it from legacy smp_parse() in vl.c */
+PCMachineState *pcms = (PCMachineState *)
+object_dynamic_cast(OBJECT(ms), TYPE_PC_MACHINE);
+
 if (opts) {
 unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
 unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+unsigned dies = qemu_opt_get_number(opts, "dies", 1);
 unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
 unsigned threads = qemu_opt_get_number(opts, "threads", 0);
 
@@ -1556,24 +1559,24 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 threads = threads > 0 ? threads : 1;
 if (cpus == 0) {
 sockets = sockets > 0 ? sockets : 1;
-cpus = cores * threads * sockets;
+cpus = cores * threads * dies * sockets;
 } else {
 ms->smp.max_cpus =
 qemu_opt_get_number(opts, "maxcpus", cpus);
-sockets = ms->smp.max_cpus / (cores * threads);
+sockets = ms->smp.max_cpus / (cores * threads * dies);
 }
 } else if (cores == 0) {
 threads = threads > 0 ? threads : 1;
-cores = cpus / (sockets * threads);
+cores = cpus / (sockets * dies * threads);
 cores = cores > 0 ? cores : 1;
 } else if (threads == 0) {
-threads = cpus / (cores * sockets);
+threads = cpus / (cores * dies * sockets);
 threads = threads > 0 ? threads : 1;
-} else if (sockets * cores * threads < cpus) {
+} else if (sockets * dies * cores * threads < cpus) {
 error_report("cpu topology: "
- "sockets (%u) * cores (%u) * threads (%u) < "
+ "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
< "
  "smp_cpus (%u)",
- sockets, cores, threads, cpus);
+ sockets, dies, cores, threads, cpus);
 exit(1);
 }
 
@@ -1585,26 +1588,27 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 exit(1);
 }
 
-if (sockets * cores * threads > ms->smp.max_cpus) {
+if (sockets * dies * cores * threads > ms->smp.max_cpus) {
 error_report("cpu topology: "
- "sockets (%u) * cores (%u) * threads (%u) > "
+ "sockets (%u) * dies (%u) * cores (%u) * threads (%u) 
> "
  "maxcpus (%u)",
- sockets, cores, threads,
+ sockets, dies, cores, threads,
  ms->smp.max_cpus);
 exit(1);
 }
 
-if (sockets * cores * threads != ms->smp.max_cpus) {
+if (sockets * dies * cores * threads != ms->smp.max_cpus) {
 warn_report("Invalid CPU topology deprecated: "
-"sockets (%u) * cores (%u) * threads (%u) "
+"sockets (%u) * dies (%u) * cores (%u) * threads (%u) "
 "!= maxcpus (%u)",
-sockets, cores, threads,
+sockets, dies, cores, threads,
 ms->smp.max_cpus);
 }
 
 ms->smp.cpus = cpus;
 ms->smp.cores = cores;
 ms->smp.threads = threads;
+pcms->smp_dies = dies;
 }
 
 if (ms->smp.cpus > 1) {
diff --git a/qemu-options.hx b/qemu-options.hx
index 0d8beb4afd..a5b314a448 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -138,25 +138,26 @@ no incompatible TCG features have been enabled (e.g. 
icount/replay).
 ETEXI
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
+"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,dies=dies][,sockets=sockets]\n"
 " 

[Qemu-devel] [PATCH v3 6/9] i386/cpu: Add CPUID.1F generation support for multi-dies PCMachine

2019-06-12 Thread Like Xu
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.

If the number of dies per package is less than 2, the qemu will not
expose CPUID.1F regardless of whether the host supports CPUID.1F.

Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 37 +
 target/i386/cpu.h |  4 
 target/i386/kvm.c | 12 
 3 files changed, 53 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 09e20a2c3b..127aff74a6 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4437,6 +4437,42 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
 }
 
+assert(!(*eax & ~0x1f));
+*ebx &= 0x; /* The count doesn't need to be reliable. */
+break;
+case 0x1F:
+/* V2 Extended Topology Enumeration Leaf */
+if (env->nr_dies < 2 || !cpu->enable_cpuid_0x1f) {
+*eax = *ebx = *ecx = *edx = 0;
+break;
+}
+
+*ecx = count & 0xff;
+*edx = cpu->apic_id;
+switch (count) {
+case 0:
+*eax = apicid_core_offset(env->nr_dies, cs->nr_cores,
+cs->nr_threads);
+*ebx = cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
+break;
+case 1:
+*eax = apicid_die_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
+break;
+case 2:
+*eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
+break;
+default:
+*eax = 0;
+*ebx = 0;
+*ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
+}
 assert(!(*eax & ~0x1f));
 *ebx &= 0x; /* The count doesn't need to be reliable. */
 break;
@@ -5890,6 +5926,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("full-cpuid-auto-level", X86CPU, full_cpuid_auto_level, 
true),
 DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor_id),
 DEFINE_PROP_BOOL("cpuid-0xb", X86CPU, enable_cpuid_0xb, true),
+DEFINE_PROP_BOOL("cpuid-0x1f", X86CPU, enable_cpuid_0x1f, true),
 DEFINE_PROP_BOOL("lmce", X86CPU, enable_lmce, false),
 DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true),
 DEFINE_PROP_BOOL("kvm-no-smi-migration", X86CPU, kvm_no_smi_migration,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 69495f0a8a..0434dfb62a 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -726,6 +726,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_TOPOLOGY_LEVEL_INVALID  (0U << 8)
 #define CPUID_TOPOLOGY_LEVEL_SMT  (1U << 8)
 #define CPUID_TOPOLOGY_LEVEL_CORE (2U << 8)
+#define CPUID_TOPOLOGY_LEVEL_DIE  (5U << 8)
 
 /* MSR Feature Bits */
 #define MSR_ARCH_CAP_RDCL_NO(1U << 0)
@@ -1444,6 +1445,9 @@ struct X86CPU {
 /* Compatibility bits for old machine types: */
 bool enable_cpuid_0xb;
 
+/* V2 Compatibility bits for old machine types: */
+bool enable_cpuid_0x1f;
+
 /* Enable auto level-increase for all CPUID leaves */
 bool full_cpuid_auto_level;
 
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 3b29ce5c0d..9b4da9b265 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1081,6 +1081,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
 }
 break;
 }
+case 0x1f:
+if (env->nr_dies < 2 || !cpu->enable_cpuid_0x1f) {
+break;
+}
 case 4:
 case 0xb:
 case 0xd:
@@ -1088,6 +1092,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 if (i == 0xd && j == 64) {
 break;
 }
+
+if (i == 0x1f && j == 64) {
+break;
+}
+
 c->function = i;
 c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
 c->index = j;
@@ -1099,6 +1108,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 if (i == 0xb && !(c->ecx & 0xff00)) {
 break;
 }
+if (i == 0x1f && !(c->ecx & 0xff00)) {
+break;
+}
 if (i == 0xd && c->eax == 0) {
 continue;
 }
-- 
2.21.0




[Qemu-devel] [PATCH v3 5/9] tests/x86-cpuid: Update testcases in test_topo_bits() with multiple dies

2019-06-12 Thread Like Xu
The corresponding topo_bits tests are updated to support die configurations.

Signed-off-by: Like Xu 
---
 tests/test-x86-cpuid.c | 84 ++
 1 file changed, 45 insertions(+), 39 deletions(-)

diff --git a/tests/test-x86-cpuid.c b/tests/test-x86-cpuid.c
index ff225006e4..1942287f33 100644
--- a/tests/test-x86-cpuid.c
+++ b/tests/test-x86-cpuid.c
@@ -28,74 +28,80 @@
 
 static void test_topo_bits(void)
 {
-/* simple tests for 1 thread per core, 1 core per socket */
-g_assert_cmpuint(apicid_smt_width(1, 1), ==, 0);
-g_assert_cmpuint(apicid_core_width(1, 1), ==, 0);
+/* simple tests for 1 thread per core, 1 core per die, 1 die per package */
+g_assert_cmpuint(apicid_smt_width(1, 1, 1), ==, 0);
+g_assert_cmpuint(apicid_core_width(1, 1, 1), ==, 0);
+g_assert_cmpuint(apicid_die_width(1, 1, 1), ==, 0);
 
-g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 0), ==, 0);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1), ==, 1);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 2), ==, 2);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 3), ==, 3);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 0), ==, 0);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 1), ==, 1);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 2), ==, 2);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 1, 1, 3), ==, 3);
 
 
 /* Test field width calculation for multiple values
  */
-g_assert_cmpuint(apicid_smt_width(1, 2), ==, 1);
-g_assert_cmpuint(apicid_smt_width(1, 3), ==, 2);
-g_assert_cmpuint(apicid_smt_width(1, 4), ==, 2);
+g_assert_cmpuint(apicid_smt_width(1, 1, 2), ==, 1);
+g_assert_cmpuint(apicid_smt_width(1, 1, 3), ==, 2);
+g_assert_cmpuint(apicid_smt_width(1, 1, 4), ==, 2);
 
-g_assert_cmpuint(apicid_smt_width(1, 14), ==, 4);
-g_assert_cmpuint(apicid_smt_width(1, 15), ==, 4);
-g_assert_cmpuint(apicid_smt_width(1, 16), ==, 4);
-g_assert_cmpuint(apicid_smt_width(1, 17), ==, 5);
+g_assert_cmpuint(apicid_smt_width(1, 1, 14), ==, 4);
+g_assert_cmpuint(apicid_smt_width(1, 1, 15), ==, 4);
+g_assert_cmpuint(apicid_smt_width(1, 1, 16), ==, 4);
+g_assert_cmpuint(apicid_smt_width(1, 1, 17), ==, 5);
 
 
-g_assert_cmpuint(apicid_core_width(30, 2), ==, 5);
-g_assert_cmpuint(apicid_core_width(31, 2), ==, 5);
-g_assert_cmpuint(apicid_core_width(32, 2), ==, 5);
-g_assert_cmpuint(apicid_core_width(33, 2), ==, 6);
+g_assert_cmpuint(apicid_core_width(1, 30, 2), ==, 5);
+g_assert_cmpuint(apicid_core_width(1, 31, 2), ==, 5);
+g_assert_cmpuint(apicid_core_width(1, 32, 2), ==, 5);
+g_assert_cmpuint(apicid_core_width(1, 33, 2), ==, 6);
 
+g_assert_cmpuint(apicid_die_width(1, 30, 2), ==, 0);
+g_assert_cmpuint(apicid_die_width(2, 30, 2), ==, 1);
+g_assert_cmpuint(apicid_die_width(3, 30, 2), ==, 2);
+g_assert_cmpuint(apicid_die_width(4, 30, 2), ==, 2);
 
 /* build a weird topology and see if IDs are calculated correctly
  */
 
 /* This will use 2 bits for thread ID and 3 bits for core ID
  */
-g_assert_cmpuint(apicid_smt_width(6, 3), ==, 2);
-g_assert_cmpuint(apicid_core_width(6, 3), ==, 3);
-g_assert_cmpuint(apicid_pkg_offset(6, 3), ==, 5);
+g_assert_cmpuint(apicid_smt_width(1, 6, 3), ==, 2);
+g_assert_cmpuint(apicid_core_offset(1, 6, 3), ==, 2);
+g_assert_cmpuint(apicid_die_offset(1, 6, 3), ==, 5);
+g_assert_cmpuint(apicid_pkg_offset(1, 6, 3), ==, 5);
 
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 0), ==, 0);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 1), ==, 1);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 2), ==, 2);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 0), ==, 0);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1), ==, 1);
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2), ==, 2);
 
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 1 * 3 + 0), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 0), ==,
  (1 << 2) | 0);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 1 * 3 + 1), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 1), ==,
  (1 << 2) | 1);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 1 * 3 + 2), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 1 * 3 + 2), ==,
  (1 << 2) | 2);
 
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 2 * 3 + 0), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 0), ==,
  (2 << 2) | 0);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 2 * 3 + 1), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 1), ==,
  (2 << 2) | 1);
-g_assert_cmpuint(x86_apicid_from_cpu_idx(6, 3, 2 * 3 + 2), ==,
+g_assert_cmpuint(x86_apicid_from_cpu_idx(1, 6, 3, 2 * 3 + 2), ==,
  (2 <&

[Qemu-devel] [PATCH v3 0/9] Introduce cpu die topology and enable CPUID.1F for i386

2019-06-12 Thread Like Xu
Multi-chip packaging technology allows integration of multi-cores in one die
and multi-dies in one single package, for example Intel CLX-AP or AMD EPYC.

This patch series extend the CPU topology to the socket/dies/core/thread model,
allowing the setting of dies number per one socket on -smp qemu command. For
i386, it upgrades APIC_IDs generation and reversion functions with a new exposed
leaf called CPUID.1F, which is a preferred superset to leaf 0BH. The CPUID.1F
spec is on https://software.intel.com/en-us/articles/intel-sdm, 3-190 Vol 2A.

E.g. we use -smp 4,dies=2,cores=2,threads=1 to run a multi-dies guest and
check raw cpuid data and the expected output from guest is following:
0x001f 0x00: eax=0x ebx=0x0001 ecx=0x0100 edx=0x0002
0x001f 0x01: eax=0x0001 ebx=0x0002 ecx=0x0201 edx=0x0001
0x001f 0x02: eax=0x0002 ebx=0x0004 ecx=0x0502 edx=0x0003
0x001f 0x03: eax=0x ebx=0x ecx=0x0003 edx=0x0001

Guest system could discover multi-die/package topology through CPUID.1F.
and its benefit is primarily for _reporting_ of the (virtual) CPU topology.
The guest kernel with multi-die/package support have no impact on its 
cache topology, NUMA topology, Linux scheduler, or system performance. 

==changelog==

v3:

- add a MachineClass::smp_parse function pointer
- place the PC-specific function inside hw/i386/pc.c
- introduce die_id in a separate patch with default value 0
- set env->nr_dies in pc_new_cpu() and pc_cpu_pre_plug()
- fix a circular dependency between target/i386/cpu.c and hw/i386/pc.c
- fix cpu->die_id check in pc_cpu_pre_plug()
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 219dca61ebf41625831d4f96a720852baf44b762

v2: https://patchwork.kernel.org/cover/10953191/

- Enable cpu die-level topolgy only for PCMachine and X86CPU
- Minimize cpuid.0.eax to the setting value actually used by guest
- Update cmd line -smps docs for die-level configurations
- Refactoring topo-bit tests for x86_apicid_from_cpu_idx() with nr_dies
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 2259637b95bef3116cc262459271de08e038cc66

v1: https://patchwork.kernel.org/cover/10876667/

Like Xu (9):
  i386: Add die-level cpu topology to x86CPU on PCMachine
  hw/i386: Adjust nr_dies with configured smp_dies for PCMachine
  i386/cpu: Consolidate die-id validity in smp context
  i386: Update new x86_apicid parsing rules with die_offset support
  tests/x86-cpuid: Update testcases in test_topo_bits() with multiple dies
  i386/cpu: Add CPUID.1F generation support for multi-dies PCMachine
  target/i386: Support multi-dies when host doesn't support CPUID.1F
  machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()
  vl.c: Add -smp, dies=* command line support and update doc

 hmp.c  |   3 +
 hw/core/machine.c  |  89 ++
 hw/i386/pc.c   | 148 -
 include/hw/boards.h|   5 ++
 include/hw/i386/pc.h   |   3 +
 include/hw/i386/topology.h |  76 +--
 qapi/misc.json |   6 +-
 qemu-options.hx|  17 +++--
 target/i386/cpu.c  |  53 +++--
 target/i386/cpu.h  |   7 ++
 target/i386/kvm.c  |  36 -
 tests/test-x86-cpuid.c |  84 +++--
 vl.c   |  78 ++-
 13 files changed, 438 insertions(+), 167 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v3 1/9] i386: Add die-level cpu topology to x86CPU on PCMachine

2019-06-12 Thread Like Xu
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

 #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().

Signed-off-by: Like Xu 
---
 hw/i386/pc.c | 9 +++--
 include/hw/i386/pc.h | 2 ++
 target/i386/cpu.c| 1 +
 target/i386/cpu.h| 2 ++
 4 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 12c1e08b85..9e9a42f007 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2308,9 +2308,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
-/* if APIC ID is not set, set it based on socket/core/thread properties */
+/*
+ * If APIC ID is not set,
+ * set it based on socket/die/core/thread properties.
+ */
 if (cpu->apic_id == UNASSIGNED_APIC_ID) {
-int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores;
+int max_socket = (ms->smp.max_cpus - 1) /
+smp_threads / smp_cores / pcms->smp_dies;
 
 if (cpu->socket_id < 0) {
 error_setg(errp, "CPU socket-id is not set");
@@ -2620,6 +2624,7 @@ static void pc_machine_initfn(Object *obj)
 pcms->smbus_enabled = true;
 pcms->sata_enabled = true;
 pcms->pit_enabled = true;
+pcms->smp_dies = 1;
 
 pc_system_flash_create(pcms);
 }
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index b260262640..fae9217e34 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -24,6 +24,7 @@
  * PCMachineState:
  * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
  * @boot_cpus: number of present VCPUs
+ * @smp_dies: number of dies per one package
  */
 struct PCMachineState {
 /*< private >*/
@@ -59,6 +60,7 @@ struct PCMachineState {
 bool apic_xrupt_override;
 unsigned apic_id_limit;
 uint16_t boot_cpus;
+unsigned smp_dies;
 
 /* NUMA information: */
 uint64_t numa_nodes;
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 23119699de..a16be205fe 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5619,6 +5619,7 @@ static void x86_cpu_initfn(Object *obj)
 CPUX86State *env = >env;
 FeatureWord w;
 
+env->nr_dies = 1;
 cpu_set_cpustate_pointers(cpu);
 
 object_property_add(obj, "family", "int",
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index edad6e1efb..5daa2eeafa 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1349,6 +1349,8 @@ typedef struct CPUX86State {
 uint64_t xss;
 
 TPRAccess tpr_access_type;
+
+unsigned nr_dies;
 } CPUX86State;
 
 struct kvm_msrs;
-- 
2.21.0




[Qemu-devel] [PATCH v3 2/9] hw/i386: Adjust nr_dies with configured smp_dies for PCMachine

2019-06-12 Thread Like Xu
To support multiple dies configuration on PCMachine, the best place to
set CPUX86State->nr_dies with requested PCMachineState->smp_dies is in
pc_new_cpu() and pc_cpu_pre_plug(). Refactoring pc_new_cpu() is applied
and redundant parameter "const char *typename" would be removed.

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 hw/i386/pc.c | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 9e9a42f007..af2e95a1b9 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1520,12 +1520,16 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
level)
 }
 }
 
-static void pc_new_cpu(const char *typename, int64_t apic_id, Error **errp)
+static void pc_new_cpu(PCMachineState *pcms, int64_t apic_id, Error **errp)
 {
 Object *cpu = NULL;
 Error *local_err = NULL;
+CPUX86State *env = NULL;
 
-cpu = object_new(typename);
+cpu = object_new(MACHINE(pcms)->cpu_type);
+
+env = _CPU(cpu)->env;
+env->nr_dies = pcms->smp_dies;
 
 object_property_set_uint(cpu, apic_id, "apic-id", _err);
 object_property_set_bool(cpu, true, "realized", _err);
@@ -1551,7 +1555,7 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, 
Error **errp)
 return;
 }
 
-pc_new_cpu(ms->cpu_type, apic_id, _err);
+pc_new_cpu(PC_MACHINE(ms), apic_id, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -1576,8 +1580,7 @@ void pc_cpus_init(PCMachineState *pcms)
  ms->smp.max_cpus - 1) + 1;
 possible_cpus = mc->possible_cpu_arch_ids(ms);
 for (i = 0; i < ms->smp.cpus; i++) {
-pc_new_cpu(possible_cpus->cpus[i].type, possible_cpus->cpus[i].arch_id,
-   _fatal);
+pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, _fatal);
 }
 }
 
@@ -2297,6 +2300,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 CPUArchId *cpu_slot;
 X86CPUTopoInfo topo;
 X86CPU *cpu = X86_CPU(dev);
+CPUX86State *env = >env;
 MachineState *ms = MACHINE(hotplug_dev);
 PCMachineState *pcms = PC_MACHINE(hotplug_dev);
 unsigned int smp_cores = ms->smp.cores;
@@ -2308,6 +2312,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
+env->nr_dies = pcms->smp_dies;
+
 /*
  * If APIC ID is not set,
  * set it based on socket/die/core/thread properties.
-- 
2.21.0




[Qemu-devel] [PATCH v3 8/9] machine: Refactor smp_parse() in vl.c as MachineClass::smp_parse()

2019-06-12 Thread Like Xu
To make smp_parse() more flexible and expansive, a smp_parse function
pointer is added to MachineClass that machine types could override.

The generic smp_parse() code in vl.c is moved to hw/core/machine.c, and
become the default implementation of MachineClass::smp_parse. A PC-specific
function called pc_smp_parse() has been added to hw/i386/pc.c, which in
this patch changes nothing against the default one .

Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 hw/core/machine.c| 77 
 hw/i386/pc.c | 76 +++
 include/hw/boards.h  |  5 +++
 include/hw/i386/pc.h |  1 +
 vl.c | 75 ++
 5 files changed, 161 insertions(+), 73 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 9eeba448ed..d58a684abf 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -11,6 +11,9 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/option.h"
+#include "qapi/qmp/qerror.h"
+#include "sysemu/replay.h"
 #include "qemu/units.h"
 #include "hw/boards.h"
 #include "qapi/error.h"
@@ -722,6 +725,79 @@ void machine_set_cpu_numa_node(MachineState *machine,
 }
 }
 
+static void smp_parse(MachineState *ms, QemuOpts *opts)
+{
+/* copy it from legacy smp_parse() in vl.c */
+if (opts) {
+unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
+unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
+unsigned threads = qemu_opt_get_number(opts, "threads", 0);
+
+/* compute missing values, prefer sockets over cores over threads */
+if (cpus == 0 || sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+if (cpus == 0) {
+sockets = sockets > 0 ? sockets : 1;
+cpus = cores * threads * sockets;
+} else {
+ms->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+sockets = ms->smp.max_cpus / (cores * threads);
+}
+} else if (cores == 0) {
+threads = threads > 0 ? threads : 1;
+cores = cpus / (sockets * threads);
+cores = cores > 0 ? cores : 1;
+} else if (threads == 0) {
+threads = cpus / (cores * sockets);
+threads = threads > 0 ? threads : 1;
+} else if (sockets * cores * threads < cpus) {
+error_report("cpu topology: "
+ "sockets (%u) * cores (%u) * threads (%u) < "
+ "smp_cpus (%u)",
+ sockets, cores, threads, cpus);
+exit(1);
+}
+
+ms->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+
+if (ms->smp.max_cpus < cpus) {
+error_report("maxcpus must be equal to or greater than smp");
+exit(1);
+}
+
+if (sockets * cores * threads > ms->smp.max_cpus) {
+error_report("cpu topology: "
+ "sockets (%u) * cores (%u) * threads (%u) > "
+ "maxcpus (%u)",
+ sockets, cores, threads,
+ ms->smp.max_cpus);
+exit(1);
+}
+
+if (sockets * cores * threads != ms->smp.max_cpus) {
+warn_report("Invalid CPU topology deprecated: "
+"sockets (%u) * cores (%u) * threads (%u) "
+"!= maxcpus (%u)",
+sockets, cores, threads,
+ms->smp.max_cpus);
+}
+
+ms->smp.cpus = cpus;
+ms->smp.cores = cores;
+ms->smp.threads = threads;
+}
+
+if (ms->smp.cpus > 1) {
+Error *blocker = NULL;
+error_setg(, QERR_REPLAY_NOT_SUPPORTED, "smp");
+replay_add_blocker(blocker);
+}
+}
+
 static void machine_class_init(ObjectClass *oc, void *data)
 {
 MachineClass *mc = MACHINE_CLASS(oc);
@@ -729,6 +805,7 @@ static void machine_class_init(ObjectClass *oc, void *data)
 /* Default 128 MB as guest ram size */
 mc->default_ram_size = 128 * MiB;
 mc->rom_file_has_mr = true;
+mc->smp_parse = smp_parse;
 
 /* numa node memory size aligned on 8MB by default.
  * On Linux, each node's border has to be 8MB aligned
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b4dbd1064d..63b44bd2bd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -78,6 +78,8 @@
 #include "hw/i386/intel_iommu.h"
 #include "hw/net/n

[Qemu-devel] [PATCH v3 3/9] i386/cpu: Consolidate die-id validity in smp context

2019-06-12 Thread Like Xu
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Acked-by: Dr. David Alan Gilbert 
Signed-off-by: Like Xu 
---
 hmp.c  |  3 +++
 hw/core/machine.c  | 12 
 hw/i386/pc.c   | 14 ++
 include/hw/i386/topology.h |  2 ++
 qapi/misc.json |  6 --
 target/i386/cpu.c  |  2 ++
 target/i386/cpu.h  |  1 +
 7 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/hmp.c b/hmp.c
index be5e345c6f..b567c86628 100644
--- a/hmp.c
+++ b/hmp.c
@@ -3113,6 +3113,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
*qdict)
 if (c->has_socket_id) {
 monitor_printf(mon, "socket-id: \"%" PRIu64 "\"\n", 
c->socket_id);
 }
+if (c->has_die_id) {
+monitor_printf(mon, "die-id: \"%" PRIu64 "\"\n", c->die_id);
+}
 if (c->has_core_id) {
 monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n", c->core_id);
 }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index f1a0f45f9c..9eeba448ed 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -679,6 +679,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
 return;
 }
 
+if (props->has_die_id && !slot->props.has_die_id) {
+error_setg(errp, "die-id is not supported");
+return;
+}
+
 /* skip slots with explicit mismatch */
 if (props->has_thread_id && props->thread_id != slot->props.thread_id) 
{
 continue;
@@ -688,6 +693,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
 continue;
 }
 
+if (props->has_die_id && props->die_id != slot->props.die_id) {
+continue;
+}
+
 if (props->has_socket_id && props->socket_id != slot->props.socket_id) 
{
 continue;
 }
@@ -945,6 +954,9 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
 if (cpu->props.has_socket_id) {
 g_string_append_printf(s, "socket-id: %"PRId64, cpu->props.socket_id);
 }
+if (cpu->props.has_die_id) {
+g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
+}
 if (cpu->props.has_core_id) {
 if (s->len) {
 g_string_append_printf(s, ", ");
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index af2e95a1b9..6e774c6c8e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2329,6 +2329,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 error_setg(errp, "Invalid CPU socket-id: %u must be in range 0:%u",
cpu->socket_id, max_socket);
 return;
+} else if (cpu->die_id > pcms->smp_dies - 1) {
+error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
+   cpu->die_id, max_socket);
+return;
 }
 if (cpu->core_id < 0) {
 error_setg(errp, "CPU core-id is not set");
@@ -2348,6 +2352,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 }
 
 topo.pkg_id = cpu->socket_id;
+topo.die_id = cpu->die_id;
 topo.core_id = cpu->core_id;
 topo.smt_id = cpu->thread_id;
 cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
@@ -2385,6 +2390,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 }
 cpu->socket_id = topo.pkg_id;
 
+if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
+error_setg(errp, "property die-id: %u doesn't match set apic-id:"
+" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
+return;
+}
+cpu->die_id = topo.die_id;
+
 if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
 error_setg(errp, "property core-id: %u doesn't match set apic-id:"
 " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
@@ -2701,6 +2713,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
  ms->smp.cores, ms->smp.threads, );
 ms->possible_cpus->cpus[i].props.has_socket_id = true;
 ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+ms->possible_cpus->cpus[i].props.has_die_id = true;
+ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
 ms->possible_cpu

[Qemu-devel] [PATCH v3 7/9] target/i386: Support multi-dies when host doesn't support CPUID.1F

2019-06-12 Thread Like Xu
In guest CPUID generation process, the cpuid_min_level would be adjusted to
the maximum passed value for basic CPUID configuration and it should not be
restricted by the limited value returned from cpu_x86_cpuid(). After the basic
cpu_x86_cpuid() loop is finished, the cpuid_0_entry.eax needs to be configured
again by the last adjusted cpuid_min_level value.

If a user wants to expose CPUID.1F by passing dies > 1 for any reason without
host support, a per-cpu smp topology warning will appear but it's not blocked.

Signed-off-by: Like Xu 
---
 target/i386/kvm.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 9b4da9b265..8bf1604d2b 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -931,12 +931,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
 struct kvm_cpuid_entry2 *c;
 uint32_t signature[3];
 int kvm_base = KVM_CPUID_SIGNATURE;
-int r;
+int r, cpuid_0_entry, cpuid_min_level;
 Error *local_err = NULL;
 
 memset(_data, 0, sizeof(cpuid_data));
 
-cpuid_i = 0;
+cpuid_i = cpuid_0_entry = cpuid_min_level = 0;
 
 r = kvm_arch_set_tsc_khz(cs);
 if (r < 0) {
@@ -1050,6 +1050,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 cpu_x86_cpuid(env, 0, 0, , , , );
 
+/* Allow 0x1f setting regardless of kvm support if nr_dies > 1 */
+if (limit < 0x1f && env->nr_dies > 1 && cpu->enable_cpuid_0x1f) {
+limit = env->cpuid_level = env->cpuid_min_level = 0x1f;
+warn_report("CPU topology: the CPUID.1F isn't supported on the host.");
+}
+
 for (i = 0; i <= limit; i++) {
 if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
 fprintf(stderr, "unsupported level value: 0x%x\n", limit);
@@ -1151,8 +1157,22 @@ int kvm_arch_init_vcpu(CPUState *cs)
 cpu_x86_cpuid(env, i, 0, >eax, >ebx, >ecx, >edx);
 break;
 }
+
+/* Remember the index of cpuid.0 leaf for reconfiguration. */
+cpuid_0_entry = (i == 0) ? (cpuid_i - 1) : cpuid_0_entry;
+
+/* Adjust cpuid_min_level to the maximum index of valid basic cpuids. 
*/
+cpuid_min_level =
+((c->eax | c->ebx | c->ecx | c->edx | c->flags | c->index) &&
+(i > cpuid_min_level)) ? i : cpuid_min_level;
 }
 
+env->cpuid_level = env->cpuid_min_level = cpuid_min_level;
+
+/* Reconfigure cpuid_0_eax value to follow CPUID.0 instruction spec.*/
+c = _data.entries[cpuid_0_entry];
+cpu_x86_cpuid(env, 0, 0, >eax, >ebx, >ecx, >edx);
+
 if (limit >= 0x0a) {
 uint32_t eax, edx;
 
-- 
2.21.0




Re: [Qemu-devel] [PATCH v2 1/5] target/i386: Add cpu die-level topology support for X86CPU

2019-06-10 Thread Like Xu

On 2019/6/6 11:32, Eduardo Habkost wrote:

On Tue, May 21, 2019 at 12:50:52AM +0800, Like Xu wrote:

The die-level as the first PC-specific cpu topology is added to the
leagcy cpu topology model which only covers sockets/cores/threads.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

  #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies is 1.

A new set of die-related variables are added in smp context and the
CPUX86State.nr_dies is assigned in x86_cpu_initfn() from PCMachineState.

Signed-off-by: Like Xu 
---
  hw/i386/pc.c   | 3 +++
  include/hw/i386/pc.h   | 2 ++
  include/hw/i386/topology.h | 2 ++
  qapi/misc.json | 6 --
  target/i386/cpu.c  | 9 +
  target/i386/cpu.h  | 3 +++
  6 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 896c22e32e..83ab53c814 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2341,6 +2341,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  
  topo.pkg_id = cpu->socket_id;

  topo.core_id = cpu->core_id;
+topo.die_id = cpu->die_id;
  topo.smt_id = cpu->thread_id;
  cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
  }
@@ -2692,6 +2693,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
   ms->smp.cores, ms->smp.threads, );
  ms->possible_cpus->cpus[i].props.has_socket_id = true;
  ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+ms->possible_cpus->cpus[i].props.has_die_id = true;
+ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
  ms->possible_cpus->cpus[i].props.has_core_id = true;
  ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
  ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ce3c22951e..b5faf2ede9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -24,6 +24,7 @@
   * PCMachineState:
   * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
   * @boot_cpus: number of present VCPUs
+ * @smp_dies: number of dies per one package
   */
  struct PCMachineState {
  /*< private >*/
@@ -59,6 +60,7 @@ struct PCMachineState {
  bool apic_xrupt_override;
  unsigned apic_id_limit;
  uint16_t boot_cpus;
+unsigned smp_dies;
  
  /* NUMA information: */

  uint64_t numa_nodes;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 1ebaee0f76..7f80498eb3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -47,6 +47,7 @@ typedef uint32_t apic_id_t;
  
  typedef struct X86CPUTopoInfo {

  unsigned pkg_id;
+unsigned die_id;


Isn't it better to add this field only on patch 4/5?


  unsigned core_id;
  unsigned smt_id;
  } X86CPUTopoInfo;
@@ -130,6 +131,7 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
  topo->core_id = (apicid >> apicid_core_offset(nr_cores, nr_threads)) &
 ~(0xUL << apicid_core_width(nr_cores, nr_threads));
  topo->pkg_id = apicid >> apicid_pkg_offset(nr_cores, nr_threads);
+topo->die_id = -1;


Why are you setting die_id = -1 here?


Hi Eduardo,thanks for your comments and support.

Would it be a better way to introduce all die related variables 
including has_die_id/nr_dies/cpu->die_id/topo.die_id/smp_dies in one 
patch for consistency check and backport convenient?


In this case the default value for topo->die_id would be 0 (for sure, 
one die per package) with has_die_id = false. Is that acceptable to you?




If die_id isn't valid yet, isn't it better to keep has_die_id =
false at pc_possible_cpu_arch_ids() above, and set has_die_id =
true only on patch 4/5?


  }
  
  /* Make APIC ID for the CPU 'cpu_index'

diff --git a/qapi/misc.json b/qapi/misc.json
index 8b3ca4fdd3..cd236c89b3 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -2924,10 +2924,11 @@
  #
  # @node-id: NUMA node ID the CPU belongs to
  # @socket-id: socket number within node/board the CPU belongs to
-# @core-id: core number within socket the CPU belongs to
+# @die-id: die number within node/board the CPU belongs to (Since 4.1)
+# @core-id: core number within die the CPU belongs to
  # @thread-id: thread number within core the CPU belongs to
  #
-# Note: currently there are 4 properties that could be present
+# Note: currently there are 5 properties that could be present
  # but management should be prepared to pass through other
  # properties with device_add command to allow for future
  # interface extension. This also requires the filed names to be kept in
@@ -2938,6 +2939,7 @@
  { 'struct': 'CpuInstancePro

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 04/10] hw/ppc: Replace global smp variables with machine smp properties

2019-06-06 Thread Like Xu

On 2019/6/6 16:20, Greg Kurz wrote:

On Thu, 6 Jun 2019 13:07:32 +1000
David Gibson  wrote:


On Wed, Jun 05, 2019 at 11:54:56PM -0300, Eduardo Habkost wrote:

On Wed, Jun 05, 2019 at 11:52:41PM -0300, Eduardo Habkost wrote:

On Sun, May 19, 2019 at 04:54:22AM +0800, Like Xu wrote:

The global smp variables in ppc are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 


Any objections from the ppc maintainers to queueing this through
the Machine Core tree?


Oops, CCing the ppc maintainers.


No objection here.

Acked-by: David Gibson 



Just one nit...

[...]


diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index ee24212765..c9ffe9786c 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -231,6 +231,8 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
target_ulong args,
uint32_t nret, target_ulong rets)
  {
+MachineState *ms = MACHINE(qdev_get_machine());


rtas_ibm_get_system_parameter() has a SpaprMachineState *spapr argument, no
need to rely on qdev_get_machine().


I will fix it in the next (rebased) version. Thank you, Greg.



But this can be fixed in a followup patch I guess. Not worth holding the
patchset because of that.


+unsigned int max_cpus = ms->smp.max_cpus;
  target_ulong parameter = rtas_ld(args, 0);
  target_ulong buffer = rtas_ld(args, 1);
  target_ulong length = rtas_ld(args, 2);
@@ -244,7 +246,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
"MaxPlatProcs=%d",
max_cpus,
current_machine->ram_size / MiB,
-  smp_cpus,
+  ms->smp.cpus,
max_cpus);
  ret = sysparm_st(buffer, length, param_val, strlen(param_val) + 1);
  g_free(param_val);
   
   









[Qemu-devel] [QUESTION] How to reduce network latency to improve netperf TCP_RR drastically?

2019-06-04 Thread Like Xu

Hi Michael,

At https://www.linux-kvm.org/page/NetworkingTodo, there is an entry for 
network latency saying:


---
reduce networking latency:
 allow handling short packets from softirq or VCPU context
 Plan:
   We are going through the scheduler 3 times
   (could be up to 5 if softirqd is involved)
   Consider RX: host irq -> io thread -> VCPU thread ->
   guest irq -> guest thread.
   This adds a lot of latency.
   We can cut it by some 1.5x if we do a bit of work
   either in the VCPU or softirq context.
 Testing: netperf TCP RR - should be improved drastically
  netperf TCP STREAM guest to host - no regression
 Contact: MST
---

I am trying to make some contributions to improving netperf TCP_RR.
Could you please share more ideas or plans or implemental details to 
make it happen?


Thanks,
Like Xu



Re: [Qemu-devel] [PATCH v3 00/10] Refactor cpu topo into machine properties

2019-05-28 Thread Like Xu

Ping for [PATCH v3 00/10] Refactor cpu topo into machine properties.

On 2019/5/26 21:51, Like Xu wrote:

On 2019/5/19 4:54, Like Xu wrote:

This patch series make existing cores/threads/sockets into machine
properties and get rid of global smp_* variables they use currently.

The purpose of getting rid of globals is disentangle layer violations and
let's do it one step at a time by replacing the smp_foo with 
qdev_get_machine()

as few calls as possible and delay other related refactoring efforts.



Hi Eduardo & Igor,
Do you have any comments on this new version of CpuTopology refactoring?

With this series of patch, we may move forward to review [Qemu-devel] 
[PATCH v2 0/5] Introduce cpu die topology and enable CPUID.1F for i386.


Thanks,
Like Xu


==changelog==

v3:

- rephrase commit messages
- s/of/of present/ for CpuTopology comment
- drop reduanct arguments such as cpu_type
- use ms instead of macs in migration context
- rebase to commit 1b46b4daa6











Re: [Qemu-devel] [PATCH v2 2/5] i386/cpu: Consolidate die-id validity in smp context

2019-05-27 Thread Like Xu

On 2019/5/22 1:12, Dr. David Alan Gilbert wrote:

* Like Xu (like...@linux.intel.com) wrote:

Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Signed-off-by: Like Xu 
---
  hmp.c |  3 +++
  hw/core/machine.c | 12 
  hw/i386/pc.c  | 11 +++
  3 files changed, 26 insertions(+)

diff --git a/hmp.c b/hmp.c
index 56a3ed7375..7deb7b7226 100644
--- a/hmp.c
+++ b/hmp.c
@@ -3112,6 +3112,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
*qdict)
  if (c->has_socket_id) {
  monitor_printf(mon, "socket-id: \"%" PRIu64 "\"\n", 
c->socket_id);
  }
+if (c->has_die_id) {
+monitor_printf(mon, "die-id: \"%" PRIu64 "\"\n", c->die_id);
+}
  if (c->has_core_id) {
  monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n", c->core_id);
  }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 5d046a43e3..5116429732 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -659,6 +659,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
  return;
  }
  
+if (props->has_die_id && !slot->props.has_die_id) {

+error_setg(errp, "die-id is not supported");
+return;
+}
+
  /* skip slots with explicit mismatch */
  if (props->has_thread_id && props->thread_id != 
slot->props.thread_id) {
  continue;
@@ -668,6 +673,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
  continue;
  }
  
+if (props->has_die_id && props->die_id != slot->props.die_id) {

+continue;
+}
+
  if (props->has_socket_id && props->socket_id != 
slot->props.socket_id) {
  continue;
  }
@@ -925,6 +934,9 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
  if (cpu->props.has_socket_id) {
  g_string_append_printf(s, "socket-id: %"PRId64, cpu->props.socket_id);
  }
+if (cpu->props.has_die_id) {
+g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
+}
  if (cpu->props.has_core_id) {
  if (s->len) {
  g_string_append_printf(s, ", ");
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 83ab53c814..00be2463af 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2321,6 +2321,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  error_setg(errp, "Invalid CPU socket-id: %u must be in range 
0:%u",
 cpu->socket_id, max_socket);
  return;
+} else if (cpu->die_id > max_socket) {
+error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
+   cpu->die_id, max_socket);
+return;


Can you explain why the die_id is related to max_socket?
I'd assumed you could have a 2 socket system where each socket has 4
dies.


Dr David,thanks for your comments and sorry for the slow reply.

You're right about this and the check rule for cpu->die_id in 
pc_cpu_pre_plug() should be:


"else if (cpu->die_id > (pcms->smp_dies - 1))"



However, for the HMP side of it:


Acked-by: Dr. David Alan Gilbert 


  }
  if (cpu->core_id < 0) {
  error_setg(errp, "CPU core-id is not set");
@@ -2378,6 +2382,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
  }
  cpu->socket_id = topo.pkg_id;
  
+if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {

+error_setg(errp, "property die-id: %u doesn't match set apic-id:"
+" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
+return;
+}
+cpu->die_id = topo.die_id;
+
  if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
  error_setg(errp, "property core-id: %u doesn't match set apic-id:"
  " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
--
2.21.0


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK







Re: [Qemu-devel] [PATCH v3 00/10] Refactor cpu topo into machine properties

2019-05-26 Thread Like Xu

On 2019/5/19 4:54, Like Xu wrote:

This patch series make existing cores/threads/sockets into machine
properties and get rid of global smp_* variables they use currently.

The purpose of getting rid of globals is disentangle layer violations and
let's do it one step at a time by replacing the smp_foo with qdev_get_machine()
as few calls as possible and delay other related refactoring efforts.



Hi Eduardo & Igor,
Do you have any comments on this new version of CpuTopology refactoring?

With this series of patch, we may move forward to review [Qemu-devel] 
[PATCH v2 0/5] Introduce cpu die topology and enable CPUID.1F for i386.


Thanks,
Like Xu


==changelog==

v3:

- rephrase commit messages
- s/of/of present/ for CpuTopology comment
- drop reduanct arguments such as cpu_type
- use ms instead of macs in migration context
- rebase to commit 1b46b4daa6







[Qemu-devel] [PATCH v2 4/5] i386/cpu: Update apicid parsing rules and topo-bit tests for dies

2019-05-21 Thread Like Xu
On Intel MCP (Multi-chip packaging) platforms, the apicid of logical cpu
would imply die level info of cpu topology thus x86_apicid_from_cpu_idx()
should be refactored with virtual nr_dies, so does apicid_*_offset().

To maintain semantic consistency, the pkg_offset which helps to generate
CPUIDs such as 0x3 for L3 cache is mapping to die_offset from this commit.

The corresponding topo_bits tests are updated to test die configurations.

Signed-off-by: Like Xu 
---
 hw/i386/pc.c   | 38 +++--
 include/hw/i386/topology.h | 76 --
 target/i386/cpu.c  | 13 +++---
 tests/test-x86-cpuid.c | 84 --
 4 files changed, 133 insertions(+), 78 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 00be2463af..e498334cbc 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -935,10 +935,11 @@ void enable_compat_apic_id_mode(void)
 static uint32_t x86_cpu_apic_id_from_index(MachineState *ms,
unsigned int cpu_index)
 {
+PCMachineState *pcms = PC_MACHINE(ms);
 uint32_t correct_id;
 static bool warned;
 
-correct_id = x86_apicid_from_cpu_idx(ms->smp.cores,
+correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
  ms->smp.threads, cpu_index);
 if (compat_apic_id_mode) {
 if (cpu_index != correct_id && !warned && !qtest_enabled()) {
@@ -2303,6 +2304,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 PCMachineState *pcms = PC_MACHINE(hotplug_dev);
 unsigned int smp_cores = ms->smp.cores;
 unsigned int smp_threads = ms->smp.threads;
+unsigned int smp_dies = pcms->smp_dies;
 
 if(!object_dynamic_cast(OBJECT(cpu), ms->cpu_type)) {
 error_setg(errp, "Invalid CPU type, expected cpu type: '%s'",
@@ -2310,9 +2312,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 return;
 }
 
-/* if APIC ID is not set, set it based on socket/core/thread properties */
+/*
+ * If APIC ID is not set,
+ * set it based on socket/die/core/thread properties.
+ */
 if (cpu->apic_id == UNASSIGNED_APIC_ID) {
-int max_socket = (ms->smp.max_cpus - 1) / smp_threads / smp_cores;
+int max_socket = (ms->smp.max_cpus - 1) /
+smp_threads / smp_cores / pcms->smp_dies;
 
 if (cpu->socket_id < 0) {
 error_setg(errp, "CPU socket-id is not set");
@@ -2347,18 +2353,21 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 topo.core_id = cpu->core_id;
 topo.die_id = cpu->die_id;
 topo.smt_id = cpu->thread_id;
-cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
+cpu->apic_id = apicid_from_topo_ids(smp_dies, smp_cores,
+smp_threads, );
 }
 
 cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, );
 if (!cpu_slot) {
 MachineState *ms = MACHINE(pcms);
 
-x86_topo_ids_from_apicid(cpu->apic_id, smp_cores, smp_threads, );
-error_setg(errp, "Invalid CPU [socket: %u, core: %u, thread: %u] with"
-  " APIC ID %" PRIu32 ", valid index range 0:%d",
-   topo.pkg_id, topo.core_id, topo.smt_id, cpu->apic_id,
-   ms->possible_cpus->len - 1);
+x86_topo_ids_from_apicid(cpu->apic_id, smp_dies,
+ smp_cores, smp_threads, );
+error_setg(errp,
+"Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
+" APIC ID %" PRIu32 ", valid index range 0:%d",
+topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
+cpu->apic_id, ms->possible_cpus->len - 1);
 return;
 }
 
@@ -2374,7 +2383,8 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
  * once -smp refactoring is complete and there will be CPU private
  * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-x86_topo_ids_from_apicid(cpu->apic_id, smp_cores, smp_threads, );
+x86_topo_ids_from_apicid(cpu->apic_id, smp_dies,
+ smp_cores, smp_threads, );
 if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
 error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
 " 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
topo.pkg_id);
@@ -2670,10 +2680,12 @@ pc_cpu_index_to_props(MachineState *ms, unsigned 
cpu_index)
 static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
X86CPUTopoInfo topo;
+   PCMachineState *pcms = PC_MACHINE(m

[Qemu-devel] [PATCH v2 5/5] target/i386: Add CPUID.1F generation support for multi-die PCMachine

2019-05-21 Thread Like Xu
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be exposed
if guests want to emulate multiple software-visible die within each package.
Per Intel's SDM, the 0x1f is a superset of 0xb, thus they can be generated
by almost same code as 0xb except die_offset setting.

If the number of dies per package is less than 2, the qemu will not expose
CPUID.1F regardless of whether the host supports CPUID.1F, and in any case,
cpuid.0.eax would store the maximum input value for **guest** basic CPUID.

If users do want to expose CPUID.1F by passing dies > 1 for simulation without
host support, there will be a smp topology warning but it is not blocking.

Signed-off-by: Like Xu 
---
 target/i386/cpu.c | 37 +
 target/i386/cpu.h |  4 
 target/i386/kvm.c | 30 --
 3 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3222bd3254..cd6c9933c3 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4417,6 +4417,42 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 *ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
 }
 
+assert(!(*eax & ~0x1f));
+*ebx &= 0x; /* The count doesn't need to be reliable. */
+break;
+case 0x1F:
+/* V2 Extended Topology Enumeration Leaf */
+if (env->nr_dies < 2 || !cpu->enable_cpuid_0x1f) {
+*eax = *ebx = *ecx = *edx = 0;
+break;
+}
+
+*ecx = count & 0xff;
+*edx = cpu->apic_id;
+switch (count) {
+case 0:
+*eax = apicid_core_offset(env->nr_dies, cs->nr_cores,
+cs->nr_threads);
+*ebx = cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_SMT;
+break;
+case 1:
+*eax = apicid_die_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_CORE;
+break;
+case 2:
+*eax = apicid_pkg_offset(env->nr_dies, cs->nr_cores,
+   cs->nr_threads);
+*ebx = env->nr_dies * cs->nr_cores * cs->nr_threads;
+*ecx |= CPUID_TOPOLOGY_LEVEL_DIE;
+break;
+default:
+*eax = 0;
+*ebx = 0;
+*ecx |= CPUID_TOPOLOGY_LEVEL_INVALID;
+}
 assert(!(*eax & ~0x1f));
 *ebx &= 0x; /* The count doesn't need to be reliable. */
 break;
@@ -5864,6 +5900,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("full-cpuid-auto-level", X86CPU, full_cpuid_auto_level, 
true),
 DEFINE_PROP_STRING("hv-vendor-id", X86CPU, hyperv_vendor_id),
 DEFINE_PROP_BOOL("cpuid-0xb", X86CPU, enable_cpuid_0xb, true),
+DEFINE_PROP_BOOL("cpuid-0x1f", X86CPU, enable_cpuid_0x1f, true),
 DEFINE_PROP_BOOL("lmce", X86CPU, enable_lmce, false),
 DEFINE_PROP_BOOL("l3-cache", X86CPU, enable_l3_cache, true),
 DEFINE_PROP_BOOL("kvm-no-smi-migration", X86CPU, kvm_no_smi_migration,
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index d5f2a60ff5..9b54c646e7 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -735,6 +735,7 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS];
 #define CPUID_TOPOLOGY_LEVEL_INVALID  (0U << 8)
 #define CPUID_TOPOLOGY_LEVEL_SMT  (1U << 8)
 #define CPUID_TOPOLOGY_LEVEL_CORE (2U << 8)
+#define CPUID_TOPOLOGY_LEVEL_DIE  (5U << 8)
 
 /* MSR Feature Bits */
 #define MSR_ARCH_CAP_RDCL_NO(1U << 0)
@@ -1455,6 +1456,9 @@ struct X86CPU {
 /* Compatibility bits for old machine types: */
 bool enable_cpuid_0xb;
 
+/* V2 Compatibility bits for old machine types: */
+bool enable_cpuid_0x1f;
+
 /* Enable auto level-increase for all CPUID leaves */
 bool full_cpuid_auto_level;
 
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 3b29ce5c0d..d8b8bd5c9e 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -931,12 +931,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
 struct kvm_cpuid_entry2 *c;
 uint32_t signature[3];
 int kvm_base = KVM_CPUID_SIGNATURE;
-int r;
+int r, cpuid_0_entry, cpuid_min_level;
 Error *local_err = NULL;
 
 memset(_data, 0, sizeof(cpuid_data));
 
-cpuid_i = 0;
+cpuid_i = cpuid_0_entry = cpuid_min_level = 0;
 
 r = kvm_arch_set_tsc_khz(cs);
 if (r < 0) {
@@ -1050,6 +1050,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 cpu_x86_cpuid(env, 0, 0, , , , );
 
+if (limit < 0x1f && env->nr_dies > 1 && cpu->enable_cpuid_0x1f) {
+limit = env->cpuid_level = env->cpuid_min_level = 0x1f;
+

[Qemu-devel] [PATCH v2 3/5] vl.c: Add -smp, dies=* command line support and update -smp doc

2019-05-21 Thread Like Xu
For PC target, users could configure the number of dies per one package
via command line with this patch, such as "-smp dies=2,cores=4".

A new pc-specified pc_smp_parse() is introduced and to keep the interface
consistent, refactoring legacy smp_parse() to __smp_parse() is necessary.

The parsing rules of new cpu-topology model obey the same restrictions/logic
as the legacy socket/core/thread model especially on missing values computing.

Signed-off-by: Like Xu 
---
 qemu-options.hx | 17 +-
 vl.c| 89 -
 2 files changed, 97 insertions(+), 9 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 5daa5a8fb0..7fad5b50ff 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -138,25 +138,26 @@ no incompatible TCG features have been enabled (e.g. 
icount/replay).
 ETEXI
 
 DEF("smp", HAS_ARG, QEMU_OPTION_smp,
-"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,sockets=sockets]\n"
+"-smp 
[cpus=]n[,maxcpus=cpus][,cores=cores][,threads=threads][,dies=dies][,sockets=sockets]\n"
 "set the number of CPUs to 'n' [default=1]\n"
 "maxcpus= maximum number of total cpus, including\n"
 "offline CPUs for hotplug, etc\n"
-"cores= number of CPU cores on one socket\n"
+"cores= number of CPU cores on one socket (for PC, it's on 
one die)\n"
 "threads= number of threads on one CPU core\n"
+"dies= number of CPU dies on one socket (for PC only)\n"
 "sockets= number of discrete sockets in the system\n",
 QEMU_ARCH_ALL)
 STEXI
-@item -smp 
[cpus=]@var{n}[,cores=@var{cores}][,threads=@var{threads}][,sockets=@var{sockets}][,maxcpus=@var{maxcpus}]
+@item -smp 
[cpus=]@var{n}[,cores=@var{cores}][,threads=@var{threads}][,dies=dies][,sockets=@var{sockets}][,maxcpus=@var{maxcpus}]
 @findex -smp
 Simulate an SMP system with @var{n} CPUs. On the PC target, up to 255
 CPUs are supported. On Sparc32 target, Linux limits the number of usable CPUs
 to 4.
-For the PC target, the number of @var{cores} per socket, the number
-of @var{threads} per cores and the total number of @var{sockets} can be
-specified. Missing values will be computed. If any on the three values is
-given, the total number of CPUs @var{n} can be omitted. @var{maxcpus}
-specifies the maximum number of hotpluggable CPUs.
+For the PC target, the number of @var{cores} per die, the number of 
@var{threads}
+per cores, the number of @var{dies} per packages and the total number of
+@var{sockets} can be specified. Missing values will be computed.
+If any on the three values is given, the total number of CPUs @var{n} can be 
omitted.
+@var{maxcpus} specifies the maximum number of hotpluggable CPUs.
 ETEXI
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
diff --git a/vl.c b/vl.c
index 8d92e2d209..66b577f447 100644
--- a/vl.c
+++ b/vl.c
@@ -63,6 +63,7 @@ int main(int argc, char **argv)
 #include "sysemu/watchdog.h"
 #include "hw/firmware/smbios.h"
 #include "hw/acpi/acpi.h"
+#include "hw/i386/pc.h"
 #include "hw/xen/xen.h"
 #include "hw/qdev.h"
 #include "hw/loader.h"
@@ -1248,6 +1249,9 @@ static QemuOptsList qemu_smp_opts = {
 }, {
 .name = "sockets",
 .type = QEMU_OPT_NUMBER,
+}, {
+.name = "dies",
+.type = QEMU_OPT_NUMBER,
 }, {
 .name = "cores",
 .type = QEMU_OPT_NUMBER,
@@ -1262,7 +1266,7 @@ static QemuOptsList qemu_smp_opts = {
 },
 };
 
-static void smp_parse(QemuOpts *opts)
+static void __smp_parse(QemuOpts *opts)
 {
 if (opts) {
 unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
@@ -1334,6 +1338,89 @@ static void smp_parse(QemuOpts *opts)
 }
 }
 
+static void pc_smp_parse(QemuOpts *opts)
+{
+PCMachineState *pcms = (PCMachineState *)
+object_dynamic_cast(OBJECT(current_machine), TYPE_PC_MACHINE);
+
+unsigned cpus= qemu_opt_get_number(opts, "cpus", 0);
+unsigned sockets = qemu_opt_get_number(opts, "sockets", 0);
+unsigned dies = qemu_opt_get_number(opts, "dies", 1);
+unsigned cores   = qemu_opt_get_number(opts, "cores", 0);
+unsigned threads = qemu_opt_get_number(opts, "threads", 0);
+
+/* compute missing values, prefer sockets over cores over threads */
+if (cpus == 0 || sockets == 0) {
+cores = cores > 0 ? cores : 1;
+threads = threads > 0 ? threads : 1;
+if (cpus == 0) {
+sockets = sockets > 0 ? sockets : 1;
+cpus = cores * threads * dies * sockets;
+} else {
+current_machine->smp.max_cpus

[Qemu-devel] [PATCH v2 0/5] Introduce cpu die topology and enable CPUID.1F for i386

2019-05-21 Thread Like Xu
Multi-chip packaging technology allows integration of multi-cores in one die
and multi-dies in one single package, for example Intel CLX-AP or AMD EPYC.

This kind of integration can be enabled by high-performance, heterogeneous,
multi-dies interconnect technology, providing a more cost-effective manner. 
QEMU and guests may take advantages of multi-dies host for such as guest
placing or energy efficiency management...

This patch series extend the CPU topology to the socket/dies/core/thread model,
allowing the setting of dies number per one socket on -smp qemu command. For
i386, it upgrades APIC_IDs generation and reversion functions with a new exposed
leaf called CPUID.1F, which is a preferred superset to leaf 0BH. The CPUID.1F
spec is on https://software.intel.com/en-us/articles/intel-sdm, 3-190 Vol 2A.

E.g. we use -smp 4,dies=2,cores=2,threads=1 to run an MCP kvm-guest,
check raw cpuid data and the expected output from guest is following:
0x001f 0x00: eax=0x ebx=0x0001 ecx=0x0100 edx=0x0002
0x001f 0x01: eax=0x0001 ebx=0x0002 ecx=0x0201 edx=0x0001
0x001f 0x02: eax=0x0002 ebx=0x0004 ecx=0x0502 edx=0x0003
0x001f 0x03: eax=0x ebx=0x ecx=0x0003 edx=0x0001

==changelog==

v2:

- Enable cpu die-level topolgy only for PCMachine and X86CPU
- Minimize cpuid.0.eax to the setting value actually used by guest
- Update cmd line -smps docs for die-level configurations
- Refactoring topo-bit tests for x86_apicid_from_cpu_idx() with nr_dies
- Based on "[PATCH v3 00/10] Refactor cpu topo into machine properties"
- Rebase to commit 2259637b95bef3116cc262459271de08e038cc66

v1: https://patchwork.kernel.org/cover/10876667/

Like Xu (5):
  target/i386: Add cpu die-level topology support for X86CPU
  i386/cpu: Consolidate die-id validity in smp context
  vl.c: Add -smp, dies=* command line support and update -smp doc
  i386/cpu: Update apicid parsing rules and topo-bit tests for dies
  target/i386: Add CPUID.1F generation support for multi-die PCMachine

 hmp.c  |  3 ++
 hw/core/machine.c  | 12 +
 hw/i386/pc.c   | 52 +-
 include/hw/i386/pc.h   |  2 +
 include/hw/i386/topology.h | 76 +++-
 qapi/misc.json |  6 ++-
 qemu-options.hx| 17 
 target/i386/cpu.c  | 59 ++---
 target/i386/cpu.h  |  7 +++
 target/i386/kvm.c  | 30 -
 tests/test-x86-cpuid.c | 84 ++-
 vl.c   | 89 +-
 12 files changed, 347 insertions(+), 90 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v2 1/5] target/i386: Add cpu die-level topology support for X86CPU

2019-05-21 Thread Like Xu
The die-level as the first PC-specific cpu topology is added to the
leagcy cpu topology model which only covers sockets/cores/threads.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

 #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies is 1.

A new set of die-related variables are added in smp context and the
CPUX86State.nr_dies is assigned in x86_cpu_initfn() from PCMachineState.

Signed-off-by: Like Xu 
---
 hw/i386/pc.c   | 3 +++
 include/hw/i386/pc.h   | 2 ++
 include/hw/i386/topology.h | 2 ++
 qapi/misc.json | 6 --
 target/i386/cpu.c  | 9 +
 target/i386/cpu.h  | 3 +++
 6 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 896c22e32e..83ab53c814 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2341,6 +2341,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
 topo.pkg_id = cpu->socket_id;
 topo.core_id = cpu->core_id;
+topo.die_id = cpu->die_id;
 topo.smt_id = cpu->thread_id;
 cpu->apic_id = apicid_from_topo_ids(smp_cores, smp_threads, );
 }
@@ -2692,6 +2693,8 @@ static const CPUArchIdList 
*pc_possible_cpu_arch_ids(MachineState *ms)
  ms->smp.cores, ms->smp.threads, );
 ms->possible_cpus->cpus[i].props.has_socket_id = true;
 ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+ms->possible_cpus->cpus[i].props.has_die_id = true;
+ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
 ms->possible_cpus->cpus[i].props.has_core_id = true;
 ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
 ms->possible_cpus->cpus[i].props.has_thread_id = true;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index ce3c22951e..b5faf2ede9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -24,6 +24,7 @@
  * PCMachineState:
  * @acpi_dev: link to ACPI PM device that performs ACPI hotplug handling
  * @boot_cpus: number of present VCPUs
+ * @smp_dies: number of dies per one package
  */
 struct PCMachineState {
 /*< private >*/
@@ -59,6 +60,7 @@ struct PCMachineState {
 bool apic_xrupt_override;
 unsigned apic_id_limit;
 uint16_t boot_cpus;
+unsigned smp_dies;
 
 /* NUMA information: */
 uint64_t numa_nodes;
diff --git a/include/hw/i386/topology.h b/include/hw/i386/topology.h
index 1ebaee0f76..7f80498eb3 100644
--- a/include/hw/i386/topology.h
+++ b/include/hw/i386/topology.h
@@ -47,6 +47,7 @@ typedef uint32_t apic_id_t;
 
 typedef struct X86CPUTopoInfo {
 unsigned pkg_id;
+unsigned die_id;
 unsigned core_id;
 unsigned smt_id;
 } X86CPUTopoInfo;
@@ -130,6 +131,7 @@ static inline void x86_topo_ids_from_apicid(apic_id_t 
apicid,
 topo->core_id = (apicid >> apicid_core_offset(nr_cores, nr_threads)) &
~(0xUL << apicid_core_width(nr_cores, nr_threads));
 topo->pkg_id = apicid >> apicid_pkg_offset(nr_cores, nr_threads);
+topo->die_id = -1;
 }
 
 /* Make APIC ID for the CPU 'cpu_index'
diff --git a/qapi/misc.json b/qapi/misc.json
index 8b3ca4fdd3..cd236c89b3 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -2924,10 +2924,11 @@
 #
 # @node-id: NUMA node ID the CPU belongs to
 # @socket-id: socket number within node/board the CPU belongs to
-# @core-id: core number within socket the CPU belongs to
+# @die-id: die number within node/board the CPU belongs to (Since 4.1)
+# @core-id: core number within die the CPU belongs to
 # @thread-id: thread number within core the CPU belongs to
 #
-# Note: currently there are 4 properties that could be present
+# Note: currently there are 5 properties that could be present
 # but management should be prepared to pass through other
 # properties with device_add command to allow for future
 # interface extension. This also requires the filed names to be kept in
@@ -2938,6 +2939,7 @@
 { 'struct': 'CpuInstanceProperties',
   'data': { '*node-id': 'int',
 '*socket-id': 'int',
+'*die-id': 'int',
 '*core-id': 'int',
 '*thread-id': 'int'
   }
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9a93dd8be7..9bd35b4965 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -55,6 +55,7 @@
 #include "hw/xen/xen.h"
 #include "hw/i386/apic_internal.h"
 #include "hw/boards.h"
+#include "hw/i386/pc.h"
 #endif
 
 #include "disas/capstone.h"
@@ -5595,7 +5596,13 @@ static void x86_cpu_initfn(Object *obj)
 X86CPUClass *xcc = X86_CPU_GET_CLASS(obj);
 CPUX86State *env = >env;
 FeatureWord w;
+#ifndef CONFIG_USER_ONLY
+MachineState *machine = MACHINE(qdev_get_machine());
+PCMachineState *pcms = (PC

[Qemu-devel] [PATCH v2 2/5] i386/cpu: Consolidate die-id validity in smp context

2019-05-21 Thread Like Xu
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Signed-off-by: Like Xu 
---
 hmp.c |  3 +++
 hw/core/machine.c | 12 
 hw/i386/pc.c  | 11 +++
 3 files changed, 26 insertions(+)

diff --git a/hmp.c b/hmp.c
index 56a3ed7375..7deb7b7226 100644
--- a/hmp.c
+++ b/hmp.c
@@ -3112,6 +3112,9 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
*qdict)
 if (c->has_socket_id) {
 monitor_printf(mon, "socket-id: \"%" PRIu64 "\"\n", 
c->socket_id);
 }
+if (c->has_die_id) {
+monitor_printf(mon, "die-id: \"%" PRIu64 "\"\n", c->die_id);
+}
 if (c->has_core_id) {
 monitor_printf(mon, "core-id: \"%" PRIu64 "\"\n", c->core_id);
 }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 5d046a43e3..5116429732 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -659,6 +659,11 @@ void machine_set_cpu_numa_node(MachineState *machine,
 return;
 }
 
+if (props->has_die_id && !slot->props.has_die_id) {
+error_setg(errp, "die-id is not supported");
+return;
+}
+
 /* skip slots with explicit mismatch */
 if (props->has_thread_id && props->thread_id != slot->props.thread_id) 
{
 continue;
@@ -668,6 +673,10 @@ void machine_set_cpu_numa_node(MachineState *machine,
 continue;
 }
 
+if (props->has_die_id && props->die_id != slot->props.die_id) {
+continue;
+}
+
 if (props->has_socket_id && props->socket_id != slot->props.socket_id) 
{
 continue;
 }
@@ -925,6 +934,9 @@ static char *cpu_slot_to_string(const CPUArchId *cpu)
 if (cpu->props.has_socket_id) {
 g_string_append_printf(s, "socket-id: %"PRId64, cpu->props.socket_id);
 }
+if (cpu->props.has_die_id) {
+g_string_append_printf(s, "die-id: %"PRId64, cpu->props.die_id);
+}
 if (cpu->props.has_core_id) {
 if (s->len) {
 g_string_append_printf(s, ", ");
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 83ab53c814..00be2463af 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2321,6 +2321,10 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 error_setg(errp, "Invalid CPU socket-id: %u must be in range 0:%u",
cpu->socket_id, max_socket);
 return;
+} else if (cpu->die_id > max_socket) {
+error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
+   cpu->die_id, max_socket);
+return;
 }
 if (cpu->core_id < 0) {
 error_setg(errp, "CPU core-id is not set");
@@ -2378,6 +2382,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 }
 cpu->socket_id = topo.pkg_id;
 
+if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
+error_setg(errp, "property die-id: %u doesn't match set apic-id:"
+" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
+return;
+}
+cpu->die_id = topo.die_id;
+
 if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
 error_setg(errp, "property core-id: %u doesn't match set apic-id:"
 " 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, topo.core_id);
-- 
2.21.0




[Qemu-devel] [PATCH v3 08/10] hw/arm: Replace global smp variables with machine smp properties

2019-05-19 Thread Like Xu
The global smp variables in arm are replaced with smp machine properties.
The init_cpus() and *_create_rpu() are refactored to pass MachineState.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 hw/arm/fsl-imx6.c  |  6 +-
 hw/arm/fsl-imx6ul.c|  6 +-
 hw/arm/fsl-imx7.c  |  7 +--
 hw/arm/highbank.c  |  1 +
 hw/arm/mcimx6ul-evk.c  |  2 +-
 hw/arm/mcimx7d-sabre.c |  2 +-
 hw/arm/raspi.c |  4 ++--
 hw/arm/realview.c  |  1 +
 hw/arm/sabrelite.c |  2 +-
 hw/arm/vexpress.c  | 16 ++--
 hw/arm/virt.c  |  8 +++-
 hw/arm/xlnx-zynqmp.c   | 16 ++--
 target/arm/cpu.c   |  8 +++-
 13 files changed, 56 insertions(+), 23 deletions(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index 7b7b97f74c..ed772d5bd9 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -23,6 +23,7 @@
 #include "qapi/error.h"
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx6.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "chardev/char.h"
 #include "qemu/error-report.h"
@@ -33,11 +34,12 @@
 
 static void fsl_imx6_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6State *s = FSL_IMX6(obj);
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX6_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 "cortex-a9-" TYPE_ARM_CPU, _abort, NULL);
@@ -93,9 +95,11 @@ static void fsl_imx6_init(Object *obj)
 
 static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6State *s = FSL_IMX6(dev);
 uint16_t i;
 Error *err = NULL;
+unsigned int smp_cpus = ms->smp.cpus;
 
 if (smp_cpus > FSL_IMX6_NUM_CPUS) {
 error_setg(errp, "%s: Only %d CPUs are supported (%d requested)",
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
index 4b56bfa8d1..74b8ecbbb6 100644
--- a/hw/arm/fsl-imx6ul.c
+++ b/hw/arm/fsl-imx6ul.c
@@ -21,6 +21,7 @@
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx6ul.h"
 #include "hw/misc/unimp.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
 
@@ -28,11 +29,12 @@
 
 static void fsl_imx6ul_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6ULState *s = FSL_IMX6UL(obj);
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6UL_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX6UL_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 "cortex-a7-" TYPE_ARM_CPU, _abort, NULL);
@@ -156,10 +158,12 @@ static void fsl_imx6ul_init(Object *obj)
 
 static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6ULState *s = FSL_IMX6UL(dev);
 int i;
 qemu_irq irq;
 char name[NAME_SIZE];
+unsigned int smp_cpus = ms->smp.cpus;
 
 if (smp_cpus > FSL_IMX6UL_NUM_CPUS) {
 error_setg(errp, "%s: Only %d CPUs are supported (%d requested)",
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 7663ad6861..71cc414de6 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -23,6 +23,7 @@
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx7.h"
 #include "hw/misc/unimp.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
 
@@ -30,12 +31,12 @@
 
 static void fsl_imx7_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX7State *s = FSL_IMX7(obj);
 char name[NAME_SIZE];
 int i;
 
-
-for (i = 0; i < MIN(smp_cpus, FSL_IMX7_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX7_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 ARM_CPU_TYPE_NAME("cortex-a7"), _abort,
@@ -155,11 +156,13 @@ static void fsl_imx7_init(Object *obj)
 
 static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX7State *s = FSL_IMX7(dev);
 Object *o;
 int i;
 qemu_irq irq;
 char name[NAME_SIZE];
+unsigned int smp_cpus

[Qemu-devel] [PATCH v3 02/10] machine: Refactor smp-related call chains to pass MachineState

2019-05-19 Thread Like Xu
To get rid of the global smp_* variables we're currently using, it's recommended
to pass MachineState in the list of incoming parameters for functions that use
global smp variables, thus some redundant parameters are dropped. It's applied
for legacy smbios_*(), *_machine_reset(), hot_add_cpu() and mips *_create_cpu().

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 hw/arm/virt.c|  2 +-
 hw/hppa/machine.c|  2 +-
 hw/i386/acpi-build.c |  2 +-
 hw/i386/pc.c |  9 -
 hw/mips/mips_malta.c | 22 +++---
 hw/ppc/pnv.c |  3 +--
 hw/ppc/spapr.c   |  3 +--
 hw/s390x/s390-virtio-ccw.c   |  6 +++---
 hw/smbios/smbios.c   | 26 +++---
 include/hw/boards.h  |  4 ++--
 include/hw/firmware/smbios.h |  5 +++--
 include/hw/i386/pc.h |  2 +-
 qmp.c|  2 +-
 vl.c |  2 +-
 14 files changed, 46 insertions(+), 44 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 5331ab71e2..6b2f2e96d3 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1342,7 +1342,7 @@ static void virt_build_smbios(VirtMachineState *vms)
 vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
 true, SMBIOS_ENTRY_POINT_30);
 
-smbios_get_tables(NULL, 0, _tables, _tables_len,
+smbios_get_tables(MACHINE(vms), NULL, 0, _tables, 
_tables_len,
   _anchor, _anchor_len);
 
 if (smbios_anchor) {
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index d1b1d3caa4..416e67bab1 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -240,7 +240,7 @@ static void machine_hppa_init(MachineState *machine)
 cpu[0]->env.gr[21] = smp_cpus;
 }
 
-static void hppa_machine_reset(void)
+static void hppa_machine_reset(MachineState *ms)
 {
 int i;
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index b4ec14e349..c8e47e5713 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -187,7 +187,7 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
 pm->pcihp_io_len = 0;
 
 assert(obj);
-init_common_fadt_data(obj, >fadt);
+init_common_fadt_data(machine, obj, >fadt);
 if (piix) {
 /* w2k requires FADT(rev1) or it won't boot, keep PC compatible */
 pm->fadt.rev = 1;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d98b737b8f..9bcd867ea3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -962,7 +962,7 @@ static void pc_build_smbios(PCMachineState *pcms)
 /* tell smbios about cpuid version and features */
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
-smbios_tables = smbios_get_table_legacy(_tables_len);
+smbios_tables = smbios_get_table_legacy(ms, _tables_len);
 if (smbios_tables) {
 fw_cfg_add_bytes(pcms->fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
@@ -979,7 +979,7 @@ static void pc_build_smbios(PCMachineState *pcms)
 array_count++;
 }
 }
-smbios_get_tables(mem_array, array_count,
+smbios_get_tables(ms, mem_array, array_count,
   _tables, _tables_len,
   _anchor, _anchor_len);
 g_free(mem_array);
@@ -1534,9 +1534,8 @@ static void pc_new_cpu(const char *typename, int64_t 
apic_id, Error **errp)
 error_propagate(errp, local_err);
 }
 
-void pc_hot_add_cpu(const int64_t id, Error **errp)
+void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
 {
-MachineState *ms = MACHINE(qdev_get_machine());
 int64_t apic_id = x86_cpu_apic_id_from_index(id);
 Error *local_err = NULL;
 
@@ -2622,7 +2621,7 @@ static void pc_machine_initfn(Object *obj)
 pc_system_flash_create(pcms);
 }
 
-static void pc_machine_reset(void)
+static void pc_machine_reset(MachineState *machine)
 {
 CPUState *cs;
 X86CPU *cpu;
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 439665ab45..5fe9512c24 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1124,15 +1124,15 @@ static void main_cpu_reset(void *opaque)
 }
 }
 
-static void create_cpu_without_cps(const char *cpu_type,
+static void create_cpu_without_cps(MachineState *ms,
qemu_irq *cbus_irq, qemu_irq *i8259_irq)
 {
 CPUMIPSState *env;
 MIPSCPU *cpu;
 int i;
 
-for (i = 0; i < smp_cpus; i++) {
-cpu = MIPS_CPU(cpu_create(cpu_type));
+for (i = 0; i < ms->smp.cpus; i++) {
+cpu = MIPS_CPU(cpu_create(ms->cpu_type));
 
 /* Init internal devices */
 cpu_mips_irq_init_cpu(cpu);
@@ -1146,7 +1146,7 @@ static void create_cpu_without_cps(const char *cpu_type,
 *cbus_irq = env->irq[4];
 }
 
-static void create_cps(MaltaState *s, const char *cpu_type,
+static void create_cps(MachineState *ms, MaltaState *s,

[Qemu-devel] [PATCH v3 10/10] vl.c: Replace smp global variables with smp machine properties

2019-05-19 Thread Like Xu
The global smp variables in vl.c are completely replaced with machine 
properties.

Form this commit, the smp_cpus/smp_cores/smp_threads/max_cpus are deprecated
and only machine properties within MachineState are fully applied and enabled.

Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 vl.c | 53 ++---
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/vl.c b/vl.c
index 15d519e371..a700c93c77 100644
--- a/vl.c
+++ b/vl.c
@@ -162,10 +162,6 @@ static Chardev **serial_hds;
 Chardev *parallel_hds[MAX_PARALLEL_PORTS];
 int win2k_install_hack = 0;
 int singlestep = 0;
-int smp_cpus;
-unsigned int max_cpus;
-int smp_cores = 1;
-int smp_threads = 1;
 int acpi_enabled = 1;
 int no_hpet = 0;
 int fd_bootchk = 1;
@@ -1282,8 +1278,9 @@ static void smp_parse(QemuOpts *opts)
 sockets = sockets > 0 ? sockets : 1;
 cpus = cores * threads * sockets;
 } else {
-max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
-sockets = max_cpus / (cores * threads);
+current_machine->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+sockets = current_machine->smp.max_cpus / (cores * threads);
 }
 } else if (cores == 0) {
 threads = threads > 0 ? threads : 1;
@@ -1300,34 +1297,37 @@ static void smp_parse(QemuOpts *opts)
 exit(1);
 }
 
-max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+current_machine->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
 
-if (max_cpus < cpus) {
+if (current_machine->smp.max_cpus < cpus) {
 error_report("maxcpus must be equal to or greater than smp");
 exit(1);
 }
 
-if (sockets * cores * threads > max_cpus) {
+if (sockets * cores * threads > current_machine->smp.max_cpus) {
 error_report("cpu topology: "
  "sockets (%u) * cores (%u) * threads (%u) > "
  "maxcpus (%u)",
- sockets, cores, threads, max_cpus);
+ sockets, cores, threads,
+ current_machine->smp.max_cpus);
 exit(1);
 }
 
-if (sockets * cores * threads != max_cpus) {
+if (sockets * cores * threads != current_machine->smp.max_cpus) {
 warn_report("Invalid CPU topology deprecated: "
 "sockets (%u) * cores (%u) * threads (%u) "
 "!= maxcpus (%u)",
-sockets, cores, threads, max_cpus);
+sockets, cores, threads,
+current_machine->smp.max_cpus);
 }
 
-smp_cpus = cpus;
-smp_cores = cores;
-smp_threads = threads;
+current_machine->smp.cpus = cpus;
+current_machine->smp.cores = cores;
+current_machine->smp.threads = threads;
 }
 
-if (smp_cpus > 1) {
+if (current_machine->smp.cpus > 1) {
 Error *blocker = NULL;
 error_setg(, QERR_REPLAY_NOT_SUPPORTED, "smp");
 replay_add_blocker(blocker);
@@ -4128,26 +4128,25 @@ int main(int argc, char **argv, char **envp)
 machine_class->default_cpus = machine_class->default_cpus ?: 1;
 
 /* default to machine_class->default_cpus */
-smp_cpus = machine_class->default_cpus;
-max_cpus = machine_class->default_cpus;
+current_machine->smp.cpus = machine_class->default_cpus;
+current_machine->smp.max_cpus = machine_class->default_cpus;
+current_machine->smp.cores = 1;
+current_machine->smp.threads = 1;
 
 smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
 
-current_machine->smp.cpus = smp_cpus;
-current_machine->smp.max_cpus = max_cpus;
-current_machine->smp.cores = smp_cores;
-current_machine->smp.threads = smp_threads;
-
 /* sanity-check smp_cpus and max_cpus against machine_class */
-if (smp_cpus < machine_class->min_cpus) {
+if (current_machine->smp.cpus < machine_class->min_cpus) {
 error_report("Invalid SMP CPUs %d. The min CPUs "
- "supported by machine '%s' is %d", smp_cpus,
+ "supported by machine '%s' is %d",
+ current_machine->smp.cpus,
  machine_class->name, machine_class->min_cpus);
 exit(1);
 }
-if (max_cpus > machine_class->max_cpus) {
+if (current_machine->smp.max_cpus > machine_class->max_cpus) {
 error_report("Invalid SMP CPUs %d. The max CPUs "
- "supported by machine '%s' is %d", max_cpus,
+ "supported by machine '%s' is %d",
+ current_machine->smp.max_cpus,
  machine_class->name, machine_class->max_cpus);
 exit(1);
 }
-- 
2.21.0




[Qemu-devel] [PATCH v3 03/10] general: Replace global smp variables with smp machine properties

2019-05-19 Thread Like Xu
Basically, the context could get the MachineState reference via call
chains or unrecommended qdev_get_machine() in !CONFIG_USER_ONLY mode.

A local variable of the same name would be introduced in the declaration
phase out of less effort OR replace it on the spot if it's only used
once in the context. No semantic changes.

Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 accel/kvm/kvm-all.c  |  4 ++--
 backends/hostmem.c   |  6 --
 cpus.c   |  6 --
 exec.c   |  3 ++-
 gdbstub.c|  4 
 hw/cpu/core.c|  4 +++-
 migration/postcopy-ram.c |  8 +++-
 numa.c   |  1 +
 target/openrisc/sys_helper.c |  6 +-
 tcg/tcg.c| 13 -
 10 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 524c4ddfbd..f8ef39d845 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1533,8 +1533,8 @@ static int kvm_init(MachineState *ms)
 const char *name;
 int num;
 } num_cpus[] = {
-{ "SMP",  smp_cpus },
-{ "hotpluggable", max_cpus },
+{ "SMP",  ms->smp.cpus },
+{ "hotpluggable", ms->smp.max_cpus },
 { NULL, }
 }, *nc = num_cpus;
 int soft_vcpus_limit, hard_vcpus_limit;
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 04baf479a1..463102aa15 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -222,6 +222,7 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 {
 Error *local_err = NULL;
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+MachineState *ms = MACHINE(qdev_get_machine());
 
 if (backend->force_prealloc) {
 if (value) {
@@ -241,7 +242,7 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
 
-os_mem_prealloc(fd, ptr, sz, smp_cpus, _err);
+os_mem_prealloc(fd, ptr, sz, ms->smp.cpus, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -311,6 +312,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(uc);
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
+MachineState *ms = MACHINE(qdev_get_machine());
 Error *local_err = NULL;
 void *ptr;
 uint64_t sz;
@@ -375,7 +377,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
  */
 if (backend->prealloc) {
 os_mem_prealloc(memory_region_get_fd(>mr), ptr, sz,
-smp_cpus, _err);
+ms->smp.cpus, _err);
 if (local_err) {
 goto out;
 }
diff --git a/cpus.c b/cpus.c
index e58e7ab0f6..b49db3604a 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2068,8 +2068,10 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
 
 void qemu_init_vcpu(CPUState *cpu)
 {
-cpu->nr_cores = smp_cores;
-cpu->nr_threads = smp_threads;
+MachineState *ms = MACHINE(qdev_get_machine());
+
+cpu->nr_cores = ms->smp.cores;
+cpu->nr_threads =  ms->smp.threads;
 cpu->stopped = true;
 
 if (!cpu->as) {
diff --git a/exec.c b/exec.c
index 4e734770c2..2744df648c 100644
--- a/exec.c
+++ b/exec.c
@@ -1871,6 +1871,7 @@ static void *file_ram_alloc(RAMBlock *block,
 bool truncate,
 Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 void *area;
 
 block->page_size = qemu_fd_getpagesize(fd);
@@ -1927,7 +1928,7 @@ static void *file_ram_alloc(RAMBlock *block,
 }
 
 if (mem_prealloc) {
-os_mem_prealloc(fd, area, memory, smp_cpus, errp);
+os_mem_prealloc(fd, area, memory, ms->smp.cpus, errp);
 if (errp && *errp) {
 qemu_ram_munmap(fd, area, memory);
 return NULL;
diff --git a/gdbstub.c b/gdbstub.c
index d54abd17cc..dba37df2e9 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -30,6 +30,7 @@
 #include "sysemu/sysemu.h"
 #include "exec/gdbstub.h"
 #include "hw/cpu/cluster.h"
+#include "hw/boards.h"
 #endif
 
 #define MAX_PACKET_LENGTH 4096
@@ -1159,6 +1160,9 @@ static int gdb_handle_vcont(GDBState *s, const char *p)
 CPU_FOREACH(cpu) {
 max_cpus = max_cpus <= cpu->cpu_index ? cpu->cpu_index + 1 : max_cpus;
 }
+#else
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
 #endif
 /* uninitialised CPUs stay 0 */
 newstates = g_new0(char, max_cpus);
diff --git a/hw/cpu/core.c b/hw/cpu/core.c
index 7e42e2c87a..be2c7e10f7 100644
--- a/hw/cpu/core.c
+++ b/hw/cpu/core.c
@@ -11

[Qemu-devel] [PATCH v3 05/10] hw/riscv: Replace global smp variables with machine smp properties

2019-05-19 Thread Like Xu
The global smp variables in riscv are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/riscv/sifive_e.c| 6 --
 hw/riscv/sifive_plic.c | 3 +++
 hw/riscv/sifive_u.c| 6 --
 hw/riscv/spike.c   | 2 ++
 hw/riscv/virt.c| 1 +
 5 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index b1cd11363c..ae86a63c04 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -137,6 +137,7 @@ static void riscv_sifive_e_init(MachineState *machine)
 
 static void riscv_sifive_e_soc_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveESoCState *s = RISCV_E_SOC(obj);
 
 object_initialize_child(obj, "cpus", >cpus,
@@ -144,12 +145,13 @@ static void riscv_sifive_e_soc_init(Object *obj)
 _abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_E_CPU, "cpu-type",
 _abort);
-object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
+object_property_set_int(OBJECT(>cpus), ms->smp.cpus, "num-harts",
 _abort);
 }
 
 static void riscv_sifive_e_soc_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 const struct MemmapEntry *memmap = sifive_e_memmap;
 
 SiFiveESoCState *s = RISCV_E_SOC(dev);
@@ -179,7 +181,7 @@ static void riscv_sifive_e_soc_realize(DeviceState *dev, 
Error **errp)
 SIFIVE_E_PLIC_CONTEXT_STRIDE,
 memmap[SIFIVE_E_PLIC].size);
 sifive_clint_create(memmap[SIFIVE_E_CLINT].base,
-memmap[SIFIVE_E_CLINT].size, smp_cpus,
+memmap[SIFIVE_E_CLINT].size, ms->smp.cpus,
 SIFIVE_SIP_BASE, SIFIVE_TIMECMP_BASE, SIFIVE_TIME_BASE);
 sifive_mmio_emulate(sys_mem, "riscv.sifive.e.aon",
 memmap[SIFIVE_E_AON].base, memmap[SIFIVE_E_AON].size);
diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
index 07a032d93d..d4010a1f39 100644
--- a/hw/riscv/sifive_plic.c
+++ b/hw/riscv/sifive_plic.c
@@ -23,6 +23,7 @@
 #include "qemu/error-report.h"
 #include "hw/sysbus.h"
 #include "hw/pci/msi.h"
+#include "hw/boards.h"
 #include "target/riscv/cpu.h"
 #include "sysemu/sysemu.h"
 #include "hw/riscv/sifive_plic.h"
@@ -438,6 +439,8 @@ static void sifive_plic_irq_request(void *opaque, int irq, 
int level)
 
 static void sifive_plic_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cpus = ms->smp.cpus;
 SiFivePLICState *plic = SIFIVE_PLIC(dev);
 int i;
 
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 5ecc47cea3..43bf256946 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -321,13 +321,14 @@ static void riscv_sifive_u_init(MachineState *machine)
 
 static void riscv_sifive_u_soc_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveUSoCState *s = RISCV_U_SOC(obj);
 
 object_initialize_child(obj, "cpus", >cpus, sizeof(s->cpus),
 TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_U_CPU, "cpu-type",
 _abort);
-object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
+object_property_set_int(OBJECT(>cpus), ms->smp.cpus, "num-harts",
 _abort);
 
 sysbus_init_child_obj(obj, "gem", >gem, sizeof(s->gem),
@@ -336,6 +337,7 @@ static void riscv_sifive_u_soc_init(Object *obj)
 
 static void riscv_sifive_u_soc_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveUSoCState *s = RISCV_U_SOC(dev);
 const struct MemmapEntry *memmap = sifive_u_memmap;
 MemoryRegion *system_memory = get_system_memory();
@@ -371,7 +373,7 @@ static void riscv_sifive_u_soc_realize(DeviceState *dev, 
Error **errp)
 sifive_uart_create(system_memory, memmap[SIFIVE_U_UART1].base,
 serial_hd(1), qdev_get_gpio_in(DEVICE(s->plic), SIFIVE_U_UART1_IRQ));
 sifive_clint_create(memmap[SIFIVE_U_CLINT].base,
-memmap[SIFIVE_U_CLINT].size, smp_cpus,
+memmap[SIFIVE_U_CLINT].size, ms->smp.cpus,
 SIFIVE_SIP_BASE, SIFIVE_TIMECMP_BASE, SIFIVE_TIME_BASE);
 
 for (i = 0; i < SIFIVE_U_PLIC_NUM_SOURCES; i++) {
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 2a000a5800..6a747ff22e 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -171,6 +171,7 @@ static void spike_v1_10_0_board_init(MachineState *machine)
 MemoryRegion *main_mem = g_new(MemoryRegion, 1);
 MemoryRegion *

[Qemu-devel] [PATCH v3 06/10] hw/s390x: Replace global smp variables with machine smp properties

2019-05-19 Thread Like Xu
The global smp variables in s390x are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/s390x/s390-virtio-ccw.c | 3 ++-
 hw/s390x/sclp.c| 2 +-
 target/s390x/cpu.c | 3 +++
 target/s390x/excp_helper.c | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index ed1fe7a93e..692ad6e372 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -83,7 +83,7 @@ static void s390_init_cpus(MachineState *machine)
 /* initialize possible_cpus */
 mc->possible_cpu_arch_ids(machine);
 
-for (i = 0; i < smp_cpus; i++) {
+for (i = 0; i < machine->smp.cpus; i++) {
 s390x_new_cpu(machine->cpu_type, i, _fatal);
 }
 }
@@ -410,6 +410,7 @@ static CpuInstanceProperties 
s390_cpu_index_to_props(MachineState *ms,
 static const CPUArchIdList *s390_possible_cpu_arch_ids(MachineState *ms)
 {
 int i;
+unsigned int max_cpus = ms->smp.max_cpus;
 
 if (ms->possible_cpus) {
 g_assert(ms->possible_cpus && ms->possible_cpus->len == max_cpus);
diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 4510a800cb..fac7c3bb6c 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -64,7 +64,7 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 prepare_cpu_entries(sclp, read_info->entries, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
 read_info->offset_cpu = cpu_to_be16(offsetof(ReadInfo, entries));
-read_info->highest_cpu = cpu_to_be16(max_cpus - 1);
+read_info->highest_cpu = cpu_to_be16(machine->smp.max_cpus - 1);
 
 read_info->ibc_val = cpu_to_be32(s390_get_ibc_val());
 
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index b1df63d82c..f1e5c0d9c3 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -37,6 +37,7 @@
 #include "hw/qdev-properties.h"
 #ifndef CONFIG_USER_ONLY
 #include "hw/hw.h"
+#include "hw/boards.h"
 #include "sysemu/arch_init.h"
 #include "sysemu/sysemu.h"
 #endif
@@ -193,6 +194,8 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 
 #if !defined(CONFIG_USER_ONLY)
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
 if (cpu->env.core_id >= max_cpus) {
 error_setg(, "Unable to add CPU with core-id: %" PRIu32
", maximum core-id: %d", cpu->env.core_id,
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
index 3a467b72c5..1c6938effc 100644
--- a/target/s390x/excp_helper.c
+++ b/target/s390x/excp_helper.c
@@ -31,6 +31,7 @@
 #ifndef CONFIG_USER_ONLY
 #include "sysemu/sysemu.h"
 #include "hw/s390x/s390_flic.h"
+#include "hw/boards.h"
 #endif
 
 void QEMU_NORETURN tcg_s390_program_interrupt(CPUS390XState *env, uint32_t 
code,
@@ -300,6 +301,10 @@ static void do_ext_interrupt(CPUS390XState *env)
 g_assert(cpu_addr < S390_MAX_CPUS);
 lowcore->cpu_addr = cpu_to_be16(cpu_addr);
 clear_bit(cpu_addr, env->emergency_signals);
+#ifndef CONFIG_USER_ONLY
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
+#endif
 if (bitmap_empty(env->emergency_signals, max_cpus)) {
 env->pending_int &= ~INTERRUPT_EMERGENCY_SIGNAL;
 }
-- 
2.21.0




[Qemu-devel] [PATCH v3 04/10] hw/ppc: Replace global smp variables with machine smp properties

2019-05-19 Thread Like Xu
The global smp variables in ppc are replaced with smp machine properties.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/ppc/e500.c |  3 +++
 hw/ppc/mac_newworld.c |  3 ++-
 hw/ppc/mac_oldworld.c |  3 ++-
 hw/ppc/pnv.c  |  6 --
 hw/ppc/prep.c |  4 ++--
 hw/ppc/spapr.c| 34 ++
 hw/ppc/spapr_rtas.c   |  4 +++-
 7 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index beb2efd694..5e42e5a059 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -307,6 +307,7 @@ static int ppce500_load_device_tree(PPCE500MachineState 
*pms,
 bool dry_run)
 {
 MachineState *machine = MACHINE(pms);
+unsigned int smp_cpus = machine->smp.cpus;
 const PPCE500MachineClass *pmc = PPCE500_MACHINE_GET_CLASS(pms);
 CPUPPCState *env = first_cpu->env_ptr;
 int ret = -1;
@@ -734,6 +735,7 @@ static DeviceState 
*ppce500_init_mpic_qemu(PPCE500MachineState *pms,
 SysBusDevice *s;
 int i, j, k;
 MachineState *machine = MACHINE(pms);
+unsigned int smp_cpus = machine->smp.cpus;
 const PPCE500MachineClass *pmc = PPCE500_MACHINE_GET_CLASS(pms);
 
 dev = qdev_create(NULL, TYPE_OPENPIC);
@@ -846,6 +848,7 @@ void ppce500_init(MachineState *machine)
 struct boot_info *boot_info;
 int dt_size;
 int i;
+unsigned int smp_cpus = machine->smp.cpus;
 /* irq num for pin INTA, INTB, INTC and INTD is 1, 2, 3 and
  * 4 respectively */
 unsigned int pci_irq_nrs[PCI_NUM_PINS] = {1, 2, 3, 4};
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 02d8559621..257b26ee24 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -135,6 +135,7 @@ static void ppc_core99_init(MachineState *machine)
 DeviceState *dev, *pic_dev;
 hwaddr nvram_addr = 0xFFF04000;
 uint64_t tbfreq;
+unsigned int smp_cpus = machine->smp.cpus;
 
 linux_boot = (kernel_filename != NULL);
 
@@ -464,7 +465,7 @@ static void ppc_core99_init(MachineState *machine)
 sysbus_mmio_map(s, 1, CFG_ADDR + 2);
 
 fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, machine_arch);
 fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_base);
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 460cbc7923..1968f05a6c 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -99,6 +99,7 @@ static void ppc_heathrow_init(MachineState *machine)
 DeviceState *dev, *pic_dev;
 BusState *adb_bus;
 int bios_size;
+unsigned int smp_cpus = machine->smp.cpus;
 uint16_t ppc_boot_device;
 DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
 void *fw_cfg;
@@ -322,7 +323,7 @@ static void ppc_heathrow_init(MachineState *machine)
 sysbus_mmio_map(s, 1, CFG_ADDR + 2);
 
 fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, ARCH_HEATHROW);
 fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_base);
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 1e8c505936..3bb15338de 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -678,7 +678,8 @@ static void pnv_init(MachineState *machine)
 object_property_add_child(OBJECT(pnv), chip_name, chip, _fatal);
 object_property_set_int(chip, PNV_CHIP_HWID(i), "chip-id",
 _fatal);
-object_property_set_int(chip, smp_cores, "nr-cores", _fatal);
+object_property_set_int(chip, machine->smp.cores,
+"nr-cores", _fatal);
 object_property_set_bool(chip, true, "realized", _fatal);
 }
 g_free(chip_typename);
@@ -1134,6 +1135,7 @@ static void pnv_chip_instance_init(Object *obj)
 
 static void pnv_chip_core_realize(PnvChip *chip, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 Error *error = NULL;
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
 const char *typename = pnv_chip_core_typename(chip);
@@ -1168,7 +1170,7 @@ static void pnv_chip_core_realize(PnvChip *chip, Error 
**errp)
 snprintf(core_name, sizeof(core_name), "core[%d]", core_hwid);
 object_property_add_child(OBJECT(chip), core_name, OBJECT(pnv_core),
   _fatal);
-object_property_set_int(

[Qemu-devel] [PATCH v3 07/10] hw/i386: Replace global smp variables with machine smp properties

2019-05-19 Thread Like Xu
The global smp variables in i386 are replaced with smp machine properties.
To avoid calling qdev_get_machine() as much as possible, some related funtions
for acpi data generations are refactored. No semantic changes.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/i386/acpi-build.c  | 11 +++
 hw/i386/kvmvapic.c|  7 +--
 hw/i386/pc.c  | 24 +++-
 hw/i386/xen/xen-hvm.c |  4 
 target/i386/cpu.c |  4 +++-
 5 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index c8e47e5713..eb41af04ce 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -45,6 +45,7 @@
 #include "sysemu/tpm.h"
 #include "hw/acpi/tpm.h"
 #include "hw/acpi/vmgenid.h"
+#include "hw/boards.h"
 #include "sysemu/tpm_backend.h"
 #include "hw/timer/mc146818rtc_regs.h"
 #include "hw/mem/memory-device.h"
@@ -126,7 +127,8 @@ typedef struct FwCfgTPMConfig {
 uint8_t tpmppi_version;
 } QEMU_PACKED FwCfgTPMConfig;
 
-static void init_common_fadt_data(Object *o, AcpiFadtData *data)
+static void init_common_fadt_data(MachineState *ms, Object *o,
+  AcpiFadtData *data)
 {
 uint32_t io = object_property_get_uint(o, ACPI_PM_PROP_PM_IO_BASE, NULL);
 AmlAddressSpace as = AML_AS_SYSTEM_IO;
@@ -142,7 +144,8 @@ static void init_common_fadt_data(Object *o, AcpiFadtData 
*data)
  * CPUs for more than 8 CPUs, "Clustered Logical" mode has to be
  * used
  */
-((max_cpus > 8) ? (1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL) : 0),
+((ms->smp.max_cpus > 8) ?
+(1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL) : 0),
 .int_model = 1 /* Multiple APIC */,
 .rtc_century = RTC_CENTURY,
 .plvl2_lat = 0xfff /* C2 state not supported */,
@@ -176,7 +179,7 @@ static Object *object_resolve_type_unambiguous(const char 
*typename)
 return o;
 }
 
-static void acpi_get_pm_info(AcpiPmInfo *pm)
+static void acpi_get_pm_info(MachineState *machine, AcpiPmInfo *pm)
 {
 Object *piix = object_resolve_type_unambiguous(TYPE_PIIX4_PM);
 Object *lpc = object_resolve_type_unambiguous(TYPE_ICH9_LPC_DEVICE);
@@ -2629,7 +2632,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
 Object *vmgenid_dev;
 
-acpi_get_pm_info();
+acpi_get_pm_info(machine, );
 acpi_get_misc_info();
 acpi_get_pci_holes(_hole, _hole64);
 acpi_get_slic_oem(_oem);
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 70f6f26a94..3fce704613 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -17,6 +17,7 @@
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
 #include "hw/sysbus.h"
+#include "hw/boards.h"
 #include "tcg/tcg.h"
 
 #define VAPIC_IO_PORT   0x7e
@@ -441,11 +442,12 @@ static void do_patch_instruction(CPUState *cs, 
run_on_cpu_data data)
 
 static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 CPUState *cs = CPU(cpu);
 VAPICHandlers *handlers;
 PatchInfo *info;
 
-if (smp_cpus == 1) {
+if (ms->smp.cpus == 1) {
 handlers = >rom_state.up;
 } else {
 handlers = >rom_state.mp;
@@ -746,6 +748,7 @@ static void do_vapic_enable(CPUState *cs, run_on_cpu_data 
data)
 static void kvmvapic_vm_state_change(void *opaque, int running,
  RunState state)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 VAPICROMState *s = opaque;
 uint8_t *zero;
 
@@ -754,7 +757,7 @@ static void kvmvapic_vm_state_change(void *opaque, int 
running,
 }
 
 if (s->state == VAPIC_ACTIVE) {
-if (smp_cpus == 1) {
+if (ms->smp.cpus == 1) {
 run_on_cpu(first_cpu, do_vapic_enable, RUN_ON_CPU_HOST_PTR(s));
 } else {
 zero = g_malloc0(s->rom_state.vapic_size);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 9bcd867ea3..896c22e32e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -932,12 +932,14 @@ void enable_compat_apic_id_mode(void)
  * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
  * all CPUs up to max_cpus.
  */
-static uint32_t x86_cpu_apic_id_from_index(unsigned int cpu_index)
+static uint32_t x86_cpu_apic_id_from_index(MachineState *ms,
+   unsigned int cpu_index)
 {
 uint32_t correct_id;
 static bool warned;
 
-correct_id = x86_apicid_from_cpu_idx(smp_cores, smp_threads, cpu_index);
+correct_id = x86_apicid

[Qemu-devel] [PATCH v3 09/10] hw: Replace global smp variables with MachineState for all remaining archs

2019-05-19 Thread Like Xu
The global smp variables in alpha/hppa/mips/openrisc/sparc*/xtensa codes
are replaced with smp properties from MachineState.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 hw/alpha/dp264.c   | 1 +
 hw/hppa/machine.c  | 2 ++
 hw/mips/boston.c   | 2 +-
 hw/mips/mips_malta.c   | 2 ++
 hw/openrisc/openrisc_sim.c | 1 +
 hw/sparc/sun4m.c   | 2 ++
 hw/sparc64/sun4u.c | 4 ++--
 hw/xtensa/sim.c| 2 +-
 hw/xtensa/xtfpga.c | 1 +
 9 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 0347eb897c..9dfb835013 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -63,6 +63,7 @@ static void clipper_init(MachineState *machine)
 char *palcode_filename;
 uint64_t palcode_entry, palcode_low, palcode_high;
 uint64_t kernel_entry, kernel_low, kernel_high;
+unsigned int smp_cpus = machine->smp.cpus;
 
 /* Create up to 4 cpus.  */
 memset(cpus, 0, sizeof(cpus));
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 416e67bab1..662838d83b 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -72,6 +72,7 @@ static void machine_hppa_init(MachineState *machine)
 MemoryRegion *ram_region;
 MemoryRegion *cpu_region;
 long i;
+unsigned int smp_cpus = machine->smp.cpus;
 
 ram_size = machine->ram_size;
 
@@ -242,6 +243,7 @@ static void machine_hppa_init(MachineState *machine)
 
 static void hppa_machine_reset(MachineState *ms)
 {
+unsigned int smp_cpus = ms->smp.cpus;
 int i;
 
 qemu_devices_reset();
diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index a8b29f62f5..ccbfac54ef 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -460,7 +460,7 @@ static void boston_mach_init(MachineState *machine)
 
 object_property_set_str(OBJECT(s->cps), machine->cpu_type, "cpu-type",
 );
-object_property_set_int(OBJECT(s->cps), smp_cpus, "num-vp", );
+object_property_set_int(OBJECT(s->cps), machine->smp.cpus, "num-vp", );
 object_property_set_bool(OBJECT(s->cps), true, "realized", );
 
 if (err != NULL) {
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 5fe9512c24..ead5976d1a 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1095,6 +1095,8 @@ static int64_t load_kernel (void)
 
 static void malta_mips_config(MIPSCPU *cpu)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cpus = ms->smp.cpus;
 CPUMIPSState *env = >env;
 CPUState *cs = CPU(cpu);
 
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 0a906d815e..8d828e78ee 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -131,6 +131,7 @@ static void openrisc_sim_init(MachineState *machine)
 qemu_irq *cpu_irqs[2];
 qemu_irq serial_irq;
 int n;
+unsigned int smp_cpus = machine->smp.cpus;
 
 for (n = 0; n < smp_cpus; n++) {
 cpu = OPENRISC_CPU(cpu_create(machine->cpu_type));
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 07d126aea8..5c3739f2ef 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -852,6 +852,8 @@ static void sun4m_hw_init(const struct sun4m_hwdef *hwdef,
 FWCfgState *fw_cfg;
 DeviceState *dev;
 SysBusDevice *s;
+unsigned int smp_cpus = machine->smp.cpus;
+unsigned int max_cpus = machine->smp.max_cpus;
 
 /* init CPUs */
 for(i = 0; i < smp_cpus; i++) {
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index 399f2d73c8..0807f274bf 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -678,8 +678,8 @@ static void sun4uv_init(MemoryRegion *address_space_mem,
 _CFG_IO(dev)->comb_iomem);
 
 fw_cfg = FW_CFG(dev);
-fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)machine->smp.cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, hwdef->machine_id);
 fw_cfg_add_i64(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_entry);
diff --git a/hw/xtensa/sim.c b/hw/xtensa/sim.c
index 12c7437398..a4eef76fbc 100644
--- a/hw/xtensa/sim.c
+++ b/hw/xtensa/sim.c
@@ -60,7 +60,7 @@ static void xtensa_sim_init(MachineState *machine)
 const char *kernel_filename = machine->kernel_filename;
 int n;
 
-for (n = 0; n < smp_cpus; n++) {
+for (n = 0; n < machine->smp.cpus; n++) {
 cpu = XTENSA_CPU(cpu_create(machine->cpu_type));
 env = >env;
 
diff --git a/h

[Qemu-devel] [PATCH v3 00/10] Refactor cpu topo into machine properties

2019-05-19 Thread Like Xu
This patch series make existing cores/threads/sockets into machine
properties and get rid of global smp_* variables they use currently.

The purpose of getting rid of globals is disentangle layer violations and
let's do it one step at a time by replacing the smp_foo with qdev_get_machine()
as few calls as possible and delay other related refactoring efforts.

==changelog==

v3:

- rephrase commit messages
- s/of/of present/ for CpuTopology comment
- drop reduanct arguments such as cpu_type
- use ms instead of macs in migration context
- rebase to commit 1b46b4daa6

v2: https://patchwork.ozlabs.org/cover/1095727/

- pass MachineState via call chain with trivial fixups
- replace smp_cpus directly at places if it's only used once
- s/topo/smp/ and drop smp_ prefix inside CpuTopology structure
- add more commit messages to explaining what patch does
- fix Patchew build failure for xen usage
- use macs rather than ms in migration context for MigrationState
- cleanup unrelated and redundant changes
- spilt OpenRISC and RISC-V related patches

v1: https://patchwork.kernel.org/cover/10876667/

Like Xu (10):
  hw/boards: Add struct CpuTopology to MachineState
  machine: Refactor smp-related call chains to pass MachineState
  general: Replace global smp variables with smp machine properties
  hw/ppc: Replace global smp variables with machine smp properties
  hw/riscv: Replace global smp variables with machine smp properties
  hw/s390x: Replace global smp variables with machine smp properties
  hw/i386: Replace global smp variables with machine smp properties
  hw/arm: Replace global smp variables with machine smp properties
  hw: Replace global smp variables with MachineState for all remaining archs
  vl.c: Replace smp global variables with smp machine properties

 accel/kvm/kvm-all.c  |  4 +--
 backends/hostmem.c   |  6 +++--
 cpus.c   |  6 +++--
 exec.c   |  3 ++-
 gdbstub.c|  4 +++
 hw/alpha/dp264.c |  1 +
 hw/arm/fsl-imx6.c|  6 -
 hw/arm/fsl-imx6ul.c  |  6 -
 hw/arm/fsl-imx7.c|  7 +++--
 hw/arm/highbank.c|  1 +
 hw/arm/mcimx6ul-evk.c|  2 +-
 hw/arm/mcimx7d-sabre.c   |  2 +-
 hw/arm/raspi.c   |  4 +--
 hw/arm/realview.c|  1 +
 hw/arm/sabrelite.c   |  2 +-
 hw/arm/vexpress.c| 16 +++-
 hw/arm/virt.c| 10 ++--
 hw/arm/xlnx-zynqmp.c | 16 +++-
 hw/cpu/core.c|  4 ++-
 hw/hppa/machine.c|  4 ++-
 hw/i386/acpi-build.c | 13 ++
 hw/i386/kvmvapic.c   |  7 +++--
 hw/i386/pc.c | 33 ++--
 hw/i386/xen/xen-hvm.c|  4 +++
 hw/mips/boston.c |  2 +-
 hw/mips/mips_malta.c | 24 +
 hw/openrisc/openrisc_sim.c   |  1 +
 hw/ppc/e500.c|  3 +++
 hw/ppc/mac_newworld.c|  3 ++-
 hw/ppc/mac_oldworld.c|  3 ++-
 hw/ppc/pnv.c |  9 ---
 hw/ppc/prep.c|  4 +--
 hw/ppc/spapr.c   | 37 ++
 hw/ppc/spapr_rtas.c  |  4 ++-
 hw/riscv/sifive_e.c  |  6 +++--
 hw/riscv/sifive_plic.c   |  3 +++
 hw/riscv/sifive_u.c  |  6 +++--
 hw/riscv/spike.c |  2 ++
 hw/riscv/virt.c  |  1 +
 hw/s390x/s390-virtio-ccw.c   |  9 ---
 hw/s390x/sclp.c  |  2 +-
 hw/smbios/smbios.c   | 26 +++
 hw/sparc/sun4m.c |  2 ++
 hw/sparc64/sun4u.c   |  4 +--
 hw/xtensa/sim.c  |  2 +-
 hw/xtensa/xtfpga.c   |  1 +
 include/hw/boards.h  | 19 --
 include/hw/firmware/smbios.h |  5 ++--
 include/hw/i386/pc.h |  2 +-
 migration/postcopy-ram.c |  8 +-
 numa.c   |  1 +
 qmp.c|  2 +-
 target/arm/cpu.c |  8 +-
 target/i386/cpu.c|  4 ++-
 target/openrisc/sys_helper.c |  6 -
 target/s390x/cpu.c   |  3 +++
 target/s390x/excp_helper.c   |  5 
 tcg/tcg.c| 13 +-
 vl.c | 50 +++-
 59 files changed, 301 insertions(+), 141 deletions(-)

-- 
2.21.0




[Qemu-devel] [PATCH v3 01/10] hw/boards: Add struct CpuTopology to MachineState

2019-05-19 Thread Like Xu
The cpu topology property CpuTopology is added to the MachineState
and its members are initialized with the leagcy global smp variables.

>From this commit, the code in the system emulation mode is supposed to
use cpu topology variables from MachineState instead of the global ones
defined in vl.c and there is no semantic change.

Suggested-by: Igor Mammedov 
Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
Reviewed-by: Alistair Francis 
---
 include/hw/boards.h | 15 +++
 vl.c|  5 +
 2 files changed, 20 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6f7916f88f..bc23b5db1d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -230,6 +230,20 @@ typedef struct DeviceMemoryState {
 MemoryRegion mr;
 } DeviceMemoryState;
 
+/**
+ * CpuTopology:
+ * @cpus: the number of present logical processors on the machine
+ * @cores: the number of cores in one package
+ * @threads: the number of threads in one core
+ * @max_cpus: the maximum number of logical processors on the machine
+ */
+typedef struct CpuTopology {
+unsigned int cpus;
+unsigned int cores;
+unsigned int threads;
+unsigned int max_cpus;
+} CpuTopology;
+
 /**
  * MachineState:
  */
@@ -272,6 +286,7 @@ struct MachineState {
 const char *cpu_type;
 AccelState *accelerator;
 CPUArchIdList *possible_cpus;
+CpuTopology smp;
 struct NVDIMMState *nvdimms_state;
 };
 
diff --git a/vl.c b/vl.c
index c8ca9ff6ff..40b006577b 100644
--- a/vl.c
+++ b/vl.c
@@ -4133,6 +4133,11 @@ int main(int argc, char **argv, char **envp)
 
 smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
 
+current_machine->smp.cpus = smp_cpus;
+current_machine->smp.max_cpus = max_cpus;
+current_machine->smp.cores = smp_cores;
+current_machine->smp.threads = smp_threads;
+
 /* sanity-check smp_cpus and max_cpus against machine_class */
 if (smp_cpus < machine_class->min_cpus) {
 error_report("Invalid SMP CPUs %d. The min CPUs "
-- 
2.21.0




Re: [Qemu-devel] [PATCH v2 00/10] refactor cpu topo into machine properties

2019-05-09 Thread Like Xu

On 2019/5/6 16:33, Like Xu wrote:

This patch series make existing cores/threads/sockets into machine
properties and get rid of global smp_* variables they use currently.

The purpose of getting rid of globals is disentangle layer violations and
let's do it one step at a time by replacing the smp_foo with qdev_get_machine()
as few calls as possible and delay other related refactoring efforts.



It looks like the changelog is missing and here it is:

==changelog==

v2:
- pass MachineState via call chain with trivial fixups
- replace smp_cpus directly at places if it's only used once
- s/topo/smp/ and drop smp_ prefix inside CpuTopology structure
- add more commit messages to explaining what patch does
- fix Patchew build failure for xen usage
- use macs rather than ms in migration context for MigrationState
- cleanup unrelated and redundant changes
- spilt OpenRISC and RISC-V related patches

v1: https://patchwork.kernel.org/cover/10876667/


Like Xu (10):
   hw/boards: add struct CpuTopology to MachineState
   cpu/topology: related call chains refactoring to pass MachineState
   cpu/topology: replace global smp variables by MachineState in general path
   cpu/topology: add uncommon arch support for smp machine properties
   cpu/topology: add hw/ppc support for smp machine properties
   cpu/topology: add hw/riscv support for smp machine properties
   cpu/topology: add hw/s390x support for smp machine properties
   cpu/topology: add hw/i386 support for smp machine properties
   cpu/topology: add hw/arm support for smp machine properties
   cpu/topology: replace smp global variables with smp machine properties

  accel/kvm/kvm-all.c  |  4 ++--
  backends/hostmem.c   |  6 --
  cpus.c   |  6 --
  exec.c   |  3 ++-
  gdbstub.c|  4 
  hw/alpha/dp264.c |  1 +
  hw/arm/fsl-imx6.c|  6 +-
  hw/arm/fsl-imx6ul.c  |  6 +-
  hw/arm/fsl-imx7.c|  7 +--
  hw/arm/highbank.c|  1 +
  hw/arm/mcimx6ul-evk.c|  2 +-
  hw/arm/mcimx7d-sabre.c   |  2 +-
  hw/arm/raspi.c   |  4 ++--
  hw/arm/realview.c|  1 +
  hw/arm/sabrelite.c   |  2 +-
  hw/arm/vexpress.c| 16 --
  hw/arm/virt.c| 10 +++--
  hw/arm/xlnx-zynqmp.c | 16 --
  hw/cpu/core.c|  4 +++-
  hw/hppa/machine.c|  4 +++-
  hw/i386/acpi-build.c | 13 +++-
  hw/i386/kvmvapic.c   |  7 +--
  hw/i386/pc.c | 33 -
  hw/i386/xen/xen-hvm.c|  4 
  hw/mips/boston.c |  2 +-
  hw/mips/mips_malta.c | 23 +++-
  hw/openrisc/openrisc_sim.c   |  1 +
  hw/ppc/e500.c|  3 +++
  hw/ppc/mac_newworld.c|  3 ++-
  hw/ppc/mac_oldworld.c|  3 ++-
  hw/ppc/pnv.c |  9 
  hw/ppc/prep.c|  4 ++--
  hw/ppc/spapr.c   | 37 +++-
  hw/ppc/spapr_rtas.c  |  4 +++-
  hw/riscv/sifive_e.c  |  6 --
  hw/riscv/sifive_plic.c   |  3 +++
  hw/riscv/sifive_u.c  |  6 --
  hw/riscv/spike.c |  2 ++
  hw/riscv/virt.c  |  1 +
  hw/s390x/s390-virtio-ccw.c   |  9 
  hw/s390x/sclp.c  |  2 +-
  hw/smbios/smbios.c   | 26 +--
  hw/sparc/sun4m.c |  2 ++
  hw/sparc64/sun4u.c   |  4 ++--
  hw/xtensa/sim.c  |  2 +-
  hw/xtensa/xtfpga.c   |  1 +
  include/hw/boards.h  | 19 +++--
  include/hw/firmware/smbios.h |  5 +++--
  include/hw/i386/pc.h |  2 +-
  migration/postcopy-ram.c |  8 ++-
  numa.c   |  1 +
  qmp.c|  2 +-
  target/arm/cpu.c |  8 ++-
  target/i386/cpu.c|  4 +++-
  target/openrisc/sys_helper.c |  6 +-
  target/s390x/cpu.c   |  3 +++
  target/s390x/excp_helper.c   |  5 +
  tcg/tcg.c| 13 +++-
  vl.c | 50 
  59 files changed, 301 insertions(+), 140 deletions(-)






[Qemu-devel] [PATCH v2 10/10] cpu/topology: replace smp global variables with smp machine properties

2019-05-06 Thread Like Xu
At the end of this smp refactoring series, the global ones are removed
and only smp machine properties are fully applied and enabled.

Signed-off-by: Like Xu 
---
 vl.c | 53 ++---
 1 file changed, 26 insertions(+), 27 deletions(-)

diff --git a/vl.c b/vl.c
index 34f05b2..a3e426c 100644
--- a/vl.c
+++ b/vl.c
@@ -162,10 +162,6 @@ static Chardev **serial_hds;
 Chardev *parallel_hds[MAX_PARALLEL_PORTS];
 int win2k_install_hack = 0;
 int singlestep = 0;
-int smp_cpus;
-unsigned int max_cpus;
-int smp_cores = 1;
-int smp_threads = 1;
 int acpi_enabled = 1;
 int no_hpet = 0;
 int fd_bootchk = 1;
@@ -1282,8 +1278,9 @@ static void smp_parse(QemuOpts *opts)
 sockets = sockets > 0 ? sockets : 1;
 cpus = cores * threads * sockets;
 } else {
-max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
-sockets = max_cpus / (cores * threads);
+current_machine->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
+sockets = current_machine->smp.max_cpus / (cores * threads);
 }
 } else if (cores == 0) {
 threads = threads > 0 ? threads : 1;
@@ -1300,34 +1297,37 @@ static void smp_parse(QemuOpts *opts)
 exit(1);
 }
 
-max_cpus = qemu_opt_get_number(opts, "maxcpus", cpus);
+current_machine->smp.max_cpus =
+qemu_opt_get_number(opts, "maxcpus", cpus);
 
-if (max_cpus < cpus) {
+if (current_machine->smp.max_cpus < cpus) {
 error_report("maxcpus must be equal to or greater than smp");
 exit(1);
 }
 
-if (sockets * cores * threads > max_cpus) {
+if (sockets * cores * threads > current_machine->smp.max_cpus) {
 error_report("cpu topology: "
  "sockets (%u) * cores (%u) * threads (%u) > "
  "maxcpus (%u)",
- sockets, cores, threads, max_cpus);
+ sockets, cores, threads,
+ current_machine->smp.max_cpus);
 exit(1);
 }
 
-if (sockets * cores * threads != max_cpus) {
+if (sockets * cores * threads != current_machine->smp.max_cpus) {
 warn_report("Invalid CPU topology deprecated: "
 "sockets (%u) * cores (%u) * threads (%u) "
 "!= maxcpus (%u)",
-sockets, cores, threads, max_cpus);
+sockets, cores, threads,
+current_machine->smp.max_cpus);
 }
 
-smp_cpus = cpus;
-smp_cores = cores;
-smp_threads = threads;
+current_machine->smp.cpus = cpus;
+current_machine->smp.cores = cores;
+current_machine->smp.threads = threads;
 }
 
-if (smp_cpus > 1) {
+if (current_machine->smp.cpus > 1) {
 Error *blocker = NULL;
 error_setg(, QERR_REPLAY_NOT_SUPPORTED, "smp");
 replay_add_blocker(blocker);
@@ -4094,26 +4094,25 @@ int main(int argc, char **argv, char **envp)
 machine_class->default_cpus = machine_class->default_cpus ?: 1;
 
 /* default to machine_class->default_cpus */
-smp_cpus = machine_class->default_cpus;
-max_cpus = machine_class->default_cpus;
+current_machine->smp.cpus = machine_class->default_cpus;
+current_machine->smp.max_cpus = machine_class->default_cpus;
+current_machine->smp.cores = 1;
+current_machine->smp.threads = 1;
 
 smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
 
-current_machine->smp.cpus = smp_cpus;
-current_machine->smp.max_cpus = max_cpus;
-current_machine->smp.cores = smp_cores;
-current_machine->smp.threads = smp_threads;
-
 /* sanity-check smp_cpus and max_cpus against machine_class */
-if (smp_cpus < machine_class->min_cpus) {
+if (current_machine->smp.cpus < machine_class->min_cpus) {
 error_report("Invalid SMP CPUs %d. The min CPUs "
- "supported by machine '%s' is %d", smp_cpus,
+ "supported by machine '%s' is %d",
+ current_machine->smp.cpus,
  machine_class->name, machine_class->min_cpus);
 exit(1);
 }
-if (max_cpus > machine_class->max_cpus) {
+if (current_machine->smp.max_cpus > machine_class->max_cpus) {
 error_report("Invalid SMP CPUs %d. The max CPUs "
- "supported by machine '%s' is %d", max_cpus,
+ "supported by machine '%s' is %d",
+ current_machine->smp.max_cpus,
  machine_class->name, machine_class->max_cpus);
 exit(1);
 }
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 08/10] cpu/topology: add hw/i386 support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in i386
are replaced with smp machine properties. To avoid calling qdev_get_machine()
as much as possible, related funtions for acpi data generations including
init_cpus() are refactored to pass MachineState. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/arm/vexpress.c |  4 ++--
 hw/i386/acpi-build.c  | 13 -
 hw/i386/kvmvapic.c|  7 +--
 hw/i386/pc.c  | 24 +++-
 hw/i386/xen/xen-hvm.c |  4 
 target/i386/cpu.c |  4 +++-
 6 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/hw/arm/vexpress.c b/hw/arm/vexpress.c
index d8634f3..19273a2 100644
--- a/hw/arm/vexpress.c
+++ b/hw/arm/vexpress.c
@@ -377,8 +377,8 @@ static void a15_daughterboard_init(const 
VexpressMachineState *vms,
 memory_region_add_subregion(sysmem, 0x8000, ram);
 
 /* 0x2c00 A15MPCore private memory region (GIC) */
-init_cpus(cpu_type, TYPE_A15MPCORE_PRIV, 0x2c00, pic, vms->secure,
-  vms->virt);
+init_cpus(machine, cpu_type, TYPE_A15MPCORE_PRIV,
+  0x2c00, pic, vms->secure, vms->virt);
 
 /* A15 daughterboard peripherals: */
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 416da31..29adc17 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -44,6 +44,7 @@
 #include "sysemu/tpm.h"
 #include "hw/acpi/tpm.h"
 #include "hw/acpi/vmgenid.h"
+#include "hw/boards.h"
 #include "sysemu/tpm_backend.h"
 #include "hw/timer/mc146818rtc_regs.h"
 #include "hw/mem/memory-device.h"
@@ -125,7 +126,8 @@ typedef struct FwCfgTPMConfig {
 uint8_t tpmppi_version;
 } QEMU_PACKED FwCfgTPMConfig;
 
-static void init_common_fadt_data(Object *o, AcpiFadtData *data)
+static void init_common_fadt_data(MachineState *ms, Object *o,
+  AcpiFadtData *data)
 {
 uint32_t io = object_property_get_uint(o, ACPI_PM_PROP_PM_IO_BASE, NULL);
 AmlAddressSpace as = AML_AS_SYSTEM_IO;
@@ -141,7 +143,8 @@ static void init_common_fadt_data(Object *o, AcpiFadtData 
*data)
  * CPUs for more than 8 CPUs, "Clustered Logical" mode has to be
  * used
  */
-((max_cpus > 8) ? (1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL) : 0),
+((ms->smp.max_cpus > 8) ?
+(1 << ACPI_FADT_F_FORCE_APIC_CLUSTER_MODEL) : 0),
 .int_model = 1 /* Multiple APIC */,
 .rtc_century = RTC_CENTURY,
 .plvl2_lat = 0xfff /* C2 state not supported */,
@@ -164,7 +167,7 @@ static void init_common_fadt_data(Object *o, AcpiFadtData 
*data)
 *data = fadt;
 }
 
-static void acpi_get_pm_info(AcpiPmInfo *pm)
+static void acpi_get_pm_info(MachineState *machine, AcpiPmInfo *pm)
 {
 Object *piix = piix4_pm_find();
 Object *lpc = ich9_lpc_find();
@@ -174,7 +177,7 @@ static void acpi_get_pm_info(AcpiPmInfo *pm)
 pm->pcihp_io_base = 0;
 pm->pcihp_io_len = 0;
 
-init_common_fadt_data(obj, >fadt);
+init_common_fadt_data(machine, obj, >fadt);
 if (piix) {
 /* w2k requires FADT(rev1) or it won't boot, keep PC compatible */
 pm->fadt.rev = 1;
@@ -2617,7 +2620,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 AcpiSlicOem slic_oem = { .id = NULL, .table_id = NULL };
 Object *vmgenid_dev;
 
-acpi_get_pm_info();
+acpi_get_pm_info(machine, );
 acpi_get_misc_info();
 acpi_get_pci_holes(_hole, _hole64);
 acpi_get_slic_oem(_oem);
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 70f6f26..3fce704 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -17,6 +17,7 @@
 #include "sysemu/kvm.h"
 #include "hw/i386/apic_internal.h"
 #include "hw/sysbus.h"
+#include "hw/boards.h"
 #include "tcg/tcg.h"
 
 #define VAPIC_IO_PORT   0x7e
@@ -441,11 +442,12 @@ static void do_patch_instruction(CPUState *cs, 
run_on_cpu_data data)
 
 static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 CPUState *cs = CPU(cpu);
 VAPICHandlers *handlers;
 PatchInfo *info;
 
-if (smp_cpus == 1) {
+if (ms->smp.cpus == 1) {
 handlers = >rom_state.up;
 } else {
 handlers = >rom_state.mp;
@@ -746,6 +748,7 @@ static void do_vapic_enable(CPUState *cs, run_on_cpu_data 
data)
 static void kvmvapic_vm_state_change(void *opaque, int running,
  RunState state)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 VAPICROMState *s = opaque;
 uint8_t *zero;
 
@@ -754,7 +757,7 @@ static void kvmvapic_vm_state_change(void *opaque, int 
running,
 }
 
 if (s->state == VAPIC_ACTIVE) {
-if (smp_cpus == 1) {
+if (ms->smp.cpus == 1) {
 run_on_c

[Qemu-devel] [PATCH v2 00/10] refactor cpu topo into machine properties

2019-05-06 Thread Like Xu
This patch series make existing cores/threads/sockets into machine
properties and get rid of global smp_* variables they use currently.

The purpose of getting rid of globals is disentangle layer violations and
let's do it one step at a time by replacing the smp_foo with qdev_get_machine()
as few calls as possible and delay other related refactoring efforts.

Like Xu (10):
  hw/boards: add struct CpuTopology to MachineState
  cpu/topology: related call chains refactoring to pass MachineState
  cpu/topology: replace global smp variables by MachineState in general path
  cpu/topology: add uncommon arch support for smp machine properties
  cpu/topology: add hw/ppc support for smp machine properties
  cpu/topology: add hw/riscv support for smp machine properties
  cpu/topology: add hw/s390x support for smp machine properties
  cpu/topology: add hw/i386 support for smp machine properties
  cpu/topology: add hw/arm support for smp machine properties
  cpu/topology: replace smp global variables with smp machine properties

 accel/kvm/kvm-all.c  |  4 ++--
 backends/hostmem.c   |  6 --
 cpus.c   |  6 --
 exec.c   |  3 ++-
 gdbstub.c|  4 
 hw/alpha/dp264.c |  1 +
 hw/arm/fsl-imx6.c|  6 +-
 hw/arm/fsl-imx6ul.c  |  6 +-
 hw/arm/fsl-imx7.c|  7 +--
 hw/arm/highbank.c|  1 +
 hw/arm/mcimx6ul-evk.c|  2 +-
 hw/arm/mcimx7d-sabre.c   |  2 +-
 hw/arm/raspi.c   |  4 ++--
 hw/arm/realview.c|  1 +
 hw/arm/sabrelite.c   |  2 +-
 hw/arm/vexpress.c| 16 --
 hw/arm/virt.c| 10 +++--
 hw/arm/xlnx-zynqmp.c | 16 --
 hw/cpu/core.c|  4 +++-
 hw/hppa/machine.c|  4 +++-
 hw/i386/acpi-build.c | 13 +++-
 hw/i386/kvmvapic.c   |  7 +--
 hw/i386/pc.c | 33 -
 hw/i386/xen/xen-hvm.c|  4 
 hw/mips/boston.c |  2 +-
 hw/mips/mips_malta.c | 23 +++-
 hw/openrisc/openrisc_sim.c   |  1 +
 hw/ppc/e500.c|  3 +++
 hw/ppc/mac_newworld.c|  3 ++-
 hw/ppc/mac_oldworld.c|  3 ++-
 hw/ppc/pnv.c |  9 
 hw/ppc/prep.c|  4 ++--
 hw/ppc/spapr.c   | 37 +++-
 hw/ppc/spapr_rtas.c  |  4 +++-
 hw/riscv/sifive_e.c  |  6 --
 hw/riscv/sifive_plic.c   |  3 +++
 hw/riscv/sifive_u.c  |  6 --
 hw/riscv/spike.c |  2 ++
 hw/riscv/virt.c  |  1 +
 hw/s390x/s390-virtio-ccw.c   |  9 
 hw/s390x/sclp.c  |  2 +-
 hw/smbios/smbios.c   | 26 +--
 hw/sparc/sun4m.c |  2 ++
 hw/sparc64/sun4u.c   |  4 ++--
 hw/xtensa/sim.c  |  2 +-
 hw/xtensa/xtfpga.c   |  1 +
 include/hw/boards.h  | 19 +++--
 include/hw/firmware/smbios.h |  5 +++--
 include/hw/i386/pc.h |  2 +-
 migration/postcopy-ram.c |  8 ++-
 numa.c   |  1 +
 qmp.c|  2 +-
 target/arm/cpu.c |  8 ++-
 target/i386/cpu.c|  4 +++-
 target/openrisc/sys_helper.c |  6 +-
 target/s390x/cpu.c   |  3 +++
 target/s390x/excp_helper.c   |  5 +
 tcg/tcg.c| 13 +++-
 vl.c | 50 
 59 files changed, 301 insertions(+), 140 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH v2 05/10] cpu/topology: add hw/ppc support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in ppc
are replaced with smp machine properties. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/ppc/e500.c |  3 +++
 hw/ppc/mac_newworld.c |  3 ++-
 hw/ppc/mac_oldworld.c |  3 ++-
 hw/ppc/pnv.c  |  6 --
 hw/ppc/prep.c |  4 ++--
 hw/ppc/spapr.c| 34 ++
 hw/ppc/spapr_rtas.c   |  4 +++-
 7 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index beb2efd..5e42e5a 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -307,6 +307,7 @@ static int ppce500_load_device_tree(PPCE500MachineState 
*pms,
 bool dry_run)
 {
 MachineState *machine = MACHINE(pms);
+unsigned int smp_cpus = machine->smp.cpus;
 const PPCE500MachineClass *pmc = PPCE500_MACHINE_GET_CLASS(pms);
 CPUPPCState *env = first_cpu->env_ptr;
 int ret = -1;
@@ -734,6 +735,7 @@ static DeviceState 
*ppce500_init_mpic_qemu(PPCE500MachineState *pms,
 SysBusDevice *s;
 int i, j, k;
 MachineState *machine = MACHINE(pms);
+unsigned int smp_cpus = machine->smp.cpus;
 const PPCE500MachineClass *pmc = PPCE500_MACHINE_GET_CLASS(pms);
 
 dev = qdev_create(NULL, TYPE_OPENPIC);
@@ -846,6 +848,7 @@ void ppce500_init(MachineState *machine)
 struct boot_info *boot_info;
 int dt_size;
 int i;
+unsigned int smp_cpus = machine->smp.cpus;
 /* irq num for pin INTA, INTB, INTC and INTD is 1, 2, 3 and
  * 4 respectively */
 unsigned int pci_irq_nrs[PCI_NUM_PINS] = {1, 2, 3, 4};
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 02d8559..257b26e 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -135,6 +135,7 @@ static void ppc_core99_init(MachineState *machine)
 DeviceState *dev, *pic_dev;
 hwaddr nvram_addr = 0xFFF04000;
 uint64_t tbfreq;
+unsigned int smp_cpus = machine->smp.cpus;
 
 linux_boot = (kernel_filename != NULL);
 
@@ -464,7 +465,7 @@ static void ppc_core99_init(MachineState *machine)
 sysbus_mmio_map(s, 1, CFG_ADDR + 2);
 
 fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, machine_arch);
 fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_base);
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 460cbc7..1968f05 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -99,6 +99,7 @@ static void ppc_heathrow_init(MachineState *machine)
 DeviceState *dev, *pic_dev;
 BusState *adb_bus;
 int bios_size;
+unsigned int smp_cpus = machine->smp.cpus;
 uint16_t ppc_boot_device;
 DriveInfo *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
 void *fw_cfg;
@@ -322,7 +323,7 @@ static void ppc_heathrow_init(MachineState *machine)
 sysbus_mmio_map(s, 1, CFG_ADDR + 2);
 
 fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, ARCH_HEATHROW);
 fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_base);
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 1e8c505..3bb1533 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -678,7 +678,8 @@ static void pnv_init(MachineState *machine)
 object_property_add_child(OBJECT(pnv), chip_name, chip, _fatal);
 object_property_set_int(chip, PNV_CHIP_HWID(i), "chip-id",
 _fatal);
-object_property_set_int(chip, smp_cores, "nr-cores", _fatal);
+object_property_set_int(chip, machine->smp.cores,
+"nr-cores", _fatal);
 object_property_set_bool(chip, true, "realized", _fatal);
 }
 g_free(chip_typename);
@@ -1134,6 +1135,7 @@ static void pnv_chip_instance_init(Object *obj)
 
 static void pnv_chip_core_realize(PnvChip *chip, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 Error *error = NULL;
 PnvChipClass *pcc = PNV_CHIP_GET_CLASS(chip);
 const char *typename = pnv_chip_core_typename(chip);
@@ -1168,7 +1170,7 @@ static void pnv_chip_core_realize(PnvChip *chip, Error 
**errp)
 snprintf(core_name, sizeof(core_name), "core[%d]", core_hwid);
 object_property_add_child(OBJECT(chip), core_name, OBJECT(pnv_core),
   _fatal);
-object_property_set_int(OBJECT(pnv_core), smp_threads, "nr-threads",
+object_property_set_int(OBJECT(pnv_core), ms->smp.threads, 
"nr-threads",
  

[Qemu-devel] [PATCH v2 03/10] cpu/topology: replace global smp variables by MachineState in general path

2019-05-06 Thread Like Xu
Basically, the context could get the MachineState reference via call
chains or unrecommend qdev_get_machine() in !CONFIG_USER_ONLY mode.

A new variable of the same name would be introduced in the declaration
phase out of less effort OR replace it on the spot if it's only used
once in the context. No semantic changes.

Signed-off-by: Like Xu 
---
 accel/kvm/kvm-all.c  |  4 ++--
 backends/hostmem.c   |  6 --
 cpus.c   |  6 --
 exec.c   |  3 ++-
 gdbstub.c|  4 
 hw/cpu/core.c|  4 +++-
 migration/postcopy-ram.c |  8 +++-
 numa.c   |  1 +
 target/openrisc/sys_helper.c |  6 +-
 tcg/tcg.c| 13 -
 10 files changed, 44 insertions(+), 11 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 524c4dd..f8ef39d 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1533,8 +1533,8 @@ static int kvm_init(MachineState *ms)
 const char *name;
 int num;
 } num_cpus[] = {
-{ "SMP",  smp_cpus },
-{ "hotpluggable", max_cpus },
+{ "SMP",  ms->smp.cpus },
+{ "hotpluggable", ms->smp.max_cpus },
 { NULL, }
 }, *nc = num_cpus;
 int soft_vcpus_limit, hard_vcpus_limit;
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 04baf47..463102a 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -222,6 +222,7 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 {
 Error *local_err = NULL;
 HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+MachineState *ms = MACHINE(qdev_get_machine());
 
 if (backend->force_prealloc) {
 if (value) {
@@ -241,7 +242,7 @@ static void host_memory_backend_set_prealloc(Object *obj, 
bool value,
 void *ptr = memory_region_get_ram_ptr(>mr);
 uint64_t sz = memory_region_size(>mr);
 
-os_mem_prealloc(fd, ptr, sz, smp_cpus, _err);
+os_mem_prealloc(fd, ptr, sz, ms->smp.cpus, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -311,6 +312,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
 {
 HostMemoryBackend *backend = MEMORY_BACKEND(uc);
 HostMemoryBackendClass *bc = MEMORY_BACKEND_GET_CLASS(uc);
+MachineState *ms = MACHINE(qdev_get_machine());
 Error *local_err = NULL;
 void *ptr;
 uint64_t sz;
@@ -375,7 +377,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, 
Error **errp)
  */
 if (backend->prealloc) {
 os_mem_prealloc(memory_region_get_fd(>mr), ptr, sz,
-smp_cpus, _err);
+ms->smp.cpus, _err);
 if (local_err) {
 goto out;
 }
diff --git a/cpus.c b/cpus.c
index e58e7ab..b49db36 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2068,8 +2068,10 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
 
 void qemu_init_vcpu(CPUState *cpu)
 {
-cpu->nr_cores = smp_cores;
-cpu->nr_threads = smp_threads;
+MachineState *ms = MACHINE(qdev_get_machine());
+
+cpu->nr_cores = ms->smp.cores;
+cpu->nr_threads =  ms->smp.threads;
 cpu->stopped = true;
 
 if (!cpu->as) {
diff --git a/exec.c b/exec.c
index 4e73477..2744df6 100644
--- a/exec.c
+++ b/exec.c
@@ -1871,6 +1871,7 @@ static void *file_ram_alloc(RAMBlock *block,
 bool truncate,
 Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 void *area;
 
 block->page_size = qemu_fd_getpagesize(fd);
@@ -1927,7 +1928,7 @@ static void *file_ram_alloc(RAMBlock *block,
 }
 
 if (mem_prealloc) {
-os_mem_prealloc(fd, area, memory, smp_cpus, errp);
+os_mem_prealloc(fd, area, memory, ms->smp.cpus, errp);
 if (errp && *errp) {
 qemu_ram_munmap(fd, area, memory);
 return NULL;
diff --git a/gdbstub.c b/gdbstub.c
index d54abd1..dba37df 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -30,6 +30,7 @@
 #include "sysemu/sysemu.h"
 #include "exec/gdbstub.h"
 #include "hw/cpu/cluster.h"
+#include "hw/boards.h"
 #endif
 
 #define MAX_PACKET_LENGTH 4096
@@ -1159,6 +1160,9 @@ static int gdb_handle_vcont(GDBState *s, const char *p)
 CPU_FOREACH(cpu) {
 max_cpus = max_cpus <= cpu->cpu_index ? cpu->cpu_index + 1 : max_cpus;
 }
+#else
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
 #endif
 /* uninitialised CPUs stay 0 */
 newstates = g_new0(char, max_cpus);
diff --git a/hw/cpu/core.c b/hw/cpu/core.c
index 7e42e2c..be2c7e1 100644
--- a/hw/cpu/core.c
+++ b/hw/cpu/core.c
@@ -11,6 +11,7 @@
 #include "qapi/vi

[Qemu-devel] [PATCH v2 09/10] cpu/topology: add hw/arm support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in arm
are replaced with smp machine properties. The init_cpus() and
xlnx_zynqmp_create_rpu() are refactored to pass MachineState.
No semantic changes.

Signed-off-by: Like Xu 
---
 hw/arm/fsl-imx6.c  |  6 +-
 hw/arm/fsl-imx6ul.c|  6 +-
 hw/arm/fsl-imx7.c  |  7 +--
 hw/arm/highbank.c  |  1 +
 hw/arm/mcimx6ul-evk.c  |  2 +-
 hw/arm/mcimx7d-sabre.c |  2 +-
 hw/arm/raspi.c |  4 ++--
 hw/arm/realview.c  |  1 +
 hw/arm/sabrelite.c |  2 +-
 hw/arm/vexpress.c  | 12 
 hw/arm/virt.c  |  8 +++-
 hw/arm/xlnx-zynqmp.c   | 16 ++--
 target/arm/cpu.c   |  8 +++-
 13 files changed, 54 insertions(+), 21 deletions(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index 7b7b97f..ed772d5 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -23,6 +23,7 @@
 #include "qapi/error.h"
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx6.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "chardev/char.h"
 #include "qemu/error-report.h"
@@ -33,11 +34,12 @@
 
 static void fsl_imx6_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6State *s = FSL_IMX6(obj);
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX6_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 "cortex-a9-" TYPE_ARM_CPU, _abort, NULL);
@@ -93,9 +95,11 @@ static void fsl_imx6_init(Object *obj)
 
 static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6State *s = FSL_IMX6(dev);
 uint16_t i;
 Error *err = NULL;
+unsigned int smp_cpus = ms->smp.cpus;
 
 if (smp_cpus > FSL_IMX6_NUM_CPUS) {
 error_setg(errp, "%s: Only %d CPUs are supported (%d requested)",
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
index 4b56bfa..74b8ecb 100644
--- a/hw/arm/fsl-imx6ul.c
+++ b/hw/arm/fsl-imx6ul.c
@@ -21,6 +21,7 @@
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx6ul.h"
 #include "hw/misc/unimp.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
 
@@ -28,11 +29,12 @@
 
 static void fsl_imx6ul_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6ULState *s = FSL_IMX6UL(obj);
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6UL_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX6UL_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 "cortex-a7-" TYPE_ARM_CPU, _abort, NULL);
@@ -156,10 +158,12 @@ static void fsl_imx6ul_init(Object *obj)
 
 static void fsl_imx6ul_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX6ULState *s = FSL_IMX6UL(dev);
 int i;
 qemu_irq irq;
 char name[NAME_SIZE];
+unsigned int smp_cpus = ms->smp.cpus;
 
 if (smp_cpus > FSL_IMX6UL_NUM_CPUS) {
 error_setg(errp, "%s: Only %d CPUs are supported (%d requested)",
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 7663ad6..71cc414 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -23,6 +23,7 @@
 #include "qemu-common.h"
 #include "hw/arm/fsl-imx7.h"
 #include "hw/misc/unimp.h"
+#include "hw/boards.h"
 #include "sysemu/sysemu.h"
 #include "qemu/error-report.h"
 
@@ -30,12 +31,12 @@
 
 static void fsl_imx7_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX7State *s = FSL_IMX7(obj);
 char name[NAME_SIZE];
 int i;
 
-
-for (i = 0; i < MIN(smp_cpus, FSL_IMX7_NUM_CPUS); i++) {
+for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX7_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
 ARM_CPU_TYPE_NAME("cortex-a7"), _abort,
@@ -155,11 +156,13 @@ static void fsl_imx7_init(Object *obj)
 
 static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 FslIMX7State *s = FSL_IMX7(dev);
 Object *o;
 int i;
 qemu_irq irq;
 char name[NAME_SIZE];
+unsigned int smp_cpus = ms->smp.cpus;
 
 if (smp_cpus > FSL_IMX7_NUM_CPUS) {
 error_setg(errp, "%s: Only %d CPUs are supported (%d requested)",
diff --git a/hw/a

[Qemu-devel] [PATCH v2 02/10] cpu/topology: related call chains refactoring to pass MachineState

2019-05-06 Thread Like Xu
It's recommended to access smp variables via MachineState
as an incoming parameter. This approach applies on legacy
smbios_*_tables*(), *_machine_reset(), *__hot_add_cpu()
and related *_create_cpu() for later smp variables usages.

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 
---
 hw/arm/virt.c|  2 +-
 hw/hppa/machine.c|  2 +-
 hw/i386/pc.c |  9 -
 hw/mips/mips_malta.c | 21 +++--
 hw/ppc/pnv.c |  3 +--
 hw/ppc/spapr.c   |  3 +--
 hw/s390x/s390-virtio-ccw.c   |  6 +++---
 hw/smbios/smbios.c   | 26 +++---
 include/hw/boards.h  |  4 ++--
 include/hw/firmware/smbios.h |  5 +++--
 include/hw/i386/pc.h |  2 +-
 qmp.c|  2 +-
 vl.c |  2 +-
 13 files changed, 45 insertions(+), 42 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 16ba67f..1b02ba4 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1290,7 +1290,7 @@ static void virt_build_smbios(VirtMachineState *vms)
 vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
 true, SMBIOS_ENTRY_POINT_30);
 
-smbios_get_tables(NULL, 0, _tables, _tables_len,
+smbios_get_tables(MACHINE(vms), NULL, 0, _tables, 
_tables_len,
   _anchor, _anchor_len);
 
 if (smbios_anchor) {
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index d1b1d3c..416e67b 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -240,7 +240,7 @@ static void machine_hppa_init(MachineState *machine)
 cpu[0]->env.gr[21] = smp_cpus;
 }
 
-static void hppa_machine_reset(void)
+static void hppa_machine_reset(MachineState *ms)
 {
 int i;
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d98b737..9bcd867 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -962,7 +962,7 @@ static void pc_build_smbios(PCMachineState *pcms)
 /* tell smbios about cpuid version and features */
 smbios_set_cpuid(cpu->env.cpuid_version, cpu->env.features[FEAT_1_EDX]);
 
-smbios_tables = smbios_get_table_legacy(_tables_len);
+smbios_tables = smbios_get_table_legacy(ms, _tables_len);
 if (smbios_tables) {
 fw_cfg_add_bytes(pcms->fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_tables, smbios_tables_len);
@@ -979,7 +979,7 @@ static void pc_build_smbios(PCMachineState *pcms)
 array_count++;
 }
 }
-smbios_get_tables(mem_array, array_count,
+smbios_get_tables(ms, mem_array, array_count,
   _tables, _tables_len,
   _anchor, _anchor_len);
 g_free(mem_array);
@@ -1534,9 +1534,8 @@ static void pc_new_cpu(const char *typename, int64_t 
apic_id, Error **errp)
 error_propagate(errp, local_err);
 }
 
-void pc_hot_add_cpu(const int64_t id, Error **errp)
+void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
 {
-MachineState *ms = MACHINE(qdev_get_machine());
 int64_t apic_id = x86_cpu_apic_id_from_index(id);
 Error *local_err = NULL;
 
@@ -2622,7 +2621,7 @@ static void pc_machine_initfn(Object *obj)
 pc_system_flash_create(pcms);
 }
 
-static void pc_machine_reset(void)
+static void pc_machine_reset(MachineState *machine)
 {
 CPUState *cs;
 X86CPU *cpu;
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 439665a..534e705 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1124,14 +1124,14 @@ static void main_cpu_reset(void *opaque)
 }
 }
 
-static void create_cpu_without_cps(const char *cpu_type,
+static void create_cpu_without_cps(MachineState *ms, const char *cpu_type,
qemu_irq *cbus_irq, qemu_irq *i8259_irq)
 {
 CPUMIPSState *env;
 MIPSCPU *cpu;
 int i;
 
-for (i = 0; i < smp_cpus; i++) {
+for (i = 0; i < ms->smp.cpus; i++) {
 cpu = MIPS_CPU(cpu_create(cpu_type));
 
 /* Init internal devices */
@@ -1146,7 +1146,7 @@ static void create_cpu_without_cps(const char *cpu_type,
 *cbus_irq = env->irq[4];
 }
 
-static void create_cps(MaltaState *s, const char *cpu_type,
+static void create_cps(MachineState *ms, MaltaState *s, const char *cpu_type,
qemu_irq *cbus_irq, qemu_irq *i8259_irq)
 {
 Error *err = NULL;
@@ -1155,7 +1155,7 @@ static void create_cps(MaltaState *s, const char 
*cpu_type,
 qdev_set_parent_bus(DEVICE(s->cps), sysbus_get_default());
 
 object_property_set_str(OBJECT(s->cps), cpu_type, "cpu-type", );
-object_property_set_int(OBJECT(s->cps), smp_cpus, "num-vp", );
+object_property_set_int(OBJECT(s->cps), ms->smp.cpus, "num-vp", );
 object_property_set_bool(OBJECT(s->cps), true, "realized", );
 if (err != NULL) {
 error_report("%s", error_get_pretty(err));
@@ -1168,13 +1168,14 @@ static void create_cps(MaltaState *

[Qemu-devel] [PATCH v2 04/10] cpu/topology: add uncommon arch support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in hppa/mips/openrisc
/sparc*/xtensa are replaced with smp machine properties. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/alpha/dp264.c   | 1 +
 hw/hppa/machine.c  | 2 ++
 hw/mips/boston.c   | 2 +-
 hw/mips/mips_malta.c   | 2 ++
 hw/openrisc/openrisc_sim.c | 1 +
 hw/sparc/sun4m.c   | 2 ++
 hw/sparc64/sun4u.c | 4 ++--
 hw/xtensa/sim.c| 2 +-
 hw/xtensa/xtfpga.c | 1 +
 9 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 0347eb8..9dfb835 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -63,6 +63,7 @@ static void clipper_init(MachineState *machine)
 char *palcode_filename;
 uint64_t palcode_entry, palcode_low, palcode_high;
 uint64_t kernel_entry, kernel_low, kernel_high;
+unsigned int smp_cpus = machine->smp.cpus;
 
 /* Create up to 4 cpus.  */
 memset(cpus, 0, sizeof(cpus));
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 416e67b..662838d 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -72,6 +72,7 @@ static void machine_hppa_init(MachineState *machine)
 MemoryRegion *ram_region;
 MemoryRegion *cpu_region;
 long i;
+unsigned int smp_cpus = machine->smp.cpus;
 
 ram_size = machine->ram_size;
 
@@ -242,6 +243,7 @@ static void machine_hppa_init(MachineState *machine)
 
 static void hppa_machine_reset(MachineState *ms)
 {
+unsigned int smp_cpus = ms->smp.cpus;
 int i;
 
 qemu_devices_reset();
diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index a8b29f6..ccbfac5 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -460,7 +460,7 @@ static void boston_mach_init(MachineState *machine)
 
 object_property_set_str(OBJECT(s->cps), machine->cpu_type, "cpu-type",
 );
-object_property_set_int(OBJECT(s->cps), smp_cpus, "num-vp", );
+object_property_set_int(OBJECT(s->cps), machine->smp.cpus, "num-vp", );
 object_property_set_bool(OBJECT(s->cps), true, "realized", );
 
 if (err != NULL) {
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 534e705..70ff98b 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1095,6 +1095,8 @@ static int64_t load_kernel (void)
 
 static void malta_mips_config(MIPSCPU *cpu)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cpus = ms->smp.cpus;
 CPUMIPSState *env = >env;
 CPUState *cs = CPU(cpu);
 
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 7d3b734..c84b9af 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -131,6 +131,7 @@ static void openrisc_sim_init(MachineState *machine)
 qemu_irq *cpu_irqs[2];
 qemu_irq serial_irq;
 int n;
+unsigned int smp_cpus = machine->smp.cpus;
 
 for (n = 0; n < smp_cpus; n++) {
 cpu = OPENRISC_CPU(cpu_create(machine->cpu_type));
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index ca1e382..43e3434 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -853,6 +853,8 @@ static void sun4m_hw_init(const struct sun4m_hwdef *hwdef,
 unsigned int num_vsimms;
 DeviceState *dev;
 SysBusDevice *s;
+unsigned int smp_cpus = machine->smp.cpus;
+unsigned int max_cpus = machine->smp.max_cpus;
 
 /* init CPUs */
 for(i = 0; i < smp_cpus; i++) {
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index 399f2d7..0807f27 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -678,8 +678,8 @@ static void sun4uv_init(MemoryRegion *address_space_mem,
 _CFG_IO(dev)->comb_iomem);
 
 fw_cfg = FW_CFG(dev);
-fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
-fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)machine->smp.cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, (uint16_t)machine->smp.max_cpus);
 fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)ram_size);
 fw_cfg_add_i16(fw_cfg, FW_CFG_MACHINE_ID, hwdef->machine_id);
 fw_cfg_add_i64(fw_cfg, FW_CFG_KERNEL_ADDR, kernel_entry);
diff --git a/hw/xtensa/sim.c b/hw/xtensa/sim.c
index 12c7437..a4eef76 100644
--- a/hw/xtensa/sim.c
+++ b/hw/xtensa/sim.c
@@ -60,7 +60,7 @@ static void xtensa_sim_init(MachineState *machine)
 const char *kernel_filename = machine->kernel_filename;
 int n;
 
-for (n = 0; n < smp_cpus; n++) {
+for (n = 0; n < machine->smp.cpus; n++) {
 cpu = XTENSA_CPU(cpu_create(machine->cpu_type));
 env = >env;
 
diff --git a/hw/xtensa/xtfpga.c b/hw/xtensa/xtfpga.c
index e05ef75..f7f3e11 100644
--- a/hw/xtensa/xtfpga.c
+++ b/hw/xtensa/xtfpga.c
@@ -238,6 +238,7 @@ static void xtfpga_init(const XtfpgaBoardDesc *board, 
MachineState *machine)
 const un

[Qemu-devel] [PATCH v2 06/10] cpu/topology: add hw/riscv support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in riscv
are replaced with smp machine properties. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/riscv/sifive_e.c| 6 --
 hw/riscv/sifive_plic.c | 3 +++
 hw/riscv/sifive_u.c| 6 --
 hw/riscv/spike.c   | 2 ++
 hw/riscv/virt.c| 1 +
 5 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index b1cd113..ae86a63 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -137,6 +137,7 @@ static void riscv_sifive_e_init(MachineState *machine)
 
 static void riscv_sifive_e_soc_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveESoCState *s = RISCV_E_SOC(obj);
 
 object_initialize_child(obj, "cpus", >cpus,
@@ -144,12 +145,13 @@ static void riscv_sifive_e_soc_init(Object *obj)
 _abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_E_CPU, "cpu-type",
 _abort);
-object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
+object_property_set_int(OBJECT(>cpus), ms->smp.cpus, "num-harts",
 _abort);
 }
 
 static void riscv_sifive_e_soc_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 const struct MemmapEntry *memmap = sifive_e_memmap;
 
 SiFiveESoCState *s = RISCV_E_SOC(dev);
@@ -179,7 +181,7 @@ static void riscv_sifive_e_soc_realize(DeviceState *dev, 
Error **errp)
 SIFIVE_E_PLIC_CONTEXT_STRIDE,
 memmap[SIFIVE_E_PLIC].size);
 sifive_clint_create(memmap[SIFIVE_E_CLINT].base,
-memmap[SIFIVE_E_CLINT].size, smp_cpus,
+memmap[SIFIVE_E_CLINT].size, ms->smp.cpus,
 SIFIVE_SIP_BASE, SIFIVE_TIMECMP_BASE, SIFIVE_TIME_BASE);
 sifive_mmio_emulate(sys_mem, "riscv.sifive.e.aon",
 memmap[SIFIVE_E_AON].base, memmap[SIFIVE_E_AON].size);
diff --git a/hw/riscv/sifive_plic.c b/hw/riscv/sifive_plic.c
index 07a032d..d4010a1 100644
--- a/hw/riscv/sifive_plic.c
+++ b/hw/riscv/sifive_plic.c
@@ -23,6 +23,7 @@
 #include "qemu/error-report.h"
 #include "hw/sysbus.h"
 #include "hw/pci/msi.h"
+#include "hw/boards.h"
 #include "target/riscv/cpu.h"
 #include "sysemu/sysemu.h"
 #include "hw/riscv/sifive_plic.h"
@@ -438,6 +439,8 @@ static void sifive_plic_irq_request(void *opaque, int irq, 
int level)
 
 static void sifive_plic_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cpus = ms->smp.cpus;
 SiFivePLICState *plic = SIFIVE_PLIC(dev);
 int i;
 
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 5ecc47c..43bf256 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -321,13 +321,14 @@ static void riscv_sifive_u_init(MachineState *machine)
 
 static void riscv_sifive_u_soc_init(Object *obj)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveUSoCState *s = RISCV_U_SOC(obj);
 
 object_initialize_child(obj, "cpus", >cpus, sizeof(s->cpus),
 TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_U_CPU, "cpu-type",
 _abort);
-object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
+object_property_set_int(OBJECT(>cpus), ms->smp.cpus, "num-harts",
 _abort);
 
 sysbus_init_child_obj(obj, "gem", >gem, sizeof(s->gem),
@@ -336,6 +337,7 @@ static void riscv_sifive_u_soc_init(Object *obj)
 
 static void riscv_sifive_u_soc_realize(DeviceState *dev, Error **errp)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 SiFiveUSoCState *s = RISCV_U_SOC(dev);
 const struct MemmapEntry *memmap = sifive_u_memmap;
 MemoryRegion *system_memory = get_system_memory();
@@ -371,7 +373,7 @@ static void riscv_sifive_u_soc_realize(DeviceState *dev, 
Error **errp)
 sifive_uart_create(system_memory, memmap[SIFIVE_U_UART1].base,
 serial_hd(1), qdev_get_gpio_in(DEVICE(s->plic), SIFIVE_U_UART1_IRQ));
 sifive_clint_create(memmap[SIFIVE_U_CLINT].base,
-memmap[SIFIVE_U_CLINT].size, smp_cpus,
+memmap[SIFIVE_U_CLINT].size, ms->smp.cpus,
 SIFIVE_SIP_BASE, SIFIVE_TIMECMP_BASE, SIFIVE_TIME_BASE);
 
 for (i = 0; i < SIFIVE_U_PLIC_NUM_SOURCES; i++) {
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 2a000a5..6a747ff 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -171,6 +171,7 @@ static void spike_v1_10_0_board_init(MachineState *machine)
 MemoryRegion *main_mem = g_new(MemoryRegion, 1);
 MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
 int i;
+unsigned int smp_cpus = machine->smp.cpus;
 
 /* Initialize SOC */
 object_initialize_child(OBJE

[Qemu-devel] [PATCH v2 07/10] cpu/topology: add hw/s390x support for smp machine properties

2019-05-06 Thread Like Xu
Following the replace rules, the global smp variables in s390x
are replaced with smp machine properties. No semantic changes.

Signed-off-by: Like Xu 
---
 hw/s390x/s390-virtio-ccw.c | 3 ++-
 hw/s390x/sclp.c| 2 +-
 target/s390x/cpu.c | 3 +++
 target/s390x/excp_helper.c | 5 +
 4 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index ed1fe7a..692ad6e 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -83,7 +83,7 @@ static void s390_init_cpus(MachineState *machine)
 /* initialize possible_cpus */
 mc->possible_cpu_arch_ids(machine);
 
-for (i = 0; i < smp_cpus; i++) {
+for (i = 0; i < machine->smp.cpus; i++) {
 s390x_new_cpu(machine->cpu_type, i, _fatal);
 }
 }
@@ -410,6 +410,7 @@ static CpuInstanceProperties 
s390_cpu_index_to_props(MachineState *ms,
 static const CPUArchIdList *s390_possible_cpu_arch_ids(MachineState *ms)
 {
 int i;
+unsigned int max_cpus = ms->smp.max_cpus;
 
 if (ms->possible_cpus) {
 g_assert(ms->possible_cpus && ms->possible_cpus->len == max_cpus);
diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index 4510a80..fac7c3b 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -64,7 +64,7 @@ static void read_SCP_info(SCLPDevice *sclp, SCCB *sccb)
 prepare_cpu_entries(sclp, read_info->entries, _count);
 read_info->entries_cpu = cpu_to_be16(cpu_count);
 read_info->offset_cpu = cpu_to_be16(offsetof(ReadInfo, entries));
-read_info->highest_cpu = cpu_to_be16(max_cpus - 1);
+read_info->highest_cpu = cpu_to_be16(machine->smp.max_cpus - 1);
 
 read_info->ibc_val = cpu_to_be32(s390_get_ibc_val());
 
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index b58ef0a..0601c2e 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -37,6 +37,7 @@
 #include "hw/qdev-properties.h"
 #ifndef CONFIG_USER_ONLY
 #include "hw/hw.h"
+#include "hw/boards.h"
 #include "sysemu/arch_init.h"
 #include "sysemu/sysemu.h"
 #endif
@@ -193,6 +194,8 @@ static void s390_cpu_realizefn(DeviceState *dev, Error 
**errp)
 }
 
 #if !defined(CONFIG_USER_ONLY)
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
 if (cpu->env.core_id >= max_cpus) {
 error_setg(, "Unable to add CPU with core-id: %" PRIu32
", maximum core-id: %d", cpu->env.core_id,
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
index f84bfb1..77833e9 100644
--- a/target/s390x/excp_helper.c
+++ b/target/s390x/excp_helper.c
@@ -31,6 +31,7 @@
 #ifndef CONFIG_USER_ONLY
 #include "sysemu/sysemu.h"
 #include "hw/s390x/s390_flic.h"
+#include "hw/boards.h"
 #endif
 
 void QEMU_NORETURN tcg_s390_program_interrupt(CPUS390XState *env, uint32_t 
code,
@@ -279,6 +280,10 @@ static void do_ext_interrupt(CPUS390XState *env)
 g_assert(cpu_addr < S390_MAX_CPUS);
 lowcore->cpu_addr = cpu_to_be16(cpu_addr);
 clear_bit(cpu_addr, env->emergency_signals);
+#ifndef CONFIG_USER_ONLY
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int max_cpus = ms->smp.max_cpus;
+#endif
 if (bitmap_empty(env->emergency_signals, max_cpus)) {
 env->pending_int &= ~INTERRUPT_EMERGENCY_SIGNAL;
 }
-- 
1.8.3.1




[Qemu-devel] [PATCH v2 01/10] hw/boards: add struct CpuTopology to MachineState

2019-05-06 Thread Like Xu
To remove usages of global smp variables arch by arch,
a bisect friendly way is introduced to initialize struct
CpuTopology with duplicate ones; no semantic changes.

Suggested-by: Igor Mammedov 
Suggested-by: Eduardo Habkost 
Signed-off-by: Like Xu 
---
 include/hw/boards.h | 15 +++
 vl.c|  5 +
 2 files changed, 20 insertions(+)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6f7916f..dc89c6d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -231,6 +231,20 @@ typedef struct DeviceMemoryState {
 } DeviceMemoryState;
 
 /**
+ * CpuTopology:
+ * @cpus: the number of logical processors on the machine
+ * @cores: the number of cores in one package
+ * @threads: the number of threads in one core
+ * @max_cpus: the maximum number of logical processors on the machine
+ */
+typedef struct CpuTopology {
+unsigned int cpus;
+unsigned int cores;
+unsigned int threads;
+unsigned int max_cpus;
+} CpuTopology;
+
+/**
  * MachineState:
  */
 struct MachineState {
@@ -272,6 +286,7 @@ struct MachineState {
 const char *cpu_type;
 AccelState *accelerator;
 CPUArchIdList *possible_cpus;
+CpuTopology smp;
 struct NVDIMMState *nvdimms_state;
 };
 
diff --git a/vl.c b/vl.c
index d9fea0a..43fd247 100644
--- a/vl.c
+++ b/vl.c
@@ -4099,6 +4099,11 @@ int main(int argc, char **argv, char **envp)
 
 smp_parse(qemu_opts_find(qemu_find_opts("smp-opts"), NULL));
 
+current_machine->smp.cpus = smp_cpus;
+current_machine->smp.max_cpus = max_cpus;
+current_machine->smp.cores = smp_cores;
+current_machine->smp.threads = smp_threads;
+
 /* sanity-check smp_cpus and max_cpus against machine_class */
 if (smp_cpus < machine_class->min_cpus) {
 error_report("Invalid SMP CPUs %d. The min CPUs "
-- 
1.8.3.1




[Qemu-devel] [PATCH] hw/arm/fsl-imx: move cpus initialization to realize time after smp_cpus check

2019-04-30 Thread Like Xu
If "smp_cpus> FSL_IMX6_NUM_CPUS" fails in *_realize(), there is no need to
initialize the CPUs in *_init(). So it could be better to create all cpus
after the validity in *_realize(). On the other hand, it makes the usages
of global variable smp_cpus more centrally for maintenance.

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 
---
 hw/arm/fsl-imx6.c   | 13 +++--
 hw/arm/fsl-imx6ul.c | 12 ++--
 hw/arm/fsl-imx7.c   | 15 +++
 3 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/hw/arm/fsl-imx6.c b/hw/arm/fsl-imx6.c
index 7b7b97f..14015a1 100644
--- a/hw/arm/fsl-imx6.c
+++ b/hw/arm/fsl-imx6.c
@@ -37,12 +37,6 @@ static void fsl_imx6_init(Object *obj)
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6_NUM_CPUS); i++) {
-snprintf(name, NAME_SIZE, "cpu%d", i);
-object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
-"cortex-a9-" TYPE_ARM_CPU, _abort, NULL);
-}
-
 sysbus_init_child_obj(obj, "a9mpcore", >a9mpcore, sizeof(s->a9mpcore),
   TYPE_A9MPCORE_PRIV);
 
@@ -95,6 +89,7 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 {
 FslIMX6State *s = FSL_IMX6(dev);
 uint16_t i;
+char name[NAME_SIZE];
 Error *err = NULL;
 
 if (smp_cpus > FSL_IMX6_NUM_CPUS) {
@@ -103,6 +98,12 @@ static void fsl_imx6_realize(DeviceState *dev, Error **errp)
 return;
 }
 
+for (i = 0; i < MIN(smp_cpus, FSL_IMX6_NUM_CPUS); i++) {
+snprintf(name, NAME_SIZE, "cpu%d", i);
+object_initialize_child(OBJECT(dev), name, >cpu[i],
+sizeof(s->cpu[i]), "cortex-a9-" TYPE_ARM_CPU, _abort, NULL);
+}
+
 for (i = 0; i < smp_cpus; i++) {
 
 /* On uniprocessor, the CBAR is set to 0 */
diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
index 4b56bfa..7f30eb7 100644
--- a/hw/arm/fsl-imx6ul.c
+++ b/hw/arm/fsl-imx6ul.c
@@ -32,12 +32,6 @@ static void fsl_imx6ul_init(Object *obj)
 char name[NAME_SIZE];
 int i;
 
-for (i = 0; i < MIN(smp_cpus, FSL_IMX6UL_NUM_CPUS); i++) {
-snprintf(name, NAME_SIZE, "cpu%d", i);
-object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
-"cortex-a7-" TYPE_ARM_CPU, _abort, NULL);
-}
-
 /*
  * A7MPCORE
  */
@@ -167,6 +161,12 @@ static void fsl_imx6ul_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+for (i = 0; i < MIN(smp_cpus, FSL_IMX6UL_NUM_CPUS); i++) {
+snprintf(name, NAME_SIZE, "cpu%d", i);
+object_initialize_child(OBJECT(dev), name, >cpu[i],
+sizeof(s->cpu[i]), "cortex-a7-" TYPE_ARM_CPU, _abort, NULL);
+}
+
 for (i = 0; i < smp_cpus; i++) {
 Object *o = OBJECT(>cpu[i]);
 
diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 7663ad6..2580348 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -34,14 +34,6 @@ static void fsl_imx7_init(Object *obj)
 char name[NAME_SIZE];
 int i;
 
-
-for (i = 0; i < MIN(smp_cpus, FSL_IMX7_NUM_CPUS); i++) {
-snprintf(name, NAME_SIZE, "cpu%d", i);
-object_initialize_child(obj, name, >cpu[i], sizeof(s->cpu[i]),
-ARM_CPU_TYPE_NAME("cortex-a7"), _abort,
-NULL);
-}
-
 /*
  * A7MPCORE
  */
@@ -167,6 +159,13 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
**errp)
 return;
 }
 
+for (i = 0; i < MIN(smp_cpus, FSL_IMX7_NUM_CPUS); i++) {
+snprintf(name, NAME_SIZE, "cpu%d", i);
+object_initialize_child(OBJECT(dev), name, >cpu[i],
+sizeof(s->cpu[i]), ARM_CPU_TYPE_NAME("cortex-a7"),
+_abort, NULL);
+}
+
 for (i = 0; i < smp_cpus; i++) {
 o = OBJECT(>cpu[i]);
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 2/9] cpu/topology: add general support for machine properties

2019-04-30 Thread Like Xu

On 2019/4/4 22:25, Igor Mammedov wrote:

On Fri, 29 Mar 2019 16:48:38 +0800
Like Xu  wrote:







diff --git a/cpus.c b/cpus.c
index e83f72b..834a697 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2067,6 +2067,10 @@ static void qemu_dummy_start_vcpu(CPUState *cpu)
  
  void qemu_init_vcpu(CPUState *cpu)

  {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cores = ms->topo.smp_cores;
+unsigned int smp_threads = ms->topo.smp_threads;


(***)
for once it probably will crash *-user builds
and secondly the purpose of getting rid of smp_foo globals
is disentangle layer violations and not replace it with another global
(qdev_get_machine()).


I am happy to follow this rule on cpu-topo refactoring work, but 
sometimes calling qdev_get_machine() is inevitable.




What should be done is to make a properties of nr_cores/nr_threads and set
them from the parent object that creates CPUs. The point is CPUs shouldn't
reach out outside itself to fish out data bits it needs, it's responsibility
of creator to feed to being create CPU needed properties.

This kind of refactoring probably deserves its own series and should precede
-smp refactoring as it doesn't depend on CpuTopology at all.



The division of responsibility for this case (refactoring 
qemu_init_vcpu) seems to be a poisonous apple.


The prerequisite for setting cpu-> nr_cores / nr_threads from the parent 
is that the CPU has been created, so if any process during 
initialization needs this topo information, it will use the default 
values form cpu_common_initfn() instead of user-configured parameters.


We may not want to repeat those assignment operations using the new 
values and what do you think, Igor?






Re: [Qemu-devel] [PATCH 0/4] Remove some qdev_get_machine() calls from CONFIG_USER_ONLY

2019-04-26 Thread Like Xu

On 2019/4/26 4:00, Eduardo Habkost wrote:

This series moves some qdev code outside qdev.o, so it can be
compiled only in CONFIG_SOFTMMU.

The code being moved includes two qdev_get_machine() calls, so
this will make it easier to move qdev_get_machine() to
CONFIG_SOFTMMU later.

After this series, there's one remaining qdev_get_machine() call
that seems more difficult to remove:

 static void device_set_realized(Object *obj, bool value, Error **errp)
 {
 /* [...] */
 if (!obj->parent) {
 gchar *name = g_strdup_printf("device[%d]", unattached_count++);

 object_property_add_child(container_get(qdev_get_machine(),
 "/unattached"),
   name, obj, _abort);
 unattached_parent = true;
 g_free(name);
 }
 /* [...] */
 }



I may have an experimental patch to fix device_set_realized issue:

1. in qdev_get_machine():
replace
dev = container_get(object_get_root(), "/machine");
with
dev = object_resolve_path("/machine", NULL);

2. in device_set_realized():

Using
Object *container = qdev_get_machine() ?
qdev_get_machine() : object_get_root();
and pass it to
object_property_add_child(
container_get(container, "/unattached"),
name, obj, _abort);

With this fix, we could say the qdev_get_machine() does
return the "/machine" object (or null) not a confused "/container".

We could continue to use qdev_get_machine() in system emulation mode,
getting rid of its surprising side effect as Markus said.

The return value of qdev_get_machine() in user-only mode
is the same object returned by object_get_root(),
so no semantic changes.



This one is tricky because on system emulation mode it needs
"/machine" to already exist, but in user-only mode it needs to
implicitly create a "/machine" container.

Eduardo Habkost (4):
   machine: Move gpio code to hw/core/gpio.c
   move qdev hotplug code to qdev-hotplug.c
   qdev: Don't compile hotplug code in user-mode emulation
   qdev-hotplug: Don't check type of qdev_get_machine()

  hw/core/bus.c|  11 --
  hw/core/gpio.c   | 206 
  hw/core/qdev-hotplug-stubs.c |  44 +++
  hw/core/qdev-hotplug.c   |  64 ++
  hw/core/qdev.c   | 219 ---
  hw/core/Makefile.objs|   5 +-
  tests/Makefile.include   |   3 +-
  7 files changed, 320 insertions(+), 232 deletions(-)
  create mode 100644 hw/core/gpio.c
  create mode 100644 hw/core/qdev-hotplug-stubs.c
  create mode 100644 hw/core/qdev-hotplug.c






Re: [Qemu-devel] [PATCH v3 2/2] core/qdev: refactor qdev_get_machine() with type assertion

2019-04-24 Thread Like Xu

On 2019/4/25 1:21, Eduardo Habkost wrote:

On Tue, Apr 23, 2019 at 03:59:31PM +0800, Like Xu wrote:

On 2019/4/18 1:10, Eduardo Habkost wrote:

On Wed, Apr 17, 2019 at 07:14:10AM +0200, Markus Armbruster wrote:

Eduardo Habkost  writes:


On Mon, Apr 15, 2019 at 03:59:45PM +0800, Like Xu wrote:

To avoid the misuse of qdev_get_machine() if machine hasn't been created yet,
this patch uses qdev_get_machine_uncheck() for obj-common (share with user-only
mode) and adds type assertion to qdev_get_machine() in system-emulation mode.

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 


Reviewed-by: Eduardo Habkost 

I'm queueing the series on machine-next, thanks!


Hold your horses, please.

I dislike the name qdev_get_machine_uncheck().  I could live with
qdev_get_machine_unchecked().

However, I doubt this is the right approach.

The issue at hand is undisciplined creation of QOM object /machine.

This patch adds an asseertion "undisciplined creation of /machine didn't
create crap", but only in some places.

I think we should never create /machine as (surprising!) side effect of
qdev_get_machine().  Create it explicitly instead, and have
qdev_get_machine() use object_resolve_path("/machine", NULL) to get it.
Look ma, no side effects.


OK, I'm dropping this one while we discuss it.

I really miss a good explanation why qdev_get_machine_unchecked()
needs to exist.  When exactly do we want /machine to exist but
not be TYPE_MACHINE?  Why?


AFAICT, there is no such "/machine" that is not of type TYPE_MACHINE.

The original qdev_get_machine() would always return a "/container" object in
user-only mode and there is none TYPE_MACHINE object.


I'm confused.  Both qdev_get_machine() and
qdev_get_machine_unchecked() still return the object at
"/machine".  On softmmu, /machine will be of type TYPE_MACHINE.
On user-only, /machine will be of type "container".




In system emulation mode, it returns the same "/container" object at the
beginning, until we initialize and add a TYPE_MACHINE object to the
"/container" as a child and it would return OBJECT(current_machine)
for later usages.

The starting point is to avoid using the legacy qdev_get_machine()
in system emulation mode when we haven't added the "/machine" object.
As a result, we introduced type checking assertions to avoid premature
invocations.


I believe Markus is suggesting that avoiding unwanted side
effects is even better than doing type checking after it's
already too late.  In other words, it doesn't make sense to call
container_get("/machine") on system emulation mode.


I agree.






In this proposal, the qdev_get_machine_unchecked() is only used
in user-only mode, part of which shares with system emulation mode
(such as device_set_realized, cpu_common_realizefn). The new
qdev_get_machine() is only used in system emulation mode and type checking
assertion does reduce the irrational use of this function (if any in the
future).


This part confuses me as well.  qdev_get_machine_unchecked() is
used in both user-only and softmmu, isn't?  Thus we can't say it
is only used in user-only mode.


You're right about this.



I think we all agree that qdev_get_machine() should eventually be
available in softmmu only.


I think we need to make it happen to avoid calling qdev_get_machine()
in user-only mode.



But I don't think we agree when it would be appropriate to call
qdev_get_machine_unchecked() instead of qdev_get_machine().

On both examples in your patch, the code checks for TYPE_MACHINE
immediately after calling qdev_get_machine_unchecked().  If that
code is only useful in softmmu mode, when would anybody want to
call qdev_get_machine_unchecked() in user-only mode?




We all agree to use this qdev_get_machine() as little as possible
and this patch could make future clean up work easier.



Once the expectations and use cases are explained, we can choose
a better name for qdev_get_machine_unchecked() and document it
properly.










Re: [Qemu-devel] [PATCH v3 2/2] core/qdev: refactor qdev_get_machine() with type assertion

2019-04-23 Thread Like Xu

On 2019/4/18 1:10, Eduardo Habkost wrote:

On Wed, Apr 17, 2019 at 07:14:10AM +0200, Markus Armbruster wrote:

Eduardo Habkost  writes:


On Mon, Apr 15, 2019 at 03:59:45PM +0800, Like Xu wrote:

To avoid the misuse of qdev_get_machine() if machine hasn't been created yet,
this patch uses qdev_get_machine_uncheck() for obj-common (share with user-only
mode) and adds type assertion to qdev_get_machine() in system-emulation mode.

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 


Reviewed-by: Eduardo Habkost 

I'm queueing the series on machine-next, thanks!


Hold your horses, please.

I dislike the name qdev_get_machine_uncheck().  I could live with
qdev_get_machine_unchecked().

However, I doubt this is the right approach.

The issue at hand is undisciplined creation of QOM object /machine.

This patch adds an asseertion "undisciplined creation of /machine didn't
create crap", but only in some places.

I think we should never create /machine as (surprising!) side effect of
qdev_get_machine().  Create it explicitly instead, and have
qdev_get_machine() use object_resolve_path("/machine", NULL) to get it.
Look ma, no side effects.


OK, I'm dropping this one while we discuss it.

I really miss a good explanation why qdev_get_machine_unchecked()
needs to exist.  When exactly do we want /machine to exist but
not be TYPE_MACHINE?  Why?


AFAICT, there is no such "/machine" that is not of type TYPE_MACHINE.

The original qdev_get_machine() would always return a "/container" 
object in user-only mode and there is none TYPE_MACHINE object.


In system emulation mode, it returns the same "/container" object at the 
beginning, until we initialize and add a TYPE_MACHINE object to the 
"/container" as a child and it would return OBJECT(current_machine)

for later usages.

The starting point is to avoid using the legacy qdev_get_machine()
in system emulation mode when we haven't added the "/machine" object.
As a result, we introduced type checking assertions to avoid premature 
invocations.


In this proposal, the qdev_get_machine_unchecked() is only used
in user-only mode, part of which shares with system emulation mode
(such as device_set_realized, cpu_common_realizefn). The new 
qdev_get_machine() is only used in system emulation mode and type 
checking assertion does reduce the irrational use of this function (if 
any in the future).


We all agree to use this qdev_get_machine() as little as possible
and this patch could make future clean up work easier.



Once the expectations and use cases are explained, we can choose
a better name for qdev_get_machine_unchecked() and document it
properly.






Re: [Qemu-devel] [PATCH 3/9] cpu/topology: add uncommon arch support for smp machine properties

2019-04-16 Thread Like Xu

On 2019/4/8 20:54, Igor Mammedov wrote:

On Fri, 29 Mar 2019 16:48:39 +0800
Like Xu  wrote:

here should be a commit message explaining what patch does
in more detail.



Signed-off-by: Like Xu 


Generic note, try not call qdev_get_machine() every time
you replace smp_cpus or other variables. It's often possible
to pass MachineState via call chain with trivial fixups.


Hi Igor,

I have some doubts on this comments after some attempts.

I'm not sure if this idea could apply to all qdev_get_machine()
usages in tree or just for this smp-touch-only patch.

It takes a lot of efforts on hooks overrides when we
undo calls to qdev_get_machine() with modification of incoming parameters.

The implementation of qdev_get_machine() couldn't be simpler more
and it doesn't seem to bring much overhead compared with parameter stack.




---
  hw/alpha/dp264.c | 1 +
  hw/hppa/machine.c| 4 
  hw/mips/boston.c | 1 +
  hw/mips/mips_malta.c | 9 +
  hw/sparc/sun4m.c | 2 ++
  hw/sparc64/sun4u.c   | 2 ++
  hw/xtensa/sim.c  | 1 +
  hw/xtensa/xtfpga.c   | 1 +
  8 files changed, 21 insertions(+)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index 0347eb8..ee5d432 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -63,6 +63,7 @@ static void clipper_init(MachineState *machine)
  char *palcode_filename;
  uint64_t palcode_entry, palcode_low, palcode_high;
  uint64_t kernel_entry, kernel_low, kernel_high;
+unsigned int smp_cpus = machine->topo.smp_cpus;
  
  /* Create up to 4 cpus.  */

  memset(cpus, 0, sizeof(cpus));
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index d1b1d3c..f652891 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -16,6 +16,7 @@
  #include "hw/ide.h"
  #include "hw/timer/i8254.h"
  #include "hw/char/serial.h"
+#include "hw/boards.h"
  #include "hppa_sys.h"
  #include "qemu/units.h"
  #include "qapi/error.h"
@@ -72,6 +73,7 @@ static void machine_hppa_init(MachineState *machine)
  MemoryRegion *ram_region;
  MemoryRegion *cpu_region;
  long i;
+unsigned int smp_cpus = machine->topo.smp_cpus;

I'd prefer to replace smp_cpus with machine->topo.smp_cpus
directly at places it's used, as it makes affected sites
more visible in the patch.
And use local smp_cpus only in places where using machine->topo.smp_cpus
makes core less readable.
(but it's just personal preference so I don't insist on it)

  
  ram_size = machine->ram_size;
  
@@ -242,7 +244,9 @@ static void machine_hppa_init(MachineState *machine)
  
  static void hppa_machine_reset(void)

  {
+MachineState *ms = MACHINE(qdev_get_machine());
  int i;
+unsigned int smp_cpus = ms->topo.smp_cpus;


***)
It would be better to pass MachineState as argument to
hppa_machine_reset(), a patch to so should go before this one.

Quick look shows only 3 overrides (hppa, pc, pnv) and one caller,
so I'd rather fix it than calling qdev_get_machine() unnecessarily

  
  qemu_devices_reset();
  
diff --git a/hw/mips/boston.c b/hw/mips/boston.c

index e5bab3c..7752c10 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -434,6 +434,7 @@ static void boston_mach_init(MachineState *machine)
  DriveInfo *hd[6];
  Chardev *chr;
  int fw_size, fit_err;
+unsigned int smp_cpus = machine->topo.smp_cpus;
  bool is_64b;
  
  if ((machine->ram_size % GiB) ||

diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index 439665a..d595375 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1095,6 +1095,8 @@ static int64_t load_kernel (void)
  
  static void malta_mips_config(MIPSCPU *cpu)

  {
+MachineState *ms = MACHINE(qdev_get_machine());
+unsigned int smp_cpus = ms->topo.smp_cpus;
  CPUMIPSState *env = >env;
  CPUState *cs = CPU(cpu);

this one also called from reset, so the same [***] applies here too.

  
@@ -1127,9 +1129,11 @@ static void main_cpu_reset(void *opaque)

  static void create_cpu_without_cps(const char *cpu_type,
 qemu_irq *cbus_irq, qemu_irq *i8259_irq)
  {
+MachineState *ms = MACHINE(qdev_get_machine());

caller has an access to MachineState so pass it down call chain all the way


  CPUMIPSState *env;
  MIPSCPU *cpu;
  int i;
+unsigned int smp_cpus = ms->topo.smp_cpus;
  
  for (i = 0; i < smp_cpus; i++) {

  cpu = MIPS_CPU(cpu_create(cpu_type));
@@ -1149,7 +1153,9 @@ static void create_cpu_without_cps(const char *cpu_type,
  static void create_cps(MaltaState *s, const char *cpu_type,
 qemu_irq *cbus_irq, qemu_irq *i8259_irq)
  {
+MachineState *ms = MACHINE(qdev_get_machine());

ditto


  Error *err = NULL;
+unsigned int smp_cpus = ms->topo.smp_cpus;
  
  s->cps = MIPS_CPS(object_new(TYPE_MIPS_CPS));

  qdev_set_parent_bus(DEVICE(s->cps), sysbus_get_default());
@@ -1171,6 +1177,9

[Qemu-devel] [PATCH v3 1/2] vl.c: refactor current_machine as non-global variable

2019-04-15 Thread Like Xu
This patch makes the remaining dozen or so uses of the global
current_machine outside vl.c use qdev_get_machine() instead,
and then make current_machine local to vl.c instead of global.

Suggested-by: Peter Maydell 
Signed-off-by: Like Xu 
---
 accel/kvm/kvm-all.c | 6 --
 device-hotplug.c| 3 ++-
 device_tree.c   | 3 ++-
 exec.c  | 6 --
 hw/ppc/spapr_rtas.c | 3 ++-
 include/hw/boards.h | 1 -
 migration/savevm.c  | 9 ++---
 qmp.c   | 3 ++-
 target/i386/kvm.c   | 3 ++-
 target/ppc/kvm.c| 3 ++-
 vl.c| 4 ++--
 11 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 241db49..d103de2 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -140,7 +140,8 @@ static const KVMCapabilityInfo kvm_required_capabilites[] = 
{
 
 int kvm_get_max_memslots(void)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+MachineState *ms = MACHINE(qdev_get_machine());
+KVMState *s = KVM_STATE(ms->accelerator);
 
 return s->nr_slots;
 }
@@ -1519,7 +1520,8 @@ static int kvm_max_vcpu_id(KVMState *s)
 
 bool kvm_vcpu_id_is_valid(int vcpu_id)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+MachineState *ms = MACHINE(qdev_get_machine());
+KVMState *s = KVM_STATE(ms->accelerator);
 return vcpu_id >= 0 && vcpu_id < kvm_max_vcpu_id(s);
 }
 
diff --git a/device-hotplug.c b/device-hotplug.c
index 6153259..d31c1f8 100644
--- a/device-hotplug.c
+++ b/device-hotplug.c
@@ -37,6 +37,7 @@
 
 static DriveInfo *add_init_drive(const char *optstr)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 Error *err = NULL;
 DriveInfo *dinfo;
 QemuOpts *opts;
@@ -46,7 +47,7 @@ static DriveInfo *add_init_drive(const char *optstr)
 if (!opts)
 return NULL;
 
-mc = MACHINE_GET_CLASS(current_machine);
+mc = MACHINE_GET_CLASS(ms);
 dinfo = drive_new(opts, mc->block_default_type, );
 if (err) {
 error_report_err(err);
diff --git a/device_tree.c b/device_tree.c
index f8b46b3..3294ef6 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -459,6 +459,7 @@ int qemu_fdt_setprop_phandle(void *fdt, const char 
*node_path,
 
 uint32_t qemu_fdt_alloc_phandle(void *fdt)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 static int phandle = 0x0;
 
 /*
@@ -466,7 +467,7 @@ uint32_t qemu_fdt_alloc_phandle(void *fdt)
  * which phandle id to start allocating phandles.
  */
 if (!phandle) {
-phandle = machine_phandle_start(current_machine);
+phandle = machine_phandle_start(ms);
 }
 
 if (!phandle) {
diff --git a/exec.c b/exec.c
index 6ab62f4..15ff2b1 100644
--- a/exec.c
+++ b/exec.c
@@ -1969,10 +1969,11 @@ static unsigned long last_ram_page(void)
 
 static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
 {
+MachineState *ms = MACHINE(qdev_get_machine());
 int ret;
 
 /* Use MADV_DONTDUMP, if user doesn't want the guest memory in the core */
-if (!machine_dump_guest_core(current_machine)) {
+if (!machine_dump_guest_core(ms)) {
 ret = qemu_madvise(addr, size, QEMU_MADV_DONTDUMP);
 if (ret) {
 perror("qemu_madvise");
@@ -2094,7 +2095,8 @@ size_t qemu_ram_pagesize_largest(void)
 
 static int memory_try_enable_merging(void *addr, size_t len)
 {
-if (!machine_mem_merge(current_machine)) {
+MachineState *ms = MACHINE(qdev_get_machine());
+if (!machine_mem_merge(ms)) {
 /* disabled by the user */
 return 0;
 }
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 24c45b1..51e320d 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -231,6 +231,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   target_ulong args,
   uint32_t nret, target_ulong rets)
 {
+MachineState *ms = MACHINE(spapr);
 target_ulong parameter = rtas_ld(args, 0);
 target_ulong buffer = rtas_ld(args, 1);
 target_ulong length = rtas_ld(args, 2);
@@ -243,7 +244,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   "DesProcs=%d,"
   "MaxPlatProcs=%d",
   max_cpus,
-  current_machine->ram_size / MiB,
+  ms->ram_size / MiB,
   smp_cpus,
   max_cpus);
 ret = sysparm_st(buffer, length, param_val, strlen(param_val) + 1);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index e231860..1d598c8 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -58,7 +58,6 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, 
Object *owner,
 OBJECT_CLASS_CHECK(MachineClas

[Qemu-devel] [PATCH v3 2/2] core/qdev: refactor qdev_get_machine() with type assertion

2019-04-15 Thread Like Xu
To avoid the misuse of qdev_get_machine() if machine hasn't been created yet,
this patch uses qdev_get_machine_uncheck() for obj-common (share with user-only
mode) and adds type assertion to qdev_get_machine() in system-emulation mode.

Suggested-by: Igor Mammedov 
Signed-off-by: Like Xu 
---
 hw/core/qdev.c | 16 +---
 include/hw/qdev-core.h |  1 +
 qom/cpu.c  |  5 +++--
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index f9b6efe..8232216 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -223,7 +223,7 @@ HotplugHandler 
*qdev_get_machine_hotplug_handler(DeviceState *dev)
 {
 MachineState *machine;
 MachineClass *mc;
-Object *m_obj = qdev_get_machine();
+Object *m_obj = qdev_get_machine_uncheck();
 
 if (object_dynamic_cast(m_obj, TYPE_MACHINE)) {
 machine = MACHINE(m_obj);
@@ -815,7 +815,7 @@ static void device_set_realized(Object *obj, bool value, 
Error **errp)
 if (!obj->parent) {
 gchar *name = g_strdup_printf("device[%d]", unattached_count++);
 
-object_property_add_child(container_get(qdev_get_machine(),
+object_property_add_child(container_get(qdev_get_machine_uncheck(),
 "/unattached"),
   name, obj, _abort);
 unattached_parent = true;
@@ -1095,7 +1095,7 @@ void device_reset(DeviceState *dev)
 }
 }
 
-Object *qdev_get_machine(void)
+Object *qdev_get_machine_uncheck(void)
 {
 static Object *dev;
 
@@ -1106,6 +1106,16 @@ Object *qdev_get_machine(void)
 return dev;
 }
 
+Object *qdev_get_machine(void)
+{
+static Object *dev;
+
+dev = qdev_get_machine_uncheck();
+assert(object_dynamic_cast(dev, TYPE_MACHINE) != NULL);
+
+return dev;
+}
+
 static const TypeInfo device_type_info = {
 .name = TYPE_DEVICE,
 .parent = TYPE_OBJECT,
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 33ed3b8..e7c6a5a 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -429,6 +429,7 @@ const struct VMStateDescription *qdev_get_vmsd(DeviceState 
*dev);
 
 const char *qdev_fw_name(DeviceState *dev);
 
+Object *qdev_get_machine_uncheck(void);
 Object *qdev_get_machine(void);
 
 /* FIXME: make this a link<> */
diff --git a/qom/cpu.c b/qom/cpu.c
index a8d2958..bb877d5 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -325,9 +325,10 @@ static void cpu_common_parse_features(const char 
*typename, char *features,
 static void cpu_common_realizefn(DeviceState *dev, Error **errp)
 {
 CPUState *cpu = CPU(dev);
-Object *machine = qdev_get_machine();
+Object *machine = qdev_get_machine_uncheck();
 
-/* qdev_get_machine() can return something that's not TYPE_MACHINE
+/*
+ * qdev_get_machine_uncheck() can return something that's not TYPE_MACHINE
  * if this is one of the user-only emulators; in that case there's
  * no need to check the ignore_memory_transaction_failures board flag.
  */
-- 
1.8.3.1




[Qemu-devel] [PATCH v3 0/2] vl.c: make current_machine as non-global variable

2019-04-15 Thread Like Xu
This patch makes the remaining dozen or so uses of the global
current_machine outside vl.c use qdev_get_machine() instead,
and then make current_machine local to vl.c instead of global.

With type assertion in qdev_get_machine(), it will be hard to
misuse this function if machine hasn't been created yet.
For obj-common cases, qdev_get_machine_uncheck() is applied
without semantic change.

---
Changes in v3:
- add TYPE_MACHINE assertion for qdev_get_machine() usage
- apply qdev_get_machine_uncheck() for obj-common usage
Changes in v2:
- make the variable current_machine "static" (Thomas Huth)

Like Xu (2):
  vl.c: refactor current_machine as non-global variable
  core/qdev: refactor qdev_get_machine() with type assertion

 accel/kvm/kvm-all.c|  6 --
 device-hotplug.c   |  3 ++-
 device_tree.c  |  3 ++-
 exec.c |  6 --
 hw/core/qdev.c | 16 +---
 hw/ppc/spapr_rtas.c|  3 ++-
 include/hw/boards.h|  1 -
 include/hw/qdev-core.h |  1 +
 migration/savevm.c |  9 ++---
 qmp.c  |  3 ++-
 qom/cpu.c  |  5 +++--
 target/i386/kvm.c  |  3 ++-
 target/ppc/kvm.c   |  3 ++-
 vl.c   |  4 ++--
 14 files changed, 45 insertions(+), 21 deletions(-)

-- 
1.8.3.1




Re: [Qemu-devel] [PATCH 0/9] refactor cpu topo into machine properties

2019-04-08 Thread Like Xu

On 2019/4/8 21:26, Igor Mammedov wrote:

On Thu, 4 Apr 2019 11:26:09 +0800
Like Xu  wrote:


On 2019/3/29 18:21, Igor Mammedov wrote:

On Fri, 29 Mar 2019 16:48:36 +0800
Like Xu  wrote:
   

This patch series make existing cores/threads/sockets into machine
properties and get rid of global variables they use currently.

Thanks for looking into it!
Its long overdue and rather desired conversion (albeit naive one,
but this series is a good starting point). I'll go over your patches
next week with comments and concrete suggestions how to implement
particular things.


Hi Igor, any comments and suggestions on smp machine properties
in this patch considering we may add die topology for PCMachine as an
extension?


I've looked at several patches and that it for this series.
The most comments apply to the patches I've not reviewed as well.


Hi Igor, thanks for your comments, time and patience.

I'll try to fix them in next version ASAP.



   

Like Xu (9):
cpu/topology: add struct CpuTopology to MachineState
cpu/topology: add general support for machine properties
cpu/topology: add uncommon arch support for smp machine properties
cpu/topology: add ARM support for smp machine properties
cpu/topology: add i386 support for smp machine properties
cpu/topology: add PPC support for smp machine properties
cpu/topology: add riscv support for smp machine properties
cpu/topology: add s390x support for smp machine properties
cpu/topology: replace smp global variables with machine propertie

   accel/kvm/kvm-all.c  |  3 +++
   backends/hostmem.c   |  4 
   cpus.c   |  4 
   exec.c   |  2 ++
   gdbstub.c|  7 ++-
   hw/alpha/dp264.c |  1 +
   hw/arm/fsl-imx6.c|  5 +
   hw/arm/fsl-imx6ul.c  |  5 +
   hw/arm/fsl-imx7.c|  5 +
   hw/arm/highbank.c|  1 +
   hw/arm/mcimx6ul-evk.c|  1 +
   hw/arm/mcimx7d-sabre.c   |  3 +++
   hw/arm/raspi.c   |  2 ++
   hw/arm/realview.c|  1 +
   hw/arm/sabrelite.c   |  1 +
   hw/arm/vexpress.c|  3 +++
   hw/arm/virt.c|  7 +++
   hw/arm/xlnx-zynqmp.c |  7 +++
   hw/cpu/core.c|  3 +++
   hw/hppa/machine.c|  4 
   hw/i386/acpi-build.c |  3 +++
   hw/i386/kvmvapic.c   |  5 +
   hw/i386/pc.c | 12 +++
   hw/mips/boston.c |  1 +
   hw/mips/mips_malta.c |  9 +
   hw/openrisc/openrisc_sim.c   |  1 +
   hw/ppc/e500.c|  3 +++
   hw/ppc/mac_newworld.c|  2 ++
   hw/ppc/mac_oldworld.c|  2 ++
   hw/ppc/pnv.c |  3 +++
   hw/ppc/prep.c|  2 ++
   hw/ppc/spapr.c   | 29 ++
   hw/ppc/spapr_rtas.c  |  3 +++
   hw/riscv/sifive_e.c  |  4 
   hw/riscv/sifive_plic.c   |  3 +++
   hw/riscv/sifive_u.c  |  4 
   hw/riscv/spike.c |  2 ++
   hw/riscv/virt.c  |  1 +
   hw/s390x/s390-virtio-ccw.c   |  2 ++
   hw/s390x/sclp.c  |  1 +
   hw/smbios/smbios.c   | 11 ++
   hw/sparc/sun4m.c |  2 ++
   hw/sparc64/sun4u.c   |  2 ++
   hw/xtensa/sim.c  |  1 +
   hw/xtensa/xtfpga.c   |  1 +
   include/hw/arm/virt.h|  2 +-
   include/hw/boards.h  |  8 
   include/sysemu/sysemu.h  |  2 +-
   migration/postcopy-ram.c |  7 +++
   numa.c   |  1 +
   target/arm/cpu.c |  7 +++
   target/i386/cpu.c|  4 
   target/openrisc/sys_helper.c |  5 +
   target/s390x/cpu.c   |  3 +++
   target/s390x/excp_helper.c   |  6 ++
   tcg/tcg.c| 15 ++
   vl.c | 48 

   57 files changed, 261 insertions(+), 25 deletions(-)
  



   











  1   2   >