Re: [PATCH v3 5/7] KVM: PPC: reimplements LOAD_VSX/STORE_VSX instruction mmio emulation with analyse_intr() input

2018-05-22 Thread Simon Guo
Hi Paul,
On Tue, May 22, 2018 at 07:41:51PM +1000, Paul Mackerras wrote:
> On Mon, May 21, 2018 at 01:24:24PM +0800, wei.guo.si...@gmail.com wrote:
> > From: Simon Guo 
> > 
> > This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with
> > analyse_intr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported
> > by analyse_instr() and handle accordingly.
> > 
> > When emulating VSX store, the VSX reg will need to be flushed so that
> > the right reg val can be retrieved before writing to IO MEM.
> 
> When I tested this patch set with the MMIO emulation test program I
> have, I got a host crash on the first test that used a VSX instruction
> with a register number >= 32, that is, a VMX register.  The crash was
> that it hit the BUG() at line 1193 of arch/powerpc/kvm/powerpc.c.
> 
> The reason it hit the BUG() is that vcpu->arch.io_gpr was 0xa3.
> What's happening here is that analyse_instr gives a register numbers
> in the range 32 - 63 for VSX instructions which access VMX registers.
> When 35 is ORed with 0x80 (KVM_MMIO_REG_VSX) we get 0xa3.
> 
> The old code didn't pass the high bit of the register number to
> kvmppc_handle_vsx_load/store, but instead passed it via the
> vcpu->arch.mmio_vsx_tx_sx_enabled field.  With your patch set we still
> set and use that field, so the patch below on top of your patches is
> the quick fix.  Ideally we would get rid of that field and just use
> the high (0x20) bit of the register number instead, but that can be
> cleaned up later.
> 
> If you like, I will fold the patch below into this patch and push the
> series to my kvm-ppc-next branch.
> 
> Paul.
Sorry my test missed this kind of cases. Please go ahead to fold the patch
as you suggested.  Thanks for point it out.

If you like, I can do the clean up work. If I understand correctly,
we need to expand io_gpr to u16 from u8 so that reg number can use 
6 bits and leave room for other reg flag bits.

BR,
- Simon

> ---
> diff --git a/arch/powerpc/kvm/emulate_loadstore.c 
> b/arch/powerpc/kvm/emulate_loadstore.c
> index 0165fcd..afde788 100644
> --- a/arch/powerpc/kvm/emulate_loadstore.c
> +++ b/arch/powerpc/kvm/emulate_loadstore.c
> @@ -242,8 +242,8 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>   }
>  
>   emulated = kvmppc_handle_vsx_load(run, vcpu,
> - KVM_MMIO_REG_VSX|op.reg, io_size_each,
> - 1, op.type & SIGNEXT);
> + KVM_MMIO_REG_VSX | (op.reg & 0x1f),
> + io_size_each, 1, op.type & SIGNEXT);
>   break;
>   }
>  #endif
> @@ -363,7 +363,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>   }
>  
>   emulated = kvmppc_handle_vsx_store(run, vcpu,
> - op.reg, io_size_each, 1);
> + op.reg & 0x1f, io_size_each, 1);
>   break;
>   }
>  #endif


Re: [PATCH v3 5/7] KVM: PPC: reimplements LOAD_VSX/STORE_VSX instruction mmio emulation with analyse_intr() input

2018-05-22 Thread Paul Mackerras
On Mon, May 21, 2018 at 01:24:24PM +0800, wei.guo.si...@gmail.com wrote:
> From: Simon Guo 
> 
> This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with
> analyse_intr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported
> by analyse_instr() and handle accordingly.
> 
> When emulating VSX store, the VSX reg will need to be flushed so that
> the right reg val can be retrieved before writing to IO MEM.

When I tested this patch set with the MMIO emulation test program I
have, I got a host crash on the first test that used a VSX instruction
with a register number >= 32, that is, a VMX register.  The crash was
that it hit the BUG() at line 1193 of arch/powerpc/kvm/powerpc.c.

The reason it hit the BUG() is that vcpu->arch.io_gpr was 0xa3.
What's happening here is that analyse_instr gives a register numbers
in the range 32 - 63 for VSX instructions which access VMX registers.
When 35 is ORed with 0x80 (KVM_MMIO_REG_VSX) we get 0xa3.

The old code didn't pass the high bit of the register number to
kvmppc_handle_vsx_load/store, but instead passed it via the
vcpu->arch.mmio_vsx_tx_sx_enabled field.  With your patch set we still
set and use that field, so the patch below on top of your patches is
the quick fix.  Ideally we would get rid of that field and just use
the high (0x20) bit of the register number instead, but that can be
cleaned up later.

If you like, I will fold the patch below into this patch and push the
series to my kvm-ppc-next branch.

Paul.
---
diff --git a/arch/powerpc/kvm/emulate_loadstore.c 
b/arch/powerpc/kvm/emulate_loadstore.c
index 0165fcd..afde788 100644
--- a/arch/powerpc/kvm/emulate_loadstore.c
+++ b/arch/powerpc/kvm/emulate_loadstore.c
@@ -242,8 +242,8 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
}
 
emulated = kvmppc_handle_vsx_load(run, vcpu,
-   KVM_MMIO_REG_VSX|op.reg, io_size_each,
-   1, op.type & SIGNEXT);
+   KVM_MMIO_REG_VSX | (op.reg & 0x1f),
+   io_size_each, 1, op.type & SIGNEXT);
break;
}
 #endif
@@ -363,7 +363,7 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
}
 
emulated = kvmppc_handle_vsx_store(run, vcpu,
-   op.reg, io_size_each, 1);
+   op.reg & 0x1f, io_size_each, 1);
break;
}
 #endif


[PATCH v3 5/7] KVM: PPC: reimplements LOAD_VSX/STORE_VSX instruction mmio emulation with analyse_intr() input

2018-05-21 Thread wei . guo . simon
From: Simon Guo 

This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with
analyse_intr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported
by analyse_instr() and handle accordingly.

When emulating VSX store, the VSX reg will need to be flushed so that
the right reg val can be retrieved before writing to IO MEM.

Suggested-by: Paul Mackerras 
Signed-off-by: Simon Guo 
---
 arch/powerpc/kvm/emulate_loadstore.c | 227 ++-
 1 file changed, 91 insertions(+), 136 deletions(-)

diff --git a/arch/powerpc/kvm/emulate_loadstore.c 
b/arch/powerpc/kvm/emulate_loadstore.c
index 5d38f95..ed73497 100644
--- a/arch/powerpc/kvm/emulate_loadstore.c
+++ b/arch/powerpc/kvm/emulate_loadstore.c
@@ -158,6 +158,54 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
 
break;
 #endif
+#ifdef CONFIG_VSX
+   case LOAD_VSX: {
+   int io_size_each;
+
+   if (op.vsx_flags & VSX_CHECK_VEC) {
+   if (kvmppc_check_altivec_disabled(vcpu))
+   return EMULATE_DONE;
+   } else {
+   if (kvmppc_check_vsx_disabled(vcpu))
+   return EMULATE_DONE;
+   }
+
+   if (op.vsx_flags & VSX_FPCONV)
+   vcpu->arch.mmio_sp64_extend = 1;
+
+   if (op.element_size == 8)  {
+   if (op.vsx_flags & VSX_SPLAT)
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_DWORD_LOAD_DUMP;
+   else
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_DWORD;
+   } else if (op.element_size == 4) {
+   if (op.vsx_flags & VSX_SPLAT)
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_WORD_LOAD_DUMP;
+   else
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_WORD;
+   } else
+   break;
+
+   if (size < op.element_size) {
+   /* precision convert case: lxsspx, etc */
+   vcpu->arch.mmio_vsx_copy_nums = 1;
+   io_size_each = size;
+   } else { /* lxvw4x, lxvd2x, etc */
+   vcpu->arch.mmio_vsx_copy_nums =
+   size/op.element_size;
+   io_size_each = op.element_size;
+   }
+
+   emulated = kvmppc_handle_vsx_load(run, vcpu,
+   KVM_MMIO_REG_VSX|op.reg, io_size_each,
+   1, op.type & SIGNEXT);
+   break;
+   }
+#endif
case STORE:
/* if need byte reverse, op.val has been reversed by
 * analyse_instr().
@@ -193,6 +241,49 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
 
break;
 #endif
+#ifdef CONFIG_VSX
+   case STORE_VSX: {
+   int io_size_each;
+
+   if (op.vsx_flags & VSX_CHECK_VEC) {
+   if (kvmppc_check_altivec_disabled(vcpu))
+   return EMULATE_DONE;
+   } else {
+   if (kvmppc_check_vsx_disabled(vcpu))
+   return EMULATE_DONE;
+   }
+
+   if (vcpu->kvm->arch.kvm_ops->giveup_ext)
+   vcpu->kvm->arch.kvm_ops->giveup_ext(vcpu,
+   MSR_VSX);
+
+   if (op.vsx_flags & VSX_FPCONV)
+   vcpu->arch.mmio_sp64_extend = 1;
+
+   if (op.element_size == 8)
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_DWORD;
+   else if (op.element_size == 4)
+   vcpu->arch.mmio_vsx_copy_type =
+   KVMPPC_VSX_COPY_WORD;
+   else
+   break;
+
+   if (size < op.element_size) {
+   /* precise conversion case, like stxsspx */
+   vcpu->arch.mmio_vsx_copy_nums = 1;