Re: [PATCH 2/8] KVM: x86 emulator: use aligned variants of SSE register ops

2012-09-05 Thread Avi Kivity
On 09/04/2012 03:51 PM, Mathias Krause wrote:
 On Tue, Sep 4, 2012 at 2:13 PM, Avi Kivity a...@redhat.com wrote:
 On 09/04/2012 03:09 PM, Avi Kivity wrote:
 On 08/30/2012 02:30 AM, Mathias Krause wrote:
 As the the compiler ensures that the memory operand is always aligned
 to a 16 byte memory location,

 I'm not sure it does.  Is V4SI aligned?  Do we use alignof() to
 propagate the alignment to the vcpu allocation code?
 
 I checked that to by introducing a dummy char member in struct operand
 that would have misaligned vec_val but, indeed, the compiler ensured
 it's still 16 byte aligned.

Ok.

 

 We actually do.  But please rebase the series against next, I got some
 conflicts while applying.
 
 If next means kvm/next
 (i.e.git://git.kernel.org/pub/scm/virt/kvm/kvm.git#next) here, the
 whole series applies cleanly for me.
 HEAD in kvm/next is 9a78197 KVM: x86: remove unused variable from
 kvm_task_switch() here. Albeit the series was build against kvm/next
 at the time as a81aba1 KVM: VMX: Ignore segment G and D bits when
 considering whether we can virtualize was HEAD in this branch.
 
 Could you please retry and show me the conflicts you get?

I tried again and it applies cleanly now, so it must have been a user
error earlier.

All applied, thanks.

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] KVM: x86 emulator: use aligned variants of SSE register ops

2012-09-04 Thread Avi Kivity
On 08/30/2012 02:30 AM, Mathias Krause wrote:
 As the the compiler ensures that the memory operand is always aligned
 to a 16 byte memory location, 

I'm not sure it does.  Is V4SI aligned?  Do we use alignof() to
propagate the alignment to the vcpu allocation code?

 use the aligned variant of MOVDQ for
 read_sse_reg() and write_sse_reg().
 
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index 1451cff..5a0fee1 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -909,23 +909,23 @@ static void read_sse_reg(struct x86_emulate_ctxt *ctxt, 
 sse128_t *data, int reg)
  {
   ctxt-ops-get_fpu(ctxt);
   switch (reg) {
 - case 0: asm(movdqu %%xmm0, %0 : =m(*data)); break;
 - case 1: asm(movdqu %%xmm1, %0 : =m(*data)); break;
 - case 2: asm(movdqu %%xmm2, %0 : =m(*data)); break;
 - case 3: asm(movdqu %%xmm3, %0 : =m(*data)); break;
 - case 4: asm(movdqu %%xmm4, %0 : =m(*data)); break;
 - case 5: asm(movdqu %%xmm5, %0 : =m(*data)); break;
 - case 6: asm(movdqu %%xmm6, %0 : =m(*data)); break;
 - case 7: asm(movdqu %%xmm7, %0 : =m(*data)); break;
 + case 0: asm(movdqa %%xmm0, %0 : =m(*data)); break;
 + case 1: asm(movdqa %%xmm1, %0 : =m(*data)); break;
 + case 2: asm(movdqa %%xmm2, %0 : =m(*data)); break;
 + case 3: asm(movdqa %%xmm3, %0 : =m(*data)); break;
 + case 4: asm(movdqa %%xmm4, %0 : =m(*data)); break;
 + case 5: asm(movdqa %%xmm5, %0 : =m(*data)); break;
 + case 6: asm(movdqa %%xmm6, %0 : =m(*data)); break;
 + case 7: asm(movdqa %%xmm7, %0 : =m(*data)); break;
  #ifdef CONFIG_X86_64
 - case 8: asm(movdqu %%xmm8, %0 : =m(*data)); break;
 - case 9: asm(movdqu %%xmm9, %0 : =m(*data)); break;
 - case 10: asm(movdqu %%xmm10, %0 : =m(*data)); break;
 - case 11: asm(movdqu %%xmm11, %0 : =m(*data)); break;
 - case 12: asm(movdqu %%xmm12, %0 : =m(*data)); break;
 - case 13: asm(movdqu %%xmm13, %0 : =m(*data)); break;
 - case 14: asm(movdqu %%xmm14, %0 : =m(*data)); break;
 - case 15: asm(movdqu %%xmm15, %0 : =m(*data)); break;
 + case 8: asm(movdqa %%xmm8, %0 : =m(*data)); break;
 + case 9: asm(movdqa %%xmm9, %0 : =m(*data)); break;
 + case 10: asm(movdqa %%xmm10, %0 : =m(*data)); break;
 + case 11: asm(movdqa %%xmm11, %0 : =m(*data)); break;
 + case 12: asm(movdqa %%xmm12, %0 : =m(*data)); break;
 + case 13: asm(movdqa %%xmm13, %0 : =m(*data)); break;
 + case 14: asm(movdqa %%xmm14, %0 : =m(*data)); break;
 + case 15: asm(movdqa %%xmm15, %0 : =m(*data)); break;
  #endif
   default: BUG();


The vmexit costs dominates any win here by several orders of magnitude.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] KVM: x86 emulator: use aligned variants of SSE register ops

2012-09-04 Thread Avi Kivity
On 09/04/2012 03:09 PM, Avi Kivity wrote:
 On 08/30/2012 02:30 AM, Mathias Krause wrote:
 As the the compiler ensures that the memory operand is always aligned
 to a 16 byte memory location, 
 
 I'm not sure it does.  Is V4SI aligned?  Do we use alignof() to
 propagate the alignment to the vcpu allocation code?

We actually do.  But please rebase the series against next, I got some
conflicts while applying.


-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] KVM: x86 emulator: use aligned variants of SSE register ops

2012-09-04 Thread Mathias Krause
On Tue, Sep 4, 2012 at 2:13 PM, Avi Kivity a...@redhat.com wrote:
 On 09/04/2012 03:09 PM, Avi Kivity wrote:
 On 08/30/2012 02:30 AM, Mathias Krause wrote:
 As the the compiler ensures that the memory operand is always aligned
 to a 16 byte memory location,

 I'm not sure it does.  Is V4SI aligned?  Do we use alignof() to
 propagate the alignment to the vcpu allocation code?

I checked that to by introducing a dummy char member in struct operand
that would have misaligned vec_val but, indeed, the compiler ensured
it's still 16 byte aligned.


 We actually do.  But please rebase the series against next, I got some
 conflicts while applying.

If next means kvm/next
(i.e.git://git.kernel.org/pub/scm/virt/kvm/kvm.git#next) here, the
whole series applies cleanly for me.
HEAD in kvm/next is 9a78197 KVM: x86: remove unused variable from
kvm_task_switch() here. Albeit the series was build against kvm/next
at the time as a81aba1 KVM: VMX: Ignore segment G and D bits when
considering whether we can virtualize was HEAD in this branch.

Could you please retry and show me the conflicts you get?


Regards,
Mathias
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html