Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Jan Kiszka
On 2015-06-25 11:25, Claudio Fontana wrote:
> On 25.06.2015 11:10, Peter Maydell wrote:
>> On 25 June 2015 at 09:59, Claudio Fontana  wrote:
>>> Once the VM is created, I think QEMU should not request kvm to
>>> change the virtual offset of the VM anymore: maybe an unexpected
>>> consequence of QEMU's target-arm/kvm64.c::kvm_arch_put_registers ?
>>
>> Hmm. In general we assume that we can:
>>  * stop the VM
>>  * read all the guest system registers
>>  * write those values back again
>>  * restart the VM
>>
>> if we need to. Is that what's happening here, or are we doing
>> something odder?
>>
>> -- PMM
>>
> 
> What I guess could be happening by looking at the code in linux
> 
> virt/kvm/arm/arch_timer.c::kvm_arm_timer_set_reg
> 
> is that QEMU tries to set the KVM_REG_ARM_TIMER_CNT register from exactly the 
> previous value,
> but just because of the fact that the set function is called, cntvoff is 
> updated,
> since the value provided by the user is apparently assumed to be _relative_ 
> to the physical timer.
> 
> This is apparent to me in the code in that function which says:
> 
> case KVM_REG_ARM_TIMER_CNT: {
> /* ... */
> u64 cntvoff = kvm_phys_timer_read() - value;
> /* ... */
> }
> 
> And this is matched by the corresponding get function kvm_arm_timer_get_reg 
> where it says:
> 
> case KVM_REG_ARM_TIMER_CNT:
>return kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;
> 
> The time difference between when the GET is issued by QEMU and when the PUT 
> is issued then would account for the difference.

QEMU has the concept of write-back levels: KVM_PUT_RUNTIME_STATE,
KVM_PUT_RESET_STATE and KVM_PUT_FULL_STATE. I suspect this registers is
just sorted into the wrong category, thus written as part of the
RUNTIME_STATE. We had such bug patterns during the x86 maturing phase as
well.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3 4/4] KVM: x86: Add support for local interrupt requests from userspace

2015-06-25 Thread Steve Rutherford
> However, why is the roundtrip to userspace necessary?  Could you pass
> the extint index directly as an argument to KVM_INTERRUPT?  It's
> backwards-compatible, because KVM_INTERRUPT so far could not be used
> together with an in-kernel LAPIC.  If you could do that, you could also
> avoid the new userspace_extint_available field.

Implemented a basic version of this, and ran into some potential
issues with this strategy. Supporting PIC masking/unmasking by the
CPU/APIC means that either:
A) PIC interrupts need to be bufferable in the kernel (with some way
   of comparing priorities).
B) the APIC state needs to be read in order to fetch the bit as to
   whether or not the PIC is being masked (which I believe can be done
   from userspace via the APIC state ioctl).
C) something hacky that doesn't conform to the PIC spec but still
   happens to boot common OSes (like buffering the interrupts and
   injecting them in the order of arrival (which is wrong)).

Steve
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] arm/arm64: speed up spinlocks and atomic ops

2015-06-25 Thread Andrew Jones
spinlock torture tests made it clear that checking mmu_enabled()
every time we call spin_lock is a bad idea. As most tests will
want the MMU enabled the entire time, then we can inline a light
weight "nobody disabled the mmu" check, and bail out early.

Adding a light weight inlined check vs. a compile-time flag
suggested by Paolo.

Signed-off-by: Andrew Jones 
---
 lib/arm/asm/mmu-api.h | 7 ++-
 lib/arm/mmu.c | 9 +++--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index 12fdc57918c27..b2fc900a30add 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -1,7 +1,12 @@
 #ifndef __ASMARM_MMU_API_H_
 #define __ASMARM_MMU_API_H_
 extern pgd_t *mmu_idmap;
-extern bool mmu_enabled(void);
+extern unsigned int mmu_disabled_cpu_count;
+extern bool __mmu_enabled(void);
+static inline bool mmu_enabled(void)
+{
+   return mmu_disabled_cpu_count == 0 || __mmu_enabled();
+}
 extern void mmu_enable(pgd_t *pgtable);
 extern void mmu_disable(void);
 extern void mmu_enable_idmap(void);
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index ad558e177dd1c..e661d4f26e4b7 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -15,8 +15,9 @@ extern unsigned long etext;
 pgd_t *mmu_idmap;
 
 static cpumask_t mmu_disabled_cpumask;
+unsigned int mmu_disabled_cpu_count;
 
-bool mmu_enabled(void)
+bool __mmu_enabled(void)
 {
int cpu = current_thread_info()->cpu;
 
@@ -30,7 +31,9 @@ void mmu_enable(pgd_t *pgtable)
 
asm_mmu_enable(__pa(pgtable));
flush_tlb_all();
-   cpumask_clear_cpu(cpu, &mmu_disabled_cpumask);
+
+   if (cpumask_test_and_clear_cpu(cpu, &mmu_disabled_cpumask))
+   --mmu_disabled_cpu_count;
 }
 
 extern void asm_mmu_disable(void);
@@ -39,6 +42,8 @@ void mmu_disable(void)
int cpu = current_thread_info()->cpu;
 
cpumask_set_cpu(cpu, &mmu_disabled_cpumask);
+   ++mmu_disabled_cpu_count;
+
asm_mmu_disable();
 }
 
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] arm/arm64: Introduce mmu_disable

2015-06-25 Thread Andrew Jones
Allow unit test cpus to disable the MMU. Why not? We want the
test framework to be as flexible as possible. Callers will have
to deal with the cache coherency fallout... Cache flush support
is still forthcoming to the framework though.

Signed-off-by: Andrew Jones 
---
 arm/cstart.S  | 9 +
 arm/cstart64.S| 8 
 lib/arm/asm/mmu-api.h | 1 +
 lib/arm/mmu.c | 8 
 4 files changed, 26 insertions(+)

diff --git a/arm/cstart.S b/arm/cstart.S
index 2b7f8f3d200ef..3943867d2f219 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -159,6 +159,15 @@ asm_mmu_enable:
 
mov pc, lr
 
+.globl asm_mmu_disable
+asm_mmu_disable:
+   /* SCTLR */
+   mrc p15, 0, r0, c1, c0, 0
+   bic r0, #CR_M
+   mcr p15, 0, r0, c1, c0, 0
+   isb
+   mov pc, lr
+
 /*
  * Vector stubs
  * Simplified version of the Linux kernel implementation
diff --git a/arm/cstart64.S b/arm/cstart64.S
index cdda13c17af9e..44cff32d0f18e 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -146,6 +146,14 @@ asm_mmu_enable:
 
ret
 
+.globl asm_mmu_disable
+asm_mmu_disable:
+   mrs x0, sctlr_el1
+   bic x0, x0, SCTLR_EL1_M
+   msr sctlr_el1, x0
+   isb
+   ret
+
 /*
  * Vectors
  * Adapted from arch/arm64/kernel/entry.S
diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index 68dc707d67241..c46c4b08b14cc 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -4,6 +4,7 @@ extern pgd_t *mmu_idmap;
 extern bool mmu_enabled(void);
 extern void mmu_set_enabled(void);
 extern void mmu_enable(pgd_t *pgtable);
+extern void mmu_disable(void);
 extern void mmu_enable_idmap(void);
 extern void mmu_init_io_sect(pgd_t *pgtable, unsigned long virt_offset);
 extern void mmu_set_range_sect(pgd_t *pgtable, unsigned long virt_offset,
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 732000a8eb088..5966b408cb455 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -35,6 +35,14 @@ void mmu_enable(pgd_t *pgtable)
mmu_set_enabled();
 }
 
+extern void asm_mmu_disable(void);
+void mmu_disable(void)
+{
+   struct thread_info *ti = current_thread_info();
+   cpumask_clear_cpu(ti->cpu, &mmu_enabled_cpumask);
+   asm_mmu_disable();
+}
+
 void mmu_set_range_ptes(pgd_t *pgtable, unsigned long virt_offset,
unsigned long phys_start, unsigned long phys_end,
pgprot_t prot)
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] arm/arm64: drop mmu_set_enabled

2015-06-25 Thread Andrew Jones
The mmu is enabled automatically for all cpus, they must disable it
themselves if they don't want it on. Switch from managing a cpumask
of enabled cpus to one of disabled cpus. This allows us to remove
the mmu_set_enabled call from secondary_cinit, and the function all
together.

Signed-off-by: Andrew Jones 
---
 lib/arm/asm/mmu-api.h |  1 -
 lib/arm/mmu.c | 21 ++---
 lib/arm/smp.c |  1 -
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index c46c4b08b14cc..12fdc57918c27 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -2,7 +2,6 @@
 #define __ASMARM_MMU_API_H_
 extern pgd_t *mmu_idmap;
 extern bool mmu_enabled(void);
-extern void mmu_set_enabled(void);
 extern void mmu_enable(pgd_t *pgtable);
 extern void mmu_disable(void);
 extern void mmu_enable_idmap(void);
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 5966b408cb455..ad558e177dd1c 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -14,32 +14,31 @@ extern unsigned long etext;
 
 pgd_t *mmu_idmap;
 
-static cpumask_t mmu_enabled_cpumask;
+static cpumask_t mmu_disabled_cpumask;
+
 bool mmu_enabled(void)
 {
-   struct thread_info *ti = current_thread_info();
-   return cpumask_test_cpu(ti->cpu, &mmu_enabled_cpumask);
-}
+   int cpu = current_thread_info()->cpu;
 
-void mmu_set_enabled(void)
-{
-   struct thread_info *ti = current_thread_info();
-   cpumask_set_cpu(ti->cpu, &mmu_enabled_cpumask);
+   return !cpumask_test_cpu(cpu, &mmu_disabled_cpumask);
 }
 
 extern void asm_mmu_enable(phys_addr_t pgtable);
 void mmu_enable(pgd_t *pgtable)
 {
+   int cpu = current_thread_info()->cpu;
+
asm_mmu_enable(__pa(pgtable));
flush_tlb_all();
-   mmu_set_enabled();
+   cpumask_clear_cpu(cpu, &mmu_disabled_cpumask);
 }
 
 extern void asm_mmu_disable(void);
 void mmu_disable(void)
 {
-   struct thread_info *ti = current_thread_info();
-   cpumask_clear_cpu(ti->cpu, &mmu_enabled_cpumask);
+   int cpu = current_thread_info()->cpu;
+
+   cpumask_set_cpu(cpu, &mmu_disabled_cpumask);
asm_mmu_disable();
 }
 
diff --git a/lib/arm/smp.c b/lib/arm/smp.c
index f389ba6598faa..ca435dcd5f4a2 100644
--- a/lib/arm/smp.c
+++ b/lib/arm/smp.c
@@ -23,7 +23,6 @@ secondary_entry_fn secondary_cinit(void)
secondary_entry_fn entry;
 
thread_info_init(ti, 0);
-   mmu_set_enabled();
 
/*
 * Save secondary_data.entry locally to avoid opening a race
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] arm/arm64: rework mmu_enabled, for spinlock speedup

2015-06-25 Thread Andrew Jones
Andrew Jones (3):
  arm/arm64: Introduce mmu_disable
  arm/arm64: drop mmu_set_enabled
  arm/arm64: speed up spinlocks and atomic ops

 arm/cstart.S  |  9 +
 arm/cstart64.S|  8 
 lib/arm/asm/mmu-api.h |  9 +++--
 lib/arm/mmu.c | 32 ++--
 lib/arm/smp.c |  1 -
 5 files changed, 46 insertions(+), 13 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v5] i386: Introduce ARAT CPU feature

2015-06-25 Thread Eduardo Habkost
On Sun, Jun 07, 2015 at 11:15:08AM +0200, Jan Kiszka wrote:
> From: Jan Kiszka 
> 
> ARAT signals that the APIC timer does not stop in power saving states.
> As our APICs are emulated, it's fine to expose this feature to guests,
> at least when asking for KVM host features or with CPU types that
> include the flag. The exact model number that introduced the feature is
> not known, but reports can be found that it's at least available since
> Sandy Bridge.
> 
> Signed-off-by: Jan Kiszka 

Reviewed-by: Eduardo Habkost 

Applied to the x86 tree. Thanks!

-- 
Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] vhost: add ioctl to query nregions upper limit

2015-06-25 Thread Igor Mammedov
On Wed, 24 Jun 2015 17:08:56 +0200
"Michael S. Tsirkin"  wrote:

> On Wed, Jun 24, 2015 at 04:52:29PM +0200, Igor Mammedov wrote:
> > On Wed, 24 Jun 2015 16:17:46 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Wed, Jun 24, 2015 at 04:07:27PM +0200, Igor Mammedov wrote:
> > > > On Wed, 24 Jun 2015 15:49:27 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > Userspace currently simply tries to give vhost as many regions
> > > > > as it happens to have, but you only have the mem table
> > > > > when you have initialized a large part of VM, so graceful
> > > > > failure is very hard to support.
> > > > > 
> > > > > The result is that userspace tends to fail catastrophically.
> > > > > 
> > > > > Instead, add a new ioctl so userspace can find out how much
> > > > > kernel supports, up front. This returns a positive value that
> > > > > we commit to.
> > > > > 
> > > > > Also, document our contract with legacy userspace: when
> > > > > running on an old kernel, you get -1 and you can assume at
> > > > > least 64 slots.  Since 0 value's left unused, let's make that
> > > > > mean that the current userspace behaviour (trial and error)
> > > > > is required, just in case we want it back.
> > > > > 
> > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > Cc: Igor Mammedov 
> > > > > Cc: Paolo Bonzini 
> > > > > ---
> > > > >  include/uapi/linux/vhost.h | 17 -
> > > > >  drivers/vhost/vhost.c  |  5 +
> > > > >  2 files changed, 21 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/include/uapi/linux/vhost.h
> > > > > b/include/uapi/linux/vhost.h index ab373191..f71fa6d 100644
> > > > > --- a/include/uapi/linux/vhost.h
> > > > > +++ b/include/uapi/linux/vhost.h
> > > > > @@ -80,7 +80,7 @@ struct vhost_memory {
> > > > >   * Allows subsequent call to VHOST_OWNER_SET to succeed. */
> > > > >  #define VHOST_RESET_OWNER _IO(VHOST_VIRTIO, 0x02)
> > > > >  
> > > > > -/* Set up/modify memory layout */
> > > > > +/* Set up/modify memory layout: see also
> > > > > VHOST_GET_MEM_MAX_NREGIONS below. */ #define
> > > > > VHOST_SET_MEM_TABLE   _IOW(VHOST_VIRTIO, 0x03, struct
> > > > > vhost_memory) /* Write logging setup. */
> > > > > @@ -127,6 +127,21 @@ struct vhost_memory {
> > > > >  /* Set eventfd to signal an error */
> > > > >  #define VHOST_SET_VRING_ERR _IOW(VHOST_VIRTIO, 0x22, struct
> > > > > vhost_vring_file) 
> > > > > +/* Query upper limit on nregions in VHOST_SET_MEM_TABLE
> > > > > arguments.
> > > > > + * Returns:
> > > > > + *   0 < value <= MAX_INT - gives the upper limit,
> > > > > higher values will fail
> > > > > + *   0 - there's no static limit: try and see if it
> > > > > works
> > > > > + *   -1 - on failure
> > > > > + */
> > > > > +#define VHOST_GET_MEM_MAX_NREGIONS   _IO(VHOST_VIRTIO, 0x23)
> > > > > +
> > > > > +/* Returned by VHOST_GET_MEM_MAX_NREGIONS to mean there's no
> > > > > static limit:
> > > > > + * try and it'll work if you are lucky. */
> > > > > +#define VHOST_MEM_MAX_NREGIONS_NONE 0
> > > > is it needed? we always have a limit,
> > > > or don't have IOCTL => -1 => old try and see way
> > > > 
> > > > > +/* We support at least as many nregions in
> > > > > VHOST_SET_MEM_TABLE:
> > > > > + * for use on legacy kernels without
> > > > > VHOST_GET_MEM_MAX_NREGIONS support. */ +#define
> > > > > VHOST_MEM_MAX_NREGIONS_DEFAULT 64
> > > > ^^^ not used below,
> > > > if it's for legacy then perhaps s/DEFAULT/LEGACY/ 
> > > 
> > > The assumption was that userspace detecting old kernels will just
> > > use 64, this means we do want a flag to get the old way.
> > > 
> > > OTOH if you won't think it's useful, let me know.
> > this header will be synced into QEMU's tree so that we could use
> > this define there, isn't it? IMHO then _LEGACY is more exact
> > description of macro.
> > 
> > As for 0 return value, -1 is just fine for detecting old kernels
> > (i.e. try and see if it works), so 0 looks unnecessary but it
> > doesn't in any way hurt either. For me limit or -1 is enough to try
> > fix userspace.
> 
> OK.
> Do you want to try now before I do v2?

I've just tried, idea to check limit is unusable in this case.
here is a link to a patch that implements it:
https://github.com/imammedo/qemu/commits/vhost_slot_limit_check

slots count is changing dynamically depending on used devices
and more importantly guest OS could change slots count during
its runtime when during managing devices it could trigger
repartitioning of current memory table as device's memory regions
mapped into address space.

That leads to 2 different values of used slots at guest startup
time and after guest booted or after hotplug.

I my case guest could be started with max 58 DIMMs coldplugged,
but after boot 3 more slots are freed and it's possible to hotadd
3 more DIMMs. That however leads to the guest that can't be migrated
to since by QEMU design all hotplugged devices should be present
at target's startup time i.e. 60 DIMMs total and that obviously
goes above vhost limit

Re: [PATCH 2/3] arm/arm64: speed up spinlocks and atomic ops

2015-06-25 Thread Andrew Jones
On Thu, Jun 25, 2015 at 06:23:48PM +0200, Paolo Bonzini wrote:
> 
> 
> On 25/06/2015 18:12, Andrew Jones wrote:
> > spinlock torture tests made it clear that checking mmu_enabled()
> > every time we call spin_lock is a bad idea. As most tests will
> > want the MMU enabled the entire time, then just hard code
> > mmu_enabled() to true. Tests that want to play with the MMU can
> > be compiled with CONFIG_MAY_DISABLE_MMU to get the actual check
> > back.
> 
> This doesn't work if you compile mmu.o just once.  Can you make
> something like
> 
> static inline bool mmu_enabled(void)
> {
>   return disabled_mmu_cpu_count == 0 || __mmu_enabled();
> }
> 
> ...
> 
> bool __mmu_enabled(void)
> {
>   struct thread_info *ti = current_thread_info();
>   return cpumask_test_cpu(ti->cpu, &mmu_enabled_cpumask);
> }
> 
> ?

Agreed. But I might as well actually add the support for disabling
the mmu, if I'm going to add yet another variable dependant on it.
We should drop this patch from this series, and I'll submit another
series of a few patches that
  - introduce mmu_disable
  - switch to assuming the mmu is enabled, and then manage disabled
state instead
  - optimize mmu_enabled()

Thanks,
drew

> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] arm/arm64: speed up spinlocks and atomic ops

2015-06-25 Thread Paolo Bonzini


On 25/06/2015 18:12, Andrew Jones wrote:
> spinlock torture tests made it clear that checking mmu_enabled()
> every time we call spin_lock is a bad idea. As most tests will
> want the MMU enabled the entire time, then just hard code
> mmu_enabled() to true. Tests that want to play with the MMU can
> be compiled with CONFIG_MAY_DISABLE_MMU to get the actual check
> back.

This doesn't work if you compile mmu.o just once.  Can you make
something like

static inline bool mmu_enabled(void)
{
return disabled_mmu_cpu_count == 0 || __mmu_enabled();
}

...

bool __mmu_enabled(void)
{
struct thread_info *ti = current_thread_info();
return cpumask_test_cpu(ti->cpu, &mmu_enabled_cpumask);
}

?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] arm/arm64: spinlocks: fix memory barriers

2015-06-25 Thread Andrew Jones
It shouldn't be necessary to use a barrier on the way into
spin_lock. We'll be focused on a single address until we get
it (exclusively) set, and then we'll do a barrier on the way
out. Also, it does make sense to do a barrier on the way in
to spin_unlock, i.e. ensure what we did in the critical section
is ordered wrt to what we do outside it, before we announce that
we're outside.

Signed-off-by: Andrew Jones 
---
 lib/arm/spinlock.c   | 8 
 lib/arm64/spinlock.c | 5 ++---
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/lib/arm/spinlock.c b/lib/arm/spinlock.c
index 3b023ceaebf71..116ea5d7db930 100644
--- a/lib/arm/spinlock.c
+++ b/lib/arm/spinlock.c
@@ -7,10 +7,9 @@ void spin_lock(struct spinlock *lock)
 {
u32 val, fail;
 
-   dmb();
-
if (!mmu_enabled()) {
lock->v = 1;
+   smp_mb();
return;
}
 
@@ -25,11 +24,12 @@ void spin_lock(struct spinlock *lock)
: "r" (&lock->v)
: "cc" );
} while (fail);
-   dmb();
+
+   smp_mb();
 }
 
 void spin_unlock(struct spinlock *lock)
 {
+   smp_mb();
lock->v = 0;
-   dmb();
 }
diff --git a/lib/arm64/spinlock.c b/lib/arm64/spinlock.c
index 68b68b75ba60d..a3907f03cacda 100644
--- a/lib/arm64/spinlock.c
+++ b/lib/arm64/spinlock.c
@@ -13,10 +13,9 @@ void spin_lock(struct spinlock *lock)
 {
u32 val, fail;
 
-   smp_mb();
-
if (!mmu_enabled()) {
lock->v = 1;
+   smp_mb();
return;
}
 
@@ -35,9 +34,9 @@ void spin_lock(struct spinlock *lock)
 
 void spin_unlock(struct spinlock *lock)
 {
+   smp_mb();
if (mmu_enabled())
asm volatile("stlrh wzr, [%0]" :: "r" (&lock->v));
else
lock->v = 0;
-   smp_mb();
 }
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] arm/arm64: tcg_baremetal_tests inspired patches

2015-06-25 Thread Andrew Jones
While porting the test in virtualopensystems' tcg_baremetal_tests
to kvm-unit-tests, I found a couple places to improve the spinlock
implementation. I also added a way to build a single test that
doesn't necessary have an entry in the makefile, which should be handy
for experimental tests.

Andrew Jones (3):
  arm/arm64: spinlocks: fix memory barriers
  arm/arm64: allow building a single test
  arm/arm64: speed up spinlocks and atomic ops

 config/config-arm-common.mak | 6 ++
 lib/arm/asm/mmu-api.h| 4 
 lib/arm/mmu.c| 3 +++
 lib/arm/spinlock.c   | 8 
 lib/arm64/spinlock.c | 5 ++---
 5 files changed, 19 insertions(+), 7 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] arm/arm64: allow building a single test

2015-06-25 Thread Andrew Jones
This is mostly useful for building new tests that don't yet (and
may never) have entries in the makefiles (config-arm*.mak). Of course
it can be used to build tests that do have entries as well, in order
to avoid building all tests, if the plan is to run just the one.

Just do 'make TEST=some-test' to use it, where "some-test" matches
the name of the source file, i.e. arm/some-test.c

Signed-off-by: Andrew Jones 
---
 config/config-arm-common.mak | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/config/config-arm-common.mak b/config/config-arm-common.mak
index 314261ef60cf7..2ad29c115f946 100644
--- a/config/config-arm-common.mak
+++ b/config/config-arm-common.mak
@@ -12,6 +12,11 @@ endif
 tests-common = \
$(TEST_DIR)/selftest.flat
 
+ifneq ($(TEST),)
+   tests = $(TEST_DIR)/$(TEST).flat
+   tests-common =
+endif
+
 all: test_cases
 
 ##
@@ -69,4 +74,5 @@ generated_files = $(asm-offsets)
 
 test_cases: $(generated_files) $(tests-common) $(tests)
 
+$(TEST_DIR)/$(TEST).elf: $(cstart.o) $(TEST_DIR)/$(TEST).o
 $(TEST_DIR)/selftest.elf: $(cstart.o) $(TEST_DIR)/selftest.o
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] arm/arm64: speed up spinlocks and atomic ops

2015-06-25 Thread Andrew Jones
spinlock torture tests made it clear that checking mmu_enabled()
every time we call spin_lock is a bad idea. As most tests will
want the MMU enabled the entire time, then just hard code
mmu_enabled() to true. Tests that want to play with the MMU can
be compiled with CONFIG_MAY_DISABLE_MMU to get the actual check
back.

Signed-off-by: Andrew Jones 
---
 lib/arm/asm/mmu-api.h | 4 
 lib/arm/mmu.c | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index 68dc707d67241..1a4d91163c398 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -1,7 +1,11 @@
 #ifndef __ASMARM_MMU_API_H_
 #define __ASMARM_MMU_API_H_
 extern pgd_t *mmu_idmap;
+#ifdef CONFIG_MAY_DISABLE_MMU
 extern bool mmu_enabled(void);
+#else
+#define mmu_enabled() (1)
+#endif
 extern void mmu_set_enabled(void);
 extern void mmu_enable(pgd_t *pgtable);
 extern void mmu_enable_idmap(void);
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 732000a8eb088..405717b6332bf 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -15,11 +15,14 @@ extern unsigned long etext;
 pgd_t *mmu_idmap;
 
 static cpumask_t mmu_enabled_cpumask;
+
+#ifdef CONFIG_MAY_DISABLE_MMU
 bool mmu_enabled(void)
 {
struct thread_info *ti = current_thread_info();
return cpumask_test_cpu(ti->cpu, &mmu_enabled_cpumask);
 }
+#endif
 
 void mmu_set_enabled(void)
 {
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-25 Thread Alex Williamson
On Thu, 2015-06-25 at 09:37 +, Wu, Feng wrote:
> 
> > -Original Message-
> > From: Joerg Roedel [mailto:j...@8bytes.org]
> > Sent: Wednesday, June 24, 2015 11:46 PM
> > To: Alex Williamson
> > Cc: Wu, Feng; Eric Auger; Avi Kivity; kvm@vger.kernel.org;
> > linux-ker...@vger.kernel.org; pbonz...@redhat.com; mtosa...@redhat.com
> > Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> > 
> > On Thu, Jun 18, 2015 at 02:04:08PM -0600, Alex Williamson wrote:
> > > There are plenty of details to be filled in,
> > 
> > I also need to fill plenty of details in my head first, so here are some
> > suggestions based on my current understanding. Please don't hesitate to
> > correct me if where I got something wrong.
> > 
> > So first I totally agree that the handling of PI/non-PI configurations
> > should be transparent to user-space.
> 
> After thinking about this a bit more, I recall that why I used user-space
> to trigger the IRTE update for posted-interrupts, here is the reason:
> 
> Let's take MSI for an example:
> When guest updates the MSI configuration, here is the code path in
> QEMU and KVM:
> 
> vfio_update_msi() --> vfio_update_kvm_msi_virq() -->
> kvm_irqchip_update_msi_route() --> kvm_update_routing_entry() -->
> kvm_irqchip_commit_routes() --> kvm_irqchip_commit_routes() -->
> KVM_SET_GSI_ROUTING --> kvm_set_irq_routing()
> 
> It will finally go to kvm_set_irq_routing() in KVM, there are two problem:
> 1. It use RCU in this function, it is hard to find which entry in the irq 
> routing
>   table is being updated.
> 2. Even we find the updated entry, it is hard to find the associated assigned
>   device with this irq routing entry.
> 
> So I used a VFIO API to notify KVM the updated MSI/MSIx configuration and
> the associated assigned devices. I think we need to find a way to address
> the above two issues before going forward. Alex, what is your opinion?

So the trouble is that QEMU vfio updates a single MSI vector, but that
just updates a single entry within a whole table of routes, then the
whole table is pushed to KVM.  But in kvm_set_irq_routing() we have
access to both the new and the old tables, so we do have the ability to
detect the change.  We can therefore detect which GSI changed and
cross-reference that to KVMs irqfds.  If we have an irqfd that matches
the GSI then we have all the information we need, right?  We can use the
eventfd_ctx of the irqfd to call into the IRQ bypass manager if we need
to.  If it's an irqfd that's already enabled for bypass then we may
already have the data we need to tweak the PI config.

Yes, I agree it's more difficult, but it doesn't appear to be
impossible, right?  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2015-06-25 Thread Mrs.Alice Walton
my name is Mrs. Alice Walton,i have a charity proposal for you
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-25 Thread Peter Maydell
On 25 June 2015 at 14:44, Marc Zyngier  wrote:
> It should always be possible to emulate a "known" CPU on a generic host,
> and it should be able to migrate. The case we can't migrate is when we
> let the guest be generic (which I guess should really be unknown, and
> not generic).
>
> So if the user specify "-cpu cortex-a57" on the command line, the guest
> should be able to migrate from an A72 to an A53. if the user specified
> "-cpu host", the resulting guest won't be able to migrate.
>
> Does it make sense?

Yes. We've always said "-cpu host" won't be cross-host migratable
from a QEMU point of view.

-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-25 Thread Marc Zyngier
On 25/06/15 13:40, Marc Zyngier wrote:
> On 25/06/15 13:30, Christoffer Dall wrote:
>> On Wed, Jun 24, 2015 at 10:32:38AM +0100, Marc Zyngier wrote:
>>> Hi Christoffer,
>>>
>>> On 24/06/15 09:51, Christoffer Dall wrote:
 On Wed, Jun 24, 2015 at 09:29:56AM +0100, Marc Zyngier wrote:
> On 22/06/15 09:44, Peter Maydell wrote:
>> On 17 June 2015 at 10:00, Suzuki K. Poulose  
>> wrote:
>>> From: "Suzuki K. Poulose" 
>>>
>>> This patch adds a generic ARM v8 KVM target cpu type for use
>>> by the new CPUs which eventualy ends up using the common sys_reg
>>> table. For backward compatibility the existing targets have been
>>> preserved. Any new target CPU that can be covered by generic v8
>>> sys_reg tables should make use of the new generic target.
>>
>> How do you intend this to work for cross-host migration?
>
> It is not meant to work for cross migration at all.
>
>> Is the idea that the kernel guarantees that "generic" looks
>> 100% the same to the guest regardless of host hardware? I'm
>> not sure that can be made to work, given impdef differences
>> in ID register values, bp/wp registers, and so on.
>>
>> Given that, it seems to me that we still need to provide
>> KVM_ARM_TARGET_$THISCPU defines so userspace can request
>> a specific guest CPU flavour; so what does this patch
>> provide that isn't already provided by just having userspace
>> query for the "preferred" CPU type as it does already?
>
> The way I see this working is that a "generic" CPU cannot be migrated
> (because we don't know anything about it). If it can be identified as a
> known (non generic) implementation, then we can migrate it.
>
 Concretely, how should this work?  Be enforced by userspace or should we
 deny certain SET_ONE_REG operations from working on this target?
>>>
>>> I can definitely see MIDR overriding failing on a generic CPU. Also, you
>>> shouldn't be able to select a generic CPU if we know about the physical CPU.
>>>
>>
>> If I am migrating from an A57 to an A53, but the VM was started as
>> "generic CPU" then we rely on the user discovering that this is not
>> supported when trying to set the MIDR from userspace in KVM?
> 
> A57 to A53 is probably a bad example because we actively support both.
> So let's replace your A57 with an A72. With this patch, the A72 would be
> reported as "generic CPU", and MIDR access would fail on the A53.

Having thought about this a bit more...

It should always be possible to emulate a "known" CPU on a generic host,
and it should be able to migrate. The case we can't migrate is when we
let the guest be generic (which I guess should really be unknown, and
not generic).

So if the user specify "-cpu cortex-a57" on the command line, the guest
should be able to migrate from an A72 to an A53. if the user specified
"-cpu host", the resulting guest won't be able to migrate.

Does it make sense?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-25 Thread Marc Zyngier
On 25/06/15 13:30, Christoffer Dall wrote:
> On Wed, Jun 24, 2015 at 10:32:38AM +0100, Marc Zyngier wrote:
>> Hi Christoffer,
>>
>> On 24/06/15 09:51, Christoffer Dall wrote:
>>> On Wed, Jun 24, 2015 at 09:29:56AM +0100, Marc Zyngier wrote:
 On 22/06/15 09:44, Peter Maydell wrote:
> On 17 June 2015 at 10:00, Suzuki K. Poulose  
> wrote:
>> From: "Suzuki K. Poulose" 
>>
>> This patch adds a generic ARM v8 KVM target cpu type for use
>> by the new CPUs which eventualy ends up using the common sys_reg
>> table. For backward compatibility the existing targets have been
>> preserved. Any new target CPU that can be covered by generic v8
>> sys_reg tables should make use of the new generic target.
>
> How do you intend this to work for cross-host migration?

 It is not meant to work for cross migration at all.

> Is the idea that the kernel guarantees that "generic" looks
> 100% the same to the guest regardless of host hardware? I'm
> not sure that can be made to work, given impdef differences
> in ID register values, bp/wp registers, and so on.
>
> Given that, it seems to me that we still need to provide
> KVM_ARM_TARGET_$THISCPU defines so userspace can request
> a specific guest CPU flavour; so what does this patch
> provide that isn't already provided by just having userspace
> query for the "preferred" CPU type as it does already?

 The way I see this working is that a "generic" CPU cannot be migrated
 (because we don't know anything about it). If it can be identified as a
 known (non generic) implementation, then we can migrate it.

>>> Concretely, how should this work?  Be enforced by userspace or should we
>>> deny certain SET_ONE_REG operations from working on this target?
>>
>> I can definitely see MIDR overriding failing on a generic CPU. Also, you
>> shouldn't be able to select a generic CPU if we know about the physical CPU.
>>
> 
> If I am migrating from an A57 to an A53, but the VM was started as
> "generic CPU" then we rely on the user discovering that this is not
> supported when trying to set the MIDR from userspace in KVM?

A57 to A53 is probably a bad example because we actively support both.
So let's replace your A57 with an A72. With this patch, the A72 would be
reported as "generic CPU", and MIDR access would fail on the A53.

Admittedly, this is a bit late. We could also refuse to instantiate a
"generic CPU" on A53, but that's not much better timing wise.

Not sure we can do much better though.

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64/kvm: Add generic v8 KVM target

2015-06-25 Thread Christoffer Dall
On Wed, Jun 24, 2015 at 10:32:38AM +0100, Marc Zyngier wrote:
> Hi Christoffer,
> 
> On 24/06/15 09:51, Christoffer Dall wrote:
> > On Wed, Jun 24, 2015 at 09:29:56AM +0100, Marc Zyngier wrote:
> >> On 22/06/15 09:44, Peter Maydell wrote:
> >>> On 17 June 2015 at 10:00, Suzuki K. Poulose  
> >>> wrote:
>  From: "Suzuki K. Poulose" 
> 
>  This patch adds a generic ARM v8 KVM target cpu type for use
>  by the new CPUs which eventualy ends up using the common sys_reg
>  table. For backward compatibility the existing targets have been
>  preserved. Any new target CPU that can be covered by generic v8
>  sys_reg tables should make use of the new generic target.
> >>>
> >>> How do you intend this to work for cross-host migration?
> >>
> >> It is not meant to work for cross migration at all.
> >>
> >>> Is the idea that the kernel guarantees that "generic" looks
> >>> 100% the same to the guest regardless of host hardware? I'm
> >>> not sure that can be made to work, given impdef differences
> >>> in ID register values, bp/wp registers, and so on.
> >>>
> >>> Given that, it seems to me that we still need to provide
> >>> KVM_ARM_TARGET_$THISCPU defines so userspace can request
> >>> a specific guest CPU flavour; so what does this patch
> >>> provide that isn't already provided by just having userspace
> >>> query for the "preferred" CPU type as it does already?
> >>
> >> The way I see this working is that a "generic" CPU cannot be migrated
> >> (because we don't know anything about it). If it can be identified as a
> >> known (non generic) implementation, then we can migrate it.
> >>
> > Concretely, how should this work?  Be enforced by userspace or should we
> > deny certain SET_ONE_REG operations from working on this target?
> 
> I can definitely see MIDR overriding failing on a generic CPU. Also, you
> shouldn't be able to select a generic CPU if we know about the physical CPU.
> 

If I am migrating from an A57 to an A53, but the VM was started as
"generic CPU" then we rely on the user discovering that this is not
supported when trying to set the MIDR from userspace in KVM?

> > 
> > Also, can we imagine any scenario where the generic CPU cannot me
> > modeled for a VM on a specific piece of hardware (current or future)?
> 
> What is the definition of a generic CPU here? In the above, generic
> really means "Unknown", so I can't immediately see what it would mean to
> model this.
> 
ok, good way to phrase it, I suppose that's a non-issue then.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 10/12] KVM: arm64: guest debug, HW assisted debug support

2015-06-25 Thread Alex Bennée

Christoffer Dall  writes:

> On Thu, Jun 25, 2015 at 07:38:33AM +0100, Alex Bennée wrote:
>> 
>> Christoffer Dall  writes:
>> 
>> > On Fri, Jun 19, 2015 at 01:23:48PM +0100, Alex Bennée wrote:
>> >> This adds support for userspace to control the HW debug registers for
>> >> guest debug. In the debug ioctl we copy the IMPDEF defined number of

>> >>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>> >>  {
>> >> - if (vcpu->guest_debug)
>> >> + if (vcpu->guest_debug) {
>> >>   restore_guest_debug_regs(vcpu);
>> >> +
>> >> + /*
>> >> +  * If we were using HW debug we need to restore the
>> >> +  * debug_ptr to the guest debug state.
>> >> +  */
>> >> + if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW)
>> >> + vcpu->arch.debug_ptr = &vcpu->arch.vcpu_debug_state;
>> >
>> > I still think this would be more cleanly done in the setup_debug
>> > function, but ok:
>> 
>> I don't follow, setup_debug is called before we enter KVM. It's pretty
>> light when no debugging is being done so this ensure we leave state how
>> we would like it when we stop debugging.
>> 
>> I can move it to an else leg in setup if you really want.
>> 
> I just feel like whenever you enter the guest you setup the state you
> want for your guest and then when reading the code you never have to
> worry about "did I set the pointer back correctly last time it exited",
> but thinking about your response, I guess that's an extra store on each
> world-switch, so theoretically that may be a bit more overhead (on top
> of the hundreds other stores and spinlocks we take and stuff).

The setup/clear() calls are tightly paired around the KVM_RUN ioctl code
without any obvious exit points.

Are there any cases you can escape the ioctl code flow? I notice irq's
are re-enabled so I guess a suitably determined irq function could
change the return address or mess around with guest_debug.

> If you prefer, leave it like this, but consider adding a
> BUG_ON(!guest_debugging && debug_ptr != &vcpu->arch.vcpu_debug_state) in
> the setup function...

The clear_debug() code would end up being a fairly sparse piece of code
without it ;-)

> I'm probably being paranoid.

A little paranoia goes a long way in kernel mode ;-)

>
> -Christoffer

-- 
Alex Bennée
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/11] kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file

2015-06-25 Thread Denis V. Lunev

On 22/06/15 19:04, Denis V. Lunev wrote:

From: Andrey Smetanin 

This patch introduces Hyper-V related source code file - hyperv.c and
per vm and per vcpu hyperv context structures.
All Hyper-V MSR's and hypercall code moved into hyperv.c.
All hyper-v kvm/vcpu fields moved into appropriate hyperv context
structures.
Copyrights and authors information copied from x86.c to hyperv.c.

Signed-off-by: Andrey Smetanin 
Signed-off-by: Denis V. Lunev 
CC: Paolo Bonzini 
CC: Gleb Natapov 


Paolo, Gleb,

we are very sensitive on the first patch. Could you
suggest which tree we should be based on?
For now I think that we should use

https://git.kernel.org/cgit/virt/kvm/kvm.git/

but the branch here could also matters.

Den
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/11] kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file

2015-06-25 Thread Paolo Bonzini


On 25/06/2015 12:25, Denis V. Lunev wrote:
> 
> Paolo, Gleb,
> 
> we are very sensitive on the first patch. Could you
> suggest which tree we should be based on?
> For now I think that we should use
> 
> https://git.kernel.org/cgit/virt/kvm/kvm.git/
> 
> but the branch here could also matters.

You can use Linus's tree right now, but in general you should use branch
"next" in that tree.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3.16.y-ckt 31/71] MIPS: KVM: Do not sign extend on unsigned MMIO load

2015-06-25 Thread Luis Henriques
3.16.7-ckt14 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Nicholas Mc Guire 

commit ed9244e6c534612d2b5ae47feab2f55a0d4b4ced upstream.

Fix possible unintended sign extension in unsigned MMIO loads by casting
to uint16_t in the case of mmio_needed != 2.

Signed-off-by: Nicholas Mc Guire 
Reviewed-by: James Hogan 
Tested-by: James Hogan 
Cc: Gleb Natapov 
Cc: Paolo Bonzini 
Cc: kvm@vger.kernel.org
Cc: linux-m...@linux-mips.org
Cc: linux-ker...@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9985/
Signed-off-by: Ralf Baechle 
[ luis: backported to 3.16:
  - file rename: emulate.c -> kvm_mips_emul.c ]
Signed-off-by: Luis Henriques 
---
 arch/mips/kvm/kvm_mips_emul.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/kvm/kvm_mips_emul.c b/arch/mips/kvm/kvm_mips_emul.c
index 2071472bc3c4..18b4e2fdae33 100644
--- a/arch/mips/kvm/kvm_mips_emul.c
+++ b/arch/mips/kvm/kvm_mips_emul.c
@@ -2130,7 +2130,7 @@ kvm_mips_complete_mmio_load(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (vcpu->mmio_needed == 2)
*gpr = *(int16_t *) run->mmio.data;
else
-   *gpr = *(int16_t *) run->mmio.data;
+   *gpr = *(uint16_t *)run->mmio.data;
 
break;
case 1:
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding

2015-06-25 Thread Wu, Feng


> -Original Message-
> From: Joerg Roedel [mailto:j...@8bytes.org]
> Sent: Wednesday, June 24, 2015 11:46 PM
> To: Alex Williamson
> Cc: Wu, Feng; Eric Auger; Avi Kivity; kvm@vger.kernel.org;
> linux-ker...@vger.kernel.org; pbonz...@redhat.com; mtosa...@redhat.com
> Subject: Re: [v4 08/16] KVM: kvm-vfio: User API for IRQ forwarding
> 
> On Thu, Jun 18, 2015 at 02:04:08PM -0600, Alex Williamson wrote:
> > There are plenty of details to be filled in,
> 
> I also need to fill plenty of details in my head first, so here are some
> suggestions based on my current understanding. Please don't hesitate to
> correct me if where I got something wrong.
> 
> So first I totally agree that the handling of PI/non-PI configurations
> should be transparent to user-space.

After thinking about this a bit more, I recall that why I used user-space
to trigger the IRTE update for posted-interrupts, here is the reason:

Let's take MSI for an example:
When guest updates the MSI configuration, here is the code path in
QEMU and KVM:

vfio_update_msi() --> vfio_update_kvm_msi_virq() -->
kvm_irqchip_update_msi_route() --> kvm_update_routing_entry() -->
kvm_irqchip_commit_routes() --> kvm_irqchip_commit_routes() -->
KVM_SET_GSI_ROUTING --> kvm_set_irq_routing()

It will finally go to kvm_set_irq_routing() in KVM, there are two problem:
1. It use RCU in this function, it is hard to find which entry in the irq 
routing
  table is being updated.
2. Even we find the updated entry, it is hard to find the associated assigned
  device with this irq routing entry.

So I used a VFIO API to notify KVM the updated MSI/MSIx configuration and
the associated assigned devices. I think we need to find a way to address
the above two issues before going forward. Alex, what is your opinion?
Thanks a lot!

Thanks,
Feng


> 
> I read a bit through the VT-d spec, and my understanding of posted
> interrupts so far is that:
> 
>   1) Each VCPU gets a PI-Descriptor with its pending Posted
>  Interrupts. This descriptor needs to be updated when a VCPU
>  is migrated to another PCPU and should thus be under control
>  of KVM.
> 
>  This is similar to the vAPIC backing page in the AMD version
>  of this, except that the PCPU routing information is stored
>  somewhere else on AMD.
> 
>   2) As long as the VCPU runs the IRTEs are configured for
>  posting, when the VCPU goes to sleep the old remapped entry is
>  established again. So when the VCPU sleeps the interrupt
>  would get routed to VFIO and forwarded through the eventfd.
> 
>  This would be different to the AMD version, where we have a
>  running bit. When this is clear the IOMMU will trigger an event
>  in its event-log. This might need special handling in VFIO
>  ('might' because VFIO does not need to forward the interrupt,
>   it just needs to make sure the VCPU wakes up).
> 
>  Please correct me if my understanding of the Intel version is
>  wrong.
> 
> So most of the data structures the IOMMU reads for this need to be
> updated from KVM code (either x86-generic or AMD/Intel specific code),
> as KVM has the information about VCPU load/unload and the IRQ routing.
> 
> What KVM needs from VFIO are the informations about the physical
> interrupts, and it makes total sense to attach them as metadata to the
> eventfd.
> 
> But the problems start at how this metadata should look like. It would
> be good to have some generic description, but not sure if this is
> possible. Otherwise this metadata would need to be requested by VFIO
> from the IOMMU driver and passed on to KVM, which it then passes back to
> the IOMMU driver. Or something like that.
> 
> 
> 
>   Joerg

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Claudio Fontana
On 25.06.2015 11:10, Peter Maydell wrote:
> On 25 June 2015 at 09:59, Claudio Fontana  wrote:
>> Once the VM is created, I think QEMU should not request kvm to
>> change the virtual offset of the VM anymore: maybe an unexpected
>> consequence of QEMU's target-arm/kvm64.c::kvm_arch_put_registers ?
> 
> Hmm. In general we assume that we can:
>  * stop the VM
>  * read all the guest system registers
>  * write those values back again
>  * restart the VM
> 
> if we need to. Is that what's happening here, or are we doing
> something odder?
> 
> -- PMM
> 

What I guess could be happening by looking at the code in linux

virt/kvm/arm/arch_timer.c::kvm_arm_timer_set_reg

is that QEMU tries to set the KVM_REG_ARM_TIMER_CNT register from exactly the 
previous value,
but just because of the fact that the set function is called, cntvoff is 
updated,
since the value provided by the user is apparently assumed to be _relative_ to 
the physical timer.

This is apparent to me in the code in that function which says:

case KVM_REG_ARM_TIMER_CNT: {
/* ... */
u64 cntvoff = kvm_phys_timer_read() - value;
/* ... */
}

And this is matched by the corresponding get function kvm_arm_timer_get_reg 
where it says:

case KVM_REG_ARM_TIMER_CNT:
   return kvm_phys_timer_read() - vcpu->kvm->arch.timer.cntvoff;

The time difference between when the GET is issued by QEMU and when the PUT is 
issued then would account for the difference.

Thanks,

Claudio

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Peter Maydell
On 25 June 2015 at 09:59, Claudio Fontana  wrote:
> Once the VM is created, I think QEMU should not request kvm to
> change the virtual offset of the VM anymore: maybe an unexpected
> consequence of QEMU's target-arm/kvm64.c::kvm_arch_put_registers ?

Hmm. In general we assume that we can:
 * stop the VM
 * read all the guest system registers
 * write those values back again
 * restart the VM

if we need to. Is that what's happening here, or are we doing
something odder?

-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Claudio Fontana
Hi Christoffer,

On 25.06.2015 10:04, Christoffer Dall wrote:
> On Wed, Jun 24, 2015 at 03:54:57PM +0100, Marc Zyngier wrote:
>> Userspace is allowed to set the guest's view of CNTVCT, which turns
>> into setting CNTVOFF for the whole VM. One thing userspace is not supposed
>> to do is to update that register while the guest is running. Time will
>> either move forward (best case) or backward (really bad idea). Either way,
>> this shouldn't happen.
>>
>> This patch prevents userspace from touching CNTVOFF as soon as a vcpu
>> has been started. This ensures that time will keep monotonically increase.
>>
>> Signed-off-by: Marc Zyngier 
>> ---
>>
>> QEMU seems to trigger this at boot time, and I have no idea why it does so.
>> It would be good to find out, hence the RFC tag.
> 
> Is this at kernel boot time you see this, or at system startup time?
> 
> IIRC, QEMU creates a throw-away VM with the default CPU target time,
> reads out all the system registers to get the KVM reset values of those,
> then creates the real VM, and feeds back in all the system register
> reset values, as a method for QEMU and KVM to be in sync about the reset
> state of the machine.  If we do this, and include CNTVCT, then that
> would probably trigger this, but the VCPU really shouldn't have been run
> at that time...
> 
> We should prevent userspace from fiddling with this register post VCPU
> start regardless, but yes, it would be good to find out why this is
> happening in the first place.
> 
> How did you notice this and does it manifest itself in some user-visible
> ugliness?
> 
> Thanks,
> -Christoffer
> 


You can read the whole history here:

https://groups.google.com/forum/#!topic/osv-dev/2w101csH65E

It causes clock-related bugs with time jumping backward when relying on the 
virtual counter register in the guest, whenever a cpu is booted (primary, 
secondary via PSCI), and actually whenever the monitor is used to stop, info 
registers etc.

Once the VM is created, I think QEMU should not request kvm to change the 
virtual offset of the VM anymore: maybe an unexpected consequence of QEMU's 
target-arm/kvm64.c::kvm_arch_put_registers ?

Thanks,

Claudio






--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Marc Zyngier
Hi Christoffer,

On 25/06/15 09:04, Christoffer Dall wrote:
> On Wed, Jun 24, 2015 at 03:54:57PM +0100, Marc Zyngier wrote:
>> Userspace is allowed to set the guest's view of CNTVCT, which turns
>> into setting CNTVOFF for the whole VM. One thing userspace is not supposed
>> to do is to update that register while the guest is running. Time will
>> either move forward (best case) or backward (really bad idea). Either way,
>> this shouldn't happen.
>>
>> This patch prevents userspace from touching CNTVOFF as soon as a vcpu
>> has been started. This ensures that time will keep monotonically increase.
>>
>> Signed-off-by: Marc Zyngier 
>> ---
>>
>> QEMU seems to trigger this at boot time, and I have no idea why it does so.
>> It would be good to find out, hence the RFC tag.
> 
> Is this at kernel boot time you see this, or at system startup time?

When the kernel boots. The OSV guys also have seen this (time going
backward), and this patch seems to have solved their problem.

> IIRC, QEMU creates a throw-away VM with the default CPU target time,
> reads out all the system registers to get the KVM reset values of those,
> then creates the real VM, and feeds back in all the system register
> reset values, as a method for QEMU and KVM to be in sync about the reset
> state of the machine.  If we do this, and include CNTVCT, then that
> would probably trigger this, but the VCPU really shouldn't have been run
> at that time...

I definitely see this process happening, but that doesn't seem to be the
problem.

> We should prevent userspace from fiddling with this register post VCPU
> start regardless, but yes, it would be good to find out why this is
> happening in the first place.

My main problem is that I cannot easily correlate what is happening in
the guest at that time with something touching CNTVCT. That's just odd.

> How did you notice this and does it manifest itself in some user-visible
> ugliness?

It was first reported by the OSV guys, and I then spotted some
interesting hrtimers taking too long at guest boot time. Putting some
traces quickly identified the issue.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 0/6] KVM: arm/arm64: gsi routing support

2015-06-25 Thread Pavel Fedin
 Hi!

> But personally I would prefer we
> check irqchip routing entries have flat mapping, ie gsi = irqchip.pin
> since in any case we don't want/expect the userspace to play with that.

 Why? On x86 userspace perfectly can play with it. We can imagine some very new 
qemu version in
future which has all arch-specific kludges like "direct mapping" removed and 
relying fully on
routing which it sets up from scratch, in the same way as x86 qemu does this. 
Or we can imagine some
new, legacy-free hypervisor implementation. The gsi == irqchip.pin limitation 
is just what we have
now by default, for backwards compatibility. But by design we were never 
obliged to stick to this.
 Well, it's just MHO...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] KVM: arm/arm64: Don't let userspace update CNTVOFF once guest is running

2015-06-25 Thread Christoffer Dall
On Wed, Jun 24, 2015 at 03:54:57PM +0100, Marc Zyngier wrote:
> Userspace is allowed to set the guest's view of CNTVCT, which turns
> into setting CNTVOFF for the whole VM. One thing userspace is not supposed
> to do is to update that register while the guest is running. Time will
> either move forward (best case) or backward (really bad idea). Either way,
> this shouldn't happen.
> 
> This patch prevents userspace from touching CNTVOFF as soon as a vcpu
> has been started. This ensures that time will keep monotonically increase.
> 
> Signed-off-by: Marc Zyngier 
> ---
> 
> QEMU seems to trigger this at boot time, and I have no idea why it does so.
> It would be good to find out, hence the RFC tag.

Is this at kernel boot time you see this, or at system startup time?

IIRC, QEMU creates a throw-away VM with the default CPU target time,
reads out all the system registers to get the KVM reset values of those,
then creates the real VM, and feeds back in all the system register
reset values, as a method for QEMU and KVM to be in sync about the reset
state of the machine.  If we do this, and include CNTVCT, then that
would probably trigger this, but the VCPU really shouldn't have been run
at that time...

We should prevent userspace from fiddling with this register post VCPU
start regardless, but yes, it would be good to find out why this is
happening in the first place.

How did you notice this and does it manifest itself in some user-visible
ugliness?

Thanks,
-Christoffer

> 
>  virt/kvm/arm/arch_timer.c | 21 +++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 98c95f2..660f348 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -220,9 +220,26 @@ int kvm_arm_timer_set_reg(struct kvm_vcpu *vcpu, u64 
> regid, u64 value)
>   case KVM_REG_ARM_TIMER_CTL:
>   timer->cntv_ctl = value;
>   break;
> - case KVM_REG_ARM_TIMER_CNT:
> - vcpu->kvm->arch.timer.cntvoff = kvm_phys_timer_read() - value;
> + case KVM_REG_ARM_TIMER_CNT: {
> + struct kvm_vcpu *tmp;
> + int i;
> + u64 cntvoff = kvm_phys_timer_read() - value;
> +
> + /*
> +  * If any of the vcpus has started, don't update
> +  * CNTVOFF. Userspace is severely brain damaged.
> +  */
> + kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
> + if (tmp->arch.has_run_once) {
> + kvm_debug("Won't set CNTVOFF to %llx (%llx)\n",
> +   cntvoff,
> +   vcpu->kvm->arch.timer.cntvoff);
> + return -1;
> + }
> + }
> + vcpu->kvm->arch.timer.cntvoff = cntvoff;
>   break;
> + }
>   case KVM_REG_ARM_TIMER_CVAL:
>   timer->cntv_cval = value;
>   break;
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 10/12] KVM: arm64: guest debug, HW assisted debug support

2015-06-25 Thread Christoffer Dall
On Thu, Jun 25, 2015 at 07:38:33AM +0100, Alex Bennée wrote:
> 
> Christoffer Dall  writes:
> 
> > On Fri, Jun 19, 2015 at 01:23:48PM +0100, Alex Bennée wrote:
> >> This adds support for userspace to control the HW debug registers for
> >> guest debug. In the debug ioctl we copy the IMPDEF defined number of
> >
> > s/defined//
> >
> >> registers into a new register set called host_debug_state. There is now
> >> a new vcpu parameter called debug_ptr which selects which register set
> >> is to copied into the real registers when world switch occurs.
> >
> > But this patch doesn't seem to add the debug_ptr field?
> 
> Oops, yes the comment belongs to the previous patch.
> 
> >
> > s/to//
> >
> >> 
> >> I've moved some helper functions into the hw_breakpoint.h header for
> >> re-use.
> >> 
> >> As with single step we need to tweak the guest registers to enable the
> >> exceptions so we need to save and restore those bits.
> >> 
> >> Two new capabilities have been added to the KVM_EXTENSION ioctl to allow
> >> userspace to query the number of hardware break and watch points
> >> available on the host hardware.
> >> 
> >> Signed-off-by: Alex Bennée 
> >> 
> >> ---
> >> v2
> >>- switched to C setup
> >>- replace host debug registers directly into context
> >>- minor tweak to api docs
> >>- setup right register for debug
> >>- add FAR_EL2 to debug exit structure
> >>- add support for trapping debug register access
> >> v3
> >>- remove stray trace statement
> >>- fix spacing around operators (various)
> >>- clean-up usage of trap_debug
> >>- introduce debug_ptr, replace excessive memcpy stuff
> >>- don't use memcpy in ioctl, just assign
> >>- update cap ioctl documentation
> >>- reword a number comments
> >>- rename host_debug_state->external_debug_state
> >> v4
> >>- use the new u32/u64 split debug_ptr approach
> >>- fix some wording/comments
> >> v5
> >>- don't set MDSCR_EL1.KDE (not needed)
> >> v6
> >>- update wording given change in commentary
> >>- KVM_GUESTDBG_USE_HW_BP->KVM_GUESTDBG_USE_HW
> >> ---
> >>  Documentation/virtual/kvm/api.txt  |  7 ++-
> >>  arch/arm/kvm/arm.c |  7 +++
> >>  arch/arm64/include/asm/hw_breakpoint.h | 12 +++
> >>  arch/arm64/include/asm/kvm_host.h  |  6 +-
> >>  arch/arm64/kernel/hw_breakpoint.c  | 12 ---
> >>  arch/arm64/kvm/debug.c | 37 
> >> +-
> >>  arch/arm64/kvm/handle_exit.c   |  6 ++
> >>  arch/arm64/kvm/reset.c | 12 +++
> >>  include/uapi/linux/kvm.h   |  2 ++
> >>  9 files changed, 82 insertions(+), 19 deletions(-)
> >> 
> >> diff --git a/Documentation/virtual/kvm/api.txt 
> >> b/Documentation/virtual/kvm/api.txt
> >> index 33c8143..ada57df 100644
> >> --- a/Documentation/virtual/kvm/api.txt
> >> +++ b/Documentation/virtual/kvm/api.txt
> >> @@ -2668,7 +2668,7 @@ The top 16 bits of the control field are 
> >> architecture specific control
> >>  flags which can include the following:
> >>  
> >>- KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64]
> >> -  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390]
> >> +  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, 
> >> arm64]
> >>- KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86]
> >>- KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86]
> >>- KVM_GUESTDBG_EXIT_PENDING:  trigger an immediate guest exit [s390]
> >> @@ -2683,6 +2683,11 @@ updated to the correct (supplied) values.
> >>  The second part of the structure is architecture specific and
> >>  typically contains a set of debug registers.
> >>  
> >> +For arm64 the number of debug registers is implementation defined and
> >> +can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and
> >> +KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number
> >> +indicating the number of supported registers.
> >> +
> >>  When debug events exit the main run loop with the reason
> >>  KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run
> >>  structure containing architecture specific debug information.
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 0d17c7b..60c4045 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -307,6 +307,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> >>  
> >>  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE |\
> >>KVM_GUESTDBG_USE_SW_BP | \
> >> +  KVM_GUESTDBG_USE_HW | \
> >>KVM_GUESTDBG_SINGLESTEP)
> >>  
> >>  /**
> >> @@ -327,6 +328,12 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct 
> >> kvm_vcpu *vcpu,
> >>  
> >>if (dbg->control & KVM_GUESTDBG_ENABLE) {
> >>vcpu->guest_debug = dbg->control;
> >> +
> >> +  /* Hardware assisted Break and Watch points */
> >> +

Re: [PATCH v6 09/12] KVM: arm64: introduce vcpu->arch.debug_ptr

2015-06-25 Thread Christoffer Dall
On Thu, Jun 25, 2015 at 07:32:27AM +0100, Alex Bennée wrote:
> 
> Christoffer Dall  writes:
> 
> > On Fri, Jun 19, 2015 at 01:23:47PM +0100, Alex Bennée wrote:
> >> This introduces a level of indirection for the debug registers. Instead
> >> of using the sys_regs[] directly we store registers in a structure in
> >> the vcpu. As we are no longer tied to the layout of the sys_regs[] we
> >> can make the copies size appropriate for control and value registers.
> >> 
> >> This also entails updating the sys_regs code to access this new
> >> structure. Instead of passing a register index we now pass an offset
> >> into the kvm_guest_debug_arch structure.
> >> 
> >> We also need to ensure the GET/SET_ONE_REG ioctl operations store the
> >> registers in their correct location.
> >> 
> >> Signed-off-by: Alex Bennée 
> >> 
> >> ---
> >> v6:
> >>   - fix up some ws issues
> >>   - correct clobber info
> >>   - re-word commentary in kvm_host.h
> >>   - fix endian access issues for aarch32 fields
> >>   - revert all KVM_GET/SET_ONE_REG to 64bit (also see ABI update)
> >> ---
> 
> >>  
> >> +/* Used when AArch32 kernels trap to mapped debug registers */
> >> +static inline bool trap_debug32(struct kvm_vcpu *vcpu,
> >> +  const struct sys_reg_params *p,
> >> +  const struct sys_reg_desc *rd)
> >> +{
> >> +  __u32 *r = (__u32 *) ((void * )&vcpu->arch.vcpu_debug_state + rd->reg);
> >
> > This still looks like something that's asking for BE trouble.  Why not
> > access the register as a __u64 as it is and then only special-case it
> > somehow for the XVR thingy...  Perhaps a separate function, see below.
> >
> >> +  if (p->is_write) {
> >> +  *r = *vcpu_reg(vcpu, p->Rt);
> >> +  vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> >> +  } else {
> >> +  *vcpu_reg(vcpu, p->Rt) = *r;
> >> +  }
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +static inline bool trap_debug64(struct kvm_vcpu *vcpu,
> >> +  const struct sys_reg_params *p,
> >> +  const struct sys_reg_desc *rd)
> >> +{
> >> +  __u64 *r = (__u64 *) ((void * )&vcpu->arch.vcpu_debug_state + rd->reg);
> >> +  if (p->is_write) {
> >> +  *r = *vcpu_reg(vcpu, p->Rt);
> >> +  vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> >> +  } else {
> >> +  *vcpu_reg(vcpu, p->Rt) = *r;
> >> +  }
> >> +
> >> +  return true;
> >> +}
> >> +
> >> +static inline void reset_debug64(struct kvm_vcpu *vcpu, const struct 
> >> sys_reg_desc *rd)
> >> +{
> >> +  __u64 *r = (__u64 *) ((void * )&vcpu->arch.vcpu_debug_state + rd->reg);
> >> +  *r = rd->val;
> >> +}
> >> +
> >>  static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct 
> >> sys_reg_desc *r)
> >>  {
> >>u64 amair;
> >> @@ -240,16 +277,20 @@ static void reset_mpidr(struct kvm_vcpu *vcpu, const 
> >> struct sys_reg_desc *r)
> >>  #define DBG_BCR_BVR_WCR_WVR_EL1(n)
> >> \
> >>/* DBGBVRn_EL1 */   \
> >>{ Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b100), \
> >> -trap_debug_regs, reset_val, (DBGBVR0_EL1 + (n)), 0 }, \
> >> +trap_debug64, reset_debug64,  \
> >> +offsetof(struct kvm_guest_debug_arch, dbg_bvr[(n)]), 0 }, \
> >>/* DBGBCRn_EL1 */   \
> >>{ Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b101), \
> >> -trap_debug_regs, reset_val, (DBGBCR0_EL1 + (n)), 0 }, \
> >> +trap_debug64, reset_debug64,  \
> >> +offsetof(struct kvm_guest_debug_arch, dbg_bcr[(n)]), 0},  \
> >>/* DBGWVRn_EL1 */   \
> >>{ Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b110), \
> >> -trap_debug_regs, reset_val, (DBGWVR0_EL1 + (n)), 0 }, \
> >> +trap_debug64, reset_debug64,  \
> >> +offsetof(struct kvm_guest_debug_arch, dbg_wvr[(n)]), 0 }, \
> >>/* DBGWCRn_EL1 */   \
> >>{ Op0(0b10), Op1(0b000), CRn(0b), CRm((n)), Op2(0b111), \
> >> -trap_debug_regs, reset_val, (DBGWCR0_EL1 + (n)), 0 }
> >> +trap_debug64, reset_debug64,  \
> >> +offsetof(struct kvm_guest_debug_arch, dbg_wcr[(n)]), 0}
> >>  
> >>  /*
> >>   * Architected system registers.
> >> @@ -502,42 +543,51 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
> >>}
> >>  }
> >>  
> >> -static bool trap_debug32(struct kvm_vcpu *vcpu,
> >> -   const struct sys_reg_params *p,
> >> -   const struct sys_reg_desc *r)
> >> -{
> >> -  if (p->is_write) {
> >> -  vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> >> -  vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
> >> -  } else {
> >> -  *vcpu_reg(vcpu, p->Rt) = vcpu_c