Re: [PATCH 10/11] arm64/kvm: context-switch ptrauth registers

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 03:26:07PM +0100, Mark Rutland wrote:
> On Tue, Aug 01, 2017 at 01:00:14PM +0200, Christoffer Dall wrote:
> > On Wed, Jul 19, 2017 at 05:01:31PM +0100, Mark Rutland wrote:
> > > When pointer authentication is supported, a guest may wish to use it.
> > > This patch adds the necessary KVM infrastructure for this to work, with
> > > a semi-lazy context switch of the pointer auth state.
> > > 
> > > When we schedule a vcpu, we disable guest usage of pointer
> > > authentication instructions and accesses to the keys. While these are
> > > disabled, we avoid context-switching the keys. When we trap the guest
> > > trying to use pointer authentication functionality, we change to eagerly
> > > context-switching the keys, and enable the feature. The next time the
> > > vcpu is scheduled out/in, we start again.
> > > 
> > > This is sufficient for systems with uniform pointer authentication
> > > support. For systems with mismatched support, it will be necessary to
> > 
> > What is mismatched support?  You mean systems where one CPU has ptrauth
> > and another one doesn't (or if they both have it but in different ways)?
> 
> Both! Any case where the support is not uniform across all CPUs.
> 
> A CPU can implement address auth and/or generic auth, and either may use
> an architected algorithm or an IMP DEF algorithm.
> 
> Even if all CPUs report an IMP DEF algorithm, the particular algorithm
> may differ across CPUs.
> 
> > > hide the feature from the guest's view of the ID registers.
> > 
> > I think the work Drew is pondering to allow userspace more control of
> > what we emulate to the guest can semi-automatically take care of this as
> > well.
> 
> I'll take a look.
> 
> [...]
> 
> > > +static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)
> > > +{
> > > + kvm_arm_vcpu_ptrauth_disable(vcpu);
> > > +}
> > > +
> > 
> > why sched_in and not vcpu_load?  (vcpu_load happens whenever you're
> > returning from userspace for example, and not only when you've been
> > scheduled away and are coming back).
> 
> I think this is the result of going searching for similar lazy context
> switching, and stumbling on some (fairly different) code on x86.
> 
> It looks like I can use vcpu_load() instead, so I'll give that a shot
> for v2.
> 
> > And why do we want to 'reset' the behavior of KVM when the host
> > schedules a VCPU thread?
> > 
> > If I understand the logic correctly, what you're establishing with the
> > appraoch of initially trapping use of ptrauth is to avoid
> > saving/restoring the state if the guest dosn't use ptrauth, but then if
> > the guest starts using it, it's likely to keep using it, and therefore
> > we start saving/restoring the registers.
> 
> Yes.
> 
> > Why is the host's decision to schedule something else on this physical
> > CPU a good indication that we should no longer save/restore these
> > registers for the VCPU?
> 
> I guess it's not.
> 
> Initially, I was followed the same approach we take for fpsimd, leaving
> the feature trapped until we take a shallow trap to hyp. Handing all the
> sysreg traps in hyp was unwieldy, so I moved that down to the kernel
> proper. That meant I couldn't enable the trap when switching from host
> to guest, and doing so in the regular context switch seemed like the
> next best option.
> 
> > Wouldn't it make more sense to have a flag on the VCPU, and
> > potentially a counter, so that if you switch X times without ever
> > touching the registers again, you can stop saving/restoring the state,
> > if that's even something we care about?
> 
> Something like that could make more sense. It should be fairly simple to
> implement a decaying counter to determine when to trap.
> 
> I'd steered clear of optimising the lazy heuristic as I'm testing with
> models, and I don't have numbers that would be representative of real
> HW.
> 
> > Another thing that comes to mind; does the kernel running a KVM's VCPU
> > thread (and handling softirqs, ISRs, etc.) ever use the ptrauth system,
> > or does that only happen when we go to userspace? 
> 
> Today, only userspace uses pointer auth, and the kernel does not.
> However, in-kernel usage is on the cards.
> 
> > If the latter, we could handle the save/restore entirely in
> > vcpu_put/vcpu_load instead. I don't mind picking up that bit as part
> > of my ongoing optimization work later though, if you're eager to get
> > these patches merged.
> 
> I'd avoided that so far, since it would be undone when in-kernel usage
> is implemented. If you prefer, I can implement that for now.
> 
> [...]
> 

I think we should just do a simple flag for now, and once we have
hardware and can measure things, we can add more advanced support like a
decaying counter or a doing this at vcpu_put/vcpu_load instead.

I can then deal with the headache of making sure this performs well on
systems that don't have the hardware support once things are merged,
because I'm looking at that already.

> > > +/*
> > > + * Handle the 

Re: [RFC PATCH v2 11/19] genirq: Document vcpu_info usage for per-CPU interrupts

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 05:15:39PM +0100, Marc Zyngier wrote:
> On 17/07/17 15:27, Christoffer Dall wrote:
> > It is currently unclear how to set the VCPU affinity for an interrupt
> > which is of the per-CPU kind, since the Linux irq_data structure
> 
> very minor nit (which I should have noticed before, my bad): we have two
> kinds of per-CPU interrupts in the kernel. percpu, and percpu_devid. For
> the percpu variety, the IRQ number represents a single interrupt line,
> delivered to a given CPU (and non migrate-able). The percpu_devid
> variant represents multiple interrupts, each assigned to one CPU.
> 
> It'd be nice if the commit message (and the comment below) reflected
> this distinction.
> 
> > describes the state for multiple interrupts when the interrupt is a
> > per-CPU interrupt, one for each physical CPU on the system.  Since each
> > such interrupt can be associated with different VCPUs or none at all,
> > associating a single VCPU state with such an interrupt does not capture
> > the necessary semantics.
> > 
> > The implementers of irq_set_affinity are the Intel and AMD IOMMUs, and
> > the ARM GIC irqchip.  The Intel and AMD callers do not appear to use
> > per-CPU interrupts, and the ARM GIC implementation only checks the
> > pointer against NULL vs. non-NULL.
> > 
> > Therefore, simply update the function documentation to explain the
> > expected use in the context of per-CPU interrupts, allowing future
> > changes or additions to irqchip implementers to do the right thing.
> > 
> > This allows us to set the VCPU affinity for the virtual timer interrupt
> > in KVM/ARM, which is a per-CPU (PPI) interrupt.
> > 
> > Cc: Thomas Gleixner 
> > Cc: Marc Zyngier 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  kernel/irq/manage.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index 5624b2d..050b9f6 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -381,7 +381,8 @@ int irq_select_affinity_usr(unsigned int irq)
> >  /**
> >   * irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
> >   * @irq: interrupt number to set affinity
> > - * @vcpu_info: vCPU specific data
> > + * @vcpu_info: vCPU specific data or pointer to a percpu array of vCPU
> > + * specific data for per-CPU interrupts
> >   *
> >   * This function uses the vCPU specific data to set the vCPU
> >   * affinity for an irq. The vCPU specific data is passed from
> > 
> 
> Otherwise a worthy clarification of the feature.
> 
Thanks, I'll try to clarify based on the info below.

Linux IRQ abstractions; I know nothing.

-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 11/19] genirq: Document vcpu_info usage for per-CPU interrupts

2017-08-01 Thread Marc Zyngier
On 17/07/17 15:27, Christoffer Dall wrote:
> It is currently unclear how to set the VCPU affinity for an interrupt
> which is of the per-CPU kind, since the Linux irq_data structure

very minor nit (which I should have noticed before, my bad): we have two
kinds of per-CPU interrupts in the kernel. percpu, and percpu_devid. For
the percpu variety, the IRQ number represents a single interrupt line,
delivered to a given CPU (and non migrate-able). The percpu_devid
variant represents multiple interrupts, each assigned to one CPU.

It'd be nice if the commit message (and the comment below) reflected
this distinction.

> describes the state for multiple interrupts when the interrupt is a
> per-CPU interrupt, one for each physical CPU on the system.  Since each
> such interrupt can be associated with different VCPUs or none at all,
> associating a single VCPU state with such an interrupt does not capture
> the necessary semantics.
> 
> The implementers of irq_set_affinity are the Intel and AMD IOMMUs, and
> the ARM GIC irqchip.  The Intel and AMD callers do not appear to use
> per-CPU interrupts, and the ARM GIC implementation only checks the
> pointer against NULL vs. non-NULL.
> 
> Therefore, simply update the function documentation to explain the
> expected use in the context of per-CPU interrupts, allowing future
> changes or additions to irqchip implementers to do the right thing.
> 
> This allows us to set the VCPU affinity for the virtual timer interrupt
> in KVM/ARM, which is a per-CPU (PPI) interrupt.
> 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  kernel/irq/manage.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index 5624b2d..050b9f6 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -381,7 +381,8 @@ int irq_select_affinity_usr(unsigned int irq)
>  /**
>   *   irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
>   *   @irq: interrupt number to set affinity
> - *   @vcpu_info: vCPU specific data
> + *   @vcpu_info: vCPU specific data or pointer to a percpu array of vCPU
> + *   specific data for per-CPU interrupts
>   *
>   *   This function uses the vCPU specific data to set the vCPU
>   *   affinity for an irq. The vCPU specific data is passed from
> 

Otherwise a worthy clarification of the feature.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 06/19] KVM: arm/arm64: Make timer_arm and timer_disarm helpers more generic

2017-08-01 Thread Marc Zyngier
On 01/08/17 15:57, Christoffer Dall wrote:
> On Tue, Aug 01, 2017 at 03:10:59PM +0100, Marc Zyngier wrote:
>> On 17/07/17 15:27, Christoffer Dall wrote:
>>> We are about to add an additional soft timer to the arch timer state for
>>> a VCPU and would like to be able to reuse the functions to program and
>>> cancel a timer, so we make them slightly more generic and rename to make
>>> it more clear that these functions work on soft timers and not the
>>> hardware resource that this code is managing.
>>>
>>> Signed-off-by: Christoffer Dall 
>>> ---
>>>  virt/kvm/arm/arch_timer.c | 33 -
>>>  1 file changed, 16 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 8e89d63..871d8ae 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -56,26 +56,22 @@ u64 kvm_phys_timer_read(void)
>>> return timecounter->cc->read(timecounter->cc);
>>>  }
>>>  
>>> -static bool timer_is_armed(struct arch_timer_cpu *timer)
>>> +static bool soft_timer_is_armed(struct arch_timer_cpu *timer)
>>>  {
>>> return timer->armed;
>>>  }
>>>  
>>> -/* timer_arm: as in "arm the timer", not as in ARM the company */
>>> -static void timer_arm(struct arch_timer_cpu *timer, u64 ns)
>>> +static void soft_timer_start(struct hrtimer *hrt, u64 ns)
>>
>> I find it a bit confusing that the soft_timer_* functions operate on
>> different object types: arch_timer_cpu for soft_timer_is_armed(), and
>> hrtimer for soft_timer_start. Is there anything that prevents us from
>> keeping arch_timer_cpu for all of them?
>>
> 
> The problem is now we have two timers for a VCPU, one which is shared
> between the two timer contexts (physical and virtual) used when the
> process goes to sleep with a programmed timer, and one used to exit the
> guest for the physical timer emulation.  So I either need to provide
> some sideband information or use the hrtimer pointer.

Yes, I came to the same conclusion at while reading patch 8.

> I tried to address this in later patches by renaming timer_is_armed() to
> bg_timer_is_armed(), but I'm open to better suggestions.

Bah, never mind.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 04/38] KVM: arm/arm64: Check if nested virtualization is in use

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 10:07:40AM -0400, Jintack Lim wrote:
> On Sun, Jul 30, 2017 at 3:59 PM, Christoffer Dall  wrote:
> > On Tue, Jul 18, 2017 at 11:58:30AM -0500, Jintack Lim wrote:
> >> Nested virtualizaion is in use only if all three conditions are met:
> >> - The architecture supports nested virtualization.
> >> - The kernel parameter is set.
> >> - The userspace uses nested virtualiztion feature.
> >>
> >> Signed-off-by: Jintack Lim 
> >> ---
> >>  arch/arm/include/asm/kvm_host.h   | 11 +++
> >>  arch/arm64/include/asm/kvm_host.h |  2 ++
> >>  arch/arm64/kvm/nested.c   | 17 +
> >>  virt/kvm/arm/arm.c|  4 
> >>  4 files changed, 34 insertions(+)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_host.h 
> >> b/arch/arm/include/asm/kvm_host.h
> >> index 00b0f97..7e9e6c8 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -303,4 +303,15 @@ static inline int __init kvmarm_nested_cfg(char *buf)
> >>  {
> >>   return 0;
> >>  }
> >> +
> >> +static inline int init_nested_virt(void)
> >> +{
> >> + return 0;
> >> +}
> >> +
> >> +static inline bool nested_virt_in_use(struct kvm_vcpu *vcpu)
> >> +{
> >> + return false;
> >> +}
> >> +
> >>  #endif /* __ARM_KVM_HOST_H__ */
> >> diff --git a/arch/arm64/include/asm/kvm_host.h 
> >> b/arch/arm64/include/asm/kvm_host.h
> >> index 6df0c7c..86d4b6c 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -387,5 +387,7 @@ static inline void __cpu_init_stage2(void)
> >>  }
> >>
> >>  int __init kvmarm_nested_cfg(char *buf);
> >> +int init_nested_virt(void);
> >> +bool nested_virt_in_use(struct kvm_vcpu *vcpu);
> >>
> >>  #endif /* __ARM64_KVM_HOST_H__ */
> >> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> >> index 79f38da..9a05c76 100644
> >> --- a/arch/arm64/kvm/nested.c
> >> +++ b/arch/arm64/kvm/nested.c
> >> @@ -24,3 +24,20 @@ int __init kvmarm_nested_cfg(char *buf)
> >>  {
> >>   return strtobool(buf, _param);
> >>  }
> >> +
> >> +int init_nested_virt(void)
> >> +{
> >> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT))
> >> + kvm_info("Nested virtualization is supported\n");
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +bool nested_virt_in_use(struct kvm_vcpu *vcpu)
> >> +{
> >> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT)
> >> + && test_bit(KVM_ARM_VCPU_NESTED_VIRT, vcpu->arch.features))
> >> + return true;
> >
> > you could initialize a bool in init_nested_virt which you then check
> > here to avoid duplicating the logic.
> 
> I can make a bool to check the kernel param and the capability. The
> third one is per VM given by the userspace, so we don't know it when
> we initialize the host hypervisor. We can potentially have a bool in
> kvm_vcpu_arch or kvm_arch to cache the whole three conditions, if that
> sounds ok.
> 

Yes, that sounds good to me.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 06/19] KVM: arm/arm64: Make timer_arm and timer_disarm helpers more generic

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 03:10:59PM +0100, Marc Zyngier wrote:
> On 17/07/17 15:27, Christoffer Dall wrote:
> > We are about to add an additional soft timer to the arch timer state for
> > a VCPU and would like to be able to reuse the functions to program and
> > cancel a timer, so we make them slightly more generic and rename to make
> > it more clear that these functions work on soft timers and not the
> > hardware resource that this code is managing.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  virt/kvm/arm/arch_timer.c | 33 -
> >  1 file changed, 16 insertions(+), 17 deletions(-)
> > 
> > diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> > index 8e89d63..871d8ae 100644
> > --- a/virt/kvm/arm/arch_timer.c
> > +++ b/virt/kvm/arm/arch_timer.c
> > @@ -56,26 +56,22 @@ u64 kvm_phys_timer_read(void)
> > return timecounter->cc->read(timecounter->cc);
> >  }
> >  
> > -static bool timer_is_armed(struct arch_timer_cpu *timer)
> > +static bool soft_timer_is_armed(struct arch_timer_cpu *timer)
> >  {
> > return timer->armed;
> >  }
> >  
> > -/* timer_arm: as in "arm the timer", not as in ARM the company */
> > -static void timer_arm(struct arch_timer_cpu *timer, u64 ns)
> > +static void soft_timer_start(struct hrtimer *hrt, u64 ns)
> 
> I find it a bit confusing that the soft_timer_* functions operate on
> different object types: arch_timer_cpu for soft_timer_is_armed(), and
> hrtimer for soft_timer_start. Is there anything that prevents us from
> keeping arch_timer_cpu for all of them?
> 

The problem is now we have two timers for a VCPU, one which is shared
between the two timer contexts (physical and virtual) used when the
process goes to sleep with a programmed timer, and one used to exit the
guest for the physical timer emulation.  So I either need to provide
some sideband information or use the hrtimer pointer.

I tried to address this in later patches by renaming timer_is_armed() to
bg_timer_is_armed(), but I'm open to better suggestions.


> >  {
> > -   timer->armed = true;
> > -   hrtimer_start(>timer, ktime_add_ns(ktime_get(), ns),
> > +   hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> >   HRTIMER_MODE_ABS);
> >  }
> >  
> > -static void timer_disarm(struct arch_timer_cpu *timer)
> > +static void soft_timer_cancel(struct hrtimer *hrt, struct work_struct 
> > *work)
> >  {
> > -   if (timer_is_armed(timer)) {
> > -   hrtimer_cancel(>timer);
> > -   cancel_work_sync(>expired);
> > -   timer->armed = false;
> > -   }
> > +   hrtimer_cancel(hrt);
> > +   if (work)
> > +   cancel_work_sync(work);
> >  }
> >  
> >  static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
> > @@ -271,7 +267,7 @@ static void kvm_timer_emulate(struct kvm_vcpu *vcpu,
> > return;
> >  
> > /*  The timer has not yet expired, schedule a background timer */
> > -   timer_arm(timer, kvm_timer_compute_delta(timer_ctx));
> > +   soft_timer_start(>timer, kvm_timer_compute_delta(timer_ctx));
> >  }
> >  
> >  /*
> > @@ -285,7 +281,7 @@ void kvm_timer_schedule(struct kvm_vcpu *vcpu)
> > struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> > struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> >  
> > -   BUG_ON(timer_is_armed(timer));
> > +   BUG_ON(soft_timer_is_armed(timer));
> >  
> > /*
> >  * No need to schedule a background timer if any guest timer has
> > @@ -306,13 +302,16 @@ void kvm_timer_schedule(struct kvm_vcpu *vcpu)
> >  * The guest timers have not yet expired, schedule a background timer.
> >  * Set the earliest expiration time among the guest timers.
> >  */
> > -   timer_arm(timer, kvm_timer_earliest_exp(vcpu));
> > +   timer->armed = true;
> > +   soft_timer_start(>timer, kvm_timer_earliest_exp(vcpu));
> >  }
> >  
> >  void kvm_timer_unschedule(struct kvm_vcpu *vcpu)
> >  {
> > struct arch_timer_cpu *timer = >arch.timer_cpu;
> > -   timer_disarm(timer);
> > +
> > +   soft_timer_cancel(>timer, >expired);
> > +   timer->armed = false;
> >  }
> >  
> >  static void kvm_timer_flush_hwstate_vgic(struct kvm_vcpu *vcpu)
> > @@ -448,7 +447,7 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
> >  * This is to cancel the background timer for the physical timer
> >  * emulation if it is set.
> >  */
> > -   timer_disarm(timer);
> > +   soft_timer_cancel(>timer, >expired);
> >  
> > /*
> >  * The guest could have modified the timer registers or the timer
> > @@ -615,7 +614,7 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
> > struct arch_timer_cpu *timer = >arch.timer_cpu;
> > struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
> >  
> > -   timer_disarm(timer);
> > +   soft_timer_cancel(>timer, >expired);
> > kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
> >  }
> >  
> > 
> 
> Otherwise looks good to me.
> 
Thanks!
-Christoffer
___
kvmarm 

Re: [PATCH 10/11] arm64/kvm: context-switch ptrauth registers

2017-08-01 Thread Will Deacon
On Tue, Aug 01, 2017 at 03:26:07PM +0100, Mark Rutland wrote:
> On Tue, Aug 01, 2017 at 01:00:14PM +0200, Christoffer Dall wrote:
> > On Wed, Jul 19, 2017 at 05:01:31PM +0100, Mark Rutland wrote:
> > > When pointer authentication is supported, a guest may wish to use it.
> > > This patch adds the necessary KVM infrastructure for this to work, with
> > > a semi-lazy context switch of the pointer auth state.
> > > 
> > > When we schedule a vcpu, we disable guest usage of pointer
> > > authentication instructions and accesses to the keys. While these are
> > > disabled, we avoid context-switching the keys. When we trap the guest
> > > trying to use pointer authentication functionality, we change to eagerly
> > > context-switching the keys, and enable the feature. The next time the
> > > vcpu is scheduled out/in, we start again.
> > > 
> > > This is sufficient for systems with uniform pointer authentication
> > > support. For systems with mismatched support, it will be necessary to
> > 
> > What is mismatched support?  You mean systems where one CPU has ptrauth
> > and another one doesn't (or if they both have it but in different ways)?
> 
> Both! Any case where the support is not uniform across all CPUs.
> 
> A CPU can implement address auth and/or generic auth, and either may use
> an architected algorithm or an IMP DEF algorithm.
> 
> Even if all CPUs report an IMP DEF algorithm, the particular algorithm
> may differ across CPUs.

I know you don't like it, but I think we should resort to MIDR at that point
because otherwise IMP DEF algorithms will never be used by Linux and people
will complain.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 10/11] arm64/kvm: context-switch ptrauth registers

2017-08-01 Thread Mark Rutland
On Tue, Aug 01, 2017 at 01:00:14PM +0200, Christoffer Dall wrote:
> On Wed, Jul 19, 2017 at 05:01:31PM +0100, Mark Rutland wrote:
> > When pointer authentication is supported, a guest may wish to use it.
> > This patch adds the necessary KVM infrastructure for this to work, with
> > a semi-lazy context switch of the pointer auth state.
> > 
> > When we schedule a vcpu, we disable guest usage of pointer
> > authentication instructions and accesses to the keys. While these are
> > disabled, we avoid context-switching the keys. When we trap the guest
> > trying to use pointer authentication functionality, we change to eagerly
> > context-switching the keys, and enable the feature. The next time the
> > vcpu is scheduled out/in, we start again.
> > 
> > This is sufficient for systems with uniform pointer authentication
> > support. For systems with mismatched support, it will be necessary to
> 
> What is mismatched support?  You mean systems where one CPU has ptrauth
> and another one doesn't (or if they both have it but in different ways)?

Both! Any case where the support is not uniform across all CPUs.

A CPU can implement address auth and/or generic auth, and either may use
an architected algorithm or an IMP DEF algorithm.

Even if all CPUs report an IMP DEF algorithm, the particular algorithm
may differ across CPUs.

> > hide the feature from the guest's view of the ID registers.
> 
> I think the work Drew is pondering to allow userspace more control of
> what we emulate to the guest can semi-automatically take care of this as
> well.

I'll take a look.

[...]

> > +static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)
> > +{
> > +   kvm_arm_vcpu_ptrauth_disable(vcpu);
> > +}
> > +
> 
> why sched_in and not vcpu_load?  (vcpu_load happens whenever you're
> returning from userspace for example, and not only when you've been
> scheduled away and are coming back).

I think this is the result of going searching for similar lazy context
switching, and stumbling on some (fairly different) code on x86.

It looks like I can use vcpu_load() instead, so I'll give that a shot
for v2.

> And why do we want to 'reset' the behavior of KVM when the host
> schedules a VCPU thread?
> 
> If I understand the logic correctly, what you're establishing with the
> appraoch of initially trapping use of ptrauth is to avoid
> saving/restoring the state if the guest dosn't use ptrauth, but then if
> the guest starts using it, it's likely to keep using it, and therefore
> we start saving/restoring the registers.

Yes.

> Why is the host's decision to schedule something else on this physical
> CPU a good indication that we should no longer save/restore these
> registers for the VCPU?

I guess it's not.

Initially, I was followed the same approach we take for fpsimd, leaving
the feature trapped until we take a shallow trap to hyp. Handing all the
sysreg traps in hyp was unwieldy, so I moved that down to the kernel
proper. That meant I couldn't enable the trap when switching from host
to guest, and doing so in the regular context switch seemed like the
next best option.

> Wouldn't it make more sense to have a flag on the VCPU, and
> potentially a counter, so that if you switch X times without ever
> touching the registers again, you can stop saving/restoring the state,
> if that's even something we care about?

Something like that could make more sense. It should be fairly simple to
implement a decaying counter to determine when to trap.

I'd steered clear of optimising the lazy heuristic as I'm testing with
models, and I don't have numbers that would be representative of real
HW.

> Another thing that comes to mind; does the kernel running a KVM's VCPU
> thread (and handling softirqs, ISRs, etc.) ever use the ptrauth system,
> or does that only happen when we go to userspace? 

Today, only userspace uses pointer auth, and the kernel does not.
However, in-kernel usage is on the cards.

> If the latter, we could handle the save/restore entirely in
> vcpu_put/vcpu_load instead. I don't mind picking up that bit as part
> of my ongoing optimization work later though, if you're eager to get
> these patches merged.

I'd avoided that so far, since it would be undone when in-kernel usage
is implemented. If you prefer, I can implement that for now.

[...]

> > +/*
> > + * Handle the guest trying to use a ptrauth instruction, or trying to 
> > access a
> > + * ptrauth register.
> > + */
> > +void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu)
> > +{
> > +   if (cpus_have_const_cap(ARM64_HAS_ADDRESS_AUTH) &&
> > +   cpus_have_const_cap(ARM64_HAS_GENERIC_AUTH)) {
> > +   kvm_arm_vcpu_ptrauth_enable(vcpu);
> > +   } else {
> > +   kvm_inject_undefined(vcpu);
> 
> How can we trap here without having ARM64_HAS_[ADDRESS,GENERIC]_AUTH ?

We'll trap if the current CPU supports one of the two (with an
architected algorithm), but some other CPU does not (or uses an IMP DEF
algorithm). Note that we're checking that all 

Re: [RFC PATCH v2 20/38] KVM: arm64: Handle eret instruction traps

2017-08-01 Thread Jintack Lim
On Sun, Jul 30, 2017 at 4:00 PM, Christoffer Dall  wrote:
> On Tue, Jul 18, 2017 at 11:58:46AM -0500, Jintack Lim wrote:
>> When HCR.NV bit is set, eret instructions trap to EL2 with EC code 0x1A.
>> Emulate eret instructions by setting pc and pstate.
>
> It may be worth noting in the commit message that this is all we have to
> do, because the rest of the logic will then discover that the mode could
> change from virtual EL2 to EL1 and will setup the hw registers etc. when
> changing modes.

Makes sense. I'll write it up in the commit message.

>
>>
>> Note that the current exception level is always the virtual EL2, since
>> we set HCR_EL2.NV bit only when entering the virtual EL2. So, we take
>> spsr and elr states from the virtual _EL2 registers.
>>
>> Signed-off-by: Jintack Lim 
>> ---
>>  arch/arm64/include/asm/esr.h |  1 +
>>  arch/arm64/kvm/handle_exit.c | 16 
>>  arch/arm64/kvm/trace.h   | 21 +
>>  3 files changed, 38 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
>> index e7d8e28..210fde6 100644
>> --- a/arch/arm64/include/asm/esr.h
>> +++ b/arch/arm64/include/asm/esr.h
>> @@ -43,6 +43,7 @@
>>  #define ESR_ELx_EC_HVC64 (0x16)
>>  #define ESR_ELx_EC_SMC64 (0x17)
>>  #define ESR_ELx_EC_SYS64 (0x18)
>> +#define ESR_ELx_EC_ERET  (0x1A)
>>  /* Unallocated EC: 0x19 - 0x1E */
>>  #define ESR_ELx_EC_IMP_DEF   (0x1f)
>>  #define ESR_ELx_EC_IABT_LOW  (0x20)
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index 17d8a16..9259881 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -147,6 +147,21 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu, 
>> struct kvm_run *run)
>>   return 1;
>>  }
>>
>> +static int kvm_handle_eret(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> + trace_kvm_nested_eret(vcpu, vcpu_el2_sreg(vcpu, ELR_EL2),
>> +   vcpu_el2_sreg(vcpu, SPSR_EL2));
>> +
>> + /*
>> +  * Note that the current exception level is always the virtual EL2,
>> +  * since we set HCR_EL2.NV bit only when entering the virtual EL2.
>> +  */
>> + *vcpu_pc(vcpu) = vcpu_el2_sreg(vcpu, ELR_EL2);
>> + *vcpu_cpsr(vcpu) = vcpu_el2_sreg(vcpu, SPSR_EL2);
>> +
>> + return 1;
>> +}
>> +
>>  static exit_handle_fn arm_exit_handlers[] = {
>>   [0 ... ESR_ELx_EC_MAX]  = kvm_handle_unknown_ec,
>>   [ESR_ELx_EC_WFx]= kvm_handle_wfx,
>> @@ -160,6 +175,7 @@ static int kvm_handle_unknown_ec(struct kvm_vcpu *vcpu, 
>> struct kvm_run *run)
>>   [ESR_ELx_EC_HVC64]  = handle_hvc,
>>   [ESR_ELx_EC_SMC64]  = handle_smc,
>>   [ESR_ELx_EC_SYS64]  = kvm_handle_sys_reg,
>> + [ESR_ELx_EC_ERET]   = kvm_handle_eret,
>>   [ESR_ELx_EC_IABT_LOW]   = kvm_handle_guest_abort,
>>   [ESR_ELx_EC_DABT_LOW]   = kvm_handle_guest_abort,
>>   [ESR_ELx_EC_SOFTSTP_LOW]= kvm_handle_guest_debug,
>> diff --git a/arch/arm64/kvm/trace.h b/arch/arm64/kvm/trace.h
>> index 7c86cfb..5f40987 100644
>> --- a/arch/arm64/kvm/trace.h
>> +++ b/arch/arm64/kvm/trace.h
>> @@ -187,6 +187,27 @@
>>   TP_printk("vcpu: %p, inject exception to vEL2: ESR_EL2 0x%lx, vector: 
>> 0x%016lx",
>> __entry->vcpu, __entry->esr_el2, __entry->pc)
>>  );
>> +
>> +TRACE_EVENT(kvm_nested_eret,
>> + TP_PROTO(struct kvm_vcpu *vcpu, unsigned long elr_el2,
>> +  unsigned long spsr_el2),
>> + TP_ARGS(vcpu, elr_el2, spsr_el2),
>> +
>> + TP_STRUCT__entry(
>> + __field(struct kvm_vcpu *,  vcpu)
>> + __field(unsigned long,  elr_el2)
>> + __field(unsigned long,  spsr_el2)
>> + ),
>> +
>> + TP_fast_assign(
>> + __entry->vcpu = vcpu;
>> + __entry->elr_el2 = elr_el2;
>> + __entry->spsr_el2 = spsr_el2;
>> + ),
>> +
>> + TP_printk("vcpu: %p, eret to elr_el2: 0x%016lx, with spsr_el2: 
>> 0x%08lx",
>> +   __entry->vcpu, __entry->elr_el2, __entry->spsr_el2)
>> +);
>>  #endif /* _TRACE_ARM64_KVM_H */
>>
>>  #undef TRACE_INCLUDE_PATH
>> --
>> 1.9.1
>>
>
> Otherwise this patch looks good.
>
> Thanks,
> -Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 06/19] KVM: arm/arm64: Make timer_arm and timer_disarm helpers more generic

2017-08-01 Thread Marc Zyngier
On 17/07/17 15:27, Christoffer Dall wrote:
> We are about to add an additional soft timer to the arch timer state for
> a VCPU and would like to be able to reuse the functions to program and
> cancel a timer, so we make them slightly more generic and rename to make
> it more clear that these functions work on soft timers and not the
> hardware resource that this code is managing.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/arch_timer.c | 33 -
>  1 file changed, 16 insertions(+), 17 deletions(-)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 8e89d63..871d8ae 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -56,26 +56,22 @@ u64 kvm_phys_timer_read(void)
>   return timecounter->cc->read(timecounter->cc);
>  }
>  
> -static bool timer_is_armed(struct arch_timer_cpu *timer)
> +static bool soft_timer_is_armed(struct arch_timer_cpu *timer)
>  {
>   return timer->armed;
>  }
>  
> -/* timer_arm: as in "arm the timer", not as in ARM the company */
> -static void timer_arm(struct arch_timer_cpu *timer, u64 ns)
> +static void soft_timer_start(struct hrtimer *hrt, u64 ns)

I find it a bit confusing that the soft_timer_* functions operate on
different object types: arch_timer_cpu for soft_timer_is_armed(), and
hrtimer for soft_timer_start. Is there anything that prevents us from
keeping arch_timer_cpu for all of them?

>  {
> - timer->armed = true;
> - hrtimer_start(>timer, ktime_add_ns(ktime_get(), ns),
> + hrtimer_start(hrt, ktime_add_ns(ktime_get(), ns),
> HRTIMER_MODE_ABS);
>  }
>  
> -static void timer_disarm(struct arch_timer_cpu *timer)
> +static void soft_timer_cancel(struct hrtimer *hrt, struct work_struct *work)
>  {
> - if (timer_is_armed(timer)) {
> - hrtimer_cancel(>timer);
> - cancel_work_sync(>expired);
> - timer->armed = false;
> - }
> + hrtimer_cancel(hrt);
> + if (work)
> + cancel_work_sync(work);
>  }
>  
>  static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id)
> @@ -271,7 +267,7 @@ static void kvm_timer_emulate(struct kvm_vcpu *vcpu,
>   return;
>  
>   /*  The timer has not yet expired, schedule a background timer */
> - timer_arm(timer, kvm_timer_compute_delta(timer_ctx));
> + soft_timer_start(>timer, kvm_timer_compute_delta(timer_ctx));
>  }
>  
>  /*
> @@ -285,7 +281,7 @@ void kvm_timer_schedule(struct kvm_vcpu *vcpu)
>   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
>  
> - BUG_ON(timer_is_armed(timer));
> + BUG_ON(soft_timer_is_armed(timer));
>  
>   /*
>* No need to schedule a background timer if any guest timer has
> @@ -306,13 +302,16 @@ void kvm_timer_schedule(struct kvm_vcpu *vcpu)
>* The guest timers have not yet expired, schedule a background timer.
>* Set the earliest expiration time among the guest timers.
>*/
> - timer_arm(timer, kvm_timer_earliest_exp(vcpu));
> + timer->armed = true;
> + soft_timer_start(>timer, kvm_timer_earliest_exp(vcpu));
>  }
>  
>  void kvm_timer_unschedule(struct kvm_vcpu *vcpu)
>  {
>   struct arch_timer_cpu *timer = >arch.timer_cpu;
> - timer_disarm(timer);
> +
> + soft_timer_cancel(>timer, >expired);
> + timer->armed = false;
>  }
>  
>  static void kvm_timer_flush_hwstate_vgic(struct kvm_vcpu *vcpu)
> @@ -448,7 +447,7 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>* This is to cancel the background timer for the physical timer
>* emulation if it is set.
>*/
> - timer_disarm(timer);
> + soft_timer_cancel(>timer, >expired);
>  
>   /*
>* The guest could have modified the timer registers or the timer
> @@ -615,7 +614,7 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
>   struct arch_timer_cpu *timer = >arch.timer_cpu;
>   struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>  
> - timer_disarm(timer);
> + soft_timer_cancel(>timer, >expired);
>   kvm_vgic_unmap_phys_irq(vcpu, vtimer->irq.irq);
>  }
>  
> 

Otherwise looks good to me.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 08/38] KVM: arm64: Add EL2 special registers to vcpu context

2017-08-01 Thread Jintack Lim
On Sun, Jul 30, 2017 at 3:59 PM, Christoffer Dall  wrote:
> On Tue, Jul 18, 2017 at 11:58:34AM -0500, Jintack Lim wrote:
>> To support the virtual EL2 execution, we need to maintain the EL2
>> special registers such as SPSR_EL2, ELR_EL2 and SP_EL2 in vcpu context.
>>
>> Note that SP_EL2 is not accessible in EL2, so we don't need a trap
>> handler for this register.
>
> Actually, it's not accessible *in the MRS/MSR instruction* but it is of
> course accessible as the current stack pointer (which is why you need
> the state, but not the trap handler).

That is correct. I'll fix the commit message.

>
> Otherwise, the patch looks good.

Thanks!

>
> Thanks,
> -Christoffer
>
>>
>> Signed-off-by: Jintack Lim 
>> ---
>>  arch/arm64/include/asm/kvm_host.h | 12 
>>  arch/arm64/include/asm/sysreg.h   |  4 
>>  arch/arm64/kvm/sys_regs.c | 38 
>> +-
>>  arch/arm64/kvm/sys_regs.h |  8 
>>  4 files changed, 57 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index 1dc4ed6..57dccde 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -171,6 +171,15 @@ enum vcpu_sysreg {
>>   NR_SYS_REGS /* Nothing after this line! */
>>  };
>>
>> +enum el2_special_regs {
>> + __INVALID_EL2_SPECIAL_REG__,
>> + SPSR_EL2,   /* Saved Program Status Register (EL2) */
>> + ELR_EL2,/* Exception Link Register (EL2) */
>> + SP_EL2, /* Stack Pointer (EL2) */
>> +
>> + NR_EL2_SPECIAL_REGS
>> +};
>> +
>>  /* 32bit mapping */
>>  #define c0_MPIDR (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
>>  #define c0_CSSELR(CSSELR_EL1 * 2)/* Cache Size Selection Register */
>> @@ -218,6 +227,8 @@ struct kvm_cpu_context {
>>   u64 sys_regs[NR_SYS_REGS];
>>   u32 copro[NR_COPRO_REGS];
>>   };
>> +
>> + u64 el2_special_regs[NR_EL2_SPECIAL_REGS];
>>  };
>>
>>  typedef struct kvm_cpu_context kvm_cpu_context_t;
>> @@ -307,6 +318,7 @@ struct kvm_vcpu_arch {
>>
>>  #define vcpu_gp_regs(v)  (&(v)->arch.ctxt.gp_regs)
>>  #define vcpu_sys_reg(v,r)((v)->arch.ctxt.sys_regs[(r)])
>> +#define vcpu_el2_sreg(v,r)   ((v)->arch.ctxt.el2_special_regs[(r)])
>>  /*
>>   * CP14 and CP15 live in the same array, as they are backed by the
>>   * same system registers.
>> diff --git a/arch/arm64/include/asm/sysreg.h 
>> b/arch/arm64/include/asm/sysreg.h
>> index 9277c4a..98c32ef 100644
>> --- a/arch/arm64/include/asm/sysreg.h
>> +++ b/arch/arm64/include/asm/sysreg.h
>> @@ -268,6 +268,8 @@
>>
>>  #define SYS_DACR32_EL2   sys_reg(3, 4, 3, 0, 0)
>>
>> +#define SYS_SPSR_EL2 sys_reg(3, 4, 4, 0, 0)
>> +#define SYS_ELR_EL2  sys_reg(3, 4, 4, 0, 1)
>>  #define SYS_SP_EL1   sys_reg(3, 4, 4, 1, 0)
>>
>>  #define SYS_IFSR32_EL2   sys_reg(3, 4, 5, 0, 1)
>> @@ -332,6 +334,8 @@
>>  #define SYS_CNTVOFF_EL2  sys_reg(3, 4, 14, 0, 3)
>>  #define SYS_CNTHCTL_EL2  sys_reg(3, 4, 14, 1, 0)
>>
>> +#define SYS_SP_EL2   sys_reg(3, 6, 4, 1, 0)
>> +
>>  /* Common SCTLR_ELx flags. */
>>  #define SCTLR_ELx_EE(1 << 25)
>>  #define SCTLR_ELx_I  (1 << 12)
>> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
>> index 1568f8b..2b3ed70 100644
>> --- a/arch/arm64/kvm/sys_regs.c
>> +++ b/arch/arm64/kvm/sys_regs.c
>> @@ -900,15 +900,33 @@ static inline void access_rw(struct sys_reg_params *p, 
>> u64 *sysreg)
>>   *sysreg = p->regval;
>>  }
>>
>> +static u64 *get_special_reg(struct kvm_vcpu *vcpu, struct sys_reg_params *p)
>> +{
>> + u64 reg = sys_reg(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2);
>> +
>> + switch (reg) {
>> + case SYS_SP_EL1:
>> + return >arch.ctxt.gp_regs.sp_el1;
>> + case SYS_ELR_EL2:
>> + return _el2_sreg(vcpu, ELR_EL2);
>> + case SYS_SPSR_EL2:
>> + return _el2_sreg(vcpu, SPSR_EL2);
>> + default:
>> + return NULL;
>> + };
>> +}
>> +
>>  static bool trap_el2_regs(struct kvm_vcpu *vcpu,
>>struct sys_reg_params *p,
>>const struct sys_reg_desc *r)
>>  {
>> - /* SP_EL1 is NOT maintained in sys_regs array */
>> - if (sys_reg(p->Op0, p->Op1, p->CRn, p->CRm, p->Op2) == SYS_SP_EL1)
>> - access_rw(p, >arch.ctxt.gp_regs.sp_el1);
>> - else
>> - access_rw(p, _sys_reg(vcpu, r->reg));
>> + u64 *sys_reg;
>> +
>> + sys_reg = get_special_reg(vcpu, p);
>> + if (!sys_reg)
>> + sys_reg = _sys_reg(vcpu, r->reg);
>> +
>> + access_rw(p, sys_reg);
>>
>>   return true;
>>  }
>> @@ -1116,6 +1134,8 @@ static bool trap_el2_regs(struct kvm_vcpu *vcpu,
>>
>>   { SYS_DESC(SYS_DACR32_EL2), NULL, reset_unknown, DACR32_EL2 },
>>
>> +

Re: [RFC PATCH v2 04/38] KVM: arm/arm64: Check if nested virtualization is in use

2017-08-01 Thread Jintack Lim
On Sun, Jul 30, 2017 at 3:59 PM, Christoffer Dall  wrote:
> On Tue, Jul 18, 2017 at 11:58:30AM -0500, Jintack Lim wrote:
>> Nested virtualizaion is in use only if all three conditions are met:
>> - The architecture supports nested virtualization.
>> - The kernel parameter is set.
>> - The userspace uses nested virtualiztion feature.
>>
>> Signed-off-by: Jintack Lim 
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 11 +++
>>  arch/arm64/include/asm/kvm_host.h |  2 ++
>>  arch/arm64/kvm/nested.c   | 17 +
>>  virt/kvm/arm/arm.c|  4 
>>  4 files changed, 34 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 00b0f97..7e9e6c8 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -303,4 +303,15 @@ static inline int __init kvmarm_nested_cfg(char *buf)
>>  {
>>   return 0;
>>  }
>> +
>> +static inline int init_nested_virt(void)
>> +{
>> + return 0;
>> +}
>> +
>> +static inline bool nested_virt_in_use(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>>  #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index 6df0c7c..86d4b6c 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -387,5 +387,7 @@ static inline void __cpu_init_stage2(void)
>>  }
>>
>>  int __init kvmarm_nested_cfg(char *buf);
>> +int init_nested_virt(void);
>> +bool nested_virt_in_use(struct kvm_vcpu *vcpu);
>>
>>  #endif /* __ARM64_KVM_HOST_H__ */
>> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
>> index 79f38da..9a05c76 100644
>> --- a/arch/arm64/kvm/nested.c
>> +++ b/arch/arm64/kvm/nested.c
>> @@ -24,3 +24,20 @@ int __init kvmarm_nested_cfg(char *buf)
>>  {
>>   return strtobool(buf, _param);
>>  }
>> +
>> +int init_nested_virt(void)
>> +{
>> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT))
>> + kvm_info("Nested virtualization is supported\n");
>> +
>> + return 0;
>> +}
>> +
>> +bool nested_virt_in_use(struct kvm_vcpu *vcpu)
>> +{
>> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT)
>> + && test_bit(KVM_ARM_VCPU_NESTED_VIRT, vcpu->arch.features))
>> + return true;
>
> you could initialize a bool in init_nested_virt which you then check
> here to avoid duplicating the logic.

I can make a bool to check the kernel param and the capability. The
third one is per VM given by the userspace, so we don't know it when
we initialize the host hypervisor. We can potentially have a bool in
kvm_vcpu_arch or kvm_arch to cache the whole three conditions, if that
sounds ok.

>
>> +
>> + return false;
>> +}
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 1c1c772..36aae3a 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -1478,6 +1478,10 @@ int kvm_arch_init(void *opaque)
>>   if (err)
>>   goto out_err;
>>
>> + err = init_nested_virt();
>> + if (err)
>> + return err;
>> +
>>   err = init_subsystems();
>>   if (err)
>>   goto out_hyp;
>> --
>> 1.9.1
>>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 02/38] KVM: arm/arm64: Enable nested virtualization via command-line

2017-08-01 Thread Jintack Lim
On Sun, Jul 30, 2017 at 3:59 PM, Christoffer Dall  wrote:
> On Tue, Jul 18, 2017 at 11:58:28AM -0500, Jintack Lim wrote:
>> Add a new kernel parameter(kvm-arm.nested) to enable KVM/ARM nested
>> virtualization support. This kernel parameter on arm architecture is
>> ignored since nested virtualization is not supported on arm.
>>
>> Note that this kernel parameter will not have any impact until nested
>> virtualization support is completed. Just add this parameter first to
>> use it when implementing nested virtualization support.
>>
>> Signed-off-by: Jintack Lim 
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt |  4 
>>  arch/arm/include/asm/kvm_host.h |  4 
>>  arch/arm64/include/asm/kvm_host.h   |  2 ++
>>  arch/arm64/kvm/Makefile |  2 ++
>>  arch/arm64/kvm/nested.c | 26 
>> +
>>  virt/kvm/arm/arm.c  |  2 ++
>>  6 files changed, 40 insertions(+)
>>  create mode 100644 arch/arm64/kvm/nested.c
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index aa8341e..8fb152d 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -1841,6 +1841,10 @@
>>   [KVM,ARM] Trap guest accesses to GICv3 common
>>   system registers
>>
>> + kvm-arm.nested=
>> + [KVM,ARM] Allow nested virtualization in KVM/ARM.
>> + Default is 0 (disabled)
>
> We may want to say "on systems that support it" or something like that
> here as well.
>

Sounds good! Thanks.

>> +
>>   kvm-intel.ept=  [KVM,Intel] Disable extended page tables
>>   (virtualized MMU) support on capable Intel chips.
>>   Default is 1 (enabled)
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 127e2dd..00b0f97 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -299,4 +299,8 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>>  int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
>>  struct kvm_device_attr *attr);
>>
>> +static inline int __init kvmarm_nested_cfg(char *buf)
>> +{
>> + return 0;
>> +}
>>  #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index 0c4fd1f..dcc4df8 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -386,4 +386,6 @@ static inline void __cpu_init_stage2(void)
>> "PARange is %d bits, unsupported configuration!", parange);
>>  }
>>
>> +int __init kvmarm_nested_cfg(char *buf);
>> +
>>  #endif /* __ARM64_KVM_HOST_H__ */
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 5d98100..f513047 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -35,3 +35,5 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-debug.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/irqchip.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
>>  kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
>> +
>> +kvm-$(CONFIG_KVM_ARM_HOST) += nested.o
>> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
>> new file mode 100644
>> index 000..79f38da
>> --- /dev/null
>> +++ b/arch/arm64/kvm/nested.c
>> @@ -0,0 +1,26 @@
>> +/*
>> + * Copyright (C) 2017 - Columbia University and Linaro Ltd.
>> + * Author: Jintack Lim 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +
>> +static bool nested_param;
>> +
>> +int __init kvmarm_nested_cfg(char *buf)
>> +{
>> + return strtobool(buf, _param);
>> +}
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index a39a1e1..1c1c772 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -67,6 +67,8 @@
>>
>>  static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
>>
>> +early_param("kvm-arm.nested", kvmarm_nested_cfg);
>> +
>>  static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>>  {
>>   BUG_ON(preemptible());
>> --
>> 1.9.1
>>
___
kvmarm mailing list

Re: [RFC PATCH v2 04/38] KVM: arm/arm64: Check if nested virtualization is in use

2017-08-01 Thread Jintack Lim
On Sun, Jul 30, 2017 at 3:59 PM, Christoffer Dall  wrote:
> On Tue, Jul 18, 2017 at 11:58:30AM -0500, Jintack Lim wrote:
>> Nested virtualizaion is in use only if all three conditions are met:
>> - The architecture supports nested virtualization.
>> - The kernel parameter is set.
>> - The userspace uses nested virtualiztion feature.
>>
>> Signed-off-by: Jintack Lim 
>> ---
>>  arch/arm/include/asm/kvm_host.h   | 11 +++
>>  arch/arm64/include/asm/kvm_host.h |  2 ++
>>  arch/arm64/kvm/nested.c   | 17 +
>>  virt/kvm/arm/arm.c|  4 
>>  4 files changed, 34 insertions(+)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 00b0f97..7e9e6c8 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -303,4 +303,15 @@ static inline int __init kvmarm_nested_cfg(char *buf)
>>  {
>>   return 0;
>>  }
>> +
>> +static inline int init_nested_virt(void)
>> +{
>> + return 0;
>> +}
>> +
>> +static inline bool nested_virt_in_use(struct kvm_vcpu *vcpu)
>> +{
>> + return false;
>> +}
>> +
>>  #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index 6df0c7c..86d4b6c 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -387,5 +387,7 @@ static inline void __cpu_init_stage2(void)
>>  }
>>
>>  int __init kvmarm_nested_cfg(char *buf);
>> +int init_nested_virt(void);
>> +bool nested_virt_in_use(struct kvm_vcpu *vcpu);
>>
>>  #endif /* __ARM64_KVM_HOST_H__ */
>> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
>> index 79f38da..9a05c76 100644
>> --- a/arch/arm64/kvm/nested.c
>> +++ b/arch/arm64/kvm/nested.c
>> @@ -24,3 +24,20 @@ int __init kvmarm_nested_cfg(char *buf)
>>  {
>>   return strtobool(buf, _param);
>>  }
>> +
>> +int init_nested_virt(void)
>> +{
>> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT))
>> + kvm_info("Nested virtualization is supported\n");
>> +
>> + return 0;
>> +}
>> +
>> +bool nested_virt_in_use(struct kvm_vcpu *vcpu)
>> +{
>> + if (nested_param && cpus_have_const_cap(ARM64_HAS_NESTED_VIRT)
>> + && test_bit(KVM_ARM_VCPU_NESTED_VIRT, vcpu->arch.features))
>> + return true;
>> +
>> + return false;
>> +}
>
> after reading through a lot of your patches, I feel like vm_has_el2()
> would be a more elegant name, but it's not a strict requirement to
> change it.

I think it's a nice name. Let me think about it :)

>
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index 1c1c772..36aae3a 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -1478,6 +1478,10 @@ int kvm_arch_init(void *opaque)
>>   if (err)
>>   goto out_err;
>>
>> + err = init_nested_virt();
>> + if (err)
>> + return err;
>> +
>>   err = init_subsystems();
>>   if (err)
>>   goto out_hyp;
>> --
>> 1.9.1
>>
>
> Thanks,
> -Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError

2017-08-01 Thread Christoffer Dall
On Fri, Jul 28, 2017 at 03:10:18PM +0100, James Morse wrote:
> When we exit a guest due to an SError the vcpu fault info isn't updated
> with the ESR. Today this is only done for traps.
>
> The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
> fault_info with the ESR on SError so that handle_exit() can determine
> if this was a RAS SError and decode its severity.
>
> Signed-off-by: James Morse 

Reviewed-by: Christoffer Dall 

> ---
>  arch/arm64/kvm/hyp/switch.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..c6f17c7675ad 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 
> far, u64 *hpfar)
>   return true;
>  }
>
> +static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> +}
> +
>  static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> - u64 esr = read_sysreg_el2(esr);
> - u8 ec = ESR_ELx_EC(esr);
> + u8 ec;
> + u64 esr;
>   u64 hpfar, far;
>
> - vcpu->arch.fault.esr_el2 = esr;
> + __populate_fault_info_esr(vcpu);
> + esr = vcpu->arch.fault.esr_el2;
> + ec = ESR_ELx_EC(esr);
>
>   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
>   return true;
> @@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>   */
>   if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
>   goto again;
> + else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
> + __populate_fault_info_esr(vcpu);
>
>   if (static_branch_unlikely(_v2_cpuif_trap) &&
>  exit_code == ARM_EXCEPTION_TRAP) {
> --
> 2.13.2
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 05/19] KVM: arm/arm64: Check that system supports split eoi/deactivate

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 01:37:14PM +0100, Marc Zyngier wrote:
> On 01/08/17 13:26, Christoffer Dall wrote:
> > On Tue, Aug 01, 2017 at 12:35:23PM +0100, Marc Zyngier wrote:
> >> On 17/07/17 15:27, Christoffer Dall wrote:
> >>> Some systems without proper firmware and/or hardware description data
> >>> don't support the split EOI and deactivate operation.
> >>>
> >>> On such systems, we cannot leave the physical interrupt active after the
> >>> timer handler on the host has run, so we cannot support KVM with an
> >>> in-kernel GIC with the timer changes we about to introduce.
> >>>
> >>> This patch makes sure that trying to initialize the KVM GIC code will
> >>> fail on such systems.
> >>>
> >>> Cc: Marc Zyngier 
> >>> Signed-off-by: Christoffer Dall 
> >>> ---
> >>>  drivers/irqchip/irq-gic.c | 12 +---
> >>>  1 file changed, 9 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> >>> index 090991f..b7e4fed 100644
> >>> --- a/drivers/irqchip/irq-gic.c
> >>> +++ b/drivers/irqchip/irq-gic.c
> >>> @@ -1391,7 +1391,8 @@ int gic_of_init_child(struct device *dev, struct 
> >>> gic_chip_data **gic, int irq)
> >>>   return 0;
> >>>  }
> >>>  
> >>> -static void __init gic_of_setup_kvm_info(struct device_node *node)
> >>> +static void __init gic_of_setup_kvm_info(struct device_node *node,
> >>> +  bool supports_deactivate)
> >>
> >> Ouch, nasty. This shadows the static key which is also called
> >> supports_deactivate...
> >>
> > 
> > oh, yeah, that's a trap waiting to happen.
> > 
> >>>  {
> >>>   int ret;
> >>>   struct resource *vctrl_res = _v2_kvm_info.vctrl;
> >>> @@ -1411,6 +1412,9 @@ static void __init gic_of_setup_kvm_info(struct 
> >>> device_node *node)
> >>>   if (ret)
> >>>   return;
> >>>  
> >>> + if (!supports_deactivate)
> >>> + return;
> >>> +
> >>>   gic_set_kvm_info(_v2_kvm_info);
> >>
> >> Speaking of which, the static key should already be initialized, so this
> >> could actually read:
> >>
> >>if (static_key_true(_deactivate))
> >>gic_set_kvm_info(_v2_kvm_info);
> >>
> >>>  }
> >>>  
> >>> @@ -1419,6 +1423,7 @@ gic_of_init(struct device_node *node, struct 
> >>> device_node *parent)
> >>>  {
> >>>   struct gic_chip_data *gic;
> >>>   int irq, ret;
> >>> + bool has_eoimode;
> >>>  
> >>>   if (WARN_ON(!node))
> >>>   return -ENODEV;
> >>> @@ -1436,7 +1441,8 @@ gic_of_init(struct device_node *node, struct 
> >>> device_node *parent)
> >>>* Disable split EOI/Deactivate if either HYP is not available
> >>>* or the CPU interface is too small.
> >>>*/
> >>> - if (gic_cnt == 0 && !gic_check_eoimode(node, >raw_cpu_base))
> >>> + has_eoimode = gic_check_eoimode(node, >raw_cpu_base);
> >>> + if (gic_cnt == 0 && !has_eoimode)
> >>>   static_key_slow_dec(_deactivate);
> >>>  
> >>>   ret = __gic_init_bases(gic, -1, >fwnode);
> >>> @@ -1447,7 +1453,7 @@ gic_of_init(struct device_node *node, struct 
> >>> device_node *parent)
> >>>  
> >>>   if (!gic_cnt) {
> >>>   gic_init_physaddr(node);
> >>> - gic_of_setup_kvm_info(node);
> >>> + gic_of_setup_kvm_info(node, has_eoimode);
> >>>   }
> >>>  
> >>>   if (parent) {
> >>>
> >>
> >> and we shouldn't need any of this. What do you think?
> >>
> > 
> > I wasn't exactly sure if gic_cnt > 0 && !gic_check_eiomode() could then
> > end up registering the KVM info when we shouldn't.
> > 
> > If that's not a concern, I'm happy to rework this.
> 
> I think it should be fine. gic_cnt is incremented each time we find a
> GIC, and we'll only register the KVM info when we discover the first one
> (while gic_cnt is still zero).
> 
> Also, nobody is mad enough to have multiple GICs these days (cough...).
> 

ok, I'll rework it then.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest

2017-08-01 Thread Christoffer Dall
Hi James,

On Fri, Jul 28, 2017 at 03:10:17PM +0100, James Morse wrote:
> SErrors due to RAS are either taken as an SError, or deferred because
> of an Error Synchronization Barrier (ESB). Systems that support the RAS
> extensions are very likely to have firmware-first handling of these
> errors, taking all SErrors to EL3.
> 
> Add {I,}ESB support to KVM and be prepared to handle any resulting SError
> if we are notified directly. (i.e. no firmware-first handling). Do this
> for the cases where we can take the SError instead of deferring it.

Sorry, I forgot how this works again.  If we do have firmware-first, are
we sure that the firmware doesn't just emulate the SError to EL2,
resulting in it looking the same whether or not we have firmware-first?
(perhaps the firmware-first thing clearly defines what should happen
when the firmware detects an SError?)

> 
> With VHE KVM is covered by the host's setting of SCTLR_EL1.IESB: unmask
> SError when entering a guest. This will hyp-panic if there was an SError
> pending during world switch (and we don't have firmware first). Make
> sure this only happens when its KVM's 'fault' by adding an ESB to

nit: it's

> __kvm_call_hyp().
> 
> On systems without the RAS extensions a pending SError triggered by KVM's
> world switch will no longer be blamed on the guest, causing a panic
> instead.
> 
> Signed-off-by: James Morse 
> ---
>  arch/arm64/include/asm/assembler.h |  1 +
>  arch/arm64/kvm/hyp.S   |  1 +
>  arch/arm64/kvm/hyp/entry.S | 18 ++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/assembler.h 
> b/arch/arm64/include/asm/assembler.h
> index e2bb551f59f7..e440fba6d0fe 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>   .macro save_and_disable_daif, flags
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 952f6cb9cf72..e96a5f6afecd 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -40,6 +40,7 @@
>   * arch/arm64/kernel/hyp_stub.S.
>   */
>  ENTRY(__kvm_call_hyp)
> + esb

are esb instructions ignored on earlier versions than ARMv8.2?  (I
couldn't easily determine this from the spec)

So this is not necessarily the world switch path we're entering (as
hinted in the commit message), but we could also be doing TLB
maintenance.  Is this still going to work in that case?

>  alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
>   hvc #0
>   ret
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 12ee62d6d410..cec18df5a324 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -49,6 +49,21 @@
>   ldp x29, lr,  [\ctxt, #CPU_XREG_OFFSET(29)]
>  .endm
>  
> +/* We have an implicit esb if we have VHE and IESB. */
> +.macro kvm_explicit_esb
> + alternative_if_not ARM64_HAS_RAS_EXTN
> + b   998f
> + alternative_else_nop_endif
> + alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> + esb
> + b   998f
> + alternative_else_nop_endif
> + alternative_if_not ARM64_HAS_IESB
> + esb
> + alternative_else_nop_endif
> +998:
> +.endm
> +
>  /*
>   * u64 __guest_enter(struct kvm_vcpu *vcpu,
>   *struct kvm_cpu_context *host_ctxt);
> @@ -85,6 +100,9 @@ ENTRY(__guest_enter)
>   ldr x18,  [x18, #CPU_XREG_OFFSET(18)]
>  
>   // Do not touch any register after this!
> +
> + enable_serror   // Don't defer an IESB SError
> + kvm_explicit_esb
>   eret
>  ENDPROC(__guest_enter)
>  
> -- 
> 2.13.2
> 

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 05/19] KVM: arm/arm64: Check that system supports split eoi/deactivate

2017-08-01 Thread Marc Zyngier
On 01/08/17 13:26, Christoffer Dall wrote:
> On Tue, Aug 01, 2017 at 12:35:23PM +0100, Marc Zyngier wrote:
>> On 17/07/17 15:27, Christoffer Dall wrote:
>>> Some systems without proper firmware and/or hardware description data
>>> don't support the split EOI and deactivate operation.
>>>
>>> On such systems, we cannot leave the physical interrupt active after the
>>> timer handler on the host has run, so we cannot support KVM with an
>>> in-kernel GIC with the timer changes we about to introduce.
>>>
>>> This patch makes sure that trying to initialize the KVM GIC code will
>>> fail on such systems.
>>>
>>> Cc: Marc Zyngier 
>>> Signed-off-by: Christoffer Dall 
>>> ---
>>>  drivers/irqchip/irq-gic.c | 12 +---
>>>  1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>> index 090991f..b7e4fed 100644
>>> --- a/drivers/irqchip/irq-gic.c
>>> +++ b/drivers/irqchip/irq-gic.c
>>> @@ -1391,7 +1391,8 @@ int gic_of_init_child(struct device *dev, struct 
>>> gic_chip_data **gic, int irq)
>>> return 0;
>>>  }
>>>  
>>> -static void __init gic_of_setup_kvm_info(struct device_node *node)
>>> +static void __init gic_of_setup_kvm_info(struct device_node *node,
>>> +bool supports_deactivate)
>>
>> Ouch, nasty. This shadows the static key which is also called
>> supports_deactivate...
>>
> 
> oh, yeah, that's a trap waiting to happen.
> 
>>>  {
>>> int ret;
>>> struct resource *vctrl_res = _v2_kvm_info.vctrl;
>>> @@ -1411,6 +1412,9 @@ static void __init gic_of_setup_kvm_info(struct 
>>> device_node *node)
>>> if (ret)
>>> return;
>>>  
>>> +   if (!supports_deactivate)
>>> +   return;
>>> +
>>> gic_set_kvm_info(_v2_kvm_info);
>>
>> Speaking of which, the static key should already be initialized, so this
>> could actually read:
>>
>>  if (static_key_true(_deactivate))
>>  gic_set_kvm_info(_v2_kvm_info);
>>
>>>  }
>>>  
>>> @@ -1419,6 +1423,7 @@ gic_of_init(struct device_node *node, struct 
>>> device_node *parent)
>>>  {
>>> struct gic_chip_data *gic;
>>> int irq, ret;
>>> +   bool has_eoimode;
>>>  
>>> if (WARN_ON(!node))
>>> return -ENODEV;
>>> @@ -1436,7 +1441,8 @@ gic_of_init(struct device_node *node, struct 
>>> device_node *parent)
>>>  * Disable split EOI/Deactivate if either HYP is not available
>>>  * or the CPU interface is too small.
>>>  */
>>> -   if (gic_cnt == 0 && !gic_check_eoimode(node, >raw_cpu_base))
>>> +   has_eoimode = gic_check_eoimode(node, >raw_cpu_base);
>>> +   if (gic_cnt == 0 && !has_eoimode)
>>> static_key_slow_dec(_deactivate);
>>>  
>>> ret = __gic_init_bases(gic, -1, >fwnode);
>>> @@ -1447,7 +1453,7 @@ gic_of_init(struct device_node *node, struct 
>>> device_node *parent)
>>>  
>>> if (!gic_cnt) {
>>> gic_init_physaddr(node);
>>> -   gic_of_setup_kvm_info(node);
>>> +   gic_of_setup_kvm_info(node, has_eoimode);
>>> }
>>>  
>>> if (parent) {
>>>
>>
>> and we shouldn't need any of this. What do you think?
>>
> 
> I wasn't exactly sure if gic_cnt > 0 && !gic_check_eiomode() could then
> end up registering the KVM info when we shouldn't.
> 
> If that's not a concern, I'm happy to rework this.

I think it should be fine. gic_cnt is incremented each time we find a
GIC, and we'll only register the KVM info when we discover the first one
(while gic_cnt is still zero).

Also, nobody is mad enough to have multiple GICs these days (cough...).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 05/19] KVM: arm/arm64: Check that system supports split eoi/deactivate

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 12:35:23PM +0100, Marc Zyngier wrote:
> On 17/07/17 15:27, Christoffer Dall wrote:
> > Some systems without proper firmware and/or hardware description data
> > don't support the split EOI and deactivate operation.
> > 
> > On such systems, we cannot leave the physical interrupt active after the
> > timer handler on the host has run, so we cannot support KVM with an
> > in-kernel GIC with the timer changes we about to introduce.
> > 
> > This patch makes sure that trying to initialize the KVM GIC code will
> > fail on such systems.
> > 
> > Cc: Marc Zyngier 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  drivers/irqchip/irq-gic.c | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > index 090991f..b7e4fed 100644
> > --- a/drivers/irqchip/irq-gic.c
> > +++ b/drivers/irqchip/irq-gic.c
> > @@ -1391,7 +1391,8 @@ int gic_of_init_child(struct device *dev, struct 
> > gic_chip_data **gic, int irq)
> > return 0;
> >  }
> >  
> > -static void __init gic_of_setup_kvm_info(struct device_node *node)
> > +static void __init gic_of_setup_kvm_info(struct device_node *node,
> > +bool supports_deactivate)
> 
> Ouch, nasty. This shadows the static key which is also called
> supports_deactivate...
> 

oh, yeah, that's a trap waiting to happen.

> >  {
> > int ret;
> > struct resource *vctrl_res = _v2_kvm_info.vctrl;
> > @@ -1411,6 +1412,9 @@ static void __init gic_of_setup_kvm_info(struct 
> > device_node *node)
> > if (ret)
> > return;
> >  
> > +   if (!supports_deactivate)
> > +   return;
> > +
> > gic_set_kvm_info(_v2_kvm_info);
> 
> Speaking of which, the static key should already be initialized, so this
> could actually read:
> 
>   if (static_key_true(_deactivate))
>   gic_set_kvm_info(_v2_kvm_info);
> 
> >  }
> >  
> > @@ -1419,6 +1423,7 @@ gic_of_init(struct device_node *node, struct 
> > device_node *parent)
> >  {
> > struct gic_chip_data *gic;
> > int irq, ret;
> > +   bool has_eoimode;
> >  
> > if (WARN_ON(!node))
> > return -ENODEV;
> > @@ -1436,7 +1441,8 @@ gic_of_init(struct device_node *node, struct 
> > device_node *parent)
> >  * Disable split EOI/Deactivate if either HYP is not available
> >  * or the CPU interface is too small.
> >  */
> > -   if (gic_cnt == 0 && !gic_check_eoimode(node, >raw_cpu_base))
> > +   has_eoimode = gic_check_eoimode(node, >raw_cpu_base);
> > +   if (gic_cnt == 0 && !has_eoimode)
> > static_key_slow_dec(_deactivate);
> >  
> > ret = __gic_init_bases(gic, -1, >fwnode);
> > @@ -1447,7 +1453,7 @@ gic_of_init(struct device_node *node, struct 
> > device_node *parent)
> >  
> > if (!gic_cnt) {
> > gic_init_physaddr(node);
> > -   gic_of_setup_kvm_info(node);
> > +   gic_of_setup_kvm_info(node, has_eoimode);
> > }
> >  
> > if (parent) {
> > 
> 
> and we shouldn't need any of this. What do you think?
> 

I wasn't exactly sure if gic_cnt > 0 && !gic_check_eiomode() could then
end up registering the KVM info when we shouldn't.

If that's not a concern, I'm happy to rework this.

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 05/19] KVM: arm/arm64: Check that system supports split eoi/deactivate

2017-08-01 Thread Marc Zyngier
On 17/07/17 15:27, Christoffer Dall wrote:
> Some systems without proper firmware and/or hardware description data
> don't support the split EOI and deactivate operation.
> 
> On such systems, we cannot leave the physical interrupt active after the
> timer handler on the host has run, so we cannot support KVM with an
> in-kernel GIC with the timer changes we about to introduce.
> 
> This patch makes sure that trying to initialize the KVM GIC code will
> fail on such systems.
> 
> Cc: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  drivers/irqchip/irq-gic.c | 12 +---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 090991f..b7e4fed 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -1391,7 +1391,8 @@ int gic_of_init_child(struct device *dev, struct 
> gic_chip_data **gic, int irq)
>   return 0;
>  }
>  
> -static void __init gic_of_setup_kvm_info(struct device_node *node)
> +static void __init gic_of_setup_kvm_info(struct device_node *node,
> +  bool supports_deactivate)

Ouch, nasty. This shadows the static key which is also called
supports_deactivate...

>  {
>   int ret;
>   struct resource *vctrl_res = _v2_kvm_info.vctrl;
> @@ -1411,6 +1412,9 @@ static void __init gic_of_setup_kvm_info(struct 
> device_node *node)
>   if (ret)
>   return;
>  
> + if (!supports_deactivate)
> + return;
> +
>   gic_set_kvm_info(_v2_kvm_info);

Speaking of which, the static key should already be initialized, so this
could actually read:

if (static_key_true(_deactivate))
gic_set_kvm_info(_v2_kvm_info);

>  }
>  
> @@ -1419,6 +1423,7 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>  {
>   struct gic_chip_data *gic;
>   int irq, ret;
> + bool has_eoimode;
>  
>   if (WARN_ON(!node))
>   return -ENODEV;
> @@ -1436,7 +1441,8 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>* Disable split EOI/Deactivate if either HYP is not available
>* or the CPU interface is too small.
>*/
> - if (gic_cnt == 0 && !gic_check_eoimode(node, >raw_cpu_base))
> + has_eoimode = gic_check_eoimode(node, >raw_cpu_base);
> + if (gic_cnt == 0 && !has_eoimode)
>   static_key_slow_dec(_deactivate);
>  
>   ret = __gic_init_bases(gic, -1, >fwnode);
> @@ -1447,7 +1453,7 @@ gic_of_init(struct device_node *node, struct 
> device_node *parent)
>  
>   if (!gic_cnt) {
>   gic_init_physaddr(node);
> - gic_of_setup_kvm_info(node);
> + gic_of_setup_kvm_info(node, has_eoimode);
>   }
>  
>   if (parent) {
> 

and we shouldn't need any of this. What do you think?

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [RFC PATCH v2 38/38] KVM: arm64: Respect the virtual CPTR_EL2.TCPAC setting

2017-08-01 Thread Christoffer Dall
On Tue, Aug 01, 2017 at 07:03:35AM -0400, Jintack Lim wrote:
> Hi Christoffer,
> 
> On Mon, Jul 31, 2017 at 8:59 AM, Christoffer Dall  wrote:
> > On Tue, Jul 18, 2017 at 11:59:04AM -0500, Jintack Lim wrote:
> >> Forward CPACR_EL1 traps to the virtual EL2 if virtual CPTR_EL2 is
> >> configured to trap CPACR_EL1 accesses from EL1.
> >>
> >> This is for recursive nested virtualization.
> >>
> >> Signed-off-by: Jintack Lim 
> >> ---
> >>  arch/arm64/kvm/sys_regs.c | 5 +
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> >> index 6f67666..ba2966d 100644
> >> --- a/arch/arm64/kvm/sys_regs.c
> >> +++ b/arch/arm64/kvm/sys_regs.c
> >> @@ -1091,6 +1091,11 @@ static bool access_cpacr(struct kvm_vcpu *vcpu,
> >>   if (el12_reg(p) && forward_nv_traps(vcpu))
> >>   return kvm_inject_nested_sync(vcpu, kvm_vcpu_get_hsr(vcpu));
> >>
> >> + /* Forward this trap to the virtual EL2 if CPTR_EL2.TCPAC is set*/
> >> + if (!el12_reg(p) && !vcpu_mode_el2(vcpu) &&
> >> + (vcpu_sys_reg(vcpu, CPTR_EL2) & CPTR_EL2_TCPAC))
> >> + return kvm_inject_nested_sync(vcpu, kvm_vcpu_get_hsr(vcpu));
> >> +
> >
> > I'm trying to understand what should happen if the VM is in EL1 and
> > accesses CPACR_EL12, but the guest hypervisor did not set
> > CPTR_EL2.TCPAC, why would we get here, and if there's a good reason why
> 
> I guess what you meant is HCR_EL2.NV bit?
> 

No, HCR_EL2.NV is set, then we obviously get here, due to traps on _EL12
registers.

But if that wasn't the case (that's the time you'd be avaluating this
if-statement), then you're checking as part of the if-statement if the
virtual CPTR_EL2.TCPAC is set.  My question is, if the virtual
CPTR_EL2.TCPAC is not set, why would the physical one be set, which must
be the case if we're running this code, right?

> > we god here, is the EL12 access not supposed to undef at EL1 as opposed

I obviously meant *got* here.

> > to actually work, like it seems your code does when it doesn't take the
> > branch?
> 
> IIUC, we need to have this logic
> 
> if (el12_reg() && virtual HCR_EL2.NV == 0)
>inject_undef();
> 
> This is a good point, and should be applied for all traps controlled by NV 
> bit.
> 

Yes, but can this ever happen?

Thanks,
-Christoffer
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 10/11] arm64/kvm: context-switch ptrauth registers

2017-08-01 Thread Christoffer Dall
On Wed, Jul 19, 2017 at 05:01:31PM +0100, Mark Rutland wrote:
> When pointer authentication is supported, a guest may wish to use it.
> This patch adds the necessary KVM infrastructure for this to work, with
> a semi-lazy context switch of the pointer auth state.
> 
> When we schedule a vcpu, we disable guest usage of pointer
> authentication instructions and accesses to the keys. While these are
> disabled, we avoid context-switching the keys. When we trap the guest
> trying to use pointer authentication functionality, we change to eagerly
> context-switching the keys, and enable the feature. The next time the
> vcpu is scheduled out/in, we start again.
> 
> This is sufficient for systems with uniform pointer authentication
> support. For systems with mismatched support, it will be necessary to

What is mismatched support?  You mean systems where one CPU has ptrauth
and another one doesn't (or if they both have it but in different ways)?

> hide the feature from the guest's view of the ID registers.

I think the work Drew is pondering to allow userspace more control of
what we emulate to the guest can semi-automatically take care of this as
well.

> 
> Signed-off-by: Mark Rutland 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: kvmarm@lists.cs.columbia.edu
> ---
>  arch/arm64/include/asm/kvm_host.h | 23 +-
>  arch/arm64/include/asm/kvm_hyp.h  |  7 +++
>  arch/arm64/kvm/handle_exit.c  | 21 +
>  arch/arm64/kvm/hyp/Makefile   |  1 +
>  arch/arm64/kvm/hyp/ptrauth-sr.c   | 91 
> +++
>  arch/arm64/kvm/hyp/switch.c   |  4 ++
>  arch/arm64/kvm/sys_regs.c | 32 ++
>  7 files changed, 178 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/ptrauth-sr.c
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 0d7c3dd..f97defa 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -135,6 +135,18 @@ enum vcpu_sysreg {
>   PMSWINC_EL0,/* Software Increment Register */
>   PMUSERENR_EL0,  /* User Enable Register */
>  
> + /* Pointer Authentication Registers */
> + APIAKEYLO_EL1,
> + APIAKEYHI_EL1,
> + APIBKEYLO_EL1,
> + APIBKEYHI_EL1,
> + APDAKEYLO_EL1,
> + APDAKEYHI_EL1,
> + APDBKEYLO_EL1,
> + APDBKEYHI_EL1,
> + APGAKEYLO_EL1,
> + APGAKEYHI_EL1,
> +
>   /* 32bit specific registers. Keep them at the end of the range */
>   DACR32_EL2, /* Domain Access Control Register */
>   IFSR32_EL2, /* Instruction Fault Status Register */
> @@ -368,10 +380,19 @@ static inline void __cpu_init_hyp_mode(phys_addr_t 
> pgd_ptr,
>   __kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
>  }
>  
> +void kvm_arm_vcpu_ptrauth_enable(struct kvm_vcpu *vcpu);
> +void kvm_arm_vcpu_ptrauth_disable(struct kvm_vcpu *vcpu);
> +void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
> +
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> +
> +static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)
> +{
> + kvm_arm_vcpu_ptrauth_disable(vcpu);
> +}
> +

why sched_in and not vcpu_load?  (vcpu_load happens whenever you're
returning from userspace for example, and not only when you've been
scheduled away and are coming back).

And why do we want to 'reset' the behavior of KVM when the host
schedules a VCPU thread?

If I understand the logic correctly, what you're establishing with the
appraoch of initially trapping use of ptrauth is to avoid
saving/restoring the state if the guest dosn't use ptrauth, but then if
the guest starts using it, it's likely to keep using it, and therefore
we start saving/restoring the registers.

Why is the host's decision to schedule something else on this physical
CPU a good indication that we should no longer save/restore these
registers for the VCPU?  Wouldn't it make more sense to have a flag on
the VCPU, and potentially a counter, so that if you switch X times
without ever touching the registers again, you can stop saving/restoring
the state, if that's even something we care about?

Another thing that comes to mind; does the kernel running a KVM's VCPU
thread (and handling softirqs, ISRs, etc.) ever use the ptrauth system,
or does that only happen when we go to userspace?  If the latter, we
could handle the save/restore entirely in vcpu_put/vcpu_load instead. I
don't mind picking up that bit as part of my ongoing optimization work
later though, if you're eager to get these patches merged.

>  static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
>  void kvm_arm_init_debug(void);
> diff --git a/arch/arm64/include/asm/kvm_hyp.h 
> 

Re: [PATCH 09/11] arm64/kvm: preserve host HCR_EL2 value

2017-08-01 Thread Christoffer Dall
On Wed, Jul 19, 2017 at 05:01:30PM +0100, Mark Rutland wrote:
> When restoring HCR_EL2 for the host, KVM uses HCR_HOST_VHE_FLAGS, which
> is a constant value. This works today, as the host HCR_EL2 value is
> always the same, but this will get in the way of supporting extensions
> that require HCR_EL2 bits to be set conditionally for the host.
> 
> To allow such features to work without KVM having to explicitly handle
> every possible host feature combination, this patch has KVM save/restore
> the host HCR when switching to/from a guest HCR.
> 
> For __{activate,deactivate}_traps(), the HCR save/restore is made common
> across the !VHE and VHE paths. As the host and guest HCR values must
> have E2H set when VHE is in use, register redirection should always be
> in effect at EL2, and this change should not adversely affect the VHE
> code.
> 
> For the hyp TLB maintenance code, __tlb_switch_to_host_vhe() is updated
> to toggle the TGE bit with a RMW sequence, as we already do in
> __tlb_switch_to_guest_vhe().
> 
> The now unused HCR_HOST_VHE_FLAGS definition is removed.
> 
> Signed-off-by: Mark Rutland 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: kvmarm@lists.cs.columbia.edu
> ---
>  arch/arm64/include/asm/kvm_arm.h  | 1 -
>  arch/arm64/include/asm/kvm_host.h | 5 -
>  arch/arm64/kvm/hyp/switch.c   | 5 +++--
>  arch/arm64/kvm/hyp/tlb.c  | 6 +-
>  4 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_arm.h 
> b/arch/arm64/include/asm/kvm_arm.h
> index c1267e8..7b9c898 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -84,7 +84,6 @@
>HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW)
>  #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
>  #define HCR_INT_OVERRIDE   (HCR_FMO | HCR_IMO)
> -#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
>  
>  /* TCR_EL2 Registers bits */
>  #define TCR_EL2_RES1 ((1 << 31) | (1 << 23))
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index d686300..0d7c3dd 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -198,10 +198,13 @@ struct kvm_cpu_context {
>  struct kvm_vcpu_arch {
>   struct kvm_cpu_context ctxt;
>  
> - /* HYP configuration */
> + /* Guest HYP configuration */
>   u64 hcr_el2;
>   u32 mdcr_el2;
>  
> + /* Host HYP configuration */
> + u64 host_hcr_el2;
> +
>   /* Exception Information */
>   struct kvm_vcpu_fault_info fault;
>  
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c..6108813 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -71,6 +71,8 @@ static void __hyp_text __activate_traps(struct kvm_vcpu 
> *vcpu)
>  {
>   u64 val;
>  
> + vcpu->arch.host_hcr_el2 = read_sysreg(hcr_el2);
> +
>   /*
>* We are about to set CPTR_EL2.TFP to trap all floating point
>* register accesses to EL2, however, the ARM ARM clearly states that
> @@ -110,7 +112,6 @@ static void __hyp_text __deactivate_traps_vhe(void)
>   MDCR_EL2_TPMS;
>  
>   write_sysreg(mdcr_el2, mdcr_el2);
> - write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
>   write_sysreg(CPACR_EL1_FPEN, cpacr_el1);
>   write_sysreg(vectors, vbar_el1);
>  }
> @@ -123,7 +124,6 @@ static void __hyp_text __deactivate_traps_nvhe(void)
>   mdcr_el2 |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
>  
>   write_sysreg(mdcr_el2, mdcr_el2);
> - write_sysreg(HCR_RW, hcr_el2);
>   write_sysreg(CPTR_EL2_DEFAULT, cptr_el2);
>  }
>  
> @@ -145,6 +145,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu 
> *vcpu)
>   __deactivate_traps_arch()();
>   write_sysreg(0, hstr_el2);
>   write_sysreg(0, pmuserenr_el0);
> + write_sysreg(vcpu->arch.host_hcr_el2, hcr_el2);
>  }
>  
>  static void __hyp_text __activate_vm(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
> index 73464a9..c2b0680 100644
> --- a/arch/arm64/kvm/hyp/tlb.c
> +++ b/arch/arm64/kvm/hyp/tlb.c
> @@ -49,12 +49,16 @@ static hyp_alternate_select(__tlb_switch_to_guest,
>  
>  static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm)
>  {
> + u64 val;
> +
>   /*
>* We're done with the TLB operation, let's restore the host's
>* view of HCR_EL2.
>*/
>   write_sysreg(0, vttbr_el2);
> - write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
> + val = read_sysreg(hcr_el2);
> + val |= HCR_TGE;
> + write_sysreg(val, hcr_el2);
>  }
>  
>  static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm)
> -- 
> 1.9.1
> 

Reviewed-by: Christoffer Dall 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu