Re: [PATCH v2 26/28] arm64/sve: Add documentation

2017-10-06 Thread Dave Martin
On Fri, Oct 06, 2017 at 04:43:43PM +0100, Szabolcs Nagy wrote:
> On 31/08/17 18:00, Dave Martin wrote:
> > +9.  System runtime configuration
> > +
> > +
> > +* To mitigate the ABI impact of expansion of the signal frame, a policy
> > +  mechanism is provided for administrators, distro maintainers and 
> > developers
> > +  to set the default vector length for userspace processes:
> > +
> > +/proc/cpu/sve_default_vector_length
> 
> 
> elsewhere in the patch series i see
> 
> /proc/sys/abi/sve_default_vector_length
> 
> is this supposed to be the same?

Good spot, thanks!

/proc/cpu/ was the old location: they should both say /proc/abi/.
I'll fix it.

---Dave
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v6 6/7] KVM: arm64: allow get exception information from userspace

2017-10-06 Thread James Morse
Hi gengdongjiu,

On 27/09/17 12:07, gengdongjiu wrote:
> On 2017/9/23 0:51, James Morse wrote:
>> If this wasn't a firmware-first notification, then you're right KVM hands the
>> guest an asynchronous external abort. This could be considered a bug in KVM. 
>> (we
>> can discuss with Marc and Christoffer what it should do), but:
>>
>> I'm not sure what scenario you could see this in: surely all your
>> CPU:external-aborts are taken to EL3 by SCR_EL3.EA and become firmware-first
>> notifications. So they should always be claimed by APEI.

> Yes, if it is firmware-first we should not exist such issue.

[...]

>> What you may be seeing is some awkwardness with the change in the SError ESR
>> with v8.2. Previously the VSE mechanism injected an impdef SError, (but they
>> were all impdef so it didn't matter).
>> With VSESR_EL2 KVM has to specify one, and all-zeros is a bad choice as this
>> means 'classified as a RAS error ... unknown!'.

>> I have a patch in the upcoming SError/RAS series that changes KVMs 
>> virtual-abort
>> code to specify an impdef ESR for this path.

https://www.spinics.net/lists/arm-kernel/msg609589.html


> Before I remember Marc and you suggest me specify the an impdef ESR (set the 
> vsesr_el2) in
> the userspace,

If Qemu/kvmtool wants to emulate a machine that notifies the guest about
firmware-first RAS Errors using SError/SEI, it needs to decide when to send
these SError and what ESR to specify.


> now do you want to specify an impdef ESR in KVM instead of usrspace?

No, that patch is just trying to fixup the existing cases where KVM already
injects an SError. I'm just trying to keep the behaviour the-same:

Before the RAS Extensions the guest would always get an impdef SError ESR.
After the RAS Extensions KVM has to pick an ESR, as otherwise the guest gets the
hardware-reset value of VSESR_EL2. On the FVP this is all-zeros, which is a RAS
encoding. That patch changes it to still be an impdef SError ESR.


> if setting the vsesr_el2 in the KVM, whether user-space need to specify?

I think we need a KVM CAP API to do this. With that patch it can be wired into
pend_guest_serror(), which takes the ESR to make pending if the CPU supports it.

It's not clear to me whether its useful for user-space to make an SError pending
even if it can't set the ESR

[...]

>> Because the SEI notification depends on v8.2 I'd like to get the SError/RAS
>> series posted (currently re-testing), then I'll pick up enough of the patches
>> you've posted for a consolidated version of the series, and we can take the
>> discussion from there.

> James, it is great, we can make a consolidated version of the series.

We need to get some of the wider issues fixed first, the big-ugly one is
memory_failure_queue() not being NMI safe. (this isn't a problem for SEA as this
would only become re-entrant if the kernel text was corrupt). It is a problem
for SEI and SDEI.


>> I'd still like to know what your firmware does if the normal-world believes 
>> its
>> masked physical-SError and you want to hand it an SEI notification.

Aha, thanks for this:

> firstly the physical-SError that happened in the EL2/EL1/EL1 can not be 
> masked if SCR_EL3.EA is set.

Masked for the CPU because the CPU can deliver the SError to EL3.

What about software? Code at EL1 and EL2 each have a PSTATE.A bit they may have
set. HCR_EL2.VSE respects EL1's PSTATE.A ... the question is does your firmware
respect the PSTATE.A value of the exception level that SError are routed to?


> when trap to EL3, firmware will record the error to APEI CPER from reading 
> ERR* RAS registers.
> 
> (1) if HCR_EL2.TEA is set to 1, exception come from EL0, El1. firmware knows 
> this

HCR_EL2.TEA covers synchronous-external-aborts. For SError you need to check
HCR_EL2.AMO. Some crazy hypervisor may set one and not the other.


> SError come from guest OS, copy the elr_el3 to elr_el2, copy ESR_El3 to
> ESR_EL2.

The EC value in the ELR describes current/lower exception level, you need to
re-encode this for EL2 if the exception came from EL2.


> if the SError exception come from guest EL0 or EL1, set ELR_EL3 with 
> VBAR_EL2 + 0x580(one EL2 SEI entry point),
> 
> execute "ERET", then jump to EL2 hypervisor.
>
> (2)if the SError exception come EL2 hypervisor, copy the elr_el3 to elr_el2, 
> copy ESR_El3 to ESR_EL,
> set ELR_EL3 with VBAR_EL2+0x380(one EL2 SEI entry point),
> 
>execute "ERET", then jump to EL2 hypervisor.

This SError came from EL2. You _must_ check SPSR_EL3.A is clear before returning
to the EL2 SError vector.

EL2 believes it has masked SError, it does this because it can't handle one
right now. If your firmware jumps in anyway - its game over.

We mask SError in entry.S when we take an exception and when we return from an
exception. This is so that we can read/write the ELR/SPSR without them changing
under our feet. If your firmware overwrites these values - we've lost them, and
can never return to the context we interrupted.


>

Re: 答复: [PATCH v6 6/7] KVM: arm64: allow get exception information from userspace

2017-10-06 Thread James Morse
Hi gengdongjiu,

On 25/09/17 16:13, gengdongjiu wrote:
> On 2017/9/23 0:39, James Morse wrote:
>> On 18/09/17 14:36, gengdongjiu wrote:
>>> On 2017/9/14 21:00, James Morse wrote:
 On 13/09/17 08:32, gengdongjiu wrote:
>>> Then for "BUS_MCEERR_AO", how to distinguish it is asynchronous data 
>>> access(SError) and PCIE AER error?
>>
>> How would userspace get one of these memory errors for a PCIe error?
> 
> seems Ok.
> Now I only add the support for the host SEI and SEA virtualization. For the 
> PCIe error, I still do not consider much it.
> maybe we need to consider that afterwards.

Any PCIe AER error should be handled by the device driver. If user-space is
interacting with the device-driver directly, its up to them to come up with a
way of reporting errors.


>>> In the user space, we can check the si_code, if it is 
>>> "BUS_MCEERR_AR", we use SEA notification type for the guest; if it is 
>>> "BUS_MCEERR_AO", we use SEI notification type for the guest.
>>> Because there are only two values for si_code("BUS_MCEERR_AR" and 
>>> BUS_MCEERR_AO), in which case we can use the GSIV(IRQ) notification type?
>>
>> This is for Qemu/kvmtool to decide, it depends on what sort of machine 
>> they are emulating.
>>
>> For example, the physical machine's memory-controller may notify the 
>> CPU about memory errors by triggering SError trapped to EL3, or with a 
>> dedicated FIQ, also routed to EL3. By the time this gets to the host 
>> kernel the distinction doesn't matter. The host has handled the error.
>>
>> For a guest, your memory-controller is effectively the host kernel. It 
>> will give you an BUS_MCEERR_AO signal for any guest memory that is 
>> affected, and a BUS_MCEERR_AR if the guest directly accesses a page of 
>> affected memory.
>>
>> What Qemu/kvmtool do with this is up to them. If they're emulating a 
>> machine with no RAS features, printing an error and exit.
>>
>> Otherwise BUS_MCEERR_AR could be notified as one of the flavours of 
>> IRQ, unless the affected vcpu has interrupts masked, in which case an 
>> SEA notification gives you some NMI-like behaviour.

> Thanks for explanation. 
> Now that SEA notification can provide NMI-like behaviour. How about we use it 
> for
> BUS_MCEERR_AR to avoid check the interrupts mask status?

... yes, that's the suggestion. Qemu/kvmtool can take MCEERR_AR from KVM and
SET_ONE_REG the vcpu into taking a synchronous-external-abort instead.

(but there is trouble ahead where this suggestion unravels)


> Even though guest OS not support SEA notification, It is still a valid
> guest:Synchronous-external-abort

When it comes from KVM, Yes.

If you think of linux+KVM's stage2 translation as your guests memory controller.
It is effectively guaranteeing you to:
1. 'interrupt' userspace with MCEERR_AO when it detects an error,
2. contain it (by unmapping and marking the page PG_poison) then,
3. give you a synchronous external abort with MCEERR_AR if the guest touches the
contained memory.

This is fine when the MCEERR_AR came from KVM, what we need to find out is
whether this will still be true for some future kernel code that sends 
MCEERR_AR...


>> For BUS_MCEERR_AO you could use SEI, IRQ or polled notification. My 
>> choice would be IRQ, as you can't know if the guest supports SEI and 
>> it would be a shame to
> 
> How about we first check whether user space can specify the virtual SError 
> Exception Syndrome(have vsesr_el2 register)?
> If can specify, using SEI notification, otherwise use IRQ notification. 
> The advantage is that it can pass more error information than IRQ if can 
> specify Syndrome information.

If this is for APEI firmware-first, its the severity in the CPER records that
matters, so the SEI/IRQ choice doesn't matter if the guest supports both.

But its unlikely the guest supports both. (v4.14 doesn't).
If you pick IRQ and the guest doesn't support it, nothing happens, the guest
never takes the IRQ. You take the memory-error as a MCEERR_AR some time later.

If you pick SEI and the guest doesn't support it, the guest dies of an unknown
SError.

I would pick POLLed by default, as this would let the Qemu/kvmtool code that
does this work be portable between x86 and arm64.

[...]

>>> If call memory_failure(), memory_failure can translate the PA to host 
>>> VA, then deliver host VA to Qemu.
>>
>> Yes, this is how it works for any user-space process as two processes 
>> sharing the same page may map it in different locations.
>>
>>
>>> Qemu can translate the host VA to IPA. so we rely on memory_failure() 
>>> to get the IPA.
>>
>> Yes. I don't see why this is a problem: The kernel isn't going to pass 
>> RAS events into the guest, so it never needs to know the IPA.
>>
>> Instead we notify user-space about ranges of memory affected by 
>> memory_failure(), KVM's user-space isn't a special case here.
>>
>> As you point out, if Qemu wants to notify the guest it can calculate 
>> the IPA and either use CPER for firmware-first, or in the future, 

Re: [PATCH v2 26/28] arm64/sve: Add documentation

2017-10-06 Thread Szabolcs Nagy
On 31/08/17 18:00, Dave Martin wrote:
> +9.  System runtime configuration
> +
> +
> +* To mitigate the ABI impact of expansion of the signal frame, a policy
> +  mechanism is provided for administrators, distro maintainers and developers
> +  to set the default vector length for userspace processes:
> +
> +/proc/cpu/sve_default_vector_length


elsewhere in the patch series i see

/proc/sys/abi/sve_default_vector_length

is this supposed to be the same?

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 24/26] KVM: arm/arm64: GICv4: Enable VLPI support

2017-10-06 Thread Marc Zyngier
All it takes is the has_v4 flag to be set in gic_kvm_info
as well as "kvm-arm.vgic_v4_enable=1" being passed on the
command line for GICv4 to be enabled in KVM.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 Documentation/admin-guide/kernel-parameters.txt |  4 
 virt/kvm/arm/vgic/vgic-v3.c | 14 ++
 2 files changed, 18 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 05496622b4ef..93c8fff399eb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1874,6 +1874,10 @@
[KVM,ARM] Trap guest accesses to GICv3 common
system registers
 
+   kvm-arm.vgic_v4_enable=
+   [KVM,ARM] Allow use of GICv4 for direct injection of
+   LPIs.
+
kvm-intel.ept=  [KVM,Intel] Disable extended page tables
(virtualized MMU) support on capable Intel chips.
Default is 1 (enabled)
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 96ea597db0e7..405733678c2f 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -24,6 +24,7 @@
 static bool group0_trap;
 static bool group1_trap;
 static bool common_trap;
+static bool gicv4_enable;
 
 void vgic_v3_set_underflow(struct kvm_vcpu *vcpu)
 {
@@ -459,6 +460,12 @@ static int __init early_common_trap_cfg(char *buf)
 }
 early_param("kvm-arm.vgic_v3_common_trap", early_common_trap_cfg);
 
+static int __init early_gicv4_enable(char *buf)
+{
+   return strtobool(buf, &gicv4_enable);
+}
+early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable);
+
 /**
  * vgic_v3_probe - probe for a GICv3 compatible interrupt controller in DT
  * @node:  pointer to the DT node
@@ -478,6 +485,13 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
kvm_vgic_global_state.can_emulate_gicv2 = false;
kvm_vgic_global_state.ich_vtr_el2 = ich_vtr_el2;
 
+   /* GICv4 support? */
+   if (info->has_v4) {
+   kvm_vgic_global_state.has_gicv4 = gicv4_enable;
+   kvm_info("GICv4 support %sabled\n",
+gicv4_enable ? "en" : "dis");
+   }
+
if (!info->vcpu.start) {
kvm_info("GICv3: no GICV resource entry\n");
kvm_vgic_global_state.vcpu_base = 0;
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 21/26] KVM: arm/arm64: GICv4: Hook vPE scheduling into vgic flush/sync

2017-10-06 Thread Marc Zyngier
The redistributor needs to be told which vPE is about to be run,
and tells us whether there is any pending VLPI on exit.

Let's add the scheduling calls to the vgic flush/sync functions,
allowing the VLPIs to be delivered to the guest.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-v4.c | 39 +++
 virt/kvm/arm/vgic/vgic.c|  4 
 virt/kvm/arm/vgic/vgic.h|  2 ++
 3 files changed, 45 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index 5ee69aec990a..68a2ef6df161 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -130,6 +130,45 @@ void vgic_v4_teardown(struct kvm *kvm)
its_vm->vpes = NULL;
 }
 
+int vgic_v4_sync_hwstate(struct kvm_vcpu *vcpu)
+{
+   if (!vgic_supports_direct_msis(vcpu->kvm))
+   return 0;
+
+   return its_schedule_vpe(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe, false);
+}
+
+int vgic_v4_flush_hwstate(struct kvm_vcpu *vcpu)
+{
+   int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq;
+   int err;
+
+   if (!vgic_supports_direct_msis(vcpu->kvm))
+   return 0;
+
+   /*
+* Before making the VPE resident, make sure the redistributor
+* corresponding to our current CPU expects us here. See the
+* doc in drivers/irqchip/irq-gic-v4.c to understand how this
+* turns into a VMOVP command at the ITS level.
+*/
+   err = irq_set_affinity(irq, cpumask_of(smp_processor_id()));
+   if (err)
+   return err;
+
+   err = its_schedule_vpe(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe, true);
+   if (err)
+   return err;
+
+   /*
+* Now that the VPE is resident, let's get rid of a potential
+* doorbell interrupt that would still be pending.
+*/
+   err = irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, false);
+
+   return err;
+}
+
 static struct vgic_its *vgic_get_its(struct kvm *kvm,
 struct kvm_kernel_irq_routing_entry 
*irq_entry)
 {
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 0691a2250949..71ca0ff1b35a 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -709,6 +709,8 @@ void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 
+   WARN_ON(vgic_v4_sync_hwstate(vcpu));
+
/* An empty ap_list_head implies used_lrs == 0 */
if (list_empty(&vcpu->arch.vgic_cpu.ap_list_head))
return;
@@ -721,6 +723,8 @@ void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 /* Flush our emulation state into the GIC hardware before entering the guest. 
*/
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 {
+   WARN_ON(vgic_v4_flush_hwstate(vcpu));
+
/*
 * If there are no virtual interrupts active or pending for this
 * VCPU, then there is no work to do and we can bail out without
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 1bd2e28ac097..1c65750892d7 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -245,5 +245,7 @@ int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
 bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
 void vgic_v4_teardown(struct kvm *kvm);
+int vgic_v4_sync_hwstate(struct kvm_vcpu *vcpu);
+int vgic_v4_flush_hwstate(struct kvm_vcpu *vcpu);
 
 #endif
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 18/26] KVM: arm/arm64: GICv4: Use pending_last as a scheduling hint

2017-10-06 Thread Marc Zyngier
When a vPE exits, the pending_last flag is set when there are
pending VLPIs stored in the pending table. Similarily, we set
this flag when a doorbell interrupt fires, as it indicates the
same condition.

Let's update kvm_vgic_vcpu_pending_irq() to account for that
flag as well, making a vcpu runnable when set.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 9d557efd1f7d..0691a2250949 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -769,6 +769,9 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
if (!vcpu->kvm->arch.vgic.enabled)
return false;
 
+   if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last)
+   return true;
+
spin_lock(&vgic_cpu->ap_list_lock);
 
list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 22/26] KVM: arm/arm64: GICv4: Enable virtual cpuif if VLPIs can be delivered

2017-10-06 Thread Marc Zyngier
In order for VLPIs to be delivered to the guest, we must make
sure that the cpuif is always enabled, irrespective of the
presence of virtual interrupt in the LRs.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/hyp/vgic-v3-sr.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/hyp/vgic-v3-sr.c b/virt/kvm/arm/hyp/vgic-v3-sr.c
index 91728faa13fd..f5c3d6d7019e 100644
--- a/virt/kvm/arm/hyp/vgic-v3-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v3-sr.c
@@ -258,7 +258,8 @@ void __hyp_text __vgic_v3_save_state(struct kvm_vcpu *vcpu)
cpu_if->vgic_ap1r[0] = __vgic_v3_read_ap1rn(0);
}
} else {
-   if (static_branch_unlikely(&vgic_v3_cpuif_trap))
+   if (static_branch_unlikely(&vgic_v3_cpuif_trap) ||
+   cpu_if->its_vpe.its_vm)
write_gicreg(0, ICH_HCR_EL2);
 
cpu_if->vgic_elrsr = 0x;
@@ -337,9 +338,11 @@ void __hyp_text __vgic_v3_restore_state(struct kvm_vcpu 
*vcpu)
/*
 * If we need to trap system registers, we must write
 * ICH_HCR_EL2 anyway, even if no interrupts are being
-* injected,
+* injected. Same thing if GICv4 is used, as VLPI
+* delivery is gated by ICH_HCR_EL2.En.
 */
-   if (static_branch_unlikely(&vgic_v3_cpuif_trap))
+   if (static_branch_unlikely(&vgic_v3_cpuif_trap) ||
+   cpu_if->its_vpe.its_vm)
write_gicreg(cpu_if->vgic_hcr, ICH_HCR_EL2);
}
 
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 26/26] KVM: arm/arm64: GICv4: Theory of operations

2017-10-06 Thread Marc Zyngier
Yet another braindump so I can free some cells...

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-v4.c | 67 +
 1 file changed, 67 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index 68a2ef6df161..b87806fea554 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -22,6 +22,73 @@
 
 #include "vgic.h"
 
+/*
+ * How KVM uses GICv4 (insert rude comments here):
+ *
+ * The vgic-v4 layer acts as a bridge between several entities:
+ * - The GICv4 ITS representation offered by the ITS driver
+ * - VFIO, which is in charge of the PCI endpoint
+ * - The virtual ITS, which is the only thing the guest sees
+ *
+ * The configuration of VLPIs is triggered by a callback from VFIO,
+ * instructing KVM that a PCI device has been configured to deliver
+ * MSIs to a vITS.
+ *
+ * kvm_vgic_v4_set_forwarding() is thus called with the routing entry,
+ * and this is used to find the corresponding vITS data structures
+ * (ITS instance, device, event and irq) using a process that is
+ * extremely similar to the injection of an MSI.
+ *
+ * At this stage, we can link the guest's view of an LPI (uniquely
+ * identified by the routing entry) and the host irq, using the GICv4
+ * driver mapping operation. Should the mapping succeed, we've then
+ * successfully upgraded the guest's LPI to a VLPI. We can then start
+ * with updating GICv4's view of the property table and generating an
+ * INValidation in order to kickstart the delivery of this VLPI to the
+ * guest directly, without software intervention. Well, almost.
+ *
+ * When the PCI endpoint is deconfigured, this operation is reversed
+ * with VFIO calling kvm_vgic_v4_unset_forwarding().
+ *
+ * Once the VLPI has been mapped, it needs to follow any change the
+ * guest performs on its LPI through the vITS. For that, a number of
+ * command handlers have hooks to communicate these changes to the HW:
+ * - Any invalidation triggers a call to its_prop_update_vlpi()
+ * - The INT command results in a irq_set_irqchip_state(), which
+ *   generates an INT on the corresponding VLPI.
+ * - The CLEAR command results in a irq_set_irqchip_state(), which
+ *   generates an CLEAR on the corresponding VLPI.
+ * - DISCARD translates into an unmap, similar to a call to
+ *   kvm_vgic_v4_unset_forwarding().
+ * - MOVI is translated by an update of the existing mapping, changing
+ *   the target vcpu, resulting in a VMOVI being generated.
+ * - MOVALL is translated by a string of mapping updates (similar to
+ *   the handling of MOVI). MOVALL is horrible.
+ *
+ * Note that a DISCARD/MAPTI sequence emitted from the guest without
+ * reprogramming the PCI endpoint after MAPTI does not result in a
+ * VLPI being mapped, as there is no callback from VFIO (the guest
+ * will get the interrupt via the normal SW injection). Fixing this is
+ * not trivial, and requires some horrible messing with the VFIO
+ * internals. Not fun. Don't do that.
+ *
+ * Then there is the scheduling. Each time a vcpu is about to run on a
+ * physical CPU, KVM must tell the corresponding redistributor about
+ * it. And if we've migrated our vcpu from one CPU to another, we must
+ * tell the ITS (so that the messages reach the right redistributor).
+ * This is done in two steps: first issue a irq_set_affinity() on the
+ * irq corresponding to the vcpu, then call its_schedule_vpe(). You
+ * must be in a non-preemptible context. On exit, another call to
+ * its_schedule_vpe() tells the redistributor that we're done with the
+ * vcpu.
+ *
+ * Finally, the doorbell handling: Each vcpu is allocated an interrupt
+ * which will fire each time a VLPI is made pending whilst the vcpu is
+ * not running. Each time the vcpu gets blocked, the doorbell
+ * interrupt gets enabled. When the vcpu is unblocked (for whatever
+ * reason), the doorbell interrupt is disabled.
+ */
+
 static irqreturn_t vgic_v4_doorbell_handler(int irq, void *info)
 {
struct kvm_vcpu *vcpu = info;
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 23/26] KVM: arm/arm64: GICv4: Prevent a VM using GICv4 from being saved

2017-10-06 Thread Marc Zyngier
The GICv4 architecture doesn't make it easy for save/restore to
work, as it doesn't give any guarantee that the pending state
is written into the pending table.

So let's not take any chance, and let's return an error if
we encounter any LPI that has the HW bit set.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index f434748439ee..01aa4d9d405e 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1987,6 +1987,15 @@ static int vgic_its_save_itt(struct vgic_its *its, 
struct its_device *device)
list_for_each_entry(ite, &device->itt_head, ite_list) {
gpa_t gpa = base + ite->event_id * ite_esz;
 
+   /*
+* If an LPI carries the HW bit, this means that this
+* interrupt is controlled by GICv4, and we do not
+* have direct access to that state. Let's simply fail
+* the save operation...
+*/
+   if (ite->irq->hw)
+   return -EINVAL;
+
ret = vgic_its_save_ite(its, device, ite, gpa, ite_esz);
if (ret)
return ret;
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 25/26] KVM: arm/arm64: GICv4: Prevent heterogenous systems from using GICv4

2017-10-06 Thread Marc Zyngier
The GICv4 architecture doesn't prevent CPUs implementing GICv4 to
cohabit with CPUs limited to GICv3 in the same system.

This is mad (the scheduler would have to be made aware of the v4
capability), and we're certainly not going to support this any
time soon. So let's check that all online CPUs are GICv4 capable,
and disable the functionnality if not.

Signed-off-by: Marc Zyngier 
---
 include/linux/irqchip/arm-gic-v3.h |  2 ++
 virt/kvm/arm/vgic/vgic-v3.c| 27 +--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 1ea576c8126f..dfa4a51643d6 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -532,6 +532,8 @@
 #define ICH_VTR_SEIS_MASK  (1 << ICH_VTR_SEIS_SHIFT)
 #define ICH_VTR_A3V_SHIFT  21
 #define ICH_VTR_A3V_MASK   (1 << ICH_VTR_A3V_SHIFT)
+#define ICH_VTR_nV4_SHIFT  20
+#define ICH_VTR_nV4_MASK   (1 << ICH_VTR_nV4_SHIFT)
 
 #define ICC_IAR1_EL1_SPURIOUS  0x3ff
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 405733678c2f..da99df50a818 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -466,9 +466,19 @@ static int __init early_gicv4_enable(char *buf)
 }
 early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable);
 
+static void vgic_check_v4_cpuif(void *param)
+{
+   u32 ich_vtr_el2 = kvm_call_hyp(__vgic_v3_get_ich_vtr_el2);
+   bool *v4 = param;
+
+   *v4 = !(ich_vtr_el2 & ICH_VTR_nV4_MASK);
+   if (!*v4)
+   kvm_info("CPU%d is not GICv4 capable\n", smp_processor_id());
+}
+
 /**
  * vgic_v3_probe - probe for a GICv3 compatible interrupt controller in DT
- * @node:  pointer to the DT node
+ * @info:  pointer to the firmware-agnostic GIC information
  *
  * Returns 0 if a GICv3 has been found, returns an error code otherwise
  */
@@ -485,8 +495,21 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
kvm_vgic_global_state.can_emulate_gicv2 = false;
kvm_vgic_global_state.ich_vtr_el2 = ich_vtr_el2;
 
-   /* GICv4 support? */
+   /*
+* GICv4 support? We need to check on all CPUs in case of some
+* extremely creative form of big-little brain damage...
+*/
if (info->has_v4) {
+   int cpu;
+
+   for_each_online_cpu(cpu) {
+   bool enable;
+
+   smp_call_function_single(cpu, vgic_check_v4_cpuif,
+&enable, 1);
+   gicv4_enable = gicv4_enable && enable;
+   }
+
kvm_vgic_global_state.has_gicv4 = gicv4_enable;
kvm_info("GICv4 support %sabled\n",
 gicv4_enable ? "en" : "dis");
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 11/26] KVM: arm/arm64: GICv4: Unmap VLPI when freeing an LPI

2017-10-06 Thread Marc Zyngier
When freeing an LPI (on a DISCARD command, for example), we need
to unmap the VLPI down to the physical ITS level.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index b2a678d131d0..c9b1c0967426 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -628,8 +628,12 @@ static void its_free_ite(struct kvm *kvm, struct its_ite 
*ite)
list_del(&ite->ite_list);
 
/* This put matches the get in vgic_add_lpi. */
-   if (ite->irq)
+   if (ite->irq) {
+   if (ite->irq->hw)
+   WARN_ON(its_unmap_vlpi(ite->irq->host_irq));
+
vgic_put_irq(kvm, ite->irq);
+   }
 
kfree(ite);
 }
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 17/26] KVM: arm/arm64: GICv4: Propagate VLPI properties at map time

2017-10-06 Thread Marc Zyngier
When the VLPI gets mapped, it must inherit the configuration of
the LPI configured at the vITS level. For that purpose, let's make
update_lpi_config globally available and call it just after
having performed the VLPI map operation.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 6 ++
 virt/kvm/arm/vgic/vgic-v4.c  | 2 ++
 virt/kvm/arm/vgic/vgic.h | 2 ++
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index eb72eb027060..f434748439ee 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -37,8 +37,6 @@
 static int vgic_its_save_tables_v0(struct vgic_its *its);
 static int vgic_its_restore_tables_v0(struct vgic_its *its);
 static int vgic_its_commit_v0(struct vgic_its *its);
-static int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
-struct kvm_vcpu *filter_vcpu, bool needs_inv);
 
 /*
  * Creates a new (reference to a) struct vgic_irq for a given LPI.
@@ -272,8 +270,8 @@ static struct its_collection *find_collection(struct 
vgic_its *its, int coll_id)
  * If filter_vcpu is not NULL, applies only if the IRQ is targeting this
  * VCPU. Unconditionally applies if filter_vcpu is NULL.
  */
-static int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
-struct kvm_vcpu *filter_vcpu, bool needs_inv)
+int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
+ struct kvm_vcpu *filter_vcpu, bool needs_inv)
 {
u64 propbase = GICR_PROPBASER_ADDRESS(kvm->arch.vgic.propbaser);
u8 prop;
diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index ba1dd3162eba..b79a0450bb1c 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -147,6 +147,8 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
irq->hw = true;
irq->host_irq   = virq;
 
+   /* Force the property update and invalidate */
+   update_lpi_config(kvm, irq, NULL, true);
 out:
mutex_unlock(&its->its_lock);
return ret;
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index c4105f613f57..1bd2e28ac097 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -239,6 +239,8 @@ static inline int vgic_v3_max_apr_idx(struct kvm_vcpu *vcpu)
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
+int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
+ struct kvm_vcpu *filter_vcpu, bool needs_inv);
 
 bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 20/26] KVM: arm/arm64: GICv4: Use the doorbell interrupt as an unblocking source

2017-10-06 Thread Marc Zyngier
The doorbell interrupt is only useful if the vcpu is blocked on WFI.
In all other cases, recieving a doorbell interrupt is just a waste
of cycles.

So let's only enable the doorbell if a vcpu is getting blocked,
and disable it when it is unblocked. This is very similar to
what we're doing for the background timer.

Reviewed-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h  |  3 +++
 virt/kvm/arm/arm.c  |  2 ++
 virt/kvm/arm/vgic/vgic-v4.c | 18 ++
 3 files changed, 23 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 2f750c770bf2..8c896540a72c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -381,4 +381,7 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int irq,
 int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
 struct kvm_kernel_irq_routing_entry 
*irq_entry);
 
+void kvm_vgic_v4_enable_doorbell(struct kvm_vcpu *vcpu);
+void kvm_vgic_v4_disable_doorbell(struct kvm_vcpu *vcpu);
+
 #endif /* __KVM_ARM_VGIC_H */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 8388c1cc23f6..092a4ec03e57 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -316,11 +316,13 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
 {
kvm_timer_schedule(vcpu);
+   kvm_vgic_v4_enable_doorbell(vcpu);
 }
 
 void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
 {
kvm_timer_unschedule(vcpu);
+   kvm_vgic_v4_disable_doorbell(vcpu);
 }
 
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index 2c250ee37878..5ee69aec990a 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -235,3 +235,21 @@ int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int virq,
mutex_unlock(&its->its_lock);
return ret;
 }
+
+void kvm_vgic_v4_enable_doorbell(struct kvm_vcpu *vcpu)
+{
+   if (vgic_supports_direct_msis(vcpu->kvm)) {
+   int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq;
+   if (irq)
+   enable_irq(irq);
+   }
+}
+
+void kvm_vgic_v4_disable_doorbell(struct kvm_vcpu *vcpu)
+{
+   if (vgic_supports_direct_msis(vcpu->kvm)) {
+   int irq = vcpu->arch.vgic_cpu.vgic_v3.its_vpe.irq;
+   if (irq)
+   disable_irq(irq);
+   }
+}
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 09/26] KVM: arm/arm64: GICv4: Wire mapping/unmapping of VLPIs in VFIO irq bypass

2017-10-06 Thread Marc Zyngier
Let's use the irq bypass mechanism introduced for platform device
interrupts to intercept the virtual PCIe endpoint configuration
and establish our LPI->VLPI mapping.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h  |   8 
 virt/kvm/arm/arm.c  |   6 ++-
 virt/kvm/arm/vgic/vgic-v4.c | 104 
 3 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 7eeb6c2a2f9c..2f750c770bf2 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -373,4 +373,12 @@ int kvm_vgic_setup_default_irq_routing(struct kvm *kvm);
 
 int kvm_vgic_set_owner(struct kvm_vcpu *vcpu, unsigned int intid, void *owner);
 
+struct kvm_kernel_irq_routing_entry;
+
+int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int irq,
+  struct kvm_kernel_irq_routing_entry *irq_entry);
+
+int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int irq,
+struct kvm_kernel_irq_routing_entry 
*irq_entry);
+
 #endif /* __KVM_ARM_VGIC_H */
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 5d5218ecd547..8388c1cc23f6 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -1462,7 +1462,8 @@ int kvm_arch_irq_bypass_add_producer(struct 
irq_bypass_consumer *cons,
struct kvm_kernel_irqfd *irqfd =
container_of(cons, struct kvm_kernel_irqfd, consumer);
 
-   return 0;
+   return kvm_vgic_v4_set_forwarding(irqfd->kvm, prod->irq,
+ &irqfd->irq_entry);
 }
 void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
  struct irq_bypass_producer *prod)
@@ -1470,7 +1471,8 @@ void kvm_arch_irq_bypass_del_producer(struct 
irq_bypass_consumer *cons,
struct kvm_kernel_irqfd *irqfd =
container_of(cons, struct kvm_kernel_irqfd, consumer);
 
-   return;
+   kvm_vgic_v4_unset_forwarding(irqfd->kvm, prod->irq,
+&irqfd->irq_entry);
 }
 
 void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index c794f0cef09e..ba1dd3162eba 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -81,3 +81,107 @@ void vgic_v4_teardown(struct kvm *kvm)
its_vm->nr_vpes = 0;
its_vm->vpes = NULL;
 }
+
+static struct vgic_its *vgic_get_its(struct kvm *kvm,
+struct kvm_kernel_irq_routing_entry 
*irq_entry)
+{
+   struct kvm_msi msi  = (struct kvm_msi) {
+   .address_lo = irq_entry->msi.address_lo,
+   .address_hi = irq_entry->msi.address_hi,
+   .data   = irq_entry->msi.data,
+   .flags  = irq_entry->msi.flags,
+   .devid  = irq_entry->msi.devid,
+   };
+
+   /*
+* Get a reference on the LPI. If NULL, this is not a valid
+* translation for any of our vITSs.
+*/
+   return vgic_msi_to_its(kvm, &msi);
+}
+
+int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
+  struct kvm_kernel_irq_routing_entry *irq_entry)
+{
+   struct vgic_its *its;
+   struct vgic_irq *irq;
+   struct its_vlpi_map map;
+   int ret;
+
+   if (!vgic_supports_direct_msis(kvm))
+   return 0;
+
+   /*
+* Get the ITS, and escape early on error (not a valid
+* doorbell for any of our vITSs).
+*/
+   its = vgic_get_its(kvm, irq_entry);
+   if (IS_ERR(its))
+   return 0;
+
+   mutex_lock(&its->its_lock);
+
+   /* Perform then actual DevID/EventID -> LPI translation. */
+   ret = vgic_its_resolve_lpi(kvm, its, irq_entry->msi.devid,
+  irq_entry->msi.data, &irq);
+   if (ret)
+   goto out;
+
+   /*
+* Emit the mapping request. If it fails, the ITS probably
+* isn't v4 compatible, so let's silently bail out. Holding
+* the ITS lock should ensure that nothing can modify the
+* target vcpu.
+*/
+   map = (struct its_vlpi_map) {
+   .vm = &kvm->arch.vgic.its_vm,
+   .vpe= 
&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe,
+   .vintid = irq->intid,
+   .db_enabled = true,
+   };
+
+   ret = its_map_vlpi(virq, &map);
+   if (ret)
+   goto out;
+
+   irq->hw = true;
+   irq->host_irq   = virq;
+
+out:
+   mutex_unlock(&its->its_lock);
+   return ret;
+}
+
+int kvm_vgic_v4_unset_forwarding(struct kvm *kvm, int virq,
+struct kvm_kernel_irq_routing_entry *irq_entry)
+{
+   struct vgic_its *its;
+   struct vgic_irq *irq;
+   int ret;
+
+   if (!vgic_supports_direct_msis(kvm))
+   return

[PATCH v4 13/26] KVM: arm/arm64: GICv4: Handle CLEAR applied to a VLPI

2017-10-06 Thread Marc Zyngier
Handling CLEAR is pretty easy. Just ask the ITS driver to clear
the corresponding pending bit (which will turn into a CLEAR
command on the physical side).

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 42ffb9084bb7..5778b50911e8 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1070,6 +1070,10 @@ static int vgic_its_cmd_handle_clear(struct kvm *kvm, 
struct vgic_its *its,
 
ite->irq->pending_latch = false;
 
+   if (ite->irq->hw)
+   return irq_set_irqchip_state(ite->irq->host_irq,
+IRQCHIP_STATE_PENDING, false);
+
return 0;
 }
 
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 19/26] KVM: arm/arm64: GICv4: Add doorbell interrupt handling

2017-10-06 Thread Marc Zyngier
When a vPE is not running, a VLPI being made pending results in a
doorbell interrupt being delivered. Let's handle this interrupt
and update the pending_last flag that indicates that VLPIs are
pending. The corresponding vcpu is also kicked into action.

Special care is taken care of not allowing the doorbell to be
enabled at request time (this is controlled separately), and
to make the disabling on the interrupt non-lazy.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-v4.c | 48 +
 1 file changed, 48 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index b79a0450bb1c..2c250ee37878 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -16,11 +16,23 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
 #include "vgic.h"
 
+static irqreturn_t vgic_v4_doorbell_handler(int irq, void *info)
+{
+   struct kvm_vcpu *vcpu = info;
+
+   vcpu->arch.vgic_cpu.vgic_v3.its_vpe.pending_last = true;
+   kvm_make_request(KVM_REQ_IRQ_PENDING, vcpu);
+   kvm_vcpu_kick(vcpu);
+
+   return IRQ_HANDLED;
+}
+
 /**
  * vgic_v4_init - Initialize the GICv4 data structures
  * @kvm:   Pointer to the VM being initialized
@@ -60,6 +72,33 @@ int vgic_v4_init(struct kvm *kvm)
return ret;
}
 
+   kvm_for_each_vcpu(i, vcpu, kvm) {
+   int irq = dist->its_vm.vpes[i]->irq;
+
+   /*
+* Don't automatically enable the doorbell, as we're
+* flipping it back and forth when the vcpu gets
+* blocked. Also disable the lazy disabling, as the
+* doorbell could kick us out of the guest too
+* early...
+*/
+   irq_set_status_flags(irq, IRQ_NOAUTOEN | IRQ_DISABLE_UNLAZY);
+   ret = request_irq(irq, vgic_v4_doorbell_handler,
+ 0, "vcpu", vcpu);
+   if (ret) {
+   kvm_err("failed to allocate vcpu IRQ%d\n", irq);
+   /*
+* Trick: adjust the number of vpes so we know
+* how many to nuke on teardown...
+*/
+   dist->its_vm.nr_vpes = i;
+   break;
+   }
+   }
+
+   if (ret)
+   vgic_v4_teardown(kvm);
+
return ret;
 }
 
@@ -72,10 +111,19 @@ int vgic_v4_init(struct kvm *kvm)
 void vgic_v4_teardown(struct kvm *kvm)
 {
struct its_vm *its_vm = &kvm->arch.vgic.its_vm;
+   int i;
 
if (!its_vm->vpes)
return;
 
+   for (i = 0; i < its_vm->nr_vpes; i++) {
+   struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, i);
+   int irq = its_vm->vpes[i]->irq;
+
+   irq_clear_status_flags(irq, IRQ_NOAUTOEN | IRQ_DISABLE_UNLAZY);
+   free_irq(irq, vcpu);
+   }
+
its_free_vcpu_irqs(its_vm);
kfree(its_vm->vpes);
its_vm->nr_vpes = 0;
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 15/26] KVM: arm/arm64: GICv4: Propagate property updates to VLPIs

2017-10-06 Thread Marc Zyngier
Upon updating a property, we propagate it all the way to the physical
ITS, and ask for an INV command to be executed there.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 0b7e648e7a0c..2e77c7c83942 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -296,6 +296,9 @@ static int update_lpi_config(struct kvm *kvm, struct 
vgic_irq *irq,
spin_unlock(&irq->irq_lock);
}
 
+   if (irq->hw)
+   return its_prop_update_vlpi(irq->host_irq, prop, true);
+
return 0;
 }
 
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 14/26] KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE

2017-10-06 Thread Marc Zyngier
The current implementation of MOVALL doesn't allow us to call
into the core ITS code as we hold a number of spinlocks.

Let's try a method used in other parts of the code, were we copy
the intids of the candicate interrupts, and then do whatever
we need to do with them outside of the critical section.

This allows us to move the interrupts one by one, at the expense
of a bit of CPU time. Who cares? MOVALL is such a stupid command
anyway...

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 5778b50911e8..0b7e648e7a0c 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1148,11 +1148,12 @@ static int vgic_its_cmd_handle_invall(struct kvm *kvm, 
struct vgic_its *its,
 static int vgic_its_cmd_handle_movall(struct kvm *kvm, struct vgic_its *its,
  u64 *its_cmd)
 {
-   struct vgic_dist *dist = &kvm->arch.vgic;
u32 target1_addr = its_cmd_get_target_addr(its_cmd);
u32 target2_addr = its_cmd_mask_field(its_cmd, 3, 16, 32);
struct kvm_vcpu *vcpu1, *vcpu2;
struct vgic_irq *irq;
+   u32 *intids;
+   int irq_count, i;
 
if (target1_addr >= atomic_read(&kvm->online_vcpus) ||
target2_addr >= atomic_read(&kvm->online_vcpus))
@@ -1164,19 +1165,19 @@ static int vgic_its_cmd_handle_movall(struct kvm *kvm, 
struct vgic_its *its,
vcpu1 = kvm_get_vcpu(kvm, target1_addr);
vcpu2 = kvm_get_vcpu(kvm, target2_addr);
 
-   spin_lock(&dist->lpi_list_lock);
+   irq_count = vgic_copy_lpi_list(vcpu1, &intids);
+   if (irq_count < 0)
+   return irq_count;
 
-   list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
-   spin_lock(&irq->irq_lock);
+   for (i = 0; i < irq_count; i++) {
+   irq = vgic_get_irq(kvm, NULL, intids[i]);
 
-   if (irq->target_vcpu == vcpu1)
-   irq->target_vcpu = vcpu2;
+   update_affinity(irq, vcpu2);
 
-   spin_unlock(&irq->irq_lock);
+   vgic_put_irq(kvm, irq);
}
 
-   spin_unlock(&dist->lpi_list_lock);
-
+   kfree(intids);
return 0;
 }
 
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 10/26] KVM: arm/arm64: GICv4: Handle INT command applied to a VLPI

2017-10-06 Thread Marc Zyngier
If the guest issues an INT command targetting a VLPI, let's
call into the irq_set_irqchip_state() helper to make it pending
on the physical side.

This works just as well if userspace decides to inject an interrupt
using the normal userspace API...

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 89768d2b6a91..b2a678d131d0 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -578,6 +578,10 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct 
vgic_its *its,
if (err)
return err;
 
+   if (irq->hw)
+   return irq_set_irqchip_state(irq->host_irq,
+IRQCHIP_STATE_PENDING, true);
+
spin_lock(&irq->irq_lock);
irq->pending_latch = true;
vgic_queue_irq_unlock(kvm, irq);
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 12/26] KVM: arm/arm64: GICv4: Propagate affinity changes to the physical ITS

2017-10-06 Thread Marc Zyngier
When the guest issues an affinity change, we need to tell the physical
ITS that we're now targetting a new vcpu.  This is done by extracting
the current mapping, updating the target, and reapplying the mapping.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index c9b1c0967426..42ffb9084bb7 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -337,11 +337,25 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 
**intid_ptr)
 
 static int update_affinity(struct vgic_irq *irq, struct kvm_vcpu *vcpu)
 {
+   int ret = 0;
+
spin_lock(&irq->irq_lock);
irq->target_vcpu = vcpu;
spin_unlock(&irq->irq_lock);
 
-   return 0;
+   if (irq->hw) {
+   struct its_vlpi_map map;
+
+   ret = its_get_vlpi(irq->host_irq, &map);
+   if (ret)
+   return ret;
+
+   map.vpe = &vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
+
+   ret = its_map_vlpi(irq->host_irq, &map);
+   }
+
+   return ret;
 }
 
 /*
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 16/26] KVM: arm/arm64: GICv4: Handle INVALL applied to a vPE

2017-10-06 Thread Marc Zyngier
Since when updating the properties one LPI at a time, there is no
need to perform an INV each time we read one. Instead, we rely
on the final VINVALL that gets sent to the ITS to do the work.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 2e77c7c83942..eb72eb027060 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -38,7 +38,7 @@ static int vgic_its_save_tables_v0(struct vgic_its *its);
 static int vgic_its_restore_tables_v0(struct vgic_its *its);
 static int vgic_its_commit_v0(struct vgic_its *its);
 static int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
-struct kvm_vcpu *filter_vcpu);
+struct kvm_vcpu *filter_vcpu, bool needs_inv);
 
 /*
  * Creates a new (reference to a) struct vgic_irq for a given LPI.
@@ -106,7 +106,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
 * However we only have those structs for mapped IRQs, so we read in
 * the respective config data from memory here upon mapping the LPI.
 */
-   ret = update_lpi_config(kvm, irq, NULL);
+   ret = update_lpi_config(kvm, irq, NULL, false);
if (ret)
return ERR_PTR(ret);
 
@@ -273,7 +273,7 @@ static struct its_collection *find_collection(struct 
vgic_its *its, int coll_id)
  * VCPU. Unconditionally applies if filter_vcpu is NULL.
  */
 static int update_lpi_config(struct kvm *kvm, struct vgic_irq *irq,
-struct kvm_vcpu *filter_vcpu)
+struct kvm_vcpu *filter_vcpu, bool needs_inv)
 {
u64 propbase = GICR_PROPBASER_ADDRESS(kvm->arch.vgic.propbaser);
u8 prop;
@@ -297,7 +297,7 @@ static int update_lpi_config(struct kvm *kvm, struct 
vgic_irq *irq,
}
 
if (irq->hw)
-   return its_prop_update_vlpi(irq->host_irq, prop, true);
+   return its_prop_update_vlpi(irq->host_irq, prop, needs_inv);
 
return 0;
 }
@@ -1096,7 +1096,7 @@ static int vgic_its_cmd_handle_inv(struct kvm *kvm, 
struct vgic_its *its,
if (!ite)
return E_ITS_INV_UNMAPPED_INTERRUPT;
 
-   return update_lpi_config(kvm, ite->irq, NULL);
+   return update_lpi_config(kvm, ite->irq, NULL, true);
 }
 
 /*
@@ -1131,12 +1131,15 @@ static int vgic_its_cmd_handle_invall(struct kvm *kvm, 
struct vgic_its *its,
irq = vgic_get_irq(kvm, NULL, intids[i]);
if (!irq)
continue;
-   update_lpi_config(kvm, irq, vcpu);
+   update_lpi_config(kvm, irq, vcpu, false);
vgic_put_irq(kvm, irq);
}
 
kfree(intids);
 
+   if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.its_vm)
+   its_invall_vpe(&vcpu->arch.vgic_cpu.vgic_v3.its_vpe);
+
return 0;
 }
 
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 01/26] KVM: arm/arm64: register irq bypass consumer on ARM/ARM64

2017-10-06 Thread Marc Zyngier
From: Eric Auger 

This patch selects IRQ_BYPASS_MANAGER and HAVE_KVM_IRQ_BYPASS
configs for ARM/ARM64.

kvm_arch_has_irq_bypass() now is implemented and returns true.
As a consequence the irq bypass consumer will be registered for
ARM/ARM64 with the forwarding callbacks:

- stop/start: halt/resume guest execution
- add/del_producer: set/unset forwarding at vgic/irqchip level

We don't have any actual support yet, so nothing gets actually
forwarded.

Signed-off-by: Eric Auger 
[maz: dropped the DEOI stuff for the time being in order to
  reduce the dependency chain, amended commit message]
Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/Kconfig   |  3 +++
 arch/arm64/kvm/Kconfig |  3 +++
 virt/kvm/arm/arm.c | 40 
 3 files changed, 46 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 90d0176fb30d..4e2b192a030a 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select HAVE_KVM_MSI
+   select IRQ_BYPASS_MANAGER
+   select HAVE_KVM_IRQ_BYPASS
depends on ARM_VIRT_EXT && ARM_LPAE && ARM_ARCH_TIMER
---help---
  Support hosting virtualized guest machines.
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 52cb7ad9b2fd..7e0d6e63cc71 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -3,6 +3,7 @@
 #
 
 source "virt/kvm/Kconfig"
+source "virt/lib/Kconfig"
 
 menuconfig VIRTUALIZATION
bool "Virtualization"
@@ -35,6 +36,8 @@ config KVM
select HAVE_KVM_MSI
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
+   select IRQ_BYPASS_MANAGER
+   select HAVE_KVM_IRQ_BYPASS
---help---
  Support hosting virtualized guest machines.
  We don't support KVM with 16K page tables yet, due to the multiple
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index b9f68e4add71..7ac4ecba9ebd 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -1449,6 +1451,44 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, 
unsigned long mpidr)
return NULL;
 }
 
+bool kvm_arch_has_irq_bypass(void)
+{
+   return true;
+}
+
+int kvm_arch_irq_bypass_add_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   return 0;
+}
+void kvm_arch_irq_bypass_del_producer(struct irq_bypass_consumer *cons,
+ struct irq_bypass_producer *prod)
+{
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   return;
+}
+
+void kvm_arch_irq_bypass_stop(struct irq_bypass_consumer *cons)
+{
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   kvm_arm_halt_guest(irqfd->kvm);
+}
+
+void kvm_arch_irq_bypass_start(struct irq_bypass_consumer *cons)
+{
+   struct kvm_kernel_irqfd *irqfd =
+   container_of(cons, struct kvm_kernel_irqfd, consumer);
+
+   kvm_arm_resume_guest(irqfd->kvm);
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 00/26] KVM/ARM: Add support for GICv4

2017-10-06 Thread Marc Zyngier
This series implements full support for GICv4 in KVM, bringing direct
injection of MSIs to arm and arm64, assuming you have the right
hardware (which is quite unlikely).

To get an idea of the design, I'd recommend you start with commit
7954907bedaf as well as patch #26, which try to shed some light on the
approach that I've taken. And before that, please digest some of the
GICv3/GICv4 architecture documentation[1] (less than 800 pages!). Once
you feel reasonably insane, you'll be in the right mood to read the
code (no, I didn't bother changing the above, I already hit LWN/QotW
once).

Now that the low-level irqchip code has been merged, what we have here
is only the KVM-specific part. I've lifted two of Eric's patches
(having slightly changed one of them) as they form the base of this
series.

The stack has been *very lightly* tested on an arm64 model, with a PCI
virtio block device passed from the host to a guest (using kvmtool and
Jean-Philippe Brucker's excellent VFIO support patches[3]), as well as
on a Huawei D05 box using an Intel I350 Ethernet card and passing a VF
into the guest. So far, things are solid enough (although some HW need
workarounds that I'll address separately).

I've pushed out a branch based on 4.14-rc3 containing the dependencies:

git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
kvm-arm64/gicv4-kvm

* From v3:
   - Lots of cleanups following Christoffer's review
   - Got rid of all of the irqchip code (well, merged it)
   - Picked the minimal set of patches from Eric to get this working
   - Prevent VM save if using GICv4
   - Rebased on top of 4.14-rc3

* From v2:
  - Lots of fixes all over the map (the doorbell code was amazingly
wrong, both at the GICv4 level and in KVM)
  - KVM GICv4 enablement is now gated by a command-line option and
defaults to off
  - Now properly deals with ordering of vITS creation
  - Some debugging features
  - More documentation so that I can forget what this is all about
  - Huawei D05 quirks

* From v1:
  - The bulk of the 30-something initial patches have seen countless
bugs being fixed, and some key data structures have been subtly
tweaked (or killed altogether). They are still quite similar to
what I had in v1 though.
  - The whole KVM code is brand new and as I said above, only lightly
tested.
  - Collected a bunch a R-bs from Thomas and Eric (many thanks, guys).

[1] 
https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specification.pfd
[2] http://www.spinics.net/lists/kvm/msg151463.html
[3] http://www.spinics.net/lists/kvm/msg151823.html

Eric Auger (2):
  KVM: arm/arm64: register irq bypass consumer on ARM/ARM64
  KVM: arm/arm64: vgic: restructure kvm_vgic_(un)map_phys_irq

Marc Zyngier (24):
  KVM: arm: Select ARM_GIC_V3 and ARM_GIC_V3_ITS
  KVM: arm/arm64: vgic: Move kvm_vgic_destroy call around
  KVM: arm/arm64: vITS: Add MSI translation helpers
  KVM: arm/arm64: vITS: Add a helper to update the affinity of an LPI
  KVM: arm/arm64: GICv4: Add property field and per-VM predicate
  KVM: arm/arm64: GICv4: Add init/teardown of the per-VM vPE irq domain
  KVM: arm/arm64: GICv4: Wire mapping/unmapping of VLPIs in VFIO irq
bypass
  KVM: arm/arm64: GICv4: Handle INT command applied to a VLPI
  KVM: arm/arm64: GICv4: Unmap VLPI when freeing an LPI
  KVM: arm/arm64: GICv4: Propagate affinity changes to the physical ITS
  KVM: arm/arm64: GICv4: Handle CLEAR applied to a VLPI
  KVM: arm/arm64: GICv4: Handle MOVALL applied to a vPE
  KVM: arm/arm64: GICv4: Propagate property updates to VLPIs
  KVM: arm/arm64: GICv4: Handle INVALL applied to a vPE
  KVM: arm/arm64: GICv4: Propagate VLPI properties at map time
  KVM: arm/arm64: GICv4: Use pending_last as a scheduling hint
  KVM: arm/arm64: GICv4: Add doorbell interrupt handling
  KVM: arm/arm64: GICv4: Use the doorbell interrupt as an unblocking
source
  KVM: arm/arm64: GICv4: Hook vPE scheduling into vgic flush/sync
  KVM: arm/arm64: GICv4: Enable virtual cpuif if VLPIs can be delivered
  KVM: arm/arm64: GICv4: Prevent a VM using GICv4 from being saved
  KVM: arm/arm64: GICv4: Enable VLPI support
  KVM: arm/arm64: GICv4: Prevent heterogenous systems from using GICv4
  KVM: arm/arm64: GICv4: Theory of operations

 Documentation/admin-guide/kernel-parameters.txt |   4 +
 arch/arm/kvm/Kconfig|   5 +
 arch/arm/kvm/Makefile   |   1 +
 arch/arm64/kvm/Kconfig  |   3 +
 arch/arm64/kvm/Makefile |   1 +
 include/kvm/arm_vgic.h  |  41 ++-
 include/linux/irqchip/arm-gic-v3.h  |   2 +
 virt/kvm/arm/arch_timer.c   |  24 +-
 virt/kvm/arm/arm.c  |  48 +++-
 virt/kvm/arm/hyp/vgic-v3-sr.c   |   9 +-
 virt/kvm/arm/vgic/vgic-init.c   |   9 +
 virt/kvm/arm/vgic/vgic-its.c| 195 -
 virt/kvm/arm/vgic/vgic-mmio-v3.c 

[PATCH v4 04/26] KVM: arm/arm64: vgic: Move kvm_vgic_destroy call around

2017-10-06 Thread Marc Zyngier
The way we call kvm_vgic_destroy is a bit bizarre. We call it
*after* having freed the vcpus, which sort of defeats the point
of cleaning up things before that point.

Let's move kvm_vgic_destroy towards the beginning of kvm_arch_destroy_vm,
which seems more sensible.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/arm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 7ac4ecba9ebd..5d5218ecd547 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -177,6 +177,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 {
int i;
 
+   kvm_vgic_destroy(kvm);
+
free_percpu(kvm->arch.last_vcpu_ran);
kvm->arch.last_vcpu_ran = NULL;
 
@@ -186,8 +188,6 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
kvm->vcpus[i] = NULL;
}
}
-
-   kvm_vgic_destroy(kvm);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 05/26] KVM: arm/arm64: vITS: Add MSI translation helpers

2017-10-06 Thread Marc Zyngier
The whole MSI injection process is fairly monolithic. An MSI write
gets turned into an injected LPI in one swift go. But this is actually
a more fine-grained process:

- First, a virtual ITS gets selected using the doorbell address
- Then the DevID/EventID pair gets translated into an LPI
- Finally the LPI is injected

Since the GICv4 code needs the first two steps in order to match
an IRQ routing entry to an LPI, let's expose them as helpers,
and refactor the existing code to use them

Reviewed-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 93 +---
 virt/kvm/arm/vgic/vgic.h |  4 ++
 2 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index f51c1e1b3f70..475dffd0a44c 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -503,15 +503,8 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm 
*kvm,
return 0;
 }
 
-/*
- * Find the target VCPU and the LPI number for a given devid/eventid pair
- * and make this IRQ pending, possibly injecting it.
- * Must be called with the its_lock mutex held.
- * Returns 0 on success, a positive error value for any ITS mapping
- * related errors and negative error values for generic errors.
- */
-static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
-   u32 devid, u32 eventid)
+int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
+u32 devid, u32 eventid, struct vgic_irq **irq)
 {
struct kvm_vcpu *vcpu;
struct its_ite *ite;
@@ -530,26 +523,60 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct 
vgic_its *its,
if (!vcpu->arch.vgic_cpu.lpis_enabled)
return -EBUSY;
 
-   spin_lock(&ite->irq->irq_lock);
-   ite->irq->pending_latch = true;
-   vgic_queue_irq_unlock(kvm, ite->irq);
-
+   *irq = ite->irq;
return 0;
 }
 
-static struct vgic_io_device *vgic_get_its_iodev(struct kvm_io_device *dev)
+struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi)
 {
+   u64 address;
+   struct kvm_io_device *kvm_io_dev;
struct vgic_io_device *iodev;
 
-   if (dev->ops != &kvm_io_gic_ops)
-   return NULL;
+   if (!vgic_has_its(kvm))
+   return ERR_PTR(-ENODEV);
 
-   iodev = container_of(dev, struct vgic_io_device, dev);
+   if (!(msi->flags & KVM_MSI_VALID_DEVID))
+   return ERR_PTR(-EINVAL);
+
+   address = (u64)msi->address_hi << 32 | msi->address_lo;
+
+   kvm_io_dev = kvm_io_bus_get_dev(kvm, KVM_MMIO_BUS, address);
+   if (!kvm_io_dev)
+   return ERR_PTR(-EINVAL);
 
+   if (kvm_io_dev->ops != &kvm_io_gic_ops)
+   return ERR_PTR(-EINVAL);
+
+   iodev = container_of(kvm_io_dev, struct vgic_io_device, dev);
if (iodev->iodev_type != IODEV_ITS)
-   return NULL;
+   return ERR_PTR(-EINVAL);
 
-   return iodev;
+   return iodev->its;
+}
+
+/*
+ * Find the target VCPU and the LPI number for a given devid/eventid pair
+ * and make this IRQ pending, possibly injecting it.
+ * Must be called with the its_lock mutex held.
+ * Returns 0 on success, a positive error value for any ITS mapping
+ * related errors and negative error values for generic errors.
+ */
+static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
+   u32 devid, u32 eventid)
+{
+   struct vgic_irq *irq = NULL;
+   int err;
+
+   err = vgic_its_resolve_lpi(kvm, its, devid, eventid, &irq);
+   if (err)
+   return err;
+
+   spin_lock(&irq->irq_lock);
+   irq->pending_latch = true;
+   vgic_queue_irq_unlock(kvm, irq);
+
+   return 0;
 }
 
 /*
@@ -560,30 +587,16 @@ static struct vgic_io_device *vgic_get_its_iodev(struct 
kvm_io_device *dev)
  */
 int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 {
-   u64 address;
-   struct kvm_io_device *kvm_io_dev;
-   struct vgic_io_device *iodev;
+   struct vgic_its *its;
int ret;
 
-   if (!vgic_has_its(kvm))
-   return -ENODEV;
-
-   if (!(msi->flags & KVM_MSI_VALID_DEVID))
-   return -EINVAL;
-
-   address = (u64)msi->address_hi << 32 | msi->address_lo;
-
-   kvm_io_dev = kvm_io_bus_get_dev(kvm, KVM_MMIO_BUS, address);
-   if (!kvm_io_dev)
-   return -EINVAL;
+   its = vgic_msi_to_its(kvm, msi);
+   if (IS_ERR(its))
+   return PTR_ERR(its);
 
-   iodev = vgic_get_its_iodev(kvm_io_dev);
-   if (!iodev)
-   return -EINVAL;
-
-   mutex_lock(&iodev->its->its_lock);
-   ret = vgic_its_trigger_msi(kvm, iodev->its, msi->devid, msi->data);
-   mutex_unlock(&iodev->its->its_lock);
+   mutex_lock(&its->its_lock);
+   ret = vgic_its_trigger_msi(kvm, its, msi->devid, msi-

[PATCH v4 07/26] KVM: arm/arm64: GICv4: Add property field and per-VM predicate

2017-10-06 Thread Marc Zyngier
Add a new has_gicv4 field in the global VGIC state that indicates
whether the HW is GICv4 capable, as a per-VM predicate indicating
if there is a possibility for a VM to support direct injection
(the above being true and the VM having an ITS).

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h   | 3 +++
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 5 +
 virt/kvm/arm/vgic/vgic.h | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 53f631bdec75..ba9fb450aa1b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -73,6 +73,9 @@ struct vgic_global {
/* Only needed for the legacy KVM_CREATE_IRQCHIP */
boolcan_emulate_gicv2;
 
+   /* Hardware has GICv4? */
+   boolhas_gicv4;
+
/* GIC system register CPU interface */
struct static_key_false gicv3_cpuif;
 
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 408ef06638fc..f87fd17b2eb9 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -54,6 +54,11 @@ bool vgic_has_its(struct kvm *kvm)
return dist->has_its;
 }
 
+bool vgic_supports_direct_msis(struct kvm *kvm)
+{
+   return kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm);
+}
+
 static unsigned long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
gpa_t addr, unsigned int len)
 {
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 31b70326b966..e67ccb6a6250 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -240,4 +240,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its 
*its,
 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
 
+bool vgic_supports_direct_msis(struct kvm *kvm);
+
 #endif
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 03/26] KVM: arm: Select ARM_GIC_V3 and ARM_GIC_V3_ITS

2017-10-06 Thread Marc Zyngier
The GICv4 support introduces a hard dependency between the KVM
core and the ITS infrastructure. arm64 already selects it at
the architecture level, but 32bit doesn't. In order to avoid
littering the kernel with #ifdefs, let's just select the whole
of the GICv3 suport code.

You know you want it.

Acked-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/Kconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 4e2b192a030a..52b50af04e3b 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -23,6 +23,8 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
select ARM_GIC
+   select ARM_GIC_V3
+   select ARM_GIC_V3_ITS
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_ARCH_TLB_FLUSH_ALL
select KVM_MMIO
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 02/26] KVM: arm/arm64: vgic: restructure kvm_vgic_(un)map_phys_irq

2017-10-06 Thread Marc Zyngier
From: Eric Auger 

We want to reuse the core of the map/unmap functions for IRQ
forwarding. Let's move the computation of the hwirq in
kvm_vgic_map_phys_irq and pass the linux IRQ as parameter.
the host_irq is added to struct vgic_irq.

We introduce kvm_vgic_map/unmap_irq which take a struct vgic_irq
handle as a parameter.

Signed-off-by: Eric Auger 
---
 include/kvm/arm_vgic.h|  8 ---
 virt/kvm/arm/arch_timer.c | 24 +--
 virt/kvm/arm/vgic/vgic.c  | 60 +++
 3 files changed, 51 insertions(+), 41 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 34dba516ef24..53f631bdec75 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -116,6 +116,7 @@ struct vgic_irq {
bool hw;/* Tied to HW IRQ */
struct kref refcount;   /* Used for LPIs */
u32 hwintid;/* HW INTID number */
+   unsigned int host_irq;  /* linux irq corresponding to hwintid */
union {
u8 targets; /* GICv2 target VCPUs mask */
u32 mpidr;  /* GICv3 target VCPU */
@@ -307,9 +308,10 @@ void kvm_vgic_init_cpu_hardware(void);
 
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
bool level, void *owner);
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq);
-int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int virt_irq);
-bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int virt_irq);
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+ u32 vintid);
+int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int vintid);
+bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int vintid);
 
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 8e89d63005c7..b24e2f7b363a 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -647,9 +647,6 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
-   struct irq_desc *desc;
-   struct irq_data *data;
-   int phys_irq;
int ret;
 
if (timer->enabled)
@@ -667,26 +664,7 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
return -EINVAL;
}
 
-   /*
-* Find the physical IRQ number corresponding to the host_vtimer_irq
-*/
-   desc = irq_to_desc(host_vtimer_irq);
-   if (!desc) {
-   kvm_err("%s: no interrupt descriptor\n", __func__);
-   return -EINVAL;
-   }
-
-   data = irq_desc_get_irq_data(desc);
-   while (data->parent_data)
-   data = data->parent_data;
-
-   phys_irq = data->hwirq;
-
-   /*
-* Tell the VGIC that the virtual interrupt is tied to a
-* physical interrupt. We do that once per VCPU.
-*/
-   ret = kvm_vgic_map_phys_irq(vcpu, vtimer->irq.irq, phys_irq);
+   ret = kvm_vgic_map_phys_irq(vcpu, host_vtimer_irq, vtimer->irq.irq);
if (ret)
return ret;
 
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index fed717e07938..9d557efd1f7d 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -17,6 +17,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "vgic.h"
 
@@ -403,38 +405,66 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, 
unsigned int intid,
return 0;
 }
 
-int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, u32 virt_irq, u32 phys_irq)
+/* @irq->irq_lock must be held */
+static int kvm_vgic_map_irq(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
+   unsigned int host_irq)
 {
-   struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, virt_irq);
+   struct irq_desc *desc;
+   struct irq_data *data;
 
-   BUG_ON(!irq);
-
-   spin_lock(&irq->irq_lock);
+   /*
+* Find the physical IRQ number corresponding to @host_irq
+*/
+   desc = irq_to_desc(host_irq);
+   if (!desc) {
+   kvm_err("%s: no interrupt descriptor\n", __func__);
+   return -EINVAL;
+   }
+   data = irq_desc_get_irq_data(desc);
+   while (data->parent_data)
+   data = data->parent_data;
 
irq->hw = true;
-   irq->hwintid = phys_irq;
+   irq->host_irq = host_irq;
+   irq->hwintid = data->hwirq;
+   return 0;
+}
+
+/* @irq->irq_lock must be held */
+static inline void kvm_vgic_unmap_irq(struct vgic_irq *irq)
+{
+   irq->hw = false;
+   irq->hwintid = 0;
+}
+
+int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
+ u32 vintid)
+{
+   struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, v

[PATCH v4 08/26] KVM: arm/arm64: GICv4: Add init/teardown of the per-VM vPE irq domain

2017-10-06 Thread Marc Zyngier
In order to control the GICv4 view of virtual CPUs, we rely
on an irqdomain allocated for that purpose. Let's add a couple
of helpers to that effect.

At the same time, the vgic data structures gain new fields to
track all this... erm... wonderful stuff.

The way we hook into the vgic init is slightly convoluted. We
need the vgic to be initialized (in order to guarantee that
the number of vcpus is now fixed), and we must have a vITS
(otherwise this is all very pointless). So we end-up calling
the init from both vgic_init and vgic_its_create.

Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/Makefile |  1 +
 arch/arm64/kvm/Makefile   |  1 +
 include/kvm/arm_vgic.h| 19 ++
 virt/kvm/arm/vgic/vgic-init.c |  9 +
 virt/kvm/arm/vgic/vgic-its.c  |  8 +
 virt/kvm/arm/vgic/vgic-v4.c   | 83 +++
 virt/kvm/arm/vgic/vgic.h  |  2 ++
 7 files changed, 123 insertions(+)
 create mode 100644 virt/kvm/arm/vgic/vgic-v4.c

diff --git a/arch/arm/kvm/Makefile b/arch/arm/kvm/Makefile
index d9beee652d36..0a1dd2cdb928 100644
--- a/arch/arm/kvm/Makefile
+++ b/arch/arm/kvm/Makefile
@@ -31,6 +31,7 @@ obj-y += $(KVM)/arm/vgic/vgic-init.o
 obj-y += $(KVM)/arm/vgic/vgic-irqfd.o
 obj-y += $(KVM)/arm/vgic/vgic-v2.o
 obj-y += $(KVM)/arm/vgic/vgic-v3.o
+obj-y += $(KVM)/arm/vgic/vgic-v4.o
 obj-y += $(KVM)/arm/vgic/vgic-mmio.o
 obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o
 obj-y += $(KVM)/arm/vgic/vgic-mmio-v3.o
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 5d9810086c25..c30fd388ef80 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -26,6 +26,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-init.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-irqfd.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-v2.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-v3.o
+kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-v4.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v2.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v3.o
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index ba9fb450aa1b..7eeb6c2a2f9c 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -26,6 +26,8 @@
 #include 
 #include 
 
+#include 
+
 #define VGIC_V3_MAX_CPUS   255
 #define VGIC_V2_MAX_CPUS   8
 #define VGIC_NR_IRQS_LEGACY 256
@@ -236,6 +238,15 @@ struct vgic_dist {
 
/* used by vgic-debug */
struct vgic_state_iter *iter;
+
+   /*
+* GICv4 ITS per-VM data, containing the IRQ domain, the VPE
+* array, the property table pointer as well as allocation
+* data. This essentially ties the Linux IRQ core and ITS
+* together, and avoids leaking KVM's data structures anywhere
+* else.
+*/
+   struct its_vm   its_vm;
 };
 
 struct vgic_v2_cpu_if {
@@ -254,6 +265,14 @@ struct vgic_v3_cpu_if {
u32 vgic_ap0r[4];
u32 vgic_ap1r[4];
u64 vgic_lr[VGIC_V3_MAX_LRS];
+
+   /*
+* GICv4 ITS per-VPE data, containing the doorbell IRQ, the
+* pending table pointer, the its_vm pointer and a few other
+* HW specific things. As for the its_vm structure, this is
+* linking the Linux IRQ subsystem and the ITS together.
+*/
+   struct its_vpe  its_vpe;
 };
 
 struct vgic_cpu {
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 5801261f3add..c2749387615b 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -285,6 +285,12 @@ int vgic_init(struct kvm *kvm)
if (ret)
goto out;
 
+   if (vgic_has_its(kvm) && vgic_supports_direct_msis(kvm)) {
+   ret = vgic_v4_init(kvm);
+   if (ret)
+   goto out;
+   }
+
kvm_for_each_vcpu(i, vcpu, kvm)
kvm_vgic_vcpu_enable(vcpu);
 
@@ -320,6 +326,9 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
 
kfree(dist->spis);
dist->nr_spis = 0;
+
+   if (vgic_supports_direct_msis(kvm))
+   vgic_v4_teardown(kvm);
 }
 
 void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 8ee03f1e89fc..89768d2b6a91 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1603,6 +1603,14 @@ static int vgic_its_create(struct kvm_device *dev, u32 
type)
if (!its)
return -ENOMEM;
 
+   if (vgic_initialized(dev->kvm)) {
+   int ret = vgic_v4_init(dev->kvm);
+   if (ret) {
+   kfree(its);
+   return ret;
+   }
+   }
+   
mutex_init(&its->its_lock);
mutex_init(&its->cmd_lock);
 
diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
new file mode 100644
i

[PATCH v4 06/26] KVM: arm/arm64: vITS: Add a helper to update the affinity of an LPI

2017-10-06 Thread Marc Zyngier
In order help integrating the vITS code with GICv4, let's add
a new helper that deals with updating the affinity of an LPI,
which will later be augmented with super duper extra GICv4
goodness.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic/vgic-its.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 475dffd0a44c..8ee03f1e89fc 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -335,6 +335,15 @@ static int vgic_copy_lpi_list(struct kvm_vcpu *vcpu, u32 
**intid_ptr)
return i;
 }
 
+static int update_affinity(struct vgic_irq *irq, struct kvm_vcpu *vcpu)
+{
+   spin_lock(&irq->irq_lock);
+   irq->target_vcpu = vcpu;
+   spin_unlock(&irq->irq_lock);
+
+   return 0;
+}
+
 /*
  * Promotes the ITS view of affinity of an ITTE (which redistributor this LPI
  * is targeting) to the VGIC's view, which deals with target VCPUs.
@@ -349,10 +358,7 @@ static void update_affinity_ite(struct kvm *kvm, struct 
its_ite *ite)
return;
 
vcpu = kvm_get_vcpu(kvm, ite->collection->target_addr);
-
-   spin_lock(&ite->irq->irq_lock);
-   ite->irq->target_vcpu = vcpu;
-   spin_unlock(&ite->irq->irq_lock);
+   update_affinity(ite->irq, vcpu);
 }
 
 /*
@@ -693,11 +699,7 @@ static int vgic_its_cmd_handle_movi(struct kvm *kvm, 
struct vgic_its *its,
ite->collection = collection;
vcpu = kvm_get_vcpu(kvm, collection->target_addr);
 
-   spin_lock(&ite->irq->irq_lock);
-   ite->irq->target_vcpu = vcpu;
-   spin_unlock(&ite->irq->irq_lock);
-
-   return 0;
+   return update_affinity(ite->irq, vcpu);
 }
 
 /*
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 11/28] arm64/sve: Core task context handling

2017-10-06 Thread Catalin Marinas
On Fri, Oct 06, 2017 at 04:15:28PM +0100, Dave P Martin wrote:
> On Fri, Oct 06, 2017 at 02:36:40PM +0100, Catalin Marinas wrote:
> > On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:
> > > On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:
> > > > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:
> > > > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
> > > > > unchanged:
> > > > > 
> > > > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
> > > > >registers of the CPU the task is running on contain the 
> > > > > authoritative
> > > > >FPSIMD/SVE state of the task.  The backing memory may be stale.
> > > > > 
> > > > >  * Otherwise (i.e., task not running, or task running and
> > > > >TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
> > > > >authoritative.  If additionally per_cpu(fpsimd_last_state,
> > > > >task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
> > > > >task->fpsimd_state.cpu's registers are also up to date for task, 
> > > > > but
> > > > >not authorititive: the current FPSIMD/SVE state may be read from
> > > > >them, but they must not be written.
> > > > >  
> > > > > The FPSIMD/SVE backing memory is selected by TIF_SVE:
> > > > > 
> > > > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
> > > > >stored in task->thread.sve_state, formatted appropriately for 
> > > > > vector
> > > > >length task->thread.sve_vl.  task->thread.sve_state must point to a
> > > > >valid buffer at least sve_state_size(task) bytes in size.
> > 
> > "Zn [...] stored in  task->thread.sve_state" - is this still true with
> > the changes you proposed? I guess even without these changes, you have
> > situations where the hardware regs are out of sync with sve_state (see
> > more below).
> 
> I guess I need to tweak the wording here.
> 
> TIF_SVE says where the vector state should be loaded/stored from,
> but does not say whether the data is up to date in memory, or when
> it should be loaded/stored.
> 
> The latter is described by a cocktail of different things including
> which bit of kernel code we are executing if any, whether the task
> is running/stopped etc., TIF_FOREIGN_FPSTATE,
> task->thread.fpsimd_state.cpu and per_cpu(fpsimd_last_state).
> 
> Does this make any better sense of my code below?

Yes, it looks fine.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 11/28] arm64/sve: Core task context handling

2017-10-06 Thread Dave Martin
On Fri, Oct 06, 2017 at 02:36:40PM +0100, Catalin Marinas wrote:
> On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:
> > On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:
> > > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:
> > > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
> > > > unchanged:
> > > > 
> > > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
> > > >registers of the CPU the task is running on contain the authoritative
> > > >FPSIMD/SVE state of the task.  The backing memory may be stale.
> > > > 
> > > >  * Otherwise (i.e., task not running, or task running and
> > > >TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
> > > >authoritative.  If additionally per_cpu(fpsimd_last_state,
> > > >task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
> > > >task->fpsimd_state.cpu's registers are also up to date for task, but
> > > >not authorititive: the current FPSIMD/SVE state may be read from
> > > >them, but they must not be written.
> > > >  
> > > > The FPSIMD/SVE backing memory is selected by TIF_SVE:
> > > > 
> > > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
> > > >stored in task->thread.sve_state, formatted appropriately for vector
> > > >length task->thread.sve_vl.  task->thread.sve_state must point to a
> > > >valid buffer at least sve_state_size(task) bytes in size.
> 
> "Zn [...] stored in  task->thread.sve_state" - is this still true with
> the changes you proposed? I guess even without these changes, you have
> situations where the hardware regs are out of sync with sve_state (see
> more below).

I guess I need to tweak the wording here.

TIF_SVE says where the vector state should be loaded/stored from,
but does not say whether the data is up to date in memory, or when
it should be loaded/stored.

The latter is described by a cocktail of different things including
which bit of kernel code we are executing if any, whether the task
is running/stopped etc., TIF_FOREIGN_FPSTATE,
task->thread.fpsimd_state.cpu and per_cpu(fpsimd_last_state).


Does this make any better sense of my code below?

> 
> > > >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are
> > > >logically zero[*] but not stored anywhere; Pn, FFR are not stored and
> > > >have unspecified values from userspace's point of view.
> > > >task->thread.sve_state does not need to be non-null, valid or any
> > > >particular size: it must not be dereferenced.
> > > > 
> > > >In practice I don't exploit the "unspecifiedness" much.  The Zn high
> > > >bits, Pn and FFR are all zeroed when setting TIF_SVE again:
> > > >sve_alloc() is the common path for this.
> > > > 
> > > >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of
> > > >whether TIF_SVE is clear or set, since these are not vector length
> > > >dependent.
> [...]
> > > Just wondering, as an optimisation for do_sve_acc() - instead of
> > > sve_alloc() and fpsimd_to_sve(), can we not force the loading of the
> > > FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would
> > > ensure the zeroing of the top SVE bits and we only need to allocate the
> > > SVE state on the saving path. This means enabling SVE for user and
> > > setting TIF_SVE without having the backing storage allocated.
> > 
> > Currently the set of places where the "TIF_SVE implies sve_state valid"
> > assumption is applied is not very constrained, so while your suggestion
> > is reasonable I'd rather not mess with it just now, if possible.
> > 
> > 
> > But we can do this (which is what my current fixup has):
> > 
> > el0_sve_acc:
> > enable_dbg_and_irq
> > // ...
> > bl  do_sve_acc
> > b   ret_to_user
> > 
> > void do_sve_acc(unsigned int esr, struct pt_regs *regs)
> > {
> > /* Even if we chose not to use SVE, the hardware could still trap: */
> > if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {
> > force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
> > return;
> > }
> > 
> > sve_alloc(current);
> > 
> > local_bh_disable();
> > if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> > task_fpsimd_load(); /* flushes high Zn bits as a side-effect */
> > sve_flush_pregs();
> > } else {
> > sve_flush_all(); /* flush all the SVE bits in-place */
> > }
> > 
> > if (test_and_set_thread_flag(TIF_SVE))
> > WARN_ON(1); /* SVE access shouldn't have trapped */
> > local_bh_enable();
> > }
> > 
> > where sve_flush_all() zeroes all the high Zn bits via a series of
> > MOV Vn, Vn instructions, and also zeroes Pn and FFR.  sve_fplush_pregs()
> > just does the latter.
> 
> This looks fine to me but I added a comment above. IIUC, we can now have
> TIF_SVE set while sve_state contains stale data. I don't see an issue

Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 15:26, Julien Thierry wrote:
> 
> 
> On 06/10/17 15:00, Marc Zyngier wrote:
>> On 06/10/17 14:47, Alex Bennée wrote:
>>>
>>> Marc Zyngier  writes:
>>>
 On 06/10/17 13:37, Marc Zyngier wrote:
> On 06/10/17 12:39, Alex Bennée wrote:
>> The system state of KVM when using userspace emulation is not complete
>> until we return into KVM_RUN. To handle mmio related updates we wait
>> until they have been committed and then schedule our KVM_EXIT_DEBUG.
>>
>> I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
>> the differences between arm/arm64 which is currently null for arm.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>   arch/arm/include/asm/kvm_host.h   |  2 ++
>>   arch/arm64/include/asm/kvm_host.h |  1 +
>>   arch/arm64/kvm/debug.c| 21 +
>>   arch/arm64/kvm/handle_exit.c  |  9 +++--
>>   virt/kvm/arm/arm.c|  2 +-
>>   virt/kvm/arm/mmio.c   |  3 ++-
>>   6 files changed, 30 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 4a879f6ff13b..aec943f6d123 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
>>   static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>>   static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>>   static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>> +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
>> +struct kvm_run *run) {}
>>
>>   int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index e923b58606e2..fa67d21662f6 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
>>   void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>>   void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>>   void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>> +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run 
>> *run);
>>   int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>>   int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index dbadfaf850a7..a10a18c55c87 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>>  }
>>  }
>>   }
>> +
>> +
>> +/*
>> + * When KVM has successfully emulated the instruction we might want to
>> + * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
>> + * is complete though so for userspace emulations we have to wait
>> + * until we have re-entered KVM.
>> + *
>> + * Return > 0 to return to guest, 0 (and set exit_reason) on proper
>> + * exit to userspace.
>> + */
>> +
>> +int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run 
>> *run)
>> +{
>> +if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
>> +run->exit_reason = KVM_EXIT_DEBUG;
>> +run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>> ESR_ELx_EC_SHIFT;
>> +return 0;
>> +}
>> +return 1;
>> +}
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index c918d291cb58..7b04f59217bf 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu 
>> *vcpu, struct kvm_run *run)
>>  handled = exit_handler(vcpu, run);
>>  }
>>
>> -if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>> -handled = 0;
>> -run->exit_reason = KVM_EXIT_DEBUG;
>> -run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>> ESR_ELx_EC_SHIFT;
>> -}
>> +if (handled)
>> +return kvm_arm_maybe_return_debug(vcpu, run);
>>
>> -return handled;
>> +return 0;
>>   }
>>
>>   /*
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index b9f68e4add71..3d28fe2daa26 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 

Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Julien Thierry



On 06/10/17 14:45, Alex Bennée wrote:


Julien Thierry  writes:


On 06/10/17 12:39, Alex Bennée wrote:

The system state of KVM when using userspace emulation is not complete
until we return into KVM_RUN. To handle mmio related updates we wait
until they have been committed and then schedule our KVM_EXIT_DEBUG.

I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
the differences between arm/arm64 which is currently null for arm.

Signed-off-by: Alex Bennée 
---
   arch/arm/include/asm/kvm_host.h   |  2 ++
   arch/arm64/include/asm/kvm_host.h |  1 +
   arch/arm64/kvm/debug.c| 21 +
   arch/arm64/kvm/handle_exit.c  |  9 +++--
   virt/kvm/arm/arm.c|  2 +-
   virt/kvm/arm/mmio.c   |  3 ++-
   6 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6ff13b..aec943f6d123 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
   static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
   static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
   static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
+static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
+   struct kvm_run *run) {}



This function should return 1.


So I did ponder making this a bool, returning true if we need to exit
and testing in v/k/a/arm.c exit leg rather than in the mmio handler.

At the moment it mirrors the existing exit logic which follows -1 err, 0
return, >0 handled. But as I mentioned in the cover letter this fell
down a bit when dealing with the mmio case.



Hmmm, my main issue is that this version doesn't have a return 
statement, which probably triggers a gcc warning with ARCH=arm and also 
might cause arm (32bit) kvm to exit upon handling mmio return when we 
don't want to.


Otherwise, I also wondered about using a bool here. But following the 
pre-existing logic makes sense to me (but I have no strong feeling about 
it).





   int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..fa67d21662f6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
   void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
   void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
   void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);


I feel the name could be a little bit more explicit:

kvm_arm_trap_need_step_debug, kvm_arm_trap_step_return_debug,
kvm_arm_trap_need_return_debug.


I wanted to keep the debug suffix so that's fine although I'm not so
sure trap is correct because on the tail end of mmio emulation are we
still trapping?

Maybe kvm_arm_step_emulated_debug?



Yes, sounds good.

Thanks,


At least, I think it would be nice that the name reflect that this
check is meant for emulated instructions.

Otherwise:

Reviewed-by: Julien Thierry 

Thanks,



--
Alex Bennée



--
Julien Thierry
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Julien Thierry



On 06/10/17 15:00, Marc Zyngier wrote:

On 06/10/17 14:47, Alex Bennée wrote:


Marc Zyngier  writes:


On 06/10/17 13:37, Marc Zyngier wrote:

On 06/10/17 12:39, Alex Bennée wrote:

The system state of KVM when using userspace emulation is not complete
until we return into KVM_RUN. To handle mmio related updates we wait
until they have been committed and then schedule our KVM_EXIT_DEBUG.

I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
the differences between arm/arm64 which is currently null for arm.

Signed-off-by: Alex Bennée 
---
  arch/arm/include/asm/kvm_host.h   |  2 ++
  arch/arm64/include/asm/kvm_host.h |  1 +
  arch/arm64/kvm/debug.c| 21 +
  arch/arm64/kvm/handle_exit.c  |  9 +++--
  virt/kvm/arm/arm.c|  2 +-
  virt/kvm/arm/mmio.c   |  3 ++-
  6 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6ff13b..aec943f6d123 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
+static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
+   struct kvm_run *run) {}

  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..fa67d21662f6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
  int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index dbadfaf850a7..a10a18c55c87 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
}
}
  }
+
+
+/*
+ * When KVM has successfully emulated the instruction we might want to
+ * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
+ * is complete though so for userspace emulations we have to wait
+ * until we have re-entered KVM.
+ *
+ * Return > 0 to return to guest, 0 (and set exit_reason) on proper
+ * exit to userspace.
+ */
+
+int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
+   run->exit_reason = KVM_EXIT_DEBUG;
+   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
+   return 0;
+   }
+   return 1;
+}
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index c918d291cb58..7b04f59217bf 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu *vcpu, 
struct kvm_run *run)
handled = exit_handler(vcpu, run);
}

-   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
-   handled = 0;
-   run->exit_reason = KVM_EXIT_DEBUG;
-   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
-   }
+   if (handled)
+   return kvm_arm_maybe_return_debug(vcpu, run);

-   return handled;
+   return 0;
  }

  /*
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index b9f68e4add71..3d28fe2daa26 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)

if (run->exit_reason == KVM_EXIT_MMIO) {
ret = kvm_handle_mmio_return(vcpu, vcpu->run);
-   if (ret)
+   if (ret < 1)
return ret;
}

diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
index b6e715fd3c90..e43e3bd6222f 100644
--- a/virt/kvm/arm/mmio.c
+++ b/virt/kvm/arm/mmio.c
@@ -117,7 +117,8 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}

-   return 0;
+   /* If debugging in effect we may need to return now */
+   return kvm_arm_maybe_return_debug(vcpu, run);


Ah, that's how you do it. OK. Then the patch splitting is wrong, because
everything is broken after patch #1.


Actually

Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 14:47, Alex Bennée wrote:
> 
> Marc Zyngier  writes:
> 
>> On 06/10/17 13:37, Marc Zyngier wrote:
>>> On 06/10/17 12:39, Alex Bennée wrote:
 The system state of KVM when using userspace emulation is not complete
 until we return into KVM_RUN. To handle mmio related updates we wait
 until they have been committed and then schedule our KVM_EXIT_DEBUG.

 I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
 the differences between arm/arm64 which is currently null for arm.

 Signed-off-by: Alex Bennée 
 ---
  arch/arm/include/asm/kvm_host.h   |  2 ++
  arch/arm64/include/asm/kvm_host.h |  1 +
  arch/arm64/kvm/debug.c| 21 +
  arch/arm64/kvm/handle_exit.c  |  9 +++--
  virt/kvm/arm/arm.c|  2 +-
  virt/kvm/arm/mmio.c   |  3 ++-
  6 files changed, 30 insertions(+), 8 deletions(-)

 diff --git a/arch/arm/include/asm/kvm_host.h 
 b/arch/arm/include/asm/kvm_host.h
 index 4a879f6ff13b..aec943f6d123 100644
 --- a/arch/arm/include/asm/kvm_host.h
 +++ b/arch/arm/include/asm/kvm_host.h
 @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
 +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
 +  struct kvm_run *run) {}

  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
 diff --git a/arch/arm64/include/asm/kvm_host.h 
 b/arch/arm64/include/asm/kvm_host.h
 index e923b58606e2..fa67d21662f6 100644
 --- a/arch/arm64/include/asm/kvm_host.h
 +++ b/arch/arm64/include/asm/kvm_host.h
 @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run 
 *run);
  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
  int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
 diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
 index dbadfaf850a7..a10a18c55c87 100644
 --- a/arch/arm64/kvm/debug.c
 +++ b/arch/arm64/kvm/debug.c
 @@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
}
}
  }
 +
 +
 +/*
 + * When KVM has successfully emulated the instruction we might want to
 + * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
 + * is complete though so for userspace emulations we have to wait
 + * until we have re-entered KVM.
 + *
 + * Return > 0 to return to guest, 0 (and set exit_reason) on proper
 + * exit to userspace.
 + */
 +
 +int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
 +{
 +  if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
 +  run->exit_reason = KVM_EXIT_DEBUG;
 +  run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
 ESR_ELx_EC_SHIFT;
 +  return 0;
 +  }
 +  return 1;
 +}
 diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
 index c918d291cb58..7b04f59217bf 100644
 --- a/arch/arm64/kvm/handle_exit.c
 +++ b/arch/arm64/kvm/handle_exit.c
 @@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu 
 *vcpu, struct kvm_run *run)
handled = exit_handler(vcpu, run);
}

 -  if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
 -  handled = 0;
 -  run->exit_reason = KVM_EXIT_DEBUG;
 -  run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
 ESR_ELx_EC_SHIFT;
 -  }
 +  if (handled)
 +  return kvm_arm_maybe_return_debug(vcpu, run);

 -  return handled;
 +  return 0;
  }

  /*
 diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
 index b9f68e4add71..3d28fe2daa26 100644
 --- a/virt/kvm/arm/arm.c
 +++ b/virt/kvm/arm/arm.c
 @@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
 struct kvm_run *run)

if (run->exit_reason == KVM_EXIT_MMIO) {
ret = kvm_handle_mmio_return(vcpu, vcpu->run);
 -  if (ret)
 +  if (ret < 1)
return ret;
}

 diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
 index b6e715fd3c90..e43e3bd6222f 100644
 --- a/virt/kvm/arm/mmio.c
 +++ b/virt/kvm/arm/mmio.c
 @@ -117,7 +117,8 @@ int kvm_handle_mmio_retu

Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Alex Bennée

Marc Zyngier  writes:

> On 06/10/17 13:37, Marc Zyngier wrote:
>> On 06/10/17 12:39, Alex Bennée wrote:
>>> The system state of KVM when using userspace emulation is not complete
>>> until we return into KVM_RUN. To handle mmio related updates we wait
>>> until they have been committed and then schedule our KVM_EXIT_DEBUG.
>>>
>>> I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
>>> the differences between arm/arm64 which is currently null for arm.
>>>
>>> Signed-off-by: Alex Bennée 
>>> ---
>>>  arch/arm/include/asm/kvm_host.h   |  2 ++
>>>  arch/arm64/include/asm/kvm_host.h |  1 +
>>>  arch/arm64/kvm/debug.c| 21 +
>>>  arch/arm64/kvm/handle_exit.c  |  9 +++--
>>>  virt/kvm/arm/arm.c|  2 +-
>>>  virt/kvm/arm/mmio.c   |  3 ++-
>>>  6 files changed, 30 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/kvm_host.h 
>>> b/arch/arm/include/asm/kvm_host.h
>>> index 4a879f6ff13b..aec943f6d123 100644
>>> --- a/arch/arm/include/asm/kvm_host.h
>>> +++ b/arch/arm/include/asm/kvm_host.h
>>> @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
>>>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>>>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>>>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>>> +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
>>> +   struct kvm_run *run) {}
>>>
>>>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>>>struct kvm_device_attr *attr);
>>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>>> b/arch/arm64/include/asm/kvm_host.h
>>> index e923b58606e2..fa67d21662f6 100644
>>> --- a/arch/arm64/include/asm/kvm_host.h
>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>> @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
>>>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>>>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>>>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>>> +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run 
>>> *run);
>>>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>>>struct kvm_device_attr *attr);
>>>  int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>>> index dbadfaf850a7..a10a18c55c87 100644
>>> --- a/arch/arm64/kvm/debug.c
>>> +++ b/arch/arm64/kvm/debug.c
>>> @@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>>> }
>>> }
>>>  }
>>> +
>>> +
>>> +/*
>>> + * When KVM has successfully emulated the instruction we might want to
>>> + * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
>>> + * is complete though so for userspace emulations we have to wait
>>> + * until we have re-entered KVM.
>>> + *
>>> + * Return > 0 to return to guest, 0 (and set exit_reason) on proper
>>> + * exit to userspace.
>>> + */
>>> +
>>> +int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
>>> +{
>>> +   if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
>>> +   run->exit_reason = KVM_EXIT_DEBUG;
>>> +   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>>> ESR_ELx_EC_SHIFT;
>>> +   return 0;
>>> +   }
>>> +   return 1;
>>> +}
>>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>>> index c918d291cb58..7b04f59217bf 100644
>>> --- a/arch/arm64/kvm/handle_exit.c
>>> +++ b/arch/arm64/kvm/handle_exit.c
>>> @@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu 
>>> *vcpu, struct kvm_run *run)
>>> handled = exit_handler(vcpu, run);
>>> }
>>>
>>> -   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>> -   handled = 0;
>>> -   run->exit_reason = KVM_EXIT_DEBUG;
>>> -   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>>> ESR_ELx_EC_SHIFT;
>>> -   }
>>> +   if (handled)
>>> +   return kvm_arm_maybe_return_debug(vcpu, run);
>>>
>>> -   return handled;
>>> +   return 0;
>>>  }
>>>
>>>  /*
>>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>>> index b9f68e4add71..3d28fe2daa26 100644
>>> --- a/virt/kvm/arm/arm.c
>>> +++ b/virt/kvm/arm/arm.c
>>> @@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
>>> struct kvm_run *run)
>>>
>>> if (run->exit_reason == KVM_EXIT_MMIO) {
>>> ret = kvm_handle_mmio_return(vcpu, vcpu->run);
>>> -   if (ret)
>>> +   if (ret < 1)
>>> return ret;
>>> }
>>>
>>> diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
>>> index b6e715fd3c90..e43e3bd6222f 100644
>>> --- a/virt/kvm/arm/mmio.c
>>> +++ b/virt/kvm/arm/mmio.c
>>> @@ -117,7 +117,8 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, 
>>> struct kvm_run *run)
>>> vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
>>> }
>>>

Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Alex Bennée

Julien Thierry  writes:

> On 06/10/17 12:39, Alex Bennée wrote:
>> The system state of KVM when using userspace emulation is not complete
>> until we return into KVM_RUN. To handle mmio related updates we wait
>> until they have been committed and then schedule our KVM_EXIT_DEBUG.
>>
>> I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
>> the differences between arm/arm64 which is currently null for arm.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>   arch/arm/include/asm/kvm_host.h   |  2 ++
>>   arch/arm64/include/asm/kvm_host.h |  1 +
>>   arch/arm64/kvm/debug.c| 21 +
>>   arch/arm64/kvm/handle_exit.c  |  9 +++--
>>   virt/kvm/arm/arm.c|  2 +-
>>   virt/kvm/arm/mmio.c   |  3 ++-
>>   6 files changed, 30 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 4a879f6ff13b..aec943f6d123 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
>>   static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>>   static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>>   static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>> +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
>> +struct kvm_run *run) {}
>>
>
> This function should return 1.

So I did ponder making this a bool, returning true if we need to exit
and testing in v/k/a/arm.c exit leg rather than in the mmio handler.

At the moment it mirrors the existing exit logic which follows -1 err, 0
return, >0 handled. But as I mentioned in the cover letter this fell
down a bit when dealing with the mmio case.

>
>>   int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index e923b58606e2..fa67d21662f6 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
>>   void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>>   void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>>   void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>> +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
>
> I feel the name could be a little bit more explicit:
>
> kvm_arm_trap_need_step_debug, kvm_arm_trap_step_return_debug,
> kvm_arm_trap_need_return_debug.

I wanted to keep the debug suffix so that's fine although I'm not so
sure trap is correct because on the tail end of mmio emulation are we
still trapping?

Maybe kvm_arm_step_emulated_debug?

> At least, I think it would be nice that the name reflect that this
> check is meant for emulated instructions.
>
> Otherwise:
>
> Reviewed-by: Julien Thierry 
>
> Thanks,


--
Alex Bennée
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 11/28] arm64/sve: Core task context handling

2017-10-06 Thread Catalin Marinas
On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:
> On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:
> > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:
> > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
> > > unchanged:
> > > 
> > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
> > >registers of the CPU the task is running on contain the authoritative
> > >FPSIMD/SVE state of the task.  The backing memory may be stale.
> > > 
> > >  * Otherwise (i.e., task not running, or task running and
> > >TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
> > >authoritative.  If additionally per_cpu(fpsimd_last_state,
> > >task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
> > >task->fpsimd_state.cpu's registers are also up to date for task, but
> > >not authorititive: the current FPSIMD/SVE state may be read from
> > >them, but they must not be written.
> > >  
> > > The FPSIMD/SVE backing memory is selected by TIF_SVE:
> > > 
> > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
> > >stored in task->thread.sve_state, formatted appropriately for vector
> > >length task->thread.sve_vl.  task->thread.sve_state must point to a
> > >valid buffer at least sve_state_size(task) bytes in size.

"Zn [...] stored in  task->thread.sve_state" - is this still true with
the changes you proposed? I guess even without these changes, you have
situations where the hardware regs are out of sync with sve_state (see
more below).

> > >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are
> > >logically zero[*] but not stored anywhere; Pn, FFR are not stored and
> > >have unspecified values from userspace's point of view.
> > >task->thread.sve_state does not need to be non-null, valid or any
> > >particular size: it must not be dereferenced.
> > > 
> > >In practice I don't exploit the "unspecifiedness" much.  The Zn high
> > >bits, Pn and FFR are all zeroed when setting TIF_SVE again:
> > >sve_alloc() is the common path for this.
> > > 
> > >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of
> > >whether TIF_SVE is clear or set, since these are not vector length
> > >dependent.
[...]
> > Just wondering, as an optimisation for do_sve_acc() - instead of
> > sve_alloc() and fpsimd_to_sve(), can we not force the loading of the
> > FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would
> > ensure the zeroing of the top SVE bits and we only need to allocate the
> > SVE state on the saving path. This means enabling SVE for user and
> > setting TIF_SVE without having the backing storage allocated.
> 
> Currently the set of places where the "TIF_SVE implies sve_state valid"
> assumption is applied is not very constrained, so while your suggestion
> is reasonable I'd rather not mess with it just now, if possible.
> 
> 
> But we can do this (which is what my current fixup has):
> 
> el0_sve_acc:
>   enable_dbg_and_irq
>   // ...
>   bl  do_sve_acc
>   b   ret_to_user
> 
> void do_sve_acc(unsigned int esr, struct pt_regs *regs)
> {
>   /* Even if we chose not to use SVE, the hardware could still trap: */
>   if (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {
>   force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
>   return;
>   }
> 
>   sve_alloc(current);
> 
>   local_bh_disable();
>   if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
>   task_fpsimd_load(); /* flushes high Zn bits as a side-effect */
>   sve_flush_pregs();
>   } else {
>   sve_flush_all(); /* flush all the SVE bits in-place */
>   }
> 
>   if (test_and_set_thread_flag(TIF_SVE))
>   WARN_ON(1); /* SVE access shouldn't have trapped */
>   local_bh_enable();
> }
> 
> where sve_flush_all() zeroes all the high Zn bits via a series of
> MOV Vn, Vn instructions, and also zeroes Pn and FFR.  sve_fplush_pregs()
> just does the latter.

This looks fine to me but I added a comment above. IIUC, we can now have
TIF_SVE set while sve_state contains stale data. I don't see an issue
given that every time you enter the kernel from user space you have
TIF_SVE set and the sve_state storage out of sync. Maybe tweak the
TIF_SVE description above slightly.

-- 
Catalin
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 1/2] KVM: arm64: handle single-stepping trapped instructions

2017-10-06 Thread Julien Thierry



On 06/10/17 12:39, Alex Bennée wrote:

If we are using guest debug to single-step the guest we need to ensure
we exit after emulating the instruction. This only affects
instructions completely emulated by the kernel. For userspace emulated
instructions we need to exit and return to complete the emulation.

We fake debug.arch.hsr to contain ESR_ELx_EC_SOFTSTP_LOW so QEMU knows
it was a single-step event (and without altering the userspace ABI).

Signed-off-by: Alex Bennée 


Reviewed-by: Julien Thierry 

Thanks,

--
Julien Thierry
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Julien Thierry



On 06/10/17 12:39, Alex Bennée wrote:

The system state of KVM when using userspace emulation is not complete
until we return into KVM_RUN. To handle mmio related updates we wait
until they have been committed and then schedule our KVM_EXIT_DEBUG.

I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
the differences between arm/arm64 which is currently null for arm.

Signed-off-by: Alex Bennée 
---
  arch/arm/include/asm/kvm_host.h   |  2 ++
  arch/arm64/include/asm/kvm_host.h |  1 +
  arch/arm64/kvm/debug.c| 21 +
  arch/arm64/kvm/handle_exit.c  |  9 +++--
  virt/kvm/arm/arm.c|  2 +-
  virt/kvm/arm/mmio.c   |  3 ++-
  6 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6ff13b..aec943f6d123 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
+static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
+   struct kvm_run *run) {}
  


This function should return 1.


  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..fa67d21662f6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);


I feel the name could be a little bit more explicit:

kvm_arm_trap_need_step_debug, kvm_arm_trap_step_return_debug, 
kvm_arm_trap_need_return_debug.


At least, I think it would be nice that the name reflect that this check 
is meant for emulated instructions.


Otherwise:

Reviewed-by: Julien Thierry 

Thanks,

--
Julien Thierry
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 11/28] arm64/sve: Core task context handling

2017-10-06 Thread Dave Martin
On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:
> On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:
> > On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:
> > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:
> > > > +/*
> > > > + * Handle SVE state across fork():
> > > > + *
> > > > + * dst and src must not end up with aliases of the same sve_state.
> > > > + * Because a task cannot fork except in a syscall, we can discard SVE
> > > > + * state for dst here, so long as we take care to retain the FPSIMD
> > > > + * subset of the state if SVE is in use.  Reallocation of the SVE state
> > > > + * will be deferred until dst tries to use SVE.
> > > > + */
> > > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const 
> > > > *src)
> > > > +{
> > > > +   if (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {
> > > > +   WARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));
> > > > +   sve_to_fpsimd(dst);
> > > > +   }
> > > > +
> > > > +   dst->thread.sve_state = NULL;
> > > > +}
> > > 
> > > I first thought the thread flags are not visible in dst yet since
> > > dup_task_struct() calls arch_dup_task_struct() before
> > > setup_thread_stack(). However, at the end of the last year we enabled
> > > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying
> > > on this.
> > > 
> > > Anyway, IIUC we don't need sve_to_fpsimd() here. The
> > > arch_dup_task_struct() already called fpsimd_preserve_current_state()
> > > for src, so the FPSIMD state (which we care about) is transferred during
> > > the *dst = *src assignment. So you'd only need the last statement,
> > > possibly with a different function name like fpsimd_erase_sve (and maybe
> > > make the function static inline in the header).
> > 
> > Regarding the intended meanings of the thread flags, does this help?
> > 
> > --8<--
> > 
> > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise
> > unchanged:
> > 
> >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU
> >registers of the CPU the task is running on contain the authoritative
> >FPSIMD/SVE state of the task.  The backing memory may be stale.
> > 
> >  * Otherwise (i.e., task not running, or task running and
> >TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is
> >authoritative.  If additionally per_cpu(fpsimd_last_state,
> >task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then
> >task->fpsimd_state.cpu's registers are also up to date for task, but
> >not authorititive: the current FPSIMD/SVE state may be read from
> >them, but they must not be written.
> >  
> > The FPSIMD/SVE backing memory is selected by TIF_SVE:
> > 
> >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are
> >stored in task->thread.sve_state, formatted appropriately for vector
> >length task->thread.sve_vl.  task->thread.sve_state must point to a
> >valid buffer at least sve_state_size(task) bytes in size.
> > 
> >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are
> >logically zero[*] but not stored anywhere; Pn, FFR are not stored and
> >have unspecified values from userspace's point of view.
> >task->thread.sve_state does not need to be non-null, valid or any
> >particular size: it must not be dereferenced.
> > 
> >In practice I don't exploit the "unspecifiedness" much.  The Zn high
> >bits, Pn and FFR are all zeroed when setting TIF_SVE again:
> >sve_alloc() is the common path for this.
> > 
> >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of
> >whether TIF_SVE is clear or set, since these are not vector length
> >dependent.
> 
> This looks fine. I think we need to make sure (with a warning) that
> task_fpsimd_save() (and probably load) is always called in a
> non-preemptible context. 
> 
> > [*] theoretically unspecified, which is what I've written in sve.txt.
> > However, on any FPSIMD Vn write the upper bits of the corresponding Zn
> > must become logically zero in order to conform to the SVE programmer's
> > model.  It's not feasible to track which Vn have been written
> > individually because that would involve trapping every FPSIMD insn until
> > all possible Vn have been written.  So the kernel ensures that the Zn
> > high bits become zero.
> > 
> > Maybe we should just guarantee "zero-or-preserved" behaviour for
> > userspace.  This may close down some future optimisation opportunities,
> > but maybe it's better to be simple.
> 
> Does it work if we leave it as "unspecified" in the document but just do
> zero-or-preserved in the kernel code?

Sure, that's the behaviour today in effect.

I'll leave the documentation unchanged, then we can take advantage of
this flexibility later if is proves to be useful.

> Just wondering, as an optimisation for do_sve_acc() - instead of
> sve_alloc() and fpsimd_to_sve(), can we not

Re: [PATCH v1 1/2] KVM: arm64: handle single-stepping trapped instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 13:34, Julien Thierry wrote:
> 
> 
> On 06/10/17 13:27, Marc Zyngier wrote:
>> On 06/10/17 12:39, Alex Bennée wrote:
>>> If we are using guest debug to single-step the guest we need to ensure
>>> we exit after emulating the instruction. This only affects
>>> instructions completely emulated by the kernel. For userspace emulated
>>> instructions we need to exit and return to complete the emulation.
>>>
>>> We fake debug.arch.hsr to contain ESR_ELx_EC_SOFTSTP_LOW so QEMU knows
>>> it was a single-step event (and without altering the userspace ABI).
>>>
>>> Signed-off-by: Alex Bennée 
>>> ---
>>>   arch/arm64/kvm/handle_exit.c | 48 
>>> +++-
>>>   1 file changed, 34 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>>> index 7debb74843a0..c918d291cb58 100644
>>> --- a/arch/arm64/kvm/handle_exit.c
>>> +++ b/arch/arm64/kvm/handle_exit.c
>>> @@ -178,6 +178,39 @@ static exit_handle_fn kvm_get_exit_handler(struct 
>>> kvm_vcpu *vcpu)
>>> return arm_exit_handlers[hsr_ec];
>>>   }
>>>   
>>> +/*
>>> + * When handling traps we need to ensure exit the guest if we
>>> + * completely emulated the instruction while single-stepping. Stuff to
>>> + * be emulated in userspace needs to complete that first.
>>> + */
>>> +
>>> +static int handle_trap_exceptions(struct kvm_vcpu *vcpu, struct kvm_run 
>>> *run)
>>> +{
>>> +   int handled;
>>> +
>>> +   /*
>>> +* See ARM ARM B1.14.1: "Hyp traps on instructions
>>> +* that fail their condition code check"
>>> +*/
>>> +   if (!kvm_condition_valid(vcpu)) {
>>> +   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
>>> +   handled = 1;
>>> +   } else {
>>> +   exit_handle_fn exit_handler;
>>> +
>>> +   exit_handler = kvm_get_exit_handler(vcpu);
>>> +   handled = exit_handler(vcpu, run);
>>> +   }
>>> +
>>> +   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>> +   handled = 0;
>>> +   run->exit_reason = KVM_EXIT_DEBUG;
>>> +   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>>> ESR_ELx_EC_SHIFT;
>>> +   }
>>
>> Doesn't this break an MMIO read? The registers haven't been updated yet,
>> and the debugger may not see the right thing...
>>
>> How about something like:
>>
>>  if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>>  if (run->exit_reason == KVM_EXIT_MMIO)
>>  kvm_handle_mmio_return(vcpu, run);
>>  [...]
>>  }
>>
>> Or am I missing something?
> 
> If the MMIO was not handled by the kernel, exit_handler will return 0, 
> so handled will be false and we won't pretend we have a debug exception 
> (but will still return to userland with KVM_EXIT_MMIO).
> 
> I think the second patch takes care of properly handling single step for 
> userland MMIO.

Indeed, I was just confused. We do have a kvm_handle_mmio_return in the
vgic emulation, so it is all taken care off at this stage. Blimey.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 13:37, Marc Zyngier wrote:
> On 06/10/17 12:39, Alex Bennée wrote:
>> The system state of KVM when using userspace emulation is not complete
>> until we return into KVM_RUN. To handle mmio related updates we wait
>> until they have been committed and then schedule our KVM_EXIT_DEBUG.
>>
>> I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
>> the differences between arm/arm64 which is currently null for arm.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>  arch/arm/include/asm/kvm_host.h   |  2 ++
>>  arch/arm64/include/asm/kvm_host.h |  1 +
>>  arch/arm64/kvm/debug.c| 21 +
>>  arch/arm64/kvm/handle_exit.c  |  9 +++--
>>  virt/kvm/arm/arm.c|  2 +-
>>  virt/kvm/arm/mmio.c   |  3 ++-
>>  6 files changed, 30 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h 
>> b/arch/arm/include/asm/kvm_host.h
>> index 4a879f6ff13b..aec943f6d123 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
>>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>> +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
>> +struct kvm_run *run) {}
>>  
>>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>> diff --git a/arch/arm64/include/asm/kvm_host.h 
>> b/arch/arm64/include/asm/kvm_host.h
>> index e923b58606e2..fa67d21662f6 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
>>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>> +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
>>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>>  int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index dbadfaf850a7..a10a18c55c87 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>>  }
>>  }
>>  }
>> +
>> +
>> +/*
>> + * When KVM has successfully emulated the instruction we might want to
>> + * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
>> + * is complete though so for userspace emulations we have to wait
>> + * until we have re-entered KVM.
>> + *
>> + * Return > 0 to return to guest, 0 (and set exit_reason) on proper
>> + * exit to userspace.
>> + */
>> +
>> +int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
>> +{
>> +if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
>> +run->exit_reason = KVM_EXIT_DEBUG;
>> +run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>> ESR_ELx_EC_SHIFT;
>> +return 0;
>> +}
>> +return 1;
>> +}
>> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
>> index c918d291cb58..7b04f59217bf 100644
>> --- a/arch/arm64/kvm/handle_exit.c
>> +++ b/arch/arm64/kvm/handle_exit.c
>> @@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu 
>> *vcpu, struct kvm_run *run)
>>  handled = exit_handler(vcpu, run);
>>  }
>>  
>> -if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
>> -handled = 0;
>> -run->exit_reason = KVM_EXIT_DEBUG;
>> -run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
>> ESR_ELx_EC_SHIFT;
>> -}
>> +if (handled)
>> +return kvm_arm_maybe_return_debug(vcpu, run);
>>  
>> -return handled;
>> +return 0;
>>  }
>>  
>>  /*
>> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
>> index b9f68e4add71..3d28fe2daa26 100644
>> --- a/virt/kvm/arm/arm.c
>> +++ b/virt/kvm/arm/arm.c
>> @@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
>> struct kvm_run *run)
>>  
>>  if (run->exit_reason == KVM_EXIT_MMIO) {
>>  ret = kvm_handle_mmio_return(vcpu, vcpu->run);
>> -if (ret)
>> +if (ret < 1)
>>  return ret;
>>  }
>>  
>> diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
>> index b6e715fd3c90..e43e3bd6222f 100644
>> --- a/virt/kvm/arm/mmio.c
>> +++ b/virt/kvm/arm/mmio.c
>> @@ -117,7 +117,8 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
>> kvm_run *run)
>>  vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
>>  }
>>  
>> -return 0;
>> +/* If debugging in effect we may need to return now */
>> +return kvm_arm_mayb

Re: [PATCH v1 1/2] KVM: arm64: handle single-stepping trapped instructions

2017-10-06 Thread Julien Thierry



On 06/10/17 13:27, Marc Zyngier wrote:

On 06/10/17 12:39, Alex Bennée wrote:

If we are using guest debug to single-step the guest we need to ensure
we exit after emulating the instruction. This only affects
instructions completely emulated by the kernel. For userspace emulated
instructions we need to exit and return to complete the emulation.

We fake debug.arch.hsr to contain ESR_ELx_EC_SOFTSTP_LOW so QEMU knows
it was a single-step event (and without altering the userspace ABI).

Signed-off-by: Alex Bennée 
---
  arch/arm64/kvm/handle_exit.c | 48 +++-
  1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74843a0..c918d291cb58 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -178,6 +178,39 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu 
*vcpu)
return arm_exit_handlers[hsr_ec];
  }
  
+/*

+ * When handling traps we need to ensure exit the guest if we
+ * completely emulated the instruction while single-stepping. Stuff to
+ * be emulated in userspace needs to complete that first.
+ */
+
+static int handle_trap_exceptions(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   int handled;
+
+   /*
+* See ARM ARM B1.14.1: "Hyp traps on instructions
+* that fail their condition code check"
+*/
+   if (!kvm_condition_valid(vcpu)) {
+   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+   handled = 1;
+   } else {
+   exit_handle_fn exit_handler;
+
+   exit_handler = kvm_get_exit_handler(vcpu);
+   handled = exit_handler(vcpu, run);
+   }
+
+   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
+   handled = 0;
+   run->exit_reason = KVM_EXIT_DEBUG;
+   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
+   }


Doesn't this break an MMIO read? The registers haven't been updated yet,
and the debugger may not see the right thing...

How about something like:

if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
if (run->exit_reason == KVM_EXIT_MMIO)
kvm_handle_mmio_return(vcpu, run);
[...]
}

Or am I missing something?


If the MMIO was not handled by the kernel, exit_handler will return 0, 
so handled will be false and we won't pretend we have a debug exception 
(but will still return to userland with KVM_EXIT_MMIO).


I think the second patch takes care of properly handling single step for 
userland MMIO.


--
Julien Thierry
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 12:39, Alex Bennée wrote:
> The system state of KVM when using userspace emulation is not complete
> until we return into KVM_RUN. To handle mmio related updates we wait
> until they have been committed and then schedule our KVM_EXIT_DEBUG.
> 
> I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
> the differences between arm/arm64 which is currently null for arm.
> 
> Signed-off-by: Alex Bennée 
> ---
>  arch/arm/include/asm/kvm_host.h   |  2 ++
>  arch/arm64/include/asm/kvm_host.h |  1 +
>  arch/arm64/kvm/debug.c| 21 +
>  arch/arm64/kvm/handle_exit.c  |  9 +++--
>  virt/kvm/arm/arm.c|  2 +-
>  virt/kvm/arm/mmio.c   |  3 ++-
>  6 files changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index 4a879f6ff13b..aec943f6d123 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
>  static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
> +static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
> + struct kvm_run *run) {}
>  
>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  struct kvm_device_attr *attr);
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index e923b58606e2..fa67d21662f6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
>  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
> +int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
>  int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
>  struct kvm_device_attr *attr);
>  int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index dbadfaf850a7..a10a18c55c87 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>   }
>   }
>  }
> +
> +
> +/*
> + * When KVM has successfully emulated the instruction we might want to
> + * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
> + * is complete though so for userspace emulations we have to wait
> + * until we have re-entered KVM.
> + *
> + * Return > 0 to return to guest, 0 (and set exit_reason) on proper
> + * exit to userspace.
> + */
> +
> +int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> + run->exit_reason = KVM_EXIT_DEBUG;
> + run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
> ESR_ELx_EC_SHIFT;
> + return 0;
> + }
> + return 1;
> +}
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index c918d291cb58..7b04f59217bf 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu 
> *vcpu, struct kvm_run *run)
>   handled = exit_handler(vcpu, run);
>   }
>  
> - if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
> - handled = 0;
> - run->exit_reason = KVM_EXIT_DEBUG;
> - run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
> ESR_ELx_EC_SHIFT;
> - }
> + if (handled)
> + return kvm_arm_maybe_return_debug(vcpu, run);
>  
> - return handled;
> + return 0;
>  }
>  
>  /*
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index b9f68e4add71..3d28fe2daa26 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>  
>   if (run->exit_reason == KVM_EXIT_MMIO) {
>   ret = kvm_handle_mmio_return(vcpu, vcpu->run);
> - if (ret)
> + if (ret < 1)
>   return ret;
>   }
>  
> diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
> index b6e715fd3c90..e43e3bd6222f 100644
> --- a/virt/kvm/arm/mmio.c
> +++ b/virt/kvm/arm/mmio.c
> @@ -117,7 +117,8 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
>   }
>  
> - return 0;
> + /* If debugging in effect we may need to return now */
> + return kvm_arm_maybe_return_debug(vcpu, run);

Ah, that's how you do it. OK. Then the patch splitting is wrong, because
everything is broken after pat

Re: [PATCH v1 1/2] KVM: arm64: handle single-stepping trapped instructions

2017-10-06 Thread Marc Zyngier
On 06/10/17 12:39, Alex Bennée wrote:
> If we are using guest debug to single-step the guest we need to ensure
> we exit after emulating the instruction. This only affects
> instructions completely emulated by the kernel. For userspace emulated
> instructions we need to exit and return to complete the emulation.
> 
> We fake debug.arch.hsr to contain ESR_ELx_EC_SOFTSTP_LOW so QEMU knows
> it was a single-step event (and without altering the userspace ABI).
> 
> Signed-off-by: Alex Bennée 
> ---
>  arch/arm64/kvm/handle_exit.c | 48 
> +++-
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 7debb74843a0..c918d291cb58 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -178,6 +178,39 @@ static exit_handle_fn kvm_get_exit_handler(struct 
> kvm_vcpu *vcpu)
>   return arm_exit_handlers[hsr_ec];
>  }
>  
> +/*
> + * When handling traps we need to ensure exit the guest if we
> + * completely emulated the instruction while single-stepping. Stuff to
> + * be emulated in userspace needs to complete that first.
> + */
> +
> +static int handle_trap_exceptions(struct kvm_vcpu *vcpu, struct kvm_run *run)
> +{
> + int handled;
> +
> + /*
> +  * See ARM ARM B1.14.1: "Hyp traps on instructions
> +  * that fail their condition code check"
> +  */
> + if (!kvm_condition_valid(vcpu)) {
> + kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
> + handled = 1;
> + } else {
> + exit_handle_fn exit_handler;
> +
> + exit_handler = kvm_get_exit_handler(vcpu);
> + handled = exit_handler(vcpu, run);
> + }
> +
> + if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
> + handled = 0;
> + run->exit_reason = KVM_EXIT_DEBUG;
> + run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
> ESR_ELx_EC_SHIFT;
> + }

Doesn't this break an MMIO read? The registers haven't been updated yet,
and the debugger may not see the right thing...

How about something like:

if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
if (run->exit_reason == KVM_EXIT_MMIO)
kvm_handle_mmio_return(vcpu, run);
[...]
}

Or am I missing something?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v1 0/2] KVM: arm64: single step emulation instructions

2017-10-06 Thread Alex Bennée
Hi Julian,

Here are the proposed patches for KVM support. Feel free to merge them
together if you want.

There are two minor wrinkles.

The first is we fake the HSR_EC value to keep QEMU happy. I don't
think this is a major problem because we aren't dealing with a guest
register but the mechanism for the guest debug to know what happened.
In this case we still single-stepped even if an exception wasn't
actually involved.

The second is we have to slightly munge the handling of return
kvm_handle_mmio_return to properly exit the loop on error or 0.

Alex Bennée (2):
  KVM: arm64: handle single-stepping trapped instructions
  kvm: arm64: handle single-step of userspace mmio instructions

 arch/arm/include/asm/kvm_host.h   |  2 ++
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/debug.c| 21 ++
 arch/arm64/kvm/handle_exit.c  | 45 +++
 virt/kvm/arm/arm.c|  2 +-
 virt/kvm/arm/mmio.c   |  3 ++-
 6 files changed, 58 insertions(+), 16 deletions(-)

-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v1 2/2] kvm: arm64: handle single-step of userspace mmio instructions

2017-10-06 Thread Alex Bennée
The system state of KVM when using userspace emulation is not complete
until we return into KVM_RUN. To handle mmio related updates we wait
until they have been committed and then schedule our KVM_EXIT_DEBUG.

I've introduced a new function kvm_arm_maybe_return_debug() to wrap up
the differences between arm/arm64 which is currently null for arm.

Signed-off-by: Alex Bennée 
---
 arch/arm/include/asm/kvm_host.h   |  2 ++
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/debug.c| 21 +
 arch/arm64/kvm/handle_exit.c  |  9 +++--
 virt/kvm/arm/arm.c|  2 +-
 virt/kvm/arm/mmio.c   |  3 ++-
 6 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4a879f6ff13b..aec943f6d123 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -285,6 +285,8 @@ static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
+static inline int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu,
+   struct kvm_run *run) {}
 
 int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index e923b58606e2..fa67d21662f6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -369,6 +369,7 @@ void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
+int  kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run);
 int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
   struct kvm_device_attr *attr);
 int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index dbadfaf850a7..a10a18c55c87 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -221,3 +221,24 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
}
}
 }
+
+
+/*
+ * When KVM has successfully emulated the instruction we might want to
+ * return we a KVM_EXIT_DEBUG. We can only do this once the emulation
+ * is complete though so for userspace emulations we have to wait
+ * until we have re-entered KVM.
+ *
+ * Return > 0 to return to guest, 0 (and set exit_reason) on proper
+ * exit to userspace.
+ */
+
+int kvm_arm_maybe_return_debug(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
+   run->exit_reason = KVM_EXIT_DEBUG;
+   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
+   return 0;
+   }
+   return 1;
+}
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index c918d291cb58..7b04f59217bf 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -202,13 +202,10 @@ static int handle_trap_exceptions(struct kvm_vcpu *vcpu, 
struct kvm_run *run)
handled = exit_handler(vcpu, run);
}
 
-   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
-   handled = 0;
-   run->exit_reason = KVM_EXIT_DEBUG;
-   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
-   }
+   if (handled)
+   return kvm_arm_maybe_return_debug(vcpu, run);
 
-   return handled;
+   return 0;
 }
 
 /*
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index b9f68e4add71..3d28fe2daa26 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -623,7 +623,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
if (run->exit_reason == KVM_EXIT_MMIO) {
ret = kvm_handle_mmio_return(vcpu, vcpu->run);
-   if (ret)
+   if (ret < 1)
return ret;
}
 
diff --git a/virt/kvm/arm/mmio.c b/virt/kvm/arm/mmio.c
index b6e715fd3c90..e43e3bd6222f 100644
--- a/virt/kvm/arm/mmio.c
+++ b/virt/kvm/arm/mmio.c
@@ -117,7 +117,8 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
-   return 0;
+   /* If debugging in effect we may need to return now */
+   return kvm_arm_maybe_return_debug(vcpu, run);
 }
 
 static int decode_hsr(struct kvm_vcpu *vcpu, bool *is_write, int *len)
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v1 1/2] KVM: arm64: handle single-stepping trapped instructions

2017-10-06 Thread Alex Bennée
If we are using guest debug to single-step the guest we need to ensure
we exit after emulating the instruction. This only affects
instructions completely emulated by the kernel. For userspace emulated
instructions we need to exit and return to complete the emulation.

We fake debug.arch.hsr to contain ESR_ELx_EC_SOFTSTP_LOW so QEMU knows
it was a single-step event (and without altering the userspace ABI).

Signed-off-by: Alex Bennée 
---
 arch/arm64/kvm/handle_exit.c | 48 +++-
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 7debb74843a0..c918d291cb58 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -178,6 +178,39 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu 
*vcpu)
return arm_exit_handlers[hsr_ec];
 }
 
+/*
+ * When handling traps we need to ensure exit the guest if we
+ * completely emulated the instruction while single-stepping. Stuff to
+ * be emulated in userspace needs to complete that first.
+ */
+
+static int handle_trap_exceptions(struct kvm_vcpu *vcpu, struct kvm_run *run)
+{
+   int handled;
+
+   /*
+* See ARM ARM B1.14.1: "Hyp traps on instructions
+* that fail their condition code check"
+*/
+   if (!kvm_condition_valid(vcpu)) {
+   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
+   handled = 1;
+   } else {
+   exit_handle_fn exit_handler;
+
+   exit_handler = kvm_get_exit_handler(vcpu);
+   handled = exit_handler(vcpu, run);
+   }
+
+   if (handled && (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) {
+   handled = 0;
+   run->exit_reason = KVM_EXIT_DEBUG;
+   run->debug.arch.hsr = ESR_ELx_EC_SOFTSTP_LOW << 
ESR_ELx_EC_SHIFT;
+   }
+
+   return handled;
+}
+
 /*
  * Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
  * proper exit to userspace.
@@ -185,8 +218,6 @@ static exit_handle_fn kvm_get_exit_handler(struct kvm_vcpu 
*vcpu)
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
   int exception_index)
 {
-   exit_handle_fn exit_handler;
-
if (ARM_SERROR_PENDING(exception_index)) {
u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
 
@@ -214,18 +245,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
kvm_inject_vabt(vcpu);
return 1;
case ARM_EXCEPTION_TRAP:
-   /*
-* See ARM ARM B1.14.1: "Hyp traps on instructions
-* that fail their condition code check"
-*/
-   if (!kvm_condition_valid(vcpu)) {
-   kvm_skip_instr(vcpu, kvm_vcpu_trap_il_is32bit(vcpu));
-   return 1;
-   }
-
-   exit_handler = kvm_get_exit_handler(vcpu);
-
-   return exit_handler(vcpu, run);
+   return handle_trap_exceptions(vcpu, run);
case ARM_EXCEPTION_HYP_GONE:
/*
 * EL2 has been reset to the hyp-stub. This happens when a guest
-- 
2.14.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm