RE: [Qemu-devel] live migration vs device assignment (motivation)

2015-12-28 Thread Pavel Fedin
 Hello!

> A dedicated IRQ per device for something that is a system wide event
> sounds like a waste.  I don't understand why a spec change is strictly
> required, we only need to support this with the specific virtual bridge
> used by QEMU, so I think that a vendor specific capability will do.
> Once this works well in the field, a PCI spec ECN might make sense
> to standardise the capability.

 Keeping track of your discussion for some time, decided to jump in...
 So far, we want to have some kind of mailbox to notify the quest about 
migration. So what about some dedicated "pci device" for
this purpose? Some kind of "migration controller". This is:
a) perhaps easier to implement than capability, we don't need to push anything 
to PCI spec.
b) could easily make friendship with Windows, because this means that no bus 
code has to be touched at all. It would rely only on
drivers' ability to communicate with each other (i guess it should be possible 
in Windows, isn't it?)
c) does not need to steal resources (BARs, IRQs, etc) from the actual devices.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH 2/5] KVM: add KVM_EXIT_MSR exit reason and capability.

2015-12-22 Thread Pavel Fedin
 Hello!

> > 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and
> KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the 
> same.
> 
> Indeed.  Perhaps the kernel can set .handled to true to let userspace
> know it already took care of it, instead of introducing yet another
> exit_reason.  The field would need to be marked in/out, then.

 I'm not sure that you need even this. Anyway, particular MSRs are 
function-specific, and if you're emulating an MSR in userspace,
then, i believe, you know the function behind it. And it's IMHO safe to just 
know that SynIC MSRs have some extra handling in
kernel. And i believe this has no direct impact on userland's behavior.
 But, you better know the details.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH 2/5] KVM: add KVM_EXIT_MSR exit reason and capability.

2015-12-22 Thread Pavel Fedin
 Hello!

> It has: unlike the scenario that was the original motivation for Peter's
> patches, where the the userspace wanted to handle register accesses
> which the kernel *didn't*, in case of SynIC the userspace wants do
> something about MSR accesses *only* if the kernel *also* handles them.

 Well... I believe, that qemu knows if we are instantiating SynIC. And, if we 
are, it knows that the kernel will do something about
it. Otherwise these registers don't exist, and, by the way, the guest is not 
expected to touch them, is it?

> I guess that was the reason why Paolo suggested an extra exit_reason,
> and I think .handled field can be used to pass that information instead.

[skip]

> But the proposed use of .handled costs basically nothing, and it may
> prove useful in general (as a conisistency proof, if anything).

 Well... May be... So, i'm OK with it.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit

2015-12-21 Thread Pavel Fedin
 Hello!

> Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes
> and can even use only one MSR value . So union inside struct
> kvm_hyperv_exit is excessive.
> 
> But we still need Vcpu exit to handle VMBus hypercalls by QEMU to
> emulate VMBus devices inside QEMU.
> 
> And currently we are going to extend struct kvm_hyperv_exit
> to store Hyper-V VMBus hypercall parameters.

 Hm... Hypercalls, you say?
 We already have KVM_EXIT_HYPERCALL. Documentation says it's currently unused. 
Is it a leftover from ia64 KVM? Could we reuse it for
the purpose?

> but could we replace Hyper-V VMBus hypercall and it's parameters
> by KVM_EXIT_REG_IO/MSR_IO too?

 It depends. Can i read about these hypercalls somewhere? Is there any 
documentation?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC PATCH 2/5] KVM: add KVM_EXIT_MSR exit reason and capability.

2015-12-21 Thread Pavel Fedin
or vmx. Also, note that if the 
> kvm
> +module's ignore_msrs flag is set then KVM_EXIT_MSR_READ and 
> KVM_EXIT_MSR_WRITE
> +will not be generated, and unhandled MSR accesses will simply be ignored and
> +the guest re-entered immediately.
> +
> 
>  8. Other capabilities.
>  --
> @@ -3726,3 +3783,11 @@ In order to use SynIC, it has to be activated
> by setting this
>  capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
>  will disable the use of APIC hardware virtualization even if supported
>  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
> +
> +8.3 KVM_CAP_MSR_EXITS
> +
> +Architectures: x86 (vmx-only)
> +
> +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
> +that the kernel implements the KVM_CAP_ENABLE_MSR_EXITS and
> +KVM_CAP_DISABLE_MSR_EXITS capabilities for VMs.
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 6e32f7599081..431fd1ec0d06 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -199,6 +199,9 @@ struct kvm_hyperv_exit {
>  #define KVM_EXIT_S390_STSI25
>  #define KVM_EXIT_IOAPIC_EOI   26
>  #define KVM_EXIT_HYPERV   27
> +#define KVM_EXIT_MSR_READ 28
> +#define KVM_EXIT_MSR_WRITE29
> +#define KVM_EXIT_MSR_AFTER_WRITE  30
> 
>  /* For KVM_EXIT_INTERNAL_ERROR */
>  /* Emulate instruction failed. */
> @@ -355,6 +358,18 @@ struct kvm_run {
>   } eoi;
>   /* KVM_EXIT_HYPERV */
>   struct kvm_hyperv_exit hyperv;
> + /*
> + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
> + * KVM_EXIT_MSR_AFTER_WRITE
> + */
> + struct {
> + __u32 index;/* i.e. ecx; out */
> + __u64 data; /* out (wrmsr) / in (rdmsr) */
> + __u64 type; /* out */
> +#define KVM_EXIT_MSR_UNHANDLED 0
> +#define KVM_EXIT_MSR_HANDLED   1
> + __u8 handled;   /* in */
> + } msr;
>   /* Fix the size of the union. */
>   char padding[256];
>   };
> @@ -849,6 +864,9 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_SPLIT_IRQCHIP 121
>  #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
>  #define KVM_CAP_HYPERV_SYNIC 123
> +#define KVM_CAP_MSR_EXITS 124
> +#define KVM_CAP_DISABLE_MSR_EXITS 125
> +#define KVM_CAP_ENABLE_MSR_EXITS 126
> 
>  #ifdef KVM_CAP_IRQ_ROUTING

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit

2015-12-21 Thread Pavel Fedin
 Hello!

> >   It depends. Can i read about these hypercalls somewhere? Is there any 
> > documentation?
> I don't know about a documentation, but you can look at the code of
> Hyper-V hypercall handling inside KVM:
> 
> https://github.com/torvalds/linux/blob/master/arch/x86/kvm/hyperv.c#L346

 Aha, i see, so vmmcall CPU instruction is employed. Well, i believe this very 
well fits into the sematics of KVM_EXIT_HYPERCALL,
because it's a true hypercall.

> The code simply decodes hypercall parameters from vcpu registers then
> handle hypercall code in switch and encode return code inside vcpu
> registers. Probably encode and decode of hypercall parameters/return
> code can be done in QEMU so we need only some exit with parameter that
> this is Hyper-V hypercall and probably KVM_EXIT_HYPERCALL is good for it.

 Or you could even reuse the whole structure, it has all you need:

__u64 nr;   /* Reserved for x86, other 
architectures can use it, for example ARM "hvc #nr" */
__u64 args[6];  /* rax, rbx, rcx, rdx, rdi, rsi */
__u64 ret;
__u32 longmode; /* longmode; other architectures (like 
ARM64) can also make sense of it */

 Or you could put in struct kvm_regs instead of args and ret, and allow the 
userspace to manipulate it.

> But KVM_EXIT_HYPERCALL is not used inside KVM/QEMU so requires
> implementation.

 I guess your hypercalls to be introduced using KVM_EXIT_HYPERV are also not 
used inside qemu so require implementation :)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit

2015-12-20 Thread Pavel Fedin
 Hello!

 Replying to everything in one message.

> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.

 Ah, i missed them, what a pity. There are lots of patches, i don't review them 
all. Actually i have noticed the change only after
it appeared in linux-next.

>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

 I see, but this doesn't change the semantic. All we need to do is to tell the 
userland that "register has been written".
Additionally to this we could do whatever we want, including caching the data 
in kernel, using it in kernel, and processing reads in
kernel.

> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

 Actually, i see this in more generic way, something like:

/* KVM_EXIT_REG_ACCESS */
struct {
__u64 reg;
__u64 data;
__u8  is_write;
} mmio;
 
 'data' and 'is_write' are self-explanatory, 'reg' would be generalized 
register code, the same as used for KVM_(GET|SET)_ONE_REG:
 - for ARM64: ARM64_SYS_REG(op0, op1, crn, crm, op2) - see
http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/kvm.h#L189
 - for x86  : to be defined (i know, we don't use ..._ONE_REG operations here 
yet), like X86_MSR_REG(id), where the macro itself is:

#define X86_MSR_REG(id) (KVM_REG_X86 | KVM_REG_X86_MSR | 
KVM_REG_SIZE_U64 | (id))

 - for other architectures: to be defined in a similar way, once needed.

> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

 I have looked at the code too, and these three fields are nothing more than 
values of respective MSR's:

case HV_X64_MSR_SCONTROL:
synic->control = data;
if (!host)
synic_exit(synic, msr);
break;



case HV_X64_MSR_SIEFP:
if (data & HV_SYNIC_SIEFP_ENABLE)
if (kvm_clear_guest(vcpu->kvm,
data & PAGE_MASK, PAGE_SIZE)) {
ret = 1;
break;
}
synic->evt_page = data;
if (!host)
synic_exit(synic, msr);
break;
case HV_X64_MSR_SIMP:
if (data & HV_SYNIC_SIMP_ENABLE)
if (kvm_clear_guest(vcpu->kvm,
data & PAGE_MASK, PAGE_SIZE)) {
ret = 1;
break;
}
synic->msg_page = data;
if (!host)
synic_exit(synic, msr);
break;

 So, every time one of these thee MSRs is written, we get a vmexit with values 
of all three registers, and that's all. We could
easily have 'synic_exit(synic, msr, data)' in all three cases, and i think the 
userspace could easily deal with proposed
KVM_EXIT_REG_ACCESS, just cache these values internally if needed.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v6] arm/arm64: KVM: Detect vGIC presence at runtime

2015-12-18 Thread Pavel Fedin
 Hello!

> > This patch does not touch any virtual timer code, suggesting that timer
> > hardware is actually in place. Normally on boards in question it is true,
> > however since vGIC is missing, it is impossible to correctly utilize
> > interrupts from the virtual timer. Since virtual timer handling is in
> > active redevelopment now, handling in it userspace is out of scope at
> > the moment. The guest is currently suggested to use some memory-mapped
> > timer which can be emulated in userspace.
> 
> Not sure I understand this paragraph.  Either drop it or just say "The
> architectured timers are not supported without the in-kernel vGIC."

 Ok, i'll repost with changed message. But, just to let you know, with this 
(http://www.spinics.net/lists/kvm/msg124539.html) the
notice about architected timer loses its relevancy at all. So, i'll just drop 
it.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit

2015-12-18 Thread Pavel Fedin
 Hello!

 I realize that it's perhaps too late, because patches are already on 
Linux-next, but i have one concern... May be it's not too
late...

 I dislike implementing architecture-dependent exit code where we could 
implement an architecture-independent one.

 As far as i understand this code, KVM_EXIT_HYPERV is called when one of three 
MSRs are accessed. But, shouldn't we have implemented
instead something more generic, like KVM_EXIT_REG_IO, which would work similar 
to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
code and value?

 This would allow us to solve the same task which we have done here, but this 
solution would be reusable for other devices and other
archirectures. What if in future we have more system registers to emulate in 
userspace?

 I write this because at one point i suggested similar thing for ARM64 (but i 
never actually wrote it), to emulate physical CP15
timer. And it would require exactly the same capability - process some trapped 
system register accesses in userspace.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-07 Thread Pavel Fedin
 Hello!

> I messed up the "load into xzr" test royally in the last attached patch.
> It was quite wrong.

 Yes, because "mov %0, xzr" is not trapped.

> I have now tested
> 
>  asm volatile(
>  "str %3, [%1]\n\t"
>  "ldr wzr, [%1]\n\t"
>  "str wzr, [%2]\n\t"
>  "ldr %0, [%2]\n\t"
>  :"=r"(val):"r"(addr), "r"(addr2), "r"(0x):"memory");
> report("mmio: 'ldr wzr' check: read 0x%08lx", val != 0x, val);
> 
> which passes

 I guess i forgot to mention that both addr and addr2 have to be MMIO 
registers. If they are plain memory, then of course everything
will work because they are not trapped.

> Anyway, I
> probably won't clean this test up and post it. I don't think we really
> need to add it as a regression test, unless others disagree and would
> like to see it added.

 Considering how difficult it was to find this problem, and how tricky and 
unobvious it is, i would ask to add this test. Especially
considering you've already written it. At least it will serve as a reminder 
about the problem.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-07 Thread Pavel Fedin
 Hello!

> But, if Pavel doesn't
> mind trying them out on his system, then it'd be good to know if they
> reproduce there. I'd like to find out if it's a test case problem or
> something else strange going on with environments.

Does not build, applied to master:
--- cut ---
aarch64-unknown-linux-gnu-gcc  -std=gnu99 -ffreestanding -Wextra -O2 -I lib -I 
lib/libfdt -g -MMD -MF arm/.xzr-test.d -Wall
-fomit-frame-pointer  -fno-stack-protector -c -o arm/xzr-test.o 
arm/xzr-test.c
arm/xzr-test.c: In function 'check_xzr_sysreg':
arm/xzr-test.c:13:2: warning: implicit declaration of function 'mmu_disable' 
[-Wimplicit-function-declaration]
  mmu_disable(); /* Tell KVM to set HCR_TVM for this VCPU */
  ^
aarch64-unknown-linux-gnu-gcc  -std=gnu99 -ffreestanding -Wextra -O2 -I lib -I 
lib/libfdt -g -MMD -MF arm/.xzr-test.d -Wall
-fomit-frame-pointer  -fno-stack-protector   -nostdlib -o arm/xzr-test.elf \
-Wl,-T,arm/flat.lds,--build-id=none,-Ttext=4008 \
arm/xzr-test.o arm/cstart64.o lib/libcflat.a lib/libfdt/libfdt.a 
/usr/lib/gcc/aarch64-unknown-linux-gnu/4.9.0/libgcc.a
lib/arm/libeabi.a
arm/xzr-test.o: In function `check_xzr_sysreg':
/cygdrive/d/Projects/kvm-unit-tests/arm/xzr-test.c:13: undefined reference to 
`mmu_disable'
--- cut ---

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-07 Thread Pavel Fedin
 Hello!

> FYI, I tried writing test cases for this issue with kvm-unit-tests. The
> issue didn't reproduce for me. It's quite possible my test cases are
> flawed, so I'm not making any claims about the validity of the series

 This is indeed very interesting, so i'll take a look at it.
 For now i've just only took a quick glance at the code, and i have at least 
one suggestion. Could you happen to have sp == 0 in
check_xzr_sysreg()? In this case it will magically work.
 Also, you could try to write a test which tries to overwrite xzr. Something 
like:

volatile int *addr1;
volatile int *addr2;

asm volatile("str %3, [%1]\n\t"
 "ldr wzr, [%1]\n\t"
 "str wzr, [%2]\n\t",
 "ldr %0, [%2]\n\t"
 :"=r"(res):"r"(addr1), "r"(addr2), "r"(some_nonzero_val):"memory");

 Then check for res == some_nonzero_val. If they are equal, you've got the bug 
:)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-07 Thread Pavel Fedin
 Hello!

> FYI, I tried writing test cases for this issue with kvm-unit-tests. The
> issue didn't reproduce for me. It's quite possible my test cases are
> flawed

 Indeed they are, a very little thing fell through again... :)
 It's not just SP, it's SP_EL0. And you never initialize it to anything because 
your code always runs in kernel mode, so it's just
zero, so you get your zero.
 But if you add a little thing in the beginning of your main():

asm volatile("msr sp_el0, %0" : : "r" (0xDEADC0DE0BADC0DE));

 then you have it:
--- cut ---
[root@thunderx-2 kvm-unit-tests]# ./arm-run arm/xzr-test.flat -smp 2
qemu-system-aarch64 -machine virt,accel=kvm:tcg,gic-version=host -cpu host 
-device virtio-serial-device -device
virtconsole,chardev=ctd -chardev testdev,id=ctd -display none -serial stdio 
-kernel arm/xzr-test.flat -smp 2
PASS: mmio: sanity check: read 0x
FAIL: mmio: 'str wzr' check: read 0x0badc0de
vm_setup_vq: virtqueue 0 already setup! base=0xa003e00
chr_testdev_init: chr-testdev: can't init virtqueues
--- cut ---

 Here i run only MMIO test, because i could not compile sysreg one, so i simply 
commented it out.

 P.S. Could you also apply something like the following to arm/run:
--- cut ---
arm/run | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/run b/arm/run
index 662a856..3890c8c 100755
--- a/arm/run
+++ b/arm/run
@@ -33,7 +33,11 @@ if $qemu $M -chardev testdev,id=id -initrd . 2>&1 \
exit 2
 fi
 
-M='-machine virt,accel=kvm:tcg'
+if $qemu $M,? 2>&1 | grep gic-version > /dev/null; then
+   GIC='gic-version=host,'
+fi
+
+M="-machine virt,${GIC}accel=kvm:tcg"
 chr_testdev='-device virtio-serial-device'
 chr_testdev+=' -device virtconsole,chardev=ctd -chardev testdev,id=ctd'
--- cut ---

 Without it qemu does not work on GICv3-only hardware, like my board, because 
it defaults to gic-version=2. I don't post the patch
on the mailing lists, because in order to be able to post this 5-liner i'll 
need to go through the formal approval procedure at my
company, and i just don't want to bother for a single small fix. :) Will do as 
a "Reported-by:".

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 2/6] KVM: arm/arm64: Move endianness conversion out of vgic_attr_regs_access()

2015-12-07 Thread Pavel Fedin
mmio_data_read() and mmio_data_write(), originally used in this function,
are limited only to 32 bits. We are going to refactor this code and
eventually let it do 64-bit I/O for vGICv3. Therefore, our first step is
to get rid of this limitation.

We open up these inlines, which consist of endianness conversion and
masking. Masking is not used here (the mask is set to ~0), so we just
move out the remaining endianness conversion.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic-v2-emul.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..959b9c6 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -663,7 +663,7 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
 
 static int vgic_attr_regs_access(struct kvm_device *dev,
 struct kvm_device_attr *attr,
-u32 *reg, bool is_write)
+__le32 *data, bool is_write)
 {
const struct vgic_io_range *r = NULL, *ranges;
phys_addr_t offset;
@@ -671,7 +671,6 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
struct kvm_vcpu *vcpu, *tmp_vcpu;
struct vgic_dist *vgic;
struct kvm_exit_mmio mmio;
-   u32 data;
 
offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
@@ -693,9 +692,7 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
 
mmio.len = 4;
mmio.is_write = is_write;
-   mmio.data = 
-   if (is_write)
-   mmio_data_write(, ~0, *reg);
+   mmio.data = data;
switch (attr->group) {
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
mmio.phys_addr = vgic->vgic_dist_base + offset;
@@ -743,9 +740,6 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
offset -= r->base;
r->handle_mmio(vcpu, , offset);
 
-   if (!is_write)
-   *reg = mmio_data_read(, ~0);
-
ret = 0;
 out_vgic_unlock:
spin_unlock(>lock);
@@ -778,11 +772,13 @@ static int vgic_v2_set_attr(struct kvm_device *dev,
case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
u32 __user *uaddr = (u32 __user *)(long)attr->addr;
u32 reg;
+   __le32 data;
 
if (get_user(reg, uaddr))
return -EFAULT;
 
-   return vgic_attr_regs_access(dev, attr, , true);
+   data = cpu_to_le32(reg);
+   return vgic_attr_regs_access(dev, attr, , true);
}
 
}
@@ -803,12 +799,12 @@ static int vgic_v2_get_attr(struct kvm_device *dev,
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-   u32 reg = 0;
+   __le32 data = 0;
 
-   ret = vgic_attr_regs_access(dev, attr, , false);
+   ret = vgic_attr_regs_access(dev, attr, , false);
if (ret)
return ret;
-   return put_user(reg, uaddr);
+   return put_user(le32_to_cpu(data), uaddr);
}
 
}
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 5/6] KVM: arm64: Introduce find_reg_by_id()

2015-12-07 Thread Pavel Fedin
In order to implement vGICv3 CPU interface access, we will need to
perform table lookup of system registers. We would need both
index_to_params() and find_reg() exported for that purpose, but instead
we export a single function which combines them both.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Andre Przywara <andre.przyw...@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 22 +++---
 arch/arm64/kvm/sys_regs.h |  4 
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index d2650e8..8c4b671 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1276,6 +1276,17 @@ static bool index_to_params(u64 id, struct 
sys_reg_params *params)
}
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num)
+{
+   if (!index_to_params(id, params))
+   return NULL;
+
+   return find_reg(params, table, num);
+}
+
 /* Decode an index value, and find the sys_reg_desc entry. */
 static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
u64 id)
@@ -1403,10 +1414,8 @@ static int get_invariant_sys_reg(u64 id, void __user 
*uaddr)
struct sys_reg_params params;
const struct sys_reg_desc *r;
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
@@ -1420,9 +1429,8 @@ static int set_invariant_sys_reg(u64 id, void __user 
*uaddr)
int err;
u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index dbbb01c..9c6ffd0 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc 
*i1,
return i1->Op2 - i2->Op2;
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num);
 
 #define Op0(_x).Op0 = _x
 #define Op1(_x).Op1 = _x
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 3/6] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-12-07 Thread Pavel Fedin
Separate all implementation-independent code in vgic_attr_regs_access()
and move it to vgic.c. This will allow to reuse this code for vGICv3
implementation.

vcpu lookup is left where it originally was, because vGICv3 API will
expect affinity ID instead of vCPU index, therefore it will be done
differently. Also, vcpu pointer has backpointer to kvm, so 'dev' was
replaced with  'vcpu'.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic-v2-emul.c | 120 +++-
 virt/kvm/arm/vgic.c |  57 +
 virt/kvm/arm/vgic.h |   3 ++
 3 files changed, 88 insertions(+), 92 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 959b9c6..8e769c6 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -661,38 +661,24 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
},
 };
 
-static int vgic_attr_regs_access(struct kvm_device *dev,
-struct kvm_device_attr *attr,
-__le32 *data, bool is_write)
+static int vgic_v2_attr_regs_access(struct kvm_device *dev,
+   struct kvm_device_attr *attr,
+   __le32 *data, bool is_write)
 {
-   const struct vgic_io_range *r = NULL, *ranges;
+   const struct vgic_io_range *ranges;
phys_addr_t offset;
-   int ret, cpuid, c;
-   struct kvm_vcpu *vcpu, *tmp_vcpu;
-   struct vgic_dist *vgic;
+   struct kvm_vcpu *vcpu;
+   int cpuid;
+   struct vgic_dist *vgic = >kvm->arch.vgic;
struct kvm_exit_mmio mmio;
 
offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
KVM_DEV_ARM_VGIC_CPUID_SHIFT;
 
-   mutex_lock(>kvm->lock);
-
-   ret = vgic_init(dev->kvm);
-   if (ret)
-   goto out;
-
-   if (cpuid >= atomic_read(>kvm->online_vcpus)) {
-   ret = -EINVAL;
-   goto out;
-   }
+   if (cpuid >= atomic_read(>kvm->online_vcpus))
+   return -EINVAL;
 
-   vcpu = kvm_get_vcpu(dev->kvm, cpuid);
-   vgic = >kvm->arch.vgic;
-
-   mmio.len = 4;
-   mmio.is_write = is_write;
-   mmio.data = data;
switch (attr->group) {
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
mmio.phys_addr = vgic->vgic_dist_base + offset;
@@ -703,49 +689,16 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
ranges = vgic_cpu_ranges;
break;
default:
-   BUG();
+   return -ENXIO;
}
-   r = vgic_find_range(ranges, 4, offset);
 
-   if (unlikely(!r || !r->handle_mmio)) {
-   ret = -ENXIO;
-   goto out;
-   }
-
-
-   spin_lock(>lock);
-
-   /*
-* Ensure that no other VCPU is running by checking the vcpu->cpu
-* field.  If no other VPCUs are running we can safely access the VGIC
-* state, because even if another VPU is run after this point, that
-* VCPU will not touch the vgic state, because it will block on
-* getting the vgic->lock in kvm_vgic_sync_hwstate().
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
-   if (unlikely(tmp_vcpu->cpu != -1)) {
-   ret = -EBUSY;
-   goto out_vgic_unlock;
-   }
-   }
-
-   /*
-* Move all pending IRQs from the LRs on all VCPUs so the pending
-* state can be properly represented in the register state accessible
-* through this API.
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
-   vgic_unqueue_irqs(tmp_vcpu);
+   vcpu = kvm_get_vcpu(dev->kvm, cpuid);
 
-   offset -= r->base;
-   r->handle_mmio(vcpu, , offset);
+   mmio.len = 4;
+   mmio.is_write = is_write;
+   mmio.data = data;
 
-   ret = 0;
-out_vgic_unlock:
-   spin_unlock(>lock);
-out:
-   mutex_unlock(>kvm->lock);
-   return ret;
+   return vgic_attr_regs_access(vcpu, ranges, , offset);
 }
 
 static int vgic_v2_create(struct kvm_device *dev, u32 type)
@@ -761,55 +714,38 @@ static void vgic_v2_destroy(struct kvm_device *dev)
 static int vgic_v2_set_attr(struct kvm_device *dev,
struct kvm_device_attr *attr)
 {
+   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+   u32 reg;
+   __le32 data;
int ret;
 
ret = vgic_set_common_attr(dev, attr);
if (ret != -ENXIO)
return ret;
 
-   switch (attr->group) {
-   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-   case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
- 

[PATCH v7 1/6] KVM: arm/arm64: Add VGICv3 save/restore API documentation

2015-12-07 Thread Pavel Fedin
From: Christoffer Dall <christoffer.d...@linaro.org>

Factor out the GICv3-specific documentation into a separate
documentation file.  Add description for how to access distributor,
redistributor, and CPU interface registers for GICv3 in this new file.

Acked-by: Peter Maydell <peter.mayd...@linaro.org>
Acked-by: Marc Zyngier <marc.zyng...@arm.com>
Signed-off-by: Christoffer Dall <christoffer.d...@linaro.org>
Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 116 ++
 Documentation/virtual/kvm/devices/arm-vgic.txt|  21 +---
 2 files changed, 120 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/arm-vgic-v3.txt

diff --git a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt 
b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
new file mode 100644
index 000..24e2f6b
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
@@ -0,0 +1,116 @@
+ARM Virtual Generic Interrupt Controller v3 and later (VGICv3)
+==
+
+
+Device types supported:
+  KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
+
+Only one VGIC instance may be instantiated through this API.  The created VGIC
+will act as the VM interrupt controller, requiring emulated user-space devices
+to inject interrupts to the VGIC instead of directly to CPUs.  It is not
+possible to create both a GICv3 and GICv2 on the same VM.
+
+Creating a guest GICv3 device requires a host GICv3 as well.
+
+Groups:
+  KVM_DEV_ARM_VGIC_GRP_ADDR
+  Attributes:
+KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
+  Base address in the guest physical address space of the GICv3 distributor
+  register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+  This address needs to be 64K aligned and the region covers 64 KByte.
+
+KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit)
+  Base address in the guest physical address space of the GICv3
+  redistributor register mappings. There are two 64K pages for each
+  VCPU and all of the redistributor pages are contiguous.
+  Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+  This address needs to be 64K aligned.
+
+
+  KVM_DEV_ARM_VGIC_GRP_DIST_REGS
+  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
+  Attributes:
+The attr field of kvm_device_attr encodes two values:
+bits: | 63     32  |  31   0 |
+values:   |  mpidr |  offset |
+
+All distributor regs are (rw, 64-bit).
+
+KVM_DEV_ARM_VGIC_GRP_DIST_REGS accesses the main distributor registers.
+KVM_DEV_ARM_VGIC_GRP_REDIST_REGS accesses the redistributor of the CPU
+specified by the mpidr.
+
+The offset is relative to the "[Re]Distributor base address" as defined
+in the GICv3/4 specs.  Getting or setting such a register has the same
+effect as reading or writing the register on real hardware, and the mpidr
+field is used to specify which redistributor is accessed.  The mpidr is
+ignored for the distributor.
+
+The mpidr encoding is based on the affinity information in the
+architecture defined MPIDR, and the field is encoded as follows:
+  | 63  56 | 55  48 | 47  40 | 39  32 |
+  |Aff3|Aff2|Aff1|Aff0|
+
+Note that distributor fields are not banked, but return the same value
+regardless of the mpidr used to access the register.
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENXIO: Getting or setting this register is not yet supported
+-EBUSY: One or more VCPUs are running
+
+
+  KVM_DEV_ARM_VGIC_CPU_SYSREGS
+  Attributes:
+The attr field of kvm_device_attr encodes two values:
+bits: | 63     32 | 31    16 | 15    0 |
+values:   | mpidr |  RES |instr|
+
+The mpidr field encodes the CPU ID based on the affinity information in the
+architecture defined MPIDR, and the field is encoded as follows:
+  | 63  56 | 55  48 | 47  40 | 39  32 |
+  |Aff3|Aff2|Aff1|Aff0   |
+KVM_DEV_ARM_VGIC_SYSREG() macro is provided for building register ID.
+
+The instr field encodes the system register to access based on the fields
+defined in the A64 instruction set encoding for system register access
+(RES means the bits are reserved for future use and should be zero):
+
+  | 15 ... 14 | 13 ... 11 | 10 ... 7 | 6 ... 3 | 2 ... 0 |
+  |   Op 0|Op1|CRn   |   CRm   |   Op2   |
+
+All system regs accessed through this API are (rw, 64-bit).
+
+KVM_DEV_ARM_VGIC_CPU_SYSREGS accesses the CPU interface registers for the
+CPU specified by the mpidr field.
+
+
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENXIO: Getting or setting this registe

[PATCH v7 6/6] KVM: arm64: Implement vGICv3 CPU interface access

2015-12-07 Thread Pavel Fedin
Access size is always 64 bits. Since CPU interface state actually affects
only a single vCPU, no vGIC locking is done in order to avoid code
duplication. Just made sure that the vCPU is not running.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/include/uapi/asm/kvm.h  |  14 ++-
 arch/arm64/mm/mmap.c   |   2 +-
 include/linux/irqchip/arm-gic-v3.h |  18 ++-
 virt/kvm/arm/vgic-v3-emul.c| 224 -
 4 files changed, 251 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 98bd047..ca32fe5 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -179,14 +179,14 @@ struct kvm_arch_memory_slot {
KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
 
 #define __ARM64_SYS_REG(op0,op1,crn,crm,op2) \
-   (KVM_REG_ARM64 | KVM_REG_ARM64_SYSREG | \
-   ARM64_SYS_REG_SHIFT_MASK(op0, OP0) | \
+   (ARM64_SYS_REG_SHIFT_MASK(op0, OP0) | \
ARM64_SYS_REG_SHIFT_MASK(op1, OP1) | \
ARM64_SYS_REG_SHIFT_MASK(crn, CRN) | \
ARM64_SYS_REG_SHIFT_MASK(crm, CRM) | \
ARM64_SYS_REG_SHIFT_MASK(op2, OP2))
 
-#define ARM64_SYS_REG(...) (__ARM64_SYS_REG(__VA_ARGS__) | KVM_REG_SIZE_U64)
+#define ARM64_SYS_REG(...) (__ARM64_SYS_REG(__VA_ARGS__) | KVM_REG_ARM64 | \
+   KVM_REG_SIZE_U64 | KVM_REG_ARM64_SYSREG)
 
 #define KVM_REG_ARM_TIMER_CTL  ARM64_SYS_REG(3, 3, 14, 3, 1)
 #define KVM_REG_ARM_TIMER_CNT  ARM64_SYS_REG(3, 3, 14, 3, 2)
@@ -204,6 +204,14 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
+#define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
+#define   KVM_DEV_ARM_VGIC_SYSREG_MASK (KVM_REG_ARM64_SYSREG_OP0_MASK | \
+KVM_REG_ARM64_SYSREG_OP1_MASK | \
+KVM_REG_ARM64_SYSREG_CRN_MASK | \
+KVM_REG_ARM64_SYSREG_CRM_MASK | \
+KVM_REG_ARM64_SYSREG_OP2_MASK)
+#define   KVM_DEV_ARM_VGIC_SYSREG(op0, op1, crn, crm, op2) \
+   __ARM64_SYS_REG(op0, op1, crn, crm, op2)
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
index af461b9..e59a75a 100644
--- a/arch/arm64/mm/mmap.c
+++ b/arch/arm64/mm/mmap.c
@@ -51,7 +51,7 @@ unsigned long arch_mmap_rnd(void)
 {
unsigned long rnd;
 
-ifdef CONFIG_COMPAT
+#ifdef CONFIG_COMPAT
if (test_thread_flag(TIF_32BIT))
rnd = (unsigned long)get_random_int() % (1 << 
mmap_rnd_compat_bits);
else
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 53fd894..bff3eee 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -259,8 +259,14 @@
 /*
  * CPU interface registers
  */
-#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << 1)
-#define ICC_CTLR_EL1_EOImode_drop  (1U << 1)
+#define ICC_CTLR_EL1_CBPR_SHIFT0
+#define ICC_CTLR_EL1_EOImode_SHIFT 1
+#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_EOImode_drop  (1U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_PRIbits_MASK  (7U << 8)
+#define ICC_CTLR_EL1_IDbits_MASK   (7U << 11)
+#define ICC_CTLR_EL1_SEIS  (1U << 14)
+#define ICC_CTLR_EL1_A3V   (1U << 15)
 #define ICC_SRE_EL1_SRE(1U << 0)
 
 /*
@@ -285,6 +291,14 @@
 
 #define ICH_VMCR_CTLR_SHIFT0
 #define ICH_VMCR_CTLR_MASK (0x21f << ICH_VMCR_CTLR_SHIFT)
+#define ICH_VMCR_ENG0_SHIFT0
+#define ICH_VMCR_ENG0  (1 << ICH_VMCR_ENG0_SHIFT)
+#define ICH_VMCR_ENG1_SHIFT1
+#define ICH_VMCR_ENG1  (1 << ICH_VMCR_ENG1_SHIFT)
+#define ICH_VMCR_CBPR_SHIFT4
+#define ICH_VMCR_CBPR  (1 << ICH_VMCR_CBPR_SHIFT)
+#define ICH_VMCR_EOIM_SHIFT9
+#define ICH_VMCR_EOIM  (1 << ICH_VMCR_EOIM_SHIFT)
 #define ICH_VMCR_BPR1_SHIFT18
 #define ICH_VMCR_BPR1_MASK (7 << ICH_VMCR_BPR1_SHIFT)
 #define ICH_VMCR_BPR0_SHIFT21
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index d9d644c..8cae803 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 
+#include "sys_regs.h"
 #include "vgic.h"
 
 static bool handle_mmio_rao_wi(struct kvm_vcpu *vcpu,
@@ -991,6 +992,219 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
vgic_kick_vcpus(vcpu->kvm);
 }
 
+static bool access_gic_ctlr(struct kvm_vcp

[PATCH v7 4/6] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-12-07 Thread Pavel Fedin
The access is done similar to vGICv2, using
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
with KVM_SET_DEVICE_ATTR and KVM_GET_DEVICE_ATTR ioctls.

Access size for vGICv3 is 64 bits, vgic_attr_regs_access() fixed to
support this. The trick with vgic_v3_get_reg_size() is necessary because
the major part of GICv3 registers is actually 32-bit, and their accessors
do not distinguish between lower and upper words (offset & 3). Accessing
these registers with len == 8 would cause rollover. For write operations
this would overwrite lower word with the upper one (which would normally
be 0), for read operations this would cause duplication of the same word
in both halves.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/include/uapi/asm/kvm.h  |   1 +
 include/linux/irqchip/arm-gic-v3.h |   1 +
 virt/kvm/arm/vgic-v3-emul.c| 112 -
 virt/kvm/arm/vgic.c|   4 +-
 4 files changed, 102 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 2d4ca4b..98bd047 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -203,6 +203,7 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS   3
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
+#define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index c9ae0c6..53fd894 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -43,6 +43,7 @@
 #define GICD_IGRPMODR  0x0D00
 #define GICD_NSACR 0x0E00
 #define GICD_IROUTER   0x6000
+#define GICD_IROUTER1019   0x7FD8
 #define GICD_IDREGS0xFFD0
 #define GICD_PIDR2 0xFFE8
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e661e7f..d9d644c 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -990,6 +991,77 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
vgic_kick_vcpus(vcpu->kvm);
 }
 
+static u32 vgic_v3_get_reg_size(u32 group, u32 offset)
+{
+   switch (group) {
+   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+   if (offset >= GICD_IROUTER && offset <= GICD_IROUTER1019)
+   return 8;
+   else
+   return 4;
+   break;
+
+   case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS:
+   if ((offset == GICR_TYPER) ||
+   (offset >= GICR_SETLPIR && offset <= GICR_INVALLR))
+   return 8;
+   else
+   return 4;
+   break;
+
+   default:
+   BUG();
+   }
+}
+
+static int vgic_v3_attr_regs_access(struct kvm_device *dev,
+   struct kvm_device_attr *attr,
+   u64 *reg, bool is_write)
+{
+   const struct vgic_io_range *ranges;
+   phys_addr_t offset;
+   struct kvm_vcpu *vcpu;
+   u64 cpuid;
+   struct vgic_dist *vgic = >kvm->arch.vgic;
+   struct kvm_exit_mmio mmio;
+   __le64 data;
+   int ret;
+
+   offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
+   cpuid = attr->attr >> KVM_DEV_ARM_VGIC_CPUID_SHIFT;
+
+   /* Convert affinity ID from our packed to normal form */
+   cpuid = (cpuid & 0x00ff) | ((cpuid & 0xff00) << 8);
+   vcpu = kvm_mpidr_to_vcpu(dev->kvm, cpuid);
+   if (!vcpu)
+   return -EINVAL;
+
+   switch (attr->group) {
+   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+   mmio.phys_addr = vgic->vgic_dist_base + offset;
+   ranges = vgic_v3_dist_ranges;
+   break;
+   case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS:
+   mmio.phys_addr = vgic->vgic_redist_base + offset;
+   ranges = vgic_redist_ranges;
+   break;
+   default:
+   return -ENXIO;
+   }
+
+   data = cpu_to_le64(*reg);
+
+   mmio.len = vgic_v3_get_reg_size(attr->group, offset);
+   mmio.is_write = is_write;
+   mmio.data = 
+   mmio.private = vcpu; /* Redistributor handlers expect this */
+
+   ret = vgic_attr_regs_access(vcpu, ranges, , offset);
+
+   *reg = le64_to_cpu(data);
+   return ret;
+}
+
 static int vgic_v3_create(struct kvm_device *dev, u32 type)
 {
return kvm_vgic_create(dev->kvm, type);
@@ -1003,42 +1075,45 @@ static void vgic_v3_destroy(struct kvm_device *dev)
 static int vgic_v3_set_attr(struct kvm_device *dev,
  

[PATCH v7 0/6] KVM: arm64: Implement API for vGICv3 live migration

2015-12-07 Thread Pavel Fedin
This patchset adds necessary userspace API in order to support vGICv3 live
migration. GICv3 registers are accessed using device attribute ioctls,
similar to GICv2.

v6 => v7:
- Rebased on top of linux-next of 07.12.2015, thrown away unnecessary part

v5 => v6:
- Rebased on top of linux-next of 23.11.2015
- Use original API documentation patch, with minor changes only.
- Quit reusing KVM_DEV_ARM_VGIC_CPUID_MASK, do not touch vGICv2 API at all.
- Fixed some issues reported by the new checkpatch

v4 => v5:
- Adapted to new API by Peter Maydell, Marc Zyngier and Christoffer Dall.
  Acked-by's on the documentation were dropped, just in case, because i
  slightly adjusted it. Additionally, i merged all doc updates into one
  patch.

v3 => v4:
- Split pure refactoring from anything else
- Documentation brought up to date
- Cleaned up 'mmio' structure usage in vgic_attr_regs_access(),
  use call_range_handler() for 64-bit access handling
- Rebased on new linux-next

v2 => v3:
- KVM_DEV_ARM_VGIC_CPUID_MASK enlarged to 20 bits, allowing more than 256
  CPUs.
- Bug fix: Correctly set mmio->private, necessary for redistributor access.
- Added accessors for ICC_AP0R and ICC_AP1R registers
- Rebased on new linux-next

v1 => v2:
- Do not use generic register get/set API for CPU interface, use only
  device attributes.
- Introduce size specifier for distributor and redistributor register
  accesses, do not assume size any more.
- Lots of refactor and reusable code extraction.
- Added forgotten documentation

Christoffer Dall (1):
  KVM: arm/arm64: Add VGICv3 save/restore API documentation

Pavel Fedin (5):
  KVM: arm/arm64: Move endianness conversion out of
vgic_attr_regs_access()
  KVM: arm/arm64: Refactor vGIC attributes handling code
  KVM: arm64: Implement vGICv3 distributor and redistributor access from
userspace
  KVM: arm64: Introduce find_reg_by_id()
  KVM: arm64: Implement vGICv3 CPU interface access

 Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 116 
 Documentation/virtual/kvm/devices/arm-vgic.txt|  21 +-
 arch/arm64/include/uapi/asm/kvm.h |  15 +-
 arch/arm64/kvm/sys_regs.c |  22 +-
 arch/arm64/kvm/sys_regs.h |   4 +
 arch/arm64/mm/mmap.c  |   2 +-
 include/linux/irqchip/arm-gic-v3.h|  19 +-
 virt/kvm/arm/vgic-v2-emul.c   | 124 ++--
 virt/kvm/arm/vgic-v3-emul.c   | 334 +-
 virt/kvm/arm/vgic.c   |  57 
 virt/kvm/arm/vgic.h   |   3 +
 11 files changed, 577 insertions(+), 140 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/arm-vgic-v3.txt

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/4] KVM: arm64: Correctly handle zero register during MMIO

2015-12-04 Thread Pavel Fedin
On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.

Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.

This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:

 *((volatile int *)reg) = 0;

compilers tend to generate "str wzr, [xx]" here

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   | 12 
 arch/arm/kvm/mmio.c  |  5 +++--
 arch/arm64/include/asm/kvm_emulate.h | 13 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h 
b/arch/arm/include/asm/kvm_emulate.h
index a9c80a2..b7ff32e 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -28,6 +28,18 @@
 unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return *vcpu_reg(vcpu, reg_num);
+}
+
+static inline void vcpu_set_reg(const struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   *vcpu_reg(vcpu, reg_num) = val;
+}
+
 bool kvm_condition_valid(struct kvm_vcpu *vcpu);
 void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr);
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c
index 974b1c6..3a10c9f 100644
--- a/arch/arm/kvm/mmio.c
+++ b/arch/arm/kvm/mmio.c
@@ -115,7 +115,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,
   data);
data = vcpu_data_host_to_guest(vcpu, data, len);
-   *vcpu_reg(vcpu, vcpu->arch.mmio_decode.rt) = data;
+   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
return 0;
@@ -186,7 +186,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
rt = vcpu->arch.mmio_decode.rt;
 
if (is_write) {
-   data = vcpu_data_guest_to_host(vcpu, *vcpu_reg(vcpu, rt), len);
+   data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt),
+  len);
 
trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data);
mmio_write_buf(data_buf, len, data);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 3ca894e..5a182af 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -109,6 +109,19 @@ static inline unsigned long *vcpu_reg(const struct 
kvm_vcpu *vcpu, u8 reg_num)
return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
 }
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num];
+}
+
+static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   if (reg_num != 31)
+   vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
+}
+
 /* Get vcpu SPSR for current mode */
 static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
 {
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/4] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-04 Thread Pavel Fedin
System register accesses also use zero register for Rt == 31, and
therefore using it will also result in getting SP value instead. This
patch makes them also using new accessors, introduced by the previous
patch. Since register value is no longer directly associated with storage
inside vCPU context structure, we introduce a dedicated storage for it in
struct sys_reg_params.

This refactor also gets rid of "massive hack" in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 88 ++--
 arch/arm64/kvm/sys_regs.h|  4 +-
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 46 insertions(+), 48 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 545a72a..425f1f6 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -97,18 +97,17 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
-   unsigned long val;
bool was_enabled = vcpu_has_cache_enabled(vcpu);
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
if (!p->is_aarch32) {
-   vcpu_sys_reg(vcpu, r->reg) = val;
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
} else {
if (!p->is_32bit)
-   vcpu_cp15_64_high(vcpu, r->reg) = val >> 32;
-   vcpu_cp15_64_low(vcpu, r->reg) = val & 0xUL;
+   vcpu_cp15_64_high(vcpu, r->reg) =
+   upper_32_bits(p->regval);
+   vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval);
}
 
kvm_toggle_cache(vcpu, was_enabled);
@@ -125,13 +124,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, p->regval);
 
return true;
 }
@@ -153,7 +149,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   p->regval = (1 << 3);
return true;
}
 }
@@ -167,7 +163,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval = val;
return true;
}
 }
@@ -204,13 +200,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   p->regval = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, p->regval);
 
return true;
 }
@@ -228,7 +224,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = p->regval;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -243,12 +239,9 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *dbg_reg;
-
+   p->regval = *dbg_reg;
if (p->is_32bit)
-   val &= 0xUL;
-
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval &= 0xUL;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -697,10 +690,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
- (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
- (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIF

[PATCH v3 2/4] KVM: arm64: Remove const from struct sys_reg_params

2015-12-04 Thread Pavel Fedin
Further rework is going to introduce a dedicated storage for transfer
register value in struct sys_reg_params. Before doing this we have to
remove 'const' modifiers from it in all accessor functions and their
callers.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 36 ++--
 arch/arm64/kvm/sys_regs.h|  4 ++--
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87a64e8..545a72a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -78,7 +78,7 @@ static u32 get_ccsidr(u32 csselr)
  * See note at ARMv7 ARM B1.14.4 (TL;DR: S/W ops are not easily virtualized).
  */
 static bool access_dcsw(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (!p->is_write)
@@ -94,7 +94,7 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
  * sys_regs and leave it in complete control of the caches.
  */
 static bool access_vm_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
unsigned long val;
@@ -122,7 +122,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  * for both AArch64 and AArch32 accesses.
  */
 static bool access_gic_sgi(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
u64 val;
@@ -137,7 +137,7 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_raz_wi(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write)
@@ -147,7 +147,7 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -159,7 +159,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -200,7 +200,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
  *   now use the debug registers.
  */
 static bool trap_debug_regs(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -225,7 +225,7 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
  * hyp.S code switches between host and guest values in future.
  */
 static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *vcpu_reg(vcpu, p->Rt);
@@ -240,7 +240,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
 }
 
 static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *dbg_reg;
@@ -252,7 +252,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bvr[rd->reg];
@@ -294,7 +294,7 @@ static inline void reset_bvr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bcr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bcr[rd->reg];
@@ -337,7 +337,7 @@ static inline void reset_bcr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_wvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_wvr[rd-&

[PATCH v3 4/4] KVM: arm64: Get rid of old vcpu_reg()

2015-12-04 Thread Pavel Fedin
Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
unsafe on ARM64. This patch converts the rest of use cases to new
accessors and completely removes vcpu_reg() on ARM64.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/psci.c  | 20 ++--
 arch/arm64/include/asm/kvm_emulate.h | 11 +++
 arch/arm64/kvm/handle_exit.c |  2 +-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 0b55696..a9b3b90 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -75,7 +75,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
unsigned long context_id;
phys_addr_t target_pc;
 
-   cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+   cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);
 
@@ -94,8 +94,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
return PSCI_RET_INVALID_PARAMS;
}
 
-   target_pc = *vcpu_reg(source_vcpu, 2);
-   context_id = *vcpu_reg(source_vcpu, 3);
+   target_pc = vcpu_get_reg(source_vcpu, 2);
+   context_id = vcpu_get_reg(source_vcpu, 3);
 
kvm_reset_vcpu(vcpu);
 
@@ -114,7 +114,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 * the general puspose registers are undefined upon CPU_ON.
 */
-   *vcpu_reg(vcpu, 0) = context_id;
+   vcpu_set_reg(vcpu, 0, context_id);
vcpu->arch.power_off = false;
smp_mb();   /* Make sure the above is visible */
 
@@ -134,8 +134,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;
 
-   target_affinity = *vcpu_reg(vcpu, 1);
-   lowest_affinity_level = *vcpu_reg(vcpu, 2);
+   target_affinity = vcpu_get_reg(vcpu, 1);
+   lowest_affinity_level = vcpu_get_reg(vcpu, 2);
 
/* Determine target affinity mask */
target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -209,7 +209,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
int ret = 1;
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -273,13 +273,13 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return ret;
 }
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -295,7 +295,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return 1;
 }
 
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 5a182af..25a4021 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -100,15 +100,10 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 }
 
 /*
- * vcpu_reg should always be passed a register number coming from a
- * read of ESR_EL2. Otherwise, it may give the wrong result on AArch32
- * with banked registers.
+ * vcpu_get_reg and vcpu_set_reg should always be passed a register number
+ * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on
+ * AArch32 with banked registers.
  */
-static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
-{
-   return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
-}
-
 static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
 u8 reg_num)
 {
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 68a0759..15f0477 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -37,7 +37,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run 
*run)
 {
int ret;
 
-   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
+   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), vcpu_get_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
 
ret = kvm_psci_call(vcpu);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-04 Thread Pavel Fedin
ARM64 CPU has zero register which is read-only, with a value of 0.
However, KVM currently incorrectly recognizes it being SP (because
Rt == 31, and in struct user_pt_regs 'regs' array is followed by SP),
resulting in invalid value being read, or even SP corruption on write.

The problem has been discovered by performing an operation

 *((volatile int *)reg) = 0;

which compiles as "str xzr, [xx]", and resulted in strange values being
written.

v2 => v3:
- Brought back some const modifiers in unaffected functions

v1 => v2:
- Changed type of transfer value to u64 and store it directly in
  struct sys_reg_params instead of a pointer
- Use lower_32_bits()/upper_32_bits() where appropriate
- Fixed wrong usage of 'Rt' instead of 'Rt2' in kvm_handle_cp_64(),
  overlooked in v1
- Do not write value back when reading

Pavel Fedin (4):
  KVM: arm64: Correctly handle zero register during MMIO
  KVM: arm64: Remove const from struct sys_reg_params
  KVM: arm64: Correctly handle zero register in system register accesses
  KVM: arm64: Get rid of old vcpu_reg()

 arch/arm/include/asm/kvm_emulate.h   |  12 
 arch/arm/kvm/mmio.c  |   5 +-
 arch/arm/kvm/psci.c  |  20 +++---
 arch/arm64/include/asm/kvm_emulate.h |  18 +++--
 arch/arm64/kvm/handle_exit.c |   2 +-
 arch/arm64/kvm/sys_regs.c| 124 +--
 arch/arm64/kvm/sys_regs.h|   8 +--
 arch/arm64/kvm/sys_regs_generic_v8.c |   4 +-
 8 files changed, 106 insertions(+), 87 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-04 Thread Pavel Fedin
 Hello!

> Thanks a lot for respining this quickly. I just had a few minor
> comments, so this is almost ready to go. If you can fix that

 Damn, the rest of reviews got stuck somewhere and arrived later, so i've just 
sent v3 without wrap fix. Will correct it.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-04 Thread Pavel Fedin
ARM64 CPU has zero register which is read-only, with a value of 0.
However, KVM currently incorrectly recognizes it being SP (because
Rt == 31, and in struct user_pt_regs 'regs' array is followed by SP),
resulting in invalid value being read, or even SP corruption on write.

The problem has been discovered by performing an operation

 *((volatile int *)reg) = 0;

which compiles as "str xzr, [xx]", and resulted in strange values being
written.

v3 => v4:
- Unwrapped assignment in patch 0003

v2 => v3:
- Brought back some const modifiers in unaffected functions

v1 => v2:
- Changed type of transfer value to u64 and store it directly in
  struct sys_reg_params instead of a pointer
- Use lower_32_bits()/upper_32_bits() where appropriate
- Fixed wrong usage of 'Rt' instead of 'Rt2' in kvm_handle_cp_64(),
  overlooked in v1
- Do not write value back when reading

Pavel Fedin (4):
  KVM: arm64: Correctly handle zero register during MMIO
  KVM: arm64: Remove const from struct sys_reg_params
  KVM: arm64: Correctly handle zero register in system register accesses
  KVM: arm64: Get rid of old vcpu_reg()

 arch/arm/include/asm/kvm_emulate.h   |  12 
 arch/arm/kvm/mmio.c  |   5 +-
 arch/arm/kvm/psci.c  |  20 +++---
 arch/arm64/include/asm/kvm_emulate.h |  18 +++--
 arch/arm64/kvm/handle_exit.c |   2 +-
 arch/arm64/kvm/sys_regs.c| 123 +--
 arch/arm64/kvm/sys_regs.h|   8 +--
 arch/arm64/kvm/sys_regs_generic_v8.c |   4 +-
 8 files changed, 105 insertions(+), 87 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/4] KVM: arm64: Remove const from struct sys_reg_params

2015-12-04 Thread Pavel Fedin
Further rework is going to introduce a dedicated storage for transfer
register value in struct sys_reg_params. Before doing this we have to
remove 'const' modifiers from it in all accessor functions and their
callers.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 36 ++--
 arch/arm64/kvm/sys_regs.h|  4 ++--
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87a64e8..545a72a 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -78,7 +78,7 @@ static u32 get_ccsidr(u32 csselr)
  * See note at ARMv7 ARM B1.14.4 (TL;DR: S/W ops are not easily virtualized).
  */
 static bool access_dcsw(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (!p->is_write)
@@ -94,7 +94,7 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
  * sys_regs and leave it in complete control of the caches.
  */
 static bool access_vm_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
unsigned long val;
@@ -122,7 +122,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  * for both AArch64 and AArch32 accesses.
  */
 static bool access_gic_sgi(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
u64 val;
@@ -137,7 +137,7 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_raz_wi(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write)
@@ -147,7 +147,7 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -159,7 +159,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -200,7 +200,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
  *   now use the debug registers.
  */
 static bool trap_debug_regs(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -225,7 +225,7 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
  * hyp.S code switches between host and guest values in future.
  */
 static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *vcpu_reg(vcpu, p->Rt);
@@ -240,7 +240,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
 }
 
 static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *dbg_reg;
@@ -252,7 +252,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bvr[rd->reg];
@@ -294,7 +294,7 @@ static inline void reset_bvr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bcr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bcr[rd->reg];
@@ -337,7 +337,7 @@ static inline void reset_bcr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_wvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_wvr[rd-&

[PATCH v4 1/4] KVM: arm64: Correctly handle zero register during MMIO

2015-12-04 Thread Pavel Fedin
On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.

Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.

This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:

 *((volatile int *)reg) = 0;

compilers tend to generate "str wzr, [xx]" here

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   | 12 
 arch/arm/kvm/mmio.c  |  5 +++--
 arch/arm64/include/asm/kvm_emulate.h | 13 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h 
b/arch/arm/include/asm/kvm_emulate.h
index a9c80a2..b7ff32e 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -28,6 +28,18 @@
 unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return *vcpu_reg(vcpu, reg_num);
+}
+
+static inline void vcpu_set_reg(const struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   *vcpu_reg(vcpu, reg_num) = val;
+}
+
 bool kvm_condition_valid(struct kvm_vcpu *vcpu);
 void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr);
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c
index 974b1c6..3a10c9f 100644
--- a/arch/arm/kvm/mmio.c
+++ b/arch/arm/kvm/mmio.c
@@ -115,7 +115,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,
   data);
data = vcpu_data_host_to_guest(vcpu, data, len);
-   *vcpu_reg(vcpu, vcpu->arch.mmio_decode.rt) = data;
+   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
return 0;
@@ -186,7 +186,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
rt = vcpu->arch.mmio_decode.rt;
 
if (is_write) {
-   data = vcpu_data_guest_to_host(vcpu, *vcpu_reg(vcpu, rt), len);
+   data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt),
+  len);
 
trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data);
mmio_write_buf(data_buf, len, data);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 3ca894e..5a182af 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -109,6 +109,19 @@ static inline unsigned long *vcpu_reg(const struct 
kvm_vcpu *vcpu, u8 reg_num)
return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
 }
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num];
+}
+
+static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   if (reg_num != 31)
+   vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
+}
+
 /* Get vcpu SPSR for current mode */
 static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
 {
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 4/4] KVM: arm64: Get rid of old vcpu_reg()

2015-12-04 Thread Pavel Fedin
Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
unsafe on ARM64. This patch converts the rest of use cases to new
accessors and completely removes vcpu_reg() on ARM64.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
---
 arch/arm/kvm/psci.c  | 20 ++--
 arch/arm64/include/asm/kvm_emulate.h | 11 +++
 arch/arm64/kvm/handle_exit.c |  2 +-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 0b55696..a9b3b90 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -75,7 +75,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
unsigned long context_id;
phys_addr_t target_pc;
 
-   cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+   cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);
 
@@ -94,8 +94,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
return PSCI_RET_INVALID_PARAMS;
}
 
-   target_pc = *vcpu_reg(source_vcpu, 2);
-   context_id = *vcpu_reg(source_vcpu, 3);
+   target_pc = vcpu_get_reg(source_vcpu, 2);
+   context_id = vcpu_get_reg(source_vcpu, 3);
 
kvm_reset_vcpu(vcpu);
 
@@ -114,7 +114,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 * the general puspose registers are undefined upon CPU_ON.
 */
-   *vcpu_reg(vcpu, 0) = context_id;
+   vcpu_set_reg(vcpu, 0, context_id);
vcpu->arch.power_off = false;
smp_mb();   /* Make sure the above is visible */
 
@@ -134,8 +134,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;
 
-   target_affinity = *vcpu_reg(vcpu, 1);
-   lowest_affinity_level = *vcpu_reg(vcpu, 2);
+   target_affinity = vcpu_get_reg(vcpu, 1);
+   lowest_affinity_level = vcpu_get_reg(vcpu, 2);
 
/* Determine target affinity mask */
target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -209,7 +209,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
int ret = 1;
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -273,13 +273,13 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return ret;
 }
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -295,7 +295,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return 1;
 }
 
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 5a182af..25a4021 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -100,15 +100,10 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 }
 
 /*
- * vcpu_reg should always be passed a register number coming from a
- * read of ESR_EL2. Otherwise, it may give the wrong result on AArch32
- * with banked registers.
+ * vcpu_get_reg and vcpu_set_reg should always be passed a register number
+ * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on
+ * AArch32 with banked registers.
  */
-static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
-{
-   return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
-}
-
 static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
 u8 reg_num)
 {
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 68a0759..15f0477 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -37,7 +37,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run 
*run)
 {
int ret;
 
-   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
+   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), vcpu_get_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
 
ret = kvm_psci_call(vcpu);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/4] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-04 Thread Pavel Fedin
System register accesses also use zero register for Rt == 31, and
therefore using it will also result in getting SP value instead. This
patch makes them also using new accessors, introduced by the previous
patch. Since register value is no longer directly associated with storage
inside vCPU context structure, we introduce a dedicated storage for it in
struct sys_reg_params.

This refactor also gets rid of "massive hack" in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
---
 arch/arm64/kvm/sys_regs.c| 87 +---
 arch/arm64/kvm/sys_regs.h|  4 +-
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 45 insertions(+), 48 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 545a72a..d2650e8 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -97,18 +97,16 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
-   unsigned long val;
bool was_enabled = vcpu_has_cache_enabled(vcpu);
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
if (!p->is_aarch32) {
-   vcpu_sys_reg(vcpu, r->reg) = val;
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
} else {
if (!p->is_32bit)
-   vcpu_cp15_64_high(vcpu, r->reg) = val >> 32;
-   vcpu_cp15_64_low(vcpu, r->reg) = val & 0xUL;
+   vcpu_cp15_64_high(vcpu, r->reg) = 
upper_32_bits(p->regval);
+   vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval);
}
 
kvm_toggle_cache(vcpu, was_enabled);
@@ -125,13 +123,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, p->regval);
 
return true;
 }
@@ -153,7 +148,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   p->regval = (1 << 3);
return true;
}
 }
@@ -167,7 +162,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval = val;
return true;
}
 }
@@ -204,13 +199,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   p->regval = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, p->regval);
 
return true;
 }
@@ -228,7 +223,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = p->regval;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -243,12 +238,9 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *dbg_reg;
-
+   p->regval = *dbg_reg;
if (p->is_32bit)
-   val &= 0xUL;
-
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval &= 0xUL;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -697,10 +689,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
- (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
- (((dfr &g

[PATCH v2 4/4] KVM: arm64: Get rid of old vcpu_reg()

2015-12-04 Thread Pavel Fedin
Using oldstyle vcpu_reg() accessor is proven to be inappropriate and
unsafe on ARM64. This patch converts the rest of use cases to new
accessors and completely removes vcpu_reg() on ARM64.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/psci.c  | 20 ++--
 arch/arm64/include/asm/kvm_emulate.h | 11 +++
 arch/arm64/kvm/handle_exit.c |  2 +-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 0b55696..a9b3b90 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -75,7 +75,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
unsigned long context_id;
phys_addr_t target_pc;
 
-   cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+   cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);
 
@@ -94,8 +94,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
return PSCI_RET_INVALID_PARAMS;
}
 
-   target_pc = *vcpu_reg(source_vcpu, 2);
-   context_id = *vcpu_reg(source_vcpu, 3);
+   target_pc = vcpu_get_reg(source_vcpu, 2);
+   context_id = vcpu_get_reg(source_vcpu, 3);
 
kvm_reset_vcpu(vcpu);
 
@@ -114,7 +114,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 * the general puspose registers are undefined upon CPU_ON.
 */
-   *vcpu_reg(vcpu, 0) = context_id;
+   vcpu_set_reg(vcpu, 0, context_id);
vcpu->arch.power_off = false;
smp_mb();   /* Make sure the above is visible */
 
@@ -134,8 +134,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;
 
-   target_affinity = *vcpu_reg(vcpu, 1);
-   lowest_affinity_level = *vcpu_reg(vcpu, 2);
+   target_affinity = vcpu_get_reg(vcpu, 1);
+   lowest_affinity_level = vcpu_get_reg(vcpu, 2);
 
/* Determine target affinity mask */
target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -209,7 +209,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
int ret = 1;
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -273,13 +273,13 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return ret;
 }
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -295,7 +295,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return 1;
 }
 
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 5a182af..25a4021 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -100,15 +100,10 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 }
 
 /*
- * vcpu_reg should always be passed a register number coming from a
- * read of ESR_EL2. Otherwise, it may give the wrong result on AArch32
- * with banked registers.
+ * vcpu_get_reg and vcpu_set_reg should always be passed a register number
+ * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on
+ * AArch32 with banked registers.
  */
-static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
-{
-   return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
-}
-
 static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
 u8 reg_num)
 {
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 68a0759..15f0477 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -37,7 +37,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run 
*run)
 {
int ret;
 
-   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
+   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), vcpu_get_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
 
ret = kvm_psci_call(vcpu);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-04 Thread Pavel Fedin
ARM64 CPU has zero register which is read-only, with a value of 0.
However, KVM currently incorrectly recognizes it being SP (because
Rt == 31, and in struct user_pt_regs 'regs' array is followed by SP),
resulting in invalid value being read, or even SP corruption on write.

The problem has been discovered by performing an operation

 *((volatile int *)reg) = 0;

which compiles as "str xzr, [xx]", and resulted in strange values being
written.

v1 => v2:
- Changed type of transfer value to u64 and store it directly in
  struct sys_reg_params instead of a pointer
- Use lower_32_bits()/upper_32_bits() where appropriate
- Fixed wrong usage of 'Rt' instead of 'Rt2' in kvm_handle_cp_64(),
  overlooked in v1
- Do not write value back when reading

Pavel Fedin (4):
  KVM: arm64: Correctly handle zero register during MMIO
  KVM: arm64: Remove const from struct sys_reg_params
  KVM: arm64: Correctly handle zero register in system register accesses
  KVM: arm64: Get rid of old vcpu_reg()

 arch/arm/include/asm/kvm_emulate.h   |  12 
 arch/arm/kvm/mmio.c  |   5 +-
 arch/arm/kvm/psci.c  |  20 +++---
 arch/arm64/include/asm/kvm_emulate.h |  18 +++--
 arch/arm64/kvm/handle_exit.c |   2 +-
 arch/arm64/kvm/sys_regs.c| 126 +--
 arch/arm64/kvm/sys_regs.h|  16 ++---
 arch/arm64/kvm/sys_regs_generic_v8.c |   4 +-
 8 files changed, 111 insertions(+), 92 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-04 Thread Pavel Fedin
System register accesses also use zero register for Rt == 31, and
therefore using it will also result in getting SP value instead. This
patch makes them also using new accessors, introduced by the previous
patch. Since register value is no longer directly associated with storage
inside vCPU context structure, we introduce a dedicated storage for it in
struct sys_reg_params.

This refactor also gets rid of "massive hack" in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 88 ++--
 arch/arm64/kvm/sys_regs.h|  4 +-
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 46 insertions(+), 48 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index e5f024e..7c9cd64 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -97,18 +97,17 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
-   unsigned long val;
bool was_enabled = vcpu_has_cache_enabled(vcpu);
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
if (!p->is_aarch32) {
-   vcpu_sys_reg(vcpu, r->reg) = val;
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
} else {
if (!p->is_32bit)
-   vcpu_cp15_64_high(vcpu, r->reg) = val >> 32;
-   vcpu_cp15_64_low(vcpu, r->reg) = val & 0xUL;
+   vcpu_cp15_64_high(vcpu, r->reg) =
+   upper_32_bits(p->regval);
+   vcpu_cp15_64_low(vcpu, r->reg) = lower_32_bits(p->regval);
}
 
kvm_toggle_cache(vcpu, was_enabled);
@@ -125,13 +124,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, p->regval);
 
return true;
 }
@@ -153,7 +149,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   p->regval = (1 << 3);
return true;
}
 }
@@ -167,7 +163,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval = val;
return true;
}
 }
@@ -204,13 +200,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = p->regval;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   p->regval = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, p->regval);
 
return true;
 }
@@ -228,7 +224,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = p->regval;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -243,12 +239,9 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
  struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *dbg_reg;
-
+   p->regval = *dbg_reg;
if (p->is_32bit)
-   val &= 0xUL;
-
-   *vcpu_reg(vcpu, p->Rt) = val;
+   p->regval &= 0xUL;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -697,10 +690,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
- (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
- (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIF

[PATCH v2 1/4] KVM: arm64: Correctly handle zero register during MMIO

2015-12-04 Thread Pavel Fedin
On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.

Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.

This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:

 *((volatile int *)reg) = 0;

compilers tend to generate "str wzr, [xx]" here

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
---
 arch/arm/include/asm/kvm_emulate.h   | 12 
 arch/arm/kvm/mmio.c  |  5 +++--
 arch/arm64/include/asm/kvm_emulate.h | 13 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h 
b/arch/arm/include/asm/kvm_emulate.h
index a9c80a2..b7ff32e 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -28,6 +28,18 @@
 unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return *vcpu_reg(vcpu, reg_num);
+}
+
+static inline void vcpu_set_reg(const struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   *vcpu_reg(vcpu, reg_num) = val;
+}
+
 bool kvm_condition_valid(struct kvm_vcpu *vcpu);
 void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr);
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c
index 974b1c6..3a10c9f 100644
--- a/arch/arm/kvm/mmio.c
+++ b/arch/arm/kvm/mmio.c
@@ -115,7 +115,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,
   data);
data = vcpu_data_host_to_guest(vcpu, data, len);
-   *vcpu_reg(vcpu, vcpu->arch.mmio_decode.rt) = data;
+   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
return 0;
@@ -186,7 +186,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
rt = vcpu->arch.mmio_decode.rt;
 
if (is_write) {
-   data = vcpu_data_guest_to_host(vcpu, *vcpu_reg(vcpu, rt), len);
+   data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt),
+  len);
 
trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data);
mmio_write_buf(data_buf, len, data);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 3ca894e..5a182af 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -109,6 +109,19 @@ static inline unsigned long *vcpu_reg(const struct 
kvm_vcpu *vcpu, u8 reg_num)
return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
 }
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num];
+}
+
+static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   if (reg_num != 31)
+   vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
+}
+
 /* Get vcpu SPSR for current mode */
 static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
 {
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] KVM: arm64: Remove const from struct sys_reg_params

2015-12-04 Thread Pavel Fedin
Further rework is going to introduce a dedicated storage for transfer
register value in struct sys_reg_params. Before doing this we have to
remove all 'const' modifiers from it.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 38 ++--
 arch/arm64/kvm/sys_regs.h| 12 ++--
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87a64e8..e5f024e 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -78,7 +78,7 @@ static u32 get_ccsidr(u32 csselr)
  * See note at ARMv7 ARM B1.14.4 (TL;DR: S/W ops are not easily virtualized).
  */
 static bool access_dcsw(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (!p->is_write)
@@ -94,7 +94,7 @@ static bool access_dcsw(struct kvm_vcpu *vcpu,
  * sys_regs and leave it in complete control of the caches.
  */
 static bool access_vm_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  const struct sys_reg_desc *r)
 {
unsigned long val;
@@ -122,7 +122,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
  * for both AArch64 and AArch32 accesses.
  */
 static bool access_gic_sgi(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
u64 val;
@@ -137,7 +137,7 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_raz_wi(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write)
@@ -147,7 +147,7 @@ static bool trap_raz_wi(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -159,7 +159,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
 }
 
 static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
-  const struct sys_reg_params *p,
+  struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -200,7 +200,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
  *   now use the debug registers.
  */
 static bool trap_debug_regs(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
@@ -225,7 +225,7 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
  * hyp.S code switches between host and guest values in future.
  */
 static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *vcpu_reg(vcpu, p->Rt);
@@ -240,7 +240,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
 }
 
 static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
- const struct sys_reg_params *p,
+ struct sys_reg_params *p,
  u64 *dbg_reg)
 {
u64 val = *dbg_reg;
@@ -252,7 +252,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bvr[rd->reg];
@@ -294,7 +294,7 @@ static inline void reset_bvr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_bcr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_bcr[rd->reg];
@@ -337,7 +337,7 @@ static inline void reset_bcr(struct kvm_vcpu *vcpu,
 }
 
 static inline bool trap_wvr(struct kvm_vcpu *vcpu,
-   const struct sys_reg_params *p,
+   struct sys_reg_params *p,
const struct sys_reg_desc *rd)
 {
u64 *dbg_reg = >arch.vcpu_debug_state.dbg_wvr[rd->reg];
@@ -380,7 +380,7 @@ st

RE: [PATCH v2 2/4] KVM: arm64: Remove const from struct sys_reg_params

2015-12-04 Thread Pavel Fedin
 Hello!

> I think you are being a bit overzealous here, and a few const can
> legitimately be kept, see below.

 :) Yes, i've just commanded "search and replace" to the editor. Fixing...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm/arm64: Revert to old way of checking for device mapping in stage2_flush_ptes().

2015-12-03 Thread Pavel Fedin
 Hello!

> > > Cc: sta...@vger.kernel.org
> > > Fixes: e6fab5442345 ("ARM/arm64: KVM: test properly for a PTE's 
> > > uncachedness")
> > >
> >
> > That commit is not in a release yet, so no need for cc stable
> [...]
> 
> But it is cc'd to stable, so unless it is going to be nacked at review
> stage, any subsequent fixes should also be cc'd.

 Sorry guys for messing things up a bit, but the affected commit actually is in 
stable branch (4.4-rc3), so i decided to Cc: stable, just in case, because the 
breakage is quite big IMHO.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/3] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-03 Thread Pavel Fedin
 Hello!

> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 87a64e8..a667228 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -102,7 +102,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
> >
> > BUG_ON(!p->is_write);
> >
> > -   val = *vcpu_reg(vcpu, p->Rt);
> > +   val = *p->val;
> 
> Why does it have to be a pointer? You could just have "val = p->val" if
> you carried the actual value instead of a pointer to the stack variable
> holding that value.

 There's only one concern for pointer approach. Actually, this refactor is 
based on my vGICv3 live migration API patch set:
http://www.spinics.net/lists/kvm/msg124205.html
http://www.spinics.net/lists/kvm/msg124202.html

 It's simply more convenient to use a pointer for exchange with userspace, see 
vgic_v3_cpu_regs_access() and callers. I wouldn't
like to refactor the code again. What's your opinion on this?
 And of course i'll fix up the rest.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/3] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-03 Thread Pavel Fedin
 Hello!

> > It's simply more convenient to use a pointer for exchange with
> > userspace, see vgic_v3_cpu_regs_access() and callers. I wouldn't like
> > to refactor the code again. What's your opinion on this?
> 
> I still don't think this is a good idea. You can still store the value
> as an integer in vgic_v3_cpu_regs_access(), and check the write property
> to do the writeback on read. Which is the same thing I asked for in this
> patch.

 Started doing this and found one more (big) reason against. All sysreg 
handlers have 'const struct sys_reg_params' declaration, and
callers, and their callers... This 'const' is all around the code, and it would 
take a separate huge patch to un-const'ify all this.
Does it worth that?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm/arm64: Revert to old way of checking for device mapping in stage2_flush_ptes().

2015-12-03 Thread Pavel Fedin
 Hello!

> >> I think your analysis is correct, but does that not apply to both 
> >> instances?
> >
> >  No no, another one is correct, since it operates on real PFN (at least 
> > looks like so). I
> have verified my fix against the original problem (crash on Exynos5410 
> without generic timer),
> and it still works fine there.
> >
> 
> I don't think so. Regardless of whether you are manipulating HYP
> mappings or stage-2 mappings, the physical address is always the
> output, not the input of the translation, so addr is always either a
> virtual address or a intermediate physical address, whereas
> pfn_valid() operates on host physical addresses.

 Yes, you are right. I have reviewed this more carefully, and indeed, 
unmap_range() is also called by unmap_stage2_range(), so it can be both IPA and 
real PA.

> OK. I will follow up with a patch, as Christoffer requested. I'd
> appreciate it if you could test to see if it also fixes the current
> issue, and the original arch timer issue.

 I have just made the same patch, and currently testing it on all my boards. 
Also i'll test it on my ARM64 too, just in case. I was about to finish the 
testing and send the patch in maybe one or two hours.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-03 Thread Pavel Fedin
ARM64 CPU has zero register which is read-only, with a value of 0.
However, KVM currently incorrectly recognizes it being SP (because
Rt == 31, and in struct user_pt_regs 'regs' array is followed by SP),
resulting in invalid value being read, or even SP corruption on write.

The problem has been discovered by performing an operation

 *((volatile int *)reg) = 0;

which compiles as "str xzr, [xx]", and resulted in strange values being
written.

Pavel Fedin (3):
  KVM: arm64: Correctly handle zero register during MMIO
  KVM: arm64: Correctly handle zero register in system register accesses
  KVM: arm64: Get rid of old vcpu_reg()

 arch/arm/include/asm/kvm_emulate.h   | 12 ++
 arch/arm/kvm/mmio.c  |  5 ++-
 arch/arm/kvm/psci.c  | 20 -
 arch/arm64/include/asm/kvm_emulate.h | 18 +---
 arch/arm64/kvm/handle_exit.c |  2 +-
 arch/arm64/kvm/sys_regs.c| 79 
 arch/arm64/kvm/sys_regs.h|  4 +-
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 8 files changed, 85 insertions(+), 57 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: arm64: Correctly handle zero register during MMIO

2015-12-03 Thread Pavel Fedin
On ARM64 register index of 31 corresponds to both zero register and SP.
However, all memory access instructions, use ZR as transfer register. SP
is used only as a base register in indirect memory addressing, or by
register-register arithmetics, which cannot be trapped here.

Correct emulation is achieved by introducing new register accessor
functions, which can do special handling for reg_num == 31. These new
accessors intentionally do not rely on old vcpu_reg() on ARM64, because
it is to be removed. Since the affected code is shared by both ARM
flavours, implementations of these accessors are also added to ARM32 code.

This patch fixes setting MMIO register to a random value (actually SP)
instead of zero by something like:

 *((volatile int *)reg) = 0;

compilers tend to generate "str wzr, [xx]" here

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/include/asm/kvm_emulate.h   | 12 
 arch/arm/kvm/mmio.c  |  5 +++--
 arch/arm64/include/asm/kvm_emulate.h | 13 +
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_emulate.h 
b/arch/arm/include/asm/kvm_emulate.h
index a9c80a2..b7ff32e 100644
--- a/arch/arm/include/asm/kvm_emulate.h
+++ b/arch/arm/include/asm/kvm_emulate.h
@@ -28,6 +28,18 @@
 unsigned long *vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num);
 unsigned long *vcpu_spsr(struct kvm_vcpu *vcpu);
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return *vcpu_reg(vcpu, reg_num);
+}
+
+static inline void vcpu_set_reg(const struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   *vcpu_reg(vcpu, reg_num) = val;
+}
+
 bool kvm_condition_valid(struct kvm_vcpu *vcpu);
 void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr);
 void kvm_inject_undefined(struct kvm_vcpu *vcpu);
diff --git a/arch/arm/kvm/mmio.c b/arch/arm/kvm/mmio.c
index 974b1c6..3a10c9f 100644
--- a/arch/arm/kvm/mmio.c
+++ b/arch/arm/kvm/mmio.c
@@ -115,7 +115,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
trace_kvm_mmio(KVM_TRACE_MMIO_READ, len, run->mmio.phys_addr,
   data);
data = vcpu_data_host_to_guest(vcpu, data, len);
-   *vcpu_reg(vcpu, vcpu->arch.mmio_decode.rt) = data;
+   vcpu_set_reg(vcpu, vcpu->arch.mmio_decode.rt, data);
}
 
return 0;
@@ -186,7 +186,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, struct kvm_run *run,
rt = vcpu->arch.mmio_decode.rt;
 
if (is_write) {
-   data = vcpu_data_guest_to_host(vcpu, *vcpu_reg(vcpu, rt), len);
+   data = vcpu_data_guest_to_host(vcpu, vcpu_get_reg(vcpu, rt),
+  len);
 
trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, len, fault_ipa, data);
mmio_write_buf(data_buf, len, data);
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 3ca894e..5a182af 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -109,6 +109,19 @@ static inline unsigned long *vcpu_reg(const struct 
kvm_vcpu *vcpu, u8 reg_num)
return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
 }
 
+static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+u8 reg_num)
+{
+   return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num];
+}
+
+static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
+   unsigned long val)
+{
+   if (reg_num != 31)
+   vcpu_gp_regs(vcpu)->regs.regs[reg_num] = val;
+}
+
 /* Get vcpu SPSR for current mode */
 static inline unsigned long *vcpu_spsr(const struct kvm_vcpu *vcpu)
 {
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] KVM: arm64: Correctly handle zero register in system register accesses

2015-12-03 Thread Pavel Fedin
System register accesses also use zero register for Rt == 31, and
therefore using it will also result in getting SP value instead. This
patch makes them also using new accessors, introduced by the previous
patch.

Additionally, got rid of "massive hack" in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 79 
 arch/arm64/kvm/sys_regs.h|  4 +-
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 46 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87a64e8..a667228 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -102,7 +102,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
+   val = *p->val;
if (!p->is_aarch32) {
vcpu_sys_reg(vcpu, r->reg) = val;
} else {
@@ -125,13 +125,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   const struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, *p->val);
 
return true;
 }
@@ -153,7 +150,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   *p->val = (1 << 3);
return true;
}
 }
@@ -167,7 +164,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
return true;
}
 }
@@ -204,13 +201,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   *p->val = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, *p->val);
 
return true;
 }
@@ -228,7 +225,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  const struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = *p->val;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -248,7 +245,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
if (p->is_32bit)
val &= 0xUL;
 
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -697,10 +694,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
- (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
- (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) 
& 0xf) << 20) |
- (6 << 16) | (el3 << 14) | (el3 << 
12));
+   *p->val = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 0xf) << 28) |
+  (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 0xf) << 24) |
+  (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) & 0xf) << 20) |
+  (6 << 16) | (el3 << 14) | (el3 << 12));
return true;
}
 }
@@ -710,10 +707,10 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
 const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_cp14(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+   *p->val = vcpu_cp14(vcpu, r->reg

[PATCH 3/3] KVM: arm64: Get rid of old vcpu_reg()

2015-12-03 Thread Pavel Fedin
Using oldstyle vcpu_reg() accessor is proven to be inapproptiate and
unsafe on ARM64. This patch fixes the rest of use cases and completely
removes vcpu_reg() on ARM64.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/psci.c  | 20 ++--
 arch/arm64/include/asm/kvm_emulate.h | 11 +++
 arch/arm64/kvm/handle_exit.c |  2 +-
 3 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 0b55696..a9b3b90 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -75,7 +75,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
unsigned long context_id;
phys_addr_t target_pc;
 
-   cpu_id = *vcpu_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
+   cpu_id = vcpu_get_reg(source_vcpu, 1) & MPIDR_HWID_BITMASK;
if (vcpu_mode_is_32bit(source_vcpu))
cpu_id &= ~((u32) 0);
 
@@ -94,8 +94,8 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
return PSCI_RET_INVALID_PARAMS;
}
 
-   target_pc = *vcpu_reg(source_vcpu, 2);
-   context_id = *vcpu_reg(source_vcpu, 3);
+   target_pc = vcpu_get_reg(source_vcpu, 2);
+   context_id = vcpu_get_reg(source_vcpu, 3);
 
kvm_reset_vcpu(vcpu);
 
@@ -114,7 +114,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 * NOTE: We always update r0 (or x0) because for PSCI v0.1
 * the general puspose registers are undefined upon CPU_ON.
 */
-   *vcpu_reg(vcpu, 0) = context_id;
+   vcpu_set_reg(vcpu, 0, context_id);
vcpu->arch.power_off = false;
smp_mb();   /* Make sure the above is visible */
 
@@ -134,8 +134,8 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tmp;
 
-   target_affinity = *vcpu_reg(vcpu, 1);
-   lowest_affinity_level = *vcpu_reg(vcpu, 2);
+   target_affinity = vcpu_get_reg(vcpu, 1);
+   lowest_affinity_level = vcpu_get_reg(vcpu, 2);
 
/* Determine target affinity mask */
target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
@@ -209,7 +209,7 @@ int kvm_psci_version(struct kvm_vcpu *vcpu)
 static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
 {
int ret = 1;
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -273,13 +273,13 @@ static int kvm_psci_0_2_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return ret;
 }
 
 static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 {
-   unsigned long psci_fn = *vcpu_reg(vcpu, 0) & ~((u32) 0);
+   unsigned long psci_fn = vcpu_get_reg(vcpu, 0) & ~((u32) 0);
unsigned long val;
 
switch (psci_fn) {
@@ -295,7 +295,7 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
break;
}
 
-   *vcpu_reg(vcpu, 0) = val;
+   vcpu_set_reg(vcpu, 0, val);
return 1;
 }
 
diff --git a/arch/arm64/include/asm/kvm_emulate.h 
b/arch/arm64/include/asm/kvm_emulate.h
index 5a182af..25a4021 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -100,15 +100,10 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
 }
 
 /*
- * vcpu_reg should always be passed a register number coming from a
- * read of ESR_EL2. Otherwise, it may give the wrong result on AArch32
- * with banked registers.
+ * vcpu_get_reg and vcpu_set_reg should always be passed a register number
+ * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on
+ * AArch32 with banked registers.
  */
-static inline unsigned long *vcpu_reg(const struct kvm_vcpu *vcpu, u8 reg_num)
-{
-   return (unsigned long *)_gp_regs(vcpu)->regs.regs[reg_num];
-}
-
 static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
 u8 reg_num)
 {
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 68a0759..15f0477 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -37,7 +37,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run 
*run)
 {
int ret;
 
-   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), *vcpu_reg(vcpu, 0),
+   trace_kvm_hvc_arm64(*vcpu_pc(vcpu), vcpu_get_reg(vcpu, 0),
kvm_vcpu_hvc_get_imm(vcpu));
 
ret = kvm_psci_call(vcpu);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/3] KVM: arm64: BUG FIX: Correctly handle zero register transfers

2015-12-03 Thread Pavel Fedin
 Hello!

> > The problem has been discovered by performing an operation
> >
> >  *((volatile int *)reg) = 0;
> >
> > which compiles as "str xzr, [xx]", and resulted in strange values being
> > written.
> 
> Interesting find. Which compiler is that?

$ aarch64-linux-gnu-gcc --version
aarch64-linux-gnu-gcc (Linaro GCC 2014.11) 4.9.3 20141031 (prerelease)
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 This is from my colleague who actually hit the bug by his driver. And i can 
reproduce the issue with different compiler version
using the following small testcase:
--- cut ---
p.fedin@fedinw7x64 /cygdrive/d/Projects/Test
$ cat test.c
volatile int *addr;

int test_val(int val)
{
*addr = val;
}

int test_zero(void)
{
*addr = 0;
}

p.fedin@fedinw7x64 /cygdrive/d/Projects/Test
$ aarch64-unknown-linux-gnu-gcc -O2 -c test.c

p.fedin@fedinw7x64 /cygdrive/d/Projects/Test
$ aarch64-unknown-linux-gnu-objdump -d test.o

test.o: file format elf64-littleaarch64


Disassembly of section .text:

 :
   0:   2a0003e2mov w2, w0
   4:   2a0103e0mov w0, w1
   8:   9001adrpx1, 8 <test_val+0x8>
   c:   f9400021ldr x1, [x1]
  10:   b922str w2, [x1]
  14:   d65f03c0ret

0018 :
  18:   9001adrpx1, 8 <test_val+0x8>
  1c:   f9400021ldr x1, [x1]
  20:   b93fstr wzr, [x1]
  24:   d65f03c0ret

p.fedin@fedinw7x64 /cygdrive/d/Projects/Test
$ aarch64-unknown-linux-gnu-gcc --version
aarch64-unknown-linux-gnu-gcc (GCC) 4.9.0
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
--- cut ---

 Isn't it legitimate to write from ZR to MMIO register?
 Another potential case is in our vgic-v3-switch.S:

msr_s   ICH_HCR_EL2, xzr

 It's only because it is KVM code we have never discovered this problem yet. 
Somebody could write such a thing in some other place,
with some other register, which would be executed by KVM, and... boo...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 0/3] Introduce MSI hardware mapping for VFIO

2015-12-03 Thread Pavel Fedin
 Hello!

> I like that you're making this transparent
> for the user, but at the same time, directly calling function pointers
> through the msi_domain_ops is quite ugly.

 Do you mean dereferencing info->ops->vfio_map() in .c code? I can introduce 
some wrappers in include/linux/msi.h like 
msi_domain_vfio_map()/msi_domain_vfio_unmap(), this would not conceptually 
change anything.

>  There needs to be a real, interface there that isn't specific to vfio.

 Hm... What else is going to use this?
 Actually, in my implementation the only thing specific to vfio is using struct 
vfio_iommu_driver_ops. This is because we have to perform MSI mapping for all 
"vfio domains" registered for this container. At least this is how original 
type1 driver works.
 Can anybody explain me, what these "vfio domains" are? From the code it looks 
like we can have several IOMMU instances belonging to one VFIO container, and 
in this case one IOMMU == one "vfio domain". So is my understanding correct 
that "vfio domain" is IOMMU instance?
 And here come completely different ideas...
 First of all, can anybody explain, why do i perform all mappings on per-IOMMU 
basis, not on per-device basis? AFAIK at least ARM SMMU knows about "stream 
IDs", and therefore it should be capable of distinguishing between different 
devices. So can i have per-device mapping? This would make things much simpler.
 So:
 Idea 1: do per-device mappings. In this case i don't have to track down which 
devices belong to which group and which IOMMU...
 Idea 2: What if we indeed simply simulate x86 behavior? What if we just do 1:1 
mapping for MSI register when IOMMU is initialized and forget about it, so that 
MSI messages are guaranteed to reach the host? Or would this mean that we would 
have to do 1:1 mapping for the whole address range? Looks like (1) tried to do 
something similar, with address reservation.
 Idea 3: Is single device guaranteed to correspond to a single "vfio domain" 
(and, as a consequence, to a single IOMMU)? In this case it's very easy to 
unlink interface introduced by 0002 of my series from vfio, and pass just raw 
struct iommu_domain * without any driver_ops? irqchip code would only need 
iommu_map() and iommu_unmap() then, no calling back to vfio layer.

Cc'ed to authors of all mentioned series

> [1] http://www.spinics.net/lists/kvm/msg121669.html,
> http://www.spinics.net/lists/kvm/msg121662.html
> [2] http://www.spinics.net/lists/kvm/msg119236.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: BUG ALERT: ARM32 KVM does not work in 4.4-rc3

2015-12-02 Thread Pavel Fedin
 Hello!

> > My project involves ARM64, but from time to time i also test ARM32
> > KVM. I have discovered that it stopped working in 4.4-rc3. The same
> > virtual machine works perfectly under current kvmarm/next, but gets
> > stuck at random point under 4.4-rc3 from linux-stable. I'm not sure
> > that i have time to investigate this quickly, but i'll post some new
> > information as soon as i get it

[skip]

> So until you bisect it to an exact commit and configuration, I declare
> the alert over. ;-)

 Just in case, to make sure you don't miss it. I have found the problem, and 
it's just good luck that it works on some machines.
Unreliably, BTW. The problem is that it verifies guest's physical addresses 
(IPA) against host memory map; and the fix is here:
http://www.spinics.net/lists/kvm/msg124561.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm/arm64: Revert to old way of checking for device mapping in stage2_flush_ptes().

2015-12-02 Thread Pavel Fedin
 Hello!

> > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> > index 7dace90..51ad98f 100644
> > --- a/arch/arm/kvm/mmu.c
> > +++ b/arch/arm/kvm/mmu.c
> > @@ -310,7 +310,8 @@ static void stage2_flush_ptes(struct kvm *kvm, pmd_t 
> > *pmd,
> >
> > pte = pte_offset_kernel(pmd, addr);
> > do {
> > -   if (!pte_none(*pte) && 
> > !kvm_is_device_pfn(__phys_to_pfn(addr)))
> > +   if (!pte_none(*pte) &&
> > +   (pte_val(*pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
> 
> I think your analysis is correct, but does that not apply to both instances?

 No no, another one is correct, since it operates on real PFN (at least looks 
like so). I have verified my fix against the original problem (crash on 
Exynos5410 without generic timer), and it still works fine there.

> And instead of reverting, could we fix this properly instead?

 Of course, i'm not against alternate approaches, feel free to. I've just 
suggested what i could, to fix things quickly. I'm indeed no expert in KVM 
memory management yet. After all, this is what mailing lists are for.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: BUG ALERT: ARM32 KVM does not work in 4.4-rc3

2015-12-01 Thread Pavel Fedin
 Hello!

> The same kernel is used both as a guest and a host with v4.4-rc3.
> 
> So until you bisect it to an exact commit and configuration, I declare
> the alert over. ;-)

 By this time i have also tried it on another machine, and there it also works. 
Looks like it's triggered only on some particular
HW. I'll try to figure this out.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: BUG ALERT: ARM32 KVM does not work in 4.4-rc3

2015-12-01 Thread Pavel Fedin
 Hello!

> > My project involves ARM64, but from time to time i also test ARM32
> > KVM. I have discovered that it stopped working in 4.4-rc3. The same
> > virtual machine works perfectly under current kvmarm/next, but gets
> > stuck at random point under 4.4-rc3 from linux-stable. I'm not sure
> > that i have time to investigate this quickly, but i'll post some new
> > information as soon as i get it

[skip]

> So until you bisect it to an exact commit and configuration, I declare
> the alert over. ;-)

 The commit in question is e6fab54423450d699a09ec2b899473a541f61971 
("ARM/arm64: KVM: test properly for a PTE's uncachedness").
Reverting it fixes the problem.
 Study in qemu shows that the CPU gets stuck at PC = 0x0C with LR = 0x10. So i 
quickly decided that it might have to do with
caching, and my first hit was correct. The guest crashes in this state very 
early, sometimes it even cannot fully print
"Uncompressing kernel".
 The machine which reproduces it is custom Samsung's out-of-tree board. I'll 
investigate it further in order to determine how
exactly the commit could harm. I know that it passed reviews and testing, and i 
was involved too. Perhaps it's board's code fault,
however.

Cc'ed to others involved.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] KVM: arm/arm64: Decouple virtual timer from vGIC

2015-12-01 Thread Pavel Fedin
Remove dependency on vgic_initialized() and use the newly introduced
infrastructure to send interrupts via the userspace if vGIC is not being
used.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/arm.c|  8 +---
 virt/kvm/arm/arch_timer.c | 23 +--
 2 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6392a5b..e729068 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -468,13 +468,7 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
return ret;
}
 
-   /*
-* Enable the arch timers only if we have an in-kernel VGIC
-* and it has been properly initialized, since we cannot handle
-* interrupts from the virtual timer with a userspace gic.
-*/
-   if (irqchip_in_kernel(kvm) && vgic_initialized(kvm))
-   kvm_timer_enable(kvm);
+   kvm_timer_enable(kvm);
 
return 0;
 }
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 69bca18..90c91b0 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -128,15 +128,17 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, 
bool new_level)
int ret;
struct arch_timer_cpu *timer = >arch.timer_cpu;
 
-   BUG_ON(!vgic_initialized(vcpu->kvm));
-
timer->irq.level = new_level;
trace_kvm_timer_update_irq(vcpu->vcpu_id, timer->map->virt_irq,
   timer->irq.level);
-   ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
-timer->map,
-timer->irq.level);
-   WARN_ON(ret);
+   if (irqchip_in_kernel(vcpu->kvm)) {
+   ret = kvm_vgic_inject_mapped_irq(vcpu->kvm, vcpu->vcpu_id,
+timer->map,
+timer->irq.level);
+   WARN_ON(ret);
+   } else {
+   vcpu->irq = >irq;
+   }
 }
 
 /*
@@ -149,12 +151,12 @@ static void kvm_timer_update_state(struct kvm_vcpu *vcpu)
 
/*
 * If userspace modified the timer registers via SET_ONE_REG before
-* the vgic was initialized, we mustn't set the timer->irq.level value
+* the timer was initialized, we mustn't set the timer->irq.level value
 * because the guest would never see the interrupt.  Instead wait
 * until we call this function from kvm_timer_flush_hwstate.
 */
-   if (!vgic_initialized(vcpu->kvm))
-   return;
+   if (!vcpu->kvm->arch.timer.enabled)
+   return;
 
if (kvm_timer_should_fire(vcpu) != timer->irq.level)
kvm_timer_update_irq(vcpu, !timer->irq.level);
@@ -237,7 +239,8 @@ void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu)
* to ensure that hardware interrupts from the timer triggers a guest
* exit.
*/
-   if (timer->irq.level || kvm_vgic_map_is_active(vcpu, timer->map))
+   if (timer->irq.level || (irqchip_in_kernel(vcpu->kvm) &&
+kvm_vgic_map_is_active(vcpu, timer->map)))
phys_active = true;
else
phys_active = false;
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] KVM: Documentation: Document KVM_EXIT_IRQ

2015-12-01 Thread Pavel Fedin
Add documentation for the new exit code.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 Documentation/virtual/kvm/api.txt | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 092ee9f..d8aae4c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3331,6 +3331,20 @@ the userspace IOAPIC should process the EOI and 
retrigger the interrupt if
 it is still asserted.  Vector is the LAPIC interrupt vector for which the
 EOI was received.
 
+   /* KVM_EXIT_IRQ */
+   struct kvm_irq_level irq;
+
+Indicates that an interrupt happens, to be processed by irqchip implemented in
+userspace. irq.irq specifies the raw IRQ number, and irq.status is to be
+interpreted according to interrupt type:
+ For level-triggered interrupts irq.status is set to new level of the line, and
+  the exit happens upon level change.
+ For edge-triggered interrupts irq.status is set to active level of the line
+  (low or high), and the exit happens when the line is pulsed.
+
+CPU-private interrupts (like per-CPU timers) belong to the vCPU where the exit
+happened.
+
/* Fix the size of the union. */
char padding[256];
};
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] KVM: Introduce KVM_EXIT_IRQ

2015-12-01 Thread Pavel Fedin
This exit code means that this vCPU wants to inject an interrupt using
userspace-emulated controller.

IRQs are signalled by adding pending interrupt descriptors to vcpu
structure. For simplicity, we currently reserve only one pointer for a
single interrupt, which will be used by ARM virtual timer code. This can
be extended in the future if necessary.

The interface is designed to be as much arch-agnostic as possible.
Therefore, it has IRQ number and level as parameters (encoded in
struct kvm_irq_level).

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/arm.c   | 6 ++
 include/linux/kvm_host.h | 7 +++
 include/uapi/linux/kvm.h | 3 +++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 66f90c1..6392a5b 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -585,6 +585,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (signal_pending(current)) {
ret = -EINTR;
run->exit_reason = KVM_EXIT_INTR;
+   } else if (vcpu->irq) {
+   ret = 0;
+   run->exit_reason = KVM_EXIT_IRQ;
+   run->irq.irq = vcpu->irq->irq;
+   run->irq.level = vcpu->irq->level;
+   vcpu->irq = NULL;
}
 
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm) ||
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c923350..93f59c5 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -281,6 +281,13 @@ struct kvm_vcpu {
} spin_loop;
 #endif
bool preempted;
+
+   /*
+* IRQ pending to the userspace on this CPU.
+* Currently we support only one slot, used only by ARM architecture.
+*/
+   const struct kvm_irq_level *irq;
+
struct kvm_vcpu_arch arch;
 };
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 03f3618..a717a9b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -184,6 +184,7 @@ struct kvm_s390_skeys {
 #define KVM_EXIT_SYSTEM_EVENT 24
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
+#define KVM_EXIT_IRQ  27
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -338,6 +339,8 @@ struct kvm_run {
struct {
__u8 vector;
} eoi;
+   /* KVM_EXIT_IRQ */
+   struct kvm_irq_level irq;
/* Fix the size of the union. */
char padding[256];
};
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Add support for handling IRQs in userspace

2015-12-01 Thread Pavel Fedin
This patch series introduces ability to handle IRQs in userspace. This is
currently necessary for ARM KVM in order to be able to use virtual CP15
timer without in-kernel irqchip. This allows to use KVM on machines with
either broken vGIC or custom interrupt controller, like Raspberry Pi 2.

The API is designed to be as much architecture-agnostic is possible.
Currently it actually supports only a single IRQ, but it can be easily
extended to accomodate more.

Pavel Fedin (3):
  KVM: Introduce KVM_EXIT_IRQ
  KVM: Documentation: Document KVM_EXIT_IRQ
  KVM: arm/arm64: Decouple virtual timer from vGIC

 Documentation/virtual/kvm/api.txt | 14 ++
 arch/arm/kvm/arm.c| 14 +++---
 include/linux/kvm_host.h  |  7 +++
 include/uapi/linux/kvm.h  |  3 +++
 virt/kvm/arm/arch_timer.c | 23 +--
 5 files changed, 44 insertions(+), 17 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6] arm/arm64: KVM: Detect vGIC presence at runtime

2015-12-01 Thread Pavel Fedin
Before commit 662d9715840aef44dcb573b0f9fab9e8319c868a
("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") is was possible to
compile the kernel without vGIC and vTimer support. Commit message says
about possibility to detect vGIC support in runtime, but this has never
been implemented.

This patch introdices runtime check, restoring the lost functionality.
It again allows to use KVM on hardware without vGIC. Interrupt
controller has to be emulated in userspace in this case.

-ENODEV return code from probe function means there's no GIC at all.
-ENXIO happens when, for example, there is GIC node in the device tree,
but it does not specify vGIC resources. Normally this means that vGIC
hardware is defunct. Any other error code is still treated as full stop
because it might mean some really serious problems.

This patch does not touch any virtual timer code, suggesting that timer
hardware is actually in place. Normally on boards in question it is true,
however since vGIC is missing, it is impossible to correctly utilize
interrupts from the virtual timer. Since virtual timer handling is in
active redevelopment now, handling in it userspace is out of scope at
the moment. The guest is currently suggested to use some memory-mapped
timer which can be emulated in userspace.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
v5 => v6:
- KVM_CAP_IRQFD patch also dropped, causing many problems on PowerPC and
  S390
- Rebased on top of 4.3-rc3

v4 => v5:
- Tested on top of kvmarm/next
- Dropped already applied part
- Fixed minor checkpatch issues

v3 => v4:
- Revert back to using switch on kvm_vgic_hyp_init() return code. I decided
  to leave 'vgic_present = false' statement because it helps to understand
  the code.

v2 => v3:
- Improved commit messages, added references to commits where the respective
  functionality was broken
- Explicitly specify that the solution currently affects only vGIC and has
  nothing to do with timer.
- Fixed code style according to previous notes
- Removed ARM64 save/restore patch introduced in v2 because it was already
  obsolete for linux-next
- Modify KVM_CAP_IRQFD handling in correct place

v1 => v2:
- Do not use defensive approach in patch 0001. Use correct conditions in
  callers instead
- Added ARM64-specific code, without which attempt to run a VM ends in a
  HYP crash because of unset vGIC save/restore function pointers
---
 arch/arm/kvm/arm.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index e06fd29..66f90c1 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static bool vgic_present;
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
BUG_ON(preemptible());
@@ -132,7 +134,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.vmid_gen = 0;
 
/* The maximum number of VCPUs is limited by the host's GIC model */
-   kvm->arch.max_vcpus = kvm_vgic_get_max_vcpus();
+   kvm->arch.max_vcpus = vgic_present ?
+   kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
 
return ret;
 out_free_stage2_pgd:
@@ -172,6 +175,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
int r;
switch (ext) {
case KVM_CAP_IRQCHIP:
+   r = vgic_present;
+   break;
case KVM_CAP_IOEVENTFD:
case KVM_CAP_DEVICE_CTRL:
case KVM_CAP_USER_MEMORY:
@@ -913,6 +918,8 @@ static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
 
switch (dev_id) {
case KVM_ARM_DEVICE_VGIC_V2:
+   if (!vgic_present)
+   return -ENXIO;
return kvm_vgic_addr(kvm, type, _addr->addr, true);
default:
return -ENODEV;
@@ -927,6 +934,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
switch (ioctl) {
case KVM_CREATE_IRQCHIP: {
+   if (!vgic_present)
+   return -ENXIO;
return kvm_vgic_create(kvm, KVM_DEV_TYPE_ARM_VGIC_V2);
}
case KVM_ARM_SET_DEVICE_ADDR: {
@@ -,8 +1120,17 @@ static int init_hyp_mode(void)
 * Init HYP view of VGIC
 */
err = kvm_vgic_hyp_init();
-   if (err)
+   switch (err) {
+   case 0:
+   vgic_present = true;
+   break;
+   case -ENODEV:
+   case -ENXIO:
+   vgic_present = false;
+   break;
+   default:
goto out_free_context;
+   }
 
/*
 * Init HYP architected timer support
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: arm/arm64: Revert to old way of checking for device mapping in stage2_flush_ptes().

2015-12-01 Thread Pavel Fedin
This function takes stage-II physical addresses (A.K.A. IPA), on input, not
real physical addresses. This causes kvm_is_device_pfn() to return wrong
values, depending on how much guest and host memory maps match. This
results in completely broken KVM on some boards. The problem has been
caught on Samsung proprietary hardware.

Cc: sta...@vger.kernel.org
Fixes: e6fab5442345 ("ARM/arm64: KVM: test properly for a PTE's uncachedness")

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 7dace90..51ad98f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -310,7 +310,8 @@ static void stage2_flush_ptes(struct kvm *kvm, pmd_t *pmd,
 
pte = pte_offset_kernel(pmd, addr);
do {
-   if (!pte_none(*pte) && !kvm_is_device_pfn(__phys_to_pfn(addr)))
+   if (!pte_none(*pte) &&
+   (pte_val(*pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
kvm_flush_dcache_pte(*pte);
} while (pte++, addr += PAGE_SIZE, addr != end);
 }
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

2015-12-01 Thread Pavel Fedin
 Hello!

> >  b) I simply drop it as it is, because current qemu knows about the 
> > dependency and does not
> try to use irqfd without irqchip,
> > because there's simply no use for them. But, well, perhaps there would be 
> > an exception in
> vhost, i don't remember testing it.
> 
> Wouldn't an irqfd emulation cover vhost?

 I've just tested, and no, it does not cause any problems with qemu. It happens 
to correctly detect that the whole thing is not
running and falls back to not using vhost. This is output from my qemu:
--- cut ---
2015-12-01T11:03:16.135724Z qemu-system-arm: Error binding guest notifier: 11
2015-12-01T11:03:16.135849Z qemu-system-arm: unable to start vhost net: 11: 
falling back on userspace virtio
--- cut ---

 So, the resume is: we just drop this patch and only N1 remains.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
> Of Cornelia Huck
> Sent: Monday, November 30, 2015 5:38 PM
> To: Pavel Fedin
> Cc: kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org; 'Marc Zyngier'; 
> 'Christoffer Dall';
> 'Gleb Natapov'; 'Paolo Bonzini'
> Subject: Re: [PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on 
> KVM_CAP_IRQCHIP
> 
> On Mon, 30 Nov 2015 15:41:20 +0300
> Pavel Fedin <p.fe...@samsung.com> wrote:
> 
> >  Hello!
> >
> > > >  Thank you for the note, i didn't know about irqchip-specific 
> > > > capability codes. There's
> the
> > > > same issue with PowerPC, now i
> > > > understand why there's no KVM_CAP_IRQCHIP for them. Because they have 
> > > > KVM_CAP_IRQ_MPIC
> and
> > > > KVM_CAP_IRQ_XICS, similar to S390.
> > > >  But isn't it just weird? I understand that perhaps we have some real 
> > > > need to
> distinguish
> > > > between different irqchip types, but
> > > > shouldn't the kernel also publish KVM_CAP_IRQCHIP, which stands just 
> > > > for "we support
> some
> > > > irqchip virtualization"?
> > > >  May be we should just add this for PowerPC and S390, to make things 
> > > > less ambiguous?
> > >
> > > Note that we explicitly need to _enable_ the s390 cap (for
> > > compatibility). I'd need to recall the exact details but I came to the
> > > conclusion back than that I could not simply enable KVM_CAP_IRQCHIP for
> > > s390 (and current qemu would fail to enable the s390 cap if we started
> > > advertising KVM_CAP_IRQCHIP now).
> >
> >  OMG... I've looked at the code, what a mess...
> >  If i was implementing this, i'd simply introduce kvm_vm_enable_cap(s, 
> > KVM_CAP_IRQCHIP, 0),
> > which would be allowed to fail with -ENOSYS, so that backwards 
> > compatibility is kept and an
> existing API is reused... But, well,
> > it's already impossible to unscramble an egg... :)
> >  Ok, i think in current situation we could choose one of these ways (both 
> > are based on the
> fact that it's obvious that irqfd require
> > IRQCHIP).
> >  a) I look for an alternate way to report KVM_CAP_IRQFD dynamically, and 
> > maybe PowerPC and
> S390 follow this way.
> 
> The thing is: _when_ can you report KVM_CAP_IRQFD? It obviously
> requires an irqchip; but if you need some configuration/enablement
> beforehand, you'll get different values depending on when you retrieve
> the cap. So does KVM_CAP_IRQFD mean "irqfds are available in principle"
> or "everything has been setup for usage of irqfds"? I'd assume the
> former.
> 
> >  b) I simply drop it as it is, because current qemu knows about the 
> > dependency and does not
> try to use irqfd without irqchip,
> > because there's simply no use for them. But, well, perhaps there would be 
> > an exception in
> vhost, i don't remember testing it.
> 
> Wouldn't an irqfd emulation cover vhost?
> 
> >  So what shall we do?
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: BUG ALERT: ARM32 KVM does not work in 4.4-rc3

2015-12-01 Thread Pavel Fedin
Hello!

> -Original Message-
> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
> Of Pavel Fedin
> Sent: Tuesday, December 01, 2015 1:03 PM
> To: 'Marc Zyngier'; kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
> Cc: 'Ard Biesheuvel'; christoffer.d...@linaro.org
> Subject: RE: BUG ALERT: ARM32 KVM does not work in 4.4-rc3
> 
>  Hello!
> 
> > > My project involves ARM64, but from time to time i also test ARM32
> > > KVM. I have discovered that it stopped working in 4.4-rc3. The same
> > > virtual machine works perfectly under current kvmarm/next, but gets
> > > stuck at random point under 4.4-rc3 from linux-stable. I'm not sure
> > > that i have time to investigate this quickly, but i'll post some new
> > > information as soon as i get it
> 
> [skip]
> 
> > So until you bisect it to an exact commit and configuration, I declare
> > the alert over. ;-)
> 
>  The commit in question is e6fab54423450d699a09ec2b899473a541f61971 
> ("ARM/arm64: KVM: test
> properly for a PTE's uncachedness").
> Reverting it fixes the problem.
>  Study in qemu shows that the CPU gets stuck at PC = 0x0C with LR = 0x10. So 
> i quickly decided
> that it might have to do with
> caching, and my first hit was correct. The guest crashes in this state very 
> early, sometimes
> it even cannot fully print
> "Uncompressing kernel".
>  The machine which reproduces it is custom Samsung's out-of-tree board. I'll 
> investigate it
> further in order to determine how
> exactly the commit could harm. I know that it passed reviews and testing, and 
> i was involved
> too. Perhaps it's board's code fault,
> however.
> 
> Cc'ed to others involved.

 The problem seems to be triggered by ARCH_SPARSEMEM_ENABLE. My top-secret 
machine uses it, while more widespread things like
vexpress and Exynos don't.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BUG ALERT: ARM32 KVM does not work in 4.4-rc3

2015-11-30 Thread Pavel Fedin
 Hello!

 My project involves ARM64, but from time to time i also test ARM32 KVM. I have 
discovered that it stopped working in 4.4-rc3. The
same virtual machine works perfectly under current kvmarm/next, but gets stuck 
at random point under 4.4-rc3 from linux-stable.
 I'm not sure that i have time to investigate this quickly, but i'll post some 
new information as soon as i get it

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

2015-11-30 Thread Pavel Fedin
 Hello!

> > case KVM_CAP_INTERNAL_ERROR_DATA:
> >  #ifdef CONFIG_HAVE_KVM_MSI
> > case KVM_CAP_SIGNAL_MSI:
> > +   /* Fallthrough */
> >  #endif
> > +   case KVM_CAP_CHECK_EXTENSION_VM:
> > +   return 1;
> >  #ifdef CONFIG_HAVE_KVM_IRQFD
> > case KVM_CAP_IRQFD:
> > case KVM_CAP_IRQFD_RESAMPLE:
> > +   return kvm_vm_ioctl_check_extension(kvm, KVM_CAP_IRQCHIP);
> 
> This won't work for s390, as it doesn't have KVM_CAP_IRQCHIP but
> KVM_CAP_S390_IRQCHIP (which needs to be enabled).

 Thank you for the note, i didn't know about irqchip-specific capability codes. 
There's the same issue with PowerPC, now i
understand why there's no KVM_CAP_IRQCHIP for them. Because they have 
KVM_CAP_IRQ_MPIC and KVM_CAP_IRQ_XICS, similar to S390.
 But isn't it just weird? I understand that perhaps we have some real need to 
distinguish between different irqchip types, but
shouldn't the kernel also publish KVM_CAP_IRQCHIP, which stands just for "we 
support some irqchip virtualization"?
 May be we should just add this for PowerPC and S390, to make things less 
ambiguous?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 0/2] KVM: arm/arm64: Allow to use KVM without in-kernel irqchip

2015-11-30 Thread Pavel Fedin
This patch set brings back functionality which was broken in v4.0.
Unfortunately, currently it is impossible to take advantage of virtual
architected timer in this case, therefore guest, running in such
restricted mode, has to use some memory-mapped timer. But it is still
better than nothing.

Patch 0002 needs to be verified on PowerPC architecture, because i've
got an impression that KVM_CAP_IRQCHIP is forgotten there.

v4 => v5:
- Tested on top of kvmarm/next
- Dropped already applied part
- Fixed minor checkpatch issues

v3 => v4:
- Revert back to using switch on kvm_vgic_hyp_init() return code. I decided
  to leave 'vgic_present = false' statement because it helps to understand
  the code.

v2 => v3:
- Improved commit messages, added references to commits where the respective
  functionality was broken
- Explicitly specify that the solution currently affects only vGIC and has
  nothing to do with timer.
- Fixed code style according to previous notes
- Removed ARM64 save/restore patch introduced in v2 because it was already
  obsolete for linux-next
- Modify KVM_CAP_IRQFD handling in correct place

v1 => v2:
- Do not use defensive approach in patch 0001. Use correct conditions in
  callers instead
- Added ARM64-specific code, without which attempt to run a VM ends in a
  HYP crash because of unset vGIC save/restore function pointers



Pavel Fedin (2):
  arm/arm64: KVM: Detect vGIC presence at runtime
  KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

 arch/arm/kvm/arm.c  | 22 --
 virt/kvm/kvm_main.c |  6 --
 2 files changed, 24 insertions(+), 4 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 1/2] arm/arm64: KVM: Detect vGIC presence at runtime

2015-11-30 Thread Pavel Fedin
Before commit 662d9715840aef44dcb573b0f9fab9e8319c868a
("arm/arm64: KVM: Kill CONFIG_KVM_ARM_{VGIC,TIMER}") is was possible to
compile the kernel without vGIC and vTimer support. Commit message says
about possibility to detect vGIC support in runtime, but this has never
been implemented.

This patch introdices runtime check, restoring the lost functionality.
It again allows to use KVM on hardware without vGIC. Interrupt
controller has to be emulated in userspace in this case.

-ENODEV return code from probe function means there's no GIC at all.
-ENXIO happens when, for example, there is GIC node in the device tree,
but it does not specify vGIC resources. Normally this means that vGIC
hardware is defunct. Any other error code is still treated as full stop
because it might mean some really serious problems.

This patch does not touch any virtual timer code, suggesting that timer
hardware is actually in place. Normally on boards in question it is true,
however since vGIC is missing, it is impossible to correctly utilize
interrupts from the virtual timer. Since virtual timer handling is in
active redevelopment now, handling in it userspace is out of scope at
the moment. The guest is currently suggested to use some memory-mapped
timer which can be emulated in userspace.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm/kvm/arm.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index eab83b2..d581756 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -61,6 +61,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static bool vgic_present;
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
BUG_ON(preemptible());
@@ -132,7 +134,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
kvm->arch.vmid_gen = 0;
 
/* The maximum number of VCPUs is limited by the host's GIC model */
-   kvm->arch.max_vcpus = kvm_vgic_get_max_vcpus();
+   kvm->arch.max_vcpus = vgic_present ?
+   kvm_vgic_get_max_vcpus() : KVM_MAX_VCPUS;
 
return ret;
 out_free_stage2_pgd:
@@ -172,6 +175,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
int r;
switch (ext) {
case KVM_CAP_IRQCHIP:
+   r = vgic_present;
+   break;
case KVM_CAP_IOEVENTFD:
case KVM_CAP_DEVICE_CTRL:
case KVM_CAP_USER_MEMORY:
@@ -918,6 +923,8 @@ static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
 
switch (dev_id) {
case KVM_ARM_DEVICE_VGIC_V2:
+   if (!vgic_present)
+   return -ENXIO;
return kvm_vgic_addr(kvm, type, _addr->addr, true);
default:
return -ENODEV;
@@ -932,6 +939,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 
switch (ioctl) {
case KVM_CREATE_IRQCHIP: {
+   if (!vgic_present)
+   return -ENXIO;
return kvm_vgic_create(kvm, KVM_DEV_TYPE_ARM_VGIC_V2);
}
case KVM_ARM_SET_DEVICE_ADDR: {
@@ -1116,8 +1125,17 @@ static int init_hyp_mode(void)
 * Init HYP view of VGIC
 */
err = kvm_vgic_hyp_init();
-   if (err)
+   switch (err) {
+   case 0:
+   vgic_present = true;
+   break;
+   case -ENODEV:
+   case -ENXIO:
+   vgic_present = false;
+   break;
+   default:
goto out_free_context;
+   }
 
/*
 * Init HYP architected timer support
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

2015-11-30 Thread Pavel Fedin
Now at least ARM is able to determine whether the machine has
virtualization support for irqchip or not at runtime. Obviously,
irqfd requires irqchip.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/kvm_main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7873d6d..a057d5e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2716,13 +2716,15 @@ static long kvm_vm_ioctl_check_extension_generic(struct 
kvm *kvm, long arg)
case KVM_CAP_INTERNAL_ERROR_DATA:
 #ifdef CONFIG_HAVE_KVM_MSI
case KVM_CAP_SIGNAL_MSI:
+   /* Fallthrough */
 #endif
+   case KVM_CAP_CHECK_EXTENSION_VM:
+   return 1;
 #ifdef CONFIG_HAVE_KVM_IRQFD
case KVM_CAP_IRQFD:
case KVM_CAP_IRQFD_RESAMPLE:
+   return kvm_vm_ioctl_check_extension(kvm, KVM_CAP_IRQCHIP);
 #endif
-   case KVM_CAP_CHECK_EXTENSION_VM:
-   return 1;
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
case KVM_CAP_IRQ_ROUTING:
return KVM_MAX_IRQ_ROUTES;
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

2015-11-30 Thread Pavel Fedin
 Hello!

> >  Thank you for the note, i didn't know about irqchip-specific capability 
> > codes. There's the
> > same issue with PowerPC, now i
> > understand why there's no KVM_CAP_IRQCHIP for them. Because they have 
> > KVM_CAP_IRQ_MPIC and
> > KVM_CAP_IRQ_XICS, similar to S390.
> >  But isn't it just weird? I understand that perhaps we have some real need 
> > to distinguish
> > between different irqchip types, but
> > shouldn't the kernel also publish KVM_CAP_IRQCHIP, which stands just for 
> > "we support some
> > irqchip virtualization"?
> >  May be we should just add this for PowerPC and S390, to make things less 
> > ambiguous?
> 
> Note that we explicitly need to _enable_ the s390 cap (for
> compatibility). I'd need to recall the exact details but I came to the
> conclusion back than that I could not simply enable KVM_CAP_IRQCHIP for
> s390 (and current qemu would fail to enable the s390 cap if we started
> advertising KVM_CAP_IRQCHIP now).

 OMG... I've looked at the code, what a mess...
 If i was implementing this, i'd simply introduce kvm_vm_enable_cap(s, 
KVM_CAP_IRQCHIP, 0),
which would be allowed to fail with -ENOSYS, so that backwards compatibility is 
kept and an existing API is reused... But, well,
it's already impossible to unscramble an egg... :)
 Ok, i think in current situation we could choose one of these ways (both are 
based on the fact that it's obvious that irqfd require
IRQCHIP).
 a) I look for an alternate way to report KVM_CAP_IRQFD dynamically, and maybe 
PowerPC and S390 follow this way.
 b) I simply drop it as it is, because current qemu knows about the dependency 
and does not try to use irqfd without irqchip,
because there's simply no use for them. But, well, perhaps there would be an 
exception in vhost, i don't remember testing it.
 So what shall we do?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 2/2] KVM: Make KVM_CAP_IRQFD dependent on KVM_CAP_IRQCHIP

2015-11-30 Thread Pavel Fedin
 Hello!

> >  b) I simply drop it as it is, because current qemu knows about the 
> > dependency and does not
> try to use irqfd without irqchip,
> > because there's simply no use for them. But, well, perhaps there would be 
> > an exception in
> vhost, i don't remember testing it.
> 
> Wouldn't an irqfd emulation cover vhost?

 Of course it would. At least it should, but perhaps will need some minor 
tweaks.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 15/16] KVM: arm64: implement MSI injection in ITS emulation

2015-11-25 Thread Pavel Fedin
.
> @@ -812,6 +861,19 @@ static int vits_cmd_handle_movall(struct kvm *kvm, u64 
> *its_cmd)
>   return 0;
>  }
> 
> +/* The INT command injects the LPI associated with that DevID/EvID pair. */
> +static int vits_cmd_handle_int(struct kvm *kvm, u64 *its_cmd)
> +{
> + struct kvm_msi msi = {
> + .data = its_cmd_get_id(its_cmd),
> + .devid = its_cmd_get_deviceid(its_cmd),
> + .flags = KVM_MSI_VALID_DEVID,
> + };
> +
> + vits_inject_msi(kvm, );
> + return 0;
> +}
> +
>  /*
>   * This function is called with both the ITS and the distributor lock 
> dropped,
>   * so the actual command handlers must take the respective locks when needed.
> @@ -846,6 +908,9 @@ static int vits_handle_command(struct kvm_vcpu *vcpu, u64 
> *its_cmd)
>   case GITS_CMD_MOVALL:
>   ret = vits_cmd_handle_movall(vcpu->kvm, its_cmd);
>   break;
> + case GITS_CMD_INT:
> + ret = vits_cmd_handle_int(vcpu->kvm, its_cmd);
> + break;
>   case GITS_CMD_INV:
>   ret = vits_cmd_handle_inv(vcpu->kvm, its_cmd);
>   break;
> diff --git a/virt/kvm/arm/its-emul.h b/virt/kvm/arm/its-emul.h
> index 830524a..95e56a7 100644
> --- a/virt/kvm/arm/its-emul.h
> +++ b/virt/kvm/arm/its-emul.h
> @@ -36,6 +36,8 @@ void vgic_enable_lpis(struct kvm_vcpu *vcpu);
>  int vits_init(struct kvm *kvm);
>  void vits_destroy(struct kvm *kvm);
> 
> +int vits_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
> +
>  bool vits_queue_lpis(struct kvm_vcpu *vcpu);
>  void vits_unqueue_lpi(struct kvm_vcpu *vcpu, int irq);
> 
> diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
> index f482e34..90f3628 100644
> --- a/virt/kvm/arm/vgic-v3-emul.c
> +++ b/virt/kvm/arm/vgic-v3-emul.c
> @@ -944,6 +944,7 @@ void vgic_v3_init_emulation(struct kvm *kvm)
>   dist->vm_ops.init_model = vgic_v3_init_model;
>   dist->vm_ops.destroy_model = vgic_v3_destroy_model;
>   dist->vm_ops.map_resources = vgic_v3_map_resources;
> + dist->vm_ops.inject_msi = vits_inject_msi;
>   dist->vm_ops.queue_lpis = vits_queue_lpis;
>   dist->vm_ops.unqueue_lpi = vits_unqueue_lpi;
> 
> --
> 2.5.1

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 2/7] KVM: arm/arm64: Move endianness conversion out of vgic_attr_regs_access()

2015-11-24 Thread Pavel Fedin
mmio_data_read() and mmio_data_write(), originally used in this function,
are limited only to 32 bits. We are going to refactor this code and
eventually let it do 64-bit I/O for vGICv3. Therefore, our first step is
to get rid of this limitation.

We open up these inlines, which consist of endianness conversion and
masking. Masking is not used here (the mask is set to ~0), so we just
move out the remaining endianness conversion.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic-v2-emul.c | 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..959b9c6 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -663,7 +663,7 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
 
 static int vgic_attr_regs_access(struct kvm_device *dev,
 struct kvm_device_attr *attr,
-u32 *reg, bool is_write)
+__le32 *data, bool is_write)
 {
const struct vgic_io_range *r = NULL, *ranges;
phys_addr_t offset;
@@ -671,7 +671,6 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
struct kvm_vcpu *vcpu, *tmp_vcpu;
struct vgic_dist *vgic;
struct kvm_exit_mmio mmio;
-   u32 data;
 
offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
@@ -693,9 +692,7 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
 
mmio.len = 4;
mmio.is_write = is_write;
-   mmio.data = 
-   if (is_write)
-   mmio_data_write(, ~0, *reg);
+   mmio.data = data;
switch (attr->group) {
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
mmio.phys_addr = vgic->vgic_dist_base + offset;
@@ -743,9 +740,6 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
offset -= r->base;
r->handle_mmio(vcpu, , offset);
 
-   if (!is_write)
-   *reg = mmio_data_read(, ~0);
-
ret = 0;
 out_vgic_unlock:
spin_unlock(>lock);
@@ -778,11 +772,13 @@ static int vgic_v2_set_attr(struct kvm_device *dev,
case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
u32 __user *uaddr = (u32 __user *)(long)attr->addr;
u32 reg;
+   __le32 data;
 
if (get_user(reg, uaddr))
return -EFAULT;
 
-   return vgic_attr_regs_access(dev, attr, , true);
+   data = cpu_to_le32(reg);
+   return vgic_attr_regs_access(dev, attr, , true);
}
 
}
@@ -803,12 +799,12 @@ static int vgic_v2_get_attr(struct kvm_device *dev,
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-   u32 reg = 0;
+   __le32 data = 0;
 
-   ret = vgic_attr_regs_access(dev, attr, , false);
+   ret = vgic_attr_regs_access(dev, attr, , false);
if (ret)
return ret;
-   return put_user(reg, uaddr);
+   return put_user(le32_to_cpu(data), uaddr);
}
 
}
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 7/7] KVM: arm64: Implement vGICv3 CPU interface access

2015-11-24 Thread Pavel Fedin
Access size is always 64 bits. Since CPU interface state actually affects
only a single vCPU, no vGIC locking is done in order to avoid code
duplication. Just made sure that the vCPU is not running.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/include/uapi/asm/kvm.h  |  14 ++-
 include/linux/irqchip/arm-gic-v3.h |  18 ++-
 virt/kvm/arm/vgic-v3-emul.c| 232 -
 3 files changed, 258 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 98bd047..ca32fe5 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -179,14 +179,14 @@ struct kvm_arch_memory_slot {
KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
 
 #define __ARM64_SYS_REG(op0,op1,crn,crm,op2) \
-   (KVM_REG_ARM64 | KVM_REG_ARM64_SYSREG | \
-   ARM64_SYS_REG_SHIFT_MASK(op0, OP0) | \
+   (ARM64_SYS_REG_SHIFT_MASK(op0, OP0) | \
ARM64_SYS_REG_SHIFT_MASK(op1, OP1) | \
ARM64_SYS_REG_SHIFT_MASK(crn, CRN) | \
ARM64_SYS_REG_SHIFT_MASK(crm, CRM) | \
ARM64_SYS_REG_SHIFT_MASK(op2, OP2))
 
-#define ARM64_SYS_REG(...) (__ARM64_SYS_REG(__VA_ARGS__) | KVM_REG_SIZE_U64)
+#define ARM64_SYS_REG(...) (__ARM64_SYS_REG(__VA_ARGS__) | KVM_REG_ARM64 | \
+   KVM_REG_SIZE_U64 | KVM_REG_ARM64_SYSREG)
 
 #define KVM_REG_ARM_TIMER_CTL  ARM64_SYS_REG(3, 3, 14, 3, 1)
 #define KVM_REG_ARM_TIMER_CNT  ARM64_SYS_REG(3, 3, 14, 3, 2)
@@ -204,6 +204,14 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
 #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
+#define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
+#define   KVM_DEV_ARM_VGIC_SYSREG_MASK (KVM_REG_ARM64_SYSREG_OP0_MASK | \
+KVM_REG_ARM64_SYSREG_OP1_MASK | \
+KVM_REG_ARM64_SYSREG_CRN_MASK | \
+KVM_REG_ARM64_SYSREG_CRM_MASK | \
+KVM_REG_ARM64_SYSREG_OP2_MASK)
+#define   KVM_DEV_ARM_VGIC_SYSREG(op0, op1, crn, crm, op2) \
+   __ARM64_SYS_REG(op0, op1, crn, crm, op2)
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 7e9f9d5..dfd2bed 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -261,8 +261,14 @@
 /*
  * CPU interface registers
  */
-#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << 1)
-#define ICC_CTLR_EL1_EOImode_drop  (1U << 1)
+#define ICC_CTLR_EL1_CBPR_SHIFT0
+#define ICC_CTLR_EL1_EOImode_SHIFT 1
+#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_EOImode_drop  (1U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_PRIbits_MASK  (7U << 8)
+#define ICC_CTLR_EL1_IDbits_MASK   (7U << 11)
+#define ICC_CTLR_EL1_SEIS  (1U << 14)
+#define ICC_CTLR_EL1_A3V   (1U << 15)
 #define ICC_SRE_EL1_SRE(1U << 0)
 
 /*
@@ -287,6 +293,14 @@
 
 #define ICH_VMCR_CTLR_SHIFT0
 #define ICH_VMCR_CTLR_MASK (0x21f << ICH_VMCR_CTLR_SHIFT)
+#define ICH_VMCR_ENG0_SHIFT0
+#define ICH_VMCR_ENG0  (1 << ICH_VMCR_ENG0_SHIFT)
+#define ICH_VMCR_ENG1_SHIFT1
+#define ICH_VMCR_ENG1  (1 << ICH_VMCR_ENG1_SHIFT)
+#define ICH_VMCR_CBPR_SHIFT4
+#define ICH_VMCR_CBPR  (1 << ICH_VMCR_CBPR_SHIFT)
+#define ICH_VMCR_EOIM_SHIFT9
+#define ICH_VMCR_EOIM  (1 << ICH_VMCR_EOIM_SHIFT)
 #define ICH_VMCR_BPR1_SHIFT18
 #define ICH_VMCR_BPR1_MASK (7 << ICH_VMCR_BPR1_SHIFT)
 #define ICH_VMCR_BPR0_SHIFT21
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index d9d644c..113c386 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 
+#include "sys_regs.h"
 #include "vgic.h"
 
 static bool handle_mmio_rao_wi(struct kvm_vcpu *vcpu,
@@ -991,6 +992,227 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
vgic_kick_vcpus(vcpu->kvm);
 }
 
+static bool access_gic_ctlr(struct kvm_vcpu *vcpu,
+   const struct sys_reg_params *p,
+   const struct sys_reg_desc *r)
+{
+   u64 val;
+   struct vgic_v3_cpu_if *vgicv3 = >arch.vgic_cpu.vgic_v3;
+
+   if (p->is_write) {
+   val = *p->val;
+
+   vgicv3->vgic_vmcr &= ~(ICH_VMCR_CBPR|ICH_VMCR_EOIM);
+   vgicv3->vgic_vmcr |= (val << (ICH_VMCR_CBPR_SHIFT -
+   

[PATCH v6 6/7] KVM: arm64: Introduce find_reg_by_id()

2015-11-24 Thread Pavel Fedin
In order to implement vGICv3 CPU interface access, we will need to
perform table lookup of system registers. We would need both
index_to_params() and find_reg() exported for that purpose, but instead
we export a single function which combines them both.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
Reviewed-by: Andre Przywara <andre.przyw...@arm.com>
---
 arch/arm64/kvm/sys_regs.c | 22 +++---
 arch/arm64/kvm/sys_regs.h |  4 
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 5001cc8..d7ac611 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1276,6 +1276,17 @@ static bool index_to_params(u64 id, struct 
sys_reg_params *params)
}
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num)
+{
+   if (!index_to_params(id, params))
+   return NULL;
+
+   return find_reg(params, table, num);
+}
+
 /* Decode an index value, and find the sys_reg_desc entry. */
 static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
u64 id)
@@ -1403,10 +1414,8 @@ static int get_invariant_sys_reg(u64 id, void __user 
*uaddr)
struct sys_reg_params params;
const struct sys_reg_desc *r;
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
@@ -1420,9 +1429,8 @@ static int set_invariant_sys_reg(u64 id, void __user 
*uaddr)
int err;
u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 3267518..0646108 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc 
*i1,
return i1->Op2 - i2->Op2;
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num);
 
 #define Op0(_x).Op0 = _x
 #define Op1(_x).Op1 = _x
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 4/7] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-11-24 Thread Pavel Fedin
The access is done similar to vGICv2, using
KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
with KVM_SET_DEVICE_ATTR and KVM_GET_DEVICE_ATTR ioctls.

Access size for vGICv3 is 64 bits, vgic_attr_regs_access() fixed to
support this. The trick with vgic_v3_get_reg_size() is necessary because
the major part of GICv3 registers is actually 32-bit, and their accessors
do not distinguish between lower and upper words (offset & 3). Accessing
these registers with len == 8 would cause rollover. For write operations
this would overwrite lower word with the upper one (which would normally
be 0), for read operations this would cause duplication of the same word
in both halves.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/include/uapi/asm/kvm.h  |   1 +
 include/linux/irqchip/arm-gic-v3.h |   1 +
 virt/kvm/arm/vgic-v3-emul.c| 112 -
 virt/kvm/arm/vgic.c|   4 +-
 4 files changed, 102 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 2d4ca4b..98bd047 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -203,6 +203,7 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS   3
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
+#define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 95388a7..7e9f9d5 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -43,6 +43,7 @@
 #define GICD_IGRPMODR  0x0D00
 #define GICD_NSACR 0x0E00
 #define GICD_IROUTER   0x6000
+#define GICD_IROUTER1019   0x7FD8
 #define GICD_IDREGS0xFFD0
 #define GICD_PIDR2 0xFFE8
 
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e661e7f..d9d644c 100644
--- a/virt/kvm/arm/vgic-v3-emul.c
+++ b/virt/kvm/arm/vgic-v3-emul.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -990,6 +991,77 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg)
vgic_kick_vcpus(vcpu->kvm);
 }
 
+static u32 vgic_v3_get_reg_size(u32 group, u32 offset)
+{
+   switch (group) {
+   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+   if (offset >= GICD_IROUTER && offset <= GICD_IROUTER1019)
+   return 8;
+   else
+   return 4;
+   break;
+
+   case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS:
+   if ((offset == GICR_TYPER) ||
+   (offset >= GICR_SETLPIR && offset <= GICR_INVALLR))
+   return 8;
+   else
+   return 4;
+   break;
+
+   default:
+   BUG();
+   }
+}
+
+static int vgic_v3_attr_regs_access(struct kvm_device *dev,
+   struct kvm_device_attr *attr,
+   u64 *reg, bool is_write)
+{
+   const struct vgic_io_range *ranges;
+   phys_addr_t offset;
+   struct kvm_vcpu *vcpu;
+   u64 cpuid;
+   struct vgic_dist *vgic = >kvm->arch.vgic;
+   struct kvm_exit_mmio mmio;
+   __le64 data;
+   int ret;
+
+   offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
+   cpuid = attr->attr >> KVM_DEV_ARM_VGIC_CPUID_SHIFT;
+
+   /* Convert affinity ID from our packed to normal form */
+   cpuid = (cpuid & 0x00ff) | ((cpuid & 0xff00) << 8);
+   vcpu = kvm_mpidr_to_vcpu(dev->kvm, cpuid);
+   if (!vcpu)
+   return -EINVAL;
+
+   switch (attr->group) {
+   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
+   mmio.phys_addr = vgic->vgic_dist_base + offset;
+   ranges = vgic_v3_dist_ranges;
+   break;
+   case KVM_DEV_ARM_VGIC_GRP_REDIST_REGS:
+   mmio.phys_addr = vgic->vgic_redist_base + offset;
+   ranges = vgic_redist_ranges;
+   break;
+   default:
+   return -ENXIO;
+   }
+
+   data = cpu_to_le64(*reg);
+
+   mmio.len = vgic_v3_get_reg_size(attr->group, offset);
+   mmio.is_write = is_write;
+   mmio.data = 
+   mmio.private = vcpu; /* Redistributor handlers expect this */
+
+   ret = vgic_attr_regs_access(vcpu, ranges, , offset);
+
+   *reg = le64_to_cpu(data);
+   return ret;
+}
+
 static int vgic_v3_create(struct kvm_device *dev, u32 type)
 {
return kvm_vgic_create(dev->kvm, type);
@@ -1003,42 +1075,45 @@ static void vgic_v3_destroy(struct kvm_device *dev)
 static int vgic_v3_set_attr(struct kvm_device *dev,
  

[PATCH v6 0/7] KVM: arm64: Implement API for vGICv3 live migration

2015-11-24 Thread Pavel Fedin
This patchset adds necessary userspace API in order to support vGICv3 live
migration. GICv3 registers are accessed using device attribute ioctls,
similar to GICv2.

v5 => v6:
- Rebased on top of linux-next of 23.11.2015
- Use original API documentation patch, with minor changes only.
- Quit reusing KVM_DEV_ARM_VGIC_CPUID_MASK, do not touch vGICv2 API at all.
- Fixed some issues reported by the new checkpatch

v4 => v5:
- Adapted to new API by Peter Maydell, Marc Zyngier and Christoffer Dall.
  Acked-by's on the documentation were dropped, just in case, because i
  slightly adjusted it. Additionally, i merged all doc updates into one
  patch.

v3 => v4:
- Split pure refactoring from anything else
- Documentation brought up to date
- Cleaned up 'mmio' structure usage in vgic_attr_regs_access(),
  use call_range_handler() for 64-bit access handling
- Rebased on new linux-next

v2 => v3:
- KVM_DEV_ARM_VGIC_CPUID_MASK enlarged to 20 bits, allowing more than 256
  CPUs.
- Bug fix: Correctly set mmio->private, necessary for redistributor access.
- Added accessors for ICC_AP0R and ICC_AP1R registers
- Rebased on new linux-next

v1 => v2:
- Do not use generic register get/set API for CPU interface, use only
  device attributes.
- Introduce size specifier for distributor and redistributor register
  accesses, do not assume size any more.
- Lots of refactor and reusable code extraction.
- Added forgotten documentation

Christoffer Dall (1):
  KVM: arm/arm64: Add VGICv3 save/restore API documentation

Pavel Fedin (6):
  KVM: arm/arm64: Move endianness conversion out of
vgic_attr_regs_access()
  KVM: arm/arm64: Refactor vGIC attributes handling code
  KVM: arm64: Implement vGICv3 distributor and redistributor access from
userspace
  KVM: arm64: Refactor system register handlers
  KVM: arm64: Introduce find_reg_by_id()
  KVM: arm64: Implement vGICv3 CPU interface access

 Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 116 
 Documentation/virtual/kvm/devices/arm-vgic.txt|  21 +-
 arch/arm64/include/uapi/asm/kvm.h |  15 +-
 arch/arm64/kvm/sys_regs.c |  83 +++---
 arch/arm64/kvm/sys_regs.h |   8 +-
 arch/arm64/kvm/sys_regs_generic_v8.c  |   2 +-
 include/linux/irqchip/arm-gic-v3.h|  19 +-
 virt/kvm/arm/vgic-v2-emul.c   | 124 ++--
 virt/kvm/arm/vgic-v3-emul.c   | 342 +-
 virt/kvm/arm/vgic.c   |  57 
 virt/kvm/arm/vgic.h   |   3 +
 11 files changed, 616 insertions(+), 174 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/arm-vgic-v3.txt

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 1/7] KVM: arm/arm64: Add VGICv3 save/restore API documentation

2015-11-24 Thread Pavel Fedin
From: Christoffer Dall <christoffer.d...@linaro.org>

Factor out the GICv3-specific documentation into a separate
documentation file.  Add description for how to access distributor,
redistributor, and CPU interface registers for GICv3 in this new file.

Acked-by: Peter Maydell <peter.mayd...@linaro.org>
Acked-by: Marc Zyngier <marc.zyng...@arm.com>
Signed-off-by: Christoffer Dall <christoffer.d...@linaro.org>
Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 Documentation/virtual/kvm/devices/arm-vgic-v3.txt | 116 ++
 Documentation/virtual/kvm/devices/arm-vgic.txt|  21 +---
 2 files changed, 120 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/virtual/kvm/devices/arm-vgic-v3.txt

diff --git a/Documentation/virtual/kvm/devices/arm-vgic-v3.txt 
b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
new file mode 100644
index 000..24e2f6b
--- /dev/null
+++ b/Documentation/virtual/kvm/devices/arm-vgic-v3.txt
@@ -0,0 +1,116 @@
+ARM Virtual Generic Interrupt Controller v3 and later (VGICv3)
+==
+
+
+Device types supported:
+  KVM_DEV_TYPE_ARM_VGIC_V3 ARM Generic Interrupt Controller v3.0
+
+Only one VGIC instance may be instantiated through this API.  The created VGIC
+will act as the VM interrupt controller, requiring emulated user-space devices
+to inject interrupts to the VGIC instead of directly to CPUs.  It is not
+possible to create both a GICv3 and GICv2 on the same VM.
+
+Creating a guest GICv3 device requires a host GICv3 as well.
+
+Groups:
+  KVM_DEV_ARM_VGIC_GRP_ADDR
+  Attributes:
+KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
+  Base address in the guest physical address space of the GICv3 distributor
+  register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+  This address needs to be 64K aligned and the region covers 64 KByte.
+
+KVM_VGIC_V3_ADDR_TYPE_REDIST (rw, 64-bit)
+  Base address in the guest physical address space of the GICv3
+  redistributor register mappings. There are two 64K pages for each
+  VCPU and all of the redistributor pages are contiguous.
+  Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
+  This address needs to be 64K aligned.
+
+
+  KVM_DEV_ARM_VGIC_GRP_DIST_REGS
+  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
+  Attributes:
+The attr field of kvm_device_attr encodes two values:
+bits: | 63     32  |  31   0 |
+values:   |  mpidr |  offset |
+
+All distributor regs are (rw, 64-bit).
+
+KVM_DEV_ARM_VGIC_GRP_DIST_REGS accesses the main distributor registers.
+KVM_DEV_ARM_VGIC_GRP_REDIST_REGS accesses the redistributor of the CPU
+specified by the mpidr.
+
+The offset is relative to the "[Re]Distributor base address" as defined
+in the GICv3/4 specs.  Getting or setting such a register has the same
+effect as reading or writing the register on real hardware, and the mpidr
+field is used to specify which redistributor is accessed.  The mpidr is
+ignored for the distributor.
+
+The mpidr encoding is based on the affinity information in the
+architecture defined MPIDR, and the field is encoded as follows:
+  | 63  56 | 55  48 | 47  40 | 39  32 |
+  |Aff3|Aff2|Aff1|Aff0|
+
+Note that distributor fields are not banked, but return the same value
+regardless of the mpidr used to access the register.
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENXIO: Getting or setting this register is not yet supported
+-EBUSY: One or more VCPUs are running
+
+
+  KVM_DEV_ARM_VGIC_CPU_SYSREGS
+  Attributes:
+The attr field of kvm_device_attr encodes two values:
+bits: | 63     32 | 31    16 | 15    0 |
+values:   | mpidr |  RES |instr|
+
+The mpidr field encodes the CPU ID based on the affinity information in the
+architecture defined MPIDR, and the field is encoded as follows:
+  | 63  56 | 55  48 | 47  40 | 39  32 |
+  |Aff3|Aff2|Aff1|Aff0   |
+KVM_DEV_ARM_VGIC_SYSREG() macro is provided for building register ID.
+
+The instr field encodes the system register to access based on the fields
+defined in the A64 instruction set encoding for system register access
+(RES means the bits are reserved for future use and should be zero):
+
+  | 15 ... 14 | 13 ... 11 | 10 ... 7 | 6 ... 3 | 2 ... 0 |
+  |   Op 0|Op1|CRn   |   CRm   |   Op2   |
+
+All system regs accessed through this API are (rw, 64-bit).
+
+KVM_DEV_ARM_VGIC_CPU_SYSREGS accesses the CPU interface registers for the
+CPU specified by the mpidr field.
+
+
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENXIO: Getting or setting this registe

[PATCH v6 5/7] KVM: arm64: Refactor system register handlers

2015-11-24 Thread Pavel Fedin
Replace Rt with data pointer in struct sys_reg_params. This will allow to
reuse system register handling code in implementation of vGICv3 CPU
interface access API. Additionally, got rid of "massive hack"
in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 arch/arm64/kvm/sys_regs.c| 61 +---
 arch/arm64/kvm/sys_regs.h|  4 +--
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 32 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 87a64e8..5001cc8 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -102,7 +102,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
+   val = *p->val;
if (!p->is_aarch32) {
vcpu_sys_reg(vcpu, r->reg) = val;
} else {
@@ -125,13 +125,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   const struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, *p->val);
 
return true;
 }
@@ -153,7 +150,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   *p->val = (1 << 3);
return true;
}
 }
@@ -167,7 +164,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
return true;
}
 }
@@ -204,13 +201,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   *p->val = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, *p->val);
 
return true;
 }
@@ -228,7 +225,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  const struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = *p->val;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -248,7 +245,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
if (p->is_32bit)
val &= 0xUL;
 
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -697,10 +694,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_system_reg(SYS_ID_AA64PFR0_EL1);
u32 el3 = !!cpuid_feature_extract_field(pfr, 
ID_AA64PFR0_EL3_SHIFT);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 
0xf) << 28) |
- (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 
0xf) << 24) |
- (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) 
& 0xf) << 20) |
- (6 << 16) | (el3 << 14) | (el3 << 
12));
+   *p->val = dfr >> ID_AA64DFR0_WRPS_SHIFT) & 0xf) << 28) |
+  (((dfr >> ID_AA64DFR0_BRPS_SHIFT) & 0xf) << 24) |
+  (((dfr >> ID_AA64DFR0_CTX_CMPS_SHIFT) & 0xf) << 20) |
+  (6 << 16) | (el3 << 14) | (el3 << 12));
return true;
}
 }
@@ -710,10 +707,10 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
 const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_cp14(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+   *p->val = vcpu_cp14(vcpu, r->reg);
}
 
return true;
@@ -740

[PATCH v6 3/7] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-11-24 Thread Pavel Fedin
Separate all implementation-independent code in vgic_attr_regs_access()
and move it to vgic.c. This will allow to reuse this code for vGICv3
implementation.

vcpu lookup is left where it originally was, because vGICv3 API will
expect affinity ID instead of vCPU index, therefore it will be done
differently. Also, vcpu pointer has backpointer to kvm, so 'dev' was
replaced with  'vcpu'.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic-v2-emul.c | 120 +++-
 virt/kvm/arm/vgic.c |  57 +
 virt/kvm/arm/vgic.h |   3 ++
 3 files changed, 88 insertions(+), 92 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 959b9c6..8e769c6 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -661,38 +661,24 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
},
 };
 
-static int vgic_attr_regs_access(struct kvm_device *dev,
-struct kvm_device_attr *attr,
-__le32 *data, bool is_write)
+static int vgic_v2_attr_regs_access(struct kvm_device *dev,
+   struct kvm_device_attr *attr,
+   __le32 *data, bool is_write)
 {
-   const struct vgic_io_range *r = NULL, *ranges;
+   const struct vgic_io_range *ranges;
phys_addr_t offset;
-   int ret, cpuid, c;
-   struct kvm_vcpu *vcpu, *tmp_vcpu;
-   struct vgic_dist *vgic;
+   struct kvm_vcpu *vcpu;
+   int cpuid;
+   struct vgic_dist *vgic = >kvm->arch.vgic;
struct kvm_exit_mmio mmio;
 
offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
KVM_DEV_ARM_VGIC_CPUID_SHIFT;
 
-   mutex_lock(>kvm->lock);
-
-   ret = vgic_init(dev->kvm);
-   if (ret)
-   goto out;
-
-   if (cpuid >= atomic_read(>kvm->online_vcpus)) {
-   ret = -EINVAL;
-   goto out;
-   }
+   if (cpuid >= atomic_read(>kvm->online_vcpus))
+   return -EINVAL;
 
-   vcpu = kvm_get_vcpu(dev->kvm, cpuid);
-   vgic = >kvm->arch.vgic;
-
-   mmio.len = 4;
-   mmio.is_write = is_write;
-   mmio.data = data;
switch (attr->group) {
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
mmio.phys_addr = vgic->vgic_dist_base + offset;
@@ -703,49 +689,16 @@ static int vgic_attr_regs_access(struct kvm_device *dev,
ranges = vgic_cpu_ranges;
break;
default:
-   BUG();
+   return -ENXIO;
}
-   r = vgic_find_range(ranges, 4, offset);
 
-   if (unlikely(!r || !r->handle_mmio)) {
-   ret = -ENXIO;
-   goto out;
-   }
-
-
-   spin_lock(>lock);
-
-   /*
-* Ensure that no other VCPU is running by checking the vcpu->cpu
-* field.  If no other VPCUs are running we can safely access the VGIC
-* state, because even if another VPU is run after this point, that
-* VCPU will not touch the vgic state, because it will block on
-* getting the vgic->lock in kvm_vgic_sync_hwstate().
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
-   if (unlikely(tmp_vcpu->cpu != -1)) {
-   ret = -EBUSY;
-   goto out_vgic_unlock;
-   }
-   }
-
-   /*
-* Move all pending IRQs from the LRs on all VCPUs so the pending
-* state can be properly represented in the register state accessible
-* through this API.
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
-   vgic_unqueue_irqs(tmp_vcpu);
+   vcpu = kvm_get_vcpu(dev->kvm, cpuid);
 
-   offset -= r->base;
-   r->handle_mmio(vcpu, , offset);
+   mmio.len = 4;
+   mmio.is_write = is_write;
+   mmio.data = data;
 
-   ret = 0;
-out_vgic_unlock:
-   spin_unlock(>lock);
-out:
-   mutex_unlock(>kvm->lock);
-   return ret;
+   return vgic_attr_regs_access(vcpu, ranges, , offset);
 }
 
 static int vgic_v2_create(struct kvm_device *dev, u32 type)
@@ -761,55 +714,38 @@ static void vgic_v2_destroy(struct kvm_device *dev)
 static int vgic_v2_set_attr(struct kvm_device *dev,
struct kvm_device_attr *attr)
 {
+   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+   u32 reg;
+   __le32 data;
int ret;
 
ret = vgic_set_common_attr(dev, attr);
if (ret != -ENXIO)
return ret;
 
-   switch (attr->group) {
-   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-   case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
- 

[PATCH v2 1/3] vfio: Introduce map and unmap operations

2015-11-24 Thread Pavel Fedin
These new functions allow direct mapping and unmapping of addresses on the
given IOMMU. They will be used for mapping MSI hardware.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 drivers/vfio/vfio_iommu_type1.c | 29 +
 include/linux/vfio.h|  4 +++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 59d47cb..17506eb 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,33 @@ unwind:
return ret;
 }
 
+static int vfio_iommu_type1_map(void *iommu_data, dma_addr_t iova,
+ phys_addr_t paddr, long npage, int prot)
+{
+   struct vfio_iommu *iommu = iommu_data;
+   int ret;
+
+   mutex_lock(>lock);
+   ret = vfio_iommu_map(iommu, iova, paddr >> PAGE_SHIFT, npage, prot);
+   mutex_unlock(>lock);
+
+   return ret;
+}
+
+static void vfio_iommu_type1_unmap(void *iommu_data, dma_addr_t iova,
+  long npage)
+{
+   struct vfio_iommu *iommu = iommu_data;
+   struct vfio_domain *d;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry_reverse(d, >domain_list, next)
+   iommu_unmap(d->domain, iova, npage << PAGE_SHIFT);
+
+   mutex_unlock(>lock);
+}
+
 static int vfio_dma_do_map(struct vfio_iommu *iommu,
   struct vfio_iommu_type1_dma_map *map)
 {
@@ -1046,6 +1073,8 @@ static const struct vfio_iommu_driver_ops 
vfio_iommu_driver_ops_type1 = {
.ioctl  = vfio_iommu_type1_ioctl,
.attach_group   = vfio_iommu_type1_attach_group,
.detach_group   = vfio_iommu_type1_detach_group,
+   .map= vfio_iommu_type1_map,
+   .unmap  = vfio_iommu_type1_unmap,
 };
 
 static int __init vfio_iommu_type1_init(void)
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 610a86a..061038a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,7 +75,9 @@ struct vfio_iommu_driver_ops {
struct iommu_group *group);
void(*detach_group)(void *iommu_data,
struct iommu_group *group);
-
+   int (*map)(void *iommu_data, dma_addr_t iova,
+  phys_addr_t paddr, long npage, int prot);
+   void(*unmap)(void *iommu_data, dma_addr_t iova, long npage);
 };
 
 extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/3] Introduce MSI hardware mapping for VFIO

2015-11-24 Thread Pavel Fedin
On some architectures (e.g. ARM64) if the device is behind an IOMMU, and
is being mapped by VFIO, it is necessary to also add mappings for MSI
translation register for interrupts to work. This series implements the
necessary API to do this, and makes use of this API for GICv3 ITS on
ARM64.

v1 => v2:
- Adde dependency on CONFIG_GENERIC_MSI_IRQ_DOMAIN in some parts of the
  code, should fix build without this option

Pavel Fedin (3):
  vfio: Introduce map and unmap operations
  gicv3, its: Introduce VFIO map and unmap operations
  vfio: Introduce generic MSI mapping operations

 drivers/irqchip/irq-gic-v3-its.c   |  31 ++
 drivers/vfio/pci/vfio_pci_intrs.c  |  11 
 drivers/vfio/vfio.c| 116 +
 drivers/vfio/vfio_iommu_type1.c|  29 ++
 include/linux/irqchip/arm-gic-v3.h |   2 +
 include/linux/msi.h|  12 
 include/linux/vfio.h   |  17 +-
 7 files changed, 217 insertions(+), 1 deletion(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] gicv3, its: Introduce VFIO map and unmap operations

2015-11-24 Thread Pavel Fedin
These new functions use the supplied IOMMU in order to map and unmap MSI
translation register(s).

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 drivers/irqchip/irq-gic-v3-its.c   | 31 +++
 include/linux/irqchip/arm-gic-v3.h |  2 ++
 include/linux/msi.h| 12 
 3 files changed, 45 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e23d1d1..b97dfd7 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1257,8 +1258,38 @@ out:
return 0;
 }
 
+#if IS_ENABLED(CONFIG_VFIO)
+
+static int its_vfio_map(struct irq_domain *domain,
+   const struct vfio_iommu_driver_ops *ops,
+   void *iommu_data)
+{
+   struct msi_domain_info *msi_info = msi_get_domain_info(domain);
+   struct its_node *its = msi_info->data;
+   u64 addr = its->phys_base + GIC_V3_ITS_CONTROL_SIZE;
+
+   return ops->map(iommu_data, addr, addr, 1, IOMMU_READ|IOMMU_WRITE);
+}
+
+static void its_vfio_unmap(struct irq_domain *domain,
+  const struct vfio_iommu_driver_ops *ops,
+  void *iommu_data)
+{
+   struct msi_domain_info *msi_info = msi_get_domain_info(domain);
+   struct its_node *its = msi_info->data;
+   u64 addr = its->phys_base + GIC_V3_ITS_CONTROL_SIZE;
+
+   ops->unmap(iommu_data, addr, 1);
+}
+
+#endif
+
 static struct msi_domain_ops its_msi_domain_ops = {
.msi_prepare= its_msi_prepare,
+#if IS_ENABLED(CONFIG_VFIO)
+   .vfio_map   = its_vfio_map,
+   .vfio_unmap = its_vfio_unmap,
+#endif
 };
 
 static int its_irq_gic_domain_alloc(struct irq_domain *domain,
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index bff3eee..dfd2bed 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -241,6 +241,8 @@
 #define GITS_BASER_TYPE_RESERVED6  6
 #define GITS_BASER_TYPE_RESERVED7  7
 
+#define GIC_V3_ITS_CONTROL_SIZE0x1
+
 /*
  * ITS commands
  */
diff --git a/include/linux/msi.h b/include/linux/msi.h
index f71a25e..48faea9 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -155,6 +155,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev);
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev);
 
+struct vfio_iommu_driver_ops;
+
 struct msi_controller {
struct module *owner;
struct device *dev;
@@ -189,6 +191,8 @@ struct msi_domain_info;
  * @msi_finish:Optional callbacl to finalize the allocation
  * @set_desc:  Set the msi descriptor for an interrupt
  * @handle_error:  Optional error handler if the allocation fails
+ * @vfio_map:  Map the MSI hardware for VFIO
+ * @vfio_unmap:Unmap the MSI hardware for VFIO
  *
  * @get_hwirq, @msi_init and @msi_free are callbacks used by
  * msi_create_irq_domain() and related interfaces
@@ -218,6 +222,14 @@ struct msi_domain_ops {
struct msi_desc *desc);
int (*handle_error)(struct irq_domain *domain,
struct msi_desc *desc, int error);
+#if IS_ENABLED(CONFIG_VFIO)
+   int (*vfio_map)(struct irq_domain *domain,
+   const struct vfio_iommu_driver_ops *ops,
+   void *iommu_data);
+   void(*vfio_unmap)(struct irq_domain *domain,
+ const struct vfio_iommu_driver_ops *ops,
+ void *iommu_data);
+#endif
 };
 
 /**
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] vfio: Introduce generic MSI mapping operations

2015-11-24 Thread Pavel Fedin
These operations are used in order to map and unmap MSI translation
registers for the device, allowing it to send MSIs to the host while being
mapped via IOMMU.

Usage of MSI controllers is tracked on a per-device basis using reference
counting. An MSI controller remains mapped as long as there's at least one
device referring to it using MSI.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 drivers/vfio/pci/vfio_pci_intrs.c |  11 
 drivers/vfio/vfio.c   | 116 ++
 include/linux/vfio.h  |  13 +
 3 files changed, 140 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 3b3ba15..3c8be59 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -259,12 +259,19 @@ static int vfio_msi_enable(struct vfio_pci_device *vdev, 
int nvec, bool msix)
if (!vdev->ctx)
return -ENOMEM;
 
+   ret = vfio_device_map_msi(>dev);
+   if (ret) {
+   kfree(vdev->ctx);
+   return ret;
+   }
+
if (msix) {
int i;
 
vdev->msix = kzalloc(nvec * sizeof(struct msix_entry),
 GFP_KERNEL);
if (!vdev->msix) {
+   vfio_device_unmap_msi(>dev);
kfree(vdev->ctx);
return -ENOMEM;
}
@@ -277,6 +284,7 @@ static int vfio_msi_enable(struct vfio_pci_device *vdev, 
int nvec, bool msix)
if (ret > 0)
pci_disable_msix(pdev);
kfree(vdev->msix);
+   vfio_device_unmap_msi(>dev);
kfree(vdev->ctx);
return ret;
}
@@ -285,6 +293,7 @@ static int vfio_msi_enable(struct vfio_pci_device *vdev, 
int nvec, bool msix)
if (ret < nvec) {
if (ret > 0)
pci_disable_msi(pdev);
+   vfio_device_unmap_msi(>dev);
kfree(vdev->ctx);
return ret;
}
@@ -413,6 +422,8 @@ static void vfio_msi_disable(struct vfio_pci_device *vdev, 
bool msix)
} else
pci_disable_msi(pdev);
 
+   vfio_device_unmap_msi(>dev);
+
vdev->irq_type = VFIO_PCI_NUM_IRQS;
vdev->num_ctx = 0;
kfree(vdev->ctx);
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index de632da..37d99f5 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -21,9 +21,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,6 +65,8 @@ struct vfio_container {
struct vfio_iommu_driver*iommu_driver;
void*iommu_data;
boolnoiommu;
+   struct list_headmsi_list;
+   struct mutexmsi_lock;
 };
 
 struct vfio_unbound_dev {
@@ -97,6 +101,13 @@ struct vfio_device {
void*device_data;
 };
 
+struct vfio_msi {
+   struct kref kref;
+   struct list_headmsi_next;
+   struct vfio_container   *container;
+   struct irq_domain   *domain;
+};
+
 #ifdef CONFIG_VFIO_NOIOMMU
 static bool noiommu __read_mostly;
 module_param_named(enable_unsafe_noiommu_support,
@@ -882,6 +893,109 @@ void *vfio_device_data(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_device_data);
 
+#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
+
+int vfio_device_map_msi(struct device *dev)
+{
+   struct irq_domain *msi_domain = dev_get_msi_domain(dev);
+   struct msi_domain_info *info;
+   struct vfio_device *device;
+   struct vfio_container *container;
+   struct vfio_msi *vmsi;
+   int ret;
+
+   if (!msi_domain)
+   return 0;
+   info = msi_domain->host_data;
+   if (!info->ops->vfio_map)
+   return 0;
+
+   device = dev_get_drvdata(dev);
+   container = device->group->container;
+
+   if (!container->iommu_driver->ops->map)
+   return -EINVAL;
+
+   mutex_lock(>msi_lock);
+
+   list_for_each_entry(vmsi, >msi_list, msi_next) {
+   if (vmsi->domain == msi_domain) {
+   kref_get(>kref);
+   mutex_unlock(>msi_lock);
+   return 0;
+   }
+   }
+
+   vmsi = kmalloc(sizeof(*vmsi), GFP_KERNEL);
+   if (!vmsi) {
+   mutex_unlock(>msi_lock);
+   return -ENOMEM;
+   }
+
+   ret = info->ops->vfio_map(msi_domain, container->iommu_driver->ops,
+ container->iommu_da

[PATCH 2/3] gicv3, its: Introduce VFIO map and unmap operations

2015-11-23 Thread Pavel Fedin
These new functions use the supplied IOMMU in order to map and unmap MSI
translation register(s).

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 drivers/irqchip/irq-gic-v3-its.c   | 31 +++
 include/linux/irqchip/arm-gic-v3.h |  2 ++
 include/linux/msi.h| 12 
 3 files changed, 45 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e23d1d1..b97dfd7 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1257,8 +1258,38 @@ out:
return 0;
 }
 
+#if IS_ENABLED(CONFIG_VFIO)
+
+static int its_vfio_map(struct irq_domain *domain,
+   const struct vfio_iommu_driver_ops *ops,
+   void *iommu_data)
+{
+   struct msi_domain_info *msi_info = msi_get_domain_info(domain);
+   struct its_node *its = msi_info->data;
+   u64 addr = its->phys_base + GIC_V3_ITS_CONTROL_SIZE;
+
+   return ops->map(iommu_data, addr, addr, 1, IOMMU_READ|IOMMU_WRITE);
+}
+
+static void its_vfio_unmap(struct irq_domain *domain,
+  const struct vfio_iommu_driver_ops *ops,
+  void *iommu_data)
+{
+   struct msi_domain_info *msi_info = msi_get_domain_info(domain);
+   struct its_node *its = msi_info->data;
+   u64 addr = its->phys_base + GIC_V3_ITS_CONTROL_SIZE;
+
+   ops->unmap(iommu_data, addr, 1);
+}
+
+#endif
+
 static struct msi_domain_ops its_msi_domain_ops = {
.msi_prepare= its_msi_prepare,
+#if IS_ENABLED(CONFIG_VFIO)
+   .vfio_map   = its_vfio_map,
+   .vfio_unmap = its_vfio_unmap,
+#endif
 };
 
 static int its_irq_gic_domain_alloc(struct irq_domain *domain,
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index c9ae0c6..95388a7 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -240,6 +240,8 @@
 #define GITS_BASER_TYPE_RESERVED6  6
 #define GITS_BASER_TYPE_RESERVED7  7
 
+#define GIC_V3_ITS_CONTROL_SIZE0x1
+
 /*
  * ITS commands
  */
diff --git a/include/linux/msi.h b/include/linux/msi.h
index f71a25e..48faea9 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -155,6 +155,8 @@ void arch_restore_msi_irqs(struct pci_dev *dev);
 void default_teardown_msi_irqs(struct pci_dev *dev);
 void default_restore_msi_irqs(struct pci_dev *dev);
 
+struct vfio_iommu_driver_ops;
+
 struct msi_controller {
struct module *owner;
struct device *dev;
@@ -189,6 +191,8 @@ struct msi_domain_info;
  * @msi_finish:Optional callbacl to finalize the allocation
  * @set_desc:  Set the msi descriptor for an interrupt
  * @handle_error:  Optional error handler if the allocation fails
+ * @vfio_map:  Map the MSI hardware for VFIO
+ * @vfio_unmap:Unmap the MSI hardware for VFIO
  *
  * @get_hwirq, @msi_init and @msi_free are callbacks used by
  * msi_create_irq_domain() and related interfaces
@@ -218,6 +222,14 @@ struct msi_domain_ops {
struct msi_desc *desc);
int (*handle_error)(struct irq_domain *domain,
struct msi_desc *desc, int error);
+#if IS_ENABLED(CONFIG_VFIO)
+   int (*vfio_map)(struct irq_domain *domain,
+   const struct vfio_iommu_driver_ops *ops,
+   void *iommu_data);
+   void(*vfio_unmap)(struct irq_domain *domain,
+ const struct vfio_iommu_driver_ops *ops,
+ void *iommu_data);
+#endif
 };
 
 /**
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] vfio: Introduce map and unmap operations

2015-11-23 Thread Pavel Fedin
These new functions allow direct mapping and unmapping of addresses on the
given IOMMU. They will be used for mapping MSI hardware.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 drivers/vfio/vfio_iommu_type1.c | 29 +
 include/linux/vfio.h|  4 +++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 59d47cb..17506eb 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -558,6 +558,33 @@ unwind:
return ret;
 }
 
+static int vfio_iommu_type1_map(void *iommu_data, dma_addr_t iova,
+ phys_addr_t paddr, long npage, int prot)
+{
+   struct vfio_iommu *iommu = iommu_data;
+   int ret;
+
+   mutex_lock(>lock);
+   ret = vfio_iommu_map(iommu, iova, paddr >> PAGE_SHIFT, npage, prot);
+   mutex_unlock(>lock);
+
+   return ret;
+}
+
+static void vfio_iommu_type1_unmap(void *iommu_data, dma_addr_t iova,
+  long npage)
+{
+   struct vfio_iommu *iommu = iommu_data;
+   struct vfio_domain *d;
+
+   mutex_lock(>lock);
+
+   list_for_each_entry_reverse(d, >domain_list, next)
+   iommu_unmap(d->domain, iova, npage << PAGE_SHIFT);
+
+   mutex_unlock(>lock);
+}
+
 static int vfio_dma_do_map(struct vfio_iommu *iommu,
   struct vfio_iommu_type1_dma_map *map)
 {
@@ -1046,6 +1073,8 @@ static const struct vfio_iommu_driver_ops 
vfio_iommu_driver_ops_type1 = {
.ioctl  = vfio_iommu_type1_ioctl,
.attach_group   = vfio_iommu_type1_attach_group,
.detach_group   = vfio_iommu_type1_detach_group,
+   .map= vfio_iommu_type1_map,
+   .unmap  = vfio_iommu_type1_unmap,
 };
 
 static int __init vfio_iommu_type1_init(void)
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index 610a86a..061038a 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -75,7 +75,9 @@ struct vfio_iommu_driver_ops {
struct iommu_group *group);
void(*detach_group)(void *iommu_data,
struct iommu_group *group);
-
+   int (*map)(void *iommu_data, dma_addr_t iova,
+  phys_addr_t paddr, long npage, int prot);
+   void(*unmap)(void *iommu_data, dma_addr_t iova, long npage);
 };
 
 extern int vfio_register_iommu_driver(const struct vfio_iommu_driver_ops *ops);
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Introduce MSI hardware mapping for VFIO

2015-11-23 Thread Pavel Fedin
On some architectures (e.g. ARM64) if the device is behind an IOMMU, and
is being mapped by VFIO, it is necessary to also add mappings for MSI
translation register for interrupts to work. This series implements the
necessary API to do this, and makes use of this API for GICv3 ITS on
ARM64.

Pavel Fedin (3):
  vfio: Introduce map and unmap operations
  gicv3, its: Introduce VFIO map and unmap operations
  vfio: Introduce generic MSI mapping operations

 drivers/irqchip/irq-gic-v3-its.c   |  31 ++
 drivers/vfio/pci/vfio_pci_intrs.c  |  11 
 drivers/vfio/vfio.c| 112 +
 drivers/vfio/vfio_iommu_type1.c|  29 ++
 include/linux/irqchip/arm-gic-v3.h |   2 +
 include/linux/msi.h|  12 
 include/linux/vfio.h   |   6 +-
 7 files changed, 202 insertions(+), 1 deletion(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] ARM/arm64: KVM: test properly for a PTE's uncachedness

2015-11-10 Thread Pavel Fedin
 Hello!

 Tested-by: Pavel Fedin <p.fe...@samsung.com>

 Personally i have a small concern about this way of testing. I know many ports 
of the kernel to proprietary systems, and they tend to have drivers which just 
deal with hardcoded physical memory regions on their own, without even 
registering them in the kernel.
 OTOH:
 1. KVM is not meant to be hacked this way as far as i can understand.
 2. Maintainers, i believe, would say: "Then all problems are problems of 
authors of those ports".
 3. Actually, this does not invent anything new, but reuses the approach being 
already used in other parts of the code. And this part is what personally i 
like.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

> -Original Message-
> From: Ard Biesheuvel [mailto:ard.biesheu...@linaro.org]
> Sent: Tuesday, November 10, 2015 12:48 PM
> To: Christoffer Dall; Marc Zyngier; KVM devel mailing list; 
> kvm...@lists.cs.columbia.edu
> Cc: p.fe...@samsung.com; Ard Biesheuvel
> Subject: Re: [PATCH v2] ARM/arm64: KVM: test properly for a PTE's uncachedness
> 
> (adding lists)
> 
> On 10 November 2015 at 10:45, Ard Biesheuvel <ard.biesheu...@linaro.org> 
> wrote:
> > Hi all,
> >
> > I wonder if this is a better way to address the problem. It looks at
> > the nature of the memory rather than the nature of the mapping, which
> > is probably a more reliable indicator of whether cache maintenance is
> > required when performing the unmap.
> >
> >
> > ---8<
> > The open coded tests for checking whether a PTE maps a page as
> > uncached use a flawed 'pte_val(xxx) & CONST != CONST' pattern,
> > which is not guaranteed to work since the type of a mapping is
> > not a set of mutually exclusive bits
> >
> > For HYP mappings, the type is an index into the MAIR table (i.e, the
> > index itself does not contain any information whatsoever about the
> > type of the mapping), and for stage-2 mappings it is a bit field where
> > normal memory and device types are defined as follows:
> >
> > #define MT_S2_NORMAL0xf
> > #define MT_S2_DEVICE_nGnRE  0x1
> >
> > I.e., masking *and* comparing with the latter matches on the former,
> > and we have been getting lucky merely because the S2 device mappings
> > also have the PTE_UXN bit set, or we would misidentify memory mappings
> > as device mappings.
> >
> > Since the unmap_range() code path (which contains one instance of the
> > flawed test) is used both for HYP mappings and stage-2 mappings, and
> > considering the difference between the two, it is non-trivial to fix
> > this by rewriting the tests in place, as it would involve passing
> > down the type of mapping through all the functions.
> >
> > However, since HYP mappings and stage-2 mappings both deal with host
> > physical addresses, we can simply check whether the mapping is backed
> > by memory that is managed by the host kernel, and only perform the
> > D-cache maintenance if this is the case.
> >
> > Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> > ---
> >  arch/arm/kvm/mmu.c | 15 +++
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> > index 6984342da13d..7dace909d5cf 100644
> > --- a/arch/arm/kvm/mmu.c
> > +++ b/arch/arm/kvm/mmu.c
> > @@ -98,6 +98,11 @@ static void kvm_flush_dcache_pud(pud_t pud)
> > __kvm_flush_dcache_pud(pud);
> >  }
> >
> > +static bool kvm_is_device_pfn(unsigned long pfn)
> > +{
> > +   return !pfn_valid(pfn);
> > +}
> > +
> >  /**
> >   * stage2_dissolve_pmd() - clear and flush huge PMD entry
> >   * @kvm:   pointer to kvm structure.
> > @@ -213,7 +218,7 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
> > kvm_tlb_flush_vmid_ipa(kvm, addr);
> >
> > /* No need to invalidate the cache for device 
> > mappings */
> > -   if ((pte_val(old_pte) & PAGE_S2_DEVICE) != 
> > PAGE_S2_DEVICE)
> > +   if (!kvm_is_device_pfn(__phys_to_pfn(addr)))
> > kvm_flush_dcache_pte(old_pte);
> >
> > put_page(virt_to_page(pte));
> > @@ -305,8 +310,7 @@ static void stage2_flush_ptes(struct kvm *kvm, pmd_t 
> > *pmd,
> >
> > pte = pte_offset_kernel(pmd, addr);
> > do {
> > -   if (!pte_none(*pte) &&
> > -   (pte_val(

RE: [PATCH] ARM/arm64: KVM: test properly for a PTE's uncachedness

2015-11-08 Thread Pavel Fedin
 Hello!

 I have tested this patch, it also fixes the crash on Exynos5410, and is indeed 
a better approach.

Tested-by: Pavel Fedin <p.fe...@samsung.com>

CC'ed general KVM mailing list too.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


> -Original Message-
> From: kvmarm-boun...@lists.cs.columbia.edu 
> [mailto:kvmarm-boun...@lists.cs.columbia.edu] On
> Behalf Of Ard Biesheuvel
> Sent: Friday, November 06, 2015 2:43 PM
> To: linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu; 
> marc.zyng...@arm.com;
> christoffer.d...@linaro.org
> Cc: Ard Biesheuvel
> Subject: [PATCH] ARM/arm64: KVM: test properly for a PTE's uncachedness
> 
> The open coded tests for checking whether a PTE maps a page as
> uncached use a flawed 'pte_val(xxx) & CONST != CONST' pattern,
> which is not guaranteed to work since the type of a mapping is an
> index into the MAIR table, not a set of mutually exclusive bits.
> 
> Considering that, on arm64, the S2 type definitions use the following
> MAIR indexes
> 
> #define MT_S2_NORMAL0xf
> #define MT_S2_DEVICE_nGnRE  0x1
> 
> we have been getting lucky merely because the S2 device mappings also
> have the PTE_UXN bit set, which means that a device PTE still does not
> equal a normal PTE after masking with the former type.
> 
> Instead, implement proper checking against the MAIR indexes that are
> known to define uncached memory attributes.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org>
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 11 +++
>  arch/arm/kvm/mmu.c   |  5 ++---
>  arch/arm64/include/asm/kvm_mmu.h | 12 
>  3 files changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 405aa1883307..422973835d41 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -279,6 +279,17 @@ static inline void __kvm_extend_hypmap(pgd_t 
> *boot_hyp_pgd,
>  pgd_t *merged_hyp_pgd,
>  unsigned long hyp_idmap_start) { }
> 
> +static inline bool __kvm_pte_is_uncached(pte_t pte)
> +{
> + switch (pte_val(pte) & L_PTE_MT_MASK) {
> + case L_PTE_MT_UNCACHED:
> + case L_PTE_MT_BUFFERABLE:
> + case L_PTE_MT_DEV_SHARED:
> + return true;
> + }
> + return false;
> +}
> +
>  #endif   /* !__ASSEMBLY__ */
> 
>  #endif /* __ARM_KVM_MMU_H__ */
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 6984342da13d..eb9a06e3dbee 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -213,7 +213,7 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
>   kvm_tlb_flush_vmid_ipa(kvm, addr);
> 
>   /* No need to invalidate the cache for device mappings 
> */
> - if ((pte_val(old_pte) & PAGE_S2_DEVICE) != 
> PAGE_S2_DEVICE)
> + if (!__kvm_pte_is_uncached(old_pte))
>   kvm_flush_dcache_pte(old_pte);
> 
>   put_page(virt_to_page(pte));
> @@ -305,8 +305,7 @@ static void stage2_flush_ptes(struct kvm *kvm, pmd_t *pmd,
> 
>   pte = pte_offset_kernel(pmd, addr);
>   do {
> - if (!pte_none(*pte) &&
> - (pte_val(*pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE)
> + if (!pte_none(*pte) && !__kvm_pte_is_uncached(*pte))
>   kvm_flush_dcache_pte(*pte);
>   } while (pte++, addr += PAGE_SIZE, addr != end);
>  }
> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> b/arch/arm64/include/asm/kvm_mmu.h
> index 61505676d085..5806f412a47a 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -302,5 +302,17 @@ static inline void __kvm_extend_hypmap(pgd_t 
> *boot_hyp_pgd,
>   merged_hyp_pgd[idmap_idx] = __pgd(__pa(boot_hyp_pgd) | PMD_TYPE_TABLE);
>  }
> 
> +static inline bool __kvm_pte_is_uncached(pte_t pte)
> +{
> + switch (pte_val(pte) & PTE_ATTRINDX_MASK) {
> + case PTE_ATTRINDX(MT_DEVICE_nGnRnE):
> + case PTE_ATTRINDX(MT_DEVICE_nGnRE):
> + case PTE_ATTRINDX(MT_DEVICE_GRE):
> + case PTE_ATTRINDX(MT_NORMAL_NC):
> + return true;
> + }
> + return false;
> +}
> +
>  #endif /* __ASSEMBLY__ */
>  #endif /* __ARM64_KVM_MMU_H__ */
> --
> 1.9.1
> 
> ___
> kvmarm mailing list
> kvm...@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm: Fix crash in free_hyp_pgds() if timer initialization fails

2015-11-06 Thread Pavel Fedin
 Hello!

> > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> > index 7b42012..839dd970 100644
> > --- a/arch/arm/kvm/mmu.c
> > +++ b/arch/arm/kvm/mmu.c
> > @@ -213,7 +213,10 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
> > kvm_tlb_flush_vmid_ipa(kvm, addr);
> >
> > /* No need to invalidate the cache for device mappings 
> > */
> > -   if ((pte_val(old_pte) & PAGE_S2_DEVICE) != 
> > PAGE_S2_DEVICE)
> > +   if (((pte_val(old_pte) & PAGE_S2_DEVICE)
> > +!= PAGE_S2_DEVICE) &&
> > +   ((pte_val(old_pte) & PAGE_HYP_DEVICE)
> > +!= PAGE_HYP_DEVICE))
> > kvm_flush_dcache_pte(old_pte);
> >
> > put_page(virt_to_page(pte));
> > --
> > 2.4.4
> >
> 
> Did you check if PAGE_HYP_DEVICE can mean something sane on a stage-2
> page table entry and vice verse?

 I tried to, the chain of macros and variables is complicated enough not to
get 200% sure, but anyway PAGE_HYP_DEVICE (as well as PAGE_S2_DEVICE)
includes PROT_PTE_DEVICE, so this is definitely device.
 I even tried to construct some mask in order to make a single check for only
DEVICE flags, but, to make things even less understandable and predictable,
the same code with different bitfields is reused by ARM64. So, i thought that
it will be more reliable just to add a second test.

> 
> Also, the commit message and formatting here is horrible, see this
> reworked version:

[skip]

 It's OK, to tell the truth the commit message is not that much important
for me, but i know that sometimes these changes require good elaboration,
so i included as much information as possible, together with crash
backtrace. I've seen something like this in "git log" before.
 Could you give me some directions on how to write better messages? And
about formatting, IIRC i adhere to "75 chars per line" rule, and always
(well, almost, unless forget to do so ;) ) run checkpatch.

> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 6984342..f0c3aef 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -206,18 +206,20 @@ static void unmap_ptes(struct kvm *kvm, pmd_t *pmd,
> 
>   start_pte = pte = pte_offset_kernel(pmd, addr);
>   do {
> - if (!pte_none(*pte)) {
> - pte_t old_pte = *pte;
> + if (pte_none(*pte))
> + continue;
> 
> - kvm_set_pte(pte, __pte(0));
> - kvm_tlb_flush_vmid_ipa(kvm, addr);
> + pte_t old_pte = *pte;
> 
> - /* No need to invalidate the cache for device mappings 
> */
> - if ((pte_val(old_pte) & PAGE_S2_DEVICE) != 
> PAGE_S2_DEVICE)
> - kvm_flush_dcache_pte(old_pte);
> + kvm_set_pte(pte, __pte(0));
> + kvm_tlb_flush_vmid_ipa(kvm, addr);
> 
> - put_page(virt_to_page(pte));
> - }
> + /* No need to invalidate the cache for device mappings */
> + if ((pte_val(old_pte) & PAGE_S2_DEVICE) != PAGE_S2_DEVICE &&
> + (pte_val(old_pte) & PAGE_HYP_DEVICE) != PAGE_HYP_DEVICE)
> + kvm_flush_dcache_pte(old_pte);
> +
> + put_page(virt_to_page(pte));
>   } while (pte++, addr += PAGE_SIZE, addr != end);
> 
>   if (kvm_pte_table_empty(kvm, start_pte))
> --

 I see you inverted pte_none() check, and now kbuild bot complains about
"mixed declarations and code".

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm: Fix crash in free_hyp_pgds() if timer initialization fails

2015-11-06 Thread Pavel Fedin
 Hello!

> > > Did you check if PAGE_HYP_DEVICE can mean something sane on a stage-2
> > > page table entry and vice verse?
> >
> >  I tried to, the chain of macros and variables is complicated enough not to
> > get 200% sure, but anyway PAGE_HYP_DEVICE (as well as PAGE_S2_DEVICE)
> > includes PROT_PTE_DEVICE, so this is definitely device.
> >  I even tried to construct some mask in order to make a single check for 
> > only
> > DEVICE flags, but, to make things even less understandable and predictable,
> > the same code with different bitfields is reused by ARM64. So, i thought 
> > that
> > it will be more reliable just to add a second test.
> 
> The thing I want to avoid is PAGE_HYP_DEVICE covering some normal S2
> mapping, which we *should* flush but that we now end up ignoring?  That
> doesn't sound like it can be the case because the device bit is the same
> bit for both types of page tables, correct?

 Yes, this is exactly what i think. If DEVICE bit is set, then it's somehow
device memory and it doesn't need flashing.

 Or, in order to be 200% sure, we could modify the whole unmapping logic to 
carry
over a flag, telling whether we are removing normal or HYP mappings. But 
wouldn't
this be much more complicated?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] KVM: arm: Fix crash in free_hyp_pgds() if timer initialization fails

2015-11-06 Thread Pavel Fedin
 Hello!

> >> The thing I want to avoid is PAGE_HYP_DEVICE covering some normal S2
> >> mapping, which we *should* flush but that we now end up ignoring?  That
> >> doesn't sound like it can be the case because the device bit is the same
> >> bit for both types of page tables, correct?
> >
> >  Yes, this is exactly what i think. If DEVICE bit is set, then it's somehow
> > device memory and it doesn't need flashing.
> >
> >  Or, in order to be 200% sure, we could modify the whole unmapping logic to 
> > carry
> > over a flag, telling whether we are removing normal or HYP mappings. But 
> > wouldn't
> > this be much more complicated?
> 
> We could do without that complexity. Also, the test itself is wrong (see
> Ard's patch that was posted this morning for the real fix).

 Good. Saw it, will test it on monday. Indeed, this is better than my approach, 
and
this is what i actually wanted to do but didn't study the thing deeply enough to
implement.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/3] KVM: arm/arm64: Clean up some obsolete code

2015-11-04 Thread Pavel Fedin
 Hello!

> Actually, I seem to have been just incredibly unlucky with my test
> cycles, because I eventually reproduced the bug without your patches.

 Or lucky, without "un" :)

> I'm going to take this version of the series because that's what I
> reviewed and tested.

 It's OK, as i wrote, v5 is no different from v4 actually, just 0001 bisected. 
And making it was useful because it helped me to make
sure once again that i haven't messed anything up.

> Sorry for the noise.

 It's OK, thank you very much for putting efforts into testing and cooperation.
 You know, since we are talking about this...  This definitely has something to 
do with the reset, and... Looks like nobody resets
vGIC/vTimer, unless the userland does it explicitly by resetting every register 
by hand.
 I know, there is no global "reset" function for the whole VM. But, at least we 
have reset ioctl for vCPU. What if we hook up
vGIC/vTimer there, and reset at least per-CPU objects (CPU interface + redist + 
timer) at this point?

 P.S. I've seen your PULL, and it is missing a little thing that could be good 
for 4.4 too. I've fixed one more bug recently, it
reproduces on CP15-timer-less boards like Exynos: 
http://www.spinics.net/lists/kvm/msg122746.html. Just to make sure that you 
don't
miss it.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 2/4] KVM: arm/arm64: Replace lr_used with elrsr

2015-11-03 Thread Pavel Fedin
Since commit ae705930fca6322600690df9dc1c7d0516145a93 ("arm/arm64: KVM:
Keep elrsr/aisr in sync with software model") lr_used is completely
redundant, because together with lr_used we also update elrsr. This allows
to easily replace lr_used with elrsr, inverting all conditions (because in
elrsr '1' means 'free').

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 include/kvm/arm_vgic.h |  3 ---
 virt/kvm/arm/vgic-v2.c |  1 +
 virt/kvm/arm/vgic-v3.c |  1 +
 virt/kvm/arm/vgic.c| 37 +
 4 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c74dc7b..3936bf8 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -305,9 +305,6 @@ struct vgic_cpu {
unsigned long   *active_shared;
unsigned long   *pend_act_shared;
 
-   /* Bitmap of used/free list registers */
-   DECLARE_BITMAP(lr_used, VGIC_V2_MAX_LRS);
-
/* Number of list registers on this CPU */
int nr_lr;
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 8d7b04d..c0f5d7f 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -158,6 +158,7 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 * anyway.
 */
vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0;
+   vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0;
 
/* Get the show on the road... */
vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 7dd5d62..92003cb 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -193,6 +193,7 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 * anyway.
 */
vgic_v3->vgic_vmcr = 0;
+   vgic_v3->vgic_elrsr = ~0;
 
/*
 * If we are emulating a GICv3, we do it in an non-GICv2-compatible
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 54233e0..265a410 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -108,6 +108,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu 
*vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
+static u64 vgic_get_elrsr(struct kvm_vcpu *vcpu);
 static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
int virt_irq);
 static int compute_pending_for_cpu(struct kvm_vcpu *vcpu);
@@ -691,9 +692,11 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio 
*mmio,
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+   u64 elrsr = vgic_get_elrsr(vcpu);
+   unsigned long *elrsr_ptr = u64_to_bitmask();
int i;
 
-   for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+   for_each_clear_bit(i, elrsr_ptr, vgic_cpu->nr_lr) {
struct vgic_lr lr = vgic_get_lr(vcpu, i);
 
/*
@@ -1098,7 +1101,6 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
 
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 {
-   struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
/*
@@ -1112,7 +1114,6 @@ static void vgic_retire_lr(int lr_nr, int irq, struct 
kvm_vcpu *vcpu)
 
vlr.state = 0;
vgic_set_lr(vcpu, lr_nr, vlr);
-   clear_bit(lr_nr, vgic_cpu->lr_used);
vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
@@ -1127,10 +1128,11 @@ static void vgic_retire_lr(int lr_nr, int irq, struct 
kvm_vcpu *vcpu)
  */
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 {
-   struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+   u64 elrsr = vgic_get_elrsr(vcpu);
+   unsigned long *elrsr_ptr = u64_to_bitmask();
int lr;
 
-   for_each_set_bit(lr, vgic_cpu->lr_used, vgic->nr_lr) {
+   for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
@@ -1187,8 +1189,9 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
  */
 bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 sgi_source_id, int irq)
 {
-   struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
struct vgic_dist *dist = >kvm->arch.vgic;
+   u64 elrsr = vgic_get_elrsr(vcpu);
+   unsigned long *elrsr_ptr = u64_to_bitmask();
struct vgic_lr vlr;
int lr;
 
@@ -1200,7 +1203,7 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 
sgi_source_id, int irq)
kvm_debug("Queue IRQ%d\n", irq);
 
/* Do we have an active interrupt for the same CPUID? */
-   for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+   for_e

[PATCH v5 1/4] KVM: arm/arm64: Remove vgic_irq_lr_map

2015-11-03 Thread Pavel Fedin
Currently we use vgic_irq_lr_map in order to track which LRs hold which
IRQs, and lr_used bitmap in order to track which LRs are used or free.

vgic_irq_lr_map is actually used only in one place for piggy-back
optimization, and can be easily replaced by iteration over lr_used.
Therefore we remove it in order to get prepared for LPI support
introduction. After this number of IRQs will grow up to at least 16384,
while numbers from 1024 to 8192 are never going to be used. This would be
a huge memory waste.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 include/kvm/arm_vgic.h |  3 ---
 virt/kvm/arm/vgic.c| 18 +++---
 2 files changed, 3 insertions(+), 18 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8065801..c74dc7b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -295,9 +295,6 @@ struct vgic_v3_cpu_if {
 };
 
 struct vgic_cpu {
-   /* per IRQ to LR mapping */
-   u8  *vgic_irq_lr_map;
-
/* Pending/active/both interrupts on this VCPU */
DECLARE_BITMAP(pending_percpu, VGIC_NR_PRIVATE_IRQS);
DECLARE_BITMAP(active_percpu, VGIC_NR_PRIVATE_IRQS);
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index d4669eb..54233e0 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1113,7 +1113,6 @@ static void vgic_retire_lr(int lr_nr, int irq, struct 
kvm_vcpu *vcpu)
vlr.state = 0;
vgic_set_lr(vcpu, lr_nr, vlr);
clear_bit(lr_nr, vgic_cpu->lr_used);
-   vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
@@ -1200,14 +1199,11 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 
sgi_source_id, int irq)
 
kvm_debug("Queue IRQ%d\n", irq);
 
-   lr = vgic_cpu->vgic_irq_lr_map[irq];
-
/* Do we have an active interrupt for the same CPUID? */
-   if (lr != LR_EMPTY) {
+   for_each_set_bit(lr, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
vlr = vgic_get_lr(vcpu, lr);
-   if (vlr.source == sgi_source_id) {
+   if (vlr.irq == irq && vlr.source == sgi_source_id) {
kvm_debug("LR%d piggyback for IRQ%d\n", lr, vlr.irq);
-   BUG_ON(!test_bit(lr, vgic_cpu->lr_used));
vgic_queue_irq_to_lr(vcpu, irq, lr, vlr);
return true;
}
@@ -1220,7 +1216,6 @@ bool vgic_queue_irq(struct kvm_vcpu *vcpu, u8 
sgi_source_id, int irq)
return false;
 
kvm_debug("LR%d allocated for IRQ%d %x\n", lr, irq, sgi_source_id);
-   vgic_cpu->vgic_irq_lr_map[irq] = lr;
set_bit(lr, vgic_cpu->lr_used);
 
vlr.irq = irq;
@@ -1484,7 +1479,6 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
clear_bit(lr, vgic_cpu->lr_used);
 
BUG_ON(vlr.irq >= dist->nr_irqs);
-   vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
}
 
/* Check if we still have something up our sleeve... */
@@ -1912,12 +1906,10 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
kfree(vgic_cpu->pending_shared);
kfree(vgic_cpu->active_shared);
kfree(vgic_cpu->pend_act_shared);
-   kfree(vgic_cpu->vgic_irq_lr_map);
vgic_destroy_irq_phys_map(vcpu->kvm, _cpu->irq_phys_map_list);
vgic_cpu->pending_shared = NULL;
vgic_cpu->active_shared = NULL;
vgic_cpu->pend_act_shared = NULL;
-   vgic_cpu->vgic_irq_lr_map = NULL;
 }
 
 static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, int nr_irqs)
@@ -1928,18 +1920,14 @@ static int vgic_vcpu_init_maps(struct kvm_vcpu *vcpu, 
int nr_irqs)
vgic_cpu->pending_shared = kzalloc(sz, GFP_KERNEL);
vgic_cpu->active_shared = kzalloc(sz, GFP_KERNEL);
vgic_cpu->pend_act_shared = kzalloc(sz, GFP_KERNEL);
-   vgic_cpu->vgic_irq_lr_map = kmalloc(nr_irqs, GFP_KERNEL);
 
if (!vgic_cpu->pending_shared
|| !vgic_cpu->active_shared
-   || !vgic_cpu->pend_act_shared
-   || !vgic_cpu->vgic_irq_lr_map) {
+   || !vgic_cpu->pend_act_shared) {
kvm_vgic_vcpu_destroy(vcpu);
return -ENOMEM;
}
 
-   memset(vgic_cpu->vgic_irq_lr_map, LR_EMPTY, nr_irqs);
-
/*
 * Store the number of LRs per vcpu, so we don't have to go
 * all the way to the distributor structure to find out. Only
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 0/4] KVM: arm/arm64: Clean up some obsolete code

2015-11-03 Thread Pavel Fedin
Current KVM code has lots of old redundancies, which can be cleaned up.
This patchset is actually a better alternative to
http://www.spinics.net/lists/arm-kernel/msg430726.html, which allows to
keep piggy-backed LRs. The idea is based on the fact that our code also
maintains LR state in elrsr, and this information is enough to track LR
usage.

In case of problems this series can be applied partially, each patch is
a complete refactoring step on its own.

Thanks to Andre Przywara for pinpointing some 4.3+ specifics.

This version has been tested on SMDK5410 development board
(Exynos5410 SoC).

v4 => v5:
- Split up the first patch into two, for simpler bisection.

v3 => v4:
- Reordered changes for purpose of better understanding and bisection. All
  changes related to vgic_retire_lr() are gathered in one patch now.

v2 => v3:
- Removed two unused variables in __kvm_vgic_flush_hwstate(), overlooked
  leftover from v1.

v1 => v2:
- Rebased to kvmarm/next of 23.10.2015.
- Do not use vgic_retire_lr() for initializing ELRSR bitmask, because now
  it also handles pushback of PENDING state, use direct initialization
  instead (copied from Andre's patchset).
- Took more care about vgic_retire_lr(), which has deserved own patch.

Pavel Fedin (4):
  KVM: arm/arm64: Remove vgic_irq_lr_map
  KVM: arm/arm64: Replace lr_used with elrsr
  KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings
  KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()

 include/kvm/arm_vgic.h |   7 
 virt/kvm/arm/vgic-v2.c |   6 +--
 virt/kvm/arm/vgic-v3.c |   6 +--
 virt/kvm/arm/vgic.c| 104 +
 4 files changed, 29 insertions(+), 94 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 4/4] KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()

2015-11-03 Thread Pavel Fedin
Now we see that vgic_set_lr() and vgic_sync_lr_elrsr() are always used
together. Merge them into one function, saving from second vgic_ops
dereferencing every time.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 include/kvm/arm_vgic.h |  1 -
 virt/kvm/arm/vgic-v2.c |  5 -
 virt/kvm/arm/vgic-v3.c |  5 -
 virt/kvm/arm/vgic.c| 14 ++
 4 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 3936bf8..f62addc 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -112,7 +112,6 @@ struct vgic_vmcr {
 struct vgic_ops {
struct vgic_lr  (*get_lr)(const struct kvm_vcpu *, int);
void(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
-   void(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
u64 (*get_elrsr)(const struct kvm_vcpu *vcpu);
u64 (*get_eisr)(const struct kvm_vcpu *vcpu);
void(*clear_eisr)(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index c0f5d7f..ff02f08 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -79,11 +79,7 @@ static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT);
 
vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val;
-}
 
-static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
- struct vgic_lr lr_desc)
-{
if (!(lr_desc.state & LR_STATE_MASK))
vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
else
@@ -167,7 +163,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 static const struct vgic_ops vgic_v2_ops = {
.get_lr = vgic_v2_get_lr,
.set_lr = vgic_v2_set_lr,
-   .sync_lr_elrsr  = vgic_v2_sync_lr_elrsr,
.get_elrsr  = vgic_v2_get_elrsr,
.get_eisr   = vgic_v2_get_eisr,
.clear_eisr = vgic_v2_clear_eisr,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 92003cb..487d635 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -112,11 +112,7 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
}
 
vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
-}
 
-static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
- struct vgic_lr lr_desc)
-{
if (!(lr_desc.state & LR_STATE_MASK))
vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
else
@@ -212,7 +208,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 static const struct vgic_ops vgic_v3_ops = {
.get_lr = vgic_v3_get_lr,
.set_lr = vgic_v3_set_lr,
-   .sync_lr_elrsr  = vgic_v3_sync_lr_elrsr,
.get_elrsr  = vgic_v3_get_elrsr,
.get_eisr   = vgic_v3_get_eisr,
.clear_eisr = vgic_v3_clear_eisr,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 96e45f3..fe451d4 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1032,12 +1032,6 @@ static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
vgic_ops->set_lr(vcpu, lr, vlr);
 }
 
-static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
-  struct vgic_lr vlr)
-{
-   vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
-}
-
 static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
 {
return vgic_ops->get_elrsr(vcpu);
@@ -1100,7 +1094,6 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu 
*vcpu)
 
vlr.state = 0;
vgic_set_lr(vcpu, lr_nr, vlr);
-   vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1162,7 +1155,6 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
}
 
vgic_set_lr(vcpu, lr_nr, vlr);
-   vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1340,8 +1332,6 @@ static int process_queued_irq(struct kvm_vcpu *vcpu,
vlr.hwirq = 0;
vgic_set_lr(vcpu, lr, vlr);
 
-   vgic_sync_lr_elrsr(vcpu, lr, vlr);
-
return pending;
 }
 
@@ -1442,8 +1432,6 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
bool level_pending;
 
level_pending = vgic_process_maintenance(vcpu);
-   elrsr = vgic_get_elrsr(vcpu);
-   elrsr_ptr = u64_to_bitmask();
 
/* Deal with HW interrupts, and clear mappings for empty LRs */
for (lr = 0; lr < vgic->nr_lr; lr++) {
@@ -1454,6 +1442,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
}
 
/* Check if we still have something up our sleeve... */
+   elrsr = vgic_get_elrsr(vcpu);
+   elrsr_ptr = u64_to_bitmask();
pending = find_first_zero_bit(elrsr_ptr, vgic->nr_lr);
if (level_pending || pending < vgic->nr_lr)
   

[PATCH v5 3/4] KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings

2015-11-03 Thread Pavel Fedin
1. Remove unnecessary 'irq' argument, because irq number can be retrieved
   from the LR.
2. Since commit cff9211eb1a1f58ce7f5a2d596b617928fd4be0e ("arm/arm64: KVM:
   Fix arch timer behavior for disabled interrupts") LR_STATE_PENDING is
   queued back by vgic_retire_lr() itself. Also, it clears vlr.state
   itself. Therefore, we remove the same, now duplicated, check with all
   accompanying bit manipulations from vgic_unqueue_irqs().
3. vgic_retire_lr() is always accompanied by vgic_irq_clear_queued(). Since
   it already does more than just clearing the LR, move
   vgic_irq_clear_queued() inside of it.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic.c | 37 ++---
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 265a410..96e45f3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -105,7 +105,7 @@
 #include "vgic.h"
 
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
 static u64 vgic_get_elrsr(struct kvm_vcpu *vcpu);
@@ -717,30 +717,14 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 * interrupt then move the active state to the
 * distributor tracking bit.
 */
-   if (lr.state & LR_STATE_ACTIVE) {
+   if (lr.state & LR_STATE_ACTIVE)
vgic_irq_set_active(vcpu, lr.irq);
-   lr.state &= ~LR_STATE_ACTIVE;
-   }
 
/*
 * Reestablish the pending state on the distributor and the
-* CPU interface.  It may have already been pending, but that
-* is fine, then we are only setting a few bits that were
-* already set.
+* CPU interface and mark the LR as free for other use.
 */
-   if (lr.state & LR_STATE_PENDING) {
-   vgic_dist_irq_set_pending(vcpu, lr.irq);
-   lr.state &= ~LR_STATE_PENDING;
-   }
-
-   vgic_set_lr(vcpu, i, lr);
-
-   /*
-* Mark the LR as free for other use.
-*/
-   BUG_ON(lr.state & LR_STATE_MASK);
-   vgic_retire_lr(i, lr.irq, vcpu);
-   vgic_irq_clear_queued(vcpu, lr.irq);
+   vgic_retire_lr(i, vcpu);
 
/* Finally update the VGIC state. */
vgic_update_state(vcpu->kvm);
@@ -1099,16 +1083,18 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
vgic_ops->enable(vcpu);
 }
 
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
+   vgic_irq_clear_queued(vcpu, vlr.irq);
+
/*
 * We must transfer the pending state back to the distributor before
 * retiring the LR, otherwise we may loose edge-triggered interrupts.
 */
if (vlr.state & LR_STATE_PENDING) {
-   vgic_dist_irq_set_pending(vcpu, irq);
+   vgic_dist_irq_set_pending(vcpu, vlr.irq);
vlr.hwirq = 0;
}
 
@@ -1135,11 +1121,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu 
*vcpu)
for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
-   if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-   vgic_retire_lr(lr, vlr.irq, vcpu);
-   if (vgic_irq_is_queued(vcpu, vlr.irq))
-   vgic_irq_clear_queued(vcpu, vlr.irq);
-   }
+   if (!vgic_irq_is_enabled(vcpu, vlr.irq))
+   vgic_retire_lr(lr, vcpu);
}
 }
 
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/3] KVM: arm/arm64: Clean up some obsolete code

2015-11-03 Thread Pavel Fedin
 Hello!

>  By this time i'll make a very minimal version of patch 0001, for you to test 
> it. If we have
> problems with current 0001, which we
> cannot solve quickly, we could stick to that version then, which will provide 
> the necessary
> changes to plug in LPIs, yet with
> minimal changes (it will only remove vgic_irq_lr_map).
>  I guess i should have done it before. Or, i could even respin v5, with 
> current 0001 split up.
> This should make it easier to bisect
> the problem.

 So, i have just sent v5, conditions are the same as before. It is OK to stop 
at any point, and actually you should be able to
easily throw away 0003 and apply just 1, 2, 4. The minimum needed thing for 
LPIs introduction is 0001.
 You can also stick to v4 if the problem does not get triggered by its first 
patch, if you prefer reduced commit log.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v4 0/3] KVM: arm/arm64: Clean up some obsolete code

2015-11-02 Thread Pavel Fedin
 Hello!

> I ran this through my test scripts and I'm now quite sure that there's
> some breakage in here.
> 
> One of my tests is running two VMs in parallel, each booting up, running
> hackbench, and then doing reboot (from within the guest), and just
> repeating like that.
> 
> I've run your patches in the above config 100 times, and every time,
> the rebooting VMs got stuck before 50 reboots.
> 
> Without these patches, I could run the above config 100 times, and every
> time, the rebooting VMs passed 200 reboots.

 Huh, the description looks like some problem with vgic_retire_disabled_irqs(). 
By the way, during reboot, who does call it? The
only call i see is in vgic_handle_enable_reg(), which obviously just processes 
emulated register accesses...
 And the only thing i know is that in case of GICv2 the userland resets vGIC 
manually by resetting each register to its default
value (therefore all ENABLER are set to 0). At least qemu does this, and i'm 
not sure about kvmtool. And in case of vGICv3 nobody
can do this because there's no API to set registers yet. So, could we be 
rebooting with interrupts enabled or something like that?
 So: what kind of container are you running and what vGIC version? Does this 
problem reproduce with both vGICv2 and vGICv3?

 By this time i'll make a very minimal version of patch 0001, for you to test 
it. If we have problems with current 0001, which we
cannot solve quickly, we could stick to that version then, which will provide 
the necessary changes to plug in LPIs, yet with
minimal changes (it will only remove vgic_irq_lr_map).
 I guess i should have done it before. Or, i could even respin v5, with current 
0001 split up. This should make it easier to bisect
the problem.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v5 0/7] KVM: arm64: Implement API for vGICv3 live migration

2015-10-28 Thread Pavel Fedin
 Hello!

> > v4 => v5:
> > - Adapted to new API by Peter Maydell, Marc Zyngier and Christoffer Dall.
> >   Acked-by's on the documentation were dropped, just in case, because i
> >   slightly adjusted it. Additionally, i merged all doc updates into one
> >   patch.
> 
> Could you tell us what you changed in the doc patch from the version
> that got sent out with the acks, please?

 Sorry, completely forgot to answer this one...

 The major differences / things to review are:

1. Both GICv2 and GICv3 use the same value of KVM_DEV_ARM_VGIC_CPUID_MASK, 
which is extended to 32 bits. So GICv2 attribute layout also looks like:
--- cut ---
bits: | 63     32  |  31   0 |
values:   |  vcpu_index|  offset |
--- cut ---

2. KVM_DEV_ARM_VGIC_CPU_SYSREGS documentation originally says (error code 
description):
--- cut ---
-EBUSY: One or more VCPUs are running
--- cut ---
 While my version says "VCPU is running". Since this is CPU interface, it does 
not affect other CPUs, so for simplicity i check only current vCPU in my code.

 That's all. Just i'm maybe too careful about fundamentals...

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 2/3] KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()

2015-10-27 Thread Pavel Fedin
 Hello!

> > --- cut ---
> > Additionally, remove unnecessary vgic_set_lr() and LR_STATE_PENDING check
> > in vgic_unqueue_irqs(), because all these things are now done by the
> > following vgic_retire_lr().
> > --- cut ---
> 
> This does not explain the question I'm raising.
> 
> After applying this patch, and before applying your next patch,
> unqueueing an IRQ will not restore the pending state on the
> distributor, but just throw that piece of state away

 It will restore the state and not throw it away.
 I guess i'm just not clear enough and you misunderstand me. This check in 
vgic_unqueue_irqs() is redundant from the beginning.
Let's look at current vgic_retire_lr():
https://git.kernel.org/cgit/linux/kernel/git/kvmarm/kvmarm.git/tree/virt/kvm/arm/vgic.c?h=next#n1099
 It already does LR_STATE_PENDING check and pushback by itself, since 
cff9211eb1a1f58ce7f5a2d596b617928fd4be0e (it's your commit,
BTW), so that this check:
https://git.kernel.org/cgit/linux/kernel/git/kvmarm/kvmarm.git/tree/virt/kvm/arm/vgic.c?h=next#n728
 is already redundant. So actually this is a separate change, and perhaps it's 
my fault to squash it in.

> which breaks bisectability and makes it impossible to understand the logic by 
> looking
> at this commit in isolation.

 Will this be understood better if i make this particular refactor a separate 
commit, with better explanations?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/3] KVM: arm/arm64: Optimize away redundant LR tracking

2015-10-27 Thread Pavel Fedin
Currently we use vgic_irq_lr_map in order to track which LRs hold which
IRQs, and lr_used bitmap in order to track which LRs are used or free.

vgic_irq_lr_map is actually used only for piggy-back optimization, and
can be easily replaced by iteration over lr_used. This is good because in
future, when LPI support is introduced, number of IRQs will grow up to at
least 16384, while numbers from 1024 to 8192 are never going to be used.
This would be a huge memory waste.

In its turn, lr_used is also completely redundant since
ae705930fca6322600690df9dc1c7d0516145a93 ("arm/arm64: KVM: Keep elrsr/aisr
in sync with software model"), because together with lr_used we also update
elrsr. This allows to easily replace lr_used with elrsr, inverting all
conditions (because in elrsr '1' means 'free').

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 include/kvm/arm_vgic.h |  6 --
 virt/kvm/arm/vgic-v2.c |  1 +
 virt/kvm/arm/vgic-v3.c |  1 +
 virt/kvm/arm/vgic.c| 53 ++
 4 files changed, 17 insertions(+), 44 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8065801..3936bf8 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -295,9 +295,6 @@ struct vgic_v3_cpu_if {
 };
 
 struct vgic_cpu {
-   /* per IRQ to LR mapping */
-   u8  *vgic_irq_lr_map;
-
/* Pending/active/both interrupts on this VCPU */
DECLARE_BITMAP(pending_percpu, VGIC_NR_PRIVATE_IRQS);
DECLARE_BITMAP(active_percpu, VGIC_NR_PRIVATE_IRQS);
@@ -308,9 +305,6 @@ struct vgic_cpu {
unsigned long   *active_shared;
unsigned long   *pend_act_shared;
 
-   /* Bitmap of used/free list registers */
-   DECLARE_BITMAP(lr_used, VGIC_V2_MAX_LRS);
-
/* Number of list registers on this CPU */
int nr_lr;
 
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index 8d7b04d..c0f5d7f 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -158,6 +158,7 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 * anyway.
 */
vcpu->arch.vgic_cpu.vgic_v2.vgic_vmcr = 0;
+   vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr = ~0;
 
/* Get the show on the road... */
vcpu->arch.vgic_cpu.vgic_v2.vgic_hcr = GICH_HCR_EN;
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 7dd5d62..92003cb 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -193,6 +193,7 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 * anyway.
 */
vgic_v3->vgic_vmcr = 0;
+   vgic_v3->vgic_elrsr = ~0;
 
/*
 * If we are emulating a GICv3, we do it in an non-GICv2-compatible
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index d4669eb..265a410 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -108,6 +108,7 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu 
*vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
+static u64 vgic_get_elrsr(struct kvm_vcpu *vcpu);
 static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
int virt_irq);
 static int compute_pending_for_cpu(struct kvm_vcpu *vcpu);
@@ -691,9 +692,11 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio 
*mmio,
 void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 {
struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+   u64 elrsr = vgic_get_elrsr(vcpu);
+   unsigned long *elrsr_ptr = u64_to_bitmask();
int i;
 
-   for_each_set_bit(i, vgic_cpu->lr_used, vgic_cpu->nr_lr) {
+   for_each_clear_bit(i, elrsr_ptr, vgic_cpu->nr_lr) {
struct vgic_lr lr = vgic_get_lr(vcpu, i);
 
/*
@@ -1098,7 +1101,6 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
 
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
 {
-   struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
/*
@@ -1112,8 +1114,6 @@ static void vgic_retire_lr(int lr_nr, int irq, struct 
kvm_vcpu *vcpu)
 
vlr.state = 0;
vgic_set_lr(vcpu, lr_nr, vlr);
-   clear_bit(lr_nr, vgic_cpu->lr_used);
-   vgic_cpu->vgic_irq_lr_map[irq] = LR_EMPTY;
vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
@@ -1128,10 +1128,11 @@ static void vgic_retire_lr(int lr_nr, int irq, struct 
kvm_vcpu *vcpu)
  */
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu)
 {
-   struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
+   u64 elrsr = vgic_get_elrsr(vcpu);
+   unsigned long *elrsr_ptr = u64_to_bitmask();
int lr;
 
-   for_each_set_bit(lr, vgic_cpu->lr_used, vgic->nr_lr) {
+   for_each_clear_bit(lr,

[PATCH v4 2/3] KVM: arm/arm64: Clean up vgic_retire_lr() and surroundings

2015-10-27 Thread Pavel Fedin
1. Remove unnecessary 'irq' argument, because irq number can be retrieved
   from the LR.
2. Since cff9211eb1a1f58ce7f5a2d596b617928fd4be0e
   ("arm/arm64: KVM: Fix arch timer behavior for disabled interrupts ")
   LR_STATE_PENDING is queued back by vgic_retire_lr() itself. Also, it
   clears vlr.state itself. Therefore, we remove the same, now duplicated,
   check with all accompanying bit manipulations from vgic_unqueue_irqs().
3. vgic_retire_lr() is always accompanied by vgic_irq_clear_queued(). Since
   it already does more than just clearing the LR, move
   vgic_irq_clear_queued() inside of it.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 virt/kvm/arm/vgic.c | 37 ++---
 1 file changed, 10 insertions(+), 27 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 265a410..96e45f3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -105,7 +105,7 @@
 #include "vgic.h"
 
 static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
 static u64 vgic_get_elrsr(struct kvm_vcpu *vcpu);
@@ -717,30 +717,14 @@ void vgic_unqueue_irqs(struct kvm_vcpu *vcpu)
 * interrupt then move the active state to the
 * distributor tracking bit.
 */
-   if (lr.state & LR_STATE_ACTIVE) {
+   if (lr.state & LR_STATE_ACTIVE)
vgic_irq_set_active(vcpu, lr.irq);
-   lr.state &= ~LR_STATE_ACTIVE;
-   }
 
/*
 * Reestablish the pending state on the distributor and the
-* CPU interface.  It may have already been pending, but that
-* is fine, then we are only setting a few bits that were
-* already set.
+* CPU interface and mark the LR as free for other use.
 */
-   if (lr.state & LR_STATE_PENDING) {
-   vgic_dist_irq_set_pending(vcpu, lr.irq);
-   lr.state &= ~LR_STATE_PENDING;
-   }
-
-   vgic_set_lr(vcpu, i, lr);
-
-   /*
-* Mark the LR as free for other use.
-*/
-   BUG_ON(lr.state & LR_STATE_MASK);
-   vgic_retire_lr(i, lr.irq, vcpu);
-   vgic_irq_clear_queued(vcpu, lr.irq);
+   vgic_retire_lr(i, vcpu);
 
/* Finally update the VGIC state. */
vgic_update_state(vcpu->kvm);
@@ -1099,16 +1083,18 @@ static inline void vgic_enable(struct kvm_vcpu *vcpu)
vgic_ops->enable(vcpu);
 }
 
-static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu)
+static void vgic_retire_lr(int lr_nr, struct kvm_vcpu *vcpu)
 {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr_nr);
 
+   vgic_irq_clear_queued(vcpu, vlr.irq);
+
/*
 * We must transfer the pending state back to the distributor before
 * retiring the LR, otherwise we may loose edge-triggered interrupts.
 */
if (vlr.state & LR_STATE_PENDING) {
-   vgic_dist_irq_set_pending(vcpu, irq);
+   vgic_dist_irq_set_pending(vcpu, vlr.irq);
vlr.hwirq = 0;
}
 
@@ -1135,11 +1121,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu 
*vcpu)
for_each_clear_bit(lr, elrsr_ptr, vgic->nr_lr) {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
 
-   if (!vgic_irq_is_enabled(vcpu, vlr.irq)) {
-   vgic_retire_lr(lr, vlr.irq, vcpu);
-   if (vgic_irq_is_queued(vcpu, vlr.irq))
-   vgic_irq_clear_queued(vcpu, vlr.irq);
-   }
+   if (!vgic_irq_is_enabled(vcpu, vlr.irq))
+   vgic_retire_lr(lr, vcpu);
}
 }
 
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/3] KVM: arm/arm64: Merge vgic_set_lr() and vgic_sync_lr_elrsr()

2015-10-27 Thread Pavel Fedin
Now we see that vgic_set_lr() and vgic_sync_lr_elrsr() are always used
together. Merge them into one function, saving from second vgic_ops
dereferencing every time.

Signed-off-by: Pavel Fedin <p.fe...@samsung.com>
---
 include/kvm/arm_vgic.h |  1 -
 virt/kvm/arm/vgic-v2.c |  5 -
 virt/kvm/arm/vgic-v3.c |  5 -
 virt/kvm/arm/vgic.c| 14 ++
 4 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 3936bf8..f62addc 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -112,7 +112,6 @@ struct vgic_vmcr {
 struct vgic_ops {
struct vgic_lr  (*get_lr)(const struct kvm_vcpu *, int);
void(*set_lr)(struct kvm_vcpu *, int, struct vgic_lr);
-   void(*sync_lr_elrsr)(struct kvm_vcpu *, int, struct vgic_lr);
u64 (*get_elrsr)(const struct kvm_vcpu *vcpu);
u64 (*get_eisr)(const struct kvm_vcpu *vcpu);
void(*clear_eisr)(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index c0f5d7f..ff02f08 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -79,11 +79,7 @@ static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT);
 
vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val;
-}
 
-static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
- struct vgic_lr lr_desc)
-{
if (!(lr_desc.state & LR_STATE_MASK))
vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
else
@@ -167,7 +163,6 @@ static void vgic_v2_enable(struct kvm_vcpu *vcpu)
 static const struct vgic_ops vgic_v2_ops = {
.get_lr = vgic_v2_get_lr,
.set_lr = vgic_v2_set_lr,
-   .sync_lr_elrsr  = vgic_v2_sync_lr_elrsr,
.get_elrsr  = vgic_v2_get_elrsr,
.get_eisr   = vgic_v2_get_eisr,
.clear_eisr = vgic_v2_clear_eisr,
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index 92003cb..487d635 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -112,11 +112,7 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
}
 
vcpu->arch.vgic_cpu.vgic_v3.vgic_lr[LR_INDEX(lr)] = lr_val;
-}
 
-static void vgic_v3_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
- struct vgic_lr lr_desc)
-{
if (!(lr_desc.state & LR_STATE_MASK))
vcpu->arch.vgic_cpu.vgic_v3.vgic_elrsr |= (1U << lr);
else
@@ -212,7 +208,6 @@ static void vgic_v3_enable(struct kvm_vcpu *vcpu)
 static const struct vgic_ops vgic_v3_ops = {
.get_lr = vgic_v3_get_lr,
.set_lr = vgic_v3_set_lr,
-   .sync_lr_elrsr  = vgic_v3_sync_lr_elrsr,
.get_elrsr  = vgic_v3_get_elrsr,
.get_eisr   = vgic_v3_get_eisr,
.clear_eisr = vgic_v3_clear_eisr,
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 96e45f3..fe451d4 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1032,12 +1032,6 @@ static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr,
vgic_ops->set_lr(vcpu, lr, vlr);
 }
 
-static void vgic_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
-  struct vgic_lr vlr)
-{
-   vgic_ops->sync_lr_elrsr(vcpu, lr, vlr);
-}
-
 static inline u64 vgic_get_elrsr(struct kvm_vcpu *vcpu)
 {
return vgic_ops->get_elrsr(vcpu);
@@ -1100,7 +1094,6 @@ static void vgic_retire_lr(int lr_nr, struct kvm_vcpu 
*vcpu)
 
vlr.state = 0;
vgic_set_lr(vcpu, lr_nr, vlr);
-   vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1162,7 +1155,6 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
}
 
vgic_set_lr(vcpu, lr_nr, vlr);
-   vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
 
 /*
@@ -1340,8 +1332,6 @@ static int process_queued_irq(struct kvm_vcpu *vcpu,
vlr.hwirq = 0;
vgic_set_lr(vcpu, lr, vlr);
 
-   vgic_sync_lr_elrsr(vcpu, lr, vlr);
-
return pending;
 }
 
@@ -1442,8 +1432,6 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
bool level_pending;
 
level_pending = vgic_process_maintenance(vcpu);
-   elrsr = vgic_get_elrsr(vcpu);
-   elrsr_ptr = u64_to_bitmask();
 
/* Deal with HW interrupts, and clear mappings for empty LRs */
for (lr = 0; lr < vgic->nr_lr; lr++) {
@@ -1454,6 +1442,8 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
}
 
/* Check if we still have something up our sleeve... */
+   elrsr = vgic_get_elrsr(vcpu);
+   elrsr_ptr = u64_to_bitmask();
pending = find_first_zero_bit(elrsr_ptr, vgic->nr_lr);
if (level_pending || pending < vgic->nr_lr)
   

  1   2   3   4   >