KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth

 Hi all,

now that we get memory hotplugging for the spapr machine on qemu-ppc,
too, it seems like we easily can hit the amount of KVM-internal memory
slots now ("#define KVM_USER_MEM_SLOTS 32" in
arch/powerpc/include/asm/kvm_host.h). For example, start
qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
you'll see that it aborts way earlier already.

The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
already (+3 internal slots = 512) ... maybe we should now increase the
amount of slots on powerpc, too? Since we don't use internal slots on
POWER, would 512 be a good value? Or would less be sufficient, too?

 Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth

 Hi all,

now that we get memory hotplugging for the spapr machine on qemu-ppc,
too, it seems like we easily can hit the amount of KVM-internal memory
slots now ("#define KVM_USER_MEM_SLOTS 32" in
arch/powerpc/include/asm/kvm_host.h). For example, start
qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
you'll see that it aborts way earlier already.

The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
already (+3 internal slots = 512) ... maybe we should now increase the
amount of slots on powerpc, too? Since we don't use internal slots on
POWER, would 512 be a good value? Or would less be sufficient, too?

 Thomas
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM memory slots limit on powerpc

2015-09-04 Thread Christian Borntraeger
Am 04.09.2015 um 11:35 schrieb Thomas Huth:
> 
>  Hi all,
> 
> now that we get memory hotplugging for the spapr machine on qemu-ppc,
> too, it seems like we easily can hit the amount of KVM-internal memory
> slots now ("#define KVM_USER_MEM_SLOTS 32" in
> arch/powerpc/include/asm/kvm_host.h). For example, start
> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
> you'll see that it aborts way earlier already.
> 
> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
> already (+3 internal slots = 512) ... maybe we should now increase the
> amount of slots on powerpc, too? Since we don't use internal slots on
> POWER, would 512 be a good value? Or would less be sufficient, too?

When you are at it, the s390 value should also be increased I guess.

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth
On 04/09/15 12:07, Christian Borntraeger wrote:
> Am 04.09.2015 um 12:04 schrieb Alexander Graf:
>>
>>
>> On 04.09.15 11:59, Christian Borntraeger wrote:
>>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:

  Hi all,

 now that we get memory hotplugging for the spapr machine on qemu-ppc,
 too, it seems like we easily can hit the amount of KVM-internal memory
 slots now ("#define KVM_USER_MEM_SLOTS 32" in
 arch/powerpc/include/asm/kvm_host.h). For example, start
 qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
 you'll see that it aborts way earlier already.

 The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
 already (+3 internal slots = 512) ... maybe we should now increase the
 amount of slots on powerpc, too? Since we don't use internal slots on
 POWER, would 512 be a good value? Or would less be sufficient, too?
>>>
>>> When you are at it, the s390 value should also be increased I guess.
>>
>> That constant defines the array size for the memslot array in struct kvm
>> which in turn again gets allocated by kzalloc, so it's pinned kernel
>> memory that is physically contiguous. Doing big allocations can turn
>> into problems during runtime.
>>
>> So maybe there is another way? Can we extend the memslot array size
>> dynamically somehow? Allocate it separately? How much memory does the
>> memslot array use up with 512 entries?
> 
> Maybe some rcu protected scheme that doubles the amount of memslots for
> each overrun? Yes, that would be good and even reduce the footprint for
> systems with only a small number of memslots.

Seems like Alex Williamson already posted a patchset for growable
memslots a couple of years ago:

 http://www.spinics.net/lists/kvm/msg50491.html

But I didn't quite spot the result in that thread why it never has been
included upstream. Alex (W.), do you remember the outcome?

 Thomas


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool Makefile: relax arm test

2015-09-04 Thread Riku Voipio
On 4 September 2015 at 13:10, Andre Przywara  wrote:
> Hi Riku,
>
> On 03/09/15 12:20, riku.voi...@linaro.org wrote:
>> From: Riku Voipio 
>>
>> Currently Makefile accepts only armv7l.* When building kvmtool under 32bit
>> personality on Aarch64 machines, uname -m reports "armv8l", so build fails.
>> We expect doing 32bit arm builds in Aarch64 to become standard the same way
>> people do i386 builds on x86_64 machines.
>>
>> Make the sed test a little more greedy so armv8l becomes acceptable.
>>
>> Signed-off-by: Riku Voipio 
>
> The patch looks OK to me, I just wonder how you do the actual build
> within the linux32 environment?
> Do you have an arm cross compiler installed and set CROSS_COMPILE? Or is
> there a magic compiler (driver) which uses uname -m as well?
> And what would be the difference to setting ARCH=arm as well? Just
> convenience?

It's just an arm32 chroot, with an native arm32 compiler. The chroot
is on an arm64 machine since these tend to be much faster than arm32
hardware.

It would of course be possible to set ARCH=arm, but that would mean
some ifdefs in the Debian packaging, since the same build rule should
work for all architectures.

Riku
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvmtool Makefile: relax arm test

2015-09-04 Thread Andre Przywara
Hi Riku,

On 04/09/15 11:52, Riku Voipio wrote:
> On 4 September 2015 at 13:10, Andre Przywara  wrote:
>> Hi Riku,
>>
>> On 03/09/15 12:20, riku.voi...@linaro.org wrote:
>>> From: Riku Voipio 
>>>
>>> Currently Makefile accepts only armv7l.* When building kvmtool under 32bit
>>> personality on Aarch64 machines, uname -m reports "armv8l", so build fails.
>>> We expect doing 32bit arm builds in Aarch64 to become standard the same way
>>> people do i386 builds on x86_64 machines.
>>>
>>> Make the sed test a little more greedy so armv8l becomes acceptable.
>>>
>>> Signed-off-by: Riku Voipio 
>>
>> The patch looks OK to me, I just wonder how you do the actual build
>> within the linux32 environment?
>> Do you have an arm cross compiler installed and set CROSS_COMPILE? Or is
>> there a magic compiler (driver) which uses uname -m as well?
>> And what would be the difference to setting ARCH=arm as well? Just
>> convenience?
> 
> It's just an arm32 chroot, with an native arm32 compiler. The chroot
> is on an arm64 machine since these tend to be much faster than arm32
> hardware.

Oh right, a chroot, didn't think about the obvious ;-)
Also it applies to 64-bit kernels with 32-bit root filesystems, I think.
So:

Acked-by: Andre Przywara 

Cheers,
Andre.

> 
> It would of course be possible to set ARCH=arm, but that would mean
> some ifdefs in the Debian packaging, since the same build rule should
> work for all architectures.
> 
> Riku
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH v2 07/18] nvdimm: reserve address range for NVDIMM

2015-09-04 Thread Igor Mammedov
On Fri, 14 Aug 2015 22:52:00 +0800
Xiao Guangrong  wrote:

> NVDIMM reserves all the free range above 4G to do:
> - Persistent Memory (PMEM) mapping
> - implement NVDIMM ACPI device _DSM method
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/i386/pc.c   | 12 ++--
>  hw/mem/nvdimm/pc-nvdimm.c  | 13 +
>  include/hw/mem/pc-nvdimm.h |  1 +
>  3 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 7661ea9..41af6ea 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -64,6 +64,7 @@
>  #include "hw/pci/pci_host.h"
>  #include "acpi-build.h"
>  #include "hw/mem/pc-dimm.h"
> +#include "hw/mem/pc-nvdimm.h"
>  #include "qapi/visitor.h"
>  #include "qapi-visit.h"
>  
> @@ -1302,6 +1303,7 @@ FWCfgState *pc_memory_init(MachineState *machine,
>  MemoryRegion *ram_below_4g, *ram_above_4g;
>  FWCfgState *fw_cfg;
>  PCMachineState *pcms = PC_MACHINE(machine);
> +ram_addr_t offset;
>  
>  assert(machine->ram_size == below_4g_mem_size + above_4g_mem_size);
>  
> @@ -1339,6 +1341,8 @@ FWCfgState *pc_memory_init(MachineState *machine,
>  exit(EXIT_FAILURE);
>  }
>  
> +offset = 0x1ULL + above_4g_mem_size;
> +
>  /* initialize hotplug memory address space */
>  if (guest_info->has_reserved_memory &&
>  (machine->ram_size < machine->maxram_size)) {
> @@ -1358,8 +1362,7 @@ FWCfgState *pc_memory_init(MachineState *machine,
>  exit(EXIT_FAILURE);
>  }
>  
> -pcms->hotplug_memory.base =
> -ROUND_UP(0x1ULL + above_4g_mem_size, 1ULL << 30);
> +pcms->hotplug_memory.base = ROUND_UP(offset, 1ULL << 30);
>  
>  if (pcms->enforce_aligned_dimm) {
>  /* size hotplug region assuming 1G page max alignment per slot */
> @@ -1377,8 +1380,13 @@ FWCfgState *pc_memory_init(MachineState *machine,
> "hotplug-memory", hotplug_mem_size);
>  memory_region_add_subregion(system_memory, pcms->hotplug_memory.base,
>  >hotplug_memory.mr);
> +
> +offset = pcms->hotplug_memory.base + hotplug_mem_size;
>  }
>  
> + /* all the space left above 4G is reserved for NVDIMM. */
> +pc_nvdimm_reserve_range(offset);
I'd drop 'offset' in this patch and just use:
  foo(pcms->hotplug_memory.base + hotplug_mem_size)

> +
>  /* Initialize PC system firmware */
>  pc_system_firmware_init(rom_memory, guest_info->isapc_ram_fw);
>  
> diff --git a/hw/mem/nvdimm/pc-nvdimm.c b/hw/mem/nvdimm/pc-nvdimm.c
> index a53d235..7a270a8 100644
> --- a/hw/mem/nvdimm/pc-nvdimm.c
> +++ b/hw/mem/nvdimm/pc-nvdimm.c
> @@ -24,6 +24,19 @@
>  
>  #include "hw/mem/pc-nvdimm.h"
>  
> +#define PAGE_SIZE  (1UL << 12)
> +
> +static struct nvdimms_info {
> +ram_addr_t current_addr;
> +} nvdimms_info;
no globals please, so far it looks like pcms->hotplug_memory
so add asimmilar nvdimm_memory field to PCMachineState

> +
> +/* the address range [offset, ~0ULL) is reserved for NVDIMM. */
> +void pc_nvdimm_reserve_range(ram_addr_t offset)
do you plan to reuse this function, if not then just inline it at call site

> +{
> +offset = ROUND_UP(offset, PAGE_SIZE);
I'd suggest round up to 1Gb as we do with mem hotplug

> +nvdimms_info.current_addr = offset;
> +}
> +
>  static char *get_file(Object *obj, Error **errp)
>  {
>  PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj);
> diff --git a/include/hw/mem/pc-nvdimm.h b/include/hw/mem/pc-nvdimm.h
> index 51152b8..8601e9b 100644
> --- a/include/hw/mem/pc-nvdimm.h
> +++ b/include/hw/mem/pc-nvdimm.h
> @@ -28,4 +28,5 @@ typedef struct PCNVDIMMDevice {
>  #define PC_NVDIMM(obj) \
>  OBJECT_CHECK(PCNVDIMMDevice, (obj), TYPE_PC_NVDIMM)
>  
> +void pc_nvdimm_reserve_range(ram_addr_t offset);
>  #endif

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH kvmtool] Make static libc and guest-init functionality optional.

2015-09-04 Thread Dimitri John Ledkov
If one typically only boots full disk-images, one wouldn't necessaraly
want to statically link glibc, for the guest-init feature of the
kvmtool. As statically linked glibc triggers haevy security
maintainance.

Signed-off-by: Dimitri John Ledkov 
---
 Makefile| 11 ++-
 builtin-run.c   |  7 +++
 builtin-setup.c |  7 +++
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 1534e6f..42a629a 100644
--- a/Makefile
+++ b/Makefile
@@ -34,8 +34,6 @@ bindir_SQ = $(subst ','\'',$(bindir))
 PROGRAM:= lkvm
 PROGRAM_ALIAS := vm
 
-GUEST_INIT := guest/init
-
 OBJS   += builtin-balloon.o
 OBJS   += builtin-debug.o
 OBJS   += builtin-help.o
@@ -279,8 +277,12 @@ ifeq ($(LTO),1)
endif
 endif
 
-ifneq ($(call try-build,$(SOURCE_STATIC),,-static),y)
-$(error No static libc found. Please install glibc-static package.)
+ifeq ($(call try-build,$(SOURCE_STATIC),,-static),y)
+   CFLAGS  += -DCONFIG_HAS_LIBC
+   GUEST_INIT := guest/init
+   GUEST_OBJS = guest/guest_init.o
+else
+   NOTFOUND+= static-libc
 endif
 
 ifeq (y,$(ARCH_WANT_LIBFDT))
@@ -356,7 +358,6 @@ c_flags = -Wp,-MD,$(depfile) $(CFLAGS)
 # $(OTHEROBJS) are things that do not get substituted like this.
 #
 STATIC_OBJS = $(patsubst %.o,%.static.o,$(OBJS) $(OBJS_STATOPT))
-GUEST_OBJS = guest/guest_init.o
 
 $(PROGRAM)-static:  $(STATIC_OBJS) $(OTHEROBJS) $(GUEST_INIT)
$(E) "  LINK" $@
diff --git a/builtin-run.c b/builtin-run.c
index 1ee75ad..0f67471 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -59,8 +59,13 @@ static int  kvm_run_wrapper;
 
 bool do_debug_print = false;
 
+#ifdef CONFIG_HAS_LIBC
 extern char _binary_guest_init_start;
 extern char _binary_guest_init_size;
+#else
+static char _binary_guest_init_start=0;
+static char _binary_guest_init_size=0;
+#endif
 
 static const char * const run_usage[] = {
"lkvm run [] []",
@@ -354,6 +359,8 @@ static int kvm_setup_guest_init(struct kvm *kvm)
char *data;
 
/* Setup /virt/init */
+   if (!_binary_guest_init_size)
+   die("Guest init not compiled");
size = (size_t)&_binary_guest_init_size;
data = (char *)&_binary_guest_init_start;
snprintf(tmp, PATH_MAX, "%s%s/virt/init", kvm__get_dir(), rootfs);
diff --git a/builtin-setup.c b/builtin-setup.c
index 8b45c56..d77e5e0 100644
--- a/builtin-setup.c
+++ b/builtin-setup.c
@@ -16,8 +16,13 @@
 #include 
 #include 
 
+#ifdef CONFIG_HAS_LIBC
 extern char _binary_guest_init_start;
 extern char _binary_guest_init_size;
+#else
+static char _binary_guest_init_start=0;
+static char _binary_guest_init_size=0;
+#endif
 
 static const char *instance_name;
 
@@ -131,6 +136,8 @@ static int copy_init(const char *guestfs_name)
int fd, ret;
char *data;
 
+   if (!_binary_guest_init_size)
+   die("Guest init not compiled");
size = (size_t)&_binary_guest_init_size;
data = (char *)&_binary_guest_init_start;
snprintf(path, PATH_MAX, "%s%s/virt/init", kvm__get_dir(), 
guestfs_name);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/5] KVM: arm64: Refactor system register handlers

2015-09-04 Thread Pavel Fedin
Replace Rt with data pointer in struct sys_reg_params. This will allow to
reuse system register handling code in implementation of vGICv3 CPU
interface access API. Additionally, got rid of "massive hack"
in kvm_handle_cp_64().

Signed-off-by: Pavel Fedin 
---
 arch/arm64/kvm/sys_regs.c| 61 +---
 arch/arm64/kvm/sys_regs.h|  4 +--
 arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
 3 files changed, 32 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index b41607d..fe6b517 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -102,7 +102,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
 
BUG_ON(!p->is_write);
 
-   val = *vcpu_reg(vcpu, p->Rt);
+   val = *p->val;
if (!p->is_aarch32) {
vcpu_sys_reg(vcpu, r->reg) = val;
} else {
@@ -125,13 +125,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
   const struct sys_reg_params *p,
   const struct sys_reg_desc *r)
 {
-   u64 val;
-
if (!p->is_write)
return read_from_write_only(vcpu, p);
 
-   val = *vcpu_reg(vcpu, p->Rt);
-   vgic_v3_dispatch_sgi(vcpu, val);
+   vgic_v3_dispatch_sgi(vcpu, *p->val);
 
return true;
 }
@@ -153,7 +150,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
if (p->is_write) {
return ignore_write(vcpu, p);
} else {
-   *vcpu_reg(vcpu, p->Rt) = (1 << 3);
+   *p->val = (1 << 3);
return true;
}
 }
@@ -167,7 +164,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
} else {
u32 val;
asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
return true;
}
 }
@@ -204,13 +201,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_sys_reg(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
+   *p->val = vcpu_sys_reg(vcpu, r->reg);
}
 
-   trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
+   trace_trap_reg(__func__, r->reg, p->is_write, *p->val);
 
return true;
 }
@@ -228,7 +225,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
  const struct sys_reg_params *p,
  u64 *dbg_reg)
 {
-   u64 val = *vcpu_reg(vcpu, p->Rt);
+   u64 val = *p->val;
 
if (p->is_32bit) {
val &= 0xUL;
@@ -248,7 +245,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
if (p->is_32bit)
val &= 0xUL;
 
-   *vcpu_reg(vcpu, p->Rt) = val;
+   *p->val = val;
 }
 
 static inline bool trap_bvr(struct kvm_vcpu *vcpu,
@@ -704,10 +701,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
u32 el3 = !!((pfr >> 12) & 0xf);
 
-   *vcpu_reg(vcpu, p->Rt) = dfr >> 20) & 0xf) << 28) |
- (((dfr >> 12) & 0xf) << 24) |
- (((dfr >> 28) & 0xf) << 20) |
- (6 << 16) | (el3 << 14) | (el3 << 
12));
+   *p->val = dfr >> 20) & 0xf) << 28) |
+  (((dfr >> 12) & 0xf) << 24) |
+  (((dfr >> 28) & 0xf) << 20) |
+  (6 << 16) | (el3 << 14) | (el3 << 12));
return true;
}
 }
@@ -717,10 +714,10 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
 const struct sys_reg_desc *r)
 {
if (p->is_write) {
-   vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
+   vcpu_cp14(vcpu, r->reg) = *p->val;
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
+   *p->val = vcpu_cp14(vcpu, r->reg);
}
 
return true;
@@ -747,12 +744,12 @@ static inline bool trap_xvr(struct kvm_vcpu *vcpu,
u64 val = *dbg_reg;
 
val &= 0xUL;
-   val |= *vcpu_reg(vcpu, p->Rt) << 32;
+   val |= *p->val << 32;
*dbg_reg = val;
 
vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
} else {
-   *vcpu_reg(vcpu, p->Rt) = *dbg_reg >> 32;
+   *p->val = *dbg_reg >> 32;
}
 
trace_trap_reg(__func__, rd->reg, p->is_write, 

[PATCH v3 5/5] KVM: arm64: Implement vGICv3 CPU interface access

2015-09-04 Thread Pavel Fedin
The access is done similar to GICv2, using KVM_DEV_ARM_VGIC_GRP_CPU_REGS
group, however attribute ID encodes corresponding system register. Access
size is always 64 bits.

Since CPU interface state actually affects only a single vCPU, no vGIC
locking is done. Just made sure that the vCPU is not running.

Signed-off-by: Pavel Fedin 
---
 Documentation/virtual/kvm/devices/arm-vgic.txt |  38 +++-
 arch/arm64/include/uapi/asm/kvm.h  |   7 +
 include/linux/irqchip/arm-gic-v3.h |  18 +-
 virt/kvm/arm/vgic-v3-emul.c| 244 +
 4 files changed, 303 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt 
b/Documentation/virtual/kvm/devices/arm-vgic.txt
index 03f640f..518b634 100644
--- a/Documentation/virtual/kvm/devices/arm-vgic.txt
+++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
@@ -88,7 +88,7 @@ Groups:
 -ENODEV: Getting or setting this register is not yet supported
 -EBUSY: One or more VCPUs are running
 
-  KVM_DEV_ARM_VGIC_GRP_CPU_REGS
+  KVM_DEV_ARM_VGIC_GRP_CPU_REGS (vGICv2)
   Attributes:
 The attr field of kvm_device_attr encodes two values:
 bits: | 63     52 | 51 ..  32  |  31   0 |
@@ -116,11 +116,45 @@ Groups:
 
   Limitations:
 - Priorities are not implemented, and registers are RAZ/WI
-- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
   Errors:
 -ENODEV: Getting or setting this register is not yet supported
 -EBUSY: One or more VCPUs are running
 
+  KVM_DEV_ARM_VGIC_GRP_CPU_REGS (vGICv3)
+  Attributes:
+The attr field of kvm_device_attr encodes the following values:
+bits:   | 63 .. 56 | 55 .. 48 | 47 ... 40 | 39 .. 32 | 31 .. 0 |
+values: |   arch   |   size   | reserved  |  cpu id  |  reg id |
+
+All CPU interface regs are (rw, 64-bit). The only supported size value is
+KVM_REG_SIZE_U64.
+
+Arch, size and reg id fields actually encode system register to be
+accessed. Normally these values are obtained using  ARM64_SYS_REG() macro.
+Getting or setting such a register has the same effect as reading or
+writing the register on the actual hardware.
+
+The Active Priorities Registers AP0Rn and AP1Rn are implementation defined,
+so we set a fixed format for our implementation that fits with the model of
+a "GICv3 implementation without the security extensions" which we present
+to the guest. This interface always exposes four register APR[0-3]
+describing the maximum possible 128 preemption levels. The semantics of the
+register indicates if any interrupts in a given preemption level are in the
+active state by setting the corresponding bit.
+
+Thus, preemption level X has one or more active interrupts if and only if:
+
+  APRn[X mod 32] == 0b1,  where n = X / 32
+
+Bits for undefined preemption levels are RAZ/WI.
+
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENODEV: Getting or setting this register is not yet supported
+-EBUSY: One or more VCPUs are running
+
+
   KVM_DEV_ARM_VGIC_GRP_NR_IRQS
   Attributes:
 A value describing the number of interrupts (SGI, PPI and SPI) for
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 249954f..7d37ccd 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -201,6 +201,13 @@ struct kvm_arch_memory_slot {
 #define   KVM_DEV_ARM_VGIC_CPUID_MASK  (0xfULL << 
KVM_DEV_ARM_VGIC_CPUID_SHIFT)
 #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT0
 #define   KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL << 
KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
+#define   KVM_DEV_ARM_VGIC_REG_MASK(KVM_REG_SIZE_MASK | \
+KVM_REG_ARM64_SYSREG_OP0_MASK | \
+KVM_REG_ARM64_SYSREG_OP1_MASK | \
+KVM_REG_ARM64_SYSREG_CRN_MASK | \
+KVM_REG_ARM64_SYSREG_CRM_MASK | \
+KVM_REG_ARM64_SYSREG_OP2_MASK)
+
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS   3
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index 9eeeb95..dbc5c49 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -259,8 +259,14 @@
 /*
  * CPU interface registers
  */
-#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << 1)
-#define ICC_CTLR_EL1_EOImode_drop  (1U << 1)
+#define ICC_CTLR_EL1_CBPR_SHIFT0
+#define ICC_CTLR_EL1_EOImode_SHIFT 1
+#define ICC_CTLR_EL1_EOImode_drop_dir  (0U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_EOImode_drop  (1U << ICC_CTLR_EL1_EOImode_SHIFT)
+#define ICC_CTLR_EL1_PRIbits_MASK  (7U << 8)
+#define ICC_CTLR_EL1_IDbits_MASK   (7U << 11)
+#define 

[PATCH v3 4/5] KVM: arm64: Introduce find_reg_by_id()

2015-09-04 Thread Pavel Fedin
In order to implement vGICv3 CPU interface access, we will need to perform
table lookup of system registers. We would need both index_to_params() and
find_reg() exported for that purpose, but instead we export a single
function which combines them both.

Signed-off-by: Pavel Fedin 
---
 arch/arm64/kvm/sys_regs.c | 22 +++---
 arch/arm64/kvm/sys_regs.h |  4 
 2 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fe6b517..21403fa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1283,6 +1283,17 @@ static bool index_to_params(u64 id, struct 
sys_reg_params *params)
}
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num)
+{
+   if (!index_to_params(id, params))
+   return NULL;
+
+   return find_reg(params, table, num);
+}
+
 /* Decode an index value, and find the sys_reg_desc entry. */
 static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
u64 id)
@@ -1410,10 +1421,8 @@ static int get_invariant_sys_reg(u64 id, void __user 
*uaddr)
struct sys_reg_params params;
const struct sys_reg_desc *r;
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
@@ -1427,9 +1436,8 @@ static int set_invariant_sys_reg(u64 id, void __user 
*uaddr)
int err;
u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */
 
-   if (!index_to_params(id, ))
-   return -ENOENT;
-   r = find_reg(, invariant_sys_regs, 
ARRAY_SIZE(invariant_sys_regs));
+   r = find_reg_by_id(id, , invariant_sys_regs,
+  ARRAY_SIZE(invariant_sys_regs));
if (!r)
return -ENOENT;
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 3267518..0646108 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc 
*i1,
return i1->Op2 - i2->Op2;
 }
 
+const struct sys_reg_desc *find_reg_by_id(u64 id,
+ struct sys_reg_params *params,
+ const struct sys_reg_desc table[],
+ unsigned int num);
 
 #define Op0(_x).Op0 = _x
 #define Op1(_x).Op1 = _x
-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/5] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-09-04 Thread Pavel Fedin
Separate all implementation-independent code in vgic_attr_regs_access()
and move it to vgic.c. This will allow to reuse this code for vGICv3
implementation.

Signed-off-by: Pavel Fedin 
---
 virt/kvm/arm/vgic-v2-emul.c | 126 +---
 virt/kvm/arm/vgic.c |  77 +++
 virt/kvm/arm/vgic.h |   4 ++
 3 files changed, 107 insertions(+), 100 deletions(-)

diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
index 1390797..557c5a6 100644
--- a/virt/kvm/arm/vgic-v2-emul.c
+++ b/virt/kvm/arm/vgic-v2-emul.c
@@ -661,97 +661,38 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
},
 };
 
-static int vgic_attr_regs_access(struct kvm_device *dev,
-struct kvm_device_attr *attr,
-u32 *reg, bool is_write)
+static int vgic_v2_attr_regs_access(struct kvm_device *dev,
+   struct kvm_device_attr *attr,
+   __le32 *data, bool is_write)
 {
-   const struct vgic_io_range *r = NULL, *ranges;
+   const struct vgic_io_range *ranges;
phys_addr_t offset;
-   int ret, cpuid, c;
-   struct kvm_vcpu *vcpu, *tmp_vcpu;
-   struct vgic_dist *vgic;
+   int cpuid;
+   struct vgic_dist *vgic = >kvm->arch.vgic;
struct kvm_exit_mmio mmio;
-   u32 data;
 
offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
KVM_DEV_ARM_VGIC_CPUID_SHIFT;
 
-   mutex_lock(>kvm->lock);
-
-   ret = vgic_init(dev->kvm);
-   if (ret)
-   goto out;
-
-   if (cpuid >= atomic_read(>kvm->online_vcpus)) {
-   ret = -EINVAL;
-   goto out;
-   }
-
-   vcpu = kvm_get_vcpu(dev->kvm, cpuid);
-   vgic = >kvm->arch.vgic;
-
-   mmio.len = 4;
-   mmio.is_write = is_write;
-   mmio.data = 
-   if (is_write)
-   mmio_data_write(, ~0, *reg);
switch (attr->group) {
case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-   mmio.phys_addr = vgic->vgic_dist_base + offset;
+   mmio.phys_addr = vgic->vgic_dist_base;
ranges = vgic_dist_ranges;
break;
case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
-   mmio.phys_addr = vgic->vgic_cpu_base + offset;
+   mmio.phys_addr = vgic->vgic_cpu_base;
ranges = vgic_cpu_ranges;
break;
default:
-   BUG();
+   return -ENXIO;
}
-   r = vgic_find_range(ranges, 4, offset);
 
-   if (unlikely(!r || !r->handle_mmio)) {
-   ret = -ENXIO;
-   goto out;
-   }
-
-
-   spin_lock(>lock);
-
-   /*
-* Ensure that no other VCPU is running by checking the vcpu->cpu
-* field.  If no other VPCUs are running we can safely access the VGIC
-* state, because even if another VPU is run after this point, that
-* VCPU will not touch the vgic state, because it will block on
-* getting the vgic->lock in kvm_vgic_sync_hwstate().
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
-   if (unlikely(tmp_vcpu->cpu != -1)) {
-   ret = -EBUSY;
-   goto out_vgic_unlock;
-   }
-   }
-
-   /*
-* Move all pending IRQs from the LRs on all VCPUs so the pending
-* state can be properly represented in the register state accessible
-* through this API.
-*/
-   kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
-   vgic_unqueue_irqs(tmp_vcpu);
-
-   offset -= r->base;
-   r->handle_mmio(vcpu, , offset);
-
-   if (!is_write)
-   *reg = mmio_data_read(, ~0);
+   mmio.is_write = is_write;
+   mmio.data = data;
 
-   ret = 0;
-out_vgic_unlock:
-   spin_unlock(>lock);
-out:
-   mutex_unlock(>kvm->lock);
-   return ret;
+   return vgic_attr_regs_access(dev, ranges, , offset, sizeof(data),
+cpuid);
 }
 
 static int vgic_v2_create(struct kvm_device *dev, u32 type)
@@ -767,53 +708,38 @@ static void vgic_v2_destroy(struct kvm_device *dev)
 static int vgic_v2_set_attr(struct kvm_device *dev,
struct kvm_device_attr *attr)
 {
+   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
+   u32 reg;
+   __le32 data;
int ret;
 
ret = vgic_set_common_attr(dev, attr);
if (ret != -ENXIO)
return ret;
 
-   switch (attr->group) {
-   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
-   case KVM_DEV_ARM_VGIC_GRP_CPU_REGS: {
-   u32 __user *uaddr = (u32 __user *)(long)attr->addr;
-   u32 reg;
-
-   if (get_user(reg, uaddr))
-   return -EFAULT;
-
-   return 

[PATCH v3 0/5] KVM: arm64: Implement API for vGICv3 live migration

2015-09-04 Thread Pavel Fedin
This patchset adds necessary userspace API in order to support vGICv3 live
migration. GICv3 registers are accessed using device attribute ioctls,
similar to GICv2.

V2 => V3:
- KVM_DEV_ARM_VGIC_CPUID_MASK enlarged to 20 bits, allowing more than 256
  CPUs.
- Bug fix: Correctly set mmio->private, necessary for redistributor access.
- Added accessors for ICC_AP0R and ICC_AP1R registers
- Rebased on new linux-next

v1 => v2:
- Do not use generic register get/set API for CPU interface, use only
  device attributes.
- Introduce size specifier for distributor and redistributor register
  accesses, do not assume size any more.
- Lots of refactor and reusable code extraction.
- Added forgotten documentation

Pavel Fedin (5):
  KVM: arm/arm64: Refactor vGIC attributes handling code
  KVM: arm64: Implement vGICv3 distributor and redistributor access from
userspace
  KVM: arm64: Refactor system register handlers
  KVM: arm64: Introduce find_reg_by_id()
  KVM: arm64: Implement vGICv3 CPU interface access

 Documentation/virtual/kvm/devices/arm-vgic.txt |  73 +-
 arch/arm64/include/uapi/asm/kvm.h  |  11 +-
 arch/arm64/kvm/sys_regs.c  |  83 +++---
 arch/arm64/kvm/sys_regs.h  |   8 +-
 arch/arm64/kvm/sys_regs_generic_v8.c   |   2 +-
 include/linux/irqchip/arm-gic-v3.h |  18 +-
 virt/kvm/arm/vgic-v2-emul.c| 126 ++---
 virt/kvm/arm/vgic-v3-emul.c| 339 -
 virt/kvm/arm/vgic.c|  78 ++
 virt/kvm/arm/vgic.h|   4 +
 10 files changed, 577 insertions(+), 165 deletions(-)

-- 
2.4.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-04 Thread Pavel Fedin
The access is done similar to vGICv2, using KVM_DEV_ARM_VGIC_GRP_DIST_REGS
and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS with KVM_SET_DEVICE_ATTR and
KVM_GET_DEVICE_ATTR ioctls. Since GICv3 can handle large number of CPUs,
KVM_DEV_ARM_VGIC_CPUID_MASK has been extended to 20 bits. This is enough
for 1048576 CPUs.

Some registers are 64-bit wide according to the specification.
KVM_DEV_ARM_VGIC_64BIT flag is introduced, allowing to perform full 64-bit
accesses.

Signed-off-by: Pavel Fedin 
---
 Documentation/virtual/kvm/devices/arm-vgic.txt | 35 --
 arch/arm64/include/uapi/asm/kvm.h  |  4 +-
 virt/kvm/arm/vgic-v3-emul.c| 95 ++
 virt/kvm/arm/vgic.c|  1 +
 4 files changed, 116 insertions(+), 19 deletions(-)

diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt 
b/Documentation/virtual/kvm/devices/arm-vgic.txt
index 3fb9054..03f640f 100644
--- a/Documentation/virtual/kvm/devices/arm-vgic.txt
+++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
@@ -43,10 +43,13 @@ Groups:
   KVM_DEV_ARM_VGIC_GRP_DIST_REGS
   Attributes:
 The attr field of kvm_device_attr encodes two values:
-bits: | 63     40 | 39 ..  32  |  31   0 |
-values:   |reserved   |   cpu id   |  offset |
+bits: |  63  | 62 .. 52 | 51 ..  32  |  31   0 |
+values:   | size | reserved |   cpu id   |  offset |
 
 All distributor regs are (rw, 32-bit)
+For GICv3 some regs are also (rw, 64-bit) according to the specification.
+In order to perform full 64-bit access 'size' bit should be set to 1.
+KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose.
 
 The offset is relative to the "Distributor base address" as defined in the
 GICv2 specs.  Getting or setting such a register has the same effect as
@@ -54,9 +57,33 @@ Groups:
 specified with cpu id field.  Note that most distributor fields are not
 banked, but return the same value regardless of the cpu id used to access
 the register.
+
+  Limitations:
+- Priorities are not implemented, and registers are RAZ/WI
+  Errors:
+-ENODEV: Getting or setting this register is not yet supported
+-EBUSY: One or more VCPUs are running
+
+  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
+  Attributes:
+The attr field of kvm_device_attr encodes two values:
+bits: |  63  | 62 .. 52 | 51 ..  32  |  31   0 |
+values:   | size | reserved |   cpu id   |  offset |
+
+All redistributor regs are (rw, 32-bit)
+For GICv3 some regs are also (rw, 64-bit) according to the specification.
+In order to perform full 64-bit access 'size' bit should be set to 1.
+KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose.
+
+The offset is relative to the "Redistributor base address" as defined in
+the GICv3 specs.  Getting or setting such a register has the same effect as
+reading or writing the register on the actual hardware from the cpu
+specified with cpu id field.  Note that most distributor fields are not
+banked, but return the same value regardless of the cpu id used to access
+the register.
+
   Limitations:
 - Priorities are not implemented, and registers are RAZ/WI
-- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
   Errors:
 -ENODEV: Getting or setting this register is not yet supported
 -EBUSY: One or more VCPUs are running
@@ -64,7 +91,7 @@ Groups:
   KVM_DEV_ARM_VGIC_GRP_CPU_REGS
   Attributes:
 The attr field of kvm_device_attr encodes two values:
-bits: | 63     40 | 39 ..  32  |  31   0 |
+bits: | 63     52 | 51 ..  32  |  31   0 |
 values:   |reserved   |   cpu id   |  offset |
 
 All CPU interface regs are (rw, 32-bit)
diff --git a/arch/arm64/include/uapi/asm/kvm.h 
b/arch/arm64/include/uapi/asm/kvm.h
index 0cd7b59..249954f 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -196,13 +196,15 @@ struct kvm_arch_memory_slot {
 #define KVM_DEV_ARM_VGIC_GRP_ADDR  0
 #define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
 #define KVM_DEV_ARM_VGIC_GRP_CPU_REGS  2
+#define   KVM_DEV_ARM_VGIC_64BIT   (1ULL << 63)
 #define   KVM_DEV_ARM_VGIC_CPUID_SHIFT 32
-#define   KVM_DEV_ARM_VGIC_CPUID_MASK  (0xffULL << 
KVM_DEV_ARM_VGIC_CPUID_SHIFT)
+#define   KVM_DEV_ARM_VGIC_CPUID_MASK  (0xfULL << 
KVM_DEV_ARM_VGIC_CPUID_SHIFT)
 #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT0
 #define   KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL << 
KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS   3
 #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
 #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
+#define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT 24
diff --git a/virt/kvm/arm/vgic-v3-emul.c b/virt/kvm/arm/vgic-v3-emul.c
index e661e7f..8bda714 100644
--- 

Re: [PATCH] kvmtool Makefile: relax arm test

2015-09-04 Thread Andre Przywara
Hi Riku,

On 03/09/15 12:20, riku.voi...@linaro.org wrote:
> From: Riku Voipio 
> 
> Currently Makefile accepts only armv7l.* When building kvmtool under 32bit
> personality on Aarch64 machines, uname -m reports "armv8l", so build fails.
> We expect doing 32bit arm builds in Aarch64 to become standard the same way
> people do i386 builds on x86_64 machines.
> 
> Make the sed test a little more greedy so armv8l becomes acceptable.
> 
> Signed-off-by: Riku Voipio 

The patch looks OK to me, I just wonder how you do the actual build
within the linux32 environment?
Do you have an arm cross compiler installed and set CROSS_COMPILE? Or is
there a magic compiler (driver) which uses uname -m as well?
And what would be the difference to setting ARCH=arm as well? Just
convenience?

Cheers,
Andre.

> ---
>  Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 1534e6f..7b17d52 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -103,7 +103,7 @@ OBJS  += hw/i8042.o
>  
>  # Translate uname -m into ARCH string
>  ARCH ?= $(shell uname -m | sed -e s/i.86/i386/ -e s/ppc.*/powerpc/ \
> -   -e s/armv7.*/arm/ -e s/aarch64.*/arm64/ -e s/mips64/mips/)
> +   -e s/armv.*/arm/ -e s/aarch64.*/arm64/ -e s/mips64/mips/)
>  
>  ifeq ($(ARCH),i386)
>   ARCH := x86
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM memory slots limit on powerpc

2015-09-04 Thread Christian Borntraeger
Am 04.09.2015 um 11:35 schrieb Thomas Huth:
> 
>  Hi all,
> 
> now that we get memory hotplugging for the spapr machine on qemu-ppc,
> too, it seems like we easily can hit the amount of KVM-internal memory
> slots now ("#define KVM_USER_MEM_SLOTS 32" in
> arch/powerpc/include/asm/kvm_host.h). For example, start
> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
> you'll see that it aborts way earlier already.
> 
> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
> already (+3 internal slots = 512) ... maybe we should now increase the
> amount of slots on powerpc, too? Since we don't use internal slots on
> POWER, would 512 be a good value? Or would less be sufficient, too?

When you are at it, the s390 value should also be increased I guess.

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Christian Borntraeger
Am 04.09.2015 um 12:04 schrieb Alexander Graf:
> 
> 
> On 04.09.15 11:59, Christian Borntraeger wrote:
>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>>
>>>  Hi all,
>>>
>>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>>> too, it seems like we easily can hit the amount of KVM-internal memory
>>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>>> arch/powerpc/include/asm/kvm_host.h). For example, start
>>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>>> you'll see that it aborts way earlier already.
>>>
>>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>>> already (+3 internal slots = 512) ... maybe we should now increase the
>>> amount of slots on powerpc, too? Since we don't use internal slots on
>>> POWER, would 512 be a good value? Or would less be sufficient, too?
>>
>> When you are at it, the s390 value should also be increased I guess.
> 
> That constant defines the array size for the memslot array in struct kvm
> which in turn again gets allocated by kzalloc, so it's pinned kernel
> memory that is physically contiguous. Doing big allocations can turn
> into problems during runtime.
> 
> So maybe there is another way? Can we extend the memslot array size
> dynamically somehow? Allocate it separately? How much memory does the
> memslot array use up with 512 entries?

Maybe some rcu protected scheme that doubles the amount of memslots for
each overrun? Yes, that would be good and even reduce the footprint for
systems with only a small number of memslots.

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Christian Borntraeger
Am 04.09.2015 um 12:04 schrieb Alexander Graf:
> 
> 
> On 04.09.15 11:59, Christian Borntraeger wrote:
>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>>
>>>  Hi all,
>>>
>>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>>> too, it seems like we easily can hit the amount of KVM-internal memory
>>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>>> arch/powerpc/include/asm/kvm_host.h). For example, start
>>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>>> you'll see that it aborts way earlier already.
>>>
>>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>>> already (+3 internal slots = 512) ... maybe we should now increase the
>>> amount of slots on powerpc, too? Since we don't use internal slots on
>>> POWER, would 512 be a good value? Or would less be sufficient, too?
>>
>> When you are at it, the s390 value should also be increased I guess.
> 
> That constant defines the array size for the memslot array in struct kvm
> which in turn again gets allocated by kzalloc, so it's pinned kernel
> memory that is physically contiguous. Doing big allocations can turn
> into problems during runtime.
> 
> So maybe there is another way? Can we extend the memslot array size
> dynamically somehow? Allocate it separately? How much memory does the
> memslot array use up with 512 entries?

Maybe some rcu protected scheme that doubles the amount of memslots for
each overrun? Yes, that would be good and even reduce the footprint for
systems with only a small number of memslots.

Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Alexander Graf


On 04.09.15 11:59, Christian Borntraeger wrote:
> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>
>>  Hi all,
>>
>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>> too, it seems like we easily can hit the amount of KVM-internal memory
>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>> arch/powerpc/include/asm/kvm_host.h). For example, start
>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>> you'll see that it aborts way earlier already.
>>
>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>> already (+3 internal slots = 512) ... maybe we should now increase the
>> amount of slots on powerpc, too? Since we don't use internal slots on
>> POWER, would 512 be a good value? Or would less be sufficient, too?
> 
> When you are at it, the s390 value should also be increased I guess.

That constant defines the array size for the memslot array in struct kvm
which in turn again gets allocated by kzalloc, so it's pinned kernel
memory that is physically contiguous. Doing big allocations can turn
into problems during runtime.

So maybe there is another way? Can we extend the memslot array size
dynamically somehow? Allocate it separately? How much memory does the
memslot array use up with 512 entries?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth
On 04/09/15 12:07, Christian Borntraeger wrote:
> Am 04.09.2015 um 12:04 schrieb Alexander Graf:
>>
>>
>> On 04.09.15 11:59, Christian Borntraeger wrote:
>>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:

  Hi all,

 now that we get memory hotplugging for the spapr machine on qemu-ppc,
 too, it seems like we easily can hit the amount of KVM-internal memory
 slots now ("#define KVM_USER_MEM_SLOTS 32" in
 arch/powerpc/include/asm/kvm_host.h). For example, start
 qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
 you'll see that it aborts way earlier already.

 The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
 already (+3 internal slots = 512) ... maybe we should now increase the
 amount of slots on powerpc, too? Since we don't use internal slots on
 POWER, would 512 be a good value? Or would less be sufficient, too?
>>>
>>> When you are at it, the s390 value should also be increased I guess.
>>
>> That constant defines the array size for the memslot array in struct kvm
>> which in turn again gets allocated by kzalloc, so it's pinned kernel
>> memory that is physically contiguous. Doing big allocations can turn
>> into problems during runtime.
>>
>> So maybe there is another way? Can we extend the memslot array size
>> dynamically somehow? Allocate it separately? How much memory does the
>> memslot array use up with 512 entries?
> 
> Maybe some rcu protected scheme that doubles the amount of memslots for
> each overrun? Yes, that would be good and even reduce the footprint for
> systems with only a small number of memslots.

Seems like Alex Williamson already posted a patchset for growable
memslots a couple of years ago:

 http://www.spinics.net/lists/kvm/msg50491.html

But I didn't quite spot the result in that thread why it never has been
included upstream. Alex (W.), do you remember the outcome?

 Thomas


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Benjamin Herrenschmidt
On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote:

> > Maybe some rcu protected scheme that doubles the amount of memslots
> > for
> > each overrun? Yes, that would be good and even reduce the footprint
> > for
> > systems with only a small number of memslots.
> 
> Seems like Alex Williamson already posted a patchset for growable
> memslots a couple of years ago:
> 
>  http://www.spinics.net/lists/kvm/msg50491.html
> 
> But I didn't quite spot the result in that thread why it never has 
> been
> included upstream. Alex (W.), do you remember the outcome?

Isn't the memslot array *already* protected by RCU anyway ?

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Benjamin Herrenschmidt
On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote:

> > Maybe some rcu protected scheme that doubles the amount of memslots
> > for
> > each overrun? Yes, that would be good and even reduce the footprint
> > for
> > systems with only a small number of memslots.
> 
> Seems like Alex Williamson already posted a patchset for growable
> memslots a couple of years ago:
> 
>  http://www.spinics.net/lists/kvm/msg50491.html
> 
> But I didn't quite spot the result in that thread why it never has 
> been
> included upstream. Alex (W.), do you remember the outcome?

Isn't the memslot array *already* protected by RCU anyway ?

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Andrew Jones
On Wed, Sep 02, 2015 at 11:25:26AM +0200, Alexander Spyridakis wrote:
> Properly clean any generated object and binary files after a 'make clean',
> this fixes an issue when trying to reconfigure between arm and arm64.

Are you also running configure (with the opposite arch selected) after
'make clean'? If not, then that could be the source of your problems.
Anyway, please describe the issues you're seeing because I don't see
what this patch is doing that isn't already being done. The lines this
patch adds are already there. See the arm_clean target in
config/config-arm-common.mak.

Thanks,
drew

> 
> Signed-off-by: Alexander Spyridakis 
> ---
>  config/config-arm.mak   | 2 ++
>  config/config-arm64.mak | 3 ++-
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/config/config-arm.mak b/config/config-arm.mak
> index ae6c2e7..68fab62 100644
> --- a/config/config-arm.mak
> +++ b/config/config-arm.mak
> @@ -21,3 +21,5 @@ tests =
>  include config/config-arm-common.mak
>  
>  arch_clean: arm_clean
> + $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
> + $(TEST_DIR)/.*.d lib/arm/.*.d
> diff --git a/config/config-arm64.mak b/config/config-arm64.mak
> index d61b703..a0bc1b3 100644
> --- a/config/config-arm64.mak
> +++ b/config/config-arm64.mak
> @@ -17,4 +17,5 @@ tests =
>  include config/config-arm-common.mak
>  
>  arch_clean: arm_clean
> - $(RM) lib/arm64/.*.d
> + $(RM) $(TEST_DIR)/*.o $(TEST_DIR)/*.flat $(TEST_DIR)/*.elf \
> + $(TEST_DIR)/.*.d lib/arm64/.*.d
> -- 
> 2.1.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] KVM: Add kvm_arch_vcpu_{un}blocking callbacks

2015-09-04 Thread Christoffer Dall
On Fri, Sep 04, 2015 at 03:50:08PM +0200, Eric Auger wrote:
> Hi Christoffer,
> On 08/30/2015 03:54 PM, Christoffer Dall wrote:
> > Some times it is useful for architecture implementations of KVM to know
> > when the VCPU thread is about to block or when it comes back from
> > blocking (arm/arm64 needs to know this to properly implement timers, for
> > example).
> what about vcpu_sleep()? Is that callback specific to kvm_vcpu_block
> function entry/exit points or is it more generic? The question also
> applies to future halt/resume functions
> 
For ARM, This should be called when we're about to block in a situation
where timer interrupts could affect our sleep state, which would not be
the case for vcpu_sleep, which unconditionally puts the vcpu to sleep
based on other conditions.

I believe that any case where you care about incoming interrupts are
covered by the semantics of kvm_vcpu_block, and therefore these hooks
should only be called by kvm_vcpu_block.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-09-04 Thread Andre Przywara
Hi Pavel,

On 02/09/15 09:09, Pavel Fedin wrote:
> Separate all implementation-independent code in vgic_attr_regs_access()
> and move it to vgic.c. This will allow to reuse this code for vGICv3
> implementation.
> 
> Signed-off-by: Pavel Fedin 
> ---
>  virt/kvm/arm/vgic-v2-emul.c | 126 
> +---
>  virt/kvm/arm/vgic.c |  77 +++
>  virt/kvm/arm/vgic.h |   4 ++
>  3 files changed, 107 insertions(+), 100 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic-v2-emul.c b/virt/kvm/arm/vgic-v2-emul.c
> index 1390797..557c5a6 100644
> --- a/virt/kvm/arm/vgic-v2-emul.c
> +++ b/virt/kvm/arm/vgic-v2-emul.c
> @@ -661,97 +661,38 @@ static const struct vgic_io_range vgic_cpu_ranges[] = {
>   },
>  };
>  
> -static int vgic_attr_regs_access(struct kvm_device *dev,
> -  struct kvm_device_attr *attr,
> -  u32 *reg, bool is_write)
> +static int vgic_v2_attr_regs_access(struct kvm_device *dev,
> + struct kvm_device_attr *attr,
> + __le32 *data, bool is_write)
>  {
> - const struct vgic_io_range *r = NULL, *ranges;
> + const struct vgic_io_range *ranges;
>   phys_addr_t offset;
> - int ret, cpuid, c;
> - struct kvm_vcpu *vcpu, *tmp_vcpu;
> - struct vgic_dist *vgic;
> + int cpuid;
> + struct vgic_dist *vgic = >kvm->arch.vgic;
>   struct kvm_exit_mmio mmio;
> - u32 data;
>  
>   offset = attr->attr & KVM_DEV_ARM_VGIC_OFFSET_MASK;
>   cpuid = (attr->attr & KVM_DEV_ARM_VGIC_CPUID_MASK) >>
>   KVM_DEV_ARM_VGIC_CPUID_SHIFT;
>  
> - mutex_lock(>kvm->lock);
> -
> - ret = vgic_init(dev->kvm);
> - if (ret)
> - goto out;
> -
> - if (cpuid >= atomic_read(>kvm->online_vcpus)) {
> - ret = -EINVAL;
> - goto out;
> - }
> -
> - vcpu = kvm_get_vcpu(dev->kvm, cpuid);
> - vgic = >kvm->arch.vgic;
> -
> - mmio.len = 4;
> - mmio.is_write = is_write;
> - mmio.data = 
> - if (is_write)
> - mmio_data_write(, ~0, *reg);
>   switch (attr->group) {
>   case KVM_DEV_ARM_VGIC_GRP_DIST_REGS:
> - mmio.phys_addr = vgic->vgic_dist_base + offset;
> + mmio.phys_addr = vgic->vgic_dist_base;
>   ranges = vgic_dist_ranges;
>   break;
>   case KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
> - mmio.phys_addr = vgic->vgic_cpu_base + offset;
> + mmio.phys_addr = vgic->vgic_cpu_base;
>   ranges = vgic_cpu_ranges;
>   break;
>   default:
> - BUG();
> + return -ENXIO;
>   }
> - r = vgic_find_range(ranges, 4, offset);
>  
> - if (unlikely(!r || !r->handle_mmio)) {
> - ret = -ENXIO;
> - goto out;
> - }
> -
> -
> - spin_lock(>lock);
> -
> - /*
> -  * Ensure that no other VCPU is running by checking the vcpu->cpu
> -  * field.  If no other VPCUs are running we can safely access the VGIC
> -  * state, because even if another VPU is run after this point, that
> -  * VCPU will not touch the vgic state, because it will block on
> -  * getting the vgic->lock in kvm_vgic_sync_hwstate().
> -  */
> - kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm) {
> - if (unlikely(tmp_vcpu->cpu != -1)) {
> - ret = -EBUSY;
> - goto out_vgic_unlock;
> - }
> - }
> -
> - /*
> -  * Move all pending IRQs from the LRs on all VCPUs so the pending
> -  * state can be properly represented in the register state accessible
> -  * through this API.
> -  */
> - kvm_for_each_vcpu(c, tmp_vcpu, dev->kvm)
> - vgic_unqueue_irqs(tmp_vcpu);
> -
> - offset -= r->base;
> - r->handle_mmio(vcpu, , offset);
> -
> - if (!is_write)
> - *reg = mmio_data_read(, ~0);
> + mmio.is_write = is_write;
> + mmio.data = data;
>  
> - ret = 0;
> -out_vgic_unlock:
> - spin_unlock(>lock);
> -out:
> - mutex_unlock(>kvm->lock);
> - return ret;
> + return vgic_attr_regs_access(dev, ranges, , offset, sizeof(data),
> +  cpuid);

Isn't the len parameter redundant here? I see that you don't initialize
mmio.len (which is a bit scary, btw), so can't you just use that field?

>  }
>  
>  static int vgic_v2_create(struct kvm_device *dev, u32 type)
> @@ -767,53 +708,38 @@ static void vgic_v2_destroy(struct kvm_device *dev)
>  static int vgic_v2_set_attr(struct kvm_device *dev,
>   struct kvm_device_attr *attr)
>  {
> + u32 __user *uaddr = (u32 __user *)(long)attr->addr;
> + u32 reg;
> + __le32 data;

That (and other parts of this patch) sneak in some endianness handling,
which I'd like to be mentioned in the commit message, but preferably be
in a separate patch. The commit message 

Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth
On 04/09/15 12:04, Alexander Graf wrote:
> 
> 
> On 04.09.15 11:59, Christian Borntraeger wrote:
>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>>
>>>  Hi all,
>>>
>>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>>> too, it seems like we easily can hit the amount of KVM-internal memory
>>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>>> arch/powerpc/include/asm/kvm_host.h). For example, start
>>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>>> you'll see that it aborts way earlier already.
>>>
>>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>>> already (+3 internal slots = 512) ... maybe we should now increase the
>>> amount of slots on powerpc, too? Since we don't use internal slots on
>>> POWER, would 512 be a good value? Or would less be sufficient, too?
>>
>> When you are at it, the s390 value should also be increased I guess.
> 
> That constant defines the array size for the memslot array in struct kvm
> which in turn again gets allocated by kzalloc, so it's pinned kernel
> memory that is physically contiguous. Doing big allocations can turn
> into problems during runtime.

FWIW, I've just checked sizeof(struct kvm) with the current ppc64 kernel
build from master branch, and it is 34144 bytes.
So on a system that is using PAGE_SIZE = 64kB, there should be plenty of
space left before we're getting into trouble.

And even assuming the worst case, that we're on a system which still
uses PAGE_SIZE = 4kB, the last page of the 34144 bytes is only filled
with 1376 bytes, leaving 2720 bytes free right now.
sizeof(struct kvm_memory_slot) is 48 bytes right now on powerpc, and you
need two additional bytes per entry for the id_to_index array in
struct kvm_memslots, i.e. we need 50 additional bytes per entry on ppc.
That means we could increase KVM_USER_MEM_SLOTS by 2720 / 50 = 54
entries without getting into further trouble.

I think we should leave some more additional bytes left in that last
4k page of the struct kvm region, ... so what about increasing
KVM_USER_MEM_SLOTS to 32 + 48 = 80 now (instead of 32 + 54 = 86) to
ease the memslot situation at least a little bit 'till we figured out
a really final solution like growable memslots?

 Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] powerpc/e500: move qemu machine spec together with the rest

2015-09-04 Thread Laurentiu Tudor
This way we get rid of an entire file with mostly
duplicated code plus a Kconfig option that you always
had to take care to check it in order for kvm to work.

Signed-off-by: Laurentiu Tudor 
---
 arch/powerpc/platforms/85xx/Kconfig   | 15 -
 arch/powerpc/platforms/85xx/Makefile  |  1 -
 arch/powerpc/platforms/85xx/corenet_generic.c |  1 +
 arch/powerpc/platforms/85xx/qemu_e500.c   | 85 ---
 4 files changed, 1 insertion(+), 101 deletions(-)
 delete mode 100644 arch/powerpc/platforms/85xx/qemu_e500.c

diff --git a/arch/powerpc/platforms/85xx/Kconfig 
b/arch/powerpc/platforms/85xx/Kconfig
index 97915fe..0c1ce10 100644
--- a/arch/powerpc/platforms/85xx/Kconfig
+++ b/arch/powerpc/platforms/85xx/Kconfig
@@ -249,21 +249,6 @@ config MVME2500
 
 endif # PPC32
 
-config PPC_QEMU_E500
-   bool "QEMU generic e500 platform"
-   select DEFAULT_UIMAGE
-   help
- This option enables support for running as a QEMU guest using
- QEMU's generic e500 machine.  This is not required if you're
- using a QEMU machine that targets a specific board, such as
- mpc8544ds.
-
- Unlike most e500 boards that target a specific CPU, this
- platform works with any e500-family CPU that QEMU supports.
- Thus, you'll need to make sure CONFIG_PPC_E500MC is set or
- unset based on the emulated CPU (or actual host CPU in the case
- of KVM).
-
 config CORENET_GENERIC
bool "Freescale CoreNet Generic"
select DEFAULT_UIMAGE
diff --git a/arch/powerpc/platforms/85xx/Makefile 
b/arch/powerpc/platforms/85xx/Makefile
index 1fe7fb9..33fbfb8 100644
--- a/arch/powerpc/platforms/85xx/Makefile
+++ b/arch/powerpc/platforms/85xx/Makefile
@@ -29,6 +29,5 @@ obj-$(CONFIG_SOCRATES)+= socrates.o socrates_fpga_pic.o
 obj-$(CONFIG_KSI8560)+= ksi8560.o
 obj-$(CONFIG_XES_MPC85xx) += xes_mpc85xx.o
 obj-$(CONFIG_GE_IMP3A)   += ge_imp3a.o
-obj-$(CONFIG_PPC_QEMU_E500) += qemu_e500.o
 obj-$(CONFIG_SGY_CTS1000) += sgy_cts1000.o
 obj-$(CONFIG_MVME2500)   += mvme2500.o
diff --git a/arch/powerpc/platforms/85xx/corenet_generic.c 
b/arch/powerpc/platforms/85xx/corenet_generic.c
index bd839dc..85078ee 100644
--- a/arch/powerpc/platforms/85xx/corenet_generic.c
+++ b/arch/powerpc/platforms/85xx/corenet_generic.c
@@ -159,6 +159,7 @@ static const char * const boards[] __initconst = {
"fsl,T1042RDB",
"fsl,T1042RDB_PI",
"keymile,kmcoge4",
+   "fsl,qemu-e500",
NULL
 };
 
diff --git a/arch/powerpc/platforms/85xx/qemu_e500.c 
b/arch/powerpc/platforms/85xx/qemu_e500.c
deleted file mode 100644
index 8ad2fe6..000
--- a/arch/powerpc/platforms/85xx/qemu_e500.c
+++ /dev/null
@@ -1,85 +0,0 @@
-/*
- * Paravirt target for a generic QEMU e500 machine
- *
- * This is intended to be a flexible device-tree-driven platform, not fixed
- * to a particular piece of hardware or a particular spec of virtual hardware,
- * beyond the assumption of an e500-family CPU.  Some things are still 
hardcoded
- * here, such as MPIC, but this is a limitation of the current code rather than
- * an interface contract with QEMU.
- *
- * Copyright 2012 Freescale Semiconductor Inc.
- *
- * This program is free software; you can redistribute  it and/or modify it
- * under  the terms of  the GNU General  Public License as published by the
- * Free Software Foundation;  either version 2 of the  License, or (at your
- * option) any later version.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include "smp.h"
-#include "mpc85xx.h"
-
-void __init qemu_e500_pic_init(void)
-{
-   struct mpic *mpic;
-   unsigned int flags = MPIC_BIG_ENDIAN | MPIC_SINGLE_DEST_CPU |
-   MPIC_ENABLE_COREINT;
-
-   mpic = mpic_alloc(NULL, 0, flags, 0, 256, " OpenPIC  ");
-
-   BUG_ON(mpic == NULL);
-   mpic_init(mpic);
-}
-
-static void __init qemu_e500_setup_arch(void)
-{
-   ppc_md.progress("qemu_e500_setup_arch()", 0);
-
-   fsl_pci_assign_primary();
-   swiotlb_detect_4g();
-#if defined(CONFIG_FSL_PCI) && defined(CONFIG_ZONE_DMA32)
-   /*
-* Inbound windows don't cover the full lower 4 GiB
-* due to conflicts with PCICSRBAR and outbound windows,
-* so limit the DMA32 zone to 2 GiB, to allow consistent
-* allocations to succeed.
-*/
-   limit_zone_pfn(ZONE_DMA32, 1UL << (31 - PAGE_SHIFT));
-#endif
-   mpc85xx_smp_init();
-}
-
-/*
- * Called very early, device-tree isn't unflattened
- */
-static int __init qemu_e500_probe(void)
-{
-   unsigned long root = of_get_flat_dt_root();
-
-   return !!of_flat_dt_is_compatible(root, "fsl,qemu-e500");
-}
-
-machine_arch_initcall(qemu_e500, mpc85xx_common_publish_devices);
-
-define_machine(qemu_e500) {
-   .name   = "QEMU e500",
-   .probe  = qemu_e500_probe,
-   .setup_arch 

Re: [PATCH 1/9] KVM: Add kvm_arch_vcpu_{un}blocking callbacks

2015-09-04 Thread Eric Auger
Hi Christoffer,
On 08/30/2015 03:54 PM, Christoffer Dall wrote:
> Some times it is useful for architecture implementations of KVM to know
> when the VCPU thread is about to block or when it comes back from
> blocking (arm/arm64 needs to know this to properly implement timers, for
> example).
what about vcpu_sleep()? Is that callback specific to kvm_vcpu_block
function entry/exit points or is it more generic? The question also
applies to future halt/resume functions

Thanks

Eric
> 
> Therefore provide a generic architecture callback function in line with
> what we do elsewhere for KVM generic-arch interactions.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm/include/asm/kvm_host.h | 3 +++
>  arch/arm64/include/asm/kvm_host.h   | 3 +++
>  arch/mips/include/asm/kvm_host.h| 2 ++
>  arch/powerpc/include/asm/kvm_host.h | 2 ++
>  arch/s390/include/asm/kvm_host.h| 2 ++
>  arch/x86/include/asm/kvm_host.h | 3 +++
>  include/linux/kvm_host.h| 2 ++
>  virt/kvm/kvm_main.c | 3 +++
>  8 files changed, 20 insertions(+)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index dcba0fa..86fcf6e 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -236,4 +236,7 @@ static inline void kvm_arm_setup_debug(struct kvm_vcpu 
> *vcpu) {}
>  static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> +
>  #endif /* __ARM_KVM_HOST_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 415938d..dd143f5 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -257,4 +257,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
>  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
>  
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> +
>  #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/arch/mips/include/asm/kvm_host.h 
> b/arch/mips/include/asm/kvm_host.h
> index e8c8d9d..58f0f4d 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -845,5 +845,7 @@ static inline void kvm_arch_flush_shadow_memslot(struct 
> kvm *kvm,
>   struct kvm_memory_slot *slot) {}
>  static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>  
>  #endif /* __MIPS_KVM_HOST_H__ */
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index d91f65b..179f9a7 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -702,5 +702,7 @@ static inline void kvm_arch_memslots_updated(struct kvm 
> *kvm, struct kvm_memslot
>  static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_exit(void) {}
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>  
>  #endif /* __POWERPC_KVM_HOST_H__ */
> diff --git a/arch/s390/include/asm/kvm_host.h 
> b/arch/s390/include/asm/kvm_host.h
> index 3024acb..04a97df 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -640,5 +640,7 @@ static inline void kvm_arch_memslots_updated(struct kvm 
> *kvm, struct kvm_memslot
>  static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
>  static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>   struct kvm_memory_slot *slot) {}
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>  
>  #endif
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 2a7f5d7..26c4086 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1202,4 +1202,7 @@ int __x86_set_memory_region(struct kvm *kvm,
>  int x86_set_memory_region(struct kvm *kvm,
> const struct kvm_userspace_memory_region *mem);
>  
> +static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> +
>  #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 9564fd7..87d7be6 100644
> --- a/include/linux/kvm_host.h

[PATCH 1/2] arm/arm64: KVM: vgic: Move active state handling to flush_hwstate

2015-09-04 Thread Christoffer Dall
We currently set the physical active state only when we *inject* a new
pending virtual interrupt, but this is actually not correct, because we
could have been preempted and run something else on the system that
resets the active state to clear.  This causes us to run the VM with the
timer set to fire, but without setting the physical active state.

The solution is to always check the LR configurations, and we if have a
mapped interrupt in the LR in either the pending or active state
(virtual), then set the physical active state.

Acked-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 42 ++
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 9eb489a..6bd1c9b 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1144,26 +1144,11 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
struct irq_phys_map *map;
map = vgic_irq_map_search(vcpu, irq);
 
-   /*
-* If we have a mapping, and the virtual interrupt is
-* being injected, then we must set the state to
-* active in the physical world. Otherwise the
-* physical interrupt will fire and the guest will
-* exit before processing the virtual interrupt.
-*/
if (map) {
-   int ret;
-
-   BUG_ON(!map->active);
vlr.hwirq = map->phys_irq;
vlr.state |= LR_HW;
vlr.state &= ~LR_EOI_INT;
 
-   ret = irq_set_irqchip_state(map->irq,
-   IRQCHIP_STATE_ACTIVE,
-   true);
-   WARN_ON(ret);
-
/*
 * Make sure we're not going to sample this
 * again, as a HW-backed interrupt cannot be
@@ -1255,7 +1240,7 @@ static void __kvm_vgic_flush_hwstate(struct kvm_vcpu 
*vcpu)
struct vgic_cpu *vgic_cpu = >arch.vgic_cpu;
struct vgic_dist *dist = >kvm->arch.vgic;
unsigned long *pa_percpu, *pa_shared;
-   int i, vcpu_id;
+   int i, vcpu_id, lr, ret;
int overflow = 0;
int nr_shared = vgic_nr_shared_irqs(dist);
 
@@ -1310,6 +1295,31 @@ epilog:
 */
clear_bit(vcpu_id, dist->irq_pending_on_cpu);
}
+
+   for (lr = 0; lr < vgic->nr_lr; lr++) {
+   struct vgic_lr vlr;
+
+   if (!test_bit(lr, vgic_cpu->lr_used))
+   continue;
+
+   vlr = vgic_get_lr(vcpu, lr);
+
+   /*
+* If we have a mapping, and the virtual interrupt is
+* presented to the guest (as pending or active), then we must
+* set the state to active in the physical world. See
+* Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt.
+*/
+   if (vlr.state & LR_HW) {
+   struct irq_phys_map *map;
+   map = vgic_irq_map_search(vcpu, vlr.irq);
+
+   ret = irq_set_irqchip_state(map->irq,
+   IRQCHIP_STATE_ACTIVE,
+   true);
+   WARN_ON(ret);
+   }
+   }
 }
 
 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Alex Williamson
On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote:
> On 04/09/15 12:07, Christian Borntraeger wrote:
> > Am 04.09.2015 um 12:04 schrieb Alexander Graf:
> >>
> >>
> >> On 04.09.15 11:59, Christian Borntraeger wrote:
> >>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
> 
>   Hi all,
> 
>  now that we get memory hotplugging for the spapr machine on qemu-ppc,
>  too, it seems like we easily can hit the amount of KVM-internal memory
>  slots now ("#define KVM_USER_MEM_SLOTS 32" in
>  arch/powerpc/include/asm/kvm_host.h). For example, start
>  qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>  4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>  you'll see that it aborts way earlier already.
> 
>  The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>  already (+3 internal slots = 512) ... maybe we should now increase the
>  amount of slots on powerpc, too? Since we don't use internal slots on
>  POWER, would 512 be a good value? Or would less be sufficient, too?
> >>>
> >>> When you are at it, the s390 value should also be increased I guess.
> >>
> >> That constant defines the array size for the memslot array in struct kvm
> >> which in turn again gets allocated by kzalloc, so it's pinned kernel
> >> memory that is physically contiguous. Doing big allocations can turn
> >> into problems during runtime.
> >>
> >> So maybe there is another way? Can we extend the memslot array size
> >> dynamically somehow? Allocate it separately? How much memory does the
> >> memslot array use up with 512 entries?
> > 
> > Maybe some rcu protected scheme that doubles the amount of memslots for
> > each overrun? Yes, that would be good and even reduce the footprint for
> > systems with only a small number of memslots.
> 
> Seems like Alex Williamson already posted a patchset for growable
> memslots a couple of years ago:
> 
>  http://www.spinics.net/lists/kvm/msg50491.html
> 
> But I didn't quite spot the result in that thread why it never has been
> included upstream. Alex (W.), do you remember the outcome?

IIRC it was simply thought to be more complicated than necessary.  Once
we started caching memory slot misses, the O(N) search of a larger fixed
array didn't seem to bother anyone, so I abandoned the weight-balanced
tree, which I think still had some re-balancing issues.  Please run with
the code and make it work if it's useful now.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0

2015-09-04 Thread Christoffer Dall
Provide a better quality of implementation and be architecture compliant
on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on
reset of the timer, and call kvm_timer_update_state(vcpu) at the same
time, ensuring the timer output is not asserted after, for example, a
PSCI system reset.

This change alone fixes the UEFI reset issue reported by Laszlo back in
February.

Cc: Laszlo Ersek 
Cc: Ard Biesheuvel 
Cc: Drew Jones 
Cc: Wei Huang 
Cc: Peter Maydell 
Reviewed-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/arch_timer.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 76e38d2..48c6e1a 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -200,6 +200,14 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
timer->irq = irq;
 
/*
+* The bits in CNTV_CTL are architecturally reset to UNKNOWN for ARMv8
+* and to 0 for ARMv7.  We provide an implementation that always
+* resets the timer to be disabled and unmasked and is compliant with
+* the ARMv7 architecture.
+*/
+   timer->cntv_ctl = 0;
+
+   /*
 * Tell the VGIC that the virtual interrupt is tied to a
 * physical interrupt. We do that once per VCPU.
 */
-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] arm/arm64: KVM: Fix arthictected timer issues

2015-09-04 Thread Christoffer Dall
These two patches fix two separate issues with the architected timer and
the corresponding interrupt injection to VMs on KVM/ARM.

The first patch fixes an issue introduced with the active timer state
switching series recently merged for v4.3, which could cause a guest to
loop without progress if another VCPU is run on the same physical CPU
and preempts the original VCPU while the guest is running the ISR for
the timer interrupt.

The second patch resets the architected timer's control register to zero
on system reset, ensuring that interrupts are not injected when a system
resets.  This fixes a long-standing issue with UEFI, where soft reset
initiated from within UEFI prevented the system from booting again.

Christoffer Dall (2):
  arm/arm64: KVM: vgic: Move active state handling to flush_hwstate
  arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0

 virt/kvm/arm/arch_timer.c |  8 
 virt/kvm/arm/vgic.c   | 42 ++
 2 files changed, 34 insertions(+), 16 deletions(-)

-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0

2015-09-04 Thread Christoffer Dall
On Fri, Sep 04, 2015 at 04:24:39PM +0200, Christoffer Dall wrote:
> Provide a better quality of implementation and be architecture compliant
> on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on
> reset of the timer, and call kvm_timer_update_state(vcpu) at the same
> time, ensuring the timer output is not asserted after, for example, a
> PSCI system reset.

forgot to remove the bit about kvm_timer_update_state(vcpu) which is no
longer valid when these patches are merged before the rework series.

Marc, if you're otherwise happy with this patch, can you fix this up at
commit time?

Thanks,
-Christoffer

> 
> This change alone fixes the UEFI reset issue reported by Laszlo back in
> February.
> 
> Cc: Laszlo Ersek 
> Cc: Ard Biesheuvel 
> Cc: Drew Jones 
> Cc: Wei Huang 
> Cc: Peter Maydell 
> Reviewed-by: Marc Zyngier 
> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/arch_timer.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
> index 76e38d2..48c6e1a 100644
> --- a/virt/kvm/arm/arch_timer.c
> +++ b/virt/kvm/arm/arch_timer.c
> @@ -200,6 +200,14 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>   timer->irq = irq;
>  
>   /*
> +  * The bits in CNTV_CTL are architecturally reset to UNKNOWN for ARMv8
> +  * and to 0 for ARMv7.  We provide an implementation that always
> +  * resets the timer to be disabled and unmasked and is compliant with
> +  * the ARMv7 architecture.
> +  */
> + timer->cntv_ctl = 0;
> +
> + /*
>* Tell the VGIC that the virtual interrupt is tied to a
>* physical interrupt. We do that once per VCPU.
>*/
> -- 
> 2.1.2.330.g565301e.dirty
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Thomas Huth
On 04/09/15 12:04, Alexander Graf wrote:
> 
> 
> On 04.09.15 11:59, Christian Borntraeger wrote:
>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>>
>>>  Hi all,
>>>
>>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>>> too, it seems like we easily can hit the amount of KVM-internal memory
>>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>>> arch/powerpc/include/asm/kvm_host.h). For example, start
>>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>>> you'll see that it aborts way earlier already.
>>>
>>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>>> already (+3 internal slots = 512) ... maybe we should now increase the
>>> amount of slots on powerpc, too? Since we don't use internal slots on
>>> POWER, would 512 be a good value? Or would less be sufficient, too?
>>
>> When you are at it, the s390 value should also be increased I guess.
> 
> That constant defines the array size for the memslot array in struct kvm
> which in turn again gets allocated by kzalloc, so it's pinned kernel
> memory that is physically contiguous. Doing big allocations can turn
> into problems during runtime.

FWIW, I've just checked sizeof(struct kvm) with the current ppc64 kernel
build from master branch, and it is 34144 bytes.
So on a system that is using PAGE_SIZE = 64kB, there should be plenty of
space left before we're getting into trouble.

And even assuming the worst case, that we're on a system which still
uses PAGE_SIZE = 4kB, the last page of the 34144 bytes is only filled
with 1376 bytes, leaving 2720 bytes free right now.
sizeof(struct kvm_memory_slot) is 48 bytes right now on powerpc, and you
need two additional bytes per entry for the id_to_index array in
struct kvm_memslots, i.e. we need 50 additional bytes per entry on ppc.
That means we could increase KVM_USER_MEM_SLOTS by 2720 / 50 = 54
entries without getting into further trouble.

I think we should leave some more additional bytes left in that last
4k page of the struct kvm region, ... so what about increasing
KVM_USER_MEM_SLOTS to 32 + 48 = 80 now (instead of 32 + 54 = 86) to
ease the memslot situation at least a little bit 'till we figured out
a really final solution like growable memslots?

 Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Andrew Jones
On Fri, Sep 04, 2015 at 03:48:35PM +0200, Alexander Spyridakis wrote:
> On 4 September 2015 at 12:48, Andrew Jones  wrote:
> > Are you also running configure (with the opposite arch selected) after
> > 'make clean'? If not, then that could be the source of your problems.
> > Anyway, please describe the issues you're seeing because I don't see
> > what this patch is doing that isn't already being done. The lines this
> > patch adds are already there. See the arm_clean target in
> > config/config-arm-common.mak.
> 
> Steps to reproduce my issue:
> > ./configure --arch=arm --cross-prefix=arm-linux-gnueabihf-
> > make
> > ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> > make clean && make
> > arm/selftest.o: error adding symbols: File in wrong format

This doesn't reproduce for me. I did the following, and it worked
fine.

make distclean
./configure --arch=arm --cross-prefix=arm-linux-gnu-
make
./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
make clean && make


But anyway I would suggest you do your 'make clean' or even a
'make distclean' *before* running configure the second time.
Otherwise you'll leave old object files around from the previously
configured arch. I.e. you want 'make clean' to apply to the
currently built config, not the new (not yet built) config.

drew

> 
> I would expect that after 'make clean', the object and binary files
> are removed. Instead they are not and I have to manually remove them
> before rebuilding. Running 'make clean' before and/or after
> reconfiguring still produces the same issue for me.
> 
> Thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] arm/arm64: KVM: arch timer: Reset CNTV_CTL to 0

2015-09-04 Thread Marc Zyngier
On 04/09/15 15:47, Christoffer Dall wrote:
> On Fri, Sep 04, 2015 at 04:24:39PM +0200, Christoffer Dall wrote:
>> Provide a better quality of implementation and be architecture compliant
>> on ARMv7 for the architected timer by resetting the CNTV_CTL to 0 on
>> reset of the timer, and call kvm_timer_update_state(vcpu) at the same
>> time, ensuring the timer output is not asserted after, for example, a
>> PSCI system reset.
> 
> forgot to remove the bit about kvm_timer_update_state(vcpu) which is no
> longer valid when these patches are merged before the rework series.
> 
> Marc, if you're otherwise happy with this patch, can you fix this up at
> commit time?

No problem, I can fix the commit message on applying the patch.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Alexander Spyridakis
On 4 September 2015 at 12:48, Andrew Jones  wrote:
> The lines this
> patch adds are already there. See the arm_clean target in
> config/config-arm-common.mak.

I see your point now. Maybe then "arm_clean" should be changed in
"arch_clean" in config-arm-common.mak?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-04 Thread Andre Przywara
Hi Pavel,

On 04/09/15 13:40, Pavel Fedin wrote:
> The access is done similar to vGICv2, using KVM_DEV_ARM_VGIC_GRP_DIST_REGS
> and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS with KVM_SET_DEVICE_ATTR and
> KVM_GET_DEVICE_ATTR ioctls. Since GICv3 can handle large number of CPUs,
> KVM_DEV_ARM_VGIC_CPUID_MASK has been extended to 20 bits. This is enough
> for 1048576 CPUs.

I guess the 20 bits come from 8 bits for Aff2 and Aff1 and 4-bits for
Aff0? If so, please mention this. But I am not sure we should limit the
cpu index in this public API to something as low 20 bits. Since this
mask is GIC specific, we could push the size bit into offset and use the
full upper 32 bits for cpuid, or at least 28 bits plus 4 reserved.

> 
> Some registers are 64-bit wide according to the specification.
> KVM_DEV_ARM_VGIC_64BIT flag is introduced, allowing to perform full 64-bit
> accesses.
> 
> Signed-off-by: Pavel Fedin 
> ---
>  Documentation/virtual/kvm/devices/arm-vgic.txt | 35 --
>  arch/arm64/include/uapi/asm/kvm.h  |  4 +-
>  virt/kvm/arm/vgic-v3-emul.c| 95 
> ++
>  virt/kvm/arm/vgic.c|  1 +
>  4 files changed, 116 insertions(+), 19 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt 
> b/Documentation/virtual/kvm/devices/arm-vgic.txt
> index 3fb9054..03f640f 100644
> --- a/Documentation/virtual/kvm/devices/arm-vgic.txt
> +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
> @@ -43,10 +43,13 @@ Groups:
>KVM_DEV_ARM_VGIC_GRP_DIST_REGS
>Attributes:
>  The attr field of kvm_device_attr encodes two values:
> -bits: | 63     40 | 39 ..  32  |  31   0 |
> -values:   |reserved   |   cpu id   |  offset |
> +bits: |  63  | 62 .. 52 | 51 ..  32  |  31   0 |
> +values:   | size | reserved |   cpu id   |  offset |
>  
>  All distributor regs are (rw, 32-bit)
> +For GICv3 some regs are also (rw, 64-bit) according to the specification.

That sounds contradictory to me. What about:
All registers can be accessed by using 32-bit accesses, some registers
also by 64-bit reads/writes according to the specification.

> +In order to perform full 64-bit access 'size' bit should be set to 1.
> +KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose.
>  
>  The offset is relative to the "Distributor base address" as defined in 
> the
>  GICv2 specs.  Getting or setting such a register has the same effect as
> @@ -54,9 +57,33 @@ Groups:
>  specified with cpu id field.  Note that most distributor fields are not
>  banked, but return the same value regardless of the cpu id used to access
>  the register.
> +
> +  Limitations:
> +- Priorities are not implemented, and registers are RAZ/WI
> +  Errors:
> +-ENODEV: Getting or setting this register is not yet supported

Isn't that actually -ENXIO in the code? I see that this is just copy &
paste, but it should be fixed in either case.

> +-EBUSY: One or more VCPUs are running
> +
> +  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
> +  Attributes:
> +The attr field of kvm_device_attr encodes two values:
> +bits: |  63  | 62 .. 52 | 51 ..  32  |  31   0 |
> +values:   | size | reserved |   cpu id   |  offset |
> +
> +All redistributor regs are (rw, 32-bit)
> +For GICv3 some regs are also (rw, 64-bit) according to the specification.
> +In order to perform full 64-bit access 'size' bit should be set to 1.
> +KVM_DEV_ARM_VGIC_64BIT flag value is provided for this purpose.
> +
> +The offset is relative to the "Redistributor base address" as defined in
> +the GICv3 specs.  Getting or setting such a register has the same effect 
> as
> +reading or writing the register on the actual hardware from the cpu
> +specified with cpu id field.  Note that most distributor fields are not
> +banked, but return the same value regardless of the cpu id used to access
> +the register.
> +
>Limitations:
>  - Priorities are not implemented, and registers are RAZ/WI
> -- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
>Errors:
>  -ENODEV: Getting or setting this register is not yet supported
>  -EBUSY: One or more VCPUs are running
> @@ -64,7 +91,7 @@ Groups:
>KVM_DEV_ARM_VGIC_GRP_CPU_REGS
>Attributes:
>  The attr field of kvm_device_attr encodes two values:
> -bits: | 63     40 | 39 ..  32  |  31   0 |
> +bits: | 63     52 | 51 ..  32  |  31   0 |
>  values:   |reserved   |   cpu id   |  offset |
>  
>  All CPU interface regs are (rw, 32-bit)
> diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> b/arch/arm64/include/uapi/asm/kvm.h
> index 0cd7b59..249954f 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -196,13 +196,15 @@ struct kvm_arch_memory_slot {
>  #define 

Re: [PATCH 12/14] arm64: Check for selected granule support

2015-09-04 Thread Catalin Marinas
On Wed, Sep 02, 2015 at 12:19:07PM +0200, Ard Biesheuvel wrote:
> On 2 September 2015 at 11:48, Ard Biesheuvel  
> wrote:
> > Couldn't we allocate some flag bits in the Image header to communicate
> > the page size to the bootloader?
> 
> Something like this perhaps?
> 
> 8<---
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 7d9d3c2286b2..13a8aaa9a6e9 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -104,7 +104,8 @@ Header notes:
>  - The flags field (introduced in v3.17) is a little-endian 64-bit field
>composed as follows:
>Bit 0:   Kernel endianness.  1 if BE, 0 if LE.
> -  Bits 1-63:   Reserved.
> +  Bits 1-2:Kernel page size.   0=unspecified, 1=4K, 2=16K, 3=64K
> +  Bits 3-63:   Reserved.
> 
>  - When image_size is zero, a bootloader should attempt to keep as much
>memory as possible free for use by the kernel immediately after the
> diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
> index 8fae0756e175..5def289bda84 100644
> --- a/arch/arm64/kernel/image.h
> +++ b/arch/arm64/kernel/image.h
> @@ -47,7 +47,9 @@
>  #define __HEAD_FLAG_BE 0
>  #endif
> 
> -#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0)
> +#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
> +
> +#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0) | (__HEAD_FLAG_PAGE_SIZE << 1)

I'm fine with this.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Alexander Spyridakis
On 4 September 2015 at 16:05, Andrew Jones  wrote:
> This doesn't reproduce for me. I did the following, and it worked
> fine.
>
> make distclean
> ./configure --arch=arm --cross-prefix=arm-linux-gnu-
> make
> ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> make clean && make

This is very troubling then as with the same commands the binaries are
still there. I need to check what's wrong with my setup, although I
managed to have the same issue on a newly created docker container...

Thanks for trying to replicate, I will investigate more.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Alexander Spyridakis
On 4 September 2015 at 12:48, Andrew Jones  wrote:
> Are you also running configure (with the opposite arch selected) after
> 'make clean'? If not, then that could be the source of your problems.
> Anyway, please describe the issues you're seeing because I don't see
> what this patch is doing that isn't already being done. The lines this
> patch adds are already there. See the arm_clean target in
> config/config-arm-common.mak.

Steps to reproduce my issue:
> ./configure --arch=arm --cross-prefix=arm-linux-gnueabihf-
> make
> ./configure --arch=arm64 --cross-prefix=aarch64-linux-gnu-
> make clean && make
> arm/selftest.o: error adding symbols: File in wrong format

I would expect that after 'make clean', the object and binary files
are removed. Instead they are not and I have to manually remove them
before rebuilding. Running 'make clean' before and/or after
reconfiguring still produces the same issue for me.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Peter Maydell
On 4 September 2015 at 15:05, Andrew Jones  wrote:
> But anyway I would suggest you do your 'make clean' or even a
> 'make distclean' *before* running configure the second time.
> Otherwise you'll leave old object files around from the previously
> configured arch. I.e. you want 'make clean' to apply to the
> currently built config, not the new (not yet built) config.

Separate object directories is usually the best approach if
you regularly want to build multiple configurations IMHO...

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCH 2/2] arm/arm64 config: Fix arch_clean rule

2015-09-04 Thread Andrew Jones
On Fri, Sep 04, 2015 at 03:53:46PM +0200, Alexander Spyridakis wrote:
> On 4 September 2015 at 12:48, Andrew Jones  wrote:
> > The lines this
> > patch adds are already there. See the arm_clean target in
> > config/config-arm-common.mak.
> 
> I see your point now. Maybe then "arm_clean" should be changed in
> "arch_clean" in config-arm-common.mak?

No, the dependency chain for 'make clean' is

clean
  arch_clean (this either arm's or arm64's)
arm_clean (this is for both arm and arm64)

> 
> Regards.
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Alex Williamson
On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote:
> On 04/09/15 12:07, Christian Borntraeger wrote:
> > Am 04.09.2015 um 12:04 schrieb Alexander Graf:
> >>
> >>
> >> On 04.09.15 11:59, Christian Borntraeger wrote:
> >>> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
> 
>   Hi all,
> 
>  now that we get memory hotplugging for the spapr machine on qemu-ppc,
>  too, it seems like we easily can hit the amount of KVM-internal memory
>  slots now ("#define KVM_USER_MEM_SLOTS 32" in
>  arch/powerpc/include/asm/kvm_host.h). For example, start
>  qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>  4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>  you'll see that it aborts way earlier already.
> 
>  The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>  already (+3 internal slots = 512) ... maybe we should now increase the
>  amount of slots on powerpc, too? Since we don't use internal slots on
>  POWER, would 512 be a good value? Or would less be sufficient, too?
> >>>
> >>> When you are at it, the s390 value should also be increased I guess.
> >>
> >> That constant defines the array size for the memslot array in struct kvm
> >> which in turn again gets allocated by kzalloc, so it's pinned kernel
> >> memory that is physically contiguous. Doing big allocations can turn
> >> into problems during runtime.
> >>
> >> So maybe there is another way? Can we extend the memslot array size
> >> dynamically somehow? Allocate it separately? How much memory does the
> >> memslot array use up with 512 entries?
> > 
> > Maybe some rcu protected scheme that doubles the amount of memslots for
> > each overrun? Yes, that would be good and even reduce the footprint for
> > systems with only a small number of memslots.
> 
> Seems like Alex Williamson already posted a patchset for growable
> memslots a couple of years ago:
> 
>  http://www.spinics.net/lists/kvm/msg50491.html
> 
> But I didn't quite spot the result in that thread why it never has been
> included upstream. Alex (W.), do you remember the outcome?

IIRC it was simply thought to be more complicated than necessary.  Once
we started caching memory slot misses, the O(N) search of a larger fixed
array didn't seem to bother anyone, so I abandoned the weight-balanced
tree, which I think still had some re-balancing issues.  Please run with
the code and make it work if it's useful now.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/5] KVM: arm64: Refactor system register handlers

2015-09-04 Thread Andre Przywara
Hi Pavel,

On 04/09/15 13:40, Pavel Fedin wrote:
> Replace Rt with data pointer in struct sys_reg_params. This will allow to
> reuse system register handling code in implementation of vGICv3 CPU
> interface access API. Additionally, got rid of "massive hack"
> in kvm_handle_cp_64().
> 
> Signed-off-by: Pavel Fedin 
> ---
>  arch/arm64/kvm/sys_regs.c| 61 
> +---
>  arch/arm64/kvm/sys_regs.h|  4 +--
>  arch/arm64/kvm/sys_regs_generic_v8.c |  2 +-
>  3 files changed, 32 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index b41607d..fe6b517 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -102,7 +102,7 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu,
>  
>   BUG_ON(!p->is_write);
>  
> - val = *vcpu_reg(vcpu, p->Rt);
> + val = *p->val;
>   if (!p->is_aarch32) {
>   vcpu_sys_reg(vcpu, r->reg) = val;
>   } else {
> @@ -125,13 +125,10 @@ static bool access_gic_sgi(struct kvm_vcpu *vcpu,
>  const struct sys_reg_params *p,
>  const struct sys_reg_desc *r)
>  {
> - u64 val;
> -
>   if (!p->is_write)
>   return read_from_write_only(vcpu, p);
>  
> - val = *vcpu_reg(vcpu, p->Rt);
> - vgic_v3_dispatch_sgi(vcpu, val);
> + vgic_v3_dispatch_sgi(vcpu, *p->val);
>  
>   return true;
>  }
> @@ -153,7 +150,7 @@ static bool trap_oslsr_el1(struct kvm_vcpu *vcpu,
>   if (p->is_write) {
>   return ignore_write(vcpu, p);
>   } else {
> - *vcpu_reg(vcpu, p->Rt) = (1 << 3);
> + *p->val = (1 << 3);
>   return true;
>   }
>  }
> @@ -167,7 +164,7 @@ static bool trap_dbgauthstatus_el1(struct kvm_vcpu *vcpu,
>   } else {
>   u32 val;
>   asm volatile("mrs %0, dbgauthstatus_el1" : "=r" (val));
> - *vcpu_reg(vcpu, p->Rt) = val;
> + *p->val = val;
>   return true;
>   }
>  }
> @@ -204,13 +201,13 @@ static bool trap_debug_regs(struct kvm_vcpu *vcpu,
>   const struct sys_reg_desc *r)
>  {
>   if (p->is_write) {
> - vcpu_sys_reg(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> + vcpu_sys_reg(vcpu, r->reg) = *p->val;
>   vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
>   } else {
> - *vcpu_reg(vcpu, p->Rt) = vcpu_sys_reg(vcpu, r->reg);
> + *p->val = vcpu_sys_reg(vcpu, r->reg);
>   }
>  
> - trace_trap_reg(__func__, r->reg, p->is_write, *vcpu_reg(vcpu, p->Rt));
> + trace_trap_reg(__func__, r->reg, p->is_write, *p->val);
>  
>   return true;
>  }
> @@ -228,7 +225,7 @@ static inline void reg_to_dbg(struct kvm_vcpu *vcpu,
> const struct sys_reg_params *p,
> u64 *dbg_reg)
>  {
> - u64 val = *vcpu_reg(vcpu, p->Rt);
> + u64 val = *p->val;
>  
>   if (p->is_32bit) {
>   val &= 0xUL;
> @@ -248,7 +245,7 @@ static inline void dbg_to_reg(struct kvm_vcpu *vcpu,
>   if (p->is_32bit)
>   val &= 0xUL;
>  
> - *vcpu_reg(vcpu, p->Rt) = val;
> + *p->val = val;
>  }
>  
>  static inline bool trap_bvr(struct kvm_vcpu *vcpu,
> @@ -704,10 +701,10 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
>   u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
>   u32 el3 = !!((pfr >> 12) & 0xf);
>  
> - *vcpu_reg(vcpu, p->Rt) = dfr >> 20) & 0xf) << 28) |
> -   (((dfr >> 12) & 0xf) << 24) |
> -   (((dfr >> 28) & 0xf) << 20) |
> -   (6 << 16) | (el3 << 14) | (el3 << 
> 12));
> + *p->val = dfr >> 20) & 0xf) << 28) |
> +(((dfr >> 12) & 0xf) << 24) |
> +(((dfr >> 28) & 0xf) << 20) |
> +(6 << 16) | (el3 << 14) | (el3 << 12));
>   return true;
>   }
>  }
> @@ -717,10 +714,10 @@ static bool trap_debug32(struct kvm_vcpu *vcpu,
>const struct sys_reg_desc *r)
>  {
>   if (p->is_write) {
> - vcpu_cp14(vcpu, r->reg) = *vcpu_reg(vcpu, p->Rt);
> + vcpu_cp14(vcpu, r->reg) = *p->val;
>   vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
>   } else {
> - *vcpu_reg(vcpu, p->Rt) = vcpu_cp14(vcpu, r->reg);
> + *p->val = vcpu_cp14(vcpu, r->reg);
>   }
>  
>   return true;
> @@ -747,12 +744,12 @@ static inline bool trap_xvr(struct kvm_vcpu *vcpu,
>   u64 val = *dbg_reg;
>  
>   val &= 0xUL;
> - val |= *vcpu_reg(vcpu, p->Rt) << 32;
> + val |= *p->val << 32;
>   *dbg_reg = val;
>  
>   vcpu->arch.debug_flags |= KVM_ARM64_DEBUG_DIRTY;
>   } else {
> - 

RE: [PATCH v2 1/5] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-09-04 Thread Pavel Fedin
 Hello!

> Isn't the len parameter redundant here? I see that you don't initialize
> mmio.len (which is a bit scary, btw), so can't you just use that field?

 This was because of split below. I did not know about call_range_handler(), 
and now i will redo
this.

> That (and other parts of this patch) sneak in some endianness handling,
> which I'd like to be mentioned in the commit message, but preferably be
> in a separate patch. The commit message here talks only about refactoring.

 These come from mmio_data_read() and mmio_data_write() in original 
vgic_attr_regs_access().
These inlines cannot be used with arbitrary data length, so i opened them up 
(they contain
endianness conversion plus masking which isn't used in our case) and moved 
endianness conversion to
load/store part.
 If i make this a separate patch, it will be two lines patch. Does it worth 
that? In the next respin
i'd better add this explanation to commit message. Would it be OK?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 4/5] KVM: arm64: Introduce find_reg_by_id()

2015-09-04 Thread Andre Przywara
On 04/09/15 13:40, Pavel Fedin wrote:
> In order to implement vGICv3 CPU interface access, we will need to perform
> table lookup of system registers. We would need both index_to_params() and
> find_reg() exported for that purpose, but instead we export a single
> function which combines them both.
> 
> Signed-off-by: Pavel Fedin 

Reviewed-by: Andre Przywara 

> ---
>  arch/arm64/kvm/sys_regs.c | 22 +++---
>  arch/arm64/kvm/sys_regs.h |  4 
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index fe6b517..21403fa 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1283,6 +1283,17 @@ static bool index_to_params(u64 id, struct 
> sys_reg_params *params)
>   }
>  }
>  
> +const struct sys_reg_desc *find_reg_by_id(u64 id,
> +   struct sys_reg_params *params,
> +   const struct sys_reg_desc table[],
> +   unsigned int num)
> +{
> + if (!index_to_params(id, params))
> + return NULL;
> +
> + return find_reg(params, table, num);
> +}
> +
>  /* Decode an index value, and find the sys_reg_desc entry. */
>  static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu 
> *vcpu,
>   u64 id)
> @@ -1410,10 +1421,8 @@ static int get_invariant_sys_reg(u64 id, void __user 
> *uaddr)
>   struct sys_reg_params params;
>   const struct sys_reg_desc *r;
>  
> - if (!index_to_params(id, ))
> - return -ENOENT;
> -
> - r = find_reg(, invariant_sys_regs, 
> ARRAY_SIZE(invariant_sys_regs));
> + r = find_reg_by_id(id, , invariant_sys_regs,
> +ARRAY_SIZE(invariant_sys_regs));
>   if (!r)
>   return -ENOENT;
>  
> @@ -1427,9 +1436,8 @@ static int set_invariant_sys_reg(u64 id, void __user 
> *uaddr)
>   int err;
>   u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */
>  
> - if (!index_to_params(id, ))
> - return -ENOENT;
> - r = find_reg(, invariant_sys_regs, 
> ARRAY_SIZE(invariant_sys_regs));
> + r = find_reg_by_id(id, , invariant_sys_regs,
> +ARRAY_SIZE(invariant_sys_regs));
>   if (!r)
>   return -ENOENT;
>  
> diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
> index 3267518..0646108 100644
> --- a/arch/arm64/kvm/sys_regs.h
> +++ b/arch/arm64/kvm/sys_regs.h
> @@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc 
> *i1,
>   return i1->Op2 - i2->Op2;
>  }
>  
> +const struct sys_reg_desc *find_reg_by_id(u64 id,
> +   struct sys_reg_params *params,
> +   const struct sys_reg_desc table[],
> +   unsigned int num);
>  
>  #define Op0(_x)  .Op0 = _x
>  #define Op1(_x)  .Op1 = _x
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-04 Thread Pavel Fedin
 Hello!

> > The access is done similar to vGICv2, using KVM_DEV_ARM_VGIC_GRP_DIST_REGS
> > and KVM_DEV_ARM_VGIC_GRP_REDIST_REGS with KVM_SET_DEVICE_ATTR and
> > KVM_GET_DEVICE_ATTR ioctls. Since GICv3 can handle large number of CPUs,
> > KVM_DEV_ARM_VGIC_CPUID_MASK has been extended to 20 bits. This is enough
> > for 1048576 CPUs.
> 
> I guess the 20 bits come from 8 bits for Aff2 and Aff1 and 4-bits for
> Aff0? If so, please mention this. But I am not sure we should limit the
> cpu index in this public API

 The documentation is confusing. It is actually not the affinity ID, but just 
number from 0 to N.
See http://lxr.free-electrons.com/source/include/linux/kvm_host.h#L427 - this 
is the function which
gets this "id" and returns vcpu structure.
 So, we can have up to 1 << 20 = 1048576 CPUs. Enough?
 If you are OK with this, i can post a separate patch to fix the documentation, 
or include it in
respin.
 
> we could push the size bit into offset and use the
> full upper 32 bits for cpuid, or at least 28 bits plus 4 reserved.

 Actually 20 bits comes from ARM64_SYS_REG() macro, which i reuse for 
KVM_DEV_ARM_VGIC_GRP_CPU_REGS:
--- cut ---
bits:   | 63 .. 56 | 55 .. 48 | 47 ... 40 | 39 .. 32 | 31 .. 0 |
values: |   arch   |   size   | reserved  |  cpu id  |  reg id |
--- cut ---
 arch = KVM_REG_ARM64 and size = KVM_REG_SIZE_U64. I decided not to invent own 
macro.


Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] arm/arm64: KVM: Fix arthictected timer issues

2015-09-04 Thread Marc Zyngier
Hi Christoffer,

On 04/09/15 15:24, Christoffer Dall wrote:
> These two patches fix two separate issues with the architected timer and
> the corresponding interrupt injection to VMs on KVM/ARM.
> 
> The first patch fixes an issue introduced with the active timer state
> switching series recently merged for v4.3, which could cause a guest to
> loop without progress if another VCPU is run on the same physical CPU
> and preempts the original VCPU while the guest is running the ISR for
> the timer interrupt.
> 
> The second patch resets the architected timer's control register to zero
> on system reset, ensuring that interrupts are not injected when a system
> resets.  This fixes a long-standing issue with UEFI, where soft reset
> initiated from within UEFI prevented the system from booting again.

Thanks for respinning these patches. I've queued them in our -next
queue. I'll send this to Paolo some time next week, once they get some
hammering.

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] arm/arm64: KVM: Fix arthictected timer issues

2015-09-04 Thread Christoffer Dall
On Fri, Sep 04, 2015 at 04:35:20PM +0100, Marc Zyngier wrote:
> Hi Christoffer,
> 
> On 04/09/15 15:24, Christoffer Dall wrote:
> > These two patches fix two separate issues with the architected timer and
> > the corresponding interrupt injection to VMs on KVM/ARM.
> > 
> > The first patch fixes an issue introduced with the active timer state
> > switching series recently merged for v4.3, which could cause a guest to
> > loop without progress if another VCPU is run on the same physical CPU
> > and preempts the original VCPU while the guest is running the ISR for
> > the timer interrupt.
> > 
> > The second patch resets the architected timer's control register to zero
> > on system reset, ensuring that interrupts are not injected when a system
> > resets.  This fixes a long-standing issue with UEFI, where soft reset
> > initiated from within UEFI prevented the system from booting again.
> 
> Thanks for respinning these patches. I've queued them in our -next
> queue. I'll send this to Paolo some time next week, once they get some
> hammering.
> 
Awesome, fyi I left these patches running in a loop with reboots and
running hackbench on 2 simultaneous VMs with 4 vcpus each on X-gene for
100 iterations, and left the guest and host run cyclictest for ~20
minutes without issues.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: arm/arm64: Fix PSCI affinity info return value for non valid cores

2015-09-04 Thread Mark Rutland
On Fri, Sep 04, 2015 at 04:06:24PM +0100, Alexander Spyridakis wrote:
> If a guest requests the affinity info for a non-existing vCPU we need to
> properly return an error, instead of erroneously reporting an off state.
> 
> Signed-off-by: Alexander Spyridakis 
> Signed-off-by: Alvise Rigo 

This looks correct to me per the PSCI 0.2 spec:

Acked-by: Mark Rutland 

Was this spotted by inspection, or do you have some client for which
this is important?

Thanks,
Mark.

> ---
>  arch/arm/kvm/psci.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
> index 4b94b51..ad6f642 100644
> --- a/arch/arm/kvm/psci.c
> +++ b/arch/arm/kvm/psci.c
> @@ -126,7 +126,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
> *source_vcpu)
>  
>  static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
>  {
> - int i;
> + int i, matching_cpus = 0;
>   unsigned long mpidr;
>   unsigned long target_affinity;
>   unsigned long target_affinity_mask;
> @@ -151,12 +151,16 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
> kvm_vcpu *vcpu)
>*/
>   kvm_for_each_vcpu(i, tmp, kvm) {
>   mpidr = kvm_vcpu_get_mpidr_aff(tmp);
> - if (((mpidr & target_affinity_mask) == target_affinity) &&
> - !tmp->arch.pause) {
> - return PSCI_0_2_AFFINITY_LEVEL_ON;
> + if ((mpidr & target_affinity_mask) == target_affinity) {
> + matching_cpus++;
> + if (!tmp->arch.pause)
> + return PSCI_0_2_AFFINITY_LEVEL_ON;
>   }
>   }
>  
> + if (!matching_cpus)
> + return PSCI_RET_INVALID_PARAMS;
> +
>   return PSCI_0_2_AFFINITY_LEVEL_OFF;
>  }
>  
> -- 
> 2.5.1
> 
> ___
> kvmarm mailing list
> kvm...@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: arm/arm64: Fix PSCI affinity info return value for non valid cores

2015-09-04 Thread Alexander Spyridakis
If a guest requests the affinity info for a non-existing vCPU we need to
properly return an error, instead of erroneously reporting an off state.

Signed-off-by: Alexander Spyridakis 
Signed-off-by: Alvise Rigo 
---
 arch/arm/kvm/psci.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 4b94b51..ad6f642 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -126,7 +126,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 
 static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 {
-   int i;
+   int i, matching_cpus = 0;
unsigned long mpidr;
unsigned long target_affinity;
unsigned long target_affinity_mask;
@@ -151,12 +151,16 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
 */
kvm_for_each_vcpu(i, tmp, kvm) {
mpidr = kvm_vcpu_get_mpidr_aff(tmp);
-   if (((mpidr & target_affinity_mask) == target_affinity) &&
-   !tmp->arch.pause) {
-   return PSCI_0_2_AFFINITY_LEVEL_ON;
+   if ((mpidr & target_affinity_mask) == target_affinity) {
+   matching_cpus++;
+   if (!tmp->arch.pause)
+   return PSCI_0_2_AFFINITY_LEVEL_ON;
}
}
 
+   if (!matching_cpus)
+   return PSCI_RET_INVALID_PARAMS;
+
return PSCI_0_2_AFFINITY_LEVEL_OFF;
 }
 
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 5/5] KVM: arm64: Implement vGICv3 CPU interface access

2015-09-04 Thread Andre Przywara
Hi Pavel,

On 04/09/15 13:40, Pavel Fedin wrote:
> The access is done similar to GICv2, using KVM_DEV_ARM_VGIC_GRP_CPU_REGS
> group, however attribute ID encodes corresponding system register. Access
> size is always 64 bits.

Why is this? Actually all registers in the CPU interface (except the w/o
SGI registers) are 32 bits and in the pending 32-bit GICv3 support
series[1] this is exploited by using MRC/MCR accesses.
The only thing that is pure 64-bit is the MRS/MSR _instruction_ in
Aarch64, which always takes a x register.
So can you model the register size according to the spec and allow
32-bit accesses from userland?

> Since CPU interface state actually affects only a single vCPU, no vGIC
> locking is done. Just made sure that the vCPU is not running.
> 
> Signed-off-by: Pavel Fedin 
> ---
>  Documentation/virtual/kvm/devices/arm-vgic.txt |  38 +++-
>  arch/arm64/include/uapi/asm/kvm.h  |   7 +
>  include/linux/irqchip/arm-gic-v3.h |  18 +-
>  virt/kvm/arm/vgic-v3-emul.c| 244 
> +
>  4 files changed, 303 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/devices/arm-vgic.txt 
> b/Documentation/virtual/kvm/devices/arm-vgic.txt
> index 03f640f..518b634 100644
> --- a/Documentation/virtual/kvm/devices/arm-vgic.txt
> +++ b/Documentation/virtual/kvm/devices/arm-vgic.txt
> @@ -88,7 +88,7 @@ Groups:
>  -ENODEV: Getting or setting this register is not yet supported
>  -EBUSY: One or more VCPUs are running
> 
> -  KVM_DEV_ARM_VGIC_GRP_CPU_REGS
> +  KVM_DEV_ARM_VGIC_GRP_CPU_REGS (vGICv2)
>Attributes:
>  The attr field of kvm_device_attr encodes two values:
>  bits: | 63     52 | 51 ..  32  |  31   0 |
> @@ -116,11 +116,45 @@ Groups:
> 
>Limitations:
>  - Priorities are not implemented, and registers are RAZ/WI
> -- Currently only implemented for KVM_DEV_TYPE_ARM_VGIC_V2.
>Errors:
>  -ENODEV: Getting or setting this register is not yet supported
>  -EBUSY: One or more VCPUs are running
> 
> +  KVM_DEV_ARM_VGIC_GRP_CPU_REGS (vGICv3)
> +  Attributes:
> +The attr field of kvm_device_attr encodes the following values:
> +bits:   | 63 .. 56 | 55 .. 48 | 47 ... 40 | 39 .. 32 | 31 .. 0 |
> +values: |   arch   |   size   | reserved  |  cpu id  |  reg id |
> +
> +All CPU interface regs are (rw, 64-bit). The only supported size value is
> +KVM_REG_SIZE_U64.
> +
> +Arch, size and reg id fields actually encode system register to be
> +accessed. Normally these values are obtained using  ARM64_SYS_REG() 
> macro.
> +Getting or setting such a register has the same effect as reading or
> +writing the register on the actual hardware.
> +
> +The Active Priorities Registers AP0Rn and AP1Rn are implementation 
> defined,
> +so we set a fixed format for our implementation that fits with the model 
> of
> +a "GICv3 implementation without the security extensions" which we present
> +to the guest. This interface always exposes four register APR[0-3]
> +describing the maximum possible 128 preemption levels. The semantics of 
> the
> +register indicates if any interrupts in a given preemption level are in 
> the
> +active state by setting the corresponding bit.
> +
> +Thus, preemption level X has one or more active interrupts if and only 
> if:
> +
> +  APRn[X mod 32] == 0b1,  where n = X / 32
> +
> +Bits for undefined preemption levels are RAZ/WI.
> +
> +  Limitations:
> +- Priorities are not implemented, and registers are RAZ/WI
> +  Errors:
> +-ENODEV: Getting or setting this register is not yet supported

The code uses -ENXIO.

> +-EBUSY: One or more VCPUs are running
> +
> +
>KVM_DEV_ARM_VGIC_GRP_NR_IRQS
>Attributes:
>  A value describing the number of interrupts (SGI, PPI and SPI) for
> diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> b/arch/arm64/include/uapi/asm/kvm.h
> index 249954f..7d37ccd 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -201,6 +201,13 @@ struct kvm_arch_memory_slot {
>  #define   KVM_DEV_ARM_VGIC_CPUID_MASK  (0xfULL << 
> KVM_DEV_ARM_VGIC_CPUID_SHIFT)
>  #define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT0
>  #define   KVM_DEV_ARM_VGIC_OFFSET_MASK (0xULL << 
> KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
> +#define   KVM_DEV_ARM_VGIC_REG_MASK(KVM_REG_SIZE_MASK | \
> +KVM_REG_ARM64_SYSREG_OP0_MASK | \
> +KVM_REG_ARM64_SYSREG_OP1_MASK | \
> +KVM_REG_ARM64_SYSREG_CRN_MASK | \
> +KVM_REG_ARM64_SYSREG_CRM_MASK | \
> +KVM_REG_ARM64_SYSREG_OP2_MASK)
> +
>  #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS   3
>  #define KVM_DEV_ARM_VGIC_GRP_CTRL  4
>  #define   KVM_DEV_ARM_VGIC_CTRL_INIT   0
> diff --git 

Re: [PATCH 6/9] arm/arm64: KVM: Add mapped interrupts documentation

2015-09-04 Thread Christoffer Dall
On Thu, Sep 03, 2015 at 05:56:26PM +0200, Eric Auger wrote:
> Hi Christoffer,
> On 09/03/2015 05:23 PM, Marc Zyngier wrote:
> > On 30/08/15 14:54, Christoffer Dall wrote:
> >> Mapped interrupts on arm/arm64 is a tricky concept and the way we deal
> >> with them is not apparently easy to understand by reading various specs.
> >>
> >> Therefore, add a proper documentation file explaining the flow and
> >> rationale of the behavior of the vgic.
> >>
> >> Some of this text was contributed by Marc Zyngier.
> >>
> >> Signed-off-by: Christoffer Dall 
> >> ---
> >>  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 59 
> >> ++
> >>  1 file changed, 59 insertions(+)
> >>  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> >>
> >> diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> >> b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> >> new file mode 100644
> >> index 000..49e1357
> >> --- /dev/null
> >> +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> >> @@ -0,0 +1,59 @@
> >> +KVM/ARM VGIC Mapped Interrupts
> >> +==
> >> +
> >> +Setting the Physical Active State for Edge vs. Level Triggered IRQs
> >> +---
> >> +
> >> +Mapped non-shared interrupts injected to a guest should always mark the
> >> +interrupt as active on the physical distributor.
> When injecting the virtual IRQ associated to the mapped=forwarded IRQ
> (see next comment), the host must not deactivate the physical IRQ so
> that its active state remains?

almost, but this is not the case for the timer, where we set the active
state manually, I'll attempt a reword.

> >> +
> >> +The reasoning for level-triggered interrupts:
> >> +For level-triggered interrupts, we have to mark the interrupt as active
> >> +on the physical distributor,
> to leave the interrupt as active? I have the impression you talk about
> shared IRQ here where the HW would not have any impact on the physical
> distributor state? The physical IRQ can be pending+active too?
>  because otherwise, as the line remains

I'm talking about any level-triggered interrupt where you handle the
interrupt in the guest.  In that case, if you keep deactivating the
interrupt on the host the guest will never make progress.

This is not specific to the timer I think?

Yes, the physical IRQ can be pending+active, but the key is that the
host doesn't deactivate the IRQ.

> >> +asserted, the guest will never execute because the host will keep taking
> >> +interrupts.  As soon as the guest deactivates the interrupt, the
> >> +physical line is sampled by the hardware again and the host takes a new
> >> +interrupt if the physical line is still asserted.
> >> +
> >> +The reasoning for edge-triggered interrupts:
> >> +For edge-triggered interrupts, if we set the HW bit in the LR we also
> >> +have to mark the interrupt as active on the physical distributor.  If we
> >> +don't set the physical active bit and the interrupt hits again before
> >> +the guest has deactivated the interrupt, the interrupt goes to the host,
> >> +which cannot set the state to ACTIVE+PENDING in the LR, because that is
> >> +not supported when setting the HW bit in the LR.
> >> +
> >> +An alternative could be to not use HW bit at all, and inject
> >> +edge-triggered interrupts from a physical assigned device as pure
> >> +virtual interrupts, but that would potentially slow down handling of the
> >> +interrupt in the guest, because a physical interrupt occurring in the
> >> +middle of the guest ISR would preempt the guest for the host to handle
> >> +the interrupt.
> > 
> > It would be worth mentioning that this is valid for PPIs and SPIs. LPIs
> > do not have an Active state (they are either Pending or not), so we'll
> > have to deal with edge interrupts as you just described at some point.
> > Other architectures do something similar, I'd expect.
> > 
> >> +
> >> +
> >> +Life Cycle for Forwarded Physical Interrupts
> >> +
> >> +
> >> +By forwarded physical interrupts we mean interrupts presented to a guest
> >> +representing a real HW event originally signaled to the host as a
> > 
> > s/signaled/signalled/
> > 
> >> +physical interrupt
> is it always true for the timer? sometimes isn't it a SW counter that
> expires and upon that event you inject the virtual IRQ with HW bit set?
>  and injecting this as a virtual interrupt with the HW

well you restore the timer state at the same time, so you setup the
hardware exactly as if the timer device raised a physical interrupt
while running the VM.  (in fact that's why we set the active state as
well, because we re-program the arch-timer to assert the line when we
enter the guest).

> >> +bit set in the LR.
> another definition of a forwarded/mapped physical IRQ is a physical IRQ
> that is deactivated by the guest and not by the host.

I was deliberately 

Re: [PATCH 6/9] arm/arm64: KVM: Add mapped interrupts documentation

2015-09-04 Thread Christoffer Dall
On Thu, Sep 03, 2015 at 04:23:04PM +0100, Marc Zyngier wrote:
> On 30/08/15 14:54, Christoffer Dall wrote:
> > Mapped interrupts on arm/arm64 is a tricky concept and the way we deal
> > with them is not apparently easy to understand by reading various specs.
> > 
> > Therefore, add a proper documentation file explaining the flow and
> > rationale of the behavior of the vgic.
> > 
> > Some of this text was contributed by Marc Zyngier.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 59 
> > ++
> >  1 file changed, 59 insertions(+)
> >  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > 
> > diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> > b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > new file mode 100644
> > index 000..49e1357
> > --- /dev/null
> > +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > @@ -0,0 +1,59 @@
> > +KVM/ARM VGIC Mapped Interrupts
> > +==
> > +
> > +Setting the Physical Active State for Edge vs. Level Triggered IRQs
> > +---
> > +
> > +Mapped non-shared interrupts injected to a guest should always mark the
> > +interrupt as active on the physical distributor.
> > +
> > +The reasoning for level-triggered interrupts:
> > +For level-triggered interrupts, we have to mark the interrupt as active
> > +on the physical distributor, because otherwise, as the line remains
> > +asserted, the guest will never execute because the host will keep taking
> > +interrupts.  As soon as the guest deactivates the interrupt, the
> > +physical line is sampled by the hardware again and the host takes a new
> > +interrupt if the physical line is still asserted.
> > +
> > +The reasoning for edge-triggered interrupts:
> > +For edge-triggered interrupts, if we set the HW bit in the LR we also
> > +have to mark the interrupt as active on the physical distributor.  If we
> > +don't set the physical active bit and the interrupt hits again before
> > +the guest has deactivated the interrupt, the interrupt goes to the host,
> > +which cannot set the state to ACTIVE+PENDING in the LR, because that is
> > +not supported when setting the HW bit in the LR.
> > +
> > +An alternative could be to not use HW bit at all, and inject
> > +edge-triggered interrupts from a physical assigned device as pure
> > +virtual interrupts, but that would potentially slow down handling of the
> > +interrupt in the guest, because a physical interrupt occurring in the
> > +middle of the guest ISR would preempt the guest for the host to handle
> > +the interrupt.
> 
> It would be worth mentioning that this is valid for PPIs and SPIs. LPIs
> do not have an Active state (they are either Pending or not), so we'll
> have to deal with edge interrupts as you just described at some point.
> Other architectures do something similar, I'd expect.
> 
> > +
> > +
> > +Life Cycle for Forwarded Physical Interrupts
> > +
> > +
> > +By forwarded physical interrupts we mean interrupts presented to a guest
> > +representing a real HW event originally signaled to the host as a
> 
> s/signaled/signalled/
> 
> > +physical interrupt and injecting this as a virtual interrupt with the HW
> > +bit set in the LR.
> > +
> > +The state of such an interrupt is managed in the following way:
> > +
> > +  - LR.Pending must be set when the interrupt is first injected, because 
> > this
> > +is the only way the GICV interface is going to present it to the guest.
> > +  - LR.Pending will stay set as long as the guest has not acked the 
> > interrupt.
> > +  - LR.Pending transitions to LR.Active on read of IAR, as expected.
> > +  - On EOI, the *physical distributor* active bit gets cleared, but the
> > +LR.Active is left untouched - it looks like the GIC can only clear a
> > +single bit (either the virtual active, or the physical one).
> > +  - This means we cannot trust LR.Active to find out about the state of the
> > +interrupt, and we definitely need to look at the distributor version.
> > +
> > +Consequently, when we context switch the state of a VCPU with forwarded
> > +physical interrupts, we must context switch set pending *or* active bits 
> > in the
> > +LR for that VCPU until the guest has deactivated the physical interrupt, 
> > and
> > +then clear the corresponding bits in the LR.  If we ever set an LR to 
> > pending or
> > +mapped when switching in a VCPU for a forwarded physical interrupt, we 
> > must also
> > +set the active state on the *physical distributor*.
> > 
> 
> I wonder if it may be worth adding a small example with the timer,
> because it is not immediately obvious why the interrupt would fire on
> and on without putting the generating device in the picture...
> 
Yes, probably.

I'll try to work both yours and Eric's comments into 

Re: [PATCH 5/9] arm/arm64: KVM: Use appropriate define in VGIC reset code

2015-09-04 Thread Eric Auger
On 08/30/2015 03:54 PM, Christoffer Dall wrote:
> We currently initialize the SGIs to be enabled in the VGIC code, but we
> use the VGIC_NR_PPIS define for this purpose, instead of the the more
s/the//

Eric
> natural VGIC_NR_SGIS.  Change this slightly confusing use of the
> defines.
> 
> Note: This should have no functional change, as both names are defined
> to the number 16.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  virt/kvm/arm/vgic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
> index 0ba92d3..8299c24 100644
> --- a/virt/kvm/arm/vgic.c
> +++ b/virt/kvm/arm/vgic.c
> @@ -2099,7 +2099,7 @@ int vgic_init(struct kvm *kvm)
>   }
>  
>   for (i = 0; i < dist->nr_irqs; i++) {
> - if (i < VGIC_NR_PPIS)
> + if (i < VGIC_NR_SGIS)
>   vgic_bitmap_set_irq_val(>irq_enabled,
>   vcpu->vcpu_id, i, 1);
>   if (i < VGIC_NR_PRIVATE_IRQS)
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: arm/arm64: Fix PSCI affinity info return value for non valid cores

2015-09-04 Thread Alexander Spyridakis
On 4 September 2015 at 17:54, Marc Zyngier  wrote:
> Seems sound to me. Out of curiosity, how was this found?

Hello Mark,

This issue came up when testing some basic bare metal tests with KVM,
which originally were meant for the multi-threading TCG effort
(https://git.virtualopensystems.com/dev/tcg_baremetal_tests.git).
While shutting down the system the test relies on PSCI to find out if
all secondary cores are powered-off by incrementally polling PSCI for
every cpu id. In the case of QEMU/TCG everything was fine, but with
KVM there was no error response for non-existing cores, and so the
system could not shut down properly.

Kind regards.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 3/5] KVM: arm64: Refactor system register handlers

2015-09-04 Thread Pavel Fedin
 Hello!

> Mmmh, not sure about this, but I guess u64 would be more precise here?
> With the Rt2 handling below we rely on this being 64 bits anyway. But I
> guess that breaks with assigning vcpu_reg later and since this is arm64
> only, long is always 64 bits?

 It is because ' params.val = '. And params.val is u_long * because of: 
'params.val =
vcpu_reg(vcpu, Rt)'. vcpu_reg() is declared as unsigned long * :
http://lxr.free-electrons.com/source/arch/arm64/include/asm/kvm_emulate.h#L102
 Initially i declared params.val as u64 *, but got warning in 'params.val = 
vcpu_reg(vcpu, Rt)'. So
i decided to change it to u_long * instead of having casts here and there. Do 
you like casts?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 5/5] KVM: arm64: Implement vGICv3 CPU interface access

2015-09-04 Thread Pavel Fedin
 Hello!

> So can you model the register size according to the spec and allow
> 32-bit accesses from userland?

 I can. But i'll have to invent my own macro for encoding register IDs into the 
attribute, as well
as drop reusing index_to_params(). Will it be OK?
 Upside: i can further extend "cpu id" (actually CPU index) field to 31 bit. :)

 It's friday evening here, work week is almost over, i'll read your replies 2 
days later on monday.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] KVM: arm/arm64: Refactor vGIC attributes handling code

2015-09-04 Thread Andre Przywara
Hi,

On 04/09/15 16:11, Pavel Fedin wrote:
>  Hello!
> 
>> Isn't the len parameter redundant here? I see that you don't initialize
>> mmio.len (which is a bit scary, btw), so can't you just use that field?
> 
>  This was because of split below. I did not know about call_range_handler(), 
> and now i will redo
> this.
> 
>> That (and other parts of this patch) sneak in some endianness handling,
>> which I'd like to be mentioned in the commit message, but preferably be
>> in a separate patch. The commit message here talks only about refactoring.
> 
>  These come from mmio_data_read() and mmio_data_write() in original 
> vgic_attr_regs_access().
> These inlines cannot be used with arbitrary data length, so i opened them up 
> (they contain
> endianness conversion plus masking which isn't used in our case) and moved 
> endianness conversion to
> load/store part.
>  If i make this a separate patch, it will be two lines patch. Does it worth 
> that? In the next respin
> i'd better add this explanation to commit message. Would it be OK?

>From a review (and later bisecting) point of view separate patches would
be better. Ideally the refactoring does not introduce any change except
code moving around.

Cheers,
Andre.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] arm/arm64: KVM: Add mapped interrupts documentation

2015-09-04 Thread Marc Zyngier
On 04/09/15 16:57, Christoffer Dall wrote:
> On Thu, Sep 03, 2015 at 04:23:04PM +0100, Marc Zyngier wrote:
>> On 30/08/15 14:54, Christoffer Dall wrote:
>>> +
>>> +
>>> +Life Cycle for Forwarded Physical Interrupts
>>> +
>>> +
>>> +By forwarded physical interrupts we mean interrupts presented to a guest
>>> +representing a real HW event originally signaled to the host as a
>>
>> s/signaled/signalled/
>>
> Actaully this was my first version as well, but aspell told me it was
> spelled signaled.
> 
> Turns out it's mostly acceptable to use both spellings:
> 
> http://www.merriam-webster.com/dictionary/signaled

I stand corrected! :-)

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/9] arm/arm64: KVM: Add mapped interrupts documentation

2015-09-04 Thread Christoffer Dall
On Thu, Sep 03, 2015 at 04:23:04PM +0100, Marc Zyngier wrote:
> On 30/08/15 14:54, Christoffer Dall wrote:
> > Mapped interrupts on arm/arm64 is a tricky concept and the way we deal
> > with them is not apparently easy to understand by reading various specs.
> > 
> > Therefore, add a proper documentation file explaining the flow and
> > rationale of the behavior of the vgic.
> > 
> > Some of this text was contributed by Marc Zyngier.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 59 
> > ++
> >  1 file changed, 59 insertions(+)
> >  create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > 
> > diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
> > b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > new file mode 100644
> > index 000..49e1357
> > --- /dev/null
> > +++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
> > @@ -0,0 +1,59 @@
> > +KVM/ARM VGIC Mapped Interrupts
> > +==
> > +
> > +Setting the Physical Active State for Edge vs. Level Triggered IRQs
> > +---
> > +
> > +Mapped non-shared interrupts injected to a guest should always mark the
> > +interrupt as active on the physical distributor.
> > +
> > +The reasoning for level-triggered interrupts:
> > +For level-triggered interrupts, we have to mark the interrupt as active
> > +on the physical distributor, because otherwise, as the line remains
> > +asserted, the guest will never execute because the host will keep taking
> > +interrupts.  As soon as the guest deactivates the interrupt, the
> > +physical line is sampled by the hardware again and the host takes a new
> > +interrupt if the physical line is still asserted.
> > +
> > +The reasoning for edge-triggered interrupts:
> > +For edge-triggered interrupts, if we set the HW bit in the LR we also
> > +have to mark the interrupt as active on the physical distributor.  If we
> > +don't set the physical active bit and the interrupt hits again before
> > +the guest has deactivated the interrupt, the interrupt goes to the host,
> > +which cannot set the state to ACTIVE+PENDING in the LR, because that is
> > +not supported when setting the HW bit in the LR.
> > +
> > +An alternative could be to not use HW bit at all, and inject
> > +edge-triggered interrupts from a physical assigned device as pure
> > +virtual interrupts, but that would potentially slow down handling of the
> > +interrupt in the guest, because a physical interrupt occurring in the
> > +middle of the guest ISR would preempt the guest for the host to handle
> > +the interrupt.
> 
> It would be worth mentioning that this is valid for PPIs and SPIs. LPIs
> do not have an Active state (they are either Pending or not), so we'll
> have to deal with edge interrupts as you just described at some point.
> Other architectures do something similar, I'd expect.
> 
> > +
> > +
> > +Life Cycle for Forwarded Physical Interrupts
> > +
> > +
> > +By forwarded physical interrupts we mean interrupts presented to a guest
> > +representing a real HW event originally signaled to the host as a
> 
> s/signaled/signalled/
> 
Actaully this was my first version as well, but aspell told me it was
spelled signaled.

Turns out it's mostly acceptable to use both spellings:

http://www.merriam-webster.com/dictionary/signaled

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/8] arm/arm64: KVM: Use appropriate define in VGIC reset code

2015-09-04 Thread Christoffer Dall
We currently initialize the SGIs to be enabled in the VGIC code, but we
use the VGIC_NR_PPIS define for this purpose, instead of the the more
natural VGIC_NR_SGIS.  Change this slightly confusing use of the
defines.

Note: This should have no functional change, as both names are defined
to the number 16.

Acked-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index e606f78..9ed8d53 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2109,7 +2109,7 @@ int vgic_init(struct kvm *kvm)
}
 
for (i = 0; i < dist->nr_irqs; i++) {
-   if (i < VGIC_NR_PPIS)
+   if (i < VGIC_NR_SGIS)
vgic_bitmap_set_irq_val(>irq_enabled,
vcpu->vcpu_id, i, 1);
if (i < VGIC_NR_PRIVATE_IRQS)
-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/8] arm/arm64: KVM: vgic: Factor out level irq processing on guest exit

2015-09-04 Thread Christoffer Dall
Currently vgic_process_maintenance() processes dealing with a completed
level-triggered interrupt directly, but we are soon going to reuse this
logic for level-triggered mapped interrupts with the HW bit set, so
move this logic into a separate static function.

Probably the most scary part of this commit is convincing yourself that
the current flow is safe compared to the old one.  In the following I
try to list the changes and why they are harmless:

  Move vgic_irq_clear_queued after kvm_notify_acked_irq:
Harmless because the effect of clearing the queued flag wrt.
kvm_set_irq is only that vgic_update_irq_pending does not set the
pending bit on the emulated CPU interface or in the pending_on_cpu
bitmask, but we set this in __kvm_vgic_sync_hwstate later on if the
level is stil high.

  Move vgic_set_lr before kvm_notify_acked_irq:
Also, harmless because the LR are cpu-local operations and
kvm_notify_acked only affects the dist

  Move vgic_dist_irq_clear_soft_pend after kvm_notify_acked_irq:
Also harmless because it's just a bit which is cleared and altering
the line state does not affect this bit.

Reviewed-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 88 ++---
 1 file changed, 50 insertions(+), 38 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 6bd1c9b..fe0e5db 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1322,12 +1322,56 @@ epilog:
}
 }
 
+static int process_level_irq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
+{
+   int level_pending = 0;
+
+   vlr.state = 0;
+   vlr.hwirq = 0;
+   vgic_set_lr(vcpu, lr, vlr);
+
+   /*
+* If the IRQ was EOIed (called from vgic_process_maintenance) or it
+* went from active to non-active (called from vgic_sync_hwirq) it was
+* also ACKed and we we therefore assume we can clear the soft pending
+* state (should it had been set) for this interrupt.
+*
+* Note: if the IRQ soft pending state was set after the IRQ was
+* acked, it actually shouldn't be cleared, but we have no way of
+* knowing that unless we start trapping ACKs when the soft-pending
+* state is set.
+*/
+   vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq);
+
+   /*
+* Tell the gic to start sampling the line of this interrupt again.
+*/
+   vgic_irq_clear_queued(vcpu, vlr.irq);
+
+   /* Any additional pending interrupt? */
+   if (vgic_dist_irq_get_level(vcpu, vlr.irq)) {
+   vgic_cpu_irq_set(vcpu, vlr.irq);
+   level_pending = 1;
+   } else {
+   vgic_dist_irq_clear_pending(vcpu, vlr.irq);
+   vgic_cpu_irq_clear(vcpu, vlr.irq);
+   }
+
+   /*
+* Despite being EOIed, the LR may not have
+* been marked as empty.
+*/
+   vgic_sync_lr_elrsr(vcpu, lr, vlr);
+
+   return level_pending;
+}
+
 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
 {
u32 status = vgic_get_interrupt_status(vcpu);
struct vgic_dist *dist = >kvm->arch.vgic;
-   bool level_pending = false;
struct kvm *kvm = vcpu->kvm;
+   int level_pending = 0;
 
kvm_debug("STATUS = %08x\n", status);
 
@@ -1342,54 +1386,22 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
*vcpu)
 
for_each_set_bit(lr, eisr_ptr, vgic->nr_lr) {
struct vgic_lr vlr = vgic_get_lr(vcpu, lr);
-   WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq));
 
-   spin_lock(>lock);
-   vgic_irq_clear_queued(vcpu, vlr.irq);
+   WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq));
WARN_ON(vlr.state & LR_STATE_MASK);
-   vlr.state = 0;
-   vgic_set_lr(vcpu, lr, vlr);
 
-   /*
-* If the IRQ was EOIed it was also ACKed and we we
-* therefore assume we can clear the soft pending
-* state (should it had been set) for this interrupt.
-*
-* Note: if the IRQ soft pending state was set after
-* the IRQ was acked, it actually shouldn't be
-* cleared, but we have no way of knowing that unless
-* we start trapping ACKs when the soft-pending state
-* is set.
-*/
-   vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq);
 
/*
 * kvm_notify_acked_irq calls kvm_set_irq()
-* to reset the IRQ level. Need to release the
-* lock for kvm_set_irq to grab it.
+* to reset the 

[PATCH v2 8/8] arm/arm64: KVM: Support edge-triggered forwarded interrupts

2015-09-04 Thread Christoffer Dall
We mark edge-triggered interrupts with the HW bit set as queued to
prevent the VGIC code from injecting LRs with both the Active and
Pending bits set at the same time while also setting the HW bit,
because the hardware does not support this.

However, this means that we must also clear the queued flag when we sync
back a LR where the state on the physical distributor went from active
to inactive because the guest deactivated the interrupt.  At this point
we must also check if the interrupt is pending on the distributor, and
tell the VGIC to queue it again if it is.

Since these actions on the sync path are extremely close to those for
level-triggered interrupts, rename process_level_irq to
process_queued_irq, allowing it to cater for both cases.

Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index f4ea950..5942ce9 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1322,13 +1322,10 @@ epilog:
}
 }
 
-static int process_level_irq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
+static int process_queued_irq(struct kvm_vcpu *vcpu,
+  int lr, struct vgic_lr vlr)
 {
-   int level_pending = 0;
-
-   vlr.state = 0;
-   vlr.hwirq = 0;
-   vgic_set_lr(vcpu, lr, vlr);
+   int pending = 0;
 
/*
 * If the IRQ was EOIed (called from vgic_process_maintenance) or it
@@ -1344,26 +1341,35 @@ static int process_level_irq(struct kvm_vcpu *vcpu, int 
lr, struct vgic_lr vlr)
vgic_dist_irq_clear_soft_pend(vcpu, vlr.irq);
 
/*
-* Tell the gic to start sampling the line of this interrupt again.
+* Tell the gic to start sampling this interrupt again.
 */
vgic_irq_clear_queued(vcpu, vlr.irq);
 
/* Any additional pending interrupt? */
-   if (vgic_dist_irq_get_level(vcpu, vlr.irq)) {
-   vgic_cpu_irq_set(vcpu, vlr.irq);
-   level_pending = 1;
+   if (vgic_irq_is_edge(vcpu, vlr.irq)) {
+   BUG_ON(!(vlr.state & LR_HW));
+   pending = vgic_dist_irq_is_pending(vcpu, vlr.irq);
} else {
-   vgic_dist_irq_clear_pending(vcpu, vlr.irq);
-   vgic_cpu_irq_clear(vcpu, vlr.irq);
+   if (vgic_dist_irq_get_level(vcpu, vlr.irq)) {
+   vgic_cpu_irq_set(vcpu, vlr.irq);
+   pending = 1;
+   } else {
+   vgic_dist_irq_clear_pending(vcpu, vlr.irq);
+   vgic_cpu_irq_clear(vcpu, vlr.irq);
+   }
}
 
/*
 * Despite being EOIed, the LR may not have
 * been marked as empty.
 */
+   vlr.state = 0;
+   vlr.hwirq = 0;
+   vgic_set_lr(vcpu, lr, vlr);
+
vgic_sync_lr_elrsr(vcpu, lr, vlr);
 
-   return level_pending;
+   return pending;
 }
 
 static bool vgic_process_maintenance(struct kvm_vcpu *vcpu)
@@ -1400,7 +1406,7 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
*vcpu)
 vlr.irq - VGIC_NR_PRIVATE_IRQS);
 
spin_lock(>lock);
-   level_pending |= process_level_irq(vcpu, lr, vlr);
+   level_pending |= process_queued_irq(vcpu, lr, vlr);
spin_unlock(>lock);
}
}
@@ -1422,7 +1428,7 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
*vcpu)
 /*
  * Save the physical active state, and reset it to inactive.
  *
- * Return true if there's a pending level triggered interrupt line to queue.
+ * Return true if there's a pending forwarded interrupt to queue.
  */
 static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int lr, struct vgic_lr vlr)
 {
@@ -1456,9 +1462,7 @@ static bool vgic_sync_hwirq(struct kvm_vcpu *vcpu, int 
lr, struct vgic_lr vlr)
return false;
}
 
-   /* Mapped edge-triggered interrupts not yet supported. */
-   WARN_ON(vgic_irq_is_edge(vcpu, vlr.irq));
-   return process_level_irq(vcpu, lr, vlr);
+   return process_queued_irq(vcpu, lr, vlr);
 }
 
 /* Sync back the VGIC state after a guest run */
-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/8] arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block

2015-09-04 Thread Christoffer Dall
We currently schedule a soft timer every time we exit the guest if the
timer did not expire while running the guest.  This is really not
necessary, because the only work we do in the timer work function is to
kick the vcpu.

Kicking the vcpu does two things:
(1) If the vpcu thread is on a waitqueue, make it runnable and remove it
from the waitqueue.
(2) If the vcpu is running on a different physical CPU from the one
doing the kick, it sends a reschedule IPI.

The second case cannot happen, because the soft timer is only ever
scheduled when the vcpu is not running.  The first case is only relevant
when the vcpu thread is on a waitqueue, which is only the case when the
vcpu thread has called kvm_vcpu_block().

Therefore, we only need to make sure a timer is scheduled for
kvm_vcpu_block(), which we do by encapsulating all calls to
kvm_vcpu_block() with kvm_timer_{un}schedule calls.

Additionally, we only schedule a soft timer if the timer is enabled and
unmasked, since it is useless otherwise.

Note that theoretically userspace can use the SET_ONE_REG interface to
change registers that should cause the timer to fire, even if the vcpu
is blocked without a scheduled timer, but this case was not supported
before this patch and we leave it for future work for now.

Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_host.h   |  3 --
 arch/arm/kvm/arm.c| 10 +
 arch/arm64/include/asm/kvm_host.h |  3 --
 include/kvm/arm_arch_timer.h  |  2 +
 virt/kvm/arm/arch_timer.c | 91 ++-
 5 files changed, 72 insertions(+), 37 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 86fcf6e..dcba0fa 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -236,7 +236,4 @@ static inline void kvm_arm_setup_debug(struct kvm_vcpu 
*vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
 
-static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
-static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
-
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index ce404a5..bdf8871 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -271,6 +271,16 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
return kvm_timer_should_fire(vcpu);
 }
 
+void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
+{
+   kvm_timer_schedule(vcpu);
+}
+
+void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
+{
+   kvm_timer_unschedule(vcpu);
+}
+
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
/* Force users to call KVM_ARM_VCPU_INIT */
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index dd143f5..415938d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -257,7 +257,4 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 
-static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
-static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
-
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index e1e4d7c..ef14cc1 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -71,5 +71,7 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
 int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
 
 bool kvm_timer_should_fire(struct kvm_vcpu *vcpu);
+void kvm_timer_schedule(struct kvm_vcpu *vcpu);
+void kvm_timer_unschedule(struct kvm_vcpu *vcpu);
 
 #endif
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 48c6e1a..7991537 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -111,14 +111,21 @@ static enum hrtimer_restart kvm_timer_expire(struct 
hrtimer *hrt)
return HRTIMER_NORESTART;
 }
 
+static bool kvm_timer_irq_can_fire(struct kvm_vcpu *vcpu)
+{
+   struct arch_timer_cpu *timer = >arch.timer_cpu;
+
+   return !(timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) &&
+   (timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) &&
+   !kvm_vgic_get_phys_irq_active(timer->map);
+}
+
 bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
 {
struct arch_timer_cpu *timer = >arch.timer_cpu;
cycle_t cval, now;
 
-   if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-   !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
-   kvm_vgic_get_phys_irq_active(timer->map))
+   if (!kvm_timer_irq_can_fire(vcpu))
return false;
 
cval = timer->cntv_cval;
@@ -127,24 +134,61 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
return cval <= now;
 }
 
-/**
- * kvm_timer_flush_hwstate - prepare to move the virt 

[PATCH v2 0/8] Rework architected timer and forwarded IRQs handling

2015-09-04 Thread Christoffer Dall
The architected timer integration with the VGIC had some shortcomings in
that certain guest operations weren't fully supported.

This series tries to address these problems in providing level-triggered
semantics for the arch timer and VGIC integration and seeks to clarify
the behavior when setting/clearing the active state on the physical
distributor.

It also fixes a few other bugs in the VGIC code and finally adds support
for edge-triggered forwarded interrupts.

The edge-triggered forwarded interrupts code is untested but probably
better to clearly do something wrong and raise a warning.

Changes since v1:
 - Sent out bug fixes for active state and UEFI reset as separate
   patches.
 - Fixed various spelling nits
 - Rewrote proposed documentation file trying to address Eric's and
   Marc's comments
 - Rewrote kvm_timer_update_irq and kvm_timer_update_state according to
   Marc's suggestion (thanks!)
 - Added additional patch to support edge-triggered forwarded
   interrupts.

Christoffer Dall (8):
  KVM: Add kvm_arch_vcpu_{un}blocking callbacks
  arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block
  arm/arm64: KVM: vgic: Factor out level irq processing on guest exit
  arm/arm64: KVM: Implement GICD_ICFGR as RO for PPIs
  arm/arm64: KVM: Use appropriate define in VGIC reset code
  arm/arm64: KVM: Add forwarded physical interrupts documentation
  arm/arm64: KVM: Rework the arch timer to use level-triggered semantics
  arm/arm64: KVM: Support edge-triggered forwarded interrupts

 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 +
 arch/arm/kvm/arm.c |  21 ++-
 arch/mips/include/asm/kvm_host.h   |   2 +
 arch/powerpc/include/asm/kvm_host.h|   2 +
 arch/s390/include/asm/kvm_host.h   |   2 +
 arch/x86/include/asm/kvm_host.h|   3 +
 include/kvm/arm_arch_timer.h   |   4 +-
 include/kvm/arm_vgic.h |   3 -
 include/linux/kvm_host.h   |   2 +
 virt/kvm/arm/arch_timer.c  | 150 +++--
 virt/kvm/arm/vgic.c| 163 +--
 virt/kvm/kvm_main.c|   3 +
 12 files changed, 398 insertions(+), 138 deletions(-)
 create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt

-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/8] KVM: Add kvm_arch_vcpu_{un}blocking callbacks

2015-09-04 Thread Christoffer Dall
Some times it is useful for architecture implementations of KVM to know
when the VCPU thread is about to block or when it comes back from
blocking (arm/arm64 needs to know this to properly implement timers, for
example).

Therefore provide a generic architecture callback function in line with
what we do elsewhere for KVM generic-arch interactions.

Reviewed-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 arch/arm/include/asm/kvm_host.h | 3 +++
 arch/arm64/include/asm/kvm_host.h   | 3 +++
 arch/mips/include/asm/kvm_host.h| 2 ++
 arch/powerpc/include/asm/kvm_host.h | 2 ++
 arch/s390/include/asm/kvm_host.h| 2 ++
 arch/x86/include/asm/kvm_host.h | 3 +++
 include/linux/kvm_host.h| 2 ++
 virt/kvm/kvm_main.c | 3 +++
 8 files changed, 20 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index dcba0fa..86fcf6e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -236,4 +236,7 @@ static inline void kvm_arm_setup_debug(struct kvm_vcpu 
*vcpu) {}
 static inline void kvm_arm_clear_debug(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu) {}
 
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 415938d..dd143f5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -257,4 +257,7 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
 
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index e8c8d9d..58f0f4d 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -845,5 +845,7 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm 
*kvm,
struct kvm_memory_slot *slot) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 
 #endif /* __MIPS_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d91f65b..179f9a7 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -702,5 +702,7 @@ static inline void kvm_arch_memslots_updated(struct kvm 
*kvm, struct kvm_memslot
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_exit(void) {}
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3024acb..04a97df 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -640,5 +640,7 @@ static inline void kvm_arch_memslots_updated(struct kvm 
*kvm, struct kvm_memslot
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
 static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot) {}
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 
 #endif
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2a7f5d7..26c4086 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1202,4 +1202,7 @@ int __x86_set_memory_region(struct kvm *kvm,
 int x86_set_memory_region(struct kvm *kvm,
  const struct kvm_userspace_memory_region *mem);
 
+static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
+static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9564fd7..87d7be6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -619,6 +619,8 @@ int kvm_vcpu_write_guest(struct kvm_vcpu *vcpu, gpa_t gpa, 
const void *data,
 void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 void kvm_vcpu_block(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
+void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
 void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
 int 

[PATCH v2 4/8] arm/arm64: KVM: Implement GICD_ICFGR as RO for PPIs

2015-09-04 Thread Christoffer Dall
The GICD_ICFGR allows the bits for the SGIs and PPIs to be read only.
We currently simulate this behavior by writing a hardcoded value to the
register for the SGIs and PPIs on every write of these bits to the
register (ignoring what the guest actually wrote), and by writing the
same value as the reset value to the register.

This is a bit counter-intuitive, as the register is RO for these bits,
and we can just implement it that way, allowing us to control the value
of the bits purely in the reset code.

Reviewed-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
---
 virt/kvm/arm/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index fe0e5db..e606f78 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -655,7 +655,7 @@ bool vgic_handle_cfg_reg(u32 *reg, struct kvm_exit_mmio 
*mmio,
ACCESS_READ_VALUE | ACCESS_WRITE_VALUE);
if (mmio->is_write) {
if (offset < 8) {
-   *reg = ~0U; /* Force PPIs/SGIs to 1 */
+   /* Ignore writes to read-only SGI and PPI bits */
return false;
}
 
-- 
2.1.2.330.g565301e.dirty

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: fix maybe-uninitialized compiler warning

2015-09-04 Thread Clemens Gruber
Fixes the following compiler warning, occuring with GCC 5.2.0:

arch/x86/kvm/mmu.c: In function ‘handle_mmio_page_fault_common’:
arch/x86/kvm/mmu.c:3332:9: warning: ‘leaf’ may be used uninitialized in
this
function [-Wmaybe-uninitialized]
   while (root >= leaf) {
 ^
arch/x86/kvm/mmu.c:3304:12: note: ‘leaf’ was declared here
  int root, leaf;

Signed-off-by: Clemens Gruber 
---
 arch/x86/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fb16a8e..0af5a39 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3301,7 +3301,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
 {
struct kvm_shadow_walk_iterator iterator;
u64 sptes[PT64_ROOT_LEVEL], spte = 0ull;
-   int root, leaf;
+   int root, leaf=0;
bool reserved = false;
 
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 7/8] arm/arm64: KVM: Rework the arch timer to use level-triggered semantics

2015-09-04 Thread Christoffer Dall
The arch timer currently uses edge-triggered semantics in the sense that
the line is never sampled by the vgic and lowering the line from the
timer to the vgic doesn't have any affect on the pending state of
virtual interrupts in the vgic.  This means that we do not support a
guest with the otherwise valid behavior of (1) disable interrupts (2)
enable the timer (3) disable the timer (4) enable interrupts.  Such a
guest would validly not expect to see any interrupts on real hardware,
but will see interrupts on KVM.

This patches fixes this shortcoming through the following series of
changes.

First, we change the flow of the timer/vgic sync/flush operations.  Now
the timer is always flushed/synced before the vgic, because the vgic
samples the state of the timer output.  This has the implication that we
move the timer operations in to non-preempible sections, but that is
fine after the previous commit getting rid of hrtimer schedules on every
entry/exit.

Second, we change the internal behavior of the timer, letting the timer
keep track of its previous output state, and only lower/raise the line
to the vgic when the state changes.  Note that in theory this could have
been accomplished more simply by signalling the vgic every time the
state *potentially* changed, but we don't want to be hitting the vgic
more often than necessary.

Third, we get rid of the use of the map->active field in the vgic and
instead simply set the interrupt as active on the physical distributor
whenever we signal a mapped interrupt to the guest, and we reset the
active state when we sync back the HW state from the vgic.

Fourth, and finally, we now initialize the timer PPIs (and all the other
unused PPIs for now), to be level-triggered, and modify the sync code to
sample the line state on HW sync and re-inject a new interrupt if it is
still pending at that time.

Signed-off-by: Christoffer Dall 
---
 arch/arm/kvm/arm.c   | 11 ++--
 include/kvm/arm_arch_timer.h |  2 +-
 include/kvm/arm_vgic.h   |  3 --
 virt/kvm/arm/arch_timer.c| 65 +-
 virt/kvm/arm/vgic.c  | 67 +++-
 5 files changed, 78 insertions(+), 70 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index bdf8871..102a4aa 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -561,9 +561,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
local_irq_enable();
+   kvm_timer_sync_hwstate(vcpu);
kvm_vgic_sync_hwstate(vcpu);
preempt_enable();
-   kvm_timer_sync_hwstate(vcpu);
continue;
}
 
@@ -608,12 +608,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
kvm_guest_exit();
trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 
+   /*
+* We must sync the timer state before the vgic state so that
+* the vgic can properly sample the updated state of the
+* interrupt line.
+*/
+   kvm_timer_sync_hwstate(vcpu);
+
kvm_vgic_sync_hwstate(vcpu);
 
preempt_enable();
 
-   kvm_timer_sync_hwstate(vcpu);
-
ret = handle_exit(vcpu, run, ret);
}
 
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index ef14cc1..1800227 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -51,7 +51,7 @@ struct arch_timer_cpu {
boolarmed;
 
/* Timer IRQ */
-   const struct kvm_irq_level  *irq;
+   struct kvm_irq_levelirq;
 
/* VGIC mapping */
struct irq_phys_map *map;
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index d901f1a..99011a0 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -163,7 +163,6 @@ struct irq_phys_map {
u32 virt_irq;
u32 phys_irq;
u32 irq;
-   boolactive;
 };
 
 struct irq_phys_map_entry {
@@ -358,8 +357,6 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu,
   int virt_irq, int irq);
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
-bool kvm_vgic_get_phys_irq_active(struct irq_phys_map *map);
-void kvm_vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
 #define irqchip_in_kernel(k)   (!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 

[PATCH v2 6/8] arm/arm64: KVM: Add forwarded physical interrupts documentation

2015-09-04 Thread Christoffer Dall
Forwarded physical interrupts on arm/arm64 is a tricky concept and the
way we deal with them is not apparently easy to understand by reading
various specs.

Therefore, add a proper documentation file explaining the flow and
rationale of the behavior of the vgic.

Some of this text was contributed by Marc Zyngier and edited by me.
Omissions and errors are all mine.

Signed-off-by: Christoffer Dall 
---
 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt | 181 +
 1 file changed, 181 insertions(+)
 create mode 100644 Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt

diff --git a/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt 
b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
new file mode 100644
index 000..24b6f28
--- /dev/null
+++ b/Documentation/virtual/kvm/arm/vgic-mapped-irqs.txt
@@ -0,0 +1,181 @@
+KVM/ARM VGIC Forwarded Physical Interrupts
+==
+
+The KVM/ARM code implements software support for the ARM Generic
+Interrupt Controller's (GIC's) hardware support for virtualization by
+allowing software to inject virtual interrupts to a VM, which the guest
+OS sees as regular interrupts.  The code is famously known as the VGIC.
+
+Some of these virtual interrupts, however, correspond to physical
+interrupts from real physical devices.  One example could be the
+architected timer, which itself supports virtualization, and therefore
+lets a guest OS program the hardware device directly to raise an
+interrupt at some point in time.  When such an interrupt is raised, the
+host OS initially handles the interrupt and must somehow signal this
+event as a virtual interrupt to the guest.  Another example could be a
+passthrough device, where the physical interrupts are initially handled
+by the host, but the device driver for the device lives in the guest OS
+and KVM must therefore somehow inject a virtual interrupt on behalf of
+the physical one to the guest OS.
+
+These virtual interrupts corresponding to a physical interrupt on the
+host are called forwarded physical interrupts, but are also sometimes
+referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
+
+Forwarded physical interrupts are handled slightly differently compared
+to virtual interrupts generated purely by a software emulated device.
+
+
+The HW bit
+--
+Virtual interrupts are signalled to the guest by programming the List
+Registers (LRs) on the GIC before running a VCPU.  The LR is programmed
+with the virtual IRQ number and the state of the interrupt (Pending,
+Active, or Pending+Active).  When the guest ACKs and EOIs a virtual
+interrupt, the LR state moves from Pending to Active, and finally to
+inactive.
+
+The LRs include an extra bit, called the HW bit.  When this bit is set,
+KVM must also program an additional field in the LR, the physical IRQ
+number, to link the virtual with the physical IRQ.
+
+When the HW bit is set, KVM must EITHER set the Pending OR the Active
+bit, never both at the same time.
+
+Setting the HW bit causes the hardware to deactivate the physical
+interrupt on the physical distributor when the guest deactivates the
+corresponding virtual interrupt.
+
+
+Forwarded Physical Interrupts Life Cycle
+
+
+The state of forwarded physical interrupts is managed in the following way:
+
+  - The physical interrupt is acked by the host, and becomes active on
+the physical distributor (*).
+  - KVM sets the LR.Pending bit, because this is the only way the GICV
+interface is going to present it to the guest.
+  - LR.Pending will stay set as long as the guest has not acked the interrupt.
+  - LR.Pending transitions to LR.Active on the guest read of the IAR, as
+expected.
+  - On guest EOI, the *physical distributor* active bit gets cleared,
+but the LR.Active is left untouched (set).
+  - KVM clears the LR when on VM exits when the physical distributor
+active state has been cleared.
+
+(*): The host handling is slightly more complicated.  For some devices
+(shared), KVM directly sets the active state on the physical distributor
+before entering the guest, and for some devices (non-shared) the host
+configures the GIC such that it does not deactivate the interrupt on
+host EOIs, but only performs a priority drop allowing the GIC to receive
+other interrupts and leaves the interrupt in the active state on the
+physical distributor.
+
+
+Forwarded Edge and Level Triggered PPIs and SPIs
+
+Forwarded physical interrupts injected should always be active on the
+physical distributor when injected to a guest.
+
+Level-triggered interrupts will keep the interrupt line to the GIC
+asserted, typically until the guest programs the device to deassert the
+line.  This means that the interrupt will remain pending on the physical
+distributor until the guest has reprogrammed the device.  Since we
+always run the VM 

Re: [PATCH] KVM: arm/arm64: Fix PSCI affinity info return value for non valid cores

2015-09-04 Thread Marc Zyngier
On 04/09/15 16:06, Alexander Spyridakis wrote:
> If a guest requests the affinity info for a non-existing vCPU we need to
> properly return an error, instead of erroneously reporting an off state.
> 
> Signed-off-by: Alexander Spyridakis 
> Signed-off-by: Alvise Rigo 

Seems sound to me. Out of curiosity, how was this found?

Reviewed-by: Marc Zyngier 

M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 2/3] KVM: dynamic halt-polling

2015-09-04 Thread Wanpeng Li

Hi Paolo,
On 9/3/15 10:07 PM, Wanpeng Li wrote:

[...]
  static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
  {
if (kvm_arch_vcpu_runnable(vcpu)) {
@@ -1928,7 +1962,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
  {
ktime_t start, cur;
DEFINE_WAIT(wait);
-   bool waited = false;
+   bool polled = false, waited = false;
+   u64 poll_ns = 0, wait_ns = 0, block_ns = 0;
  
  	start = cur = ktime_get();

if (vcpu->halt_poll_ns) {
@@ -1940,11 +1975,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 * arrives.
 */
if (kvm_vcpu_check_block(vcpu) < 0) {
+   polled = true;
++vcpu->stat.halt_successful_poll;
-   goto out;
+   break;
}
cur = ktime_get();
} while (single_task_running() && ktime_before(cur, stop));
+
+   poll_ns = ktime_to_ns(cur) - ktime_to_ns(start);
+   if (polled)
+   goto out;



Please move poll_ns caculation under if() when you applied, as I 
explained in reply to v6.


Regards,
Wanpeng Li
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/5] KVM: arm64: Implement vGICv3 distributor and redistributor access from userspace

2015-09-04 Thread Pavel Fedin
 Hello!

> > +  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
> > +  Attributes:
> > +The attr field of kvm_device_attr encodes two values:
> > +bits: |  63  | 62  ..  40 | 39 ..  32  |  31   0 |
> > +values:   | size |  reserved  |   cpu id   |  offset |
> 
> We should avoid imposing an accidental limit on the maximum
> number of CPUs in the userspace API. GICv3 doesn't have a
> limit at 256 CPUs

 Ops, my fault, forgot. :(
 However, it seems to be very simple. "cpu id" is actually an index, not a real 
affinity ID (see 
http://lxr.free-electrons.com/source/include/linux/kvm_host.h#L427). Would it 
be OK just to enlarge KVM_DEV_ARM_VGIC_CPUID_MASK?

bits: |  63  | 62 ..  32 |  31   0 |
values:   | size |  cpu id   |  offset |

 I think 31 bits is more than enough for CPU index.
 And, since id is actually an index, may be we should fix up docs?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html