from:"Will Deacon"

Re: [PATCH 1/2] Add a rudimentary manpage

2015-12-22 Thread Will Deacon

Hi Andre, of Phoronix fame,

On Tue, Dec 22, 2015 at 02:00:46PM +, Andre Przywara wrote:
> The kvmtool documentation is somewhat lacking, also it is not easily
> accessible when living in the source tree only.
> Add a good ol' manpage to document at least the basic commands and
> their options.
> This level of documentation matches the one that is already there in
> the Documentation directory and should be subject to extension.
> 
> Signed-off-by: Andre Przywara 
> ---
>  Documentation/kvmtool.1 | 222 
> 
>  1 file changed, 222 insertions(+)
>  create mode 100644 Documentation/kvmtool.1
> 
> diff --git a/Documentation/kvmtool.1 b/Documentation/kvmtool.1
> new file mode 100644
> index 000..aecb2dc
> --- /dev/null
> +++ b/Documentation/kvmtool.1
> @@ -0,0 +1,222 @@
> +.\" Manpage for kvmtool
> +.\" Copyright (C) 2015 by Andre Przywara 
> +.TH kvmtool 1 "11 Nov 2015" "0.1" "kvmtool man page"
> +.SH NAME
> +kvmtool \- running KVM guests
> +.SH SYNOPSIS
> +lkvm COMMAND [ARGS]
> +.SH DESCRIPTION
> +kvmtool is a userland tool for creating and controlling KVM guests.
> +.SH "KVMTOOL COMMANDS"
> +.sp
> +.PP
> +.B run -k  ...

You seem to be inconsistent with your synopses for each command. That is,
here you just have -k  ... , but later you have things like

> +.B debug --all|--name  [--dump] [--nmi ] [--sysrq ]

which describes all of the possible options to debug. I think I prefer
this latter way, so could you make all of the commands look like that,
please?

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v6 02/21] KVM: ARM64: Define PMU data structure for each vcpu

2015-12-08 Thread Will Deacon

On Tue, Dec 08, 2015 at 01:37:14PM +, Marc Zyngier wrote:
> On 08/12/15 12:47, Shannon Zhao wrote:
> > From: Shannon Zhao 
> > 
> > Here we plan to support virtual PMU for guest by full software
> > emulation, so define some basic structs and functions preparing for
> > futher steps. Define struct kvm_pmc for performance monitor counter and
> > struct kvm_pmu for performance monitor unit for each vcpu. According to
> > ARMv8 spec, the PMU contains at most 32(ARMV8_MAX_COUNTERS) counters.
> > 
> > Since this only supports ARM64 (or PMUv3), add a separate config symbol
> > for it.
> > 
> > Signed-off-by: Shannon Zhao 
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  2 ++
> >  arch/arm64/kvm/Kconfig|  8 
> >  include/kvm/arm_pmu.h | 40 
> > +++
> >  3 files changed, 50 insertions(+)
> >  create mode 100644 include/kvm/arm_pmu.h

[...]

> > diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> > index a5272c0..66da9a2 100644
> > --- a/arch/arm64/kvm/Kconfig
> > +++ b/arch/arm64/kvm/Kconfig
> > @@ -36,6 +36,7 @@ config KVM
> > select HAVE_KVM_EVENTFD
> > select HAVE_KVM_IRQFD
> > select KVM_ARM_VGIC_V3
> > +   select KVM_ARM_PMU
> 
> What if HW_PERF_EVENTS is not selected? Also, selecting HW_PERF_EVENTS
> is not enough, and you probably need PERF_EVENTS as well, So this should
> probably read:
> 
>   select KVM_ARM_PMU if (HW_PERF_EVENTS && PERF_EVENTS)

HW_PERF_EVENTS depends on ARM_PMU which in turn depends on PERF_EVENTS.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] kvmtool: virtio-net: fix VIRTIO_NET_F_MRG_RXBUF usage in rx thread

2015-11-26 Thread Will Deacon

When merging virtio-net buffers using the VIRTIO_NET_F_MRG_RXBUF feature,
the first buffer added to the used ring should indicate the total number
of buffers used to hold the packet. Unfortunately, kvmtool has a number
of issues when constructing these merged buffers:

  - Commit 5131332e3f1a ("kvmtool: convert net backend to support
bi-endianness") introduced a strange loop counter, which resulted in
hdr->num_buffers being set redundantly the first time round

  - When adding the buffers to the ring, we actually add them one-by-one,
allowing the guest to see the header before we've inserted the rest
of the data buffers...

  - ... which is made worse because we non-atomically increment the
num_buffers count in the header each time we insert a new data buffer

Consequently, the guest quickly becomes confused in its net rx code and
the whole thing grinds to a halt. This is easily exemplified by trying
to boot a root filesystem over NFS, which seldom succeeds.

This patch resolves the issues by allowing us to insert items into the
used ring without updating the index. Once the full payload has been
added and num_buffers corresponds to the total size, we *then* publish
the buffers to the guest.

Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Sasha Levin <sasha.le...@oracle.com>
Signed-off-by: Will Deacon <will.dea...@arm.com>
---

Without this patch, I can't boot to a prompt using NFS over virtio-net.

 include/kvm/virtio.h |  2 ++
 virtio/core.c| 32 +---
 virtio/net.c | 31 +++
 3 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 768ee9668d44..8324ba7d38be 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -112,6 +112,8 @@ static inline bool virt_queue__available(struct virt_queue 
*vq)
return virtio_guest_to_host_u16(vq, vq->vring.avail->idx) != 
vq->last_avail_idx;
 }
 
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump);
+struct vring_used_elem * virt_queue__set_used_elem_no_update(struct virt_queue 
*queue, u32 head, u32 len, u16 offset);
 struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len);
 
 bool virtio_queue__should_signal(struct virt_queue *vq);
diff --git a/virtio/core.c b/virtio/core.c
index 3b6e4d7cd045..d6ac289d450e 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -21,22 +21,17 @@ const char* virtio_trans_name(enum virtio_trans trans)
return "unknown";
 }
 
-struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len)
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump)
 {
-   struct vring_used_elem *used_elem;
u16 idx = virtio_guest_to_host_u16(queue, queue->vring.used->idx);
 
-   used_elem   = >vring.used->ring[idx % queue->vring.num];
-   used_elem->id   = virtio_host_to_guest_u32(queue, head);
-   used_elem->len  = virtio_host_to_guest_u32(queue, len);
-
/*
 * Use wmb to assure that used elem was updated with head and len.
 * We need a wmb here since we can't advance idx unless we're ready
 * to pass the used element to the guest.
 */
wmb();
-   idx++;
+   idx += jump;
queue->vring.used->idx = virtio_host_to_guest_u16(queue, idx);
 
/*
@@ -45,6 +40,29 @@ struct vring_used_elem *virt_queue__set_used_elem(struct 
virt_queue *queue, u32
 * an updated idx.
 */
wmb();
+}
+
+struct vring_used_elem *
+virt_queue__set_used_elem_no_update(struct virt_queue *queue, u32 head,
+   u32 len, u16 offset)
+{
+   struct vring_used_elem *used_elem;
+   u16 idx = virtio_guest_to_host_u16(queue, queue->vring.used->idx);
+
+   idx += offset;
+   used_elem   = >vring.used->ring[idx % queue->vring.num];
+   used_elem->id   = virtio_host_to_guest_u32(queue, head);
+   used_elem->len  = virtio_host_to_guest_u32(queue, len);
+
+   return used_elem;
+}
+
+struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len)
+{
+   struct vring_used_elem *used_elem;
+
+   used_elem = virt_queue__set_used_elem_no_update(queue, head, len, 0);
+   virt_queue__used_idx_advance(queue, 1);
 
return used_elem;
 }
diff --git a/virtio/net.c b/virtio/net.c
index 6d1be657735e..ffa83ab3b549 100644
--- a/virtio/net.c
+++ b/virtio/net.c
@@ -80,14 +80,12 @@ static void virtio_net_fix_tx_hdr(struct virtio_net_hdr 
*hdr, struct net_dev *nd
hdr->csum_offset= virtio_guest_to_host_u16(>vdev, 
hdr->csum_offset);
 }
 
-static void virtio_net_fix_rx_hdr(struct virtio_net_hdr_mrg_rxbuf *hdr, struct 
net_dev *ndev)
+static void virtio_net_fix_rx_hdr(struct virtio_net_hdr *hdr, struct net_dev

Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-18 Thread Will Deacon

On Wed, Nov 18, 2015 at 10:29:30AM +, Andre Przywara wrote:
> On 02/11/15 14:58, Will Deacon wrote:
> > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> >> this series cleans up kvmtool's kernel loading functionality a bit.
> >> It has been broken out of a previous series I sent [1] and contains
> >> just the cleanup and bug fix parts, which should be less controversial
> >> and thus easier to merge ;-)
> >> I will resend the pipe loading part later on as a separate series.
> >>
> >> The first patch properly abstracts kernel loading to move
> >> responsibility into each architecture's code. It removes quite some
> >> ugly code from the generic kvm.c file.
> >> The later patches address the naive usage of read(2) to, well, read
> >> data from files. Doing this without coping with the subtleties of
> >> the UNIX read semantics (returning with less or none data read is not
> >> an error) can provoke hard to debug failures.
> >> So these patches make use of the existing and one new wrapper function
> >> to make sure we read everything we actually wanted to.
> >> The last patch moves the ARM kernel loading code into the proper
> >> location to be in line with the other architectures.
> >>
> >> Please have a look and give some comments!
> > 
> > Looks good to me, but I'd like to see some comments from some mips/ppc/x86
> > people on the changes you're making over there.
> 
> Sounds reasonable, but no answers yet.
> 
> Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the
> ARM parts) also if you are OK with it?
> I have other patches that depend on 1/7 and 2/7, so having them upstream
> would help me and reduce further dependency churn.
> I am happy to resend the remaining patches for further discussion later.

We let them sit on the list for a while with no comments, so I just pushed
out your series. If a bug report shows up, then we can always revert the
offending patch if there's no quick fix.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-18 Thread Will Deacon

On Wed, Nov 18, 2015 at 10:29:30AM +, Andre Przywara wrote:
> On 02/11/15 14:58, Will Deacon wrote:
> > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> >> this series cleans up kvmtool's kernel loading functionality a bit.
> >> It has been broken out of a previous series I sent [1] and contains
> >> just the cleanup and bug fix parts, which should be less controversial
> >> and thus easier to merge ;-)
> >> I will resend the pipe loading part later on as a separate series.
> >>
> >> The first patch properly abstracts kernel loading to move
> >> responsibility into each architecture's code. It removes quite some
> >> ugly code from the generic kvm.c file.
> >> The later patches address the naive usage of read(2) to, well, read
> >> data from files. Doing this without coping with the subtleties of
> >> the UNIX read semantics (returning with less or none data read is not
> >> an error) can provoke hard to debug failures.
> >> So these patches make use of the existing and one new wrapper function
> >> to make sure we read everything we actually wanted to.
> >> The last patch moves the ARM kernel loading code into the proper
> >> location to be in line with the other architectures.
> >>
> >> Please have a look and give some comments!
> > 
> > Looks good to me, but I'd like to see some comments from some mips/ppc/x86
> > people on the changes you're making over there.
> 
> Sounds reasonable, but no answers yet.
> 
> Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the
> ARM parts) also if you are OK with it?
> I have other patches that depend on 1/7 and 2/7, so having them upstream
> would help me and reduce further dependency churn.
> I am happy to resend the remaining patches for further discussion later.

We let them sit on the list for a while with no comments, so I just pushed
out your series. If a bug report shows up, then we can always revert the
offending patch if there's no quick fix.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] arm64: KVM: Add workaround for Cortex-A57 erratum 834220

2015-11-17 Thread Will Deacon

Hi Marc,

On Mon, Nov 16, 2015 at 10:28:18AM +, Marc Zyngier wrote:
> Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
> when a Stage 1 permission fault or device alignment fault should
> have been reported.
> 
> This patch implements the workaround (which is to validate that the
> Stage-1 translation actually succeeds) by using code patching.
> 
> Signed-off-by: Marc Zyngier <marc.zyng...@arm.com>
> ---
>  arch/arm64/Kconfig  | 21 +
>  arch/arm64/include/asm/cpufeature.h |  3 ++-
>  arch/arm64/kernel/cpu_errata.c  |  9 +
>  arch/arm64/kvm/hyp.S|  6 ++
>  4 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 9ac16a4..746d985 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -316,6 +316,27 @@ config ARM64_ERRATUM_832075
>  
> If unsure, say Y.
>  
> +config ARM64_ERRATUM_834220
> + bool "Cortex-A57: 834220: Stage 2 translation fault might be 
> incorrectly reported in presence of a Stage 1 fault"
> + depends on KVM
> + default y
> + help
> +   This option adds an alternative code sequence to work around ARM
> +   erratum 834220 on Cortex-A57 parts up to r1p2.
> +
> +   Affected Cortex-A57 parts might report a Stage 2 translation
> +   fault as a the result of a Stage 1 fault for load crossing a

s/as a the/as the/
s/for load/for a load/

> +   page boundary when there is a permission or device memory
> +   alignment fault at Stage 1 and a translation fault at Stage 2.
> +
> +   The workaround is to verify that the Stage-1 translation

Consistency between "Stage 1" and "Stage-1".

> +   doesn't generate a fault before handling the Stage-2 fault.

Same here.

> +   Please note that this does not necessarily enable the workaround,
> +   as it depends on the alternative framework, which will only patch
> +   the kernel if an affected CPU is detected.
> +
> +   If unsure, say Y.
> +
>  config ARM64_ERRATUM_845719
>   bool "Cortex-A53: 845719: a load might read incorrect data"
>   depends on COMPAT
> diff --git a/arch/arm64/include/asm/cpufeature.h 
> b/arch/arm64/include/asm/cpufeature.h
> index 11d5bb0f..52722ee 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -29,8 +29,9 @@
>  #define ARM64_HAS_PAN4
>  #define ARM64_HAS_LSE_ATOMICS5
>  #define ARM64_WORKAROUND_CAVIUM_231546
> +#define ARM64_WORKAROUND_834220  7
>  
> -#define ARM64_NCAPS  7
> +#define ARM64_NCAPS  8
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 24926f2..feb6b4e 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -75,6 +75,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  (1 << MIDR_VARIANT_SHIFT) | 2),
>   },
>  #endif
> +#ifdef CONFIG_ARM64_ERRATUM_834220
> + {
> + /* Cortex-A57 r0p0 - r1p2 */
> + .desc = "ARM erratum 834220",
> + .capability = ARM64_WORKAROUND_834220,
> + MIDR_RANGE(MIDR_CORTEX_A57, 0x00,
> +(1 << MIDR_VARIANT_SHIFT) | 2),
> + },
> +#endif
>  #ifdef CONFIG_ARM64_ERRATUM_845719
>   {
>   /* Cortex-A53 r0p[01234] */
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 1599701..ff2e038 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -1015,9 +1015,15 @@ el1_trap:
>   b.ne1f  // Not an abort we care about
>  
>   /* This is an abort. Check for permission fault */
> +alternative_if_not ARM64_WORKAROUND_834220
>   and x2, x1, #ESR_ELx_FSC_TYPE
>   cmp x2, #FSC_PERM
>   b.ne1f  // Not a permission fault
> +alternative_else
> + nop // Use the permission fault path to
> + nop // check for a valid S1 translation,
> + nop // regardless of the ESR value.
> +alternative_endif

With the cosmetic changes:

  Reviewed-by: Will Deacon <will.dea...@arm.com>

Can you cc stable as well, please?

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvmtool: assume dead vcpus are paused too

2015-11-05 Thread Will Deacon

On Wed, Nov 04, 2015 at 06:51:12PM -0500, Sasha Levin wrote:
> On 11/04/2015 06:51 AM, Will Deacon wrote:
> > +   mutex_lock(_lock);
> > +
> > +   /* The kvm->cpus array contains a null pointer in the last location */
> > +   for (i = 0; ; i++) {
> > +   if (kvm->cpus[i])
> > +   pthread_kill(kvm->cpus[i]->thread, SIGKVMEXIT);
> > +   else
> > +   break;
> > +   }
> > +
> > +   kvm__continue(kvm);
> 
> In this scenario: if we grabbed pause_lock, signaled vcpu0 to exit, and it did
> before we called kvm__continue(), we won't end up releasing pause_lock, which
> might cause a lockup later, no?

Hmm, yeah, maybe that should be an explicit mutex_unlock rather than a
call to kvm__continue.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] kvmtool: assume dead vcpus are paused too

2015-11-04 Thread Will Deacon

Hi Sasha,

[adding the kvm list; not sure why it was dropped]

On Wed, Oct 28, 2015 at 01:34:25PM +, Will Deacon wrote:
> On Wed, Oct 28, 2015 at 09:00:07AM -0400, Sasha Levin wrote:
> > On 10/28/2015 08:27 AM, Will Deacon wrote:
> > > On Tue, Oct 27, 2015 at 10:08:44PM -0400, Sasha Levin wrote:
> > >> > On 10/27/2015 12:08 PM, Will Deacon wrote:
> > >>>> > >>   for (i = 0; i < kvm->nrcpus; i++)
> > >>>>> > >> > -  pthread_kill(kvm->cpus[i]->thread, SIGKVMPAUSE);
> > >>>>> > >> > +  if (kvm->cpus[i]->is_running)
> > >>>>> > >> > +  pthread_kill(kvm->cpus[i]->thread, 
> > >>>>> > >> > SIGKVMPAUSE);
> > >>>>> > >> > +  else
> > >>>>> > >> > +  paused_vcpus++;
> > >>> > > What serialises the test of kvm->cpus[i]->is_running against a 
> > >>> > > concurrent
> > >>> > > SIGKVMEXIT?
> > >> > 
> > >> > If there's a pause signal pending we'd still end up marking it as 
> > >> > paused
> > >> > and do the whole process even though the vcpu is actually dead, so 
> > >> > while
> > >> > we can race with SIGKVMEXIT, it'll just mean we'd go through the whole
> > >> > pausing procedure even though the vcpu is dead.
> > >> > 
> > >> > Or did you mean something else?
> > > I couldn't convince myself there was an issue here (hence the question ;),
> > > but I was wondering about something like:
> > > 
> > >   1. The VCPUn thread takes SIGKVMEXIT (e.g. due to kvm_cpu__reboot)
> > >   2. The IPC thread handles a pause event and reads 
> > > kvm->cpus[n]->is_running
> > >  as true
> > >   3. VCPUn sets kvm->cpus[n]->is_running to false
> > >   4. VCPUn exits
> > >   5. The IPC thread trues to pthread_kill on an exited thread
> > > 
> > > am I missing some obvious synchronisation here?
> > 
> > Can we take two signals in parallel? (I'm really not sure). If yes, than
> > what you've described is a problem (and has been for a while). If not,
> > then no :)
> 
> Regardless, isn't the pause event coming from polling the IPC file
> descriptor as opposed to a signal?

It looks like lkvm is currently SEGVing on exit due to the original
issue. Digging deeper, it looks like we should be taking the pause_lock
for the SIGKVMEXIT case too, so that we can serialise access to is_running.

What do you think about the patch below?

Will

--->8

diff --git a/hw/i8042.c b/hw/i8042.c
index 3801e20a675d..288b7d1108ac 100644
--- a/hw/i8042.c
+++ b/hw/i8042.c
@@ -163,7 +163,7 @@ static void kbd_write_command(struct kvm *kvm, u8 val)
state.mode &= ~MODE_DISABLE_AUX;
break;
case I8042_CMD_SYSTEM_RESET:
-   kvm_cpu__reboot(kvm);
+   kvm__reboot(kvm);
break;
default:
break;
diff --git a/include/kvm/kvm-cpu.h b/include/kvm/kvm-cpu.h
index aa0cb543f8fb..c4c9cca449eb 100644
--- a/include/kvm/kvm-cpu.h
+++ b/include/kvm/kvm-cpu.h
@@ -12,7 +12,6 @@ void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu);
 void kvm_cpu__setup_cpuid(struct kvm_cpu *vcpu);
 void kvm_cpu__enable_singlestep(struct kvm_cpu *vcpu);
 void kvm_cpu__run(struct kvm_cpu *vcpu);
-void kvm_cpu__reboot(struct kvm *kvm);
 int kvm_cpu__start(struct kvm_cpu *cpu);
 bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu);
 int kvm_cpu__get_endianness(struct kvm_cpu *vcpu);
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 37155dbc132b..a474dae6c2cf 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -93,6 +93,7 @@ int kvm__register_mmio(struct kvm *kvm, u64 phys_addr, u64 
phys_addr_len, bool c
   void (*mmio_fn)(struct kvm_cpu *vcpu, u64 addr, u8 
*data, u32 len, u8 is_write, void *ptr),
void *ptr);
 bool kvm__deregister_mmio(struct kvm *kvm, u64 phys_addr);
+void kvm__reboot(struct kvm *kvm);
 void kvm__pause(struct kvm *kvm);
 void kvm__continue(struct kvm *kvm);
 void kvm__notify_paused(void);
diff --git a/kvm-cpu.c b/kvm-cpu.c
index ad4441b1d96c..3e2037e3ccb3 100644
--- a/kvm-cpu.c
+++ b/kvm-cpu.c
@@ -45,10 +45,8 @@ void kvm_cpu__run(struct kvm_cpu *vcpu)
 static void kvm_cpu_signal_handler(int signum)
 {
if (signum == SIGKVMEXIT) {
-   if (current_kvm_cpu && current_kvm_cpu->is_running) {
+   if (current_kvm_cpu && current_kvm_cpu->is_running)
curre

Re: [PATCH 1/2] Makefile: allow overriding CFLAGS on the command line

2015-11-04 Thread Will Deacon

On Wed, Nov 04, 2015 at 12:02:23PM +0200, Riku Voipio wrote:
> On 30 October 2015 at 19:20, Andre Przywara  wrote:
> > When a Makefile variable is set on the make command line, all
> > Makefile-internal assignments to that very variable are _ignored_.
> > Since we add quite some essential values to CFLAGS internally,
> > specifying some CFLAGS on the command line will usually break the
> > build (and not fix any include file problems you hoped to overcome
> > with that).
> > Somewhat against intuition GNU make provides the "override" directive
> > to change this behavior; with that assignments in the Makefile get
> > _appended_ to the value given on the command line. [1]
> >
> > Change any internal assignments to use that directive, so that a user
> > can use:
> > $ make CFLAGS=/path/to/my/include/dir
> > to teach kvmtool about non-standard header file locations (helpful
> > for cross-compilation) or to tweak other compiler options.
> >
> > Signed-off-by: Andre Przywara 
> >
> > [1] 
> > https://www.gnu.org/software/make/manual/html_node/Override-Directive.html
> > ---
> >  Makefile | 15 +++
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/Makefile b/Makefile
> > index f8f7cc4..77a7c9f 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -15,9 +15,7 @@ include config/utilities.mak
> >  include config/feature-tests.mak
> >
> >  CC := $(CROSS_COMPILE)gcc
> > -CFLAGS :=
> >  LD := $(CROSS_COMPILE)ld
> > -LDFLAGS:=
> 
> This breaks builds of debian packages as dpkg-buildpackage sets LDFLAGS
> to something unsuitable for guest init.

Thanks for the report, Riku. I'll revert this patch while we rethink how
to support user-supplied toolchain flags.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM/arm: kernel low level debug support for ARM32 virtual platforms

2015-11-03 Thread Will Deacon

On Tue, Nov 03, 2015 at 09:44:52AM -0800, Mario Smarduch wrote:
> On 11/3/2015 8:33 AM, Christopher Covington wrote:
> > On 11/02/2015 06:51 PM, Mario Smarduch wrote:
> >>this is a re-post from couple weeks ago, please take time to review 
> >> this 
> >> simple patch which simplifies DEBUG_LL and prevents kernel crash on 
> >> virtual 
> >> platforms.
> >>
> >> Before this patch DEBUG_LL for 'dummy virtual machine':
> >>
> >> ( ) Kernel low-level debugging via EmbeddedICE DCC channel
> >> ( ) Kernel low-level debug output via semihosting I/O
> >> ( ) Kernel low-level debugging via 8250 UART
> >> ( ) Kernel low-level debugging via ARM Ltd PL01x Primecell
> >>
> >> In summary if debug uart is not emulated kernel crashes.
> >> And once you pass that hurdle, uart physical/virtual addresses are unknown.
> >> DEBUG_LL comes in handy on many occasions and should be somewhat 
> >> intuitive to use like it is for physical platforms. For virtual platforms
> >> user may start daubting the host and get into a bigger mess.
> >>
> >> After this patch is applied user gets:
> >>
> >> (X) Kernel low-level debugging on QEMU Virtual Platform
> >> ( ) Kernel low-level debugging on Kvmtool Virtual Platform
> >>. above repeated 
> >>
> >> The virtual addresses selected follow arm reference models, high in 
> >> vmalloc 
> >> section with high mem enabled and guest running with >= 1GB of memory. The 
> >> offset is leftover from arm reference models.
> > 
> > Which model? It doesn't appear to match the vexpress AEM/RTSM/FVP/whatever
> > which used 0x1c09 for UART0.
> 
> I recall QEMU virt model had it's own physical address map, for sure I saw the
> virtio-mmio regions assigned in some ARM document. Peter would you know?
> 
> As far as kvmtool I'm not sure, currently PC1 COM1 port is used? Andre will 
> that
> stay fixed?

We make absolutely no guarantees about the memory map provided by kvmtool.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-02 Thread Will Deacon

On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> Hi,

Hello Andre,

> this series cleans up kvmtool's kernel loading functionality a bit.
> It has been broken out of a previous series I sent [1] and contains
> just the cleanup and bug fix parts, which should be less controversial
> and thus easier to merge ;-)
> I will resend the pipe loading part later on as a separate series.
> 
> The first patch properly abstracts kernel loading to move
> responsibility into each architecture's code. It removes quite some
> ugly code from the generic kvm.c file.
> The later patches address the naive usage of read(2) to, well, read
> data from files. Doing this without coping with the subtleties of
> the UNIX read semantics (returning with less or none data read is not
> an error) can provoke hard to debug failures.
> So these patches make use of the existing and one new wrapper function
> to make sure we read everything we actually wanted to.
> The last patch moves the ARM kernel loading code into the proper
> location to be in line with the other architectures.
> 
> Please have a look and give some comments!

Looks good to me, but I'd like to see some comments from some mips/ppc/x86
people on the changes you're making over there.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-02 Thread Will Deacon

On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> Hi,

Hello Andre,

> this series cleans up kvmtool's kernel loading functionality a bit.
> It has been broken out of a previous series I sent [1] and contains
> just the cleanup and bug fix parts, which should be less controversial
> and thus easier to merge ;-)
> I will resend the pipe loading part later on as a separate series.
> 
> The first patch properly abstracts kernel loading to move
> responsibility into each architecture's code. It removes quite some
> ugly code from the generic kvm.c file.
> The later patches address the naive usage of read(2) to, well, read
> data from files. Doing this without coping with the subtleties of
> the UNIX read semantics (returning with less or none data read is not
> an error) can provoke hard to debug failures.
> So these patches make use of the existing and one new wrapper function
> to make sure we read everything we actually wanted to.
> The last patch moves the ARM kernel loading code into the proper
> location to be in line with the other architectures.
> 
> Please have a look and give some comments!

Looks good to me, but I'd like to see some comments from some mips/ppc/x86
people on the changes you're making over there.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] vfio/type1: Do not support IOMMUs that allow bypass

2015-10-29 Thread Will Deacon

On Tue, Oct 27, 2015 at 10:00:11AM -0600, Alex Williamson wrote:
> On Tue, 2015-10-27 at 15:40 +, Will Deacon wrote:
> > On Fri, Oct 16, 2015 at 09:51:22AM -0600, Alex Williamson wrote:
> > > Would it be possible to add iommu_domain_geometry support to arm-smmu.c?
> > > In addition to this test to verify that DMA cannot bypass the IOMMU, I'd
> > > eventually like to pass the aperture information out through the VFIO
> > > API.  Thanks,
> > 
> > The slight snag here is that we don't know which SMMU in the system the
> > domain is attached to at the point when the geometry is queried, so I
> > can't give you an upper bound on the aperture. For example, if there is
> > an SMMU with a 32-bit input address and another with a 48-bit input
> > address.
> > 
> > We could play the same horrible games that we do with the pgsize bitmap,
> > and truncate some global aperture everytime we probe an SMMU device, but
> > I'd really like to have fewer hacks like that if possible. The root of
> > the problem is still that domains are allocated for a bus, rather than
> > an IOMMU instance.
> 
> Yes, Intel VT-d has this issue as well.  In theory we can have
> heterogeneous IOMMU hardware units (DRHDs) in a system and the upper
> bound of the geometry could be diminished if we add a less capable DRHD
> into the domain.  I suspect this is more a theoretical problem than a
> practical one though as we're typically mixing similar DRHDs and I think
> we're still capable of 39-bit addressing in the least capable version
> per the spec.
> 
> In any case, I really want to start testing geometry.force_aperture,
> even if we're not yet comfortable to expose the IOMMU limits to the
> user.  The vfio type1 shouldn't be enabled at all for underlying
> hardware that allows DMA bypass.  Thanks,

Ok, I'll put it on my list of things to look at under the assumption that
the actual aperture limits don't need to be accurate as long as DMA to
an arbitrary unmapped address always faults.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] vfio/type1: Do not support IOMMUs that allow bypass

2015-10-29 Thread Will Deacon

On Thu, Oct 29, 2015 at 06:42:10PM +, Robin Murphy wrote:
> On 29/10/15 18:28, Will Deacon wrote:
> >On Tue, Oct 27, 2015 at 10:00:11AM -0600, Alex Williamson wrote:
> >>On Tue, 2015-10-27 at 15:40 +, Will Deacon wrote:
> >>>On Fri, Oct 16, 2015 at 09:51:22AM -0600, Alex Williamson wrote:
> >>>>Would it be possible to add iommu_domain_geometry support to arm-smmu.c?
> >>>>In addition to this test to verify that DMA cannot bypass the IOMMU, I'd
> >>>>eventually like to pass the aperture information out through the VFIO
> >>>>API.  Thanks,
> >>>
> >>>The slight snag here is that we don't know which SMMU in the system the
> >>>domain is attached to at the point when the geometry is queried, so I
> >>>can't give you an upper bound on the aperture. For example, if there is
> >>>an SMMU with a 32-bit input address and another with a 48-bit input
> >>>address.
> >>>
> >>>We could play the same horrible games that we do with the pgsize bitmap,
> >>>and truncate some global aperture everytime we probe an SMMU device, but
> >>>I'd really like to have fewer hacks like that if possible. The root of
> >>>the problem is still that domains are allocated for a bus, rather than
> >>>an IOMMU instance.
> >>
> >>Yes, Intel VT-d has this issue as well.  In theory we can have
> >>heterogeneous IOMMU hardware units (DRHDs) in a system and the upper
> >>bound of the geometry could be diminished if we add a less capable DRHD
> >>into the domain.  I suspect this is more a theoretical problem than a
> >>practical one though as we're typically mixing similar DRHDs and I think
> >>we're still capable of 39-bit addressing in the least capable version
> >>per the spec.
> >>
> >>In any case, I really want to start testing geometry.force_aperture,
> >>even if we're not yet comfortable to expose the IOMMU limits to the
> >>user.  The vfio type1 shouldn't be enabled at all for underlying
> >>hardware that allows DMA bypass.  Thanks,
> >
> >Ok, I'll put it on my list of things to look at under the assumption that
> >the actual aperture limits don't need to be accurate as long as DMA to
> >an arbitrary unmapped address always faults.
> 
> I'm pretty sure we'd only ever set the aperture to the full input address
> range anyway (since we're not a GART-type thing), in which case we should
> only need to worry about unmatched streams that don't hit in a domain at
> all. Doesn't the disable_bypass option already cover that? (FWIW I hacked it
> up for v2 a while back, too[0]).

Well, the "full input address range" is tricky when you have multiple SMMU
instances with different input address sizes. I can do something similar
to the pgsize_bitmap.

I also don't think the disable_bypass option is what we're after -- this
is about devices attached to a VFIO domain that can still mysteriously
bypass the SMMU for some ranges AFAIU (and shouldn't be an issue for ARM).

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vfio/type1: handle case where IOMMU does not support PAGE_SIZE size

2015-10-29 Thread Will Deacon

On Thu, Oct 29, 2015 at 01:59:45PM +, Eric Auger wrote:
> Current vfio_pgsize_bitmap code hides the supported IOMMU page
> sizes smaller than PAGE_SIZE. As a result, in case the IOMMU
> does not support PAGE_SIZE page, the alignment check on map/unmap
> is done with larger page sizes, if any. This can fail although
> mapping could be done with pages smaller than PAGE_SIZE.
> 
> This patch modifies vfio_pgsize_bitmap implementation so that,
> in case the IOMMU supports page sizes smaller than PAGE_HOST
> we pretend PAGE_HOST is supported and hide sub-PAGE_HOST sizes.
> That way the user will be able to map/unmap buffers whose size/
> start address is aligned with PAGE_HOST. Pinning code uses that
> granularity while iommu driver can use the sub-PAGE_HOST size
> to map the buffer.
> 
> Signed-off-by: Eric Auger <eric.au...@linaro.org>
> Signed-off-by: Alex Williamson <alex.william...@redhat.com>
> 
> ---
> 
> This was tested on AMD Seattle with 64kB page host. ARM MMU 401
> currently expose 4kB, 2MB and 1GB page support. With a 64kB page host,
> the map/unmap check is done against 2MB. Some alignment check fail
> so VFIO_IOMMU_MAP_DMA fail while we could map using 4kB IOMMU page
> size.
> 
> RFC -> PATCH v1:
> - move all modifications in vfio_pgsize_bitmap following Alex'
>   suggestion to expose a fake PAGE_HOST support
> - restore WARN_ON's
> ---
>  drivers/vfio/vfio_iommu_type1.c | 15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 57d8c37..cee504a 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -403,13 +403,26 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, 
> struct vfio_dma *dma)
>  static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
>  {
>   struct vfio_domain *domain;
> - unsigned long bitmap = PAGE_MASK;
> + unsigned long bitmap = ULONG_MAX;
>  
>   mutex_lock(>lock);
>   list_for_each_entry(domain, >domain_list, next)
>   bitmap &= domain->domain->ops->pgsize_bitmap;
>   mutex_unlock(>lock);
>  
> + /*
> +  * In case the IOMMU supports page sizes smaller than PAGE_HOST
> +  * we pretend PAGE_HOST is supported and hide sub-PAGE_HOST sizes.
> +  * That way the user will be able to map/unmap buffers whose size/
> +  * start address is aligned with PAGE_HOST. Pinning code uses that
> +  * granularity while iommu driver can use the sub-PAGE_HOST size
> +  * to map the buffer.
> +  */
> + if (bitmap & ~PAGE_MASK) {
> + bitmap &= PAGE_MASK;
> + bitmap |= PAGE_SIZE;
> + }
> +

s/PAGE_HOST/PAGE_SIZE/g (in the cover-letter too) and then I think this
looks good:

  Acked-by: Will Deacon <will.dea...@arm.com>

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] vfio/type1: handle case where IOMMU does not support PAGE_SIZE size

2015-10-28 Thread Will Deacon

On Wed, Oct 28, 2015 at 01:12:45PM +, Eric Auger wrote:
> Current vfio_pgsize_bitmap code hides the supported IOMMU page
> sizes smaller than PAGE_SIZE. As a result, in case the IOMMU
> does not support PAGE_SIZE page, the alignment check on map/unmap
> is done with larger page sizes, if any. This can fail although
> mapping could be done with pages smaller than PAGE_SIZE.
> 
> vfio_pgsize_bitmap is modified to expose the IOMMU page sizes,
> supported by all domains, even those smaller than PAGE_SIZE. The
> alignment check on map is performed against PAGE_SIZE if the minimum
> IOMMU size is less than PAGE_SIZE or against the min page size greater
> than PAGE_SIZE.
> 
> Signed-off-by: Eric Auger 
> 
> ---
> 
> This was tested on AMD Seattle with 64kB page host. ARM MMU 401
> currently expose 4kB, 2MB and 1GB page support. With a 64kB page host,
> the map/unmap check is done against 2MB. Some alignment check fail
> so VFIO_IOMMU_MAP_DMA fail while we could map using 4kB IOMMU page
> size.
> ---
>  drivers/vfio/vfio_iommu_type1.c | 25 +++--
>  1 file changed, 11 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 57d8c37..13fb974 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -403,7 +403,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, 
> struct vfio_dma *dma)
>  static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
>  {
>   struct vfio_domain *domain;
> - unsigned long bitmap = PAGE_MASK;
> + unsigned long bitmap = ULONG_MAX;
>  
>   mutex_lock(>lock);
>   list_for_each_entry(domain, >domain_list, next)
> @@ -416,20 +416,18 @@ static unsigned long vfio_pgsize_bitmap(struct 
> vfio_iommu *iommu)
>  static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
>struct vfio_iommu_type1_dma_unmap *unmap)
>  {
> - uint64_t mask;
>   struct vfio_dma *dma;
>   size_t unmapped = 0;
>   int ret = 0;
> + unsigned int min_pagesz = __ffs(vfio_pgsize_bitmap(iommu));
> + unsigned int requested_alignment = (min_pagesz < PAGE_SIZE) ?
> + PAGE_SIZE : min_pagesz;

max_t ?

>  
> - mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
> -
> - if (unmap->iova & mask)
> + if (!IS_ALIGNED(unmap->iova, requested_alignment))
>   return -EINVAL;
> - if (!unmap->size || unmap->size & mask)
> + if (!unmap->size || !IS_ALIGNED(unmap->size, requested_alignment))
>   return -EINVAL;
>  
> - WARN_ON(mask & PAGE_MASK);
> -
>   mutex_lock(>lock);
>  
>   /*
> @@ -553,25 +551,24 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>   size_t size = map->size;
>   long npage;
>   int ret = 0, prot = 0;
> - uint64_t mask;
>   struct vfio_dma *dma;
>   unsigned long pfn;
> + unsigned int min_pagesz = __ffs(vfio_pgsize_bitmap(iommu));
> + unsigned int requested_alignment = (min_pagesz < PAGE_SIZE) ?
> + PAGE_SIZE : min_pagesz;

Same code here. Perhaps you need another function to get the alignment?

Otherwise, this looks pretty straightforward to me; iommu_map will take
care of splitting up the requests to the IOMMU driver so they are in the
right chunks.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] vfio/type1: handle case where IOMMU does not support PAGE_SIZE size

2015-10-28 Thread Will Deacon

On Wed, Oct 28, 2015 at 10:27:28AM -0600, Alex Williamson wrote:
> On Wed, 2015-10-28 at 13:12 +, Eric Auger wrote:
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 57d8c37..13fb974 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -403,7 +403,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, 
> > struct vfio_dma *dma)
> >  static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
> >  {
> > struct vfio_domain *domain;
> > -   unsigned long bitmap = PAGE_MASK;
> > +   unsigned long bitmap = ULONG_MAX;
> 
> Isn't this and removing the WARN_ON()s the only real change in this
> patch?  The rest looks like conversion to use IS_ALIGNED and the
> following test, that I don't really understand...
> 
> >  
> > mutex_lock(>lock);
> > list_for_each_entry(domain, >domain_list, next)
> > @@ -416,20 +416,18 @@ static unsigned long vfio_pgsize_bitmap(struct 
> > vfio_iommu *iommu)
> >  static int vfio_dma_do_unmap(struct vfio_iommu *iommu,
> >  struct vfio_iommu_type1_dma_unmap *unmap)
> >  {
> > -   uint64_t mask;
> > struct vfio_dma *dma;
> > size_t unmapped = 0;
> > int ret = 0;
> > +   unsigned int min_pagesz = __ffs(vfio_pgsize_bitmap(iommu));
> > +   unsigned int requested_alignment = (min_pagesz < PAGE_SIZE) ?
> > +   PAGE_SIZE : min_pagesz;
> 
> This one.  If we're going to support sub-PAGE_SIZE mappings, why do we
> care to cap alignment at PAGE_SIZE?

Eric can clarify, but I think the intention here is to have VFIO continue
doing things in PAGE_SIZE chunks precisely so that we don't have to rework
all of the pinning code etc. The IOMMU API can then deal with the smaller
page size.

> > -   mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
> > -
> > -   if (unmap->iova & mask)
> > +   if (!IS_ALIGNED(unmap->iova, requested_alignment))
> > return -EINVAL;
> > -   if (!unmap->size || unmap->size & mask)
> > +   if (!unmap->size || !IS_ALIGNED(unmap->size, requested_alignment))
> > return -EINVAL;
> >  
> > -   WARN_ON(mask & PAGE_MASK);
> > -
> > mutex_lock(>lock);
> >  
> > /*
> > @@ -553,25 +551,24 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> > size_t size = map->size;
> > long npage;
> > int ret = 0, prot = 0;
> > -   uint64_t mask;
> > struct vfio_dma *dma;
> > unsigned long pfn;
> > +   unsigned int min_pagesz = __ffs(vfio_pgsize_bitmap(iommu));
> > +   unsigned int requested_alignment = (min_pagesz < PAGE_SIZE) ?
> > +   PAGE_SIZE : min_pagesz;
> >  
> > /* Verify that none of our __u64 fields overflow */
> > if (map->size != size || map->vaddr != vaddr || map->iova != iova)
> > return -EINVAL;
> >  
> > -   mask = ((uint64_t)1 << __ffs(vfio_pgsize_bitmap(iommu))) - 1;
> > -
> > -   WARN_ON(mask & PAGE_MASK);
> > -
> > /* READ/WRITE from device perspective */
> > if (map->flags & VFIO_DMA_MAP_FLAG_WRITE)
> > prot |= IOMMU_WRITE;
> > if (map->flags & VFIO_DMA_MAP_FLAG_READ)
> > prot |= IOMMU_READ;
> >  
> > -   if (!prot || !size || (size | iova | vaddr) & mask)
> > +   if (!prot || !size ||
> > +   !IS_ALIGNED(size | iova | vaddr, requested_alignment))
> > return -EINVAL;
> >  
> > /* Don't allow IOVA or virtual address wrap */
> 
> This is mostly ignoring the problems with sub-PAGE_SIZE mappings.  For
> instance, we can only pin on PAGE_SIZE and therefore we only do
> accounting on PAGE_SIZE, so if the user does 4K mappings across your 64K
> page, that page gets pinned and accounted 16 times.  Are we going to
> tell users that their locked memory limit needs to be 16x now?  The rest
> of the code would need an audit as well to see what other sub-page bugs
> might be hiding.  Thanks,

I don't see that. The pinning all happens the same in VFIO, which can
then happily pass a 64k region to iommu_map. iommu_map will then call
->map in 4k chunks on the IOMMU driver ops.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] vfio/type1: handle case where IOMMU does not support PAGE_SIZE size

2015-10-28 Thread Will Deacon

On Wed, Oct 28, 2015 at 06:48:41PM +0100, Eric Auger wrote:
> On 10/28/2015 06:37 PM, Alex Williamson wrote:
> > Ok, so with hopefully correcting my understand of what this does, isn't
> > this effectively the same:
> > 
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 57d8c37..7db4f5a 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -403,13 +403,19 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, 
> > stru
> >  static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
> >  {
> > struct vfio_domain *domain;
> > -   unsigned long bitmap = PAGE_MASK;
> > +   unsigned long bitmap = ULONG_MAX;
> >  
> > mutex_lock(>lock);
> > list_for_each_entry(domain, >domain_list, next)
> > bitmap &= domain->domain->ops->pgsize_bitmap;
> > mutex_unlock(>lock);
> >  
> > +   /* Some comment about how the IOMMU API splits requests */
> > +   if (bitmap & ~PAGE_MASK) {
> > +   bitmap &= PAGE_MASK;
> > +   bitmap |= PAGE_SIZE;
> > +   }
> > +
> > return bitmap;
> >  }
> Yes, to me it is indeed the same
> >  
> > This would also expose to the user that we're accepting PAGE_SIZE, which
> > we weren't before, so it was not quite right to just let them do it
> > anyway.  I don't think we even need to get rid of the WARN_ONs, do we?
> > Thanks,
> 
> The end-user might be afraid of those latter. Personally I would get rid
> of them but that's definitively up to you.

I think Alex's point is that the WARN_ON's won't trigger with this patch,
because he clears those lower bits in the bitmap.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Network hangs when communicating with host

2015-10-27 Thread Will Deacon

[apologies for the delay -- I've been off for a week and am catching up
 on email]

On Tue, Oct 20, 2015 at 09:58:51AM -0400, Sasha Levin wrote:
> On 10/20/2015 09:42 AM, Dmitry Vyukov wrote:
> > I now have another issue. My binary fails to mmap a file within lkvm
> > sandbox. The same binary works fine on host and in qemu. I've added
> > strace into sandbox script, and here is the output:
> > 
> > [pid   837] openat(AT_FDCWD, "syzkaller-shm048878722", O_RDWR|O_CLOEXEC) = 5
> > [pid   837] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 5,
> > 0) = -1 EINVAL (Invalid argument)
> > 
> > I don't see anything that can potentially cause EINVAL here. Is it
> > possible that lkvm somehow affects kernel behavior here?
> > 
> > I run lkvm as:
> > 
> > $ taskset 1 /kvmtool/lkvm sandbox --disk syz-0 --mem=2048 --cpus=2
> > --kernel /arch/x86/boot/bzImage --network mode=user --sandbox
> > /workdir/kvm/syz-0.sh
> 
> It's possible that something in the virtio-9p layer is broken. I'll give
> it a look in the evening.

I ended up with the patch below, but it's really ugly and I didn't get
round to posting it.

Will

--->8

>From 7cbcdfef1b9f094db4bf75676f22339f3164e103 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.dea...@arm.com>
Date: Fri, 17 Apr 2015 17:31:36 +0100
Subject: [PATCH] kvmtool: virtio-net: fix VIRTIO_NET_F_MRG_RXBUF usage in rx
 thread

When merging virtio-net buffers using the VIRTIO_NET_F_MRG_RXBUF feature,
the first buffer added to the used ring should indicate the total number
of buffers used to hold the packet. Unfortunately, kvmtool has a number
of issues when constructing these merged buffers:

  - Commit 5131332e3f1a ("kvmtool: convert net backend to support
bi-endianness") introduced a strange loop counter, which resulted in
hdr->num_buffers being set redundantly the first time round

  - When adding the buffers to the ring, we actually add them one-by-one,
allowing the guest to see the header before we've inserted the rest
of the data buffers...

  - ... which is made worse because we non-atomically increment the
num_buffers count in the header each time we insert a new data buffer

Consequently, the guest quickly becomes confused in its net rx code and
the whole thing grinds to a halt. This is easily exemplified by trying
to boot a root filesystem over NFS, which seldom succeeds.

This patch resolves the issues by allowing us to insert items into the
used ring without updating the index. Once the full payload has been
added and num_buffers corresponds to the total size, we *then* publish
the buffers to the guest.

Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Sasha Levin <sasha.le...@oracle.com>
Signed-off-by: Will Deacon <will.dea...@arm.com>
---
 include/kvm/virtio.h |  2 ++
 virtio/core.c| 32 +---
 virtio/net.c | 31 +++
 3 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 768ee9668d44..8324ba7d38be 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -112,6 +112,8 @@ static inline bool virt_queue__available(struct virt_queue 
*vq)
return virtio_guest_to_host_u16(vq, vq->vring.avail->idx) != 
vq->last_avail_idx;
 }
 
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump);
+struct vring_used_elem * virt_queue__set_used_elem_no_update(struct virt_queue 
*queue, u32 head, u32 len, u16 offset);
 struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len);
 
 bool virtio_queue__should_signal(struct virt_queue *vq);
diff --git a/virtio/core.c b/virtio/core.c
index 3b6e4d7cd045..d6ac289d450e 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -21,22 +21,17 @@ const char* virtio_trans_name(enum virtio_trans trans)
return "unknown";
 }
 
-struct vring_used_elem *virt_queue__set_used_elem(struct virt_queue *queue, 
u32 head, u32 len)
+void virt_queue__used_idx_advance(struct virt_queue *queue, u16 jump)
 {
-   struct vring_used_elem *used_elem;
u16 idx = virtio_guest_to_host_u16(queue, queue->vring.used->idx);
 
-   used_elem   = >vring.used->ring[idx % queue->vring.num];
-   used_elem->id   = virtio_host_to_guest_u32(queue, head);
-   used_elem->len  = virtio_host_to_guest_u32(queue, len);
-
/*
 * Use wmb to assure that used elem was updated with head and len.
 * We need a wmb here since we can't advance idx unless we're ready
 * to pass the used element to the guest.
 */
wmb();
-   idx++;
+   idx += jump;
queue->vring.used->idx = virtio_host_to_guest_u16(queue, idx);
 
/*
@@ -45,6 +40,29 @@ struct vring_used_elem *virt_queue__set_used_elem(struct 
virt_queue *queue

Re: [RFC PATCH] vfio/type1: Do not support IOMMUs that allow bypass

2015-10-27 Thread Will Deacon

On Fri, Oct 16, 2015 at 09:51:22AM -0600, Alex Williamson wrote:
> On Fri, 2015-10-16 at 16:03 +0200, Eric Auger wrote:
> > Hi Alex,
> > On 10/15/2015 10:52 PM, Alex Williamson wrote:
> > > We can only provide isolation if DMA is forced through the IOMMU
> > > aperture.  Don't allow type1 to be used if this is not the case.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > ---
> > > 
> > > Eric, I see a number of IOMMU drivers enable this, do the ones you
> > > care about for ARM set geometry.force_aperture?  Thanks,
> > I am currently using arm-smmu.c. I don't see force_aperture being set.
> 
> Hi Will,

Hi Alex,

> Would it be possible to add iommu_domain_geometry support to arm-smmu.c?
> In addition to this test to verify that DMA cannot bypass the IOMMU, I'd
> eventually like to pass the aperture information out through the VFIO
> API.  Thanks,

The slight snag here is that we don't know which SMMU in the system the
domain is attached to at the point when the geometry is queried, so I
can't give you an upper bound on the aperture. For example, if there is
an SMMU with a 32-bit input address and another with a 48-bit input
address.

We could play the same horrible games that we do with the pgsize bitmap,
and truncate some global aperture everytime we probe an SMMU device, but
I'd really like to have fewer hacks like that if possible. The root of
the problem is still that domains are allocated for a bus, rather than
an IOMMU instance.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/3] kvmtool: don't overwrite /virt/init

2015-10-27 Thread Will Deacon

Hi Oleg,

On Thu, Oct 22, 2015 at 05:59:21PM +0200, Oleg Nesterov wrote:
> On top of "[PATCH 0/3] kvmtool: tiny init fox x86_64" I sent.
> 
> I simply can't understand why kvmtool always overwrites /virt/init.
> This doesn't look right, especially because you can't pass "init="
> kernel argument without another patch "[PATCH] kvmtool/run: append
> cfg.kernel_cmdline at the end of real_cmdline" I sent. And so far
> it is not clear to me if you are going to accept that patch or not.

Sorry, I was off last week and didn't have email access. I'm going through
my inbox now and your patches look pretty good to me. I'll apply/test and
then push out if I don't see any issues.

Thanks!

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvmtool] Add a link to the lwn.net article

2015-10-06 Thread Will Deacon

Hi Sven,

On Sat, Oct 03, 2015 at 10:42:55PM +1000, Sven Dowideit wrote:
> Signed-off-by: Sven Dowideit 
> ---
>  README | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/README b/README
> index 6667f23..5501f05 100644
> --- a/README
> +++ b/README
> @@ -97,6 +97,9 @@ project:
>  
>  http://thread.gmane.org/gmane.linux.kernel/962051/focus=962620
>  
> +Another detailed example can be found in the lwn.net article:
> +
> +http://lwn.net/Articles/658511/

Currently, this link requires an lwn subscription but I'm happy to take
this patch once it becomes freely available later this week.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH kvmtool] Skip a few messages by default: command line args; flat binary; earlyprintk.

2015-10-06 Thread Will Deacon

On Wed, Sep 30, 2015 at 05:11:15PM +0100, Andre Przywara wrote:
> On 29/09/15 17:59, Dimitri John Ledkov wrote:
> > The partial command line args & earlyprintk=serial are still enabled
> > in the debug mode. Warning that a flat binary kernel image is attemped
> > to be loaded is completely dropped. These are not that informative,
> > once one is past intial debugging, and only polute the console.
> > 
> > Signed-off-by: Dimitri John Ledkov 
> > ---
> >  builtin-run.c | 10 ++
> >  kvm.c |  1 -
> >  x86/kvm.c |  8 ++--
> >  3 files changed, 12 insertions(+), 7 deletions(-)
> > 
> > diff --git a/builtin-run.c b/builtin-run.c
> > index e0c8732..8edbf88 100644
> > --- a/builtin-run.c
> > +++ b/builtin-run.c
> > @@ -613,10 +613,12 @@ static struct kvm *kvm_cmd_run_init(int argc, const 
> > char **argv)
> >  
> > kvm->cfg.real_cmdline = real_cmdline;
> >  
> > -   printf("  # %s run -k %s -m %Lu -c %d --name %s\n", KVM_BINARY_NAME,
> > -   kvm->cfg.kernel_filename,
> > -   (unsigned long long)kvm->cfg.ram_size / 1024 / 1024,
> > -   kvm->cfg.nrcpus, kvm->cfg.guest_name);
> > +   if (do_debug_print) {
> > +   printf("  # %s run -k %s -m %Lu -c %d --name %s\n", 
> > KVM_BINARY_NAME,
> > +  kvm->cfg.kernel_filename,
> > +  (unsigned long long)kvm->cfg.ram_size / 1024 / 1024,
> > +  kvm->cfg.nrcpus, kvm->cfg.guest_name);
> > +   }
> 
> I like the general idea. In fact I have this very patch (among others)
> in my tree too. I applied similar guarding to other messages as well
> (mostly those that only show up on ARM, but also the "ended normally"
> message). Like any good UNIX tool kvmtool should keep quiet if it has
> nothing worthwhile to say ;-)
> But looking at it more closely, I see that there is pr_debug() defined
> doing that "if (do_debug_print)" already. The only issue is that is
> prints source line information, which is not really useful here. But
> then again there does not seem to be any user of it?
> 
> So what about the following:
> - We avoid printing pr_info() messages in the default case. Looking at
> its current users in the tree this information is not really useful for
> normal users. We enable pr_info() output only if do_debug_print is
> enabled or introduce another command line flag (--verbose?) for that.
> - We check each user of pr_info() to see whether this information is
> actually "informational" or whether it should be converted to pr_warn.
> - We change the above line to use pr_info instead of printf.
> - We fix the EOL mayhem we have atm while at it.
> 
> If you don't mind I will give this a try later this week.

Sounds good to me.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't rely on $HOME

2015-09-18 Thread Will Deacon

On Fri, Sep 18, 2015 at 11:51:37AM +0100, Riku Voipio wrote:
> On 17 September 2015 at 18:53, Will Deacon <will.dea...@arm.com> wrote:
> > On Thu, Sep 17, 2015 at 03:03:15PM +0100, Alban Crequy wrote:
> >> kvm__set_dir() called in main() and kvm__get_dir() rely on $HOME. But in
> >> some environments (such as starting lkvm through systemd-run), $HOME is
> >> undefined. This causes bind() to use a socket path containing "(null)"
> >> and this fails. The current code does not check errors returned by
> >> realpath().
> >>
> >> Symptoms:
> >>
> >> | bind: No such file or directory
> >> |   Error: Failed adding socket to epoll
> >> |  Warning: Failed init: kvm_ipc__init
> >> |
> >> |  Fatal: Initialisation failed
> >>
> >> This bug was first reported on https://github.com/coreos/rkt/issues/1393
> >>
> >> Instead of using "$HOME/.lkvm/" (i.e. "/root/.lkvm/"), this patch uses
> >> "/var/lib/lkvm/". This also improve the error reporting by printing the
> >> socket filename.
> >
> > Hmm, but that requires lkvm to be run with sufficient privileges to
> > write to /var/lib, which I don't think is generally the case. I think we
> > have a few options:
> >
> >   (1) Try /var/lib/lkvm if $HOME is NULL
> >   (2) Use an alternative environment variable for the pid prefix
> >   (3) Add a --pid command line option for the pidfile
> >   (4) ???
> >
> > Any preferences? What do other projects do?
> 
> The right place to put a pid file would be $XDG_RUNTIME_DIR aka
> /run/user/$UID/. System services write their pidfiles and sockets to
> plain /run

Ok, that certainly sounds like the right things to do for temporary
structures then. Thanks.

> The place where to put the rootfs structure created in builtin-setup.c
> is more complicated. Is the  created rootfs epheremal? Then sticking
> under /run(user/UID) is fine.

Currently, the filesystem persists across multiple invocations of lkvm,
so I can imagine somebody being surprised if what they thought was
persistent was lost over something like a reboot.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't rely on $HOME

2015-09-17 Thread Will Deacon

On Thu, Sep 17, 2015 at 03:03:15PM +0100, Alban Crequy wrote:
> kvm__set_dir() called in main() and kvm__get_dir() rely on $HOME. But in
> some environments (such as starting lkvm through systemd-run), $HOME is
> undefined. This causes bind() to use a socket path containing "(null)"
> and this fails. The current code does not check errors returned by
> realpath().
> 
> Symptoms:
> 
> | bind: No such file or directory
> |   Error: Failed adding socket to epoll
> |  Warning: Failed init: kvm_ipc__init
> |
> |  Fatal: Initialisation failed
> 
> This bug was first reported on https://github.com/coreos/rkt/issues/1393
> 
> Instead of using "$HOME/.lkvm/" (i.e. "/root/.lkvm/"), this patch uses
> "/var/lib/lkvm/". This also improve the error reporting by printing the
> socket filename.

Hmm, but that requires lkvm to be run with sufficient privileges to
write to /var/lib, which I don't think is generally the case. I think we
have a few options:

  (1) Try /var/lib/lkvm if $HOME is NULL
  (2) Use an alternative environment variable for the pid prefix
  (3) Add a --pid command line option for the pidfile
  (4) ???

Any preferences? What do other projects do?

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 kvmtool] Make static libc and guest-init functionality optional.

2015-09-15 Thread Will Deacon

Hi Dmitri,

On Fri, Sep 11, 2015 at 03:40:00PM +0100, Dimitri John Ledkov wrote:
> If one typically only boots full disk-images, one wouldn't necessaraly
> want to statically link glibc, for the guest-init feature of the
> kvmtool. As statically linked glibc triggers haevy security
> maintainance.
> 
> Signed-off-by: Dimitri John Ledkov <dimitri.j.led...@intel.com>
> ---
>  Changes since v1:
>  - rename CONFIG_HAS_LIBC to CONFIG_GUEST_INIT for clarity
>  - use more ifdefs, instead of runtime check of _binary_guest_init_size==0

The idea looks good, but I think we can tidy some of this up at the same
time by moving all the guest_init code in builtin_setup.c.

How about the patch below?

Will

--->8

>From cdce942c1a3a04635065a7972ca4e21386664756 Mon Sep 17 00:00:00 2001
From: Dimitri John Ledkov <dimitri.j.led...@intel.com>
Date: Fri, 11 Sep 2015 15:40:00 +0100
Subject: [PATCH] Make static libc and guest-init functionality optional.

If one typically only boots full disk-images, one wouldn't necessaraly
want to statically link glibc, for the guest-init feature of the
kvmtool. As statically linked glibc triggers haevy security
maintainance.

Signed-off-by: Dimitri John Ledkov <dimitri.j.led...@intel.com>
[will: moved all the guest_init handling into builtin_setup.c]
Signed-off-by: Will Deacon <will.dea...@arm.com>
---
 Makefile| 12 +++-
 builtin-run.c   | 29 +
 builtin-setup.c | 19 ++-
 include/kvm/builtin-setup.h |  1 +
 4 files changed, 23 insertions(+), 38 deletions(-)

diff --git a/Makefile b/Makefile
index 7b17d529d13b..f1701aa7b8ec 100644
--- a/Makefile
+++ b/Makefile
@@ -34,8 +34,6 @@ bindir_SQ = $(subst ','\'',$(bindir))
 PROGRAM:= lkvm
 PROGRAM_ALIAS := vm
 
-GUEST_INIT := guest/init
-
 OBJS   += builtin-balloon.o
 OBJS   += builtin-debug.o
 OBJS   += builtin-help.o
@@ -279,8 +277,13 @@ ifeq ($(LTO),1)
endif
 endif
 
-ifneq ($(call try-build,$(SOURCE_STATIC),,-static),y)
-$(error No static libc found. Please install glibc-static package.)
+ifeq ($(call try-build,$(SOURCE_STATIC),,-static),y)
+   CFLAGS  += -DCONFIG_GUEST_INIT
+   GUEST_INIT  := guest/init
+   GUEST_OBJS  = guest/guest_init.o
+else
+   $(warning No static libc found. Skipping guest init)
+   NOTFOUND+= static-libc
 endif
 
 ifeq (y,$(ARCH_WANT_LIBFDT))
@@ -356,7 +359,6 @@ c_flags = -Wp,-MD,$(depfile) $(CFLAGS)
 # $(OTHEROBJS) are things that do not get substituted like this.
 #
 STATIC_OBJS = $(patsubst %.o,%.static.o,$(OBJS) $(OBJS_STATOPT))
-GUEST_OBJS = guest/guest_init.o
 
 $(PROGRAM)-static:  $(STATIC_OBJS) $(OTHEROBJS) $(GUEST_INIT)
$(E) "  LINK" $@
diff --git a/builtin-run.c b/builtin-run.c
index 1ee75ad3f010..e0c87329e52b 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -59,9 +59,6 @@ static int  kvm_run_wrapper;
 
 bool do_debug_print = false;
 
-extern char _binary_guest_init_start;
-extern char _binary_guest_init_size;
-
 static const char * const run_usage[] = {
"lkvm run [] []",
NULL
@@ -345,30 +342,6 @@ void kvm_run_help(void)
usage_with_options(run_usage, options);
 }
 
-static int kvm_setup_guest_init(struct kvm *kvm)
-{
-   const char *rootfs = kvm->cfg.custom_rootfs_name;
-   char tmp[PATH_MAX];
-   size_t size;
-   int fd, ret;
-   char *data;
-
-   /* Setup /virt/init */
-   size = (size_t)&_binary_guest_init_size;
-   data = (char *)&_binary_guest_init_start;
-   snprintf(tmp, PATH_MAX, "%s%s/virt/init", kvm__get_dir(), rootfs);
-   remove(tmp);
-   fd = open(tmp, O_CREAT | O_WRONLY, 0755);
-   if (fd < 0)
-   die("Fail to setup %s", tmp);
-   ret = xwrite(fd, data, size);
-   if (ret < 0)
-   die("Fail to setup %s", tmp);
-   close(fd);
-
-   return 0;
-}
-
 static int kvm_run_set_sandbox(struct kvm *kvm)
 {
const char *guestfs_name = kvm->cfg.custom_rootfs_name;
@@ -631,7 +604,7 @@ static struct kvm *kvm_cmd_run_init(int argc, const char 
**argv)
 
if (!kvm->cfg.no_dhcp)
strcat(real_cmdline, "  ip=dhcp");
-   if (kvm_setup_guest_init(kvm))
+   if (kvm_setup_guest_init(kvm->cfg.custom_rootfs_name))
die("Failed to setup init for guest.");
}
} else if (!strstr(real_cmdline, "root=")) {
diff --git a/builtin-setup.c b/builtin-setup.c
index 8b45c5645ad4..40fef15dbbe4 100644
--- a/builtin-setup.c
+++ b/builtin-setup.c
@@ -16,9 +16,6 @@
 #include 
 #include 
 
-extern char _binary_guest_init_start;
-extern char _binary_guest_init_size;
-
 static const char *instance_name;
 
 static const char * const set

[PATCH] KVM: arm64: remove all traces of the ThumbEE registers

2015-09-15 Thread Will Deacon

Although the ThumbEE registers and traps were present in earlier
versions of the v8 architecture, it was retrospectively removed and so
we can do the same.

Cc: Marc Zyngier <marc.zyng...@arm.com>
Signed-off-by: Will Deacon <will.dea...@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h |  1 -
 arch/arm64/include/asm/kvm_asm.h |  4 +---
 arch/arm64/kvm/hyp.S | 22 --
 arch/arm64/kvm/sys_regs.c|  7 ---
 4 files changed, 5 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 7605e095217f..cdc9888134e5 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -168,7 +168,6 @@
 #define VTTBR_VMID_MASK  (UL(0xFF) << VTTBR_VMID_SHIFT)
 
 /* Hyp System Trap Register */
-#define HSTR_EL2_TTEE  (1 << 16)
 #define HSTR_EL2_T(x)  (1 << x)
 
 /* Hyp Coproccessor Trap Register Shifts */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 67fa0de3d483..5e377101f919 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -53,9 +53,7 @@
 #defineIFSR32_EL2  25  /* Instruction Fault Status Register */
 #defineFPEXC32_EL2 26  /* Floating-Point Exception Control 
Register */
 #defineDBGVCR32_EL227  /* Debug Vector Catch Register */
-#defineTEECR32_EL1 28  /* ThumbEE Configuration Register */
-#defineTEEHBR32_EL129  /* ThumbEE Handler Base Register */
-#defineNR_SYS_REGS 30
+#defineNR_SYS_REGS 28
 
 /* 32bit mapping */
 #define c0_MPIDR   (MPIDR_EL1 * 2) /* MultiProcessor ID Register */
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 6addf97aadb3..c4016d411f4a 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -433,20 +433,13 @@
mrs x5, ifsr32_el2
stp x4, x5, [x3]
 
-   skip_fpsimd_state x8, 3f
+   skip_fpsimd_state x8, 2f
mrs x6, fpexc32_el2
str x6, [x3, #16]
-3:
-   skip_debug_state x8, 2f
+2:
+   skip_debug_state x8, 1f
mrs x7, dbgvcr32_el2
str x7, [x3, #24]
-2:
-   skip_tee_state x8, 1f
-
-   add x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
-   mrs x4, teecr32_el1
-   mrs x5, teehbr32_el1
-   stp x4, x5, [x3]
 1:
 .endm
 
@@ -466,16 +459,9 @@
msr dacr32_el2, x4
msr ifsr32_el2, x5
 
-   skip_debug_state x8, 2f
+   skip_debug_state x8, 1f
ldr x7, [x3, #24]
msr dbgvcr32_el2, x7
-2:
-   skip_tee_state x8, 1f
-
-   add x3, x2, #CPU_SYSREG_OFFSET(TEECR32_EL1)
-   ldp x4, x5, [x3]
-   msr teecr32_el1, x4
-   msr teehbr32_el1, x5
 1:
 .endm
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index b41607d270ac..6c35e49757d8 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -539,13 +539,6 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ Op0(0b10), Op1(0b000), CRn(0b0111), CRm(0b1110), Op2(0b110),
  trap_dbgauthstatus_el1 },
 
-   /* TEECR32_EL1 */
-   { Op0(0b10), Op1(0b010), CRn(0b), CRm(0b), Op2(0b000),
- NULL, reset_val, TEECR32_EL1, 0 },
-   /* TEEHBR32_EL1 */
-   { Op0(0b10), Op1(0b010), CRn(0b0001), CRm(0b), Op2(0b000),
- NULL, reset_val, TEEHBR32_EL1, 0 },
-
/* MDCCSR_EL1 */
{ Op0(0b10), Op1(0b011), CRn(0b), CRm(0b0001), Op2(0b000),
  trap_raz_wi },
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] KVM: arm64: add workaround for Cortex-A57 erratum #852523

2015-09-14 Thread Will Deacon

When restoring the system register state for an AArch32 guest at EL2,
writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
which can lead to the guest effectively running with junk in the DACR
and running into unexpected domain faults.

This patch works around the issue by re-ordering our restoration of the
AArch32 register aliases so that they happen before the AArch64 system
registers. Ensuring that the registers are restored in this order
guarantees that they will be correctly synchronised by the core.

Cc: <sta...@vger.kernel.org>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Signed-off-by: Will Deacon <will.dea...@arm.com>
---
 arch/arm64/kvm/hyp.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 3c4f641451bb..c4016d411f4a 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -739,6 +739,9 @@ ENTRY(__kvm_vcpu_run)
// Guest context
add x2, x0, #VCPU_CONTEXT
 
+   // We must restore the 32-bit state before the sysregs, thanks
+   // to Cortex-A57 erratum #852523.
+   restore_guest_32bit_state
bl __restore_sysregs
 
skip_debug_state x3, 1f
@@ -746,7 +749,6 @@ ENTRY(__kvm_vcpu_run)
kern_hyp_va x3
bl  __restore_debug
 1:
-   restore_guest_32bit_state
restore_guest_regs
 
// That's it, no more messing around.
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: arm64: add workaround for Cortex-A57 erratum #852523

2015-09-14 Thread Will Deacon

On Mon, Sep 14, 2015 at 04:46:28PM +0100, Marc Zyngier wrote:
> On 14/09/15 16:06, Will Deacon wrote:
> > When restoring the system register state for an AArch32 guest at EL2,
> > writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
> > which can lead to the guest effectively running with junk in the DACR
> > and running into unexpected domain faults.
> > 
> > This patch works around the issue by re-ordering our restoration of the
> > AArch32 register aliases so that they happen before the AArch64 system
> > registers. Ensuring that the registers are restored in this order
> > guarantees that they will be correctly synchronised by the core.
> > 
> > Cc: <sta...@vger.kernel.org>
> > Cc: Marc Zyngier <marc.zyng...@arm.com>
> > Signed-off-by: Will Deacon <will.dea...@arm.com>
> 
> Reviewed-by: Marc Zyngier <marc.zyng...@arm.com>
> 
> I'll queue that together with the next batch of fixes.

Great, thanks Marc.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool Makefile: relax arm test

2015-09-10 Thread Will Deacon

On Thu, Sep 10, 2015 at 06:45:59AM +0100, Riku Voipio wrote:
> On 4 September 2015 at 14:06, Andre Przywara  wrote:
> > On 04/09/15 11:52, Riku Voipio wrote:
> >> On 4 September 2015 at 13:10, Andre Przywara  
> >> wrote:
> >>> On 03/09/15 12:20, riku.voi...@linaro.org wrote:
>  From: Riku Voipio 
> 
>  Currently Makefile accepts only armv7l.* When building kvmtool under 
>  32bit
>  personality on Aarch64 machines, uname -m reports "armv8l", so build 
>  fails.
>  We expect doing 32bit arm builds in Aarch64 to become standard the same 
>  way
>  people do i386 builds on x86_64 machines.
> 
>  Make the sed test a little more greedy so armv8l becomes acceptable.
> 
>  Signed-off-by: Riku Voipio 
> >>>
> >>> The patch looks OK to me, I just wonder how you do the actual build
> >>> within the linux32 environment?
> >>> Do you have an arm cross compiler installed and set CROSS_COMPILE? Or is
> >>> there a magic compiler (driver) which uses uname -m as well?
> >>> And what would be the difference to setting ARCH=arm as well? Just
> >>> convenience?
> >>
> >> It's just an arm32 chroot, with an native arm32 compiler. The chroot
> >> is on an arm64 machine since these tend to be much faster than arm32
> >> hardware.
> >
> > Oh right, a chroot, didn't think about the obvious ;-)
> > Also it applies to 64-bit kernels with 32-bit root filesystems, I think.
> > So:
> >
> > Acked-by: Andre Przywara 
> 
> Ping?

Applied and pushed, thanks.

In future, it's best to Cc me if you want to make sure stuff doesn't
get missed :)

Cheers,

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] Late arm64 KVM fix for 4.2

2015-08-27 Thread Will Deacon

Hi Linus,

I appreciate that it's extremely late in the cycle, but we've uncovered
a nasty bug in the arm64 KVM code which allows a badly behaved 32-bit
guest to bring down the host. The fix is simple (it's what I believe we
call a brown paper bag bug) and I don't think it makes sense to sit on
this, particularly as Russell ended up triggering this rather than just
somebody noticing a potential problem by inspection.

Usually arm64 KVM changes would go via Paolo's tree, but he's on holiday
at the moment and the deal is that anything urgent gets shuffled via
the arch trees, so here it is. Please pull.

Cheers,

Will

---8

The following changes since commit c13dcf9f2d6f5f06ef1bf79ec456df614c5e058b:

  Linux 4.2-rc8 (2015-08-23 20:52:59 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git tags/arm64-fixes

for you to fetch changes up to 126c69a0bd0e441bf6766a5d9bf20de011be9f68:

  arm64: KVM: Fix host crash when injecting a fault into a 32bit guest 
(2015-08-27 16:16:55 +0100)


Urgent arm64 KVM fix for 4.2:

Fix arm64 KVM issue when injecting an abort into a 32-bit guest, which
would lead to an illegal exception return at EL2 and a subsequent host
crash.


Marc Zyngier (1):
  arm64: KVM: Fix host crash when injecting a fault into a 32bit guest

 arch/arm64/kvm/inject_fault.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH kvmtool] kvm__emulate_io: Don't call br_read_unlock() twice on IO error

2015-08-07 Thread Will Deacon

On Thu, Aug 06, 2015 at 07:39:44PM +0100, Josh Triplett wrote:
 The IO error path in kvm__emulate_io would call br_read_unlock(), then
 goto error, which would call br_read_unlock() again.  Refactor the
 control flow to have only one exit path and one call to
 br_read_unlock().

Thanks, Josh. I applied all three patches!

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv3 1/2] kvmtool: Introduce downscript option for virtio-net

2015-07-24 Thread Will Deacon

On Tue, Jul 21, 2015 at 10:44:31AM +0100, Andre Przywara wrote:
 thanks for the rework, that looks good to me, some minor nits below.
 Not sure if that requires a new version, maybe Will can fix that up when
 he applies it.

Nah, please send a new version with these points addressed.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/12] kvmtool: Improve portability

2015-07-17 Thread Will Deacon

On Fri, Jul 17, 2015 at 05:02:06PM +0100, Andre Przywara wrote:
 Hi,

Hi Andre,

 this is a collection of patches to bring kvmtool closer to standards
 compliance (with standards not necessarily meaning GNU only).
 With all those patches applied, you can compile kvmtool with newer
 C standards, with clang and against musl-libc.

Most of this looks good to me, thanks! I'll leave it to stew for a few
days on the list, but I'm planning to merge it all apart from the two
*printf clang hacks and the PAGE_SIZE patch which Szabolcs commented on.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 06/12] kvm__set_dir(): avoid variable arguments call

2015-07-17 Thread Will Deacon

On Fri, Jul 17, 2015 at 05:02:12PM +0100, Andre Przywara wrote:
 The clang compiler by default dislikes non-literal format strings
 in *printf functions, so it complains about kvm__set_dir() in kvm.c.
 Instead of suppressing this warning, lets change the code to avoid
 that unneeded var_args detour and use a literal format string.

Why does clang moan about this? The current code looks perfectly alright
to me.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 07/12] util/util.c: avoid clang error on vsnprintf

2015-07-17 Thread Will Deacon

On Fri, Jul 17, 2015 at 05:02:13PM +0100, Andre Przywara wrote:
 clang by default doesn't seem to like printf calls with non-literal
 format strings.
 Add the proper pragma to disable this warning in the report function
 to make kvmtool compile with clang. Despite its GCC name, clang also
 accepts this.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  util/util.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/util/util.c b/util/util.c
 index 1877105..825da3f 100644
 --- a/util/util.c
 +++ b/util/util.c
 @@ -10,6 +10,7 @@
  #include sys/stat.h
  #include sys/statfs.h
  
 +#pragma GCC diagnostic ignored -Wformat-nonliteral

Urgh! I think we need to figure out a better way to keep clang happy in
this regard, if we decide that we care about building with it.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 10/13] arm64: Add support for running Linux in EL2 mode

2015-07-16 Thread Will Deacon

On Wed, Jul 08, 2015 at 05:19:13PM +0100, Marc Zyngier wrote:
 With the ARMv8.1 VHE, the architecture is able to (almost) transparently
 run the kernel at EL2, despite being written for EL1.
 
 This patch takes care of the almost part, mostly preventing the kernel
 from dropping from EL2 to EL1, and setting up the HYP configuration.
 
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 ---
  arch/arm64/kernel/head.S | 23 ++-
  1 file changed, 22 insertions(+), 1 deletion(-)
 
 diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
 index c0ff3ce..a179747 100644
 --- a/arch/arm64/kernel/head.S
 +++ b/arch/arm64/kernel/head.S
 @@ -29,6 +29,7 @@
  #include asm/asm-offsets.h
  #include asm/cache.h
  #include asm/cputype.h
 +#include asm/kvm_mmu.h
  #include asm/memory.h
  #include asm/thread_info.h
  #include asm/pgtable-hwdef.h
 @@ -481,8 +482,16 @@ CPU_LE(  bic x0, x0, #(3  24)  )   // 
 Clear the EE and E0E bits for EL1
   isb
   ret
  
 + /* Check for VHE being present */
 +2:   mrs x2, id_aa64mmfr1_el1
 + ubfxx2, x2, #8, #4
 +
   /* Hyp configuration. */
 -2:   mov x0, #(1  31)  // 64-bit EL1
 + mov x0, #HCR_RW // 64-bit EL1
 + cbz x2, set_hcr
 + orr x0, x0, #HCR_TGE// Enable Host Extensions
 + orr x0, x0, #HCR_E2H
 +set_hcr:
   msr hcr_el2, x0
  
   /* Generic timers. */
 @@ -522,6 +531,9 @@ CPU_LE(   movkx0, #0x30d0, lsl #16)   // 
 Clear EE and E0E on LE systems
  
   /* Coprocessor traps. */
   mov x0, #0x33ff

These bits are RES0 with VHE enabled, afaict.

 + cbz x2, set_cptr
 + orr x0, x0, #(3  20)  // Don't trap FP
 +set_cptr:
   msr cptr_el2, x0// Disable copro. traps to EL2
  
  #ifdef CONFIG_COMPAT
 @@ -531,6 +543,15 @@ CPU_LE(  movkx0, #0x30d0, lsl #16)   // 
 Clear EE and E0E on LE systems
   /* Stage-2 translation */
   msr vttbr_el2, xzr
  
 + cbz x2, install_el2_stub
 +
 + setup_vtcr x4, x5
 +
 + mov w20, #BOOT_CPU_MODE_EL2 // This CPU booted in EL2

You do this in install_el2_stub as well -- can you move it above the cbz?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/13] arm64: Panic when VHE and non VHE CPUs coexist

2015-07-16 Thread Will Deacon

On Wed, Jul 08, 2015 at 05:19:14PM +0100, Marc Zyngier wrote:
 Having both VHE and non-VHE capable CPUs in the same system
 is likely to be a recipe for disaster.
 
 If the boot CPU has VHE, but a secondary is not, we won't be
 able to downgrade and run the kernel at EL1. Add CPU hotplug
 to the mix, and this produces a terrifying mess.
 
 Let's solve the problem once and for all. If you mix VHE and
 non-VHE CPUs in the same system, you deserve to loose, and this
 patch makes sure you don't get a chance.
 
 This is implemented by storing the kernel execution level in
 a global variable. Secondaries will park themselves in a
 WFI loop if they observe a mismatch. Also, the primary CPU
 will detect that the secondary CPU has died on a mismatched
 execution level. Panic will follow.
 
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 ---
  arch/arm64/include/asm/virt.h | 15 +++
  arch/arm64/kernel/head.S  | 15 +++
  arch/arm64/kernel/smp.c   |  4 
  3 files changed, 34 insertions(+)
 
 diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
 index 9f22dd6..8e246f7 100644
 --- a/arch/arm64/include/asm/virt.h
 +++ b/arch/arm64/include/asm/virt.h
 @@ -36,6 +36,11 @@
   */
  extern u32 __boot_cpu_mode[2];
  
 +/*
 + * __run_cpu_mode records the mode the boot CPU uses for the kernel.
 + */
 +extern u32 __run_cpu_mode;
 +
  void __hyp_set_vectors(phys_addr_t phys_vector_base);
  phys_addr_t __hyp_get_vectors(void);
  
 @@ -60,6 +65,16 @@ static inline bool is_kernel_in_hyp_mode(void)
   return el == CurrentEL_EL2;
  }
  
 +static inline bool is_kernel_mode_mismatched(void)
 +{
 + u64 el;
 + u32 mode;
 +
 + asm(mrs %0, CurrentEL : =r (el));
 + mode = ACCESS_ONCE(__run_cpu_mode);
 + return el != mode;

Why the temporary 'mode' variable?

 +}
 +
  /* The section containing the hypervisor text */
  extern char __hyp_text_start[];
  extern char __hyp_text_end[];
 diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
 index a179747..318e69f 100644
 --- a/arch/arm64/kernel/head.S
 +++ b/arch/arm64/kernel/head.S
 @@ -578,7 +578,20 @@ ENTRY(set_cpu_boot_mode_flag)
  1:   str w20, [x1]   // This CPU has booted in EL1
   dmb sy
   dc  ivac, x1// Invalidate potentially stale 
 cache line
 + adr_l   x1, __run_cpu_mode
 + ldr w0, [x1]
 + mrs x20, CurrentEL

Why x20?

 + str w20, [x1]
 + dmb sy
 + dc  ivac, x1// Invalidate potentially stale 
 cache line

Can we stick __run_cpu_mode and __boot_cpu_mode into a struct in the same
cacheline and avoid the extra maintenance?

 + cbz x0, skip_el_check

w0?

 + cmp x0, x20

w0, w20?

 + bne mismatched_el
 +skip_el_check:
   ret
 +mismatched_el:
 + wfi
 + b   mismatched_el
  ENDPROC(set_cpu_boot_mode_flag)
  
  /*
 @@ -593,6 +606,8 @@ ENDPROC(set_cpu_boot_mode_flag)
  ENTRY(__boot_cpu_mode)
   .long   BOOT_CPU_MODE_EL2
   .long   BOOT_CPU_MODE_EL1
 +ENTRY(__run_cpu_mode)
 + .long   0
   .popsection
  
  #ifdef CONFIG_SMP
 diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
 index 695801a..b467f51 100644
 --- a/arch/arm64/kernel/smp.c
 +++ b/arch/arm64/kernel/smp.c
 @@ -52,6 +52,7 @@
  #include asm/sections.h
  #include asm/tlbflush.h
  #include asm/ptrace.h
 +#include asm/virt.h
  
  #define CREATE_TRACE_POINTS
  #include trace/events/ipi.h
 @@ -112,6 +113,9 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
   pr_crit(CPU%u: failed to come online\n, cpu);
   ret = -EIO;
   }
 +
 + if (is_kernel_mode_mismatched())
 + panic(CPU%u: incompatible execution level, cpu);

Might be useful to print the incompatible values, if possible.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/13] arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature

2015-07-16 Thread Will Deacon

On Wed, Jul 08, 2015 at 05:19:06PM +0100, Marc Zyngier wrote:
 Add a new ARM64_HAS_VIRT_HOST_EXTN features to indicate that the
 CPU has the ARMv8,1 VHE capability.
 
 This will be used to trigger kernel patching in KVM.
 
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com

Acked-by: Will Deacon will.dea...@arm.com

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/13] arm/arm64: Add new is_kernel_in_hyp_mode predicate

2015-07-16 Thread Will Deacon

On Wed, Jul 08, 2015 at 05:19:04PM +0100, Marc Zyngier wrote:
 With ARMv8.1 VHE extension, it will be possible to run the kernel
 at EL2 (aka HYP mode). In order for the kernel to easily find out
 where it is running, add a new predicate that returns whether or
 not the kernel is in HYP mode.
 
 For completeness, the 32bit code also get such a predicate (always
 returning false) so that code common to both architecture (timers,
 KVM) can use it transparently.
 
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 ---
  arch/arm/include/asm/virt.h   |  5 +
  arch/arm64/include/asm/virt.h | 10 ++
  2 files changed, 15 insertions(+)

[...]

 +static inline bool is_kernel_in_hyp_mode(void)
 +{
 + u64 el;
 +
 + asm(mrs %0, CurrentEL : =r (el));
 + return el == CurrentEL_EL2;

Missing mask?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 01/18] ARM64: Move PMU register related defines to asm/pmu.h

2015-07-08 Thread Will Deacon

On Mon, Jul 06, 2015 at 03:17:31AM +0100, shannon.z...@linaro.org wrote:
 From: Shannon Zhao shannon.z...@linaro.org
 
 To use the ARMv8 PMU related register defines from the KVM code,
 we move the relevant definitions to asm/pmu.h header file.
 
 Signed-off-by: Anup Patel anup.pa...@linaro.org
 Signed-off-by: Shannon Zhao shannon.z...@linaro.org
 ---
  arch/arm64/include/asm/pmu.h   | 45 
 ++
  arch/arm64/kernel/perf_event.c | 35 
  2 files changed, 45 insertions(+), 35 deletions(-)

Whilst I'm ok with the idea, we're currently in the process of moving the
ARM and arm64 PMU drivers out into drivers/perf/, so this is likely to
conflict horribly with that work.

Christoffer/Marc: if you end up wanting to queue this for 4.3, give me a
shout and I'll try to create a stable branch you can use as a base (but
it's likely to contain a whole heap of stuff you don't want...).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers

2015-07-08 Thread Will Deacon

On Wed, Jul 08, 2015 at 11:50:22AM +0100, Zhichao Huang wrote:
 Are you happy with this?:

You miss the reserved breakpoint, I think.
I also still don't understand why this is preferable to trapping.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 08/11] KVM: arm: implement dirty bit mechanism for debug registers

2015-07-07 Thread Will Deacon

On Tue, Jul 07, 2015 at 11:06:57AM +0100, Zhichao Huang wrote:
 Chazy and me are talking about how to reduce the saving/restoring
 overhead for debug registers.
 We want to add a state in hw_breakpoint.c to indicate whether the host
 enable any hwbrpts or not (might export a fuction that kvm can call),
 then we can read this state from memory instead of reading from real
 hardware registers, and to decide whether we need a world switch or
 not.
 Does it acceptable?

Maybe, hard to tell without the code. There are obvious races to deal with
if you use variables to indicate whether resources are in use -- why not
just trap debug access from the host as well? Then you could keep track of
the owner in kvm and trap accesses from everybody else.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support

2015-07-06 Thread Will Deacon

On Fri, Jul 03, 2015 at 05:07:41PM +0100, Alex Bennée wrote:
 Will Deacon will.dea...@arm.com writes:
  On Thu, Jul 02, 2015 at 02:50:33PM +0100, Alex Bennée wrote:
  Are you happy with this?:
 
  [...]
 
  +/**
  + * kvm_arch_dev_ioctl_check_extension
  + *
  + * We currently assume that the number of HW registers is uniform
  + * across all CPUs (see cpuinfo_sanity_check).
  + */
   int kvm_arch_dev_ioctl_check_extension(long ext)
   {
  int r;
  @@ -64,6 +71,12 @@ int kvm_arch_dev_ioctl_check_extension(long ext)
  case KVM_CAP_ARM_EL1_32BIT:
  r = cpu_has_32bit_el1();
  break;
  +   case KVM_CAP_GUEST_DEBUG_HW_BPS:
  +   r = hw_breakpoint_slots(TYPE_INST);
  +   break;
  +   case KVM_CAP_GUEST_DEBUG_HW_WPS:
  +   r  = hw_breakpoint_slots(TYPE_DATA);
  +   break;
 
  Whilst I much prefer this code, it actually adds an unwanted dependency
  on PERF_EVENTS that I didn't think about to start with. Sorry to keep
  messing you about -- I guess your original patch is the best thing after
  all.
 
 Everything looks to be in hw_breakpoint.[ch] which does depend on
 CONFIG_HAVE_HW_BREAKPOINT which depends on PERF_EVENTS to be built.
 However the previous code depended on this behaviour as well.

I think your original approach (of sticking stuff in the header file) works
regardless of the CONFIG option, no?

 It would seem weird to enable guest debug using HW debug registers to
 debug the guest yet not allowing the host kernel to use them? Of course
 this is the only code they would share as all the magic of guest
 debugging is already mostly there for dirty guest handling.
 
 I'm not familiar with Kconfig but it looks like this is all part of
 arm64 defconfig. Are people really going to want to disable PERF_EVENTS
 but still debug their guests with HW support?

Then it's your call. I just find the host dependency on perf a bit weird
to get guest debug working (especially as the dependency is completely
fake because we don't use any perf infrastructure at all).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support

2015-07-03 Thread Will Deacon

Hi Alex,

On Thu, Jul 02, 2015 at 02:50:33PM +0100, Alex Bennée wrote:
 Are you happy with this?:

[...]

 +/**
 + * kvm_arch_dev_ioctl_check_extension
 + *
 + * We currently assume that the number of HW registers is uniform
 + * across all CPUs (see cpuinfo_sanity_check).
 + */
  int kvm_arch_dev_ioctl_check_extension(long ext)
  {
 int r;
 @@ -64,6 +71,12 @@ int kvm_arch_dev_ioctl_check_extension(long ext)
 case KVM_CAP_ARM_EL1_32BIT:
 r = cpu_has_32bit_el1();
 break;
 +   case KVM_CAP_GUEST_DEBUG_HW_BPS:
 +   r = hw_breakpoint_slots(TYPE_INST);
 +   break;
 +   case KVM_CAP_GUEST_DEBUG_HW_WPS:
 +   r  = hw_breakpoint_slots(TYPE_DATA);
 +   break;

Whilst I much prefer this code, it actually adds an unwanted dependency
on PERF_EVENTS that I didn't think about to start with. Sorry to keep
messing you about -- I guess your original patch is the best thing after
all.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support

2015-07-02 Thread Will Deacon

Hi Alex,

On Wed, Jul 01, 2015 at 07:29:01PM +0100, Alex Bennée wrote:
 This adds support for userspace to control the HW debug registers for
 guest debug. In the debug ioctl we copy an IMPDEF registers into a new
 register set called host_debug_state.
 
 We use the recently introduced vcpu parameter debug_ptr to select which
 register set is copied into the real registers when world switch occurs.
 
 I've made some helper functions from hw_breakpoint.c more widely
 available for re-use.
 
 As with single step we need to tweak the guest registers to enable the
 exceptions so we need to save and restore those bits.
 
 Two new capabilities have been added to the KVM_EXTENSION ioctl to allow
 userspace to query the number of hardware break and watch points
 available on the host hardware.
 
 Signed-off-by: Alex Bennée alex.ben...@linaro.org
 Reviewed-by: Christoffer Dall christoffer.d...@linaro.org

[...]

 diff --git a/arch/arm64/kernel/hw_breakpoint.c 
 b/arch/arm64/kernel/hw_breakpoint.c
 index e7d934d..ac07f2a 100644
 --- a/arch/arm64/kernel/hw_breakpoint.c
 +++ b/arch/arm64/kernel/hw_breakpoint.c
 @@ -50,13 +50,13 @@ static int core_num_brps;
  static int core_num_wrps;
 
  /* Determine number of BRP registers available. */
 -static int get_num_brps(void)
 +int get_num_brps(void)
  {
 return ((read_cpuid(ID_AA64DFR0_EL1)  12)  0xf) + 1;
  }
 
  /* Determine number of WRP registers available. */
 -static int get_num_wrps(void)
 +int get_num_wrps(void)
  {
 return ((read_cpuid(ID_AA64DFR0_EL1)  20)  0xf) + 1;
  }

Sorry, just noticed this, but we already have a public interface for
figuring these numbers out as required by perf. Can't you use
hw_breakpoint_slots(...) instead of exposing these internal helpers?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] kvmtool: Add parameter to specifiy number of threads in thread_pool

2015-06-30 Thread Will Deacon

Hi Andreas,

On Mon, Jun 29, 2015 at 12:43:33PM +0100, Andreas Herrmann wrote:
 With current code the number of threads added to the thread_pool
 equals number of online CPUs. So on cn78xx we usually have 48 threads
 per guest just for the thread_pool. IMHO this is overkill for guests
 that just have a few vCPUs and/or if a guest is pinned to a subset of
 host CPUs. E.g.
 
  # numactl -C 4,5,7,8 ./lkvm run -c 2 -m 256 -k paravirt -d rootfs.ext3 ...
  # ps -La | grep threadpool-work | wc -l
  48
 
 Don't change default behaviour (for sake of compatibility) but
 introduce a new parameter (-t or --threads) that allows to specify
 number of threads to be created for the thread_pool:
 
  # numactl -C 4,5,7,8 ./lkvm run -c 2 -m 256 --threads 4 -k paravirt -d ...
  # ps -La | grep threadpool-work | wc -l
  4
 
 Signed-off-by: Andreas Herrmann andreas.herrm...@caviumnetworks.com
 ---
  builtin-run.c|6 ++
  include/kvm/kvm-config.h |1 +
  util/threadpool.c|2 +-
  3 files changed, 8 insertions(+), 1 deletion(-)
 
 New in v2: paramter must be in range [1, number of online cpus]
 otherwise the default (number of online cpus) will be used.

I thought some more about this and started to wonder whether we can make
the threadpool self-balancing.

Given the fairly restricted nature of kvmtool's threading model, could
we not start with a small fixed number of threads (but no more than the
number of physical CPUs) and then create new threads on demand when we
detect that there is backlog in the job queue and there are spare CPUs?

I don't think it should be too hard, especially if we ignore the problem
of destroying idle threads and it removes the need for a magic tunable
on the cmdline.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 10/10] arm: use new irqchip parameter to create different vGIC types

2015-06-30 Thread Will Deacon

On Fri, Jun 26, 2015 at 02:16:18PM +0100, Andre Przywara wrote:
 Currently we unconditionally create a virtual GICv2 in the guest.
 Add a --irqchip= parameter to let the user specify a different GIC
 type for the guest, when omitting this parameter it still defaults to
 --irqchip=gicv2.
 For now the only other supported type is --irqchip=gicv3
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  arm/aarch64/arm-cpu.c|  2 +-
  arm/gic.c| 17 +
  arm/include/arm-common/kvm-config-arch.h |  9 -
  arm/kvm.c|  2 +-
  4 files changed, 27 insertions(+), 3 deletions(-)
 
 diff --git a/arm/aarch64/arm-cpu.c b/arm/aarch64/arm-cpu.c
 index f702b9e..3dc8ea3 100644
 --- a/arm/aarch64/arm-cpu.c
 +++ b/arm/aarch64/arm-cpu.c
 @@ -12,7 +12,7 @@
  static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle)
  {
   int timer_interrupts[4] = {13, 14, 11, 10};
 - gic__generate_fdt_nodes(fdt, gic_phandle, IRQCHIP_GICV2);
 + gic__generate_fdt_nodes(fdt, gic_phandle, kvm-cfg.arch.irqchip);
   timer__generate_fdt_nodes(fdt, kvm, timer_interrupts);
  }
  
 diff --git a/arm/gic.c b/arm/gic.c
 index efe4b42..ff56de7 100644
 --- a/arm/gic.c
 +++ b/arm/gic.c
 @@ -22,6 +22,23 @@ static int gic_fd = -1;
  static u64 gic_redists_base;
  static u64 gic_redists_size;
  
 +int irqchip_parser(const struct option *opt, const char *arg, int unset)
 +{
 + enum irqchip_type *type = opt-value;
 +
 + *type = IRQCHIP_GICV2;

Pointless assignment?

 + if (!strcmp(arg, gicv2)) {
 + *type = IRQCHIP_GICV2;
 + } else if (!strcmp(arg, gicv3)) {
 + *type = IRQCHIP_GICV3;
 + } else {
 + fprintf(stderr, irqchip: unknown type \%s\\n, arg);
 + return -1;
 + }
 +
 + return 0;
 +}
 +
  static int gic__create_device(struct kvm *kvm, enum irqchip_type type)
  {
   int err;
 diff --git a/arm/include/arm-common/kvm-config-arch.h 
 b/arm/include/arm-common/kvm-config-arch.h
 index a8ebd94..9529881 100644
 --- a/arm/include/arm-common/kvm-config-arch.h
 +++ b/arm/include/arm-common/kvm-config-arch.h
 @@ -8,8 +8,11 @@ struct kvm_config_arch {
   unsigned intforce_cntfrq;
   boolvirtio_trans_pci;
   boolaarch32_guest;
 + enum irqchip_type irqchip;
  };
  
 +int irqchip_parser(const struct option *opt, const char *arg, int unset);
 +
  #define OPT_ARCH_RUN(pfx, cfg)   
 \
   pfx,
 \
   ARM_OPT_ARCH_RUN(cfg)   
 \
 @@ -21,6 +24,10 @@ struct kvm_config_arch {
updated to program CNTFRQ correctly*),   
 \
   OPT_BOOLEAN('\0', force-pci, (cfg)-virtio_trans_pci,
 \
   Force virtio devices to use PCI as their default  
 \
 - transport),
 + transport),   
 \
 +OPT_CALLBACK('\0', irqchip, (cfg)-irqchip,   
 \
 +  [gicv2|gicv3],   \
 +  type of interrupt controller to emulate in the guest,
 \
 +  irqchip_parser, NULL),

What happens if I don't pass this option at all?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 07/10] limit number of VCPUs on demand

2015-06-30 Thread Will Deacon

On Fri, Jun 26, 2015 at 02:16:15PM +0100, Andre Przywara wrote:
 Currently the ARM GIC checks the number of VCPUs against a fixed
 limit, which is GICv2 specific. Don't pretend we know better than the
 kernel and let's get rid of that explicit check.
 Instead be more relaxed about KVM_CREATE_VCPU failing with EINVAL,
 which is the way the kernel communicates having reached a VCPU limit.
 If we see this and have at least brought up one VCPU already
 successfully, then don't panic, but limit the number of VCPUs instead.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  arm/gic.c | 6 --
  arm/kvm-cpu.c | 7 ++-
  kvm-cpu.c | 7 +++
  3 files changed, 13 insertions(+), 7 deletions(-)
 
 diff --git a/arm/gic.c b/arm/gic.c
 index 99f0d2b..05f85a2 100644
 --- a/arm/gic.c
 +++ b/arm/gic.c
 @@ -84,12 +84,6 @@ int gic__create(struct kvm *kvm)
  {
   int err;
  
 - if (kvm-nrcpus  GIC_MAX_CPUS) {
 - pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
 - kvm-nrcpus, GIC_MAX_CPUS);
 - kvm-nrcpus = GIC_MAX_CPUS;
 - }
 -
   /* Try the new way first, and fallback on legacy method otherwise */
   err = gic__create_device(kvm);
   if (err)
 diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
 index 7780251..b2fd6ed 100644
 --- a/arm/kvm-cpu.c
 +++ b/arm/kvm-cpu.c
 @@ -51,8 +51,13 @@ struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, 
 unsigned long cpu_id)
   return NULL;
  
   vcpu-vcpu_fd = ioctl(kvm-vm_fd, KVM_CREATE_VCPU, cpu_id);
 - if (vcpu-vcpu_fd  0)
 + if (vcpu-vcpu_fd  0) {
 + if (errno == EINVAL) {
 + free(vcpu);
 + return NULL;
 + }

Hmm, but EINVAL can mean all sorts of other failures too, surely? I'm
not against removing the nrcpus check, but I think we should die if ioctls
start failing rather than silently ignore them.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: Add parameter to specifiy number of threads in thread_pool

2015-06-29 Thread Will Deacon

On Mon, Jun 29, 2015 at 08:45:44AM +0100, Andreas Herrmann wrote:
 
 With current code the number of threads added to the thread_pool
 equals number of online CPUs. Thus on an OcteonIII cn78xx system we
 usually have 48 threads per guest just for the thread_pool. IMHO this
 is overkill for guests that just have a few vCPUs and/or if a guest is
 pinned to a subset of host CPUs. E.g.
 
  # numactl -C 4,5,7,8 ./lkvm run -c 2 -m 256 -k paravirt -d rootfs.ext3 ...
  # ps -La | grep threadpool-work | wc -l
  48
 
 Don't change default behaviour (for sake of compatibility) but
 introduce a new parameter (-t or --threads) that allows to specify
 number of threads to be created for the thread_pool:
 
  # numactl -C 4,5,7,8 ./lkvm run -c 2 -m 256 --threads 4 -k paravirt -d ...
  # ps -La | grep threadpool-work | wc -l
  4

We should probably bound this on some minimum value. I assume things go
pear-shaped if you pass --threads 1 (or 0, or -1)?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-06-29 Thread Will Deacon

Hi Andre,

On Thu, Jun 18, 2015 at 06:19:53PM +0100, Andre Przywara wrote:
 I am tempted to remove shmem, since it's broken:
 a) there is no upstream driver, only some out-of-tree uio driver module
 in some Github repo

Right, but that's the same for qemu and we've already made the jump of
merging the driver, so I don't think that's a good argument for throwing
it out of the tree.

 b) the PCI device BARs do not match what QEMU implements and what the
 uio driver expects (IO BAR vs. MMIO BAR)

In what way? A quick look suggests that kvmtool is at least aligned with
said github repo.

 c) there is (at least one) bug in kvmtool (easily fixed, though)

Care to elaborate?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 06/10] arm: simplify MMIO dispatching

2015-06-24 Thread Will Deacon

On Wed, Jun 24, 2015 at 02:30:05PM +0100, Andre Przywara wrote:
 do you want me to respin the whole series to address the remaining minor
 comments in the last four patches or do you want to take patch 01-06
 already (which I think Marc has already agreed upon)?
 Then I would just send an updated version of the remaining patches.

Yes, please. It's easier to keep track if you just repost the series with
the changes and acks.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] powerpc: Define the hcall opcodes return values we need

2015-06-19 Thread Will Deacon

On Fri, Jun 19, 2015 at 08:21:00AM +0100, Michael Ellerman wrote:
 Now that we don't have the kernel header on hand, just define the
 minimum set of hcall opcodes and return values we need in order to
 build.

Thanks Michael!

I pushed both of these out.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] powerpc: Define the hcall opcodes return values we need

2015-06-19 Thread Will Deacon

On Fri, Jun 19, 2015 at 08:21:00AM +0100, Michael Ellerman wrote:
 Now that we don't have the kernel header on hand, just define the
 minimum set of hcall opcodes and return values we need in order to
 build.

Thanks Michael!

I pushed both of these out.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-18 Thread Will Deacon

On Thu, Jun 18, 2015 at 10:11:58AM +0100, Michael Ellerman wrote:
 On Wed, 2015-06-17 at 11:15 +0100, Will Deacon wrote:
  On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
   Instead of referring to the Linux header including the barrier
   macros, copy over the rather simple implementation for the PowerPC
   barrier instructions kvmtool uses. This fixes build for powerpc.
   
   Signed-off-by: Andre Przywara andre.przyw...@arm.com
   ---
   Hi,
   
   I just took what kvmtool seems to have used before, I actually have
   no idea if sync is the right instruction or lwsync would do.
   Would be nice if some people with PowerPC knowledge could comment.
  
  I *think* we can use lwsync for rmb and wmb, but would want confirmation
  from a ppc guy before making that change!
 
 Ugh, memory barriers :)

I prefer to call them Job Security :)

 You probably can use lwsync, assuming you're only ordering cacheable vs
 cacheable.
 
 But, lwsync has given us pain in the past[1], so I'd be happier if you just 
 used
 sync.

No probs. I pushed Andre's original patch.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-18 Thread Will Deacon

On Thu, Jun 18, 2015 at 10:11:58AM +0100, Michael Ellerman wrote:
 On Wed, 2015-06-17 at 11:15 +0100, Will Deacon wrote:
  On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
   Instead of referring to the Linux header including the barrier
   macros, copy over the rather simple implementation for the PowerPC
   barrier instructions kvmtool uses. This fixes build for powerpc.
   
   Signed-off-by: Andre Przywara andre.przyw...@arm.com
   ---
   Hi,
   
   I just took what kvmtool seems to have used before, I actually have
   no idea if sync is the right instruction or lwsync would do.
   Would be nice if some people with PowerPC knowledge could comment.
  
  I *think* we can use lwsync for rmb and wmb, but would want confirmation
  from a ppc guy before making that change!
 
 Ugh, memory barriers :)

I prefer to call them Job Security :)

 You probably can use lwsync, assuming you're only ordering cacheable vs
 cacheable.
 
 But, lwsync has given us pain in the past[1], so I'd be happier if you just 
 used
 sync.

No probs. I pushed Andre's original patch.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/5] kvmtool: Save datamatch as little endian in {add,del}_event

2015-06-17 Thread Will Deacon

On Wed, Jun 17, 2015 at 08:17:49AM +0100, Andreas Herrmann wrote:
 On Tue, Jun 16, 2015 at 06:17:14PM +0100, Will Deacon wrote:
  On Mon, Jun 15, 2015 at 12:49:45PM +0100, Andreas Herrmann wrote:
   W/o dedicated endianess it's impossible to find reliably a match
   e.g. in kernel/virt/kvm/eventfd.c ioeventfd_in_range.
  
  Hmm, but shouldn't this be the endianness of the guest, rather than just
  forcing things to little-endian?
 
 With my patch and following adaption to
 ioeventfd_in_range (in virt/kvm/eventfd.c):

[...]

 But now I see, w/o a correponding kernel change the patch shouldn't
 be merged.

Digging a bit deeper, I think it's up to the architecture KVM backend
(in the kernel) to present the mmio buffer to core kvm in the host
endianness.

For example, on ARM, we honour the endianness of the vcpu in
vcpu_data_guest_to_host when we populate the buffer for kvm_io_bus_write
(which is what ends up in the ioeventfd code).

I couldn't find equivalent code for MIPs, but I may have been looking in
the wrong place.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] powerpc: add hvcall.h header from Linux

2015-06-17 Thread Will Deacon

On Wed, Jun 17, 2015 at 10:43:50AM +0100, Andre Przywara wrote:
 The powerpc code uses some PAPR hypercalls, of which we need the
 hypercall number. Copy the macro definition parts from the kernel's
 (private) hvcall.h file and remove the extra tricks formerly used
 to be able to include this header file directly.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
 Hi,
 
 I copied most of the Linux header, without removing
 definitions that kvmtool doesn't use. That should make updates
 easier. If people would prefer a bespoke header, let me know.

I'd rather just #define the stuff we need now that we're outside of the
kernel source tree.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] powerpc: add hvcall.h header from Linux

2015-06-17 Thread Will Deacon

On Wed, Jun 17, 2015 at 10:43:50AM +0100, Andre Przywara wrote:
 The powerpc code uses some PAPR hypercalls, of which we need the
 hypercall number. Copy the macro definition parts from the kernel's
 (private) hvcall.h file and remove the extra tricks formerly used
 to be able to include this header file directly.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
 Hi,
 
 I copied most of the Linux header, without removing
 definitions that kvmtool doesn't use. That should make updates
 easier. If people would prefer a bespoke header, let me know.

I'd rather just #define the stuff we need now that we're outside of the
kernel source tree.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-17 Thread Will Deacon

On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
 Instead of referring to the Linux header including the barrier
 macros, copy over the rather simple implementation for the PowerPC
 barrier instructions kvmtool uses. This fixes build for powerpc.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
 Hi,
 
 I just took what kvmtool seems to have used before, I actually have
 no idea if sync is the right instruction or lwsync would do.
 Would be nice if some people with PowerPC knowledge could comment.

I *think* we can use lwsync for rmb and wmb, but would want confirmation
from a ppc guy before making that change!

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/5] kvmtool: Save datamatch as little endian in {add,del}_event

2015-06-16 Thread Will Deacon

On Mon, Jun 15, 2015 at 12:49:45PM +0100, Andreas Herrmann wrote:
 W/o dedicated endianess it's impossible to find reliably a match
 e.g. in kernel/virt/kvm/eventfd.c ioeventfd_in_range.

Hmm, but shouldn't this be the endianness of the guest, rather than just
forcing things to little-endian?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 01/11] KVM: arm: plug guest debug exploit

2015-06-16 Thread Will Deacon

On Sun, Jun 14, 2015 at 05:13:05PM +0100, zichao wrote:
 I and marc are talking about how to plug the guest debug exploit in an
 easier way.
 
 I remembered that you mentioned disabling monitor mode had proven to be
 extremely fragile in practice on 32-bit ARM SoCs, what if I save/restore
 the debug monitor mode on each switch between the guest and the host,
 would it be acceptable?

If you're just referring to DBGDSCRext, then you could give it a go, but
you'll certainly want to predicate any writes to that register on whether
or not hw_breakpoint managed to reset the debug regs on the host.

Like I said, accessing these registers always worries me, so I'd really
avoid it in KVM if you can. If not, you'll need to do extensive testing
on a bunch of platforms with and without the presence of external debug.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] kvmtool: Misc fixes

2015-06-16 Thread Will Deacon

On Mon, Jun 15, 2015 at 12:49:41PM +0100, Andreas Herrmann wrote:
 Following some patches to fix misc issues found when testing the
 standalone kvmtool version.
 
 Please apply.

All applied, apart from the ioeventfd patch which I'm not sure about.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-06-16 Thread Will Deacon

On Mon, Jun 15, 2015 at 11:45:38AM +0100, Andre Przywara wrote:
 On 06/05/2015 05:41 PM, Will Deacon wrote:
  On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
  In PCI config space there is an interrupt line field (offset 0x3f),
  which is used to initially communicate the IRQ line number from
  firmware to the OS. _Hardware_ should never use this information,
  as the OS is free to write any information in there.
  But kvmtool uses this number when it triggers IRQs in the guest,
  which fails starting with Linux 3.19-rc1, where the PCI layer starts
  writing the virtual IRQ number in there.
 
  Fix that by storing the IRQ number in a separate field in
  struct virtio_pci, which is independent from the PCI config space
  and cannot be influenced by the guest.
  This fixes ARM/ARM64 guests using PCI with newer kernels.
 
  Signed-off-by: Andre Przywara andre.przyw...@arm.com
  ---
   include/kvm/virtio-pci.h | 8 
   virtio/pci.c | 9 ++---
   2 files changed, 14 insertions(+), 3 deletions(-)
 
  diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
  index c795ce7..b70cadd 100644
  --- a/include/kvm/virtio-pci.h
  +++ b/include/kvm/virtio-pci.h
  @@ -30,6 +30,14 @@ struct virtio_pci {
 u8  isr;
 u32 features;
   
  +  /*
  +   * We cannot rely on the INTERRUPT_LINE byte in the config space once
  +   * we have run guest code, as the OS is allowed to use that field
  +   * as a scratch pad to communicate between driver and PCI layer.
  +   * So store our legacy interrupt line number in here for internal use.
  +   */
  +  u8  legacy_irq_line;
  +
 /* MSI-X */
 u16 config_vector;
 u32 config_gsi;
  diff --git a/virtio/pci.c b/virtio/pci.c
  index 7556239..e17e5a9 100644
  --- a/virtio/pci.c
  +++ b/virtio/pci.c
  @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, 
  struct kvm_cpu *vcpu, u16 p
 break;
 case VIRTIO_PCI_ISR:
 ioport__write8(data, vpci-isr);
  -  kvm__irq_line(kvm, vpci-pci_hdr.irq_line, VIRTIO_IRQ_LOW);
  +  kvm__irq_line(kvm, vpci-legacy_irq_line, VIRTIO_IRQ_LOW);
 vpci-isr = VIRTIO_IRQ_LOW;
 break;
 default:
  @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct 
  virtio_device *vdev, u32 vq)
 kvm__irq_trigger(kvm, vpci-gsis[vq]);
 } else {
 vpci-isr = VIRTIO_IRQ_HIGH;
  -  kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
  +  kvm__irq_trigger(kvm, vpci-legacy_irq_line);
 }
 return 0;
   }
  @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct 
  virtio_device *vdev)
 kvm__irq_trigger(kvm, vpci-config_gsi);
 } else {
 vpci-isr = VIRTIO_PCI_ISR_CONFIG;
  -  kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
  +  kvm__irq_trigger(kvm, vpci-legacy_irq_line);
 }
   
 return 0;
  @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, 
  struct virtio_device *vdev,
 if (r  0)
 goto free_msix_mmio;
   
  +  /* save the IRQ that device__register() has allocated */
  +  vpci-legacy_irq_line = vpci-pci_hdr.irq_line;
  
  I'd rather we used the container_of trick that we do for virtio-mmio
  devices when assigning the irq in device__register. Then we can avoid
  this line completely.
 
 Not completely sure I get what you mean, I take it you want to assign
 legacy_irq_line in pci__assign_irq() directly (where the IRQ number is
 allocated).
 But this function is PCI generic code and is used by the VESA
 framebuffer and the shmem device on x86 as well. For those devices
 dev_hdr is not part of a struct virtio_pci, so we can't do container_of
 to assign the legacy_irq_line here directly.
 Admittedly this fix should apply to the other two users as well, but
 VESA does not use interrupts and pci-shmem is completely broken anyway,
 so I didn't bother to fix it in this regard.
 Would it be justified to provide an IRQ number field in struct
 device_header to address all users?
 
 Or what am I missing here?

If VESA and shmem are broken, they should either be fixed or removed.

If you fix them, then we could have separate virtual buses for virtio-pci
and emulated-pci (or whatever you want to call it). We could also have
a separate bus for passthrough-devices too.

However, that's quite a lot of work for a bug-fix, so I guess the easiest
thing is to extend your current hack to cover VESA and shmem too.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-06-05 Thread Will Deacon

On Thu, Jun 04, 2015 at 04:20:45PM +0100, Andre Przywara wrote:
 In PCI config space there is an interrupt line field (offset 0x3f),
 which is used to initially communicate the IRQ line number from
 firmware to the OS. _Hardware_ should never use this information,
 as the OS is free to write any information in there.
 But kvmtool uses this number when it triggers IRQs in the guest,
 which fails starting with Linux 3.19-rc1, where the PCI layer starts
 writing the virtual IRQ number in there.
 
 Fix that by storing the IRQ number in a separate field in
 struct virtio_pci, which is independent from the PCI config space
 and cannot be influenced by the guest.
 This fixes ARM/ARM64 guests using PCI with newer kernels.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  include/kvm/virtio-pci.h | 8 
  virtio/pci.c | 9 ++---
  2 files changed, 14 insertions(+), 3 deletions(-)
 
 diff --git a/include/kvm/virtio-pci.h b/include/kvm/virtio-pci.h
 index c795ce7..b70cadd 100644
 --- a/include/kvm/virtio-pci.h
 +++ b/include/kvm/virtio-pci.h
 @@ -30,6 +30,14 @@ struct virtio_pci {
   u8  isr;
   u32 features;
  
 + /*
 +  * We cannot rely on the INTERRUPT_LINE byte in the config space once
 +  * we have run guest code, as the OS is allowed to use that field
 +  * as a scratch pad to communicate between driver and PCI layer.
 +  * So store our legacy interrupt line number in here for internal use.
 +  */
 + u8  legacy_irq_line;
 +
   /* MSI-X */
   u16 config_vector;
   u32 config_gsi;
 diff --git a/virtio/pci.c b/virtio/pci.c
 index 7556239..e17e5a9 100644
 --- a/virtio/pci.c
 +++ b/virtio/pci.c
 @@ -141,7 +141,7 @@ static bool virtio_pci__io_in(struct ioport *ioport, 
 struct kvm_cpu *vcpu, u16 p
   break;
   case VIRTIO_PCI_ISR:
   ioport__write8(data, vpci-isr);
 - kvm__irq_line(kvm, vpci-pci_hdr.irq_line, VIRTIO_IRQ_LOW);
 + kvm__irq_line(kvm, vpci-legacy_irq_line, VIRTIO_IRQ_LOW);
   vpci-isr = VIRTIO_IRQ_LOW;
   break;
   default:
 @@ -299,7 +299,7 @@ int virtio_pci__signal_vq(struct kvm *kvm, struct 
 virtio_device *vdev, u32 vq)
   kvm__irq_trigger(kvm, vpci-gsis[vq]);
   } else {
   vpci-isr = VIRTIO_IRQ_HIGH;
 - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
 + kvm__irq_trigger(kvm, vpci-legacy_irq_line);
   }
   return 0;
  }
 @@ -323,7 +323,7 @@ int virtio_pci__signal_config(struct kvm *kvm, struct 
 virtio_device *vdev)
   kvm__irq_trigger(kvm, vpci-config_gsi);
   } else {
   vpci-isr = VIRTIO_PCI_ISR_CONFIG;
 - kvm__irq_trigger(kvm, vpci-pci_hdr.irq_line);
 + kvm__irq_trigger(kvm, vpci-legacy_irq_line);
   }
  
   return 0;
 @@ -422,6 +422,9 @@ int virtio_pci__init(struct kvm *kvm, void *dev, struct 
 virtio_device *vdev,
   if (r  0)
   goto free_msix_mmio;
  
 + /* save the IRQ that device__register() has allocated */
 + vpci-legacy_irq_line = vpci-pci_hdr.irq_line;

I'd rather we used the container_of trick that we do for virtio-mmio
devices when assigning the irq in device__register. Then we can avoid
this line completely.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: stand-alone kvmtool

2015-06-03 Thread Will Deacon

On Mon, Feb 23, 2015 at 05:23:58PM +, Pekka Enberg wrote:
 On 2/18/15 5:50 PM, Will Deacon wrote:
  Thanks for doing this. Since it looks unlikely that kvmtool will ever be
  merged back into the kernel tree, it makes sense to cut the dependency
  in my opinion.
 
 I am certainly OK with a standalone repository which preserves the 
 history. Will, would you like to take over the proposed new repository 
 and put it somewhere on git.kernel.org, perhaps?

Finally got around to this with Andre's help, so I'll post a separate
mail advertising it.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[ANNOUNCE] Stand-alone kvmtool repository

2015-06-03 Thread Will Deacon

Hi all,

Thanks to a tonne of help (and friendly nagging!) from Andre, I'm
pleased to announce a stand-alone repository for kvmtool:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git

This is a follow-up from a previous thread[1] where we concluded that
this was the right way to go. As requested, the original git history is
preserved but is *not* bisectable since we don't include the kernel tree
in the history. For a bisectable history, the original github repo[2]
should be used instead.

For now, it's living in my area of kernel.org since (a) I already had a
fork there and (b) the patch traffic is pretty low for the project. If
there is sufficient development interest, I can move it somewhere more
appropriate for shared maintenance (e.g. under virt/kvm/).

Anyway, please give it a spin and send patches when it falls over ;)

Will

[1] https://lkml.org/lkml/2015/2/23/393
[2] git://github.com/penberg/linux-kvm.git
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 09/11] KVM: arm: disable debug mode if we don't actually need it.

2015-06-01 Thread Will Deacon

On Sun, May 31, 2015 at 05:27:10AM +0100, Zhichao Huang wrote:
 Until now we enable debug mode all the time even if we don't
 actually need it.
 
 Inspired by the implementation in arm64, disable debug mode if
 we don't need it. And then we are able to reduce unnecessary
 registers saving/restoring when the debug mode is disabled.

I'm terrified about this patch. Enabling monitor mode has proven to be
*extremely* fragile in practice on 32-bit ARM SoCs, so trying to do this
morei often makes me very nervous.

 Signed-off-by: Zhichao Huang zhichao.hu...@linaro.org
 ---
  arch/arm/kernel/hw_breakpoint.c | 55 
 ++---
  1 file changed, 46 insertions(+), 9 deletions(-)
 
 diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c
 index dc7d0a9..1d27563 100644
 --- a/arch/arm/kernel/hw_breakpoint.c
 +++ b/arch/arm/kernel/hw_breakpoint.c
 @@ -266,8 +266,7 @@ static int enable_monitor_mode(void)
   }
  
   /* Check that the write made it through. */
 - ARM_DBG_READ(c0, c1, 0, dscr);
 - if (!(dscr  ARM_DSCR_MDBGEN)) {
 + if (!monitor_mode_enabled()) {
   pr_warn_once(Failed to enable monitor mode on CPU %d.\n,
   smp_processor_id());
   return -EPERM;

Ok, this hunk is harmless :)

 @@ -277,6 +276,43 @@ out:
   return 0;
  }
  
 +static int disable_monitor_mode(void)
 +{
 + u32 dscr;
 +
 + ARM_DBG_READ(c0, c1, 0, dscr);
 +
 + /* If monitor mode is already disabled, just return. */
 + if (!(dscr  ARM_DSCR_MDBGEN))
 + goto out;
 +
 + /* Write to the corresponding DSCR. */
 + switch (get_debug_arch()) {
 + case ARM_DEBUG_ARCH_V6:
 + case ARM_DEBUG_ARCH_V6_1:
 + ARM_DBG_WRITE(c0, c1, 0, (dscr  ~ARM_DSCR_MDBGEN));
 + break;
 + case ARM_DEBUG_ARCH_V7_ECP14:
 + case ARM_DEBUG_ARCH_V7_1:
 + case ARM_DEBUG_ARCH_V8:
 + ARM_DBG_WRITE(c0, c2, 2, (dscr  ~ARM_DSCR_MDBGEN));
 + isb();
 + break;
 + default:
 + return -ENODEV;
 + }
 +
 + /* Check that the write made it through. */
 + if (monitor_mode_enabled()) {
 + pr_warn_once(Failed to disable monitor mode on CPU %d.\n,
 + smp_processor_id());
 + return -EPERM;
 + }
 +
 +out:
 + return 0;
 +}

I'm not comfortable with this. enable_monitor_mode has precisly one caller
[reset_ctrl_regs] which goes to some lengths to get the system into a
well-defined state. On top of that, the whole thing is run with an undef
hook registered because there isn't an architected way to discover whether
or not DBGSWENABLE is driven low.

Why exactly do you need this? Can you not trap guest debug accesses some
other way?

  int hw_breakpoint_slots(int type)
  {
   if (!debug_arch_supported())
 @@ -338,6 +374,8 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
   int i, max_slots, ctrl_base, val_base;
   u32 addr, ctrl;
  
 + enable_monitor_mode();
 +
   addr = info-address;
   ctrl = encode_ctrl_reg(info-ctrl) | 0x1;
  
 @@ -430,6 +468,8 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp)
  
   /* Reset the control register. */
   write_wb_reg(base + i, 0);
 +
 + disable_monitor_mode();

My previous concerns notwithstanding, shouldn't this be refcounted?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 10/12] KVM: arm64: guest debug, HW assisted debug support

2015-06-01 Thread Will Deacon

Hi Alex,

On Fri, May 29, 2015 at 10:30:26AM +0100, Alex Bennée wrote:
 This adds support for userspace to control the HW debug registers for
 guest debug. In the debug ioctl we copy the IMPDEF defined number of
 registers into a new register set called host_debug_state. There is now
 a new vcpu parameter called debug_ptr which selects which register set
 is to copied into the real registers when world switch occurs.
 
 I've moved some helper functions into the hw_breakpoint.h header for
 re-use.
 
 As with single step we need to tweak the guest registers to enable the
 exceptions so we need to save and restore those bits.
 
 Two new capabilities have been added to the KVM_EXTENSION ioctl to allow
 userspace to query the number of hardware break and watch points
 available on the host hardware.
 
 Signed-off-by: Alex Bennée alex.ben...@linaro.org

[...]

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index 33c8143..ada57df 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -2668,7 +2668,7 @@ The top 16 bits of the control field are architecture 
 specific control
  flags which can include the following:
  
- KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64]
 -  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390]
 +  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64]
- KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86]
- KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86]
- KVM_GUESTDBG_EXIT_PENDING:  trigger an immediate guest exit [s390]
 @@ -2683,6 +2683,11 @@ updated to the correct (supplied) values.
  The second part of the structure is architecture specific and
  typically contains a set of debug registers.
  
 +For arm64 the number of debug registers is implementation defined and
 +can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and
 +KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number
 +indicating the number of supported registers.
 +
  When debug events exit the main run loop with the reason
  KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run
  structure containing architecture specific debug information.
 diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
 index 0d17c7b..6df47c1 100644
 --- a/arch/arm/kvm/arm.c
 +++ b/arch/arm/kvm/arm.c
 @@ -307,6 +307,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
  
  #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE |\
   KVM_GUESTDBG_USE_SW_BP | \
 + KVM_GUESTDBG_USE_HW_BP | \
   KVM_GUESTDBG_SINGLESTEP)
  
  /**
 @@ -327,6 +328,12 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu 
 *vcpu,
  
   if (dbg-control  KVM_GUESTDBG_ENABLE) {
   vcpu-guest_debug = dbg-control;
 +
 + /* Hardware assisted Break and Watch points */
 + if (vcpu-guest_debug  KVM_GUESTDBG_USE_HW_BP) {
 + vcpu-arch.external_debug_state = dbg-arch;
 + }
 +
   } else {
   /* If not enabled clear all flags */
   vcpu-guest_debug = 0;
 diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
 b/arch/arm64/include/asm/hw_breakpoint.h
 index 52b484b..c450552 100644
 --- a/arch/arm64/include/asm/hw_breakpoint.h
 +++ b/arch/arm64/include/asm/hw_breakpoint.h
 @@ -130,6 +130,18 @@ static inline void ptrace_hw_copy_thread(struct 
 task_struct *task)
  }
  #endif
  
 +/* Determine number of BRP registers available. */
 +static inline int get_num_brps(void)
 +{
 + return ((read_cpuid(ID_AA64DFR0_EL1)  12)  0xf) + 1;
 +}
 +
 +/* Determine number of WRP registers available. */
 +static inline int get_num_wrps(void)
 +{
 + return ((read_cpuid(ID_AA64DFR0_EL1)  20)  0xf) + 1;
 +}

I'm fine with moving these into the header file, but we should probably
revisit this at some point so that we use a shadow value to indicate the
minimum number of registers across the system for e.g. big.LITTLE platforms
that don't have uniform debug resources.

  extern struct pmu perf_ops_bp;
  
  #endif   /* __KERNEL__ */
 diff --git a/arch/arm64/include/asm/kvm_host.h 
 b/arch/arm64/include/asm/kvm_host.h
 index e5040b6..498d4f7 100644
 --- a/arch/arm64/include/asm/kvm_host.h
 +++ b/arch/arm64/include/asm/kvm_host.h
 @@ -113,12 +113,13 @@ struct kvm_vcpu_arch {
  
   /*
* For debugging the guest we need to keep a set of debug
 -  * registers which can override the guests own debug state
 +  * registers which can override the guest's own debug state
* while being used. These are set via the KVM_SET_GUEST_DEBUG
* ioctl.
*/
   struct kvm_guest_debug_arch *debug_ptr;
   struct kvm_guest_debug_arch vcpu_debug_state;
 + struct kvm_guest_debug_arch external_debug_state;
  
   /* Pointer to host CPU context */
   kvm_cpu_context_t *host_cpu_context;
 diff

Re: stand-alone kvmtool

2015-02-25 Thread Will Deacon

On Mon, Feb 23, 2015 at 05:23:58PM +, Pekka Enberg wrote:
 Hi,

Hi Pekka,

Sorry for the delay, I've been away from email for a few days.

 On 2/18/15 5:50 PM, Will Deacon wrote:
  Thanks for doing this. Since it looks unlikely that kvmtool will ever be
  merged back into the kernel tree, it makes sense to cut the dependency
  in my opinion.
 
 I am certainly OK with a standalone repository which preserves the 
 history. Will, would you like to take over the proposed new repository 
 and put it somewhere on git.kernel.org, perhaps?

Sure thing. I'll sync-up with Andre and reply back here when we've got
something sensible.

One extra point that I don't think has been mentioned in this thread so
far is that separating kvmtool from the kernel sources is likely a
prerequisite for distribution packaging. Once we've got something sorted
out, I'll poke some friendly local debian developers and see if they can't
package it up for us.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: stand-alone kvmtool

2015-02-18 Thread Will Deacon

Hi Andre,

Thanks for doing this. Since it looks unlikely that kvmtool will ever be
merged back into the kernel tree, it makes sense to cut the dependency
in my opinion.

On Fri, Feb 13, 2015 at 10:39:33AM +, Andre Przywara wrote:
 as I found it increasingly inconvenient to use kvmtool[1] as part of a
 Linux repository, I decided to give it a go and make it a stand-alone
 project. So I filtered all the respective commits, adjusted the paths in
 there (while keeping authorship and commit date, of course) and then
 added the missing bits to let it compile without a kernel tree nearby.
 The result is now available on:
 
 git://linux-arm.org/kvmtool.git
 http://linux-arm.org/kvmtool.git
 
 You can simply check it out, type make and use ./lkvm run for a quick
 test. So far I briefly tested x86-64, arm and arm64, the later two were
 also cross-compiled. For sure there are rough edges in there (for
 instance copying a few non-uapi header files into), but I deem it worthy
 enough to get some public comments.
 For me that also fixed some nasty warnings about libfdt, which now are
 gone due it using your system library version of it.
 I also managed to get rid of the libc-i386-dev dependency when compiling
 for x86-64, but that still needs to be cleaned up and thus is not in the
 current HEAD.
 I haven't got around to compile-test the other supported architectures,
 but supporting them should be as easy as copying over the uapi kvm.h
 header file (see the respective ARM commit). Contributions (and tests!)
 are welcome.
 
 Please give it a go and tell me what you think. I don't want to fork the
 project, so I am happy if someone official picks it up.

In which case, it's probably best to post the patches for review rather
than just point me at your git repo!

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH -v3 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest

2015-02-09 Thread Will Deacon

Hi Rik,

On Mon, Feb 09, 2015 at 04:04:38PM +, r...@redhat.com wrote:
 Apologies to Catalin and Will for not fixing up ARM. I am not
 familiar with ARM assembly, and not sure how to pass a constant
 argument to a function from assembly code on ARM :)

It's a bit of a faff getting enum values into asm -- we actually have to
duplicate the definitions using #defines to get at the constants. Perhaps it
would be cleaner to leave context_tracking_user_{enter,exit} intact as C
wrappers around context_tracking_{enter,exit} passing the appropriate
constant? That way we don't actually need to change the arch code at all.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-02-06 Thread Will Deacon

On Fri, Feb 06, 2015 at 07:02:25PM +, Peter Maydell wrote:
 On 6 February 2015 at 18:55, Will Deacon will.dea...@arm.com wrote:
  On Wed, Feb 04, 2015 at 03:39:50PM +, Andre Przywara wrote:
  In PCI config space there is an interrupt line field (offset 0x3f),
  which is used to initially communicate the IRQ line number from
  firmware to the OS. _Hardware_ should never use this information,
  as the OS is free to write any information in there.
 
  Is this true even with probe-only? I appreciate that this isn't a BAR,
  but it still feels odd for Linux to write this in that case.
 
 The hardware (model) shouldn't be doing anything with the value
 in this register anyway, so I think this change to kvmtool is
 correct regardless of Linux's behaviour.

Well, kvmtool is also pretending to be firmware in this case, which is why
it passes things like probe-only and PSCI nodes.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: don't use PCI config space IRQ line field

2015-02-06 Thread Will Deacon

On Wed, Feb 04, 2015 at 03:39:50PM +, Andre Przywara wrote:
 In PCI config space there is an interrupt line field (offset 0x3f),
 which is used to initially communicate the IRQ line number from
 firmware to the OS. _Hardware_ should never use this information,
 as the OS is free to write any information in there.

Is this true even with probe-only? I appreciate that this isn't a BAR,
but it still feels odd for Linux to write this in that case.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/11] kvmtool: AArch{32,64}: use KVM_CREATE_DEVICE co to instanciate the GIC

2015-01-26 Thread Will Deacon

On Fri, Jan 23, 2015 at 04:35:02PM +, Andre Przywara wrote:
 From: Marc Zyngier marc.zyng...@arm.com
 
 As of 3.14, KVM/arm supports the creation/configuration of the GIC through
 a more generic device API, which is now the preferred way to do so.
 
 Plumb the new API in, and allow the old code to be used as a fallback.
 
 [Andre: Rename some functions on the way to differentiate between
 creation and initialisation more clearly.]
 
 Signed-off-by: Marc Zyngier marc.zyng...@arm.com
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/arm/gic.c|   60 
 
  tools/kvm/arm/include/arm-common/gic.h |2 +-
  tools/kvm/arm/kvm.c|6 ++--
  3 files changed, 57 insertions(+), 11 deletions(-)
 
 diff --git a/tools/kvm/arm/gic.c b/tools/kvm/arm/gic.c
 index 5d8cbe6..ce5f7fa 100644
 --- a/tools/kvm/arm/gic.c
 +++ b/tools/kvm/arm/gic.c
 @@ -7,7 +7,41 @@
  #include linux/byteorder.h
  #include linux/kvm.h
  
 -int gic__init_irqchip(struct kvm *kvm)
 +static int gic_fd = -1;
 +
 +static int gic__create_device(struct kvm *kvm)
 +{
 + int err;
 + u64 cpu_if_addr = ARM_GIC_CPUI_BASE;
 + u64 dist_addr = ARM_GIC_DIST_BASE;
 + struct kvm_create_device gic_device = {
 + .type   = KVM_DEV_TYPE_ARM_VGIC_V2,
 + };
 + struct kvm_device_attr cpu_if_attr = {
 + .group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
 + .attr   = KVM_VGIC_V2_ADDR_TYPE_CPU,
 + .addr   = (u64)(unsigned long)cpu_if_addr,
 + };
 + struct kvm_device_attr dist_attr = {
 + .group  = KVM_DEV_ARM_VGIC_GRP_ADDR,
 + .attr   = KVM_VGIC_V2_ADDR_TYPE_DIST,
 + .addr   = (u64)(unsigned long)dist_addr,
 + };
 +
 + err = ioctl(kvm-vm_fd, KVM_CREATE_DEVICE, gic_device);
 + if (err)
 + return err;
 +
 + gic_fd = gic_device.fd;
 +
 + err = ioctl(gic_fd, KVM_SET_DEVICE_ATTR, cpu_if_attr);
 + if (err)
 + return err;
 +
 + return ioctl(gic_fd, KVM_SET_DEVICE_ATTR, dist_attr);
 +}
 +
 +static int gic__create_irqchip(struct kvm *kvm)
  {
   int err;
   struct kvm_arm_device_addr gic_addr[] = {
 @@ -23,12 +57,6 @@ int gic__init_irqchip(struct kvm *kvm)
   }
   };
  
 - if (kvm-nrcpus  GIC_MAX_CPUS) {
 - pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
 - kvm-nrcpus, GIC_MAX_CPUS);
 - kvm-nrcpus = GIC_MAX_CPUS;
 - }
 -
   err = ioctl(kvm-vm_fd, KVM_CREATE_IRQCHIP);
   if (err)
   return err;
 @@ -41,6 +69,24 @@ int gic__init_irqchip(struct kvm *kvm)
   return err;
  }
  
 +int gic__create(struct kvm *kvm)
 +{
 + int err;
 +
 + if (kvm-nrcpus  GIC_MAX_CPUS) {
 + pr_warning(%d CPUS greater than maximum of %d -- truncating\n,
 + kvm-nrcpus, GIC_MAX_CPUS);
 + kvm-nrcpus = GIC_MAX_CPUS;
 + }
 +
 + /* Try the new way first, and fallback on legacy method otherwise */
 + err = gic__create_device(kvm);
 + if (err)
 + err = gic__create_irqchip(kvm);

This fallback doesn't look safe to me:

  - gic_fd might remain initialised
  - What does the kernel vgic driver do if you've already done
a successful KVM_CREATE_DEVICE and then try to use the legacy method?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 11/11] kvmtool: add command line parameter to instantiate a vGICv3

2015-01-26 Thread Will Deacon

On Fri, Jan 23, 2015 at 04:35:10PM +, Andre Przywara wrote:
 Add the command line parameter --gicv3 to request GICv3 emulation
 in the kernel. Connect that to the already existing GICv3 code.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/arm/aarch64/arm-cpu.c|5 -
  .../kvm/arm/aarch64/include/kvm/kvm-config-arch.h  |4 +++-
  tools/kvm/arm/gic.c|   14 ++
  tools/kvm/arm/include/arm-common/kvm-config-arch.h |1 +
  tools/kvm/arm/kvm-cpu.c|2 +-
  tools/kvm/arm/kvm.c|3 ++-
  6 files changed, 25 insertions(+), 4 deletions(-)
 
 diff --git a/tools/kvm/arm/aarch64/arm-cpu.c b/tools/kvm/arm/aarch64/arm-cpu.c
 index a70d6bb..46d6d6a 100644
 --- a/tools/kvm/arm/aarch64/arm-cpu.c
 +++ b/tools/kvm/arm/aarch64/arm-cpu.c
 @@ -12,7 +12,10 @@
  static void generate_fdt_nodes(void *fdt, struct kvm *kvm, u32 gic_phandle)
  {
   int timer_interrupts[4] = {13, 14, 11, 10};
 - gic__generate_fdt_nodes(fdt, gic_phandle, KVM_DEV_TYPE_ARM_VGIC_V2);
 + gic__generate_fdt_nodes(fdt, gic_phandle,
 + kvm-cfg.arch.gicv3 ?
 + KVM_DEV_TYPE_ARM_VGIC_V3 :
 + KVM_DEV_TYPE_ARM_VGIC_V2);
   timer__generate_fdt_nodes(fdt, kvm, timer_interrupts);
  }
  
 diff --git a/tools/kvm/arm/aarch64/include/kvm/kvm-config-arch.h 
 b/tools/kvm/arm/aarch64/include/kvm/kvm-config-arch.h
 index 89860ae..106e52f 100644
 --- a/tools/kvm/arm/aarch64/include/kvm/kvm-config-arch.h
 +++ b/tools/kvm/arm/aarch64/include/kvm/kvm-config-arch.h
 @@ -3,7 +3,9 @@
  
  #define ARM_OPT_ARCH_RUN(cfg)
 \
   OPT_BOOLEAN('\0', aarch32, (cfg)-aarch32_guest, \
 - Run AArch32 guest),
 + Run AArch32 guest),   \
 + OPT_BOOLEAN('\0', gicv3, (cfg)-gicv3,   \
 + Use a GICv3 interrupt controller in the guest),

On a GICv3-capable system, why would I *not* want to enable this option?
In other words, could we make this the default behaviour on systems that
support it, and if you need an override then it should be something like
--force-gicv2.

Or am I missing a key piece of the puzzle?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvmtool: virtio-9p: Convert EMFILE error at the server to ENFILE for the guest

2015-01-16 Thread Will Deacon

On Fri, Jan 16, 2015 at 11:30:10AM +, Suzuki K. Poulose wrote:
 From: Suzuki K. Poulose suzuki.poul...@arm.com
 
 If an open at the 9p server(host) fails with EMFILE (Too many open files for
 the process), we should return ENFILE(too many open files in the system) to
 the guest to indicate the actual status within the guest.
 
 This was uncovered during LTP, where getdtablesize01 fails to open the maximum
 number-open-files.
 
 getdtablesize010  TINFO  :  Maximum number of files a process can have 
 opened is 1024
 getdtablesize010  TINFO  :  Checking with the value returned by 
 getrlimit...RLIMIT_NOFILE
 getdtablesize011  TPASS  :  got correct dtablesize, value is 1024
 getdtablesize010  TINFO  :  Checking Max num of files that can be opened 
 by a process.Should be: RLIMIT_NOFILE - 1
 getdtablesize012  TFAIL  :  getdtablesize01.c:102: 974 != 1023
 
 For a more practial impact:
 
  # ./getdtablesize01 
 [1] 1834
  getdtablesize010  TINFO  :  Maximum number of files a process can have 
 opened is 1024
  getdtablesize010  TINFO  :  Checking with the value returned by 
 getrlimit...RLIMIT_NOFILE
  getdtablesize011  TPASS  :  got correct dtablesize, value is 1024
  getdtablesize010  TINFO  :  Checking Max num of files that can be opened 
 by a process.Should be: RLIMIT_NOFILE - 1
  getdtablesize012  TFAIL  :  getdtablesize01.c:102: 974 != 1023
  [--- Modified to sleep indefinitely, without closing the files --- ]
 
  # ls
  bash: /bin/ls: Too many open files
 
 That gives a wrong error message for the bash, when getdtablesize01 has 
 exhausted the system
 wide limits, giving false indicators.
 
 With the fix, we get :
 
  # ls
  bash: /bin/ls: Too many open files in system

I doubt this is affecting anybody in practice, but:

  Acked-by: Will Deacon will.dea...@arm.com

Will

 Signed-off-by: Suzuki K. Poulose suzuki.poul...@arm.com
 ---
  tools/kvm/virtio/9p.c |4 
  1 file changed, 4 insertions(+)
 
 diff --git a/tools/kvm/virtio/9p.c b/tools/kvm/virtio/9p.c
 index 9073a1e..b24c0f2 100644
 --- a/tools/kvm/virtio/9p.c
 +++ b/tools/kvm/virtio/9p.c
 @@ -152,6 +152,10 @@ static void virtio_p9_error_reply(struct p9_dev *p9dev,
  {
   u16 tag;
  
 + /* EMFILE at server implies ENFILE for the VM */
 + if (err == EMFILE)
 + err = ENFILE;
 +
   pdu-write_offset = VIRTIO_9P_HDR_LEN;
   virtio_p9_pdu_writef(pdu, d, err);
   *outlen = pdu-write_offset;
 -- 
 1.7.9.5
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] kvmtool: remove 8250 IRQ line reset on device_init

2015-01-12 Thread Will Deacon

On Fri, Jan 09, 2015 at 03:58:19PM +, Andre Przywara wrote:
 Currently we reset the KVM interrupt line on initializing the 8250
 serial device emulation.
 For ARM this creates a problem where we use the in-kernel IRQ chip
 before having fully initialized it. But with the new kernel interface
 we cannot finish the GIC initialization before we know the number of
 used IRQs, so we have to wait until all devices have been created and
 initialized.
 Since the in-kernel GIC emulation resets the IRQ line anyway and also
 QEMU gets away without resetting it, the easiest solution is to drop
 the IRQ line reset.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
 Hi Pekka,
 
 this patch is an easy fix for our problems with Marc's kvmtool tree
 and 3.19-rc (setting number of IRQs after the GIC has been used
 already). I see that this basically reverts:
 
  commit 2ca9e37193ca5f5df5726e0061dc749829295435
  Author: Pekka Enberg penb...@kernel.org
  Date:   Sun Jan 23 11:49:39 2011 +0200
  
  kvm,8250: Fix device initial state
  
  This patch fixes 8250 device initial state for registers and IRQ based
  on what Qemu does.
 
 Do you (or does anyone) know of the issue that this patch fixed?
 This is four years old and from what I see QEMU does no longer the
 mentioned IRQ reset(?).
 Reworking kvmtool to avoid this issue sound rather painful.

I'm fine with this from an ARM point-of-view.

  Acked-by: Will Deacon will.dea...@arm.com

Have you tested this on x86?

Will


  tools/kvm/hw/serial.c |1 -
  1 file changed, 1 deletion(-)
 
 diff --git a/tools/kvm/hw/serial.c b/tools/kvm/hw/serial.c
 index 270e6182..2f19ba8 100644
 --- a/tools/kvm/hw/serial.c
 +++ b/tools/kvm/hw/serial.c
 @@ -406,7 +406,6 @@ static int serial8250__device_init(struct kvm *kvm, 
 struct serial8250_device *de
  
   ioport__map_irq(dev-irq);
   r = ioport__register(kvm, dev-iobase, serial8250_ops, 8, dev);
 - kvm__irq_line(kvm, dev-irq, 0);
  
   return r;
  }
 -- 
 1.7.9.5
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/5] kvmtool: replace GIC specific IRQ type #defines

2014-12-17 Thread Will Deacon

On Wed, Dec 17, 2014 at 11:14:44AM +, Andre Przywara wrote:
 We had GIC specific defines for the IRQ type identifiers in kvmtool.
 But In fact the specification of being a level or edge interrupt
 is quite generic, with the GIC binding using the generic Linux
 defines.
 So lets replace the GIC specific names in favour of the general
 defines used in Linux.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/arm/fdt.c|2 +-
  tools/kvm/arm/include/arm-common/gic.h |5 -
  tools/kvm/arm/pci.c|2 +-
  tools/kvm/arm/timer.c  |8 
  tools/kvm/include/kvm/fdt.h|9 +
  5 files changed, 15 insertions(+), 11 deletions(-)
 
 diff --git a/tools/kvm/arm/fdt.c b/tools/kvm/arm/fdt.c
 index 4a33846..24f030f 100644
 --- a/tools/kvm/arm/fdt.c
 +++ b/tools/kvm/arm/fdt.c
 @@ -79,7 +79,7 @@ static void generate_irq_prop(void *fdt, u8 irq)
   u32 irq_prop[] = {
   cpu_to_fdt32(GIC_FDT_IRQ_TYPE_SPI),
   cpu_to_fdt32(irq - GIC_SPI_IRQ_BASE),
 - cpu_to_fdt32(GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + cpu_to_fdt32(IRQ_TYPE_EDGE_RISING),
   };
  
   _FDT(fdt_property(fdt, interrupts, irq_prop, sizeof(irq_prop)));
 diff --git a/tools/kvm/arm/include/arm-common/gic.h 
 b/tools/kvm/arm/include/arm-common/gic.h
 index 850edc7..5a36f2c 100644
 --- a/tools/kvm/arm/include/arm-common/gic.h
 +++ b/tools/kvm/arm/include/arm-common/gic.h
 @@ -10,11 +10,6 @@
  #define GIC_FDT_IRQ_TYPE_SPI 0
  #define GIC_FDT_IRQ_TYPE_PPI 1
  
 -#define GIC_FDT_IRQ_FLAGS_EDGE_LO_HI 1
 -#define GIC_FDT_IRQ_FLAGS_EDGE_HI_LO 2
 -#define GIC_FDT_IRQ_FLAGS_LEVEL_HI   4
 -#define GIC_FDT_IRQ_FLAGS_LEVEL_LO   8
 -
  #define GIC_FDT_IRQ_PPI_CPU_SHIFT8
  #define GIC_FDT_IRQ_PPI_CPU_MASK (0xff  GIC_FDT_IRQ_PPI_CPU_SHIFT)
  
 diff --git a/tools/kvm/arm/pci.c b/tools/kvm/arm/pci.c
 index 9f4dabc..99a8130 100644
 --- a/tools/kvm/arm/pci.c
 +++ b/tools/kvm/arm/pci.c
 @@ -87,7 +87,7 @@ void pci__generate_fdt_nodes(void *fdt, u32 gic_phandle)
   .gic_irq = {
   .type   = cpu_to_fdt32(GIC_FDT_IRQ_TYPE_SPI),
   .num= cpu_to_fdt32(irq - GIC_SPI_IRQ_BASE),
 - .flags  = 
 cpu_to_fdt32(GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + .flags  = cpu_to_fdt32(IRQ_TYPE_EDGE_RISING),
   },
   };
  
 diff --git a/tools/kvm/arm/timer.c b/tools/kvm/arm/timer.c
 index 209251e..29991da 100644
 --- a/tools/kvm/arm/timer.c
 +++ b/tools/kvm/arm/timer.c
 @@ -15,19 +15,19 @@ void timer__generate_fdt_nodes(void *fdt, struct kvm 
 *kvm, int *irqs)
   u32 irq_prop[] = {
   cpu_to_fdt32(GIC_FDT_IRQ_TYPE_PPI),
   cpu_to_fdt32(irqs[0]),
 - cpu_to_fdt32(cpu_mask | GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + cpu_to_fdt32(cpu_mask | IRQ_TYPE_EDGE_RISING),
  
   cpu_to_fdt32(GIC_FDT_IRQ_TYPE_PPI),
   cpu_to_fdt32(irqs[1]),
 - cpu_to_fdt32(cpu_mask | GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + cpu_to_fdt32(cpu_mask | IRQ_TYPE_EDGE_RISING),
  
   cpu_to_fdt32(GIC_FDT_IRQ_TYPE_PPI),
   cpu_to_fdt32(irqs[2]),
 - cpu_to_fdt32(cpu_mask | GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + cpu_to_fdt32(cpu_mask | IRQ_TYPE_EDGE_RISING),
  
   cpu_to_fdt32(GIC_FDT_IRQ_TYPE_PPI),
   cpu_to_fdt32(irqs[3]),
 - cpu_to_fdt32(cpu_mask | GIC_FDT_IRQ_FLAGS_EDGE_LO_HI),
 + cpu_to_fdt32(cpu_mask | IRQ_TYPE_EDGE_RISING),
   };
  
   _FDT(fdt_begin_node(fdt, timer));
 diff --git a/tools/kvm/include/kvm/fdt.h b/tools/kvm/include/kvm/fdt.h
 index 19f95ac..dee9a71 100644
 --- a/tools/kvm/include/kvm/fdt.h
 +++ b/tools/kvm/include/kvm/fdt.h
 @@ -7,6 +7,15 @@
  
  #define FDT_MAX_SIZE 0x1
  
 +/* Those definitions are generic FDT values for specifying IRQ
 + * types and are used in the Linux kernel internally as well as in
 + * the dts files and their documentation.
 + */
 +#define IRQ_TYPE_EDGE_RISING 1
 +#define IRQ_TYPE_EDGE_FALLING2
 +#define IRQ_TYPE_LEVEL_HIGH  4
 +#define IRQ_TYPE_LEVEL_LOW   8

Any chance we can keep this as an enum, please? That matches that the kernel
uses internally, and allows you to take a specific type instead of a u32
for the device-tree generating functions.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 5/5] kvmtool: remove warning about bzImage on non-x86 architectures

2014-12-17 Thread Will Deacon

On Wed, Dec 17, 2014 at 11:14:47AM +, Andre Przywara wrote:
 Among the architectures supported by kvmtool, only x86 defines a
 bzImage format. So we shouldn't bother users of other architectures
 with a message about something that cannot work.
 Make the bzImage check dependent on compiling for x86.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/kvm.c |5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
 index e1b9f6c..21817c3 100644
 --- a/tools/kvm/kvm.c
 +++ b/tools/kvm/kvm.c
 @@ -380,12 +380,15 @@ bool kvm__load_kernel(struct kvm *kvm, const char 
 *kernel_filename,
   die(%s is not an initrd, initrd_filename);
   }
  
 +#ifdef CONFIG_X86
   ret = load_bzimage(kvm, fd_kernel, fd_initrd, kernel_cmdline);
  
   if (ret)
   goto found_kernel;
  
 - pr_warning(%s is not a bzImage. Trying to load it as a flat 
 binary..., kernel_filename);
 + pr_warning(%s is not a bzImage. Trying to load it as an ELF or flat 
 binary...,
 +kernel_filename);
 +#endif

I don't think you need to change the string (at least, that seems to be an
unrelated change).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/4] kvmtool: ARM: allow level interrupts in device tree

2014-12-11 Thread Will Deacon

On Thu, Dec 11, 2014 at 04:30:33PM +, Andre Przywara wrote:
 Currently we describe every interrupt for each device in the FDT
 as being edge triggered.
 Add a parameter to the irq property generation to allow devices to
 specify their interrupts as level triggered if needed.
 
 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/arm/fdt.c|6 +++---
  tools/kvm/hw/serial.c  |5 +++--
  tools/kvm/include/kvm/ioport.h |4 +++-
  tools/kvm/ioport.c |6 --
  tools/kvm/virtio/mmio.c|5 +++--
  5 files changed, 16 insertions(+), 10 deletions(-)

[...]

 @@ -71,7 +72,8 @@ static void generate_ioport_fdt_node(void *fdt,
  static void generate_ioport_fdt_node(void *fdt,
struct device_header *dev_hdr,
void (*generate_irq_prop)(void *fdt,
 -u8 irq))
 +u8 irq,
 +u32 irq_type))
  {
   die(Unable to generate device tree nodes without libfdt\n);
  }
 diff --git a/tools/kvm/virtio/mmio.c b/tools/kvm/virtio/mmio.c
 index 3a2bd62..28b0651 100644
 --- a/tools/kvm/virtio/mmio.c
 +++ b/tools/kvm/virtio/mmio.c
 @@ -233,7 +233,8 @@ static void virtio_mmio_mmio_callback(struct kvm_cpu 
 *vcpu,
  static void generate_virtio_mmio_fdt_node(void *fdt,
 struct device_header *dev_hdr,
 void (*generate_irq_prop)(void *fdt,
 - u8 irq))
 + u8 irq,
 + u32 type))
  {
   char dev_name[DEVICE_NAME_MAX_LEN];
   struct virtio_mmio *vmmio = container_of(dev_hdr,
 @@ -250,7 +251,7 @@ static void generate_virtio_mmio_fdt_node(void *fdt,
   _FDT(fdt_begin_node(fdt, dev_name));
   _FDT(fdt_property_string(fdt, compatible, virtio,mmio));
   _FDT(fdt_property(fdt, reg, reg_prop, sizeof(reg_prop)));
 - generate_irq_prop(fdt, vmmio-irq);
 + generate_irq_prop(fdt, vmmio-irq, GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);

This is a GIC-specific #define in arch-agnostic code. I think we should have
a new enum type for describing edge and level interrupts, then just use
that instead.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 1/3] KVM: arm/arm64: unset CONFIG_HAVE_KVM_IRQCHIP

2014-11-24 Thread Will Deacon

On Sun, Nov 23, 2014 at 05:56:58PM +, Eric Auger wrote:
 CONFIG_HAVE_KVM_IRQCHIP is needed to support IRQ routing (along
 with irq_comm.c and irqchip.c usage). This is not the case for
 arm/arm64 currently.
 
 This patch unsets the flag for both arm and arm64.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---

The arm64 change is fine by me. I assume this will go via the kvm tree
eventually?

  Acked-by: Will Deacon will.dea...@arm.com

Will

  arch/arm/kvm/Kconfig   | 2 --
  arch/arm64/kvm/Kconfig | 1 -
  2 files changed, 3 deletions(-)
 
 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 466bd29..9f581b1 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -55,7 +55,6 @@ config KVM_ARM_MAX_VCPUS
  config KVM_ARM_VGIC
   bool KVM support for Virtual GIC
   depends on KVM_ARM_HOST  OF
 - select HAVE_KVM_IRQCHIP
   default y
   ---help---
 Adds support for a hardware assisted, in-kernel GIC emulation.
 @@ -63,7 +62,6 @@ config KVM_ARM_VGIC
  config KVM_ARM_TIMER
   bool KVM support for Architected Timers
   depends on KVM_ARM_VGIC  ARM_ARCH_TIMER
 - select HAVE_KVM_IRQCHIP
   default y
   ---help---
 Adds support for the Architected Timers in virtual machines
 diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
 index 8ba85e9..279e1a0 100644
 --- a/arch/arm64/kvm/Kconfig
 +++ b/arch/arm64/kvm/Kconfig
 @@ -50,7 +50,6 @@ config KVM_ARM_MAX_VCPUS
  config KVM_ARM_VGIC
   bool
   depends on KVM_ARM_HOST  OF
 - select HAVE_KVM_IRQCHIP
   ---help---
 Adds support for a hardware assisted, in-kernel GIC emulation.
  
 -- 
 1.9.1
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 3/4] kvmtool: Handle exit reason KVM_EXIT_SYSTEM_EVENT

2014-10-03 Thread Will Deacon

On Wed, Oct 01, 2014 at 11:34:54AM +0100, Anup Patel wrote:
 The KVM_EXIT_SYSTEM_EVENT exit reason was added to define
 architecture independent system-wide events for a Guest.
 
 Currently, it is used by in-kernel PSCI-0.2 emulation of
 KVM ARM/ARM64 to inform user space about PSCI SYSTEM_OFF
 or PSCI SYSTEM_RESET request.
 
 For now, we simply treat all system-wide guest events as
 shutdown request in KVMTOOL.
 
 Signed-off-by: Pranavkumar Sawargaonkar pranavku...@linaro.org
 Signed-off-by: Anup Patel anup.pa...@linaro.org
 Reviewed-by: Andre Przywara andre.przyw...@arm.com
 ---
  tools/kvm/kvm-cpu.c |   21 +
  1 file changed, 21 insertions(+)
 
 diff --git a/tools/kvm/kvm-cpu.c b/tools/kvm/kvm-cpu.c
 index ee0a8ec..5180039 100644
 --- a/tools/kvm/kvm-cpu.c
 +++ b/tools/kvm/kvm-cpu.c
 @@ -160,6 +160,27 @@ int kvm_cpu__start(struct kvm_cpu *cpu)
   goto exit_kvm;
   case KVM_EXIT_SHUTDOWN:
   goto exit_kvm;
 + case KVM_EXIT_SYSTEM_EVENT:
 + /*
 +  * Print the type of system event and
 +  * treat all system events as shutdown request.
 +  */
 + switch (cpu-kvm_run-system_event.type) {
 + case KVM_SYSTEM_EVENT_SHUTDOWN:
 + printf(  # Info: shutdown system event\n);
 + goto exit_kvm;
 + case KVM_SYSTEM_EVENT_RESET:
 + printf(  # Info: reset system event\n);
 + printf(  # Info: KVMTOOL does not support VM 
 reset\n);
 + printf(  # Info: please re-launch the VM 
 manually\n);

Can you use pr_info here instead?

 + goto exit_kvm;
 + default:
 + printf(  # Warning: unknown system event 
 type=%d\n,
 +cpu-kvm_run-system_event.type);

and pr_warning here? (be sure to drop the '\n's).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v5 0/4] kvmtool: ARM/ARM64: Misc updates

2014-10-03 Thread Will Deacon

On Wed, Oct 01, 2014 at 11:34:51AM +0100, Anup Patel wrote:
 This patchset updates KVMTOOL to use some of the features
 supported by Linux-3.16 KVM ARM/ARM64, such as:
 
 1. Target CPU == Host using KVM_ARM_PREFERRED_TARGET vm ioctl
 2. Target CPU type Potenza for using KVMTOOL on X-Gene
 3. PSCI v0.2 support for Aarch32 and Aarch64 guest
 4. System event exit reason
 
 Changes since v4:
 - Avoid using magic '0' target for kvm arm generic target
 - Added comment for why we need Potenza target in KVMTOOL

Please can you send a v5 addressing my minor comment and adding Andre's
reviewed-by tags?

Thanks,

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm-vfio: do not use module_init

2014-09-24 Thread Will Deacon

On Wed, Sep 24, 2014 at 12:29:09PM +0100, Paolo Bonzini wrote:
 /me got confused between the kernel and QEMU.  In the kernel, you can
 only have one module_init function, and it will prevent unloading the
 module unless you also have the corresponding module_exit function.

Happy for you to take the blame, but I think this one's my fault!

 So, commit 80ce1639727e (KVM: VFIO: register kvm_device_ops dynamically,
 2014-09-02) broke unloading of the kvm module, by adding a module_init
 function and no module_exit.

I forget kvm builds as a module for other architectures (ie. not arm/arm64).

 Repair it by making kvm_vfio_ops_init weak, and checking it in
 kvm_init.

Hehe, if only there was a kconfig option for kvm-vfio.c...

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm-vfio: do not use module_init

2014-09-24 Thread Will Deacon

On Wed, Sep 24, 2014 at 12:45:53PM +0100, Paolo Bonzini wrote:
 Il 24/09/2014 13:44, Will Deacon ha scritto:
  On Wed, Sep 24, 2014 at 12:29:09PM +0100, Paolo Bonzini wrote:
  /me got confused between the kernel and QEMU.  In the kernel, you can
  only have one module_init function, and it will prevent unloading the
  module unless you also have the corresponding module_exit function.
  
  Happy for you to take the blame, but I think this one's my fault!
 
 That's why you were CCed! ;)
 
  So, commit 80ce1639727e (KVM: VFIO: register kvm_device_ops dynamically,
  2014-09-02) broke unloading of the kvm module, by adding a module_init
  function and no module_exit.
  
  I forget kvm builds as a module for other architectures (ie. not arm/arm64).
  
  Repair it by making kvm_vfio_ops_init weak, and checking it in
  kvm_init.
  
  Hehe, if only there was a kconfig option for kvm-vfio.c...
 
 Yeah, I was tempted to put it back.  What do you think?

I think it's nicer than the __weak symbol and I don't see a downside to
having the option (the reason for removing it was lack of users, which
lasted a couple of hours).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] kvm-vfio: do not use module_init

2014-09-24 Thread Will Deacon

On Wed, Sep 24, 2014 at 12:56:20PM +0100, Paolo Bonzini wrote:
 /me got confused between the kernel and QEMU.  In the kernel, you can
 only have one module_init function, and it will prevent unloading the
 module unless you also have the corresponding module_exit function.
 
 So, commit 80ce1639727e (KVM: VFIO: register kvm_device_ops dynamically,
 2014-09-02) broke unloading of the kvm module, by adding a module_init
 function and no module_exit.
 
 Repair it by making kvm_vfio_ops_init weak, and checking it in
 kvm_init.
 
 Cc: Will Deacon will.dea...@arm.com
 Cc: Gleb Natapov g...@kernel.org
 Cc: Alex Williamson alex.william...@redhat.com
 Fixes: 80ce1639727e9d38729c34f162378508c307ca25
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  virt/kvm/kvm_main.c | 9 +
  virt/kvm/vfio.c | 4 ++--
  virt/kvm/vfio.h | 6 ++
  3 files changed, 17 insertions(+), 2 deletions(-)
  create mode 100644 virt/kvm/vfio.h
 
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index db57363cc287..ad3439835adb 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -58,6 +58,10 @@
  #include coalesced_mmio.h
  #include async_pf.h
  
 +#ifdef CONFIG_KVM_VFIO
 +#include vfio.h
 +#endif

Can't you include this unconditionally, then have the #ifdef in the header
and a static inline kvm_vfio_ops_init that returns 0?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] kvm-vfio: do not use module_init

2014-09-24 Thread Will Deacon

On Wed, Sep 24, 2014 at 01:09:28PM +0100, Paolo Bonzini wrote:
 /me got confused between the kernel and QEMU.  In the kernel, you can
 only have one module_init function, and it will prevent unloading the
 module unless you also have the corresponding module_exit function.
 
 So, commit 80ce1639727e (KVM: VFIO: register kvm_device_ops dynamically,
 2014-09-02) broke unloading of the kvm module, by adding a module_init
 function and no module_exit.
 
 Repair it by making kvm_vfio_ops_init weak, and checking it in
 kvm_init.
 
 Cc: Will Deacon will.dea...@arm.com
 Cc: Gleb Natapov g...@kernel.org
 Cc: Alex Williamson alex.william...@redhat.com
 Fixes: 80ce1639727e9d38729c34f162378508c307ca25
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  virt/kvm/kvm_main.c |  4 
  virt/kvm/vfio.c |  4 ++--
  virt/kvm/vfio.h | 13 +
  3 files changed, 19 insertions(+), 2 deletions(-)
  create mode 100644 virt/kvm/vfio.h

Looks good to me, thanks for fixing this:

  Acked-by: Will Deacon will.dea...@arm.com

Will

 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index db57363cc287..499db0977f3c 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -57,6 +57,7 @@
  
  #include coalesced_mmio.h
  #include async_pf.h
 +#include vfio.h
  
  #define CREATE_TRACE_POINTS
  #include trace/events/kvm.h
 @@ -3226,6 +3227,9 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned 
 vcpu_align,
   goto out_undebugfs;
   }
  
 + r = kvm_vfio_ops_init();
 + WARN_ON(r);
 +
   return 0;
  
  out_undebugfs:
 diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
 index bb11b36ee8a2..281e7cf2b8e5 100644
 --- a/virt/kvm/vfio.c
 +++ b/virt/kvm/vfio.c
 @@ -18,6 +18,7 @@
  #include linux/slab.h
  #include linux/uaccess.h
  #include linux/vfio.h
 +#include vfio.h
  
  struct kvm_vfio_group {
   struct list_head node;
 @@ -278,8 +279,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 
 type)
   return 0;
  }
  
 -static int __init kvm_vfio_ops_init(void)
 +int kvm_vfio_ops_init(void)
  {
   return kvm_register_device_ops(kvm_vfio_ops, KVM_DEV_TYPE_VFIO);
  }
 -module_init(kvm_vfio_ops_init);
 diff --git a/virt/kvm/vfio.h b/virt/kvm/vfio.h
 new file mode 100644
 index ..92eac75d6b62
 --- /dev/null
 +++ b/virt/kvm/vfio.h
 @@ -0,0 +1,13 @@
 +#ifndef __KVM_VFIO_H
 +#define __KVM_VFIO_H
 +
 +#ifdef CONFIG_KVM_VFIO
 +int kvm_vfio_ops_init(void);
 +#else
 +static inline int kvm_vfio_ops_init(void)
 +{
 + return 0;
 +}
 +#endif
 +
 +#endif
 -- 
 1.8.3.1
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCHv7 01/26] iommu/arm-smmu: change IOMMU_EXEC to IOMMU_NOEXEC

2014-09-23 Thread Will Deacon

Hi Antonios,

On Tue, Sep 23, 2014 at 03:46:00PM +0100, Antonios Motakis wrote:
 Exposing the XN flag of the SMMU driver as IOMMU_NOEXEC instead of
 IOMMU_EXEC makes it enforceable, since for IOMMUs that don't support
 the XN flag pages will always be executable.
 
 Signed-off-by: Antonios Motakis a.mota...@virtualopensystems.com
 ---
  drivers/iommu/arm-smmu.c | 9 +
  include/linux/iommu.h| 2 +-
  2 files changed, 6 insertions(+), 5 deletions(-)

[...]

 diff --git a/include/linux/iommu.h b/include/linux/iommu.h
 index 20f9a52..e1a644c 100644
 --- a/include/linux/iommu.h
 +++ b/include/linux/iommu.h
 @@ -27,7 +27,7 @@
  #define IOMMU_READ   (1  0)
  #define IOMMU_WRITE  (1  1)
  #define IOMMU_CACHE  (1  2) /* DMA cache coherency */
 -#define IOMMU_EXEC   (1  3)
 +#define IOMMU_NOEXEC (1  3)

This hunk needs to be a separate patch merged by Joerg before I can take the
arm-smmu part (which looks fine).

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 0/4] Make kvm_device_ops registration dynamic

2014-09-11 Thread Will Deacon

Hi Paolo,

On Tue, Sep 02, 2014 at 10:27:32AM +0100, Will Deacon wrote:
 Hi all,
 
 This is version 3 of the patches originally posted here:
 
   v1: http://www.spinics.net/lists/kvm-arm/msg10219.html
   v2: http://www.spinics.net/lists/kvm/msg105197.html
 
 Changes since v2 include:
 
   - Rebased onto 3.17-rc* (the vgic code changed a lot!)
   - Added relevant acks
 
 The mpic, flic and xics are still not ported over, as I don't want to
 risk breaking those devices (it's not clear at which point they need to
 be registered).

Any further comments on this lot?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm tools: arm: remove register accessor macros now that they are in uapi

2014-09-05 Thread Will Deacon

On Fri, Aug 29, 2014 at 02:00:24PM +0100, Will Deacon wrote:
 The kernel now exposes register accessor macros in the uapi/ headers
 for arm and arm64, so use those instead (and avoid the compile failure
 from the duplicate definitions).
 
 Signed-off-by: Will Deacon will.dea...@arm.com
 ---
 
 Pekka -- please take this as a fix, since merging the 3.16 sources has
  caused some breakage for ARM. Cheers!

Ping? It would be great to get the build failure fixed on master.

Cheers,

Will

  tools/kvm/arm/aarch32/kvm-cpu.c | 15 +--
  tools/kvm/arm/aarch64/kvm-cpu.c | 15 ---
  2 files changed, 1 insertion(+), 29 deletions(-)
 
 diff --git a/tools/kvm/arm/aarch32/kvm-cpu.c b/tools/kvm/arm/aarch32/kvm-cpu.c
 index 464b473dc936..95fb1da5ba3d 100644
 --- a/tools/kvm/arm/aarch32/kvm-cpu.c
 +++ b/tools/kvm/arm/aarch32/kvm-cpu.c
 @@ -7,25 +7,12 @@
  #define ARM_CORE_REG(x)  (KVM_REG_ARM | KVM_REG_SIZE_U32 | 
 KVM_REG_ARM_CORE | \
KVM_REG_ARM_CORE_REG(x))
  
 -#define ARM_CP15_REG_SHIFT_MASK(x,n) \
 - (((x)  KVM_REG_ARM_ ## n ## _SHIFT)  KVM_REG_ARM_ ## n ## _MASK)
 -
 -#define __ARM_CP15_REG(op1,crn,crm,op2)  \
 - (KVM_REG_ARM | KVM_REG_SIZE_U32 |   \
 -  (15  KVM_REG_ARM_COPROC_SHIFT)   |   \
 -  ARM_CP15_REG_SHIFT_MASK(op1, OPC1) |   \
 -  ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN)   |   \
 -  ARM_CP15_REG_SHIFT_MASK(crm, CRM)  |   \
 -  ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))
 -
 -#define ARM_CP15_REG(...)__ARM_CP15_REG(__VA_ARGS__)
 -
  unsigned long kvm_cpu__get_vcpu_mpidr(struct kvm_cpu *vcpu)
  {
   struct kvm_one_reg reg;
   u32 mpidr;
  
 - reg.id = ARM_CP15_REG(ARM_CPU_ID, ARM_CPU_ID_MPIDR);
 + reg.id = ARM_CP15_REG32(ARM_CPU_ID, ARM_CPU_ID_MPIDR);
   reg.addr = (u64)(unsigned long)mpidr;
   if (ioctl(vcpu-vcpu_fd, KVM_GET_ONE_REG, reg)  0)
   die(KVM_GET_ONE_REG failed (get_mpidr vcpu%ld, vcpu-cpu_id);
 diff --git a/tools/kvm/arm/aarch64/kvm-cpu.c b/tools/kvm/arm/aarch64/kvm-cpu.c
 index 71a2a3a7789d..1b293748efd6 100644
 --- a/tools/kvm/arm/aarch64/kvm-cpu.c
 +++ b/tools/kvm/arm/aarch64/kvm-cpu.c
 @@ -15,21 +15,6 @@
  #define ARM64_CORE_REG(x)(KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
  
 -#define ARM64_SYS_REG_SHIFT_MASK(x,n)\
 - (((x)  KVM_REG_ARM64_SYSREG_ ## n ## _SHIFT) \
 -  KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
 -
 -#define __ARM64_SYS_REG(op0,op1,crn,crm,op2) \
 - (KVM_REG_ARM64 | KVM_REG_SIZE_U64   |   \
 -  KVM_REG_ARM64_SYSREG   |   \
 -  ARM64_SYS_REG_SHIFT_MASK(op0, OP0) |   \
 -  ARM64_SYS_REG_SHIFT_MASK(op1, OP1) |   \
 -  ARM64_SYS_REG_SHIFT_MASK(crn, CRN) |   \
 -  ARM64_SYS_REG_SHIFT_MASK(crm, CRM) |   \
 -  ARM64_SYS_REG_SHIFT_MASK(op2, OP2))
 -
 -#define ARM64_SYS_REG(...)   __ARM64_SYS_REG(__VA_ARGS__)
 -
  unsigned long kvm_cpu__get_vcpu_mpidr(struct kvm_cpu *vcpu)
  {
   struct kvm_one_reg reg;
 -- 
 2.1.0
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 0/4] Make kvm_device_ops registration dynamic

2014-09-02 Thread Will Deacon

Hi all,

This is version 3 of the patches originally posted here:

  v1: http://www.spinics.net/lists/kvm-arm/msg10219.html
  v2: http://www.spinics.net/lists/kvm/msg105197.html

Changes since v2 include:

  - Rebased onto 3.17-rc* (the vgic code changed a lot!)
  - Added relevant acks

The mpic, flic and xics are still not ported over, as I don't want to
risk breaking those devices (it's not clear at which point they need to
be registered).

Thanks,

Will

---8

Cornelia Huck (1):
  KVM: s390: register flic ops dynamically

Will Deacon (3):
  KVM: device: add simple registration mechanism for kvm_device_ops
  KVM: ARM: vgic: register kvm_device_ops dynamically
  KVM: VFIO: register kvm_device_ops dynamically

 arch/s390/kvm/kvm-s390.c |   3 +-
 arch/s390/kvm/kvm-s390.h |   1 +
 include/linux/kvm_host.h |   4 +-
 include/uapi/linux/kvm.h |  22 +--
 virt/kvm/arm/vgic.c  | 157 ---
 virt/kvm/kvm_main.c  |  57 +
 virt/kvm/vfio.c  |  22 ---
 7 files changed, 142 insertions(+), 124 deletions(-)

-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 3/4] KVM: s390: register flic ops dynamically

2014-09-02 Thread Will Deacon

From: Cornelia Huck cornelia.h...@de.ibm.com

Using the new kvm_register_device_ops() interface makes us get rid of
an #ifdef in common code.

Cc: Gleb Natapov g...@kernel.org
Cc: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com
Signed-off-by: Will Deacon will.dea...@arm.com
---
 arch/s390/kvm/kvm-s390.c | 3 ++-
 arch/s390/kvm/kvm-s390.h | 1 +
 include/linux/kvm_host.h | 1 -
 virt/kvm/kvm_main.c  | 4 
 4 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ce81eb2ab76a..63793641393c 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -130,7 +130,8 @@ void kvm_arch_check_processor_compat(void *rtn)
 
 int kvm_arch_init(void *opaque)
 {
-   return 0;
+   /* Register floating interrupt controller interface. */
+   return kvm_register_device_ops(kvm_flic_ops, KVM_DEV_TYPE_FLIC);
 }
 
 void kvm_arch_exit(void)
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 3862fa2cefe0..04e90538c37e 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -228,6 +228,7 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *vcpu);
 int psw_extint_disabled(struct kvm_vcpu *vcpu);
 void kvm_s390_destroy_adapters(struct kvm *kvm);
 int kvm_s390_si_ext_call_pending(struct kvm_vcpu *vcpu);
+extern struct kvm_device_ops kvm_flic_ops;
 
 /* implemented in guestdbg.c */
 void kvm_s390_backup_guest_per_regs(struct kvm_vcpu *vcpu);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e9a069ddcb2b..73f796ad0d0b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1073,7 +1073,6 @@ int kvm_register_device_ops(struct kvm_device_ops *ops, 
u32 type);
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
 extern struct kvm_device_ops kvm_vfio_ops;
-extern struct kvm_device_ops kvm_flic_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6aaa44c1506e..78b25a44db53 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2272,10 +2272,6 @@ static struct kvm_device_ops 
*kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_VFIO
[KVM_DEV_TYPE_VFIO] = kvm_vfio_ops,
 #endif
-
-#ifdef CONFIG_S390
-   [KVM_DEV_TYPE_FLIC] = kvm_flic_ops,
-#endif
 };
 
 int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 1/4] KVM: device: add simple registration mechanism for kvm_device_ops

2014-09-02 Thread Will Deacon

kvm_ioctl_create_device currently has knowledge of all the device types
and their associated ops. This is fairly inflexible when adding support
for new in-kernel device emulations, so move what we currently have out
into a table, which can support dynamic registration of ops by new
drivers for virtual hardware.

Cc: Alex Williamson alex.william...@redhat.com
Cc: Alex Graf ag...@suse.de
Cc: Gleb Natapov g...@kernel.org
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Marc Zyngier marc.zyng...@arm.com
Acked-by: Cornelia Huck cornelia.h...@de.ibm.com
Reviewed-by: Christoffer Dall christoffer.d...@linaro.org
Signed-off-by: Will Deacon will.dea...@arm.com
---
 include/linux/kvm_host.h |  1 +
 include/uapi/linux/kvm.h | 22 +++-
 virt/kvm/kvm_main.c  | 65 
 3 files changed, 55 insertions(+), 33 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a4c33b34fe3f..ca105f919f0b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1068,6 +1068,7 @@ struct kvm_device_ops {
 void kvm_device_get(struct kvm_device *dev);
 void kvm_device_put(struct kvm_device *dev);
 struct kvm_device *kvm_device_from_filp(struct file *filp);
+int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type);
 
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index cf3a2ff440e4..5541024dd5d0 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -947,15 +947,25 @@ struct kvm_device_attr {
__u64   addr;   /* userspace address of attr data */
 };
 
-#define KVM_DEV_TYPE_FSL_MPIC_20   1
-#define KVM_DEV_TYPE_FSL_MPIC_42   2
-#define KVM_DEV_TYPE_XICS  3
-#define KVM_DEV_TYPE_VFIO  4
 #define  KVM_DEV_VFIO_GROUP1
 #define   KVM_DEV_VFIO_GROUP_ADD   1
 #define   KVM_DEV_VFIO_GROUP_DEL   2
-#define KVM_DEV_TYPE_ARM_VGIC_V2   5
-#define KVM_DEV_TYPE_FLIC  6
+
+enum kvm_device_type {
+   KVM_DEV_TYPE_FSL_MPIC_20= 1,
+#define KVM_DEV_TYPE_FSL_MPIC_20   KVM_DEV_TYPE_FSL_MPIC_20
+   KVM_DEV_TYPE_FSL_MPIC_42,
+#define KVM_DEV_TYPE_FSL_MPIC_42   KVM_DEV_TYPE_FSL_MPIC_42
+   KVM_DEV_TYPE_XICS,
+#define KVM_DEV_TYPE_XICS  KVM_DEV_TYPE_XICS
+   KVM_DEV_TYPE_VFIO,
+#define KVM_DEV_TYPE_VFIO  KVM_DEV_TYPE_VFIO
+   KVM_DEV_TYPE_ARM_VGIC_V2,
+#define KVM_DEV_TYPE_ARM_VGIC_V2   KVM_DEV_TYPE_ARM_VGIC_V2
+   KVM_DEV_TYPE_FLIC,
+#define KVM_DEV_TYPE_FLIC  KVM_DEV_TYPE_FLIC
+   KVM_DEV_TYPE_MAX,
+};
 
 /*
  * ioctls for VM fds
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 33712fb26eb1..81de6768d2c2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2259,44 +2259,55 @@ struct kvm_device *kvm_device_from_filp(struct file 
*filp)
return filp-private_data;
 }
 
-static int kvm_ioctl_create_device(struct kvm *kvm,
-  struct kvm_create_device *cd)
-{
-   struct kvm_device_ops *ops = NULL;
-   struct kvm_device *dev;
-   bool test = cd-flags  KVM_CREATE_DEVICE_TEST;
-   int ret;
-
-   switch (cd-type) {
+static struct kvm_device_ops *kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_MPIC
-   case KVM_DEV_TYPE_FSL_MPIC_20:
-   case KVM_DEV_TYPE_FSL_MPIC_42:
-   ops = kvm_mpic_ops;
-   break;
+   [KVM_DEV_TYPE_FSL_MPIC_20]  = kvm_mpic_ops,
+   [KVM_DEV_TYPE_FSL_MPIC_42]  = kvm_mpic_ops,
 #endif
+
 #ifdef CONFIG_KVM_XICS
-   case KVM_DEV_TYPE_XICS:
-   ops = kvm_xics_ops;
-   break;
+   [KVM_DEV_TYPE_XICS] = kvm_xics_ops,
 #endif
+
 #ifdef CONFIG_KVM_VFIO
-   case KVM_DEV_TYPE_VFIO:
-   ops = kvm_vfio_ops;
-   break;
+   [KVM_DEV_TYPE_VFIO] = kvm_vfio_ops,
 #endif
+
 #ifdef CONFIG_KVM_ARM_VGIC
-   case KVM_DEV_TYPE_ARM_VGIC_V2:
-   ops = kvm_arm_vgic_v2_ops;
-   break;
+   [KVM_DEV_TYPE_ARM_VGIC_V2]  = kvm_arm_vgic_v2_ops,
 #endif
+
 #ifdef CONFIG_S390
-   case KVM_DEV_TYPE_FLIC:
-   ops = kvm_flic_ops;
-   break;
+   [KVM_DEV_TYPE_FLIC] = kvm_flic_ops,
 #endif
-   default:
+};
+
+int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
+{
+   if (type = ARRAY_SIZE(kvm_device_ops_table))
+   return -ENOSPC;
+
+   if (kvm_device_ops_table[type] != NULL)
+   return -EEXIST;
+
+   kvm_device_ops_table[type] = ops;
+   return 0;
+}
+
+static int kvm_ioctl_create_device(struct kvm *kvm,
+  struct kvm_create_device *cd)
+{
+   struct kvm_device_ops *ops = NULL;
+   struct kvm_device *dev;
+   bool test = cd-flags  KVM_CREATE_DEVICE_TEST

[PATCH v3 2/4] KVM: ARM: vgic: register kvm_device_ops dynamically

2014-09-02 Thread Will Deacon

Now that we have a dynamic means to register kvm_device_ops, use that
for the ARM VGIC, instead of relying on the static table.

Cc: Gleb Natapov g...@kernel.org
Cc: Paolo Bonzini pbonz...@redhat.com
Acked-by: Marc Zyngier marc.zyng...@arm.com
Reviewed-by: Christoffer Dall christoffer.d...@linaro.org
Signed-off-by: Will Deacon will.dea...@arm.com
---
 include/linux/kvm_host.h |   1 -
 virt/kvm/arm/vgic.c  | 157 ---
 virt/kvm/kvm_main.c  |   4 --
 3 files changed, 79 insertions(+), 83 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ca105f919f0b..e9a069ddcb2b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1073,7 +1073,6 @@ int kvm_register_device_ops(struct kvm_device_ops *ops, 
u32 type);
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
 extern struct kvm_device_ops kvm_vfio_ops;
-extern struct kvm_device_ops kvm_arm_vgic_v2_ops;
 extern struct kvm_device_ops kvm_flic_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 73eba793b17f..3ee3ce06bbec 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1522,83 +1522,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
return 0;
 }
 
-static void vgic_init_maintenance_interrupt(void *info)
-{
-   enable_percpu_irq(vgic-maint_irq, 0);
-}
-
-static int vgic_cpu_notify(struct notifier_block *self,
-  unsigned long action, void *cpu)
-{
-   switch (action) {
-   case CPU_STARTING:
-   case CPU_STARTING_FROZEN:
-   vgic_init_maintenance_interrupt(NULL);
-   break;
-   case CPU_DYING:
-   case CPU_DYING_FROZEN:
-   disable_percpu_irq(vgic-maint_irq);
-   break;
-   }
-
-   return NOTIFY_OK;
-}
-
-static struct notifier_block vgic_cpu_nb = {
-   .notifier_call = vgic_cpu_notify,
-};
-
-static const struct of_device_id vgic_ids[] = {
-   { .compatible = arm,cortex-a15-gic, .data = vgic_v2_probe, },
-   { .compatible = arm,gic-v3, .data = vgic_v3_probe, },
-   {},
-};
-
-int kvm_vgic_hyp_init(void)
-{
-   const struct of_device_id *matched_id;
-   int (*vgic_probe)(struct device_node *,const struct vgic_ops **,
- const struct vgic_params **);
-   struct device_node *vgic_node;
-   int ret;
-
-   vgic_node = of_find_matching_node_and_match(NULL,
-   vgic_ids, matched_id);
-   if (!vgic_node) {
-   kvm_err(error: no compatible GIC node found\n);
-   return -ENODEV;
-   }
-
-   vgic_probe = matched_id-data;
-   ret = vgic_probe(vgic_node, vgic_ops, vgic);
-   if (ret)
-   return ret;
-
-   ret = request_percpu_irq(vgic-maint_irq, vgic_maintenance_handler,
-vgic, kvm_get_running_vcpus());
-   if (ret) {
-   kvm_err(Cannot register interrupt %d\n, vgic-maint_irq);
-   return ret;
-   }
-
-   ret = __register_cpu_notifier(vgic_cpu_nb);
-   if (ret) {
-   kvm_err(Cannot register vgic CPU notifier\n);
-   goto out_free_irq;
-   }
-
-   /* Callback into for arch code for setup */
-   vgic_arch_setup(vgic);
-
-   on_each_cpu(vgic_init_maintenance_interrupt, NULL, 1);
-
-   return 0;
-
-out_free_irq:
-   free_percpu_irq(vgic-maint_irq, kvm_get_running_vcpus());
-   return ret;
-}
-
 /**
  * kvm_vgic_init - Initialize global VGIC state before running any VCPUs
  * @kvm: pointer to the kvm struct
@@ -2062,7 +1985,7 @@ static int vgic_create(struct kvm_device *dev, u32 type)
return kvm_vgic_create(dev-kvm);
 }
 
-struct kvm_device_ops kvm_arm_vgic_v2_ops = {
+static struct kvm_device_ops kvm_arm_vgic_v2_ops = {
.name = kvm-arm-vgic,
.create = vgic_create,
.destroy = vgic_destroy,
@@ -2070,3 +1993,81 @@ struct kvm_device_ops kvm_arm_vgic_v2_ops = {
.get_attr = vgic_get_attr,
.has_attr = vgic_has_attr,
 };
+
+static void vgic_init_maintenance_interrupt(void *info)
+{
+   enable_percpu_irq(vgic-maint_irq, 0);
+}
+
+static int vgic_cpu_notify(struct notifier_block *self,
+  unsigned long action, void *cpu)
+{
+   switch (action) {
+   case CPU_STARTING:
+   case CPU_STARTING_FROZEN:
+   vgic_init_maintenance_interrupt(NULL);
+   break;
+   case CPU_DYING:
+   case CPU_DYING_FROZEN:
+   disable_percpu_irq(vgic-maint_irq);
+   break;
+   }
+
+   return NOTIFY_OK;
+}
+
+static struct notifier_block vgic_cpu_nb = {
+   .notifier_call = vgic_cpu_notify,
+};
+
+static const struct of_device_id vgic_ids[] = {
+   { .compatible = arm,cortex-a15-gic, .data = vgic_v2_probe, },
+   { .compatible = arm,gic-v3, .data

[PATCH v3 4/4] KVM: VFIO: register kvm_device_ops dynamically

2014-09-02 Thread Will Deacon

Now that we have a dynamic means to register kvm_device_ops, use that
for the VFIO kvm device, instead of relying on the static table.

This is achieved by a module_init call to register the ops with KVM.

Cc: Gleb Natapov g...@kernel.org
Cc: Paolo Bonzini pbonz...@redhat.com
Acked-by: Alex Williamson alex.william...@redhat.com
Signed-off-by: Will Deacon will.dea...@arm.com
---
 include/linux/kvm_host.h |  1 -
 virt/kvm/kvm_main.c  |  4 
 virt/kvm/vfio.c  | 22 +++---
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 73f796ad0d0b..55a1618a19dc 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1072,7 +1072,6 @@ int kvm_register_device_ops(struct kvm_device_ops *ops, 
u32 type);
 
 extern struct kvm_device_ops kvm_mpic_ops;
 extern struct kvm_device_ops kvm_xics_ops;
-extern struct kvm_device_ops kvm_vfio_ops;
 
 #ifdef CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 78b25a44db53..e9a703eca8c1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2268,10 +2268,6 @@ static struct kvm_device_ops 
*kvm_device_ops_table[KVM_DEV_TYPE_MAX] = {
 #ifdef CONFIG_KVM_XICS
[KVM_DEV_TYPE_XICS] = kvm_xics_ops,
 #endif
-
-#ifdef CONFIG_KVM_VFIO
-   [KVM_DEV_TYPE_VFIO] = kvm_vfio_ops,
-#endif
 };
 
 int kvm_register_device_ops(struct kvm_device_ops *ops, u32 type)
diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
index ba1a93f935c7..bb11b36ee8a2 100644
--- a/virt/kvm/vfio.c
+++ b/virt/kvm/vfio.c
@@ -246,6 +246,16 @@ static void kvm_vfio_destroy(struct kvm_device *dev)
kfree(dev); /* alloc by kvm_ioctl_create_device, free by .destroy */
 }
 
+static int kvm_vfio_create(struct kvm_device *dev, u32 type);
+
+static struct kvm_device_ops kvm_vfio_ops = {
+   .name = kvm-vfio,
+   .create = kvm_vfio_create,
+   .destroy = kvm_vfio_destroy,
+   .set_attr = kvm_vfio_set_attr,
+   .has_attr = kvm_vfio_has_attr,
+};
+
 static int kvm_vfio_create(struct kvm_device *dev, u32 type)
 {
struct kvm_device *tmp;
@@ -268,10 +278,8 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 
type)
return 0;
 }
 
-struct kvm_device_ops kvm_vfio_ops = {
-   .name = kvm-vfio,
-   .create = kvm_vfio_create,
-   .destroy = kvm_vfio_destroy,
-   .set_attr = kvm_vfio_set_attr,
-   .has_attr = kvm_vfio_has_attr,
-};
+static int __init kvm_vfio_ops_init(void)
+{
+   return kvm_register_device_ops(kvm_vfio_ops, KVM_DEV_TYPE_VFIO);
+}
+module_init(kvm_vfio_ops_init);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] kvm tools: arm: remove register accessor macros now that they are in uapi

2014-08-29 Thread Will Deacon

The kernel now exposes register accessor macros in the uapi/ headers
for arm and arm64, so use those instead (and avoid the compile failure
from the duplicate definitions).

Signed-off-by: Will Deacon will.dea...@arm.com
---

Pekka -- please take this as a fix, since merging the 3.16 sources has
 caused some breakage for ARM. Cheers!

 tools/kvm/arm/aarch32/kvm-cpu.c | 15 +--
 tools/kvm/arm/aarch64/kvm-cpu.c | 15 ---
 2 files changed, 1 insertion(+), 29 deletions(-)

diff --git a/tools/kvm/arm/aarch32/kvm-cpu.c b/tools/kvm/arm/aarch32/kvm-cpu.c
index 464b473dc936..95fb1da5ba3d 100644
--- a/tools/kvm/arm/aarch32/kvm-cpu.c
+++ b/tools/kvm/arm/aarch32/kvm-cpu.c
@@ -7,25 +7,12 @@
 #define ARM_CORE_REG(x)(KVM_REG_ARM | KVM_REG_SIZE_U32 | 
KVM_REG_ARM_CORE | \
 KVM_REG_ARM_CORE_REG(x))
 
-#define ARM_CP15_REG_SHIFT_MASK(x,n)   \
-   (((x)  KVM_REG_ARM_ ## n ## _SHIFT)  KVM_REG_ARM_ ## n ## _MASK)
-
-#define __ARM_CP15_REG(op1,crn,crm,op2)\
-   (KVM_REG_ARM | KVM_REG_SIZE_U32 |   \
-(15  KVM_REG_ARM_COPROC_SHIFT)   |   \
-ARM_CP15_REG_SHIFT_MASK(op1, OPC1) |   \
-ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN)   |   \
-ARM_CP15_REG_SHIFT_MASK(crm, CRM)  |   \
-ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))
-
-#define ARM_CP15_REG(...)  __ARM_CP15_REG(__VA_ARGS__)
-
 unsigned long kvm_cpu__get_vcpu_mpidr(struct kvm_cpu *vcpu)
 {
struct kvm_one_reg reg;
u32 mpidr;
 
-   reg.id = ARM_CP15_REG(ARM_CPU_ID, ARM_CPU_ID_MPIDR);
+   reg.id = ARM_CP15_REG32(ARM_CPU_ID, ARM_CPU_ID_MPIDR);
reg.addr = (u64)(unsigned long)mpidr;
if (ioctl(vcpu-vcpu_fd, KVM_GET_ONE_REG, reg)  0)
die(KVM_GET_ONE_REG failed (get_mpidr vcpu%ld, vcpu-cpu_id);
diff --git a/tools/kvm/arm/aarch64/kvm-cpu.c b/tools/kvm/arm/aarch64/kvm-cpu.c
index 71a2a3a7789d..1b293748efd6 100644
--- a/tools/kvm/arm/aarch64/kvm-cpu.c
+++ b/tools/kvm/arm/aarch64/kvm-cpu.c
@@ -15,21 +15,6 @@
 #define ARM64_CORE_REG(x)  (KVM_REG_ARM64 | KVM_REG_SIZE_U64 | \
 KVM_REG_ARM_CORE | KVM_REG_ARM_CORE_REG(x))
 
-#define ARM64_SYS_REG_SHIFT_MASK(x,n)  \
-   (((x)  KVM_REG_ARM64_SYSREG_ ## n ## _SHIFT) \
-KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
-
-#define __ARM64_SYS_REG(op0,op1,crn,crm,op2)   \
-   (KVM_REG_ARM64 | KVM_REG_SIZE_U64   |   \
-KVM_REG_ARM64_SYSREG   |   \
-ARM64_SYS_REG_SHIFT_MASK(op0, OP0) |   \
-ARM64_SYS_REG_SHIFT_MASK(op1, OP1) |   \
-ARM64_SYS_REG_SHIFT_MASK(crn, CRN) |   \
-ARM64_SYS_REG_SHIFT_MASK(crm, CRM) |   \
-ARM64_SYS_REG_SHIFT_MASK(op2, OP2))
-
-#define ARM64_SYS_REG(...) __ARM64_SYS_REG(__VA_ARGS__)
-
 unsigned long kvm_cpu__get_vcpu_mpidr(struct kvm_cpu *vcpu)
 {
struct kvm_one_reg reg;
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] kvmtool: ARM: Use KVM_ARM_PREFERRED_TARGET vm ioctl to determine target cpu

2014-08-29 Thread Will Deacon

On Fri, Aug 29, 2014 at 10:10:52AM +0100, Andre Przywara wrote:
 (resent, that was the wrong account before ...)

Aha, and now your true identity has been revealed to all! Nice try Andre...
or should I say, Rienhard?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] kvmtool: ARM64: Fix compile error for aarch64

2014-08-28 Thread Will Deacon

On Thu, Aug 28, 2014 at 10:56:29AM +0100, Pekka Enberg wrote:
 On 08/07/2014 12:12 PM, Will Deacon wrote:
  Ok. Pekka, could you merge in 3.16 to the kvmtool master branch please?
  You'll need my patch below to resolve some ARM build fallout.
 
 Done.

Thanks, Pekka! We'll give it a spin.

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC 4/5] vfio: amba: VFIO support for AMBA devices

2014-08-26 Thread Will Deacon

Hi Antonios,

On Fri, Aug 22, 2014 at 10:01:27AM +0100, Antonios Motakis wrote:
 Add support for discovering AMBA devices with VFIO and handle them
 similarly to Linux platform devices.

[...]

 +static struct amba_id pl330_ids[] = {
 + { 0, 0 },
 +};
 +
 +MODULE_DEVICE_TABLE(amba, pl330_ids);
 +
 +static struct amba_driver vfio_amba_driver = {
 + .probe = vfio_amba_probe,
 + .remove = vfio_amba_remove,
 + .id_table = pl330_ids,
 + .drv = {
 + .name = vfio-amba,
 + .owner = THIS_MODULE,
 + },
 +};

I don't understand what you're doing with the IDs here. What's the point in
the empty list?

This also raises a larger question about whether or not it's safe to allow
device passthrough of arbitrary platform devices with VFIO. In the absence
of a bus/device standard like PCI, I really think this should be in opt-in
decision, where certain platform drivers can declare that their device can
be safely used with passthrough.

Thoughts?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 >

1 - 100 of 399 matches

Mail list logo