Re: [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Xiao Guangrong



On 07/02/2015 02:17 PM, Michael S. Tsirkin wrote:

On Wed, Jul 01, 2015 at 10:50:16PM +0800, Xiao Guangrong wrote:

  hw/acpi/aml-build.c |   32 +-
  hw/i386/acpi-build.c|9 +-
  hw/i386/acpi-dsdt.dsl   |2 +-
  hw/i386/pc.c|   11 +-
  hw/mem/Makefile.objs|1 +
  hw/mem/pc-nvdimm.c  | 1040 +++
  include/hw/acpi/aml-build.h |5 +-
  include/hw/mem/pc-nvdimm.h  |   56 +++
  8 files changed, 1149 insertions(+), 7 deletions(-)
  create mode 100644 hw/mem/pc-nvdimm.c
  create mode 100644 include/hw/mem/pc-nvdimm.h


Given the amount of code, this is definitely not 2.4 material.
Maybe others will have the time to review it before this, but
in any case please remember to repost after 2.4 is out.


I see, thanks for your reminder, Michael!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL] virtio/vhost: cross endian support

2015-07-02 Thread Michael S. Tsirkin
On Wed, Jul 01, 2015 at 12:02:50PM -0700, Linus Torvalds wrote:
 On Wed, Jul 1, 2015 at 2:31 AM, Michael S. Tsirkin m...@redhat.com wrote:
  virtio/vhost: cross endian support
 
 Ugh. Does this really have to be dynamic?
 
 Can't virtio do the sane thing, and just use a _fixed_ endianness?
 
 Doing a unconditional byte swap is faster and simpler than the crazy
 conditionals. That's true regardless of endianness, but gets to be
 even more so if the fixed endianness is little-endian, since BE is
 not-so-slowly fading from the world.
 
Linus

Yea, well - support for legacy BE guests on the new LE hosts is
exactly the motivation for this.

I dislike it too, but there are two redeeming properties that
made me merge this:

1.  It's a trivial amount of code: since we wrap host/guest accesses
anyway, almost all of it is well hidden from drivers.

2.  Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY -
and when it's clear, there's zero overhead (as some point it was
tested by compiling with and without the patches, got the same
stripped binary).

Maybe we could create a Kconfig symbol to enforce point (2): prevent
people from enabling it e.g. on x86. I will look into this - but it can
be done by a patch on top, so I think this can be merged as is.

Or do you know of someone using kernel with all config options enabled
undiscriminately?

Thanks,

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Pavel Fedin
 Hello!

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi
 
 On ARM, the MSI msg (address and data) comes along with
 out-of-band device ID information. The device ID encodes the device
 that composes the MSI msg. Let's create a new routing entry type,
 dubbed KVM_IRQ_ROUTING_EXTENDED_MSI and use the __u32 pad space
 to convey the device ID.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 
 ---
 
 RFC - PATCH
 - remove kvm_irq_routing_extended_msi and use union instead
 ---
  Documentation/virtual/kvm/api.txt | 9 -
  include/uapi/linux/kvm.h  | 6 +-
  2 files changed, 13 insertions(+), 2 deletions(-)
 
 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index d20fd94..6426ae9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1414,7 +1414,10 @@ struct kvm_irq_routing_entry {
   __u32 gsi;
   __u32 type;
   __u32 flags;
 - __u32 pad;
 + union {
 + __u32 pad;
 + __u32 devid;
 + };
   union {
   struct kvm_irq_routing_irqchip irqchip;
   struct kvm_irq_routing_msi msi;

 devid is actually a part of MSI bunch. Shouldn't it be a part of struct 
kvm_irq_routing_msi then?
It also has reserved pad.

 @@ -1427,6 +1430,10 @@ struct kvm_irq_routing_entry {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 +
 +In case of KVM_IRQ_ROUTING_EXTENDED_MSI routing type, devid is used to convey
 +the device ID.
 
  No flags are specified so far, the corresponding field must be set to zero.

What if we use KVM_MSI_VALID_DEVID flag instead of new 
KVM_IRQ_ROUTING_EXTENDED_MSI definition? I
believe this would make an API more consistent and introduce less new 
definitions.

 
 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 2a23705..8484681 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -841,12 +841,16 @@ struct kvm_irq_routing_s390_adapter {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 
  struct kvm_irq_routing_entry {
   __u32 gsi;
   __u32 type;
   __u32 flags;
 - __u32 pad;
 + union {
 + __u32 pad;
 + __u32 devid;
 + };
   union {
   struct kvm_irq_routing_irqchip irqchip;
   struct kvm_irq_routing_msi msi;
 --
 1.9.1
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL] virtio/vhost: cross endian support

2015-07-02 Thread Michael S. Tsirkin
On Wed, Jul 01, 2015 at 12:03:59PM -0700, Linus Torvalds wrote:
 On Wed, Jul 1, 2015 at 12:02 PM, Linus Torvalds
 torva...@linux-foundation.org wrote:
 
  Doing a unconditional byte swap is faster and simpler than the crazy
  conditionals.
 
 Unconditional endianness not only makes for simpler and faster code,
 it also ends up being easier to debug and add things like type
 annotations for sparse.
 
 Linus

At least this last one is well covered by these patches: this uses
separate sparse types so all accesses are statically verified by sparse
to use the correct accessor.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-07-02 Thread Pavel Fedin
 Hello!

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping
 
 If the ITS modality is not available, let's simply support MSI
 injection by transforming the MSI.data into an SPI ID.
 
 This becomes possible to use KVM_SIGNAL_MSI ioctl for arm too.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  arch/arm/kvm/Kconfig | 1 +
  virt/kvm/arm/vgic.c  | 5 +
  2 files changed, 6 insertions(+)
 
 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 151e710..0f58baf 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -31,6 +31,7 @@ config KVM
   select KVM_VFIO
   select HAVE_KVM_EVENTFD
   select HAVE_KVM_IRQFD
 + select HAVE_KVM_MSI
   select HAVE_KVM_IRQCHIP
   select HAVE_KVM_IRQ_ROUTING
   depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index 0b4c48c..b3c10dc 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -2314,6 +2314,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
   return kvm-arch.vgic.vm_ops.inject_msi(kvm, msi);
   else
   return -ENODEV;
 + case KVM_IRQ_ROUTING_MSI:
 + if (kvm-arch.vgic.vm_ops.inject_msi)
 + return -EINVAL;
 + else
 + return kvm_vgic_inject_irq(kvm, 0, e-msi.data, level);

 Given API change i suggest (using KVM_MSI_VALID_DEVID flag), we could get rid 
of all these if()'s
here. Just forward all parameters to vGIC implementation code and let it do its 
checks.

   default:
   return -EINVAL;
   }
 --
 1.9.1
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Michael S. Tsirkin
On Wed, Jul 01, 2015 at 10:50:16PM +0800, Xiao Guangrong wrote:
  hw/acpi/aml-build.c |   32 +-
  hw/i386/acpi-build.c|9 +-
  hw/i386/acpi-dsdt.dsl   |2 +-
  hw/i386/pc.c|   11 +-
  hw/mem/Makefile.objs|1 +
  hw/mem/pc-nvdimm.c  | 1040 
 +++
  include/hw/acpi/aml-build.h |5 +-
  include/hw/mem/pc-nvdimm.h  |   56 +++
  8 files changed, 1149 insertions(+), 7 deletions(-)
  create mode 100644 hw/mem/pc-nvdimm.c
  create mode 100644 include/hw/mem/pc-nvdimm.h

Given the amount of code, this is definitely not 2.4 material.
Maybe others will have the time to review it before this, but
in any case please remember to repost after 2.4 is out.

Thanks!

 -- 
 2.1.0
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL] virtio/vhost: cross endian support

2015-07-02 Thread Greg Kurz
On Thu, 2 Jul 2015 08:01:28 +0200
Michael S. Tsirkin m...@redhat.com wrote:

 On Wed, Jul 01, 2015 at 12:02:50PM -0700, Linus Torvalds wrote:
  On Wed, Jul 1, 2015 at 2:31 AM, Michael S. Tsirkin m...@redhat.com wrote:
   virtio/vhost: cross endian support
  
  Ugh. Does this really have to be dynamic?
  
  Can't virtio do the sane thing, and just use a _fixed_ endianness?
  
  Doing a unconditional byte swap is faster and simpler than the crazy
  conditionals. That's true regardless of endianness, but gets to be
  even more so if the fixed endianness is little-endian, since BE is
  not-so-slowly fading from the world.
  
 Linus
 
 Yea, well - support for legacy BE guests on the new LE hosts is
 exactly the motivation for this.
 
 I dislike it too, but there are two redeeming properties that
 made me merge this:
 
 1.  It's a trivial amount of code: since we wrap host/guest accesses
 anyway, almost all of it is well hidden from drivers.
 
 2.  Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY -
 and when it's clear, there's zero overhead (as some point it was
 tested by compiling with and without the patches, got the same
 stripped binary).
 
 Maybe we could create a Kconfig symbol to enforce point (2): prevent
 people from enabling it e.g. on x86. I will look into this - but it can
 be done by a patch on top, so I think this can be merged as is.
 

This cross-endian *oddity* is targeting PowerPC book3s_64 processors... I
am not aware of any other users. Maybe create a symbol that would
be only selected by PPC_BOOK3S_64 ?


 Or do you know of someone using kernel with all config options enabled
 undiscriminately?
 
 Thanks,
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] MAINTAINERS: separate section for s390 virtio drivers

2015-07-02 Thread Christian Borntraeger
Am 01.07.2015 um 17:15 schrieb Cornelia Huck:
 The s390-specific virtio drivers have probably more to do with virtio
 than with kvm today; let's move them out into a separate section to
 reflect this and to be able to add relevant mailing lists.
 
 CC: Christian Borntraeger borntrae...@de.ibm.com
 Signed-off-by: Cornelia Huck cornelia.h...@de.ibm.com

Acked-by: Christian Borntraeger borntrae...@de.ibm.com

 ---
  MAINTAINERS | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index 246d9d8..fca5c00 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -5766,7 +5766,6 @@ S:  Supported
  F:   Documentation/s390/kvm.txt
  F:   arch/s390/include/asm/kvm*
  F:   arch/s390/kvm/
 -F:   drivers/s390/kvm/
 
  KERNEL VIRTUAL MACHINE (KVM) FOR ARM
  M:   Christoffer Dall christoffer.d...@linaro.org
 @@ -10671,6 +10670,15 @@ F:   drivers/block/virtio_blk.c
  F:   include/linux/virtio_*.h
  F:   include/uapi/linux/virtio_*.h
 
 +VIRTIO DRIVERS FOR S390
 +M:   Christian Borntraeger borntrae...@de.ibm.com
 +M:   Cornelia Huck cornelia.h...@de.ibm.com
 +L:   linux-s...@vger.kernel.org
 +L:   virtualizat...@lists.linux-foundation.org
 +L:   kvm@vger.kernel.org
 +S:   Supported
 +F:   drivers/s390/kvm/
 +
  VIRTIO HOST (VHOST)
  M:   Michael S. Tsirkin m...@redhat.com
  L:   kvm@vger.kernel.org
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Alex Bennée
As ARM (and no doubt other systems) can also run tests in pure TCG mode
we might as well not bother enabling accel=kvm if we aren't on a real
ARM based system. This prevents us seeing ugly warning messages when
testing TCG.

Signed-off-by: Alex Bennée alex.ben...@linaro.org
---
 arm/run | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arm/run b/arm/run
index 662a856..2bdb4be 100755
--- a/arm/run
+++ b/arm/run
@@ -33,7 +33,13 @@ if $qemu $M -chardev testdev,id=id -initrd . 21 \
exit 2
 fi
 
-M='-machine virt,accel=kvm:tcg'
+host=`uname -m | sed -e 's/arm.*/arm/'`
+if [ ${host} = arm ] || [ ${host} = aarch64 ]; then
+M='-machine virt,accel=kvm:tcg'
+else
+M='-machine virt,accel=tcg'
+fi
+
 chr_testdev='-device virtio-serial-device'
 chr_testdev+=' -device virtconsole,chardev=ctd -chardev testdev,id=ctd'
 
-- 
2.4.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/2] vhost: support more than 64 memory regions

2015-07-02 Thread Michael S. Tsirkin
On Wed, Jul 01, 2015 at 11:07:08AM +0200, Igor Mammedov wrote:
 changes since v2:
   * drop cache patches for now as suggested
   * add max_mem_regions module parameter instead of unconditionally
 increasing limit
   * drop bsearch patch since it's already queued

I get non-trivial conflicts with this - could you rebase it
so it applies to my tree please?

 References to previous versions:
 v2: https://lkml.org/lkml/2015/6/17/276
 v1: http://www.spinics.net/lists/kvm/msg117654.html
 
 Series allows to tweak vhost's memory regions count limit.
 
 It fixes VM crashing on memory hotplug due to vhost refusing
 accepting more than 64 memory regions with max_mem_regions
 set to more than 262 slots in default QEMU configuration.
 
 Igor Mammedov (2):
   vhost: extend memory regions allocation to vmalloc
   vhost: add max_mem_regions module parameter
 
  drivers/vhost/vhost.c | 30 +++---
  1 file changed, 23 insertions(+), 7 deletions(-)
 
 -- 
 1.8.3.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/16] nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function

2015-07-02 Thread Stefan Hajnoczi
On Wed, Jul 01, 2015 at 10:50:30PM +0800, Xiao Guangrong wrote:
 +static uint32_t dsm_cmd_config_size(struct dsm_buffer *in, struct dsm_out 
 *out)
 +{
 +GSList *list = get_nvdimm_built_list();
 +PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in-handle);
 +uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
 +
 +if (!nvdimm) {
 +goto exit;
 +}
 +
 +status = NFIT_STATUS_SUCCESS;
 +out-cmd_config_size.config_size = nvdimm-config_data_size;
 +out-cmd_config_size.max_xfer = max_xfer_config_size();

cpu_to_*() missing?

It should be possible to emulate NVDIMMs for a x86_64 guest on a
big-endian host, for example.


pgpLcgFKme_vc.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Michael S. Tsirkin
On Thu, Jul 02, 2015 at 09:31:23AM +0100, Stefan Hajnoczi wrote:
 On Thu, Jul 02, 2015 at 02:34:05PM +0800, Xiao Guangrong wrote:
  On 07/02/2015 02:17 PM, Michael S. Tsirkin wrote:
  On Wed, Jul 01, 2015 at 10:50:16PM +0800, Xiao Guangrong wrote:
hw/acpi/aml-build.c |   32 +-
hw/i386/acpi-build.c|9 +-
hw/i386/acpi-dsdt.dsl   |2 +-
hw/i386/pc.c|   11 +-
hw/mem/Makefile.objs|1 +
hw/mem/pc-nvdimm.c  | 1040 
   +++
include/hw/acpi/aml-build.h |5 +-
include/hw/mem/pc-nvdimm.h  |   56 +++
8 files changed, 1149 insertions(+), 7 deletions(-)
create mode 100644 hw/mem/pc-nvdimm.c
create mode 100644 include/hw/mem/pc-nvdimm.h
  
  Given the amount of code, this is definitely not 2.4 material.
  Maybe others will have the time to review it before this, but
  in any case please remember to repost after 2.4 is out.
  
  I see, thanks for your reminder, Michael!
 
 I will review the series now.
 
 Here is the QEMU release schedule:
 http://qemu-project.org/Planning/2.4
 
 Hard freeze - 7 July
 
 QEMU 2.4 release - 4 August
 
 It could be merged into a maintainer's tree when the -next branches are
 opened (it's up to each maintainer but for the block and net trees I do
 that at hard freeze time).

Absolutely, but I'm not sure I'll do a next tree this time around.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Pavel Fedin
 Hello!

 What if we use KVM_MSI_VALID_DEVID flag instead of new 
 KVM_IRQ_ROUTING_EXTENDED_MSI
 definition? I
 believe this would make an API more consistent and introduce less new 
 definitions.

 I have just found one more flaw in your implementation. If you take a look at 
irqfd_wakeup()...
--- cut ---
/* An event has been signaled, inject an interrupt */
if (irq.type == KVM_IRQ_ROUTING_MSI)
kvm_set_msi(irq, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1,
false);
else
schedule_work(irqfd-inject);
--- cut ---
 You apparently missed KVM_IRQ_ROUTING_EXTENDED_MSI here, as well as in 
irqfd_update(). But, if you
accept my API proposal, this becomes irrelevant.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PULL] virtio/vhost: cross endian support

2015-07-02 Thread Michael S. Tsirkin
On Thu, Jul 02, 2015 at 11:12:56AM +0200, Greg Kurz wrote:
 On Thu, 2 Jul 2015 08:01:28 +0200
 Michael S. Tsirkin m...@redhat.com wrote:
 
  On Wed, Jul 01, 2015 at 12:02:50PM -0700, Linus Torvalds wrote:
   On Wed, Jul 1, 2015 at 2:31 AM, Michael S. Tsirkin m...@redhat.com 
   wrote:
virtio/vhost: cross endian support
   
   Ugh. Does this really have to be dynamic?
   
   Can't virtio do the sane thing, and just use a _fixed_ endianness?
   
   Doing a unconditional byte swap is faster and simpler than the crazy
   conditionals. That's true regardless of endianness, but gets to be
   even more so if the fixed endianness is little-endian, since BE is
   not-so-slowly fading from the world.
   
  Linus
  
  Yea, well - support for legacy BE guests on the new LE hosts is
  exactly the motivation for this.
  
  I dislike it too, but there are two redeeming properties that
  made me merge this:
  
  1.  It's a trivial amount of code: since we wrap host/guest accesses
  anyway, almost all of it is well hidden from drivers.
  
  2.  Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY -
  and when it's clear, there's zero overhead (as some point it was
  tested by compiling with and without the patches, got the same
  stripped binary).
  
  Maybe we could create a Kconfig symbol to enforce point (2): prevent
  people from enabling it e.g. on x86. I will look into this - but it can
  be done by a patch on top, so I think this can be merged as is.
  
 
 This cross-endian *oddity* is targeting PowerPC book3s_64 processors... I
 am not aware of any other users. Maybe create a symbol that would
 be only selected by PPC_BOOK3S_64 ?

I think some ARM systems are trying to support cross-endian
configurations as well.

Besides that, yes, this is more or less what I had in mind.

 
  Or do you know of someone using kernel with all config options enabled
  undiscriminately?
  
  Thanks,
  
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: FACULTY STAFF MAILBOX MESSAGE!!!!

2015-07-02 Thread Hailegiorgis Elisabeth
 FACULTY  STAFF MAILBOX MESSAGE


Your mailbox has exceeded size limits set by administrator click on 
CLEANUPhttp://owaoutlook.ezweb123.com/  to reduce quota.

  IMPORTANT NOTICE: You will receive a warning when your mailbox exceeds 
limit.You may not be able to send or receive new mail until you reduce your 
mailbox usage size Click on staff and Faculty members mailbox 
CLEANUPhttp://owaoutlook.ezweb123.com/  to clear quota usage.

You must empty the Deleted Items folder after deleting items or the space will 
not be freed.

See Mailbox Help for more information.

ADMIN TEAM
©Copyright 2010 Microsoft
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 11:20, Stefan Hajnoczi wrote:
  Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
  this patchset tries to enable it in virtualization field
 
 From a device model perspective, have you checked whether it makes sense
 to integrate nvdimms into the pc-dimm and hostmem code that is used for
 memory hotplug and NUMA?
 
 The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
 doesn't share any interfaces or code with existing memory devices.
 Maybe that is the right solution here because NVDIMMs have different
 characteristics, but I'm not sure.

The hostmem code should definitely be shared, e.g. by adding a new
file property to the memory-backend-file class.  ivshmem can also use
it---CCing Marc-André.

I don't know about the pc-dimm devices.  If the NVDIMM devices can do
_OST and can be hotplugged, then the answer is probably yes.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Stefan Hajnoczi
On Thu, Jul 02, 2015 at 02:34:05PM +0800, Xiao Guangrong wrote:
 On 07/02/2015 02:17 PM, Michael S. Tsirkin wrote:
 On Wed, Jul 01, 2015 at 10:50:16PM +0800, Xiao Guangrong wrote:
   hw/acpi/aml-build.c |   32 +-
   hw/i386/acpi-build.c|9 +-
   hw/i386/acpi-dsdt.dsl   |2 +-
   hw/i386/pc.c|   11 +-
   hw/mem/Makefile.objs|1 +
   hw/mem/pc-nvdimm.c  | 1040 
  +++
   include/hw/acpi/aml-build.h |5 +-
   include/hw/mem/pc-nvdimm.h  |   56 +++
   8 files changed, 1149 insertions(+), 7 deletions(-)
   create mode 100644 hw/mem/pc-nvdimm.c
   create mode 100644 include/hw/mem/pc-nvdimm.h
 
 Given the amount of code, this is definitely not 2.4 material.
 Maybe others will have the time to review it before this, but
 in any case please remember to repost after 2.4 is out.
 
 I see, thanks for your reminder, Michael!

I will review the series now.

Here is the QEMU release schedule:
http://qemu-project.org/Planning/2.4

Hard freeze - 7 July

QEMU 2.4 release - 4 August

It could be merged into a maintainer's tree when the -next branches are
opened (it's up to each maintainer but for the block and net trees I do
that at hard freeze time).


pgpGg9qlhEWNe.pgp
Description: PGP signature


Re: [PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support

2015-07-02 Thread Will Deacon
Hi Alex,

On Wed, Jul 01, 2015 at 07:29:01PM +0100, Alex Bennée wrote:
 This adds support for userspace to control the HW debug registers for
 guest debug. In the debug ioctl we copy an IMPDEF registers into a new
 register set called host_debug_state.
 
 We use the recently introduced vcpu parameter debug_ptr to select which
 register set is copied into the real registers when world switch occurs.
 
 I've made some helper functions from hw_breakpoint.c more widely
 available for re-use.
 
 As with single step we need to tweak the guest registers to enable the
 exceptions so we need to save and restore those bits.
 
 Two new capabilities have been added to the KVM_EXTENSION ioctl to allow
 userspace to query the number of hardware break and watch points
 available on the host hardware.
 
 Signed-off-by: Alex Bennée alex.ben...@linaro.org
 Reviewed-by: Christoffer Dall christoffer.d...@linaro.org

[...]

 diff --git a/arch/arm64/kernel/hw_breakpoint.c 
 b/arch/arm64/kernel/hw_breakpoint.c
 index e7d934d..ac07f2a 100644
 --- a/arch/arm64/kernel/hw_breakpoint.c
 +++ b/arch/arm64/kernel/hw_breakpoint.c
 @@ -50,13 +50,13 @@ static int core_num_brps;
  static int core_num_wrps;
 
  /* Determine number of BRP registers available. */
 -static int get_num_brps(void)
 +int get_num_brps(void)
  {
 return ((read_cpuid(ID_AA64DFR0_EL1)  12)  0xf) + 1;
  }
 
  /* Determine number of WRP registers available. */
 -static int get_num_wrps(void)
 +int get_num_wrps(void)
  {
 return ((read_cpuid(ID_AA64DFR0_EL1)  20)  0xf) + 1;
  }

Sorry, just noticed this, but we already have a public interface for
figuring these numbers out as required by perf. Can't you use
hw_breakpoint_slots(...) instead of exposing these internal helpers?

Will
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-07-02 Thread Paul Mackerras
This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras pau...@samba.org
---
v3: Rename MAX_THREADS to MAX_SMT_THREADS to avoid a compile warning

 arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
 arch/powerpc/include/asm/kvm_host.h   |   3 +
 arch/powerpc/kernel/asm-offsets.c |   7 +
 arch/powerpc/kvm/book3s_hv.c  | 367 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
 6 files changed, 473 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..57d5dfe 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
 #define XICS_MFRR  0xc
 #define XICS_IPI   2   /* interrupt source # for IPIs */
 
+/* Maximum number of threads per physical core */
+#define MAX_SMT_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_KVM_BOOK3S_HANDLER
@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
 
 #else  /*__ASSEMBLY__ */
 
+struct kvmppc_vcore;
+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_SMT_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
 /*
  * This struct goes in the PACA on 64-bit processors.  It is used
  * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
+/* This bit is used when a vcore exit is triggered from outside the vcore */
+#define VCORE_EXIT_REQ 0x1
+
 /*
  * Values for vcore_state.
  * Note that these are arranged such that lower values
diff --git 

[PATCH v3] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-07-02 Thread Paul Mackerras
This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras pau...@samba.org
---
v3: Rename MAX_THREADS to MAX_SMT_THREADS to avoid a compile warning

 arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
 arch/powerpc/include/asm/kvm_host.h   |   3 +
 arch/powerpc/kernel/asm-offsets.c |   7 +
 arch/powerpc/kvm/book3s_hv.c  | 367 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
 6 files changed, 473 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..57d5dfe 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
 #define XICS_MFRR  0xc
 #define XICS_IPI   2   /* interrupt source # for IPIs */
 
+/* Maximum number of threads per physical core */
+#define MAX_SMT_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_KVM_BOOK3S_HANDLER
@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
 
 #else  /*__ASSEMBLY__ */
 
+struct kvmppc_vcore;
+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_SMT_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
 /*
  * This struct goes in the PACA on 64-bit processors.  It is used
  * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
+/* This bit is used when a vcore exit is triggered from outside the vcore */
+#define VCORE_EXIT_REQ 0x1
+
 /*
  * Values for vcore_state.
  * Note that these are arranged such that lower values
diff --git 

Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Stefan Hajnoczi
On Wed, Jul 01, 2015 at 10:50:16PM +0800, Xiao Guangrong wrote:
 == Background ==
 NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
 on Intel's platform. They are discovered via ACPI and configured by _DSM
 method of NVDIMM device in ACPI. There has some supporting documents which
 can be found at:
 ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
 NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
 DSM Interface Example: 
 http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
 Driver Writer's Guide: 
 http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
 
 Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
 this patchset tries to enable it in virtualization field

From a device model perspective, have you checked whether it makes sense
to integrate nvdimms into the pc-dimm and hostmem code that is used for
memory hotplug and NUMA?

The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
doesn't share any interfaces or code with existing memory devices.
Maybe that is the right solution here because NVDIMMs have different
characteristics, but I'm not sure.


pgpbdYnHE2wZa.pgp
Description: PGP signature


[RFC 04/17] VFIO: pci: initialize vfio_device_external_ops

2015-07-02 Thread Eric Auger
Signed-off-by: Eric Auger eric.au...@linaro.org

---

v6: creation
---
 drivers/vfio/pci/vfio_pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 964ad57..1e48125 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -929,6 +929,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
.write  = vfio_pci_write,
.mmap   = vfio_pci_mmap,
.request= vfio_pci_request,
+   .external_ops   = NULL,
 };
 
 static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 16/17] KVM: eventfd: add irq bypass consumer management

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 15:17, Eric Auger wrote:
 This patch adds the registration/unregistration of an
 irq_bypass_consumer on irqfd assignment/deassignment.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  virt/kvm/eventfd.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)
 
 diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
 index f3da161..425a47b 100644
 --- a/virt/kvm/eventfd.c
 +++ b/virt/kvm/eventfd.c
 @@ -34,6 +34,7 @@
  #include linux/srcu.h
  #include linux/slab.h
  #include linux/seqlock.h
 +#include linux/irqbypass.h
  #include trace/events/kvm.h
  
  #include kvm/iodev.h
 @@ -93,6 +94,7 @@ struct _irqfd {
   struct list_head list;
   poll_table pt;
   struct work_struct shutdown;
 + struct irq_bypass_consumer *cons;
  };
  
  static struct workqueue_struct *irqfd_cleanup_wq;
 @@ -429,7 +431,21 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
*/
   fdput(f);
  
 - /* irq_bypass_register_consumer(); */
 + irqfd-cons = kzalloc(sizeof(struct irq_bypass_consumer),
 +   GFP_KERNEL);

Apart from the struct embedding technique I suggested in patch 12, this
looks very reasonable.  Thanks!

Paolo

 + if (!irqfd-cons) {
 + ret = -ENOMEM;
 + goto fail;
 + }
 + irqfd-cons-token = (void *)irqfd-eventfd;
 + irqfd-cons-gsi = irqfd-gsi;
 + irqfd-cons-kvm = kvm;
 + irqfd-cons-add_producer = kvm_arch_add_producer;
 + irqfd-cons-del_producer = kvm_arch_del_producer;
 + irqfd-cons-stop_consumer = kvm_arch_stop_consumer;
 + irqfd-cons-resume_consumer = kvm_arch_resume_consumer;
 + ret = irq_bypass_register_consumer(irqfd-cons);
 + WARN_ON(ret);
  
   return 0;
  
 @@ -530,8 +546,6 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd 
 *args)
   struct _irqfd *irqfd, *tmp;
   struct eventfd_ctx *eventfd;
  
 - /* irq_bypass_unregister_consumer() */
 -
   eventfd = eventfd_ctx_fdget(args-fd);
   if (IS_ERR(eventfd))
   return PTR_ERR(eventfd);
 @@ -550,6 +564,8 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd 
 *args)
   irqfd-irq_entry.type = 0;
   write_seqcount_end(irqfd-irq_entry_sc);
   irqfd_deactivate(irqfd);
 + irq_bypass_unregister_consumer(irqfd-cons);
 + kfree(irqfd-cons);
   }
   }
  
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Andrew Jones
On Thu, Jul 02, 2015 at 03:45:17PM +0200, Paolo Bonzini wrote:
 
 
 On 02/07/2015 13:51, Andrew Jones wrote:
  4) I recently mentioned[*] it might be nice to add a '-force-tcg' type
 of arm/run command line option, allowing tcg to be used even if
 it's possible to use kvm. Adding that at the same time would be
 nice.
 
 Can you just use --no-kvm?  It is equivalent to -machine accel=tcg,

Sounds perfect. Thanks!

 and it overrides previous -machine accel=foo options.
 
 Paolo
 
 ps: I also share the yay feeling, of course!
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 16/17] KVM: eventfd: add irq bypass consumer management

2015-07-02 Thread Eric Auger
On 07/02/2015 03:42 PM, Paolo Bonzini wrote:
 
 
 On 02/07/2015 15:17, Eric Auger wrote:
 This patch adds the registration/unregistration of an
 irq_bypass_consumer on irqfd assignment/deassignment.

 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  virt/kvm/eventfd.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)

 diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
 index f3da161..425a47b 100644
 --- a/virt/kvm/eventfd.c
 +++ b/virt/kvm/eventfd.c
 @@ -34,6 +34,7 @@
  #include linux/srcu.h
  #include linux/slab.h
  #include linux/seqlock.h
 +#include linux/irqbypass.h
  #include trace/events/kvm.h
  
  #include kvm/iodev.h
 @@ -93,6 +94,7 @@ struct _irqfd {
  struct list_head list;
  poll_table pt;
  struct work_struct shutdown;
 +struct irq_bypass_consumer *cons;
  };
  
  static struct workqueue_struct *irqfd_cleanup_wq;
 @@ -429,7 +431,21 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd 
 *args)
   */
  fdput(f);
  
 -/* irq_bypass_register_consumer(); */
 +irqfd-cons = kzalloc(sizeof(struct irq_bypass_consumer),
 +  GFP_KERNEL);
 
 Apart from the struct embedding technique I suggested in patch 12, this
 looks very reasonable.  Thanks!

Hi Paolo,

thanks for the swift feedback. I will respin shortly with the advised
embedding technique.

Best Regards

Eric
 
 Paolo
 
 +if (!irqfd-cons) {
 +ret = -ENOMEM;
 +goto fail;
 +}
 +irqfd-cons-token = (void *)irqfd-eventfd;
 +irqfd-cons-gsi = irqfd-gsi;
 +irqfd-cons-kvm = kvm;
 +irqfd-cons-add_producer = kvm_arch_add_producer;
 +irqfd-cons-del_producer = kvm_arch_del_producer;
 +irqfd-cons-stop_consumer = kvm_arch_stop_consumer;
 +irqfd-cons-resume_consumer = kvm_arch_resume_consumer;
 +ret = irq_bypass_register_consumer(irqfd-cons);
 +WARN_ON(ret);
  
  return 0;
  
 @@ -530,8 +546,6 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd 
 *args)
  struct _irqfd *irqfd, *tmp;
  struct eventfd_ctx *eventfd;
  
 -/* irq_bypass_unregister_consumer() */
 -
  eventfd = eventfd_ctx_fdget(args-fd);
  if (IS_ERR(eventfd))
  return PTR_ERR(eventfd);
 @@ -550,6 +564,8 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd 
 *args)
  irqfd-irq_entry.type = 0;
  write_seqcount_end(irqfd-irq_entry_sc);
  irqfd_deactivate(irqfd);
 +irq_bypass_unregister_consumer(irqfd-cons);
 +kfree(irqfd-cons);
  }
  }
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Andrew Jones
On Thu, Jul 02, 2015 at 02:17:18PM +0100, Alex Bennée wrote:
 
 Andrew Jones drjo...@redhat.com writes:
 
  On Thu, Jul 02, 2015 at 12:05:31PM +0100, Alex Bennée wrote:
  As ARM (and no doubt other systems) can also run tests in pure TCG mode
  we might as well not bother enabling accel=kvm if we aren't on a real
  ARM based system. This prevents us seeing ugly warning messages when
  testing TCG.
 
  First,
  YAY! We're getting contributions to kvm-unit-tests/arm!
 
 :-) well so far I've been noodling about looking at it for KVM Guest
 Debug testing. I've a hideous branch on github that attempts to test
 exercise the debug register trapping code. However that falls down as I
 really need to find an easy way of attaching GDB to the qemu-gdb stub
 while the test is running.
 
 However with the TCG multi-thread work coming up I certainly see the
 need to exercise QEMU in a way that the internal TCG test code might
 have trouble with.
 
 
  
  Signed-off-by: Alex Bennée alex.ben...@linaro.org
  ---
   arm/run | 8 +++-
   1 file changed, 7 insertions(+), 1 deletion(-)
  
  diff --git a/arm/run b/arm/run
  index 662a856..2bdb4be 100755
  --- a/arm/run
  +++ b/arm/run
  @@ -33,7 +33,13 @@ if $qemu $M -chardev testdev,id=id -initrd . 21 \
 exit 2
   fi
   
  -M='-machine virt,accel=kvm:tcg'
  +host=`uname -m | sed -e 's/arm.*/arm/'`
  +if [ ${host} = arm ] || [ ${host} = aarch64 ]; then
  +M='-machine virt,accel=kvm:tcg'
  +else
  +M='-machine virt,accel=tcg'
  +fi
 
  I think this is a good idea, although I had actually left that warning
  on purpose. Originally, the plan was for these unit tests to be kvm
  specific. If they could be developed with the aid of tcg, and even used
  to test tcg, then fine, but running them on tcg should always complain,
  in order to make sure that the test output clearly showed that it had
  not been running on kvm. Developing unit tests for tcg is also a good
  idea though, and there's really no reason not to share this framework.
 
  So, for this patch I'd prefer we do a few things differently;
 
  1) we should be able to integrate this new condition with the
 arm64 must use '-cpu host' with kvm condition that is lower down.
 And, let's just make this $HOST variable one that ./configure
 prepares, allowing that arm64 condition to s/$(arch)/$HOST/ and
 avoiding the need to duplicate the sed -e 's/arm.*/arm/'
 
 Yeah makes sense.
 
 
  2) we might as well do something like
 
 M='-machine virt'
 if using-kvm
   M+=',accel=kvm'
 else
   M+=',accel=tcg'
 fi
 
 now, since we don't want to use the accel fallback feature anymore
 
  3) outputting which one we're using might still be nice, otherwise
 one must inspect the qemu command line in the logs to find out
 
  4) I recently mentioned[*] it might be nice to add a '-force-tcg' type
 of arm/run command line option, allowing tcg to be used even if
 it's possible to use kvm. Adding that at the same time would be
 nice.
 
 Would it also be useful for other arches? Does run-tests.sh pass 

Maybe someday, so we might as well add it there. As long as it allows
current command lines to keep working as they have, then why not.

 
  5) we use tabs for indentation in arm/run, and only bother with the
 variable's {}, if necessary
 
 My shell quoting was rusty. I think $(host) was calling the host command
 for some reason.

Yes, $(cmd) executes cmd. ${var} is correct, but only necessary if you're
substituting a substring. For example

X=FOO
echo ${X}_BAR will echo FOO_BAR, but echo $X_BAR will echo whatever
the variable X_BAR is. It's not necessary to use the {} in most cases
though, space and some other characters, like /, automatically end
the variable name.

 
 
  6) we should post patches with [kvm-unit-tests PATCH] to avoid
 confusion with other kvm postings. (I screwed that up on my
 last two postings...).
 
 /me ponders if he can just config git for that.

You can. Add

[format]
subjectprefix = kvm-unit-tests PATCH

to your kvm-unit-tests/.git/config. I just hadn't bothered until now...

 
 I'll patch the readme ;-)

Contributing code !AND! updating the readme! Double YAY!

Thanks,
drew
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] qemu/kvm: kvm hyper-v based guest crash event handling

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 15:19, Andrey Smetanin wrote:
   +if (has_msr_hv_crash) {
   +env-msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY;
  
  The value is always host-defined, so I think it doesn't need a field in
  CPUX86State.  On the other hand, this:
 Kernel just works with that value, kernel doesn't setup it. 
 The user space is allowed to setup this msr if qemu option hv-crash is
 on. So the code env-msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY;
 setups msr in user space at cpu reset. When cpu setup it's registers
 these msr's values are uploaded into kernel.
 
 Anyway we need a code that initially set up crash ctl msr with value
 HV_X64_MSR_CRASH_CTL_NOTIFY. And I think that code should be user space.
 Any objections ?

Yes, that's correct.

What I'm saying, is that the value can be hard-coded and doesn't need a
field in CPUX86State.  If you want to leave the field that's also okay,
but even then it should not be part of the migration state.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 16/17] KVM: eventfd: add irq bypass consumer management

2015-07-02 Thread Eric Auger
This patch adds the registration/unregistration of an
irq_bypass_consumer on irqfd assignment/deassignment.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 virt/kvm/eventfd.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index f3da161..425a47b 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -34,6 +34,7 @@
 #include linux/srcu.h
 #include linux/slab.h
 #include linux/seqlock.h
+#include linux/irqbypass.h
 #include trace/events/kvm.h
 
 #include kvm/iodev.h
@@ -93,6 +94,7 @@ struct _irqfd {
struct list_head list;
poll_table pt;
struct work_struct shutdown;
+   struct irq_bypass_consumer *cons;
 };
 
 static struct workqueue_struct *irqfd_cleanup_wq;
@@ -429,7 +431,21 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 */
fdput(f);
 
-   /* irq_bypass_register_consumer(); */
+   irqfd-cons = kzalloc(sizeof(struct irq_bypass_consumer),
+ GFP_KERNEL);
+   if (!irqfd-cons) {
+   ret = -ENOMEM;
+   goto fail;
+   }
+   irqfd-cons-token = (void *)irqfd-eventfd;
+   irqfd-cons-gsi = irqfd-gsi;
+   irqfd-cons-kvm = kvm;
+   irqfd-cons-add_producer = kvm_arch_add_producer;
+   irqfd-cons-del_producer = kvm_arch_del_producer;
+   irqfd-cons-stop_consumer = kvm_arch_stop_consumer;
+   irqfd-cons-resume_consumer = kvm_arch_resume_consumer;
+   ret = irq_bypass_register_consumer(irqfd-cons);
+   WARN_ON(ret);
 
return 0;
 
@@ -530,8 +546,6 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd *args)
struct _irqfd *irqfd, *tmp;
struct eventfd_ctx *eventfd;
 
-   /* irq_bypass_unregister_consumer() */
-
eventfd = eventfd_ctx_fdget(args-fd);
if (IS_ERR(eventfd))
return PTR_ERR(eventfd);
@@ -550,6 +564,8 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd *args)
irqfd-irq_entry.type = 0;
write_seqcount_end(irqfd-irq_entry_sc);
irqfd_deactivate(irqfd);
+   irq_bypass_unregister_consumer(irqfd-cons);
+   kfree(irqfd-cons);
}
}
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] qemu/kvm: kvm hyper-v based guest crash event handling

2015-07-02 Thread Andrey Smetanin
On Wed, 2015-07-01 at 17:07 +0200, Paolo Bonzini wrote:
 
 On 30/06/2015 13:33, Denis V. Lunev wrote:
  
  +static int kvm_arch_handle_hv_crash(CPUState *cs)
  +{
  +X86CPU *cpu = X86_CPU(cs);
  +CPUX86State *env = cpu-env;
  +
  +/* Mark that Hyper-v guest crash occurred */
  +env-hv_crash_occurred = 1;
 
 This need not be a hv crash.  You can add crash_occurred to CPUState
 directly, and set it in qemu_system_guest_panicked:
 
   if (current_cpu) {
   current_cpu-crash_occurred = true;
 }
 
 Then you would add two subsections: one for crash_occurred in exec.c
 (attached to vmstate_cpu_common), one for hyperv crash params in
 target-i386/machine.c.
 
 This also gives an idea about splitting the patch: first the
 introduction of qemu_system_guest_panicked and crash_occurred, second
 the Hyper-V specific bits.
 
  +if (cpu-hyperv_crash) {
  +c-edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
  +has_msr_hv_crash = true;
 
 You can only set this to true if the kernel also supports the MSRs.
 
  +}
  +
   c = cpuid_data.entries[cpuid_i++];
   c-function = HYPERV_CPUID_ENLIGHTMENT_INFO;
   if (cpu-hyperv_relaxed_timing) {
  @@ -761,6 +767,10 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
   } else {
   env-mp_state = KVM_MP_STATE_RUNNABLE;
   }
  +if (has_msr_hv_crash) {
  +env-msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY;
 
 The value is always host-defined, so I think it doesn't need a field in
 CPUX86State.  On the other hand, this:
Kernel just works with that value, kernel doesn't setup it. 
The user space is allowed to setup this msr if qemu option hv-crash is
on. So the code env-msr_hv_crash_ctl = HV_X64_MSR_CRASH_CTL_NOTIFY;
setups msr in user space at cpu reset. When cpu setup it's registers
these msr's values are uploaded into kernel.

Anyway we need a code that initially set up crash ctl msr with value
HV_X64_MSR_CRASH_CTL_NOTIFY. And I think that code should be user space.
Any objections ?
 
 
 +static bool hyperv_crash_enable_needed(void *opaque)
 +{
 +X86CPU *cpu = opaque;
 +CPUX86State *env = cpu-env;
 +
 +return (env-msr_hv_crash_ctl  HV_X64_MSR_CRASH_CTL_CONTENTS) ?
 +true : false;
 +}
 +
 
 can just check if any of the params fields is nonzero.
 
If we setup crash ctl msr by user space, we need it to migrate.

 Thanks,
 
 Paolo
 
  +env-hv_crash_occurred = 0;
  +}


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 15/17] KVM: arm/arm64: implement IRQ bypass consumer functions

2015-07-02 Thread Eric Auger
- kvm_arch_add_producer: perform VGIC/irqchip settings for forwarding
- kvm_arch_del_producer: same for inverse operation
- kvm_arch_stop_consumer: halt guest execution
- kvm_arch_resume_consumer resume guest execution

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 arch/arm/kvm/arm.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 4be6715..f9b9b1e 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1146,6 +1146,28 @@ struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, 
unsigned long mpidr)
return NULL;
 }
 
+void kvm_arch_add_producer(struct irq_bypass_consumer *cons,
+  struct irq_bypass_producer *prod)
+{
+   kvm_vgic_set_forward(cons-kvm, prod-irq, cons-gsi);
+}
+void kvm_arch_del_producer(struct irq_bypass_consumer *cons,
+  struct irq_bypass_producer *prod)
+{
+   kvm_vgic_unset_forward(cons-kvm, prod-irq, cons-gsi,
+  prod-active);
+}
+
+void kvm_arch_stop_consumer(struct irq_bypass_consumer *cons)
+{
+   kvm_arm_halt_guest(cons-kvm);
+}
+
+void kvm_arch_resume_consumer(struct irq_bypass_consumer *cons)
+{
+   kvm_arm_resume_guest(cons-kvm);
+}
+
 /**
  * Initialize Hyp-mode and memory mappings on all CPUs.
  */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/2] vhost: add max_mem_regions module parameter

2015-07-02 Thread Igor Mammedov
it became possible to use a bigger amount of memory
slots, which is used by memory hotplug for
registering hotplugged memory.
However QEMU crashes if it's used with more than ~60
pc-dimm devices and vhost-net enabled since host kernel
in module vhost-net refuses to accept more than 64
memory regions.

Allow to tweak limit via max_mem_regions module paramemter
with default value set to 64 slots.

Signed-off-by: Igor Mammedov imamm...@redhat.com
---
 drivers/vhost/vhost.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 6488011..9a68e2e 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -29,8 +29,12 @@
 
 #include vhost.h
 
+static ushort max_mem_regions = 64;
+module_param(max_mem_regions, ushort, 0444);
+MODULE_PARM_DESC(max_mem_regions,
+   Maximum number of memory regions in memory map. (default: 64));
+
 enum {
-   VHOST_MEMORY_MAX_NREGIONS = 64,
VHOST_MEMORY_F_LOG = 0x1,
 };
 
@@ -696,7 +700,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
vhost_memory __user *m)
return -EFAULT;
if (mem.padding)
return -EOPNOTSUPP;
-   if (mem.nregions  VHOST_MEMORY_MAX_NREGIONS)
+   if (mem.nregions  max_mem_regions)
return -E2BIG;
newmem = vhost_kvzalloc(size + mem.nregions * sizeof(*m-regions));
if (!newmem)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/2] vhost: extend memory regions allocation to vmalloc

2015-07-02 Thread Igor Mammedov
with large number of memory regions we could end up with
high order allocations and kmalloc could fail if
host is under memory pressure.
Considering that memory regions array is used on hot path
try harder to allocate using kmalloc and if it fails resort
to vmalloc.
It's still better than just failing vhost_set_memory() and
causing guest crash due to it when a new memory hotplugged
to guest.

I'll still look at QEMU side solution to reduce amount of
memory regions it feeds to vhost to make things even better,
but it doesn't hurt for kernel to behave smarter and don't
crash older QEMU's which could use large amount of memory
regions.

Signed-off-by: Igor Mammedov imamm...@redhat.com
---
 drivers/vhost/vhost.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 71bb468..6488011 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -544,7 +544,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev, bool locked)
fput(dev-log_file);
dev-log_file = NULL;
/* No one will access memory at this point */
-   kfree(dev-memory);
+   kvfree(dev-memory);
dev-memory = NULL;
WARN_ON(!list_empty(dev-work_list));
if (dev-worker) {
@@ -674,6 +674,18 @@ static int vhost_memory_reg_sort_cmp(const void *p1, const 
void *p2)
return 0;
 }
 
+static void *vhost_kvzalloc(unsigned long size)
+{
+   void *n = kzalloc(size, GFP_KERNEL | __GFP_NOWARN | __GFP_REPEAT);
+
+   if (!n) {
+   n = vzalloc(size);
+   if (!n)
+   return ERR_PTR(-ENOMEM);
+   }
+   return n;
+}
+
 static long vhost_set_memory(struct vhost_dev *d, struct vhost_memory __user 
*m)
 {
struct vhost_memory mem, *newmem, *oldmem;
@@ -686,7 +698,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
vhost_memory __user *m)
return -EOPNOTSUPP;
if (mem.nregions  VHOST_MEMORY_MAX_NREGIONS)
return -E2BIG;
-   newmem = kmalloc(size + mem.nregions * sizeof *m-regions, GFP_KERNEL);
+   newmem = vhost_kvzalloc(size + mem.nregions * sizeof(*m-regions));
if (!newmem)
return -ENOMEM;
 
@@ -700,7 +712,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
vhost_memory __user *m)
vhost_memory_reg_sort_cmp, NULL);
 
if (!memory_access_ok(d, newmem, 0)) {
-   kfree(newmem);
+   kvfree(newmem);
return -EFAULT;
}
oldmem = d-memory;
@@ -712,7 +724,7 @@ static long vhost_set_memory(struct vhost_dev *d, struct 
vhost_memory __user *m)
d-vqs[i]-memory = newmem;
mutex_unlock(d-vqs[i]-mutex);
}
-   kfree(oldmem);
+   kvfree(oldmem);
return 0;
 }
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 17/17] VFIO: platform: add irq bypass producer management

2015-07-02 Thread Eric Auger
This patch adds irq_bypass_producer registration/unregistration.
VFIO producer callbacks are populated:
- stop/resume producer simply consist in disabling/enabling the host irq
- add/del consumer: basically set the automasked flag to false/true

The vfio_platform_device pointer is passed as producer opaque.

We also cache the device handle in vfio_platform_device. This
makes possible to easily retrieve the vfio_device at registration.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 drivers/vfio/platform/vfio_platform_common.c  |  2 +
 drivers/vfio/platform/vfio_platform_irq.c | 83 +++
 drivers/vfio/platform/vfio_platform_private.h |  2 +
 3 files changed, 87 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index 9acfca6..12d4540 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -546,6 +546,8 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
if (!vdev)
return -EINVAL;
 
+   vdev-dev = dev;
+
group = iommu_group_get(dev);
if (!group) {
pr_err(VFIO: No IOMMU group for device %s\n, vdev-name);
diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
b/drivers/vfio/platform/vfio_platform_irq.c
index f6d83ed..0061e6e 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -20,6 +20,7 @@
 #include linux/types.h
 #include linux/vfio.h
 #include linux/irq.h
+#include linux/irqbypass.h
 
 #include vfio_platform_private.h
 
@@ -185,6 +186,70 @@ static irqreturn_t vfio_handler(int irq, void *dev_id)
return ret;
 }
 
+static void vfio_platform_stop_producer(struct irq_bypass_producer *prod)
+{
+   pr_info(%s disable %d\n, __func__, prod-irq);
+   disable_irq(prod-irq);
+}
+
+static void vfio_platform_resume_producer(struct irq_bypass_producer *prod)
+{
+   pr_info(%s enable %d\n, __func__, prod-irq);
+   enable_irq(prod-irq);
+}
+
+static void vfio_platform_add_consumer(struct irq_bypass_producer *prod,
+  struct irq_bypass_consumer *cons)
+{
+   int i, ret;
+   struct vfio_platform_device *vdev =
+   (struct vfio_platform_device *)prod-opaque;
+
+   pr_info(%s irq=%d gsi =%d\n, __func__, prod-irq, cons-gsi);
+
+   for (i = 0; i  vdev-num_irqs; i++) {
+   if (vdev-irqs[i].prod == prod)
+   break;
+   }
+   WARN_ON(i == vdev-num_irqs);
+
+   //TODO
+   /*
+* if the IRQ is active at irqchip level or VFIO (auto)masked
+* this means the host IRQ is already under injection in the
+* guest and this not safe to change the forwarding state at
+* that stage.
+* It is not possible to differentiate user-space masking
+* from auto-masking, leading to possible false detection of
+* active state.
+*/
+   prod-active = vfio_external_is_active(prod-vdev, i, 0, 0);
+
+   ret = vfio_external_set_automasked(prod-vdev, i, 0, 0, false);
+   WARN_ON(ret);
+}
+
+static void vfio_platform_del_consumer(struct irq_bypass_producer *prod,
+  struct irq_bypass_consumer *cons)
+{
+   int i;
+   struct vfio_platform_device *vdev =
+   (struct vfio_platform_device *)prod-opaque;
+
+   pr_info(%s irq=%d gsi =%d\n, __func__, prod-irq, cons-gsi);
+
+   for (i = 0; i  vdev-num_irqs; i++) {
+   if (vdev-irqs[i].prod == prod)
+   break;
+   }
+   WARN_ON(i == vdev-num_irqs);
+
+   if (prod-active)
+   vfio_external_mask(prod-vdev, i, 0, 0);
+
+   vfio_external_set_automasked(prod-vdev, i, 0, 0, true);
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
int fd, irq_handler_t handler)
 {
@@ -192,8 +257,11 @@ static int vfio_set_trigger(struct vfio_platform_device 
*vdev, int index,
struct eventfd_ctx *trigger;
int ret;
 
+
if (irq-trigger) {
free_irq(irq-hwirq, irq);
+   irq_bypass_unregister_producer(irq-prod);
+   kfree(irq-prod);
kfree(irq-name);
eventfd_ctx_put(irq-trigger);
irq-trigger = NULL;
@@ -225,6 +293,21 @@ static int vfio_set_trigger(struct vfio_platform_device 
*vdev, int index,
return ret;
}
 
+   irq-prod = kzalloc(sizeof(struct irq_bypass_producer),
+   GFP_KERNEL);
+   if (!irq-prod)
+   return -ENOMEM;
+   irq-prod-token = (void *)trigger;
+   irq-prod-irq = irq-hwirq;
+   irq-prod-vdev = vfio_device_get_from_dev(vdev-dev);
+   irq-prod-opaque = (void *)vdev;
+   irq-prod-add_consumer = vfio_platform_add_consumer;
+   irq-prod-del_consumer = 

[RFC 11/17] VFIO: platform: select IRQ_BYPASS_MANAGER

2015-07-02 Thread Eric Auger
Select IRQ_BYPASS_MANAGER when CONFIG_VFIO_PLATFORM is set

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 drivers/vfio/platform/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vfio/platform/Kconfig b/drivers/vfio/platform/Kconfig
index bb30128..c2f3dce 100644
--- a/drivers/vfio/platform/Kconfig
+++ b/drivers/vfio/platform/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PLATFORM
tristate VFIO support for platform devices
depends on VFIO  EVENTFD  (ARM || ARM64)
select VFIO_VIRQFD
+   select IRQ_BYPASS_MANAGER
help
  Support for platform devices with VFIO. This is required to make
  use of platform devices present on the system using the VFIO
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 14/17] KVM: arm/arm64: vgic: forwarding control

2015-07-02 Thread Eric Auger
Implements kvm_vgic_[set|unset]_forward.

Handle low-level VGIC programming: physical IRQ/guest IRQ mapping,
list register cleanup, VGIC state machine. Also interacts with
the irqchip.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

bypass rfc:
- rename kvm_arch_{set|unset}_forward into
  kvm_vgic_{set|unset}_forward. Remove __KVM_HAVE_ARCH_HALT_GUEST.
  The function is bound to be called by ARM code only.

v4 - v5:
- fix arm64 compilation issues, ie. also defines
  __KVM_HAVE_ARCH_HALT_GUEST for arm64

v3 - v4:
- code originally located in kvm_vfio_arm.c
- kvm_arch_vfio_{set|unset}_forward renamed into
  kvm_arch_{set|unset}_forward
- split into 2 functions (set/unset) since unset does not fail anymore
- unset can be invoked at whatever time. Extra care is taken to handle
  transition in VGIC state machine, LR cleanup, ...

v2 - v3:
- renaming of kvm_arch_set_fwd_state into kvm_arch_vfio_set_forward
- takes a bool arg instead of kvm_fwd_irq_action enum
- removal of KVM_VFIO_IRQ_CLEANUP
- platform device check now happens here
- more precise errors returned
- irq_eoi handled externally to this patch (VGIC)
- correct enable_irq bug done twice
- reword the commit message
- correct check of platform_bus_type
- use raw_spin_lock_irqsave and check the validity of the handler
---
 include/kvm/arm_vgic.h |   7 ++
 virt/kvm/arm/vgic.c| 195 +
 2 files changed, 202 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 5d47d60..93b379f 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -353,6 +353,13 @@ int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct 
irq_phys_map *map);
 bool vgic_get_phys_irq_active(struct irq_phys_map *map);
 void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
+int kvm_vgic_set_forward(struct kvm *kvm,
+unsigned int host_irq, unsigned int guest_irq);
+
+void kvm_vgic_unset_forward(struct kvm *kvm,
+   unsigned int host_irq, unsigned int guest_irq,
+   bool *active);
+
 #define irqchip_in_kernel(k)   (!!((k)-arch.vgic.in_kernel))
 #define vgic_initialized(k)(!!((k)-arch.vgic.nr_cpus))
 #define vgic_ready(k)  ((k)-arch.vgic.ready)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index eef35d9..9efc839 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -2402,3 +2402,198 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
return 0;
 }
+
+/**
+ * kvm_vgic_set_forward - Set IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ *
+ * This function is supposed to be called only if the IRQ
+ * is not in progress: ie. not active at GIC level and not
+ * currently under injection in the KVM. The physical IRQ must
+ * also be disabled and the guest must have been exited and
+ * prevented from being re-entered.
+ */
+int kvm_vgic_set_forward(struct kvm *kvm,
+unsigned int host_irq,
+unsigned int guest_irq)
+{
+   struct irq_desc *desc = irq_to_desc(host_irq);
+   struct irq_phys_map *map = NULL;
+   struct irq_data *d;
+   unsigned long flags;
+   struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+   int spi_id = guest_irq + VGIC_NR_PRIVATE_IRQS;
+   struct vgic_dist *dist = kvm-arch.vgic;
+
+   kvm_debug(%s host_irq=%d guest_irq=%d\n,
+   __func__, host_irq, guest_irq);
+
+   if (!vcpu)
+   return 0;
+
+   spin_lock(dist-lock);
+
+   raw_spin_lock_irqsave(desc-lock, flags);
+   d = desc-irq_data;
+   irqd_set_irq_forwarded(d);
+   /*
+* next physical IRQ will be be handled as forwarded
+* by the host (priority drop only)
+*/
+
+   raw_spin_unlock_irqrestore(desc-lock, flags);
+
+   /*
+* need to release the dist spin_lock here since
+* vgic_map_phys_irq can sleep
+*/
+   spin_unlock(dist-lock);
+   map = vgic_map_phys_irq(vcpu, spi_id, host_irq, false);
+   /*
+* next guest_irq injection will be considered as
+* forwarded and next flush will program LR
+* without maintenance IRQ but with HW bit set
+*/
+   return !map;
+}
+
+/**
+ * kvm_vgic_unset_forward - Unset IRQ forwarding
+ *
+ * @kvm: handle to the VM
+ * @host_irq: physical IRQ number
+ * @guest_irq: virtual IRQ number
+ * @active: returns whether the physical IRQ is active
+ *
+ * This function must be called when the host_irq is disabled
+ * and guest has been exited and prevented from being re-entered.
+ *
+ */
+void kvm_vgic_unset_forward(struct kvm *kvm,
+   unsigned int host_irq,
+   unsigned int guest_irq,
+   bool *active)
+{
+   struct kvm_vcpu *vcpu = kvm_get_vcpu(kvm, 0);
+   struct vgic_cpu *vgic_cpu = vcpu-arch.vgic_cpu;
+   

[RFC 13/17] KVM: introduce kvm_arch functions for IRQ bypass

2015-07-02 Thread Eric Auger
This patch introduces
- kvm_arch_add_producer
- kvm_arch_del_producer
- kvm_arch_stop_consumer
- kvm_arch_resume_consumer

They make possible to specialize the KVM IRQ bypass consumer.

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 include/linux/kvm_host.h | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9564fd7..8e981e9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -24,6 +24,7 @@
 #include linux/err.h
 #include linux/irqflags.h
 #include linux/context_tracking.h
+#include linux/irqbypass.h
 #include asm/signal.h
 
 #include linux/kvm.h
@@ -1133,5 +1134,31 @@ static inline void kvm_vcpu_set_dy_eligible(struct 
kvm_vcpu *vcpu, bool val)
 {
 }
 #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
+
+#ifdef CONFIG_IRQ_BYPASS_MANAGER
+
+void kvm_arch_add_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *);
+void kvm_arch_del_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *);
+void kvm_arch_stop_consumer(struct irq_bypass_consumer *);
+void kvm_arch_resume_consumer(struct irq_bypass_consumer *);
+
+#else
+void kvm_arch_add_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *)
+{
+}
+void kvm_arch_del_producer(struct irq_bypass_consumer *,
+  struct irq_bypass_producer *)
+{
+}
+void kvm_arch_stop_consumer(struct irq_bypass_consumer *)
+{
+}
+void kvm_arch_resume_consumer(struct irq_bypass_consumer *)
+{
+}
+#endif /* CONFIG_IRQ_BYPASS_MANAGER */
 #endif
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 07/17] KVM: arm: rename pause into power_off

2015-07-02 Thread Eric Auger
The kvm_vcpu_arch pause field is renamed into power_off to prepare
for the introduction of a new pause field.

Signed-off-by: Eric Auger eric.au...@linaro.org

v4 - v5:
- fix compilation issue on arm64 (add power_off field in kvm_host.h)
---
 arch/arm/include/asm/kvm_host.h   |  4 ++--
 arch/arm/kvm/arm.c| 10 +-
 arch/arm/kvm/psci.c   | 10 +-
 arch/arm64/include/asm/kvm_host.h |  4 ++--
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index e896d2c..304004d 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -129,8 +129,8 @@ struct kvm_vcpu_arch {
 * here.
 */
 
-   /* Don't run the guest on this vcpu */
-   bool pause;
+   /* vcpu power-off state */
+   bool power_off;
 
/* IO related fields */
struct kvm_decode mmio_decode;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index bcdf799..7537e68 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -475,7 +475,7 @@ static void vcpu_pause(struct kvm_vcpu *vcpu)
 {
wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu);
 
-   wait_event_interruptible(*wq, !vcpu-arch.pause);
+   wait_event_interruptible(*wq, !vcpu-arch.power_off);
 }
 
 static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
@@ -525,7 +525,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
update_vttbr(vcpu-kvm);
 
-   if (vcpu-arch.pause)
+   if (vcpu-arch.power_off)
vcpu_pause(vcpu);
 
/*
@@ -766,12 +766,12 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu 
*vcpu,
vcpu_reset_hcr(vcpu);
 
/*
-* Handle the start in power-off case by marking the VCPU as paused.
+* Handle the start in power-off case.
 */
if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu-arch.features))
-   vcpu-arch.pause = true;
+   vcpu-arch.power_off = true;
else
-   vcpu-arch.pause = false;
+   vcpu-arch.power_off = false;
 
return 0;
 }
diff --git a/arch/arm/kvm/psci.c b/arch/arm/kvm/psci.c
index 4b94b51..134971a 100644
--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -63,7 +63,7 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu 
*vcpu)
 
 static void kvm_psci_vcpu_off(struct kvm_vcpu *vcpu)
 {
-   vcpu-arch.pause = true;
+   vcpu-arch.power_off = true;
 }
 
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
@@ -87,7 +87,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 */
if (!vcpu)
return PSCI_RET_INVALID_PARAMS;
-   if (!vcpu-arch.pause) {
+   if (!vcpu-arch.power_off) {
if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
return PSCI_RET_ALREADY_ON;
else
@@ -115,7 +115,7 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu 
*source_vcpu)
 * the general puspose registers are undefined upon CPU_ON.
 */
*vcpu_reg(vcpu, 0) = context_id;
-   vcpu-arch.pause = false;
+   vcpu-arch.power_off = false;
smp_mb();   /* Make sure the above is visible */
 
wq = kvm_arch_vcpu_wq(vcpu);
@@ -152,7 +152,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct 
kvm_vcpu *vcpu)
kvm_for_each_vcpu(i, tmp, kvm) {
mpidr = kvm_vcpu_get_mpidr_aff(tmp);
if (((mpidr  target_affinity_mask) == target_affinity) 
-   !tmp-arch.pause) {
+   !tmp-arch.power_off) {
return PSCI_0_2_AFFINITY_LEVEL_ON;
}
}
@@ -175,7 +175,7 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, 
u32 type)
 * re-initialized.
 */
kvm_for_each_vcpu(i, tmp, vcpu-kvm) {
-   tmp-arch.pause = true;
+   tmp-arch.power_off = true;
kvm_vcpu_kick(tmp);
}
 
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 2709db2..009da6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -122,8 +122,8 @@ struct kvm_vcpu_arch {
 * here.
 */
 
-   /* Don't run the guest */
-   bool pause;
+   /* vcpu power-off state */
+   bool power_off;
 
/* IO related fields */
struct kvm_decode mmio_decode;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 06/17] VFIO: add vfio_external_{mask|is_active|set_automasked}

2015-07-02 Thread Eric Auger
Introduces 3 new external functions aimed at doing actions
on VFIO devices:
- mask VFIO IRQ
- get the active status of VFIO IRQ (active at interrupt
  controller level or masked by the level-sensitive automasking).
- change the automasked property and switch the IRQ handler
  (between automasked/ non automasked)

Their implementation is based on bus specific callbacks.

Note there is no way to discriminate between user-space
masking and automasked handler masking. As a consequence, is_active
will return true in case the IRQ was masked by the user-space.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

v5 - v6:
- implementation now uses external ops
- prototype changed (index, start, count) and returns int

V4: creation
---
 drivers/vfio/vfio.c  | 39 +++
 include/linux/vfio.h | 16 
 2 files changed, 55 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2fb29df..af6901e 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1527,6 +1527,45 @@ long vfio_external_check_extension(struct vfio_group 
*group, unsigned long arg)
 }
 EXPORT_SYMBOL_GPL(vfio_external_check_extension);
 
+int vfio_external_mask(struct vfio_device *vdev, unsigned index,
+   unsigned start, unsigned count)
+{
+   if (vdev-ops-external_ops 
+   vdev-ops-external_ops-mask)
+   return vdev-ops-external_ops-mask(vdev-device_data,
+index, start, count);
+   else
+   return -ENXIO;
+}
+EXPORT_SYMBOL_GPL(vfio_external_mask);
+
+int vfio_external_is_active(struct vfio_device *vdev, unsigned index,
+unsigned start, unsigned count)
+{
+   if (vdev-ops-external_ops 
+   vdev-ops-external_ops-is_active)
+   return vdev-ops-external_ops-is_active(vdev-device_data,
+ index, start, count);
+   else
+   return -ENXIO;
+}
+EXPORT_SYMBOL_GPL(vfio_external_is_active);
+
+int vfio_external_set_automasked(struct vfio_device *vdev,
+ unsigned index, unsigned start,
+ unsigned count, bool automasked)
+{
+   if (vdev-ops-external_ops 
+   vdev-ops-external_ops-set_automasked)
+   return vdev-ops-external_ops-set_automasked(
+   vdev-device_data,
+   index, start,
+   count, automasked);
+   else
+   return -ENXIO;
+}
+EXPORT_SYMBOL_GPL(vfio_external_set_automasked);
+
 /**
  * Module/class support
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index d79e8a9..31d3c95 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -107,6 +107,22 @@ extern int vfio_external_user_iommu_id(struct vfio_group 
*group);
 extern long vfio_external_check_extension(struct vfio_group *group,
  unsigned long arg);
 
+extern int vfio_external_mask(struct vfio_device *vdev, unsigned index,
+  unsigned start, unsigned count);
+/*
+ * returns whether the VFIO IRQ is active:
+ * true if not yet deactivated at interrupt controller level or if
+ * automasked (level sensitive IRQ). Unfortunately there is no way to
+ * discriminate between handler auto-masking and user-space masking
+ */
+extern int vfio_external_is_active(struct vfio_device *vdev,
+   unsigned index, unsigned start,
+   unsigned count);
+
+extern int vfio_external_set_automasked(struct vfio_device *vdev,
+unsigned index, unsigned start,
+unsigned count, bool automasked);
+
 struct pci_dev;
 #ifdef CONFIG_EEH
 extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 10/17] KVM: arm: select IRQ_BYPASS_MANAGER

2015-07-02 Thread Eric Auger
Select IRQ_BYPASS_MANAGER when CONFIG_KVM is set

Signed-off-by: Eric Auger eric.au...@linaro.org
---
 arch/arm/kvm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index bfb915d..7d38d25 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -31,6 +31,7 @@ config KVM
select KVM_VFIO
select HAVE_KVM_EVENTFD
select HAVE_KVM_IRQFD
+   select IRQ_BYPASS_MANAGER
depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
---help---
  Support hosting virtualized guest machines.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control

2015-07-02 Thread Eric Auger
- [add,del]_[consumer,producer] updated to takes both the consumer and
  producer handles. This is requested to combine info from both,
  typically to link the source irq owned by the producer with the gsi
  owned by the consumer (forwarded IRQ setup).
- new functions are added: [stop,resume]_[consumer, producer]. Those are
  needed for forwarding since the state change requires to entermingle
  actions at consumer, producer.
- On handshake, we now call connect, disconnect which features the more
  complex sequence.
- new fields are added on producer side: linux irq, vfio_device handle,
  active which reflects whether the source is active (at interrupt
  controller level or at VFIO level - automasked -) and finally an
  opaque pointer which will be used to point to the vfio_platform_device
  in this series.
- new fields on consumer side: the kvm handle, the gsi

Integration of posted interrupt series will help to refine those choices

Signed-off-by: Eric Auger eric.au...@linaro.org

---

- connect/disconnect could become a cb too. For forwarding it may make
  sense to have failure at connection: this would happen when the physical
  IRQ is either active at irqchip level or VFIO masked. This means some
  of the cb should return an error and this error management could be
  prod/cons specific. Where to attach the connect/disconnect cb: to the
  cons or prod, to both?
- Hence may be sensible to do the list_add only if connect returns 0
- disconnect would not be allowed to fail.
---
 include/linux/irqbypass.h | 26 ++---
 kernel/irq/bypass.c   | 48 +++
 2 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
index 718508e..591ae3f 100644
--- a/include/linux/irqbypass.h
+++ b/include/linux/irqbypass.h
@@ -3,17 +3,37 @@
 
 #include linux/list.h
 
+struct vfio_device;
+struct irq_bypass_consumer;
+struct kvm;
+
 struct irq_bypass_producer {
struct list_head node;
void *token;
-   /* TBD */
+   unsigned int irq; /* host physical irq */
+   struct vfio_device *vdev; /* vfio device that requested irq */
+   /* is irq active at irqchip or VFIO masked? */
+   bool active;
+   void *opaque;
+   void (*stop_producer)(struct irq_bypass_producer *);
+   void (*resume_producer)(struct irq_bypass_producer *);
+   void (*add_consumer)(struct irq_bypass_producer *,
+struct irq_bypass_consumer *);
+   void (*del_consumer)(struct irq_bypass_producer *,
+struct irq_bypass_consumer *);
 };
 
 struct irq_bypass_consumer {
struct list_head node;
void *token;
-   void (*add_producer)(struct irq_bypass_producer *);
-   void (*del_producer)(struct irq_bypass_producer *);
+   unsigned int gsi;   /* the guest gsi */
+   struct kvm *kvm;
+   void (*stop_consumer)(struct irq_bypass_consumer *);
+   void (*resume_consumer)(struct irq_bypass_consumer *);
+   void (*add_producer)(struct irq_bypass_consumer *,
+struct irq_bypass_producer *);
+   void (*del_producer)(struct irq_bypass_consumer *,
+struct irq_bypass_producer *);
 };
 
 int irq_bypass_register_producer(struct irq_bypass_producer *);
diff --git a/kernel/irq/bypass.c b/kernel/irq/bypass.c
index 5d0f92b..fb31fef 100644
--- a/kernel/irq/bypass.c
+++ b/kernel/irq/bypass.c
@@ -19,6 +19,46 @@ static LIST_HEAD(producers);
 static LIST_HEAD(consumers);
 static DEFINE_MUTEX(lock);
 
+/* lock must be hold when calling connect */
+static void connect(struct irq_bypass_producer *prod,
+   struct irq_bypass_consumer *cons)
+{
+   pr_info( %s prod(%d) - cons(%d)\n,
+   __func__, prod-irq, cons-gsi);
+   if (prod-stop_producer)
+   prod-stop_producer(prod);
+   if (cons-stop_consumer)
+   cons-stop_consumer(cons);
+   if (prod-add_consumer)
+   prod-add_consumer(prod, cons);
+   if (cons-add_producer)
+   cons-add_producer(cons, prod);
+   if (cons-resume_consumer)
+   cons-resume_consumer(cons);
+   if (prod-resume_producer)
+   prod-resume_producer(prod);
+}
+
+/* lock must be hold when calling disconnect */
+static void disconnect(struct irq_bypass_producer *prod,
+  struct irq_bypass_consumer *cons)
+{
+   pr_info( %s prod(%d) - cons(%d)\n,
+   __func__, prod-irq, cons-gsi);
+   if (prod-stop_producer)
+   prod-stop_producer(prod);
+   if (cons-stop_consumer)
+   cons-stop_consumer(cons);
+   if (cons-del_producer)
+   cons-del_producer(cons, prod);
+   if (prod-del_consumer)
+   prod-del_consumer(prod, cons);
+   if (cons-resume_consumer)
+   cons-resume_consumer(cons);
+   if 

[RFC 00/17] ARM IRQ forward control based on IRQ bypass manager

2015-07-02 Thread Eric Auger
This series allows to set ARM IRQ forwarding between a VFIO platform
device physical IRQ and a guest virtual IRQ.

The setting is coordinated by the prototype IRQ bypass manager.
This kernel integration seems now prefered to previous kvm-vfio device
user api:
- [RFC v6 00/16] KVM-VFIO IRQ forward control,
  https://lkml.org/lkml/2015/4/13/353).

Some rationale can be found in IRQ bypass manager thread:
https://lkml.org/lkml/2015/6/29/268

The principle is the VFIO platform driver registers a producer struct
on VFIO_IRQ_SET_ACTION_TRIGGER while KVM irqfd registers a consumer struct
on the irqfd assignment. This leads to a handshake based on the eventfd
context (used as token) match.

When either of the producer/consumer module disappears, this leads to
an unregistration and the link is disconnected.

Structure of the series:
[1-6] Modifications in the VFIO (platform) driver to prepare for dynamic
  switch between automasked/masked mode
[7-8] Introduce halt/resume guest capability
[9] irq bypass manager proto as sent by Alex
[10-17] Adaptations to support forwarding on top of IRQ bypass manager

Dependencies:
1- [PATCH 00/10] arm/arm64: KVM: Active interrupt state switching for
   shared devices (http://www.spinics.net/lists/kvm/msg117411.html)
2- RFC ARM: Forwarding physical interrupts to a guest VM
   (http://lwn.net/Articles/603514/)
3- IRQ bypass manager proto: https://lkml.org/lkml/2015/6/29/268
4- [RFC v2 0/4] chip/vgic adaptations for forwarded irq
   
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/323183.html

All those pieces can be found at:
https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.2-rc1-bypass-fwd

More backgroung on ARM IRQ forwarding in the text below and at
http://www.linux-kvm.org/images/a/a8/01x04-ARMdevice.pdf.

A forwarded IRQ is deactivated by the guest and not by the host.
When the guest deactivates the associated virtual IRQ, the interrupt
controller automatically completes the physical IRQ. Obviously
this requires some HW support in the interrupt controller. This is
the case for ARM GICv2.

The direct benefit is that, for a level sensitive IRQ, a VM exit
can be avoided on forwarded IRQ completion.

When the IRQ is forwarded, the VFIO platform driver does not need to
mask the physical IRQ anymore before signaling the eventfd. Indeed
genirq lowers the running priority, enabling other physical IRQ to hit
except that one.

Besides, the injection still is based on irqfd triggering. The only
impact on irqfd process is resamplefd is not called anymore on
virtual IRQ completion since deactivation is not trapped by KVM.

This was tested on Calxeda Midway, assigning the xgmac main IRQ

kvm-vfio v6 - rfc based on IRQ bypass manager
see previous history in https://lkml.org/lkml/2015/4/13/353).

Best Regards

Eric


Alex Williamson (1):
  bypass: IRQ bypass manager proto by Alex

Eric Auger (16):
  VFIO: platform: test forwarded state when selecting IRQ handler
  VFIO: platform: single handler using function pointer
  VFIO: Introduce vfio_device_external_ops
  VFIO: pci: initialize vfio_device_external_ops
  VFIO: platform: implement vfio_device_external_ops callbacks
  VFIO: add vfio_external_{mask|is_active|set_automasked}
  KVM: arm: rename pause into power_off
  kvm: arm/arm64: implement kvm_arm_[halt,resume]_guest
  KVM: arm: select IRQ_BYPASS_MANAGER
  VFIO: platform: select IRQ_BYPASS_MANAGER
  irq: bypass: Extend skeleton for ARM forwarding control
  KVM: introduce kvm_arch functions for IRQ bypass
  KVM: arm/arm64: vgic: forwarding control
  KVM: arm/arm64: implement IRQ bypass consumer functions
  KVM: eventfd: add irq bypass consumer management
  VFIO: platform: add irq bypass producer management

 arch/arm/include/asm/kvm_host.h   |   5 +-
 arch/arm/kvm/Kconfig  |   1 +
 arch/arm/kvm/arm.c|  60 +++-
 arch/arm/kvm/psci.c   |  10 +-
 arch/arm64/include/asm/kvm_host.h |   3 +
 arch/x86/kvm/Kconfig  |   1 +
 drivers/vfio/pci/Kconfig  |   1 +
 drivers/vfio/pci/vfio_pci.c   |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c |   6 +
 drivers/vfio/platform/Kconfig |   1 +
 drivers/vfio/platform/vfio_platform_common.c  |   9 ++
 drivers/vfio/platform/vfio_platform_irq.c | 160 -
 drivers/vfio/platform/vfio_platform_private.h |  14 ++
 drivers/vfio/vfio.c   |  39 ++
 include/kvm/arm_vgic.h|   7 +
 include/linux/irqbypass.h |  43 ++
 include/linux/kvm_host.h  |  27 
 include/linux/vfio.h  |  34 +
 kernel/irq/Kconfig|   3 +
 kernel/irq/Makefile   |   1 +
 kernel/irq/bypass.c   | 156 +
 virt/kvm/arm/vgic.c   

[RFC 09/17] bypass: IRQ bypass manager proto by Alex

2015-07-02 Thread Eric Auger
From: Alex Williamson alex.william...@redhat.com

There are plenty of details to be filled in, but I think the basics
looks something like the code below.  The IRQ bypass manager just
defines a pair of structures, one for interrupt producers and one for
interrupt consumers.  I'm certain that we'll need more callbacks than
I've defined below, but figuring out what those should be for the best
abstraction is the hardest part of this idea.  The manager provides both
registration and de-registration interfaces for both types of objects
and keeps lists for each, protected by a lock.  The manager doesn't even
really need to know what the match token is, but I assume for our
purposes it will be an eventfd_ctx.

On the vfio side, the producer struct would be embedded in the
vfio_pci_irq_ctx struct.  KVM would probably embed the consumer struct
in _irqfd.  As I've coded below, the IRQ bypass manager calls the
consumer callbacks, so the producer struct would need fields or
callbacks to provide the consumer the info it needs.  AIUI the Posted
Interrupt model, VFIO only needs to provide data to the consumer.  For
IRQ Forwarding, I think the producer needs to be informed when bypass is
active to model the incoming interrupt as edge vs level.

I've prototyped the base IRQ bypass manager here as static, but I don't
see any reason it couldn't be a module that's loaded by dependency when
either vfio-pci or kvm-intel is loaded (or other producer/consumer
objects).

Is this a reasonable starting point to craft the additional fields and
callbacks and interaction of who calls who that we need to support
Posted Interrupts and IRQ Forwarding?  Is the AMD version of this still
alive?  Thanks,

Alex
---
 arch/x86/kvm/Kconfig  |   1 +
 drivers/vfio/pci/Kconfig  |   1 +
 drivers/vfio/pci/vfio_pci_intrs.c |   6 ++
 include/linux/irqbypass.h |  23 
 kernel/irq/Kconfig|   3 +
 kernel/irq/Makefile   |   1 +
 kernel/irq/bypass.c   | 116 ++
 virt/kvm/eventfd.c|   4 ++
 8 files changed, 155 insertions(+)
 create mode 100644 include/linux/irqbypass.h
 create mode 100644 kernel/irq/bypass.c

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index d8a1d56..86d0d77 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -61,6 +61,7 @@ config KVM_INTEL
depends on KVM
# for perf_guest_get_msrs():
depends on CPU_SUP_INTEL
+   select IRQ_BYPASS_MANAGER
---help---
  Provides support for KVM on Intel processors equipped with the VT
  extensions.
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 579d83b..02912f1 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -2,6 +2,7 @@ config VFIO_PCI
tristate VFIO support for PCI devices
depends on VFIO  PCI  EVENTFD
select VFIO_VIRQFD
+   select IRQ_BYPASS_MANAGER
help
  Support for the PCI VFIO bus driver.  This is required to make
  use of PCI drivers using the VFIO framework.
diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
b/drivers/vfio/pci/vfio_pci_intrs.c
index 1f577b4..4e053be 100644
--- a/drivers/vfio/pci/vfio_pci_intrs.c
+++ b/drivers/vfio/pci/vfio_pci_intrs.c
@@ -181,6 +181,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device 
*vdev, int fd)
 
if (vdev-ctx[0].trigger) {
free_irq(pdev-irq, vdev);
+   /* irq_bypass_unregister_producer(); */
kfree(vdev-ctx[0].name);
eventfd_ctx_put(vdev-ctx[0].trigger);
vdev-ctx[0].trigger = NULL;
@@ -214,6 +215,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device 
*vdev, int fd)
return ret;
}
 
+   /* irq_bypass_register_producer(); */
+
/*
 * INTx disable will stick across the new irq setup,
 * disable_irq won't.
@@ -319,6 +322,7 @@ static int vfio_msi_set_vector_signal(struct 
vfio_pci_device *vdev,
 
if (vdev-ctx[vector].trigger) {
free_irq(irq, vdev-ctx[vector].trigger);
+   /* irq_bypass_unregister_producer(); */
kfree(vdev-ctx[vector].name);
eventfd_ctx_put(vdev-ctx[vector].trigger);
vdev-ctx[vector].trigger = NULL;
@@ -360,6 +364,8 @@ static int vfio_msi_set_vector_signal(struct 
vfio_pci_device *vdev,
return ret;
}
 
+   /* irq_bypass_register_producer(); */
+
vdev-ctx[vector].trigger = trigger;
 
return 0;
diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
new file mode 100644
index 000..718508e
--- /dev/null
+++ b/include/linux/irqbypass.h
@@ -0,0 +1,23 @@
+#ifndef IRQBYPASS_H
+#define IRQBYPASS_H
+
+#include linux/list.h
+
+struct irq_bypass_producer {
+   struct list_head node;
+   void *token;
+   /* TBD */
+};
+
+struct irq_bypass_consumer {
+   struct list_head 

[RFC 02/17] VFIO: platform: single handler using function pointer

2015-07-02 Thread Eric Auger
A single handler now is registered whatever the use case: automasked
or not. A function pointer is set according to the wished behavior
and the handler calls this function.

The irq lock is taken/released in the root handler. eventfd_signal can
be called in regions not allowed to sleep.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

v4: creation
---
 drivers/vfio/platform/vfio_platform_irq.c | 21 +++--
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
b/drivers/vfio/platform/vfio_platform_irq.c
index 132bb3f..8eb65c1 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -147,11 +147,8 @@ static int vfio_platform_set_irq_unmask(struct 
vfio_platform_device *vdev,
 static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id)
 {
struct vfio_platform_irq *irq_ctx = dev_id;
-   unsigned long flags;
int ret = IRQ_NONE;
 
-   spin_lock_irqsave(irq_ctx-lock, flags);
-
if (!irq_ctx-masked) {
ret = IRQ_HANDLED;
 
@@ -160,8 +157,6 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, 
void *dev_id)
irq_ctx-masked = true;
}
 
-   spin_unlock_irqrestore(irq_ctx-lock, flags);
-
if (ret == IRQ_HANDLED)
eventfd_signal(irq_ctx-trigger, 1);
 
@@ -177,6 +172,19 @@ static irqreturn_t vfio_irq_handler(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+static irqreturn_t vfio_handler(int irq, void *dev_id)
+{
+   struct vfio_platform_irq *irq_ctx = dev_id;
+   unsigned long flags;
+   irqreturn_t ret;
+
+   spin_lock_irqsave(irq_ctx-lock, flags);
+   ret = irq_ctx-handler(irq, dev_id);
+   spin_unlock_irqrestore(irq_ctx-lock, flags);
+
+   return ret;
+}
+
 static int vfio_set_trigger(struct vfio_platform_device *vdev, int index,
int fd, irq_handler_t handler)
 {
@@ -206,9 +214,10 @@ static int vfio_set_trigger(struct vfio_platform_device 
*vdev, int index,
}
 
irq-trigger = trigger;
+   irq-handler = handler;
 
irq_set_status_flags(irq-hwirq, IRQ_NOAUTOEN);
-   ret = request_irq(irq-hwirq, handler, 0, irq-name, irq);
+   ret = request_irq(irq-hwirq, vfio_handler, 0, irq-name, irq);
if (ret) {
kfree(irq-name);
eventfd_ctx_put(trigger);
diff --git a/drivers/vfio/platform/vfio_platform_private.h 
b/drivers/vfio/platform/vfio_platform_private.h
index 1c9b3d5..413f575 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -37,6 +37,7 @@ struct vfio_platform_irq {
spinlock_t  lock;
struct virqfd   *unmask;
struct virqfd   *mask;
+   irqreturn_t (*handler)(int irq, void *dev_id);
 };
 
 struct vfio_platform_region {
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 08/17] kvm: arm/arm64: implement kvm_arm_[halt,resume]_guest

2015-07-02 Thread Eric Auger
On halt, the guest is forced to exit and prevented from being
re-entered. This is synchronous.

Those two operations will be needed for IRQ forwarding setting.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

RFC:
- rename the function and this latter becomes static
- remove __KVM_HAVE_ARCH_HALT_GUEST

v4 - v5: add arm64 support
- also defines __KVM_HAVE_ARCH_HALT_GUEST for arm64
- add pause field
---
 arch/arm/include/asm/kvm_host.h   |  3 +++
 arch/arm/kvm/arm.c| 32 +---
 arch/arm64/include/asm/kvm_host.h |  3 +++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 304004d..899ae27 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -132,6 +132,9 @@ struct kvm_vcpu_arch {
/* vcpu power-off state */
bool power_off;
 
+   /* Don't run the guest */
+   bool pause;
+
/* IO related fields */
struct kvm_decode mmio_decode;
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 7537e68..4be6715 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -471,11 +471,36 @@ bool kvm_arch_intc_initialized(struct kvm *kvm)
return vgic_initialized(kvm);
 }
 
+static void kvm_arm_halt_guest(struct kvm *kvm)
+{
+   int i;
+   struct kvm_vcpu *vcpu;
+
+   kvm_for_each_vcpu(i, vcpu, kvm)
+   vcpu-arch.pause = true;
+   force_vm_exit(cpu_all_mask);
+}
+
+static void kvm_arm_resume_guest(struct kvm *kvm)
+{
+   int i;
+   struct kvm_vcpu *vcpu;
+
+   kvm_for_each_vcpu(i, vcpu, kvm) {
+   wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu);
+
+   vcpu-arch.pause = false;
+   wake_up_interruptible(wq);
+   }
+}
+
+
 static void vcpu_pause(struct kvm_vcpu *vcpu)
 {
wait_queue_head_t *wq = kvm_arch_vcpu_wq(vcpu);
 
-   wait_event_interruptible(*wq, !vcpu-arch.power_off);
+   wait_event_interruptible(*wq, ((!vcpu-arch.power_off) 
+  (!vcpu-arch.pause)));
 }
 
 static int kvm_vcpu_initialized(struct kvm_vcpu *vcpu)
@@ -525,7 +550,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
update_vttbr(vcpu-kvm);
 
-   if (vcpu-arch.power_off)
+   if (vcpu-arch.power_off || vcpu-arch.pause)
vcpu_pause(vcpu);
 
/*
@@ -551,7 +576,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
run-exit_reason = KVM_EXIT_INTR;
}
 
-   if (ret = 0 || need_new_vmid_gen(vcpu-kvm)) {
+   if (ret = 0 || need_new_vmid_gen(vcpu-kvm) ||
+   vcpu-arch.pause) {
local_irq_enable();
preempt_enable();
kvm_vgic_sync_hwstate(vcpu);
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 009da6b..69e3785 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -125,6 +125,9 @@ struct kvm_vcpu_arch {
/* vcpu power-off state */
bool power_off;
 
+   /* Don't run the guest */
+   bool pause;
+
/* IO related fields */
struct kvm_decode mmio_decode;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: copy_huge_page: unable to handle kernel NULL pointer dereference at 0000000000000008

2015-07-02 Thread Andrey Korolyov
 But you are very appositely mistaken: copy_huge_page() used to make
 the same mistake, and Dave Hansen fixed it back in v3.13, but the fix
 never went to the stable trees.

 commit 30b0a105d9f7141e4cbf72ae5511832457d89788
 Author: Dave Hansen dave.han...@linux.intel.com
 Date:   Thu Nov 21 14:31:58 2013 -0800

 mm: thp: give transparent hugepage code a separate copy_page

 Right now, the migration code in migrate_page_copy() uses 
 copy_huge_page()
 for hugetlbfs and thp pages:

if (PageHuge(page) || PageTransHuge(page))
 copy_huge_page(newpage, page);

 So, yay for code reuse.  But:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);

 and a non-hugetlbfs page has no page_hstate().  This works 99% of the
 time because page_hstate() determines the hstate from the page order
 alone.  Since the page order of a THP page matches the default hugetlbfs
 page order, it works.

 But, if you change the default huge page size on the boot command-line
 (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate
 so page_hstate() returns null and copy_huge_page() oopses pretty fast
 since copy_huge_page() dereferences the hstate:

   void copy_huge_page(struct page *dst, struct page *src)
   {
 struct hstate *h = page_hstate(src);
 if (unlikely(pages_per_huge_page(h)  MAX_ORDER_NR_PAGES)) {
   ...

 Mel noticed that the migration code is really the only user of these
 functions.  This moves all the copy code over to migrate.c and makes
 copy_huge_page() work for THP by checking for it explicitly.

 I believe the bug was introduced in commit b32967ff101a (mm: numa: Add
 THP migration for the NUMA working set scanning fault case)

 [a...@linux-foundation.org: fix coding-style and comment text, per Naoya 
 Horiguchi]
 Signed-off-by: Dave Hansen dave.han...@linux.intel.com
 Acked-by: Mel Gorman mgor...@suse.de
 Reviewed-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com
 Cc: Hillf Danton dhi...@gmail.com
 Cc: Andrea Arcangeli aarca...@redhat.com
 Tested-by: Dave Jiang dave.ji...@intel.com
 Signed-off-by: Andrew Morton a...@linux-foundation.org
 Signed-off-by: Linus Torvalds torva...@linux-foundation.org


 Thanks, the issue is fixed on 3.10 with trivial patch modification.

Ping? 3.10 still misses that..
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 05/17] VFIO: platform: implement vfio_device_external_ops callbacks

2015-07-02 Thread Eric Auger
This patch adds the implementation for the 3 external callbacks of
vfio_device_external_ops struct, namely active, is_active,
set_automasked. Also vfio_device_ops and vfio_device_external_ops are
set accordingly.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

v6: creation
---
 drivers/vfio/platform/vfio_platform_common.c  |  7 
 drivers/vfio/platform/vfio_platform_irq.c | 49 +++
 drivers/vfio/platform/vfio_platform_private.h | 11 ++
 3 files changed, 67 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index e43efb5..9acfca6 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -520,6 +520,12 @@ static int vfio_platform_mmap(void *device_data, struct 
vm_area_struct *vma)
return -EINVAL;
 }
 
+static struct vfio_device_external_ops vfio_platform_external_ops = {
+   .mask   = vfio_platform_external_mask,
+   .is_active  = vfio_platform_external_is_active,
+   .set_automasked = vfio_platform_external_set_automasked,
+};
+
 static const struct vfio_device_ops vfio_platform_ops = {
.name   = vfio-platform,
.open   = vfio_platform_open,
@@ -528,6 +534,7 @@ static const struct vfio_device_ops vfio_platform_ops = {
.read   = vfio_platform_read,
.write  = vfio_platform_write,
.mmap   = vfio_platform_mmap,
+   .external_ops   = vfio_platform_external_ops
 };
 
 int vfio_platform_probe_common(struct vfio_platform_device *vdev,
diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
b/drivers/vfio/platform/vfio_platform_irq.c
index 8eb65c1..f6d83ed 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -231,6 +231,55 @@ static int vfio_set_trigger(struct vfio_platform_device 
*vdev, int index,
return 0;
 }
 
+int vfio_platform_external_mask(void *device_data, unsigned index,
+unsigned start, unsigned count)
+{
+   struct vfio_platform_device *vdev = device_data;
+
+   vfio_platform_mask(vdev-irqs[index]);
+   return 0;
+}
+
+int vfio_platform_external_is_active(void *device_data, unsigned index,
+ unsigned start, unsigned count)
+{
+   unsigned long flags;
+   struct vfio_platform_device *vdev = device_data;
+   struct vfio_platform_irq *irq = vdev-irqs[index];
+   bool active, masked, outstanding;
+   int ret;
+
+   spin_lock_irqsave(irq-lock, flags);
+
+   ret = irq_get_irqchip_state(irq-hwirq, IRQCHIP_STATE_ACTIVE, active);
+   BUG_ON(ret);
+   masked = irq-masked;
+   outstanding = active || masked;
+
+   spin_unlock_irqrestore(irq-lock, flags);
+   return outstanding;
+}
+
+int vfio_platform_external_set_automasked(void *device_data, unsigned index,
+  unsigned start, unsigned count,
+  bool automasked)
+{
+   unsigned long flags;
+   struct vfio_platform_device *vdev = device_data;
+   struct vfio_platform_irq *irq = vdev-irqs[index];
+
+   spin_lock_irqsave(irq-lock, flags);
+   if (automasked) {
+   irq-flags |= VFIO_IRQ_INFO_AUTOMASKED;
+   irq-handler = vfio_automasked_irq_handler;
+   } else {
+   irq-flags = ~VFIO_IRQ_INFO_AUTOMASKED;
+   irq-handler = vfio_irq_handler;
+   }
+   spin_unlock_irqrestore(irq-lock, flags);
+   return 0;
+}
+
 static int vfio_platform_set_irq_trigger(struct vfio_platform_device *vdev,
 unsigned index, unsigned start,
 unsigned count, uint32_t flags,
diff --git a/drivers/vfio/platform/vfio_platform_private.h 
b/drivers/vfio/platform/vfio_platform_private.h
index 413f575..5f46c68 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -90,4 +90,15 @@ extern int vfio_platform_set_irqs_ioctl(struct 
vfio_platform_device *vdev,
unsigned start, unsigned count,
void *data);
 
+extern int vfio_platform_external_mask(void *device_data, unsigned index,
+  unsigned start, unsigned count);
+extern int vfio_platform_external_is_active(void *device_data,
+   unsigned index, unsigned start,
+   unsigned count);
+extern int vfio_platform_external_set_automasked(void *device_data,
+unsigned index,
+unsigned start,
+unsigned count,
+bool 

[RFC 03/17] VFIO: Introduce vfio_device_external_ops

2015-07-02 Thread Eric Auger
New bus callbacks are introduced. They correspond to external
functions. To avoid messing up the main vfio_device_ops
struct, a new vfio_device_external_ops struct is introduced.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

v6: creation
---
 include/linux/vfio.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ddb4409..d79e8a9 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -19,6 +19,23 @@
 #include uapi/linux/vfio.h
 
 /**
+ * struct vfio_device_external_ops - VFIO bus driver device callbacks
+ * used as external API
+ * @mask: mask any IRQ defined by triplet
+ * @is_active: returns whether any IRQ defined by triplet is active
+ * @set_automasked: sets the automasked flag of triplet's IRQ
+ */
+struct vfio_device_external_ops  {
+   int (*mask)(void *device_data, unsigned index, unsigned start,
+   unsigned count);
+   int (*is_active)(void *device_data, unsigned index, unsigned start,
+unsigned count);
+   int (*set_automasked)(void *device_data, unsigned index,
+ unsigned start, unsigned count,
+  bool automasked);
+};
+
+/**
  * struct vfio_device_ops - VFIO bus driver device callbacks
  *
  * @open: Called when userspace creates new file descriptor for device
@@ -42,6 +59,7 @@ struct vfio_device_ops {
 unsigned long arg);
int (*mmap)(void *device_data, struct vm_area_struct *vma);
void(*request)(void *device_data, unsigned int count);
+   struct vfio_device_external_ops *external_ops;
 };
 
 extern int vfio_add_group_dev(struct device *dev,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 13:51, Andrew Jones wrote:
 4) I recently mentioned[*] it might be nice to add a '-force-tcg' type
of arm/run command line option, allowing tcg to be used even if
it's possible to use kvm. Adding that at the same time would be
nice.

Can you just use --no-kvm?  It is equivalent to -machine accel=tcg,
and it overrides previous -machine accel=foo options.

Paolo

ps: I also share the yay feeling, of course!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC 01/17] VFIO: platform: test forwarded state when selecting IRQ handler

2015-07-02 Thread Eric Auger
In case the IRQ is forwarded, the VFIO platform IRQ handler does not
need to disable the IRQ anymore.

When setting the IRQ handler we now also test the forwarded state. In
case the IRQ is forwarded we select the vfio_irq_handler.

Signed-off-by: Eric Auger eric.au...@linaro.org

---

v3 - v4:
- change title

v2 - v3:
- forwarded state was tested in the handler. Now the forwarded state
  is tested before setting the handler. This definitively limits
  the dynamics of forwarded state changes but I don't think there is
  a use case where we need to be able to change the state at any time.

Conflicts:
drivers/vfio/platform/vfio_platform_irq.c
---
 drivers/vfio/platform/vfio_platform_irq.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/platform/vfio_platform_irq.c 
b/drivers/vfio/platform/vfio_platform_irq.c
index 88bba57..132bb3f 100644
--- a/drivers/vfio/platform/vfio_platform_irq.c
+++ b/drivers/vfio/platform/vfio_platform_irq.c
@@ -229,8 +229,13 @@ static int vfio_platform_set_irq_trigger(struct 
vfio_platform_device *vdev,
 {
struct vfio_platform_irq *irq = vdev-irqs[index];
irq_handler_t handler;
+   struct irq_data *d;
+   bool is_forwarded;
 
-   if (vdev-irqs[index].flags  VFIO_IRQ_INFO_AUTOMASKED)
+   d = irq_get_irq_data(irq-hwirq);
+   is_forwarded = irqd_irq_forwarded(d);
+
+   if (vdev-irqs[index].flags  VFIO_IRQ_INFO_AUTOMASKED  !is_forwarded)
handler = vfio_automasked_irq_handler;
else
handler = vfio_irq_handler;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 13/17] KVM: introduce kvm_arch functions for IRQ bypass

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 15:17, Eric Auger wrote:
 +#ifdef CONFIG_IRQ_BYPASS_MANAGER

Please use a separate symbol CONFIG_KVM_HAVE_IRQ_BYPASS.

 +void kvm_arch_add_producer(struct irq_bypass_consumer *,
 +struct irq_bypass_producer *);

add_irq_bypass_producer, and so on below.

Paolo

 +void kvm_arch_del_producer(struct irq_bypass_consumer *,
 +struct irq_bypass_producer *);
 +void kvm_arch_stop_consumer(struct irq_bypass_consumer *);
 +void kvm_arch_resume_consumer(struct irq_bypass_consumer *);
 +
 +#else
 +void kvm_arch_add_producer(struct irq_bypass_consumer *,
 +struct irq_bypass_producer *)
 +{
 +}
 +void kvm_arch_del_producer(struct irq_bypass_consumer *,
 +struct irq_bypass_producer *)
 +{
 +}
 +void kvm_arch_stop_consumer(struct irq_bypass_consumer *)
 +{
 +}
 +void kvm_arch_resume_consumer(struct irq_bypass_consumer *)
 +{
 +}
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Andrew Jones
On Thu, Jul 02, 2015 at 12:05:31PM +0100, Alex Bennée wrote:
 As ARM (and no doubt other systems) can also run tests in pure TCG mode
 we might as well not bother enabling accel=kvm if we aren't on a real
 ARM based system. This prevents us seeing ugly warning messages when
 testing TCG.

First,
YAY! We're getting contributions to kvm-unit-tests/arm!

 
 Signed-off-by: Alex Bennée alex.ben...@linaro.org
 ---
  arm/run | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)
 
 diff --git a/arm/run b/arm/run
 index 662a856..2bdb4be 100755
 --- a/arm/run
 +++ b/arm/run
 @@ -33,7 +33,13 @@ if $qemu $M -chardev testdev,id=id -initrd . 21 \
   exit 2
  fi
  
 -M='-machine virt,accel=kvm:tcg'
 +host=`uname -m | sed -e 's/arm.*/arm/'`
 +if [ ${host} = arm ] || [ ${host} = aarch64 ]; then
 +M='-machine virt,accel=kvm:tcg'
 +else
 +M='-machine virt,accel=tcg'
 +fi

I think this is a good idea, although I had actually left that warning
on purpose. Originally, the plan was for these unit tests to be kvm
specific. If they could be developed with the aid of tcg, and even used
to test tcg, then fine, but running them on tcg should always complain,
in order to make sure that the test output clearly showed that it had
not been running on kvm. Developing unit tests for tcg is also a good
idea though, and there's really no reason not to share this framework.

So, for this patch I'd prefer we do a few things differently;

1) we should be able to integrate this new condition with the
   arm64 must use '-cpu host' with kvm condition that is lower down.
   And, let's just make this $HOST variable one that ./configure
   prepares, allowing that arm64 condition to s/$(arch)/$HOST/ and
   avoiding the need to duplicate the sed -e 's/arm.*/arm/'

2) we might as well do something like

   M='-machine virt'
   if using-kvm
 M+=',accel=kvm'
   else
 M+=',accel=tcg'
   fi

   now, since we don't want to use the accel fallback feature anymore

3) outputting which one we're using might still be nice, otherwise
   one must inspect the qemu command line in the logs to find out

4) I recently mentioned[*] it might be nice to add a '-force-tcg' type
   of arm/run command line option, allowing tcg to be used even if
   it's possible to use kvm. Adding that at the same time would be
   nice.

5) we use tabs for indentation in arm/run, and only bother with the
   variable's {}, if necessary

6) we should post patches with [kvm-unit-tests PATCH] to avoid
   confusion with other kvm postings. (I screwed that up on my
   last two postings...).

Thanks!
drew

[*] https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg07514.html

 +
  chr_testdev='-device virtio-serial-device'
  chr_testdev+=' -device virtconsole,chardev=ctd -chardev testdev,id=ctd'
  
 -- 
 2.4.5
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/2] vhost: support more than 64 memory regions

2015-07-02 Thread Igor Mammedov
changes since v3:
  * rebased on top of vhost-next branch
changes since v2:
  * drop cache patches for now as suggested
  * add max_mem_regions module parameter instead of unconditionally
increasing limit
  * drop bsearch patch since it's already queued

References to previous versions:
v2: https://lkml.org/lkml/2015/6/17/276
v1: http://www.spinics.net/lists/kvm/msg117654.html

Series allows to tweak vhost's memory regions count limit.

It fixes VM crashing on memory hotplug due to vhost refusing
accepting more than 64 memory regions with max_mem_regions
set to more than 262 slots in default QEMU configuration.

Igor Mammedov (2):
  vhost: extend memory regions allocation to vmalloc
  vhost: add max_mem_regions module parameter

 drivers/vhost/vhost.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm/run: don't enable KVM if system can't do it

2015-07-02 Thread Alex Bennée

Andrew Jones drjo...@redhat.com writes:

 On Thu, Jul 02, 2015 at 12:05:31PM +0100, Alex Bennée wrote:
 As ARM (and no doubt other systems) can also run tests in pure TCG mode
 we might as well not bother enabling accel=kvm if we aren't on a real
 ARM based system. This prevents us seeing ugly warning messages when
 testing TCG.

 First,
 YAY! We're getting contributions to kvm-unit-tests/arm!

:-) well so far I've been noodling about looking at it for KVM Guest
Debug testing. I've a hideous branch on github that attempts to test
exercise the debug register trapping code. However that falls down as I
really need to find an easy way of attaching GDB to the qemu-gdb stub
while the test is running.

However with the TCG multi-thread work coming up I certainly see the
need to exercise QEMU in a way that the internal TCG test code might
have trouble with.


 
 Signed-off-by: Alex Bennée alex.ben...@linaro.org
 ---
  arm/run | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)
 
 diff --git a/arm/run b/arm/run
 index 662a856..2bdb4be 100755
 --- a/arm/run
 +++ b/arm/run
 @@ -33,7 +33,13 @@ if $qemu $M -chardev testdev,id=id -initrd . 21 \
  exit 2
  fi
  
 -M='-machine virt,accel=kvm:tcg'
 +host=`uname -m | sed -e 's/arm.*/arm/'`
 +if [ ${host} = arm ] || [ ${host} = aarch64 ]; then
 +M='-machine virt,accel=kvm:tcg'
 +else
 +M='-machine virt,accel=tcg'
 +fi

 I think this is a good idea, although I had actually left that warning
 on purpose. Originally, the plan was for these unit tests to be kvm
 specific. If they could be developed with the aid of tcg, and even used
 to test tcg, then fine, but running them on tcg should always complain,
 in order to make sure that the test output clearly showed that it had
 not been running on kvm. Developing unit tests for tcg is also a good
 idea though, and there's really no reason not to share this framework.

 So, for this patch I'd prefer we do a few things differently;

 1) we should be able to integrate this new condition with the
arm64 must use '-cpu host' with kvm condition that is lower down.
And, let's just make this $HOST variable one that ./configure
prepares, allowing that arm64 condition to s/$(arch)/$HOST/ and
avoiding the need to duplicate the sed -e 's/arm.*/arm/'

Yeah makes sense.


 2) we might as well do something like

M='-machine virt'
if using-kvm
  M+=',accel=kvm'
else
  M+=',accel=tcg'
fi

now, since we don't want to use the accel fallback feature anymore

 3) outputting which one we're using might still be nice, otherwise
one must inspect the qemu command line in the logs to find out

 4) I recently mentioned[*] it might be nice to add a '-force-tcg' type
of arm/run command line option, allowing tcg to be used even if
it's possible to use kvm. Adding that at the same time would be
nice.

Would it also be useful for other arches? Does run-tests.sh pass 

 5) we use tabs for indentation in arm/run, and only bother with the
variable's {}, if necessary

My shell quoting was rusty. I think $(host) was calling the host command
for some reason.


 6) we should post patches with [kvm-unit-tests PATCH] to avoid
confusion with other kvm postings. (I screwed that up on my
last two postings...).

/me ponders if he can just config git for that.

I'll patch the readme ;-)


 Thanks!
 drew

 [*] https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg07514.html

 +
  chr_testdev='-device virtio-serial-device'
  chr_testdev+=' -device virtconsole,chardev=ctd -chardev testdev,id=ctd'
  
 -- 
 2.4.5
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Alex Bennée
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 15:17, Eric Auger wrote:
 - new fields are added on producer side: linux irq, vfio_device handle,
   active which reflects whether the source is active (at interrupt
   controller level or at VFIO level - automasked -) and finally an
   opaque pointer which will be used to point to the vfio_platform_device
   in this series.

Linux IRQ and active should be okay.  As to the vfio_device handle, you
should link it from the vfio_platform_device instead.  And for the
vfio_platform_device, you can link it from the vfio_platform_irq instead.

Once you've done this, embed the irq_bypass_producer struct in the
vfio_platform_irq struct; in the new kvm_arch_* functions, go back to
the vfio_platform_irq struct via container_of.  From there you can
retrieve pointers to the vfio_platform_device and the vfio_device.

 - new fields on consumer side: the kvm handle, the gsi

You do not need to add these.  Instead, add the kvm handle to irqfd
only.  Like above, embed the irq_bypass_consumer struct in the irqfd
struct; in the new kvm_arch_* functions, go back to the
vfio_platform_irq struct via container_of.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 09/11] KVM: arm64: guest debug, HW assisted debug support

2015-07-02 Thread Alex Bennée

Will Deacon will.dea...@arm.com writes:

Are you happy with this?:

Subject: [PATCH v8 09/11] KVM: arm64: guest debug, HW assisted debug support

This adds support for userspace to control the HW debug registers for
guest debug. In the debug ioctl we copy an IMPDEF registers into a new
register set called host_debug_state.

We use the recently introduced vcpu parameter debug_ptr to select which
register set is copied into the real registers when world switch occurs.

I've made some helper functions from hw_breakpoint.c more widely
available for re-use.

As with single step we need to tweak the guest registers to enable the
exceptions so we need to save and restore those bits.

Two new capabilities have been added to the KVM_EXTENSION ioctl to allow
userspace to query the number of hardware break and watch points
available on the host hardware.

Signed-off-by: Alex Bennée alex.ben...@linaro.org
Reviewed-by: Christoffer Dall christoffer.d...@linaro.org

---
v2
   - switched to C setup
   - replace host debug registers directly into context
   - minor tweak to api docs
   - setup right register for debug
   - add FAR_EL2 to debug exit structure
   - add support for trapping debug register access
v3
   - remove stray trace statement
   - fix spacing around operators (various)
   - clean-up usage of trap_debug
   - introduce debug_ptr, replace excessive memcpy stuff
   - don't use memcpy in ioctl, just assign
   - update cap ioctl documentation
   - reword a number comments
   - rename host_debug_state-external_debug_state
v4
   - use the new u32/u64 split debug_ptr approach
   - fix some wording/comments
v5
   - don't set MDSCR_EL1.KDE (not needed)
v6
   - update wording given change in commentary
   - KVM_GUESTDBG_USE_HW_BP-KVM_GUESTDBG_USE_HW
v7
   - fix merge conflicts from ioctl move to guest.c
   - use kvm_arm_reset_debug_ptr to reset ptr
   - a BUG_ON() test has been added to trap failure to reset debug_ptr
   - debugging-debug in kvm_host.h comment
   - s/defined// s/to// in commit msg
   - rm ref to introducing debug_ptr in commit msg
   - add r-b tag
v8
   - use hw_breakpoint_slots() instead
---
 Documentation/virtual/kvm/api.txt |  7 ++-
 arch/arm64/include/asm/kvm_host.h |  6 +-
 arch/arm64/kvm/debug.c| 40 ++-
 arch/arm64/kvm/guest.c|  7 +++
 arch/arm64/kvm/handle_exit.c  |  6 ++
 arch/arm64/kvm/reset.c| 13 +
 arch/arm64/kvm/sys_regs.c |  3 ---
 include/uapi/linux/kvm.h  |  2 ++
 8 files changed, 74 insertions(+), 10 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 33c8143..ada57df 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -2668,7 +2668,7 @@ The top 16 bits of the control field are architecture 
specific control
 flags which can include the following:
 
   - KVM_GUESTDBG_USE_SW_BP: using software breakpoints [x86, arm64]
-  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390]
+  - KVM_GUESTDBG_USE_HW_BP: using hardware breakpoints [x86, s390, arm64]
   - KVM_GUESTDBG_INJECT_DB: inject DB type exception [x86]
   - KVM_GUESTDBG_INJECT_BP: inject BP type exception [x86]
   - KVM_GUESTDBG_EXIT_PENDING:  trigger an immediate guest exit [s390]
@@ -2683,6 +2683,11 @@ updated to the correct (supplied) values.
 The second part of the structure is architecture specific and
 typically contains a set of debug registers.
 
+For arm64 the number of debug registers is implementation defined and
+can be determined by querying the KVM_CAP_GUEST_DEBUG_HW_BPS and
+KVM_CAP_GUEST_DEBUG_HW_WPS capabilities which return a positive number
+indicating the number of supported registers.
+
 When debug events exit the main run loop with the reason
 KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run
 structure containing architecture specific debug information.
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 461d288..6c745e0 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -116,13 +116,17 @@ struct kvm_vcpu_arch {
 * debugging the guest from the host and to maintain separate host and
 * guest state during world switches. vcpu_debug_state are the debug
 * registers of the vcpu as the guest sees them.  host_debug_state are
-* the host registers which are saved and restored during world 
switches.
+* the host registers which are saved and restored during
+* world switches. external_debug_state contains the debug
+* values we want to debug the guest. This is set via the
+* KVM_SET_GUEST_DEBUG ioctl.
 *
 * debug_ptr points to the set of debug registers that should be loaded
 * onto the hardware when running the guest.
 */
struct kvm_guest_debug_arch *debug_ptr;
struct 

Re: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-07-02 Thread Eric Auger
Hi Pavel,
On 07/02/2015 09:53 AM, Pavel Fedin wrote:
  Hello!
 
 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

 If the ITS modality is not available, let's simply support MSI
 injection by transforming the MSI.data into an SPI ID.

 This becomes possible to use KVM_SIGNAL_MSI ioctl for arm too.

 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  arch/arm/kvm/Kconfig | 1 +
  virt/kvm/arm/vgic.c  | 5 +
  2 files changed, 6 insertions(+)

 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 151e710..0f58baf 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -31,6 +31,7 @@ config KVM
  select KVM_VFIO
  select HAVE_KVM_EVENTFD
  select HAVE_KVM_IRQFD
 +select HAVE_KVM_MSI
  select HAVE_KVM_IRQCHIP
  select HAVE_KVM_IRQ_ROUTING
  depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index 0b4c48c..b3c10dc 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -2314,6 +2314,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
 *e,
  return kvm-arch.vgic.vm_ops.inject_msi(kvm, msi);
  else
  return -ENODEV;
 +case KVM_IRQ_ROUTING_MSI:
 +if (kvm-arch.vgic.vm_ops.inject_msi)
 +return -EINVAL;
 +else
 +return kvm_vgic_inject_irq(kvm, 0, e-msi.data, level);
 
  Given API change i suggest (using KVM_MSI_VALID_DEVID flag), we could get 
 rid of all these if()'s
 here. Just forward all parameters to vGIC implementation code and let it do 
 its checks.
I don't understand this comment. Here this is the kernel struct that is
used (struct kvm_kernel_irq_routing_entry) and not the user one
(kvm_irq_routing_entry). The kernel struct does not have the flag field.
Another reason I think to keep using the type for homogeneity. To be
noted that in the kernel struct, the devid is passed in
kvm_extended_msi, as you suggested for the user-space struct.

Thanks

Eric
 
  default:
  return -EINVAL;
  }
 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 Kind regards,
 Pavel Fedin
 Expert Engineer
 Samsung Electronics Research center Russia
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Andre Przywara
Hi Eric,

On 02/07/15 15:49, Eric Auger wrote:
 Hi Pavel,
 On 07/02/2015 09:26 AM, Pavel Fedin wrote:
  Hello!

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On 
 Behalf Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

 On ARM, the MSI msg (address and data) comes along with
 out-of-band device ID information. The device ID encodes the device
 that composes the MSI msg. Let's create a new routing entry type,
 dubbed KVM_IRQ_ROUTING_EXTENDED_MSI and use the __u32 pad space
 to convey the device ID.

 Signed-off-by: Eric Auger eric.au...@linaro.org

 ---

 RFC - PATCH
 - remove kvm_irq_routing_extended_msi and use union instead
 ---
  Documentation/virtual/kvm/api.txt | 9 -
  include/uapi/linux/kvm.h  | 6 +-
  2 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index d20fd94..6426ae9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1414,7 +1414,10 @@ struct kvm_irq_routing_entry {
 __u32 gsi;
 __u32 type;
 __u32 flags;
 -   __u32 pad;
 +   union {
 +   __u32 pad;
 +   __u32 devid;
 +   };
 union {
 struct kvm_irq_routing_irqchip irqchip;
 struct kvm_irq_routing_msi msi;

  devid is actually a part of MSI bunch. Shouldn't it be a part of struct 
 kvm_irq_routing_msi then?
 It also has reserved pad.
 Well this makes sense to me to associate the devid to the msi and put
 devid in the pad field of struct kvm_irq_routing_msi.
 
 André, Christoffer, would you agree on this change? - I would like to
 avoid doing/undoing things ;-) -

Yes, that makes sense to me. TBH I haven't had a closer look at the
patches yet, but clearly devid belongs into struct kvm_irq_routing_msi.


 @@ -1427,6 +1430,10 @@ struct kvm_irq_routing_entry {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 +
 +In case of KVM_IRQ_ROUTING_EXTENDED_MSI routing type, devid is used to 
 convey
 +the device ID.

  No flags are specified so far, the corresponding field must be set to zero.

 What if we use KVM_MSI_VALID_DEVID flag instead of new 
 KVM_IRQ_ROUTING_EXTENDED_MSI definition? I
 believe this would make an API more consistent and introduce less new 
 definitions.
 do you mean using type == KVM_IRQ_ROUTING_MSI and flag ==
 KVM_MSI_VALID_DEVID? Not sure this is simpler/clearer. s390 paved the
 way for new routing entry types. I add a new one here.

I tend to agree with Pavel's solution. When hacking IRQ routing support
into kvmtool I saw that it's nasty being forced to differentiate between
the two MSI routing types. Actually userland should be able to query the
kernel about what kind of routing it requires. Also there is the issue
that we must _not_ set the flag on x86, since that breaks older kernels
(due to that check that Eric removes in 3/7).
So from my point of view the cleanest solution would be to always use
KVM_IRQ_ROUTING_MSI, and add the device ID if the kernel needs it (true
for ITS guests, false for GICv2M, x86, ...)
I am looking for a clever solution for this now.

Cheers,
Andre.

 
 Another solution may be to use new KVM_IRQ_ROUTING_EXTENDED_MSI type and
 add struct kvm_msi ext_msi in kvm_irq_routing_entry union. It is 8 words
 as well. But most probably this is even uglier.

 
 Let's see if this thread is heading to a consensus...
 
 Best Regards
 
 Eric


 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 2a23705..8484681 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -841,12 +841,16 @@ struct kvm_irq_routing_s390_adapter {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4

  struct kvm_irq_routing_entry {
 __u32 gsi;
 __u32 type;
 __u32 flags;
 -   __u32 pad;
 +   union {
 +   __u32 pad;
 +   __u32 devid;
 +   };
 union {
 struct kvm_irq_routing_irqchip irqchip;
 struct kvm_irq_routing_msi msi;
 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 Kind regards,
 Pavel Fedin
 Expert Engineer
 Samsung Electronics Research center Russia

 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org

Re: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Eric Auger
Hi Andre,
On 07/02/2015 05:14 PM, Andre Przywara wrote:
 Hi Eric,
 
 On 02/07/15 15:49, Eric Auger wrote:
 Hi Pavel,
 On 07/02/2015 09:26 AM, Pavel Fedin wrote:
  Hello!

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On 
 Behalf Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

 On ARM, the MSI msg (address and data) comes along with
 out-of-band device ID information. The device ID encodes the device
 that composes the MSI msg. Let's create a new routing entry type,
 dubbed KVM_IRQ_ROUTING_EXTENDED_MSI and use the __u32 pad space
 to convey the device ID.

 Signed-off-by: Eric Auger eric.au...@linaro.org

 ---

 RFC - PATCH
 - remove kvm_irq_routing_extended_msi and use union instead
 ---
  Documentation/virtual/kvm/api.txt | 9 -
  include/uapi/linux/kvm.h  | 6 +-
  2 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index d20fd94..6426ae9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1414,7 +1414,10 @@ struct kvm_irq_routing_entry {
__u32 gsi;
__u32 type;
__u32 flags;
 -  __u32 pad;
 +  union {
 +  __u32 pad;
 +  __u32 devid;
 +  };
union {
struct kvm_irq_routing_irqchip irqchip;
struct kvm_irq_routing_msi msi;

  devid is actually a part of MSI bunch. Shouldn't it be a part of struct 
 kvm_irq_routing_msi then?
 It also has reserved pad.
 Well this makes sense to me to associate the devid to the msi and put
 devid in the pad field of struct kvm_irq_routing_msi.

 André, Christoffer, would you agree on this change? - I would like to
 avoid doing/undoing things ;-) -
 
 Yes, that makes sense to me. TBH I haven't had a closer look at the
 patches yet, but clearly devid belongs into struct kvm_irq_routing_msi.
thanks for your quick reply.
OK so let's go with that change.
 

 @@ -1427,6 +1430,10 @@ struct kvm_irq_routing_entry {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 +
 +In case of KVM_IRQ_ROUTING_EXTENDED_MSI routing type, devid is used to 
 convey
 +the device ID.

  No flags are specified so far, the corresponding field must be set to 
 zero.

 What if we use KVM_MSI_VALID_DEVID flag instead of new 
 KVM_IRQ_ROUTING_EXTENDED_MSI definition? I
 believe this would make an API more consistent and introduce less new 
 definitions.
 do you mean using type == KVM_IRQ_ROUTING_MSI and flag ==
 KVM_MSI_VALID_DEVID? Not sure this is simpler/clearer. s390 paved the
 way for new routing entry types. I add a new one here.
 
 I tend to agree with Pavel's solution. When hacking IRQ routing support
 into kvmtool I saw that it's nasty being forced to differentiate between
 the two MSI routing types. Actually userland should be able to query the
 kernel about what kind of routing it requires. Also there is the issue
 that we must _not_ set the flag on x86, since that breaks older kernels
 (due to that check that Eric removes in 3/7).
 So from my point of view the cleanest solution would be to always use
 KVM_IRQ_ROUTING_MSI, and add the device ID if the kernel needs it (true
 for ITS guests, false for GICv2M, x86, ...)
 I am looking for a clever solution for this now.
OK thanks for sharing. I need some more time to study qemu code too.

- Eric

 
 Cheers,
 Andre.
 

 Another solution may be to use new KVM_IRQ_ROUTING_EXTENDED_MSI type and
 add struct kvm_msi ext_msi in kvm_irq_routing_entry union. It is 8 words
 as well. But most probably this is even uglier.
 

 Let's see if this thread is heading to a consensus...

 Best Regards

 Eric


 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 2a23705..8484681 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -841,12 +841,16 @@ struct kvm_irq_routing_s390_adapter {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4

  struct kvm_irq_routing_entry {
__u32 gsi;
__u32 type;
__u32 flags;
 -  __u32 pad;
 +  union {
 +  __u32 pad;
 +  __u32 devid;
 +  };
union {
struct kvm_irq_routing_irqchip irqchip;
struct kvm_irq_routing_msi msi;
 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 Kind regards,
 

Re: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Eric Auger
On 07/02/2015 05:39 PM, Pavel Fedin wrote:
  Hello!
 
 OK thanks for sharing. I need some more time to study qemu code too.
 
  I am currently working on supporting this in qemu. Not ready yet, need some 
 time. But, with API i
 suggest, things are really much-much simpler.

OK so both of you say the same thing. Will respin accordingly

Eric
 
 Kind regards,
 Pavel Fedin
 Expert Engineer
 Samsung Electronics Research center Russia
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Eric Auger
Hi Pavel,
On 07/02/2015 09:26 AM, Pavel Fedin wrote:
  Hello!
 
 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On Behalf 
 Of Eric Auger
 Sent: Monday, June 29, 2015 6:37 PM
 To: eric.au...@st.com; eric.au...@linaro.org; 
 linux-arm-ker...@lists.infradead.org;
 marc.zyng...@arm.com; christoffer.d...@linaro.org; andre.przyw...@arm.com;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; p.fe...@samsung.com; 
 pbonz...@redhat.com
 Subject: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

 On ARM, the MSI msg (address and data) comes along with
 out-of-band device ID information. The device ID encodes the device
 that composes the MSI msg. Let's create a new routing entry type,
 dubbed KVM_IRQ_ROUTING_EXTENDED_MSI and use the __u32 pad space
 to convey the device ID.

 Signed-off-by: Eric Auger eric.au...@linaro.org

 ---

 RFC - PATCH
 - remove kvm_irq_routing_extended_msi and use union instead
 ---
  Documentation/virtual/kvm/api.txt | 9 -
  include/uapi/linux/kvm.h  | 6 +-
  2 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/Documentation/virtual/kvm/api.txt 
 b/Documentation/virtual/kvm/api.txt
 index d20fd94..6426ae9 100644
 --- a/Documentation/virtual/kvm/api.txt
 +++ b/Documentation/virtual/kvm/api.txt
 @@ -1414,7 +1414,10 @@ struct kvm_irq_routing_entry {
  __u32 gsi;
  __u32 type;
  __u32 flags;
 -__u32 pad;
 +union {
 +__u32 pad;
 +__u32 devid;
 +};
  union {
  struct kvm_irq_routing_irqchip irqchip;
  struct kvm_irq_routing_msi msi;
 
  devid is actually a part of MSI bunch. Shouldn't it be a part of struct 
 kvm_irq_routing_msi then?
 It also has reserved pad.
Well this makes sense to me to associate the devid to the msi and put
devid in the pad field of struct kvm_irq_routing_msi.

André, Christoffer, would you agree on this change? - I would like to
avoid doing/undoing things ;-) -

 
 @@ -1427,6 +1430,10 @@ struct kvm_irq_routing_entry {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4
 +
 +In case of KVM_IRQ_ROUTING_EXTENDED_MSI routing type, devid is used to 
 convey
 +the device ID.

  No flags are specified so far, the corresponding field must be set to zero.
 
 What if we use KVM_MSI_VALID_DEVID flag instead of new 
 KVM_IRQ_ROUTING_EXTENDED_MSI definition? I
 believe this would make an API more consistent and introduce less new 
 definitions.
do you mean using type == KVM_IRQ_ROUTING_MSI and flag ==
KVM_MSI_VALID_DEVID? Not sure this is simpler/clearer. s390 paved the
way for new routing entry types. I add a new one here.

Another solution may be to use new KVM_IRQ_ROUTING_EXTENDED_MSI type and
add struct kvm_msi ext_msi in kvm_irq_routing_entry union. It is 8 words
as well. But most probably this is even uglier.

Let's see if this thread is heading to a consensus...

Best Regards

Eric
 

 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
 index 2a23705..8484681 100644
 --- a/include/uapi/linux/kvm.h
 +++ b/include/uapi/linux/kvm.h
 @@ -841,12 +841,16 @@ struct kvm_irq_routing_s390_adapter {
  #define KVM_IRQ_ROUTING_IRQCHIP 1
  #define KVM_IRQ_ROUTING_MSI 2
  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
 +#define KVM_IRQ_ROUTING_EXTENDED_MSI 4

  struct kvm_irq_routing_entry {
  __u32 gsi;
  __u32 type;
  __u32 flags;
 -__u32 pad;
 +union {
 +__u32 pad;
 +__u32 devid;
 +};
  union {
  struct kvm_irq_routing_irqchip irqchip;
  struct kvm_irq_routing_msi msi;
 --
 1.9.1

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 Kind regards,
 Pavel Fedin
 Expert Engineer
 Samsung Electronics Research center Russia
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Eric Auger
On 07/02/2015 10:41 AM, Pavel Fedin wrote:
  Hello!
 
 What if we use KVM_MSI_VALID_DEVID flag instead of new 
 KVM_IRQ_ROUTING_EXTENDED_MSI
 definition? I
 believe this would make an API more consistent and introduce less new 
 definitions.
 
  I have just found one more flaw in your implementation. If you take a look 
 at irqfd_wakeup()...
 --- cut ---
   /* An event has been signaled, inject an interrupt */
   if (irq.type == KVM_IRQ_ROUTING_MSI)
   kvm_set_msi(irq, kvm, KVM_USERSPACE_IRQ_SOURCE_ID, 1,
   false);
   else
   schedule_work(irqfd-inject);
 --- cut ---
  You apparently missed KVM_IRQ_ROUTING_EXTENDED_MSI here, as well as in 
 irqfd_update(). But, if you
 accept my API proposal, this becomes irrelevant.

Hi Pavel,

thanks for spotting this bug. Whatever the user-api API choice I will
respin shortly fixing this  plus the one reported by André.

Thanks for the review.

Best Regards

Eric


 
 Kind regards,
 Pavel Fedin
 Expert Engineer
 Samsung Electronics Research center Russia
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-07-02 Thread Pavel Fedin
 Hello!

   Given API change i suggest (using KVM_MSI_VALID_DEVID flag), we could get 
  rid of all these
if()'s
  here. Just forward all parameters to vGIC implementation code and let it do 
  its checks.
 I don't understand this comment. Here this is the kernel struct that is
 used (struct kvm_kernel_irq_routing_entry) and not the user one
 (kvm_irq_routing_entry). The kernel struct does not have the flag field.

  Easy. ARM code can always use struct kvm_extended_msi, and flags can go to 
this structure.

 Another reason I think to keep using the type for homogeneity.

 Homogeneity is perfect IMHO.
 If that would be simpler for you, i could post a patch for this which i made 
on top of your series.
Sorry, i don't have time to respin the whole thing, busy with qemu GICv3 fight 
:)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/7] KVM: api: add kvm_irq_routing_extended_msi

2015-07-02 Thread Pavel Fedin
 Hello!

 OK thanks for sharing. I need some more time to study qemu code too.

 I am currently working on supporting this in qemu. Not ready yet, need some 
time. But, with API i
suggest, things are really much-much simpler.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits

2015-07-02 Thread Christoffer Dall
On Thu, Jul 02, 2015 at 10:49:03AM -0700, Mario Smarduch wrote:
 On 07/01/2015 02:49 AM, Christoffer Dall wrote:
  On Wed, Jun 24, 2015 at 05:04:10PM -0700, Mario Smarduch wrote:
  Currently we save/restore fp/simd on each exit. Fist  patch optimizes arm64
  save/restore, we only do so on Guest access. hackbench and
  several lmbench tests show anywhere from 30% to above 50% optimzation
  achieved.
 
  In second patch 32-bit handler is updated to keep exit handling consistent
  with 64-bit code.
  
  30-50% of what?  The overhead or overall performance?
 
 Yes, so considering all exits to Host KVM anywhere from 30 to 50%
 didn't require an fp/simd switch.
 
 Anything else you like to see added here?

No, I'm good with them.  Marc is handling the tree these days so I'll
leave it up to him if we want to adjust patch 1 or what to do.

Thanks!
-Christoffer
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 09/17] bypass: IRQ bypass manager proto by Alex

2015-07-02 Thread Wu, Feng


 -Original Message-
 From: Eric Auger [mailto:eric.au...@linaro.org]
 Sent: Thursday, July 02, 2015 9:17 PM
 To: eric.au...@st.com; eric.au...@linaro.org;
 linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
 kvm@vger.kernel.org; christoffer.d...@linaro.org; marc.zyng...@arm.com;
 alex.william...@redhat.com; pbonz...@redhat.com; avi.kiv...@gmail.com;
 mtosa...@redhat.com; Wu, Feng; j...@8bytes.org;
 b.rey...@virtualopensystems.com
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org
 Subject: [RFC 09/17] bypass: IRQ bypass manager proto by Alex
 
 From: Alex Williamson alex.william...@redhat.com
 
 There are plenty of details to be filled in, but I think the basics
 looks something like the code below.  The IRQ bypass manager just
 defines a pair of structures, one for interrupt producers and one for
 interrupt consumers.  I'm certain that we'll need more callbacks than
 I've defined below, but figuring out what those should be for the best
 abstraction is the hardest part of this idea.  The manager provides both
 registration and de-registration interfaces for both types of objects
 and keeps lists for each, protected by a lock.  The manager doesn't even
 really need to know what the match token is, but I assume for our
 purposes it will be an eventfd_ctx.
 
 On the vfio side, the producer struct would be embedded in the
 vfio_pci_irq_ctx struct.  KVM would probably embed the consumer struct
 in _irqfd.  As I've coded below, the IRQ bypass manager calls the
 consumer callbacks, so the producer struct would need fields or
 callbacks to provide the consumer the info it needs.  AIUI the Posted
 Interrupt model, VFIO only needs to provide data to the consumer.  For
 IRQ Forwarding, I think the producer needs to be informed when bypass is
 active to model the incoming interrupt as edge vs level.
 
 I've prototyped the base IRQ bypass manager here as static, but I don't
 see any reason it couldn't be a module that's loaded by dependency when
 either vfio-pci or kvm-intel is loaded (or other producer/consumer
 objects).
 
 Is this a reasonable starting point to craft the additional fields and
 callbacks and interaction of who calls who that we need to support
 Posted Interrupts and IRQ Forwarding?  Is the AMD version of this still
 alive?  Thanks,
 
 Alex

In fact, I also implement a RFC patch for this new framework. I am
thinking, can we discuss all the requirements for irq forwarding and
posted interrupts, and make it a separate patchset as a general
layer? Then we can continue to push arch specific stuff, it is more
clear and easy.

Thanks,
Feng

 ---
  arch/x86/kvm/Kconfig  |   1 +
  drivers/vfio/pci/Kconfig  |   1 +
  drivers/vfio/pci/vfio_pci_intrs.c |   6 ++
  include/linux/irqbypass.h |  23 
  kernel/irq/Kconfig|   3 +
  kernel/irq/Makefile   |   1 +
  kernel/irq/bypass.c   | 116
 ++
  virt/kvm/eventfd.c|   4 ++
  8 files changed, 155 insertions(+)
  create mode 100644 include/linux/irqbypass.h
  create mode 100644 kernel/irq/bypass.c
 
 diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
 index d8a1d56..86d0d77 100644
 --- a/arch/x86/kvm/Kconfig
 +++ b/arch/x86/kvm/Kconfig
 @@ -61,6 +61,7 @@ config KVM_INTEL
   depends on KVM
   # for perf_guest_get_msrs():
   depends on CPU_SUP_INTEL
 + select IRQ_BYPASS_MANAGER
   ---help---
 Provides support for KVM on Intel processors equipped with the VT
 extensions.
 diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
 index 579d83b..02912f1 100644
 --- a/drivers/vfio/pci/Kconfig
 +++ b/drivers/vfio/pci/Kconfig
 @@ -2,6 +2,7 @@ config VFIO_PCI
   tristate VFIO support for PCI devices
   depends on VFIO  PCI  EVENTFD
   select VFIO_VIRQFD
 + select IRQ_BYPASS_MANAGER
   help
 Support for the PCI VFIO bus driver.  This is required to make
 use of PCI drivers using the VFIO framework.
 diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
 b/drivers/vfio/pci/vfio_pci_intrs.c
 index 1f577b4..4e053be 100644
 --- a/drivers/vfio/pci/vfio_pci_intrs.c
 +++ b/drivers/vfio/pci/vfio_pci_intrs.c
 @@ -181,6 +181,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device
 *vdev, int fd)
 
   if (vdev-ctx[0].trigger) {
   free_irq(pdev-irq, vdev);
 + /* irq_bypass_unregister_producer(); */
   kfree(vdev-ctx[0].name);
   eventfd_ctx_put(vdev-ctx[0].trigger);
   vdev-ctx[0].trigger = NULL;
 @@ -214,6 +215,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device
 *vdev, int fd)
   return ret;
   }
 
 + /* irq_bypass_register_producer(); */
 +
   /*
* INTx disable will stick across the new irq setup,
* disable_irq won't.
 @@ -319,6 +322,7 @@ static int vfio_msi_set_vector_signal(struct
 vfio_pci_device *vdev,
 
   if 

RE: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control

2015-07-02 Thread Wu, Feng


 -Original Message-
 From: Paolo Bonzini [mailto:pbonz...@redhat.com]
 Sent: Thursday, July 02, 2015 9:41 PM
 To: Eric Auger; eric.au...@st.com; linux-arm-ker...@lists.infradead.org;
 kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org;
 christoffer.d...@linaro.org; marc.zyng...@arm.com;
 alex.william...@redhat.com; avi.kiv...@gmail.com; mtosa...@redhat.com;
 Wu, Feng; j...@8bytes.org; b.rey...@virtualopensystems.com
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org
 Subject: Re: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding
 control
 
 
 
 On 02/07/2015 15:17, Eric Auger wrote:
  - new fields are added on producer side: linux irq, vfio_device handle,
active which reflects whether the source is active (at interrupt
controller level or at VFIO level - automasked -) and finally an
opaque pointer which will be used to point to the vfio_platform_device
in this series.
 
 Linux IRQ and active should be okay.  As to the vfio_device handle, you
 should link it from the vfio_platform_device instead.  And for the
 vfio_platform_device, you can link it from the vfio_platform_irq instead.
 
 Once you've done this, embed the irq_bypass_producer struct in the
 vfio_platform_irq struct; in the new kvm_arch_* functions, go back to
 the vfio_platform_irq struct via container_of.  From there you can
 retrieve pointers to the vfio_platform_device and the vfio_device.
 
  - new fields on consumer side: the kvm handle, the gsi
 
 You do not need to add these.  Instead, add the kvm handle to irqfd
 only.  Like above, embed the irq_bypass_consumer struct in the irqfd
 struct; in the new kvm_arch_* functions, go back to the
 vfio_platform_irq struct via container_of.
 

I also need the gsi field here, for posted-interrupts, I need 'gsi', 'irq' to
update the IRTE.

Thanks,
Feng


 Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control

2015-07-02 Thread Wu, Feng


 -Original Message-
 From: Eric Auger [mailto:eric.au...@linaro.org]
 Sent: Thursday, July 02, 2015 9:17 PM
 To: eric.au...@st.com; eric.au...@linaro.org;
 linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
 kvm@vger.kernel.org; christoffer.d...@linaro.org; marc.zyng...@arm.com;
 alex.william...@redhat.com; pbonz...@redhat.com; avi.kiv...@gmail.com;
 mtosa...@redhat.com; Wu, Feng; j...@8bytes.org;
 b.rey...@virtualopensystems.com
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org
 Subject: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control
 
 - [add,del]_[consumer,producer] updated to takes both the consumer and
   producer handles. This is requested to combine info from both,
   typically to link the source irq owned by the producer with the gsi
   owned by the consumer (forwarded IRQ setup).
 - new functions are added: [stop,resume]_[consumer, producer]. Those are
   needed for forwarding since the state change requires to entermingle
   actions at consumer, producer.
 - On handshake, we now call connect, disconnect which features the more
   complex sequence.
 - new fields are added on producer side: linux irq, vfio_device handle,
   active which reflects whether the source is active (at interrupt
   controller level or at VFIO level - automasked -) and finally an
   opaque pointer which will be used to point to the vfio_platform_device
   in this series.
 - new fields on consumer side: the kvm handle, the gsi
 
 Integration of posted interrupt series will help to refine those choices

On PI side, I need another filed as below,

struct irq_bypass_consumer {
   struct list_head node;
   void *token;
+  unsigned irq;/*got from producer when registered*/
   void (*add_producer)(struct irq_bypass_producer *,
struct irq_bypass_consumer *);
   void (*del_producer)(struct irq_bypass_producer *,
struct irq_bypass_consumer *);
+  void (*update)(struct irq_bypass_consumer *);
};

'update' is used to update the IRTE, while irq is initialized when
registered, which is used to find the right IRTE.

Thanks,
Feng

 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 
 ---
 
 - connect/disconnect could become a cb too. For forwarding it may make
   sense to have failure at connection: this would happen when the physical
   IRQ is either active at irqchip level or VFIO masked. This means some
   of the cb should return an error and this error management could be
   prod/cons specific. Where to attach the connect/disconnect cb: to the
   cons or prod, to both?
 - Hence may be sensible to do the list_add only if connect returns 0
 - disconnect would not be allowed to fail.
 ---
  include/linux/irqbypass.h | 26 ++---
  kernel/irq/bypass.c   | 48
 +++
  2 files changed, 67 insertions(+), 7 deletions(-)
 
 diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h
 index 718508e..591ae3f 100644
 --- a/include/linux/irqbypass.h
 +++ b/include/linux/irqbypass.h
 @@ -3,17 +3,37 @@
 
  #include linux/list.h
 
 +struct vfio_device;
 +struct irq_bypass_consumer;
 +struct kvm;
 +
  struct irq_bypass_producer {
   struct list_head node;
   void *token;
 - /* TBD */
 + unsigned int irq; /* host physical irq */
 + struct vfio_device *vdev; /* vfio device that requested irq */
 + /* is irq active at irqchip or VFIO masked? */
 + bool active;
 + void *opaque;
 + void (*stop_producer)(struct irq_bypass_producer *);
 + void (*resume_producer)(struct irq_bypass_producer *);
 + void (*add_consumer)(struct irq_bypass_producer *,
 +  struct irq_bypass_consumer *);
 + void (*del_consumer)(struct irq_bypass_producer *,
 +  struct irq_bypass_consumer *);
  };
 
  struct irq_bypass_consumer {
   struct list_head node;
   void *token;
 - void (*add_producer)(struct irq_bypass_producer *);
 - void (*del_producer)(struct irq_bypass_producer *);
 + unsigned int gsi;   /* the guest gsi */
 + struct kvm *kvm;
 + void (*stop_consumer)(struct irq_bypass_consumer *);
 + void (*resume_consumer)(struct irq_bypass_consumer *);
 + void (*add_producer)(struct irq_bypass_consumer *,
 +  struct irq_bypass_producer *);
 + void (*del_producer)(struct irq_bypass_consumer *,
 +  struct irq_bypass_producer *);
  };
 
  int irq_bypass_register_producer(struct irq_bypass_producer *);
 diff --git a/kernel/irq/bypass.c b/kernel/irq/bypass.c
 index 5d0f92b..fb31fef 100644
 --- a/kernel/irq/bypass.c
 +++ b/kernel/irq/bypass.c
 @@ -19,6 +19,46 @@ static LIST_HEAD(producers);
  static LIST_HEAD(consumers);
  static DEFINE_MUTEX(lock);
 
 +/* lock must be hold when calling connect */
 +static void connect(struct irq_bypass_producer *prod,
 + struct 

Re: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-07-02 Thread Eric Auger
Hi Andre,
On 07/02/2015 07:10 PM, Andre Przywara wrote:
 Hi Eric,
 
 On 29/06/15 16:37, Eric Auger wrote:
 If the ITS modality is not available, let's simply support MSI
 injection by transforming the MSI.data into an SPI ID.

 This becomes possible to use KVM_SIGNAL_MSI ioctl for arm too.

 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  arch/arm/kvm/Kconfig | 1 +
  virt/kvm/arm/vgic.c  | 5 +
  2 files changed, 6 insertions(+)

 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 151e710..0f58baf 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -31,6 +31,7 @@ config KVM
  select KVM_VFIO
  select HAVE_KVM_EVENTFD
  select HAVE_KVM_IRQFD
 +select HAVE_KVM_MSI
  select HAVE_KVM_IRQCHIP
  select HAVE_KVM_IRQ_ROUTING
  depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index 0b4c48c..b3c10dc 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -2314,6 +2314,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry 
 *e,
  return kvm-arch.vgic.vm_ops.inject_msi(kvm, msi);
  else
  return -ENODEV;
 +case KVM_IRQ_ROUTING_MSI:
 +if (kvm-arch.vgic.vm_ops.inject_msi)
 +return -EINVAL;
 +else
 +return kvm_vgic_inject_irq(kvm, 0, e-msi.data, level);
 
 If you add:
 
 static int vgic_v2m_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 {
   return kvm_vgic_inject_irq(kvm, 0, msi-data, 1);
 }
 
 to vgic-v2-emul.c and wire it up accordingly, you can simplify the above
 kvm_set_msi, getting rid of all those extra case handling.
 This also helps merging KVM_IRQ_ROUTING_MSI and the extended case.
 
 I have hacked this up and it seems to work for me.

OK thanks I will respin either today or on monday.

Best Regards

Eric
 
 Cheers,
 Andre.
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding control

2015-07-02 Thread Wu, Feng


 -Original Message-
 From: Wu, Feng
 Sent: Friday, July 03, 2015 10:20 AM
 To: Paolo Bonzini; Eric Auger; eric.au...@st.com;
 linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
 kvm@vger.kernel.org; christoffer.d...@linaro.org; marc.zyng...@arm.com;
 alex.william...@redhat.com; avi.kiv...@gmail.com; mtosa...@redhat.com;
 j...@8bytes.org; b.rey...@virtualopensystems.com
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org; Wu, Feng
 Subject: RE: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding
 control
 
 
 
  -Original Message-
  From: Paolo Bonzini [mailto:pbonz...@redhat.com]
  Sent: Thursday, July 02, 2015 9:41 PM
  To: Eric Auger; eric.au...@st.com; linux-arm-ker...@lists.infradead.org;
  kvm...@lists.cs.columbia.edu; kvm@vger.kernel.org;
  christoffer.d...@linaro.org; marc.zyng...@arm.com;
  alex.william...@redhat.com; avi.kiv...@gmail.com; mtosa...@redhat.com;
  Wu, Feng; j...@8bytes.org; b.rey...@virtualopensystems.com
  Cc: linux-ker...@vger.kernel.org; patc...@linaro.org
  Subject: Re: [RFC 12/17] irq: bypass: Extend skeleton for ARM forwarding
  control
 
 
 
  On 02/07/2015 15:17, Eric Auger wrote:
   - new fields are added on producer side: linux irq, vfio_device handle,
 active which reflects whether the source is active (at interrupt
 controller level or at VFIO level - automasked -) and finally an
 opaque pointer which will be used to point to the vfio_platform_device
 in this series.
 
  Linux IRQ and active should be okay.  As to the vfio_device handle, you
  should link it from the vfio_platform_device instead.  And for the
  vfio_platform_device, you can link it from the vfio_platform_irq instead.
 
  Once you've done this, embed the irq_bypass_producer struct in the
  vfio_platform_irq struct; in the new kvm_arch_* functions, go back to
  the vfio_platform_irq struct via container_of.  From there you can
  retrieve pointers to the vfio_platform_device and the vfio_device.
 
   - new fields on consumer side: the kvm handle, the gsi
 
  You do not need to add these.  Instead, add the kvm handle to irqfd
  only.  Like above, embed the irq_bypass_consumer struct in the irqfd
  struct; in the new kvm_arch_* functions, go back to the
  vfio_platform_irq struct via container_of.
 
 
 I also need the gsi field here, for posted-interrupts, I need 'gsi', 'irq' to
 update the IRTE.

Oh... we can get gsi from irq_bypass_consumer - _irqfd - gsi, so it
is not needed in irq_bypass_consumer. Got it! :)

Thanks,
Feng

 
 Thanks,
 Feng
 
 
  Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 09/17] bypass: IRQ bypass manager proto by Alex

2015-07-02 Thread Eric Auger
Hi Feng,
On 07/03/2015 04:16 AM, Wu, Feng wrote:
 
 
 -Original Message-
 From: Eric Auger [mailto:eric.au...@linaro.org]
 Sent: Thursday, July 02, 2015 9:17 PM
 To: eric.au...@st.com; eric.au...@linaro.org;
 linux-arm-ker...@lists.infradead.org; kvm...@lists.cs.columbia.edu;
 kvm@vger.kernel.org; christoffer.d...@linaro.org; marc.zyng...@arm.com;
 alex.william...@redhat.com; pbonz...@redhat.com; avi.kiv...@gmail.com;
 mtosa...@redhat.com; Wu, Feng; j...@8bytes.org;
 b.rey...@virtualopensystems.com
 Cc: linux-ker...@vger.kernel.org; patc...@linaro.org
 Subject: [RFC 09/17] bypass: IRQ bypass manager proto by Alex

 From: Alex Williamson alex.william...@redhat.com

 There are plenty of details to be filled in, but I think the basics
 looks something like the code below.  The IRQ bypass manager just
 defines a pair of structures, one for interrupt producers and one for
 interrupt consumers.  I'm certain that we'll need more callbacks than
 I've defined below, but figuring out what those should be for the best
 abstraction is the hardest part of this idea.  The manager provides both
 registration and de-registration interfaces for both types of objects
 and keeps lists for each, protected by a lock.  The manager doesn't even
 really need to know what the match token is, but I assume for our
 purposes it will be an eventfd_ctx.

 On the vfio side, the producer struct would be embedded in the
 vfio_pci_irq_ctx struct.  KVM would probably embed the consumer struct
 in _irqfd.  As I've coded below, the IRQ bypass manager calls the
 consumer callbacks, so the producer struct would need fields or
 callbacks to provide the consumer the info it needs.  AIUI the Posted
 Interrupt model, VFIO only needs to provide data to the consumer.  For
 IRQ Forwarding, I think the producer needs to be informed when bypass is
 active to model the incoming interrupt as edge vs level.

 I've prototyped the base IRQ bypass manager here as static, but I don't
 see any reason it couldn't be a module that's loaded by dependency when
 either vfio-pci or kvm-intel is loaded (or other producer/consumer
 objects).

 Is this a reasonable starting point to craft the additional fields and
 callbacks and interaction of who calls who that we need to support
 Posted Interrupts and IRQ Forwarding?  Is the AMD version of this still
 alive?  Thanks,

 Alex
 
 In fact, I also implement a RFC patch for this new framework. I am
 thinking, can we discuss all the requirements for irq forwarding and
 posted interrupts, and make it a separate patchset as a general
 layer? Then we can continue to push arch specific stuff, it is more
 clear and easy.

Sure. I intend to respin today according to Paolo's directives and I
will put common patches in a separate series. Let's see next week with
Alex how he prefers things to be handled.

Best Regards

Eric


 
 Thanks,
 Feng
 
 ---
  arch/x86/kvm/Kconfig  |   1 +
  drivers/vfio/pci/Kconfig  |   1 +
  drivers/vfio/pci/vfio_pci_intrs.c |   6 ++
  include/linux/irqbypass.h |  23 
  kernel/irq/Kconfig|   3 +
  kernel/irq/Makefile   |   1 +
  kernel/irq/bypass.c   | 116
 ++
  virt/kvm/eventfd.c|   4 ++
  8 files changed, 155 insertions(+)
  create mode 100644 include/linux/irqbypass.h
  create mode 100644 kernel/irq/bypass.c

 diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
 index d8a1d56..86d0d77 100644
 --- a/arch/x86/kvm/Kconfig
 +++ b/arch/x86/kvm/Kconfig
 @@ -61,6 +61,7 @@ config KVM_INTEL
  depends on KVM
  # for perf_guest_get_msrs():
  depends on CPU_SUP_INTEL
 +select IRQ_BYPASS_MANAGER
  ---help---
Provides support for KVM on Intel processors equipped with the VT
extensions.
 diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
 index 579d83b..02912f1 100644
 --- a/drivers/vfio/pci/Kconfig
 +++ b/drivers/vfio/pci/Kconfig
 @@ -2,6 +2,7 @@ config VFIO_PCI
  tristate VFIO support for PCI devices
  depends on VFIO  PCI  EVENTFD
  select VFIO_VIRQFD
 +select IRQ_BYPASS_MANAGER
  help
Support for the PCI VFIO bus driver.  This is required to make
use of PCI drivers using the VFIO framework.
 diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
 b/drivers/vfio/pci/vfio_pci_intrs.c
 index 1f577b4..4e053be 100644
 --- a/drivers/vfio/pci/vfio_pci_intrs.c
 +++ b/drivers/vfio/pci/vfio_pci_intrs.c
 @@ -181,6 +181,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device
 *vdev, int fd)

  if (vdev-ctx[0].trigger) {
  free_irq(pdev-irq, vdev);
 +/* irq_bypass_unregister_producer(); */
  kfree(vdev-ctx[0].name);
  eventfd_ctx_put(vdev-ctx[0].trigger);
  vdev-ctx[0].trigger = NULL;
 @@ -214,6 +215,8 @@ static int vfio_intx_set_signal(struct vfio_pci_device
 *vdev, int fd)
  return ret;
  }

 +/* 

Re: [PATCH 7/7] KVM: arm: implement kvm_set_msi by gsi direct mapping

2015-07-02 Thread Andre Przywara
Hi Eric,

On 29/06/15 16:37, Eric Auger wrote:
 If the ITS modality is not available, let's simply support MSI
 injection by transforming the MSI.data into an SPI ID.
 
 This becomes possible to use KVM_SIGNAL_MSI ioctl for arm too.
 
 Signed-off-by: Eric Auger eric.au...@linaro.org
 ---
  arch/arm/kvm/Kconfig | 1 +
  virt/kvm/arm/vgic.c  | 5 +
  2 files changed, 6 insertions(+)
 
 diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
 index 151e710..0f58baf 100644
 --- a/arch/arm/kvm/Kconfig
 +++ b/arch/arm/kvm/Kconfig
 @@ -31,6 +31,7 @@ config KVM
   select KVM_VFIO
   select HAVE_KVM_EVENTFD
   select HAVE_KVM_IRQFD
 + select HAVE_KVM_MSI
   select HAVE_KVM_IRQCHIP
   select HAVE_KVM_IRQ_ROUTING
   depends on ARM_VIRT_EXT  ARM_LPAE  ARM_ARCH_TIMER
 diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
 index 0b4c48c..b3c10dc 100644
 --- a/virt/kvm/arm/vgic.c
 +++ b/virt/kvm/arm/vgic.c
 @@ -2314,6 +2314,11 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
   return kvm-arch.vgic.vm_ops.inject_msi(kvm, msi);
   else
   return -ENODEV;
 + case KVM_IRQ_ROUTING_MSI:
 + if (kvm-arch.vgic.vm_ops.inject_msi)
 + return -EINVAL;
 + else
 + return kvm_vgic_inject_irq(kvm, 0, e-msi.data, level);

If you add:

static int vgic_v2m_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
{
return kvm_vgic_inject_irq(kvm, 0, msi-data, 1);
}

to vgic-v2-emul.c and wire it up accordingly, you can simplify the above
kvm_set_msi, getting rid of all those extra case handling.
This also helps merging KVM_IRQ_ROUTING_MSI and the extended case.

I have hacked this up and it seems to work for me.

Cheers,
Andre.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/2] arm/arm64: KVM: Optimize arm64 fp/simd, saves 30-50% on exits

2015-07-02 Thread Mario Smarduch
On 07/01/2015 02:49 AM, Christoffer Dall wrote:
 On Wed, Jun 24, 2015 at 05:04:10PM -0700, Mario Smarduch wrote:
 Currently we save/restore fp/simd on each exit. Fist  patch optimizes arm64
 save/restore, we only do so on Guest access. hackbench and
 several lmbench tests show anywhere from 30% to above 50% optimzation
 achieved.

 In second patch 32-bit handler is updated to keep exit handling consistent
 with 64-bit code.
 
 30-50% of what?  The overhead or overall performance?

Yes, so considering all exits to Host KVM anywhere from 30 to 50%
didn't require an fp/simd switch.

Anything else you like to see added here?
 

 Changes since v1:
 - Addressed Marcs comments
 - Verified optimization improvements with lmbench and hackbench, updated 
   commit message

 Changes since v2:
 - only for patch 2/2
   - Reworked trapping to vfp access handler

 Changes since v3:
 - Only for patch 2/2
   - Removed load_vcpu in switch_to_guest_vfp per Marcs comment
   - Got another chance to replace an unreferenced label with a comment


 Mario Smarduch (2):
   Optimize arm64 skip 30-50% vfp/simd save/restore on exits
   keep arm vfp/simd exit handling consistent with arm64

  arch/arm/kvm/interrupts.S|   14 +++-
  arch/arm64/include/asm/kvm_arm.h |5 -
  arch/arm64/kvm/hyp.S |   46 
 +++---
  3 files changed, 55 insertions(+), 10 deletions(-)

 -- 
 1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 20:01, Xiao Guangrong wrote:
 
 Thanks for your review, Stefan and Paolo!
 
 On 07/02/2015 05:52 PM, Paolo Bonzini wrote:


 On 02/07/2015 11:20, Stefan Hajnoczi wrote:
 Currently, the NVDIMM driver has been merged into upstream Linux
 Kernel and
 this patchset tries to enable it in virtualization field

  From a device model perspective, have you checked whether it makes
 sense
 to integrate nvdimms into the pc-dimm and hostmem code that is used for
 memory hotplug and NUMA?

 The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
 doesn't share any interfaces or code with existing memory devices.
 Maybe that is the right solution here because NVDIMMs have different
 characteristics, but I'm not sure.

 The hostmem code should definitely be shared, e.g. by adding a new
 file property to the memory-backend-file class.  ivshmem can also use
 it---CCing Marc-Andr�.
 
 However, file-based memory used by NVDIMM is special, it divides the file
 to two parts, one part is used as PMEM and another part is used to store
 NVDIMM's configure data.
 
 Maybe we can introduce end-reserved property to reserve specified size
 at the end of the file. Or create a new class type based on
 memory-backend-file (named nvdimm-backend-file) class to hide this magic
 thing?

I need to read the code then. :)

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Xiao Guangrong


Thanks for your review, Stefan and Paolo!

On 07/02/2015 05:52 PM, Paolo Bonzini wrote:



On 02/07/2015 11:20, Stefan Hajnoczi wrote:

Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
this patchset tries to enable it in virtualization field


 From a device model perspective, have you checked whether it makes sense
to integrate nvdimms into the pc-dimm and hostmem code that is used for
memory hotplug and NUMA?

The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
doesn't share any interfaces or code with existing memory devices.
Maybe that is the right solution here because NVDIMMs have different
characteristics, but I'm not sure.


The hostmem code should definitely be shared, e.g. by adding a new
file property to the memory-backend-file class.  ivshmem can also use
it---CCing Marc-Andr�.


However, file-based memory used by NVDIMM is special, it divides the file
to two parts, one part is used as PMEM and another part is used to store
NVDIMM's configure data.

Maybe we can introduce end-reserved property to reserve specified size
at the end of the file. Or create a new class type based on
memory-backend-file (named nvdimm-backend-file) class to hide this magic
thing?



I don't know about the pc-dimm devices.  If the NVDIMM devices can do
_OST and can be hotplugged, then the answer is probably yes.


_OST is not needed for NVDIMM.

NVDIMM is completely different with dimm memory device in ACPI - it has
different HID, method object, memory range detection, device organization,
etc. So i prefer to introducing new device class for NVDIMM.

For hotplug, NVDIMM and DIMM can share some logic, e.g, free-address-range
management, slot management ... ( but new Object initiation in ACPI is
complete different), we can abstract these operation as common part.

NUMA detection is also different between NVDIMM, DIMM is also different,
NVDIMM need to report its NUMA affinity in SPA table. But they can share
some common function i think.

BTW, i am going to implement vNVDIMM hotplug once linux NVDIMM driver
supports it.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 14/16] nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function

2015-07-02 Thread Xiao Guangrong



On 07/02/2015 05:23 PM, Stefan Hajnoczi wrote:

On Wed, Jul 01, 2015 at 10:50:30PM +0800, Xiao Guangrong wrote:

+static uint32_t dsm_cmd_config_size(struct dsm_buffer *in, struct dsm_out *out)
+{
+GSList *list = get_nvdimm_built_list();
+PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in-handle);
+uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
+
+if (!nvdimm) {
+goto exit;
+}
+
+status = NFIT_STATUS_SUCCESS;
+out-cmd_config_size.config_size = nvdimm-config_data_size;
+out-cmd_config_size.max_xfer = max_xfer_config_size();


cpu_to_*() missing?

It should be possible to emulate NVDIMMs for a x86_64 guest on a
big-endian host, for example.


Indeed, will fix it in the next version, thank you for pointing it out.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/12] kvm: add hyper-v crash msrs values

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added Hyper-V crash msrs values - HV_X64_MSR_CRASH*.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/uapi/asm/hyperv.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/uapi/asm/hyperv.h 
b/arch/x86/include/uapi/asm/hyperv.h
index ce6068d..8fba544 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -199,6 +199,17 @@
 #define HV_X64_MSR_STIMER3_CONFIG  0x40B6
 #define HV_X64_MSR_STIMER3_COUNT   0x40B7
 
+/* Hyper-V guest crash notification MSR's */
+#define HV_X64_MSR_CRASH_P00x4100
+#define HV_X64_MSR_CRASH_P10x4101
+#define HV_X64_MSR_CRASH_P20x4102
+#define HV_X64_MSR_CRASH_P30x4103
+#define HV_X64_MSR_CRASH_P40x4104
+#define HV_X64_MSR_CRASH_CTL   0x4105
+#define HV_X64_MSR_CRASH_CTL_NOTIFY(1ULL  63)
+#define HV_X64_MSR_CRASH_PARAMS\
+   (1 + (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0))
+
 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_MASK \
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/12] kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

This patch introduce Hyper-V related source code file - hyperv.c and
per vm and per vcpu hyperv context structures.
All Hyper-V MSR's and hypercall code moved into hyperv.c.
All Hyper-V kvm/vcpu fields moved into appropriate hyperv context
structures. Copyrights and authors information copied from x86.c
to hyperv.c.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/asm/kvm_host.h |  20 ++-
 arch/x86/kvm/Makefile   |   4 +-
 arch/x86/kvm/hyperv.c   | 307 
 arch/x86/kvm/hyperv.h   |  32 +
 arch/x86/kvm/lapic.h|   2 +-
 arch/x86/kvm/x86.c  | 265 +-
 arch/x86/kvm/x86.h  |   5 +
 7 files changed, 366 insertions(+), 269 deletions(-)
 create mode 100644 arch/x86/kvm/hyperv.c
 create mode 100644 arch/x86/kvm/hyperv.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c7fa57b..78616aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -358,6 +358,11 @@ struct kvm_mtrr {
struct list_head head;
 };
 
+/* Hyper-V per vcpu emulation context */
+struct kvm_vcpu_hv {
+   u64 hv_vapic;
+};
+
 struct kvm_vcpu_arch {
/*
 * rip and regs accesses must go through
@@ -514,8 +519,7 @@ struct kvm_vcpu_arch {
/* used for guest single stepping over the given code position */
unsigned long singlestep_rip;
 
-   /* fields used by HYPER-V emulation */
-   u64 hv_vapic;
+   struct kvm_vcpu_hv hyperv;
 
cpumask_var_t wbinvd_dirty_mask;
 
@@ -586,6 +590,13 @@ struct kvm_apic_map {
struct kvm_lapic *logical_map[16][16];
 };
 
+/* Hyper-V emulation context */
+struct kvm_hv {
+   u64 hv_guest_os_id;
+   u64 hv_hypercall;
+   u64 hv_tsc_page;
+};
+
 struct kvm_arch {
unsigned int n_used_mmu_pages;
unsigned int n_requested_mmu_pages;
@@ -643,10 +654,7 @@ struct kvm_arch {
/* reads protected by irq_srcu, writes by irq_lock */
struct hlist_head mask_notifier_list;
 
-   /* fields used by HYPER-V emulation */
-   u64 hv_guest_os_id;
-   u64 hv_hypercall;
-   u64 hv_tsc_page;
+   struct kvm_hv hyperv;
 
#ifdef CONFIG_KVM_MMU_AUDIT
int audit_point;
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 67d215c..a1ff508 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,7 +12,9 @@ kvm-y += $(KVM)/kvm_main.o 
$(KVM)/coalesced_mmio.o \
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
-  i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o
+  i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
+  hyperv.o
+
 kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT)+= assigned-dev.o iommu.o
 kvm-intel-y+= vmx.o pmu_intel.o
 kvm-amd-y  += svm.o pmu_amd.o
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
new file mode 100644
index 000..2b49f10
--- /dev/null
+++ b/arch/x86/kvm/hyperv.c
@@ -0,0 +1,307 @@
+/*
+ * KVM Microsoft Hyper-V emulation
+ *
+ * derived from arch/x86/kvm/x86.c
+ *
+ * Copyright (C) 2006 Qumranet, Inc.
+ * Copyright (C) 2008 Qumranet, Inc.
+ * Copyright IBM Corporation, 2008
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (C) 2015 Andrey Smetanin asmeta...@virtuozzo.com
+ *
+ * Authors:
+ *   Avi Kivity   a...@qumranet.com
+ *   Yaniv Kamay  ya...@qumranet.com
+ *   Amit Shahamit.s...@qumranet.com
+ *   Ben-Ami Yassour ben...@il.ibm.com
+ *   Andrey Smetanin asmeta...@virtuozzo.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include x86.h
+#include lapic.h
+#include hyperv.h
+
+#include linux/kvm_host.h
+#include trace/events/kvm.h
+
+#include trace.h
+
+static bool kvm_hv_msr_partition_wide(u32 msr)
+{
+   bool r = false;
+
+   switch (msr) {
+   case HV_X64_MSR_GUEST_OS_ID:
+   case HV_X64_MSR_HYPERCALL:
+   case HV_X64_MSR_REFERENCE_TSC:
+   case HV_X64_MSR_TIME_REF_COUNT:
+   r = true;
+   break;
+   }
+
+   return r;
+}
+
+static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+{
+   struct kvm *kvm = vcpu-kvm;
+   struct kvm_hv *hv = kvm-arch.hyperv;
+
+   switch (msr) {
+   case HV_X64_MSR_GUEST_OS_ID:
+   hv-hv_guest_os_id = data;
+   /* setting guest os id to zero disables hypercall page */
+   if (!hv-hv_guest_os_id)
+   hv-hv_hypercall = 

[PATCH 5/12] kvm: added KVM_REQ_HV_CRASH value to notify qemu about hyper-v crash

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added KVM_REQ_HV_CRASH - vcpu request used for notify user space(QEMU)
about Hyper-V crash.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 include/linux/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2b2edf1..a377e00 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -139,6 +139,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_DISABLE_IBS   24
 #define KVM_REQ_APIC_PAGE_RELOAD  25
 #define KVM_REQ_SMI   26
+#define KVM_REQ_HV_CRASH  27
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID   1
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/12] kvm: introduce vcpu_debug = kvm_debug + vcpu context

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

vcpu_debug is useful macro like kvm_debug but additionally
includes vcpu context inside output.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 include/linux/kvm_host.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9564fd7..2b2edf1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -424,6 +424,9 @@ struct kvm {
 #define vcpu_unimpl(vcpu, fmt, ...)\
kvm_pr_unimpl(vcpu%i  fmt, (vcpu)-vcpu_id, ## __VA_ARGS__)
 
+#define vcpu_debug(vcpu, fmt, ...) \
+   kvm_debug(vcpu%i  fmt, (vcpu)-vcpu_id, ## __VA_ARGS__)
+
 static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
 {
smp_rmb();
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/12] kvm/x86: mark hyper-v crash msrs as partition wide

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Hyper-V crash msr's are per vm, aren't per vcpu, so mark them
as partition wide.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/kvm/hyperv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2b49f10..af83c96 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -39,6 +39,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
case HV_X64_MSR_HYPERCALL:
case HV_X64_MSR_REFERENCE_TSC:
case HV_X64_MSR_TIME_REF_COUNT:
+   case HV_X64_MSR_CRASH_CTL:
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
r = true;
break;
}
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/12] HyperV equivalent of pvpanic driver

2015-07-02 Thread Denis V. Lunev

ndows 2012 guests can notify hypervisor about occurred guest crash
(Windows bugcheck(BSOD)) by writing specific Hyper-V msrs. This patch does
handling of this MSR's by KVM and sending notification to user space that
allows to gather Windows guest crash dump by QEMU/LIBVIRT.

The idea is to provide functionality equal to pvpanic device without
QEMU guest agent for Windows.

The idea is borrowed from Linux HyperV bus driver and validated against
Windows 2k12.

Changes from v3:
* remove unused HV_X64_MSR_CRASH_CTL_NOTIFY
* added documentation section about KVM_SYSTEM_EVENT_CRASH
* allow only supported values inside crash ctl msr
* qemu: split patch into generic crash handling patches and hyperv specific
* qemu: skip migration of crash ctl msr value

Changes from v2:
* forbid modification crash ctl msr by guest
* qemu_system_guest_panicked usage in pvpanic and s390x
* hyper-v crash handler move from generic kvm to i386
* hyper-v crash handler: skip fetching crash msrs just mark crash occured
* sync with linux-next 20150629
* patch 11 squashed to patch 10
* patch 9 squashed to patch 7

Changes from v1:
* hyperv code move to hyperv.c
* added read handlers of crash data msrs
* added per vm and per cpu hyperv context structures
* added saving crash msrs inside qemu cpu state
* added qemu fetch and update of crash msrs
* added qemu crash msrs store in cpu state and it's migration

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Gleb Natapov g...@kernel.org
CC: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/12] kvm/x86: added hyper-v crash msrs into kvm hyperv context

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added kvm Hyper-V context hv crash variables as storage
of Hyper-V crash msrs.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/asm/kvm_host.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 78616aa..697c1f3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -595,6 +595,10 @@ struct kvm_hv {
u64 hv_guest_os_id;
u64 hv_hypercall;
u64 hv_tsc_page;
+
+   /* Hyper-v based guest crash (NT kernel bugcheck) parameters */
+   u64 hv_crash_param[HV_X64_MSR_CRASH_PARAMS];
+   u64 hv_crash_ctl;
 };
 
 struct kvm_arch {
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/12] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added hyper-v crash msr's(HV_X64_MSR_CRASH*) data and control
geters and setters. Userspace should check that such msr's
available by check of KVM_CAP_HYPERV_MSR_CRASH capability.

User space allowed to setup Hyper-V crash ctl msr.
This msr should be setup to HV_X64_MSR_CRASH_CTL_NOTIFY
value so Hyper-V guest knows it can send crash data to host.
But Hyper-V guest notifies about crash event by writing
the same HV_X64_MSR_CRASH_CTL_NOTIFY value into crash ctl msr.
So both user space and guest writes inside ctl msr the same value
and this patch distingiush the moment of actual guest crash
by checking host initiated value from msr info. Also patch
prevents modification of crash ctl msr by guest.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/kvm/hyperv.c| 74 ++--
 arch/x86/kvm/hyperv.h|  2 +-
 arch/x86/kvm/x86.c   |  8 +-
 include/uapi/linux/kvm.h |  1 +
 4 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index af83c96..a8160d2 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -48,7 +48,63 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
return r;
 }
 
-static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 *pdata)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (WARN_ON_ONCE(index = ARRAY_SIZE(hv-hv_crash_param)))
+   return -EINVAL;
+
+   *pdata = hv-hv_crash_param[index];
+   return 0;
+}
+
+static int kvm_hv_msr_get_crash_ctl(struct kvm_vcpu *vcpu, u64 *pdata)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   *pdata = hv-hv_crash_ctl;
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (host)
+   hv-hv_crash_ctl = data  HV_X64_MSR_CRASH_CTL_NOTIFY;
+
+   if (!host  (data  HV_X64_MSR_CRASH_CTL_NOTIFY)) {
+
+   vcpu_debug(vcpu, hv crash (0x%llx 0x%llx 0x%llx 0x%llx 
0x%llx)\n,
+ hv-hv_crash_param[0],
+ hv-hv_crash_param[1],
+ hv-hv_crash_param[2],
+ hv-hv_crash_param[3],
+ hv-hv_crash_param[4]);
+
+   /* Send notification about crash to user space */
+   kvm_make_request(KVM_REQ_HV_CRASH, vcpu);
+   }
+
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 data)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (WARN_ON_ONCE(index = ARRAY_SIZE(hv-hv_crash_param)))
+   return -EINVAL;
+
+   hv-hv_crash_param[index] = data;
+   return 0;
+}
+
+static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
+bool host)
 {
struct kvm *kvm = vcpu-kvm;
struct kvm_hv *hv = kvm-arch.hyperv;
@@ -101,6 +157,12 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data)
mark_page_dirty(kvm, gfn);
break;
}
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_set_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+data);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_set_crash_ctl(vcpu, data, host);
default:
vcpu_unimpl(vcpu, Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n,
msr, data);
@@ -173,6 +235,12 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata)
case HV_X64_MSR_REFERENCE_TSC:
data = hv-hv_tsc_page;
break;
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_get_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+pdata);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_get_crash_ctl(vcpu, pdata);
default:
vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr);
return 1;
@@ -217,13 +285,13 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
return 0;
 }
 
-int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 {
if (kvm_hv_msr_partition_wide(msr)) {
int r;
 

[PATCH 01/12] kvm/x86: move Hyper-V MSR's/hypercall code into hyperv.c file

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

This patch introduce Hyper-V related source code file - hyperv.c and
per vm and per vcpu hyperv context structures.
All Hyper-V MSR's and hypercall code moved into hyperv.c.
All Hyper-V kvm/vcpu fields moved into appropriate hyperv context
structures. Copyrights and authors information copied from x86.c
to hyperv.c.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/asm/kvm_host.h |  20 ++-
 arch/x86/kvm/Makefile   |   4 +-
 arch/x86/kvm/hyperv.c   | 307 
 arch/x86/kvm/hyperv.h   |  32 +
 arch/x86/kvm/lapic.h|   2 +-
 arch/x86/kvm/x86.c  | 265 +-
 arch/x86/kvm/x86.h  |   5 +
 7 files changed, 366 insertions(+), 269 deletions(-)
 create mode 100644 arch/x86/kvm/hyperv.c
 create mode 100644 arch/x86/kvm/hyperv.h

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c7fa57b..78616aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -358,6 +358,11 @@ struct kvm_mtrr {
struct list_head head;
 };
 
+/* Hyper-V per vcpu emulation context */
+struct kvm_vcpu_hv {
+   u64 hv_vapic;
+};
+
 struct kvm_vcpu_arch {
/*
 * rip and regs accesses must go through
@@ -514,8 +519,7 @@ struct kvm_vcpu_arch {
/* used for guest single stepping over the given code position */
unsigned long singlestep_rip;
 
-   /* fields used by HYPER-V emulation */
-   u64 hv_vapic;
+   struct kvm_vcpu_hv hyperv;
 
cpumask_var_t wbinvd_dirty_mask;
 
@@ -586,6 +590,13 @@ struct kvm_apic_map {
struct kvm_lapic *logical_map[16][16];
 };
 
+/* Hyper-V emulation context */
+struct kvm_hv {
+   u64 hv_guest_os_id;
+   u64 hv_hypercall;
+   u64 hv_tsc_page;
+};
+
 struct kvm_arch {
unsigned int n_used_mmu_pages;
unsigned int n_requested_mmu_pages;
@@ -643,10 +654,7 @@ struct kvm_arch {
/* reads protected by irq_srcu, writes by irq_lock */
struct hlist_head mask_notifier_list;
 
-   /* fields used by HYPER-V emulation */
-   u64 hv_guest_os_id;
-   u64 hv_hypercall;
-   u64 hv_tsc_page;
+   struct kvm_hv hyperv;
 
#ifdef CONFIG_KVM_MMU_AUDIT
int audit_point;
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 67d215c..a1ff508 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -12,7 +12,9 @@ kvm-y += $(KVM)/kvm_main.o 
$(KVM)/coalesced_mmio.o \
 kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
 
 kvm-y  += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
-  i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o
+  i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
+  hyperv.o
+
 kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT)+= assigned-dev.o iommu.o
 kvm-intel-y+= vmx.o pmu_intel.o
 kvm-amd-y  += svm.o pmu_amd.o
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
new file mode 100644
index 000..2b49f10
--- /dev/null
+++ b/arch/x86/kvm/hyperv.c
@@ -0,0 +1,307 @@
+/*
+ * KVM Microsoft Hyper-V emulation
+ *
+ * derived from arch/x86/kvm/x86.c
+ *
+ * Copyright (C) 2006 Qumranet, Inc.
+ * Copyright (C) 2008 Qumranet, Inc.
+ * Copyright IBM Corporation, 2008
+ * Copyright 2010 Red Hat, Inc. and/or its affiliates.
+ * Copyright (C) 2015 Andrey Smetanin asmeta...@virtuozzo.com
+ *
+ * Authors:
+ *   Avi Kivity   a...@qumranet.com
+ *   Yaniv Kamay  ya...@qumranet.com
+ *   Amit Shahamit.s...@qumranet.com
+ *   Ben-Ami Yassour ben...@il.ibm.com
+ *   Andrey Smetanin asmeta...@virtuozzo.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include x86.h
+#include lapic.h
+#include hyperv.h
+
+#include linux/kvm_host.h
+#include trace/events/kvm.h
+
+#include trace.h
+
+static bool kvm_hv_msr_partition_wide(u32 msr)
+{
+   bool r = false;
+
+   switch (msr) {
+   case HV_X64_MSR_GUEST_OS_ID:
+   case HV_X64_MSR_HYPERCALL:
+   case HV_X64_MSR_REFERENCE_TSC:
+   case HV_X64_MSR_TIME_REF_COUNT:
+   r = true;
+   break;
+   }
+
+   return r;
+}
+
+static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+{
+   struct kvm *kvm = vcpu-kvm;
+   struct kvm_hv *hv = kvm-arch.hyperv;
+
+   switch (msr) {
+   case HV_X64_MSR_GUEST_OS_ID:
+   hv-hv_guest_os_id = data;
+   /* setting guest os id to zero disables hypercall page */
+   if (!hv-hv_guest_os_id)
+   hv-hv_hypercall = 

[PATCH 08/12] kvm/x86: add sending hyper-v crash notification to user space

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Sending of notification is done by exiting vcpu to user space
if KVM_REQ_HV_CRASH is enabled for vcpu. At exit to user space
the kvm_run structure contains system_event with type
KVM_SYSTEM_EVENT_CRASH to notify about guest crash occured.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 Documentation/virtual/kvm/api.txt | 5 +
 arch/x86/kvm/x86.c| 6 ++
 include/uapi/linux/kvm.h  | 1 +
 3 files changed, 12 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index a7926a9..a4ebcb7 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3277,6 +3277,7 @@ should put the acknowledged interrupt vector into the 
'epr' field.
struct {
 #define KVM_SYSTEM_EVENT_SHUTDOWN   1
 #define KVM_SYSTEM_EVENT_RESET  2
+#define KVM_SYSTEM_EVENT_CRASH  3
__u32 type;
__u64 flags;
} system_event;
@@ -3296,6 +3297,10 @@ Valid values for 'type' are:
   KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
As with SHUTDOWN, userspace can choose to ignore the request, or
to schedule the reset to occur in the future and may call KVM_RUN again.
+  KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
+   has requested a crash condition maintenance. Userspace can choose
+   to ignore the request, or to gather VM memory core dump and/or
+   reset/shutdown of the VM.
 
/* Fix the size of the union. */
char padding[256];
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4c2767..28e79c0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6265,6 +6265,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu_scan_ioapic(vcpu);
if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
kvm_vcpu_reload_apic_access_page(vcpu);
+   if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) {
+   vcpu-run-exit_reason = KVM_EXIT_SYSTEM_EVENT;
+   vcpu-run-system_event.type = KVM_SYSTEM_EVENT_CRASH;
+   r = 0;
+   goto out;
+   }
}
 
if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5da4ca3..c8c6b8b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -317,6 +317,7 @@ struct kvm_run {
struct {
 #define KVM_SYSTEM_EVENT_SHUTDOWN   1
 #define KVM_SYSTEM_EVENT_RESET  2
+#define KVM_SYSTEM_EVENT_CRASH  3
__u32 type;
__u64 flags;
} system_event;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/12] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added hyper-v crash msr's(HV_X64_MSR_CRASH*) data and control
geters and setters. Userspace should check that such msr's
available by check of KVM_CAP_HYPERV_MSR_CRASH capability.

User space allowed to setup Hyper-V crash ctl msr.
This msr should be setup to HV_X64_MSR_CRASH_CTL_NOTIFY
value so Hyper-V guest knows it can send crash data to host.
But Hyper-V guest notifies about crash event by writing
the same HV_X64_MSR_CRASH_CTL_NOTIFY value into crash ctl msr.
So both user space and guest writes inside ctl msr the same value
and this patch distingiush the moment of actual guest crash
by checking host initiated value from msr info. Also patch
prevents modification of crash ctl msr by guest.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/kvm/hyperv.c| 74 ++--
 arch/x86/kvm/hyperv.h|  2 +-
 arch/x86/kvm/x86.c   |  8 +-
 include/uapi/linux/kvm.h |  1 +
 4 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index af83c96..a8160d2 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -48,7 +48,63 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
return r;
 }
 
-static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 *pdata)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (WARN_ON_ONCE(index = ARRAY_SIZE(hv-hv_crash_param)))
+   return -EINVAL;
+
+   *pdata = hv-hv_crash_param[index];
+   return 0;
+}
+
+static int kvm_hv_msr_get_crash_ctl(struct kvm_vcpu *vcpu, u64 *pdata)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   *pdata = hv-hv_crash_ctl;
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (host)
+   hv-hv_crash_ctl = data  HV_X64_MSR_CRASH_CTL_NOTIFY;
+
+   if (!host  (data  HV_X64_MSR_CRASH_CTL_NOTIFY)) {
+
+   vcpu_debug(vcpu, hv crash (0x%llx 0x%llx 0x%llx 0x%llx 
0x%llx)\n,
+ hv-hv_crash_param[0],
+ hv-hv_crash_param[1],
+ hv-hv_crash_param[2],
+ hv-hv_crash_param[3],
+ hv-hv_crash_param[4]);
+
+   /* Send notification about crash to user space */
+   kvm_make_request(KVM_REQ_HV_CRASH, vcpu);
+   }
+
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 data)
+{
+   struct kvm_hv *hv = vcpu-kvm-arch.hyperv;
+
+   if (WARN_ON_ONCE(index = ARRAY_SIZE(hv-hv_crash_param)))
+   return -EINVAL;
+
+   hv-hv_crash_param[index] = data;
+   return 0;
+}
+
+static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
+bool host)
 {
struct kvm *kvm = vcpu-kvm;
struct kvm_hv *hv = kvm-arch.hyperv;
@@ -101,6 +157,12 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data)
mark_page_dirty(kvm, gfn);
break;
}
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_set_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+data);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_set_crash_ctl(vcpu, data, host);
default:
vcpu_unimpl(vcpu, Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n,
msr, data);
@@ -173,6 +235,12 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata)
case HV_X64_MSR_REFERENCE_TSC:
data = hv-hv_tsc_page;
break;
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_get_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+pdata);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_get_crash_ctl(vcpu, pdata);
default:
vcpu_unimpl(vcpu, Hyper-V unhandled rdmsr: 0x%x\n, msr);
return 1;
@@ -217,13 +285,13 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
return 0;
 }
 
-int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 {
if (kvm_hv_msr_partition_wide(msr)) {
int r;
 

[PATCH 12/12] qemu/kvm/x86: hyper-v crash msrs set/get'ers and migration

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

KVM Hyper-V based guests can notify hypervisor about
occurred guest crash by writing into Hyper-V crash MSR's.
This patch does handling and migration of HV_X64_MSR_CRASH_P0-P4,
HV_X64_MSR_CRASH_CTL msrs. User can enable these MSR's by
'hv-crash' option.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Paolo Bonzini pbonz...@redhat.com
CC: Andreas Färber afaer...@suse.de
---
 linux-headers/asm-x86/hyperv.h | 13 +
 linux-headers/linux/kvm.h  |  1 +
 target-i386/cpu-qom.h  |  1 +
 target-i386/cpu.c  |  1 +
 target-i386/cpu.h  |  2 ++
 target-i386/kvm.c  | 27 +++
 target-i386/machine.c  | 26 ++
 7 files changed, 71 insertions(+)

diff --git a/linux-headers/asm-x86/hyperv.h b/linux-headers/asm-x86/hyperv.h
index ce6068d..5f88dc7 100644
--- a/linux-headers/asm-x86/hyperv.h
+++ b/linux-headers/asm-x86/hyperv.h
@@ -108,6 +108,8 @@
 #define HV_X64_HYPERCALL_PARAMS_XMM_AVAILABLE  (1  4)
 /* Support for a virtual guest idle state is available */
 #define HV_X64_GUEST_IDLE_STATE_AVAILABLE  (1  5)
+/* Guest crash data handler available */
+#define HV_X64_GUEST_CRASH_MSR_AVAILABLE   (1  10)
 
 /*
  * Implementation recommendations. Indicates which behaviors the hypervisor
@@ -199,6 +201,17 @@
 #define HV_X64_MSR_STIMER3_CONFIG  0x40B6
 #define HV_X64_MSR_STIMER3_COUNT   0x40B7
 
+/* Hypev-V guest crash notification MSR's */
+#define HV_X64_MSR_CRASH_P00x4100
+#define HV_X64_MSR_CRASH_P10x4101
+#define HV_X64_MSR_CRASH_P20x4102
+#define HV_X64_MSR_CRASH_P30x4103
+#define HV_X64_MSR_CRASH_P40x4104
+#define HV_X64_MSR_CRASH_CTL   0x4105
+#define HV_X64_MSR_CRASH_CTL_NOTIFY(1ULL  63)
+#define HV_X64_MSR_CRASH_PARAMS\
+   (1 + (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0))
+
 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_MASK \
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 409be37..efe720e 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -818,6 +818,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_DISABLE_QUIRKS 116
 #define KVM_CAP_X86_SMM 117
 #define KVM_CAP_MULTI_ADDRESS_SPACE 118
+#define KVM_CAP_HYPERV_MSR_CRASH 119
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/target-i386/cpu-qom.h b/target-i386/cpu-qom.h
index 7a4fddd..c35b624 100644
--- a/target-i386/cpu-qom.h
+++ b/target-i386/cpu-qom.h
@@ -89,6 +89,7 @@ typedef struct X86CPU {
 bool hyperv_relaxed_timing;
 int hyperv_spinlock_attempts;
 bool hyperv_time;
+bool hyperv_crash;
 bool check_cpuid;
 bool enforce_cpuid;
 bool expose_kvm;
diff --git a/target-i386/cpu.c b/target-i386/cpu.c
index 36b07f9..04a8408 100644
--- a/target-i386/cpu.c
+++ b/target-i386/cpu.c
@@ -3117,6 +3117,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL(hv-relaxed, X86CPU, hyperv_relaxed_timing, false),
 DEFINE_PROP_BOOL(hv-vapic, X86CPU, hyperv_vapic, false),
 DEFINE_PROP_BOOL(hv-time, X86CPU, hyperv_time, false),
+DEFINE_PROP_BOOL(hv-crash, X86CPU, hyperv_crash, false),
 DEFINE_PROP_BOOL(check, X86CPU, check_cpuid, false),
 DEFINE_PROP_BOOL(enforce, X86CPU, enforce_cpuid, false),
 DEFINE_PROP_BOOL(kvm, X86CPU, expose_kvm, true),
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 603aaf0..6c2352a 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -21,6 +21,7 @@
 
 #include config.h
 #include qemu-common.h
+#include asm/hyperv.h
 
 #ifdef TARGET_X86_64
 #define TARGET_LONG_BITS 64
@@ -904,6 +905,7 @@ typedef struct CPUX86State {
 uint64_t msr_hv_guest_os_id;
 uint64_t msr_hv_vapic;
 uint64_t msr_hv_tsc;
+uint64_t msr_hv_crash_prm[HV_X64_MSR_CRASH_PARAMS];
 
 /* exception/interrupt handling */
 int error_code;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index daced5c..f3456af 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -79,6 +79,7 @@ static int lm_capable_kernel;
 static bool has_msr_hv_hypercall;
 static bool has_msr_hv_vapic;
 static bool has_msr_hv_tsc;
+static bool has_msr_hv_crash;
 static bool has_msr_mtrr;
 static bool has_msr_xss;
 
@@ -515,6 +516,12 @@ int kvm_arch_init_vcpu(CPUState *cs)
 c-eax |= 0x200;
 has_msr_hv_tsc = true;
 }
+if (cpu-hyperv_crash 
+kvm_check_extension(cs-kvm_state, KVM_CAP_HYPERV_MSR_CRASH)  0) {
+c-edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
+has_msr_hv_crash = true;
+}
+
 c = cpuid_data.entries[cpuid_i++];
 c-function = 

[PATCH v5 0/12] HyperV equivalent of pvpanic driver

2015-07-02 Thread Denis V. Lunev
Windows 2012 guests can notify hypervisor about occurred guest crash
(Windows bugcheck(BSOD)) by writing specific Hyper-V msrs. This patch does
handling of this MSR's by KVM and sending notification to user space that
allows to gather Windows guest crash dump by QEMU/LIBVIRT.

The idea is to provide functionality equal to pvpanic device without
QEMU guest agent for Windows.

The idea is borrowed from Linux HyperV bus driver and validated against
Windows 2k12.

Changes from v4:
* fixed typo in email of Andreas Färber afaer...@suse.de
  my vim strangely behaves on lines with extended Deutch chars

Changes from v3:
* remove unused HV_X64_MSR_CRASH_CTL_NOTIFY
* added documentation section about KVM_SYSTEM_EVENT_CRASH
* allow only supported values inside crash ctl msr
* qemu: split patch into generic crash handling patches and hyperv specific
* qemu: skip migration of crash ctl msr value

Changes from v2:
* forbid modification crash ctl msr by guest
* qemu_system_guest_panicked usage in pvpanic and s390x
* hyper-v crash handler move from generic kvm to i386
* hyper-v crash handler: skip fetching crash msrs just mark crash occured
* sync with linux-next 20150629
* patch 11 squashed to patch 10
* patch 9 squashed to patch 7

Changes from v1:
* hyperv code move to hyperv.c
* added read handlers of crash data msrs
* added per vm and per cpu hyperv context structures
* added saving crash msrs inside qemu cpu state
* added qemu fetch and update of crash msrs
* added qemu crash msrs store in cpu state and it's migration

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Gleb Natapov g...@kernel.org
CC: Paolo Bonzini pbonz...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/12] kvm: introduce vcpu_debug = kvm_debug + vcpu context

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

vcpu_debug is useful macro like kvm_debug but additionally
includes vcpu context inside output.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 include/linux/kvm_host.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9564fd7..2b2edf1 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -424,6 +424,9 @@ struct kvm {
 #define vcpu_unimpl(vcpu, fmt, ...)\
kvm_pr_unimpl(vcpu%i  fmt, (vcpu)-vcpu_id, ## __VA_ARGS__)
 
+#define vcpu_debug(vcpu, fmt, ...) \
+   kvm_debug(vcpu%i  fmt, (vcpu)-vcpu_id, ## __VA_ARGS__)
+
 static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
 {
smp_rmb();
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs

2015-07-02 Thread Christoffer Dall
On Wed, Jul 01, 2015 at 07:18:40PM +0100, Marc Zyngier wrote:
 On 01/07/15 12:58, Christoffer Dall wrote:
  On Wed, Jul 01, 2015 at 10:17:52AM +0100, Marc Zyngier wrote:
  On 30/06/15 21:19, Christoffer Dall wrote:
  On Mon, Jun 08, 2015 at 06:04:00PM +0100, Marc Zyngier wrote:
  We only set the irq_queued flag for level interrupts, meaning
  that !vgic_irq_is_queued(vcpu, irq) is a good enough predicate
  for all interrupts.
 
  This will allow us to inject edge HW interrupts, for which the
  state ACTIVE+PENDING is not allowed.
 
  I don't understand this; ACTIVE+PENDING is allowed for edge interrupts.
  Do you mean that if we set the HW bit in the LR, then we are linking to
  an HW interrupt where we don't allow that to be ACTIVE+PENDING on the HW
  GIC side?
 
  Why is this relevant here?  I feel like I'm missing context.
 
  I've probably taken a shortcut here - bear with me while I'm trying to
  explain the issue.
 
  For HW interrupts, we shouldn't even try to use the state bits in the
  LR, because that state is contained in the physical distributor. Setting
  the HW bit really means there is something going on at the distributor
  level, just go there.
  
  ok, so by HW interrupts you mean virtual interrupts with the HW bit in
  the LR set, correct?
 
 Yes, sorry.
 
 
  If we were to inject a ACTIVE+PENDING interrupt at the LR level, we'd
  basically loose the second interrupt because that state is simply not
  considered.
  
  Huh?  Which second interrupt.  I looked at the spec and it says don't
  use the state bits for HW interrupts, so isn't it simply not supported
  to set these bits at all and that's it?
 
 I managed to confuse myself reading the same bit. It says (GICv3 spec):
 
 A hypervisor must only use the pending and active state for software
 originated interrupts, which are typically associated with virtual
 devices, or SGIs.
 
 That's the PENDING+ACTIVE state, and not the pending and active bits
 like I read it initially.
 
 Now consider the following scenario:
 
 - We inject a virtual edge interrupt
 - We mark the corresponding physical interrupt as active.
 - Queue interrupt in an LR
 - Resume vcpu
 
 Now, we inject another edge interrupt, the vcpu exits for whatever
 reason, and the previously injected interrupt is still active.
 
 The normal vGIC flow would be to mark the interrupt as ACTIVE+PENDING in
 the LR, and resume the vcpu. But the above states that this is invalid
 for HW generated interrupts.

Right, ok, so we must resample the pending state even for an
edge-triggered interrupt once it's EOIed, because we cannot put it in
the LR despite it being pending on the physical distributor?

Incidentally, we do not need to set the EOI_INT bit, becuase when the
guest EOIs the interrupt, it will also deactivate it on the physical
distributor and the hardware will then take the pending physical
interrupt, we will handle it in the host, etc. etc.

If we had a different *shared* device than the timer which is
edge-triggered, don't we then also need to capture the physical
distributor's pending state along with the state of the device unless we
assume that upon restoring the state for the device count on the device
to have another rising/falling edge to trigger the interrupt again? (I
assume the line would always go high for a level-triggered interrupt in
this case).

 
 
  So the trick we're using is to only inject the active interrupt, and
  prevent anything else from being injected until we can confirm that the
  active state has been cleared at the physical level.
 
  Does it make any sense?
 
  Sort of, but what I don't understand now is how the guest ever sees the
  interrupt then.  If we always inject the virtual interrupt by setting
  the active state on the physical distributor, and we can't inject this
  as active+pending, and the guest doesn't see the state in the LR, then
  how does this ever raise a virtual interrupt and how does the guest see
  an interrupt which is only PENDING so that it can ack it etc. etc.?
  
  Maybe I don't fully understand how the HW bit works after all...
 
 The way the spec is written is slightly misleading. But the gist of it
 is that we still signal the guest using the PENDING bit in the LR, and
 switch the LR as usual. it is just that we can't use the PENDING+ACTIVE
 state (apparently, this can lead to a double deactivation).
 
 Not sure the above makes sense. Beer time, I suppose.
 
It does make sense, I just had to sleep on it and see the code as a
whole instead of trying to understand it by just looking at this patch
individually.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/12] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers

2015-07-02 Thread Paolo Bonzini


On 02/07/2015 18:07, Denis V. Lunev wrote:
 From: Andrey Smetanin asmeta...@virtuozzo.com
 
 Added hyper-v crash msr's(HV_X64_MSR_CRASH*) data and control
 geters and setters. Userspace should check that such msr's
 available by check of KVM_CAP_HYPERV_MSR_CRASH capability.

It should use the existing KVM_GET_SUPPORTED_MSRS infrastructure.  See
emulated_msrs where other Hyper-V MSRs are listed.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/12] kvm/x86: add sending hyper-v crash notification to user space

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Sending of notification is done by exiting vcpu to user space
if KVM_REQ_HV_CRASH is enabled for vcpu. At exit to user space
the kvm_run structure contains system_event with type
KVM_SYSTEM_EVENT_CRASH to notify about guest crash occured.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 Documentation/virtual/kvm/api.txt | 5 +
 arch/x86/kvm/x86.c| 6 ++
 include/uapi/linux/kvm.h  | 1 +
 3 files changed, 12 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index a7926a9..a4ebcb7 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3277,6 +3277,7 @@ should put the acknowledged interrupt vector into the 
'epr' field.
struct {
 #define KVM_SYSTEM_EVENT_SHUTDOWN   1
 #define KVM_SYSTEM_EVENT_RESET  2
+#define KVM_SYSTEM_EVENT_CRASH  3
__u32 type;
__u64 flags;
} system_event;
@@ -3296,6 +3297,10 @@ Valid values for 'type' are:
   KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
As with SHUTDOWN, userspace can choose to ignore the request, or
to schedule the reset to occur in the future and may call KVM_RUN again.
+  KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
+   has requested a crash condition maintenance. Userspace can choose
+   to ignore the request, or to gather VM memory core dump and/or
+   reset/shutdown of the VM.
 
/* Fix the size of the union. */
char padding[256];
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4c2767..28e79c0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6265,6 +6265,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu_scan_ioapic(vcpu);
if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
kvm_vcpu_reload_apic_access_page(vcpu);
+   if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) {
+   vcpu-run-exit_reason = KVM_EXIT_SYSTEM_EVENT;
+   vcpu-run-system_event.type = KVM_SYSTEM_EVENT_CRASH;
+   r = 0;
+   goto out;
+   }
}
 
if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5da4ca3..c8c6b8b 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -317,6 +317,7 @@ struct kvm_run {
struct {
 #define KVM_SYSTEM_EVENT_SHUTDOWN   1
 #define KVM_SYSTEM_EVENT_RESET  2
+#define KVM_SYSTEM_EVENT_CRASH  3
__u32 type;
__u64 flags;
} system_event;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/12] kvm: add hyper-v crash msrs values

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added Hyper-V crash msrs values - HV_X64_MSR_CRASH*.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/uapi/asm/hyperv.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/uapi/asm/hyperv.h 
b/arch/x86/include/uapi/asm/hyperv.h
index ce6068d..8fba544 100644
--- a/arch/x86/include/uapi/asm/hyperv.h
+++ b/arch/x86/include/uapi/asm/hyperv.h
@@ -199,6 +199,17 @@
 #define HV_X64_MSR_STIMER3_CONFIG  0x40B6
 #define HV_X64_MSR_STIMER3_COUNT   0x40B7
 
+/* Hyper-V guest crash notification MSR's */
+#define HV_X64_MSR_CRASH_P00x4100
+#define HV_X64_MSR_CRASH_P10x4101
+#define HV_X64_MSR_CRASH_P20x4102
+#define HV_X64_MSR_CRASH_P30x4103
+#define HV_X64_MSR_CRASH_P40x4104
+#define HV_X64_MSR_CRASH_CTL   0x4105
+#define HV_X64_MSR_CRASH_CTL_NOTIFY(1ULL  63)
+#define HV_X64_MSR_CRASH_PARAMS\
+   (1 + (HV_X64_MSR_CRASH_P4 - HV_X64_MSR_CRASH_P0))
+
 #define HV_X64_MSR_HYPERCALL_ENABLE0x0001
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT12
 #define HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_MASK \
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/12] kvm/x86: mark hyper-v crash msrs as partition wide

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Hyper-V crash msr's are per vm, aren't per vcpu, so mark them
as partition wide.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/kvm/hyperv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 2b49f10..af83c96 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -39,6 +39,8 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
case HV_X64_MSR_HYPERCALL:
case HV_X64_MSR_REFERENCE_TSC:
case HV_X64_MSR_TIME_REF_COUNT:
+   case HV_X64_MSR_CRASH_CTL:
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
r = true;
break;
}
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/12] qemu: added qemu_system_guest_panicked() - generic guest panic handler

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

There are pieces of guest panic handling code that can be shared
in one generic function.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Paolo Bonzini pbonz...@redhat.com
CC: Andreas Färber afaer...@suse.de
---
 hw/misc/pvpanic.c   |  3 +--
 include/sysemu/sysemu.h |  1 +
 target-s390x/kvm.c  | 11 ++-
 vl.c|  6 ++
 4 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c
index 994f8af..3709488 100644
--- a/hw/misc/pvpanic.c
+++ b/hw/misc/pvpanic.c
@@ -41,8 +41,7 @@ static void handle_event(int event)
 }
 
 if (event  PVPANIC_PANICKED) {
-qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, error_abort);
-vm_stop(RUN_STATE_GUEST_PANICKED);
+qemu_system_guest_panicked();
 return;
 }
 }
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index df80951..70164c9 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -68,6 +68,7 @@ int qemu_reset_requested_get(void);
 void qemu_system_killed(int signal, pid_t pid);
 void qemu_devices_reset(void);
 void qemu_system_reset(bool report);
+void qemu_system_guest_panicked(void);
 
 void qemu_add_exit_notifier(Notifier *notify);
 void qemu_remove_exit_notifier(Notifier *notify);
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index 135111a..e5bd3ef 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -1796,13 +1796,6 @@ static bool is_special_wait_psw(CPUState *cs)
 return cs-kvm_run-psw_addr == 0xfffUL;
 }
 
-static void guest_panicked(void)
-{
-qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE,
-   error_abort);
-vm_stop(RUN_STATE_GUEST_PANICKED);
-}
-
 static void unmanageable_intercept(S390CPU *cpu, const char *str, int 
pswoffset)
 {
 CPUState *cs = CPU(cpu);
@@ -1811,7 +1804,7 @@ static void unmanageable_intercept(S390CPU *cpu, const 
char *str, int pswoffset)
  str, cs-cpu_index, ldq_phys(cs-as, cpu-env.psa + 
pswoffset),
  ldq_phys(cs-as, cpu-env.psa + pswoffset + 8));
 s390_cpu_halt(cpu);
-guest_panicked();
+qemu_system_guest_panicked();
 }
 
 static int handle_intercept(S390CPU *cpu)
@@ -1844,7 +1837,7 @@ static int handle_intercept(S390CPU *cpu)
 if (is_special_wait_psw(cs)) {
 qemu_system_shutdown_request();
 } else {
-guest_panicked();
+qemu_system_guest_panicked();
 }
 }
 r = EXCP_HALTED;
diff --git a/vl.c b/vl.c
index 69ad90c..38eee1f 100644
--- a/vl.c
+++ b/vl.c
@@ -1721,6 +1721,12 @@ void qemu_system_reset(bool report)
 cpu_synchronize_all_post_reset();
 }
 
+void qemu_system_guest_panicked(void)
+{
+qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, error_abort);
+vm_stop(RUN_STATE_GUEST_PANICKED);
+}
+
 void qemu_system_reset_request(void)
 {
 if (no_reboot) {
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/12] qemu: add crash_occurred flag into CPUState

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

CPUState-crash_occurred value inside CPUState marks
that guest crash occurred. This value added into cpu common
migration subsection.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Paolo Bonzini pbonz...@redhat.com
CC: Andreas Färber afaer...@suse.de
---
 exec.c| 19 +++
 include/qom/cpu.h |  1 +
 vl.c  |  3 +++
 3 files changed, 23 insertions(+)

diff --git a/exec.c b/exec.c
index f7883d2..adf49e8 100644
--- a/exec.c
+++ b/exec.c
@@ -465,6 +465,24 @@ static const VMStateDescription 
vmstate_cpu_common_exception_index = {
 }
 };
 
+static bool cpu_common_crash_occurred_needed(void *opaque)
+{
+CPUState *cpu = opaque;
+
+return cpu-crash_occurred != 0;
+}
+
+static const VMStateDescription vmstate_cpu_common_crash_occurred = {
+.name = cpu_common/crash_occurred,
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = cpu_common_crash_occurred_needed,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(crash_occurred, CPUState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 const VMStateDescription vmstate_cpu_common = {
 .name = cpu_common,
 .version_id = 1,
@@ -478,6 +496,7 @@ const VMStateDescription vmstate_cpu_common = {
 },
 .subsections = (const VMStateDescription*[]) {
 vmstate_cpu_common_exception_index,
+vmstate_cpu_common_crash_occurred,
 NULL
 }
 };
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 39f0f19..f559a69 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -263,6 +263,7 @@ struct CPUState {
 bool created;
 bool stop;
 bool stopped;
+uint32_t crash_occurred;
 volatile sig_atomic_t exit_request;
 uint32_t interrupt_request;
 int singlestep_enabled;
diff --git a/vl.c b/vl.c
index 38eee1f..9e0aee5 100644
--- a/vl.c
+++ b/vl.c
@@ -1723,6 +1723,9 @@ void qemu_system_reset(bool report)
 
 void qemu_system_guest_panicked(void)
 {
+if (current_cpu) {
+current_cpu-crash_occurred = 1;
+}
 qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, error_abort);
 vm_stop(RUN_STATE_GUEST_PANICKED);
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/12] qemu/kvm: added kvm system event crash handler

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

KVM kernel can receive guest crash events. Patch code calls appropriate
handler for kernel guest crash event. Guest crash event recognized
by KVM_SYSTEM_EVENT_CRASH type of system event.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
CC: Paolo Bonzini pbonz...@redhat.com
CC: Andreas Färber afaer...@suse.de
---
 kvm-all.c | 4 
 linux-headers/linux/kvm.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/kvm-all.c b/kvm-all.c
index 53e01d4..7a959b6 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1844,6 +1844,10 @@ int kvm_cpu_exec(CPUState *cpu)
 qemu_system_reset_request();
 ret = EXCP_INTERRUPT;
 break;
+case KVM_SYSTEM_EVENT_CRASH:
+qemu_system_guest_panicked();
+ret = 0;
+break;
 default:
 DPRINTF(kvm_arch_handle_exit\n);
 ret = kvm_arch_handle_exit(cpu, run);
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index fad9e5c..409be37 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -317,6 +317,7 @@ struct kvm_run {
struct {
 #define KVM_SYSTEM_EVENT_SHUTDOWN   1
 #define KVM_SYSTEM_EVENT_RESET  2
+#define KVM_SYSTEM_EVENT_CRASH  3
__u32 type;
__u64 flags;
} system_event;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[PATCH 04/12] kvm/x86: added hyper-v crash msrs into kvm hyperv context

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added kvm Hyper-V context hv crash variables as storage
of Hyper-V crash msrs.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 arch/x86/include/asm/kvm_host.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 78616aa..697c1f3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -595,6 +595,10 @@ struct kvm_hv {
u64 hv_guest_os_id;
u64 hv_hypercall;
u64 hv_tsc_page;
+
+   /* Hyper-v based guest crash (NT kernel bugcheck) parameters */
+   u64 hv_crash_param[HV_X64_MSR_CRASH_PARAMS];
+   u64 hv_crash_ctl;
 };
 
 struct kvm_arch {
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/12] kvm: added KVM_REQ_HV_CRASH value to notify qemu about hyper-v crash

2015-07-02 Thread Denis V. Lunev
From: Andrey Smetanin asmeta...@virtuozzo.com

Added KVM_REQ_HV_CRASH - vcpu request used for notify user space(QEMU)
about Hyper-V crash.

Signed-off-by: Andrey Smetanin asmeta...@virtuozzo.com
Signed-off-by: Denis V. Lunev d...@openvz.org
Reviewed-by: Peter Hornyack peterhorny...@google.com
CC: Paolo Bonzini pbonz...@redhat.com
CC: Gleb Natapov g...@kernel.org
---
 include/linux/kvm_host.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2b2edf1..a377e00 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -139,6 +139,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_DISABLE_IBS   24
 #define KVM_REQ_APIC_PAGE_RELOAD  25
 #define KVM_REQ_SMI   26
+#define KVM_REQ_HV_CRASH  27
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID   1
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/12] qemu/kvm/x86: hyper-v crash msrs set/get'ers and migration

2015-07-02 Thread Paolo Bonzini

On 02/07/2015 18:07, Denis V. Lunev wrote:
 +if (cpu-hyperv_crash 
 +kvm_check_extension(cs-kvm_state, KVM_CAP_HYPERV_MSR_CRASH)  
 0) {
 +c-edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
 +has_msr_hv_crash = true;
 +}
 +

Please patch kvm_get_supported_msrs instead of adding a capability.  The
QEMU parts are otherwise okay.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >