Re: [PATCH v11 2/3] x86, apicv: add virtual x2apic support

2013-01-22 Thread Gleb Natapov
On Mon, Jan 21, 2013 at 08:16:18PM -0200, Marcelo Tosatti wrote:
 On Mon, Jan 21, 2013 at 11:34:20PM +0200, Gleb Natapov wrote:
  On Mon, Jan 21, 2013 at 07:21:13PM -0200, Marcelo Tosatti wrote:
   On Mon, Jan 21, 2013 at 10:21:14PM +0200, Gleb Natapov wrote:
  }
  +
  +   vcpu-arch.apic_base = value;
 
 Simpler to have
 
 if (apic_x2apic_mode(apic)) {
   ...
   kvm_x86_ops-set_virtual_x2apic_mode(vcpu, true);
 } else {
   kvm_x86_ops-set_virtual_x2apic_mode(vcpu, false);
 }
 
This will not work during cpu init. That was discussed on one of
the previous iterations of the patch. When this code is called during
vcpu init vmcs is not loaded yet so set_virtual_x2apic_mode() cannot
write into it.
   
   Are you saying that the logic to write on bit value change is due to 
   ordering with cpu init or that the callback is at the wrong place?
   
  The logic is because of ordering with cpu init.
 
 OK. Still must move this conditional callback after assignment of apic_base.
 
Sure.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 2/3] x86, apicv: add virtual x2apic support

2013-01-22 Thread Gleb Natapov
On Mon, Jan 21, 2013 at 10:33:46PM -0200, Marcelo Tosatti wrote:
  The question is, why is intercept for EOI MSR address (0x80B) not being
  disabled here, while TPR is? I don't see intercept disabled by other
  patches either.
 
 Point still valid: why intercept for EOI MSR address not being disabled?
Yang sent two version of the third patch. Second one disabled intercept
for EOI MSR. Ignore the first one.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [QEMU PATCH v4 2/3] virtio-net: introduce a new macaddr control

2013-01-22 Thread Amos Kong
On Mon, Jan 21, 2013 at 05:08:26PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jan 19, 2013 at 09:54:27AM +0800, ak...@redhat.com wrote:
  @@ -350,6 +351,18 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t 
  cmd,
   struct virtio_net_ctrl_mac mac_data;
   size_t s;
   
  +if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) {
  +if (iov_size(iov, iov_cnt) != ETH_ALEN) {
  +return VIRTIO_NET_ERR;
  +}
  +s = iov_to_buf(iov, iov_cnt, 0, n-mac, sizeof(n-mac));
  +if (s != sizeof(n-mac)) {
  +return VIRTIO_NET_ERR;
  +}

 
 Since iov_size() was checked before iov_to_buf(), we never hit this
 error.  And if we did n-mac would be trashed (i.e. error handling is
 not complete).

You are right.
iov_size() computes the size by accounting iov[].iov_lens, the first
check is enough.
 
 I think assert(s == sizeof(n-mac)) is more appropriate appropriate.
 Also, please change ETH_ALEN to sizeof(n-mac) to make the relationship
 between the check and the copy clear.
 

Will update this patch.

 if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) {
 if (iov_size(iov, iov_cnt) != sizeof(n-mac)) {
 return VIRTIO_NET_ERR;
 }
 s = iov_to_buf(iov, iov_cnt, 0, n-mac, sizeof(n-mac));
 assert(s == sizeof(n-mac));
 qemu_format_nic_info_str(n-nic-nc, n-mac);
 return VIRTIO_NET_OK;
 }

 Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v11 2/3] x86, apicv: add virtual x2apic support

2013-01-22 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-01-22:
 On Wed, Jan 16, 2013 at 06:21:11PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 basically to benefit from apicv, we need to enable virtualized x2apic mode.
 Currently, we only enable it when guest is really using x2apic.
 
 Also, clear MSR bitmap for corresponding x2apic MSRs when guest enabled
 x2apic:
 0x800 - 0x8ff: no read intercept for apicv register virtualization,
except APIC ID and TMCCT which need software's
 assistance to
 get right value.
 
 Signed-off-by: Kevin Tian kevin.t...@intel.com
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
  arch/x86/include/asm/kvm_host.h |1 + arch/x86/include/asm/vmx.h   
|1 + arch/x86/kvm/lapic.c|   20 ++--
  arch/x86/kvm/lapic.h|5 + arch/x86/kvm/svm.c   
|6 + arch/x86/kvm/vmx.c  |  204
  +++ 6 files changed, 209
  insertions(+), 28 deletions(-)
 diff --git a/arch/x86/include/asm/kvm_host.h
 b/arch/x86/include/asm/kvm_host.h index c431b33..35aa8e6 100644 ---
 a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h
 @@ -697,6 +697,7 @@ struct kvm_x86_ops {
  void (*enable_nmi_window)(struct kvm_vcpu *vcpu);   void
  (*enable_irq_window)(struct kvm_vcpu *vcpu);void
  (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
  +   void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
  int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);int
  (*get_tdp_level)(void); u64 (*get_mt_mask)(struct kvm_vcpu *vcpu,
  gfn_t gfn, bool is_mmio);
 diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
 index 44c3f7e..0a54df0 100644
 --- a/arch/x86/include/asm/vmx.h
 +++ b/arch/x86/include/asm/vmx.h
 @@ -139,6 +139,7 @@
  #define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x0001 #define
  SECONDARY_EXEC_ENABLE_EPT   0x0002 #define
  SECONDARY_EXEC_RDTSCP   0x0008 +#define
  SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE   0x0010 #define
  SECONDARY_EXEC_ENABLE_VPID  0x0020 #define
  SECONDARY_EXEC_WBINVD_EXITING   0x0040 #define
  SECONDARY_EXEC_UNRESTRICTED_GUEST   0x0080
 diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
 index 0664c13..f39aee3 100644
 --- a/arch/x86/kvm/lapic.c
 +++ b/arch/x86/kvm/lapic.c
 @@ -140,11 +140,6 @@ static inline int apic_enabled(struct kvm_lapic *apic)
  (LVT_MASK | APIC_MODE_MASK | APIC_INPUT_POLARITY | \
   APIC_LVT_REMOTE_IRR | APIC_LVT_LEVEL_TRIGGER)
 -static inline int apic_x2apic_mode(struct kvm_lapic *apic)
 -{
 -return apic-vcpu-arch.apic_base  X2APIC_ENABLE;
 -}
 -
  static inline int kvm_apic_id(struct kvm_lapic *apic)
  {
  return (kvm_apic_get_reg(apic, APIC_ID)  24)  0xff;
 @@ -1323,12 +1318,17 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu,
 u64 value)
  if (!kvm_vcpu_is_bsp(apic-vcpu))
  value = ~MSR_IA32_APICBASE_BSP;
 -vcpu-arch.apic_base = value;
 -if (apic_x2apic_mode(apic)) {
 -u32 id = kvm_apic_id(apic);
 -u32 ldr = ((id  4)  16) | (1  (id  0xf));
 -kvm_apic_set_ldr(apic, ldr);
 +if ((vcpu-arch.apic_base ^ value)  X2APIC_ENABLE) {
 +if (value  X2APIC_ENABLE) {
 +u32 id = kvm_apic_id(apic);
 +u32 ldr = ((id  4)  16) | (1  (id  0xf));
 +kvm_apic_set_ldr(apic, ldr);
 +kvm_x86_ops-set_virtual_x2apic_mode(vcpu, true);
 +} else
 +kvm_x86_ops-set_virtual_x2apic_mode(vcpu, false);
  }
 +
 +vcpu-arch.apic_base = value;
 
 Simpler to have
 
 if (apic_x2apic_mode(apic)) {
   ...
   kvm_x86_ops-set_virtual_x2apic_mode(vcpu, true);
 } else {
   kvm_x86_ops-set_virtual_x2apic_mode(vcpu, false);
 }
 
 Also it must be done after assignment of vcpu-arch.apic_base (this
 patch has vcpu-arch.apic_base being read from
 -set_virtual_x2apic_mode() path).
 
 +static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
 +{
 +unsigned long *msr_bitmap;
 +
 +if (apic_x2apic_mode(vcpu-arch.apic))
 
 vcpu-arch.apic can be NULL.
Actually, call apic_x2apic_mode to check whether use x2apic msr bitmap is wrong.
VCPU uses x2apic but it may not set virtual x2apic mode bit due to TPR shadow 
not enabled or irqchip not in kernel. Check the virtual x2apic mode bit in vmcs 
directly should be a better choice. How about the follow code:

static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
{
unsigned long *msr_bitmap;

if (vmcs_read32(SECONDARY_VM_EXEC_CONTROL)  
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)
if (is_long_mode(vcpu))
msr_bitmap = vmx_msr_bitmap_longmode_x2apic;
else
msr_bitmap = vmx_msr_bitmap_legacy_x2apic;
else
if (is_long_mode(vcpu))

Re: [Qemu-devel] [PATCH for-1.4 01/12] kvm: Add fake KVM_FEATURE_CLOCKSOURCE_STABLE_BIT for builds withou KVM

2013-01-22 Thread Eduardo Habkost
On Tue, Jan 22, 2013 at 05:59:14AM +0100, Andreas Färber wrote:
 Am 22.01.2013 02:43, schrieb Marcelo Tosatti:
  On Thu, Jan 17, 2013 at 06:59:27PM -0200, Eduardo Habkost wrote:
  Signed-off-by: Eduardo Habkost ehabk...@redhat.com
  ---
  Cc: kvm@vger.kernel.org
  Cc: Michael S. Tsirkin m...@redhat.com
  Cc: Gleb Natapov g...@redhat.com
  Cc: Marcelo Tosatti mtosa...@redhat.com
  ---
   include/sysemu/kvm.h | 1 +
   1 file changed, 1 insertion(+)
 
  diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
  index 6bdd513..22acf91 100644
  --- a/include/sysemu/kvm.h
  +++ b/include/sysemu/kvm.h
  @@ -36,6 +36,7 @@
   #define KVM_FEATURE_ASYNC_PF 0
   #define KVM_FEATURE_STEAL_TIME   0
   #define KVM_FEATURE_PV_EOI   0
  +#define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 0
   #endif
   
   extern int kvm_allowed;
  -- 
  1.7.11.7
 
  
  ACK
 
 BTW is it the general strategy to add these as needed for new patches?
 Or should we add all current ones and mandate adding such dummy
 definitions when new ones get introduced via linux-headers/ update?

I meant to include all existing bits in a single patch previously, but
somehow I missed the KVM_FEATURE_CLOCKSOURCE_STABLE_BIT definition when
looking at the kernel header.

It would be nice to automatically refresh the fake-defines list when
updating linux-headers, but I won't mind if we update the list (pulling
all existing bits again) only when QEMU starts using a define that is
missing.

-- 
Eduardo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [QEMU PATCH v4 1/3] virtio-net: remove layout assumptions for ctrl vq

2013-01-22 Thread Amos Kong
On Mon, Jan 21, 2013 at 05:03:30PM +0100, Stefan Hajnoczi wrote:
 On Sat, Jan 19, 2013 at 09:54:26AM +0800, ak...@redhat.com wrote:
  From: Michael S. Tsirkin m...@redhat.com
  
  Virtio-net code makes assumption about virtqueue descriptor layout
  (e.g. sg[0] is the header, sg[1] is the data buffer).
  
  This patch makes code not rely on the layout of descriptors.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
  Signed-off-by: Amos Kong ak...@redhat.com
  ---
   hw/virtio-net.c | 128 
  
   1 file changed, 74 insertions(+), 54 deletions(-)
  
  diff --git a/hw/virtio-net.c b/hw/virtio-net.c
  index 3bb01b1..113e194 100644
  --- a/hw/virtio-net.c
  +++ b/hw/virtio-net.c
  @@ -315,44 +315,44 @@ static void virtio_net_set_features(VirtIODevice 
  *vdev, uint32_t features)
   }
   
   static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd,
  - VirtQueueElement *elem)
  + struct iovec *iov, unsigned int 
  iov_cnt)
   {
   uint8_t on;
  +size_t s;
   
  -if (elem-out_num != 2 || elem-out_sg[1].iov_len != sizeof(on)) {
  -error_report(virtio-net ctrl invalid rx mode command);
  -exit(1);
  +s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on));
  +if (s != sizeof(on)) {
  +return VIRTIO_NET_ERR;
   }
   
  -on = ldub_p(elem-out_sg[1].iov_base);
  -
  -if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC)
  +if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC) {
   n-promisc = on;
  -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI)
  +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI) {
   n-allmulti = on;
  -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI)
  +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI) {
   n-alluni = on;
  -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI)
  +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI) {
   n-nomulti = on;
  -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI)
  +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI) {
   n-nouni = on;
  -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST)
  +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST) {
   n-nobcast = on;
  -else
  +} else {
   return VIRTIO_NET_ERR;
  +}
   
   return VIRTIO_NET_OK;
   }
   
   static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
  - VirtQueueElement *elem)
  + struct iovec *iov, unsigned int iov_cnt)
   {
   struct virtio_net_ctrl_mac mac_data;
  +size_t s;
   
  -if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET || elem-out_num != 3 ||
  -elem-out_sg[1].iov_len  sizeof(mac_data) ||
  -elem-out_sg[2].iov_len  sizeof(mac_data))
  +if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET) {
   return VIRTIO_NET_ERR;
  +}
   
   n-mac_table.in_use = 0;
   n-mac_table.first_multi = 0;
  @@ -360,54 +360,71 @@ static int virtio_net_handle_mac(VirtIONet *n, 
  uint8_t cmd,
   n-mac_table.multi_overflow = 0;
   memset(n-mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
   
  -mac_data.entries = ldl_p(elem-out_sg[1].iov_base);
  +s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
  +   sizeof(mac_data.entries));

Hi Stefan, can we adjust the endianness after each iov_to_buf() copy?


diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 72d7857..0088d6c 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -321,6 +321,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t 
cmd,
 size_t s;
 
 s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on));
+on = ldub_p(on);
 if (s != sizeof(on)) {
 return VIRTIO_NET_ERR;
 }
@@ -362,7 +363,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
 
 s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
sizeof(mac_data.entries));
-
+mac_data.entries = ldl_p(mac_data.entries);
 if (s != sizeof(mac_data.entries)) {
 return VIRTIO_NET_ERR;
 }
@@ -389,7 +390,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
 
 s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
sizeof(mac_data.entries));
-
+mac_data.entries = ldl_p(mac_data.entries);
 if (s != sizeof(mac_data.entries)) {
 return VIRTIO_NET_ERR;
 }
@@ -421,6 +422,7 @@ static int virtio_net_handle_vlan_table(VirtIONet *n, 
uint8_t cmd,
 size_t s;
 
 s = iov_to_buf(iov, iov_cnt, 0, vid, sizeof(vid));
+vid = lduw_p(vid);
 if (s != sizeof(vid)) {
 return VIRTIO_NET_ERR;
 }
@@ -458,6 +460,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 iov = elem.out_sg;
 iov_cnt = elem.out_num;
 s = iov_to_buf(iov, iov_cnt, 0, ctrl, sizeof(ctrl));
+ctrl.class = ldub_p(ctrl.class);
+ctrl.cmd = 

Re: [QEMU PATCH v4 1/3] virtio-net: remove layout assumptions for ctrl vq

2013-01-22 Thread Stefan Hajnoczi
On Tue, Jan 22, 2013 at 10:38:14PM +0800, Amos Kong wrote:
 On Mon, Jan 21, 2013 at 05:03:30PM +0100, Stefan Hajnoczi wrote:
  On Sat, Jan 19, 2013 at 09:54:26AM +0800, ak...@redhat.com wrote:
   From: Michael S. Tsirkin m...@redhat.com
   
   Virtio-net code makes assumption about virtqueue descriptor layout
   (e.g. sg[0] is the header, sg[1] is the data buffer).
   
   This patch makes code not rely on the layout of descriptors.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
   Signed-off-by: Amos Kong ak...@redhat.com
   ---
hw/virtio-net.c | 128 
   
1 file changed, 74 insertions(+), 54 deletions(-)
   
   diff --git a/hw/virtio-net.c b/hw/virtio-net.c
   index 3bb01b1..113e194 100644
   --- a/hw/virtio-net.c
   +++ b/hw/virtio-net.c
   @@ -315,44 +315,44 @@ static void virtio_net_set_features(VirtIODevice 
   *vdev, uint32_t features)
}

static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd,
   - VirtQueueElement *elem)
   + struct iovec *iov, unsigned int 
   iov_cnt)
{
uint8_t on;
   +size_t s;

   -if (elem-out_num != 2 || elem-out_sg[1].iov_len != sizeof(on)) {
   -error_report(virtio-net ctrl invalid rx mode command);
   -exit(1);
   +s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on));
   +if (s != sizeof(on)) {
   +return VIRTIO_NET_ERR;
}

   -on = ldub_p(elem-out_sg[1].iov_base);
   -
   -if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC)
   +if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC) {
n-promisc = on;
   -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI)
   +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI) {
n-allmulti = on;
   -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI)
   +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI) {
n-alluni = on;
   -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI)
   +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI) {
n-nomulti = on;
   -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI)
   +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI) {
n-nouni = on;
   -else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST)
   +} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST) {
n-nobcast = on;
   -else
   +} else {
return VIRTIO_NET_ERR;
   +}

return VIRTIO_NET_OK;
}

static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
   - VirtQueueElement *elem)
   + struct iovec *iov, unsigned int iov_cnt)
{
struct virtio_net_ctrl_mac mac_data;
   +size_t s;

   -if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET || elem-out_num != 3 ||
   -elem-out_sg[1].iov_len  sizeof(mac_data) ||
   -elem-out_sg[2].iov_len  sizeof(mac_data))
   +if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET) {
return VIRTIO_NET_ERR;
   +}

n-mac_table.in_use = 0;
n-mac_table.first_multi = 0;
   @@ -360,54 +360,71 @@ static int virtio_net_handle_mac(VirtIONet *n, 
   uint8_t cmd,
n-mac_table.multi_overflow = 0;
memset(n-mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);

   -mac_data.entries = ldl_p(elem-out_sg[1].iov_base);
   +s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
   +   sizeof(mac_data.entries));
 
 Hi Stefan, can we adjust the endianness after each iov_to_buf() copy?

Yes.

It's only necessary for uint16_t and larger types since a single byte
cannot be swapped (so ldub_p() is not needed).

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for 2013-01-22

2013-01-22 Thread Juan Quintela
Juan Quintela quint...@redhat.com wrote:
 Hi

 Please send in any agenda topics you are interested in.

As there are no topics, no call Today.

See you next week.

Later, Juan.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[QEMU PATCH v5 3/3] virtio-net: rename ctrl rx commands

2013-01-22 Thread Amos Kong
This patch makes rx commands consistent with specification.

Signed-off-by: Amos Kong ak...@redhat.com
---
 hw/virtio-net.c |   14 +++---
 hw/virtio-net.h |   14 +++---
 2 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index acef5a5..ac4434e 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -326,17 +326,17 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, 
uint8_t cmd,
 return VIRTIO_NET_ERR;
 }
 
-if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC) {
+if (cmd == VIRTIO_NET_CTRL_RX_PROMISC) {
 n-promisc = on;
-} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI) {
+} else if (cmd == VIRTIO_NET_CTRL_RX_ALLMULTI) {
 n-allmulti = on;
-} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI) {
+} else if (cmd == VIRTIO_NET_CTRL_RX_ALLUNI) {
 n-alluni = on;
-} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI) {
+} else if (cmd == VIRTIO_NET_CTRL_RX_NOMULTI) {
 n-nomulti = on;
-} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI) {
+} else if (cmd == VIRTIO_NET_CTRL_RX_NOUNI) {
 n-nouni = on;
-} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST) {
+} else if (cmd == VIRTIO_NET_CTRL_RX_NOBCAST) {
 n-nobcast = on;
 } else {
 return VIRTIO_NET_ERR;
@@ -473,7 +473,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 iov_discard_front(iov, iov_cnt, sizeof(ctrl));
 if (s != sizeof(ctrl)) {
 status = VIRTIO_NET_ERR;
-} else if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE) {
+} else if (ctrl.class == VIRTIO_NET_CTRL_RX) {
 status = virtio_net_handle_rx_mode(n, ctrl.cmd, iov, iov_cnt);
 } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
 status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
diff --git a/hw/virtio-net.h b/hw/virtio-net.h
index 1ec632f..c0bb284 100644
--- a/hw/virtio-net.h
+++ b/hw/virtio-net.h
@@ -99,13 +99,13 @@ typedef uint8_t virtio_net_ctrl_ack;
  * 0 and 1 are supported with the VIRTIO_NET_F_CTRL_RX feature.
  * Commands 2-5 are added with VIRTIO_NET_F_CTRL_RX_EXTRA.
  */
-#define VIRTIO_NET_CTRL_RX_MODE0
- #define VIRTIO_NET_CTRL_RX_MODE_PROMISC  0
- #define VIRTIO_NET_CTRL_RX_MODE_ALLMULTI 1
- #define VIRTIO_NET_CTRL_RX_MODE_ALLUNI   2
- #define VIRTIO_NET_CTRL_RX_MODE_NOMULTI  3
- #define VIRTIO_NET_CTRL_RX_MODE_NOUNI4
- #define VIRTIO_NET_CTRL_RX_MODE_NOBCAST  5
+#define VIRTIO_NET_CTRL_RX0
+ #define VIRTIO_NET_CTRL_RX_PROMISC  0
+ #define VIRTIO_NET_CTRL_RX_ALLMULTI 1
+ #define VIRTIO_NET_CTRL_RX_ALLUNI   2
+ #define VIRTIO_NET_CTRL_RX_NOMULTI  3
+ #define VIRTIO_NET_CTRL_RX_NOUNI4
+ #define VIRTIO_NET_CTRL_RX_NOBCAST  5
 
 /*
  * Control the MAC
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[QEMU PATCH v5 1/3] virtio-net: remove layout assumptions for ctrl vq

2013-01-22 Thread Amos Kong
From: Michael S. Tsirkin m...@redhat.com

Virtio-net code makes assumption about virtqueue descriptor layout
(e.g. sg[0] is the header, sg[1] is the data buffer).

This patch makes code not rely on the layout of descriptors.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Amos Kong ak...@redhat.com
---
 hw/virtio-net.c |  129 ---
 1 files changed, 75 insertions(+), 54 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 3bb01b1..af1f3a1 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -315,44 +315,44 @@ static void virtio_net_set_features(VirtIODevice *vdev, 
uint32_t features)
 }
 
 static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd,
- VirtQueueElement *elem)
+ struct iovec *iov, unsigned int iov_cnt)
 {
 uint8_t on;
+size_t s;
 
-if (elem-out_num != 2 || elem-out_sg[1].iov_len != sizeof(on)) {
-error_report(virtio-net ctrl invalid rx mode command);
-exit(1);
+s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on));
+if (s != sizeof(on)) {
+return VIRTIO_NET_ERR;
 }
 
-on = ldub_p(elem-out_sg[1].iov_base);
-
-if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC)
+if (cmd == VIRTIO_NET_CTRL_RX_MODE_PROMISC) {
 n-promisc = on;
-else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI)
+} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLMULTI) {
 n-allmulti = on;
-else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI)
+} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_ALLUNI) {
 n-alluni = on;
-else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI)
+} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOMULTI) {
 n-nomulti = on;
-else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI)
+} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOUNI) {
 n-nouni = on;
-else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST)
+} else if (cmd == VIRTIO_NET_CTRL_RX_MODE_NOBCAST) {
 n-nobcast = on;
-else
+} else {
 return VIRTIO_NET_ERR;
+}
 
 return VIRTIO_NET_OK;
 }
 
 static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd,
- VirtQueueElement *elem)
+ struct iovec *iov, unsigned int iov_cnt)
 {
 struct virtio_net_ctrl_mac mac_data;
+size_t s;
 
-if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET || elem-out_num != 3 ||
-elem-out_sg[1].iov_len  sizeof(mac_data) ||
-elem-out_sg[2].iov_len  sizeof(mac_data))
+if (cmd != VIRTIO_NET_CTRL_MAC_TABLE_SET) {
 return VIRTIO_NET_ERR;
+}
 
 n-mac_table.in_use = 0;
 n-mac_table.first_multi = 0;
@@ -360,54 +360,72 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t 
cmd,
 n-mac_table.multi_overflow = 0;
 memset(n-mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
 
-mac_data.entries = ldl_p(elem-out_sg[1].iov_base);
+s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
+   sizeof(mac_data.entries));
+mac_data.entries = ldl_p(mac_data.entries);
+if (s != sizeof(mac_data.entries)) {
+return VIRTIO_NET_ERR;
+}
+iov_discard_front(iov, iov_cnt, s);
 
-if (sizeof(mac_data.entries) +
-(mac_data.entries * ETH_ALEN)  elem-out_sg[1].iov_len)
+if (mac_data.entries * ETH_ALEN  iov_size(iov, iov_cnt)) {
 return VIRTIO_NET_ERR;
+}
 
 if (mac_data.entries = MAC_TABLE_ENTRIES) {
-memcpy(n-mac_table.macs, elem-out_sg[1].iov_base + sizeof(mac_data),
-   mac_data.entries * ETH_ALEN);
+s = iov_to_buf(iov, iov_cnt, 0, n-mac_table.macs,
+   mac_data.entries * ETH_ALEN);
+if (s != mac_data.entries * ETH_ALEN) {
+return VIRTIO_NET_ERR;
+}
 n-mac_table.in_use += mac_data.entries;
 } else {
 n-mac_table.uni_overflow = 1;
 }
 
+iov_discard_front(iov, iov_cnt, mac_data.entries * ETH_ALEN);
+
 n-mac_table.first_multi = n-mac_table.in_use;
 
-mac_data.entries = ldl_p(elem-out_sg[2].iov_base);
+s = iov_to_buf(iov, iov_cnt, 0, mac_data.entries,
+   sizeof(mac_data.entries));
+mac_data.entries = ldl_p(mac_data.entries);
+if (s != sizeof(mac_data.entries)) {
+return VIRTIO_NET_ERR;
+}
+
+iov_discard_front(iov, iov_cnt, s);
 
-if (sizeof(mac_data.entries) +
-(mac_data.entries * ETH_ALEN)  elem-out_sg[2].iov_len)
+if (mac_data.entries * ETH_ALEN != iov_size(iov, iov_cnt)) {
 return VIRTIO_NET_ERR;
+}
 
-if (mac_data.entries) {
-if (n-mac_table.in_use + mac_data.entries = MAC_TABLE_ENTRIES) {
-memcpy(n-mac_table.macs + (n-mac_table.in_use * ETH_ALEN),
-   elem-out_sg[2].iov_base + sizeof(mac_data),
-   mac_data.entries * ETH_ALEN);
-n-mac_table.in_use += mac_data.entries;
-} else {
-

Re: [Qemu-devel] [PATCH for-1.4 04/12] kvm: Create kvm_arch_vcpu_id() function

2013-01-22 Thread Eduardo Habkost
On Mon, Jan 21, 2013 at 07:35:22AM -0700, Eric Blake wrote:
 On 01/21/2013 06:14 AM, Andreas Färber wrote:
  glibc is already responsible from converting the 'unsigned long
  int' of the user declaration back into the 'unsigned int' that the
  kernel expects for the second argument.  The third argument (when
  present), is generally treated as a pointer (of size appropriate
  for the architecture).  Although there _might_ be an ioctl that
  uses it directly as an integer instead of dereferencing it as a
  pointer, those would be the exceptions to the rule.
  
  So ... do we have a conclusion what to put into the commit message? :)
  
  It looks to me as if kvm-all.c:kvm_vm_ioctl() is using void*. I like
  unsigned long but maybe uintptr_t would be more correct then?
 
 uintptr_t feels more correct - the 3rd (vararg) argument through the
 ioctl() syscall is always retrieved using the same size as void*.

Actually, sys_ioctl() always retrieve it using unsigned long, but
nothing prevents the arch-specific syscall entry code to from
translating something from a different type to unsigned long before
calling sys_ioctl().

So I guess the only guarantee we have is the Linux ioctl(2) man page,
that says: The third argument is an untyped pointer to memory. It's
traditionally char *argp (from the days before void * was valid C), and
will be so named for this discussion.

That said, I plan to change the code to cast the argument to (void*) in
the next version.

 
  
  Or should kvm_vm_ioctl() be fixed to use something else instead?
  Eric's int would be a semantic change for the 64-bit platforms, no?
 
 My discussion about 'int' vs. 'unsigned long' was in regards to the
 second argument KVM_CREATE_VCPU, which your patch does not change
 (perhaps my fault for jumping in on a conversation mid-thread without
 actually reading your original patch, which I have now done).  That is,
 KVM_CREATE_VCPU as a constant is always 32 bits (kernel constraint),
 widened out to unsigned long when passed to the glibc function (due to
 the glibc signature disagreeing with POSIX), then narrowed back down to
 32 bits when forwarded to the kernel syscall.
 
 Meanwhile, your patch is fixing the third argument from 'int' to a wider
 type, which is necessary for passing that value through varargs when the
 receiving end will retrieve the same argument via a void* variable.

I am confident that unsigned long will work properly on all
architectures we care about today, but I also don't know if this is
documented and guaranteed to work on all architectures. Passing an
argument of the documented type (void*) sounds like the right thing to
do.

-- 
Eduardo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[QEMU PATCH v5 0/3] virtio-net: fix of ctrl commands

2013-01-22 Thread Amos Kong
Currently virtio-net code relys on the layout of descriptor,
this patchset removed the assumptions and introduced a control
command to set mac address. Last patch is a trivial renaming.

V2: check guest's iov_len
V3: fix of migration compatibility
make mac field in config space read-only when new feature is acked
V4: add fix of descriptor layout assumptions, trivial rename
V5: fix endianness after iov_to_buf copy

Amos Kong (2):
  virtio-net: introduce a new macaddr control
  virtio-net: rename ctrl rx commands

Michael S. Tsirkin (1):
  virtio-net: remove layout assumptions for ctrl vq

 hw/pc_piix.c|4 ++
 hw/virtio-net.c |  142 +-
 hw/virtio-net.h |   26 +++
 3 files changed, 108 insertions(+), 64 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


windows 2008 guest causing rcu_shed to emit NMI

2013-01-22 Thread Andrey Korolyov
Hi,

problem described in the title happens on heavy I/O pressure on the
host, without idle=poll trace almost always is the same, involving
mwait, with poll and nohz=off RIP varies from time to time, at the
previous hang it was tg_throttle_down, rather than test_ti_thread_flag
in attached one. Both possible clocksource drivers, hpet and tsc, able
to reproduce that with equal probability. VMs are pinned over one of
two numa sets on two-head machine, mean emulator thread and each of
vcpu threads has its own cpuset cg with '0-5,12-17' or '6-11,18-23'.
I`ll appreciate any suggestions to try.


dmesg2.txt.gz
Description: GNU Zip compressed data


[RFC] KVM/arm64, take #3

2013-01-22 Thread Marc Zyngier
Guys,

I've once more updated the branches for KVM/arm64

- kvm-arm/pre-arm64: kvm-arm-master as of today + the cleanup branch +
some basic perf support

- arm64/soc-armv8-model: Catalin Marinas' arm64 branch

- arm64/psci: Implementation of PSCI for the above

- arm64/perf: host/guest discrimination

- kvm-arm64/kvm-prereq: a bunch of random bits that KVM/arm requires to
compile on arm64.

- kvm-arm64/kvm-prereq-merged: all the above, plus Mark Rutland's timer
rework.

- kvm-arm64/kvm: KVM/arm64 itself, and the only branch you should use
unless you're completely hatstand.

All that is at:
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git

You'll also need Will Deacon's KVM Tool port:
git://git.kernel.org/pub/scm/linux/kernel/git/will/kvmtool.git kvmtool/arm


A few random notes:
- If you're using the Foundation Model, use the provided DTS for your
host kernel (arch/arm64/boot/dts/foundation-v8.dts).
- The only supported models are the AEMv8 and the Foundation models. If
you're using something else and have any issue, first reproduce it with
one of the supported implementations.

What's new:
- Rebased on 3.8-rc4
- Resynced with kvm-arm-master
- More 32bit fixes (ThumbEE, check for lack of 32bit support in HW)
- Some basic perf support

Enjoy,

M.
-- 
Jazz is not dead. It just smells funny...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] kvm tools: remove redundant if condition

2013-01-22 Thread Pekka Enberg
On Sat, Jan 19, 2013 at 12:27 PM, Cong Ding ding...@gmail.com wrote:
 On Sat, Jan 19, 2013 at 10:58:33AM +0200, Pekka Enberg wrote:
 On Wed, Jan 16, 2013 at 6:52 PM, Cong Ding ding...@gmail.com wrote:
  After we check (state.kcount != 0), state.kcount has to be 0 in all the 
  else
  branchs.
 
  Signed-off-by: Cong Ding ding...@gmail.com
  ---
   tools/kvm/hw/i8042.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
  diff --git a/tools/kvm/hw/i8042.c b/tools/kvm/hw/i8042.c
  index 9f8be6a..9035732 100644
  --- a/tools/kvm/hw/i8042.c
  +++ b/tools/kvm/hw/i8042.c
  @@ -189,7 +189,7 @@ static u32 kbd_read_data(void)
  state.mcount--;
  kvm__irq_line(state.kvm, AUX_IRQ, 0);
  kbd_update_irq();
  -   } else if (state.kcount == 0) {
  +   } else {
  i = state.kread - 1;
  if (i  0)
  i = QUEUE_SIZE;

 This doesn't look right. The 'kcount' field is an int so the value can
 be negative.
 But the former check is state.kcount != 0 as I described in the commit
 message. Notice the difference between variable names in the if condition: 
 the
 first one is kcount, the second one is mcount, and the third one is same as 
 the
 first one kcount.

 Ok, the original code is
 if (state.kcount != 0) {
 /* do something when (state.kcount != 0) */
 } else if (state.mcount  0) {
 /* do something when (state.kcount == 0  state.mount  0) */
 } else if (state.kcount == 0) {
 /* do something when (state.kcount == 0  state.mount = 0) 
 */
 }
 For the third branch, it runs when state.kcount == 0 and state.mount = 0,
 it's not necessary to ensure state.kcount == 0 again.

Right you are. Applied, thanks!
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-1.4 qom-cpu 2/9] target-i386: kvm: Set vcpu_id to APIC ID instead of CPU index

2013-01-22 Thread Eduardo Habkost
The CPU ID in KVM is supposed to be the APIC ID, so change the
KVM_CREATE_VCPU call to match it. The current behavior didn't break
anything yet because today the APIC ID is assumed to be equal to the CPU
index, but this won't be true in the future.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
Reviewed-by: Marcelo Tosatti mtosa...@redhat.com
---
Cc: kvm@vger.kernel.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Gleb Natapov g...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com

Changes v2:
 - Change only i386 code (kvm_arch_vcpu_id())

Changes v3:
 - Get CPUState as argument instead of CPUArchState
---
 target-i386/kvm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 5f3f789..c440809 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -411,9 +411,10 @@ static void cpu_update_state(void *opaque, int running, 
RunState state)
 }
 }
 
-unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+unsigned long kvm_arch_vcpu_id(CPUState *cs)
 {
-return cpu-cpu_index;
+X86CPU *cpu = X86_CPU(cs);
+return cpu-env.cpuid_apic_id;
 }
 
 int kvm_arch_init_vcpu(CPUState *cs)
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH for-1.4 qom-cpu 1/9] kvm: Create kvm_arch_vcpu_id() function

2013-01-22 Thread Eduardo Habkost
This will allow each architecture to define how the VCPU ID is set on
the KVM_CREATE_VCPU ioctl call.

Signed-off-by: Eduardo Habkost ehabk...@redhat.com
---
Cc: kvm@vger.kernel.org
Cc: Michael S. Tsirkin m...@redhat.com
Cc: Gleb Natapov g...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com

Changes v2:
 - Get CPUState as argument instead of CPUArchState

Changes v3:
 - Convert KVM_CREATE_VCPU ioctl() argument to void*, so
   the argument type matches the type expected by kvm_vm_ioctl()
---
 include/sysemu/kvm.h | 3 +++
 kvm-all.c| 2 +-
 target-i386/kvm.c| 5 +
 target-ppc/kvm.c | 5 +
 target-s390x/kvm.c   | 5 +
 5 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 22acf91..384ee66 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -196,6 +196,9 @@ int kvm_arch_init(KVMState *s);
 
 int kvm_arch_init_vcpu(CPUState *cpu);
 
+/* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */
+unsigned long kvm_arch_vcpu_id(CPUState *cpu);
+
 void kvm_arch_reset_vcpu(CPUState *cpu);
 
 int kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr);
diff --git a/kvm-all.c b/kvm-all.c
index 6278d61..363a358 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -222,7 +222,7 @@ int kvm_init_vcpu(CPUState *cpu)
 
 DPRINTF(kvm_init_vcpu\n);
 
-ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, cpu-cpu_index);
+ret = kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)kvm_arch_vcpu_id(cpu));
 if (ret  0) {
 DPRINTF(kvm_create_vcpu failed\n);
 goto err;
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 3acff40..5f3f789 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -411,6 +411,11 @@ static void cpu_update_state(void *opaque, int running, 
RunState state)
 }
 }
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+return cpu-cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
 struct {
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 2f4f068..2c64c63 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -384,6 +384,11 @@ static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
 
 #endif /* !defined (TARGET_PPC64) */
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+return cpu-cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cs)
 {
 PowerPCCPU *cpu = POWERPC_CPU(cs);
diff --git a/target-s390x/kvm.c b/target-s390x/kvm.c
index add6a58..99deddf 100644
--- a/target-s390x/kvm.c
+++ b/target-s390x/kvm.c
@@ -76,6 +76,11 @@ int kvm_arch_init(KVMState *s)
 return 0;
 }
 
+unsigned long kvm_arch_vcpu_id(CPUState *cpu)
+{
+return cpu-cpu_index;
+}
+
 int kvm_arch_init_vcpu(CPUState *cpu)
 {
 int ret = 0;
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH for-1.4 01/12] kvm: Add fake KVM_FEATURE_CLOCKSOURCE_STABLE_BIT for builds withou KVM

2013-01-22 Thread Marcelo Tosatti
On Tue, Jan 22, 2013 at 05:59:14AM +0100, Andreas Färber wrote:
 Am 22.01.2013 02:43, schrieb Marcelo Tosatti:
  On Thu, Jan 17, 2013 at 06:59:27PM -0200, Eduardo Habkost wrote:
  Signed-off-by: Eduardo Habkost ehabk...@redhat.com
  ---
  Cc: kvm@vger.kernel.org
  Cc: Michael S. Tsirkin m...@redhat.com
  Cc: Gleb Natapov g...@redhat.com
  Cc: Marcelo Tosatti mtosa...@redhat.com
  ---
   include/sysemu/kvm.h | 1 +
   1 file changed, 1 insertion(+)
 
  diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
  index 6bdd513..22acf91 100644
  --- a/include/sysemu/kvm.h
  +++ b/include/sysemu/kvm.h
  @@ -36,6 +36,7 @@
   #define KVM_FEATURE_ASYNC_PF 0
   #define KVM_FEATURE_STEAL_TIME   0
   #define KVM_FEATURE_PV_EOI   0
  +#define KVM_FEATURE_CLOCKSOURCE_STABLE_BIT 0
   #endif
   
   extern int kvm_allowed;
  -- 
  1.7.11.7
 
  
  ACK
 
 BTW is it the general strategy to add these as needed for new patches?
 Or should we add all current ones and mandate adding such dummy
 definitions when new ones get introduced via linux-headers/ update?

Its a good idea to update the sync scripts to automatically create the
dummy ones, i suppose (there is no proactive strategy).

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86, apicv: Add Posted Interrupt supporting

2013-01-22 Thread Marcelo Tosatti
On Thu, Dec 13, 2012 at 03:29:40PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Posted Interrupt allows APIC interrupts to inject into guest directly
 without any vmexit.
 
 - When delivering a interrupt to guest, if target vcpu is running,
   update Posted-interrupt requests bitmap and send a notification event
   to the vcpu. Then the vcpu will handle this interrupt automatically,
   without any software involvemnt.
 
 - If target vcpu is not running or there already a notification event
   pending in the vcpu, do nothing. The interrupt will be handled by
   next vm entry.
 
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---

snip

 +static void pi_handler(void)
 +{
 + ;
 +}
 +
 +static int vmx_has_posted_interrupt(struct kvm_vcpu *vcpu)
 +{
 + return irqchip_in_kernel(vcpu-kvm)  enable_apicv_pi;
 +}
 +
 +static int vmx_send_nv(struct kvm_vcpu *vcpu,
 + int vector)
 +{
 + struct vcpu_vmx *vmx = to_vmx(vcpu);
 +
 + pi_set_pir(vector, vmx-pi);

Section 29.6 Posted interrupt processing:

No other agent can read or write a PIR bit (or groups of bits) between
the time it is read (to determine what to OR into VIRR) and when it is
cleared.

 + if (!pi_test_and_set_on(vmx-pi)  (vcpu-mode == IN_GUEST_MODE)) {
 + apic-send_IPI_mask(get_cpu_mask(vcpu-cpu), 
 POSTED_INTR_VECTOR);
 + return 1;
 + }
 + return 0;

What is the purpose of outstanding-notification bit? At first, its use as a
lock for PIR posted-interrupt bits is limited because its cleared
on step 3. before PIR is cleared. If it were cleared after step 5. then
software could

if (!pi_test_and_set_on(vmx-pi)) {
pi_set_pir(vector, vmx-pi);
apic-send_IPI_mask(get_cpu_mask(vcpu-cpu), 
POSTED_INTR_VECTOR);
}

Does this mean software has to read PIR _and_ outstanding notification
bit to know when its possible to set bits in PIR + send IPI?

Or is it really cleared after step 5?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v11 2/3] x86, apicv: add virtual x2apic support

2013-01-22 Thread Marcelo Tosatti
On Tue, Jan 22, 2013 at 05:55:53PM +0200, Gleb Natapov wrote:
 On Tue, Jan 22, 2013 at 12:21:47PM +, Zhang, Yang Z wrote:
   +static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
   +{
   +unsigned long *msr_bitmap;
   +
   +if (apic_x2apic_mode(vcpu-arch.apic))
   
   vcpu-arch.apic can be NULL.
  Actually, call apic_x2apic_mode to check whether use x2apic msr bitmap is 
  wrong.
  VCPU uses x2apic but it may not set virtual x2apic mode bit due to TPR 
  shadow not enabled or irqchip not in kernel. Check the virtual x2apic mode 
  bit in vmcs directly should be a better choice. How about the follow code:
  
 If TPR shadow it not enabled vmx_msr_bitmap_.*x2apic bitmap will have x2apic 
 MSRs intercepted.

And what is the problem? APIC register virt depends on TPR shadow.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][v2] KVM: PPC: add paravirt idle loop for 64-bit book E

2013-01-22 Thread Stuart Yoder
From: Stuart Yoder stuart.yo...@freescale.com

Signed-off-by: Stuart Yoder stuart.yo...@freescale.com
---

-v2
   -macro'ized loop in idle_book3e.S to avoid code 
duplication, paravirt loop is now in idle_book3e.S

 arch/powerpc/kernel/epapr_hcalls.S |2 ++
 arch/powerpc/kernel/idle_book3e.S  |   30 --
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/epapr_hcalls.S 
b/arch/powerpc/kernel/epapr_hcalls.S
index 62c0dc2..9f1ebf7 100644
--- a/arch/powerpc/kernel/epapr_hcalls.S
+++ b/arch/powerpc/kernel/epapr_hcalls.S
@@ -17,6 +17,7 @@
 #include asm/asm-compat.h
 #include asm/asm-offsets.h
 
+#ifndef CONFIG_PPC64
 /* epapr_ev_idle() was derived from e500_idle() */
 _GLOBAL(epapr_ev_idle)
CURRENT_THREAD_INFO(r3, r1)
@@ -42,6 +43,7 @@ epapr_ev_idle_start:
 * _TLF_NAPPING.
 */
b   idle_loop
+#endif
 
 /* Hypercall entry point. Will be patched with device tree instructions. */
 .global epapr_hypercall_start
diff --git a/arch/powerpc/kernel/idle_book3e.S 
b/arch/powerpc/kernel/idle_book3e.S
index 4c7cb400..e1c9acd 100644
--- a/arch/powerpc/kernel/idle_book3e.S
+++ b/arch/powerpc/kernel/idle_book3e.S
@@ -16,11 +16,13 @@
 #include asm/ppc-opcode.h
 #include asm/processor.h
 #include asm/thread_info.h
+#include asm/epapr_hcalls.h
 
 /* 64-bit version only for now */
 #ifdef CONFIG_PPC64
 
-_GLOBAL(book3e_idle)
+.macro BOOK3E_IDLE name loop
+_GLOBAL(\name)
/* Save LR for later */
mflrr0
std r0,16(r1)
@@ -67,7 +69,31 @@ _GLOBAL(book3e_idle)
 
/* We can now re-enable hard interrupts and go to sleep */
wrteei  1
-1: PPC_WAIT(0)
+   \loop
+
+.endm
+
+.macro BOOK3E_IDLE_LOOP
+1:
+   PPC_WAIT(0)
b   1b
+.endm
+
+.macro EPAPR_EV_IDLE_LOOP
+idle_loop:
+   LOAD_REG_IMMEDIATE(r11, EV_HCALL_TOKEN(EV_IDLE))
+
+.global epapr_ev_idle_start
+epapr_ev_idle_start:
+   li  r3, -1
+   nop
+   nop
+   nop
+   b   idle_loop
+.endm
+
+BOOK3E_IDLE epapr_ev_idle, EPAPR_EV_IDLE_LOOP
+
+BOOK3E_IDLE book3e_idle BOOK3E_IDLE_LOOP
 
 #endif /* CONFIG_PPC64 */
-- 
1.7.9.7


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 8/8] KVM: x86 emulator: convert a few freestanding emulations to fastop

2013-01-22 Thread Marcelo Tosatti
Missing signed off by.

On Sat, Jan 19, 2013 at 07:51:57PM +0200, Avi Kivity wrote:
 ---
  arch/x86/kvm/emulate.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v11 2/3] x86, apicv: add virtual x2apic support

2013-01-22 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-01-23:
 On Tue, Jan 22, 2013 at 05:55:53PM +0200, Gleb Natapov wrote:
 On Tue, Jan 22, 2013 at 12:21:47PM +, Zhang, Yang Z wrote:
 +static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
 +{
 + unsigned long *msr_bitmap;
 +
 + if (apic_x2apic_mode(vcpu-arch.apic))
 
 vcpu-arch.apic can be NULL.
 Actually, call apic_x2apic_mode to check whether use x2apic msr bitmap
 is wrong. VCPU uses x2apic but it may not set virtual x2apic mode bit
 due to TPR
 shadow not enabled or irqchip not in kernel. Check the virtual x2apic mode 
 bit in
 vmcs directly should be a better choice. How about the follow code:
 
 If TPR shadow it not enabled vmx_msr_bitmap_.*x2apic bitmap will have
 x2apic MSRs intercepted.
Right. So check virtual x2apic mode bit also covers the TPR shadow check. Or 
else, we need two check: one for apic mode and one for TPR shadow.

 And what is the problem? APIC register virt depends on TPR shadow.
No problem. The new implementation is more reasonable and needn't to do 
additional check of TPR shadow, and will not touch apic_base.



Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v11 3/3] x86, apicv: add virtual interrupt delivery support

2013-01-22 Thread Zhang, Yang Z
Gleb Natapov wrote on 2013-01-21:
 On Mon, Jan 21, 2013 at 12:49:01AM +, Zhang, Yang Z wrote:
 Gleb Natapov wrote on 2013-01-20:
 On Thu, Jan 17, 2013 at 01:26:03AM +, Zhang, Yang Z wrote:
 Previous patch is stale. Resend the new patch. The only change is
 clear EOI and SELF-IPI reg in msr bitmap when vid is enabled.
 
 
 @@ -340,6 +325,8 @@ static inline int apic_find_highest_irr(struct 
 kvm_lapic
 *apic)
  {
int result;
 +  /* Note that irr_pending is just a hint. It will be always
 +   * true with virtual interrupt delivery enabled. */
 This is not correct format for multi-line comments.
 Sure, will correct it here and below.
 
 +static void vmx_check_ioapic_entry(struct kvm_vcpu *vcpu,
 + struct kvm_lapic_irq *irq)
 +{
 +  struct kvm_lapic **dst;
 +  struct kvm_apic_map *map;
 +  unsigned long bitmap = 1;
 +  int i;
 +
 +  rcu_read_lock();
 +  map = rcu_dereference(vcpu-kvm-arch.apic_map);
 +
 +  if (unlikely(!map)) {
 +  set_eoi_exitmap_one(vcpu, irq-vector);
 +  goto out;
 +  }
 +
 +  if (irq-dest_mode == 0) { /* physical mode */
 +  if (irq-delivery_mode == APIC_DM_LOWEST ||
 +  irq-dest_id == 0xff) {
 +  set_eoi_exitmap_one(vcpu, irq-vector);
 +  goto out;
 +  }
 +  dst = map-phys_map[irq-dest_id  0xff];
 +  } else {
 +  u32 mda = irq-dest_id  (32 - map-ldr_bits);
 +
 +  dst = map-logical_map[apic_cluster_id(map, mda)];
 +
 +  bitmap = apic_logical_id(map, mda);
 +  }
 +
 +  for_each_set_bit(i, bitmap, 16) {
 +  if (!dst[i])
 +  continue;
 +  if (dst[i]-vcpu == vcpu) {
 +  set_eoi_exitmap_one(vcpu, irq-vector);
 +  break;
 +  }
 +  }
 +
 +out:
 +  rcu_read_unlock();
 +}
 The logic in this function belongs to lapic code. The only thing
 that is specific to vmx in the function is setting of the bit in
 vmx-eoi_exit_bitmap, but since eoi_exit_bitmap is calculated and
 loaded during same vcpu entry we do not need vmx-eoi_exit_bitmap at
 all. Declare it on a stack in vmx_update_eoi_exitmap() and pass it to
 set_eoi_exitmap() and vmx_load_eoi_exitmap().
 IIRC, this logic is in lapic before v7. And you suggested to move the
 whole function into vmx code. So, it better to move back to lapic file?
 
 IIRC I suggested to call it only from vmx, not move it there. Before
 that the calculation was done even with vid disabled and only result was
 ignored. With current logic KVM_REQ_EOIBITMAP will be set only with vid
 enabled so the calculation will not be done needlessly.
 
 
 @@ -115,6 +116,42 @@ static void update_handled_vectors(struct
 kvm_ioapic
 *ioapic)
smp_wmb();
  }
 +void set_eoi_exitmap(struct kvm_vcpu *vcpu)
 +{
 This function is exported from the file and need to have more unique
 name. kvm_ioapic_calculate_eoi_exitmap() for instance.
 Ok.
 
 @@ -156,6 +193,7 @@ static void ioapic_write_indirect(struct kvm_ioapic
 *ioapic, u32 val)
if (e-fields.trig_mode == IOAPIC_LEVEL_TRIG
 ioapic-irr  (1  index))
ioapic_service(ioapic, index);
 +  ioapic_update_eoi_exitmap(ioapic-kvm);
 ioapic_write_indirect() is called under ioapic-lock,
 ioapic_update_eoi_exitmap() takes the same lock. Have you tested the
 code?
 ioapic_update_eoi_exitmap doesn't take any lock.
 
 Sorry. You are correct. Confused between different functions.
 
 I will do a full testing for every patch before sending out. It covers
 both windows and Linux guest.
 
 We are getting close so please test with userspace irq chip too.
Thanks for your suggestion to test with userspace irqchip. I found some issues 
and will modify the logic:
As we known, APICv deponds on TPR shadow. But TPR shadow is per VM(it will be 
disabled when VM uses userspace irq chip), this means APICv also is per VM. But 
in current implementation, we use the global variable enable_apicv_reg to check 
whether APICv is used by target vcpu. This is wrong. Instead, it should to read 
VMCS to see whether the bit is set or not.

Best regards,
Yang


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/2] x86, apicv: Add Posted Interrupt supporting

2013-01-22 Thread Zhang, Yang Z
Marcelo Tosatti wrote on 2013-01-23:
 On Thu, Dec 13, 2012 at 03:29:40PM +0800, Yang Zhang wrote:
 From: Yang Zhang yang.z.zh...@intel.com
 
 Posted Interrupt allows APIC interrupts to inject into guest directly
 without any vmexit.
 
 - When delivering a interrupt to guest, if target vcpu is running,
   update Posted-interrupt requests bitmap and send a notification event
   to the vcpu. Then the vcpu will handle this interrupt automatically,
   without any software involvemnt.
 - If target vcpu is not running or there already a notification event
   pending in the vcpu, do nothing. The interrupt will be handled by
   next vm entry.
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 ---
 
 snip
 
 +static void pi_handler(void)
 +{
 +;
 +}
 +
 +static int vmx_has_posted_interrupt(struct kvm_vcpu *vcpu)
 +{
 +return irqchip_in_kernel(vcpu-kvm)  enable_apicv_pi;
 +}
 +
 +static int vmx_send_nv(struct kvm_vcpu *vcpu,
 +int vector)
 +{
 +struct vcpu_vmx *vmx = to_vmx(vcpu);
 +
 +pi_set_pir(vector, vmx-pi);
 
 Section 29.6 Posted interrupt processing:
 
 No other agent can read or write a PIR bit (or groups of bits) between
 the time it is read (to determine what to OR into VIRR) and when it is
 cleared.
This means hardware can ensure the read and clear operation is atomic, for 
example, use locked cmpxchg.

 +if (!pi_test_and_set_on(vmx-pi)  (vcpu-mode == IN_GUEST_MODE)) {
 +apic-send_IPI_mask(get_cpu_mask(vcpu-cpu), 
 POSTED_INTR_VECTOR);
 +return 1; + } + return 0;
 
 What is the purpose of outstanding-notification bit? At first, its use as a
 lock for PIR posted-interrupt bits is limited because its cleared
 on step 3. before PIR is cleared. If it were cleared after step 5. then
 software could
 
   if (!pi_test_and_set_on(vmx-pi)) { pi_set_pir(vector, 
 vmx-pi);
   apic-send_IPI_mask(get_cpu_mask(vcpu-cpu), 
 POSTED_INTR_VECTOR);   }
 
 Does this mean software has to read PIR _and_ outstanding notification
 bit to know when its possible to set bits in PIR + send IPI?
There is no limitation for software to set bits in PIR. Software can set PIR 
unconditionally with locked operation.
Software must to read ON bit to check whether the IPI is needed. If ON bit is 
set, this means an notification event already sent but not acked by target cpu 
and no need to resend it again.

 
 Or is it really cleared after step 5?
No, it is in step 3.


Best regards,
Yang

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 6/8] introduce a new qom device to deal with panicked event

2013-01-22 Thread Hu Tao
If the target is x86/x86_64, the guest's kernel will write 0x01 to the
port KVM_PV_EVENT_PORT when it is panciked. This patch introduces a new
qom device kvm_pv_ioport to listen this I/O port, and deal with panicked
event according to panicked_action's value. The possible actions are:
1. emit QEVENT_GUEST_PANICKED only
2. emit QEVENT_GUEST_PANICKED and pause the guest
3. emit QEVENT_GUEST_PANICKED and poweroff the guest
4. emit QEVENT_GUEST_PANICKED and reset the guest

I/O ports does not work for some targets(for example: s390). And you
can implement another qom device, and include it's code into pv_event.c
for such target.

Note: if we emit QEVENT_GUEST_PANICKED only, and the management
application does not receive this event(the management may not
run when the event is emitted), the management won't know the
guest is panicked.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/kvm/Makefile.objs |   2 +-
 hw/kvm/pv_event.c| 197 +++
 hw/pc_piix.c |   5 ++
 include/sysemu/kvm.h |   2 +
 kvm-stub.c   |   4 ++
 5 files changed, 209 insertions(+), 1 deletion(-)
 create mode 100644 hw/kvm/pv_event.c

diff --git a/hw/kvm/Makefile.objs b/hw/kvm/Makefile.objs
index f620d7f..cf93199 100644
--- a/hw/kvm/Makefile.objs
+++ b/hw/kvm/Makefile.objs
@@ -1 +1 @@
-obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o
+obj-$(CONFIG_KVM) += clock.o apic.o i8259.o ioapic.o i8254.o pci-assign.o 
pv_event.o
diff --git a/hw/kvm/pv_event.c b/hw/kvm/pv_event.c
new file mode 100644
index 000..f32f82e
--- /dev/null
+++ b/hw/kvm/pv_event.c
@@ -0,0 +1,197 @@
+/*
+ * QEMU KVM support, paravirtual event device
+ *
+ * Copyright Fujitsu, Corp. 2012
+ *
+ * Authors:
+ * Wen Congyang we...@cn.fujitsu.com
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include linux/kvm_para.h
+#include asm/kvm_para.h
+#include qapi/qmp/qobject.h
+#include qapi/qmp/qjson.h
+#include monitor/monitor.h
+#include sysemu/sysemu.h
+#include sysemu/kvm.h
+
+/* Possible values for action parameter. */
+#define PANICKED_REPORT 1   /* emit QEVENT_GUEST_PANICKED only */
+#define PANICKED_PAUSE  2   /* emit QEVENT_GUEST_PANICKED and pause VM */
+#define PANICKED_POWEROFF   3   /* emit QEVENT_GUEST_PANICKED and quit VM */
+#define PANICKED_RESET  4   /* emit QEVENT_GUEST_PANICKED and reset VM */
+
+#define PV_EVENT_DRIVER kvm_pv_event
+
+struct PVEventAction {
+char *panicked_action;
+int panicked_action_value;
+};
+
+#define DEFINE_PV_EVENT_PROPERTIES(_state, _conf)   \
+DEFINE_PROP_STRING(panicked_action, _state, _conf.panicked_action)
+
+static void panicked_mon_event(const char *action)
+{
+QObject *data;
+
+data = qobject_from_jsonf({ 'action': %s }, action);
+monitor_protocol_event(QEVENT_GUEST_PANICKED, data);
+qobject_decref(data);
+}
+
+static void panicked_perform_action(uint32_t panicked_action)
+{
+switch (panicked_action) {
+case PANICKED_REPORT:
+panicked_mon_event(report);
+break;
+
+case PANICKED_PAUSE:
+panicked_mon_event(pause);
+vm_stop(RUN_STATE_GUEST_PANICKED);
+break;
+
+case PANICKED_POWEROFF:
+panicked_mon_event(poweroff);
+qemu_system_shutdown_request();
+break;
+
+case PANICKED_RESET:
+panicked_mon_event(reset);
+qemu_system_reset_request();
+break;
+}
+}
+
+static uint64_t supported_event(void)
+{
+return 1  KVM_PV_FEATURE_PANICKED;
+}
+
+static void handle_event(int event, struct PVEventAction *conf)
+{
+if (event == KVM_PV_EVENT_PANICKED) {
+panicked_perform_action(conf-panicked_action_value);
+}
+}
+
+static int pv_event_init(struct PVEventAction *conf)
+{
+if (!conf-panicked_action) {
+conf-panicked_action_value = PANICKED_REPORT;
+} else if (strcasecmp(conf-panicked_action, none) == 0) {
+conf-panicked_action_value = PANICKED_REPORT;
+} else if (strcasecmp(conf-panicked_action, pause) == 0) {
+conf-panicked_action_value = PANICKED_PAUSE;
+} else if (strcasecmp(conf-panicked_action, poweroff) == 0) {
+conf-panicked_action_value = PANICKED_POWEROFF;
+} else if (strcasecmp(conf-panicked_action, reset) == 0) {
+conf-panicked_action_value = PANICKED_RESET;
+} else {
+return -1;
+}
+
+return 0;
+}
+
+#if defined(KVM_PV_EVENT_PORT)
+
+#include hw/isa.h
+
+typedef struct {
+ISADevice dev;
+struct PVEventAction conf;
+MemoryRegion ioport;
+} PVIOPortState;
+
+static uint64_t pv_io_read(void *opaque, hwaddr addr, unsigned size)
+{
+return supported_event();
+}
+
+static void pv_io_write(void *opaque, hwaddr addr, uint64_t val,
+unsigned size)
+{
+PVIOPortState *s = opaque;
+
+handle_event(val, s-conf);
+}
+

[PATCH v12 rebased 8/8] pv event: add document to describe the usage

2013-01-22 Thread Hu Tao
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 docs/pv-event.txt | 17 +
 1 file changed, 17 insertions(+)
 create mode 100644 docs/pv-event.txt

diff --git a/docs/pv-event.txt b/docs/pv-event.txt
new file mode 100644
index 000..ac9e7fa
--- /dev/null
+++ b/docs/pv-event.txt
@@ -0,0 +1,17 @@
+KVM PV EVENT
+
+
+kvm pv event allows guest OS to notify host OS of some events, for
+example, guest panic. Currently, there is one event supported, that
+is, guest panic. More events can be added later.
+
+By default, kvm pv event is disabled. In order to enable it, you have
+to specify enable_pv_event=on for -machine command line option, along
+with -global kvm_pv_event.panicked_action to specify the action taken
+when panic event has occurred. Aviable panic actions are: none,
+pause, poweroff and reset. Following is example:
+
+  qemu-system-x86_64 -enable-kvm -machine pc-0.12,enable_pv_event=on \
+-global kvm_pv_event.panicked_action=pause other options
+
+kvm pv event needs kvm support.
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 1/8] preserve cpu runstate

2013-01-22 Thread Hu Tao
This patch enables preservation of cpu runstate during save/load vm.
So when a vm is restored from snapshot, the cpu runstate is restored,
too.

See following example:

# save two vms: one is running, the other is paused
(qemu) info status
VM status: running
(qemu) savevm running
(qemu) stop
(qemu) info status
VM status: paused
(qemu) savevm paused

# restore the one running
(qemu) info status
VM status: paused
(qemu) loadvm running
(qemu) info status
VM status: running

# restore the one paused
(qemu) loadvm paused
(qemu) info status
VM status: paused
(qemu) cont
(qemu)info status
VM status: running


Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 include/sysemu/sysemu.h |  2 ++
 migration.c |  6 +-
 monitor.c   |  5 ++---
 savevm.c|  1 +
 vl.c| 34 ++
 5 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 337ce7d..7a69fde 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -19,6 +19,8 @@ extern uint8_t qemu_uuid[];
 int qemu_uuid_parse(const char *str, uint8_t *uuid);
 #define UUID_FMT 
%02hhx%02hhx%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx-%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx
 
+void save_run_state(void);
+void load_run_state(void);
 bool runstate_check(RunState state);
 void runstate_set(RunState new_state);
 int runstate_is_running(void);
diff --git a/migration.c b/migration.c
index 77c1971..f96cfd6 100644
--- a/migration.c
+++ b/migration.c
@@ -108,11 +108,7 @@ static void process_incoming_migration_co(void *opaque)
 /* Make sure all file formats flush their mutable metadata */
 bdrv_invalidate_cache_all();
 
-if (autostart) {
-vm_start();
-} else {
-runstate_set(RUN_STATE_PAUSED);
-}
+load_run_state();
 }
 
 static void enter_migration_coroutine(void *opaque)
diff --git a/monitor.c b/monitor.c
index 20bd19b..9381ed0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2059,13 +2059,12 @@ void qmp_closefd(const char *fdname, Error **errp)
 
 static void do_loadvm(Monitor *mon, const QDict *qdict)
 {
-int saved_vm_running  = runstate_is_running();
 const char *name = qdict_get_str(qdict, name);
 
 vm_stop(RUN_STATE_RESTORE_VM);
 
-if (load_vmstate(name) == 0  saved_vm_running) {
-vm_start();
+if (load_vmstate(name) == 0) {
+load_run_state();
 }
 }
 
diff --git a/savevm.c b/savevm.c
index 304d1ef..10f1d56 100644
--- a/savevm.c
+++ b/savevm.c
@@ -2112,6 +2112,7 @@ void do_savevm(Monitor *mon, const QDict *qdict)
 }
 
 saved_vm_running = runstate_is_running();
+save_run_state();
 vm_stop(RUN_STATE_SAVE_VM);
 
 memset(sn, 0, sizeof(*sn));
diff --git a/vl.c b/vl.c
index 4ee1302..b0bcf1e 100644
--- a/vl.c
+++ b/vl.c
@@ -520,6 +520,7 @@ static int default_driver_check(QemuOpts *opts, void 
*opaque)
 /* QEMU state */
 
 static RunState current_run_state = RUN_STATE_PRELAUNCH;
+static RunState saved_run_state = RUN_STATE_PRELAUNCH;
 
 typedef struct {
 RunState from;
@@ -543,6 +544,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
+{ RUN_STATE_POSTMIGRATE, RUN_STATE_PAUSED },
 { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_PRELAUNCH, RUN_STATE_RUNNING },
@@ -553,6 +555,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
 
 { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
+{ RUN_STATE_RESTORE_VM, RUN_STATE_PAUSED },
 
 { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
 { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
@@ -582,11 +585,39 @@ static const RunStateTransition 
runstate_transitions_def[] = {
 
 static bool runstate_valid_transitions[RUN_STATE_MAX][RUN_STATE_MAX];
 
+void save_run_state(void)
+{
+saved_run_state = current_run_state;
+}
+
+void load_run_state(void)
+{
+if (saved_run_state == RUN_STATE_RUNNING) {
+vm_start();
+} else if (!runstate_check(saved_run_state)) {
+runstate_set(saved_run_state);
+} else {
+; /* leave unchanged */
+}
+}
+
 bool runstate_check(RunState state)
 {
 return current_run_state == state;
 }
 
+static void runstate_save(QEMUFile *f, void *opaque)
+{
+qemu_put_byte(f, saved_run_state);
+}
+
+static int runstate_load(QEMUFile *f, void *opaque, int version_id)
+{
+saved_run_state = qemu_get_byte(f);
+
+return 0;
+}
+
 static void runstate_init(void)
 {
 const RunStateTransition *p;
@@ -596,6 +627,9 @@ static void runstate_init(void)
 for (p = runstate_transitions_def[0]; p-from != RUN_STATE_MAX; p++) {
 runstate_valid_transitions[p-from][p-to] = true;
 }
+
+register_savevm(NULL, runstate, 0, 1,
+runstate_save, runstate_load, NULL);
 }
 
 /* This function will 

[PATCH v12 rebased 2/8] start vm after resetting it

2013-01-22 Thread Hu Tao
From: Wen Congyang we...@cn.fujitsu.com

The guest should run after resetting it, but it does not run if its
old state is RUN_STATE_INTERNAL_ERROR or RUN_STATE_PAUSED.

We don't set runstate to RUN_STATE_PAUSED when resetting the guest,
so the runstate will be changed from RUN_STATE_INTERNAL_ERROR or
RUN_STATE_PAUSED to RUN_STATE_RUNNING(not RUN_STATE_PAUSED).

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
---
 include/block/block.h | 2 ++
 qmp.c | 2 +-
 vl.c  | 7 ---
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index ffd1936..5e82ccb 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -366,6 +366,8 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
 void bdrv_set_in_use(BlockDriverState *bs, int in_use);
 int bdrv_in_use(BlockDriverState *bs);
 
+void iostatus_bdrv_it(void *opaque, BlockDriverState *bs);
+
 #ifdef CONFIG_LINUX_AIO
 int raw_get_aio_fd(BlockDriverState *bs);
 #else
diff --git a/qmp.c b/qmp.c
index 55b056b..5f1bed1 100644
--- a/qmp.c
+++ b/qmp.c
@@ -130,7 +130,7 @@ SpiceInfo *qmp_query_spice(Error **errp)
 };
 #endif
 
-static void iostatus_bdrv_it(void *opaque, BlockDriverState *bs)
+void iostatus_bdrv_it(void *opaque, BlockDriverState *bs)
 {
 bdrv_iostatus_reset(bs);
 }
diff --git a/vl.c b/vl.c
index b0bcf1e..1d2edaa 100644
--- a/vl.c
+++ b/vl.c
@@ -534,7 +534,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
 { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
 
-{ RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
+{ RUN_STATE_INTERNAL_ERROR, RUN_STATE_RUNNING },
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_IO_ERROR, RUN_STATE_RUNNING },
@@ -569,7 +569,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 
 { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
-{ RUN_STATE_SHUTDOWN, RUN_STATE_PAUSED },
+{ RUN_STATE_SHUTDOWN, RUN_STATE_RUNNING },
 { RUN_STATE_SHUTDOWN, RUN_STATE_FINISH_MIGRATE },
 
 { RUN_STATE_DEBUG, RUN_STATE_SUSPENDED },
@@ -1951,7 +1951,8 @@ static bool main_loop_should_exit(void)
 resume_all_vcpus();
 if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
 runstate_check(RUN_STATE_SHUTDOWN)) {
-runstate_set(RUN_STATE_PAUSED);
+bdrv_iterate(iostatus_bdrv_it, NULL);
+vm_start();
 }
 }
 if (qemu_wakeup_requested()) {
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 7/8] allower the user to disable pv event support

2013-01-22 Thread Hu Tao
Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 hw/pc_piix.c| 9 -
 qemu-options.hx | 3 ++-
 vl.c| 4 
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index fed6ccf..507c98b 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -44,6 +44,7 @@
 #include exec/memory.h
 #include exec/address-spaces.h
 #include cpu.h
+#include qemu/config-file.h
 #ifdef CONFIG_XEN
 #  include xen/hvm/hvm_info_table.h
 #endif
@@ -86,6 +87,8 @@ static void pc_init1(MemoryRegion *system_memory,
 MemoryRegion *pci_memory;
 MemoryRegion *rom_memory;
 void *fw_cfg = NULL;
+QemuOptsList *list = qemu_find_opts(machine);
+bool enable_pv_event = false;
 
 pc_cpus_init(cpu_model);
 pc_acpi_init(acpi-dsdt.aml);
@@ -218,7 +221,11 @@ static void pc_init1(MemoryRegion *system_memory,
 pc_pci_device_init(pci_bus);
 }
 
-if (kvm_enabled()) {
+if (list  !QTAILQ_EMPTY(list-head)) {
+enable_pv_event = qemu_opt_get_bool(QTAILQ_FIRST(list-head),
+enable_pv_event, false);
+}
+if (kvm_enabled()  enable_pv_event) {
 kvm_pv_event_init(isa_bus);
 }
 }
diff --git a/qemu-options.hx b/qemu-options.hx
index 4e2b499..7522f4a 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -35,7 +35,8 @@ DEF(machine, HAS_ARG, QEMU_OPTION_machine, \
 kernel_irqchip=on|off controls accelerated irqchip 
support\n
 kvm_shadow_mem=size of KVM shadow MMU\n
 dump-guest-core=on|off include guest memory in a core 
dump (default=on)\n
-mem-merge=on|off controls memory merge support (default: 
on)\n,
+mem-merge=on|off controls memory merge support (default: 
on)\n
+enable_pv_event=on|off controls pv event support 
(default: off)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -machine [type=]@var{name}[,prop=@var{value}[,...]]
diff --git a/vl.c b/vl.c
index 5aae03f..aa15b23 100644
--- a/vl.c
+++ b/vl.c
@@ -424,6 +424,10 @@ static QemuOptsList qemu_machine_opts = {
 .name = usb,
 .type = QEMU_OPT_BOOL,
 .help = Set on/off to enable/disable usb,
+}, {
+.name = enable_pv_event,
+.type = QEMU_OPT_BOOL,
+.help = handle pv event
 },
 { /* End of list */ }
 },
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 5/8] add a new qevent: QEVENT_GUEST_PANICKED

2013-01-22 Thread Hu Tao
This event will be emited when the guest is panicked.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
---
 include/monitor/monitor.h | 1 +
 monitor.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 87fb49c..4006905 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -45,6 +45,7 @@ typedef enum MonitorEvent {
 QEVENT_WAKEUP,
 QEVENT_BALLOON_CHANGE,
 QEVENT_SPICE_MIGRATE_COMPLETED,
+QEVENT_GUEST_PANICKED,
 
 /* Add to 'monitor_event_names' array in monitor.c when
  * defining new events here */
diff --git a/monitor.c b/monitor.c
index 9381ed0..61beeb4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -463,6 +463,7 @@ static const char *monitor_event_names[] = {
 [QEVENT_WAKEUP] = WAKEUP,
 [QEVENT_BALLOON_CHANGE] = BALLOON_CHANGE,
 [QEVENT_SPICE_MIGRATE_COMPLETED] = SPICE_MIGRATE_COMPLETED,
+[QEVENT_GUEST_PANICKED] = GUEST_PANICKED,
 };
 QEMU_BUILD_BUG_ON(ARRAY_SIZE(monitor_event_names) != QEVENT_MAX)
 
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased] kvm: notify host when the guest is panicked

2013-01-22 Thread Hu Tao
We can know the guest is panicked when the guest runs on xen.
But we do not have such feature on kvm.

Another purpose of this feature is: management app(for example:
libvirt) can do auto dump when the guest is panicked. If management
app does not do auto dump, the guest's user can do dump by hand if
he sees the guest is panicked.

We have three solutions to implement this feature:
1. use vmcall
2. use I/O port
3. use virtio-serial.

We have decided to avoid touching hypervisor. The reason why I choose
choose the I/O port is:
1. it is easier to implememt
2. it does not depend any virtual device
3. it can work when starting the kernel

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 arch/ia64/kvm/irq.h  | 19 +
 arch/powerpc/include/asm/kvm_para.h  | 18 
 arch/s390/include/asm/kvm_para.h | 19 +
 arch/x86/include/asm/kvm_para.h  | 20 ++
 arch/x86/include/uapi/asm/kvm_para.h |  2 ++
 arch/x86/kernel/kvm.c| 53 
 include/linux/kvm_para.h | 18 
 include/uapi/linux/kvm_para.h|  6 
 kernel/panic.c   |  4 +++
 9 files changed, 159 insertions(+)

diff --git a/arch/ia64/kvm/irq.h b/arch/ia64/kvm/irq.h
index c0785a7..b3870f8 100644
--- a/arch/ia64/kvm/irq.h
+++ b/arch/ia64/kvm/irq.h
@@ -30,4 +30,23 @@ static inline int irqchip_in_kernel(struct kvm *kvm)
return 1;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
+static inline bool kvm_arch_pv_event_enabled(void)
+{
+   return false;
+}
+
 #endif
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 2b11965..17dd013 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -144,4 +144,22 @@ static inline bool kvm_check_and_clear_guest_paused(void)
return false;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
+static inline bool kvm_arch_pv_event_enabled(void)
+{
+   return false;
+}
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h
index e0f8423..81d87ec 100644
--- a/arch/s390/include/asm/kvm_para.h
+++ b/arch/s390/include/asm/kvm_para.h
@@ -154,4 +154,23 @@ static inline bool kvm_check_and_clear_guest_paused(void)
return false;
 }
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return 0;
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+}
+
+static inline bool kvm_arch_pv_event_enabled(void)
+{
+   return false;
+}
+
 #endif /* __S390_KVM_PARA_H */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 5ed1f161..c3f2ca8 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -133,4 +133,24 @@ static inline void kvm_disable_steal_time(void)
 }
 #endif
 
+static inline int kvm_arch_pv_event_init(void)
+{
+   if (!request_region(KVM_PV_EVENT_PORT, 4, KVM_PV_EVENT))
+   return -1;
+
+   return 0;
+}
+
+static inline unsigned int kvm_arch_pv_features(void)
+{
+   return inl(KVM_PV_EVENT_PORT);
+}
+
+static inline void kvm_arch_pv_eject_event(unsigned int event)
+{
+   outl(event, KVM_PV_EVENT_PORT);
+}
+
+bool kvm_arch_pv_event_enabled(void);
+
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/include/uapi/asm/kvm_para.h 
b/arch/x86/include/uapi/asm/kvm_para.h
index 06fdbd9..c15ef33 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -96,5 +96,7 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#define KVM_PV_EVENT_PORT  (0x505UL)
+
 
 #endif /* _UAPI_ASM_X86_KVM_PARA_H */
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 9c2bd8b..0aa7b3e 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -73,6 +73,20 @@ static int parse_no_kvmclock_vsyscall(char *arg)
 
 early_param(no-kvmclock-vsyscall, parse_no_kvmclock_vsyscall);
 
+static int pv_event = 1;
+static int parse_no_pv_event(char *arg)
+{
+   pv_event = 0;
+   return 0;
+}
+
+bool kvm_arch_pv_event_enabled(void)
+{
+   return !!pv_event;
+}
+
+early_param(no-pv-event, parse_no_pv_event);
+
 static DEFINE_PER_CPU(struct kvm_vcpu_pv_apf_data, apf_reason) __aligned(64);
 static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);
 static int has_steal_clock = 0;
@@ -385,6 +399,17 @@ static struct 

[PATCH v12 rebased 0/8] pv event to notify host when the guest is panicked

2013-01-22 Thread Hu Tao
This series implements a new interface, kvm pv event, to notify host when
some events happen in guest. Right now there is one supported event: guest
panic.

Also, the cpu runstate is preserved during save/load vm and migration. Thus,
if vm is panicked during migration, we can still know it by quring the status
of vm in destination host.

This version is a rebase and no code change.

v12: http://lists.gnu.org/archive/html/qemu-devel/2012-12/msg01459.html
v11: http://lists.gnu.org/archive/html/qemu-devel/2012-10/msg04361.html

Hu Tao (7):
  save/load cpu runstate
  update kernel headers
  add a new runstate: RUN_STATE_GUEST_PANICKED
  add a new qevent: QEVENT_GUEST_PANICKED
  introduce a new qom device to deal with panicked event
  allower the user to disable pv event support
  pv event: add document to describe the usage

Wen Congyang (1):
  start vm after resetting it

 docs/pv-event.txt|  17 
 hw/kvm/Makefile.objs |   2 +-
 hw/kvm/pv_event.c| 197 +++
 hw/pc_piix.c |  12 +++
 include/block/block.h|   2 +
 include/monitor/monitor.h|   1 +
 include/sysemu/kvm.h |   2 +
 include/sysemu/sysemu.h  |   2 +
 kvm-stub.c   |   4 +
 linux-headers/asm-x86/kvm_para.h |   1 +
 linux-headers/linux/kvm_para.h   |   6 ++
 migration.c  |   7 +-
 monitor.c|   6 +-
 qapi-schema.json |   6 +-
 qemu-options.hx  |   3 +-
 qmp.c|   5 +-
 savevm.c |   1 +
 vl.c |  56 ++-
 18 files changed, 313 insertions(+), 17 deletions(-)
 create mode 100644 docs/pv-event.txt
 create mode 100644 hw/kvm/pv_event.c

-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 3/8] update kernel headers

2013-01-22 Thread Hu Tao
update kernel headers to add pv event macros.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 linux-headers/asm-x86/kvm_para.h | 1 +
 linux-headers/linux/kvm_para.h   | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h
index a1c3d72..781959a 100644
--- a/linux-headers/asm-x86/kvm_para.h
+++ b/linux-headers/asm-x86/kvm_para.h
@@ -96,5 +96,6 @@ struct kvm_vcpu_pv_apf_data {
 #define KVM_PV_EOI_ENABLED KVM_PV_EOI_MASK
 #define KVM_PV_EOI_DISABLED 0x0
 
+#define KVM_PV_EVENT_PORT  (0x505UL)
 
 #endif /* _ASM_X86_KVM_PARA_H */
diff --git a/linux-headers/linux/kvm_para.h b/linux-headers/linux/kvm_para.h
index 7bdcf93..f6be0bb 100644
--- a/linux-headers/linux/kvm_para.h
+++ b/linux-headers/linux/kvm_para.h
@@ -20,6 +20,12 @@
 #define KVM_HC_FEATURES3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE  4
 
+/* The bit of supported pv event */
+#define KVM_PV_FEATURE_PANICKED0
+
+/* The pv event value */
+#define KVM_PV_EVENT_PANICKED  1
+
 /*
  * hypercalls use architecture specific
  */
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 rebased 4/8] add a new runstate: RUN_STATE_GUEST_PANICKED

2013-01-22 Thread Hu Tao
The guest will be in this state when it is panicked.

If guest is panicked during live migration, the runstate
RUN_STATE_GUEST_PANICKED will be transferred to dest machine.

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: Hu Tao hu...@cn.fujitsu.com
---
 migration.c  |  1 +
 qapi-schema.json |  6 +-
 qmp.c|  3 ++-
 vl.c | 11 ++-
 4 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/migration.c b/migration.c
index f96cfd6..2b51913 100644
--- a/migration.c
+++ b/migration.c
@@ -705,6 +705,7 @@ static void *buffered_file_thread(void *opaque)
 int64_t start_time, end_time;
 
 DPRINTF(done iterating\n);
+save_run_state();
 start_time = qemu_get_clock_ms(rt_clock);
 qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
 if (old_vm_running) {
diff --git a/qapi-schema.json b/qapi-schema.json
index 6d7252b..b49094b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -174,11 +174,15 @@
 # @suspended: guest is suspended (ACPI S3)
 #
 # @watchdog: the watchdog action is configured to pause and has been triggered
+#
+# @guest-panicked: the panicked action is configured to pause and has been
+# triggered.
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
 'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
-'running', 'save-vm', 'shutdown', 'suspended', 'watchdog' ] }
+'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
+'guest-panicked' ] }
 
 ##
 # @SnapshotInfo
diff --git a/qmp.c b/qmp.c
index 5f1bed1..f5027f6 100644
--- a/qmp.c
+++ b/qmp.c
@@ -150,7 +150,8 @@ void qmp_cont(Error **errp)
 Error *local_err = NULL;
 
 if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-   runstate_check(RUN_STATE_SHUTDOWN)) {
+runstate_check(RUN_STATE_SHUTDOWN) ||
+runstate_check(RUN_STATE_GUEST_PANICKED)) {
 error_set(errp, QERR_RESET_REQUIRED);
 return;
 } else if (runstate_check(RUN_STATE_SUSPENDED)) {
diff --git a/vl.c b/vl.c
index 1d2edaa..5aae03f 100644
--- a/vl.c
+++ b/vl.c
@@ -533,6 +533,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 
 { RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
 { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+{ RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },
 
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_RUNNING },
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -546,6 +547,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
 { RUN_STATE_POSTMIGRATE, RUN_STATE_PAUSED },
 { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
+{ RUN_STATE_POSTMIGRATE, RUN_STATE_GUEST_PANICKED },
 
 { RUN_STATE_PRELAUNCH, RUN_STATE_RUNNING },
 { RUN_STATE_PRELAUNCH, RUN_STATE_FINISH_MIGRATE },
@@ -556,6 +558,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 
 { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
 { RUN_STATE_RESTORE_VM, RUN_STATE_PAUSED },
+{ RUN_STATE_RESTORE_VM, RUN_STATE_GUEST_PANICKED },
 
 { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
 { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
@@ -566,6 +569,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_RUNNING, RUN_STATE_SAVE_VM },
 { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
 { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
+{ RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
 
 { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },
 
@@ -580,6 +584,10 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
 { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
 
+{ RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
+{ RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED },
+{ RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },
+
 { RUN_STATE_MAX, RUN_STATE_MAX },
 };
 
@@ -1950,7 +1958,8 @@ static bool main_loop_should_exit(void)
 qemu_system_reset(VMRESET_REPORT);
 resume_all_vcpus();
 if (runstate_check(RUN_STATE_INTERNAL_ERROR) ||
-runstate_check(RUN_STATE_SHUTDOWN)) {
+runstate_check(RUN_STATE_SHUTDOWN) ||
+runstate_check(RUN_STATE_GUEST_PANICKED)) {
 bdrv_iterate(iostatus_bdrv_it, NULL);
 vm_start();
 }
-- 
1.8.0.1.240.ge8a1f5a

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html