Re: [Qemu-devel] [PATCH v3 1/5] target-ppc: add vector insert instructions

2016-08-18 Thread Rajalakshmi Srinivasaraghavan



On 08/16/2016 09:48 AM, David Gibson wrote:

On Thu, Aug 11, 2016 at 01:06:44PM +0530, Rajalakshmi Srinivasaraghavan wrote:

The following vector insert instructions are added from ISA 3.0.

vinsertb - Vector Insert Byte
vinserth - Vector Insert Halfword
vinsertw - Vector Insert Word
vinsertd - Vector Insert Doubleword

Signed-off-by: Rajalakshmi Srinivasaraghavan 
---
  target-ppc/helper.h |4 
  target-ppc/int_helper.c |   27 +++
  target-ppc/translate.c  |2 ++
  target-ppc/translate/vmx-impl.c |   32 
  target-ppc/translate/vmx-ops.c  |   18 +-
  5 files changed, 78 insertions(+), 5 deletions(-)

diff --git a/target-ppc/helper.h b/target-ppc/helper.h
index 93ac9e1..0923779 100644
--- a/target-ppc/helper.h
+++ b/target-ppc/helper.h
@@ -250,6 +250,10 @@ DEF_HELPER_2(vspltisw, void, avr, i32)
  DEF_HELPER_3(vspltb, void, avr, avr, i32)
  DEF_HELPER_3(vsplth, void, avr, avr, i32)
  DEF_HELPER_3(vspltw, void, avr, avr, i32)
+DEF_HELPER_3(vinsertb, void, avr, avr, i32)
+DEF_HELPER_3(vinserth, void, avr, avr, i32)
+DEF_HELPER_3(vinsertw, void, avr, avr, i32)
+DEF_HELPER_3(vinsertd, void, avr, avr, i32)
  DEF_HELPER_2(vupkhpx, void, avr, avr)
  DEF_HELPER_2(vupklpx, void, avr, avr)
  DEF_HELPER_2(vupkhsb, void, avr, avr)
diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c
index 552b2e0..3f8e439 100644
--- a/target-ppc/int_helper.c
+++ b/target-ppc/int_helper.c
@@ -1792,6 +1792,33 @@ VSPLT(w, u32)
  #undef VSPLT
  #undef SPLAT_ELEMENT
  #undef _SPLAT_MASKED
+#if defined(HOST_WORDS_BIGENDIAN)
+#define VINSERT(suffix, element, index) \
+void helper_vinsert##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+{   \
+ppc_avr_t result;   \
+result = *r;\
+memcpy([splat], >element[index],   \
+   sizeof(result.element[0]));  \
+*r = result;\

Using a temporary for the result means two extra full vector copies,
which seems unfortunate.  Couldn't you just use memmove() instead of
memcpy() to handle the overlapping cases?

If the registers r and b are same, using memove() can overwrite
some of the values depending on index(third arg).

+}
+#else
+#define VINSERT(suffix, element, index) \
+void helper_vinsert##suffix(ppc_avr_t *r, ppc_avr_t *b, uint32_t splat) \
+{   \
+ppc_avr_t result;   \
+result = *r;\
+uint32_t s = (ARRAY_SIZE(r->element) - index) - 1;  \

The logic with index seems a bit convoluted.  AFAICT the index is
always the least significant element of the most-significant half of
the vector b.  So for BE [8] should always be right and for
LE [8 - sizeof(r->element)].

Ack. One change in your comment for BE its u8[8 - sizeof(r->element)]
and LE its u8[8].




+uint32_t d = (16 - splat) - sizeof(r->element[0]);  \
+memcpy([d], >element[s], sizeof(result.element[0]));   \
+*r = result;\
+}
+#endif
+VINSERT(b, u8, 7)
+VINSERT(h, u16, 3)
+VINSERT(w, u32, 1)
+VINSERT(d, u64, 0)
+#undef VINSERT
  
  #define VSPLTI(suffix, element, splat_type) \

  void helper_vspltis##suffix(ppc_avr_t *r, uint32_t splat)   \
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index fc3d371..dbe952e 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -498,6 +498,8 @@ EXTRACT_HELPER(UIMM, 0, 16);
  EXTRACT_HELPER(SIMM5, 16, 5);
  /* 5 bits signed immediate value */
  EXTRACT_HELPER(UIMM5, 16, 5);
+/* 4 bits unsigned immediate value */
+EXTRACT_HELPER(UIMM4, 16, 4);
  /* Bit count */
  EXTRACT_HELPER(NB, 11, 5);
  /* Shift count */
diff --git a/target-ppc/translate/vmx-impl.c b/target-ppc/translate/vmx-impl.c
index ac78caf..f6a97ac 100644
--- a/target-ppc/translate/vmx-impl.c
+++ b/target-ppc/translate/vmx-impl.c
@@ -623,13 +623,45 @@ static void glue(gen_, name)(DisasContext *ctx)   
  \
  tcg_temp_free_ptr(rd);  \
  }
  
+#define GEN_VXFORM_UIMM_SPLAT(name, opc2, opc3, splat_max)  \

+static void glue(gen_, name)(DisasContext *ctx) \
+{   \
+TCGv_ptr rb, rd;\
+uint8_t uimm = 

Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support

2016-08-18 Thread Dan Williams
On Thu, Aug 18, 2016 at 8:46 PM, Xiao Guangrong
 wrote:
>
>
> On 08/19/2016 11:40 AM, Xiao Guangrong wrote:
>>
>>
>>
>> On 08/19/2016 02:54 AM, Vishal Verma wrote:
>>>
>>> On 08/18, Dan Williams wrote:

 [ adding Vishal who implemented the kernel side of nvdimm hotplug
 support ]

 On Thu, Aug 11, 2016 at 11:54 PM, Xiao Guangrong
  wrote:
>
> This patchset is against commit c597dc90fbcd6 (virtio-net: allow
> increasing
> rx queue siz) on pci branch of Michael's git tree and can be found at:
>   https://github.com/xiaogr/qemu.git nvdimm-hotplug-v2
>
> Changelog in v2:
>Fixed signed integer overflow pointed out by Stefan Hajnoczi
>
> This patchset enables nvdimm hotplug support, it is used as pc-dimm
> hotplug,
> for example, a new nvdimm device can be plugged as follows:
> object_add
> memory-backend-file,id=mem3,size=10G,mem-path=/home/eric/nvdimm3
> device_add nvdimm,id=nvdimm3,memdev=mem3
>
> and unplug it as follows:
> device_del nvdimm3
> object_del mem3


 Did you test this against the Linux NFIT hotplug support?  We just
 found that the Linux driver is not properly registering for ACPI0012
 event notification.  Is a notification sent on a 'device_add' event?
>>>
>>>
>>> I've just sent out a patch that should fix this:
>>> https://lists.01.org/pipermail/linux-nvdimm/2016-August/006637.html
>>>
>>
>> Interesting. I am using the kvm tree, queue branch, the top commit is
>> 8ff7b956471f:
>>   Merge tag 'kvm-s390-next-4.8-2' of
>> git://git.kernel.org/pub/scm/linux/kernel
>>
>> It works.
>
>
> It triggers 'notify' event, not 'device add'.

Ah, I missed that the notification handler gets registered via
acpi_device_install_notify_handler() so this
ACPI_DRIVER_ALL_NOTIFY_EVENTS is not needed.

Thank you for confirming.



Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support

2016-08-18 Thread Xiao Guangrong



On 08/19/2016 11:40 AM, Xiao Guangrong wrote:



On 08/19/2016 02:54 AM, Vishal Verma wrote:

On 08/18, Dan Williams wrote:

[ adding Vishal who implemented the kernel side of nvdimm hotplug support ]

On Thu, Aug 11, 2016 at 11:54 PM, Xiao Guangrong
 wrote:

This patchset is against commit c597dc90fbcd6 (virtio-net: allow increasing
rx queue siz) on pci branch of Michael's git tree and can be found at:
  https://github.com/xiaogr/qemu.git nvdimm-hotplug-v2

Changelog in v2:
   Fixed signed integer overflow pointed out by Stefan Hajnoczi

This patchset enables nvdimm hotplug support, it is used as pc-dimm hotplug,
for example, a new nvdimm device can be plugged as follows:
object_add memory-backend-file,id=mem3,size=10G,mem-path=/home/eric/nvdimm3
device_add nvdimm,id=nvdimm3,memdev=mem3

and unplug it as follows:
device_del nvdimm3
object_del mem3


Did you test this against the Linux NFIT hotplug support?  We just
found that the Linux driver is not properly registering for ACPI0012
event notification.  Is a notification sent on a 'device_add' event?


I've just sent out a patch that should fix this:
https://lists.01.org/pipermail/linux-nvdimm/2016-August/006637.html



Interesting. I am using the kvm tree, queue branch, the top commit is
8ff7b956471f:
  Merge tag 'kvm-s390-next-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel

It works.


It triggers 'notify' event, not 'device add'.




Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support

2016-08-18 Thread Xiao Guangrong



On 08/19/2016 02:54 AM, Vishal Verma wrote:

On 08/18, Dan Williams wrote:

[ adding Vishal who implemented the kernel side of nvdimm hotplug support ]

On Thu, Aug 11, 2016 at 11:54 PM, Xiao Guangrong
 wrote:

This patchset is against commit c597dc90fbcd6 (virtio-net: allow increasing
rx queue siz) on pci branch of Michael's git tree and can be found at:
  https://github.com/xiaogr/qemu.git nvdimm-hotplug-v2

Changelog in v2:
   Fixed signed integer overflow pointed out by Stefan Hajnoczi

This patchset enables nvdimm hotplug support, it is used as pc-dimm hotplug,
for example, a new nvdimm device can be plugged as follows:
object_add memory-backend-file,id=mem3,size=10G,mem-path=/home/eric/nvdimm3
device_add nvdimm,id=nvdimm3,memdev=mem3

and unplug it as follows:
device_del nvdimm3
object_del mem3


Did you test this against the Linux NFIT hotplug support?  We just
found that the Linux driver is not properly registering for ACPI0012
event notification.  Is a notification sent on a 'device_add' event?


I've just sent out a patch that should fix this:
https://lists.01.org/pipermail/linux-nvdimm/2016-August/006637.html



Interesting. I am using the kvm tree, queue branch, the top commit is
8ff7b956471f:
  Merge tag 'kvm-s390-next-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel

It works.




Re: [Qemu-devel] [PATCH RFC] msix_init: input params *_offset isn't the real one

2016-08-18 Thread Cao jin



On 08/18/2016 06:54 PM, Marcel Apfelbaum wrote:

On 08/10/2016 06:18 AM, Cao jin wrote:

The parameter table_offset & pba_offset is kind of confusing, they
shouldn't
include bir field.

Signed-off-by: Cao jin 
---



Hi,


According to the passed arguments, I guess all the callers of msix_init()
has the same feeling with me, but I am not quite sure about this, so,
RFC.

 hw/pci/msix.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/pci/msix.c b/hw/pci/msix.c
index 0ec1cb1..3a16d83 100644
--- a/hw/pci/msix.c
+++ b/hw/pci/msix.c
@@ -264,8 +264,7 @@ int msix_init(struct PCIDevice *dev, unsigned
short nentries,
 if ((table_bar_nr == pba_bar_nr &&
  ranges_overlap(table_offset, table_size, pba_offset,
pba_size)) ||
 table_offset + table_size > memory_region_size(table_bar) ||
-pba_offset + pba_size > memory_region_size(pba_bar) ||
-(table_offset | pba_offset) & PCI_MSIX_FLAGS_BIRMASK) {
+pba_offset + pba_size > memory_region_size(pba_bar)) {
 return -EINVAL;


I think we should keep the '(table_offset | pba_offset) &
PCI_MSIX_FLAGS_BIRMASK)'
test since it is required by spec, please see: PCI Spec "6.8.2.4. Table
Offset/Table BIR for MSI-X"

  Table offset: ...The lower 3 Table BIR bits are masked off (set to
zero) by software
to form a 32-bit QWORD -aligned offset.

This function gets the offset parameters as the whole 32-BIT QWORD and
checks it does not collide
with the BIR offset.



Hi Marcel,
Thanks very much for pointing the accurate reference out.
I also checked how kernel code handle this field, it is just like 
you said.


Sorry for this noise.

Cao jin



 }

@@ -282,8 +281,8 @@ int msix_init(struct PCIDevice *dev, unsigned
short nentries,
 dev->msix_entries_nr = nentries;
 dev->msix_function_masked = true;

-pci_set_long(config + PCI_MSIX_TABLE, table_offset | table_bar_nr);
-pci_set_long(config + PCI_MSIX_PBA, pba_offset | pba_bar_nr);
+pci_set_long(config + PCI_MSIX_TABLE, (table_offset << 3) |
table_bar_nr);
+pci_set_long(config + PCI_MSIX_PBA, (pba_offset << 3) | pba_bar_nr);



Here is a similar issue. Your interpretation suggests we need to shift
left the offset
to make room for BIR, but I think current implementation looks at it
differently already
receiving the offset as a 32-bit QWORD and simply does not "look" to the
lower bits
implying them 0.

Thanks,
Marcel








Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Jason Wang



On 2016年08月19日 00:57, Paolo Bonzini wrote:


On 18/08/2016 16:15, Cao jin wrote:

Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, E1000E_USE_MSIX
is not necessary too, remove it now. And interrupt flag field intr_state also
can be removed now.

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 
CC: Paolo Bonzini 
Signed-off-by: Cao jin 
---
  hw/net/e1000e.c | 8 +---
  1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 82a7be1..72aad21 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -69,7 +69,6 @@ typedef struct E1000EState {
  uint16_t subsys_ven_used;
  uint16_t subsys_used;
  
-uint32_t intr_state;

  bool disable_vnet;
  
  E1000ECore core;

@@ -89,8 +88,6 @@ typedef struct E1000EState {
  #define E1000E_MSIX_TABLE   (0x)
  #define E1000E_MSIX_PBA (0x2000)
  
-#define E1000E_USE_MSIXBIT(0)

-
  static uint64_t
  e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
  {
@@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
  } else {
  if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
  msix_uninit(d, >msix, >msix);
-} else {
-s->intr_state |= E1000E_USE_MSIX;
  }
  }
  }
@@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
  static void
  e1000e_cleanup_msix(E1000EState *s)
  {
-if (s->intr_state & E1000E_USE_MSIX) {
+if (msix_enabled(PCI_DEVICE(s))) {
  e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
  msix_uninit(PCI_DEVICE(s), >msix, >msix);
  }
@@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
  VMSTATE_MSIX(parent_obj, E1000EState),
  
  VMSTATE_UINT32(ioaddr, E1000EState),

-VMSTATE_UINT32(intr_state, E1000EState),
  VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
  VMSTATE_UINT8(core.rx_desc_len, E1000EState),
  VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,


If there's time to get this in 2.7, it would be good.  Jason?


Yes, will send a pull request with this today.



Reviewed-by: Paolo Bonzini 


Thanks and applied to -net.



Re: [Qemu-devel] [PATCH] slirp: fix segv when init failed

2016-08-18 Thread Jason Wang



On 2016年08月18日 21:44, Marc-André Lureau wrote:

Since commit f6c2e66ae8c8a, slirp uses an exit notifier to call
slirp_smb_cleanup. However, if init() failed, the notifier isn't added,
and removing it will fail:

==18447== Invalid write of size 8
==18447==at 0x7EF2B5: notifier_remove (notify.c:32)
==18447==by 0x48E80C: qemu_remove_exit_notifier (vl.c:2661)
==18447==by 0x6A2187: net_slirp_cleanup (slirp.c:134)
==18447==by 0x69419D: qemu_cleanup_net_client (net.c:338)
==18447==by 0x69445B: qemu_del_net_client (net.c:401)
==18447==by 0x6A2B81: net_slirp_init (slirp.c:366)
==18447==by 0x6A4241: net_init_slirp (slirp.c:865)
==18447==by 0x695C6D: net_client_init1 (net.c:1051)
==18447==by 0x695F6E: net_client_init (net.c:1108)
==18447==by 0x696DBA: net_init_netdev (net.c:1498)
==18447==by 0x7F1F99: qemu_opts_foreach (qemu-option.c:1116)
==18447==by 0x696E60: net_init_clients (net.c:1516)
==18447==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Signed-off-by: Marc-André Lureau 
---
  net/slirp.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/slirp.c b/net/slirp.c
index facc30e..b60893f 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -131,7 +131,9 @@ static void net_slirp_cleanup(NetClientState *nc)
  SlirpState *s = DO_UPCAST(SlirpState, nc, nc);
  
  slirp_cleanup(s->slirp);

-qemu_remove_exit_notifier(>exit_notifier);
+if (s->exit_notifier.notify) {
+qemu_remove_exit_notifier(>exit_notifier);
+}
  slirp_smb_cleanup(s);
  QTAILQ_REMOVE(_stacks, s, entry);
  }


Applied to -net for 2.7.

Thanks



[Qemu-devel] [PATCH 4/4] virtio-balloon: keep collecting stats on save/restore

2016-08-18 Thread Roman Kagan
Upon save/restore virtio-balloon stats acquisition stops.  The reason is
that the fact that the (only) virtqueue element is being used by QEMU is
not recorded anywhere on save, so upon restore it's not released to the
guest, making further progress impossible.

Saving the information about the used element would introduce unjustified
vmstate incompatibility.

So instead just make sure the element is pushed before save, leaving the
ball on the guest side.  For that, add vm state change handler to
virtio-ballon which would take care of pushing the element if there is
one.

Signed-off-by: Roman Kagan 
Cc: "Michael S. Tsirkin" 
---
 hw/virtio/virtio-balloon.c | 27 ++-
 include/hw/virtio/virtio-balloon.h |  1 +
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index b56fecd..66a926a 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -88,10 +88,19 @@ static void balloon_stats_change_timer(VirtIOBalloon *s, 
int64_t secs)
 timer_mod(s->stats_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + secs * 
1000);
 }
 
+static void balloon_stats_push_elem(VirtIOBalloon *s)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+virtqueue_push(s->svq, s->stats_vq_elem, s->stats_vq_offset);
+virtio_notify(vdev, s->svq);
+g_free(s->stats_vq_elem);
+s->stats_vq_elem = NULL;
+}
+
 static void balloon_stats_poll_cb(void *opaque)
 {
 VirtIOBalloon *s = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
 if (!s->stats_vq_elem) {
 /* The guest hasn't sent the stats yet (either not enabled or we came
@@ -100,10 +109,7 @@ static void balloon_stats_poll_cb(void *opaque)
 return;
 }
 
-virtqueue_push(s->svq, s->stats_vq_elem, s->stats_vq_offset);
-virtio_notify(vdev, s->svq);
-g_free(s->stats_vq_elem);
-s->stats_vq_elem = NULL;
+balloon_stats_push_elem(s);
 }
 
 static void balloon_stats_get_all(Object *obj, Visitor *v, const char *name,
@@ -411,6 +417,15 @@ static int virtio_balloon_load_device(VirtIODevice *vdev, 
QEMUFile *f,
 return 0;
 }
 
+static void balloon_vm_state_change(void *opaque, int running, RunState state)
+{
+VirtIOBalloon *s = opaque;
+
+if (!running && s->stats_vq_elem) {
+balloon_stats_push_elem(s);
+}
+}
+
 static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -433,6 +448,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, 
Error **errp)
 s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
 s->svq = virtio_add_queue(vdev, 1, virtio_balloon_receive_stats);
 
+s->change = qemu_add_vm_change_state_handler(balloon_vm_state_change, s);
 reset_stats(s);
 }
 
@@ -441,6 +457,7 @@ static void virtio_balloon_device_unrealize(DeviceState 
*dev, Error **errp)
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VirtIOBalloon *s = VIRTIO_BALLOON(dev);
 
+qemu_del_vm_change_state_handler(s->change);
 balloon_stats_destroy_timer(s);
 qemu_remove_balloon_handler(s);
 virtio_cleanup(vdev);
diff --git a/include/hw/virtio/virtio-balloon.h 
b/include/hw/virtio/virtio-balloon.h
index 1ea13bd..d72ff7f 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -43,6 +43,7 @@ typedef struct VirtIOBalloon {
 int64_t stats_last_update;
 int64_t stats_poll_interval;
 uint32_t host_features;
+VMChangeStateEntry *change;
 } VirtIOBalloon;
 
 #endif
-- 
2.7.4




[Qemu-devel] [PATCH 0/4] virtio-balloon: assorted fixes

2016-08-18 Thread Roman Kagan
This patchset addresses a few problems discovered when analyzing aborts
of (an older version of) QEMU with backported commit
afd9096eb1882f23929f5b5c177898ed231bac66 "virtio: error out if guest
exceeds virtqueue size".  Those problems are present in master, too,
except that they don't trigger an abort and are thus not as easy to
notice.


Roman Kagan (4):
  virtio: assert on ->inuse underflow
  virtio-balloon: make stats virtqueue length 1
  virtio-balloon: don't restart stats timer in callback
  virtio-balloon: keep collecting stats on save/restore

Cc: "Michael S. Tsirkin" 

 hw/virtio/virtio-balloon.c | 49 +-
 hw/virtio/virtio.c |  3 ++-
 include/hw/virtio/virtio-balloon.h |  1 +
 3 files changed, 30 insertions(+), 23 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] [PATCH v3] vfio : add aer process

2016-08-18 Thread Zhou Jie

ping

On 2016/8/15 10:53, Zhou Jie wrote:

ping

On 2016/8/2 11:57, Zhou Jie wrote:

During aer err occurs and resume do following to
protect device from being accessed.
1. Make config space read only.
2. Disable INTx/MSI Interrupt.
3. Do nothing for bar regions.

Signed-off-by: Zhou Jie 
---
v2-v3:
   1. Call init_completion() in vfio_pci_probe.
   2. Call reinit_completion() in vfio_pci_aer_err_detected.
   3. Remove unnecessary brackets.

v1-v2:
   1. Add aer process to vfio driver.

 drivers/vfio/pci/vfio_pci.c | 48
+
 drivers/vfio/pci/vfio_pci_private.h |  2 ++
 include/uapi/linux/vfio.h   |  2 ++
 3 files changed, 52 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index d624a52..4c246a1 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -648,6 +648,15 @@ static long vfio_pci_ioctl(void *device_data,
 struct vfio_pci_device *vdev = device_data;
 unsigned long minsz;

+if (vdev->aer_error_in_progress && (cmd == VFIO_DEVICE_SET_IRQS ||
+cmd == VFIO_DEVICE_RESET || cmd == VFIO_DEVICE_PCI_HOT_RESET)) {
+int ret;
+ret = wait_for_completion_interruptible(
+>aer_error_completion);
+if (ret)
+return ret;
+}
+
 if (cmd == VFIO_DEVICE_GET_INFO) {
 struct vfio_device_info info;

@@ -664,6 +673,10 @@ static long vfio_pci_ioctl(void *device_data,
 if (vdev->reset_works)
 info.flags |= VFIO_DEVICE_FLAGS_RESET;

+info.flags |= VFIO_DEVICE_FLAGS_AERPROCESS;
+if (vdev->aer_error_in_progress)
+info.flags |= VFIO_DEVICE_FLAGS_INAERPROCESS;
+
 info.num_regions = VFIO_PCI_NUM_REGIONS + vdev->num_regions;
 info.num_irqs = VFIO_PCI_NUM_IRQS;

@@ -1070,6 +1083,13 @@ static ssize_t vfio_pci_rw(void *device_data,
char __user *buf,

 switch (index) {
 case VFIO_PCI_CONFIG_REGION_INDEX:
+if (vdev->aer_error_in_progress && iswrite) {
+int ret;
+ret = wait_for_completion_interruptible(
+>aer_error_completion);
+if (ret)
+return ret;
+}
 return vfio_pci_config_rw(vdev, buf, count, ppos, iswrite);

 case VFIO_PCI_ROM_REGION_INDEX:
@@ -1228,6 +1248,7 @@ static int vfio_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *id)
 vdev->irq_type = VFIO_PCI_NUM_IRQS;
 mutex_init(>igate);
 spin_lock_init(>irqlock);
+init_completion(>aer_error_completion);

 ret = vfio_add_group_dev(>dev, _pci_ops, vdev);
 if (ret) {
@@ -1300,6 +1321,11 @@ static pci_ers_result_t
vfio_pci_aer_err_detected(struct pci_dev *pdev,

 mutex_lock(>igate);

+vdev->aer_error_in_progress = true;
+reinit_completion(>aer_error_completion);
+vfio_pci_set_irqs_ioctl(vdev, VFIO_IRQ_SET_DATA_NONE |
+VFIO_IRQ_SET_ACTION_TRIGGER,
+vdev->irq_type, 0, 0, NULL);
 if (vdev->err_trigger)
 eventfd_signal(vdev->err_trigger, 1);

@@ -1310,8 +1336,30 @@ static pci_ers_result_t
vfio_pci_aer_err_detected(struct pci_dev *pdev,
 return PCI_ERS_RESULT_CAN_RECOVER;
 }

+static void vfio_pci_aer_resume(struct pci_dev *pdev)
+{
+struct vfio_pci_device *vdev;
+struct vfio_device *device;
+
+device = vfio_device_get_from_dev(>dev);
+if (device == NULL)
+return;
+
+vdev = vfio_device_data(device);
+if (vdev == NULL) {
+vfio_device_put(device);
+return;
+}
+
+vdev->aer_error_in_progress = false;
+complete_all(>aer_error_completion);
+
+vfio_device_put(device);
+}
+
 static const struct pci_error_handlers vfio_err_handlers = {
 .error_detected = vfio_pci_aer_err_detected,
+.resume = vfio_pci_aer_resume,
 };

 static struct pci_driver vfio_pci_driver = {
diff --git a/drivers/vfio/pci/vfio_pci_private.h
b/drivers/vfio/pci/vfio_pci_private.h
index 2128de8..7430d92 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -91,6 +91,8 @@ struct vfio_pci_device {
 boolhas_vga;
 boolneeds_reset;
 boolnointx;
+boolaer_error_in_progress;
+struct completionaer_error_completion;
 struct pci_saved_state*pci_saved_state;
 intrefcnt;
 struct eventfd_ctx*err_trigger;
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 255a211..59b9cf6 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -198,6 +198,8 @@ struct vfio_device_info {
 #define VFIO_DEVICE_FLAGS_PCI(1 << 1)/* vfio-pci device */
 #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2)/* vfio-platform
device */
 #define VFIO_DEVICE_FLAGS_AMBA  (1 << 3)/* vfio-amba device */
+#define VFIO_DEVICE_FLAGS_AERPROCESS  (1 << 4)/* support aer
error progress */
+#define VFIO_DEVICE_FLAGS_INAERPROCESS  (1 << 5)/* 

[Qemu-devel] [PATCH 3/4] virtio-balloon: don't restart stats timer in callback

2016-08-18 Thread Roman Kagan
There's no need to restart the stats timer in its callback.  If the
callback happens to run when there's nothing to do just do nothing and
return.

The timer is armed either in receive handler or initially when
periodic stats collection is enabled via QMP.

While at this, observe that the presence of ->stats_vq_elem is enough to
indicate there's work to do here, and drop the check for the stats
feature.

Signed-off-by: Roman Kagan 
Cc: "Michael S. Tsirkin" 
---
 hw/virtio/virtio-balloon.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 0baf4b3..b56fecd 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -68,12 +68,6 @@ static inline void reset_stats(VirtIOBalloon *dev)
 for (i = 0; i < VIRTIO_BALLOON_S_NR; dev->stats[i++] = -1);
 }
 
-static bool balloon_stats_supported(const VirtIOBalloon *s)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_STATS_VQ);
-}
-
 static bool balloon_stats_enabled(const VirtIOBalloon *s)
 {
 return s->stats_poll_interval > 0;
@@ -99,9 +93,10 @@ static void balloon_stats_poll_cb(void *opaque)
 VirtIOBalloon *s = opaque;
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-if (s->stats_vq_elem == NULL || !balloon_stats_supported(s)) {
-/* re-schedule */
-balloon_stats_change_timer(s, s->stats_poll_interval);
+if (!s->stats_vq_elem) {
+/* The guest hasn't sent the stats yet (either not enabled or we came
+ * too early), nothing to do.  Once the guest starts sending stats the
+ * timer will get armed in receive handler */
 return;
 }
 
-- 
2.7.4




Re: [Qemu-devel] [PATCH 5/7] nios2: Add periodic timer emulation

2016-08-18 Thread Marek Vasut
On 08/18/2016 11:49 AM, Dmitry Osipenko wrote:
> On 17.08.2016 23:19, Marek Vasut wrote:
>> On 08/16/2016 11:38 PM, Dmitry Osipenko wrote:
>>
>> [...]
>>
>> Well what is sane clock frequency for hardware which can have arbitrary
>> frequency configured in ?
>>
>
> You could set to the one that is used by "10M50 GHRD" patch for example.

 That doesn't sound right . I can set it to 1 (Hz) , that should make it
 slow enough to signalize that something is broken while providing valid
 number.

>>>
>>> I'm not sure about it. Explicitly showing that something is wrong would be 
>>> better.
>>>
 +static void altera_timer_realize(DeviceState *dev, Error **errp)
 +{
 +AlteraTimer *t = ALTERA_TIMER(dev);
 +SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 +
 +assert(t->freq_hz != 0);
>>>
>>> If you would prefer to keep error'ing out, then I can suggest to add some
>>> verbose message instead of the assertion, like:
>>>
>>> if (!t->freq_hz) {
>>> error_setg(errp, "altera_timer: \"clock-frequency\" property " \
>>>  "must be provided");
>>> return;
>>> }
>>
>> That's better, thanks.
>>
>> btw breaking strings is frowned upon in linux/u-boot as it makes it
>> impossible to git grep for error message. Does the same apply to qemu?
>>
> 
> Actually, "altera_timer: " prefix isn't needed, as it should be already 
> included
> in the error message produced by by the error_setg(), so that string could be
> squeezed into one line. And, of course, it could be rephrased more concisely.

Even better, prefix dropped. Thanks

So what about patches 1..5 ? Anything I should change there ?

-- 
Best regards,
Marek Vasut



Re: [Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Nir Levy
Daniel, Thanks for your response.

But I have succeeded using simpletrace when building libvirt from source file 
into some/other/dir/install 

I am using Fedora22 which does not support lttng.
at the mather of fact I have installed lttng before and build the kernel 
modules (2.6.0) still no trace generated.
Michael Jeanson is working for supplying modules (2.8.0) for the next fedora 
(25)
fedora 23 and 24 do support systemtap but not over 22 on my kernel 4.2.8 (which 
we stick to in Asocs due to known performance degradation in later releases)

Regards.
Nir.


-Original Message-
From: Daniel P. Berrange [mailto:berra...@redhat.com] 
Sent: Thursday, August 18, 2016 4:11 PM
To: Nir Levy 
Cc: qemu-devel@nongnu.org; Stefan Hajnoczi 
Subject: Re: [Qemu-devel] errno 13, fopen qemu trace file.

On Thu, Aug 18, 2016 at 12:58:29PM +, Nir Levy wrote:
> Hello everybody,
> 
> I have a progress in tracing qemu,
> I add the thread and tag done for each kvm_ioctl, kvm_vm_ioctl, 
> kvm_vcpu_ioctl in purpose of investigating pure hypervisor activity and 
> delays on host.
> the kvm type print only for convenience.
> 
> for example:
> kvm_ioctl 3106435.230545 pid=11347 thread=11347 type=0xae03 arg=0x25 
> kvm_ioctl_done 3106435.230546 pid=11347 thread=11347 type=0xae03 
> arg=0x25 diff=1 (KVM_CHECK_EXTENSION) kvm_vcpu_ioctl 3106435.253930 
> pid=11347 thread=11354 cpu_index=0x2 type=0x4008ae9c 
> arg=0x56417e6cb4f0 kvm_vcpu_ioctl_done 3106435.253931 pid=11347 
> thread=11354 cpu_index=0x2 type=0x4008ae9c arg=0x56417e6cb4f0 diff=1 
> (KVM_X86_SETUP_MCE)
> 
> kvm_vm_ioctl 3106435.268896 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0 kvm_vm_ioctl_done 3106435.269082 pid=11347 
> thread=11347 type=0x4020ae46 arg=0x7ffed97cf9d0 diff=186 
> (KVM_SET_USER_MEMORY_REGION)
> 
> I have notice KVM_RUN can take even seconds but that is probably low 
> priority tasks,(io workers probably) but this 186micro second on the main 
> qemu thread is suspicious and might cause application running over vm delays.
> 
> however when I have switch back to installed libvirtd 1.2.13.2 ( my 
> version has extended timeout for qemu debugging - ver 1.2.21)
> 
> 
> void st_set_trace_file_enabled(bool enable) {
> ...
> trace_fp = fopen(trace_file_name, "wb");
> 
> trace_fp returns null and errno is 13 access denied.
> In my installation I did not create cgroup,
> 
> If you have some hint for this issue it will be great.

Libvirt doesn't support use of the simpletrace backend. The security policy 
will prevent simpletrace from even creating its output file.

It is recommended to use a dynamic trace backend like dtrace/systemtap/ ltt-ust 
so that QEMU does not need to directly create trace output files itself.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|


[Qemu-devel] [PATCH 2/4] virtio-balloon: make stats virtqueue length 1

2016-08-18 Thread Roman Kagan
The protocol for virtio-balloon stats virtqueue doesn't allow more than
one element in the virtqueue.

So, instead of trying to compensate for guest misbehavior if it sends
new data before the slot has been released by the host, just define the
stats virtqueue length to 1 initially and rely on the generic virtio
code to handle overflows.

Signed-off-by: Roman Kagan 
Cc: "Michael S. Tsirkin" 
Cc: Ladi Prosek 
---
 hw/virtio/virtio-balloon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 5af429a..0baf4b3 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -262,13 +262,6 @@ static void virtio_balloon_receive_stats(VirtIODevice 
*vdev, VirtQueue *vq)
 goto out;
 }
 
-if (s->stats_vq_elem != NULL) {
-/* This should never happen if the driver follows the spec. */
-virtqueue_push(vq, s->stats_vq_elem, 0);
-virtio_notify(vdev, vq);
-g_free(s->stats_vq_elem);
-}
-
 s->stats_vq_elem = elem;
 
 /* Initialize the stats to get rid of any stale values.  This is only
@@ -443,7 +436,7 @@ static void virtio_balloon_device_realize(DeviceState *dev, 
Error **errp)
 
 s->ivq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
 s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
-s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
+s->svq = virtio_add_queue(vdev, 1, virtio_balloon_receive_stats);
 
 reset_stats(s);
 }
-- 
2.7.4




[Qemu-devel] [PATCH 1/4] virtio: assert on ->inuse underflow

2016-08-18 Thread Roman Kagan
Make sure that ->inuse counter on virtqueue never goes negative.

This complements commit afd9096eb1882f23929f5b5c177898ed231bac66,
"virtio: error out if guest exceeds virtqueue size", which, due to
signed ->inuse comparison against unsigned ->vring.num, manifested a bug
in virtio-balloon where virtqueue_push() was called before the matching
virtqueu_pop(). [That problem will be addressed in followup patches].

Signed-off-by: Roman Kagan 
Cc: "Michael S. Tsirkin" 
Cc: Stefan Hajnoczi 
---
 hw/virtio/virtio.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 15ee3a7..7a57857 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -92,7 +92,7 @@ struct VirtQueue
 
 uint16_t queue_index;
 
-int inuse;
+unsigned int inuse;
 
 uint16_t vector;
 VirtIOHandleOutput handle_output;
@@ -290,6 +290,7 @@ void virtqueue_fill(VirtQueue *vq, const VirtQueueElement 
*elem,
 void virtqueue_flush(VirtQueue *vq, unsigned int count)
 {
 uint16_t old, new;
+assert(vq->inuse >= count);
 /* Make sure buffer is written before we update index. */
 smp_wmb();
 trace_virtqueue_flush(vq, count);
-- 
2.7.4




Re: [Qemu-devel] [PATCH v2] hw/vfio/platform: Add Qualcomm Technologies, Inc HIDMA device support

2016-08-18 Thread Alexander Graf

> On 18 Aug 2016, at 05:37, Auger Eric  wrote:
> 
> Hi Shanker,
> 
> Adding Alex in CC for (*)
> 
> On 14/08/2016 17:42, Shanker Donthineni wrote:
>> This patch introduces the Qualcomm Technologies, Inc HIDMA device and
>> allows passthrough the host HIDMA device to a guest machine using the
>> vfio-platform framework.
>> 
>> A platform device tree node is created for the guest machine which
>> contains the compat string 'qcom,hidma-1.0', mmio regions, active high
>> SPIs, and an optional property dma-coherent.
>> 
>> Signed-off-by: Vikram Sethi 
>> Signed-off-by: Shanker Donthineni 
>> ---
>> Changes sicnce v1:
>>  combined patches [v1 1/2] and [v1 2/2].
> Some general comments:
> - I preferred the previous series organization where we had the creation
> of the VFIO device first and its sysbus-fdt dynamic instantiation in a
> separate patch.
> 
> Peter requested sysbus-fdt stops growing and advised to split the fine
> into generic helpers and specific dt creation functions in separate
> files. This sounds the right moment to do it with looming new VFIO devices.
> 
> (*) Also I am now reconsidering the relevance of creating specific VFIO
> devices per compat string. At the begining of VFIO QEMU integration
> history we made that choice, advised by Alex (Graf), considering the
> QEMU VFIO device could not be generic. With a little more experience now
> we could see the specialization is currently done in the dt creation
> function (sysbus-fdt) and in the kernel reset module. So I would now
> advocate using a non abstract base VFIO device that could be
> instantiated passing its compat string as property. Creating
> hw/vfio/qcom-hidma.c and include/hw/vfio/vfio-qcom-hidma.h then would
> become useless. Alex, what is your feeling now?

Back when we set up the rule we were concerned with a few things that a generic 
sysbus devices can’t implement properly:

  * generic reset
  * power control
  * clock control

I guess the first two are mostly a matter of the in-kernel driver, not the user 
space one. Clock control hopefully isn’t much of a thing with real hardware ;).

So yes, if you can make a generic driver work, I’m not opposed.


Alex




[Qemu-devel] [PATCH] net/tap: Add qemu_name to ifup/ifdown callback script parameters

2016-08-18 Thread Anton Worshevsky
Add additional parameter to network configuration callback script calls
in launch_script(). External script can provide any exotic network
configuration for VM instance based on qemu_name as new param.

It's convenient to have only one ifup script without tricks with symlinks to it
for each VM instance on large VDS hosts.

Signed-off-by: Anton Worshevsky 
---
 net/tap.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index 6abb962..64a31fe 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -421,6 +421,9 @@ static void launch_script(const char *setup_script, const 
char *ifname,
 parg = args;
 *parg++ = (char *)setup_script;
 *parg++ = (char *)ifname;
+if (qemu_name) {
+*parg++ = (char *)qemu_name;
+}
 *parg = NULL;
 execv(setup_script, args);
 _exit(1);
-- 
1.7.4.4



Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Laine Stump

On 08/18/2016 08:43 AM, Kevin Zhao wrote:

Hi Laine,
Thanks :-) I also has a little questions below.

On 18 August 2016 at 01:00, Laine Stump  wrote:


On 08/17/2016 12:13 PM, Andrew Jones wrote:


On Wed, Aug 17, 2016 at 08:08:11PM +0800, Kevin Zhao wrote:


Hi all,
  Now I'm investigating net device hot plug and disk hotplug for
AArch64. For virtio , the default address is virtio-mmio. After Libvirt
1.3.5, user can explicitly specify the address-type to pci and so libvirt
will pass the virtio-pci parameters to the Qemu.
  Both my host and guest OS is Debian8, and Qemu version is 2.6.0.
Libvirt version is 1.3.5.
  For net-device, I change the address-type to pci, and libvirt pass
the
command below:
  -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:25:
25,bus=pci.2,addr=0x1

  After booting, the eth0 device disappear(eth0 occur when the
address
is virtio-mmio),
but I can find another net-device enp2s1, also it can't work for dhcp:
Running lspci: 02:01.0 Ethernet controller: Red Hat, Inc Virtio network
device
I'm not sure whether it worked.

  For disk device,* when I change the address-type to pci, the whole
qemu command is :*
https://paste.fedoraproject.org/409553/,  but the VM can not boot
successfully. Does Qemu not support device disk of virtio-pci in AArch64
just as it in X86_64?
  Thanks~Since I am not very familiar with Qemu, really looking
forward
to your response.

Best Regards,
Kevin Zhao


libvirt 1.3.5 is a bit old. Later versions no longer unconditionally add
the i82801b11 bridge, which was necessary to use PCI devices with the PCIe
host bridge mach-virt has. IMO, libvirt and qemu still have a long way to
go in order to configure a base/standard mach-virt PCIe machine.



Well, you can do it now, but you have to manually assign the PCI addresses
of devices (and if you want hotplug you need to live with Intel/TI-specific
PCIe controllers).


OK. It seems that Qemu will drop PCI for machine-virt and turning to PCIE
in the future.


This isn't a qemu issue, but a libvirt issue.


Do I need to do more for  Intel/TI-specific PCIe controllers?




 what do I
need to add in the guest XML or more?


As soon as my first set of patches is pushed, what you'll need to do is 
to add:




for each virtio device.

As soon as I've figured out a useful algorithm for adding the pcie 
controllers automatically, you won't need to do anything at all. (Keep 
in mind that any existing configurations will maintain their original 
config. If you want an existing configuration to be switched to using 
the new recommended (pcie) bus topology you will need to replace every 
instance of:




with a plain:



This will force libvirt to reassign PCI addresses for the devices.







1) If we want to support both PCIe devices and PCI, then things are messy.
Currently we propose dropping PCI support. mach-virt pretty much
exclusively uses virtio, which can be set to PCIe mode (virtio-1.0)



I have a libvirt patch just about ACKed for pushing upstream that will
automatically assign virtio-pci devices to a PCIe slot (if the qemu binary
supports virtio-1.0):

https://www.redhat.com/archives/libvir-list/2016-August/msg00852.html



What's the minimum version of  Qemu that support virito-1.0? Does Qemu 2.6
works?


qemu as old as 2.4.0 has the "--disable-legacy" option for virtio 
devices. I don't know if it was actually working properly back that far.




Also as  I see your patch for automatically assign virtio-pci to PCIE
slot,after it merged  I think thing will go much more easier.
Now I will manually add the slots and bus to pcie. Because I am not
familiar with it,  if it convenient, could you give me an available xml
file which PCIE disk and PCIE
net device can work for machine virt ?


I think Andrea sent an example.




Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Laine Stump

On 08/18/2016 08:10 AM, Marcel Apfelbaum wrote:

On 08/17/2016 08:00 PM, Laine Stump wrote:




 What I'm not sure about is whether we should always auto-add an extra
pcie-*-root to be sure a device can be hotplugged, or if we should admit
that 1

available slot isn't good enough for all situations, so we should
instead just leave it up to the user/management to manually add extra
ports if they think they'll want to hotplug something later.


Why not? Leaving 1 or 2 PCIe ports should be enough. On each port you
can hotplug a switch with several downstream ports. You can continue
nesting the switches up to depth of 6-7 I think.


When did qemu start supporting hotplug of pcie switch ports? My 
understanding is that in real hardware the only way this is supported is 
to plug in the entire "upstream+downstream+downstream+..." set as a 
single unit, since there is no mechanism for guest kernel to notify the 
upstream port that a new downstream port has been attached to it (or 
something like that; I'm vague on the details). From the other end, qemu 
can only hotplug a single PCI device at a time, so by the time we get to 
the point of plugging in the downstream ports, the upstream port is 
already cemented in place by the guest kernel.


I think that would be a really desirable feature though. Maybe qemu 
could queue up any downstream-ports which are pointing to an 
as-yet-nonexistent upstream-port id, then when the upstream-port with 
the proper id is finally attached, it could send the right magic to the 
guest (similar to the way it allows hotplugging all non-0 functions 
first, then takes action when function 0 is hotplugged).


If that was available, then yes, it would make perfect sense for libvirt 
to simply always make sure at least one empty pcie-*-port was available. 
If you have plans of doing something to support hotplugging a 
pcie-switch-* collection, then that's what I'll do.





Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Laine Stump

On 08/18/2016 03:41 AM, Andrew Jones wrote:

On Wed, Aug 17, 2016 at 01:00:05PM -0400, Laine Stump wrote:

On 08/17/2016 12:13 PM, Andrew Jones wrote:

On Wed, Aug 17, 2016 at 08:08:11PM +0800, Kevin Zhao wrote:

Hi all,
  Now I'm investigating net device hot plug and disk hotplug for
AArch64. For virtio , the default address is virtio-mmio. After Libvirt
1.3.5, user can explicitly specify the address-type to pci and so libvirt
will pass the virtio-pci parameters to the Qemu.
  Both my host and guest OS is Debian8, and Qemu version is 2.6.0.
Libvirt version is 1.3.5.
  For net-device, I change the address-type to pci, and libvirt pass the
command below:
  -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:25:25,bus=pci.2,addr=0x1

  After booting, the eth0 device disappear(eth0 occur when the address
is virtio-mmio),
but I can find another net-device enp2s1, also it can't work for dhcp:
Running lspci: 02:01.0 Ethernet controller: Red Hat, Inc Virtio network
device
I'm not sure whether it worked.

  For disk device,* when I change the address-type to pci, the whole
qemu command is :*
https://paste.fedoraproject.org/409553/,  but the VM can not boot
successfully. Does Qemu not support device disk of virtio-pci in AArch64
just as it in X86_64?
  Thanks~Since I am not very familiar with Qemu, really looking forward
to your response.

Best Regards,
Kevin Zhao

libvirt 1.3.5 is a bit old. Later versions no longer unconditionally add
the i82801b11 bridge, which was necessary to use PCI devices with the PCIe
host bridge mach-virt has. IMO, libvirt and qemu still have a long way to
go in order to configure a base/standard mach-virt PCIe machine.


Well, you can do it now, but you have to manually assign the PCI addresses
of devices (and if you want hotplug you need to live with Intel/TI-specific
PCIe controllers).




1) If we want to support both PCIe devices and PCI, then things are messy.
Currently we propose dropping PCI support. mach-virt pretty much
exclusively uses virtio, which can be set to PCIe mode (virtio-1.0)


I have a libvirt patch just about ACKed for pushing upstream that will
automatically assign virtio-pci devices to a PCIe slot (if the qemu binary
supports virtio-1.0):

https://www.redhat.com/archives/libvir-list/2016-August/msg00852.html

Separate patches do the same for the e1000e emulated network device (which
you probably don't care about) and the nec-usb-xhci (USB3) controller (more
useful):

https://www.redhat.com/archives/libvir-list/2016-August/msg00732.html



Thanks for the update Laine. This sounds great to me. With those patches
we can switch from virtio-mmio to virtio-pci easily, even if we're still
missing hotplug a bit longer. What limit do we have for the number of
devices, when we don't have any switches? I think I experimented once and
found it to be 7.


Theoretically you should be able to put something in each slot of 
pcie-root, and there are 31 slots (but slot 0x1f is always used by 
integrated devices, slot 1 is usually used by video, and slot 0x1D is 
usually used by a USB controller). Anyway, you should be able to get a 
lot more than 7 devices (one potential problem is that if you add a PCI 
controller that has a device plugged into it which requests IO port 
space, that will get chewed up very quickly. That's not an issue if you 
are connecting endpoints directly to pcie-root though.






Once these are in place, the only type of device of any consequence that I
can see still having no PCIe alternative is audio (even though only the
virgl video device is PCIe, libvirt has always assigned the primary video to
slot 1 on pcie-root anyway (although you shouldn't put a legacy PCI device
on a pcie-root-port or pcie-switch-downstream-port, it is acceptable to plug
it directly into pcie-root (as long as you know you won't need to hotplug
it).


2) root complex ports, switches (upstream/downstream ports) are currently
based on Intel parts. Marcel is thinking about creating generic models.


I say this every time it comes up, so just to be consistent: +1 :-)


3) libvirt needs to learn how to plug everything together, in proper PCIe
fashion, leaving holes for hotplug.


See above about virtio, although that doesn't cover the whole story. The
other part (which I'm working on right now) is that libvirt needs to
automatically add pcie-root-port, pcie-switch-upstream-port, and
pcie-switch-downstream-port devices as necessary. With the patches I
mentioned above, you still have to manually add enough pcie-*-port
controllers to the config, and then libvirt will plug the PCIe devices into
the right place. This is simple enough to do, but it does require
intervention.


OK, so we want this to support hotplug and eventually chain switches,
bumping our device limit up higher and higher. To what? I'm not sure,
I guess we're still limited by address space.


As long as the endpoint devices don't require IO port space, the 

Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Dmitry Fleytman


> On 18 Aug 2016, at 19:41 PM, Markus Armbruster  wrote:
> 
> Dmitry Fleytman  writes:
> 
>>> On 18 Aug 2016, at 17:15, Cao jin  wrote:
>>> 
>>> Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, 
>>> E1000E_USE_MSIX
>>> is not necessary too, remove it now. And interrupt flag field intr_state 
>>> also
>>> can be removed now.
>>> 
>>> CC: Dmitry Fleytman 
>>> CC: Jason Wang 
>>> CC: Markus Armbruster 
>>> CC: Marcel Apfelbaum 
>>> CC: Michael S. Tsirkin 
>>> CC: Paolo Bonzini 
>>> Signed-off-by: Cao jin 
>>> ---
>>> hw/net/e1000e.c | 8 +---
>>> 1 file changed, 1 insertion(+), 7 deletions(-)
>>> 
>>> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
>>> index 82a7be1..72aad21 100644
>>> --- a/hw/net/e1000e.c
>>> +++ b/hw/net/e1000e.c
>>> @@ -69,7 +69,6 @@ typedef struct E1000EState {
>>>uint16_t subsys_ven_used;
>>>uint16_t subsys_used;
>>> 
>>> -uint32_t intr_state;
>>>bool disable_vnet;
>>> 
>>>E1000ECore core;
>>> @@ -89,8 +88,6 @@ typedef struct E1000EState {
>>> #define E1000E_MSIX_TABLE   (0x)
>>> #define E1000E_MSIX_PBA (0x2000)
>>> 
>>> -#define E1000E_USE_MSIXBIT(0)
>>> -
>>> static uint64_t
>>> e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
>>> {
>>> @@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
>>>} else {
>>>if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
>>>msix_uninit(d, >msix, >msix);
>>> -} else {
>>> -s->intr_state |= E1000E_USE_MSIX;
>> 
>> So you are removing error handling here. But what if e1000e_use_msix_vectors 
>> fails?
> 
> Before the patch, E1000E_USE_MSIX is set exactly when MSI-X was enabled
> successfully.  It's only use is in MSI-X cleanup (next hunk): cleanup is
> skipped when the flag isn't set.
> 
> The flag is pointless, because we can just as well test whether MSI-X is
> enabled directly.  That's what the patch does.

I see.

Acked-by: Dmitry Fleytman 

> 
>> 
>>>}
>>>}
>>> }
>>> @@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
>>> static void
>>> e1000e_cleanup_msix(E1000EState *s)
>>> {
>>> -if (s->intr_state & E1000E_USE_MSIX) {
>>> +if (msix_enabled(PCI_DEVICE(s))) {
>>>e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
>>>msix_uninit(PCI_DEVICE(s), >msix, >msix);
>>>}
>>> @@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
>>>VMSTATE_MSIX(parent_obj, E1000EState),
>>> 
>>>VMSTATE_UINT32(ioaddr, E1000EState),
>>> -   VMSTATE_UINT32(intr_state, E1000EState),
> 
> We can still do this, because as Paolo pointed out, no released version
> of QEMU had this device.
> 
>>>VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
>>>VMSTATE_UINT8(core.rx_desc_len, E1000EState),
>>>VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
>>> -- 
>>> 2.1.0
> 
> Reviewed-by: Markus Armbruster 




[Qemu-devel] [PATCH 3/3] docker: debian-bootstrap.pre: print helpful message if DEB_ARCH/DEB_TYPE unset

2016-08-18 Thread Sascha Silbe
The debian-bootstrap image doesn't choose a default architecture and
distribution version, instead the user has to set both DEB_ARCH and
DEB_TYPE in the environment. Print a reasonably helpful message if
either of them isn't set instead of complaining about "qemu-" being
missing or erroring out because we cannot cd to the mirror URL.

Signed-off-by: Sascha Silbe 
---

I haven't figured out a good way to warn about qemu-user-* being
missing because EXECUTABLE isn't set. debian-bootstrap.pre runs before
docker.py copies the executable so I cannot check in
debian-bootstrap.pre whether the binfmt interpreter exists. The
EXECUTABLE environment variable needs to be set only when run via
make, so checking it in debian-bootstrap.pre is no good either. And an
additional docker-image-debian-bootstrap rule in the Makefile that
checks if EXECUTABLE is set would override the regular rule, not
enhance it.

 tests/docker/dockerfiles/debian-bootstrap.pre | 13 +
 1 file changed, 13 insertions(+)

diff --git a/tests/docker/dockerfiles/debian-bootstrap.pre 
b/tests/docker/dockerfiles/debian-bootstrap.pre
index 5d9c8d5..2ae363f 100755
--- a/tests/docker/dockerfiles/debian-bootstrap.pre
+++ b/tests/docker/dockerfiles/debian-bootstrap.pre
@@ -15,6 +15,19 @@ exit_and_skip()
 if [ -z $FAKEROOT ]; then
 echo "Please install fakeroot to enable bootstraping"
 exit_and_skip
+
+fi
+
+if [ -z "${DEB_ARCH}" ]; then
+echo "Please set DEB_ARCH to choose an architecture (e.g. armhf)"
+exit_and_skip
+
+fi
+
+if [ -z "${DEB_TYPE}" ]; then
+echo "Please set DEB_TYPE to a Debian archive name (e.g. testing)"
+exit_and_skip
+
 fi
 
 # We check in order for
-- 
1.9.1




[Qemu-devel] [PATCH 2/3] docker: avoid dependency on 'realpath' package

2016-08-18 Thread Sascha Silbe
The 'realpath' executable is shipped in a separate package that isn't
installed by default on some distros.

We already use 'readlink -e' (provided by GNU coreutils) in some other
part of the code, so let's settle for that instead.

Signed-off-by: Sascha Silbe 
---
Too bad there isn't a POSIX equivalent of this.

 tests/docker/Makefile.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 4f4707d..1b20db0 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -116,7 +116,7 @@ docker-run-%: docker-qemu-src
-e EXTRA_CONFIGURE_OPTS=$(EXTRA_CONFIGURE_OPTS) 
\
-e V=$V -e J=$J -e DEBUG=$(DEBUG)\
-e CCACHE_DIR=/var/tmp/ccache \
-   -v $$(realpath 
$(DOCKER_SRC_COPY)):/var/tmp/qemu:z$(COMMA)ro \
+   -v $$(readlink -e 
$(DOCKER_SRC_COPY)):/var/tmp/qemu:z$(COMMA)ro \
-v $(DOCKER_CCACHE_DIR):/var/tmp/ccache:z \
qemu:$(IMAGE) \
/var/tmp/qemu/run \
-- 
1.9.1




[Qemu-devel] [PATCH 0/3] docker tests fixes

2016-08-18 Thread Sascha Silbe
A couple of fixes for issues encountered while trying out the new
docker test support. The debian-bootstrap image still doesn't build
for me, but that's a problem for another day.

Thanks for the docker test support, BTW. The centos6 test came in
rather handy today for testing the glib < 2.30 compatibility code.

Sascha Silbe (3):
  docker.py: don't hang on large docker output
  docker: avoid dependency on 'realpath' package
  docker: debian-bootstrap.pre: print helpful message if
DEB_ARCH/DEB_TYPE unset

 tests/docker/Makefile.include |  2 +-
 tests/docker/docker.py|  9 ++---
 tests/docker/dockerfiles/debian-bootstrap.pre | 13 +
 3 files changed, 20 insertions(+), 4 deletions(-)

-- 
1.9.1




[Qemu-devel] [PATCH 1/3] docker.py: don't hang on large docker output

2016-08-18 Thread Sascha Silbe
Unlike Popen.communicate(), subprocess.call() doesn't read from the
stdout file descriptor. If the child process produces more output than
fits into the pipe buffer, it will block indefinitely.

If we don't intend to consume the output, just send it straight to
/dev/null to avoid this issue.

Signed-off-by: Sascha Silbe 
---
This fixes a hang for me when building the Ubuntu docker image (empty
docker image cache).

 tests/docker/docker.py | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index 222a105..efb2bf4 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -25,6 +25,10 @@ from tarfile import TarFile, TarInfo
 from StringIO import StringIO
 from shutil import copy, rmtree
 
+
+DEVNULL = open(os.devnull, 'wb')
+
+
 def _text_checksum(text):
 """Calculate a digest string unique to the text content"""
 return hashlib.sha1(text).hexdigest()
@@ -34,8 +38,7 @@ def _guess_docker_command():
 commands = [["docker"], ["sudo", "-n", "docker"]]
 for cmd in commands:
 if subprocess.call(cmd + ["images"],
-   stdout=subprocess.PIPE,
-   stderr=subprocess.PIPE) == 0:
+   stdout=DEVNULL, stderr=DEVNULL) == 0:
 return cmd
 commands_txt = "\n".join(["  " + " ".join(x) for x in commands])
 raise Exception("Cannot find working docker command. Tried:\n%s" % \
@@ -98,7 +101,7 @@ class Docker(object):
 
 def _do(self, cmd, quiet=True, infile=None, **kwargs):
 if quiet:
-kwargs["stdout"] = subprocess.PIPE
+kwargs["stdout"] = DEVNULL
 if infile:
 kwargs["stdin"] = infile
 return subprocess.call(self._command + cmd, **kwargs)
-- 
1.9.1




Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support

2016-08-18 Thread Vishal Verma
On 08/18, Dan Williams wrote:
> [ adding Vishal who implemented the kernel side of nvdimm hotplug support ]
> 
> On Thu, Aug 11, 2016 at 11:54 PM, Xiao Guangrong
>  wrote:
> > This patchset is against commit c597dc90fbcd6 (virtio-net: allow increasing
> > rx queue siz) on pci branch of Michael's git tree and can be found at:
> >   https://github.com/xiaogr/qemu.git nvdimm-hotplug-v2
> >
> > Changelog in v2:
> >Fixed signed integer overflow pointed out by Stefan Hajnoczi
> >
> > This patchset enables nvdimm hotplug support, it is used as pc-dimm hotplug,
> > for example, a new nvdimm device can be plugged as follows:
> > object_add memory-backend-file,id=mem3,size=10G,mem-path=/home/eric/nvdimm3
> > device_add nvdimm,id=nvdimm3,memdev=mem3
> >
> > and unplug it as follows:
> > device_del nvdimm3
> > object_del mem3
> 
> Did you test this against the Linux NFIT hotplug support?  We just
> found that the Linux driver is not properly registering for ACPI0012
> event notification.  Is a notification sent on a 'device_add' event?

I've just sent out a patch that should fix this:
https://lists.01.org/pipermail/linux-nvdimm/2016-August/006637.html



Re: [Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread mar.krzeminski

W dniu 18.08.2016 o 21:05, Peter Maydell pisze:


On 18 August 2016 at 20:04, Dr. David Alan Gilbert  wrote:

Hmm, except there are two separate things with the name "xilinx_spi";
  vmstate_xilinx_spi in hw/ssi/xilinx_spi.c
which is the state for the "xlnx.xps-spi" (aka TYPE_XILINX_SPI) object.

and for added confusion:
   vmstate_m25p80 in hw/block/m25p80.c
which is the state for the "m25p80-generic" (aka TYPE_M25P80) object.
  also calls itself "xilinx_spi".

These went in a pair of Peter Crosthwaite commits at about the same time 4.5 
years
ago; I'm guessing it was just a copy-paste.

I think my preference would be to update the name for the m25p80 so it's
not got the clash; but it seems m25p80 contains definitions of about a zillion
flash devices all derived from the m25p80, so I think I'd have to try one of
them to see if the xilinx_spi name finds it's way onto the migration stream;
I suspect it doesn't.

Aha. Yeah, we should fix that. (I have a feeling the m25p80 devices are
all only for boards where we don't care about migration-compat, but I'm
not completely certain.)

I had a patch to change vmstate name in m25p80,
but I do not know why I have not send it.
I also believe that there is no board that could use migration of m25p80
devs (yet?).

Regards,
Marcin



thanks
-- PMM







Re: [Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 20:04, Dr. David Alan Gilbert  wrote:
> Hmm, except there are two separate things with the name "xilinx_spi";
>  vmstate_xilinx_spi in hw/ssi/xilinx_spi.c
> which is the state for the "xlnx.xps-spi" (aka TYPE_XILINX_SPI) object.
>
> and for added confusion:
>   vmstate_m25p80 in hw/block/m25p80.c
> which is the state for the "m25p80-generic" (aka TYPE_M25P80) object.
>  also calls itself "xilinx_spi".
>
> These went in a pair of Peter Crosthwaite commits at about the same time 4.5 
> years
> ago; I'm guessing it was just a copy-paste.
>
> I think my preference would be to update the name for the m25p80 so it's
> not got the clash; but it seems m25p80 contains definitions of about a zillion
> flash devices all derived from the m25p80, so I think I'd have to try one of
> them to see if the xilinx_spi name finds it's way onto the migration stream;
> I suspect it doesn't.

Aha. Yeah, we should fix that. (I have a feeling the m25p80 devices are
all only for boards where we don't care about migration-compat, but I'm
not completely certain.)

thanks
-- PMM



Re: [Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 18 August 2016 at 15:00, Andrew Jones  wrote:
> > We've recently started versioning mach-virt, v2.6 was the first versioned
> > release. As an effort to try and make sure we're doing things right, I
> > tried the vmstate-static-checker.py script. I compared a 2.6 machine
> > from a QEMU built from the v2.6.0 tag with a 2.6 machine from a QEMU
> > built from today's latest pull (5844365fe8). I see lots of errors. I have
> > no experience in this area, so I can't even state whether they're truly
> > a concern or not. I can say a few things;
> >
> >  1) Most of the errors look like the same problem. Something is wrong
> > with xilinx_spi state, which shows up everywhere. Here's an example
> >
> > Section "en25q64", Description "xilinx_spi": expected field 
> > "nonvolatile_cfg", got "cur_addr"; skipping rest
> 
> Well, something here is weird, because en25q64 and nonvolatile_cfg
> aren't part of xilinx_spi at all, they're in hw/block/m25p80.c.

Hmm, except there are two separate things with the name "xilinx_spi";
 vmstate_xilinx_spi in hw/ssi/xilinx_spi.c
which is the state for the "xlnx.xps-spi" (aka TYPE_XILINX_SPI) object.

and for added confusion:
  vmstate_m25p80 in hw/block/m25p80.c
which is the state for the "m25p80-generic" (aka TYPE_M25P80) object.
 also calls itself "xilinx_spi".

These went in a pair of Peter Crosthwaite commits at about the same time 4.5 
years
ago; I'm guessing it was just a copy-paste.

I think my preference would be to update the name for the m25p80 so it's
not got the clash; but it seems m25p80 contains definitions of about a zillion
flash devices all derived from the m25p80, so I think I'd have to try one of
them to see if the xilinx_spi name finds it's way onto the migration stream;
I suspect it doesn't.

Dave

> However we don't care about migration compatibility in the Xilinx
> boards at all, so the simple fix is just not to try to test them.
> Similarly, aspeed and imx are boards where we're not trying to
> preserve migration compat.
> 
> >  2) Several of the remaining problems are also present on a check of the
> > x86_64 pc-i440fx-2.6 machine type. To be precise
> >
> > Section "am53c974", Description "esp": expected field "cmdlen", got 
> > "cmdbuf"; skipping rest
> > Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> > skipping rest
> > Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", 
> > got "tx.props.ipcss"; skipping rest
> > Section "e1000-82545em", Description "e1000": expected field "tx.ipcss", 
> > got "tx.props.ipcss"; skipping rest
> > Section "e1000", Description "e1000": expected field "tx.ipcss", got 
> > "tx.props.ipcss"; skipping rest
> > Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> > skipping rest
> > Section "rtl8139", Description "rtl8139": expected field "tally_counters", 
> > got "tally_counters.TxOk"; skipping rest
> 
> Looking at just the e1000 for an example, this is a false positive
> in your checker. In commit 093454e2 the struct we're putting the
> ipcss/ipcso/etc fields was moved, so:
> 
> -VMSTATE_UINT8(tx.ipcss, E1000State),
> -VMSTATE_UINT8(tx.ipcso, E1000State),
> -VMSTATE_UINT16(tx.ipcse, E1000State),
> -VMSTATE_UINT8(tx.tucss, E1000State),
> -VMSTATE_UINT8(tx.tucso, E1000State),
> -VMSTATE_UINT16(tx.tucse, E1000State),
> -VMSTATE_UINT32(tx.paylen, E1000State),
> -VMSTATE_UINT8(tx.hdr_len, E1000State),
> -VMSTATE_UINT16(tx.mss, E1000State),
> +VMSTATE_UINT8(tx.props.ipcss, E1000State),
> +VMSTATE_UINT8(tx.props.ipcso, E1000State),
> +VMSTATE_UINT16(tx.props.ipcse, E1000State),
> +VMSTATE_UINT8(tx.props.tucss, E1000State),
> +VMSTATE_UINT8(tx.props.tucso, E1000State),
> +VMSTATE_UINT16(tx.props.tucse, E1000State),
> +VMSTATE_UINT32(tx.props.paylen, E1000State),
> +VMSTATE_UINT8(tx.props.hdr_len, E1000State),
> +VMSTATE_UINT16(tx.props.mss, E1000State),
> 
> but the on-the-wire format doesn't include the names of the C struct
> fields so this isn't a migration break.
> 
> > x86 only has three additional messages, which look harmless to me
> >
> > Section "apic-common" does not exist in dest
> > Section "apic" does not exist in dest
> > Section "kvm-apic" does not exist in dest
> >
> >  3) I analyzed one error I saw, and see it should be fine, as the device
> > simply went from unmigratable to migratable (for TCG anyway)
> >
> > Section "arm-gicv3-common" Section "arm-gicv3-common" Description 
> > "arm_gicv3": minimum version error: 0 < 1
> 
> Yep, that should be fine.
> 
> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PATCH v2 1/2] glib: add compatibility implementation for g_dir_make_tmp()

2016-08-18 Thread Sascha Silbe
We're going to make use of g_dir_make_tmp() in test-logging. Provide a
compatibility implementation of it for glib < 2.30.

May behave differently in some edge cases (e.g. pattern only at the
end of the template, the file name is not part of the error message),
but good enough in practice.

Signed-off-by: Sascha Silbe 
---
v1→v2:
  - new patch

 include/glib-compat.h | 21 +
 1 file changed, 21 insertions(+)

diff --git a/include/glib-compat.h b/include/glib-compat.h
index 01aa7b3..de64ac2 100644
--- a/include/glib-compat.h
+++ b/include/glib-compat.h
@@ -48,6 +48,27 @@ static inline gint64 qemu_g_get_monotonic_time(void)
 gint g_poll_fixed(GPollFD *fds, guint nfds, gint timeout);
 #endif
 
+#if !GLIB_CHECK_VERSION(2, 30, 0)
+/* Not a 100% compatible implementation, but good enough for most
+ * cases. Placeholders are only supported at the end of the
+ * template. */
+static inline gchar *qemu_g_dir_make_tmp(gchar const *tmpl, GError **error)
+{
+gchar const *template = tmpl ? tmpl : ".XX";
+gchar *path = g_build_filename(g_get_tmp_dir(), template, NULL);
+
+if (mkdtemp(path) != NULL) {
+return path;
+}
+/* Error occurred, clean up. */
+g_set_error(error, G_FILE_ERROR, g_file_error_from_errno(errno),
+"mkdtemp() failed");
+g_free(path);
+return NULL;
+}
+#define g_dir_make_tmp(tmpl, error) qemu_g_dir_make_tmp(tmpl, error)
+#endif /* glib 2.30 */
+
 #if !GLIB_CHECK_VERSION(2, 31, 0)
 /* before glib-2.31, GMutex and GCond was dynamic-only (there was a separate
  * GStaticMutex, but it didn't work with condition variables).
-- 
1.9.1




[Qemu-devel] [PATCH v2 2/2] test-logging: don't hard-code paths in /tmp

2016-08-18 Thread Sascha Silbe
Since f6880b7f [qemu-log: support simple pid substitution for logs],
test-logging creates files with hard-coded names in /tmp. In the best
case, this prevents multiple developers from running "make check" on
the same machine. In the worst case, it allows for symlink attacks,
enabling an attacker to overwrite files that are writable to the
developer running "make check".

Instead of hard-coding the paths, create a temporary directory using
g_dir_make_tmp() and clean it up afterwards.

Fixes: f6880b7f ("qemu-log: support simple pid substitution for logs")
Signed-off-by: Sascha Silbe 
---
v1→v2:
  - Factor out g_build_filename() + qemu_set_log_filename() + g_free()
into helper set_log_path_tmp().
  - Replace rmtree() spawning "rm -rf" with rmdir_full() using glib
directory handling (non-recursive). Private to test-logging for
now.

 tests/test-logging.c | 48 +---
 1 file changed, 41 insertions(+), 7 deletions(-)

diff --git a/tests/test-logging.c b/tests/test-logging.c
index cdf13c6..a12585f 100644
--- a/tests/test-logging.c
+++ b/tests/test-logging.c
@@ -25,6 +25,7 @@
  */
 
 #include "qemu/osdep.h"
+#include 
 
 #include "qemu-common.h"
 #include "qapi/error.h"
@@ -86,24 +87,57 @@ static void test_parse_range(void)
 error_free_or_abort();
 }
 
-static void test_parse_path(void)
+static void set_log_path_tmp(char const *dir, char const *tpl, Error **errp)
 {
+gchar *file_path = g_build_filename(dir, tpl, NULL);
+
+qemu_set_log_filename(file_path, errp);
+g_free(file_path);
+}
+
+static void test_parse_path(gconstpointer data)
+{
+gchar const *tmp_path = data;
 Error *err = NULL;
 
-qemu_set_log_filename("/tmp/qemu.log", _abort);
-qemu_set_log_filename("/tmp/qemu-%d.log", _abort);
-qemu_set_log_filename("/tmp/qemu.log.%d", _abort);
+set_log_path_tmp(tmp_path, "qemu.log", _abort);
+set_log_path_tmp(tmp_path, "qemu-%d.log", _abort);
+set_log_path_tmp(tmp_path, "qemu.log.%d", _abort);
 
-qemu_set_log_filename("/tmp/qemu-%d%d.log", );
+set_log_path_tmp(tmp_path, "qemu-%d%d.log", );
 error_free_or_abort();
 }
 
+/* Remove a directory and all its entries (non-recursive). */
+static void rmdir_full(gchar const *root)
+{
+GDir *root_gdir = g_dir_open(root, 0, NULL);
+gchar const *entry_name;
+
+g_assert_nonnull(root_gdir);
+while ((entry_name = g_dir_read_name(root_gdir)) != NULL) {
+gchar *entry_path = g_build_filename(root, entry_name, NULL);
+g_assert(g_remove(entry_path) == 0);
+g_free(entry_path);
+}
+g_dir_close(root_gdir);
+g_assert(g_rmdir(root) == 0);
+}
+
 int main(int argc, char **argv)
 {
+gchar *tmp_path = g_dir_make_tmp("qemu-test-logging.XX", NULL);
+int rc;
+
 g_test_init(, , NULL);
+g_assert_nonnull(tmp_path);
 
 g_test_add_func("/logging/parse_range", test_parse_range);
-g_test_add_func("/logging/parse_path", test_parse_path);
+g_test_add_data_func("/logging/parse_path", tmp_path, test_parse_path);
 
-return g_test_run();
+rc = g_test_run();
+
+rmdir_full(tmp_path);
+g_free(tmp_path);
+return rc;
 }
-- 
1.9.1




[Qemu-devel] [PATCH for-v2.7 v2 0/2] test-logging: don't hard-code paths in /tmp

2016-08-18 Thread Sascha Silbe
This version should be good enough for inclusion in 2.7. I kept the
temporary directory removal function local to test-logging for now,
only cleaning up a single directory level. We can still factor it out
and make it more generic in the 2.8 cycle. For 2.7 I'd rather stick
with a minimal approach as it's very late in the cycle.

Tested successfully with the centos6 docker image. Apart from a hang
(tests/test-qga) and a race condition (tests/acpi-test-disk.raw
missing) that both happen even without my patches, it also works well
on Ubuntu 14.04.

Feel free to perform any additional fixup required to land this in
rc4; I might not be around again until Tuesday.

Sascha Silbe (2):
  glib: add compatibility implementation for g_dir_make_tmp()
  test-logging: don't hard-code paths in /tmp

 include/glib-compat.h | 21 +
 tests/test-logging.c  | 48 +---
 2 files changed, 62 insertions(+), 7 deletions(-)

-- 
1.9.1




Re: [Qemu-devel] [PATCH v17 0/9] 8bit AVR cores

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> This series of patches adds 8bit AVR cores to QEMU.
> All instruction, except BREAK/DES/SPM/SPMX, are implemented. Not fully tested 
> yet.
> However I was able to execute simple code with functions. e.g fibonacci 
> calculation.
> This series of patches include a non real, sample board.
> No fuses support yet. PC is set to 0 at reset.
>

First 3 patches look good to me; I put in a comment on one or
two of the other patches but mostly I intend to leave the
review of the instruction generation patches to Richard.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v17 5/9] target-avr: adding AVR interrupt handling

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> Signed-off-by: Michael Rolnik 
> ---
>  target-avr/helper.c | 55 
> +
>  1 file changed, 55 insertions(+)
>
> diff --git a/target-avr/helper.c b/target-avr/helper.c
> index b48222d..8511fb7 100644
> --- a/target-avr/helper.c
> +++ b/target-avr/helper.c
> @@ -32,11 +32,66 @@
>  bool avr_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
>  {
>  bool ret = false;
> +CPUClass *cc = CPU_GET_CLASS(cs);
> +AVRCPU *cpu = AVR_CPU(cs);
> +CPUAVRState *env = >env;
> +
> +if (interrupt_request & CPU_INTERRUPT_RESET) {
> +if (cpu_interrupts_enabled(env)) {
> +cs->exception_index = EXCP_RESET;
> +cc->do_interrupt(cs);
> +
> +cs->interrupt_request &= ~CPU_INTERRUPT_RESET;
> +
> +ret = true;
> +}
> +}

Are you sure that you need to handle CPU_INTERRUPT_RESET here?
It looks to me like the code in cpu-exec.c should deal with it
for you.

> +if (interrupt_request & CPU_INTERRUPT_HARD) {
> +if (cpu_interrupts_enabled(env) && env->intsrc != 0) {
> +int index = ctz32(env->intsrc);
> +cs->exception_index = EXCP_INT(index);
> +cc->do_interrupt(cs);
> +
> +env->intsrc &= env->intsrc - 1; /* clear the interrupt */

I think clearing the env->intsrc bit should go in avr_cpu_do_interrupt().

> +cs->interrupt_request &= ~CPU_INTERRUPT_HARD;

I'm not sure what the interrupt model for this CPU is,
but other CPU models don't do this, so maybe you don't
want to either. (The usual model is that CPU_INTERRUPT_HARD
corresponds to an interrupt input signal to the CPU;
for instance on ARM it's the IRQ line. When the signal
goes high we call cpu_interrupt(cs, CPU_INTERRUPT_HARD)
which sets the bit, and when it goes low we call
cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD) which clears
the bit.)

> +
> +ret = true;
> +}
> +}
>  return ret;
>  }
>
>  void avr_cpu_do_interrupt(CPUState *cs)
>  {
> +AVRCPU *cpu = AVR_CPU(cs);
> +CPUAVRState *env = >env;
> +
> +uint32_t ret = env->pc_w;
> +int vector = 0;
> +int size = avr_feature(env, AVR_FEATURE_JMP_CALL) ? 2 : 1;
> +int base = 0; /* TODO: where to get it */
> +
> +if (cs->exception_index == EXCP_RESET) {
> +vector = 0;
> +} else if (env->intsrc != 0) {
> +vector = ctz32(env->intsrc) + 1;
> +}

Should env->intsrc==0 really be treated like reset?
(If it's a can't-happen case then asserting would probably be good.)

> +
> +if (avr_feature(env, AVR_FEATURE_3_BYTE_PC)) {
> +cpu_stb_data(env, env->sp--, (ret & 0xff));
> +cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >>  8);
> +cpu_stb_data(env, env->sp--, (ret & 0xff) >> 16);
> +} else if (avr_feature(env, AVR_FEATURE_2_BYTE_PC)) {
> +cpu_stb_data(env, env->sp--, (ret & 0xff));
> +cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >>  8);
> +} else {
> +cpu_stb_data(env, env->sp--, (ret & 0xff));
> +}
> +
> +env->pc_w = base + vector * size;
> +env->sregI = 0; /* clear Global Interrupt Flag */
> +
> +cs->exception_index = -1;
>  }
>
>  int avr_cpu_memory_rw_debug(CPUState *cs, vaddr addr, uint8_t *buf,
> --
> 2.4.9 (Apple Git-60)

thanks
-- PMM



Re: [Qemu-devel] [RFC v6-based v1 0/5] refine mdev framework

2016-08-18 Thread Alex Williamson
On Thu, 18 Aug 2016 16:42:14 +0800
Dong Jia  wrote:

> On Wed, 17 Aug 2016 03:09:10 -0700
> Neo Jia  wrote:
> 
> > On Wed, Aug 17, 2016 at 04:58:14PM +0800, Dong Jia wrote:  
> > > On Tue, 16 Aug 2016 16:14:12 +0800
> > > Jike Song  wrote:
> > >   
> > > > 
> > > > This patchset is based on NVidia's "Add Mediated device support" 
> > > > series, version 6:
> > > > 
> > > > http://www.spinics.net/lists/kvm/msg136472.html
> > > > 
> > > > 
> > > > Background:
> > > > 
> > > > The patchset from NVidia introduced the Mediated Device support 
> > > > to
> > > > Linux/VFIO. With that series, one can create virtual devices 
> > > > (supporting
> > > > by underlying physical device and vendor driver), and assign 
> > > > them to
> > > > userspace like QEMU/KVM, in the same way as device assignment 
> > > > via VFIO.
> > > > 
> > > > Based on that, NVidia and Intel implemented their vGPU 
> > > > solutions, IBM
> > > > implemented its CCW pass-through.  However, there are 
> > > > limitations
> > > > imposed by current (v6 in particular) mdev framework: the mdev 
> > > > must be
> > > > represented as a PCI device, several vfio capabilities such as
> > > > sparse mmap are not possible, and so forth.
> > > > 
> > > > This series aims to address above limitations and simplify the 
> > > > implementation.
> > > > 
> > > > 
> > > > Key Changes:
> > > > 
> > > > - An independent "struct device" was introduced to 
> > > > parent_device, thus
> > > >   a hierarchy in driver core is formed with physical device, 
> > > > parent device
> > > >   and mdev device;
> > > > 
> > > > - Leveraging the mechanism and APIs provided by Linux driver 
> > > > core, it
> > > >   is now safe to remove all refcnts and locks;
> > > > 
> > > > - vfio_mpci (later renamed to vfio_mdev) was made BUS-agnostic: 
> > > > all
> > > >   PCI-specific logic was removed, accesses from userspace are 
> > > > now
> > > >   passed to vendor driver directly, thus guaranteed that full 
> > > > VFIO
> > > >   capabilities provided: e.g. dynamic regions, sparse mmap, 
> > > > etc.;
> > > > 
> > > >   With vfio_mdev being BUS-agnostic, it is enough to have only 
> > > > one
> > > >   driver for all mdev devices;  
> > > 
> > > Hi Jike:
> > > 
> > > I don't know what happened, but finding out which direction this will
> > > likely go seems my first priority now...  
> > 
> > Hi Dong,
> > 
> > Just want to let you know that we are preparing the v7 patches to 
> > incorporate
> > the latest review comments from Intel folks and Alex, for some changes in 
> > this
> > patch set also mentioned in the recent review are already queued up in the 
> > new
> > version.  
> Hi Neo,
> 
> Good to know this. :>
> 
> >   
> > > 
> > > I'd say, either with only the original mdev v6, or patched this series,
> > > vfio-ccw could live. But this series saves my work of mimicing the
> > > vfio-mpci code in my vfio-mccw driver. I like this incremental patches.  
> > 
> > Thanks for sharing your progress and good to know our current v6 solution 
> > works 
> > for you. We are still evaluating the vfio_mdev changes here as I still 
> > prefer to
> > share general VFIO pci handling inside a common VFIO PCI driver, and the
> > modularization will reduce the impact of future changes and potential 
> > regressions
> > cross architectures - between PCI and CCW.  
> If this is something that Alex and the Intel folks are fine with, I have
> no problem with this too. Thanks,

Overall, I like this a lot.  Creating a proper device hierarchy and
letting the driver core manage the references makes a lot of sense and
the reduction in code volume and complexity speaks for itself.  I like
how the PCI mdev layer goes away, we're not imposing arbitrary
restrictions on the vendor driver in an attempt to insert a common
layer.  We can add helpers for things that do end up being common as we
go.  Using devices rather than uuids for functions is a big
improvement.  I hope that Neo and Kirti will incorporate many of these
changes in their next revision.  Thanks for stepping in with this,

Alex



Re: [Qemu-devel] [PATCH v17 3/9] target-avr: adding a sample AVR board

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> Signed-off-by: Michael Rolnik 
> ---
>  MAINTAINERS  |   1 +
>  hw/avr/Makefile.objs |  21 ++
>  hw/avr/sample.c  | 112 
> +++
>  3 files changed, 134 insertions(+)
>  create mode 100644 hw/avr/Makefile.objs
>  create mode 100644 hw/avr/sample.c
>

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [Qemu-devel] [PATCH v17 6/9] target-avr: adding helpers for IN, OUT, SLEEP, WBR & unsupported instructions

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> Signed-off-by: Michael Rolnik 

> @@ -79,11 +80,11 @@ void avr_cpu_do_interrupt(CPUState *cs)
>
>  if (avr_feature(env, AVR_FEATURE_3_BYTE_PC)) {
>  cpu_stb_data(env, env->sp--, (ret & 0xff));
> -cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >>  8);
> +cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >> 8);
>  cpu_stb_data(env, env->sp--, (ret & 0xff) >> 16);
>  } else if (avr_feature(env, AVR_FEATURE_2_BYTE_PC)) {
>  cpu_stb_data(env, env->sp--, (ret & 0xff));
> -cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >>  8);
> +cpu_stb_data(env, env->sp--, (ret & 0x00ff00) >> 8);
>  } else {
>  cpu_stb_data(env, env->sp--, (ret & 0xff));
>  }

These whitespace changes should be squashed into the commit where the
code was added.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v17 2/9] target-avr: adding AVR CPU features/flavors

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> Signed-off-by: Michael Rolnik 

Reviewed-by: Peter Maydell 

thanks
-- PMM



Re: [Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Nir Levy
Hello Stefan,

you are right that those are initial latencies,
I just gave those example for highlighting my goals.
I have started reading the ../Virtual/kvm/api. Documentation this is from where 
I intent to learn the operation I have printed.
still I am stuck with Asocs' testing libvirtd not allowing me to open the trace 
file 
Regards,
Nir,

-Original Message-
From: Stefan Hajnoczi [mailto:stefa...@gmail.com] 
Sent: Thursday, August 18, 2016 4:22 PM
To: Nir Levy 
Cc: qemu-devel@nongnu.org
Subject: Re: errno 13, fopen qemu trace file.

On Thu, Aug 18, 2016 at 1:58 PM, Nir Levy  wrote:
> I have a progress in tracing qemu,
> I add the thread and tag done for each kvm_ioctl, kvm_vm_ioctl, 
> kvm_vcpu_ioctl in purpose of investigating pure hypervisor activity and 
> delays on host.
> the kvm type print only for convenience.
>
> for example:
>
> kvm_ioctl 3106435.230545 pid=11347 thread=11347 type=0xae03 arg=0x25
>
> kvm_ioctl_done 3106435.230546 pid=11347 thread=11347 type=0xae03 
> arg=0x25 diff=1 (KVM_CHECK_EXTENSION)
>
> kvm_vcpu_ioctl 3106435.253930 pid=11347 thread=11354 cpu_index=0x2 
> type=0x4008ae9c arg=0x56417e6cb4f0
>
> kvm_vcpu_ioctl_done 3106435.253931 pid=11347 thread=11354 
> cpu_index=0x2 type=0x4008ae9c arg=0x56417e6cb4f0 diff=1 
> (KVM_X86_SETUP_MCE)
>
> kvm_vm_ioctl 3106435.268896 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0
>
> kvm_vm_ioctl_done 3106435.269082 pid=11347 thread=11347 
> type=0x4020ae46 arg=0x7ffed97cf9d0 diff=186 
> (KVM_SET_USER_MEMORY_REGION)
>
>
> I have notice KVM_RUN can take even seconds but that is probably low 
> priority tasks,(io workers probably)

Please read Linux Documentation/virtual/kvm/api.txt to learn about the ioctl 
calls.

KVM_RUN is *the* ioctl that executes guest code.  Unless a vcpu is halted we 
should be inside KVM_RUN, so spending time inside this ioctl is normal.

> but this 186micro second on the main qemu thread is suspicious and might 
> cause application running over vm delays.

By "186micro second" you are referring to KVM_SET_USER_MEMORY_REGION in the 
trace above.

Is this ioctl called in the critical path?  I doubt it since the 
KVM_X86_SETUP_MCE ioctl in your trace happens during initialization time from 
kvm_arch_init_vcpu() and is not in the critical path when the guest is running.

Why worry about latencies that do not affect running guests?

Stefan


Re: [Qemu-devel] [PATCH v17 1/9] target-avr: AVR cores support is added.

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 13:07, Michael Rolnik  wrote:
> 1. basic CPU structure
> 2. registers
> 3. no instructions
> 4. saving sreg, rampD, rampX, rampY, rampD, eind in HW representation
>
> Signed-off-by: Michael Rolnik 

Reviewed-by: Peter Maydell 

(please include my reviewed-by tag in the commit message for this
patch if/when you send out future versions of the patchset,
unless you make changes to this patch. That way I know I don't
need to look at it again next time around.)

thanks
-- PMM



Re: [Qemu-devel] [PATCH v2 0/8] nvdimm: hotplug support

2016-08-18 Thread Dan Williams
[ adding Vishal who implemented the kernel side of nvdimm hotplug support ]

On Thu, Aug 11, 2016 at 11:54 PM, Xiao Guangrong
 wrote:
> This patchset is against commit c597dc90fbcd6 (virtio-net: allow increasing
> rx queue siz) on pci branch of Michael's git tree and can be found at:
>   https://github.com/xiaogr/qemu.git nvdimm-hotplug-v2
>
> Changelog in v2:
>Fixed signed integer overflow pointed out by Stefan Hajnoczi
>
> This patchset enables nvdimm hotplug support, it is used as pc-dimm hotplug,
> for example, a new nvdimm device can be plugged as follows:
> object_add memory-backend-file,id=mem3,size=10G,mem-path=/home/eric/nvdimm3
> device_add nvdimm,id=nvdimm3,memdev=mem3
>
> and unplug it as follows:
> device_del nvdimm3
> object_del mem3

Did you test this against the Linux NFIT hotplug support?  We just
found that the Linux driver is not properly registering for ACPI0012
event notification.  Is a notification sent on a 'device_add' event?



Re: [Qemu-devel] [PATCH 2/2] syscall.c: Redefine IFLA_* enums

2016-08-18 Thread Peter Maydell
On 17 August 2016 at 19:57, Michal Privoznik  wrote:
> On 17.08.2016 17:28, Laurent Vivier wrote:
>> Le 17/08/2016 à 15:49, Michal Privoznik a écrit :
>>> In 9c37146782 I've tried to fix a broken build with older
>>> linux-headers. However, I didn't do it properly. The solution
>>> implemented here is to grab the enums that caused the problem
>>> initially, and rename their values so that they are "QEMU_"
>>> prefixed. In order to guarantee matching values with actual
>>> enums from linux-headers, the enums are seeded with starting
>>> values from the original enums.
>>>
>>> Signed-off-by: Michal Privoznik 
>>
>> I don't think you need the  "QEMU_IFLA_XXX_UNSPEC = IFLA_XXX_UNSPEC"
>> part as IFLA_XXX_UNSPEC is always 0 and enums always start at 0.
>
> Correct, I just wanted to make it clear that these enums I'm introducing
> here are the same as IFLA_*. But I don't have a strong opinion on that,
> so whatever you prefer.

I think it's better without them, as then we're entirely
independent of whether the system headers define any of
these constants at all. Could you respin without those, please?

thanks
-- PMM



Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Markus Armbruster
Markus Armbruster  writes:

> Dmitry Fleytman  writes:
>
>>> On 18 Aug 2016, at 17:15, Cao jin  wrote:
[...]
>>> @@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
>>> VMSTATE_MSIX(parent_obj, E1000EState),
>>> 
>>> VMSTATE_UINT32(ioaddr, E1000EState),
>>> -   VMSTATE_UINT32(intr_state, E1000EState),
>
> We can still do this, because as Paolo pointed out, no released version
> of QEMU had this device.

Which means we better get this into 2.7.0.  Thoughts?

>>> VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
>>> VMSTATE_UINT8(core.rx_desc_len, E1000EState),
>>> VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
>>> -- 
>>> 2.1.0
>
> Reviewed-by: Markus Armbruster 



[Qemu-devel] travis builds: failing because of duff data in ccache cache?

2016-08-18 Thread Peter Maydell
Hi; since commit 4b887ae travis builds have been persistently
failing on one particular config with the error:
exec.o: could not read symbols: File truncated
trying to link the mipsn32-linux-user binary.

My theory is that the problem here is that:
 * for one build, the build host ran out of disk space or otherwise
   hiccupped, resulting in a truncated .o file
 * since travis saves the ccache cache across builds, the truncated .o
   file has persisted and now every build is going to fail the same
   way (until something gets committed that results in exec.c or one
   of its included headers changing)

It looks like there's a way to manually clear the cache, so that
seems like a good first step to see if it fixes things:
https://docs.travis-ci.com/user/caching/#Clearing-Caches

Could somebody with admin access to our travis config try this, please?

thanks
-- PMM



Re: [Qemu-devel] [PATCH] 9pfs: add check for relative path

2016-08-18 Thread Greg Kurz
On Thu, 11 Aug 2016 10:43:11 +0530
P J P  wrote:

> From: Prasad J Pandit 
> 
> At various places in 9pfs back-end, it creates full path by
> concatenating two path strings. It could lead to a path
> traversal issue if one of the parameter was a relative path.
> Add check to avoid it.
> 
> Reported-by: Felix Wilhelm 
> Signed-off-by: Prasad J Pandit 
> ---

With GDB attached to my QEMU, I could easily change the path names to have a
".." component just like a malicious guest would do, and cause the path 
traversal
issue to happen with renameat at least. And this patch indeed fixes the issue in
this case.

I'm still on holidays and I could not find time to do more testing and reviewing
though. But since the official linux 9p client does not send relative paths, I
don't expect any regression to happen if we apply the whole patch. I hence give:

Acked-by: Greg Kurz 

Since this is a serious security issue, I shall send a pull request for this
patch tomorrow so that it gets into rc4. Also Cc'ing stable in case we deliver
another 2.6.x release one day.

If someone disagrees, please speak up ASAP.

Aneesh ? Michael ? Peter ?

>  hw/9pfs/9p-local.c | 31 +++
>  1 file changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
> index 3f271fc..c20331a 100644
> --- a/hw/9pfs/9p-local.c
> +++ b/hw/9pfs/9p-local.c
> @@ -493,6 +493,9 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
> *dir_path,
>  char *buffer = NULL;
>  
>  v9fs_string_init();
> +if (strstr(name, "../")) {
> +return err;
> +}
>  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
>  path = fullname.data;
>  
> @@ -554,6 +557,9 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
> *dir_path,
>  char *buffer = NULL;
>  
>  v9fs_string_init();
> +if (strstr(name, "../")) {
> +return err;
> +}
>  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
>  path = fullname.data;
>  
> @@ -663,6 +669,9 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
> *dir_path, const char *name,
>  flags |= O_NOFOLLOW;
>  
>  v9fs_string_init();
> +if (strstr(name, "../")) {
> +return err;
> +}
>  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
>  path = fullname.data;
>  
> @@ -734,6 +743,9 @@ static int local_symlink(FsContext *fs_ctx, const char 
> *oldpath,
>  char *buffer = NULL;
>  
>  v9fs_string_init();
> +if (strstr(name, "../")) {
> +return err;
> +}
>  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
>  newpath = fullname.data;
>  
> @@ -830,11 +842,14 @@ out:
>  static int local_link(FsContext *ctx, V9fsPath *oldpath,
>V9fsPath *dirpath, const char *name)
>  {
> -int ret;
> +int ret = -1;
>  V9fsString newpath;
>  char *buffer, *buffer1;
>  
>  v9fs_string_init();
> +if (strstr(name, "../")) {
> +return ret;
> +}
>  v9fs_string_sprintf(, "%s/%s", dirpath->data, name);
>  
>  buffer = rpath(ctx, oldpath->data);
> @@ -1059,6 +1074,9 @@ static int local_lremovexattr(FsContext *ctx, V9fsPath 
> *fs_path,
>  static int local_name_to_path(FsContext *ctx, V9fsPath *dir_path,
>const char *name, V9fsPath *target)
>  {
> +if (strstr(name, "../")) {
> +return -1;
> +}
>  if (dir_path) {
>  v9fs_string_sprintf((V9fsString *)target, "%s/%s",
>  dir_path->data, name);
> @@ -1074,12 +1092,15 @@ static int local_renameat(FsContext *ctx, V9fsPath 
> *olddir,
>const char *old_name, V9fsPath *newdir,
>const char *new_name)
>  {
> -int ret;
> +int ret = -1;
>  V9fsString old_full_name, new_full_name;
>  
>  v9fs_string_init(_full_name);
>  v9fs_string_init(_full_name);
>  
> +if (strstr(old_name, "../") || strstr(new_name, "../")) {
> +return ret;
> +}
>  v9fs_string_sprintf(_full_name, "%s/%s", olddir->data, old_name);
>  v9fs_string_sprintf(_full_name, "%s/%s", newdir->data, new_name);
>  
> @@ -1092,12 +1113,14 @@ static int local_renameat(FsContext *ctx, V9fsPath 
> *olddir,
>  static int local_unlinkat(FsContext *ctx, V9fsPath *dir,
>const char *name, int flags)
>  {
> -int ret;
> +int ret = -1;
>  V9fsString fullname;
>  char *buffer;
>  
>  v9fs_string_init();
> -
> +if (strstr(name, "../")) {
> +return ret;
> +}
>  v9fs_string_sprintf(, "%s/%s", dir->data, name);
>  if (ctx->export_flags & V9FS_SM_MAPPED_FILE) {
>  if (flags == AT_REMOVEDIR) {




Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Paolo Bonzini


On 18/08/2016 16:15, Cao jin wrote:
> Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, 
> E1000E_USE_MSIX
> is not necessary too, remove it now. And interrupt flag field intr_state also
> can be removed now.
> 
> CC: Dmitry Fleytman 
> CC: Jason Wang 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> CC: Michael S. Tsirkin 
> CC: Paolo Bonzini 
> Signed-off-by: Cao jin 
> ---
>  hw/net/e1000e.c | 8 +---
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
> index 82a7be1..72aad21 100644
> --- a/hw/net/e1000e.c
> +++ b/hw/net/e1000e.c
> @@ -69,7 +69,6 @@ typedef struct E1000EState {
>  uint16_t subsys_ven_used;
>  uint16_t subsys_used;
>  
> -uint32_t intr_state;
>  bool disable_vnet;
>  
>  E1000ECore core;
> @@ -89,8 +88,6 @@ typedef struct E1000EState {
>  #define E1000E_MSIX_TABLE   (0x)
>  #define E1000E_MSIX_PBA (0x2000)
>  
> -#define E1000E_USE_MSIXBIT(0)
> -
>  static uint64_t
>  e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
>  {
> @@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
>  } else {
>  if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
>  msix_uninit(d, >msix, >msix);
> -} else {
> -s->intr_state |= E1000E_USE_MSIX;
>  }
>  }
>  }
> @@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
>  static void
>  e1000e_cleanup_msix(E1000EState *s)
>  {
> -if (s->intr_state & E1000E_USE_MSIX) {
> +if (msix_enabled(PCI_DEVICE(s))) {
>  e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
>  msix_uninit(PCI_DEVICE(s), >msix, >msix);
>  }
> @@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
>  VMSTATE_MSIX(parent_obj, E1000EState),
>  
>  VMSTATE_UINT32(ioaddr, E1000EState),
> -VMSTATE_UINT32(intr_state, E1000EState),
>  VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
>  VMSTATE_UINT8(core.rx_desc_len, E1000EState),
>  VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
> 

If there's time to get this in 2.7, it would be good.  Jason?

Reviewed-by: Paolo Bonzini 



Re: [Qemu-devel] [PATCH] 9pfs: add check for relative path

2016-08-18 Thread Greg Kurz
On Thu, 11 Aug 2016 14:27:15 +0800
Fam Zheng  wrote:

> On Wed, 08/10 23:17, no-re...@ec2-52-6-146-230.compute-1.amazonaws.com wrote:
> > Hi,
> > 
> > Your series failed automatic build test. Please find the testing commands 
> > and
> > their output below. If you have docker installed, you can probably 
> > reproduce it
> > locally.  
> 
> This may not relate to this patch. But some qtest did hang on the test 
> machine.
> 

The current qtest for 9p does not do anything, unfortunately, but testing
basic initialization so far... This patch has no effect here.

FWIW, I could successfully run make check with this patch and the current
master branch (commit 5844365fe8e5).

> [root@virtlab205 ~]# pgrep -af qtest
> 
> 12377 /bin/sh -c echo "GTESTER check-qtest-x86_64" &&
> QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((RANDOM % 255 + 1))} gtester -k -q
> -m=quick tests/vhost-user-test tests/endianness-test tests/fdc-test
> tests/ide-test tests/ahci-test tests/hd-geo-test tests/boot-order-test
> tests/bios-tables-test tests/pxe-test tests/rtc-test tests/ipmi-kcs-test
> tests/ipmi-bt-test tests/i440fx-test tests/fw_cfg-test tests/drive_del-test
> tests/wdt_ib700-test tests/tco-test tests/e1000-test tests/e1000e-test
> tests/rtl8139-test tests/pcnet-test tests/eepro100-test tests/ne2000-test
> tests/nvme-test tests/ac97-test tests/es1370-test tests/virtio-net-test
> tests/virtio-balloon-test tests/virtio-blk-test tests/virtio-rng-test
> tests/virtio-scsi-test tests/virtio-serial-test tests/virtio-console-test
> tests/tpci200-test tests/ipoctal232-test tests/display-vga-test
> tests/intel-hda-test tests/ivshmem-test tests/vmxnet3-test tests/pvpanic-test
> tests/i82801b11-test tests/ioh3420-test tests/usb-hcd-ohci-test
> tests/usb-hcd-uhci-test tests/usb-hcd-ehci-test tests/usb-hcd-xhci-test
> tests/pc-cpu-test tests/q35-test tests/test-netfilter tests/test-filter-mirror
> tests/test-filter-redirector tests/postcopy-test tests/device-introspect-test
> tests/qom-test
> 
> 12381 x86_64-softmmu/qemu-system-x86_64 -qtest 
> unix:/tmp/qtest-6376.sock,nowait
> -qtest-log /dev/null -qmp unix:/tmp/qtest-6376.qmp,nowait -machine accel=qtest
> -display none -machine accel=tcg -m 512 -object
> memory-backend-file,id=mem,size=512M,mem-path=/tmp/vhost-test-4glolA,share=on
> -numa node,memdev=mem -chardev
> socket,id=chr-test,path=/tmp/vhost-test-4glolA/test.sock -netdev
> vhost-user,id=net0,chardev=chr-test,vhostforce -device
> virtio-net-pci,netdev=net0,romfile=./pc-bios/pxe-virtio.rom
> 
> I haven't looked any deeper.
> 
> Fam
> 
> > 
> > Subject: [Qemu-devel] [PATCH] 9pfs: add check for relative path
> > Message-id: 1470892391-4917-1-git-send-email-ppan...@redhat.com
> > Type: series
> > 
> > === TEST SCRIPT BEGIN ===
> > #!/bin/bash
> > set -e
> > git submodule update --init dtc
> > make J=8 docker-test-quick@centos6
> > 
> > # we need CURL DPRINTF patch
> > # 
> > http://patchew.org/QEMU/1470027888-24381-1-git-send-email-famz%40redhat.com/
> > #make J=8 docker-test-mingw@fedora
> > === TEST SCRIPT END ===
> > 
> > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
> > Switched to a new branch 'test'
> > 4e97568 9pfs: add check for relative path
> > 
> > === OUTPUT BEGIN ===
> > Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 
> > 'dtc'
> > Cloning into 'dtc'...
> > Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
> >   BUILD centos6
> >   ARCHIVE qemu.tgz
> >   ARCHIVE dtc.tgz
> >   COPY RUNNER
> >   RUN test-quick in centos6
> > No C++ compiler available; disabling C++ specific optional code
> > Install prefix/tmp/qemu-test/src/tests/docker/install
> > BIOS directory/tmp/qemu-test/src/tests/docker/install/share/qemu
> > binary directory  /tmp/qemu-test/src/tests/docker/install/bin
> > library directory /tmp/qemu-test/src/tests/docker/install/lib
> > module directory  /tmp/qemu-test/src/tests/docker/install/lib/qemu
> > libexec directory /tmp/qemu-test/src/tests/docker/install/libexec
> > include directory /tmp/qemu-test/src/tests/docker/install/include
> > config directory  /tmp/qemu-test/src/tests/docker/install/etc
> > local state directory   /tmp/qemu-test/src/tests/docker/install/var
> > Manual directory  /tmp/qemu-test/src/tests/docker/install/share/man
> > ELF interp prefix /usr/gnemul/qemu-%M
> > Source path   /tmp/qemu-test/src
> > C compilercc
> > Host C compiler   cc
> > C++ compiler  
> > Objective-C compiler cc
> > ARFLAGS   rv
> > CFLAGS-O2 -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -pthread 
> > -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include   -g 
> > QEMU_CFLAGS   -I/usr/include/pixman-1-fPIE -DPIE -m64 -D_GNU_SOURCE 
> > -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes 
> > -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes 
> > -fno-strict-aliasing -fno-common  -Wendif-labels 

[Qemu-devel] [RFC] libvirt vGPU QEMU integration

2016-08-18 Thread Neo Jia
Hi libvirt experts,

I am starting this email thread to discuss the potential solution / proposal of
integrating vGPU support into libvirt for QEMU.

Some quick background, NVIDIA is implementing a VFIO based mediated device
framework to allow people to virtualize their devices without SR-IOV, for
example NVIDIA vGPU, and Intel KVMGT. Within this framework, we are reusing the
VFIO API to process the memory / interrupt as what QEMU does today with passthru
device.

The difference here is that we are introducing a set of new sysfs file for
virtual device discovery and life cycle management due to its virtual nature.

Here is the summary of the sysfs, when they will be created and how they should
be used:

1. Discover mediated device

As part of physical device initialization process, vendor driver will register
their physical devices, which will be used to create virtual device (mediated
device, aka mdev) to the mediated framework.

Then, the sysfs file "mdev_supported_types" will be available under the physical
device sysfs, and it will indicate the supported mdev and configuration for 
this 
particular physical device, and the content may change dynamically based on the
system's current configurations, so libvirt needs to query this file every time
before create a mdev.

Note: different vendors might have their own specific configuration sysfs as
well, if they don't have pre-defined types.

For example, we have a NVIDIA Tesla M60 on 86:00.0 here registered, and here is
NVIDIA specific configuration on an idle system.

For example, to query the "mdev_supported_types" on this Tesla M60:

cat /sys/bus/pci/devices/:86:00.0/mdev_supported_types
# vgpu_type_id, vgpu_type, max_instance, num_heads, frl_config, framebuffer,
max_resolution
11  ,"GRID M60-0B",  16,   2,  45, 512M,2560x1600
12  ,"GRID M60-0Q",  16,   2,  60, 512M,2560x1600
13  ,"GRID M60-1B",   8,   2,  45,1024M,2560x1600
14  ,"GRID M60-1Q",   8,   2,  60,1024M,2560x1600
15  ,"GRID M60-2B",   4,   2,  45,2048M,2560x1600
16  ,"GRID M60-2Q",   4,   4,  60,2048M,2560x1600
17  ,"GRID M60-4Q",   2,   4,  60,4096M,3840x2160
18  ,"GRID M60-8Q",   1,   4,  60,8192M,3840x2160

2. Create/destroy mediated device

Two sysfs files are available under the physical device sysfs path : mdev_create
and mdev_destroy

The syntax of creating a mdev is:

echo "$mdev_UUID:vendor_specific_argument_list" >
/sys/bus/pci/devices/.../mdev_create

The syntax of destroying a mdev is:

echo "$mdev_UUID:vendor_specific_argument_list" >
/sys/bus/pci/devices/.../mdev_destroy

The $mdev_UUID is a unique identifier for this mdev device to be created, and it
is unique per system.

For NVIDIA vGPU, we require a vGPU type identifier (shown as vgpu_type_id in
above Tesla M60 output), and a VM UUID to be passed as
"vendor_specific_argument_list".

If there is no vendor specific arguments required, either "$mdev_UUID" or
"$mdev_UUID:" will be acceptable as input syntax for the above two commands.

To create a M60-4Q device, libvirt needs to do:

echo "$mdev_UUID:vgpu_type_id=20,vm_uuid=$VM_UUID" >
/sys/bus/pci/devices/\:86\:00.0/mdev_create

Then, you will see a virtual device shows up at:

/sys/bus/mdev/devices/$mdev_UUID/

For NVIDIA, to create multiple virtual devices per VM, it has to be created
upfront before bringing any of them online.

Regarding error reporting and detection, on failure, write() to sysfs using fd
returns error code, and write to sysfs file through command prompt shows the
string corresponding to error code.

3. Start/stop mediated device

Under the virtual device sysfs, you will see a new "online" sysfs file.

you can do cat /sys/bus/mdev/devices/$mdev_UUID/online to get the current status
of this virtual device (0 or 1), and to start a virtual device or stop a 
virtual 
device you can do:

echo "1|0" > /sys/bus/mdev/devices/$mdev_UUID/online

libvirt needs to query the current state before changing state.

Note: if you have multiple devices, you need to write to the "online" file
individually.

For NVIDIA, if there are multiple mdev per VM, libvirt needs to bring all of
them "online" before starting QEMU.

4. Launch QEMU/VM

Pass the mdev sysfs path to QEMU as vfio-pci device:

-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$mdev_UUID,id=vgpu0

5. Shutdown sequence 

libvirt needs to shutdown the qemu, bring the virtual device offline, then 
destroy the
virtual device

6. VM Reset

No change or requirement for libvirt as this will be handled via VFIO reset API
and QEMU process will keep running as before.

7. Hot-plug

It optional for vendors to support hot-plug.

And it is same syntax to create a virtual device for hot-plug. 

For hot-unplug, after executing QEMU monitor "device del" command, libvirt needs
to write to "destroy" sysfs to 

Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Markus Armbruster
Dmitry Fleytman  writes:

>> On 18 Aug 2016, at 17:15, Cao jin  wrote:
>> 
>> Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, 
>> E1000E_USE_MSIX
>> is not necessary too, remove it now. And interrupt flag field intr_state also
>> can be removed now.
>> 
>> CC: Dmitry Fleytman 
>> CC: Jason Wang 
>> CC: Markus Armbruster 
>> CC: Marcel Apfelbaum 
>> CC: Michael S. Tsirkin 
>> CC: Paolo Bonzini 
>> Signed-off-by: Cao jin 
>> ---
>> hw/net/e1000e.c | 8 +---
>> 1 file changed, 1 insertion(+), 7 deletions(-)
>> 
>> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
>> index 82a7be1..72aad21 100644
>> --- a/hw/net/e1000e.c
>> +++ b/hw/net/e1000e.c
>> @@ -69,7 +69,6 @@ typedef struct E1000EState {
>> uint16_t subsys_ven_used;
>> uint16_t subsys_used;
>> 
>> -uint32_t intr_state;
>> bool disable_vnet;
>> 
>> E1000ECore core;
>> @@ -89,8 +88,6 @@ typedef struct E1000EState {
>> #define E1000E_MSIX_TABLE   (0x)
>> #define E1000E_MSIX_PBA (0x2000)
>> 
>> -#define E1000E_USE_MSIXBIT(0)
>> -
>> static uint64_t
>> e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
>> {
>> @@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
>> } else {
>> if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
>> msix_uninit(d, >msix, >msix);
>> -} else {
>> -s->intr_state |= E1000E_USE_MSIX;
>
> So you are removing error handling here. But what if e1000e_use_msix_vectors 
> fails?

Before the patch, E1000E_USE_MSIX is set exactly when MSI-X was enabled
successfully.  It's only use is in MSI-X cleanup (next hunk): cleanup is
skipped when the flag isn't set.

The flag is pointless, because we can just as well test whether MSI-X is
enabled directly.  That's what the patch does.

>
>> }
>> }
>> }
>> @@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
>> static void
>> e1000e_cleanup_msix(E1000EState *s)
>> {
>> -if (s->intr_state & E1000E_USE_MSIX) {
>> +if (msix_enabled(PCI_DEVICE(s))) {
>> e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
>> msix_uninit(PCI_DEVICE(s), >msix, >msix);
>> }
>> @@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
>> VMSTATE_MSIX(parent_obj, E1000EState),
>> 
>> VMSTATE_UINT32(ioaddr, E1000EState),
>> -   VMSTATE_UINT32(intr_state, E1000EState),

We can still do this, because as Paolo pointed out, no released version
of QEMU had this device.

>> VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
>> VMSTATE_UINT8(core.rx_desc_len, E1000EState),
>> VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
>> -- 
>> 2.1.0

Reviewed-by: Markus Armbruster 



[Qemu-devel] [Bug 1614609] [NEW] alphabetical order of monitor options

2016-08-18 Thread Kai Poeritz
Public bug reported:

Looking for the 'continue'/'resume' option I found this order that was not 
quite 'alphabetical'.
It had me overlook the 'cont' option at glance. Which is just a little 
impractical.

...
boot_set bootdevice -- define new values for the boot device list
change device filename [format [read-only-mode]] -- change a removable medium, 
optional format
chardev-add args -- add chardev
chardev-remove id -- remove chardev
client_migrate_info protocol hostname port tls-port cert-subject -- set 
migration information for remote display
closefd closefd name -- close a file descriptor previously passed via SCM rights
commit device|all -- commit changes to the disk images (if -snapshot is used) 
or backing files
cpu index -- set the default CPU
cpu-add id -- add cpu
c|cont  -- resume emulation
delvm tag|id -- delete a VM snapshot from its tag or id
...

I tested this list with 'sort' just to make sure and make a point:

$ cat Desktop/order-orig.txt 
boot_set
change
chardev-add
chardev-remove
client_migrate_info
closefd
commit
cpu
cpu-add
c|cont
delvm
$ cat Desktop/order-orig.txt | sort
boot_set
c|cont
change
chardev-add
chardev-remove
client_migrate_info
closefd
commit
cpu
cpu-add
delvm
$

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1614609

Title:
  alphabetical order of monitor options

Status in QEMU:
  New

Bug description:
  Looking for the 'continue'/'resume' option I found this order that was not 
quite 'alphabetical'.
  It had me overlook the 'cont' option at glance. Which is just a little 
impractical.

  ...
  boot_set bootdevice -- define new values for the boot device list
  change device filename [format [read-only-mode]] -- change a removable 
medium, optional format
  chardev-add args -- add chardev
  chardev-remove id -- remove chardev
  client_migrate_info protocol hostname port tls-port cert-subject -- set 
migration information for remote display
  closefd closefd name -- close a file descriptor previously passed via SCM 
rights
  commit device|all -- commit changes to the disk images (if -snapshot is used) 
or backing files
  cpu index -- set the default CPU
  cpu-add id -- add cpu
  c|cont  -- resume emulation
  delvm tag|id -- delete a VM snapshot from its tag or id
  ...

  I tested this list with 'sort' just to make sure and make a point:

  $ cat Desktop/order-orig.txt 
  boot_set
  change
  chardev-add
  chardev-remove
  client_migrate_info
  closefd
  commit
  cpu
  cpu-add
  c|cont
  delvm
  $ cat Desktop/order-orig.txt | sort
  boot_set
  c|cont
  change
  chardev-add
  chardev-remove
  client_migrate_info
  closefd
  commit
  cpu
  cpu-add
  delvm
  $

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1614609/+subscriptions



Re: [Qemu-devel] [PATCH 0/6] hypertrace: Lightweight guest-to-QEMU trace channel

2016-08-18 Thread Steven Rostedt
On Thu, 18 Aug 2016 11:54:24 +0100
Stefan Hajnoczi  wrote:

> Steven is working on a host/guest solution for trace-cmd.  It is also
> asynchronous.  No new paravirt hardware is needed and it makes me wonder
> whether the hypertrace PCI device is trying to solve the problem at the
> wrong layer.

Yes, and I'm currently working on it again. In fact, if any of you will
be at LinuxCon in Toronto next week, I'll be presenting[1] what I have
and what I plan to complete in the near future.

-- Steve

[1] 
https://lcccna2016.sched.org/event/7JWL/trace-cmd-virt-server-a-status-update-steven-rostedt-red-hat?iframe=no=i:100;=yes=no



Re: [Qemu-devel] QEMU TCG issue when executing UEFI

2016-08-18 Thread Ard Biesheuvel
(+ Leif)

Exec summary: strange QEMU bug triggered by RELEASE_GCC5 code, which
is caused by a spurious write to the NOR flash at runtime. The latter
is also a bug, in Tianocore.

On 18 August 2016 at 16:36, Peter Maydell  wrote:
> On 18 August 2016 at 15:15, Ard Biesheuvel  wrote:
>> On 18 August 2016 at 16:10, Peter Maydell  wrote:
>>> On 16 August 2016 at 13:08, Ard Biesheuvel  
>>> wrote:
 Bad ram pointer 0x54
 Aborted (core dumped)
>>>
>>> So the reason this happens is that get_page_addr_code() doesn't
>>> correctly handle the case of the memory region being a
>>> ROM that's not in ROMD mode. That is, the flash memory can
>>> be either in "reads map directly to guest memory" (normal)
>>> mode or "reads are MMIO to a device" (ROMD) mode. QEMU
>>> can't execute from devices, so the best case here would
>>> be that we print the "Sorry, we can't execute from a device"
>>> message and stop execution.
>>>
>>
>> So is there a spurious write somewhere that causes the ROM to switch
>> into ROMD mode? Because it executes happily from ROM (until it
>> doesn't, of course)
>
> The write that causes us to go into not-ROMD mode is in this block:
>
> 0x96ac:  cb000294  sub x20, x20, x0
> 0x96b0:  f9000a74  str x20, [x19, #16]
> 0x96b4:  9100627c  add x28, x19, #0x18 (24)
> 0x96b8:  b9400780  ldr w0, [x28, #4]
> 0x96bc:  35002cc0  cbnz w0, #+0x598 (addr 0x9c54)
>
> which is executed with
>
> PC=96ac  SP=4007f590
> X00=0160 X01=0095 X02=3031424e
> X03=1b40
> X04=00010b64 X05=0160 X06=0188
> X07=4007c268
> X08=000149a0 X09=4007fe58 X10=4007f793
> X11=0002
> X12=707fe07a X13=0002 X14=
> X15=
> X16= X17= X18=
> X19=000149a0
> X20=000149a0 X21=000149a0 X22=0001
> X23=0160
> X24=4007fa24 X25=4007fa38 X26=
> X27=00014840
> X28=000149a0 X29= X30=9364
> PSTATE=23c5 --C- EL1h
>
> so you write 0x14840 to address 0x149b0, which is in the flash.
>
> (This is the last TB we execute, because trying to find the
> next one hits the problem of the flash not being in ROMD mode.
> So it's the very last thing in the log if you run QEMU with
> -d in_asm,out_asm,exec,cpu,int -D /tmp/q.log)
>

OK, this rabbit hole goes pretty deep :-)

Normally, the uncompressed PE/COFF images in the NOR flash (the ones
that set up the MMU etc) are relocated at build time, so that they can
execute from the offset they end up at in the NOR image. The
relocation code sets the base address in the image header, and applies
all fixups in the .reloc PE/COFF section.

As it turns out, the LTO is so effective that it optimizes away all
absolute symbol references, leaving us with no .reloc section at all.
(i.e., the module turns out completely position independent, but
purely by accident). The runtime loader does not cope well with this,
and ends up writing to the NOR flash, triggering the issue above.

Thanks,
Ard.



Re: [Qemu-devel] [PATCH] atapi: allow 0 transfer bytes for read_cd command

2016-08-18 Thread John Snow



On 08/18/2016 05:48 AM, Hervé Poussineau wrote:

This fixes Windows NT4 startup when a cdrom is inserted.

Fixes: 9ef2e93f9b1888c7d0deb4a105149138e6ad2e98
Signed-off-by: Hervé Poussineau 
---
 hw/ide/atapi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
index 6189675..63312f2 100644
--- a/hw/ide/atapi.c
+++ b/hw/ide/atapi.c
@@ -1289,7 +1289,7 @@ static const struct AtapiCmd {
 [ 0xad ] = { cmd_read_dvd_structure,CHECK_READY },
 [ 0xbb ] = { cmd_set_speed, NONDATA },
 [ 0xbd ] = { cmd_mechanism_status,  0 },
-[ 0xbe ] = { cmd_read_cd,   CHECK_READY },
+[ 0xbe ] = { cmd_read_cd,   CHECK_READY | NONDATA },
 /* [1] handler detects and reports not ready condition itself */
 };




What's the exact nature of the problem? I intended the "NONDATA" flag to 
be used exclusively for commands that do not return ANY information 
except for status return codes:


/*
 * Commands flagged with NONDATA do not in any circumstances return
 * any data via ide_atapi_cmd_reply. These commands are exempt from
 * the normal byte_count_limit constraints.
 * See ATA8-ACS3 "7.21.5 Byte Count Limit"
 */
NONDATA = 0x04,

I wouldn't be comfortable applying it to a command that DID indeed 
return data under some circumstances... though you're right that if the 
command as delivered returns no data, the ATAPI layer is allowed to 
process that request absent byte_count_limit being programmed.


You may need to adjust near this line in ide_atapi_cmd:

 if (cmd->handler && !(cmd->flags & NONDATA)) {

to allow or disallow more commands as appropriate. It looks to me sadly 
as if there is no hard and fast rule available to tell which commands 
must set the BCL mandatorily ... and putting the check in the data 
transfer itself puts us at risk for not aborting the command early enough.


I'll try to address this post-KVM forum, if you don't solve it by then.

--js



Re: [Qemu-devel] [RFC 00/13] Live memory snapshot based on userfaultfd

2016-08-18 Thread Andrea Arcangeli
Hello everyone,

I've an aa.git tree uptodate on the master & userfault branch (master
includes other pending VM stuff, userfault branch only contains
userfault enhancements):

https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/log/?h=userfault

I didn't have time to test KVM live memory snapshot on it yet as I'm
still working to improve it. Did anybody test it? However I'd be happy
to take any bugreports and quickly solve anything that isn't working
right with the shadow MMU.

I got positive report already for another usage of the uffd WP support:

https://medium.com/@MartinCracauer/generational-garbage-collection-write-barriers-write-protection-and-userfaultfd-2-8b0e796b8f7f

The last few things I'm working on to finish the WP support are:

1) pte_swp_mkuffd_wp equivalent of pte_swp_mksoft_dirty to mark in a
   vma->vm_flags with VM_UFFD_WP set, which swap entries were
   generated while the pte was wrprotected.

2) to avoid all false positives the equivalent of pte_mksoft_dirty is
   needed too... and that requires spare software bits on the pte
   which are available on x86. I considered also taking over the
   soft_dirty bit but then you couldn't do checkpoint restore of a
   JIT/to-native compiler that uses uffd WP support so it wasn't
   ideal. Perhaps it would be ok as an incremental patch to make the
   two options mutually exclusive to defer the arch changes that
   pte_mkuffd_wp would require for later.

3) prevent UFFDIO_ZEROPAGE if registering WP|MISSING or trigger a
   cow in userfaultfd_writeprotect.

4) WP selftest

In theory things should work ok already if the userland code is
tolerant against false positives through swap and after fork() and
KSM. For an usage like snapshotting false positives shouldn't be an
issue (it'll just run slower if you swap in the worst case), and point
3) above also isn't an issue because it's going to register into uffd
with WP only.

The current status includes:

1) WP support for anon (with false positives.. work in progress)

2) MISSING support for tmpfs and hugetlbfs

3) non cooperative support

Thanks,
Andrea



Re: [Qemu-devel] [PATCH] aarch64: use TSX for ldrex/strex

2016-08-18 Thread Richard Henderson

On 08/17/2016 11:41 AM, Richard Henderson wrote:

On 08/17/2016 10:58 AM, Emilio G. Cota wrote:

(2) that we should start a new TB upon encountering a load-exclusive, so
that we maximize the chance of the store-exclusive being a part of the same
TB and thus have *nothing* extra between the beginning and commit of the
transaction.


I don't know how to do this. If it's easy to do, please let me know how
(for aarch64 at least, since that's the target I'm using).


It's a simple matter of peeking at the next instruction.

One way is to partially decode the insn before advancing the PC.

 static void disas_a64_insn (CPUARMState *env, DisasContext *s, int num_insns)
 {
uint32_t insn = arm_ldl_code(env, s->pc, s->sctlr_b);
+
+   if (num_insns > 1 && (insn & xxx) == yyy) {
+   /* Start load-exclusive in a new TB.  */
+   s->is_jmp = DISAS_UPDATE;
+   return;
+   }
s->insn = insn;
s->pc += 4;
...


Alternately, store num_insns into DisasContext, and do pc -= 4 in 
disas_ldst_excl.


Actually, the mask check is the only really viable solution, and it needs to 
happen before we do the tcg_gen_insn_start thing.


A couple of other notes, as I've thought about this some more.

If the start and end of the transaction are not in the same TB, the likelihood 
of transaction failure should be very near 100%.  Consider:


  * TB with ldrex ends before the strex.

  * Since the next TB hasn't been built yet, we'll definitely go
through tb_find_physical, through the translator, and through
the tcg compiler.

(a) Which I think we can definitely assume will exhaust any
resources associated with the transaction.
(b) Which will abort the transaction,
(c) Which, with the current code, will retry N times, with
identical results, failing within the compiler each time,
(d) Which, with the current code, will single-step through
to the strex, as you saw.

  * Since we proceed to (d) the first time, we'll never succeed
to create the next TB, so we'll always iterate compilation N
times, resulting in the single-step.

This is probably the real slow-down that you see.

Therefore, we must abort any transaction when we exit tcg-generated code.  Both 
through cpu_exit_loop or through the tcg epilogue.  We should be able to use 
the software controlled bits associated with the abort to tell what kind of 
event lead to the abort.  However, we must bear in mind that (for both x86 and 
ppc at least) we only have an 8-bit abort code.  So we can't pass back a 
pointer, for instance.


We should think about what kinds of limitations we should accept for handling 
ll/sc via transactions.


  * How do we handle unpaired ldrexd / ldxp?  This is used by the compiler,
as it's the only way to perform a double-word atomic load.

This implies that we need some sort of counter, beyond which we stop
trying to succeed via transaction.

  * In order to make normal cmpxchg patterns work, we have to be able to
handle a branch within a ll/sc sequence.  Options:

* Less complex way is to build a TB, including branches, with a max
  of N insns along the branch-not-taken path, searching for the strex.
  But of course this fails to handle legitimate patterns for arm
  (and other ll/sc guests).

  However, gcc code generation will generally annotate the cmpxchg
  failure branch as not-taken, so perhaps this will work well enough
  in practice.

* More complex way is to build a TB, including branches, with a max
  of N insns along *all* paths, searching for the strex.  This runs
  into problems with, among other things, branches crossing pages.

* Most complex way is to somehow get all of the TBs built, and
  linked together, preferably before we even try executing
  (and failing the transaction in) the first TB.


r~



Re: [Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Dmitry Fleytman

> On 18 Aug 2016, at 17:15, Cao jin  wrote:
> 
> Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, 
> E1000E_USE_MSIX
> is not necessary too, remove it now. And interrupt flag field intr_state also
> can be removed now.
> 
> CC: Dmitry Fleytman 
> CC: Jason Wang 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> CC: Michael S. Tsirkin 
> CC: Paolo Bonzini 
> Signed-off-by: Cao jin 
> ---
> hw/net/e1000e.c | 8 +---
> 1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
> index 82a7be1..72aad21 100644
> --- a/hw/net/e1000e.c
> +++ b/hw/net/e1000e.c
> @@ -69,7 +69,6 @@ typedef struct E1000EState {
> uint16_t subsys_ven_used;
> uint16_t subsys_used;
> 
> -uint32_t intr_state;
> bool disable_vnet;
> 
> E1000ECore core;
> @@ -89,8 +88,6 @@ typedef struct E1000EState {
> #define E1000E_MSIX_TABLE   (0x)
> #define E1000E_MSIX_PBA (0x2000)
> 
> -#define E1000E_USE_MSIXBIT(0)
> -
> static uint64_t
> e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
> {
> @@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
> } else {
> if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
> msix_uninit(d, >msix, >msix);
> -} else {
> -s->intr_state |= E1000E_USE_MSIX;

So you are removing error handling here. But what if e1000e_use_msix_vectors 
fails?

> }
> }
> }
> @@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
> static void
> e1000e_cleanup_msix(E1000EState *s)
> {
> -if (s->intr_state & E1000E_USE_MSIX) {
> +if (msix_enabled(PCI_DEVICE(s))) {
> e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
> msix_uninit(PCI_DEVICE(s), >msix, >msix);
> }
> @@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
> VMSTATE_MSIX(parent_obj, E1000EState),
> 
> VMSTATE_UINT32(ioaddr, E1000EState),
> -VMSTATE_UINT32(intr_state, E1000EState),
> VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
> VMSTATE_UINT8(core.rx_desc_len, E1000EState),
> VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
> -- 
> 2.1.0
> 
> 
> 



Re: [Qemu-devel] [PATCH v5] fpu: add mechanism to check for invalid long double formats

2016-08-18 Thread Peter Maydell
On 17 August 2016 at 01:14, Andrew Dutcher  wrote:
> All operations that take a floatx80 as an operand need to have their
> inputs checked for malformed encodings. In all of these cases, use the
> function floatx80_invalid_encoding to perform the check. If an invalid
> operand is found, raise an invalid operation exception, and then return
> either NaN (for fp-typed results) or the integer indefinite value (the
> minimum representable signed integer value, for int-typed results).
>
> For the non-quiet comparison operations, this touches adjacent code in
> order to pass style checks.
>
> Signed-off-by: Andrew Dutcher 

This version looks good -- thanks for your effort in working
through the code review process.

Reviewed-by: Peter Maydell 

and I'll make sure it gets committed once the trunk reopens
after the 2.7 release.

thanks
-- PMM



Re: [Qemu-devel] [PATCH] 9pfs: add check for relative path

2016-08-18 Thread Greg Kurz
On Thu, 11 Aug 2016 12:01:46 +0530
"Aneesh Kumar K.V"  wrote:

> P J P  writes:
> 
> > From: Prasad J Pandit 
> >
> > At various places in 9pfs back-end, it creates full path by
> > concatenating two path strings. It could lead to a path
> > traversal issue if one of the parameter was a relative path.
> > Add check to avoid it.
> >
> > Reported-by: Felix Wilhelm 
> > Signed-off-by: Prasad J Pandit   
> 
> I am not sure relative path names need to be completely disallowed. What

The official linux client does not send relative paths: all ".." path
components are resolved in the VFS layer IIUC.

> we need is to disallow the access outside export path. virtfs-proxy was
> done primarily to handle such things. 

My understanding is that the virtfs proxy was created to handle the case when
QEMU is running as non-root but root privilege is needed: to be able to call
lchown() or chmod() on any uid/gid when using the passthrough security model
for example.

> It does a chroot to the export path. 
> 

Yes it does since it is running as root and isn't supposed to access
anything outside the 9p mount path. But that doesn't help here since
the issue affects 9p-local, which is run by the QEMU process.

> 
> > ---
> >  hw/9pfs/9p-local.c | 31 +++
> >  1 file changed, 27 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/9pfs/9p-local.c b/hw/9pfs/9p-local.c
> > index 3f271fc..c20331a 100644
> > --- a/hw/9pfs/9p-local.c
> > +++ b/hw/9pfs/9p-local.c
> > @@ -493,6 +493,9 @@ static int local_mknod(FsContext *fs_ctx, V9fsPath 
> > *dir_path,
> >  char *buffer = NULL;
> >  
> >  v9fs_string_init();
> > +if (strstr(name, "../")) {
> > +return err;
> > +}
> >  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
> >  path = fullname.data;
> >  
> > @@ -554,6 +557,9 @@ static int local_mkdir(FsContext *fs_ctx, V9fsPath 
> > *dir_path,
> >  char *buffer = NULL;
> >  
> >  v9fs_string_init();
> > +if (strstr(name, "../")) {
> > +return err;
> > +}
> >  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
> >  path = fullname.data;
> >  
> > @@ -663,6 +669,9 @@ static int local_open2(FsContext *fs_ctx, V9fsPath 
> > *dir_path, const char *name,
> >  flags |= O_NOFOLLOW;
> >  
> >  v9fs_string_init();
> > +if (strstr(name, "../")) {
> > +return err;
> > +}
> >  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
> >  path = fullname.data;
> >  
> > @@ -734,6 +743,9 @@ static int local_symlink(FsContext *fs_ctx, const char 
> > *oldpath,
> >  char *buffer = NULL;
> >  
> >  v9fs_string_init();
> > +if (strstr(name, "../")) {
> > +return err;
> > +}
> >  v9fs_string_sprintf(, "%s/%s", dir_path->data, name);
> >  newpath = fullname.data;
> >  
> > @@ -830,11 +842,14 @@ out:
> >  static int local_link(FsContext *ctx, V9fsPath *oldpath,
> >V9fsPath *dirpath, const char *name)
> >  {
> > -int ret;
> > +int ret = -1;
> >  V9fsString newpath;
> >  char *buffer, *buffer1;
> >  
> >  v9fs_string_init();
> > +if (strstr(name, "../")) {
> > +return ret;
> > +}
> >  v9fs_string_sprintf(, "%s/%s", dirpath->data, name);
> >  
> >  buffer = rpath(ctx, oldpath->data);
> > @@ -1059,6 +1074,9 @@ static int local_lremovexattr(FsContext *ctx, 
> > V9fsPath *fs_path,
> >  static int local_name_to_path(FsContext *ctx, V9fsPath *dir_path,
> >const char *name, V9fsPath *target)
> >  {
> > +if (strstr(name, "../")) {
> > +return -1;
> > +}
> >  if (dir_path) {
> >  v9fs_string_sprintf((V9fsString *)target, "%s/%s",
> >  dir_path->data, name);
> > @@ -1074,12 +1092,15 @@ static int local_renameat(FsContext *ctx, V9fsPath 
> > *olddir,
> >const char *old_name, V9fsPath *newdir,
> >const char *new_name)
> >  {
> > -int ret;
> > +int ret = -1;
> >  V9fsString old_full_name, new_full_name;
> >  
> >  v9fs_string_init(_full_name);
> >  v9fs_string_init(_full_name);
> >  
> > +if (strstr(old_name, "../") || strstr(new_name, "../")) {
> > +return ret;
> > +}
> >  v9fs_string_sprintf(_full_name, "%s/%s", olddir->data, old_name);
> >  v9fs_string_sprintf(_full_name, "%s/%s", newdir->data, new_name);
> >  
> > @@ -1092,12 +1113,14 @@ static int local_renameat(FsContext *ctx, V9fsPath 
> > *olddir,
> >  static int local_unlinkat(FsContext *ctx, V9fsPath *dir,
> >const char *name, int flags)
> >  {
> > -int ret;
> > +int ret = -1;
> >  V9fsString fullname;
> >  char *buffer;
> >  
> >  v9fs_string_init();
> > -
> > +if (strstr(name, "../")) {
> > +return ret;
> > +}
> >  

Re: [Qemu-devel] [PATCH] test-logging: don't hard-code paths in /tmp

2016-08-18 Thread Sascha Silbe
Dear Peter,

Peter Maydell  writes:

> Are you planning to send a v2 of this patch? I was hoping we could
> fix the non-deleted logfiles for qemu 2.7.0 but it's getting a bit
> late in the cycle...

I'll try cooking up a version that's good enough for 2.7. I expected it
to land in 2.8 so I wasn't in a hurry.

Sascha
-- 
Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg
https://se-silbe.de/
USt-IdNr. DE281696641




Re: [Qemu-devel] [RFC PATCH v2 1/2] utils: Add helper to read arm MIDR_EL1 register

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 15:46, Richard Henderson  wrote:
> On 08/18/2016 07:14 AM, Peter Maydell wrote:
>> While we're on the subject, can somebody explain to me why we
>> use ifuncs at all? I couldn't work out why it would be better than
>> just using a straightforward function pointer -- when I tried single
>> stepping through things the ifunc approach still seemed to indirect
>> through some table or other so it wasn't actually resolving to
>> a direct function call anyway.

> No reason, I suppose.
>
> It's particularly helpful for libraries, where we don't really want the
> overhead of the initialization when it's not used.

Ah, I see.

> But (1) we don't have many of these and (2) we really don't care *that* much
> about startup time.
>
> So a simple function pointer initialized by a constructor has the same
> effect.

That seems like it would be a worthwhile change since
(a) I think it's easier to understand than ifunc magic
(b) it means we don't unnecessarily restrict ourselves to a libc
with ifunc support (musl libc doesn't do ifuncs, for instance)

thanks
-- PMM



Re: [Qemu-devel] [RFC PATCH v2 1/2] utils: Add helper to read arm MIDR_EL1 register

2016-08-18 Thread Richard Henderson

On 08/18/2016 07:14 AM, Peter Maydell wrote:

On 18 August 2016 at 15:04, Richard Henderson  wrote:

or (2) ifunc


While we're on the subject, can somebody explain to me why we
use ifuncs at all? I couldn't work out why it would be better than
just using a straightforward function pointer -- when I tried single
stepping through things the ifunc approach still seemed to indirect
through some table or other so it wasn't actually resolving to
a direct function call anyway.


No reason, I suppose.

It's particularly helpful for libraries, where we don't really want the 
overhead of the initialization when it's not used.


But (1) we don't have many of these and (2) we really don't care *that* much 
about startup time.


So a simple function pointer initialized by a constructor has the same effect.


r~



Re: [Qemu-devel] QEMU TCG issue when executing UEFI

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 15:15, Ard Biesheuvel  wrote:
> On 18 August 2016 at 16:10, Peter Maydell  wrote:
>> On 16 August 2016 at 13:08, Ard Biesheuvel  wrote:
>>> Bad ram pointer 0x54
>>> Aborted (core dumped)
>>
>> So the reason this happens is that get_page_addr_code() doesn't
>> correctly handle the case of the memory region being a
>> ROM that's not in ROMD mode. That is, the flash memory can
>> be either in "reads map directly to guest memory" (normal)
>> mode or "reads are MMIO to a device" (ROMD) mode. QEMU
>> can't execute from devices, so the best case here would
>> be that we print the "Sorry, we can't execute from a device"
>> message and stop execution.
>>
>
> So is there a spurious write somewhere that causes the ROM to switch
> into ROMD mode? Because it executes happily from ROM (until it
> doesn't, of course)

The write that causes us to go into not-ROMD mode is in this block:

0x96ac:  cb000294  sub x20, x20, x0
0x96b0:  f9000a74  str x20, [x19, #16]
0x96b4:  9100627c  add x28, x19, #0x18 (24)
0x96b8:  b9400780  ldr w0, [x28, #4]
0x96bc:  35002cc0  cbnz w0, #+0x598 (addr 0x9c54)

which is executed with

PC=96ac  SP=4007f590
X00=0160 X01=0095 X02=3031424e
X03=1b40
X04=00010b64 X05=0160 X06=0188
X07=4007c268
X08=000149a0 X09=4007fe58 X10=4007f793
X11=0002
X12=707fe07a X13=0002 X14=
X15=
X16= X17= X18=
X19=000149a0
X20=000149a0 X21=000149a0 X22=0001
X23=0160
X24=4007fa24 X25=4007fa38 X26=
X27=00014840
X28=000149a0 X29= X30=9364
PSTATE=23c5 --C- EL1h

so you write 0x14840 to address 0x149b0, which is in the flash.

(This is the last TB we execute, because trying to find the
next one hits the problem of the flash not being in ROMD mode.
So it's the very last thing in the log if you run QEMU with
-d in_asm,out_asm,exec,cpu,int -D /tmp/q.log)

thanks
-- PMM



Re: [Qemu-devel] QEMU TCG issue when executing UEFI

2016-08-18 Thread Peter Maydell
On 16 August 2016 at 13:08, Ard Biesheuvel  wrote:
> Bad ram pointer 0x54
> Aborted (core dumped)

So the reason this happens is that get_page_addr_code() doesn't
correctly handle the case of the memory region being a
ROM that's not in ROMD mode. That is, the flash memory can
be either in "reads map directly to guest memory" (normal)
mode or "reads are MMIO to a device" (ROMD) mode. QEMU
can't execute from devices, so the best case here would
be that we print the "Sorry, we can't execute from a device"
message and stop execution.

Treating the flash device's "return the current status"
bytes as code probably wasn't what you wanted to do anyway :-)

In more detail: when we call get_page_addr_code() for this
address, we notice that there is no TLB entry for it, and
so we call cpu_ldub_code() which is supposed to fill the TLB.
This ends up calling tlb_set_page_with_attrs(), which for a
not-RAM-not-ROMD MR will set the addend to 0 and then OR
TLB_MMIO into the address field (rather than setting the
addend to the right offset to get between the guest
address and the host RAM address). get_page_addr_code()
unfortunately then uses a different condition when it
distinguishes "is this an IO address we can't handle"
from "is this RAM", which means it takes the path for
"treat the addend as the offset between guest and host",
resulting in a completely bogus host address.

thanks
-- PMM



Re: [Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 15:00, Andrew Jones  wrote:
> We've recently started versioning mach-virt, v2.6 was the first versioned
> release. As an effort to try and make sure we're doing things right, I
> tried the vmstate-static-checker.py script. I compared a 2.6 machine
> from a QEMU built from the v2.6.0 tag with a 2.6 machine from a QEMU
> built from today's latest pull (5844365fe8). I see lots of errors. I have
> no experience in this area, so I can't even state whether they're truly
> a concern or not. I can say a few things;
>
>  1) Most of the errors look like the same problem. Something is wrong
> with xilinx_spi state, which shows up everywhere. Here's an example
>
> Section "en25q64", Description "xilinx_spi": expected field 
> "nonvolatile_cfg", got "cur_addr"; skipping rest

Well, something here is weird, because en25q64 and nonvolatile_cfg
aren't part of xilinx_spi at all, they're in hw/block/m25p80.c.

However we don't care about migration compatibility in the Xilinx
boards at all, so the simple fix is just not to try to test them.
Similarly, aspeed and imx are boards where we're not trying to
preserve migration compat.

>  2) Several of the remaining problems are also present on a check of the
> x86_64 pc-i440fx-2.6 machine type. To be precise
>
> Section "am53c974", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "e1000-82545em", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "e1000", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "rtl8139", Description "rtl8139": expected field "tally_counters", 
> got "tally_counters.TxOk"; skipping rest

Looking at just the e1000 for an example, this is a false positive
in your checker. In commit 093454e2 the struct we're putting the
ipcss/ipcso/etc fields was moved, so:

-VMSTATE_UINT8(tx.ipcss, E1000State),
-VMSTATE_UINT8(tx.ipcso, E1000State),
-VMSTATE_UINT16(tx.ipcse, E1000State),
-VMSTATE_UINT8(tx.tucss, E1000State),
-VMSTATE_UINT8(tx.tucso, E1000State),
-VMSTATE_UINT16(tx.tucse, E1000State),
-VMSTATE_UINT32(tx.paylen, E1000State),
-VMSTATE_UINT8(tx.hdr_len, E1000State),
-VMSTATE_UINT16(tx.mss, E1000State),
+VMSTATE_UINT8(tx.props.ipcss, E1000State),
+VMSTATE_UINT8(tx.props.ipcso, E1000State),
+VMSTATE_UINT16(tx.props.ipcse, E1000State),
+VMSTATE_UINT8(tx.props.tucss, E1000State),
+VMSTATE_UINT8(tx.props.tucso, E1000State),
+VMSTATE_UINT16(tx.props.tucse, E1000State),
+VMSTATE_UINT32(tx.props.paylen, E1000State),
+VMSTATE_UINT8(tx.props.hdr_len, E1000State),
+VMSTATE_UINT16(tx.props.mss, E1000State),

but the on-the-wire format doesn't include the names of the C struct
fields so this isn't a migration break.

> x86 only has three additional messages, which look harmless to me
>
> Section "apic-common" does not exist in dest
> Section "apic" does not exist in dest
> Section "kvm-apic" does not exist in dest
>
>  3) I analyzed one error I saw, and see it should be fine, as the device
> simply went from unmigratable to migratable (for TCG anyway)
>
> Section "arm-gicv3-common" Section "arm-gicv3-common" Description 
> "arm_gicv3": minimum version error: 0 < 1

Yep, that should be fine.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-block] [PATCH] atapi: allow 0 transfer bytes for read_cd command

2016-08-18 Thread Kevin Wolf
Am 18.08.2016 um 11:48 hat Hervé Poussineau geschrieben:
> This fixes Windows NT4 startup when a cdrom is inserted.
> 
> Fixes: 9ef2e93f9b1888c7d0deb4a105149138e6ad2e98
> Signed-off-by: Hervé Poussineau 

Hm, which of the paths in cmd_read_cd() does this hit? Is it the one
that directly calls ide_atapi_cmd_ok() without doing anything?

I think adding NONDATA is okay, but we may need to add explicit
atapi_byte_count_limit() == 0 checks to those paths that do transfer
some data. At least at first sight I'm not sure that
ide_atapi_cmd_read() can handle this.

Kevin

>  hw/ide/atapi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 6189675..63312f2 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -1289,7 +1289,7 @@ static const struct AtapiCmd {
>  [ 0xad ] = { cmd_read_dvd_structure,CHECK_READY },
>  [ 0xbb ] = { cmd_set_speed, NONDATA },
>  [ 0xbd ] = { cmd_mechanism_status,  0 },
> -[ 0xbe ] = { cmd_read_cd,   CHECK_READY },
> +[ 0xbe ] = { cmd_read_cd,   CHECK_READY | NONDATA },
>  /* [1] handler detects and reports not ready condition itself */
>  };



Re: [Qemu-devel] [PATCH 0/6] hypertrace: Lightweight guest-to-QEMU trace channel

2016-08-18 Thread Luiz Capitulino
On Thu, 18 Aug 2016 14:53:27 +0100
Stefan Hajnoczi  wrote:

> On Thu, Aug 18, 2016 at 12:22:18PM +0200, Lluís Vilanova wrote:
> > Stefan Hajnoczi writes:
> >   
> > > On Fri, Aug 05, 2016 at 06:59:23PM +0200, Lluís Vilanova wrote:  
> > >> The hypertrace channel allows guest code to emit events in QEMU (the 
> > >> host) using
> > >> its tracing infrastructure (see "docs/trace.txt"). This works in both 
> > >> 'system'
> > >> and 'user' modes. That is, hypertrace is to tracing, what hypercalls are 
> > >> to
> > >> system calls.
> > >> 
> > >> You can use this to emit an event on both guest and QEMU (host) traces 
> > >> to easily
> > >> synchronize or correlate them. You could also modify you guest's tracing 
> > >> system
> > >> to emit all events through the hypertrace channel, providing a unified 
> > >> and fully
> > >> synchronized trace log. Another use case is timing the performance of 
> > >> guest code
> > >> when optimizing TCG (QEMU traces have a timestamp).
> > >> 
> > >> See first commit for a full description.
> > >> 
> > >> Signed-off-by: Lluís Vilanova 
> > >> ---  
> >   
> > > CCing Steven Rostedt, Masami Hiramatsu, Luiz Capitulino, and LTTng folks
> > > who have all looked into host/guest tracing solutions.  
> > [...]
> > 
> > Oh, I wasn't aware of that. I'm certainly interested in collaborating.  
> 
> They are working on or have worked on different approaches to host/guest
> tracing.  Unfortunately there isn't an out-of-the-box solution as far as
> I know.

The ftrace solution is documented here:

 https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html

This traces the guest and host kernels. It supports merging the guest
and host traces. It's extremely low latency and has helped us to
find several spikes for real-time KVM (we're talking a few to
a dozen microseconds at most).

Now, our stack actually is:

 - Guest app
 - Guest kernel
 - Host kernel
 - QEMU

QEMU already has its own tracing (which I don't know how it works).
If I had to trace the guest app, I'd certainly start off by using
LTTng. Although, we'd have to write a tool to merge and orchestrate
(wooo, cloud buzzword!) all those traces (if that's what one wants).

> It would be nice if there was a documented host/guest tracing approach
> that didn't involve much manual setup and handled most use cases.

I'd volunteer to do it, although it will take me weeks to do it.



Re: [Qemu-devel] [RFC PATCH v2 1/2] utils: Add helper to read arm MIDR_EL1 register

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 15:04, Richard Henderson  wrote:
> or (2) ifunc

While we're on the subject, can somebody explain to me why we
use ifuncs at all? I couldn't work out why it would be better than
just using a straightforward function pointer -- when I tried single
stepping through things the ifunc approach still seemed to indirect
through some table or other so it wasn't actually resolving to
a direct function call anyway.

thanks
-- PMM



Re: [Qemu-devel] QEMU TCG issue when executing UEFI

2016-08-18 Thread Ard Biesheuvel
On 18 August 2016 at 16:10, Peter Maydell  wrote:
> On 16 August 2016 at 13:08, Ard Biesheuvel  wrote:
>> Bad ram pointer 0x54
>> Aborted (core dumped)
>
> So the reason this happens is that get_page_addr_code() doesn't
> correctly handle the case of the memory region being a
> ROM that's not in ROMD mode. That is, the flash memory can
> be either in "reads map directly to guest memory" (normal)
> mode or "reads are MMIO to a device" (ROMD) mode. QEMU
> can't execute from devices, so the best case here would
> be that we print the "Sorry, we can't execute from a device"
> message and stop execution.
>

So is there a spurious write somewhere that causes the ROM to switch
into ROMD mode? Because it executes happily from ROM (until it
doesn't, of course)

> Treating the flash device's "return the current status"
> bytes as code probably wasn't what you wanted to do anyway :-)
>
> In more detail: when we call get_page_addr_code() for this
> address, we notice that there is no TLB entry for it, and
> so we call cpu_ldub_code() which is supposed to fill the TLB.
> This ends up calling tlb_set_page_with_attrs(), which for a
> not-RAM-not-ROMD MR will set the addend to 0 and then OR
> TLB_MMIO into the address field (rather than setting the
> addend to the right offset to get between the guest
> address and the host RAM address). get_page_addr_code()
> unfortunately then uses a different condition when it
> distinguishes "is this an IO address we can't handle"
> from "is this RAM", which means it takes the path for
> "treat the addend as the offset between guest and host",
> resulting in a completely bogus host address.
>

OK, so that sounds like something that should be fixable. But I still
don't understand if the guest code is doing anything wrong.



Re: [Qemu-devel] [PULL for-2.7 0/2] Block patches

2016-08-18 Thread Peter Maydell
On 18 August 2016 at 14:39, Stefan Hajnoczi  wrote:
> The following changes since commit 5844365fe8e5e4598222d276d2af54fd45c7e3d3:
>
>   Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into 
> staging (2016-08-18 10:56:41 +0100)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to 156af3ac98da24f0155eed18ec546157436d6b2e:
>
>   block: fix possible reorder of flush operations (2016-08-18 14:36:49 +0100)

Applied, thanks.

-- PMM



[Qemu-devel] [PATCH] e1000e: remove internal interrupt flag

2016-08-18 Thread Cao jin
Commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, E1000E_USE_MSIX
is not necessary too, remove it now. And interrupt flag field intr_state also
can be removed now.

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 
CC: Paolo Bonzini 
Signed-off-by: Cao jin 
---
 hw/net/e1000e.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 82a7be1..72aad21 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -69,7 +69,6 @@ typedef struct E1000EState {
 uint16_t subsys_ven_used;
 uint16_t subsys_used;
 
-uint32_t intr_state;
 bool disable_vnet;
 
 E1000ECore core;
@@ -89,8 +88,6 @@ typedef struct E1000EState {
 #define E1000E_MSIX_TABLE   (0x)
 #define E1000E_MSIX_PBA (0x2000)
 
-#define E1000E_USE_MSIXBIT(0)
-
 static uint64_t
 e1000e_mmio_read(void *opaque, hwaddr addr, unsigned size)
 {
@@ -302,8 +299,6 @@ e1000e_init_msix(E1000EState *s)
 } else {
 if (!e1000e_use_msix_vectors(s, E1000E_MSIX_VEC_NUM)) {
 msix_uninit(d, >msix, >msix);
-} else {
-s->intr_state |= E1000E_USE_MSIX;
 }
 }
 }
@@ -311,7 +306,7 @@ e1000e_init_msix(E1000EState *s)
 static void
 e1000e_cleanup_msix(E1000EState *s)
 {
-if (s->intr_state & E1000E_USE_MSIX) {
+if (msix_enabled(PCI_DEVICE(s))) {
 e1000e_unuse_msix_vectors(s, E1000E_MSIX_VEC_NUM);
 msix_uninit(PCI_DEVICE(s), >msix, >msix);
 }
@@ -601,7 +596,6 @@ static const VMStateDescription e1000e_vmstate = {
 VMSTATE_MSIX(parent_obj, E1000EState),
 
 VMSTATE_UINT32(ioaddr, E1000EState),
-VMSTATE_UINT32(intr_state, E1000EState),
 VMSTATE_UINT32(core.rxbuf_min_shift, E1000EState),
 VMSTATE_UINT8(core.rx_desc_len, E1000EState),
 VMSTATE_UINT32_ARRAY(core.rxbuf_sizes, E1000EState,
-- 
2.1.0






Re: [Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread Andrew Jones
On Thu, Aug 18, 2016 at 04:00:14PM +0200, Andrew Jones wrote:
> Hi all,
> 
> We've recently started versioning mach-virt, v2.6 was the first versioned
> release. As an effort to try and make sure we're doing things right, I
> tried the vmstate-static-checker.py script. I compared a 2.6 machine
> from a QEMU built from the v2.6.0 tag with a 2.6 machine from a QEMU
> built from today's latest pull (5844365fe8). I see lots of errors. I have
> no experience in this area, so I can't even state whether they're truly
> a concern or not. I can say a few things;
> 
>  1) Most of the errors look like the same problem. Something is wrong
> with xilinx_spi state, which shows up everywhere. Here's an example
> 
> Section "en25q64", Description "xilinx_spi": expected field 
> "nonvolatile_cfg", got "cur_addr"; skipping rest
> 
>  2) Several of the remaining problems are also present on a check of the
> x86_64 pc-i440fx-2.6 machine type. To be precise
> 
> Section "am53c974", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "e1000-82545em", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "e1000", Description "e1000": expected field "tx.ipcss", got 
> "tx.props.ipcss"; skipping rest
> Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
> skipping rest
> Section "rtl8139", Description "rtl8139": expected field "tally_counters", 
> got "tally_counters.TxOk"; skipping rest
> 
> x86 only has three additional messages, which look harmless to me
> 
> Section "apic-common" does not exist in dest
> Section "apic" does not exist in dest
> Section "kvm-apic" does not exist in dest
> 
>  3) I analyzed one error I saw, and see it should be fine, as the device
> simply went from unmigratable to migratable (for TCG anyway)
> 
> Section "arm-gicv3-common" Section "arm-gicv3-common" Description 
> "arm_gicv3": minimum version error: 0 < 1
> 
> 
> Any help with this would be appreciated. I probably won't be looking into
> it myself, at least not any time soon. So, IOW, this mail is really just a
> bug report, not a progress report :-)
> 
> Steps I did and full output attached.

Sigh... forgot to actually attach. Am now.

> 
> Thanks,
> drew
> 

# Run with build of v2.6.0 tag
x86_64-softmmu/qemu-system-x86_64 -M pc-i440fx-2.6 -dump-vmstate x86_64.v2.6
aarch64-softmmu/qemu-system-aarch64 -M virt-2.6 -dump-vmstate aarch64.v2.6

# Run with latest QEMU build
x86_64-softmmu/qemu-system-x86_64 -M pc-i440fx-2.6 -dump-vmstate x86_64.latest
aarch64-softmmu/qemu-system-aarch64 -M virt-2.6 -dump-vmstate aarch64.latest

# x86_64 pc static checker output
../qemu/scripts/vmstate-static-checker.py -s x86_64.v2.6 -d x86_64.latest
Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "rtl8139", Description "rtl8139": expected field "tally_counters", got 
"tally_counters.TxOk"; skipping rest
Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "e1000-82545em", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "apic-common" does not exist in dest
Section "apic" does not exist in dest
Section "am53c974", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "kvm-apic" does not exist in dest
Section "e1000", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest

# AArch64 mach-virt static checker output
# Note1: The 'grep -v' is used to avoid over 100 lines like
#  Section "", Description "xilinx_spi": expected field 
"nonvolatile_cfg", got "cur_addr"; skipping rest
# which all seem to be the same issue. Unfiltered output below.
# Note2: I think the arm-gicv3-common error should be fine, as that
# device simply went from unmigratable to migratable (for TCG).
../qemu/scripts/vmstate-static-checker.py -s aarch64.v2.6 -d aarch64.latest | 
grep -v xilinx_spi
Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "aspeed.timer" Description "aspeed.timer": minimum version error: 1 < 2
Section "aspeed.timer" Description "aspeed.timer": version error: 1 > 0
Section "aspeed.timer", Description "aspeed.timer", Field "timer": missing 
description
Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "rtl8139", Description "rtl8139": expected field "tally_counters", got 
"tally_counters.TxOk"; skipping rest
Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "imx.fec" Section "imx.fec" Description 

Re: [Qemu-devel] [RFC PATCH v2 1/2] utils: Add helper to read arm MIDR_EL1 register

2016-08-18 Thread Richard Henderson

On 08/18/2016 02:39 AM, Paolo Bonzini wrote:



On 18/08/2016 11:01, Vijay Kilari wrote:

On Thu, Aug 18, 2016 at 2:20 PM, Paolo Bonzini  wrote:



On 18/08/2016 09:56, Vijay Kilari wrote:

The get_aarch_cpu_id() has check " if (unlikely(!cpu_info_read)) ".
If we call get_aarch_cpu_id() from is_thunderx_pass2_cpu() which is
called from inside the loop, we will be adding one additional check.


On the other hand, you are making an assumption that the caller of
is_thunderx_pass2_cpu() calls get_aarch64_cpu_id() first, and not
documenting it anywhere.

And given that you shouldn't call _any_ function from inside such a hot
loop, your solution is inferior on both counts.


Yes, but I could not think of better way to get rid of this check.


bool need_aa64_prefetch = is_thunderx_pass2();
for (...) {
 if (need_aa64_prefetch) {
 ...
 }
}

The check on cpu_info_read is done just once.


Supposing a check is required at all, this is still inferior to either

(1) If completely outside the loop,

  if (is_thunderx_pass2()) {
  for (...)
...
  } else {
  for (...)
  }

or (2) ifunc, so that we only check once, not every invocation.


r~



[Qemu-devel] hw/arm/virt: vmstate-static-checker.py results

2016-08-18 Thread Andrew Jones
Hi all,

We've recently started versioning mach-virt, v2.6 was the first versioned
release. As an effort to try and make sure we're doing things right, I
tried the vmstate-static-checker.py script. I compared a 2.6 machine
from a QEMU built from the v2.6.0 tag with a 2.6 machine from a QEMU
built from today's latest pull (5844365fe8). I see lots of errors. I have
no experience in this area, so I can't even state whether they're truly
a concern or not. I can say a few things;

 1) Most of the errors look like the same problem. Something is wrong
with xilinx_spi state, which shows up everywhere. Here's an example

Section "en25q64", Description "xilinx_spi": expected field "nonvolatile_cfg", 
got "cur_addr"; skipping rest

 2) Several of the remaining problems are also present on a check of the
x86_64 pc-i440fx-2.6 machine type. To be precise

Section "am53c974", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "dc390", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "e1000-82544gc", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "e1000-82545em", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "e1000", Description "e1000": expected field "tx.ipcss", got 
"tx.props.ipcss"; skipping rest
Section "esp", Description "esp": expected field "cmdlen", got "cmdbuf"; 
skipping rest
Section "rtl8139", Description "rtl8139": expected field "tally_counters", got 
"tally_counters.TxOk"; skipping rest

x86 only has three additional messages, which look harmless to me

Section "apic-common" does not exist in dest
Section "apic" does not exist in dest
Section "kvm-apic" does not exist in dest

 3) I analyzed one error I saw, and see it should be fine, as the device
simply went from unmigratable to migratable (for TCG anyway)

Section "arm-gicv3-common" Section "arm-gicv3-common" Description "arm_gicv3": 
minimum version error: 0 < 1


Any help with this would be appreciated. I probably won't be looking into
it myself, at least not any time soon. So, IOW, this mail is really just a
bug report, not a progress report :-)

Steps I did and full output attached.

Thanks,
drew



Re: [Qemu-devel] [PATCH 0/6] hypertrace: Lightweight guest-to-QEMU trace channel

2016-08-18 Thread Stefan Hajnoczi
On Thu, Aug 18, 2016 at 12:22:18PM +0200, Lluís Vilanova wrote:
> Stefan Hajnoczi writes:
> 
> > On Fri, Aug 05, 2016 at 06:59:23PM +0200, Lluís Vilanova wrote:
> >> The hypertrace channel allows guest code to emit events in QEMU (the host) 
> >> using
> >> its tracing infrastructure (see "docs/trace.txt"). This works in both 
> >> 'system'
> >> and 'user' modes. That is, hypertrace is to tracing, what hypercalls are to
> >> system calls.
> >> 
> >> You can use this to emit an event on both guest and QEMU (host) traces to 
> >> easily
> >> synchronize or correlate them. You could also modify you guest's tracing 
> >> system
> >> to emit all events through the hypertrace channel, providing a unified and 
> >> fully
> >> synchronized trace log. Another use case is timing the performance of 
> >> guest code
> >> when optimizing TCG (QEMU traces have a timestamp).
> >> 
> >> See first commit for a full description.
> >> 
> >> Signed-off-by: Lluís Vilanova 
> >> ---
> 
> > CCing Steven Rostedt, Masami Hiramatsu, Luiz Capitulino, and LTTng folks
> > who have all looked into host/guest tracing solutions.
> [...]
> 
> Oh, I wasn't aware of that. I'm certainly interested in collaborating.

They are working on or have worked on different approaches to host/guest
tracing.  Unfortunately there isn't an out-of-the-box solution as far as
I know.

It would be nice if there was a documented host/guest tracing approach
that didn't involve much manual setup and handled most use cases.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server

2016-08-18 Thread Paolo Bonzini


On 18/08/2016 15:50, Vladimir Sementsov-Ogievskiy wrote:
> 
> from nbd proto.md:
> 
> "Finally, it SHOULD return |EPERM| if it receives a write or trim
> request on a read-only export."
> 
> And EROFS is not mentioned in proto.md
> 
> (however the same bug is in NBD_CMD_WRITE case.)

system_errno_to_nbd_errno takes care of fixing up the return value.

Paolo



Re: [Qemu-devel] [Qemu-arm] [PATCH v2] hw/vfio/platform: Add Qualcomm Technologies, Inc HIDMA device support

2016-08-18 Thread Sinan Kaya
On 8/18/2016 5:37 AM, Auger Eric wrote:
> Some general comments:
> - I preferred the previous series organization where we had the creation
> of the VFIO device first and its sysbus-fdt dynamic instantiation in a
> separate patch.
> 
> Peter requested sysbus-fdt stops growing and advised to split the fine
> into generic helpers and specific dt creation functions in separate
> files. This sounds the right moment to do it with looming new VFIO devices.
> 
> (*) Also I am now reconsidering the relevance of creating specific VFIO
> devices per compat string. At the begining of VFIO QEMU integration
> history we made that choice, advised by Alex (Graf), considering the
> QEMU VFIO device could not be generic. With a little more experience now
> we could see the specialization is currently done in the dt creation
> function (sysbus-fdt) and in the kernel reset module. So I would now
> advocate using a non abstract base VFIO device that could be
> instantiated passing its compat string as property. Creating
> hw/vfio/qcom-hidma.c and include/hw/vfio/vfio-qcom-hidma.h then would
> become useless. Alex, what is your feeling now?
> 
> I think the split of sysbus-fdt and (*) - if approoved -  shall be
> considered before introducing a new QEMU VFIO device. Are you willing to
> work on it?

I really like generic device approach. It would save a ton of burden from
developers.

If somebody needs a specialized device, they can do so. For simple objects
that just needs a compat string, it doesn't make sense to create a driver
every single time.

Similarly, there is generic PCI host driver in the Linux kernel that works
for systems that is CAM or ECAM compliant. If somebody needs to do more,
they can implement their own PCI host driver too. 

We are really looking for the same model.

I'm interested in virtualization of platform SATA and HIDMA devices. The
only difference between these are the compatible strings. 

Why submit a new SATA AHCI driver for QEMU just to set the compat string?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.



Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Andrea Bolognani
On Thu, 2016-08-18 at 20:43 +0800, Kevin Zhao wrote:
> What's the minimum version of  Qemu that support virito-1.0?
> Does Qemu 2.6 works? 

2.6 definitely has virtio 1.0 support, however libvirt does
not yet allow you to control whether a device uses 0.9, 1.0
or both. The default for 2.6 should be both IIRC.

> Now I will manually add the slots and bus to pcie. Because
> I am not familiar with it,  if it convenient, could you give
> me an available xml file which PCIE disk and PCIE
> net device can work for machine virt ?

The XML you're looking for is at the end of this message.

Note that a Fedora 24 guest configured this way will not
boot at all if the machine type is virt-2.6; on the other
hand, an identically-configured RHEL 7.3 guest will boot
even with virt-2.6, but both the disk and the network
adapter will be legacy PCI instead of PCIe.



  abologna-f24
  f6d0428b-a034-4c4e-8ef2-f12f6aa9cab0
  2097152
  2097152
  4
  
hvm
/usr/share/AAVMF/AAVMF_CODE.fd
/var/lib/libvirt/qemu/nvram/abologna-f24_VARS.fd

  
  

  
  
  
  destroy
  restart
  restart
  
/usr/libexec/abologna-qemu-kvm

  
  
  
  



  
  
  


  
  
  


  


  
  
  
  
  


  


  


  
  
  

  


-- 
Andrea Bolognani / Red Hat / Virtualization



Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server

2016-08-18 Thread Vladimir Sementsov-Ogievskiy

On 19.07.2016 07:08, Eric Blake wrote:

Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.

Signed-off-by: Eric Blake 

---
v4: rebase, fix value for constant
v3: abandon NBD_CMD_CLOSE extension, rebase to use blk_pwrite_zeroes
---
  include/block/nbd.h |  8 ++--
  nbd/server.c| 42 --
  2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index fc4426c..e23ef73 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -69,6 +69,7 @@ struct nbd_reply {
  #define NBD_FLAG_SEND_FUA   (1 << 3)/* Send FUA (Force Unit 
Access) */
  #define NBD_FLAG_ROTATIONAL (1 << 4)/* Use elevator algorithm - 
rotational media */
  #define NBD_FLAG_SEND_TRIM  (1 << 5)/* Send TRIM (discard) */
+#define NBD_FLAG_SEND_WRITE_ZEROES (1 << 6) /* Send WRITE_ZEROES */

  /* New-style handshake (global) flags, sent from server to client, and
 control what will happen during handshake phase. */
@@ -94,7 +95,8 @@ struct nbd_reply {
  #define NBD_REP_ERR_SHUTDOWNNBD_REP_ERR(7)  /* Server shutting down */

  /* Request flags, sent from client to server during transmission phase */
-#define NBD_CMD_FLAG_FUA(1 << 0)
+#define NBD_CMD_FLAG_FUA(1 << 0) /* 'force unit access' during write */
+#define NBD_CMD_FLAG_NO_HOLE(1 << 1) /* don't punch hole on zero run */

  /* Supported request types */
  enum {
@@ -102,7 +104,9 @@ enum {
  NBD_CMD_WRITE = 1,
  NBD_CMD_DISC = 2,
  NBD_CMD_FLUSH = 3,
-NBD_CMD_TRIM = 4
+NBD_CMD_TRIM = 4,
+/* 5 reserved for failed experiment NBD_CMD_CACHE */
+NBD_CMD_WRITE_ZEROES = 6,
  };

  #define NBD_DEFAULT_PORT  10809
diff --git a/nbd/server.c b/nbd/server.c
index 689636c..3a2fecb 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -610,7 +610,8 @@ static coroutine_fn int nbd_negotiate(NBDClientNewData 
*data)
  char buf[8 + 8 + 8 + 128];
  int rc;
  const uint16_t myflags = (NBD_FLAG_HAS_FLAGS | NBD_FLAG_SEND_TRIM |
-  NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA);
+  NBD_FLAG_SEND_FLUSH | NBD_FLAG_SEND_FUA |
+  NBD_FLAG_SEND_WRITE_ZEROES);
  bool oldStyle;
  size_t len;

@@ -1126,11 +1127,17 @@ static ssize_t nbd_co_receive_request(NBDRequest *req,
  rc = request->type == NBD_CMD_WRITE ? -ENOSPC : -EINVAL;
  goto out;
  }
-if (request->flags & ~NBD_CMD_FLAG_FUA) {
+if (request->flags & ~(NBD_CMD_FLAG_FUA | NBD_CMD_FLAG_NO_HOLE)) {
  LOG("unsupported flags (got 0x%x)", request->flags);
  rc = -EINVAL;
  goto out;
  }
+if (request->type != NBD_CMD_WRITE_ZEROES &&
+(request->flags & NBD_CMD_FLAG_NO_HOLE)) {
+LOG("unexpected flags (got 0x%x)", request->flags);
+rc = -EINVAL;
+goto out;
+}

  rc = 0;

@@ -1235,6 +1242,37 @@ static void nbd_trip(void *opaque)
  }
  break;

+case NBD_CMD_WRITE_ZEROES:
+TRACE("Request type is WRITE_ZEROES");
+
+if (exp->nbdflags & NBD_FLAG_READ_ONLY) {
+TRACE("Server is read-only, return error");
+reply.error = EROFS;


from nbd proto.md:

"Finally, it SHOULD return |EPERM| if it receives a write or trim 
request on a read-only export."


And EROFS is not mentioned in proto.md

(however the same bug is in NBD_CMD_WRITE case.)


+goto error_reply;
+}
+
+TRACE("Writing to device");
+
+flags = 0;
+if (request.flags & NBD_CMD_FLAG_FUA) {
+flags |= BDRV_REQ_FUA;
+}
+if (!(request.flags & NBD_CMD_FLAG_NO_HOLE)) {
+flags |= BDRV_REQ_MAY_UNMAP;
+}
+ret = blk_pwrite_zeroes(exp->blk, request.from + exp->dev_offset,
+request.len, flags);
+if (ret < 0) {
+LOG("writing to file failed");
+reply.error = -ret;
+goto error_reply;
+}
+
+if (nbd_co_send_reply(req, , 0) < 0) {
+goto out;
+}
+break;
+
  case NBD_CMD_DISC:
  /* unreachable, thanks to special case in nbd_co_receive_request() */
  abort();



--
Best regards,
Vladimir



[Qemu-devel] [PATCH] slirp: fix segv when init failed

2016-08-18 Thread Marc-André Lureau
Since commit f6c2e66ae8c8a, slirp uses an exit notifier to call
slirp_smb_cleanup. However, if init() failed, the notifier isn't added,
and removing it will fail:

==18447== Invalid write of size 8
==18447==at 0x7EF2B5: notifier_remove (notify.c:32)
==18447==by 0x48E80C: qemu_remove_exit_notifier (vl.c:2661)
==18447==by 0x6A2187: net_slirp_cleanup (slirp.c:134)
==18447==by 0x69419D: qemu_cleanup_net_client (net.c:338)
==18447==by 0x69445B: qemu_del_net_client (net.c:401)
==18447==by 0x6A2B81: net_slirp_init (slirp.c:366)
==18447==by 0x6A4241: net_init_slirp (slirp.c:865)
==18447==by 0x695C6D: net_client_init1 (net.c:1051)
==18447==by 0x695F6E: net_client_init (net.c:1108)
==18447==by 0x696DBA: net_init_netdev (net.c:1498)
==18447==by 0x7F1F99: qemu_opts_foreach (qemu-option.c:1116)
==18447==by 0x696E60: net_init_clients (net.c:1516)
==18447==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

Signed-off-by: Marc-André Lureau 
---
 net/slirp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/slirp.c b/net/slirp.c
index facc30e..b60893f 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -131,7 +131,9 @@ static void net_slirp_cleanup(NetClientState *nc)
 SlirpState *s = DO_UPCAST(SlirpState, nc, nc);
 
 slirp_cleanup(s->slirp);
-qemu_remove_exit_notifier(>exit_notifier);
+if (s->exit_notifier.notify) {
+qemu_remove_exit_notifier(>exit_notifier);
+}
 slirp_smb_cleanup(s);
 QTAILQ_REMOVE(_stacks, s, entry);
 }
-- 
2.9.0




Re: [Qemu-devel] [PATCH 0/6] hypertrace: Lightweight guest-to-QEMU trace channel

2016-08-18 Thread Luiz Capitulino
On Thu, 18 Aug 2016 11:54:24 +0100
Stefan Hajnoczi  wrote:

> On Fri, Aug 05, 2016 at 06:59:23PM +0200, Lluís Vilanova wrote:
> > The hypertrace channel allows guest code to emit events in QEMU (the host) 
> > using
> > its tracing infrastructure (see "docs/trace.txt"). This works in both 
> > 'system'
> > and 'user' modes. That is, hypertrace is to tracing, what hypercalls are to
> > system calls.
> > 
> > You can use this to emit an event on both guest and QEMU (host) traces to 
> > easily
> > synchronize or correlate them. You could also modify you guest's tracing 
> > system
> > to emit all events through the hypertrace channel, providing a unified and 
> > fully
> > synchronized trace log. Another use case is timing the performance of guest 
> > code
> > when optimizing TCG (QEMU traces have a timestamp).
> > 
> > See first commit for a full description.  

I hate the be the one asking the stupid questions, but what's
the problem this series solves? What are the use-cases? Why
existing solutions don't solve this problem? How does this
compares to existing solutions?

> This tracing approach has a high performance overhead, particularly for
> SMP guests where each trace event requires writing to the global control
> register.  All CPUs will be hammering this register (heavyweight vmexit)
> for each trace event.
> 
> I think the folks CCed on this email all take an asynchronous approach
> to avoid this performance overhead.  Synchronous means taking a VM exit
> for every event.  Asynchronous means writing trace data to a buffer and
> later interleaving guest data with host trace data.
> 
> LTTng Userspace Tracer is an example of the asynchronous approach.  The
> trace data buffers are in shared memory.  The LTTng process can grab
> buffers at appropriate times.
> 
> The ftrace virtio-serial approach has been to splice() the ftrace
> buffers, resulting in efficient I/O.

Yes. However, note that the virtio-serial device is only used to
transfer the tracing data to the host. It has no role in the
tracing process. Also, it's not required. I've been using the
networking and it works OK as long as your tracing data is not big.

> 
> Steven is working on a host/guest solution for trace-cmd.  It is also
> asynchronous.  No new paravirt hardware is needed and it makes me wonder
> whether the hypertrace PCI device is trying to solve the problem at the
> wrong layer.
> 
> If you want to play around with asynchronous tracing, you could start
> with trace/simple.c.  It has a trace buffer that is asynchronously
> written out to file by a dedicated "writer" thread.
> 
> The one case where hypertrace makes sense to me is for -user tracing.
> There QEMU can efficiently interleave guest and QEMU traces, although as
> mentioned in the patch, I don't think the SIGSEGV approach should be
> used.
> 
> I suggest stripping this series down to focus on -user.  Synchronous
> tracing is not a good approach for -system emulation.
> 
> Stefan




Re: [Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Daniel P. Berrange
On Thu, Aug 18, 2016 at 01:31:19PM +, Nir Levy wrote:
> Daniel, Thanks for your response.
> 
> But I have succeeded using simpletrace when building libvirt from source file 
> into some/other/dir/install

I can only assume you built libvirt so that qemu runs as root,
instead of an unprivileged user account, and/or have not enabled
selinux for libvirt.

> 
> I am using Fedora22 which does not support lttng.
> at the mather of fact I have installed lttng before and build the kernel 
> modules (2.6.0) still no trace generated.
> Michael Jeanson is working for supplying modules (2.8.0) for the next fedora 
> (25)
> fedora 23 and 24 do support systemtap but not over 22 on my kernel 4.2.8 
> (which we stick to in Asocs due to known performance degradation in later 
> releases)

systemtap is supported in all Fedora releases - if there's any
problems do file bugs against it, as its expected to work as
it is the supported tracing mechanism for Fedora with qemu and
libvirt.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [PULL for-2.7 2/2] block: fix possible reorder of flush operations

2016-08-18 Thread Stefan Hajnoczi
From: "Denis V. Lunev" 

This patch reduce CPU usage of flush operations a bit. When we have one
flush completed we should kick only next operation. We should not start
all pending operations in the hope that they will go back to wait on
wait_queue.

Also there is a technical possibility that requests will get reordered
with the previous approach. After wakeup all requests are removed from
the wait queue. They become active and they are processed one-by-one
adding to the wait queue in the same order. Though new flush can arrive
while all requests are not put into the queue.

Signed-off-by: Denis V. Lunev 
Tested-by: Evgeny Yakovlev 
Signed-off-by: Evgeny Yakovlev 
Message-id: 1471457214-3994-3-git-send-email-...@openvz.org
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: Kevin Wolf 
CC: Max Reitz 
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 9c04086..420944d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2358,7 +2358,8 @@ out:
 /* Notify any pending flushes that we have completed */
 bs->flushed_gen = current_gen;
 bs->active_flush_req = NULL;
-qemu_co_queue_restart_all(>flush_queue);
+/* Return value is ignored - it's ok if wait queue is empty */
+qemu_co_queue_next(>flush_queue);
 
 tracked_request_end();
 return ret;
-- 
2.7.4




[Qemu-devel] [PULL for-2.7 1/2] block: fix deadlock in bdrv_co_flush

2016-08-18 Thread Stefan Hajnoczi
From: Evgeny Yakovlev 

The following commit
commit 3ff2f67a7c24183fcbcfe1332e5223ac6f96438c
Author: Evgeny Yakovlev 
Date:   Mon Jul 18 22:39:52 2016 +0300
block: ignore flush requests when storage is clean
has introduced a regression.

There is a problem that it is still possible for 2 requests to execute
in non sequential fashion and sometimes this results in a deadlock
when bdrv_drain_one/all are called for BDS with such stalled requests.

1. Current flushed_gen and flush_started_gen is 1.
2. Request 1 enters bdrv_co_flush to with write_gen 1 (i.e. the same
   as flushed_gen). It gets past flushed_gen != flush_started_gen and
   sets flush_started_gen to 1 (again, the same it was before).
3. Request 1 yields somewhere before exiting bdrv_co_flush
4. Request 2 enters bdrv_co_flush with write_gen 2. It gets past
   flushed_gen != flush_started_gen and sets flush_started_gen to 2.
5. Request 2 runs to completion and sets flushed_gen to 2
6. Request 1 is resumed, runs to completion and sets flushed_gen to 1.
   However flush_started_gen is now 2.

>From here on out flushed_gen is always != to flush_started_gen and all
further requests will wait on flush_queue. This change replaces
flush_started_gen with an explicitly tracked active flush request.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
Message-id: 1471457214-3994-2-git-send-email-...@openvz.org
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: Kevin Wolf 
CC: Max Reitz 
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c| 5 +++--
 include/block/block_int.h | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index d5493ba..9c04086 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2283,11 +2283,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 int current_gen = bs->write_gen;
 
 /* Wait until any previous flushes are completed */
-while (bs->flush_started_gen != bs->flushed_gen) {
+while (bs->active_flush_req != NULL) {
 qemu_co_queue_wait(>flush_queue);
 }
 
-bs->flush_started_gen = current_gen;
+bs->active_flush_req = 
 
 /* Write back all layers by calling one driver function */
 if (bs->drv->bdrv_co_flush) {
@@ -2357,6 +2357,7 @@ flush_parent:
 out:
 /* Notify any pending flushes that we have completed */
 bs->flushed_gen = current_gen;
+bs->active_flush_req = NULL;
 qemu_co_queue_restart_all(>flush_queue);
 
 tracked_request_end();
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 47665be..1e939de 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -443,8 +443,8 @@ struct BlockDriverState {
  note this is a reference count */
 
 CoQueue flush_queue;/* Serializing flush queue */
+BdrvTrackedRequest *active_flush_req; /* Flush request in flight */
 unsigned int write_gen; /* Current data generation */
-unsigned int flush_started_gen; /* Generation for which flush has started 
*/
 unsigned int flushed_gen;   /* Flushed write generation */
 
 BlockDriver *drv; /* NULL means no media */
-- 
2.7.4




[Qemu-devel] [PULL for-2.7 0/2] Block patches

2016-08-18 Thread Stefan Hajnoczi
The following changes since commit 5844365fe8e5e4598222d276d2af54fd45c7e3d3:

  Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into 
staging (2016-08-18 10:56:41 +0100)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 156af3ac98da24f0155eed18ec546157436d6b2e:

  block: fix possible reorder of flush operations (2016-08-18 14:36:49 +0100)





Denis V. Lunev (1):
  block: fix possible reorder of flush operations

Evgeny Yakovlev (1):
  block: fix deadlock in bdrv_co_flush

 block/io.c| 8 +---
 include/block/block_int.h | 2 +-
 2 files changed, 6 insertions(+), 4 deletions(-)

-- 
2.7.4




Re: [Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Stefan Hajnoczi
On Thu, Aug 18, 2016 at 1:58 PM, Nir Levy  wrote:
> I have a progress in tracing qemu,
> I add the thread and tag done for each kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl
> in purpose of investigating pure hypervisor activity and delays on host.
> the kvm type print only for convenience.
>
> for example:
>
> kvm_ioctl 3106435.230545 pid=11347 thread=11347 type=0xae03 arg=0x25
>
> kvm_ioctl_done 3106435.230546 pid=11347 thread=11347 type=0xae03 arg=0x25 
> diff=1 (KVM_CHECK_EXTENSION)
>
> kvm_vcpu_ioctl 3106435.253930 pid=11347 thread=11354 cpu_index=0x2 
> type=0x4008ae9c arg=0x56417e6cb4f0
>
> kvm_vcpu_ioctl_done 3106435.253931 pid=11347 thread=11354 cpu_index=0x2 
> type=0x4008ae9c arg=0x56417e6cb4f0 diff=1 (KVM_X86_SETUP_MCE)
>
> kvm_vm_ioctl 3106435.268896 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0
>
> kvm_vm_ioctl_done 3106435.269082 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0 diff=186 (KVM_SET_USER_MEMORY_REGION)
>
>
> I have notice KVM_RUN can take even seconds but that is probably low priority 
> tasks,(io workers probably)

Please read Linux Documentation/virtual/kvm/api.txt to learn about the
ioctl calls.

KVM_RUN is *the* ioctl that executes guest code.  Unless a vcpu is
halted we should be inside KVM_RUN, so spending time inside this ioctl
is normal.

> but this 186micro second on the main qemu thread is suspicious and might 
> cause application running over vm delays.

By "186micro second" you are referring to KVM_SET_USER_MEMORY_REGION
in the trace above.

Is this ioctl called in the critical path?  I doubt it since the
KVM_X86_SETUP_MCE ioctl in your trace happens during initialization
time from kvm_arch_init_vcpu() and is not in the critical path when
the guest is running.

Why worry about latencies that do not affect running guests?

Stefan



Re: [Qemu-devel] [PATCH 3/6] e1000e: fix for migration compatibility

2016-08-18 Thread Paolo Bonzini


On 18/08/2016 15:25, Cao jin wrote:
> 
> 
> On 08/18/2016 09:04 PM, Paolo Bonzini wrote:
>>
>>
>> On 18/08/2016 15:11, Cao jin wrote:
>>>
>>>
>>> On 08/18/2016 06:47 PM, Paolo Bonzini wrote:


 On 17/08/2016 16:39, Cao jin wrote:
> commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, but it
> is used by intr_state which exists in vmstate. Restore it for
> migration
> to older QEMU versions
>
> CC: Dmitry Fleytman 
> CC: Jason Wang 
> CC: Markus Armbruster 
> CC: Marcel Apfelbaum 
> CC: Michael S. Tsirkin 
> Signed-off-by: Cao jin 

 Not necessary.  No released version of QEMU had e1000e and lacked
 commit
 66bf7d58.

 Paolo

>>>
>>> Ok, then I will make this patch as separated one and send it out asap,
>>> so maybe it goes in 2.7
>>
>> It's not necessary at all; why would it be useful in 2.7?
>>
>> Paolo
>>
> 
> commit 66bf7d58I already removed E1000E_USE_MSI, so I think maybe I can
> send a patch to remove E1000E_USE_MSIX & intr_state, so there will be no
> migration compatibility issue

Please do, worst case it won't be accepted.

Paolo



Re: [Qemu-devel] [PATCH 3/6] e1000e: fix for migration compatibility

2016-08-18 Thread Cao jin



On 08/18/2016 09:04 PM, Paolo Bonzini wrote:



On 18/08/2016 15:11, Cao jin wrote:



On 08/18/2016 06:47 PM, Paolo Bonzini wrote:



On 17/08/2016 16:39, Cao jin wrote:

commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, but it
is used by intr_state which exists in vmstate. Restore it for migration
to older QEMU versions

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 
Signed-off-by: Cao jin 


Not necessary.  No released version of QEMU had e1000e and lacked commit
66bf7d58.

Paolo



Ok, then I will make this patch as separated one and send it out asap,
so maybe it goes in 2.7


It's not necessary at all; why would it be useful in 2.7?

Paolo



commit 66bf7d58I already removed E1000E_USE_MSI, so I think maybe I can 
send a patch to remove E1000E_USE_MSIX & intr_state, so there will be no 
migration compatibility issue






Re: [Qemu-devel] [PATCH] test-logging: don't hard-code paths in /tmp

2016-08-18 Thread Peter Maydell
On 15 August 2016 at 19:24, Sascha Silbe  wrote:
> Dear Peter,
>
> Peter Maydell  writes:
>
>> On 15 July 2016 at 17:24, Sascha Silbe  wrote:
> [...]
>>> Instead of hard-coding the paths, create a temporary directory using
>>> g_dir_make_tmp() and clean it up afterwards.
>>>
>>> Fixes: f6880b7f ("qemu-log: support simple pid substitution for logs")
>>> Signed-off-by: Sascha Silbe 
>>
>> Thanks for this patch -- I just noticed that the test was leaving
>> temporary files not cleaned up, which brought me to this patch
>> by searching the mail archives...
>
> I have totally forgotten about it. Would probably have remembered the
> next time "make check" failed on a shared machine. ;)

Are you planning to send a v2 of this patch? I was hoping we could
fix the non-deleted logfiles for qemu 2.7.0 but it's getting a bit
late in the cycle...

thanks
-- PMM



Re: [Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Daniel P. Berrange
On Thu, Aug 18, 2016 at 12:58:29PM +, Nir Levy wrote:
> Hello everybody,
> 
> I have a progress in tracing qemu,
> I add the thread and tag done for each kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl
> in purpose of investigating pure hypervisor activity and delays on host.
> the kvm type print only for convenience.
> 
> for example:
> kvm_ioctl 3106435.230545 pid=11347 thread=11347 type=0xae03 arg=0x25
> kvm_ioctl_done 3106435.230546 pid=11347 thread=11347 type=0xae03 arg=0x25 
> diff=1 (KVM_CHECK_EXTENSION)
> kvm_vcpu_ioctl 3106435.253930 pid=11347 thread=11354 cpu_index=0x2 
> type=0x4008ae9c arg=0x56417e6cb4f0
> kvm_vcpu_ioctl_done 3106435.253931 pid=11347 thread=11354 cpu_index=0x2 
> type=0x4008ae9c arg=0x56417e6cb4f0 diff=1 (KVM_X86_SETUP_MCE)
> 
> kvm_vm_ioctl 3106435.268896 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0
> kvm_vm_ioctl_done 3106435.269082 pid=11347 thread=11347 type=0x4020ae46 
> arg=0x7ffed97cf9d0 diff=186 (KVM_SET_USER_MEMORY_REGION)
> 
> I have notice KVM_RUN can take even seconds but that is probably low priority 
> tasks,(io workers probably)
> but this 186micro second on the main qemu thread is suspicious and might 
> cause application running over vm delays.
> 
> however when I have switch back to installed libvirtd 1.2.13.2
> ( my version has extended timeout for qemu debugging - ver 1.2.21)
> 
> 
> void st_set_trace_file_enabled(bool enable)
> {
> ...
> trace_fp = fopen(trace_file_name, "wb");
> 
> trace_fp returns null and errno is 13 access denied.
> In my installation I did not create cgroup,
> 
> If you have some hint for this issue it will be great.

Libvirt doesn't support use of the simpletrace backend. The security
policy will prevent simpletrace from even creating its output file.

It is recommended to use a dynamic trace backend like dtrace/systemtap/
ltt-ust so that QEMU does not need to directly create trace output
files itself.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



[Qemu-devel] [Bug 1614521] [NEW] -display accepts "none[a-z, 0-9]*" instead of 'none'

2016-08-18 Thread Kai Poeritz
Public bug reported:

When using the '-display' option the parameter 'none' is not the only
string that causes the behaviour of 'none'. I can use '-display
noneMICKEYMOUSE' and still have the none behaviour.

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1614521

Title:
  -display accepts "none[a-z,0-9]*" instead of 'none'

Status in QEMU:
  New

Bug description:
  When using the '-display' option the parameter 'none' is not the only
  string that causes the behaviour of 'none'. I can use '-display
  noneMICKEYMOUSE' and still have the none behaviour.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1614521/+subscriptions



Re: [Qemu-devel] [PATCH v4 0/2] trace: Simplify late initialization

2016-08-18 Thread Stefan Hajnoczi
On Fri, Aug 12, 2016 at 05:33:35PM +0200, Lluís Vilanova wrote:
> Removes the need for 'trace_events_dstate_init' and does a little cleanup in 
> how
> state values are modified (to avoid implicit conversions from bool).
> 
> Changes in v2
> =
> 
> * Fix late-init state value [Daniel P. Berrange].
> 
> Changes in v3
> =
> 
> * Avoid implicit conversions from bool to integers (second patch) [Daniel
>   P. Berrange].
> 
> Changes in v4
> =
> 
> * Correctly implement idempotent state changes.
> * Clarify when state changes are idempotent [Daniel P. Berrange].
> 
> Signed-off-by: Lluís Vilanova 
> ---
> 
> Lluís Vilanova (2):
>   trace: Remove 'trace_events_dstate_init'
>   trace: Avoid implicit bool->integer conversions
> 
> 
>  stubs/trace-control.c  |   22 --
>  trace/control-target.c |   34 --
>  trace/control.c|   23 ++-
>  trace/control.h|3 +++
>  trace/event-internal.h |1 +
>  5 files changed, 66 insertions(+), 17 deletions(-)
> 
> 
> To: qemu-devel@nongnu.org
> Cc: Stefan Hajnoczi 
> Cc: 'Daniel P. Berrange' 
> Cc: Paolo Bonzini 

The Travis CI system is reporting the following failure:

GTESTER check-qtest-aarch64
qemu-system-aarch64: ./trace/control-internal.h:77: 
trace_event_get_state_dynamic: Assertion `trace_event_get_state_static(ev)' 
failed.

https://travis-ci.org/stefanha/qemu/jobs/153235649

Please send a new revision that fixes the issue.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 3/6] e1000e: fix for migration compatibility

2016-08-18 Thread Paolo Bonzini


On 18/08/2016 15:11, Cao jin wrote:
> 
> 
> On 08/18/2016 06:47 PM, Paolo Bonzini wrote:
>>
>>
>> On 17/08/2016 16:39, Cao jin wrote:
>>> commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, but it
>>> is used by intr_state which exists in vmstate. Restore it for migration
>>> to older QEMU versions
>>>
>>> CC: Dmitry Fleytman 
>>> CC: Jason Wang 
>>> CC: Markus Armbruster 
>>> CC: Marcel Apfelbaum 
>>> CC: Michael S. Tsirkin 
>>> Signed-off-by: Cao jin 
>>
>> Not necessary.  No released version of QEMU had e1000e and lacked commit
>> 66bf7d58.
>>
>> Paolo
>>
> 
> Ok, then I will make this patch as separated one and send it out asap,
> so maybe it goes in 2.7

It's not necessary at all; why would it be useful in 2.7?

Paolo



Re: [Qemu-devel] [PATCH 3/6] e1000e: fix for migration compatibility

2016-08-18 Thread Cao jin



On 08/18/2016 06:47 PM, Paolo Bonzini wrote:



On 17/08/2016 16:39, Cao jin wrote:

commit 66bf7d58 removed internal msi state flag E1000E_USE_MSI, but it
is used by intr_state which exists in vmstate. Restore it for migration
to older QEMU versions

CC: Dmitry Fleytman 
CC: Jason Wang 
CC: Markus Armbruster 
CC: Marcel Apfelbaum 
CC: Michael S. Tsirkin 
Signed-off-by: Cao jin 


Not necessary.  No released version of QEMU had e1000e and lacked commit
66bf7d58.

Paolo



Ok, then I will make this patch as separated one and send it out asap, 
so maybe it goes in 2.7


--
Yours Sincerely,

Cao jin





[Qemu-devel] errno 13, fopen qemu trace file.

2016-08-18 Thread Nir Levy
Hello everybody,

I have a progress in tracing qemu,
I add the thread and tag done for each kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl
in purpose of investigating pure hypervisor activity and delays on host.
the kvm type print only for convenience.

for example:
kvm_ioctl 3106435.230545 pid=11347 thread=11347 type=0xae03 arg=0x25
kvm_ioctl_done 3106435.230546 pid=11347 thread=11347 type=0xae03 arg=0x25 
diff=1 (KVM_CHECK_EXTENSION)
kvm_vcpu_ioctl 3106435.253930 pid=11347 thread=11354 cpu_index=0x2 
type=0x4008ae9c arg=0x56417e6cb4f0
kvm_vcpu_ioctl_done 3106435.253931 pid=11347 thread=11354 cpu_index=0x2 
type=0x4008ae9c arg=0x56417e6cb4f0 diff=1 (KVM_X86_SETUP_MCE)

kvm_vm_ioctl 3106435.268896 pid=11347 thread=11347 type=0x4020ae46 
arg=0x7ffed97cf9d0
kvm_vm_ioctl_done 3106435.269082 pid=11347 thread=11347 type=0x4020ae46 
arg=0x7ffed97cf9d0 diff=186 (KVM_SET_USER_MEMORY_REGION)

I have notice KVM_RUN can take even seconds but that is probably low priority 
tasks,(io workers probably)
but this 186micro second on the main qemu thread is suspicious and might cause 
application running over vm delays.

however when I have switch back to installed libvirtd 1.2.13.2
( my version has extended timeout for qemu debugging - ver 1.2.21)


void st_set_trace_file_enabled(bool enable)
{
...
trace_fp = fopen(trace_file_name, "wb");

trace_fp returns null and errno is 13 access denied.
In my installation I did not create cgroup,

If you have some hint for this issue it will be great.
regards.
Nir Levy
SW Engineer


Nir Levy
SW Engineer

Web: www.asocstech.com |
[cid:image001.jpg@01D1B599.5A2C9530]



Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Kevin Zhao
Hi All,
Thanks for your all kindly response. Really Great and helpful :-)

On 18 August 2016 at 20:30, Kevin Zhao  wrote:

> Hi Jones:
>Thanks~It is great that Qemu has been working on that :-)
>
> On 18 August 2016 at 00:13, Andrew Jones  wrote:
>
>> On Wed, Aug 17, 2016 at 08:08:11PM +0800, Kevin Zhao wrote:
>> > Hi all,
>> >  Now I'm investigating net device hot plug and disk hotplug for
>> > AArch64. For virtio , the default address is virtio-mmio. After Libvirt
>> > 1.3.5, user can explicitly specify the address-type to pci and so
>> libvirt
>> > will pass the virtio-pci parameters to the Qemu.
>> >  Both my host and guest OS is Debian8, and Qemu version is 2.6.0.
>> > Libvirt version is 1.3.5.
>> >  For net-device, I change the address-type to pci, and libvirt pass
>> the
>> > command below:
>> >  -device
>> > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:25:
>> 25,bus=pci.2,addr=0x1
>> >
>> >  After booting, the eth0 device disappear(eth0 occur when the
>> address
>> > is virtio-mmio),
>> > but I can find another net-device enp2s1, also it can't work for dhcp:
>> > Running lspci: 02:01.0 Ethernet controller: Red Hat, Inc Virtio network
>> > device
>> > I'm not sure whether it worked.
>> >
>> >  For disk device,* when I change the address-type to pci, the whole
>> > qemu command is :*
>> > https://paste.fedoraproject.org/409553/,  but the VM can not boot
>> > successfully. Does Qemu not support device disk of virtio-pci in AArch64
>> > just as it in X86_64?
>> >  Thanks~Since I am not very familiar with Qemu, really looking
>> forward
>> > to your response.
>> >
>> > Best Regards,
>> > Kevin Zhao
>>
>> libvirt 1.3.5 is a bit old. Later versions no longer unconditionally add
>> the i82801b11 bridge, which was necessary to use PCI devices with the PCIe
>> host bridge mach-virt has. IMO, libvirt and qemu still have a long way to
>> go in order to configure a base/standard mach-virt PCIe machine.
>>
>>
> Yeah, I am changing to libvirt 2.1.0, I find that I should use PCI by
> manually add
> the slots and bus to it.
>
> 1) If we want to support both PCIe devices and PCI, then things are messy.
>>Currently we propose dropping PCI support. mach-virt pretty much
>>exclusively uses virtio, which can be set to PCIe mode (virtio-1.0)
>> 2) root complex ports, switches (upstream/downstream ports) are currently
>>based on Intel parts. Marcel is thinking about creating generic models.
>> 3) libvirt needs to learn how to plug everything together, in proper PCIe
>>fashion, leaving holes for hotplug.
>> 4) Probably more... I forget all the different issues we discovered when
>>we started playing with this a few months ago.
>>
>> The good news is that x86 folk want all the same things for the q35 model.
>> mach-virt enthusiasts like us get to ride along pretty much for free.
>>
>> So, using virtio-pci with mach-virt and libvirt isn't possible right now,
>> not without manual changes to the XML. It might be nice to document how to
>> manually convert a guest, so developers who want to use virtio-pci don't
>> have to abandon libvirt. I'd have to look into that, or ask one of our
>> libvirt friends to help. Certainly the instructions would be for latest
>> libvirt though.
>>
>>
> As you said,  that means that I can use PCIe as the bus for disk and
> net-device.
> I will try using pcie in libvirt. I will try the newest version of
> libvirt.
> Do I need to change  to enable it in AArch64 ?
> The pcie bus will  be automatically assigned in libvirt ?
>

Just to remind that I made a mistake here, as Laine said now just manually
adding pcie
slots can work. He is going on push a patch to libvirt about automatically
assigment.

>
>
>> Finally, FWIW, with a guest kernel of 4.6.4-301.fc24.aarch64. The
>> following qemu command line works for me.
>> (notice the use of PCIe), and my network interface gets labeled enp0s1.
>>
>> $QEMU -machine virt-2.6,accel=kvm -cpu host \
>>  -m 1024 -smp 1 -nographic \
>>  -bios /usr/share/AAVMF/AAVMF_CODE.fd \
>>  -device ioh3420,bus=pcie.0,id=pcie.1,port=1,chassis=1 \
>>  -device ioh3420,bus=pcie.0,id=pcie.2,port=2,chassis=2 \
>>  -device 
>> virtio-scsi-pci,disable-modern=off,disable-legacy=on,bus=pcie.1,addr=00.0,id=scsi0
>> \
>>  -drive file=/home/drjones/.local/libvirt/images/fedora.qcow2,format
>> =qcow2,if=none,id=drive-scsi0-0-0-0 \
>>  -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-sc
>> si0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
>>  -netdev user,id=hostnet0 \
>>  -device virtio-net-pci,disable-modern=off,disable-legacy=on,bus=pcie
>> .2,addr=00.0,netdev=hostnet0,id=net0
>>
>> I prefer always using virtio-scsi for the disk, but a similar command
>> line can be used for a virtio-blk-pci disk.
>>
>> OK great! Because in Openstack Nova ,AArch64 need to realize the hotplug
> only with
> the virtio bus, so investigate the virtio-pci. :-)
> Thanks again!
>
>
>> Thanks,
>> 

Re: [Qemu-devel] [PATCH v17 0/9] 8bit AVR cores

2016-08-18 Thread no-reply
Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 1471522070-77598-1-git-send-email-mrol...@gmail.com
Subject: [Qemu-devel] [PATCH v17 0/9] 8bit AVR cores
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git show --no-patch --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/1471522070-77598-1-git-send-email-mrol...@gmail.com -> 
patchew/1471522070-77598-1-git-send-email-mrol...@gmail.com
Switched to a new branch 'test'
294e7a4 target-avr: adding instruction decoder
3d36738 target-avr: instruction decoder generator
35bf67b target-avr: adding instruction translation
18e3870 target-avr: adding helpers for IN, OUT, SLEEP, WBR & unsupported 
instructions
3ccde73 target-avr: adding AVR interrupt handling
8786607 target-avr: adding instructions encodings
f48b5f2 target-avr: adding a sample AVR board
5a2840e target-avr: adding AVR CPU features/flavors
4e72990 target-avr: AVR cores support is added.

=== OUTPUT BEGIN ===
Checking PATCH 1/9: target-avr: AVR cores support is added
Checking PATCH 2/9: target-avr: adding AVR CPU features/flavors...
Checking PATCH 3/9: target-avr: adding a sample AVR board...
Checking PATCH 4/9: target-avr: adding instructions encodings...
Checking PATCH 5/9: target-avr: adding AVR interrupt handling...
Checking PATCH 6/9: target-avr: adding helpers for IN, OUT, SLEEP, WBR & 
unsupported instructions...
Checking PATCH 7/9: target-avr: adding instruction translation...
Checking PATCH 8/9: target-avr: instruction decoder generator...
ERROR: suspect code indent for conditional statements (4, 4)
#810: FILE: target-avr/cpugen/src/cpugen.cpp:440:
+if (argc != 2) {
[...]
+}

total: 1 errors, 0 warnings, 1225 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 9/9: target-avr: adding instruction decoder...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH v17 1/9] target-avr: AVR cores support is added.

2016-08-18 Thread Michael Rolnik
1. basic CPU structure
2. registers
3. no instructions
4. saving sreg, rampD, rampX, rampY, rampD, eind in HW representation

Signed-off-by: Michael Rolnik 
---
 MAINTAINERS |   5 +
 arch_init.c |   2 +
 configure   |   5 +
 default-configs/avr-softmmu.mak |  21 +++
 include/disas/bfd.h |   6 +
 include/sysemu/arch_init.h  |   1 +
 target-avr/Makefile.objs|  24 
 target-avr/cpu-qom.h|  85 
 target-avr/cpu.c| 292 
 target-avr/cpu.h| 180 +
 target-avr/gdbstub.c|  86 
 target-avr/helper.c |  89 
 target-avr/helper.h |  22 +++
 target-avr/machine.c| 115 
 target-avr/translate.c  | 256 +++
 target-avr/translate.h  | 115 
 16 files changed, 1304 insertions(+)
 create mode 100644 default-configs/avr-softmmu.mak
 create mode 100644 target-avr/Makefile.objs
 create mode 100644 target-avr/cpu-qom.h
 create mode 100644 target-avr/cpu.c
 create mode 100644 target-avr/cpu.h
 create mode 100644 target-avr/gdbstub.c
 create mode 100644 target-avr/helper.c
 create mode 100644 target-avr/helper.h
 create mode 100644 target-avr/machine.c
 create mode 100644 target-avr/translate.c
 create mode 100644 target-avr/translate.h

diff --git a/MAINTAINERS b/MAINTAINERS
index b6fb84e..c793e90 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -110,6 +110,11 @@ F: disas/arm.c
 F: disas/arm-a64.cc
 F: disas/libvixl/
 
+AVR
+M: Michael Rolnik 
+S: Maintained
+F: target-avr/
+
 CRIS
 M: Edgar E. Iglesias 
 S: Maintained
diff --git a/arch_init.c b/arch_init.c
index fa05973..be6e6de 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -80,6 +80,8 @@ int graphic_depth = 32;
 #define QEMU_ARCH QEMU_ARCH_UNICORE32
 #elif defined(TARGET_TRICORE)
 #define QEMU_ARCH QEMU_ARCH_TRICORE
+#elif defined(TARGET_AVR)
+#define QEMU_ARCH QEMU_ARCH_AVR
 #endif
 
 const uint32_t arch_type = QEMU_ARCH;
diff --git a/configure b/configure
index 4b808f9..8ac60f0 100755
--- a/configure
+++ b/configure
@@ -5663,6 +5663,8 @@ case "$target_name" in
 bflt="yes"
 gdb_xml_files="aarch64-core.xml aarch64-fpu.xml arm-core.xml arm-vfp.xml 
arm-vfp3.xml arm-neon.xml"
   ;;
+  avr)
+  ;;
   cris)
   ;;
   lm32)
@@ -5861,6 +5863,9 @@ for i in $ARCH $TARGET_BASE_ARCH ; do
   disas_config "ARM_A64"
 fi
   ;;
+  avr)
+disas_config "AVR"
+  ;;
   cris)
 disas_config "CRIS"
   ;;
diff --git a/default-configs/avr-softmmu.mak b/default-configs/avr-softmmu.mak
new file mode 100644
index 000..003465d
--- /dev/null
+++ b/default-configs/avr-softmmu.mak
@@ -0,0 +1,21 @@
+#
+#  QEMU AVR CPU
+#
+#  Copyright (c) 2016 Michael Rolnik
+#
+#  This library is free software; you can redistribute it and/or
+#  modify it under the terms of the GNU Lesser General Public
+#  License as published by the Free Software Foundation; either
+#  version 2.1 of the License, or (at your option) any later version.
+#
+#  This library is distributed in the hope that it will be useful,
+#  but WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  Lesser General Public License for more details.
+#
+#  You should have received a copy of the GNU Lesser General Public
+#  License along with this library; if not, see
+#  
+#
+
+# Default configuration for avr-softmmu
diff --git a/include/disas/bfd.h b/include/disas/bfd.h
index 8a3488c..8ccd108 100644
--- a/include/disas/bfd.h
+++ b/include/disas/bfd.h
@@ -213,6 +213,12 @@ enum bfd_architecture
 #define bfd_mach_m32r  0  /* backwards compatibility */
   bfd_arch_mn10200,/* Matsushita MN10200 */
   bfd_arch_mn10300,/* Matsushita MN10300 */
+  bfd_arch_avr,   /* Atmel AVR microcontrollers.  */
+#define bfd_mach_avr1  1
+#define bfd_mach_avr2  2
+#define bfd_mach_avr3  3
+#define bfd_mach_avr4  4
+#define bfd_mach_avr5  5
   bfd_arch_cris,   /* Axis CRIS */
 #define bfd_mach_cris_v0_v10   255
 #define bfd_mach_cris_v32  32
diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h
index d690dfa..8c75777 100644
--- a/include/sysemu/arch_init.h
+++ b/include/sysemu/arch_init.h
@@ -23,6 +23,7 @@ enum {
 QEMU_ARCH_UNICORE32 = (1 << 14),
 QEMU_ARCH_MOXIE = (1 << 15),
 QEMU_ARCH_TRICORE = (1 << 16),
+QEMU_ARCH_AVR = (1 << 17),
 };
 
 extern const uint32_t arch_type;
diff --git a/target-avr/Makefile.objs b/target-avr/Makefile.objs
new file mode 100644
index 000..85f9261
--- /dev/null
+++ b/target-avr/Makefile.objs
@@ -0,0 +1,24 @@
+#
+#  QEMU AVR CPU
+#
+#  Copyright (c) 2016 Michael Rolnik
+#
+#  This 

[Qemu-devel] [PATCH v17 9/9] target-avr: adding instruction decoder

2016-08-18 Thread Michael Rolnik
Signed-off-by: Michael Rolnik 
---
 target-avr/Makefile.objs |   1 +
 target-avr/decode.c  | 693 +++
 target-avr/translate.c   |   2 +
 3 files changed, 696 insertions(+)
 create mode 100644 target-avr/decode.c

diff --git a/target-avr/Makefile.objs b/target-avr/Makefile.objs
index f4a82c4..b500989 100644
--- a/target-avr/Makefile.objs
+++ b/target-avr/Makefile.objs
@@ -21,5 +21,6 @@
 obj-y += translate.o cpu.o helper.o
 obj-y += gdbstub.o
 obj-y += translate-inst.o
+obj-y += decode.o
 obj-$(CONFIG_SOFTMMU) += machine.o
 
diff --git a/target-avr/decode.c b/target-avr/decode.c
new file mode 100644
index 000..243bab3
--- /dev/null
+++ b/target-avr/decode.c
@@ -0,0 +1,693 @@
+/*
+ * QEMU AVR CPU
+ *
+ * Copyright (c) 2016 Michael Rolnik
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see
+ * 
+ */
+
+#include 
+#include "translate.h"
+
+void avr_decode(uint32_t pc, uint32_t *l, uint32_t c, translate_function_t *t)
+{
+uint32_t opc = extract32(c, 0, 16);
+switch (opc & 0xd000) {
+case 0x: {
+switch (opc & 0x2c00) {
+case 0x: {
+switch (opc & 0x0300) {
+case 0x: {
+*l = 16;
+*t = _translate_NOP;
+break;
+}
+case 0x0100: {
+*l = 16;
+*t = _translate_MOVW;
+break;
+}
+case 0x0200: {
+*l = 16;
+*t = _translate_MULS;
+break;
+}
+case 0x0300: {
+switch (opc & 0x0088) {
+case 0x: {
+*l = 16;
+*t = _translate_MULSU;
+break;
+}
+case 0x0008: {
+*l = 16;
+*t = _translate_FMUL;
+break;
+}
+case 0x0080: {
+*l = 16;
+*t = _translate_FMULS;
+break;
+}
+case 0x0088: {
+*l = 16;
+*t = _translate_FMULSU;
+break;
+}
+}
+break;
+}
+}
+break;
+}
+case 0x0400: {
+*l = 16;
+*t = _translate_CPC;
+break;
+}
+case 0x0800: {
+*l = 16;
+*t = _translate_SBC;
+break;
+}
+case 0x0c00: {
+*l = 16;
+*t = _translate_ADD;
+break;
+}
+case 0x2000: {
+*l = 16;
+*t = _translate_AND;
+break;
+}
+case 0x2400: {
+*l = 16;
+*t = _translate_EOR;
+break;
+}
+case 0x2800: {
+*l = 16;
+*t = _translate_OR;
+break;
+}
+case 0x2c00: {
+*l = 16;
+*t = _translate_MOV;
+break;
+}
+}
+break;
+}
+case 0x1000: {
+switch (opc & 0x2000) {
+case 0x: {
+switch (opc & 0x0c00) {
+case 0x: {

Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Marcel Apfelbaum

On 08/17/2016 08:00 PM, Laine Stump wrote:

On 08/17/2016 12:13 PM, Andrew Jones wrote:

On Wed, Aug 17, 2016 at 08:08:11PM +0800, Kevin Zhao wrote:

Hi all,

[...]

Hi,



1) If we want to support both PCIe devices and PCI, then things are messy.
Currently we propose dropping PCI support. mach-virt pretty much
exclusively uses virtio, which can be set to PCIe mode (virtio-1.0)


I have a libvirt patch just about ACKed for pushing upstream that will 
automatically assign virtio-pci devices to a PCIe slot (if the qemu binary 
supports virtio-1.0):

https://www.redhat.com/archives/libvir-list/2016-August/msg00852.html

Separate patches do the same for the e1000e emulated network device (which you 
probably don't care about) and the nec-usb-xhci (USB3) controller (more useful):

https://www.redhat.com/archives/libvir-list/2016-August/msg00732.html

Once these are in place, the only type of device of any consequence that I can 
see still having no PCIe alternative is audio (even though only the virgl video 
device is PCIe, libvirt has always
assigned the primary video to slot 1 on pcie-root anyway (although you 
shouldn't put a legacy PCI device on a pcie-root-port or 
pcie-switch-downstream-port, it is acceptable to plug it directly into
pcie-root (as long as you know you won't need to hotplug it).



I agree, please don't allow plugging PCI devices into PCIe ports (root ports/downstream 
ports). Use pcie.0 bus as you mentioned or start a pci "hierarchy" with a 
dmi-pci and pci-pci bridge.


2) root complex ports, switches (upstream/downstream ports) are currently
based on Intel parts. Marcel is thinking about creating generic models.


I say this every time it comes up, so just to be consistent: +1 :-)



I plan do that for my "own selfish" reasons, one of them is to be able to 
specify
the IO/MEM range the guest firmware should assign to a PCIe port. But, of 
course,
I'll keep them neutral enough; I might need help with testing the mach-virt 
machines.


3) libvirt needs to learn how to plug everything together, in proper PCIe
fashion, leaving holes for hotplug.


See above about virtio, although that doesn't cover the whole story. The other 
part (which I'm working on right now) is that libvirt needs to automatically 
add pcie-root-port,
pcie-switch-upstream-port, and pcie-switch-downstream-port devices as 
necessary. With the patches I mentioned above, you still have to manually add 
enough pcie-*-port controllers to the config, and
then libvirt will plug the PCIe devices into the right place. This is simple 
enough to do, but it does require intervention.

As far as leaving holes for hotplug, there's actually still a bit of an open 
question there - with machinetypes that use only legacy-PCI, *all* slots are 
hotpluggable, and they're added 31 at a time,
so there was never any question about which slots were hotpluggable, and it 
would be very rare to end up with a configuration that had 0 free slots 
available for hotplug (actually libvirt would always
make sure there was at least one, but in practice there would be many more open 
slots). With PCIe-capable machinetypes that is changed, since the root complex 
(pcie-root) doesn't support hotplug, and
new slots are added 1 at a time (pcie-*-port) rather than 31 at a time. This 
means you have to really go out of your way if you want open slots for hotplug 
(and even if you want devices in the machine
at boot time to be hot-unpluggable).



I agree you need to take into account hotplug for Q35, by leaving some PCIe 
root ports available, but it shouldn't be too much trouble.


I'm still not sure just how far we need to go in this regard.  We've already 
decided that, unless manually set to an address on pcie-root by the 
user/management application, all PCI devices will be
auto-assigned to a slot that supports hotplug.


Good idea.

 What I'm not sure about is whether we should always auto-add an extra 
pcie-*-root to be sure a device can be hotplugged, or if we should admit that 1

available slot isn't good enough for all situations, so we should instead just 
leave it up to the user/management to manually add extra ports if they think 
they'll want to hotplug something later.


Why not? Leaving 1 or 2 PCIe ports should be enough. On each port you can 
hotplug a switch with several downstream ports. You can continue nesting the 
switches up to depth of 6-7 I think.

Thanks,
Marcel




[...]


Thanks,
drew








Re: [Qemu-devel] Help: Does Qemu support virtio-pci for net-device and disk device?

2016-08-18 Thread Kevin Zhao
Hi Laine,
Thanks :-) I also has a little questions below.

On 18 August 2016 at 01:00, Laine Stump  wrote:

> On 08/17/2016 12:13 PM, Andrew Jones wrote:
>
>> On Wed, Aug 17, 2016 at 08:08:11PM +0800, Kevin Zhao wrote:
>>
>>> Hi all,
>>>   Now I'm investigating net device hot plug and disk hotplug for
>>> AArch64. For virtio , the default address is virtio-mmio. After Libvirt
>>> 1.3.5, user can explicitly specify the address-type to pci and so libvirt
>>> will pass the virtio-pci parameters to the Qemu.
>>>   Both my host and guest OS is Debian8, and Qemu version is 2.6.0.
>>> Libvirt version is 1.3.5.
>>>   For net-device, I change the address-type to pci, and libvirt pass
>>> the
>>> command below:
>>>   -device
>>> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:0d:25:
>>> 25,bus=pci.2,addr=0x1
>>>
>>>   After booting, the eth0 device disappear(eth0 occur when the
>>> address
>>> is virtio-mmio),
>>> but I can find another net-device enp2s1, also it can't work for dhcp:
>>> Running lspci: 02:01.0 Ethernet controller: Red Hat, Inc Virtio network
>>> device
>>> I'm not sure whether it worked.
>>>
>>>   For disk device,* when I change the address-type to pci, the whole
>>> qemu command is :*
>>> https://paste.fedoraproject.org/409553/,  but the VM can not boot
>>> successfully. Does Qemu not support device disk of virtio-pci in AArch64
>>> just as it in X86_64?
>>>   Thanks~Since I am not very familiar with Qemu, really looking
>>> forward
>>> to your response.
>>>
>>> Best Regards,
>>> Kevin Zhao
>>>
>> libvirt 1.3.5 is a bit old. Later versions no longer unconditionally add
>> the i82801b11 bridge, which was necessary to use PCI devices with the PCIe
>> host bridge mach-virt has. IMO, libvirt and qemu still have a long way to
>> go in order to configure a base/standard mach-virt PCIe machine.
>>
>
> Well, you can do it now, but you have to manually assign the PCI addresses
> of devices (and if you want hotplug you need to live with Intel/TI-specific
> PCIe controllers).

OK. It seems that Qemu will drop PCI for machine-virt and turning to PCIE
in the future.
Do I need to do more for  Intel/TI-specific PCIe controllers?  what do I
need to add in the guest XML or more?

>
>
>> 1) If we want to support both PCIe devices and PCI, then things are messy.
>> Currently we propose dropping PCI support. mach-virt pretty much
>> exclusively uses virtio, which can be set to PCIe mode (virtio-1.0)
>>
>
> I have a libvirt patch just about ACKed for pushing upstream that will
> automatically assign virtio-pci devices to a PCIe slot (if the qemu binary
> supports virtio-1.0):
>
> https://www.redhat.com/archives/libvir-list/2016-August/msg00852.html
>

What's the minimum version of  Qemu that support virito-1.0? Does Qemu 2.6
works?
Also as  I see your patch for automatically assign virtio-pci to PCIE
slot,after it merged  I think thing will go much more easier.
Now I will manually add the slots and bus to pcie. Because I am not
familiar with it,  if it convenient, could you give me an available xml
file which PCIE disk and PCIE
net device can work for machine virt ?

Thanks~

Separate patches do the same for the e1000e emulated network device (which
> you probably don't care about) and the nec-usb-xhci (USB3) controller (more
> useful):
>
> https://www.redhat.com/archives/libvir-list/2016-August/msg00732.html
>
> Once these are in place, the only type of device of any consequence that I
> can see still having no PCIe alternative is audio (even though only the
> virgl video device is PCIe, libvirt has always assigned the primary video
> to slot 1 on pcie-root anyway (although you shouldn't put a legacy PCI
> device on a pcie-root-port or pcie-switch-downstream-port, it is acceptable
> to plug it directly into pcie-root (as long as you know you won't need to
> hotplug it).
>
> 2) root complex ports, switches (upstream/downstream ports) are currently
>> based on Intel parts. Marcel is thinking about creating generic
>> models.
>>
>
> I say this every time it comes up, so just to be consistent: +1 :-)
>
> 3) libvirt needs to learn how to plug everything together, in proper PCIe
>> fashion, leaving holes for hotplug.
>>
>
> See above about virtio, although that doesn't cover the whole story. The
> other part (which I'm working on right now) is that libvirt needs to
> automatically add pcie-root-port, pcie-switch-upstream-port, and
> pcie-switch-downstream-port devices as necessary. With the patches I
> mentioned above, you still have to manually add enough pcie-*-port
> controllers to the config, and then libvirt will plug the PCIe devices into
> the right place. This is simple enough to do, but it does require
> intervention.
>
> As far as leaving holes for hotplug, there's actually still a bit of an
> open question there - with machinetypes that use only legacy-PCI, *all*
> slots are hotpluggable, and they're added 31 at a time, so there was never
> 

  1   2   >