from:"linzhecheng"

[Qemu-devel] [PATCH v2] usb-host: reset and close libusb_device_handle before qemu exit

2018-12-10 Thread linzhecheng

we should perform these actions as same as usb_host_close.

Signed-off-by: linzhecheng 

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index b6602ded4e..833250a886 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -988,7 +988,9 @@ static void usb_host_exit_notifier(struct Notifier *n, void 
*data)
 
 if (s->dh) {
 usb_host_release_interfaces(s);
+libusb_reset_device(s->dh);
 usb_host_attach_kernel(s);
+libusb_close(s->dh);
 }
 }
 
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] usb-host: reset and close libusb_device_handle before qemu exit

2018-11-29 Thread linzhecheng

we should perform these things as same as usb_host_close.

Signed-off-by: linzhecheng 

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index b6602ded4e..2016375e6b 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -988,7 +988,9 @@ static void usb_host_exit_notifier(struct Notifier *n, void 
*data)
 
 if (s->dh) {
 usb_host_release_interfaces(s);
+   libusb_reset_device(s->dh);
 usb_host_attach_kernel(s);
+   libusb_close(s->dh);
 }
 }
 
-- 
2.12.2.windows.2

Re: [Qemu-devel] [BUG] qemu stuck when detach host-usb device

2018-11-26 Thread linzhecheng




> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Tuesday, November 27, 2018 2:09 PM
> To: linzhecheng 
> Cc: qemu-devel@nongnu.org; wangxin (U) ;
> Zhoujian (jay) ; libusb-de...@lists.sourceforge.net
> Subject: Re: [Qemu-devel] [BUG] qemu stuck when detach host-usb device
> 
> On Tue, Nov 27, 2018 at 01:26:24AM +, linzhecheng wrote:
> > Description of problem:
> > The guest has a host-usb device(Kingston Technology DataTraveler 100
> > G3/G4/SE9 G2), which is attached to xhci controller(on host). Qemu will 
> > stuck
> if I detach it from guest.
> >
> > How reproducible:
> > 100%
> >
> > Steps to Reproduce:
> > 1.Use usb stick to copy files in guest , make it busy working.
> > 2.virsh detach-device vm_name usb.xml
> >
> > Then qemu will stuck for 20s, I found this is because 
> > libusb_release_interface
> block for 20s.
> > Dmesg prints:
> >
> > [35442.034861] usb 4-2.1: Disable of device-initiated U1 failed.
> > [35447.034993] usb 4-2.1: Disable of device-initiated U2 failed.
> > [35452.035131] usb 4-2.1: Set SEL for device-initiated U1 failed.
> > [35457.035259] usb 4-2.1: Set SEL for device-initiated U2 failed.
> >
> > Is this a hardware error or software's bug?
> 
> I'd guess software error, could be is libusb or (host) linux kernel.
> Cc'ing libusb-devel.
Perhaps it's usb driver's bug. Could you also reproduce it?
> 
> cheers,
>   Gerd

[Qemu-devel] [BUG] qemu stuck when detach host-usb device

2018-11-26 Thread linzhecheng

Description of problem:
The guest has a host-usb device(Kingston Technology DataTraveler 100 G3/G4/SE9 
G2), which is attached
to xhci controller(on host). Qemu will stuck if I detach it from guest.

How reproducible:
100%

Steps to Reproduce:
1.Use usb stick to copy files in guest , make it busy working.
2.virsh detach-device vm_name usb.xml

Then qemu will stuck for 20s, I found this is because libusb_release_interface 
block for 20s.
Dmesg prints:

[35442.034861] usb 4-2.1: Disable of device-initiated U1 failed.
[35447.034993] usb 4-2.1: Disable of device-initiated U2 failed.
[35452.035131] usb 4-2.1: Set SEL for device-initiated U1 failed.
[35457.035259] usb 4-2.1: Set SEL for device-initiated U2 failed.

Is this a hardware error or software's bug?

Re: [Qemu-devel] [question] live migration about redir-usb

2018-11-26 Thread linzhecheng




> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Monday, November 26, 2018 4:50 PM
> To: linzhecheng 
> Cc: qemu-devel@nongnu.org; wangxin (U) ;
> Zhoujian (jay) 
> Subject: Re: [Qemu-devel] [question] live migration about redir-usb
> 
> On Mon, Nov 26, 2018 at 06:53:23AM +, linzhecheng wrote:
> >
> >
> > > -Original Message-
> > > From: Gerd Hoffmann [mailto:kra...@redhat.com]
> > > Sent: Monday, November 26, 2018 2:46 PM
> > > To: linzhecheng 
> > > Cc: qemu-devel@nongnu.org; wangxin (U)
> ;
> > > Zhoujian (jay) 
> > > Subject: Re: [Qemu-devel] [question] live migration about redir-usb
> > >
> > > On Mon, Nov 26, 2018 at 02:29:12AM +, linzhecheng wrote:
> > > > Hi, Gerd
> > > > I have read this discussion thread about live migration in tcp mode.
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1254971
> > > > Only spice redirection supports live migration not, but why not tcp 
> > > > mode?
> > > > Does usbredirserver not support it or qemu?
> > >
> > > usbredirserver.
> > >
> > > spice client holds a connection to both source and target host for a
> > > seamless connection handover, and usb-redir needs that too to hand
> > > over the usb device without interruption.
> > >
> > > usbredirserver can't do that.
> 
> > So if we realize the handover between src and dst chardev of qemu, we
> > can do live migration in tcp mode.
> 
> > Is it similar with openvswtich who both accept the connection of src and dst
> qemu?
> 
> It is more complicated than just accepting two connections.  Network can just
> throw away packages if needed, typicaly they will be resent by the guest.  
> That
> doesn't work with usb, you must properly keep track of every in-flight usb
> transfer.
Why will network throw away packets? I think inflight usb
 Packets have already been tracked by redir-usb in live migration. What's more 
do we have to do?

> 
> What exactly do you want do?
Realize live migration for redir-usb in tcp mode.
> 
> Maybe it will be simpler to build a stripped-down spice client with only the 
> usb
> redir code, which doesn't require a UI and which you can start from the
> command line like usbredirserver?  Most spice client code is actually in 
> shared
> libraries, so it should not be too much work.
> 
> cheers,
>   Gerd

Re: [Qemu-devel] [question] live migration about redir-usb

2018-11-25 Thread linzhecheng




> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Monday, November 26, 2018 2:46 PM
> To: linzhecheng 
> Cc: qemu-devel@nongnu.org; wangxin (U) ;
> Zhoujian (jay) 
> Subject: Re: [Qemu-devel] [question] live migration about redir-usb
> 
> On Mon, Nov 26, 2018 at 02:29:12AM +, linzhecheng wrote:
> > Hi, Gerd
> > I have read this discussion thread about live migration in tcp mode.
> > https://bugzilla.redhat.com/show_bug.cgi?id=1254971
> > Only spice redirection supports live migration not, but why not tcp mode?
> > Does usbredirserver not support it or qemu?
> 
> usbredirserver.
> 
> spice client holds a connection to both source and target host for a seamless
> connection handover, and usb-redir needs that too to hand over the usb device
> without interruption.
> 
> usbredirserver can't do that.
So if we realize the handover between src and dst chardev of qemu, we can do 
live migration in tcp mode.
Is it similar with openvswtich who both accept the connection of src and dst 
qemu?
> 
> cheers,
>   Gerd

[Qemu-devel] [question] live migration about redir-usb

2018-11-25 Thread linzhecheng

Hi, Gerd
I have read this discussion thread about live migration in tcp mode.
https://bugzilla.redhat.com/show_bug.cgi?id=1254971
Only spice redirection supports live migration not, but why not tcp mode?
Does usbredirserver not support it or qemu?

[Qemu-devel] [PATCH] cirrus_vga/migration: update the bank offset before use

2018-11-22 Thread linzhecheng

From: Wang Xin 

The cirrus bank0/1 offset should be updated before we update the vram's alias
offset.

Signed-off-by: Wang Xin 

diff --git a/hw/display/cirrus_vga.c b/hw/display/cirrus_vga.c
index d9b854d..a0e7146 100644
--- a/hw/display/cirrus_vga.c
+++ b/hw/display/cirrus_vga.c
@@ -2746,11 +2746,12 @@ static int cirrus_post_load(void *opaque, int 
version_id)
 s->vga.gr[0x00] = s->cirrus_shadow_gr0 & 0x0f;
 s->vga.gr[0x01] = s->cirrus_shadow_gr1 & 0x0f;
 
+cirrus_update_bank_ptr(s, 0);
+cirrus_update_bank_ptr(s, 1);
 cirrus_update_memory_access(s);
 /* force refresh */
 s->vga.graphic_mode = -1;
-cirrus_update_bank_ptr(s, 0);
-cirrus_update_bank_ptr(s, 1);
+
 return 0;
 }
 
-- 
2.8.1.windows.1

Re: [Qemu-devel] [PATCH] usb-host: set ifs.detached as true if kernel driver is not active

2018-11-20 Thread linzhecheng




> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Tuesday, November 20, 2018 4:25 PM
> To: linzhecheng 
> Cc: qemu-devel@nongnu.org; Zhoujian (jay) ;
> wangxin (U) 
> Subject: Re: [PATCH] usb-host: set ifs.detached as true if kernel driver is 
> not
> active
> 
> On Tue, Nov 20, 2018 at 09:18:15AM +0800, linzhecheng wrote:
> > If no kernel driver is active, we can already claim and perform I/O on
> > it without detaching it.
> >
> > Signed-off-by: linzhecheng 
> >
> > diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c index
> > f31e9cbbb8..db4ae1e6e8 100644
> > --- a/hw/usb/host-libusb.c
> > +++ b/hw/usb/host-libusb.c
> > @@ -1119,6 +1119,10 @@ static void
> usb_host_detach_kernel(USBHostDevice *s)
> >  for (i = 0; i < USB_MAX_INTERFACES; i++) {
> >  rc = libusb_kernel_driver_active(s->dh, i);
> >  usb_host_libusb_error("libusb_kernel_driver_active", rc);
> > +if (rc == 0) {
> > +s->ifs[i].detached = true;
> > +continue;
> > +}
> >  if (rc != 1) {
> 
> Can't we just add detached = true here?
Yes, it's better.
> 
> >  continue;
> >  }
> 
> cheers,
>   Gerd

[Qemu-devel] [PATCH v2] usb-host: set ifs.detached as true if kernel driver is not active

2018-11-20 Thread linzhecheng

If no kernel driver is active, we can already claim and perform I/O on
it without detaching it.

Signed-off-by: linzhecheng 

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index f31e9cbbb8..b6602ded4e 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1120,6 +1120,9 @@ static void usb_host_detach_kernel(USBHostDevice *s)
 rc = libusb_kernel_driver_active(s->dh, i);
 usb_host_libusb_error("libusb_kernel_driver_active", rc);
 if (rc != 1) {
+if (rc == 0) {
+s->ifs[i].detached = true;
+}
 continue;
 }
 trace_usb_host_detach_kernel(s->bus_num, s->addr, i);
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] usb-host: set ifs.detached as true if kernel driver is not active

2018-11-19 Thread linzhecheng

If no kernel driver is active, we can already claim and perform I/O on
it without detaching it.

Signed-off-by: linzhecheng 

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index f31e9cbbb8..db4ae1e6e8 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1119,6 +1119,10 @@ static void usb_host_detach_kernel(USBHostDevice *s)
 for (i = 0; i < USB_MAX_INTERFACES; i++) {
 rc = libusb_kernel_driver_active(s->dh, i);
 usb_host_libusb_error("libusb_kernel_driver_active", rc);
+if (rc == 0) {
+s->ifs[i].detached = true;
+continue;
+}
 if (rc != 1) {
 continue;
 }
-- 
2.12.2.windows.2

[Qemu-devel] one issue about usb passsthrough devices

2018-10-31 Thread linzhecheng

Hi, all
A found a problem about libusb, the reproducing steps are as followed:
1. start up a vm with host-usb devices
2. kill -9 `pgrep qemu`
We can not see the usb device on host any more, I think this is because qemu 
has no chance to release
resources to kernel in this case. How can we recover environment if we 
encounter this problem?

[Qemu-devel] question about usb passthrough migration

2018-09-28 Thread linzhecheng

Hi, Gerd
I'm still trying to realize local live-migration with usb passthrough devices  
for purpose of upgrading qemu.
I've made sure that source and target vms will not access host usb devices at 
the same time.
But when I test USB flash disk copying files during live-migration stage, I 
encounter a probabilistic problem: copying task may stuck for a while.
This problem happened just after source vm stoping and target vm starting to 
run, which caused by the long time pending transfer.
In other words, target vm submit the first bulk transfer may not trigger 
callback function. I cannot figure out whether it is a usb driver's bug.
Could you give me some advice or whether you have met the same problem before?

[Qemu-devel] question about usb passthrough migration

2018-09-24 Thread linzhecheng

Hi, Gerd
I'm still trying to realize local live-migration with usb passthrough devices  
for purpose of upgrading qemu.
I've make sure source and target vms will not access host usb devices at the 
same time.
But when I test USB flash disk copying files during live-migration stage, I 
encounter a 
probabilistic
 problem: copying task may stuck for a while.
This problem happened just after source vm stoping and target vm starting to 
run, which caused by the long time pending transfer.
In other words, target vm submit the first bulk transfer may not trigger 
callback function. I cannot figure out whether it is a usb driver's bug.
Could give me some advice or whether you have met the same problem before?

Re: [Qemu-devel] Some confusion about live migration of usb device

2018-09-07 Thread linzhecheng



> -Original Message-
> From: gerd hoffmann [mailto:kra...@redhat.com]
> Sent: Friday, September 07, 2018 2:23 PM
> To: CheneyLin 
> Cc: linzhecheng ; wangxin (U)
> ; qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] Some confusion about live migration of usb device
> 
> On Thu, Sep 06, 2018 at 10:23:45PM +0800, CheneyLin wrote:
> >
> >
> >
> > > -原始邮件-
> > > 发件人: "gerd hoffmann" 
> > > 发送时间: 2018-09-06 21:52:23 (星期四)
> > > 收件人: linzhecheng 
> > > 抄送: "wangxin (U)" , CheneyLin
> > > , "qemu-devel@nongnu.org" 
> > > 主题: Re: [Qemu-devel] Some confusion about live migration of usb
> > > device
> > >
> > > On Thu, Sep 06, 2018 at 12:10:08PM +, linzhecheng wrote:
> > > > You had said that copying vmstate of usb-host is pointless, so
> > > > just unpulg and plug it after migration is all right,  but will other 
> > > > usb
> devices like usb-storage devices lose pending USBPackets then?
> > >
> > > Ah, emulated usb devices.  The usb host adapters will re-submit any
> > > unfinished usb transfers on the target machine.
> > You mean target vm usb controllers will re-submit them?
> 
> Yes.
> 
> > And what about usb-redir, btw?
> 
> Basically the same.  But some extra care is needed to make sure the physical
> device will not see requests twice when the target starts replaying unfinished
> requests.  All usb requests get a unique tag
> (USBPacket->id) so the usbredir server (typically spice client) has a chance 
> to
> handle this correctly.
I got it! Thank you for patiently answering my questions.
> 
> cheers,
>   Gerd

Re: [Qemu-devel] Some confusion about live migration of usb device

2018-09-06 Thread linzhecheng

You had said that copying vmstate of usb-host is pointless, so just unpulg and 
plug it after migration is all right,
 but will other usb devices like usb-storage devices lose pending USBPackets 
then?  

> -Original Message-
> From: gerd hoffmann [mailto:kra...@redhat.com]
> Sent: Thursday, September 06, 2018 8:04 PM
> To: linzhecheng 
> Cc: CheneyLin ; wangxin (U)
> ; qemu-devel@nongnu.org
> Subject: Re: Some confusion about live migration of usb device
> 
> On Thu, Sep 06, 2018 at 10:25:20AM +, linzhecheng wrote:
> > Hi, Gerd
> >
> > I'm going through relevant codes about live migration of usb devices,
> > it seems that we will not save/load USBpacket in any vmstate, so
> > pending usb packets will be lost after live migration, is it a
> > problem?
> 
> With usb-host or usb-redir?
> 
> cheers,
>   Gerd

[Qemu-devel] Some confusion about live migration of usb device

2018-09-06 Thread linzhecheng

Hi, Gerd

I'm going through relevant codes about live migration of usb devices, it seems 
that we will not save/load USBpacket in any vmstate, so pending usb packets 
will be lost after live migration, is it a problem?

Re: [Qemu-devel] [PATCH] usb-host: insert usb device into hostdevs to be scaned

2018-08-19 Thread linzhecheng




> -Original Message-
> From: gerd hoffmann [mailto:kra...@redhat.com]
> Sent: Friday, August 17, 2018 2:08 PM
> To: CheneyLin 
> Cc: linzhecheng ; wangxin (U)
> ; qemu-devel@nongnu.org
> Subject: Re: [Qemu-devel] [PATCH] usb-host: insert usb device into hostdevs to
> be scaned
> 
> > > Why live-migrate to localhost?  That is rather tricky due to both
> > > source and target qemu accessing host resources at the same time,
> > > and I guess this is the reason you are seeing the problems here.
> > Thanks for your reply, but I'm still confused about these issues:
> > 1. If local live migration is not supported for host-usb device, why
> > we have to detach it and rescan it in usb_host_post_load_bh? If a
> > device's property *needs_autoscan* is false, how can we scan it(because we
> have detached it)?
> 
> autoscan is for hotplug.  You can un-plug and re-plug a usb device on the
> physical host, and the guest will see these un-plugs and re-plugs.
> 
> autoscan is turned off in case the usb device is specified using
> hostbus+hostaddr attributes, because hostaddr is a moving target.
> Each time the device is plugged in it gets a different address, for hotplug to
> work properly you must specify the device using attributes which don't change.
> You can use hostport instead of hostaddr for example.
> 
> > 2. Normal live migration between two remote hostes makes a vm access
> > differnt host usb devices, how can we copy states and data of them?
> 
> We don't copy any state.  It is rather pointless, even for save/resume on the
> same machine we don't know what state the usb device has and whenever it is
> still in the state the guest has left it.
I have tried stop/cont a vm with host-usb device during coping large 
files(makes usb device busy woking), 
but usb device was not detached/attached and coping task was completed properly.
 It seems that we can restart it where we left it off.

> 
> So instead we virtually un-plug and re-plug the device, to make the guest 
> driver
> re-initialize the device.
> 
> HTH,
>   Gerd

Re: [Qemu-devel] [PATCH] usb-host: insert usb device into hostdevs to be scaned

2018-08-16 Thread linzhecheng



> -Original Message-
> From: Gerd Hoffmann [mailto:kra...@redhat.com]
> Sent: Thursday, August 16, 2018 6:46 PM
> To: linzhecheng 
> Cc: qemu-devel@nongnu.org; wangxin (U) 
> Subject: Re: [PATCH] usb-host: insert usb device into hostdevs to be scaned
> 
> On Thu, Aug 16, 2018 at 05:43:56PM +0800, linzhecheng wrote:
> > According to the comment of usb_host_post_load_bh, after removing the
> > usb device passed through, we will kick host to scan it again, but the
> > emulated usb device is not added to global list hostdevs, so it can never be
> discovered then.
> > So just do it before usb_host_auto_check. What's more, it's futile to
> > walk devs in usb_host_auto_check periodically if hostdevs is empty, so let's
> delete the usb_auto_timer.
> 
> Hmm, needs_autoscan is a fixed property of the device (depending on
> configuration), it should not be switched on and off.
> 
> What problem you are trying to solve?
I'm trying to live migrate a vm to local host, but found the host-usb device 
was removed then.
Is it possible not to remove it after migrating vm?
> 
> cheers,
>   Gerd

[Qemu-devel] [PATCH] usb-host: insert usb device into hostdevs to be scaned

2018-08-16 Thread linzhecheng

According to the comment of usb_host_post_load_bh, after removing the
usb device passed through, we will kick host to scan it again, but the emulated
usb device is not added to global list hostdevs, so it can never be discovered 
then.
So just do it before usb_host_auto_check. What's more, it's futile to walk devs 
in
usb_host_auto_check periodically if hostdevs is empty, so let's delete the 
usb_auto_timer.

Signed-off-by: linzhecheng 

diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index f31e9cbbb8..632abaa390 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1534,6 +1534,10 @@ static void usb_host_post_load_bh(void *opaque)
 usb_device_detach(udev);
 }
 dev->bh_postld_pending = false;
+if (!dev->needs_autoscan) {
+dev->needs_autoscan = true;
+QTAILQ_INSERT_TAIL(, dev, next);
+}
 usb_host_auto_check(NULL);
 }
 
@@ -1631,6 +1635,14 @@ static void usb_host_auto_check(void *unused)
 int unconnected = 0;
 int i, n;
 
+if (QTAILQ_EMPTY()) {
+if (usb_auto_timer) {
+timer_del(usb_auto_timer);
+usb_auto_timer = NULL;
+}
+return;
+}
+
 if (usb_host_init() != 0) {
 return;
 }
@@ -1682,6 +1694,8 @@ static void usb_host_auto_check(void *unused)
 s->errcount++;
 continue;
 }
+s->needs_autoscan = false;
+QTAILQ_REMOVE(, s, next);
 break;
 }
 }
-- 
2.12.2.windows.2

[Qemu-devel] issue about numa configure

2018-07-19 Thread linzhecheng

Hi, all
I found that qemu has a constraint in function numa_node_parse now:
 If (node->has_memdev != have_memdevs) {
 Error_setg(errp, "qemu: memdev option must be specified for either "
"all or no nodes");
 Return;
 }
This restricts us from being able to configure an empty numa node (without 
memory and cpus). But if I delete these codes, I can start a VM with cmdline:
qemu-system-x86_64 --enable-kvm  -m size=2G,slots=256,maxmem=300G -smp 
2,maxcpus=4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=2048 
-numa node,nodeid=1,cpus=2-3 ...
We can see only one numa node inside the VM(I have tested both linux and 
windows) after beginning.
And if I hot-plug the dimm memory devices into the empty node, vm will present 
a new numa node inside and the new memory is online then.
I'm wondering if you have any related issue before? Or can we remove this 
constraint?
Looking forward to your answers, thanks.

[Qemu-devel] Some issues of hotpluging memory and cpus

2018-06-25 Thread linzhecheng

Hi, all,
I had some problems when doing memory hot plugging. After a lot of tests, I was 
able to stably reproduce the following issues with Windows 2012 R2:

1. If I hot-plug the memory immediately after hot-plugging the CPUs, these 
memory devices will not be detected by windows OS because I cannot see them in 
the device manager.
2. If I add a certain delay (for example, 2s or more) between the hot-plugging 
memory and the CPU, the memory devices will be detected but some of them may 
not be online.
And Memory Module Properties displaying the following contents:

This device cannot start. (Code 10)
{Conflicting Address Range}
The specified address range conflicts with the address space.

3. usb tablet mouse would not work probability after hot adding memory.

Due to the closed nature of windows OS, I can't know what's going on inside.
Is it related to acpi tables exposure?
I hope you can give me some advice, thank you.

[Qemu-devel] [PATCH v2] vhost-user: delete net client if necessary

2018-06-11 Thread linzhecheng

As qemu_new_net_client create new ncs but error happens later,
ncs will be left in global net_clients list and we can't use them any
more, so we need to cleanup them.

Signed-off-by: linzhecheng 

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 608b837175..a39f9c9974 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -345,6 +345,9 @@ err:
 s->vhost_user = NULL;
 }
 }
+if (nc0) {
+qemu_del_net_client(nc0);
+}
 
 return -1;
 }
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] vhost-user: delete net client if necessary

2018-06-11 Thread linzhecheng

As qemu_new_net_client create new ncs but error happens later,
ncs will be left in global net_clients list, so we need to cleanup them.

Signed-off-by: linzhecheng 

diff --git a/net/vhost-user.c b/net/vhost-user.c
index 608b837175..1c7ee48b60 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -345,6 +345,7 @@ err:
 s->vhost_user = NULL;
 }
 }
+qemu_del_net_client(nc0);
 
 return -1;
 }
-- 
2.12.2.windows.2

Re: [Qemu-devel] [PATCH] socket: dont't free msgfds if error equals EAGAIN

2018-05-29 Thread linzhecheng

I think this patch doesn't fix my issue. For more details, please see Gonglei's 
reply.
https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg06296.html

> -邮件原件-
> 发件人: Marc-André Lureau [mailto:marcandre.lur...@gmail.com]
> 发送时间: 2018年5月29日 17:11
> 收件人: linzhecheng 
> 抄送: QEMU ; Paolo Bonzini
> ; wangxin (U) 
> 主题: Re: [Qemu-devel] [PATCH] socket: dont't free msgfds if error equals
> EAGAIN
> 
> Hi
> 
> On Tue, May 29, 2018 at 4:52 AM, linzhecheng 
> wrote:
> > Signed-off-by: linzhecheng 
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c index
> > 159e69c3b1..17519ec589 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -134,8 +134,8 @@ static int tcp_chr_write(Chardev *chr, const uint8_t
> *buf, int len)
> >  s->write_msgfds,
> >  s->write_msgfds_num);
> >
> > -/* free the written msgfds, no matter what */
> > -if (s->write_msgfds_num) {
> > +/* free the written msgfds in any cases other than errno==EAGAIN */
> > +if (EAGAIN != errno && s->write_msgfds_num) {
> >  g_free(s->write_msgfds);
> >  s->write_msgfds = 0;
> >  s->write_msgfds_num = 0;
> > --
> 
> It is already fix since v2.12:
> 
> commit c863fdec6aff6b5a4ca8fff1537b80d9f8b97726
> Author: Daniel P. Berrangé 
> Date:   Thu Feb 22 12:13:51 2018 +
> 
> chardev: fix handling of EAGAIN for TCP chardev
> 
> > 2.12.2.windows.2
> >
> >
> >
> 
> 
> 
> --
> Marc-André Lureau

Re: [Qemu-devel] [PATCH] socket: dont't free msgfds if error equals EAGAIN

2018-05-29 Thread linzhecheng

CC'ing Daniel P. Berrangé , Peter Xu, Marc-André Lureau, Eric Blake, Gonglei

> -邮件原件-
> 发件人: linzhecheng
> 发送时间: 2018年5月29日 10:53
> 收件人: qemu-devel@nongnu.org
> 抄送: pbonz...@redhat.com; wangxin (U) ;
> linzhecheng 
> 主题: [PATCH] socket: dont't free msgfds if error equals EAGAIN
> 
> Signed-off-by: linzhecheng 
> 
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c index
> 159e69c3b1..17519ec589 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -134,8 +134,8 @@ static int tcp_chr_write(Chardev *chr, const uint8_t
> *buf, int len)
>  s->write_msgfds,
>  s->write_msgfds_num);
> 
> -/* free the written msgfds, no matter what */
> -if (s->write_msgfds_num) {
> +/* free the written msgfds in any cases other than errno==EAGAIN */
> +if (EAGAIN != errno && s->write_msgfds_num) {
>  g_free(s->write_msgfds);
>  s->write_msgfds = 0;
>  s->write_msgfds_num = 0;
> --
> 2.12.2.windows.2
>

[Qemu-devel] [PATCH] socket: dont't free msgfds if error equals EAGAIN

2018-05-28 Thread linzhecheng

Signed-off-by: linzhecheng 

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 159e69c3b1..17519ec589 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -134,8 +134,8 @@ static int tcp_chr_write(Chardev *chr, const uint8_t *buf, 
int len)
 s->write_msgfds,
 s->write_msgfds_num);
 
-/* free the written msgfds, no matter what */
-if (s->write_msgfds_num) {
+/* free the written msgfds in any cases other than errno==EAGAIN */
+if (EAGAIN != errno && s->write_msgfds_num) {
 g_free(s->write_msgfds);
 s->write_msgfds = 0;
 s->write_msgfds_num = 0;
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] cpus: remove useless cond signal

2018-05-14 Thread linzhecheng

commit dbadee4 removed qemu_cond_wait in cpu_remove_sync, so it is
useless to keep qemu_cond_signal here.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/cpus.c b/cpus.c
index 5bcd3ecf38..c7262484f3 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1222,7 +1222,6 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 
 qemu_kvm_destroy_vcpu(cpu);
 cpu->created = false;
-qemu_cond_signal(_cpu_cond);
 qemu_mutex_unlock_iothread();
 rcu_unregister_thread();
 return NULL;
-- 
2.12.2.windows.2

[Qemu-devel] [Bug 1575607] Re: vm startup failed, qemu returned "kvm run failed Bad address"

2018-05-10 Thread linzhecheng

have resolved it？

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1575607

Title:
   vm startup failed, qemu returned "kvm run failed Bad address"

Status in QEMU:
  New

Bug description:
create a virtual machine and start by libvirt. vm startup failed, qemu 
returned "kvm run failed Bad address"
the error log is :

  error: kvm run failed Bad address

  EAX=7ffc EBX=7ffbffd0 ECX=fff0 EDX=002c

  ESI=6f5c EDI=7ffbffd0 EBP=7ffc ESP=6f34

  EIP=000dec7b EFL=00010046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0

  ES =0010   00c09300 DPL=0 DS   [-WA]

  CS =0008   00c09b00 DPL=0 CS32 [-RA]

  SS =0010   00c09300 DPL=0 DS   [-WA]

  DS =0010   00c09300 DPL=0 DS   [-WA]

  FS =0010   00c09300 DPL=0 DS   [-WA]

  GS =0010   00c09300 DPL=0 DS   [-WA]

  LDT=   8200 DPL=0 LDT

  TR =   8b00 DPL=0 TSS32-busy

  GDT= 000f6a80 0037

  IDT= 000f6abe 

  CR0=6011 CR2= CR3= CR4=

  DR0= DR1= DR2=
  DR3=

  DR6=0ff0 DR7=0400

  EFER=

  Code=c3 29 d3 21 cb 39 c3 77 27 3b 5e 0c 72 22 85 ff 75 02 89 df <89>
  5f 08 01 da 89 57 0c 89 47 10 89 5e 10 8b 56 04 89 f8 e8 8c fc ff ff
  89 d8 eb 06 8b 36

we add numa in the vm, the numatune info is:


  



   the version of qemu is 2.5.0.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1575607/+subscriptions

Re: [Qemu-devel] [Bug Report] vm paused after succeeding to migrate

2018-04-12 Thread linzhecheng



> -邮件原件-
> 发件人: Dr. David Alan Gilbert [mailto:dgilb...@redhat.com]
> 发送时间: 2018年4月12日 20:37
> 收件人: linzhecheng <linzhech...@huawei.com>; pbonz...@redhat.com
> 抄送: qemu-devel@nongnu.org; wangxin (U) <wangxinxin.w...@huawei.com>;
> Zhoujian (jay) <jianjay.z...@huawei.com>; quint...@redhat.com
> 主题: Re: [Qemu-devel] [Bug Report] vm paused after succeeding to migrate
> 
> * linzhecheng (linzhech...@huawei.com) wrote:
> > Hi, all
> > I encounterd a bug when I try to migrate a windows vm.
> >
> > Enviroment information:
> > host A: cpu E5620(model WestmereEP without flag xsave) host B: cpu
> > E5-2643(model SandyBridgeEP with xsave)
> >
> > The reproduce steps is :
> > 1. Start a windows 2008 vm with -cpu host(which means host-passthrough).
> > 2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully).
> > 3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1.
> > 4. Then migrate the vm to host A (successfully), but vm was paused, and
> qemu printed log as followed:
> 
> Remember that migrating using -cpu host  across different CPU models is NOT
> expected to work.
> 
> > KVM: entry failed, hardware error 0x8021
> >
> > If you're running a guest on an Intel machine without unrestricted
> > mode support, the failure can be most likely due to the guest entering
> > an invalid state for Intel VT. For example, the guest maybe running in
> > big real mode which is not supported on less recent Intel processors.
> >
> > EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=
> > ESI=01a62000 EDI= EBP= ESP=01718b20
> > EIP=0185d982 EFL=0286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES
> > =   9300 CS =f000   9b00
> > SS =   9300 DS =  
> > 9300 FS =   9300 GS = 
> >  9300
> > LDT=   8200
> > TR =   8b00
> > GDT=  
> > IDT=  
> > CR0=6010 CR2= CR3= CR4=
> > DR0= DR1= DR2=
> > DR3=
> > DR6=0ff0 DR7=0400
> > EFER=
> > Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00>
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 00 00 00 00 00 00
> >
> > I have found that problem happened when kvm_put_sregs returns err -
> 22(called by kvm_arch_put_registers(qemu)).
> > Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that
> guest_cpuid_has no X86_FEATURE_XSAVE but cr4.OSXSAVE=1.
> > So should we cancel migration when kvm_arch_put_registers returns error?
> 
> It would seem good if we can make the migration fail there rather than hitting
> that KVM error.
> It looks like we need to do a bit of plumbing to convert the places that call 
> it to
> return a bool rather than void.
I think we should return a int value of run_on_cpu which callback 
run_on_cpu_func, but run_on_cpu_func is the prototype of many functions,
Is it overkill?
> 
> Dave
> 
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] [Bug Report] vm paused after succeeding to migrate

2018-04-12 Thread linzhecheng

Hi, all
I encounterd a bug when I try to migrate a windows vm.

Enviroment information:
host A: cpu E5620(model WestmereEP without flag xsave)
host B: cpu E5-2643(model SandyBridgeEP with xsave)

The reproduce steps is :
1. Start a windows 2008 vm with -cpu host(which means host-passthrough).
2. Migrate the vm to host B when cr4.OSXSAVE=0 (successfully).
3. Vm runs on host B for a while so that cr4.OSXSAVE changes to 1.
4. Then migrate the vm to host A (successfully), but vm was paused, and qemu 
printed log as followed:

KVM: entry failed, hardware error 0x8021

If you're running a guest on an Intel machine without unrestricted mode
support, the failure can be most likely due to the guest entering an invalid
state for Intel VT. For example, the guest maybe running in big real mode
which is not supported on less recent Intel processors.

EAX=019b3bb0 EBX=01a3ae80 ECX=01a61ce8 EDX=
ESI=01a62000 EDI= EBP= ESP=01718b20
EIP=0185d982 EFL=0286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =   9300
CS =f000   9b00
SS =   9300
DS =   9300
FS =   9300
GS =   9300
LDT=   8200
TR =   8b00
GDT=  
IDT=  
CR0=6010 CR2= CR3= CR4=
DR0= DR1= DR2= 
DR3=
DR6=0ff0 DR7=0400
EFER=
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I have found that problem happened when kvm_put_sregs returns err -22(called by 
kvm_arch_put_registers(qemu)).
Because kvm_arch_vcpu_ioctl_set_sregs(kvm-mod) checked that guest_cpuid_has no 
X86_FEATURE_XSAVE but cr4.OSXSAVE=1.
So should we cancel migration when kvm_arch_put_registers returns error?

[Qemu-devel] [PATCH] cpu: skip unpluged cpu when querying cpus

2018-04-11 Thread linzhecheng

From: XuYandong 

After vcpu1 thread exiting, vcpu0 thread (received notification) is still 
waiting for
holding qemu_global_mutex in cpu_remove_sync, at this moment, vcpu1 is still in 
global cpus list.
If main thread grab qemu_global_mutex in order to handle qmp command "info 
cpus",
qmp_query_cpus visit unpluged vcpu1 will lead qemu process to exit.

Signed-off-by: XuYandong 
---
 cpus.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/cpus.c b/cpus.c
index 2cb0af9..9b3a6c4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2018,6 +2018,11 @@ CpuInfoList *qmp_query_cpus(Error **errp)
 
 CPU_FOREACH(cpu) {
 CpuInfoList *info;
+
+if (cpu->unplug) {
+continue;
+}
+
 #if defined(TARGET_I386)
 X86CPU *x86_cpu = X86_CPU(cpu);
 CPUX86State *env = _cpu->env;
-- 
1.8.3.1

[Qemu-devel] [PATCH v2] virtio-serial: fix heap-over-flow

2018-03-28 Thread linzhecheng

Check device having the feature of VIRTIO_CONSOLE_F_EMERG_WRITE before
get config->emerg_wr. It is neccessary because sizeof(virtio_console_config)
is 8 byte if VirtIOSerial doesn't have the feature of
VIRTIO_CONSOLE_F_EMERG_WRITE(see virtio_serial_device_realize),
read/write emerg_wr will lead to heap-over-flow.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 9470bd7be7..d2dd8ab502 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -580,13 +580,16 @@ static void set_config(VirtIODevice *vdev, const uint8_t 
*config_data)
 VirtIOSerial *vser = VIRTIO_SERIAL(vdev);
 struct virtio_console_config *config =
 (struct virtio_console_config *)config_data;
-uint8_t emerg_wr_lo = le32_to_cpu(config->emerg_wr);
 VirtIOSerialPort *port = find_first_connected_console(vser);
 VirtIOSerialPortClass *vsc;
+uint8_t emerg_wr_lo;
 
-if (!config->emerg_wr) {
+if (!virtio_has_feature(vser->host_features,
+VIRTIO_CONSOLE_F_EMERG_WRITE) || !config->emerg_wr) {
 return;
 }
+
+emerg_wr_lo = le32_to_cpu(config->emerg_wr);
 /* Make sure we don't misdetect an emergency write when the guest
  * does a short config write after an emergency write. */
 config->emerg_wr = 0;
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] virtio-serial: fix heap-over-flow

2018-03-27 Thread linzhecheng

Check device having the feature of VIRTIO_CONSOLE_F_EMERG_WRITE before
get config->emerg_wr. It is neccessary because sizeof(virtio_console_config)
is 8 byte if VirtIOSerial doesn't have the feature of
VIRTIO_CONSOLE_F_EMERG_WRITE(see virtio_serial_device_realize),
read/write emerg_wr will lead to heap-over-flow.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 9470bd7be7..3695172f37 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -580,11 +580,12 @@ static void set_config(VirtIODevice *vdev, const uint8_t 
*config_data)
 VirtIOSerial *vser = VIRTIO_SERIAL(vdev);
 struct virtio_console_config *config =
 (struct virtio_console_config *)config_data;
-uint8_t emerg_wr_lo = le32_to_cpu(config->emerg_wr);
 VirtIOSerialPort *port = find_first_connected_console(vser);
 VirtIOSerialPortClass *vsc;
+uint8_t emerg_wr_lo;
 
-if (!config->emerg_wr) {
+if (!virtio_has_feature(vser->host_features,
+VIRTIO_CONSOLE_F_EMERG_WRITE) || !config->emerg_wr) {
 return;
 }
 /* Make sure we don't misdetect an emergency write when the guest
@@ -594,6 +595,7 @@ static void set_config(VirtIODevice *vdev, const uint8_t 
*config_data)
 return;
 }
 vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
+emerg_wr_lo = le32_to_cpu(config->emerg_wr);
 (void)vsc->have_data(port, _wr_lo, 1);
 }
 
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] tap: delete tap(net client) if net_init_tap_one failed

2018-03-05 Thread linzhecheng

If net_init_tap_one failed but net_tap_fd_init succeeded, we should
delete the TAPState *s without vhostforce and has_vhostforce flag.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/net/tap.c b/net/tap.c
index 2b3a36f9b5..1cb8eaf31f 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -651,7 +651,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, 
NetClientState *peer,
 tap_set_sndbuf(s->fd, tap, );
 if (err) {
 error_propagate(errp, err);
-return;
+goto fail;
 }
 
 if (tap->has_fd || tap->has_fds) {
@@ -688,6 +688,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, 
NetClientState *peer,
 if (vhostfd == -1) {
 if (tap->has_vhostforce && tap->vhostforce) {
 error_propagate(errp, err);
+goto fail;
 } else {
 warn_report_err(err);
 }
@@ -699,6 +700,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, 
NetClientState *peer,
 if (tap->has_vhostforce && tap->vhostforce) {
 error_setg_errno(errp, errno,
  "tap: open vhost char device failed");
+goto fail;
 } else {
 warn_report("tap: open vhost char device failed: %s",
 strerror(errno));
@@ -713,6 +715,7 @@ static void net_init_tap_one(const NetdevTapOptions *tap, 
NetClientState *peer,
 if (!s->vhost_net) {
 if (tap->has_vhostforce && tap->vhostforce) {
 error_setg(errp, VHOST_NET_INIT_FAILED);
+goto fail;
 } else {
 warn_report(VHOST_NET_INIT_FAILED);
 }
@@ -720,7 +723,11 @@ static void net_init_tap_one(const NetdevTapOptions *tap, 
NetClientState *peer,
 }
 } else if (vhostfdname) {
 error_setg(errp, "vhostfd(s)= is not valid without vhost");
+goto fail;
 }
+return;
+fail:
+qemu_del_net_client(>nc);
 }
 
 static int get_fds(char *str, char *fds[], int max)
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH v2] vhost-user: fix memory leak

2018-02-12 Thread linzhecheng

g_free() was moved from vhost_net_cleanup in commit e6bcb1b, so we should
free net after vhost_net_cleanup

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/net/vhost-user.c b/net/vhost-user.c
index cb45512506..d024573e45 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -109,6 +109,7 @@ static int vhost_user_start(int queues, NetClientState 
*ncs[], CharBackend *be)
 err:
 if (net) {
 vhost_net_cleanup(net);
+g_free(net);
 }
 vhost_user_stop(i, ncs);
 return -1;
-- 
2.12.2.windows.2

[Qemu-devel] 答复: [PATCH] vhost-user: fix memory leak

2018-02-12 Thread linzhecheng



> -邮件原件-
> 发件人: Philippe Mathieu-Daudé [mailto:philippe.mathieu.da...@gmail.com]
> 代表 Philippe Mathieu-Daudé
> 发送时间: 2018年2月13日 11:54
> 收件人: linzhecheng <linzhech...@huawei.com>; qemu-devel@nongnu.org
> 抄送: pbonz...@redhat.com; wangxin (U) <wangxinxin.w...@huawei.com>;
> lidonglin <lidong...@huawei.com>; m...@redhat.com
> 主题: Re: [Qemu-devel] [PATCH] vhost-user: fix memory leak
> 
> Hi Linzhecheng,
> 
> On 02/12/2018 11:53 PM, linzhecheng wrote:
> > fix memory leak
> >
> > Signed-off-by: linzhecheng <linzhech...@huawei.com>
> >
> > diff --git a/net/vhost-user.c b/net/vhost-user.c index
> > cb45512506..d024573e45 100644
> > --- a/net/vhost-user.c
> > +++ b/net/vhost-user.c
> > @@ -109,6 +109,7 @@ static int vhost_user_start(int queues,
> > NetClientState *ncs[], CharBackend *be)
> >  err:
> >  if (net) {
> >  vhost_net_cleanup(net);
> > +g_free(net);
> 
> I think this g_free() belongs to vhost_net_cleanup() in net/vhost_net.c:
I think your qemu version is out of date,  g_free was moved from 
vhost_net_cleanup in commit e6bcb1b
> 
> void vhost_net_cleanup(struct vhost_net *net) {
> vhost_dev_cleanup(>dev);
> g_free(net);
> }
> 
> Regards,
> 
> Phil.
> 
> >  }
> >  vhost_user_stop(i, ncs);
> >  return -1;
> >

[Qemu-devel] [PATCH] vhost-user: fix memory leak

2018-02-12 Thread linzhecheng

fix memory leak

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/net/vhost-user.c b/net/vhost-user.c
index cb45512506..d024573e45 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -109,6 +109,7 @@ static int vhost_user_start(int queues, NetClientState 
*ncs[], CharBackend *be)
 err:
 if (net) {
 vhost_net_cleanup(net);
+g_free(net);
 }
 vhost_user_stop(i, ncs);
 return -1;
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH v2] vcpu: join vcpu thread after it exiting

2018-02-01 Thread linzhecheng

As we create vcpu thread with QEMU_THREAD_JOINABLE mode,
we should join it after it exiting to cleanup resources.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f290f48..5cc1ba2 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -282,9 +282,9 @@ err:
 
 static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 {
-struct KVMParkedVcpu *cpu;
+struct KVMParkedVcpu *cpu, *next_cpu;
 
-QLIST_FOREACH(cpu, >kvm_parked_vcpus, node) {
+QLIST_FOREACH_SAFE(cpu, >kvm_parked_vcpus, node, next_cpu) {
 if (cpu->vcpu_id == vcpu_id) {
 int kvm_fd;
 
diff --git a/cpus.c b/cpus.c
index 2cb0af9..1890bfe 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1205,6 +1205,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 cpu->created = false;
 qemu_cond_signal(_cpu_cond);
 qemu_mutex_unlock_iothread();
+rcu_unregister_thread();
 return NULL;
 }
 
@@ -1759,6 +1760,7 @@ void cpu_remove_sync(CPUState *cpu)
 cpu_remove(cpu);
 while (cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
+qemu_thread_join(cpu->thread);
 }
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH] vcpu: join vcpu thread after it exiting

2018-01-31 Thread linzhecheng

As we create vcpu thread with QEMU_THREAD_JOINABLE mode,
we should join it after it exiting to cleanup resources.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f290f48..6ff71e4 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -282,9 +282,9 @@ err:
 
 static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 {
-struct KVMParkedVcpu *cpu;
+struct KVMParkedVcpu *cpu, *next_cpu;
 
-QLIST_FOREACH(cpu, >kvm_parked_vcpus, node) {
+QLIST_FOREACH_SAFE(cpu, >kvm_parked_vcpus, node, *next_cpu) {
 if (cpu->vcpu_id == vcpu_id) {
 int kvm_fd;
 
diff --git a/cpus.c b/cpus.c
index 2cb0af9..1890bfe 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1205,6 +1205,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 cpu->created = false;
 qemu_cond_signal(_cpu_cond);
 qemu_mutex_unlock_iothread();
+rcu_unregister_thread();
 return NULL;
 }
 
@@ -1759,6 +1760,7 @@ void cpu_remove_sync(CPUState *cpu)
 cpu_remove(cpu);
 while (cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
+qemu_thread_join(cpu->thread);
 }
 }
 
-- 
1.8.3.1

[Qemu-devel] [PATCH] vcpu: create vcpu thread with QEMU_THREAD_DETACHED mode

2018-01-19 Thread linzhecheng

1. If we create vcpu thread with QEMU_THREAD_JOINABLE mode,
we will get memory leak when vcpu thread exits, which will happen
when hot-unplug vcpus.
2. We should use QLIST_FOREACH_SAFE instead of QLIST_FOREACH
if we need to remove the entry in QLIST.

Signed-off-by: linzhecheng <li...@zju.edu.cn>

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 071f4f5..fd95b18 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -282,9 +282,9 @@ err:
 
 static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
 {
-struct KVMParkedVcpu *cpu;
+struct KVMParkedVcpu *cpu, *next_cpu;
 
-QLIST_FOREACH(cpu, >kvm_parked_vcpus, node) {
+QLIST_FOREACH_SAFE(cpu, >kvm_parked_vcpus, node, next_cpu) {
 if (cpu->vcpu_id == vcpu_id) {
 int kvm_fd;
 
diff --git a/cpus.c b/cpus.c
index 2cb0af9..de3a96b 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1113,6 +1113,9 @@ static void qemu_kvm_destroy_vcpu(CPUState *cpu)
 error_report("kvm_destroy_vcpu failed");
 exit(EXIT_FAILURE);
 }
+g_free(cpu->thread);
+g_free(cpu->halt_cond);
+g_free(cpu->cpu_ases);
 }
 
 static void qemu_tcg_destroy_vcpu(CPUState *cpu)
@@ -1205,6 +1208,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
 cpu->created = false;
 qemu_cond_signal(_cpu_cond);
 qemu_mutex_unlock_iothread();
+rcu_unregister_thread();
 return NULL;
 }
 
@@ -1850,7 +1854,7 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
  cpu->cpu_index);
 qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
-   cpu, QEMU_THREAD_JOINABLE);
+   cpu, QEMU_THREAD_DETACHED);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
-- 
1.8.3.1

[Qemu-devel] [PATCH] scsi: handle the special parameters

2018-01-14 Thread linzhecheng

scsi_disk_emulate_command calls
scsi_build_sense(NULL, 0, outbuf, r->buflen,
  (req->cmd.buf[1] & 1) == 0);
But scsi_convert_sense doesn't handle the case when in_buf is NULL
or in_len is 0, which will lead to segfault.

Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 scsi/utils.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/scsi/utils.c b/scsi/utils.c
index ddae650a99..b769e80c12 100644
--- a/scsi/utils.c
+++ b/scsi/utils.c
@@ -322,6 +322,10 @@ int scsi_convert_sense(uint8_t *in_buf, int in_len,
 SCSISense sense;
 bool fixed_in;
 
+if (!in_buf || !in_len) {
+return 0;
+}
+
 fixed_in = (in_buf[0] & 2) == 0;
 if (in_len && fixed == fixed_in) {
 memcpy(buf, in_buf, MIN(len, in_len));
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] memory: set ioeventfd_update_pending after address_space_update_ioeventfds

2018-01-14 Thread linzhecheng

We should set ioeventfd_update_pending same as memory_region_update_pending.

Signed-off-by: linzhecheng <li...@zju.edu.cn>
---
 memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/memory.c b/memory.c
index 4b41fb8..0cf39d0 100644
--- a/memory.c
+++ b/memory.c
@@ -1091,6 +1091,7 @@ void memory_region_transaction_commit(void)
 address_space_update_ioeventfds(as);
 }
 memory_region_update_pending = false;
+ioeventfd_update_pending = false;
 MEMORY_LISTENER_CALL_GLOBAL(commit, Forward);
 } else if (ioeventfd_update_pending) {
 QTAILQ_FOREACH(as, _spaces, address_spaces_link) {
-- 
1.8.3.1

[Qemu-devel] [PATCH v3] vga: check the validation of memory addr when draw text

2018-01-11 Thread linzhecheng

Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
iopl(3);
srand(time(NULL));
int a,b;
while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
outb(a,b);
}
return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 hw/display/vga.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/display/vga.c b/hw/display/vga.c
index a0412000a5..6e78a4e156 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -1279,6 +1279,9 @@ static void vga_draw_text(VGACommonState *s, int 
full_update)
 cx_min = width;
 cx_max = -1;
 for(cx = 0; cx < width; cx++) {
+if (src + sizeof(uint16_t) > s->vram_ptr + s->vram_size) {
+break;
+}
 ch_attr = *(uint16_t *)src;
 if (full_update || ch_attr != *ch_attr_ptr || src == cursor_ptr) {
 if (cx < cx_min)
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH v2] vga: check the validation of memory addr when draw text

2018-01-11 Thread linzhecheng

From: fangying <fangyi...@huawei.com>

Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
iopl(3);
srand(time(NULL));
int a,b;
while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
outb(a,b);
}
return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 hw/display/vga.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/display/vga.c b/hw/display/vga.c
index a041200..6e78a4e 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -1279,6 +1279,9 @@ static void vga_draw_text(VGACommonState *s, int 
full_update)
 cx_min = width;
 cx_max = -1;
 for(cx = 0; cx < width; cx++) {
+if (src + sizeof(uint16_t) > s->vram_ptr + s->vram_size) {
+break;
+}
 ch_attr = *(uint16_t *)src;
 if (full_update || ch_attr != *ch_attr_ptr || src == cursor_ptr) {
 if (cx < cx_min)
-- 
1.8.3.1

[Qemu-devel] [PATCH] irq: fix memory leak

2017-12-24 Thread linzhecheng

entry is moved from list but is not freed.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 351b64f77c..3c920db79a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3494,6 +3494,7 @@ int kvm_arch_release_virq_post(int virq)
 if (entry->virq == virq) {
 trace_kvm_x86_remove_msi_route(virq);
 QLIST_REMOVE(entry, list);
+g_free(entry);
 break;
 }
 }
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] input: fix memory leak

2017-12-24 Thread linzhecheng

If kbd_queue is not empty and queue_count >= queue_limit,
we should free evt.

Change-Id: Ieeacf90d5e7e370a40452ec79031912d8b864d83
Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/ui/input.c b/ui/input.c
index 3e2d324278..e5b78aae9e 100644
--- a/ui/input.c
+++ b/ui/input.c
@@ -421,6 +421,8 @@ void qemu_input_event_send_key(QemuConsole *src, KeyValue 
*key, bool down)
 } else if (queue_count < queue_limit) {
 qemu_input_queue_event(_queue, src, evt);
 qemu_input_queue_sync(_queue);
+} else {
+qapi_free_InputEvent(evt);
 }
 }
 
-- 
2.12.2.windows.2

[Qemu-devel] [PATCH] vga: check the validation of memory addr when draw text

2017-12-24 Thread linzhecheng

Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda 
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest OS 
will lead to qemu crash:

int main()
 {
iopl(3);
srand(time(NULL));
int a,b;
while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
outb(a,b);
}
return 0;
}

The backtrace is:
#0  0x55defdf28dd1 in vga_draw_text (s=0x55deffe19a80, full_update=1)
at /mnt/sdb/lzc/code/open/qemu/hw/display/vga.c:1283
#1  0x55defdf2a371 in vga_update_display (opaque=0x55deffe19a80)
at /mnt/sdb/lzc/code/open/qemu/hw/display/vga.c:1766
#2  0x55defe28098e in graphic_hw_update (con=0x55deffeeb770) at 
ui/console.c:263
#3  0x55defe29360f in vnc_refresh (dcl=0x55deffc54860) at ui/vnc.c:2855
#4  0x55defe2842fe in dpy_refresh (s=0x55deffeeb700) at ui/console.c:1595
#5  0x55defe2806ca in gui_update (opaque=0x55deffeeb700) at ui/console.c:201
#6  0x55defe3ca875 in timerlist_run_timers (timer_list=0x55deff420100) at 
util/qemu-timer.c:536
#7  0x55defe3ca8bd in qemu_clock_run_timers (type=QEMU_CLOCK_REALTIME) at 
util/qemu-timer.c:547
#8  0x55defe3cac83 in qemu_clock_run_all_timers () at util/qemu-timer.c:662
#9  0x55defe3cb430 in main_loop_wait (nonblocking=0) at util/main-loop.c:521
#10 0x55defe029838 in main_loop () at vl.c:1951
#11 0x55defe031720 in main (argc=16, argv=0x7ffe5fb600e8, 
envp=0x7ffe5fb60170) at vl.c:4867

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D 
(which is the start address register) to modify the 
the display memory address of the upper left pixel 
or character of the screen. The address may be out of the 
range of vga ram. So we should check the validation of memory address 
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhech...@huawei.com>

Change-Id: Ib7466361b18e0a232fc068aad50d2113701786ab

diff --git a/hw/display/vga.c b/hw/display/vga.c
index a0412000a5..c265572bf3 100644
--- a/hw/display/vga.c
+++ b/hw/display/vga.c
@@ -1279,6 +1279,10 @@ static void vga_draw_text(VGACommonState *s, int 
full_update)
 cx_min = width;
 cx_max = -1;
 for(cx = 0; cx < width; cx++) {
+if (src + sizeof(uint16_t) > s->vram_ptr + s->vram_size) {
+printf("src is out of the range of vga ram.\n");
+return;
+ }
 ch_attr = *(uint16_t *)src;
 if (full_update || ch_attr != *ch_attr_ptr || src == cursor_ptr) {
 if (cx < cx_min)
-- 
2.12.2.windows.2

Re: [Qemu-devel] [PULL 02/46] qemu-thread: fix races on threads that exit very quickly

2017-12-20 Thread linzhecheng



> -邮件原件-
> 发件人: Eric Blake [mailto:ebl...@redhat.com]
> 发送时间: 2017年12月21日 11:36
> 收件人: linzhecheng <linzhech...@huawei.com>; Paolo Bonzini
> <pbonz...@redhat.com>; qemu-devel@nongnu.org; f...@redhat.com
> 抄送: wangxin (U) <wangxinxin.w...@huawei.com>
> 主题: Re: [Qemu-devel] [PULL 02/46] qemu-thread: fix races on threads that
> exit very quickly
> 
> On 12/20/2017 09:29 PM, linzhecheng wrote:
> 
> >> +} QemuThreadArgs;
> >> +
> >> +static void *qemu_thread_start(void *args) {
> >> +QemuThreadArgs *qemu_thread_args = args;
> >> +void *(*start_routine)(void *) = qemu_thread_args->start_routine;
> >> +void *arg = qemu_thread_args->arg;
> >> +
> >> +/* Attempt to set the threads name; note that this is for debug, so
> >> + * we're not going to fail if we can't set it.
> >> + */
> >> +pthread_setname_np(pthread_self(), qemu_thread_args->name);
> >> +g_free(qemu_thread_args->name);
> >> +g_free(qemu_thread_args);
> > If qemu_thread_args is freed here, start_routine(arg) will lead to use
> > after free because arg equals to qemu_thread_args
> 
> No, we explicitly copied qemu_thread_args->arg into a local variable prior to
> freeing qemu_thread_args, so that we do not have to dereference the freed
> variable.
OK, that's true. 
> 
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PULL 02/46] qemu-thread: fix races on threads that exit very quickly

2017-12-20 Thread linzhecheng



> -邮件原件-
> 发件人: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] 代表 Paolo Bonzini
> 发送时间: 2017年12月21日 1:14
> 收件人: qemu-devel@nongnu.org
> 抄送: linzhecheng <linzhech...@huawei.com>
> 主题: [PULL 02/46] qemu-thread: fix races on threads that exit very quickly
> 
> From: linzhecheng <linzhech...@huawei.com>
> 
> If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a
> segfault with low probability.
> 
> The backtrace is:
>#0  0x7f46c60291d7 in __GI_raise (sig=sig@entry=6)
> at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>#1  0x7f46c602a8c8 in __GI_abort () at abort.c:90
>#2  0x008543c9 in PAT_abort ()
>#3  0x0085140d in patchIllInsHandler ()
>#4  
>#5  pthread_detach (th=139933037614848) at pthread_detach.c:50
>#6  0x00829759 in qemu_thread_create
> (thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-
> task-worker", start_routine=start_routine@entry=0x7eb9a0
> ,
>arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at
> util/qemu_thread_posix.c:512
>#7  0x007ebc96 in qio_task_run_in_thread (task=0x31db2c0,
> worker=worker@entry=0x7e7e40 ,
> opaque=0xcd23380, destroy=0x7f1180 )
>at io/task.c:141
>#8  0x007e7f33 in qio_channel_socket_connect_async
> (ioc=ioc@entry=0x626c0b0, addr=,
> callback=callback@entry=0x55e080 ,
> opaque=opaque@entry=0x42862c0,
>destroy=destroy@entry=0x0) at io/channel_socket.c:194
>#9  0x0055bdd1 in socket_reconnect_timeout (opaque=0x42862c0)
> at qemu_char.c:4744
>#10 0x7f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-
> 2.0.so.0
>#11 0x7f46c724799a in g_main_context_dispatch () from
> /usr/lib64/libglib-2.0.so.0
>#12 0x0076c646 in glib_pollfds_poll () at main_loop.c:228
>#13 0x0076c6eb in os_host_main_loop_wait (timeout=34800)
> at main_loop.c:273
>#14 0x0076c815 in main_loop_wait
> (nonblocking=nonblocking@entry=0) at main_loop.c:521
>#15 0x0056a511 in main_loop () at vl.c:2076
>#16 0x00420705 in main (argc=, argv= out>, envp=) at vl.c:4940
> 
> The cause of this problem is a glibc bug; for more information, see
> https://sourceware.org/bugzilla/show_bug.cgi?id=19951.
> The solution for this bug is to use pthread_attr_setdetachstate.
> 
> There is a similar issue with pthread_setname_np, which is moved from
> creating thread to created thread.
> 
> Signed-off-by: linzhecheng <linzhech...@huawei.com>
> Message-Id: <20171128044656.10592-1-linzhech...@huawei.com>
> Reviewed-by: Fam Zheng <f...@redhat.com> [Simplify the code by removing
> qemu_thread_set_name, and free the arguments  before invoking the start
> routine. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> ---
>  util/qemu-thread-posix.c | 59 ++-
> -
>  1 file changed, 42 insertions(+), 17 deletions(-)
> 
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c index
> 7306475..fcd369b 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -479,15 +479,29 @@ static void __attribute__((constructor))
> qemu_thread_atexit_init(void)  }
> 
> 
> -/* Attempt to set the threads name; note that this is for debug, so
> - * we're not going to fail if we can't set it.
> - */
> -static void qemu_thread_set_name(QemuThread *thread, const char *name) -
> {  #ifdef CONFIG_PTHREAD_SETNAME_NP
> -pthread_setname_np(thread->thread, name);
> -#endif
> +typedef struct {
> +void *(*start_routine)(void *);
> +void *arg;
> +char *name;
> +} QemuThreadArgs;
> +
> +static void *qemu_thread_start(void *args) {
> +QemuThreadArgs *qemu_thread_args = args;
> +void *(*start_routine)(void *) = qemu_thread_args->start_routine;
> +void *arg = qemu_thread_args->arg;
> +
> +/* Attempt to set the threads name; note that this is for debug, so
> + * we're not going to fail if we can't set it.
> + */
> +pthread_setname_np(pthread_self(), qemu_thread_args->name);
> +g_free(qemu_thread_args->name);
> +g_free(qemu_thread_args);
If qemu_thread_args is freed here, start_routine(arg) will lead to use after 
free because arg equals to qemu_thread_args
> +return start_routine(arg);
>  }
> +#endif
> +
> 
>  void qemu_thread_create(QemuThread *thread, const char *name,
> void *(*start_routine)(void*), @@ -496,29 +510,40 @@ 
> void
> qemu_thread_create(QemuThread *thread, const char *name,
>  sigset_t set, oldset;
>  int err;
>  pthread_attr_t attr;
> +QemuThreadAr

[Qemu-devel] [Question] vhost-user: hot-unplug vhost-user nic for windows guest OS will fail with 100% reproduce rate

2017-12-07 Thread linzhecheng

Hi, guys

I met a problem when hot-unplug vhost-user nic for Windows 2008 rc2 sp1 64 
(Guest OS)

The xml of nic is as followed:

  
  
  
  
  
  


Firstly, I use virsh attach-device win2008 vif.xml to hot-plug a nic for Guest 
OS. This operation returns success.
After guest OS discover nic successfully, I use virsh detach-device win2008 
vif.xml to hot-unplug it. This operation will fail with 100% reproduce rate.

However, if I hot-plug and hot-unplug virtio-net nic , it will not fail.

I have analysis the process of qmp_device_del , I found that qemu have inject 
interrupt to acpi to let it notice guest OS to remove nic.
I guess there is something wrong in Windows when handle the interrupt.

[Qemu-devel] [BUG] vhost-user: hot-unplug vhost-user nic for windows guest OS will fail with 100% reproduce rate

2017-12-07 Thread linzhecheng

Hi, guys

I met a problem when hot-unplug vhost-user nic for Windows 2008 rc2 sp1 64 
(Guest OS)

The xml of nic is as followed:

  
  
  
  
  
  


Firstly, I use virsh attach-device win2008 vif.xml to hot-plug a nic for Guest 
OS. This operation returns success.
After guest OS discover nic successfully, I use virsh detach-device win2008 
vif.xml to hot-unplug it. This operation will fail with 100% reproduce rate.

However, if I hot-plug and hot-unplug virtio-net nic , it will not fail.

I have analysis the process of qmp_device_del , I found that qemu have inject 
interrupt to acpi to let it notice guest OS to remove nic.
I guess there is something wrong in Windows when handle the interrupt.

[Qemu-devel] [PATCH v4] thread: move detach_thread from creating thread to created thread

2017-11-27 Thread linzhecheng

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault 
in a low probability.

The backtrace is:
#0  0x7f46c60291d7 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f46c602a8c8 in __GI_abort () at abort.c:90
#2  0x008543c9 in PAT_abort ()
#3  0x0085140d in patchIllInsHandler ()
#4  
#5  pthread_detach (th=139933037614848) at pthread_detach.c:50
#6  0x00829759 in qemu_thread_create 
(thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", 
start_routine=start_routine@entry=0x7eb9a0 , 
arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512
#7  0x007ebc96 in qio_task_run_in_thread (task=0x31db2c0, 
worker=worker@entry=0x7e7e40 , 
opaque=0xcd23380, destroy=0x7f1180 )
at io/task.c:141
#8  0x007e7f33 in qio_channel_socket_connect_async 
(ioc=ioc@entry=0x626c0b0, addr=, 
callback=callback@entry=0x55e080 , 
opaque=opaque@entry=0x42862c0, 
destroy=destroy@entry=0x0) at io/channel_socket.c:194
#9  0x0055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at 
qemu_char.c:4744
#10 0x7f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0
#11 0x7f46c724799a in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
#12 0x0076c646 in glib_pollfds_poll () at main_loop.c:228
#13 0x0076c6eb in os_host_main_loop_wait (timeout=34800) at 
main_loop.c:273
#14 0x0076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at 
main_loop.c:521
#15 0x0056a511 in main_loop () at vl.c:2076
#16 0x00420705 in main (argc=, argv=, 
envp=) at vl.c:4940

The root cause of this problem is a bug of glibc(version 2.17,the latest 
version has the same bug),
let's see what happened in glibc's code.
Here is the code slice of pthread_detach.c

25 int
26 pthread_detach (pthread_t th)
27 {
28  struct pthread *pd = (struct pthread *) th;
29
30  /* Make sure the descriptor is valid.  */
31  if (INVALID_NOT_TERMINATED_TD_P (pd))
32/* Not a valid thread handle.  */
34return ESRCH;
35
36  int result = 0;
37  /* Mark the thread as detached.  */
38  if (atomic_compare_and_exchange_bool_acq (>joinid, pd, NULL))
39{
40  /* There are two possibilities here.  First, the thread might
41   already be detached.  In this case we return EINVAL.
42   Otherwise there might already be a waiter.  The standard does
43   not mention what happens in this case.  */
44 if (IS_DETACHED (pd))
45   result = EINVAL;
46}
47  else
48/* Check whether the thread terminated meanwhile.  In this case we
49   will just free the TCB.  */
50if ((pd->cancelhandling & EXITING_BITMASK) != 0)
51  /* Note that the code in __free_tcb makes sure each thread
52   control block is freed only once.  */
53  __free_tcb (pd);
54  return result;
55}

QEMU get a segfault at line 50, becasue pd is an invalid address.
pd is still valid at line 38 when set pd->joinid = pd, at this moment,
created thread is just exiting(only keeps runing for a short time),
created thread is running in code of start_thread:

404  /* If the thread is detached free the TCB.  */
405  if (IS_DETACHED (pd))
406/* Free the TCB.  */
407__free_tcb (pd);
created thread found that pd is detached, so it freed pd, in this case,
pd became an invalid address.

I rewrite qemu_thread_create to move detach_thread from creating thread to 
created
to avoid this concurrency problem.

Change-Id: I2293d5be1526241cf58785d701b922f2ffc6491b
Signed-off-by: linzhecheng <linzhech...@huawei.com>

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e4..4c6dbb8 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -44,4 +44,12 @@ struct QemuThread {
 pthread_t thread;
 };
 
+typedef struct {
+void *(*start_routine)(void *);
+void *arg;
+char *name;
+int mode;
+} QemuThreadArgs;
+
+
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 7306475..e5bdb6b 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -489,6 +489,30 @@ static void qemu_thread_set_name(QemuThread *thread, const 
char *name)
 #endif
 }
 
+static void *qemu_thread_start(void *args)
+{
+QemuThreadArgs *qemu_thread_args;
+void *ret;
+QemuThread qemu_thread;
+
+qemu_thread_args = args;
+qemu_thread_get_self(_thread);
+
+if (qemu_thread_args->name) {
+qemu_thread_set_name(_thread, qemu_thread_args->name);
+g_free(qemu_thread_args->name);
+}
+
+if (qemu_thread_args->mode == QEMU_THREAD_DETACHED) {
+pthread_detach(qemu_thread.thread);
+}
+ret = qemu_thread_args->start_routine(qemu_thread_args->arg);
+
+g_free(qemu_thread_args);
+return ret;
+}
+
+
 void qemu_thread_create(QemuThread *thread, const char *name,
void *(*star

[Qemu-devel] [PATCH v3] thread: move detach_thread from creating thread to created thread

2017-11-27 Thread linzhecheng

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault 
in a low probability.

The backtrace is:
#0  0x7f46c60291d7 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f46c602a8c8 in __GI_abort () at abort.c:90
#2  0x008543c9 in PAT_abort ()
#3  0x0085140d in patchIllInsHandler ()
#4  
#5  pthread_detach (th=139933037614848) at pthread_detach.c:50
#6  0x00829759 in qemu_thread_create 
(thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", 
start_routine=start_routine@entry=0x7eb9a0 , 
arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512
#7  0x007ebc96 in qio_task_run_in_thread (task=0x31db2c0, 
worker=worker@entry=0x7e7e40 , 
opaque=0xcd23380, destroy=0x7f1180 )
at io/task.c:141
#8  0x007e7f33 in qio_channel_socket_connect_async 
(ioc=ioc@entry=0x626c0b0, addr=, 
callback=callback@entry=0x55e080 , 
opaque=opaque@entry=0x42862c0, 
destroy=destroy@entry=0x0) at io/channel_socket.c:194
#9  0x0055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at 
qemu_char.c:4744
#10 0x7f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0
#11 0x7f46c724799a in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
#12 0x0076c646 in glib_pollfds_poll () at main_loop.c:228
#13 0x0076c6eb in os_host_main_loop_wait (timeout=34800) at 
main_loop.c:273
#14 0x0076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at 
main_loop.c:521
#15 0x0056a511 in main_loop () at vl.c:2076
#16 0x00420705 in main (argc=, argv=, 
envp=) at vl.c:4940

The root cause of this problem is a bug of glibc(version 2.17,the latest 
version has the same bug),
let's see what happened in glibc's code.
Here is the code slice of pthread_detach.c

25 int
26 pthread_detach (pthread_t th)
27 {
28  struct pthread *pd = (struct pthread *) th;
29
30  /* Make sure the descriptor is valid.  */
31  if (INVALID_NOT_TERMINATED_TD_P (pd))
32/* Not a valid thread handle.  */
34return ESRCH;
35
36  int result = 0;
37  /* Mark the thread as detached.  */
38  if (atomic_compare_and_exchange_bool_acq (>joinid, pd, NULL))
39{
40  /* There are two possibilities here.  First, the thread might
41   already be detached.  In this case we return EINVAL.
42   Otherwise there might already be a waiter.  The standard does
43   not mention what happens in this case.  */
44 if (IS_DETACHED (pd))
45   result = EINVAL;
46}
47  else
48/* Check whether the thread terminated meanwhile.  In this case we
49   will just free the TCB.  */
50if ((pd->cancelhandling & EXITING_BITMASK) != 0)
51  /* Note that the code in __free_tcb makes sure each thread
52   control block is freed only once.  */
53  __free_tcb (pd);
54  return result;
55}

QEMU get a segfault at line 50, becasue pd is an invalid address.
pd is still valid at line 38 when set pd->joinid = pd, at this moment,
created thread is just exiting(only keeps runing for a short time),
created thread is running in code of start_thread:

404  /* If the thread is detached free the TCB.  */
405  if (IS_DETACHED (pd))
406/* Free the TCB.  */
407__free_tcb (pd);
created thread found that pd is detached, so it freed pd, in this case,
pd became an invalid address.

I rewrite qemu_thread_create to move detach_thread from creating thread to 
created
to avoid this concurrency problem.

Change-Id: I2293d5be1526241cf58785d701b922f2ffc6491b
Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 include/qemu/thread-posix.h |  8 
 include/qemu/thread.h   |  1 +
 util/qemu-thread-posix.c| 45 ++---
 3 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e426f..d855c15dab 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -44,4 +44,12 @@ struct QemuThread {
 pthread_t thread;
 };
 
+struct QemuThread_args {
+void *(*start_routine)(void *);
+void *arg;
+char *name;
+int mode;
+};
+
+
 #endif
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 9910f49b3a..db365242da 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -10,6 +10,7 @@ typedef struct QemuSemaphore QemuSemaphore;
 typedef struct QemuEvent QemuEvent;
 typedef struct QemuLockCnt QemuLockCnt;
 typedef struct QemuThread QemuThread;
+typedef struct QemuThread_args QemuThread_args;
 
 #ifdef _WIN32
 #include "qemu/thread-win32.h"
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 7306475899..07b5838862 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -489,6 +489,30 @@ static void qemu_thread_set_name(QemuThread *thread, const 
char *name)
 #endif
 }
 
+static void *qemu_thread_start(void *args)
+{

[Qemu-devel] [PATCH v2] thread: move detach_thread from creating thread to created thread

2017-11-27 Thread linzhecheng

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault 
in a low probability.

The backtrace is:
arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512
at io/task.c:141
destroy=destroy@entry=0x0) at io/channel_socket.c:194

The root cause of this problem is a bug of glibc(version 2.17,the latest 
version have the same bug),
let's see what happened in glibc's code.
Here is the code slice of pthread_detach.c

25 int
26 pthread_detach (pthread_t th)
27 {
28  struct pthread *pd = (struct pthread *) th;
29
30  /* Make sure the descriptor is valid.  */
31  if (INVALID_NOT_TERMINATED_TD_P (pd))
32/* Not a valid thread handle.  */
34return ESRCH;
35
36  int result = 0;
37  /* Mark the thread as detached.  */
38  if (atomic_compare_and_exchange_bool_acq (>joinid, pd, NULL))
39{
40  /* There are two possibilities here.  First, the thread might
41   already be detached.  In this case we return EINVAL.
42   Otherwise there might already be a waiter.  The standard does
43   not mention what happens in this case.  */
44 if (IS_DETACHED (pd))
45   result = EINVAL;
46}
47  else
48/* Check whether the thread terminated meanwhile.  In this case we
49   will just free the TCB.  */
50if ((pd->cancelhandling & EXITING_BITMASK) != 0)
51  /* Note that the code in __free_tcb makes sure each thread
52   control block is freed only once.  */
53  __free_tcb (pd);

54  return result;
55}

QEMU get a segfault at line 50, becasue pd is an invalid address.
pd is still valid at line 38 when set pd->joinid = pd, at this moment,
created thread is just exiting(only keeps runing for a short time),
created thread is running in code of start_thread:

404  /* If the thread is detached free the TCB.  */
405  if (IS_DETACHED (pd))
406/* Free the TCB.  */
407__free_tcb (pd);
created thread found that pd is detached, so it freed pd, in this case,
pd became an invalid address.

I rewrite qemu_thread_create to move detach_thread from creating thread to 
created
to avoid this concurrency problem.

Change-Id: I2293d5be1526241cf58785d701b922f2ffc6491b
Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 include/qemu/thread-posix.h |  8 
 include/qemu/thread.h   |  1 +
 util/qemu-thread-posix.c| 45 ++---
 3 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e426f..d855c15dab 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -44,4 +44,12 @@ struct QemuThread {
 pthread_t thread;
 };
 
+struct QemuThread_args {
+void *(*start_routine)(void *);
+void *arg;
+char *name;
+int mode;
+};
+
+
 #endif
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 9910f49b3a..db365242da 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -10,6 +10,7 @@ typedef struct QemuSemaphore QemuSemaphore;
 typedef struct QemuEvent QemuEvent;
 typedef struct QemuLockCnt QemuLockCnt;
 typedef struct QemuThread QemuThread;
+typedef struct QemuThread_args QemuThread_args;
 
 #ifdef _WIN32
 #include "qemu/thread-win32.h"
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 7306475899..07b5838862 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -489,6 +489,30 @@ static void qemu_thread_set_name(QemuThread *thread, const 
char *name)
 #endif
 }
 
+static void *qemu_thread_start(void *args)
+{
+QemuThread_args *qemu_thread_args;
+void *ret;
+QemuThread qemu_thread;
+
+qemu_thread_args = (QemuThread_args *)args;
+qemu_thread_get_self(_thread);
+
+if (qemu_thread_args->name) {
+qemu_thread_set_name(_thread, qemu_thread_args->name);
+g_free(qemu_thread_args->name);
+}
+
+if (qemu_thread_args->mode == QEMU_THREAD_DETACHED) {
+pthread_detach(qemu_thread.thread);
+}
+ret = qemu_thread_args->start_routine(qemu_thread_args->arg);
+
+g_free(qemu_thread_args);
+return ret;
+}
+
+
 void qemu_thread_create(QemuThread *thread, const char *name,
void *(*start_routine)(void*),
void *arg, int mode)
@@ -496,6 +520,7 @@ void qemu_thread_create(QemuThread *thread, const char 
*name,
 sigset_t set, oldset;
 int err;
 pthread_attr_t attr;
+QemuThread_args *qemu_thread_args;
 
 err = pthread_attr_init();
 if (err) {
@@ -505,20 +530,18 @@ void qemu_thread_create(QemuThread *thread, const char 
*name,
 /* Leave signal handling to the iothread.  */
 sigfillset();
 pthread_sigmask(SIG_SETMASK, , );
-err = pthread_create(>thread, , start_routine, arg);
+
+qemu_thread_args = g_new0(QemuThread_args, 1);
+qemu_thread_args->mode = mode;
+qemu_thread_args->name = name_threads ? g_strdup_printf("%s",

[Qemu-devel] [PATCH] thread: move detach_thread from creating thread to created thread

2017-11-27 Thread linzhecheng

If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault 
in a low probability.

The backtrace is:
#0  0x7f46c60291d7 in __GI_raise (sig=sig@entry=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x7f46c602a8c8 in __GI_abort () at abort.c:90
#2  0x008543c9 in PAT_abort ()
#3  0x0085140d in patchIllInsHandler ()
#4  
#5  pthread_detach (th=139933037614848) at pthread_detach.c:50
#6  0x00829759 in qemu_thread_create 
(thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", 
start_routine=start_routine@entry=0x7eb9a0 , 
arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512
#7  0x007ebc96 in qio_task_run_in_thread (task=0x31db2c0, 
worker=worker@entry=0x7e7e40 , 
opaque=0xcd23380, destroy=0x7f1180 )
at io/task.c:141
#8  0x007e7f33 in qio_channel_socket_connect_async 
(ioc=ioc@entry=0x626c0b0, addr=, 
callback=callback@entry=0x55e080 , 
opaque=opaque@entry=0x42862c0, 
destroy=destroy@entry=0x0) at io/channel_socket.c:194
#9  0x0055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at 
qemu_char.c:4744
#10 0x7f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0
#11 0x7f46c724799a in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
#12 0x0076c646 in glib_pollfds_poll () at main_loop.c:228
#13 0x0076c6eb in os_host_main_loop_wait (timeout=34800) at 
main_loop.c:273
#14 0x0076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at 
main_loop.c:521
#15 0x0056a511 in main_loop () at vl.c:2076
#16 0x00420705 in main (argc=, argv=, 
envp=) at vl.c:4940

The root cause of this problem is a bug of glibc(version 2.17,the lastest 
version have the same bug),
let's see what happened in glibc's code.
Here is the code slice of pthread_detach.c

25 int
26 pthread_detach (pthread_t th)
27 {
28  struct pthread *pd = (struct pthread *) th;
29
30  /* Make sure the descriptor is valid.  */
31  if (INVALID_NOT_TERMINATED_TD_P (pd))
32/* Not a valid thread handle.  */
34return ESRCH;
35
36  int result = 0;
37  /* Mark the thread as detached.  */
38  if (atomic_compare_and_exchange_bool_acq (>joinid, pd, NULL))
39{
40  /* There are two possibilities here.  First, the thread might
41   already be detached.  In this case we return EINVAL.
42   Otherwise there might already be a waiter.  The standard does
43   not mention what happens in this case.  */
44 if (IS_DETACHED (pd))
45   result = EINVAL;
46}
47  else
48/* Check whether the thread terminated meanwhile.  In this case we
49   will just free the TCB.  */
50if ((pd->cancelhandling & EXITING_BITMASK) != 0)
51  /* Note that the code in __free_tcb makes sure each thread
52   control block is freed only once.  */
53  __free_tcb (pd);

54  return result;
55}

QEMU get a segfault at line 50, becasue pd is an invalid address.
pd is still valid at line 38 when set pd->joinid = pd, at this moment,
created thread is just exiting(only keeps runing for a short time), 
created thread is running in code of start_thread:

404  /* If the thread is detached free the TCB.  */
405  if (IS_DETACHED (pd))
406/* Free the TCB.  */
407__free_tcb (pd);
created thread found that pd is detached, so it freed pd, in this case,
pd became an invalid address.

I rewrite qemu_thread_create to move detach_thread from creating thread to 
created
to avoid this concurrency problem.

Signed-off-by: linzhecheng <linzhech...@huawei.com>


diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e426f..d855c15dab 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -44,4 +44,12 @@ struct QemuThread {
 pthread_t thread;
 };
 
+struct QemuThread_args {
+void *(*start_routine)(void *);
+void *arg;
+char *name;
+int mode;
+};
+
+
 #endif
diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 9910f49b3a..db365242da 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -10,6 +10,7 @@ typedef struct QemuSemaphore QemuSemaphore;
 typedef struct QemuEvent QemuEvent;
 typedef struct QemuLockCnt QemuLockCnt;
 typedef struct QemuThread QemuThread;
+typedef struct QemuThread_args QemuThread_args;
 
 #ifdef _WIN32
 #include "qemu/thread-win32.h"
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 7306475899..8c72fb12a8 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -482,13 +482,34 @@ static void __attribute__((constructor)) 
qemu_thread_atexit_init(void)
 /* Attempt to set the threads name; note that this is for debug, so
  * we're not going to fail if we can't set it.
  */
-static void qemu_thread_set_name(QemuThread *thread, const char *name)
+static void qemu_thread_set_name(pthread_t thread, const char *name)
 {
 #ifdef CONFIG_PTHREAD_SETNAME_NP
-

[Qemu-devel] [PATCH] fix: unrealize virtio device if we fail to hotplug it

2017-10-31 Thread linzhecheng

If we fail to hotplug virtio-blk device and then suspend
or shutdown VM, qemu is likely to crash.

Re-production steps:
1. Run VM named vm001
2. Create a virtio-blk.xml which contains wrong configurations:

  
  
  

3. Run command : virsh attach-device vm001 virtio-blk.xml
error: Failed to attach device from blk-scsi.xml
error: internal error: unable to execute QEMU command 'device_add': Please set 
scsi=off for virtio-blk devices in order to use virtio 1.0
it means hotplug virtio-blk device failed.
4. Suspend or shutdown VM will leads to qemu crash

Problem happens in virtio_vmstate_change which is called by
vm_state_notify:
vdev’s parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash.
virtio_vmstate_change is added to the list vm_change_state_head at 
virtio_blk_device_realize(virtio_init),
but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed 
from vm_change_state_head.
Adding unrealize function of virtio-blk device can solve this problem.

Signed-off-by: linzhecheng <linzhech...@huawei.com>
---
 hw/virtio/virtio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5884ce3480..ea532dc35f 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2491,6 +2491,7 @@ static void virtio_device_realize(DeviceState *dev, Error 
**errp)
 virtio_bus_device_plugged(vdev, );
 if (err != NULL) {
 error_propagate(errp, err);
+vdc->unrealize(dev, NULL);
 return;
 }
 
-- 
2.12.2.windows.2

[Qemu-devel] [Bug] virtio-blk: qemu will crash if hotplug virtio-blk device failed

2017-10-30 Thread linzhecheng

I found that hotplug virtio-blk device will lead to qemu crash.

Re-production steps:

1.   Run VM named vm001

2.   Create a virtio-blk.xml which contains wrong configurations:

  
  
  


3.   Run command : virsh attach-device vm001 vm001

Libvirt will return err msg:

error: Failed to attach device from blk-scsi.xml

error: internal error: unable to execute QEMU command 'device_add': Please set 
scsi=off for virtio-blk devices in order to use virtio 1.0

it means hotplug virtio-blk device failed.

4.   Suspend or shutdown VM will leads to qemu crash



from gdb:


(gdb) bt
#0  object_get_class (obj=obj@entry=0x0) at qom/object.c:750
#1  0x7f9a72582e01 in virtio_vmstate_change (opaque=0x7f9a73d10960, 
running=0, state=) at 
/mnt/sdb/lzc/code/open/qemu/hw/virtio/virtio.c:2203
#2  0x7f9a7261ef52 in vm_state_notify (running=running@entry=0, 
state=state@entry=RUN_STATE_PAUSED) at vl.c:1685
#3  0x7f9a7252603a in do_vm_stop (state=RUN_STATE_PAUSED) at 
/mnt/sdb/lzc/code/open/qemu/cpus.c:941
#4  vm_stop (state=state@entry=RUN_STATE_PAUSED) at 
/mnt/sdb/lzc/code/open/qemu/cpus.c:1807
#5  0x7f9a7262eb1b in qmp_stop (errp=errp@entry=0x7ffe63e25590) at qmp.c:102
#6  0x7f9a7262c70a in qmp_marshal_stop (args=, 
ret=, errp=0x7ffe63e255d8) at qmp-marshal.c:5854
#7  0x7f9a72897e79 in do_qmp_dispatch (errp=0x7ffe63e255d0, 
request=0x7f9a76510120, cmds=0x7f9a72ee7980 ) at 
qapi/qmp-dispatch.c:104
#8  qmp_dispatch (cmds=0x7f9a72ee7980 , 
request=request@entry=0x7f9a76510120) at qapi/qmp-dispatch.c:131
#9  0x7f9a725288d5 in handle_qmp_command (parser=, 
tokens=) at /mnt/sdb/lzc/code/open/qemu/monitor.c:3852
#10 0x7f9a7289d514 in json_message_process_token (lexer=0x7f9a73ce4498, 
input=0x7f9a73cc6880, type=JSON_RCURLY, x=36, y=17) at 
qobject/json-streamer.c:105
#11 0x7f9a728bb69b in json_lexer_feed_char 
(lexer=lexer@entry=0x7f9a73ce4498, ch=125 '}', flush=flush@entry=false) at 
qobject/json-lexer.c:323
#12 0x7f9a728bb75e in json_lexer_feed (lexer=0x7f9a73ce4498, 
buffer=, size=) at qobject/json-lexer.c:373
#13 0x7f9a7289d5d9 in json_message_parser_feed (parser=, 
buffer=, size=) at qobject/json-streamer.c:124
#14 0x7f9a7252722e in monitor_qmp_read (opaque=, 
buf=, size=) at 
/mnt/sdb/lzc/code/open/qemu/monitor.c:3894
#15 0x7f9a7284ee1b in tcp_chr_read (chan=, cond=, opaque=) at chardev/char-socket.c:441
#16 0x7f9a6e03e99a in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
#17 0x7f9a728a342c in glib_pollfds_poll () at util/main-loop.c:214
#18 os_host_main_loop_wait (timeout=) at util/main-loop.c:261
#19 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#20 0x7f9a724e7547 in main_loop () at vl.c:1999
#21 main (argc=, argv=, envp=) at 
vl.c:4877

Problem happens in virtio_vmstate_change which is called by vm_state_notify,
static void virtio_vmstate_change(void *opaque, int running, RunState state)
{
VirtIODevice *vdev = opaque;
BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
VirtioBusClass *k = VIRTIO_BUS_GET_CLASS(qbus);
bool backend_run = running && (vdev->status & VIRTIO_CONFIG_S_DRIVER_OK);
vdev->vm_running = running;

if (backend_run) {
virtio_set_status(vdev, vdev->status);
}

if (k->vmstate_change) {
k->vmstate_change(qbus->parent, backend_run);
}

if (!backend_run) {
virtio_set_status(vdev, vdev->status);
}
}

Vdev's parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash.
virtio_vmstate_change is added to the list vm_change_state_head at 
virtio_blk_device_realize(virtio_init),
but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed 
from vm_change_state_head.


I apply a patch as follews:

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 5884ce3..ea532dc 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2491,6 +2491,7 @@ static void virtio_device_realize(DeviceState *dev, Error 
**errp)
 virtio_bus_device_plugged(vdev, );
 if (err != NULL) {
 error_propagate(errp, err);
+vdc->unrealize(dev, NULL);
 return;
 }

58 matches

Mail list logo