At 03/09/2011 02:12 PM, Ryan Harper Write: > * Wen Congyang <we...@cn.fujitsu.com> [2011-03-08 23:09]: >> At 03/09/2011 12:08 PM, Ryan Harper Write: >>> * Wen Congyang <we...@cn.fujitsu.com> [2011-02-27 20:56]: >>>> Hi Markus Armbruster >>>> >>>> At 02/23/2011 04:30 PM, Markus Armbruster Write: >>>>> Isaku Yamahata <yamah...@valinux.co.jp> writes: >>>>> >>>> >>>> <snip> >>>> >>>>> >>>>> I don't think this patch is correct. Let me explain. >>>>> >>>>> Device hot unplug is *not* guaranteed to succeed. >>>>> >>>>> For some buses, such as USB, it always succeeds immediately, i.e. when >>>>> the device_del monitor command finishes, the device is gone. Live is >>>>> good. >>>>> >>>>> But for PCI, device_del merely initiates the ACPI unplug rain dance. It >>>>> doesn't wait for the dance to complete. Why? The dance can take an >>>>> unpredictable amount of time, including forever. >>>>> >>>>> Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI >>>>> slot, and the unplug has not yet completed (race condition), or it >>>>> failed. Yes, Virginia, PCI hotplug *can* fail. >>>>> >>>>> When unplug succeeds, the qdev is automatically destroyed. >>>>> pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() >>>>> does it for PCIE. >>>> >>>> I got a similar problem. When I unplug a pci device by hand, it works >>>> as expected, and I can hotplug it again. But when I use a srcipt to >>>> do the same thing, sometimes it failed. I think I may find another bug. >>>> >>>> Steps to reproduce this bug: >>>> 1. cat ./test-e1000.sh # RHEL6RC is domain name >>>> #! /bin/bash >>>> >>>> while true; do >>>> virsh attach-interface RHEL6RC network default --mac >>>> 52:54:00:1f:db:c7 --model e1000 >>>> if [[ $? -ne 0 ]]; then >>>> break >>>> fi >>>> virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 >>>> if [[ $? -ne 0 ]]; then >>>> break >>>> fi >>>> sleep 5 >>> >>> How do you know that the guest has responded at this point before you >>> attempt to attach again at the top of the loop. Any attach/detach >>> requires the guest to respond to the request and it may not respond at >>> all. >> >> When I attach/detach interface by hand, it works fine: I can see the new >> interface >> when I attach it, and it disapears when I detached it. > > The point is that since the attach and detach require guest > participation, this interface isn't reliable. You have a sleep 5 in > your loop, hoping to wait long enough for the guest to respond, but > after a number of iterations in your loop it fails, you can bump the > sleep to to 3600 seconds and the guest *still* might not respond...
We use sci interrupt to tell the guest that a device has been attached/detached. But the sci interrupt is *lost* in qemu, so the guest does not know a device has been attached/detached, and does not respond it. If the sci interrupt is not lost, the guest can respond it. > >