On Tue, Nov 02, 2010 at 08:46:22AM -0500, Ryan Harper wrote: > * Markus Armbruster <arm...@redhat.com> [2010-11-02 04:40]: > > Ryan Harper <ry...@us.ibm.com> writes: > > > > > * Markus Armbruster <arm...@redhat.com> [2010-10-29 11:11]: > > >> Ryan Harper <ry...@us.ibm.com> writes: > > >> > > >> > * Markus Armbruster <arm...@redhat.com> [2010-10-29 09:13]: > > >> >> [Note cc: Michael] > > >> >> > > >> >> Ryan Harper <ry...@us.ibm.com> writes: > > >> >> > > >> >> > > >> >> If I understand your patch correctly, the difference between your > > >> >> drive_unplug and my blockdev_del is as follows: > > >> >> > > >> >> * drive_unplug forcefully severs the connection between the host part > > >> >> of > > >> >> the block device and its BlockDriverState. A shell of the host part > > >> >> remains, to be cleaned up later. You need forceful disconnect > > >> >> operation to be able to revoke access to an image whether the guest > > >> >> cooperates or not. Fair enough. > > >> >> > > >> >> * blockdev_del deletes a host part. My current version fails when the > > >> >> host part is in use. I patterned that after netdev_del, which used > > >> >> to > > >> >> work that way, until commit 2ffcb18d: > > >> >> > > >> >> Make netdev_del delete the netdev even when it's in use > > >> >> > > >> >> To hot-unplug guest and host part of a network device, you do: > > >> >> > > >> >> device_del NIC-ID > > >> >> netdev_del NETDEV-ID > > >> >> > > >> >> For PCI devices, device_del merely tells ACPI to unplug the > > >> >> device. > > >> >> The device goes away for real only after the guest processed the > > >> >> ACPI > > >> >> unplug event. > > >> >> > > >> >> You have to wait until then (e.g. by polling info pci) before you > > >> >> can > > >> >> unplug the netdev. Not good. > > >> >> > > >> >> Fix by removing the "in use" check from do_netdev_del(). > > >> >> Deleting a > > >> >> netdev while it's in use is safe; packets simply get routed to > > >> >> the bit > > >> >> bucket. > > >> >> > > >> >> Isn't this the very same problem that's behind your drive_unplug? > > >> > > > >> > Yes it is. > > >> > > > >> >> > > >> >> I'd like to have some consistency among net, block and char device > > >> >> commands, i.e. a common set of operations that work the same for all > > >> >> of > > >> >> them. Can we agree on such a set? > > >> > > > >> > Yeah; the current trouble (or at least what I perceive to be trouble) > > >> > is > > >> > that in the case where the guest responds to device_del induced ACPI > > >> > removal event; the current qdev code already does the host-side device > > >> > tear down. Not sure if it is OK to do a blockdev_del() immediately > > >> > after the device_del. What happens when we do: > > >> > > > >> > device_del > > >> > ACPI to guest > > >> > blockdev_del /* removes host-side device */ > > >> > > >> Fails in my tree, because the blockdev's still in use. See below. > > >> > > >> > guest responds to ACPI > > >> > qdev calls pci device removal code > > >> > qemu attempts to destroy the associated host-side block > > >> > > > >> > That may just work today; and if not, it shouldn't be hard to fix up > > >> > the > > >> > code to check for NULLs > > >> > > >> I hate the automatic deletion of host part along with the guest part. > > >> device_del should undo device_add. {block,net,char}dev_{add,del} should > > >> be similarly paired. > > > > > > Agreed. > > >> > > >> In my blockdev branch, I keep the automatic delete only for backwards > > >> compatibility: if you create the drive with drive_add, it gets > > >> auto-deleted, but if you use blockdev_add, it stays around. > > > > > > But what to do about the case where we're doing drive_add and then a > > > device_del() That's the urgent situation that needs to be resolved. > > > > What's the exact problem we need to solve urgently? > > > > Is it "provide means to cut the connection to the host part immediately, > > even with an uncooperative guest"? > > Yes, need to ensure that if the mgmt layer (libvirt) has done what it > believes should have disassociated the host block device from the guest, > we want to ensure that the host block device is no longer accessible > from the guest. > > > > > Does this need to be separate from device_del? > > no, it doesn't have to be. Honestly, I didn't see a clear way to do > something like unplug early in the device_del because that's all pci > device code which has no knowledge of host block devices; having it > disconnect seemed like a layering violation.
We invoke the cleanup callback, isn't that enough? > > > > >> >> Even if your drive_unplug shouldn't fit in that set, we might want it > > >> >> as > > >> >> a stop-gap. Depends on how urgent the need for it is. Yet another > > >> >> special-purpose command to be deprecated later. > > >> > > > >> > The fix is urgent; but I'm willing to spin a couple patches if it helps > > >> > get this into better shape. > > >> > > >> Can we agree on a common solution for block and net? That's why I cc'ed > > >> Michael. > > > > > > I didn't see a good way to have block behave the same as net; though I > > > do agree that it would be good to have this be common, long term. > > > > If we can't make them behave 100% the same, then the next best thing is > > to offer a preferred way to do things that works similarly enough to let > > users ignore the differences. > > > > Possible preferred ways to revoke access to a host part: > > > > A. device_del > > > > Need to make device_del cut the connection right away instead of when > > the guest completes unplug. > > > > device_del changes behavior. Any problems with that? > > I don't think so; current mgmt consumers assume a cooperative guest and > don't handle an uncooperative one right now in the case where Selinux is > disabled; so modifying this path to do a disconnect shouldn't be a > problem. > > > > > Not an option if we need "cut the connection" to be separate from > > device_del. > > > > B. FOO_del > > > > Got netdev_del. > > > > Need drive_del. If drive is in use, replace it by a special "dead > > drive" without a device name (so the ID becomes available for new > > drives), then delete the original. > > > > Wart: drive_del doesn't work reliably after device_del, because the > > drive is auto-deleted when the guest completes unplug. Not a problem > > for my blockdev_add/blockdev_del work-in-progress, because host parts > > created with blockdev_add don't auto-delete. > > > > C. FOO_unplug > > > > You got a patch for drive_unplug. > > > > Need netdev_unplug. > > > > By the way, I hate "unplug", because it suggests relation to hot > > unplug. What about "disconnect"? > > > > Any preferences? > > disconnect is fine. > > > > > >> Currently, we have two different ways: > > >> > > >> * The netdev way: "del" always succeeds > > >> > > >> How can it succeed if the host part is in use? > > >> > > >> If all device models are prepared to deal with a missing host part, we > > >> can delete it right away. > > >> > > >> Else, we need to replace it with a suitable zombie, which is > > >> auto-deleted when it goes out of use. Such zombies are not be visible > > >> elsewhere, in particular, the ID becomes available immediately. > > >> > > >> * The unplug way: "del" fails while in use, "unplug" always succeeds > > >> > > >> Feels a bit cleaner to me. But changing netdev_del might not be > > >> acceptable. > > >> > > >> Either way works for me as an user interface. But I'd rather not have > > >> both. > > >> > > >> Next, we need to consider how to integrate this with the automatic > > >> deletion of drives on qdev destruction. That's too late for unplug, we > > >> want that right in device_del. I'd leave the stupid automatic delete > > >> where it is now, in qdev destruction. The C API need unplug and delete > > >> separate for that. > > >> > > >> > > >> Regardless of the way we choose, we need to think very clearly on how > > >> exactly device models should behave when their host part is missing or a > > >> zombie, and how that behavior appears in the guest. > > >> > > >> For net, making it look exactly like a yanked out network cable would > > >> make sense to me. > > >> > > >> What about block? > > > > > > It seems to me that for block it's like cdrom with no disk, floppy with > > > no media, hard disk that's gone bad. I think we we throw EIO back; it's > > > handled gracefully enough. This is what happens when you do a > > > drive_unplug with my patch; the application using the device gets IO > > > errors. That's expected if a drive were to suddently fail (which is > > > what this looks like). And certainly there is some responsibility > > > at the mgmt console to ensure you're not unplugging a drive that you are > > > currently using. > > > > Total drive failure works for me. > > OK > > > > > "No media" is cute, but it's possible only for drives with removable > > media. I'd rather have all drives behave the same, whether their media > > is removable or not. > > right, I don't think there is any "no media" for hard disks. > > > -- > Ryan Harper > Software Engineer; Linux Technology Center > IBM Corp., Austin, Tx > ry...@us.ibm.com