RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Sunday, May 01, 2011 4:53 PM To: KY Srinivasan Cc: Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Sun, May 01, 2011 at 06:08:37PM +, KY Srinivasan wrote: Could you elaborate on the problems/issues when the block driver registers for the IDE majors. On the Qemu side, we have a mechanism to disable the emulation when PV drivers load. I don't think there is an equivalent mechanism on the Windows side. So, as far as I know, registering for the IDE majors is the only way to also prevent native drivers in Linux from taking control of the emulated device. What qemu are you talking about for the qemu side? Upstream qemu doesn't have any way to provide the same image as multiple devices, nevermind dynamically unplugging bits in that case. Nor does it support the hyperv devices. I am talking about the qemu that was (is) shipping with Xen. In Hyper-V, the block devices configured as IDE devices for the guest will be taken over by the native drivers if the PV drivers don't load first and take over the IDE majors. If you want to have the root device be managed by the PV drivers, this appears to be the only way to ensure that native IDE drivers don't take over the root device. Granted, this depends on ensuring the PV drivers load first, but I don't know if there is another way to achieve this. When you steal majors you rely on: a) loading earlier than the driver you steal them from b) the driver not simple using other numbers c) if it doesn't preventing it from working at all, also for devices you don't replace with your PV devices. These are exactly the issues that had to be solved to have the PV drivers manage the root device. d) that the guest actually uses the minors your claim, e.g. any current linux distribution uses libata anyway, so you old IDE major claim wouldn't do anything. Nor would claiming sd majors as the low-level libata driver would still drive the hardware even if SD doesn't bind to it. By setting up appropriate modprobe rules, this can be addressed. You really must never present the same device as two emulated devices instead of doing such hacks. Agreed; I am not sure what the right solution for Hyper-V is other than (a) preventing the native IDE drivers from loading and (b) having the right modprobe rules to ensure libata would not present these same devices to the guest as scsi devices. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Mon, May 02, 2011 at 07:48:38PM +, KY Srinivasan wrote: By setting up appropriate modprobe rules, this can be addressed. That assumes libata is a module, which it is not for many popular distributions. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Monday, May 02, 2011 4:00 PM To: KY Srinivasan Cc: Christoph Hellwig; Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Mon, May 02, 2011 at 07:48:38PM +, KY Srinivasan wrote: By setting up appropriate modprobe rules, this can be addressed. That assumes libata is a module, which it is not for many popular distributions. As long as you can prevent ata_piix from loading, it should be fine. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Mon, May 02, 2011 at 09:16:36PM +, KY Srinivasan wrote: That assumes libata is a module, which it is not for many popular distributions. As long as you can prevent ata_piix from loading, it should be fine. Again, this might very well be built in, e.g. take a look at: http://pkgs.fedoraproject.org/gitweb/?p=kernel.git;a=blob;f=config-generic;h=779415bcc036b922ba92de9c4b15b9da64e9707c;hb=HEAD http://gitorious.org/opensuse/kernel-source/blobs/master/config/x86_64/default ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Monday, May 02, 2011 5:35 PM To: KY Srinivasan Cc: Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Mon, May 02, 2011 at 09:16:36PM +, KY Srinivasan wrote: That assumes libata is a module, which it is not for many popular distributions. As long as you can prevent ata_piix from loading, it should be fine. Again, this might very well be built in, e.g. take a look at: http://pkgs.fedoraproject.org/gitweb/?p=kernel.git;a=blob;f=config- generic;h=779415bcc036b922ba92de9c4b15b9da64e9707c;hb=HEAD http://gitorious.org/opensuse/kernel- source/blobs/master/config/x86_64/default Good point! For what it is worth, last night I hacked up code to present the block devices currently managed by the blkvsc driver as scsi devices. I have still retained the blkvsc driver to handshake with the host and sert up the channel etc. Rather than presenting this device as an IDE device to the guest, as you had suggested, I am adding this device as a scsi device under the HBA implemented by the storvsc driver. I have assigned a special channel number to distinguish these IDE disks, so that on the I/O paths we can communicate over the appropriate channels. Given that the host is completely oblivious to this arrangement on the guest, I suspect we don't need to worry about future versions of Windows breaking this. From, very minimal testing I have done, things appear to work well. However, the motherboard emulation in Hyper-V requires the boot device to be an IDE device and other than taking over the IDE majors, I don't know of a way to prevent the native drivers from taking over the boot device. On SLES, I had implemented modprobe rules to deal with the issue you had mentioned; it is not clear what the general solution might be for this problem if any, other than changes to the host. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Fri, Apr 29, 2011 at 04:32:35PM +, KY Srinivasan wrote: On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front-end which is the scsi driver (storvsc_drv). So, when you do a bus scan in the emulated scsi front-end, the devices enumerated will not include block devices configured under the IDE controller. So, it is not clear to me how I can do what you are proposing given the restrictions imposed by the host. Just because a device is not reported by REPORT_LUNS doesn't mean you can't talk to it using a SCSI LLDD. We have SCSI transports with all kinds of strange ways to discover devices. Using scsi_add_device you can add LUNs found by your own discovery methods, and use all the existing scsi command handling. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Fri, Apr 29, 2011 at 09:40:25AM -0700, Greg KH wrote: Are you sure the libata core can't see this ide controller and connect to it? That way you would use the scsi system if you do that and you would need a much smaller ide driver, perhaps being able to merge it with your scsi driver. We really don't want to write new IDE drivers anymore that don't use libata. The blkvsc driver isn't an IDE driver, although it currently claims the old IDE drivers major numbers, which is a no-no and can't work in most usual setups. I'm pretty sure I already complained about this in a previous review round. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Sunday, May 01, 2011 11:41 AM To: Greg KH Cc: KY Srinivasan; Christoph Hellwig; gre...@suse.de; linux- ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Fri, Apr 29, 2011 at 09:40:25AM -0700, Greg KH wrote: Are you sure the libata core can't see this ide controller and connect to it? That way you would use the scsi system if you do that and you would need a much smaller ide driver, perhaps being able to merge it with your scsi driver. We really don't want to write new IDE drivers anymore that don't use libata. The blkvsc driver isn't an IDE driver, although it currently claims the old IDE drivers major numbers, which is a no-no and can't work in most usual setups. What is the issue here? This is no different than what is done in other Virtualization platforms. For instance, the Xen blkfront driver is no different - if you specify the block device to be presented to the guest as an ide device, it will register for the appropriate ide major number. Regards, K. Y I'm pretty sure I already complained about this in a previous review round. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Sun, May 01, 2011 at 11:39:21AM -0400, Christoph Hellwig wrote: On Fri, Apr 29, 2011 at 04:32:35PM +, KY Srinivasan wrote: On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front-end which is the scsi driver (storvsc_drv). So, when you do a bus scan in the emulated scsi front-end, the devices enumerated will not include block devices configured under the IDE controller. So, it is not clear to me how I can do what you are proposing given the restrictions imposed by the host. Just because a device is not reported by REPORT_LUNS doesn't mean you can't talk to it using a SCSI LLDD. We have SCSI transports with all kinds of strange ways to discover devices. Using scsi_add_device you can add LUNs found by your own discovery methods, and use all the existing scsi command handling. Yeah, it seems to me that no matter how the user specifies the disk type for the guest configuration, we should use the same Linux driver, with the same naming scheme for both ways. As Christoph points out, it's just a matter of hooking the device up to the scsi subsystem. We do that today for ide, usb, scsi, and loads of other types of devices all with the common goal of making it easier for userspace to handle the devices in a standard manner. thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Sun, May 01, 2011 at 03:46:23PM +, KY Srinivasan wrote: What is the issue here? This is no different than what is done in other Virtualization platforms. For instance, the Xen blkfront driver is no different - if you specify the block device to be presented to the guest as an ide device, it will register for the appropriate ide major number. No, it won't - at least not in mainline just because it's so buggy. If distros keep that crap around I can only recommed you to not use them. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Sunday, May 01, 2011 12:07 PM To: KY Srinivasan Cc: Christoph Hellwig; Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Sun, May 01, 2011 at 03:46:23PM +, KY Srinivasan wrote: What is the issue here? This is no different than what is done in other Virtualization platforms. For instance, the Xen blkfront driver is no different - if you specify the block device to be presented to the guest as an ide device, it will register for the appropriate ide major number. No, it won't - at least not in mainline just because it's so buggy. If distros keep that crap around I can only recommed you to not use them. Christoph, Could you elaborate on the problems/issues when the block driver registers for the IDE majors. On the Qemu side, we have a mechanism to disable the emulation when PV drivers load. I don't think there is an equivalent mechanism on the Windows side. So, as far as I know, registering for the IDE majors is the only way to also prevent native drivers in Linux from taking control of the emulated device. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Sunday, May 01, 2011 11:48 AM To: KY Srinivasan Cc: Christoph Hellwig; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Sun, May 01, 2011 at 11:39:21AM -0400, Christoph Hellwig wrote: On Fri, Apr 29, 2011 at 04:32:35PM +, KY Srinivasan wrote: On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front- end which is the scsi driver (storvsc_drv). So, when you do a bus scan in the emulated scsi front-end, the devices enumerated will not include block devices configured under the IDE controller. So, it is not clear to me how I can do what you are proposing given the restrictions imposed by the host. Just because a device is not reported by REPORT_LUNS doesn't mean you can't talk to it using a SCSI LLDD. We have SCSI transports with all kinds of strange ways to discover devices. Using scsi_add_device you can add LUNs found by your own discovery methods, and use all the existing scsi command handling. Yeah, it seems to me that no matter how the user specifies the disk type for the guest configuration, we should use the same Linux driver, with the same naming scheme for both ways. As Christoph points out, it's just a matter of hooking the device up to the scsi subsystem. We do that today for ide, usb, scsi, and loads of other types of devices all with the common goal of making it easier for userspace to handle the devices in a standard manner. This is not what is being done in Xen and KVM - they both have a PV front-end block drivers that is not managed by the scsi stack. The Hyper-V block driver is equivalent to what we have in Xen and KVM in this respect. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Sun, May 01, 2011 at 06:56:58PM +, KY Srinivasan wrote: Yeah, it seems to me that no matter how the user specifies the disk type for the guest configuration, we should use the same Linux driver, with the same naming scheme for both ways. As Christoph points out, it's just a matter of hooking the device up to the scsi subsystem. We do that today for ide, usb, scsi, and loads of other types of devices all with the common goal of making it easier for userspace to handle the devices in a standard manner. This is not what is being done in Xen and KVM - they both have a PV front-end block drivers that is not managed by the scsi stack. The Hyper-V block driver is equivalent to what we have in Xen and KVM in this respect. Xen also has a PV SCSI driver, although that isn't used very much. For virtio we think it was a mistake to not speak SCSI these days, and ponder introducing a virtio-scsi to replace virtio-blk. But that's not the point here at all. The point is that blockvsc speaks a SCSI protocol over the wire, so it should be implemented as a SCSI LLDD unless you have a good reason not to do it. This is especially important to get advanced features like block level cache flush and FUA support, device topology, discard support, for free. Cache flush and FUA are good example for something that blkvsc currently gets wrong, btw. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Sun, May 01, 2011 at 06:08:37PM +, KY Srinivasan wrote: Could you elaborate on the problems/issues when the block driver registers for the IDE majors. On the Qemu side, we have a mechanism to disable the emulation when PV drivers load. I don't think there is an equivalent mechanism on the Windows side. So, as far as I know, registering for the IDE majors is the only way to also prevent native drivers in Linux from taking control of the emulated device. What qemu are you talking about for the qemu side? Upstream qemu doesn't have any way to provide the same image as multiple devices, nevermind dynamically unplugging bits in that case. Nor does it support the hyperv devices. When you steal majors you rely on: a) loading earlier than the driver you steal them from b) the driver not simple using other numbers c) if it doesn't preventing it from working at all, also for devices you don't replace with your PV devices. d) that the guest actually uses the minors your claim, e.g. any current linux distribution uses libata anyway, so you old IDE major claim wouldn't do anything. Nor would claiming sd majors as the low-level libata driver would still drive the hardware even if SD doesn't bind to it. You really must never present the same device as two emulated devices instead of doing such hacks. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Fri, Apr 29, 2011 at 02:26:13PM +, KY Srinivasan wrote: Perhaps I did not properly formulate my question here. The review process itself may be open-ended, and that is fine - we will fix all legitimate issues/concerns in our drivers whether they are in the staging area or not. My question was specifically with regards to the review process that may gate exiting staging. I am hoping to re-spin the remaining patches of the last patch-set and send it to you by early next week and ask for a review. I fully intend to address whatever review comments I may get in a very timely manner. Assuming at some point in time after I ask for this review there are no outstanding issues, would that be sufficient to exit staging? If it looks acceptable to me, and there are no other objections from other developers, then yes, that would be sufficient to move it out of staging. thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Wednesday, April 27, 2011 8:19 AM To: KY Srinivasan Cc: Christoph Hellwig; Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Wed, Apr 27, 2011 at 11:47:03AM +, KY Srinivasan wrote: On the host side, Windows emulates the standard PC hardware to permit hosting of fully virtualized operating systems. To enhance disk I/O performance, we support a virtual block driver. This block driver currently handles disks that have been setup as IDE disks for the guest - as specified in the guest configuration. On the SCSI side, we emulate a SCSI HBA. Devices configured under the SCSI controller for the guest are handled via this emulated HBA (SCSI front-end). So, SCSI disks configured for the guest are handled through native SCSI upper-level drivers. If this SCSI front-end driver is not loaded, currently, the guest cannot see devices that have been configured as SCSI devices. So, while the virtual block driver described earlier could potentially handle all block devices, the implementation choices made on the host will not permit it. Also, the only SCSI device that can be currently configured for the guest is a disk device. Both the block device driver (hv_blkvsc) and the SCSI front-end driver (hv_storvsc) communicate with the host via unique channels that are implemented as bi-directional ring buffers. Each (storage) channel carries with it enough state to uniquely identify the device on the host side. Microsoft has chosen to use SCSI verbs for this storage channel communication. This doesn't really explain much at all. The only important piece of information I can read from this statement is that both blkvsc and storvsc only support disks, but not any other kind of device, and that chosing either one is an arbitrary seletin when setting up a VM configuration. But this still isn't an excuse to implement a block layer driver for a SCSI protocol, and it doesn't not explain in what way the two protocols actually differ. You really should implement blksvs as a SCSI LLDD, too - and from the looks of it it doesn't even have to be a separate one, but just adding the ids to storvsc would do the work. On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front-end which is the scsi driver (storvsc_drv). So, when you do a bus scan in the emulated scsi front-end, the devices enumerated will not include block devices configured under the IDE controller. So, it is not clear to me how I can do what you are proposing given the restrictions imposed by the host. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Fri, Apr 29, 2011 at 04:32:35PM +, KY Srinivasan wrote: -Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Wednesday, April 27, 2011 8:19 AM To: KY Srinivasan Cc: Christoph Hellwig; Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Wed, Apr 27, 2011 at 11:47:03AM +, KY Srinivasan wrote: On the host side, Windows emulates the standard PC hardware to permit hosting of fully virtualized operating systems. To enhance disk I/O performance, we support a virtual block driver. This block driver currently handles disks that have been setup as IDE disks for the guest - as specified in the guest configuration. On the SCSI side, we emulate a SCSI HBA. Devices configured under the SCSI controller for the guest are handled via this emulated HBA (SCSI front-end). So, SCSI disks configured for the guest are handled through native SCSI upper-level drivers. If this SCSI front-end driver is not loaded, currently, the guest cannot see devices that have been configured as SCSI devices. So, while the virtual block driver described earlier could potentially handle all block devices, the implementation choices made on the host will not permit it. Also, the only SCSI device that can be currently configured for the guest is a disk device. Both the block device driver (hv_blkvsc) and the SCSI front-end driver (hv_storvsc) communicate with the host via unique channels that are implemented as bi-directional ring buffers. Each (storage) channel carries with it enough state to uniquely identify the device on the host side. Microsoft has chosen to use SCSI verbs for this storage channel communication. This doesn't really explain much at all. The only important piece of information I can read from this statement is that both blkvsc and storvsc only support disks, but not any other kind of device, and that chosing either one is an arbitrary seletin when setting up a VM configuration. But this still isn't an excuse to implement a block layer driver for a SCSI protocol, and it doesn't not explain in what way the two protocols actually differ. You really should implement blksvs as a SCSI LLDD, too - and from the looks of it it doesn't even have to be a separate one, but just adding the ids to storvsc would do the work. On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front-end which is the scsi driver (storvsc_drv). Are you sure the libata core can't see this ide controller and connect to it? That way you would use the scsi system if you do that and you would need a much smaller ide driver, perhaps being able to merge it with your scsi driver. We really don't want to write new IDE drivers anymore that don't use libata. thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Friday, April 29, 2011 12:40 PM To: KY Srinivasan Cc: Christoph Hellwig; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Fri, Apr 29, 2011 at 04:32:35PM +, KY Srinivasan wrote: -Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Wednesday, April 27, 2011 8:19 AM To: KY Srinivasan Cc: Christoph Hellwig; Greg KH; gre...@suse.de; linux- ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Wed, Apr 27, 2011 at 11:47:03AM +, KY Srinivasan wrote: On the host side, Windows emulates the standard PC hardware to permit hosting of fully virtualized operating systems. To enhance disk I/O performance, we support a virtual block driver. This block driver currently handles disks that have been setup as IDE disks for the guest - as specified in the guest configuration. On the SCSI side, we emulate a SCSI HBA. Devices configured under the SCSI controller for the guest are handled via this emulated HBA (SCSI front-end). So, SCSI disks configured for the guest are handled through native SCSI upper-level drivers. If this SCSI front-end driver is not loaded, currently, the guest cannot see devices that have been configured as SCSI devices. So, while the virtual block driver described earlier could potentially handle all block devices, the implementation choices made on the host will not permit it. Also, the only SCSI device that can be currently configured for the guest is a disk device. Both the block device driver (hv_blkvsc) and the SCSI front-end driver (hv_storvsc) communicate with the host via unique channels that are implemented as bi-directional ring buffers. Each (storage) channel carries with it enough state to uniquely identify the device on the host side. Microsoft has chosen to use SCSI verbs for this storage channel communication. This doesn't really explain much at all. The only important piece of information I can read from this statement is that both blkvsc and storvsc only support disks, but not any other kind of device, and that chosing either one is an arbitrary seletin when setting up a VM configuration. But this still isn't an excuse to implement a block layer driver for a SCSI protocol, and it doesn't not explain in what way the two protocols actually differ. You really should implement blksvs as a SCSI LLDD, too - and from the looks of it it doesn't even have to be a separate one, but just adding the ids to storvsc would do the work. On the host-side, as part of configuring a guest you can specify block devices as being under an IDE controller or under a SCSI controller. Those are the only options you have. Devices configured under the IDE controller cannot be seen in the guest under the emulated SCSI front- end which is the scsi driver (storvsc_drv). Are you sure the libata core can't see this ide controller and connect to it? That way you would use the scsi system if you do that and you would need a much smaller ide driver, perhaps being able to merge it with your scsi driver. If we don't load the blkvsc driver, the emulated IDE controller exposed to the guest can and will be seen by the libata core. In this case though, your disk I/O will be taking the emulated path with the usual performance hit. When you load the blkvsc driver, the device access does not go through the emulated IDE controller. Blkvsc is truly a generic block driver that registers as a block driver in the guest and talks to an appropriate device driver on the host, communicating over the vmbus. In this respect, it is identical to block drivers we have for guests in other virtualization platforms (Xen etc.). The only difference is that on the host side, the only way you can assign a scsi disk to the guest is to configure this scsi disk under the scsi controller. So, while blkvsc is a generic block driver, because of the restrictions on the host side, it only ends up managing block devices that have IDE majors. We really don't want to write new IDE drivers anymore that don't use libata. As I noted earlier, it is incorrect to view Hyper-V blkvsc driver as an IDE driver. There is nothing IDE specific about it. It is very much like other block front-end drivers (like in Xen) that get their device information from the host and register the block device accordingly with the guest. It just happens that in the current version of the Windows host, only devices that are configured as IDE devices in the host end up being managed by this driver. To make this clear, in my recent
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Tue, Apr 26, 2011 at 09:19:45AM -0700, K. Y. Srinivasan wrote: This patch-set addresses some of the bus/driver model cleanup that Greg sugested over the last couple of days. In this patch-set we deal with the following issues: 1) Cleanup unnecessary state in struct hv_device and struct hv_driver to be compliant with the Linux Driver model. 2) Cleanup the vmbus_match() function to conform with the Linux Driver model. 3) Cleanup error handling in the vmbus_probe() and vmbus_child_device_register() functions. Fixed a bug in the probe failure path as part of this cleanup. 4) The Windows host cannot handle the vmbus_driver being unloaded and subsequently loaded. Cleanup the driver with this in mind. I've stopped at this patch (well, I applied one more, but you can see that.) I'd like to get some confirmation that this is really what you all want to do here before applying it. If it is, care to resend them with a bit more information about this issue and why you all are making it? Anyway, other than this one, the series looks good. But you should follow-up with some driver structure changes like what Christoph said to do. After that, do you want another round of review of the code, or do you have more things you want to send in (like the name[64] removal?) thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Greg KH [mailto:g...@kroah.com] Sent: Tuesday, April 26, 2011 7:29 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Tue, Apr 26, 2011 at 09:19:45AM -0700, K. Y. Srinivasan wrote: This patch-set addresses some of the bus/driver model cleanup that Greg sugested over the last couple of days. In this patch-set we deal with the following issues: 1) Cleanup unnecessary state in struct hv_device and struct hv_driver to be compliant with the Linux Driver model. 2) Cleanup the vmbus_match() function to conform with the Linux Driver model. 3) Cleanup error handling in the vmbus_probe() and vmbus_child_device_register() functions. Fixed a bug in the probe failure path as part of this cleanup. 4) The Windows host cannot handle the vmbus_driver being unloaded and subsequently loaded. Cleanup the driver with this in mind. I've stopped at this patch (well, I applied one more, but you can see that.) I'd like to get some confirmation that this is really what you all want to do here before applying it. If it is, care to resend them with a bit more information about this issue and why you all are making it? Greg, this is restriction imposed by the Windows host: you cannot reload the Vmbus driver without rebooting the guest. If you cannot re-load, what good is it to be able to unload? Distros that integrate these drivers will load these drivers automatically on boot and there is not much point in being able to unload this since most likely the root device will be handled by these drivers. For systems that don't integrate these drivers; I don't see much point in allowing the driver to be unloaded, if you cannot reload the driver without rebooting the guest. If and when the Windows host supports reloading the vmbus driver, we can very easily add this functionality. The situation currently at best very misleading - you think you can unload the vmbus driver, only to discover that you have to reboot the guest! Anyway, other than this one, the series looks good. But you should follow-up with some driver structure changes like what Christoph said to do. I will send you a patch for this. After that, do you want another round of review of the code, or do you have more things you want to send in (like the name[64] removal?) I would prefer that we go through the review process. What is the process for this review? Is there a time window for people to respond. I am hoping I will be able to address all the review comments well in advance of the next closing of the tree, with the hope of taking the vmbus driver out of staging this go around (hope springs eternal in the human breast ...)! Regards, K. Y thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Wed, Apr 27, 2011 at 01:54:02AM +, KY Srinivasan wrote: I would prefer that we go through the review process. What is the process for this review? Is there a time window for people to respond. I am hoping I will be able to address all the review comments well in advance of the next closing of the tree, with the hope of taking the vmbus driver out of staging this go around (hope springs eternal in the human breast ...)! It would be useful if you'd send one driver at a time to the list as the full source to review. Did we make any progress on the naming discussion? In my opinion hv is a far to generic name for your drivers. Why not call it mshv dor the driver directory and prefixes? As far as the core code is concerned, can you explain the use of the dev_add, dev_rm and cleanup methods and how they relate to the normal probe/remove/shutdown methods? As far as the storage drivers are concerned I still have issues with the architecture. I haven't seen any good explanation why you want to have the blkvsc and storvsc drivers different from each other. They both speak the same vmbus-level protocol and tunnel scsi commands over it. Why would you sometimes expose this SCSI protocol as a SCSI LLDD and sometimes as a block driver? What decides that a device is exported in a way to that blkvsc is bound to them vs storvsc? How do they look like on the windows side? From my understanding of the windows driver models both the recent storport model and the older scsiport model are more or less talking scsi to the driver anyway, so what is the difference between the two for a Windows guest? Also pleae get rid of struct storvsc_driver_object, it's just a very strange way to store file-scope variables, and useless indirection for the I/O submission handler. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Wednesday, April 27, 2011 2:46 AM To: KY Srinivasan Cc: Greg KH; gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code On Wed, Apr 27, 2011 at 01:54:02AM +, KY Srinivasan wrote: I would prefer that we go through the review process. What is the process for this review? Is there a time window for people to respond. I am hoping I will be able to address all the review comments well in advance of the next closing of the tree, with the hope of taking the vmbus driver out of staging this go around (hope springs eternal in the human breast ...)! It would be useful if you'd send one driver at a time to the list as the full source to review. Did we make any progress on the naming discussion? In my opinion hv is a far to generic name for your drivers. Why not call it mshv dor the driver directory and prefixes? This topic was discussed at some great length back in Feb/March when I did a bunch of cleanup with regards how the driver and device data structures were layered. At that point, the consensus was to keep the hv prefix. As far as the core code is concerned, can you explain the use of the dev_add, dev_rm and cleanup methods and how they relate to the normal probe/remove/shutdown methods? While I am currently cleaning up our block drivers, my goal this go around is to work on getting the vmbus driver out of staging. I am hoping when I am ready for having you guys review the storage drivers, I will have dealt with the issues you raise here. As far as the storage drivers are concerned I still have issues with the architecture. I haven't seen any good explanation why you want to have the blkvsc and storvsc drivers different from each other. They both speak the same vmbus-level protocol and tunnel scsi commands over it. Why would you sometimes expose this SCSI protocol as a SCSI LLDD and sometimes as a block driver? What decides that a device is exported in a way to that blkvsc is bound to them vs storvsc? How do they look like on the windows side? From my understanding of the windows driver models both the recent storport model and the older scsiport model are more or less talking scsi to the driver anyway, so what is the difference between the two for a Windows guest? I had written up a brief note that I had sent out setting the stage for the first patch-set for cleaning up the block drivers. I am copying it here for your convenience: From: K. Y. Srinivasan k...@microsoft.com Date: Tue, 22 Mar 2011 11:54:46 -0700 Subject: [PATCH 00/16] Staging: hv: Cleanup storage drivers - Phase I This is first in a series of patch-sets aimed at cleaning up the storage drivers for Hyper-V. Before I get into the details of this patch-set, I think it is useful to give a brief overview of the storage related front-end drivers currently in the tree for Linux on Hyper-V: On the host side, Windows emulates the standard PC hardware to permit hosting of fully virtualized operating systems. To enhance disk I/O performance, we support a virtual block driver. This block driver currently handles disks that have been setup as IDE disks for the guest - as specified in the guest configuration. On the SCSI side, we emulate a SCSI HBA. Devices configured under the SCSI controller for the guest are handled via this emulated HBA (SCSI front-end). So, SCSI disks configured for the guest are handled through native SCSI upper-level drivers. If this SCSI front-end driver is not loaded, currently, the guest cannot see devices that have been configured as SCSI devices. So, while the virtual block driver described earlier could potentially handle all block devices, the implementation choices made on the host will not permit it. Also, the only SCSI device that can be currently configured for the guest is a disk device. Both the block device driver (hv_blkvsc) and the SCSI front-end driver (hv_storvsc) communicate with the host via unique channels that are implemented as bi-directional ring buffers. Each (storage) channel carries with it enough state to uniquely identify the device on the host side. Microsoft has chosen to use SCSI verbs for this storage channel communication. Also pleae get rid of struct storvsc_driver_object, it's just a very strange way to store file-scope variables, and useless indirection for the I/O submission handler. I will do this as part of storage cleanup I am currently doing. Thank you for taking the time to review the code. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Wed, Apr 27, 2011 at 11:47:03AM +, KY Srinivasan wrote: On the host side, Windows emulates the standard PC hardware to permit hosting of fully virtualized operating systems. To enhance disk I/O performance, we support a virtual block driver. This block driver currently handles disks that have been setup as IDE disks for the guest - as specified in the guest configuration. On the SCSI side, we emulate a SCSI HBA. Devices configured under the SCSI controller for the guest are handled via this emulated HBA (SCSI front-end). So, SCSI disks configured for the guest are handled through native SCSI upper-level drivers. If this SCSI front-end driver is not loaded, currently, the guest cannot see devices that have been configured as SCSI devices. So, while the virtual block driver described earlier could potentially handle all block devices, the implementation choices made on the host will not permit it. Also, the only SCSI device that can be currently configured for the guest is a disk device. Both the block device driver (hv_blkvsc) and the SCSI front-end driver (hv_storvsc) communicate with the host via unique channels that are implemented as bi-directional ring buffers. Each (storage) channel carries with it enough state to uniquely identify the device on the host side. Microsoft has chosen to use SCSI verbs for this storage channel communication. This doesn't really explain much at all. The only important piece of information I can read from this statement is that both blkvsc and storvsc only support disks, but not any other kind of device, and that chosing either one is an arbitrary seletin when setting up a VM configuration. But this still isn't an excuse to implement a block layer driver for a SCSI protocol, and it doesn't not explain in what way the two protocols actually differ. You really should implement blksvs as a SCSI LLDD, too - and from the looks of it it doesn't even have to be a separate one, but just adding the ids to storvsc would do the work. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Wed, Apr 27, 2011 at 01:54:02AM +, KY Srinivasan wrote: After that, do you want another round of review of the code, or do you have more things you want to send in (like the name[64] removal?) I would prefer that we go through the review process. What is the process for this review? The same as always, just ask. Is there a time window for people to respond. No. We don't have time limits here, this is a community, we don't have deadlines, you know that. I am hoping I will be able to address all the review comments well in advance of the next closing of the tree, with the hope of taking the vmbus driver out of staging this go around (hope springs eternal in the human breast ...)! Yes, it would be nice, and I understand your the corporate pressures you are under to get this done, and I am doing my best to fit the patch review and apply cycle into my very-very-limited-at-the-moment spare time. As always, if you miss this kernel release, there's always another one 3 months away, so it's no big deal in the long-run. thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
Do you have a repository containing the current state of your patche somewhere? There's been so much cleanup that it's hard to review these patches against the current mainline codebase. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
RE: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
-Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Tuesday, April 26, 2011 12:57 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code Do you have a repository containing the current state of your patche somewhere? There's been so much cleanup that it's hard to review these patches against the current mainline codebase. Christoph, Yesterday (April 25, 2011), Greg checked in all of the outstanding hv patches. So, if You checkout Greg's tree today, you will get the most recent hv codebase. This current patch-set is against Greg's current tree. Regards, K. Y ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization
Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code
On Tue, Apr 26, 2011 at 05:04:36PM +, KY Srinivasan wrote: -Original Message- From: Christoph Hellwig [mailto:h...@infradead.org] Sent: Tuesday, April 26, 2011 12:57 PM To: KY Srinivasan Cc: gre...@suse.de; linux-ker...@vger.kernel.org; de...@linuxdriverproject.org; virtualizat...@lists.osdl.org Subject: Re: [PATCH 00/25] Staging: hv: Cleanup vmbus driver code Do you have a repository containing the current state of your patche somewhere? There's been so much cleanup that it's hard to review these patches against the current mainline codebase. Christoph, Yesterday (April 25, 2011), Greg checked in all of the outstanding hv patches. So, if You checkout Greg's tree today, you will get the most recent hv codebase. This current patch-set is against Greg's current tree. It's also always in the linux-next tree, which is easier for most people to work off of. thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/virtualization