Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Thu, 2012-07-19 at 08:00 +0200, Paolo Bonzini wrote: Il 18/07/2012 21:12, Anthony Liguori ha scritto: Windows does this with a points system and I do believe that INQUIRY responses from any local disks are included in this tally. INQUIRY responses (at least vendor/product/type) should not change. INQUIRY responses often change for arrays because a firmware upgrade enables new features and new features have to declare themselves, usually in the INQUIRY data. What you mean, I think, is that previously exposed features in INQUIRY data, as well as strings (vendor/product/type, as you say), shouldn't change, but unexposed data (read 0 in the fields) may. James -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
Il 19/07/2012 09:28, James Bottomley ha scritto: INQUIRY responses (at least vendor/product/type) should not change. INQUIRY responses often change for arrays because a firmware upgrade enables new features and new features have to declare themselves, usually in the INQUIRY data. What you mean, I think, is that previously exposed features in INQUIRY data, as well as strings (vendor/product/type, as you say), shouldn't change, but unexposed data (read 0 in the fields) may. What I meant is that it's unlikely that Windows fingerprinting is using anything but vendor/product/type, because everything else can change. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote: On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: SNIP It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I don't think this is a good reason to exclude something from the kernel. However, there are good reasons why this doesn't make sense for something like QEMU--specifically because we have a large number of features in our block layer that tcm_vhost would bypass. I can definitely appreciate your concern here as the QEMU maintainer. But perhaps it makes sense for something like native kvm tool. And if it did go into the kernel, we would certainly support it in QEMU. ... But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. We most certainly have thought about long term userspace compatibility with TCM. Our userspace code (that's now available in all major distros) is completely forward-compatible with new fabric modules such as tcm_vhost. No update required. I'm not sure we're talking about the same thing when we say compatibility. I'm not talking about the API. I'm talking about the behavior of the commands that tcm_vhost supports. If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. Yes, you can add knobs via configfs to control this behavior, but I think the question is, what's the plan for this? BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. I think that's probably the only change that's needed here. Regards, Anthony Liguori Also, by virtue of the fact that we are using configfs + rtslib (python object library) on top, it's very easy to keep any type of compatibility logic around in python code. With rtslib, we are able to hide configfs ABI changes from higher level apps. So far we've had a track record of 100% userspace ABI compatibility in mainline since .38, and I don't intend to merge a patch that breaks this any time soon. But if that ever happens, apps using rtslib are not going to be effected. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. I think this is a good idea. Even if it goes in, a really clear policy would be needed wrt the userspace ABI. While tcm_vhost is probably more useful than vhost_blk, it's a much more complex ABI to maintain. As far as I am concerned, the kernel API (eg: configfs directory layout) as it is now in sys/kernel/config/target/vhost/ is not going to change. It's based on the same drivers/target/target_core_fabric_configfs.c generic layout that we've had since .38. The basic functional fabric layout in configfs is identical (with fabric dependent WWPN naming of course) regardless of fabric driver, and by virtue of being generic it means we can add things like fabric dependent attributes + parameters in the future for existing fabrics without breaking userspace. So while I agree the ABI is more complex than vhost-blk, the logic in target_core_fabric_configfs.c is a basic ABI fabric definition that we are enforcing across all fabric modules in mainline for long term compatibility. --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
Il 18/07/2012 15:42, Anthony Liguori ha scritto: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. The QEMU target is not enforcing this to this level. We didn't for CD-ROM ATAPI, and we're not doing it for SCSI. It may indeed be useful for changes to VPD pages or major features. However, so far we've never introduced any feature that deserved it. This is also because OSes typically don't care: they use a small subset of the features and all the remaining decorations are only needed to be pedantically compliant to the spec. Paolo -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote: On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: SNIP It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I don't think this is a good reason to exclude something from the kernel. However, there are good reasons why this doesn't make sense for something like QEMU--specifically because we have a large number of features in our block layer that tcm_vhost would bypass. I can definitely appreciate your concern here as the QEMU maintainer. But perhaps it makes sense for something like native kvm tool. And if it did go into the kernel, we would certainly support it in QEMU. ... But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. We most certainly have thought about long term userspace compatibility with TCM. Our userspace code (that's now available in all major distros) is completely forward-compatible with new fabric modules such as tcm_vhost. No update required. I'm not sure we're talking about the same thing when we say compatibility. I'm not talking about the API. I'm talking about the behavior of the commands that tcm_vhost supports. If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. Yes, you can add knobs via configfs to control this behavior, but I think the question is, what's the plan for this? BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. I think that's probably the only change that's needed here. Regards, Anthony Liguori I agree it's needed but it's not a requirement for merging IMHO. As a first step we can disable live migration. Also, by virtue of the fact that we are using configfs + rtslib (python object library) on top, it's very easy to keep any type of compatibility logic around in python code. With rtslib, we are able to hide configfs ABI changes from higher level apps. So far we've had a track record of 100% userspace ABI compatibility in mainline since .38, and I don't intend to merge a patch that breaks this any time soon. But if that ever happens, apps using rtslib are not going to be effected. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. I think this is a good idea. Even if it goes in, a really clear policy would be needed wrt the userspace ABI. While tcm_vhost is probably more useful than vhost_blk, it's a much more complex ABI to maintain. As far as I am concerned, the kernel API (eg: configfs directory layout) as it is now in sys/kernel/config/target/vhost/ is not going to change. It's based on the same drivers/target/target_core_fabric_configfs.c generic layout that we've had since .38. The basic functional fabric layout in configfs is identical (with fabric dependent WWPN naming of course) regardless of fabric driver, and by virtue of being generic it means we can add things like fabric dependent attributes + parameters in the future for existing fabrics without breaking userspace. So while I agree the ABI is more complex than vhost-blk, the logic in target_core_fabric_configfs.c is a basic ABI fabric definition that we are enforcing across all fabric modules in mainline for long term compatibility. --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote: On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. Presumably initiators are shut down for target firmware upgrades? With virtualization your host can change without guest shutdown. You can also *lose* commands when migrating to an older host. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On 07/18/2012 10:53 AM, Christoph Hellwig wrote: On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. But would this happen while a system is running live? I agree that in general, SCSI targets don't need this, but I'm pretty sure that if a guest probes for a command, you migrate to an old version, and that command is no longer there, badness will ensue. It's different when you're talking about a reboot happening or a disconnect/reconnect due to firmware upgrade. The OS would naturally be reprobing in this case. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote: On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote: On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. Presumably initiators are shut down for target firmware upgrades? With virtualization your host can change without guest shutdown. You can also *lose* commands when migrating to an older host. Actually no. Storage vendors do not want to impose a need to take initiators down for any reason. I have worked for a storage system vendor that routinely did firmware upgrades on-the-fly. This is done by multi-pathing and taking one path down, upgrade, bring up, repeat. There was even one non-redundant system that I am aware of that could upgrade firmware and reboot fast enough that the initiators would not notice. You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion. -- Mark Rustad, LAN Access Division, Intel Corporation -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote: On 07/18/2012 10:53 AM, Christoph Hellwig wrote: On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. But would this happen while a system is running live? Of course: Think about the consequences: you want to upgrade one array on your SAN. You definitely don't want to shut down your entire data centre to achieve it. In place upgrades on running SANs have been common in enterprise environments for a while. I agree that in general, SCSI targets don't need this, but I'm pretty sure that if a guest probes for a command, you migrate to an old version, and that command is no longer there, badness will ensue. What command are we talking about? Operation of initiators is usually just READ and WRITE. So perhaps we might have inline UNMAP ... but the world wouldn't come to an end even if the latter stopped working. Most of the complex SCSI stuff is done at start of day; it's actually only then we'd notice things like changes in INQUIRY strings or mode pages. Failover, which is what you're talking about, requires reinstatement of all the operating parameters of the source/target system, but that's not wholly the responsibility of the storage system ... James It's different when you're talking about a reboot happening or a disconnect/reconnect due to firmware upgrade. The OS would naturally be reprobing in this case. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, Jul 18, 2012 at 04:42:33PM +, Rustad, Mark D wrote: On Jul 18, 2012, at 9:00 AM, Michael S. Tsirkin wrote: On Wed, Jul 18, 2012 at 11:53:38AM -0400, Christoph Hellwig wrote: On Wed, Jul 18, 2012 at 08:42:21AM -0500, Anthony Liguori wrote: If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. I don't think these strict live migration rules apply to SCSI targets. Real life storage systems get new features and different behaviour with firmware upgrades all the time, and SCSI initiators deal with that just fine. I don't see any reason to be more picky just because we're virtualized. Presumably initiators are shut down for target firmware upgrades? With virtualization your host can change without guest shutdown. You can also *lose* commands when migrating to an older host. Actually no. Storage vendors do not want to impose a need to take initiators down for any reason. I have worked for a storage system vendor that routinely did firmware upgrades on-the-fly. This is done by multi-pathing and taking one path down, upgrade, bring up, repeat. With live migration even that does not happen. There was even one non-redundant system that I am aware of that could upgrade firmware and reboot fast enough that the initiators would not notice. You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion. How about removing commands? -- Mark Rustad, LAN Access Division, Intel Corporation -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On 07/18/2012 11:47 AM, James Bottomley wrote: On Wed, 2012-07-18 at 11:00 -0500, Anthony Liguori wrote: Of course: Think about the consequences: you want to upgrade one array on your SAN. You definitely don't want to shut down your entire data centre to achieve it. In place upgrades on running SANs have been common in enterprise environments for a while. Would firmware upgrades ever result in major OS visible changes though? Maybe OSes are more robust with SCSI than with other types of buses, but I don't think it's safe to completely ignore the problem. I agree that in general, SCSI targets don't need this, but I'm pretty sure that if a guest probes for a command, you migrate to an old version, and that command is no longer there, badness will ensue. What command are we talking about? Operation of initiators is usually just READ and WRITE. So perhaps we might have inline UNMAP ... but the world wouldn't come to an end even if the latter stopped working. Is that true for all OSes? Linux may handle things gracefully if UNMAP starts throwing errors but that doesn't mean that Windows will. There are other cases where this creates problems. Windows (and some other OSes) fingerprint the hardware profile in order to do license enforcement. If the hardware changes beyond a certain amount, then they refuse to boot. Windows does this with a points system and I do believe that INQUIRY responses from any local disks are included in this tally. Most of the complex SCSI stuff is done at start of day; it's actually only then we'd notice things like changes in INQUIRY strings or mode pages. Failover, which is what you're talking about, requires reinstatement of all the operating parameters of the source/target system, but that's not wholly the responsibility of the storage system ... It's the responsibility of the hypervisor when dealing with live migration. Regards, Anthony Liguori James It's different when you're talking about a reboot happening or a disconnect/reconnect due to firmware upgrade. The OS would naturally be reprobing in this case. -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Jul 18, 2012, at 10:17 AM, Michael S. Tsirkin wrote: snip You do have to pay very close attention to some things however. Don't change the device identity in any way - even version information, otherwise a Windows initiator will blue-screen. I made that mistake myself, so I remember it well. It seemed like such an innocent change. I don't recall there being any issue with adding commands and we did do that on occasion. How about removing commands? Good question. With the storage system I am familiar with, that would only be a risk if firmware were downgraded. Downgrading would never have been recommended. I am sure that if something like persistent reserve suddenly went away it would cause big trouble for some initiators. -- Mark Rustad, LAN Access Division, Intel Corporation -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 08:42 -0500, Anthony Liguori wrote: On 07/17/2012 04:50 PM, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote: On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: SNIP But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. We most certainly have thought about long term userspace compatibility with TCM. Our userspace code (that's now available in all major distros) is completely forward-compatible with new fabric modules such as tcm_vhost. No update required. I'm not sure we're talking about the same thing when we say compatibility. I'm not talking about the API. I'm talking about the behavior of the commands that tcm_vhost supports. OK, I understand what your getting at now.. If you add support for a new command, you need to provide userspace a way to disable this command. If you change what gets reported for VPD, you need to provide userspace a way to make VPD look like what it did in a previous version. Basically, you need to be able to make a TCM device behave 100% the same as it did in an older version of the kernel. This is unique to virtualization due to live migration. If you migrate from a 3.6 kernel to a 3.8 kernel, you need to make sure that the 3.8 kernel's TCM device behaves exactly like the 3.6 kernel because the guest that is interacting with it does not realize that live migration happened. Yes, you can add knobs via configfs to control this behavior, but I think the question is, what's the plan for this? So we already allow for some types of CDBs emulation to be toggled via backend device attributes: root@tifa:/usr/src/target-pending.git# tree /sys/kernel/config/target/core/iblock_2/fioa/attrib/ /sys/kernel/config/target/core/iblock_2/fioa/attrib/ ├── block_size ├── emulate_dpo ├── emulate_fua_read ├── emulate_fua_write ├── emulate_rest_reord ├── emulate_tas ├── emulate_tpu ├── emulate_tpws ├── emulate_ua_intlck_ctrl ├── emulate_write_cache ├── enforce_pr_isids SNIP Some things like SPC-3 persistent reservations + implict/explict ALUA multipath currently can't be disabled, but adding two more backend attributes to disable/enable this logic individual is easy enough to do. So that said, I don't have a problem with adding the necessary device attributes to limit what type of CDBs a backend device is capable of processing. Trying to limiting this per-guest (instead of per-device) is where things get a little more tricky.. BTW, I think this is a good thing to cover in Documentation/vhost/tcm_vhost.txt. I think that's probably the only change that's needed here. Sure, but I'll need to know what else that you'd like to optionally restrict it terms of CDB processing that's not already there.. Thanks for your feedback! --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. Nicholas Bellinger (2): vhost: Add vhost_scsi specific defines tcm_vhost: Initial merge for vhost level target fabric driver Stefan Hajnoczi (2): vhost: Separate vhost-net features from vhost features vhost: make vhost work queue visible drivers/vhost/Kconfig |6 + drivers/vhost/Makefile|1 + drivers/vhost/net.c |4 +- drivers/vhost/tcm_vhost.c | 1609 + drivers/vhost/tcm_vhost.h | 74 ++ drivers/vhost/test.c |4 +- drivers/vhost/vhost.c |5 +- drivers/vhost/vhost.h |6 +- include/linux/vhost.h |9 + 9 files changed, 1710 insertions(+), 8 deletions(-) create mode 100644 drivers/vhost/tcm_vhost.c create mode 100644 drivers/vhost/tcm_vhost.h -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellingern...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I don't think this is a good reason to exclude something from the kernel. However, there are good reasons why this doesn't make sense for something like QEMU--specifically because we have a large number of features in our block layer that tcm_vhost would bypass. But perhaps it makes sense for something like native kvm tool. And if it did go into the kernel, we would certainly support it in QEMU. But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. I think this is a good idea. Even if it goes in, a really clear policy would be needed wrt the userspace ABI. While tcm_vhost is probably more useful than vhost_blk, it's a much more complex ABI to maintain. Regards, Anthony Liguori For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. Nicholas Bellinger (2): vhost: Add vhost_scsi specific defines tcm_vhost: Initial merge for vhost level target fabric driver Stefan Hajnoczi (2): vhost: Separate vhost-net features from vhost features vhost: make vhost work queue visible drivers/vhost/Kconfig |6 + drivers/vhost/Makefile|1 + drivers/vhost/net.c |4 +- drivers/vhost/tcm_vhost.c | 1609 + drivers/vhost/tcm_vhost.h | 74 ++ drivers/vhost/test.c |4 +- drivers/vhost/vhost.c |5 +- drivers/vhost/vhost.h |6 +- include/linux/vhost.h |9 + 9 files changed, 1710 insertions(+), 8 deletions(-) create mode 100644 drivers/vhost/tcm_vhost.c create mode 100644 drivers/vhost/tcm_vhost.h -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, Jul 17, 2012 at 01:55:42PM -0500, Anthony Liguori wrote: On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellingern...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I don't think this is a good reason to exclude something from the kernel. However, there are good reasons why this doesn't make sense for something like QEMU--specifically because we have a large number of features in our block layer that tcm_vhost would bypass. But perhaps it makes sense for something like native kvm tool. And if it did go into the kernel, we would certainly support it in QEMU. But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. I think this is a good idea. Even if it goes in, a really clear policy would be needed wrt the userspace ABI. While tcm_vhost is probably more useful than vhost_blk, it's a much more complex ABI to maintain. Regards, Anthony Liguori Maybe something like a whitelist of features will help? Might even be a good idea to make it user controllable. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. Nicholas Bellinger (2): vhost: Add vhost_scsi specific defines tcm_vhost: Initial merge for vhost level target fabric driver Stefan Hajnoczi (2): vhost: Separate vhost-net features from vhost features vhost: make vhost work queue visible drivers/vhost/Kconfig |6 + drivers/vhost/Makefile|1 + drivers/vhost/net.c |4 +- drivers/vhost/tcm_vhost.c | 1609 + drivers/vhost/tcm_vhost.h | 74 ++ drivers/vhost/test.c |4 +- drivers/vhost/vhost.c |5 +- drivers/vhost/vhost.h |6 +- include/linux/vhost.h |9 + 9 files changed, 1710 insertions(+), 8 deletions(-) create mode 100644 drivers/vhost/tcm_vhost.c create mode 100644 drivers/vhost/tcm_vhost.h -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I'm happy to commit to working with QEMU + kvm-tool folks to get to a series that can (eventually) see vhost-scsi support merged into upstream userspace code. It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes merged, but certainly vhost-scsi has alot less moving pieces and hopefully alot less controversial bits than the buffer - SGL conversion.. The key word being here 'hopefully'.. ;) I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. So tcm_vhost has been marked as Experimental following virtio-scsi. Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I'm happy to commit to working with QEMU + kvm-tool folks to get to a series that can (eventually) see vhost-scsi support merged into upstream userspace code. It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes merged, but certainly vhost-scsi has alot less moving pieces and hopefully alot less controversial bits than the buffer - SGL conversion.. The key word being here 'hopefully'.. ;) I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. So tcm_vhost has been marked as Experimental following virtio-scsi. Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy staging is not just for code that needs cleanups. It's for anything that does not guarantee ABI stability yet. And I think it's a bit early to guarantee ABI stability - 7 days is not all that long. See for example Anthony's comments that raise exactly the ABI issues. Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. Once it's in kernel you never know who will use this driver. Experimental does not mean driver can be dropped, staging does. However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab I do think upstream kernel would help you nail userspace issues too but at this point it looks like either staging meterial or 3.6 is too early. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, 2012-07-17 at 13:55 -0500, Anthony Liguori wrote: On 07/17/2012 10:05 AM, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: SNIP It still seems not 100% clear whether this driver will have major userspace using it. And if not, it would be very hard to support a driver when recent userspace does not use it in the end. I don't think this is a good reason to exclude something from the kernel. However, there are good reasons why this doesn't make sense for something like QEMU--specifically because we have a large number of features in our block layer that tcm_vhost would bypass. I can definitely appreciate your concern here as the QEMU maintainer. But perhaps it makes sense for something like native kvm tool. And if it did go into the kernel, we would certainly support it in QEMU. ... But I do think the kernel should carefully consider whether it wants to support an interface like this. This an extremely complicated ABI with a lot of subtle details around state and compatibility. Are you absolutely confident that you can support a userspace application that expects to get exactly the same response from all possible commands in 20 kernel versions from now? Virtualization requires absolutely precise compatibility in terms of bugs and features. This is probably not something the TCM stack has had to consider yet. We most certainly have thought about long term userspace compatibility with TCM. Our userspace code (that's now available in all major distros) is completely forward-compatible with new fabric modules such as tcm_vhost. No update required. Also, by virtue of the fact that we are using configfs + rtslib (python object library) on top, it's very easy to keep any type of compatibility logic around in python code. With rtslib, we are able to hide configfs ABI changes from higher level apps. So far we've had a track record of 100% userspace ABI compatibility in mainline since .38, and I don't intend to merge a patch that breaks this any time soon. But if that ever happens, apps using rtslib are not going to be effected. I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. I think this is a good idea. Even if it goes in, a really clear policy would be needed wrt the userspace ABI. While tcm_vhost is probably more useful than vhost_blk, it's a much more complex ABI to maintain. As far as I am concerned, the kernel API (eg: configfs directory layout) as it is now in sys/kernel/config/target/vhost/ is not going to change. It's based on the same drivers/target/target_core_fabric_configfs.c generic layout that we've had since .38. The basic functional fabric layout in configfs is identical (with fabric dependent WWPN naming of course) regardless of fabric driver, and by virtue of being generic it means we can add things like fabric dependent attributes + parameters in the future for existing fabrics without breaking userspace. So while I agree the ABI is more complex than vhost-blk, the logic in target_core_fabric_configfs.c is a basic ABI fabric definition that we are enforcing across all fabric modules in mainline for long term compatibility. --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy BTW I don't suggest putting code itself in staging. Just the config flag to enable it. Once we are more or less sure multiple userspaces are using this driver, we'll move the config hopefully already in 3.7. What's the downside? -- MST -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org SNIP I'm happy to commit to working with QEMU + kvm-tool folks to get to a series that can (eventually) see vhost-scsi support merged into upstream userspace code. It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes merged, but certainly vhost-scsi has alot less moving pieces and hopefully alot less controversial bits than the buffer - SGL conversion.. The key word being here 'hopefully'.. ;) I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. So tcm_vhost has been marked as Experimental following virtio-scsi. Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy staging is not just for code that needs cleanups. It's for anything that does not guarantee ABI stability yet. And I think it's a bit early to guarantee ABI stability - 7 days is not all that long. I was talking about the new I/O path has been running for 7 days. See for example Anthony's comments that raise exactly the ABI issues. As mentioned in the response to Anthony, we are using the same generic fabric ABI in drivers/target/target_core_fabric_configfs.c since .38. That part is not going to change, and it has not changed for any of the other 7 target fabric modules we've merged into mainline since then. Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. Once it's in kernel you never know who will use this driver. Experimental does not mean driver can be dropped, staging does. Yes, that's the point of being in mainline. People using the code, right..? However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab I do think upstream kernel would help you nail userspace issues too but at this point it looks like either staging meterial or 3.6 is too early. I think for-3.6 is just the right time for this kernel code. Seriously, The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is going to be the same now for-3.6, the same for-3.7, and the same for .38 code. I'd be happy to move tcm_vhost back to drivers/target/ for now, and we move it to drivers/vhost/ once the userspace bits are worked out..? Would that be a reasonable compromise to move forward..? --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 00:58 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy BTW I don't suggest putting code itself in staging. Just the config flag to enable it. Once we are more or less sure multiple userspaces are using this driver, we'll move the config hopefully already in 3.7. What's the downside? Ahh, sorry I managed to miss that part.. ;) If it's just a CONFIG_STAGING flag for a release or two until we work out the userspace bits, I don't have an objection doing something like that if it helps getting the code exposed to a large set of eyes in mainline. --nab -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org SNIP I'm happy to commit to working with QEMU + kvm-tool folks to get to a series that can (eventually) see vhost-scsi support merged into upstream userspace code. It took roughly 2 years to get the megasas HBA emulation from Dr. Hannes merged, but certainly vhost-scsi has alot less moving pieces and hopefully alot less controversial bits than the buffer - SGL conversion.. The key word being here 'hopefully'.. ;) I think a good idea for 3.6 would be to make it depend on CONFIG_STAGING. Then we don't commit to an ABI. For this, you can add a separate Kconfig and source it from drivers/staging/Kconfig. Maybe it needs to be in a separate directory drivers/vhost/staging/Kconfig. So tcm_vhost has been marked as Experimental following virtio-scsi. Wrt to staging, I'd like to avoid mucking with staging because: *) The code has been posted for review *) The code has been converted to use the latest target-core primitives *) The code does not require cleanups between staging - merge *) The code has been stable the last 7 days since RFC-v2 with heavy staging is not just for code that needs cleanups. It's for anything that does not guarantee ABI stability yet. And I think it's a bit early to guarantee ABI stability - 7 days is not all that long. I was talking about the new I/O path has been running for 7 days. See for example Anthony's comments that raise exactly the ABI issues. As mentioned in the response to Anthony, we are using the same generic fabric ABI in drivers/target/target_core_fabric_configfs.c since .38. That part is not going to change, and it has not changed for any of the other 7 target fabric modules we've merged into mainline since then. Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. Once it's in kernel you never know who will use this driver. Experimental does not mean driver can be dropped, staging does. Yes, that's the point of being in mainline. People using the code, right..? Exactly. I am just worried about in the end being no major users and we are being stuck with a niche driver that as a result is very hard to test. And the reason for the fear is the initial negative reaction from the qemu side. And no if it's there we can't just drop it. However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab I do think upstream kernel would help you nail userspace issues too but at this point it looks like either staging meterial or 3.6 is too early. I think for-3.6 is just the right time for this kernel code. Seriously, The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is going to be the same now for-3.6, the same for-3.7, and the same for .38 code. I'd be happy to move tcm_vhost back to drivers/target/ for now, and we move it to drivers/vhost/ once the userspace bits are worked out..? Would that be a reasonable compromise to move forward..? --nab I don't see how it helps. The driver is either a guaranteed ABI or not. I'd prefer not to have vhost users outside drivers/vhost/ since it is harder for me to keep track of them. What's the problem with staging proposal? It's just another hoop to jump through to enable it? -- MST -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org SNIP As mentioned in the response to Anthony, we are using the same generic fabric ABI in drivers/target/target_core_fabric_configfs.c since .38. That part is not going to change, and it has not changed for any of the other 7 target fabric modules we've merged into mainline since then. Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. Once it's in kernel you never know who will use this driver. Experimental does not mean driver can be dropped, staging does. Yes, that's the point of being in mainline. People using the code, right..? Exactly. I am just worried about in the end being no major users and we are being stuck with a niche driver that as a result is very hard to test. And the reason for the fear is the initial negative reaction from the qemu side. And no if it's there we can't just drop it. That is certainly a reasonable concern.. However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab I do think upstream kernel would help you nail userspace issues too but at this point it looks like either staging meterial or 3.6 is too early. I think for-3.6 is just the right time for this kernel code. Seriously, The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is going to be the same now for-3.6, the same for-3.7, and the same for .38 code. I'd be happy to move tcm_vhost back to drivers/target/ for now, and we move it to drivers/vhost/ once the userspace bits are worked out..? Would that be a reasonable compromise to move forward..? --nab I don't see how it helps. The driver is either a guaranteed ABI or not. I'd prefer not to have vhost users outside drivers/vhost/ since it is harder for me to keep track of them. What's the problem with staging proposal? It's just another hoop to jump through to enable it? Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step forward for-3.6. Adding the following patch into target-pending/for-next-merge now: diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig index ccbeb6f..2cd7135 100644 --- a/drivers/vhost/Kconfig +++ b/drivers/vhost/Kconfig @@ -11,7 +11,7 @@ config VHOST_NET config TCM_VHOST tristate TCM_VHOST fabric module (EXPERIMENTAL) - depends on TARGET_CORE EVENTFD EXPERIMENTAL m + depends on TARGET_CORE EVENTFD EXPERIMENTAL STAGING m default n ---help--- Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Tue, Jul 17, 2012 at 03:37:20PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 02:17:22PM -0700, Nicholas A. Bellinger wrote: On Tue, 2012-07-17 at 18:05 +0300, Michael S. Tsirkin wrote: On Wed, Jul 11, 2012 at 09:15:00PM +, Nicholas A. Bellinger wrote: From: Nicholas Bellinger n...@linux-iscsi.org SNIP As mentioned in the response to Anthony, we are using the same generic fabric ABI in drivers/target/target_core_fabric_configfs.c since .38. That part is not going to change, and it has not changed for any of the other 7 target fabric modules we've merged into mainline since then. Also, tcm_vhost has been marked as Experimental following virtio-scsi. I'd much rather leave it at Experimental until we merge upstream userspace support. If userspace support never ends up materializing, I'm fine with dropping it all together. Once it's in kernel you never know who will use this driver. Experimental does not mean driver can be dropped, staging does. Yes, that's the point of being in mainline. People using the code, right..? Exactly. I am just worried about in the end being no major users and we are being stuck with a niche driver that as a result is very hard to test. And the reason for the fear is the initial negative reaction from the qemu side. And no if it's there we can't just drop it. That is certainly a reasonable concern.. However at this point given that there is a 3x performance gap between virtio-scsi-raw + virtio-scsi+tcm_vhost for random mixed small block I/O, and we still need the latter to do proper SCSI CDB passthrough for non TYPE_DISK devices I'm hoping that we can agree on userspace bits once tcm_vhost is merged. --nab I do think upstream kernel would help you nail userspace issues too but at this point it looks like either staging meterial or 3.6 is too early. I think for-3.6 is just the right time for this kernel code. Seriously, The basic ABI fabric layout for /sys/kernel/config/target/vhost/ is going to be the same now for-3.6, the same for-3.7, and the same for .38 code. I'd be happy to move tcm_vhost back to drivers/target/ for now, and we move it to drivers/vhost/ once the userspace bits are worked out..? Would that be a reasonable compromise to move forward..? --nab I don't see how it helps. The driver is either a guaranteed ABI or not. I'd prefer not to have vhost users outside drivers/vhost/ since it is harder for me to keep track of them. What's the problem with staging proposal? It's just another hoop to jump through to enable it? Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step forward for-3.6. Adding the following patch into target-pending/for-next-merge now: diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig index ccbeb6f..2cd7135 100644 --- a/drivers/vhost/Kconfig +++ b/drivers/vhost/Kconfig @@ -11,7 +11,7 @@ config VHOST_NET config TCM_VHOST tristate TCM_VHOST fabric module (EXPERIMENTAL) - depends on TARGET_CORE EVENTFD EXPERIMENTAL m + depends on TARGET_CORE EVENTFD EXPERIMENTAL STAGING m default n ---help--- Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests Hmm that's not explicit enough, someone might enable CONFIG_STAGING for some other reason and won't notice the dependency. We need it to appear with other staging drivers in the menu, so there needs to be a Kconfig that is included from drivers/staging/Kconfig. For example, we can create drivers/vhost/staging/Kconfig or drivers/vhost/tcm/Kconfig and include it from drivers/staging/Kconfig. nouveau did something like this for a while, see f3c93cbde7eab38671ae085cb1027b08f5f36757. No need to move the rest of the code. -- MST -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
On Wed, 2012-07-18 at 02:11 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 03:37:20PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 01:18 +0300, Michael S. Tsirkin wrote: On Tue, Jul 17, 2012 at 03:02:08PM -0700, Nicholas A. Bellinger wrote: On Wed, 2012-07-18 at 00:34 +0300, Michael S. Tsirkin wrote: SNIP I don't see how it helps. The driver is either a guaranteed ABI or not. I'd prefer not to have vhost users outside drivers/vhost/ since it is harder for me to keep track of them. What's the problem with staging proposal? It's just another hoop to jump through to enable it? Yeah, I'm OK with just adding a CONFIG_STAGING tag is a reasonable step forward for-3.6. Adding the following patch into target-pending/for-next-merge now: diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig index ccbeb6f..2cd7135 100644 --- a/drivers/vhost/Kconfig +++ b/drivers/vhost/Kconfig @@ -11,7 +11,7 @@ config VHOST_NET config TCM_VHOST tristate TCM_VHOST fabric module (EXPERIMENTAL) - depends on TARGET_CORE EVENTFD EXPERIMENTAL m + depends on TARGET_CORE EVENTFD EXPERIMENTAL STAGING m default n ---help--- Say M here to enable the TCM_VHOST fabric module for use with virtio-scsi guests Hmm that's not explicit enough, someone might enable CONFIG_STAGING for some other reason and won't notice the dependency. We need it to appear with other staging drivers in the menu, so there needs to be a Kconfig that is included from drivers/staging/Kconfig. M, I am sensing a linux-next merge conflict with staging-next and/or another for-next-merge rebase coming on.. For example, we can create drivers/vhost/staging/Kconfig or drivers/vhost/tcm/Kconfig and include it from drivers/staging/Kconfig. nouveau did something like this for a while, see f3c93cbde7eab38671ae085cb1027b08f5f36757. No need to move the rest of the code. OK, lets use drivers/vhost/tcm/Kconfig and I'll post a incremental patch to make it appear under staging it shortly. (CC'ing Greg-KH for good measure.) -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
From: Nicholas Bellinger n...@linux-iscsi.org Hi folks, The following is a RFC-v2 series of tcm_vhost target fabric driver code currently in-flight for-3.6 mainline code. After last week's developments along with the help of some new folks, the changelog v1 - v2 so far looks like: *) Fix drivers/vhost/test.c to use VHOST_NET_FEATURES in patch #1 (Asias He) *) Fix tv_cmd completion - release SGL memory leak (nab) *) Fix sparse warnings for static variable usage (Fengguang Wu) *) Fix sparse warnings for min() typing + printk format specs (Fengguang Wu) *) Convert to cmwq submission for I/O dispatch (nab + hch) Also following Paolo's request, a patch for hw/virtio-scsi.c that sets scsi_host-max_target=0 that removes the need for virtio-scsi LLD to hardcode VirtIOSCSIConfig-max_id=1 in order to function with tcm_vhost. Note this series has been pushed into target-pending.git/for-next-merge, and should be getting picked up for tomorrow's linux-next build. Please let us know if you have any concerns and/or additional review feedback. Thank you! Nicholas Bellinger (2): vhost: Add vhost_scsi specific defines tcm_vhost: Initial merge for vhost level target fabric driver Stefan Hajnoczi (2): vhost: Separate vhost-net features from vhost features vhost: make vhost work queue visible drivers/vhost/Kconfig |6 + drivers/vhost/Makefile|1 + drivers/vhost/net.c |4 +- drivers/vhost/tcm_vhost.c | 1609 + drivers/vhost/tcm_vhost.h | 74 ++ drivers/vhost/test.c |4 +- drivers/vhost/vhost.c |5 +- drivers/vhost/vhost.h |6 +- include/linux/vhost.h |9 + 9 files changed, 1710 insertions(+), 8 deletions(-) create mode 100644 drivers/vhost/tcm_vhost.c create mode 100644 drivers/vhost/tcm_vhost.h -- 1.7.2.5 -- To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html