Re: [openstack-dev] vhost-scsi support in Nova
On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote: On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote: *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) IIUC, there is also missing support for block jobs like drive-mirror which is needed by Nova. This limitation can be considered an acceptable trade-off in initial support by some users, given the already considerable performance + efficiency gains that vhost logic provides to KVM/virtio guests. Others would like to utilize virtio-scsi to access features like SPC-3 Persistent Reservations, ALUA Explicit/Implicit Multipath, and EXTENDED_COPY offload provided by the LIO target subsystem. Note these three SCSI features are enabled (by default) on all LIO target fabric LUNs, starting with v3.12 kernel code. From a functionality POV migration drive-mirror support are the two core roadblocks to including vhost-scsi in Nova (as well as libvirt support for it of course). Realistically it doesn't sound like these are likely to be solved soon enough to give us confidence in taking this for the Juno release cycle. The spec is for initial support of vhost-scsi controller endpoints during Juno, and as mentioned earlier by Vish, should be considered a experimental feature given caveats you've highlighted above. We also understand that code will ultimately have to pass Nova + libvirt upstream review in order to be merged, and that the approval of any vhost-scsi spec now is not a guarantee the feature will actually make it into any official Juno release. Thanks, --nab ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] vhost-scsi support in Nova
On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote: On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote: *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) IIUC, there is also missing support for block jobs like drive-mirror which is needed by Nova. This limitation can be considered an acceptable trade-off in initial support by some users, given the already considerable performance + efficiency gains that vhost logic provides to KVM/virtio guests. Others would like to utilize virtio-scsi to access features like SPC-3 Persistent Reservations, ALUA Explicit/Implicit Multipath, and EXTENDED_COPY offload provided by the LIO target subsystem. Note these three SCSI features are enabled (by default) on all LIO target fabric LUNs, starting with v3.12 kernel code. From a functionality POV migration drive-mirror support are the two core roadblocks to including vhost-scsi in Nova (as well as libvirt support for it of course). Realistically it doesn't sound like these are likely to be solved soon enough to give us confidence in taking this for the Juno release cycle. The spec is for initial support of vhost-scsi controller endpoints during Juno, and as mentioned earlier by Vish, should be considered a experimental feature given caveats you've highlighted above. We also understand that code will ultimately have to pass Nova + libvirt upstream review in order to be merged, and that the approval of any vhost-scsi spec now is not a guarantee the feature will actually make it into any official Juno release. Thanks, --nab ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] vhost-scsi support in Nova
On Fri, Jul 25, 2014 at 01:18:33AM -0700, Nicholas A. Bellinger wrote: On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote: On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote: *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) IIUC, there is also missing support for block jobs like drive-mirror which is needed by Nova. This limitation can be considered an acceptable trade-off in initial support by some users, given the already considerable performance + efficiency gains that vhost logic provides to KVM/virtio guests. Others would like to utilize virtio-scsi to access features like SPC-3 Persistent Reservations, ALUA Explicit/Implicit Multipath, and EXTENDED_COPY offload provided by the LIO target subsystem. Note these three SCSI features are enabled (by default) on all LIO target fabric LUNs, starting with v3.12 kernel code. From a functionality POV migration drive-mirror support are the two core roadblocks to including vhost-scsi in Nova (as well as libvirt support for it of course). Realistically it doesn't sound like these are likely to be solved soon enough to give us confidence in taking this for the Juno release cycle. The spec is for initial support of vhost-scsi controller endpoints during Juno, and as mentioned earlier by Vish, should be considered a experimental feature given caveats you've highlighted above. To accept something as an feature marked experimental IMHO we need to have some confidence that it will be able to be marked non-experimental in the future. The vhost-scsi code has been around in QEMU for over a
Re: [openstack-dev] vhost-scsi support in Nova
On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote: *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) IIUC, there is also missing support for block jobs like drive-mirror which is needed by Nova. From a functionality POV migration drive-mirror support are the two core roadblocks to including vhost-scsi in Nova (as well as libvirt support for it of course). Realistically it doesn't sound like these are likely to be solved soon enough to give us confidence in taking this for the Juno release cycle. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] vhost-scsi support in Nova
Hi Nova folks, Please let me address some of the outstanding technical points that have been raised recently within the following spec [1] for supporting vhost-scsi [2] within Nova. Mike and Daniel have been going back and forth on various details, so I thought it might be helpful to open the discussion to a wider audience. First, some background. I'm the target (LIO) subsystem maintainer for the upstream Linux kernel, and have been one of the primary contributors in that community for a number of years. This includes the target-core subsystem, the backend drivers that communicate with kernel storage subsystems, and a number of frontend fabric protocol drivers. vhost-scsi is one of those frontend fabric protocol drivers that has been included upstream, that myself and others have contributed to and improved over the past three years. Given this experience and commitment to supporting upstream code, I'd like to address some of the specific points wrt vhost-scsi here. *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) So vhost-scsi can/will support essential features like live migration, qcow2, and the virtio-scsi data plane effort should not block existing alternatives already in upstream. With that, we'd like to see Nova officially support vhost-scsi because of its wide availability in the Linux ecosystem, and the considerable performance, efficiency, and end-to-end data-integrity benefits that it already brings to the table. We are committed to addressing the short and long-term items for this driver, and making it a success in Openstack Nova. Thank you, --nab [1] https://review.openstack.org/#/c/103797/5/specs/juno/virtio-scsi-settings.rst [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/vhost/scsi.c [3] http://www.spinics.net/lists/target-devel/msg07339.html