Re: [openstack-dev] vhost-scsi support in Nova

2014-07-25 Thread Nicholas A. Bellinger
On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote:
 On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote:
  *) vhost-scsi doesn't support migration
  
  Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker
  set.  This is primarily due to requiring some external orchestration in
  order to setup the necessary vhost-scsi endpoints on the migration
  destination to match what's running on the migration source.
  
  Here are a couple of points that Stefan detailed some time ago about what's
  involved for properly supporting live migration with vhost-scsi:
  
  (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by
  DMAing to guest memory buffers or by modifying the virtio vring (which also
  lives in guest memory).  This should be straightforward since the
  infrastructure is already present in vhost (it's called the log) and used
  by drivers/vhost/net.c.
  
  (2) The harder part is seamless target handover to the destination host.
  vhost-scsi needs to serialize any SCSI target state from the source machine
  and load it on the destination machine.  We could be in the middle of
  emulating a SCSI command.
  
  An obvious solution is to only support active-passive or active-active HA
  setups where tcm already knows how to fail over.  This typically requires
  shared storage and maybe some communication for the clustering mechanism.
  There are more sophisticated approaches, so this straightforward one is just
  an example.
  
  That said, we do intended to support live migration for vhost-scsi using
  iSCSI/iSER/FC shared storage.
  
  *) vhost-scsi doesn't support qcow2
  
  Given all other cinder drivers do not use QEMU qcow2 to access storage
  blocks, with the exception of the Netapp and Gluster driver, this argument
  is not particularly relevant here.
  
  However, this doesn't mean that vhost-scsi (and target-core itself) cannot
  support qcow2 images.  There is currently an effort to add a userspace
  backend driver for the upstream target (tcm_core_user [3]), that will allow
  for supporting various disk formats in userspace.
  
  The important part for vhost-scsi is that regardless of what type of target
  backend driver is put behind the fabric LUNs (raw block devices using
  IBLOCK, qcow2 images using target_core_user, etc) the changes required in
  Nova and libvirt to support vhost-scsi remain the same.  They do not change
  based on the backend driver.
  
  *) vhost-scsi is not intended for production
  
  vhost-scsi has been included the upstream kernel since the v3.6 release, and
  included in QEMU since v1.5.  vhost-scsi runs unmodified out of the box on a
  number of popular distributions including Fedora, Ubuntu, and OpenSuse.  It
  also works as a QEMU boot device with Seabios, and even with the Windows
  virtio-scsi mini-port driver.
  
  There is at least one vendor who has already posted libvirt patches to
  support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging
  and development tool.
  
  For instance, here are a few specific use cases where vhost-scsi is
  currently the only option for virtio-scsi guests:
  
- Low (sub 100 usec) latencies for AIO reads/writes with small iodepth
  workloads
- 1M+ small block IOPs workloads at low CPU utilization with large
  iopdeth workloads.
- End-to-end data integrity using T10 protection information (DIF)
 
 IIUC, there is also missing support for block jobs like drive-mirror
 which is needed by Nova.
 

This limitation can be considered an acceptable trade-off in initial
support by some users, given the already considerable performance +
efficiency gains that vhost logic provides to KVM/virtio guests.

Others would like to utilize virtio-scsi to access features like SPC-3
Persistent Reservations, ALUA Explicit/Implicit Multipath, and
EXTENDED_COPY offload provided by the LIO target subsystem.

Note these three SCSI features are enabled (by default) on all LIO
target fabric LUNs, starting with v3.12 kernel code.

 From a functionality POV migration  drive-mirror support are the two
 core roadblocks to including vhost-scsi in Nova (as well as libvirt
 support for it of course). Realistically it doesn't sound like these
 are likely to be solved soon enough to give us confidence in taking
 this for the Juno release cycle.
 

The spec is for initial support of vhost-scsi controller endpoints
during Juno, and as mentioned earlier by Vish, should be considered a
experimental feature given caveats you've highlighted above.

We also understand that code will ultimately have to pass Nova + libvirt
upstream review in order to be merged, and that the approval of any
vhost-scsi spec now is not a guarantee the feature will actually make it
into any official Juno release.

Thanks,

--nab


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

Re: [openstack-dev] vhost-scsi support in Nova

2014-07-25 Thread Nicholas A. Bellinger
On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote:
 On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote:
  *) vhost-scsi doesn't support migration
  
  Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker
  set.  This is primarily due to requiring some external orchestration in
  order to setup the necessary vhost-scsi endpoints on the migration
  destination to match what's running on the migration source.
  
  Here are a couple of points that Stefan detailed some time ago about what's
  involved for properly supporting live migration with vhost-scsi:
  
  (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by
  DMAing to guest memory buffers or by modifying the virtio vring (which also
  lives in guest memory).  This should be straightforward since the
  infrastructure is already present in vhost (it's called the log) and used
  by drivers/vhost/net.c.
  
  (2) The harder part is seamless target handover to the destination host.
  vhost-scsi needs to serialize any SCSI target state from the source machine
  and load it on the destination machine.  We could be in the middle of
  emulating a SCSI command.
  
  An obvious solution is to only support active-passive or active-active HA
  setups where tcm already knows how to fail over.  This typically requires
  shared storage and maybe some communication for the clustering mechanism.
  There are more sophisticated approaches, so this straightforward one is just
  an example.
  
  That said, we do intended to support live migration for vhost-scsi using
  iSCSI/iSER/FC shared storage.
  
  *) vhost-scsi doesn't support qcow2
  
  Given all other cinder drivers do not use QEMU qcow2 to access storage
  blocks, with the exception of the Netapp and Gluster driver, this argument
  is not particularly relevant here.
  
  However, this doesn't mean that vhost-scsi (and target-core itself) cannot
  support qcow2 images.  There is currently an effort to add a userspace
  backend driver for the upstream target (tcm_core_user [3]), that will allow
  for supporting various disk formats in userspace.
  
  The important part for vhost-scsi is that regardless of what type of target
  backend driver is put behind the fabric LUNs (raw block devices using
  IBLOCK, qcow2 images using target_core_user, etc) the changes required in
  Nova and libvirt to support vhost-scsi remain the same.  They do not change
  based on the backend driver.
  
  *) vhost-scsi is not intended for production
  
  vhost-scsi has been included the upstream kernel since the v3.6 release, and
  included in QEMU since v1.5.  vhost-scsi runs unmodified out of the box on a
  number of popular distributions including Fedora, Ubuntu, and OpenSuse.  It
  also works as a QEMU boot device with Seabios, and even with the Windows
  virtio-scsi mini-port driver.
  
  There is at least one vendor who has already posted libvirt patches to
  support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging
  and development tool.
  
  For instance, here are a few specific use cases where vhost-scsi is
  currently the only option for virtio-scsi guests:
  
- Low (sub 100 usec) latencies for AIO reads/writes with small iodepth
  workloads
- 1M+ small block IOPs workloads at low CPU utilization with large
  iopdeth workloads.
- End-to-end data integrity using T10 protection information (DIF)
 
 IIUC, there is also missing support for block jobs like drive-mirror
 which is needed by Nova.
 

This limitation can be considered an acceptable trade-off in initial
support by some users, given the already considerable performance +
efficiency gains that vhost logic provides to KVM/virtio guests.

Others would like to utilize virtio-scsi to access features like SPC-3
Persistent Reservations, ALUA Explicit/Implicit Multipath, and
EXTENDED_COPY offload provided by the LIO target subsystem.

Note these three SCSI features are enabled (by default) on all LIO
target fabric LUNs, starting with v3.12 kernel code.

 From a functionality POV migration  drive-mirror support are the two
 core roadblocks to including vhost-scsi in Nova (as well as libvirt
 support for it of course). Realistically it doesn't sound like these
 are likely to be solved soon enough to give us confidence in taking
 this for the Juno release cycle.
 

The spec is for initial support of vhost-scsi controller endpoints
during Juno, and as mentioned earlier by Vish, should be considered a
experimental feature given caveats you've highlighted above.

We also understand that code will ultimately have to pass Nova + libvirt
upstream review in order to be merged, and that the approval of any
vhost-scsi spec now is not a guarantee the feature will actually make it
into any official Juno release.

Thanks,

--nab



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

Re: [openstack-dev] vhost-scsi support in Nova

2014-07-25 Thread Daniel P. Berrange
On Fri, Jul 25, 2014 at 01:18:33AM -0700, Nicholas A. Bellinger wrote:
 On Thu, 2014-07-24 at 11:06 +0100, Daniel P. Berrange wrote:
  On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote:
   *) vhost-scsi doesn't support migration
   
   Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker
   set.  This is primarily due to requiring some external orchestration in
   order to setup the necessary vhost-scsi endpoints on the migration
   destination to match what's running on the migration source.
   
   Here are a couple of points that Stefan detailed some time ago about 
   what's
   involved for properly supporting live migration with vhost-scsi:
   
   (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by
   DMAing to guest memory buffers or by modifying the virtio vring (which 
   also
   lives in guest memory).  This should be straightforward since the
   infrastructure is already present in vhost (it's called the log) and 
   used
   by drivers/vhost/net.c.
   
   (2) The harder part is seamless target handover to the destination host.
   vhost-scsi needs to serialize any SCSI target state from the source 
   machine
   and load it on the destination machine.  We could be in the middle of
   emulating a SCSI command.
   
   An obvious solution is to only support active-passive or active-active HA
   setups where tcm already knows how to fail over.  This typically requires
   shared storage and maybe some communication for the clustering mechanism.
   There are more sophisticated approaches, so this straightforward one is 
   just
   an example.
   
   That said, we do intended to support live migration for vhost-scsi using
   iSCSI/iSER/FC shared storage.
   
   *) vhost-scsi doesn't support qcow2
   
   Given all other cinder drivers do not use QEMU qcow2 to access storage
   blocks, with the exception of the Netapp and Gluster driver, this argument
   is not particularly relevant here.
   
   However, this doesn't mean that vhost-scsi (and target-core itself) cannot
   support qcow2 images.  There is currently an effort to add a userspace
   backend driver for the upstream target (tcm_core_user [3]), that will 
   allow
   for supporting various disk formats in userspace.
   
   The important part for vhost-scsi is that regardless of what type of 
   target
   backend driver is put behind the fabric LUNs (raw block devices using
   IBLOCK, qcow2 images using target_core_user, etc) the changes required in
   Nova and libvirt to support vhost-scsi remain the same.  They do not 
   change
   based on the backend driver.
   
   *) vhost-scsi is not intended for production
   
   vhost-scsi has been included the upstream kernel since the v3.6 release, 
   and
   included in QEMU since v1.5.  vhost-scsi runs unmodified out of the box 
   on a
   number of popular distributions including Fedora, Ubuntu, and OpenSuse.  
   It
   also works as a QEMU boot device with Seabios, and even with the Windows
   virtio-scsi mini-port driver.
   
   There is at least one vendor who has already posted libvirt patches to
   support vhost-scsi, so vhost-scsi is already being pushed beyond a 
   debugging
   and development tool.
   
   For instance, here are a few specific use cases where vhost-scsi is
   currently the only option for virtio-scsi guests:
   
 - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth
   workloads
 - 1M+ small block IOPs workloads at low CPU utilization with large
   iopdeth workloads.
 - End-to-end data integrity using T10 protection information (DIF)
  
  IIUC, there is also missing support for block jobs like drive-mirror
  which is needed by Nova.
  
 
 This limitation can be considered an acceptable trade-off in initial
 support by some users, given the already considerable performance +
 efficiency gains that vhost logic provides to KVM/virtio guests.
 
 Others would like to utilize virtio-scsi to access features like SPC-3
 Persistent Reservations, ALUA Explicit/Implicit Multipath, and
 EXTENDED_COPY offload provided by the LIO target subsystem.
 
 Note these three SCSI features are enabled (by default) on all LIO
 target fabric LUNs, starting with v3.12 kernel code.
 
  From a functionality POV migration  drive-mirror support are the two
  core roadblocks to including vhost-scsi in Nova (as well as libvirt
  support for it of course). Realistically it doesn't sound like these
  are likely to be solved soon enough to give us confidence in taking
  this for the Juno release cycle.
  
 
 The spec is for initial support of vhost-scsi controller endpoints
 during Juno, and as mentioned earlier by Vish, should be considered a
 experimental feature given caveats you've highlighted above.

To accept something as an feature marked experimental IMHO we need to
have some confidence that it will be able to be marked non-experimental
in the future. The vhost-scsi code has been around in QEMU for over a

Re: [openstack-dev] vhost-scsi support in Nova

2014-07-24 Thread Daniel P. Berrange
On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote:
 *) vhost-scsi doesn't support migration
 
 Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker
 set.  This is primarily due to requiring some external orchestration in
 order to setup the necessary vhost-scsi endpoints on the migration
 destination to match what's running on the migration source.
 
 Here are a couple of points that Stefan detailed some time ago about what's
 involved for properly supporting live migration with vhost-scsi:
 
 (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by
 DMAing to guest memory buffers or by modifying the virtio vring (which also
 lives in guest memory).  This should be straightforward since the
 infrastructure is already present in vhost (it's called the log) and used
 by drivers/vhost/net.c.
 
 (2) The harder part is seamless target handover to the destination host.
 vhost-scsi needs to serialize any SCSI target state from the source machine
 and load it on the destination machine.  We could be in the middle of
 emulating a SCSI command.
 
 An obvious solution is to only support active-passive or active-active HA
 setups where tcm already knows how to fail over.  This typically requires
 shared storage and maybe some communication for the clustering mechanism.
 There are more sophisticated approaches, so this straightforward one is just
 an example.
 
 That said, we do intended to support live migration for vhost-scsi using
 iSCSI/iSER/FC shared storage.
 
 *) vhost-scsi doesn't support qcow2
 
 Given all other cinder drivers do not use QEMU qcow2 to access storage
 blocks, with the exception of the Netapp and Gluster driver, this argument
 is not particularly relevant here.
 
 However, this doesn't mean that vhost-scsi (and target-core itself) cannot
 support qcow2 images.  There is currently an effort to add a userspace
 backend driver for the upstream target (tcm_core_user [3]), that will allow
 for supporting various disk formats in userspace.
 
 The important part for vhost-scsi is that regardless of what type of target
 backend driver is put behind the fabric LUNs (raw block devices using
 IBLOCK, qcow2 images using target_core_user, etc) the changes required in
 Nova and libvirt to support vhost-scsi remain the same.  They do not change
 based on the backend driver.
 
 *) vhost-scsi is not intended for production
 
 vhost-scsi has been included the upstream kernel since the v3.6 release, and
 included in QEMU since v1.5.  vhost-scsi runs unmodified out of the box on a
 number of popular distributions including Fedora, Ubuntu, and OpenSuse.  It
 also works as a QEMU boot device with Seabios, and even with the Windows
 virtio-scsi mini-port driver.
 
 There is at least one vendor who has already posted libvirt patches to
 support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging
 and development tool.
 
 For instance, here are a few specific use cases where vhost-scsi is
 currently the only option for virtio-scsi guests:
 
   - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth
 workloads
   - 1M+ small block IOPs workloads at low CPU utilization with large
 iopdeth workloads.
   - End-to-end data integrity using T10 protection information (DIF)

IIUC, there is also missing support for block jobs like drive-mirror
which is needed by Nova.

From a functionality POV migration  drive-mirror support are the two
core roadblocks to including vhost-scsi in Nova (as well as libvirt
support for it of course). Realistically it doesn't sound like these
are likely to be solved soon enough to give us confidence in taking
this for the Juno release cycle.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] vhost-scsi support in Nova

2014-07-23 Thread Nicholas A. Bellinger
Hi Nova folks,

Please let me address some of the outstanding technical points that have
been raised recently within the following spec [1] for supporting vhost-scsi
[2] within Nova.

Mike and Daniel have been going back and forth on various details, so I
thought it might be helpful to open the discussion to a wider audience.

First, some background.  I'm the target (LIO) subsystem maintainer for the
upstream Linux kernel, and have been one of the primary contributors in that
community for a number of years.  This includes the target-core subsystem,
the backend drivers that communicate with kernel storage subsystems, and a
number of frontend fabric protocol drivers.

vhost-scsi is one of those frontend fabric protocol drivers that has been
included upstream, that myself and others have contributed to and improved
over the past three years.  Given this experience and commitment to
supporting upstream code, I'd like to address some of the specific points
wrt vhost-scsi here.

*) vhost-scsi doesn't support migration

Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker
set.  This is primarily due to requiring some external orchestration in
order to setup the necessary vhost-scsi endpoints on the migration
destination to match what's running on the migration source.

Here are a couple of points that Stefan detailed some time ago about what's
involved for properly supporting live migration with vhost-scsi:

(1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by
DMAing to guest memory buffers or by modifying the virtio vring (which also
lives in guest memory).  This should be straightforward since the
infrastructure is already present in vhost (it's called the log) and used
by drivers/vhost/net.c.

(2) The harder part is seamless target handover to the destination host.
vhost-scsi needs to serialize any SCSI target state from the source machine
and load it on the destination machine.  We could be in the middle of
emulating a SCSI command.

An obvious solution is to only support active-passive or active-active HA
setups where tcm already knows how to fail over.  This typically requires
shared storage and maybe some communication for the clustering mechanism.
There are more sophisticated approaches, so this straightforward one is just
an example.

That said, we do intended to support live migration for vhost-scsi using
iSCSI/iSER/FC shared storage.

*) vhost-scsi doesn't support qcow2

Given all other cinder drivers do not use QEMU qcow2 to access storage
blocks, with the exception of the Netapp and Gluster driver, this argument
is not particularly relevant here.

However, this doesn't mean that vhost-scsi (and target-core itself) cannot
support qcow2 images.  There is currently an effort to add a userspace
backend driver for the upstream target (tcm_core_user [3]), that will allow
for supporting various disk formats in userspace.

The important part for vhost-scsi is that regardless of what type of target
backend driver is put behind the fabric LUNs (raw block devices using
IBLOCK, qcow2 images using target_core_user, etc) the changes required in
Nova and libvirt to support vhost-scsi remain the same.  They do not change
based on the backend driver.

*) vhost-scsi is not intended for production

vhost-scsi has been included the upstream kernel since the v3.6 release, and
included in QEMU since v1.5.  vhost-scsi runs unmodified out of the box on a
number of popular distributions including Fedora, Ubuntu, and OpenSuse.  It
also works as a QEMU boot device with Seabios, and even with the Windows
virtio-scsi mini-port driver.

There is at least one vendor who has already posted libvirt patches to
support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging
and development tool.

For instance, here are a few specific use cases where vhost-scsi is
currently the only option for virtio-scsi guests:

  - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth
workloads
  - 1M+ small block IOPs workloads at low CPU utilization with large
iopdeth workloads.
  - End-to-end data integrity using T10 protection information (DIF)

So vhost-scsi can/will support essential features like live migration,
qcow2, and the virtio-scsi data plane effort should not block existing
alternatives already in upstream.

With that, we'd like to see Nova officially support vhost-scsi because of
its wide availability in the Linux ecosystem, and the considerable
performance, efficiency, and end-to-end data-integrity benefits that it
already brings to the table.

We are committed to addressing the short and long-term items for this
driver, and making it a success in Openstack Nova.

Thank you,

--nab

[1] 
https://review.openstack.org/#/c/103797/5/specs/juno/virtio-scsi-settings.rst
[2] 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/vhost/scsi.c
[3] http://www.spinics.net/lists/target-devel/msg07339.html